通用技巧

这本书在本章节前讨论了限定于Rust的技术。 而这一节我们会简单讨论一些通用的概念。

避免低级错误(例如,[使用非release构建])以后,Rust程序通常性能不错。

优化后的代码往往会更复杂,而且更难写出。为此,一般我们只优化热点代码。

能带来最大性能提升的通常是对算法或数据结构的修改,而不是底层优化。 例 1, 例 2.

编写现代机器友好的代码有时并不容易,但值得一试。例如,你可以尝试最小化缓存不命中和分支预测失败。

大多数优化只会带来小的性能提升,即使单说其中一个不会带来可观测的变化,你多加利用的话也能带来不错的提升。

不同的性能分析器有不同的优点,没必要拘泥于其中一个。

译注:比如 memcpy 开销太大,你需要选择后者。

通过性能分析判断出热点函数后,有两种常用的做法,一是让它更快,二是避免对它多次调用。

It is often easier to eliminate silly slowdowns than it is to introduce clever speedups.

避免不必要的计算。惰性计算通常会带来提升。 例 1, 例 2.

Complex general cases can often be avoided by optimistically checking for common special cases that are simpler. Example 1, Example 2, Example 3.

In particular, specially handling collections with 0, 1, or 2 elements is often a win when small sizes dominate. Example 1, Example 2, Example 3, Example 4.

Similarly, when dealing with repetitive data, it is often possible to use a simple form of data compression, by using a compact representation for common values and then having a fallback to a secondary table for unusual values. Example 1, Example 2, Example 3.

When code deals with multiple cases, measure case frequencies and handle the most common ones first.

When dealing with lookups that involve high locality, it can be a win to put a small cache in front of a data structure.

Optimized code often has a non-obvious structure, which means that explanatory comments are valuable, particularly those that reference profiling measurements. A comment like “99% of the time this vector has 0 or 1 elements, so handle those cases first” can be illuminating.