散列
HashSet
和 HashMap
是很常用的类型。默认的散列算法并未指定,不过目前rust使用 SipHash 1-3,
这是种高质量的算法,提供很好的碰撞保护——也正因如此,它很慢,特别是使用短键(如整数)时。
如果性能分析显示散列是热点,而且你的应用不需要担心HashDoS攻击, 你可以使用散列算法更快的散列表,得到很大的性能提升。
rustc-hash
providesFxHashSet
andFxHashMap
types that are drop-in replacements forHashSet
andHashMap
. Its hashing algorithm is low-quality but very fast, especially for integer keys, and has been found to out-perform all other hash algorithms within rustc. (fxhash
is an older, less well maintained implementation of the same algorithm and types.)fnv
providesFnvHashSet
andFnvHashMap
types. Its hashing algorithm is higher quality thanrustc-hash
’s but a little slower.ahash
providesAHashSet
andAHashMap
. Its hashing algorithm can take advantage of AES instruction support that is available on some processors.
If hashing performance is important in your program, it is worth trying more than one of these alternatives. For example, the following results were seen in rustc.
- The switch from
fnv
tofxhash
gave speedups of up to 6%. - An attempt to switch from
fxhash
toahash
resulted in slowdowns of 1-4%. - An attempt to switch from
fxhash
back to the default hasher resulted in slowdowns ranging from 4-84%!
If you decide to universally use one of the alternatives, such as
FxHashSet
/FxHashMap
, it is easy to accidentally use HashSet
/HashMap
in
some places. You can use clippy
to avoid this problem.
一些类型不需要散列。例如,你可能用newtype包装了一个随机(或者接近随机)的整数。
对于这种类型,分发他们自己的值和分发散列后得到的值没有区别。
这种情况下你可以试试 nohash_hasher
。
散列函数设计是一个复杂的话题,超出本书的范围。ahash
的文档有相关的讨论。