散列

HashSetHashMap 是很常用的类型。默认的散列算法并未指定,不过目前rust使用 SipHash 1-3, 这是种高质量的算法,提供很好的碰撞保护——也正因如此,它很慢,特别是使用短键(如整数)时。

如果性能分析显示散列是热点,而且你的应用不需要担心HashDoS攻击, 你可以使用散列算法更快的散列表,得到很大的性能提升。

  • rustc-hash provides FxHashSet and FxHashMap types that are drop-in replacements for HashSet and HashMap. Its hashing algorithm is low-quality but very fast, especially for integer keys, and has been found to out-perform all other hash algorithms within rustc. (fxhash is an older, less well maintained implementation of the same algorithm and types.)
  • fnv provides FnvHashSet and FnvHashMap types. Its hashing algorithm is higher quality than rustc-hash’s but a little slower.
  • ahash provides AHashSet and AHashMap. Its hashing algorithm can take advantage of AES instruction support that is available on some processors.

If hashing performance is important in your program, it is worth trying more than one of these alternatives. For example, the following results were seen in rustc.

If you decide to universally use one of the alternatives, such as FxHashSet/FxHashMap, it is easy to accidentally use HashSet/HashMap in some places. You can use clippy to avoid this problem.

一些类型不需要散列。例如,你可能用newtype包装了一个随机(或者接近随机)的整数。 对于这种类型,分发他们自己的值和分发散列后得到的值没有区别。 这种情况下你可以试试 nohash_hasher

散列函数设计是一个复杂的话题,超出本书的范围。ahash的文档有相关的讨论。