为何使用for-range遍历map时是乱序的

Kesa...大约 3 分钟golang

在阅读Go设计与实现open in new windowfor-rangeopen in new window章节时,了解到使用for-range循环遍历map时,获取的值是乱序的,并非按照存入顺序。

于是产生了疑问:为什么要乱序?

1. for-range 如何遍历 map

使用for-range遍历map时,会在runtime.mapiterinitopen in new window函数中,初始化开始遍历的元素:

// mapiterinit initializes the hiter struct used for ranging over maps.
// The hiter struct pointed to by 'it' is allocated on the stack
// by the compilers order pass or on the heap by reflect_mapiterinit.
// Both need to have zeroed hiter since the struct contains pointers.
func mapiterinit(t *maptype, h *hmap, it *hiter) {
	...
	// decide where to start
	r := uintptr(fastrand())
	if h.B > 31-bucketCntBits {
		r += uintptr(fastrand()) << 31
	}
	it.startBucket = r & bucketMask(h.B)
	it.offset = uint8(r >> h.B & (bucketCnt - 1))

	// iterator state
	it.bucket = it.startBucket

	...
}

其中的通过r := uintptr(fastrand()) 获取随机数用于计算遍历的首个存储桶的位置,所以每次遍历时的起始位置随机的。

2. 为何要随机遍历

根据 The Go Blog: Go maps in action: Iteration orderopen in new window 中的描述:

When iterating over a map with a range loop, the iteration order is not specified and is not guaranteed to be the same from one iteration to the next. Since the release of Go 1.0, the runtime has randomized map iteration order. Programmers had begun to rely on the stable iteration order of early versions of Go, which varied between implementations, leading to portability bugs. If you require a stable iteration order you must maintain a separate data structure that specifies that order.

Go 1 Release Notes: Iterating in mapsopen in new window中的描述:

The old language specification did not define the order of iteration for maps, and in practice it differed across hardware platforms. This caused tests that iterated over maps to be fragile and non-portable, with the unpleasant property that a test might always pass on one machine but break on another.

In Go 1, the order in which elements are visited when iterating over a map using a for range statement is defined to be unpredictable, even if the same loop is run multiple times with the same map. Code should not assume that the elements are visited in any particular order.

This change means that code that depends on iteration order is very likely to break early and be fixed long before it becomes a problem. Just as important, it allows the map implementation to ensure better map balancing even when programs are using range loops to select an element from a map.

总结来说为了避免开发者依赖哈希表的遍历顺序,因为哈希表的实现可能会改变,不同的实现方式会可能导致原来的依赖遍历顺序的程序失效。

个人认为,因为哈希映射的过程本身就是随机的,若想要在遍历时按照某种固定的顺序获取元素,需要额外的内存和时间开销,会降低哈希表的性能。

3. encoding/jsonopen in new window 如何保证输出的一致性

当使用函数json.Marshal()open in new window将哈希表编码成json格式时,每次的输出都是一样的。

但是 Golang 中哈希表的遍历是随机的,这是如何做到的 ?

当函数func Marshal(v any)传入的参数类型为哈希表时,会选择对应的编码函数 json.mapEncoder.encodeopen in new window

func (me mapEncoder) encode(e *encodeState, v reflect.Value, opts encOpts) {
	...
	// Extract and sort the keys.
	sv := make([]reflectWithString, v.Len())
	mi := v.MapRange()
	for i := 0; mi.Next(); i++ {
		sv[i].k = mi.Key()
		sv[i].v = mi.Value()
		if err := sv[i].resolve(); err != nil {
			e.error(fmt.Errorf("json: encoding error for type %q: %q", v.Type().String(), err.Error()))
		}
	}
	sort.Slice(sv, func(i, j int) bool { return sv[i].ks < sv[j].ks })
	...
}

由上可以看到,在获取哈希表中的所有元素之后,将entry按照key进行了排序,所以对于同一个哈希表来说,能够确保编码的结果的一致性。

Reference

  1. https://stackoverflow.com/questions/55925822/why-are-iterations-over-maps-randomopen in new window
  2. The Go Blog: Go maps in action: Iteration order:open in new window
  3. Go 1 Release Notes: Iterating in maps:open in new window
  4. https://draveness.me/golang/docs/part2-foundation/ch05-keyword/golang-for-range/open in new window
上次编辑于:
评论
  • 按正序
  • 按倒序
  • 按热度
Powered by Waline v2.15.2