map的内部结构

Go中的map使用哈希表实现的,在源码go runtime/hashmap.go中可以看到对应实现。

 1// A header for a Go map.
 2type hmap struct {
 3	// Note: the format of the Hmap is encoded in ../../cmd/internal/gc/reflect.go and
 4	// ../reflect/type.go. Don't change this structure without also changing that code!
 5	count     int // # live cells == size of map.  Must be first (used by len() builtin)
 6	flags     uint8
 7	B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
 8	noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
 9	hash0     uint32 // hash seed
10
11	buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
12	oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
13	nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)
14
15	// If both key and value do not contain pointers and are inline, then we mark bucket
16	// type as containing no pointers. This avoids scanning such maps.
17	// However, bmap.overflow is a pointer. In order to keep overflow buckets
18	// alive, we store pointers to all overflow buckets in hmap.overflow.
19	// Overflow is used only if key and value do not contain pointers.
20	// overflow[0] contains overflow buckets for hmap.buckets.
21	// overflow[1] contains overflow buckets for hmap.oldbuckets.
22	// The first indirection allows us to reduce static size of hmap.
23	// The second indirection allows to store a pointer to the slice in hiter.
24	overflow *[2]*[]*bmap
25}
26
27// A bucket for a Go map.
28type bmap struct {
29	// tophash generally contains the top byte of the hash value
30	// for each key in this bucket. If tophash[0] < minTopHash,
31	// tophash[0] is a bucket evacuation state instead.
32	tophash [bucketCnt]uint8
33	// Followed by bucketCnt keys and then bucketCnt values.
34	// NOTE: packing all the keys together and then all the values together makes the
35	// code a bit more complicated than alternating key/value/key/value/... but it allows
36	// us to eliminate padding which would be needed for, e.g., map[int64]int8.
37	// Followed by an overflow pointer.
38}

先删掉代码原有注释,以便于查看:

 1// A header for a Go map.
 2type hmap struct {
 3	count     int 
 4	flags     uint8
 5	B         uint8
 6	noverflow uint16
 7	hash0     uint32
 8	buckets    unsafe.Pointer
 9	oldbuckets unsafe.Pointer
10	nevacuate  uintptr
11	overflow *[2]*[]*bmap
12}

hmap和bmap是两个关键的结构,其中hmap是go内部map的实现,hmap是由桶数组buckets组成,bmap是桶bucket的结构。

 1const (
 2	// Maximum number of key/value pairs a bucket can hold.
 3	bucketCntBits = 3
 4	bucketCnt     = 1 << bucketCntBits
 5)
 6
 7// A bucket for a Go map.
 8type bmap struct {
 9	tophash [bucketCnt]uint8
10}

bucketCnt为常量8,也就是说每个桶最大持有8个key/value键值对。

Go in Action中给出了map的内部结构的示意图:

go-map

与map中的kev/value交互,首先要定位到桶bucket。根据key的具体类型的hash函数计算key的hash值假设为h,h%len(buckets)得到 LOB HASH(哈希值低位),假设为hl,buckets[hl]则为此key的桶,每个桶最多持有8个key/value,如果桶被装满,造成了桶溢出,也就是发生了hash碰撞,则会形成overflow bucket链表。hash值的高位字节HOB HASH用于定位桶中的key/value。

1const (
2	// Maximum average load of a bucket that triggers growth.
3	loadFactor = 6.5
4)

hashmap.go中的负载因子loadFactor为常量6.5,这是个经验值,每个桶的大小为8,也就是一般情况下不会被装满。

参考