Go源码学习: 切片slice的内部数据结构

2021-11-16

Go语言中的切片slice的底层数据结构是数组，这是即使是Go的初学者都了解的。前面我们学习了string类型在运行时由reflect.StringHeader表示，StringHeader可以理解为string在运行时的"描述符"，StringHeader由Data和Len两个字段组成，Data是指向string底层byte数组的指针，Len是string的长度。

string的底层实现是byte数组，切片slice的底层实现也是数组，可以看出在Go语言中数组更多的是作为底层存存储实现的角色，在日常开发中我们直接使用数组的场景很少，更多的是使用slice。本节将学习切片slice在运行时的描述符reflect.SliceHeader。

切片在运行时的描述符reflect.SliceHeader #

slice在Go中的内部结构是reflect.SliceHeader，位于reflect/value.go中：

 1// SliceHeader is the runtime representation of a slice.
 2// It cannot be used safely or portably and its representation may
 3// change in a later release.
 4// Moreover, the Data field is not sufficient to guarantee the data
 5// it references will not be garbage collected, so programs must keep
 6// a separate, correctly typed pointer to the underlying data.
 7type SliceHeader struct {
 8	Data uintptr
 9	Len  int
10	Cap  int
11}

从SliceHeader的注释中可以看出: SliceHeader是slice在运行时的表示形式，也就是前面说的“描述符”。 SliceHeader它本身不存储slice的数据，而只包含一个指向slice底层数组的指针(Data uinptr)、一个表示当前切片长度的int字段(Len int)、一个表示当前切片容量的字段Cap int。即：

Data: 是指向底层数组的指针
Len: 是切片的长度，即切片中元素的个数
Cap: 是切片的最大容量，Len <= Cap

再对比一下string在运行时的描述符StringHeader代码:

1type StringHeader struct {
2	Data uintptr
3	Len  int
4}

可以看到StringHeader只比SliceHeader少了一个Cap字段，因为string具有不可变性，不能直接向底层数组追加元素，所以不需要Cap，因此有经常有人会说字符串string是一个元素类型为byte的只读的切片类型。

对比了SliceHeade和StringHeader之后，下面我们进入正题，下面的代码通过使用unsafe.Pointer的转换能力逐步得到SliceHeader和slice的底层数组结构，并将它们打印出来。

 1package main
 2
 3import (
 4	"fmt"
 5	"reflect"
 6	"unsafe"
 7)
 8
 9func main() {
10	s := make([]int, 0, 8)
11	s = append(s, 2, 4, 6)
12	sh := (*reflect.SliceHeader)(unsafe.Pointer(&s))
13	fmt.Printf("%+v\n", sh) // &{Data:824633844416 Len:3 Cap:8}
14	ap := (*[8]int)(unsafe.Pointer(sh.Data))
15	fmt.Println(ap) // &[2 4 6 0 0 0 0 0]
16
17	s1 := s[1:3:4]
18	sh1 := (*reflect.SliceHeader)(unsafe.Pointer(&s1))
19	fmt.Printf("%+v\n", sh1) // &{Data:824633844424 Len:2 Cap:3}
20	ap1 := (*[8]int)(unsafe.Pointer(sh.Data))
21	fmt.Println(ap1) // &[2 4 6 0 0 0 0 0]
22}

上面的代码中切片s通过make函数创建，切片s1同构对切片s执行切片操作创建。这两个切片底层内存表示示意如下图:

上图中的切片s和切片s1共享同一个底层数组，因此不管是谁对底层数组的修改都会反映到其他切片中。

runtime.makeslice #

下面看一下使用make函数创建切片时的具体实现，当使用make函数创建切片时，例如s := make([]int, 0, 8)，实际上调用的是go运行时的runtime.makeslice这个函数。slice是由Go的编译器和运行时联合实现的数据类型。

runtime.makeslice函数的签名如下:

1> runtime.makeslice() /usr/local/Cellar/go/1.17.2/libexec/src/runtime/slice.go:83 (hits goroutine(1):1 total:1) (PC: 0x1047d2a)
2    82:
3=>  83: func makeslice(et *_type, len, cap int) unsafe.Pointer {
4    84:         mem, overflow := math.MulUintptr(et.size, uintptr(cap))

makeslice函数有三个参数: et对应切片中元素类型，len为切片的长度，cap为切片的容量。这个函数十分简单，主要是通过调用math.MulUintptr判断内存申请，计算当前切片占用的内存空间，最后在堆上申请一块连续的内存空间，内存空间的大小为切片元素类型大小et.size乘以切片容量cap。在计算占用内存空间时有内存是否溢出(要分配的内存是否大于寻址空间)的判断。

 1func makeslice(et *_type, len, cap int) unsafe.Pointer {
 2	mem, overflow := math.MulUintptr(et.size, uintptr(cap))
 3	if overflow || mem > maxAlloc || len < 0 || len > cap {
 4		// NOTE: Produce a 'len out of range' error instead of a
 5		// 'cap out of range' error when someone does make([]T, bignumber).
 6		// 'cap out of range' is true too, but since the cap is only being
 7		// supplied implicitly, saying len is clearer.
 8		// See golang.org/issue/4085.
 9		mem, overflow := math.MulUintptr(et.size, uintptr(len))
10		if overflow || mem > maxAlloc || len < 0 {
11			panicmakeslicelen()
12		}
13		panicmakeslicecap()
14	}
15
16	return mallocgc(mem, et, true)
17}

makeslice返回的就是指向底层数组的指针(unsafe.Pointer)。在调用makeslice函数时如果内存空间的大小发生了溢出(要分配的内存大于寻址空间)、或者传入切片长度len小于0，或者传入的切片容量cap小于len都会panic。