Linux进程内存分布简介

在Linux系统中,一个程序进程在内存布局上遵循一定规律,进程的内存空间布局由高地址到低地址大致可分为以下几段:

  • 栈(stack): 用户态的栈,栈的大小是固定的,其大小可以使用ulimit -s查看和调整,一般默认为8Mb,栈从高地址向低地址增长(函数调用)
  • 堆(heap): 动态分配的内存空间,程序在运行时动态分配和释放,堆内存的分配不是连续的,整体上是从低地址向高地址增长
  • bss(未初始化数据区): 未初始化数据区bss, 存放全局的未初始化赋值的变量
  • data(初始化数据区): 存放已经初始化的全局变量数据
  • text: 存放程序代码

memory-layout-c.jpg

ELF文件格式介绍

ELF全称“Executable and Linkable Format”,即可执行可链接文件格式,Linux上的可执行文件就是采用的这个格式。 我们以Go程序代码在Linux下编译的可执行文件为例进行分析,示例的hello.go代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
package main
import "fmt"
var (
   fooVar = "hello"
   barVar string
)
func main() {
  barVar = "hellob"
  fmt.Println(fooVar, barVar)
}

编译hello.go源码文件得到二进制文件hello:

1
go build -gcflags "-N -l" hello.go

使用-gcflags "-N -l"禁用了编译优化和内联

使用readelf -e命令分析二进制文件hello这个ELF文件的组成:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
readelf -e hello
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x45c1a0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          456 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         23
  Section header string table index: 3

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000401000  00001000
       000000000007d39e  0000000000000000  AX       0     0     32
  [ 2] .rodata           PROGBITS         000000000047f000  0007f000
       0000000000034eac  0000000000000000   A       0     0     32
  [ 3] .shstrtab         STRTAB           0000000000000000  000b3ec0
       000000000000017a  0000000000000000           0     0     1
  [ 4] .typelink         PROGBITS         00000000004b4040  000b4040
       00000000000004d8  0000000000000000   A       0     0     32
  [ 5] .itablink         PROGBITS         00000000004b4520  000b4520
       0000000000000058  0000000000000000   A       0     0     32
  [ 6] .gosymtab         PROGBITS         00000000004b4578  000b4578
       0000000000000000  0000000000000000   A       0     0     1
  [ 7] .gopclntab        PROGBITS         00000000004b4580  000b4580
       0000000000058710  0000000000000000   A       0     0     32
  [ 8] .go.buildinfo     PROGBITS         000000000050d000  0010d000
       0000000000000020  0000000000000000  WA       0     0     16
  [ 9] .noptrdata        PROGBITS         000000000050d020  0010d020
       00000000000105c0  0000000000000000  WA       0     0     32
  [10] .data             PROGBITS         000000000051d5e0  0011d5e0
       0000000000007810  0000000000000000  WA       0     0     32
  [11] .bss              NOBITS           0000000000524e00  00124e00
       000000000002ef28  0000000000000000  WA       0     0     32
  [12] .noptrbss         NOBITS           0000000000553d40  00153d40
       0000000000005360  0000000000000000  WA       0     0     32
  [13] .zdebug_abbrev    PROGBITS         000000000055a000  00125000
       0000000000000119  0000000000000000           0     0     1
  [14] .zdebug_line      PROGBITS         000000000055a119  00125119
       000000000001ae04  0000000000000000           0     0     1
  [15] .zdebug_frame     PROGBITS         0000000000574f1d  0013ff1d
       0000000000005421  0000000000000000           0     0     1
  [16] .debug_gdb_s[...] PROGBITS         000000000057a33e  0014533e
       0000000000000028  0000000000000000           0     0     1
  [17] .zdebug_info      PROGBITS         000000000057a366  00145366
       0000000000031048  0000000000000000           0     0     1
  [18] .zdebug_loc       PROGBITS         00000000005ab3ae  001763ae
       000000000001906f  0000000000000000           0     0     1
  [19] .zdebug_ranges    PROGBITS         00000000005c441d  0018f41d
       0000000000008b8b  0000000000000000           0     0     1
  [20] .note.go.buildid  NOTE             0000000000400f9c  00000f9c
       0000000000000064  0000000000000000   A       0     0     4
  [21] .symtab           SYMTAB           0000000000000000  00197fa8
       000000000000c378  0000000000000018          22   120     8
  [22] .strtab           STRTAB           0000000000000000  001a4320
       000000000000b002  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x0000000000000188 0x0000000000000188  R      0x1000
  NOTE           0x0000000000000f9c 0x0000000000400f9c 0x0000000000400f9c
                 0x0000000000000064 0x0000000000000064  R      0x4
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000007e39e 0x000000000007e39e  R E    0x1000
  LOAD           0x000000000007f000 0x000000000047f000 0x000000000047f000
                 0x000000000008dc90 0x000000000008dc90  R      0x1000
  LOAD           0x000000000010d000 0x000000000050d000 0x000000000050d000
                 0x0000000000017e00 0x000000000004c0a0  RW     0x1000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x8
  LOOS+0x5041580 0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0x8

 Section to Segment mapping:
  Segment Sections...
   00
   01     .note.go.buildid
   02     .text .note.go.buildid
   03     .rodata .typelink .itablink .gosymtab .gopclntab
   04     .go.buildinfo .noptrdata .data .bss .noptrbss
   05
   06

经过简化后Go二进制文件ELF格式大致如下图所示:

go-elf-fmt.png

Go二进制文件ELF由:

  • EFL Header: ELF Header中包含架构、ABI版本等基础信息,需要注意Entry point address就是Go程序的入口地址
  • Section Headers
  • Program headers
  • Sections

主要有以下几个Section:

  • .text存放的是程序代码
  • .rodata存放的是常量数据,例如程序中的字面量编译时会被放到这个Section中
  • .data.noptrdata存放的是在编译时已经初始化赋值的全局变量,在Go中应该就是已经初始化的各个包级别的变量
  • .bss.noptrbss存放的是未初始化的全局变量,在Go中应该就是未初始化的各个包级别的变量

Go是有运行时的垃圾回收语言,.data分为有指针.data section和无指针.noptrdatasection,.bss同样也分为有指针和无指针两个section。

ELF文件的Section Headers中通过Flags字段给出了各个Section的权限:

section 权限
.text AX 可读可执行
.rodata A 可读
.data WA 读写
.noptrdata WA 读写
.bss WA 读写
.noptrbss WA 读写

Go程序进程的内存空间布局

在执行程序时,ELF文件被加载到内存中,相同权限的Section会被对应到一个Segment中,大致是下图的内存加载结构:

elf-mem-load.png

下面我们使用objdump这个命令对照hello.go这个简单的程序找一下代码区和数据区的各个段内容。

查看.text:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
objdump -t -j .text hello
hello:     file format elf64-x86-64

SYMBOL TABLE:
0000000000401000 l     F .text  0000000000000000 runtime.text
0000000000401d40 l     F .text  000000000000022d cmpbody
0000000000401fa0 l     F .text  000000000000013e memeqbody
0000000000402120 l     F .text  0000000000000117 indexbytebody
0000000000457be0 l     F .text  0000000000000040 gogo
0000000000457c20 l     F .text  0000000000000035 callRet
0000000000457c60 l     F .text  000000000000002f gosave_systemstack_switch
0000000000457ca0 l     F .text  000000000000000d setg_gcc
0000000000457cc0 l     F .text  000000000000055a aeshashbody
0000000000458220 l     F .text  000000000000004b debugCall32
0000000000458280 l     F .text  000000000000004b debugCall64
00000000004582e0 l     F .text  000000000000006f debugCall128
0000000000458360 l     F .text  0000000000000072 debugCall256
00000000004583e0 l     F .text  0000000000000072 debugCall512
0000000000458460 l     F .text  0000000000000072 debugCall1024
00000000004584e0 l     F .text  0000000000000072 debugCall2048
0000000000458560 l     F .text  0000000000000076 debugCall4096
00000000004585e0 l     F .text  0000000000000076 debugCall8192
0000000000458660 l     F .text  0000000000000076 debugCall16384
00000000004586e0 l     F .text  0000000000000076 debugCall32768
0000000000458760 l     F .text  0000000000000076 debugCall65536
000000000047e39e l     F .text  0000000000000000 runtime.etext
0000000000401000 g     F .text  0000000000000059 internal/cpu.Initialize
0000000000401060 g     F .text  0000000000000625 internal/cpu.processOptions
00000000004016a0 g     F .text  0000000000000445 internal/cpu.doinit
0000000000401b00 g     F .text  000000000000001b internal/cpu.cpuid.abi0
0000000000401b20 g     F .text  0000000000000011 internal/cpu.xgetbv.abi0
0000000000401b40 g     F .text  0000000000000087 type..eq.internal/cpu.option
0000000000401be0 g     F .text  0000000000000094 type..eq.[15]internal/cpu.option
0000000000401c80 g     F .text  000000000000006f runtime/internal/sys.OnesCount64
0000000000401d00 g     F .text  0000000000000022 internal/bytealg.init.0
0000000000401f80 g     F .text  000000000000000e runtime.cmpstring
00000000004020e0 g     F .text  000000000000001b runtime.memequal
0000000000402100 g     F .text  000000000000001c runtime.memequal_varlen
0000000000402240 g     F .text  0000000000000018 internal/bytealg.IndexByteString.abi0
0000000000402260 g     F .text  0000000000000043 type..eq.internal/abi.RegArgs
00000000004022c0 g     F .text  0000000000000043 runtime.memhash128
0000000000402320 g     F .text  000000000000004a runtime.strhashFallback
0000000000402380 g     F .text  00000000000000e5 runtime.f32hash

查看.rodata,根据hello.go中的字符串字面量"theFoolValue"和"theBarVarValue"应该是在.rodata,下面搜索一下:

1
2
3
4
objdump -s -j .rodata hello | grep hello
hello:     file format elf64-x86-64
 494900 616e6863 68616e68 656c6c6f 696e6974  anhchanhelloinit
 494aa0 656e6365 6572726e 6f206865 6c6c6f62  enceerrno hellob

包变量fooVar已经初始化位于.data,包变量barVar未初始化位于.bss

1
2
3
4
5
objdump -t hello | grep main.fooVar
000000000051dda0 g     O .data	0000000000000010 main.fooVar

objdump -t hello | grep main.barVar
00000000005250f0 g     O .bss	0000000000000010 main.barVar

参考