Tips:以下代码在go1.12.6 windows/amd64版本下测试分析，版本不同在分析源码的时候略有不同。

append

扩展--堆内存分配

问题产生

最近群里有人提了下面一个问题，源码差不多是这样：

func f1()  {
	s := []int{1,2,3,4,5}
	l := len(s)
	copy(s,s[1:])
	s = s[:l-1]
	fmt.Println(s)
}

可以改成下面这样吗？

func f1()  {
	s := []int{1,2,3,4,5}
	l := len(s)
	s = s[1:l] 
	fmt.Println(s)
}

首先，这两种确实都能实现想要的结果，但是源码这样写肯定有它的原因和特定的上下文。现在我们来分析分析，先打印地址看看：

func f1()  {
	s := []int{1,2,3,4,5}
	fmt.Printf("%p %v\n",s,cap(s))
	l := len(s)
	
	// 第一种
	//copy(s,s[1:])
	//s = s[:l-1]
 
	// 第二种
    s = s[1:l]
	
	fmt.Printf("%p %v\n",s,cap(s))
	fmt.Println(s)
}

第一种：
0xc04200a210 5
0xc04200a210 5
[2 3 4 5]
第二种：
0xc042080030 5
0xc042080038 4
[2 3 4 5]

由上打印结果可以看出，第二种s指向的底层地址不一样，而且还是偏移0xc042080038-0xc042080030=8，正好是64位系统的int的内存大小，也就是说s指向的底层数组移动了一位，现在的底层数组是[1,2,3,4,5],s是[2,3,4,5],len=4,cap=4。
源码这样写的好处是，s指向的底层数组[2,3,4,5,5],s是[2,3,4,5],len=4,cap=5。所以源码的方式cap更大，在下次append一个元素时候，直接添加到后面，而第二种就要重新分配底层数组了。

截取

既然说到slice的截取，看一下饶大在深入解密slice中讲解。

先了解一下截取操作，例如：

 data := [...]int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
 slice := data[2:4:6] // data[low, high, max]

对 data 使用3个索引值，截取出新的 slice。这里 data 可以是数组或者 slice。low 是最低索引值，这里是闭区间，也就是说第一个元素是 data 位于 low 索引处的元素；而 high 和 max 则是开区间，表示最后一个元素只能是索引 high-1 处的元素，而最大容量则只能是索引 max-1 处的元素。high 和 max 必须在底层数组的容量（cap）范围内，如果超出范围则panic。

不指定max，新slice的cap最大值是：底层数组长度-low。
指定max，新slice的cap最大值是：max-low。

饶大举了雨痕大佬《Go学习笔记》第四版的一个例子：

package main
 
import "fmt"
 
func main() {
    slice := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
    s1 := slice[2:5]
    s2 := s1[2:6:7]
 
    s2 = append(s2, 100)
    s2 = append(s2, 200)
 
    s1[2] = 20
 
    fmt.Println(s1)
    fmt.Println(s2)
    fmt.Println(slice)
}

[2 3 20]
[4 5 6 7 100 200]
[0 1 2 3 20 5 6 7 100 9]

现在，走一遍代码，初始状态如下：

slice := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
s1 := slice[2:5]
s2 := s1[2:6:7]

s1 从 slice 索引2（闭区间）到索引5（开区间，元素真正取到索引4），长度为3，容量默认到数组结尾，为8。
s2 从 s1 的索引2（闭区间）到索引6（开区间，元素真正取到索引5），容量到索引7（开区间，真正到索引6），为5。

slice origin

接着，向 s2 尾部追加一个元素 100：

s2 = append(s2, 100)

s2 容量刚好够，直接追加。不过，这会修改原始数组对应位置的元素。这一改动，数组和 s1 都可以看得到：

append 100

再次向 s2 追加元素200：

s2 = append(s2, 200)

这时，s2 的容量不够用，该扩容了。于是，s2 另起炉灶，将原来的元素复制新的位置，扩大自己的容量。并且为了应对未来可能的 append 带来的再一次扩容，s2 会在此次扩容的时候多留一些 buffer，将新的容量将扩大为原始容量的2倍，也就是10了。

最后，修改 s1 索引为2位置的元素：

s1[2] = 20

这次只会影响原始数组相应位置的元素。它影响不到 s2 了，人家已经远走高飞了。

s1[2]=20

再提一点，打印 s1 的时候，只会打印出 s1 长度以内的元素。所以，只会打印出3个元素，虽然它的底层数组不止3个元素。

上面4张图来自饶大在深入解密slice

传值还是传指针

先看一个例子：

func main() {
	s := []int{1, 1, 1}
	f(s)
	fmt.Println(s)
}
 
func f(s []int) {
	s[0] = 100
	s[1] = 100
	s[2] = 100
}

[100 100 100]

从结果来看，传递[]int，然后在函数内部修改s有影响到外部的s。但是总是会这样吗？再看：

func main() {
	s := []int{1, 1, 1}
	f(s)
	fmt.Println(s)
}
 
func f(s []int)  {
	s = append(s,1)
	fmt.Println(s)
}

[1 1 1 1]
[1 1 1]

可以发现函数内s和外s并不一样。由前面内容来看，函数内的s因为cap不够重新分配一块内存，然后添加1，所以重新分配的内存并不影响外部s，所以导致不一样。

总结，函数传[]int并不能确保外部slice也会改变，那由上面办法让外部slice肯定会改变呢？如下：

func main() {
	s := []int{1, 1, 1}
	//f1(&s)
	s = f2(s)
	fmt.Println(s)
}
 
func f1(s *[]int)  {   //传指针
	*s = append(*s,1)
}
 
func f2(s []int)  []int {  //返回值
	s = append(s,1)
	return s
}

append

append到底会做什么？

package main
 
import "fmt"
 
func main() {
	s := []int{1, 1, 1}
	s = append(s,1)
	fmt.Println(s)
}

通过汇编角度来看append会做什么：

"".main STEXT size=283 args=0x0 locals=0x58
        0x0000 00000 (test.go:5)        TEXT    "".main(SB), ABIInternal, $88-0
        0x0000 00000 (test.go:5)        MOVQ    TLS, CX
        0x0009 00009 (test.go:5)        MOVQ    (CX)(TLS*2), CX
        0x0010 00016 (test.go:5)        CMPQ    SP, 16(CX)
        0x0014 00020 (test.go:5)        JLS     273
        0x001a 00026 (test.go:5)        SUBQ    $88, SP
        0x001e 00030 (test.go:5)        MOVQ    BP, 80(SP)
        0x0023 00035 (test.go:5)        LEAQ    80(SP), BP
        0x0028 00040 (test.go:5)        FUNCDATA        $0, gclocals·69c1753bd5f81501d95132d08af04464(SB)
        0x0028 00040 (test.go:5)        FUNCDATA        $1, gclocals·568470801006e5c0dc3947ea998fe279(SB)
        0x0028 00040 (test.go:5)        FUNCDATA        $3, gclocals·bfec7e55b3f043d1941c093912808913(SB)
        0x0028 00040 (test.go:5)        FUNCDATA        $4, "".main.stkobj(SB)
        0x0028 00040 (test.go:6)        PCDATA  $2, $1
        0x0028 00040 (test.go:6)        PCDATA  $0, $0
        0x0028 00040 (test.go:6)        LEAQ    type.[3]int(SB), AX  //类型是[3]int 放入AX
        0x002f 00047 (test.go:6)        PCDATA  $2, $0              
        0x002f 00047 (test.go:6)        MOVQ    AX, (SP)      //runtime.newobject参数
        0x0033 00051 (test.go:6)        CALL    runtime.newobject(SB)
        0x0038 00056 (test.go:6)        PCDATA  $2, $1
        0x0038 00056 (test.go:6)        MOVQ    8(SP), AX
        0x003d 00061 (test.go:6)        MOVQ    "".statictmp_0(SB), CX
        0x0044 00068 (test.go:6)        MOVQ    CX, (AX)
        0x0047 00071 (test.go:6)        MOVUPS  "".statictmp_0+8(SB), X0
        0x004e 00078 (test.go:6)        MOVUPS  X0, 8(AX)
        0x0052 00082 (test.go:7)        PCDATA  $2, $2
        0x0052 00082 (test.go:7)        LEAQ    type.int(SB), CX
        0x0059 00089 (test.go:7)        PCDATA  $2, $1
        0x0059 00089 (test.go:7)        MOVQ    CX, (SP) //growslice第一个参数 et *_type，从27行看出是int类型参数地址
        0x005d 00093 (test.go:7)        PCDATA  $2, $0
        0x005d 00093 (test.go:7)        MOVQ    AX, 8(SP)  //growslice第二个参数是old slice，slice是一个由底层数组指针、len和cap组成的结构体。 8(SP)表示底层数组指针。
        0x0062 00098 (test.go:7)        MOVQ    $3, 16(SP) //len = 3
        0x006b 00107 (test.go:7)        MOVQ    $3, 24(SP) //cap = 3
        0x0074 00116 (test.go:7)        MOVQ    $4, 32(SP) //growslice第三个参数是 cap int，等于4
        0x007d 00125 (test.go:7)        CALL    runtime.growslice(SB)
        0x0082 00130 (test.go:7)        PCDATA  $2, $1
        0x0082 00130 (test.go:7)        MOVQ    40(SP), AX  //growslice返回值slice。分别是底层数组指针、len和cap。 AX是底层数组指针
        0x0087 00135 (test.go:7)        MOVQ    48(SP), CX  //CX 是len = 3(返回的len是old.len通过下面growslice函数返回可知)
        0x008c 00140 (test.go:7)        MOVQ    56(SP), DX  //DX 是cap = 6
        0x0091 00145 (test.go:7)        MOVQ    $1, 24(AX)  //AX偏移24个字节，正好是新的slice[3]得地址，所以slice[3] = 1
        0x0099 00153 (test.go:8)        PCDATA  $2, $0
        0x0099 00153 (test.go:8)        MOVQ    AX, (SP)   //convTslice函数将slice的底层struct转成eface，因为fmt.Println参数是interface{}。convTslice参数是[]byte, 第一个8字节是 底层数组指针，第二个8字节是len，第三个8字节是cap
        0x009d 00157 (test.go:7)        LEAQ    1(CX), AX   
        0x00a1 00161 (test.go:8)        MOVQ    AX, 8(SP)  //AX= CX+1=4
        0x00a6 00166 (test.go:8)        MOVQ    DX, 16(SP) //第三个8字节是 DX = 6
        0x00ab 00171 (test.go:8)        CALL    runtime.convTslice(SB) 
        0x00b0 00176 (test.go:8)        PCDATA  $2, $1
        0x00b0 00176 (test.go:8)        MOVQ    24(SP), AX //convTslice返回值
        0x00b5 00181 (test.go:8)        PCDATA  $0, $1
        0x00b5 00181 (test.go:8)        XORPS   X0, X0
        0x00b8 00184 (test.go:8)        MOVUPS  X0, ""..autotmp_12+64(SP)
        0x00bd 00189 (test.go:8)        PCDATA  $2, $2
        0x00bd 00189 (test.go:8)        LEAQ    type.[]int(SB), CX
        0x00c4 00196 (test.go:8)        PCDATA  $2, $1
        0x00c4 00196 (test.go:8)        MOVQ    CX, ""..autotmp_12+64(SP) //临时栈变量
        0x00c9 00201 (test.go:8)        PCDATA  $2, $0
        0x00c9 00201 (test.go:8)        MOVQ    AX, ""..autotmp_12+72(SP)
        0x00ce 00206 (test.go:8)        XCHGL   AX, AX
        0x00cf 00207 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $1
        0x00cf 00207 ($GOROOT\src\fmt\print.go:275)     MOVQ    os.Stdout(SB), AX
        0x00d6 00214 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $2
        0x00d6 00214 ($GOROOT\src\fmt\print.go:275)     LEAQ    go.itab.*os.File,io.Writer(SB), CX
        0x00dd 00221 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $1
        0x00dd 00221 ($GOROOT\src\fmt\print.go:275)     MOVQ    CX, (SP)
        0x00e1 00225 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $0
        0x00e1 00225 ($GOROOT\src\fmt\print.go:275)     MOVQ    AX, 8(SP)
        0x00e6 00230 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $1
        0x00e6 00230 ($GOROOT\src\fmt\print.go:275)     PCDATA  $0, $0
        0x00e6 00230 ($GOROOT\src\fmt\print.go:275)     LEAQ    ""..autotmp_12+64(SP), AX
        0x00eb 00235 ($GOROOT\src\fmt\print.go:275)     PCDATA  $2, $0
        0x00eb 00235 ($GOROOT\src\fmt\print.go:275)     MOVQ    AX, 16(SP)
        0x00f0 00240 ($GOROOT\src\fmt\print.go:275)     MOVQ    $1, 24(SP)
        0x00f9 00249 ($GOROOT\src\fmt\print.go:275)     MOVQ    $1, 32(SP)
        0x0102 00258 ($GOROOT\src\fmt\print.go:275)     CALL    fmt.Fprintln(SB)
        0x0107 00263 (<unknown line number>)    MOVQ    80(SP), BP
        0x010c 00268 (<unknown line number>)    ADDQ    $88, SP
        0x0110 00272 (<unknown line number>)    RET
        0x0111 00273 (<unknown line number>)    NOP
        0x0111 00273 (test.go:5)        PCDATA  $0, $-1
        0x0111 00273 (test.go:5)        PCDATA  $2, $-1
        0x0111 00273 (test.go:5)        CALL    runtime.morestack_noctxt(SB)
        0x0116 00278 (test.go:5)        JMP     0

说明：

newobject(SB) 函数创建对象，在Go/runtime/malloc.go
convTslice旧版本是convT2Eslice函数，在Go/runtime/iface.go
slice底层数据结构：

type slice struct {
	array unsafe.Pointer
	len   int
	cap   int
}

从汇编来看：

0x007d 00125 (test.go:7)        CALL    runtime.growslice(SB)

源码第7行CALL(调用)函数runtime.growslice，该函数源码在Go/runtime/slice.go，如下(删减版+注解)：

//et *_type：slice元素类型
//old slice:旧的slice
//cat int:新slice需要的最小cap
func growslice(et *_type, old slice, cap int) slice {
	//...
 
	newcap := old.cap
	doublecap := newcap + newcap
	if cap > doublecap {
		newcap = cap
	} else {
		if old.len < 1024 { //当原 slice 容量小于 1024 的时候，新 slice 容量变成原来的 2 倍
			newcap = doublecap
		} else {  //原 slice 容量超过 1024，新 slice 容量变成原来的1.25倍。
			for 0 < newcap && newcap < cap {
				newcap += newcap / 4
			}
 
			if newcap <= 0 {
				newcap = cap
			}
		}
	}
 
    //字节对齐相关，主要是roundupsize函数
	var overflow bool
	var lenmem, newlenmem, capmem uintptr
	switch {
	case et.size == 1:
		lenmem = uintptr(old.len)
		newlenmem = uintptr(cap)
		capmem = roundupsize(uintptr(newcap))
		overflow = uintptr(newcap) > maxAlloc
		newcap = int(capmem)
	case et.size == sys.PtrSize:
		lenmem = uintptr(old.len) * sys.PtrSize
		newlenmem = uintptr(cap) * sys.PtrSize
		capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
		overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
		newcap = int(capmem / sys.PtrSize)
	case isPowerOfTwo(et.size):
		var shift uintptr
		if sys.PtrSize == 8 {
			// Mask shift for better code generation.
			shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
		} else {
			shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
		}
		lenmem = uintptr(old.len) << shift
		newlenmem = uintptr(cap) << shift
		capmem = roundupsize(uintptr(newcap) << shift)
		overflow = uintptr(newcap) > (maxAlloc >> shift)
		newcap = int(capmem >> shift)
	default:
		lenmem = uintptr(old.len) * et.size
		newlenmem = uintptr(cap) * et.size
		capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
		capmem = roundupsize(capmem)
		newcap = int(capmem / et.size)
	}
 
    //...
 
	if overflow || capmem > maxAlloc {
		panic(errorString("growslice: cap out of range"))
	}
 
	var p unsafe.Pointer
	if et.kind&kindNoPointers != 0 {
		p = mallocgc(capmem, nil, false)  //不包含指针的对象分配内存
		memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem) //先把P指针从p+0移到p+newlenmem，然后把p+newlenmem后面的内存清零
	} else {
		p = mallocgc(capmem, et, true)  //包含指针的对象分配内存
		if writeBarrier.enabled {
			bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem)
		}
	}
	memmove(p, old.array, lenmem)  //将old.array复制lenmem字节到P
 
	return slice{p, old.len, newcap}
}

上面提到roundupsize函数进行内存对齐，根据go内存分配知识，go小内存（67种规格大小的内存）和大内存（大于32kb）。class_to_size通过 spanClass获取 span划分的 object大小。而 size_to_class8 表示通过 size（即newcap*ptrSize）获取它的 spanClass，详细参考详解大彬的Go内存分配。

总之，从growslice函数看出append扩容规则：扩容策略并不是简单的扩为原切片容量的 2 倍或 1.25 倍，还有内存对齐的操作。扩容后的容量 >= 原容量的 2 倍或 1.25 倍。

再看一个例子：

func main() {
	s := []int{1,2}
	s = append(s,4,5,6)
	fmt.Printf("len=%d, cap=%d",len(s),cap(s))
}

len=5, cap=6

要是以前的认知肯定认为cap=8，但是这里确实cap=6？我们再看看growslice源码：中 s 原来只有 2 个元素，len 和 cap 都为 2，append 了三个元素后，长度变为 3，容量最小要变成 5，即调用 growslice 函数时，传入的第三个参数应该为 5。即 cap=5。而一方面，doublecap 是原 slice容量的 2 倍，等于 4。满足第一个 if 条件，所以 newcap 变成了 5。接着调用了 roundupsize 函数，传入 40。（代码中ptrSize是指一个指针的大小，在64位机上是8）。所以(size+smallSizeDiv-1)/smallSizeDiv = 5；获取 size_to_class8 数组中索引为 5 的元素为 4；获取 class_to_size 中索引为 4 的元素为 48。最终，newcap = int(capmem / ptrSize) newcap == 6。

扩展--堆内存分配

上面竟然说到Go语言堆内存分配，关注一下mallocgc分配堆内存的函数，Go语言堆内存大小分为两种：

mallocgc源码(删减版+注解)：

func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
	//...
 
	// Set mp.mallocing to keep from being preempted by GC.
	mp := acquirem()
	if mp.mallocing != 0 {
		throw("malloc deadlock")
	}
	if mp.gsignal == getg() {
		throw("malloc during signal")
	}
	mp.mallocing = 1
 
	shouldhelpgc := false
	dataSize := size
	c := gomcache()   //当前运行goroutine的P上的mcache
	var x unsafe.Pointer
	noscan := typ == nil || typ.kind&kindNoPointers != 0 // noscan == true:不包含指针对象无需GC扫描 
	if size <= maxSmallSize {
		if noscan && size < maxTinySize { //0~16B
			// Tiny allocator.
			//
			// Tiny allocator combines several tiny allocation requests
			// into a single memory block. The resulting memory block
			// is freed when all subobjects are unreachable. The subobjects
			// must be noscan (don't have pointers), this ensures that
			// the amount of potentially wasted memory is bounded.
			//
			// Size of the memory block used for combining (maxTinySize) is tunable.
			// Current setting is 16 bytes, which relates to 2x worst case memory
			// wastage (when all but one subobjects are unreachable).
			// 8 bytes would result in no wastage at all, but provides less
			// opportunities for combining.
			// 32 bytes provides more opportunities for combining,
			// but can lead to 4x worst case wastage.
			// The best case winning is 8x regardless of block size.
			//
			// Objects obtained from tiny allocator must not be freed explicitly.
			// So when an object will be freed explicitly, we ensure that
			// its size >= maxTinySize.
			//
			// SetFinalizer has a special case for objects potentially coming
			// from tiny allocator, it such case it allows to set finalizers
			// for an inner byte of a memory block.
			//
			// The main targets of tiny allocator are small strings and
			// standalone escaping variables. On a json benchmark
			// the allocator reduces number of allocations by ~12% and
			// reduces heap size by ~20%.
			off := c.tinyoffset
			// Align tiny pointer for required (conservative) alignment.
			if size&7 == 0 {
				off = round(off, 8)
			} else if size&3 == 0 {
				off = round(off, 4)
			} else if size&1 == 0 {
				off = round(off, 2)
			}
			if off+size <= maxTinySize && c.tiny != 0 { 
				// The object fits into existing tiny block.
                                //如果tiny block还能放下size大小内存
				x = unsafe.Pointer(c.tiny + off)
				c.tinyoffset = off + size
				c.local_tinyallocs++
				mp.mallocing = 0
				releasem(mp)
				return x
			}
			// Allocate a new maxTinySize block.
                        /*创建一个新的tiny block内存
                        通过源码知道tinySpanClass == 5 且 noscan = true 所以通过
                        makeSpanClass函数知道，span class是2即对应class_to_size[2]=16
                        正好是tiny小对象的最大内存。*/
			span := c.alloc[tinySpanClass]
			v := nextFreeFast(span) //从本地mcache上获取
			if v == 0 {
				v, _, shouldhelpgc = c.nextFree(tinySpanClass) //从公共mcentral上获取
			}
			x = unsafe.Pointer(v)
			(*[2]uint64)(x)[0] = 0  //初始化0-7字节内存
			(*[2]uint64)(x)[1] = 0  //初始化8-15字节内存
			// See if we need to replace the existing tiny block with the new one
			// based on amount of remaining free space.
			if size < c.tinyoffset || c.tiny == 0 {
				c.tiny = uintptr(x)
				c.tinyoffset = size
			}
			size = maxTinySize
		} else { //17B~32KB
			var sizeclass uint8
			if size <= smallSizeMax-8 {
				sizeclass = size_to_class8[(size+smallSizeDiv-1)/smallSizeDiv]
			} else {
				sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv-1)/largeSizeDiv]
			}
			size = uintptr(class_to_size[sizeclass])
			spc := makeSpanClass(sizeclass, noscan)
			span := c.alloc[spc]
			v := nextFreeFast(span)
			if v == 0 {
				v, span, shouldhelpgc = c.nextFree(spc)
			}
			x = unsafe.Pointer(v)
			if needzero && span.needzero != 0 {
				memclrNoHeapPointers(unsafe.Pointer(v), size)
			}
		}
	} else { // >32KB
		var s *mspan
		shouldhelpgc = true
		systemstack(func() {       //系统调用，在系统thread上执行大内存分配函数largeAlloc
			s = largeAlloc(size, needzero, noscan)
		})
		s.freeindex = 1
		s.allocCount = 1
		x = unsafe.Pointer(s.base())
		size = s.elemsize
	}
 
        //含指针对象，设置GC bitmap
	var scanSize uintptr
	if !noscan {
		// If allocating a defer+arg block, now that we've picked a malloc size
		// large enough to hold everything, cut the "asked for" size down to
		// just the defer header, so that the GC bitmap will record the arg block
		// as containing nothing at all (as if it were unused space at the end of
		// a malloc block caused by size rounding).
		// The defer arg areas are scanned as part of scanstack.
		if typ == deferType {
			dataSize = unsafe.Sizeof(_defer{})
		}
		heapBitsSetType(uintptr(x), size, dataSize, typ)
		if dataSize > typ.size {
			// Array allocation. If there are any
			// pointers, GC has to scan to the last
			// element.
			if typ.ptrdata != 0 {
				scanSize = dataSize - typ.size + typ.ptrdata
			}
		} else {
			scanSize = typ.ptrdata
		}
		c.local_scan += scanSize
	}
 
	// Ensure that the stores above that initialize x to
	// type-safe memory and set the heap bits occur before
	// the caller can make x observable to the garbage
	// collector. Otherwise, on weakly ordered machines,
	// the garbage collector could follow a pointer to x,
	// but see uninitialized memory or stale heap bits.
	publicationBarrier()
 
	// Allocate black during GC.
	// All slots hold nil so no scanning is needed.
	// This may be racing with GC so do it atomically if there can be
	// a race marking the bit.
	if gcphase != _GCoff {
		gcmarknewobject(uintptr(x), size, scanSize)
	}
 
	if raceenabled {
		racemalloc(x, size)
	}
 
	if msanenabled {
		msanmalloc(x, size)
	}
 
	mp.mallocing = 0
	releasem(mp)
 
	//...
 
	return x
}

有几个小知识点：

系统调用：即OS Thread call，OS thread调用函数。参数一般使用函数定义的字面量，这样函数的参数和返回值可以直接使用上下文的参数。比如上面函数systemstack(func(){ s = largeAlloc(size, needzero, noscan)})
对于不含指针的对象，GC的时候不需要再scan，直接GC对象；对于包含指针的对象，GC的时候需要scan，看看哪些地方都引用该对象，然后对其都要GC。

参考地址：

Go语言学习笔记

Go内存分配

深度解密Go语言之Slice

原文链接：https://blog.csdn.net/qq_31930499/article/details/100632706