panic-recovery 底层实现总结

Kesa...大约 3 分钟

1. 数据结构

panic 关键字在 Go 语言的源代码是由数据结构 runtime._panicopen in new window 表示的：

// A _panic holds information about an active panic.
//
// A _panic value must only ever live on the stack.
//
// The argp and link fields are stack pointers, but don't need special
// handling during stack growth: because they are pointer-typed and
// _panic values only live on the stack, regular stack pointer
// adjustment takes care of them.
type _panic struct {
	argp      unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink
	arg       any            // argument to panic
	link      *_panic        // link to earlier panic
	pc        uintptr        // where to return to in runtime if this panic is bypassed
	sp        unsafe.Pointer // where to return to in runtime if this panic is bypassed
	recovered bool           // whether this panic is over
	aborted   bool           // the panic was aborted
	goexit    bool
}

argp：指向defer调用时参数的指针
arg：调用panic时传入的参数
link：指向更早调用的 runtime._panicopen in new window 结构；
recovered：当前panic是否被恢复
aborted: 表示当前的panic是否被强行终止

panic 函数可以被连续多次调用，它们之间通过 link 可以组成链表。

2. 触发 panic

编译器会将关键字 panic 转换成 runtime.gopanicopen in new window，该函数的执行过程包含以下几个步骤：

创建新的 runtime._panicopen in new window 并添加到所在 Goroutine 的 _panic 链表的最前面；
在循环中不断从当前 Goroutine 的 _defer 中链表获取 runtime._deferopen in new window 并调用 runtime.reflectcallopen in new window 运行延迟调用函数；
调用 runtime.fatalpanicopen in new window 中止整个程序

func gopanic(e interface{}) {
	gp := getg()
	...
	var p _panic
	p.arg = e
	p.link = gp._panic
	gp._panic = (*_panic)(noescape(unsafe.Pointer(&p)))

	for {
		d := gp._defer
		if d == nil {
			break
		}

		d._panic = (*_panic)(noescape(unsafe.Pointer(&p)))

		reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))

		d._panic = nil
		d.fn = nil
		gp._defer = d.link

		freedefer(d)
		if p.recovered {
			...
		}
	}

	fatalpanic(gp._panic)
	*(*int)(nil) = 0
}

func fatalpanic(msgs *_panic) {
	pc := getcallerpc()
	sp := getcallersp()
	gp := getg()

	if startpanic_m() && msgs != nil {
		atomic.Xadd(&runningPanicDefers, -1)
		printpanics(msgs)
	}
	if dopanic_m(gp, pc, sp) {
		crash()
	}

	exit(2)
}

3. 执行 recovery

编译器会将关键字 recover 转换成 runtime.gorecoveropen in new window：

func gorecover(argp uintptr) interface{} {
	gp := getg()
	p := gp._panic
	if p != nil && !p.recovered && argp == uintptr(p.argp) {
		p.recovered = true
		return p.arg
	}
	return nil
}

如果当前 Goroutine 没有调用 panic，那么该函数会直接返回 nil，这也是崩溃恢复在非 defer 中调用会失效的原因。

它会修改 runtime._panicopen in new window 的 recovered 字段，runtime.gorecoveropen in new window 函数中并不包含恢复程序的逻辑，程序的恢复是由 runtime.gopanicopen in new window 函数负责的：

func gopanic(e interface{}) {
	...

	for {
		// 执行延迟调用函数，可能会设置 p.recovered = true
		...

		pc := d.pc
		sp := unsafe.Pointer(d.sp)

		...
		if p.recovered {
			gp._panic = p.link
			for gp._panic != nil && gp._panic.aborted {
				gp._panic = gp._panic.link
			}
			if gp._panic == nil {
				gp.sig = 0
			}
			gp.sigcode0 = uintptr(sp)
			gp.sigcode1 = pc
			mcall(recovery)
			throw("recovery failed")
		}
	}
	...
}

从 runtime._deferopen in new window 中取出了程序计数器 pc 和栈指针 sp 并调用 runtime.recoveryopen in new window 函数触发 Goroutine 的调度，调度之前会准备好 sp、pc 以及函数的返回值：

func recovery(gp *g) {
	sp := gp.sigcode0
	pc := gp.sigcode1

	gp.sched.sp = sp
	gp.sched.pc = pc
	gp.sched.lr = 0
	gp.sched.ret = 1
	gogo(&gp.sched)
}

在调用 defer 关键字时，调用时的栈指针 sp 和程序计数器 pc 就已经存储到了 runtime._deferopen in new window 结构体中，这里的 runtime.gogoopen in new window 函数会跳回 defer 关键字调用的位置。

runtime.recoveryopen in new window 在调度过程中会将函数的返回值设置成 1。当 runtime.deferprocopen in new window 函数的返回值是 1 时，编译器生成的代码会直接跳转到调用方函数返回之前并执行 runtime.deferreturnopen in new window：

func deferproc(siz int32, fn *funcval) {
	...
	return0()
}

跳转到 runtime.deferreturnopen in new window 函数之后，程序就已经从 panic 中恢复了并执行正常的逻辑，而 runtime.gorecoveropen in new window 函数也能从 runtime._panicopen in new window 结构中取出了调用 panic 时传入的 arg 参数并返回给调用方

Reference

昵称

邮箱

网址

按正序
按倒序
按热度

panic-recovery 底层实现总结

# 1. 数据结构

# 2. 触发 panic

# 3. 执行 recovery

# Reference

预览:

1. 数据结构

2. 触发 panic

3. 执行 recovery

Reference