panic-recovery 底层实现总结
1. 数据结构
panic
关键字在 Go 语言的源代码是由数据结构 runtime._panic
表示的:
// A _panic holds information about an active panic.
//
// A _panic value must only ever live on the stack.
//
// The argp and link fields are stack pointers, but don't need special
// handling during stack growth: because they are pointer-typed and
// _panic values only live on the stack, regular stack pointer
// adjustment takes care of them.
type _panic struct {
argp unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink
arg any // argument to panic
link *_panic // link to earlier panic
pc uintptr // where to return to in runtime if this panic is bypassed
sp unsafe.Pointer // where to return to in runtime if this panic is bypassed
recovered bool // whether this panic is over
aborted bool // the panic was aborted
goexit bool
}
argp
:指向defer
调用时参数的指针arg
:调用panic
时传入的参数link
:指向更早调用的runtime._panic
结构;recovered
:当前panic是否被恢复aborted
: 表示当前的panic
是否被强行终止
panic
函数可以被连续多次调用,它们之间通过 link
可以组成链表。
2. 触发 panic
编译器会将关键字 panic
转换成 runtime.gopanic
,该函数的执行过程包含以下几个步骤:
- 创建新的
runtime._panic
并添加到所在 Goroutine 的_panic
链表的最前面; - 在循环中不断从当前 Goroutine 的
_defer
中链表获取runtime._defer
并调用runtime.reflectcall
运行延迟调用函数; - 调用
runtime.fatalpanic
中止整个程序
func gopanic(e interface{}) {
gp := getg()
...
var p _panic
p.arg = e
p.link = gp._panic
gp._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
for {
d := gp._defer
if d == nil {
break
}
d._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))
d._panic = nil
d.fn = nil
gp._defer = d.link
freedefer(d)
if p.recovered {
...
}
}
fatalpanic(gp._panic)
*(*int)(nil) = 0
}
func fatalpanic(msgs *_panic) {
pc := getcallerpc()
sp := getcallersp()
gp := getg()
if startpanic_m() && msgs != nil {
atomic.Xadd(&runningPanicDefers, -1)
printpanics(msgs)
}
if dopanic_m(gp, pc, sp) {
crash()
}
exit(2)
}
3. 执行 recovery
编译器会将关键字 recover
转换成 runtime.gorecover
:
func gorecover(argp uintptr) interface{} {
gp := getg()
p := gp._panic
if p != nil && !p.recovered && argp == uintptr(p.argp) {
p.recovered = true
return p.arg
}
return nil
}
如果当前 Goroutine 没有调用 panic
,那么该函数会直接返回 nil
,这也是崩溃恢复在非 defer
中调用会失效的原因。
它会修改 runtime._panic
的 recovered
字段,runtime.gorecover
函数中并不包含恢复程序的逻辑,程序的恢复是由 runtime.gopanic
函数负责的:
func gopanic(e interface{}) {
...
for {
// 执行延迟调用函数,可能会设置 p.recovered = true
...
pc := d.pc
sp := unsafe.Pointer(d.sp)
...
if p.recovered {
gp._panic = p.link
for gp._panic != nil && gp._panic.aborted {
gp._panic = gp._panic.link
}
if gp._panic == nil {
gp.sig = 0
}
gp.sigcode0 = uintptr(sp)
gp.sigcode1 = pc
mcall(recovery)
throw("recovery failed")
}
}
...
}
从 runtime._defer
中取出了程序计数器 pc
和栈指针 sp
并调用 runtime.recovery
函数触发 Goroutine 的调度,调度之前会准备好 sp
、pc
以及函数的返回值:
func recovery(gp *g) {
sp := gp.sigcode0
pc := gp.sigcode1
gp.sched.sp = sp
gp.sched.pc = pc
gp.sched.lr = 0
gp.sched.ret = 1
gogo(&gp.sched)
}
在调用 defer
关键字时,调用时的栈指针 sp
和程序计数器 pc
就已经存储到了 runtime._defer
结构体中,这里的 runtime.gogo
函数会跳回 defer
关键字调用的位置。
runtime.recovery
在调度过程中会将函数的返回值设置成 1。当 runtime.deferproc
函数的返回值是 1 时,编译器生成的代码会直接跳转到调用方函数返回之前并执行 runtime.deferreturn
:
func deferproc(siz int32, fn *funcval) {
...
return0()
}
跳转到 runtime.deferreturn
函数之后,程序就已经从 panic
中恢复了并执行正常的逻辑,而 runtime.gorecover
函数也能从 runtime._panic
结构中取出了调用 panic
时传入的 arg
参数并返回给调用方