DEBUGGING GO PROGRAM WITH GDB #2

I fixed find_goroutine(goid) using yesterday’s Allg class and finally I’m able to debug goroutines.Thank you Don!
I uploaded entire runtime-gdb.py to my github repo at https://github.com/sokoide/go-gdb.

(gdb) info goroutine
  1 waiting  fname=runtime.gopark faddr=0x13415 &g=0xc208000120 waitreason="sleep"
  2 waiting  fname=runtime.gopark faddr=0x13415 &g=0xc208000480 waitreason="force gc (idle)"
  3 waiting  fname=runtime.gopark faddr=0x13415 &g=0xc2080005a0 waitreason="GC sweep wait"
  4 waiting  fname=runtime.gopark faddr=0x13415 &g=0xc2080006c0 waitreason="finalizer wait"
  5 syscall  fname=runtime.switchtoM faddr=0x366c0 &g=0xc2080007e0

(gdb) goroutine 1 bt
#0  runtime.gopark (unlockf=0x2eba0 <runtime.parkunlock_c>, lock=0x1576e0 <runtime.timers>, reason="sleep")
    at /Users/sokoide/repo/go/src/runtime/proc.go:131
#1  0x0000000000013488 in runtime.goparkunlock (lock=0x1576e0 <runtime.timers>, reason="sleep") at /Users/sokoide/repo/go/src/runtime/proc.go:136
#2  0x0000000000017de5 in runtime.timeSleep (ns=2000000000) at /Users/sokoide/repo/go/src/runtime/time.go:58
#3  0x00000000000020f8 in main.f2 (a=40, b=2, ~r2=833358237544) at /Users/sokoide/workspace/go/foo/foo.go:24
#4  0x0000000000002551 in main.main () at /Users/sokoide/workspace/go/foo/main.go:22

(gdb) goroutine 4 bt
#0  runtime.gopark (unlockf=0x2eba0 <runtime.parkunlock_c>, lock=0x15e840 <runtime.finlock>, reason="finalizer wait")
    at /Users/sokoide/repo/go/src/runtime/proc.go:131
#1  0x0000000000013488 in runtime.goparkunlock (lock=0x15e840 <runtime.finlock>, reason="finalizer wait")
    at /Users/sokoide/repo/go/src/runtime/proc.go:136
#2  0x000000000000e66a in runtime.runfinq () at /Users/sokoide/repo/go/src/runtime/malloc.go:727
#3  0x00000000000388f1 in runtime.goexit () at /Users/sokoide/repo/go/src/runtime/asm_amd64.s:2232
#4  0x0000000000000000 in ?? ()

Here is the updated find_goroutine(good).

def find_goroutine(goid):
	"""
	find_goroutine attempts to find the goroutine identified by goid.
	It returns a touple of gdv.Value's representing the stack pointer
	and program counter pointer for the goroutine.

	@param int goid

	@return tuple (gdb.Value, gdb.Value)
	"""
	__allg = Allg()
	while True:
		ptr = __allg.fetch()
		if not ptr:
			break
		if goid == __allg.Goid(ptr):
		    pc = __allg.Pc(ptr)
		    sp = __allg.Sp(ptr)
		    return pc, sp
	return None, None

Debugging GO program with GDB #1

Golang provides a python script to extend gdb to show goroutines and the callstacks here.

However, it doesn’t work as written.
This patch fixed the first problem, but it still fails (at least on OS X and Windows).

I found this article which I think worked in older GO versions, but it fails in go 1.4.2 (latest as of 2015.3).

I changed Allg class’s offset and init function to show a list of goroutines as below.

(gdb) info goroutines
  1 waiting  fname=runtime.gopark faddr=0x13585 &g=0xc208000120 waitreason="sleep"
  2 waiting  fname=runtime.gopark faddr=0x13585 &g=0xc208000480 waitreason="force gc (idle)"
  3 waiting  fname=runtime.gopark faddr=0x13585 &g=0xc2080005a0 waitreason="GC sweep wait"
  4 waiting  fname=runtime.gopark faddr=0x13585 &g=0xc2080006c0 waitreason="finalizer wait"
  5 syscall  fname=runtime.switchtoM faddr=0x36830 &g=0xc208000a20

I’ll need more work to fix ‘go $goroutineid bt’.

(gdb) goroutine 1 bt
Python Exception <class 'gdb.error'> There is no member named status.:
Error occurred in Python command: There is no member named status.

Today’s change.

runtime/runtime-gdb.py:
class Allg:
    __allglen = -1
    __position = 0
    __allg = 0

    __offsets = {
            'status': 120,
            'waitreason': 144,
            'goid': 128,
            'm': 200,
            'sched': 48,
            'sched.pc': 56,
            'sched.sp': 48,
            'stackguard': 16,
            'stackbase': 0,
        }

    def __init__(self):
        # first, fetch the number of active goroutines
        self.__allglen = int(str(gdb.parse_and_eval("&{uint64}'runtime.allglen'")), 16)
        # print("found allglen = {0}".format(self.__allglen))

        # get the next address in the array
        s = "&*{uint64}(&'runtime.allg')"
        self.__allg = int(gdb.parse_and_eval(s))
        # print("found allg = {0}".format(hex(self.__allg)))

    def fetch(self):
        if self.__position >= self.__allglen:
            return None

        s = "&*{uint64}(" + "{0}+{1})".format(self.__allg, self.__position*8)
        p = int(gdb.parse_and_eval(s))
        self.__position += 1
        return p

    def Status(self, a):
        s = "&*{int16}(" + "{0}+{1})".format(a, self.__offsets['status'])
        return int(gdb.parse_and_eval(s))

    def WaitReason(self, a):
        s = "&*{int64}(" + "{0}+{1})".format(a, self.__offsets['waitreason'])
        x = int(gdb.parse_and_eval(s))
        s = "&{int8}" + "{0}".format(x)
        return str(gdb.parse_and_eval(s))

    def Goid(self, a):
        s = "&*{int64}(" + "{0}+{1})".format(a, self.__offsets['goid'])
        return int(gdb.parse_and_eval(s))

    def M(self, a):
        s = "&*{uint64}(" + "{0}+{1})".format(a, self.__offsets['m'])
        return int(gdb.parse_and_eval(s))

    def Pc(self, a):
        s = "&*{uint64}(" + "{0}+{1})".format(a, self.__offsets['sched.pc'])
        return int(gdb.parse_and_eval(s))

    def Sp(self, a):
        s = "&*{uint64}(" + "{0}+{1})".format(a, self.__offsets['sched.sp'])
        return int(gdb.parse_and_eval(s))

    def Stackguard(self, a):
        s = "&*{uint64}(" + "{0}+{1})".format(a, self.__offsets['stackguard'])
        return int(gdb.parse_and_eval(s))

    def Stackbase(self, a):
        s = "&*{uint64}(" + "{0}+{1})".format(a, self.__offsets['stackbase'])
        return int(gdb.parse_and_eval(s))


class GoroutinesCmd(gdb.Command):
	"List all goroutines."
	__allg = None

	def __init__(self):
		gdb.Command.__init__(self, "info goroutines", gdb.COMMAND_STACK, gdb.COMPLETE_NONE)

	def invoke(self, _arg, _from_tty):
		self.__allg = Allg()
		while True:
			ptr = self.__allg.fetch()
			# print("fetched ptr = {0}".format(hex(ptr)))
			if not ptr:
				break

			st = self.__allg.Status(ptr)
			# print("status is {0}".format(st))
			w = self.__allg.WaitReason(ptr)
			# print("waitreason is {0}".format(w))
			#if st == 6:  # 'gdead'
			    #print("skipping over dead goroutine")
			    #continue

			s = ' '
			m = self.__allg.M(ptr)
			if m:
				s = '*'

			# if the status isn't "waiting" then the waitreason doesn' tmatter
			if st != 4:
				w = ''
			w2 = w.split('"')
			if len(w2) > 1:
				w = """waitreason="{0}\"""".format(w2[len(w2) - 2])

			pc = self.__allg.Pc(ptr)
			# print("pc is {0}".format(pc))
                        blk = gdb.block_for_pc(pc)
                        # print("blk is {0}".format(blk))
                        goid = self.__allg.Goid(ptr)
                        a = "fname={0} faddr={1}".format(blk.function, hex(pc))
                        print(s, goid, "{0:8s}".format(sts[st]), a, "&g={0}".format(hex(ptr)), w)

The __offsets in Allg was calculated from go’s runtime zruntime_defs_darwin_amd64.go.

 123 type g struct {
 124 ▸ stack        stack          // 0, +16
 125 ▸ stackguard0  uintptr        // 16
 126 ▸ stackguard1  uintptr        // 24
 127 ▸ _panic       *_panic        // 32
 128 ▸ _defer       *_defer        // 40
 129 ▸ sched        gobuf          // 48, +48
 130 ▸ syscallsp    uintptr        // 96
 131 ▸ syscallpc    uintptr        // 104
 132 ▸ param        unsafe.Pointer // 112
 133 ▸ atomicstatus uint32         // 120, padding +4
 134 ▸ goid         int64          // 128
 135 ▸ waitsince    int64          // 136
 136 ▸ waitreason   string         // 144, +16
 137 ▸ schedlink    *g             // 160
 138 ▸ issystem     bool           // 168
 139 ▸ preempt      bool           // 172
 140 ▸ paniconfault bool           //176
 141 ▸ preemptscan  bool           //180
 142 ▸ gcworkdone   bool           //184
 143 ▸ throwsplit   bool           //188
 144 ▸ raceignore   int8           //192, padding +7
 145 ▸ m            *m             //200

Golang calling convention

I confirmed caller allocates stack for arguments and return value and callee uses it as below. It didn’t use rcx, rdx, r8, r9 for args, rax for return value (different from x86_64 calling convention).

Sample Function:

  9 c := f1(5,6)
...
 15 func f1(a int, b int) int {
 16   return f2(a, b)
 17 }

Compiled Code:

// c := f1(5,6)
   0x000000000000228f <+191>:	mov    QWORD PTR [rsp],0x5
   0x0000000000002297 <+199>:	mov    QWORD PTR [rsp+0x8],0x6
   0x00000000000022a0 <+208>:	call   0x2000 
   0x00000000000022a5 <+213>:	mov    rcx,QWORD PTR [rsp+0x10]        :	mov    rcx,QWORD PTR gs:0x8a0
   0x0000000000002009 <+9>:	cmp    rsp,QWORD PTR [rcx]
   0x000000000000200c <+12>:	ja     0x2015 <main.f1+21>
   0x000000000000200e <+14>:	call   0x27fb0 
   0x0000000000002013 <+19>:	jmp    0x2000 
   0x0000000000002015 <+21>:	sub    rsp,0x18                         :	mov    rbx,QWORD PTR [rsp+0x20]
   0x000000000000201e <+30>:	mov    QWORD PTR [rsp],rbx              :	mov    rbx,QWORD PTR [rsp+0x28]
   0x0000000000002027 <+39>:	mov    QWORD PTR [rsp+0x8],rbx          :	call   0x2040 
   0x0000000000002031 <+49>:	mov    rbx,QWORD PTR [rsp+0x10]         :	mov    QWORD PTR [rsp+0x30],rbx
   0x000000000000203b <+59>:	add    rsp,0x18
   0x000000000000203f <+63>:	ret