Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用 lua-gdb 调试 skynet 的 core 文件 #84

Open
hanxi opened this issue Aug 17, 2022 · 1 comment
Open

使用 lua-gdb 调试 skynet 的 core 文件 #84

hanxi opened this issue Aug 17, 2022 · 1 comment

Comments

@hanxi
Copy link
Owner

hanxi commented Aug 17, 2022

测试环境为 Linux version 5.4.0-124-generic (buildd@lcy02-amd64-089) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #140-Ubuntu SMP Thu Aug 4 02:23:37 UTC 2022

开启 core 文件生成

创造一个 coredump 条件

从原有的 C 库下手比较好搞,直接改了 md5 库的代码,搞个 NULL 指针, diff 如下:

diff --git a/3rd/lua-md5/md5lib.c b/3rd/lua-md5/md5lib.c
index 2580b6a..7117eb9 100644
--- a/3rd/lua-md5/md5lib.c
+++ b/3rd/lua-md5/md5lib.c
@@ -22,7 +22,7 @@
 *  @return  A 128-bit hash string.
 */
 static int lmd5 (lua_State *L) {
-  char buff[16];
+  char *buff = NULL;
   size_t l;
   const char *message = luaL_checklstring(L, 1, &l);
   md5(message, l, buff);
diff --git a/examples/main.lua b/examples/main.lua
index 5a2150d..93e5ca4 100644
--- a/examples/main.lua
+++ b/examples/main.lua
@@ -1,5 +1,6 @@
 local skynet = require "skynet"
 local sprotoloader = require "sprotoloader"
+local md5 = require "md5"

 local max_client = 64

@@ -18,5 +19,6 @@ skynet.start(function()
                nodelay = true,
        })
        skynet.error("Watchdog listen on", 8888)
+       md5.sumhexa("mongo")
        skynet.exit()
 end)

然后跑 ./skynet examples/config 会产生如下信息:

[:01000002] LAUNCH snlua bootstrap
[:01000003] LAUNCH snlua launcher
[:01000004] LAUNCH snlua cmaster
[:01000004] master listen socket 0.0.0.0:2013
[:01000005] LAUNCH snlua cslave
[:01000005] slave connect to master 127.0.0.1:2013
[:01000004] connect from 127.0.0.1:48190 4
[:01000006] LAUNCH harbor 1 16777221
[:01000004] Harbor 1 (fd=4) report 127.0.0.1:2526
[:01000005] Waiting for 0 harbors
[:01000005] Shakehand ready
[:01000007] LAUNCH snlua datacenterd
[:01000008] LAUNCH snlua service_mgr
[:01000009] LAUNCH snlua main
[:01000009] Server start
[:0100000a] LAUNCH snlua protoloader
[:0100000b] LAUNCH snlua console
[:0100000c] LAUNCH snlua debug_console 8000
[:0100000c] Start debug console at 127.0.0.1:8000
[:0100000d] LAUNCH snlua simpledb
[:0100000e] LAUNCH snlua watchdog
[:0100000f] LAUNCH snlua gate
[:0100000f] Listen on 0.0.0.0:8888
Segmentation fault (core dumped)

此时当前目录应该会出现 core.577461 一样的文件。

开始调试

先把 https://github.com/xjdrew/lua-gdb.git 代码 clone 下来,假设我们放到 skynet 文件当前目录:

git clone https://github.com/xjdrew/lua-gdb.git

执行 gdb ./skynet core.577461 ,应该就会有下面类似的信息了,能看到 C 代码调用栈了。

Reading symbols from ./skynet...
[New LWP 577472]
[New LWP 577461]
[New LWP 577467]
[New LWP 577464]
[New LWP 577469]
[New LWP 577470]
[New LWP 577462]
[New LWP 577471]
[New LWP 577463]
[New LWP 577466]
[New LWP 577468]
[New LWP 577465]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./skynet examples/config'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fea28b41837 in word32tobytes (output=<optimized out>, input=0x7fea21bf9bd0) at 3rd/lua-md5/md5.c:71
71          WORD32 v = *input++;
[Current thread is 1 (Thread 0x7fea21bfc700 (LWP 577472))]
(gdb)

然后在 gdb 里 source 插件

(gdb) source lua-gdb/lua-gdb.py
Loading Lua Runtime support.

此时直接使用 luacoroutines 命令查看 lua 协程是没用的,会报错,找不到 L

(gdb) luacoroutines
Python Exception <class 'gdb.error'> No symbol "L" in current context.:
Error occurred in Python: No symbol "L" in current context.

我们需要把调用栈切到有 L 这个变量的层级(L 就是 Lua 的那个 LuaStack 变量)。先使用 where 命令看层级:

(gdb) where
#0  0x00007fea28b41837 in word32tobytes (output=<optimized out>, input=0x7fea21bf9bd0) at 3rd/lua-md5/md5.c:71
#1  md5 (message=0x7fea206b0850 "mongo", len=5, output=output@entry=0x0) at 3rd/lua-md5/md5.c:212
#2  0x00007fea28b41b47 in lmd5 (L=0x7fea206590a8) at 3rd/lua-md5/md5lib.c:28
#3  0x000056213553a74e in precallC (f=0x7fea28b41b10 <lmd5>, nresults=1, func=<optimized out>, L=0x7fea206590a8) at ldo.c:506#4  luaD_precall (L=L@entry=0x7fea206590a8, func=<optimized out>, func@entry=0x7fea20633540, nresults=1) at ldo.c:572        #5  0x0000562135548e38 in luaV_execute (L=L@entry=0x7fea206590a8, ci=<optimized out>, ci@entry=0x7fea206b5bc0) at lvm.c:1638 #6  0x000056213553a322 in unroll (L=0x7fea206590a8, ud=<optimized out>) at ldo.c:717
#7  0x0000562135539a05 in luaD_rawrunprotected (L=L@entry=0x7fea206590a8, f=f@entry=0x56213553a940 <resume>, ud=ud@entry=0x7fea21bf9f3c) at ldo.c:144

然后用 up 3 命令切层级到 #3 ,这时用 luacoroutines 指令就能看到协程了:

(gdb) up 3
#3  0x000056213553a74e in precallC (f=0x7fea28b41b10 <lmd5>, nresults=1, func=<optimized out>, L=0x7fea206590a8) at ldo.c:506506       n = (*f)(L);  /* do the actual call */
(gdb) luacoroutines
m <coroutine 0x7fea2060ef08> = {[source] = [C]:-1, [func] = ?}
  <coroutine 0x7fea206590a8> = {[source] = [C]:-1, [func] = 0x7fea28b41b10 <lmd5>}
  <coroutine 0x7fea20658fc8> = {[source] = [C]:-1, [func] = 0x7fea28f14370 <luaB_coresume>}
  <coroutine 0x7fea2060ef08> = {[source] = [C]:-1, [func] = ?}

再使用 luastack 命令查看 Lua 的调用栈:

(gdb) luastack 0x7fea206590a8
#0      0x7fea20633550  {val = "mongo", tbclist = {value_ = {gc = 0x7fea206b0830, p = 0x7fea206b0830, f = 0x7fea206b0830, i = 140643542960176, n = 6.9487142886020459e-310}, tt_ = 68 'D', delta = 0}}
#1      0x7fea20633540  {val = <lmd5>, tbclist = {value_ = {gc = 0x7fea28b41b10 <lmd5>, p = 0x7fea28b41b10 <lmd5>, f = 0x7fea28b41b10 <lmd5>, i = 140643681966864, n = 6.9487211564449542e-310}, tt_ = 22 '\026', delta = 0}}
#2      0x7fea20633530  {val = "mongo", tbclist = {value_ = {gc = 0x7fea206b0830, p = 0x7fea206b0830, f = 0x7fea206b0830, i = 140643542960176, n = 6.9487142886020459e-310}, tt_ = 68 'D', delta = 0}}
#3      0x7fea20633520  {val = <lclosure 0x7fea207304c0> = {[file] = "@./lualib/md5.lua", [linestart] = 11, [lineend] = 16, [nupvalues] = 2 '\002'}, tbclist = {value_ = {gc = 0x7fea207304c0, p = 0x7fea207304c0, f = 0x7fea207304c0, i = 140643543483584, n
= 6.9487143144618371e-310}, tt_ = 70 'F', delta = 0}}
#4      0x7fea20633510  {val = 16777230, tbclist = {value_ = {gc = 0x100000e, p = 0x100000e, f = 0x100000e, i = 16777230, n = 8.2890529753771368e-317}, tt_ = 3 '\003', delta = 0}}
#5      0x7fea20633500  {val = <lclosure 0x7fea20731600> = {[file] = "@./examples/main.lua", [linestart] = 7, [lineend] = 24, [nupvalues] = 3 '\003'}, tbclist = {value_ = {gc = 0x7fea20731600, p = 0x7fea20731600, f = 0x7fea20731600, i = 140643543488000,
n = 6.9487143146800165e-310}, tt_ = 70 'F', delta = 0}}
#6      0x7fea206334f0  {val = <lclosure 0x7fea206ad0a0> = {[file] = "@./lualib/skynet.lua", [linestart] = 934, [lineend] = 937, [nupvalues] = 2 '\002'}, tbclist = {value_ = {gc = 0x7fea206ad0a0, p = 0x7fea206ad0a0, f = 0x7fea206ad0a0, i = 14064354294595
2, n = 6.9487142878992869e-310}, tt_ = 70 'F', delta = 0}}
#7      0x7fea206334e0  {val = "False", tbclist = {value_ = {gc = 0x0, p = 0x0, f = 0x0, i = 0, n = 0}, tt_ = 17 '\021', delta = 0}}
#8      0x7fea206334d0  {val = <db_traceback>, tbclist = {value_ = {gc = 0x562135554d70 <db_traceback>, p = 0x562135554d70 <db_traceback>, f = 0x562135554d70 <db_traceback>, i = 94700628692336, n = 4.6788327276451069e-310}, tt_ = 22 '\026', delta = 0}}
#9      0x7fea206334c0  {val = <lclosure 0x7fea206ad0a0> = {[file] = "@./lualib/skynet.lua", [linestart] = 934, [lineend] = 937, [nupvalues] = 2 '\002'}, tbclist = {value_ = {gc = 0x7fea206ad0a0, p = 0x7fea206ad0a0, f = 0x7fea206ad0a0, i = 14064354294595
2, n = 6.9487142878992869e-310}, tt_ = 70 'F', delta = 0}}
#10     0x7fea206334b0  {val = <luaB_xpcall>, tbclist = {value_ = {gc = 0x5621355538a0 <luaB_xpcall>, p = 0x5621355538a0 <luaB_xpcall>, f = 0x5621355538a0 <luaB_xpcall>, i = 94700628687008, n = 4.6788327273818687e-310}, tt_ = 22 '\026', delta = 0}}
#11     0x7fea206334a0  {val = <lclosure 0x7fea206ad0a0> = {[file] = "@./lualib/skynet.lua", [linestart] = 934, [lineend] = 937, [nupvalues] = 2 '\002'}, tbclist = {value_ = {gc = 0x7fea206ad0a0, p = 0x7fea206ad0a0, f = 0x7fea206ad0a0, i = 14064354294595
2, n = 6.9487142878992869e-310}, tt_ = 70 'F', delta = 0}}
#12     0x7fea20633490  {val = <lclosure 0x7fea20731600> = {[file] = "@./examples/main.lua", [linestart] = 7, [lineend] = 24, [nupvalues] = 3 '\003'}, tbclist = {value_ = {gc = 0x7fea20731600, p = 0x7fea20731600, f = 0x7fea20731600, i = 140643543488000,
n = 6.9487143146800165e-310}, tt_ = 70 'F', delta = 0}}
#13     0x7fea20633480  {val = <lclosure 0x7fea20606950> = {[file] = "@./lualib/skynet.lua", [linestart] = 933, [lineend] = 946, [nupvalues] = 5 '\005'}, tbclist = {value_ = {gc = 0x7fea20606950, p = 0x7fea20606950, f = 0x7fea20606950, i = 140643542264144, n = 6.948714254213496e-310}, tt_ = 70 'F', delta = 0}}
#14     0x7fea20633470  {val = <lclosure 0x7fea207316c0> = {[file] = "@./lualib/skynet.lua", [linestart] = 950, [lineend] = 953, [nupvalues] = 3 '\003'}, tbclist = {value_ = {gc = 0x7fea207316c0, p = 0x7fea207316c0, f = 0x7fea207316c0, i = 140643543488192, n = 6.9487143146895025e-310}, tt_ = 70 'F', delta = 0}}
#15     0x7fea20633460  {val = <lclosure 0x7fea20623a40> = {[file] = "@./lualib/skynet.lua", [linestart] = 252, [lineend] = 282, [nupvalues] = 10 '\n'}, tbclist = {value_ = {gc = 0x7fea20623a40, p = 0x7fea20623a40, f = 0x7fea20623a40, i = 140643542383168, n = 6.9487142600940629e-310}, tt_ = 70 'F', delta = 0}}
#16     0x7fea20633450  {val = 1, tbclist = {value_ = {gc = 0x1, p = 0x1, f = 0x1, i = 1, n = 4.9406564584124654e-324}, tt_ = 3 '\003', delta = 0}}
#17     0x7fea20633440  {val = 0, tbclist = {value_ = {gc = 0x0, p = 0x0, f = 0x0, i = 0, n = 0}, tt_ = 3 '\003', delta = 0}}
#18     0x7fea20633430  {val = "<lightuserdata 0x0>", tbclist = {value_ = {gc = 0x0, p = 0x0, f = 0x0, i = 0, n = 0}, tt_ = 2 '\002', delta = 0}}
#19     0x7fea20633420  {val = "False", tbclist = {value_ = {gc = 0x0, p = 0x0, f = 0x0, i = 0, n = 0}, tt_ = 17 '\021', delta = 0}}
#20     0x7fea20633410  {val = <lclosure 0x7fea20623a40> = {[file] = "@./lualib/skynet.lua", [linestart] = 252, [lineend] = 282, [nupvalues] = 10 '\n'}, tbclist = {value_ = {gc = 0x7fea20623a40, p = 0x7fea20623a40, f = 0x7fea20623a40, i = 140643542383168, n = 6.9487142600940629e-310}, tt_ = 70 'F', delta = 0}}

这个调用栈信息量就比较充足了,文件和行号啥的都有了。

结束

像熟悉 lua 虚拟机的人,可能不需要工具,也能直接从 L 里手工查出 Lua 信息,比如这里介绍的方法:
https://blog.codingnow.com/2017/05/gdb_coredumplua.html

但是还是推荐这个 lua-gdb 工具吧,确实方便了不少: https://github.com/xjdrew/lua-gdb

好了,入门级别的科普文写完了,原由是这里有人问道怎么用。@RiceCN cloudwu/skynet#1624

@antsmallant
Copy link

感谢感谢,写得很清楚。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants