基于cfadmin
实现的终端调试库.
-
将代码克隆到
3rd
目录下. -
使用
local console = require "debug.console"
导入.
内部支持以下两种连接方式:
-
监听端口 -
console.start("127.0.0.1", 6666)
-
监听文件 -
console.start("local.sock")
第1
种只能支持单进程模式, 第2
种可自行配置文件名后支持多进程模式.
我们在script/main.lua
内写入以下内容:
local console = require "debug.console"
console.startx("local.sock")
然后运行./cfadmin -e script/main.lua
启动即可.(实际业务里只需要把代码写在最终启动之前即可)
最后我们在命令行运行nc -U local.sock
, 如看到如下输出则代表连接成功.
[candy@MacBookPro:~/Documents/cfadmin] $ nc -U local.sock
Welcome! This is cfadmin Debug Console:
gc - Can run/stop/modify/count garbage collectors.
run - Execute the lua script like `main` coroutine.
dump - Prints more information about the specified data structure.
stat - Process usage data analysis report.
>>>
-
stat
- 输出进程使用状态 -
run
- 启动指定文件名的脚本 -
dump
- 可以格式化输出一些指定数据结构 -
gc
- 允许用户手动操作GC
我们尝试运行stat
命令来获得一些使用帮助:
>>> stat
stat [command] :
[cpu] - CPU kernel space and user space usage of the current process.
[mem] - Memory usage report of the current process.
[page] - `hard_page_fault` and `soft_page_fault` of the current process.
[all] - Return all of the above information.
>>>
现在根据提示使用stat all
则可以输出所有内容. 如下所示:
>>> stat all
CPU(User): 0.40%
CPU(Kernel): 0.33%
Lua Memory: 239.7256/KB
Swap Memory: 0.0000/KB
Total Memory: 2.1720/MB
Hard Page Faults: 0
Soft Page Faults: 739
>>>
有时候我们需要查看Lua内部的一些数据, 这时候可以使用dump
来完成:
>>> dump
dump [command] [key1] [key1] [keyN] :
[global] - dump global table (`_G`).
[registery] - dump lua debug registery table.
[filename] - dump already loaded package and its return table .
--
`keyX` means we can get `deep value` like `table[key1][key2]..[keyN]`
e.g :
1. dump cf wait
2. dump global string
>>>
比如我们要打印全局表_G
,看下内部有Key
存在. 那么我们可以这样:
>>> dump g
global{
['tonumber'] = function: 0x107b22ec0
['error'] = function: 0x107b22550
['setmetatable'] = function: 0x107b22e20
['string'] = table: 0x7ffd8b508120
['pcall'] = function: 0x107b229b0
['rawset'] = function: 0x107b22d10
['rawget'] = function: 0x107b22cc0
['print'] = function: 0x107b22a40
['os'] = table: 0x7ffd8b5070f0
['io'] = table: 0x7ffd8b507620
['loadfile'] = function: 0x107b22670
['require'] = function: 0x7ffd8b506bb0
['coroutine'] = table: 0x7ffd8b5071b0
['utf8'] = table: 0x7ffd8b506870
['assert'] = function: 0x107b22280
['pairs'] = function: 0x107b22920
['rawequal'] = function: 0x107b22c10
['collectgarbage'] = function: 0x107b22300
['warn'] = function: 0x107b22b50
['table'] = table: 0x7ffd8b507420
['NULL'] = userdata: 0x0
['null'] = userdata: 0x0
['debug'] = table: 0x7ffd8b5073c0
['tostring'] = function: 0x107b23110
['math'] = table: 0x7ffd8b508850
['load'] = function: 0x107b22750
['ipairs'] = function: 0x107b22620
['_G'] = table: 0x7ffd8b505c30
['rawlen'] = function: 0x107b22c60
['type'] = function: 0x107b23140
['next'] = function: 0x107b228c0
['_VERSION'] = 'Lua 5.4'
['dofile'] = function: 0x107b224e0
['select'] = function: 0x107b22d70
['package'] = table: 0x7ffd8b506510
['getmetatable'] = function: 0x107b225d0
['xpcall'] = function: 0x107b231a0
}
counter:
total keys count: 37
string value count: 1
function value count: 24
usedata value count: 2
table value count: 10
Done.
>>>
是的! 你没有看错. 如果打印的是一个table
则会对内部进行统计完成数据化返回.
那么如果是一个函数呢? 如果函数是lua
编写的, 那么dump
可以定位到文件位置:
>>> dump g package loaded debug.console
debug.console{
['startx'] = function: 0x7ffd8b4118a0(3rd/debug/console.lua:86)
['start'] = function: 0x7ffd8b415760(3rd/debug/console.lua:76)
}
counter:
total keys count: 2
function value count: 2
Done.
>>>
那如果想看一下注册表
呢? 可以把g
改为r
来查看注册表的内容:
>>> dump r
registery{
[1] = thread: 0x7ffd8c009a08
[2] = table: 0x7ffd8b505c30
['__Task__'] = table: 0x7ffd8b406620
['FILE*'] = table: 0x7ffd8b507920
['_IO_input'] = file (0x7fff975c5d90)
['__G_UDP__'] = table: 0x7ffd8b510360
['_LOADED'] = table: 0x7ffd8b5062f0
['_UBOX*'] = table: 0x7ffd8b708c30
['_PRELOAD'] = table: 0x7ffd8b507090
['_IO_output'] = file (0x7fff975c5e28)
['__G_TCP__'] = table: 0x7ffd8b413ca0
['__TCP__'] = table: 0x7ffd8b5145f0
['__TIMER__'] = table: 0x7ffd8b409880
['_CLIBS'] = table: 0x7ffd8b506b70
['__G_TIMER__'] = table: 0x7ffd8b40a0a0
['__UDP__'] = table: 0x7ffd8b5102e0
}
counter:
total keys count: 16
usedata value count: 2
thread value count: 1
table value count: 13
Done.
>>>
从示例可以看出语法就是keyname
+ 空格
的方式, 使用者熟练掌握后可以快速定位.
那如果我指向定位require
过的包, 应该怎么做呢?
>>> dump cf
cf{
['yield'] = function: 0x7fbc0f609360(lualib/cf/init.lua:35)
['at'] = function: 0x7fbc0f6095d0(lualib/cf/init.lua:30)
['sleep'] = function: 0x7fbc0f609780(lualib/cf/init.lua:46)
['wakeup'] = function: 0x7fbc0f609970(lualib/cf/init.lua:75)
['timeout'] = function: 0x7fbc0f609570(lualib/cf/init.lua:22)
['fork'] = function: 0x7fbc0f609940(lualib/cf/init.lua:68)
['wait'] = function: 0x7fbc0f609910(lualib/cf/init.lua:61)
['self'] = function: 0x7fbc0f609820(lualib/cf/init.lua:56)
['join'] = function: 0x7fbc0f609a10(lualib/cf/init.lua:93)
}
counter:
total keys count: 9
function value count: 9
Done.
>>>
这能提供使用者快速定位问题的能力, 也可以简化开发者的快速上手难度.
假设我们的代码有一个隐藏的bug
, 但是每次重启后就无法定位了.
并且每次启动一段时间内也没问题, 而一旦某个时间点或某个特殊条件成立就出现了.
这时候我们就需要更多运行时调试的能力, 但是这时候我们并不attach
来影响进程的执行能力.
所以我们的框架必须提供一种任何时候都能安全执行代码的能力!
现在让我们编写一个script/demo.lua
的文件并写入如下的代码:
local function f1()
print("f1")
end
local function f2()
print("f2")
end
local function f()
f1()
f2()
end
f()
编写完成后, 我们就尝试在运行中的框架内执行这个脚本:
>>> run script/demo.lua
Total Running Time: 0.000
Done.
>>>
然后你会发现之前我们启动的框架那边输出了2行内容.
[candy@MacBookPro:~/Documents/cfadmin] $ ./cfadmin
f1
f2
这就说明我们的代码运行成功了!
但是这并不够! 因为有时候我们还需要运行的这段脚本只执行过程是什么.
这时候我们可以在最后加上一个参数, 则会补充输出运行的脚本调用栈.
>>> run script/demo.lua true
callstack traceback:
└----> [OK] [NEXT LINE] [script/demo.lua:3]
└----> [OK] [NEXT LINE] [script/demo.lua:7]
└----> [OK] [NEXT LINE] [script/demo.lua:13]
└----> [OK] [NEXT LINE] [script/demo.lua:15]
└--------> [OK] [NEXT LINE] [script/demo.lua:11]
└------------> [OK] [NEXT LINE] [script/demo.lua:2]
└------------> [OK] [NEXT LINE] [script/demo.lua:3]
└------------> [OK] [GOTO BACK] [script/demo.lua:3]
└--------> [OK] [NEXT LINE] [script/demo.lua:12]
└------------> [OK] [NEXT LINE] [script/demo.lua:6]
└------------> [OK] [NEXT LINE] [script/demo.lua:7]
└------------> [OK] [GOTO BACK] [script/demo.lua:7]
└--------> [OK] [NEXT LINE] [script/demo.lua:13]
└--------> [OK] [GOTO BACK] [script/demo.lua:13]
└----> [OK] [GOTO BACK] [script/demo.lua:15]
└----> [OK] [NEXT LINE] [3rd/debug/run.lua:83]
└----> [OK] [NEXT LINE] [3rd/debug/run.lua:84]
Total Running Time: 0.000
Done.
>>>
但有时候我们想尝试对GC
进行一些特殊操作, 以借助这些修改来观察服务的整体运行差异.
这时候我们就需要利用到下面的一些调试命令:
>>> gc
gc [command] [args]:
[count] - Let the garbage collector report memory usage.
[step] - Let the garbage collector do a step garbage collection.
[collect] - Let the garbage collector do a full garbage collection.
[start] - Let the garbage collector (re)start.
[stop] - Let the garbage collector stop working.
[mode] - Let the garbage change work mode(`incremental` or `generational`).
>>>
上述包括:暂停、重启、完整回收、修改运行模式等功能, 这些命令赋予我们运行时调试GC
的能力.
请注意: 在性能要求较高的场景下请谨慎使用, 某些操作可能会对造成进程无法提供对外提供服务.
上述的演示功能只是冰山一角, 并且由于篇幅与其它原因我们无法在此给大家全部展示更多的特性.
但是通过开发者合理的组合与自行编写的脚本, 完全可以完成灰度测试、热修复、热更新、在线调试等功能.
如果还有其它任何的疑问, 请到我们的交流群内咨询.