gdb 如何申请gdb查看内存空间

gdb 内存复制到/从文件
dump [格式] memory 文件名 起始地址 结构地址 #
把指定内存段写到文件
dump [格式] value 文件名 表达式
把指定值写到文件
原始二进制格式
intel 16进制格式
S-recored格式
tektronix 16进制格式
append [binary] memory 文件名 起始地址 结构地址 #
按2进制追加到文件
append [binary] value 文件名 表达式
按2进制追加到文件
restore 文件名 [binary] bias 起始地址 结构地址 #
恢复文件中内容到内存.如果文件内容是原始二进制,需要指定binary参数,不然会gdb自动识别文件格式
dump命令举例:
在gdb调试过程中(甚至是在调试coredump时),将程序内存中的内容dump到指定文件中。
(gdb) dump binary memory ./file START STOP
将 [START, STOP) 地址范围内的内存内容输出到文件 file 中
1)将 [$pc, $pc+450) 范围内的内存输出到./file 中
(gdb) p $pc
$1 = (void (*)()) 0x4004a7 &main+11&
(gdb) p $pc + 450
$2 = (void (*)()) 0x400669
(gdb) dump binary memory ./file $1 $2
(gdb) p $pc
$1 = (void (*)()) 0x4004a7 &main+11&
(gdb) p $pc + 450
$2 = (void (*)()) 0x400669
(gdb) dump binary memory ./file $1 $22)将字符串s1的前5个字节输出到./a中
int main ()
char s1[] = "abcdefghijklmnopqrstuvwxyz";
char s2[] = "";
int main ()
char s1[] = "abcdefghijklmnopqrstuvwxyz";
char s2[] = "";
[root@ampcommons02 yasi]# gdb ./dump -q
Reading symbols from /home/yasi/s...done.
Breakpoint 1 at 0x4005a4: file s.cpp, line 6.
Starting program: /home/yasi/s
Breakpoint 1, main () at s.cpp:6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 libstdc++-4.4.6-4.el6.x86_64
(gdb) dump binary memory ./dump s1 s1+5
[root@ampcommons02 yasi]# cat ./dump
在 gdb 中将某段内存 dump 保存到文件中
gdb参数及命令详解 (已整理) core dump调试
gdb dump binary memory to file
[范例]从正在运行的Linux进程中dump出内存内容
用 GDB 调试程序并查看dump文件
Dump memory using gdb
使用gdb和core dump迅速定位段错误
将GDB中的输出定向到文件
GDB Session Restore 保存、恢复会话
linux gdb自动测试
没有更多推荐了,gdb新命令直接支持搜索内存了,方便不少
我的图书馆
gdb新命令直接支持搜索内存了,方便不少
gdb新命令直接支持搜索内存了,方便不少
Length: 5,315 byte(s)
以前得写macro,现在就是好啊
文档是日新生成的
8.20 Search Memory
Memory can be searched for a particular sequence of bytes with the find command.
find [/sn] start_addr, +len, val1 [, val2, ...]
find [/sn] start_addr, end_addr, val1 [, val2, ...]
Search memory for the sequence of bytes specified by val1, val2, etc. The search begins at address start_addr and continues for either len bytes or through to end_addr inclusive.
s and n are optional parameters. They may be specified in either order, apart or together.
s, search query size
The size of each search query value.
halfwords (two bytes)
words (four bytes)
giant words (eight bytes)
All values are interpreted in the current language. This means, for example, that if the current source language is C/C++ then searching for the string "hello" includes the trailing ''.
If the value size is not specified, it is taken from the value's type in the current language. This is useful when one wants to specify the search pattern as a mixture of types. Note that this means, for example, that in the case of C-like languages a search for an untyped 0x42 will search for `(int) 0x42' which is typically four bytes.
n, maximum number of finds
The maximum number of matches to print. The default is to print all finds.
You can use strings as search values. Quote them with double-quotes ("). The string value is copied into the search pattern byte by byte, regardless of the endianness of the target and the size specification.
The address of each match found is printed as well as a count of the number of matches found.
The address of the last value found is stored in convenience variable `$_'. A count of the number of matches is stored in `$numfound'.
For example, if stopped at the printf in this function:
static char hello[] = "hello-hello";
static struct { }
__attribute__ ((packed)) mixed
= { 'c', 0x54321 };
printf ("%s
", hello);
you get during debugging:
(gdb) find &hello[0], +sizeof(hello), "hello"
0x804956d &hello.1620+6&
1 pattern found
(gdb) find &hello[0], +sizeof(hello), 'h', 'e', 'l', 'l', 'o'
0x8049567 &hello.1620&
0x804956d &hello.1620+6&
2 patterns found
(gdb) find /b1 &hello[0], +sizeof(hello), 'h', 0x65, 'l'
0x8049567 &hello.1620&
1 pattern found
(gdb) find &mixed, +sizeof(mixed), (char) 'c', (short) 0x1234, (int) 0x
0x8049560 &mixed.1625&
1 pattern found
(gdb) print $numfound
(gdb) print $_
$2 = (void *) 0x8049560
--------------------------------
GNU gdb (GDB) 6.8.50.
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later &http://gnu.org/licenses/gpl.html&
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
&http://www.gnu.org/software/gdb/bugs/&.
(gdb) help find
Search memory for a sequence of bytes.
find [/size-char] [/max-count] start-address, end-address, expr1 [, expr2 ...]
find [/size-char] [/max-count] start-address, +length, expr1 [, expr2 ...]
size-char is one of b,h,w,g for 8,16,32,64 bit values respectively,
and if not specified the size is taken from the type of the expression
in the current language.
Note that this means for example that in the case of C-like languages
a search for an untyped 0x42 will search for "(int) 0x42"
which is typically four bytes.
The address of the last match is stored as the value of "$_".
Convenience variable "$numfound" is set to the number of matches.
喜欢该文的人也喜欢gdb 如何调试内存地址
int age= 20;
int *p_age = &
NSLog(@"p_age
= %p", p_age);
NSLog(@"&p_age
= %p", &p_age);
15:54:07.048 Test:858079] p_age
= 0x7fff5313f65c
15:54:07.050 Test:858079] &p_age
能不能通过地址0x7fff5313f65c 和 0x7fff 查看内存地址中的值?
如何去验证:0x7fff 内存地址的值是:0x7fff5313f65c,而0x7fff5313f65c内存地址的值是20 ?
在gdb调试模式下,使用examine(简写x)命令可查看内存地址中的值!参考例子如下:
2017-07-19 16:06:54.160 Test0719[27514:863331] p_age
= 0x7fff5c8d965c
2017-07-19 16:06:54.162 Test0719[27514:863331] &p_age
= 0x7fff5c8d9650
(lldb) x/d 0x7fff5c8d965c
0x7fff5c8d965c: 20
(lldb) x/x 0x7fff5c8d9650
0x7fff5c8d9650: 0x5c8d965c
(lldb) x/g 0x7fff5c8d9650
0x7fff5c8d9650: 0x00007fff5c8d965c
(lldb) x/d 0x00007fff5c8d965c
0x7fff5c8d965c: 4768020
(lldb) x/1db 0x00007fff5c8d965c
0x7fff5c8d965c: 20
x/&n/f/u& &address&
n、f、u是可选的参数。
n 是一个正整数,表示需要显示的内存单元的个数,也就是说从当前地址向后显示几个内存单元的内容,一个内存单元的大小由后面的u定义。
f 表示显示的格式,参见下面。如果地址所指的是字符串,那么格式可以是s,如果地十是指令地址,那么格式可以是i。
u 表示从当前地址往后请求的字节数,如果不指定的话,GDB默认是4个bytes。u参数可以用下面的字符来代替,b表示单字节,h表示双字节,w表示四字 节,g表示八字节。当我们指定了字节长度后,GDB会从指内存定的内存地址开始,读写指定字节,并把其当作一个值取出来。
一般来说,GDB会根据变量的类型输出变量的值。但你也可以自定义GDB的输出的格式。例如,你想输出一个整数的十六进制,或是二进制来查看这个整型变量的中的位的情况。要做到这样,你可以使用GDB的数据显示格式:
x 按十六进制格式显示变量。
d 按十进制格式显示变量。
u 按十六进制格式显示无符号整型。
o 按八进制格式显示变量。
t 按二进制格式显示变量。
a 按十六进制格式显示变量。
c 按字符格式显示变量。
f 按浮点数格式显示变量。
gdb调试时查看内存
gdb中查看内存方法总结
GDB内存断点(Memory break)的使用举例
GDB查看指定内存地址的内容——指令x
gdb查看内存地址和栈中的值
GDB 查看指定地址的内容
使用GDB的源代码查看功能
gdb查看内存地址里面的数据
gdb查看内存地址和栈中的值—查看虚函数表、函数地址
Linux GDB打印字符串全部内容
没有更多推荐了,gdb 调试入门,大牛写的高质量指南 - 文章 - 伯乐在线
& gdb 调试入门,大牛写的高质量指南
没想到Brendan Gregg这样的大牛,会写出这样一篇gdb tutorials文章:gdb Debugging Full Example (Tutorial): ncurses 。但可能正如文章开头所说,大牛对网上的gdb文章都不太满意,所以才有了这篇高质量指南,gdb入门者的福音。—— 何登成
如果你是系统管理员,但还不认识 Brendan Gregg,那网上流传甚广的 ,你应该看过的。—— 伯小乐。
( Brendan Gregg)
gdb 调试 ncurses 全过程:
发现网上的“gdb 示例”只有命令而没有对应的输出,我有点不满意。gdb 是 GNU 调试器,Linux 上的标配调试器。当我看 Greg Law 在 CppCon 2015 上的演讲《》的时候,我想起了示例输出的不足,幸运的是,这次有输出!这 15 分钟太值了。
它也启发我去分享一个完整的 gdb 调试实例,包含输出和每个步骤,甚至钻牛角尖的情况。这不是一个特别有趣或奇怪的问题,只是常规的 gdb 调试会话。但它包含了基础的东西可以勉强作为教程使用,记住 gdb 里还有很多东西我这里没用到。
我会以 root 权限运行下面的命令,因为我在调试一个工具,它需要 root 权限(目前)。需要的时候可用 sudo 获取 root 权限。你也没必要通读全篇︰ 我已列出每一步,你可以浏览它们找感兴趣的看。
1. 问题概述
BPF 工具箱里的
工具集有一个对 的 pull 请求,它通过程序使用 top-like display 显示 page cache 的统计。太好了 !然而,当我测试它时,遇到了段错误︰
# ./cachetop.py
Segmentation fault
# ./cachetop.pySegmentation fault
注意它说的是“段错误”,不是“段错误(核心已转储)”。我想要一个核心转储文件用来调试。(核心转储文件是进程内存的拷贝 – 这个名字来源于磁芯存储器时代 – 可用调试器分析)
分析核心转储文件是一种方法,但不是调试这个问题的唯一方法。我可以在 gdb 中运行此程序,来检查这个问题。我也可以在段错误发生时,用外部追踪器去抓数据和栈帧。我们从核心转储文件入手。
2. 解决核心转储问题
我检查一下核心转储的设置:
# ulimit -c
# cat /proc/sys/kernel/core_pattern
# ulimit -c0# cat /proc/sys/kernel/core_patterncore
ulimit -c 显示核心转储文件大小的最大值,这里是零:禁止核心转储(对于本进程和它的子进程)。
/proc/…/core_pattern 仅仅被设为 “core”,表示会在当前目录下生成一个文件名为 “core” 的 核心转储文件。目前这样就行了,但是我要演示如何把它设置为全局位置。
# ulimit -c unlimited
# mkdir /var/cores
# echo "/var/cores/core.%e.%p" & /proc/sys/kernel/core_pattern
# ulimit -c unlimited# mkdir /var/cores# echo "/var/cores/core.%e.%p" & /proc/sys/kernel/core_pattern
你可以进一步定制 core_pattern;例如,%h 为主机名,%t 为转储的时间。这些选项被写在 Linux 内核源码 Documentation/sysctl/中。
要使 core_pattern 保持不变,重启之后仍然有效,你可以通过设置 /etc/sysctl.conf 里的 “kernel.core_pattern” 实现。
再来一次:
# ./cachetop.py
Segmentation fault (core dumped)
# ls -lh /var/cores
-rw------- 1 root root 20M Aug
7 22:15 core.python.30520
# file /var/cores/core.python.30520
/var/cores/core.python.30520: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python ./cachetop.py'
# ./cachetop.pySegmentation fault (core dumped)# ls -lh /var/corestotal 19M-rw------- 1 root root 20M Aug&&7 22:15 core.python.30520# file /var/cores/core.python.30520 /var/cores/core.python.30520: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python ./cachetop.py'
好多了:我们有了自己的核心转储文件。
3. 启动 GDB
现在我要用 gdb 启动目标程序(用 shell 替换符,”`”,不过在你确定能用的情况下,也可指定完整路径),和核心转储文件:
# gdb `which python` /var/cores/core.python.30520
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
Find the GDB manual and other documentation resources online at:
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
warning: core file may not match specified executable file.
[New LWP 30520]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.
Core was generated by `python ./cachetop.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
0xaac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
123456789101112131415161718192021222324
# gdb `which python` /var/cores/core.python.30520GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1Copyright (C) 2016 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.&&Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:.Find the GDB manual and other documentation resources online at:.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from /usr/bin/python...(no debugging symbols found)...done.warning: core file may not match specified executable file.[New LWP 30520][Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.Core was generated by `python ./cachetop.py'.Program terminated with signal SIGSEGV, Segmentation fault.#0&&0xaac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
最后两行很有趣:它告诉我们这个段错误发生在 libncursesw 库里 doupdate() 函数中。可以先在网上搜一下,以防这是个很常见的问题。我搜了一下,可是没发现一个常见的原因。
我已经猜到 libncursesw 是什么了,如果你对它很陌生,它在 “/lib” 目录下以 “.so.*” 结尾表明这是一个动态库文件,可能有 man 手册、网站、包描述等。
# dpkg -l | grep libncursesw
libncursesw5:amd64
6.0+ubuntu1
shared libraries for terminal handling (wide character support)
# dpkg -l | grep libncurseswii&&libncursesw5:amd64&&&&&&&&&&&&&&&&&&6.0+-1ubuntu1&&&&&&&&&&&&&&&&&&&&amd64&&&& shared libraries for terminal handling (wide character support)
我是碰巧在 Ubuntu 上调试,但用什么 Linux发行版对使用 gdb 并没有影响。
栈回溯显示我们是如何到达失败点的,通常足够帮助我们确定常见的问题。bt (backtrace的简写)常常是我在 gdb 中使用的第一条命令:
0xaac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
0xaa07e6 in wrefresh () from /lib/x86_64-linux-gnu/libncursesw.so.5
0xa99616 in ?? () from /lib/x86_64-linux-gnu/libncursesw.so.5
0xa9a325 in wgetch () from /lib/x86_64-linux-gnu/libncursesw.so.5
0xcc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0xc4d5a in PyEval_EvalFrameEx ()
0xc2e05 in PyEval_EvalCodeEx ()
0xdef08 in ?? ()
0xb1153 in PyObject_Call ()
0xc73ec in PyEval_EvalFrameEx ()
#10 0xc2e05 in PyEval_EvalCodeEx ()
#11 0xcaf42 in PyEval_EvalFrameEx ()
#12 0xc2e05 in PyEval_EvalCodeEx ()
#13 0xc2ba9 in PyEval_EvalCode ()
#14 0xf20ef in ?? ()
#15 0xeca72 in PyRun_FileExFlags ()
#16 0xeb1f1 in PyRun_SimpleFileExFlags ()
#17 0xe18a in Py_Main ()
#18 0xbe10830 in __libc_start_main (main=0x49daf0 &main&, argc=2, argv=0x7ffd33d94838, init=&optimized out&, fini=&optimized out&, rtld_fini=&optimized out&,
stack_end=0x7ffd33d94828) at ../csu/libc-start.c:291
#19 0xda19 in _start ()
12345678910111213141516171819202122
(gdb) bt#0&&0xaac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5#1&&0xaa07e6 in wrefresh () from /lib/x86_64-linux-gnu/libncursesw.so.5#2&&0xa99616 in ?? () from /lib/x86_64-linux-gnu/libncursesw.so.5#3&&0xa9a325 in wgetch () from /lib/x86_64-linux-gnu/libncursesw.so.5#4&&0xcc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so#5&&0xc4d5a in PyEval_EvalFrameEx ()#6&&0xc2e05 in PyEval_EvalCodeEx ()#7&&0xdef08 in ?? ()#8&&0xb1153 in PyObject_Call ()#9&&0xc73ec in PyEval_EvalFrameEx ()#10 0xc2e05 in PyEval_EvalCodeEx ()#11 0xcaf42 in PyEval_EvalFrameEx ()#12 0xc2e05 in PyEval_EvalCodeEx ()#13 0xc2ba9 in PyEval_EvalCode ()#14 0xf20ef in ?? ()#15 0xeca72 in PyRun_FileExFlags ()#16 0xeb1f1 in PyRun_SimpleFileExFlags ()#17 0xe18a in Py_Main ()#18 0xbe10830 in __libc_start_main (main=0x49daf0 &main&, argc=2, argv=0x7ffd33d94838, init=&optimized out&, fini=&optimized out&, rtld_fini=&optimized out&, &&&&stack_end=0x7ffd33d94828) at ../csu/libc-start.c:291#19 0xda19 in _start ()
从下往上,按照从父函数到子函数的顺序看。有 “??” 的地方是因为符号解析失败。遍历栈 – 用来生成栈帧 — 也会失败。在这种情况下你可能会看到一个正常的栈帧,跟着一个小数值的假地址。如果符号或栈破损很严重,导致无法理解栈回溯,这里有几个常用的办法来修复:安装 debug info 包(给 gdb 提供更多的符号,让它来做基于 DWARF 的栈遍历),或者重新用源码编译(-fno-omit-frame-pointer -g)一个带帧指针和调试信息的版本。以上大多数 “??”
可以通过安装 python-dbg 包来修复。
这些栈看起来不太有用:帧 5 到 17 (左边的索引)在 Python 内部,虽然还看不到 Python 方法。帧 4 是 _curses 库,然后就到了 libncursesw。看起来调用顺序是 wgetch()-&wrefresh()-&doupdate()。根据函数名来看,我猜是刷新窗口。为什么会导致核心转储 呢?
我从反汇编发生段错误的函数 doupdate() 开始:
(gdb) disas doupdate
Dump of assembler code for function doupdate:
0xaac2e0 &+0&:
0xaac2e2 &+2&:
0xaac2e4 &+4&:
0xaac2e6 &+6&:
0xaac2e8 &+8&:
0xaac2e9 &+9&:
0xaac2ea &+10&:
$0xc8,%rsp
---Type &return& to continue, or q &return& to quit---
0xaac3f7 &+279&: cmpb
$0x0,0x21(%rcx)
0xaac3fb &+283&: je
0x7f0a37aacc3b &doupdate+2395&
0xaac401 &+289&: mov
0x20cb68(%rip),%rax
# 0x7f0a37cb8f70
0xaac408 &+296&: mov
(%rax),%rsi
0xaac40b &+299&: xor
=& 0xaac40d &+301&: mov
0x10(%rsi),%rdi
0xaac411 &+305&: cmpb
$0x0,0x1c(%rdi)
0xaac415 &+309&: jne
0x7f0a37aac6f7 &doupdate+1047&
0xaac41b &+315&: movswl 0x4(%rcx),%ecx
0xaac41f &+319&: movswl 0x74(%rdx),%edi
0xaac423 &+323&: mov
%rax,0x40(%rsp)
123456789101112131415161718192021222324
(gdb) disas doupdateDump of assembler code for function doupdate:&& 0xaac2e0 &+0&:&& push&& %r15&& 0xaac2e2 &+2&:&& push&& %r14&& 0xaac2e4 &+4&:&& push&& %r13&& 0xaac2e6 &+6&:&& push&& %r12&& 0xaac2e8 &+8&:&& push&& %rbp&& 0xaac2e9 &+9&:&& push&& %rbx&& 0xaac2ea &+10&:&&sub&&&&$0xc8,%rsp[...]---Type &return& to continue, or q &return& to quit---[...]&& 0xaac3f7 &+279&: cmpb&& $0x0,0x21(%rcx)&& 0xaac3fb &+283&: je&&&& 0x7f0a37aacc3b &doupdate+2395&&& 0xaac401 &+289&: mov&&&&0x20cb68(%rip),%rax&&&&&&&&# 0x7f0a37cb8f70&& 0xaac408 &+296&: mov&&&&(%rax),%rsi&& 0xaac40b &+299&: xor&&&&%eax,%eax=& 0xaac40d &+301&: mov&&&&0x10(%rsi),%rdi&& 0xaac411 &+305&: cmpb&& $0x0,0x1c(%rdi)&& 0xaac415 &+309&: jne&&&&0x7f0a37aac6f7 &doupdate+1047&&& 0xaac41b &+315&: movswl 0x4(%rcx),%ecx&& 0xaac41f &+319&: movswl 0x74(%rdx),%edi&& 0xaac423 &+323&: mov&&&&%rax,0x40(%rsp)[...]
部分输出。(我也可以只输入 “disas” 它会默认反汇编 doupdate )
“=&” 指向段错误地址,此处是一条 mov 指令 mov 0x10(%rsi),%rdi:从%rsi中指向内存地址的值加偏移量 0x10 处取值,送到 %rdi 寄存器中。接下来我会检查寄存器的状态。
6. 查看寄存器
使用 i r(info registers 的简写)打印寄存器值:
0x7f0a3848eb10
0x7f0a3848eb10 &SP&
0x7ffd33d93c00
0x7ffd33d93c00
0x7f0a37cb93e0
0x7f0a3848eb10
0x7f0a37aac40d
0x7f0a37aac40d &doupdate+301&
[ PF ZF IF RF ]
12345678910111213141516171819202122232425
(gdb) i rrax&&&&&&&&&&&&0x0&&0rbx&&&&&&&&&&&&0x1993060&&&&rcx&&&&&&&&&&&&0x19902a0&&&&rdx&&&&&&&&&&&&0x19ce7d0&&&&rsi&&&&&&&&&&&&0x0&&0rdi&&&&&&&&&&&&0x19ce7d0&&&&rbp&&&&&&&&&&&&0x7f0a3848eb10&& 0x7f0a3848eb10 &SP&rsp&&&&&&&&&&&&0x7ffd33d93c00&& 0x7ffd33d93c00r8&&&&&&&&&&&& 0x7f0a37cb93e0&& 056r9&&&&&&&&&&&& 0x0&&0r10&&&&&&&&&&&&0x8&&8r11&&&&&&&&&&&&0x202&&&&514r12&&&&&&&&&&&&0x0&&0r13&&&&&&&&&&&&0x0&&0r14&&&&&&&&&&&&0x7f0a3848eb10&& 376r15&&&&&&&&&&&&0x19ce7d0&&&&rip&&&&&&&&&&&&0x7f0a37aac40d&& 0x7f0a37aac40d &doupdate+301&eflags&&&&&&&& 0x10246&&[ PF ZF IF RF ]cs&&&&&&&&&&&& 0x33 51ss&&&&&&&&&&&& 0x2b 43ds&&&&&&&&&&&& 0x0&&0es&&&&&&&&&&&& 0x0&&0fs&&&&&&&&&&&& 0x0&&0gs&&&&&&&&&&&& 0x0&&0
哦,%rsi是零,这就是我们的问题所在!零不太可能是有效地址,并且解引用一个未初始化的指针或空指针引起的段错误是常见的软件 bug。
7. 内存映射
你可以使用 i proc m (info proc mappings 的简写)核查零是不是有效地址:
(gdb) i proc m
Mapped address spaces:
Start Addr
Offset objfile
0x0 /usr/bin/python2.7
0x2e6000 /usr/bin/python2.7
0x2e8000 /usr/bin/python2.7
0x7f0a37a8b000
0x7f0a37ab8000
0x0 /lib/x86_64-linux-gnu/libncursesw.so.5.9
0x7f0a37ab8000
0x7f0a37cb8000
0x2d000 /lib/x86_64-linux-gnu/libncursesw.so.5.9
0x7f0a37cb8000
0x7f0a37cb9000
0x2d000 /lib/x86_64-linux-gnu/libncursesw.so.5.9
0x7f0a37cb9000
0x7f0a37cba000
0x2e000 /lib/x86_64-linux-gnu/libncursesw.so.5.9
0x7f0a37cba000
0x7f0a37ccd000
0x0 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0x7f0a37ccd000
0x7f0a37ecc000
0x13000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0x7f0a37ecc000
0x7f0a37ecd000
0x12000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0x7f0a37ecd000
0x7f0a37ecf000
0x13000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0x0 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x16000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x15000 /lib/x86_64-linux-gnu/libgcc_s.so.1
0x0 /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x25000 /lib/x86_64-linux-gnu/libtinfo.so.5.9
123456789101112131415161718192021
(gdb) i proc mMapped address spaces:&&&&&&&Start Addr&&&&&&&&&& End Addr&&&&&& Size&&&& Offset objfile&&&&&&&&0x400000&&&&&&&&&& 0x6e7000&& 0x2e7000&&&&&&&&0x0 /usr/bin/python2.7&&&&&&&&0x8e6000&&&&&&&&&& 0x8e8000&&&& 0x2000&& 0x2e6000 /usr/bin/python2.7&&&&&&&&0x8e8000&&&&&&&&&& 0x95f000&&&&0x77000&& 0x2e8000 /usr/bin/python2.7&&0x7f0a37a8b000&&&& 0x7f0a37ab8000&&&&0x2d000&&&&&&&&0x0 /lib/x86_64-linux-gnu/libncursesw.so.5.9&&0x7f0a37ab8000&&&& 0x7f0a37cb8000&& 0x200000&&&&0x2d000 /lib/x86_64-linux-gnu/libncursesw.so.5.9&&0x7f0a37cb8000&&&& 0x7f0a37cb9000&&&& 0x1000&&&&0x2d000 /lib/x86_64-linux-gnu/libncursesw.so.5.9&&0x7f0a37cb9000&&&& 0x7f0a37cba000&&&& 0x1000&&&&0x2e000 /lib/x86_64-linux-gnu/libncursesw.so.5.9&&0x7f0a37cba000&&&& 0x7f0a37ccd000&&&&0x13000&&&&&&&&0x0 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so&&0x7f0a37ccd000&&&& 0x7f0a37ecc000&& 0x1ff000&&&&0x13000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so&&0x7f0a37ecc000&&&& 0x7f0a37ecd000&&&& 0x1000&&&&0x12000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so&&0x7f0a37ecd000&&&& 0x7f0a37ecf000&&&& 0x2000&&&&0x13000 /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so&&0x7f0a&&&& 0x7f0a&&&&0x16000&&&&&&&&0x0 /lib/x86_64-linux-gnu/libgcc_s.so.1&&0x7f0a&&&& 0x7f0a&& 0x1ff000&&&&0x16000 /lib/x86_64-linux-gnu/libgcc_s.so.1&&0x7f0a&&&& 0x7f0a&&&& 0x1000&&&&0x15000 /lib/x86_64-linux-gnu/libgcc_s.so.1&&0x7f0a&&&& 0x7f0a&&&&0x25000&&&&&&&&0x0 /lib/x86_64-linux-gnu/libtinfo.so.5.9&&0x7f0a&&&& 0x7f0a&& 0x1ff000&&&&0x25000 /lib/x86_64-linux-gnu/libtinfo.so.5.9[...]
第一个有效的虚拟地址是 0x400000。任何小于它的地址都是非法的,这些地址如果被引用,就会引起段错误。
目前有几种不同的方式可做进一步分析。我先一步一步的看指令。
先回到反汇编:
0xaac401 &+289&:
0x20cb68(%rip),%rax
# 0x7f0a37cb8f70
0xaac408 &+296&:
(%rax),%rsi
0xaac40b &+299&:
=& 0xaac40d &+301&:
0x10(%rsi),%rdi
&& 0xaac401 &+289&:&& mov&&&&0x20cb68(%rip),%rax&&&&&&&&# 0x7f0a37cb8f70&& 0xaac408 &+296&:&& mov&&&&(%rax),%rsi&& 0xaac40b &+299&:&& xor&&&&%eax,%eax=& 0xaac40d &+301&:&& mov&&&&0x10(%rsi),%rdi
看这四条指令:好像是从栈中取东西放到 %rax,然后解引用 %rax 到 %rsi,再将 %eax 置零( xor 是一个优化,替换掉移动 0 的动作),最后将 %rsi 解引用再加一个偏移,不过我们知道 %rsi 是零。这几条指令用来访问数据结构。 可能 %rax 会很有趣,但是它已经被前面的指令置零,所以我们在核心转储文件的寄存器里看不到它的值。
我可以在 doupdate+289 下个断点,然后逐条指令查看寄存器的值如何变化。首先,我需要启动 gdb 把程序跑起来:
# gdb `which python`
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
Find the GDB manual and other documentation resources online at:
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
12345678910111213141516
# gdb `which python`GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1Copyright (C) 2016 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.&&Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:.Find the GDB manual and other documentation resources online at:.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
现在用 b (break 的简写)来下断点:
(gdb) b *doupdate + 289
No symbol table is loaded.
Use the "file" command.
(gdb) b *doupdate + 289No symbol table is loaded.&&Use the "file" command.
哦。我想演示这个错误来解释为什么我们经常以在主函数设置断点作为开始,因为这时候符号可能被加载,可以设置感兴趣的断点。我直接在 doupdate 函数设断点,避开这个问题,一旦断点被触发就设置加了偏移的断点。
(gdb) b doupdate
Function "doupdate" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (doupdate) pending.
(gdb) r cachetop.py
Starting program: /usr/bin/python cachetop.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.
Breakpoint 1, 0x00007ffff34ad2e0 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
(gdb) b *doupdate + 289
Breakpoint 2 at 0x7ffff34ad401
Continuing.
Breakpoint 2, 0x00007ffff34ad401 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
123456789101112131415161718
(gdb) b doupdateFunction "doupdate" not defined.Make breakpoint pending on future shared library load? (y or [n]) yBreakpoint 1 (doupdate) pending.(gdb) r cachetop.pyStarting program: /usr/bin/python cachetop.py[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.&Breakpoint 1, 0x00007ffff34ad2e0 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5(gdb) b *doupdate + 289Breakpoint 2 at 0x7ffff34ad401(gdb) cContinuing.&&Breakpoint 2, 0x00007ffff34ad401 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
我们到了断点处。
如果你之前没有做这些,r (run) 命令会把参数传给我们早先在命令行指定的 gdb 目标(python)。这样的话程序会以执行 “python cachetop.py” 结束。
9. 单步调试
我跳到下一条指令(si,stepi的简写),然后检查寄存器:
0x00007ffff34ad408 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
0x7ffff3e8f948
0x7ffff7ea8e10
0x7ffff3e8fb10
0x7ffff3e8fb10 &SP&
0x7fffffffd390
0x7fffffffd390
0x7ffff36ba3e0
0x7ffff3e8fb10
0x7ffff34ad408
0x7ffff34ad408 &doupdate+296&
(gdb) p/a 0x7ffff3e8f948
$1 = 0x7ffff3e8f948 &cur_term&
1234567891011121314151617181920212223242526272829
(gdb) si0x00007ffff34ad408 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5(gdb) i rrax&&&&&&&&&&&&0x7ffff3e8f948&& 688rbx&&&&&&&&&&&&0xaea060 rcx&&&&&&&&&&&&0xae72a0 rdx&&&&&&&&&&&&0xa403d0 rsi&&&&&&&&&&&&0x7ffff7ea8e10&& 176rdi&&&&&&&&&&&&0xa403d0 rbp&&&&&&&&&&&&0x7ffff3e8fb10&& 0x7ffff3e8fb10 &SP&rsp&&&&&&&&&&&&0x7fffffffd390&& 0x7fffffffd390r8&&&&&&&&&&&& 0x7ffff36ba3e0&& 824r9&&&&&&&&&&&& 0x0&&0r10&&&&&&&&&&&&0x8&&8r11&&&&&&&&&&&&0x202&&&&514r12&&&&&&&&&&&&0x0&&0r13&&&&&&&&&&&&0x0&&0r14&&&&&&&&&&&&0x7ffff3e8fb10&& 144r15&&&&&&&&&&&&0xa403d0 rip&&&&&&&&&&&&0x7ffff34ad408&& 0x7ffff34ad408 &doupdate+296&eflags&&&&&&&& 0x202&&&&[ IF ]cs&&&&&&&&&&&& 0x33 51ss&&&&&&&&&&&& 0x2b 43ds&&&&&&&&&&&& 0x0&&0es&&&&&&&&&&&& 0x0&&0fs&&&&&&&&&&&& 0x0&&0gs&&&&&&&&&&&& 0x0&&0(gdb) p/a 0x7ffff3e8f948$1 = 0x7ffff3e8f948 &cur_term&
又一条线索。所以我们解引用的空指针好像是一个叫 “cur_term” 的符号(p/a 是 print/a 的简写,这里 “/a” 指以地址的形式)。考虑到这是 ncurses, 是我们的环境变量 TERM 设置有问题吗?
# echo $TERM
xterm-256color
# echo $TERMxterm-256color
我试过将其设置为 vt100 并运行程序,还是遇到了同样的段错误。
注意我只是在 doupdate() 第一次被调用的时候查看了寄存器,但是它可以被多次调用,所以问题可能出在后边的调用中。我可以通过执行 c( continue 的简写)一步步到达出问题的地方。如果它被调用几次的话这样做是可行的,如果它被调用几千次的话我得用别的办法。(我会在 15 节的里介绍。)
gdb 有一个超棒的功能叫回退,Greg Law 在他的演讲中提到过。这里有一个例子。
我再启动一个 python 会话,从头演示:
# gdb `which python`
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
&http://www.gnu.org/software/gdb/bugs/&.
Find the GDB manual and other documentation resources online at:
&http://www.gnu.org/software/gdb/documentation/&.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
12345678910111213141516
# gdb `which python`GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1Copyright (C) 2016 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.&&Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:&http://www.gnu.org/software/gdb/bugs/&.Find the GDB manual and other documentation resources online at:&http://www.gnu.org/software/gdb/documentation/&.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
和之前一样我在 doupdate 下断点,一旦触发,我就启动 recording,然后继续运行程序直到崩溃。Recording 会增加相当大的开销,所以我不想在主函数里就将它打开。
(gdb) b doupdate
Function "doupdate" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (doupdate) pending.
(gdb) r cachetop.py
Starting program: /usr/bin/python cachetop.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.
Breakpoint 1, 0x00007ffff34ad2e0 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
(gdb) record
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff34ad40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
1234567891011121314151617
(gdb) b doupdateFunction "doupdate" not defined.Make breakpoint pending on future shared library load? (y or [n]) yBreakpoint 1 (doupdate) pending.(gdb) r cachetop.pyStarting program: /usr/bin/python cachetop.py[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.&Breakpoint 1, 0x00007ffff34ad2e0 in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5(gdb) record(gdb) cContinuing.&Program received signal SIGSEGV, Segmentation fault.0x00007ffff34ad40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
这里我可以逐行或逐条指令的回退。它通过播放我们记录的寄存器状态来工作。我回退两条指令,然后打印寄存器值:
(gdb) reverse-stepi
0x00007ffff34ad40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
(gdb) reverse-stepi
0x00007ffff34ad40b in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5
0x7ffff3e8f948
0x7ffff3e8fb10
0x7ffff3e8fb10 &SP&
0x7fffffffd390
0x7fffffffd390
0x7ffff36ba3e0
0x7ffff3e8fb10
0x7ffff34ad40b
0x7ffff34ad40b &doupdate+299&
(gdb) p/a 0x7ffff3e8f948
$1 = 0x7ffff3e8f948 &cur_term&
12345678910111213141516171819202122232425262728293031
(gdb) reverse-stepi0x00007ffff34ad40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5(gdb) reverse-stepi0x00007ffff34ad40b in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5(gdb) i rrax&&&&&&&&&&&&0x7ffff3e8f948&& 688rbx&&&&&&&&&&&&0xaea060 rcx&&&&&&&&&&&&0xae72a0 rdx&&&&&&&&&&&&0xa403d0 rsi&&&&&&&&&&&&0x0&&0rdi&&&&&&&&&&&&0xa403d0 rbp&&&&&&&&&&&&0x7ffff3e8fb10&& 0x7ffff3e8fb10 &SP&rsp&&&&&&&&&&&&0x7fffffffd390&& 0x7fffffffd390r8&&&&&&&&&&&& 0x7ffff36ba3e0&& 824r9&&&&&&&&&&&& 0x0&&0r10&&&&&&&&&&&&0x8&&8r11&&&&&&&&&&&&0x302&&&&770r12&&&&&&&&&&&&0x0&&0r13&&&&&&&&&&&&0x0&&0r14&&&&&&&&&&&&0x7ffff3e8fb10&& 144r15&&&&&&&&&&&&0xa403d0 rip&&&&&&&&&&&&0x7ffff34ad40b&& 0x7ffff34ad40b &doupdate+299&eflags&&&&&&&& 0x202&&&&[ IF ]cs&&&&&&&&&&&& 0x33 51ss&&&&&&&&&&&& 0x2b 43ds&&&&&&&&&&&& 0x0&&0es&&&&&&&&&&&& 0x0&&0fs&&&&&&&&&&&& 0x0&&0gs&&&&&&&&&&&& 0x0&&0(gdb) p/a 0x7ffff3e8f948$1 = 0x7ffff3e8f948 &cur_term&
所以,又找到了 “cur_term” 的线索。我很想看这里的源代码,但我将从调试信息入手。
11. 调试信息
这是 libncursesw,我没有安装调试信息(Ubuntu):
# apt-cache search libncursesw
libncursesw5 - shared libraries for terminal handling (wide character support)
libncursesw5-dbg - debugging/profiling libraries for ncursesw
libncursesw5-dev - developer's libraries for ncursesw
# dpkg -l | grep libncursesw
libncursesw5:amd64
6.0+ubuntu1
shared libraries for terminal handling (wide character support)
# apt-cache search libncurseswlibncursesw5 - shared libraries for terminal handling (wide character support)libncursesw5-dbg - debugging/profiling libraries for ncurseswlibncursesw5-dev - developer's libraries for ncursesw# dpkg -l | grep libncursesw&&ii&&libncursesw5:amd64&&&&&&&&&&&&&&&&&&6.0+-1ubuntu1&&&&&&&&&&&&&&&&&&&&amd64&&&&&&&&shared libraries for terminal handling (wide character support)
我把它装上:
# apt-get install -y libncursesw5-dbg
Reading package lists... Done
Building dependency tree
Reading state information... Done
After this operation, 2,488 kB of additional disk space will be used.
Get:1 http://us-west-1.ec2.archive.ubuntu.com/ubuntu xenial/main amd64 libncursesw5-dbg amd64 6.0+ubuntu1 [729 kB]
Fetched 729 kB in 0s (865 kB/s)
Selecting previously unselected package libncursesw5-dbg.
(Reading database ... 200094 files and directories currently installed.)
Preparing to unpack .../libncursesw5-dbg_6.0+ubuntu1_amd64.deb ...
Unpacking libncursesw5-dbg (6.0+ubuntu1) ...
Setting up libncursesw5-dbg (6.0+ubuntu1) ...
# dpkg -l | grep libncursesw
libncursesw5:amd64
6.0+ubuntu1
shared libraries for terminal handling (wide character support)
libncursesw5-dbg
6.0+ubuntu1
debugging/profiling libraries for ncursesw
12345678910111213141516
# apt-get install -y libncursesw5-dbgReading package lists... DoneBuilding dependency tree&&&&&& Reading state information... Done[...]After this operation, 2,488 kB of additional disk space will be used.Get:1 http://us-west-1.ec2.archive.ubuntu.com/ubuntu xenial/main amd64 libncursesw5-dbg amd64 6.0+ubuntu1 [729 kB]Fetched 729 kB in 0s (865 kB/s)&&&&&&&&&&Selecting previously unselected package libncursesw5-dbg.(Reading database ... 200094 files and directories currently installed.)Preparing to unpack .../libncursesw5-dbg_6.0+-1ubuntu1_amd64.deb ...Unpacking libncursesw5-dbg (6.0+-1ubuntu1) ...Setting up libncursesw5-dbg (6.0+-1ubuntu1) ...# dpkg -l | grep libncurseswii&&libncursesw5:amd64&&&&&&&&&&&&&&&&&&6.0+-1ubuntu1&&&&&&&&&&&&&&&&&&&&amd64&&&&&&&&shared libraries for terminal handling (wide character support)ii&&libncursesw5-dbg&&&&&&&&&&&&&&&&&&&&6.0+-1ubuntu1&&&&&&&&&&&&&&&&&&&&amd64&&&&&&&&debugging/profiling libraries for ncursesw
太好了,版本匹配。那么现在我们的段错误是什么样子呢?
# gdb `which python` /var/cores/core.python.30520
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.
Core was generated by `python ./cachetop.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129
if (back_color_erase)
ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129
ClrUpdate () at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1147
doupdate () at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1010
0xaa07e6 in wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_refresh.c:65
0xa99499 in recur_wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:384
0xa99616 in _nc_wgetch (win=win@entry=0x1993060, result=result@entry=0x7ffd33d93e24, use_meta=1)
at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:491
0xa9a325 in wgetch (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:672
0xcc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so
0xc4d5a in PyEval_EvalFrameEx ()
0xc2e05 in PyEval_EvalCodeEx ()
#10 0xdef08 in ?? ()
#11 0xb1153 in PyObject_Call ()
#12 0xc73ec in PyEval_EvalFrameEx ()
#13 0xc2e05 in PyEval_EvalCodeEx ()
#14 0xcaf42 in PyEval_EvalFrameEx ()
#15 0xc2e05 in PyEval_EvalCodeEx ()
#16 0xc2ba9 in PyEval_EvalCode ()
#17 0xf20ef in ?? ()
#18 0xeca72 in PyRun_FileExFlags ()
#19 0xeb1f1 in PyRun_SimpleFileExFlags ()
#20 0xe18a in Py_Main ()
#21 0xbe10830 in __libc_start_main (main=0x49daf0 &main&, argc=2, argv=0x7ffd33d94838, init=&optimized out&, fini=&optimized out&, rtld_fini=&optimized out&,
stack_end=0x7ffd33d94828) at ../csu/libc-start.c:291
#22 0xda19 in _start ()
12345678910111213141516171819202122232425262728293031323334
# gdb `which python` /var/cores/core.python.30520GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1[...]warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.Core was generated by `python ./cachetop.py'.Program terminated with signal SIGSEGV, Segmentation fault.#0&&ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:11291129&&&&&&&&if (back_color_erase)(gdb) bt#0&&ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129#1&&ClrUpdate () at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1147#2&&doupdate () at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1010#3&&0xaa07e6 in wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_refresh.c:65#4&&0xa99499 in recur_wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:384#5&&0xa99616 in _nc_wgetch (win=win@entry=0x1993060, result=result@entry=0x7ffd33d93e24, use_meta=1)&&&&at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:491#6&&0xa9a325 in wgetch (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/base/lib_getch.c:672#7&&0xcc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so#8&&0xc4d5a in PyEval_EvalFrameEx ()#9&&0xc2e05 in PyEval_EvalCodeEx ()#10 0xdef08 in ?? ()#11 0xb1153 in PyObject_Call ()#12 0xc73ec in PyEval_EvalFrameEx ()#13 0xc2e05 in PyEval_EvalCodeEx ()#14 0xcaf42 in PyEval_EvalFrameEx ()#15 0xc2e05 in PyEval_EvalCodeEx ()#16 0xc2ba9 in PyEval_EvalCode ()#17 0xf20ef in ?? ()#18 0xeca72 in PyRun_FileExFlags ()#19 0xeb1f1 in PyRun_SimpleFileExFlags ()#20 0xe18a in Py_Main ()#21 0xbe10830 in __libc_start_main (main=0x49daf0 &main&, argc=2, argv=0x7ffd33d94838, init=&optimized out&, fini=&optimized out&, rtld_fini=&optimized out&, &&&&stack_end=0x7ffd33d94828) at ../csu/libc-start.c:291#22 0xda19 in _start ()
栈回溯看起来不太一样:我们确实不在 doupdate() 里边,而是在 ClrBlank() 中,它内联在 ClrUpdate() 里,ClrUpdate() 又内联在 doupdate() 中。
现在我真的要看源码了。
12. 源代码
安装了调试信息之后,gdb 可以同时列出源码和汇编:
(gdb) disas/s
Dump of assembler code for function doupdate:
/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:
0xaac2e0 &+0&:
0xaac2e2 &+2&:
0xaac2e4 &+4&:
0xaac2e6 &+6&:
0xaac3dd &+253&: jne
0x7f0a37aac6ca &doupdate+1002&
if (CurScreen(SP_PARM)-&_clear || NewScreen(SP_PARM)-&_clear) {
/* force refresh ? */
0xaac3e3 &+259&: mov
0x80(%rdx),%rax
0xaac3ea &+266&: mov
0x88(%rdx),%rcx
0xaac3f1 &+273&: cmpb
$0x0,0x21(%rax)
0xaac3f5 &+277&: jne
0x7f0a37aac401 &doupdate+289&
0xaac3f7 &+279&: cmpb
$0x0,0x21(%rcx)
0xaac3fb &+283&: je
0x7f0a37aacc3b &doupdate+2395&
if (back_color_erase)
0xaac401 &+289&: mov
0x20cb68(%rip),%rax
# 0x7f0a37cb8f70
0xaac408 &+296&: mov
(%rax),%rsi
NCURSES_CH_T blank =
0xaac40b &+299&: xor
if (back_color_erase)
=& 0xaac40d &+301&: mov
0x10(%rsi),%rdi
0xaac411 &+305&: cmpb
$0x0,0x1c(%rdi)
0xaac415 &+309&: jne
0x7f0a37aac6f7 &doupdate+1047&
123456789101112131415161718192021222324252627282930
(gdb) disas/sDump of assembler code for function doupdate:/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:759 {&& 0xaac2e0 &+0&:&& push&& %r15&& 0xaac2e2 &+2&:&& push&& %r14&& 0xaac2e4 &+4&:&& push&& %r13&& 0xaac2e6 &+6&:&& push&& %r12[...]&& 0xaac3dd &+253&: jne&&&&0x7f0a37aac6ca &doupdate+1002&&1009&&&&&&&&if (CurScreen(SP_PARM)-&_clear || NewScreen(SP_PARM)-&_clear) {&& /* force refresh ? */&& 0xaac3e3 &+259&: mov&&&&0x80(%rdx),%rax&& 0xaac3ea &+266&: mov&&&&0x88(%rdx),%rcx&& 0xaac3f1 &+273&: cmpb&& $0x0,0x21(%rax)&& 0xaac3f5 &+277&: jne&&&&0x7f0a37aac401 &doupdate+289&&& 0xaac3f7 &+279&: cmpb&& $0x0,0x21(%rcx)&& 0xaac3fb &+283&: je&&&& 0x7f0a37aacc3b &doupdate+2395&&1129&&&&&&&&if (back_color_erase)&& 0xaac401 &+289&: mov&&&&0x20cb68(%rip),%rax&&&&&&&&# 0x7f0a37cb8f70&& 0xaac408 &+296&: mov&&&&(%rax),%rsi&1128&&&&&&&&NCURSES_CH_T blank = blankchar;&& 0xaac40b &+299&: xor&&&&%eax,%eax&1129&&&&&&&&if (back_color_erase)=& 0xaac40d &+301&: mov&&&&0x10(%rsi),%rdi&& 0xaac411 &+305&: cmpb&& $0x0,0x1c(%rdi)&& 0xaac415 &+309&: jne&&&&0x7f0a37aac6f7 &doupdate+1047&
好极了!看 “=&” 和它上边的代码。所以我们的段错误发生在 “if (back_color_erase)” ?看起来不可能。
这里我检查了一下,我的调试信息版本是对的,重新在 gdb 里边运行程序直到发生段错误。错误相同。
back_color_erase 有什么特殊吗?我们现在在 ClrBlank() 中,我先列出源码:
(gdb) list ClrBlank
static NCURSES_INLINE NCURSES_CH_T
ClrBlank(NCURSES_SP_DCLx WINDOW *win)
NCURSES_CH_T blank =
if (back_color_erase)
AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS));
1234567891011
(gdb) list ClrBlank1124&&&&1125&&&&static NCURSES_INLINE NCURSES_CH_T1126&&&&ClrBlank(NCURSES_SP_DCLx WINDOW *win)1127&&&&{1128&&&&&&&&NCURSES_CH_T blank = blankchar;1129&&&&&&&&if (back_color_erase)1130&&&&&&&&AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS));1131&&&&&&&&return blank;1132&&&&}1133
啊,在这个函数里边没定义,难道是全局变量?
有必要看看这些代码在 gdb 的文本用户界面(TUI)里是什么样的,我用的不多,是看了 Greg 的演讲之后受到的启发。
你可以用 –tui 来启动:
# gdb --tui `which python` /var/cores/core.python.30520
┌───────────────────────────────────────────────────────────────────────────┐
[ No Source Available ]
└───────────────────────────────────────────────────────────────────────────┘
None No process In:
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
to continue, or q
to quit---
12345678910111213141516171819202122232425
# gdb --tui `which python` /var/cores/core.python.30520&& ┌───────────────────────────────────────────────────────────────────────────┐&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&& [ No Source Available ]&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& └───────────────────────────────────────────────────────────────────────────┘None No process In:&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&L??&& PC: ?? GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1Copyright (C) 2016 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.&&Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".---Type&&to continue, or q&&to quit---
它在抱怨没有 Python 源码。我可以搞定,但是我们是在 libncursesw 里边崩溃的。所以不管它敲回车让它完成加载,在发生错误的地方加载了 libncursesw 调试信息里的源码:
┌──/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c──────┐
static NCURSES_INLINE NCURSES_CH_T
ClrBlank(NCURSES_SP_DCLx WINDOW *win)
NCURSES_CH_T blank =
if (back_color_erase)
AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS)│
ClrUpdate()
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7f0a3c5e87 In: doupdate
L1129 PC: 0x7f0a37aac40d
warning: JITed object file architecture unknown is not compatible with target ar
chitecture i386:x86-64.
---Type &return& to continue, or q &return& to quit---
Core was generated by `python ./cachetop.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
ClrBlank (win=0x1993060)
at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129
123456789101112131415161718192021222324
&& ┌──/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c──────┐&& │1124&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1125&&&&static NCURSES_INLINE NCURSES_CH_T&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1126&&&&ClrBlank(NCURSES_SP_DCLx WINDOW *win)&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1127&&&&{&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1128&&&&&&&&NCURSES_CH_T blank = blankchar;&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&&&│1129&&&&&&&&if (back_color_erase)&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1130&&&&&&&&&&&&AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS)│&& │1131&&&&&&&&return blank;&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1132&&&&}&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1133&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1134&&&&/*&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1135&&&&**&&&&&&ClrUpdate()&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1136&&&&**&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& └───────────────────────────────────────────────────────────────────────────┘multi-thre Thread 0x7f0a3c5e87 In: doupdate&&&&&&&&&&&&L1129 PC: 0x7f0a37aac40d warning: JITed object file architecture unknown is not compatible with target architecture i386:x86-64.---Type &return& to continue, or q &return& to quit---Core was generated by `python ./cachetop.py'.Program terminated with signal SIGSEGV, Segmentation fault.#0&&ClrBlank (win=0x1993060)&&&&at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129(gdb)
“&” 指向发生崩溃的那行代码。更棒的是:用 layout split 命令,我们可以在不同的窗口查看源代码和汇编代码。
┌──/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c──────┐
if (back_color_erase)
AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS)│
ClrUpdate()
└───────────────────────────────────────────────────────────────────────────┘
&│0x7f0a37aac40d &doupdate+301&
0x10(%rsi),%rdi
│0x7f0a37aac411 &doupdate+305&
$0x0,0x1c(%rdi)
│0x7f0a37aac415 &doupdate+309&
0x7f0a37aac6f7 &doupdate+1047&
│0x7f0a37aac41b &doupdate+315&
movswl 0x4(%rcx),%ecx
│0x7f0a37aac41f &doupdate+319&
movswl 0x74(%rdx),%edi
│0x7f0a37aac423 &doupdate+323&
%rax,0x40(%rsp)
│0x7f0a37aac428 &doupdate+328&
$0x20,0x48(%rsp)
│0x7f0a37aac430 &doupdate+336&
$0x0,0x4c(%rsp)
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7f0a3c5e87 In: doupdate
L1129 PC: 0x7f0a37aac40d
chitecture i386:x86-64.
Core was generated by `python ./cachetop.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
---Type &return& to continue, or q &return& to quit---
ClrBlank (win=0x1993060)
at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129
(gdb) layout split
1234567891011121314151617181920212223242526
&& ┌──/build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c──────┐&&&│1129&&&&&&&&if (back_color_erase)&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1130&&&&&&&&&&&&AddAttr(blank, (AttrOf(BCE_BKGD(SP_PARM, win)) & BCE_ATTRS)│&& │1131&&&&&&&&return blank;&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1132&&&&}&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& │1133&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1134&&&&/*&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& │&& │1135&&&&**&&&&&&ClrUpdate()&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&│&& └───────────────────────────────────────────────────────────────────────────┘&&&│0x7f0a37aac40d &doupdate+301&&& mov&&&&0x10(%rsi),%rdi&&&&&&&&&&&&&&&&&&&& │&& │0x7f0a37aac411 &doupdate+305&&& cmpb&& $0x0,0x1c(%rdi)&&&&&&&&&&&&&&&&&&&& │&& │0x7f0a37aac415 &doupdate+309&&& jne&&&&0x7f0a37aac6f7 &doupdate+1047&&&&&&&│&& │0x7f0a37aac41b &doupdate+315&&& movswl 0x4(%rcx),%ecx&&&&&&&&&&&&&&&&&&&&&&│&& │0x7f0a37aac41f &doupdate+319&&& movswl 0x74(%rdx),%edi&&&&&&&&&&&&&&&&&&&& │&& │0x7f0a37aac423 &doupdate+323&&& mov&&&&%rax,0x40(%rsp)&&&&&&&&&&&&&&&&&&&& │&& │0x7f0a37aac428 &doupdate+328&&& movl&& $0x20,0x48(%rsp)&&&&&&&&&&&&&&&&&&&&│&& │0x7f0a37aac430 &doupdate+336&&& movl&& $0x0,0x4c(%rsp)&&&&&&&&&&&&&&&&&&&& │&& └───────────────────────────────────────────────────────────────────────────┘multi-thre Thread 0x7f0a3c5e87 In: doupdate&&&&&&&&&&&&L1129 PC: 0x7f0a37aac40d chitecture i386:x86-64.Core was generated by `python ./cachetop.py'.Program terminated with signal SIGSEGV, Segmentation fault.---Type &return& to continue, or q &return& to quit---#0&&ClrBlank (win=0x1993060)&&&&at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tty/tty_update.c:1129(gdb) layout split
Greg 演示这个的时候,和这里的顺序相反,因此你可想像同时查看源代码和汇编的情景(这里我需要一个视频来演示)。
14. 外部工具:cscope
我需要对 back_color_erase 有更多了解,我可以试试 gdb 的 搜索命令,但是我发现用一个外部工具:cscope 更快。 cscope 是一个基于文本的代码浏览器 ,诞生于80年代的贝尔实验室。如果你有喜欢的现代 IDE,可以不用它。
安装 cscope:
# apt-get install -y cscope
# wget http://archive.ubuntu.com/ubuntu/pool/main/n/ncurses/ncurses_6.0+.orig.tar.gz
# tar xvf ncurses_6.0+.orig.tar.gz
# cd ncurses-6.0-
# cscope -bqR
# cscope -dq
# apt-get install -y cscope# wget http://archive.ubuntu.com/ubuntu/pool/main/n/ncurses/ncurses_6.0+.orig.tar.gz# tar xvf ncurses_6.0+.orig.tar.gz# cd ncurses-6.0-# cscope -bqR# cscope -dq
cscope -bqR 用来建立查找数据库。cscope -dq 用来启动 cscope。
查找 back_color_erase 的定义:
Cscope version 15.8b
Press the ? key for help
Find this C symbol:
Find this global definition: back_color_erase
Find functions called by this function:
Find functions calling this function:
Find this text string:
Change this text string:
Find this egrep pattern:
Find this file:
Find files #including this file:
Find assignments to this symbol:
1234567891011121314151617181920212223
Cscope version 15.8b&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Press the ? key for help&&&&&&&&&&&&Find this C symbol:Find this global definition: back_color_eraseFind functions called by this function:Find functions calling this function:Find this text string:Change this text string:Find this egrep pattern:Find this file:Find files #including this file:Find assignments to this symbol:
#define non_dest_scroll_region
CUR Booleans[26]
#define can_change
CUR Booleans[27]
#define back_color_erase
CUR Booleans[28]
#define hue_lightness_saturation
CUR Booleans[29]
#define col_addr_glitch
CUR Booleans[30]
#define cr_cancels_micro_mode
CUR Booleans[31]
12345678910
[...]#define non_dest_scroll_region&&&&&&&& CUR Booleans[26]#define can_change&&&&&&&&&&&&&&&&&&&& CUR Booleans[27]#define back_color_erase&&&&&&&&&&&&&& CUR Booleans[28]#define hue_lightness_saturation&&&&&& CUR Booleans[29]#define col_addr_glitch&&&&&&&&&&&&&&&&CUR Booleans[30]#define cr_cancels_micro_mode&&&&&&&&&&CUR Booleans[31]&&[...]
哦,一个宏定义。(作为宏定义的常见的形式,它们至少应该大写)
好吧,那么 CUR 是什么呢? 用 cscope 查找定义易如反掌。
#define CUR cur_term-&type.
#define CUR cur_term-&type.
起码这个宏定义是大写的!
我们通过逐条查看指令和寄存器找更早定义的 cur_term 。它是什么呢?
#if 0 && !0
extern NCURSES_EXPORT_VAR(TERMINAL *) cur_
NCURSES_WRAPPED_VAR(TERMINAL *, cur_term);
#define cur_term
NCURSES_PUBLIC_VAR(cur_term())
extern NCURSES_EXPORT_VAR(TERMINAL *) cur_
#if 0 && !0extern NCURSES_EXPORT_VAR(TERMINAL *) cur_term;#elif 0NCURSES_WRAPPED_VAR(TERMINAL *, cur_term);#define cur_term&& NCURSES_PUBLIC_VAR(cur_term())#elseextern NCURSES_EXPORT_VAR(TERMINAL *) cur_term;#endif
cscope 读取了 /usr/include/term.h 。好吧,更多的宏。我用加粗来突出这行代码, 我认为它产生了影响。为什么这里会有 “if 0 && !0 … elif 0” ?我不清楚(需要再读些代码)。有时程序员会在他们想要在产品中失效的调试代码附近使用 “#if 0”,可是,这个好像是自动生成的。
查找 NCURSES_EXPORT_VAR 发现:
define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type
#&&define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type
… 和 NCURSES_IMPEXP:
/* Take care of non-cygwin platforms */
#if !defined(NCURSES_IMPEXP)
define NCURSES_IMPEXP /* nothing */
#if !defined(NCURSES_API)
define NCURSES_API /* nothing */
#if !defined(NCURSES_EXPORT)
define NCURSES_EXPORT(type) NCURSES_IMPEXP type NCURSES_API
#if !defined(NCURSES_EXPORT_VAR)
define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type
12345678910111213
/* Take care of non-cygwin platforms */#if !defined(NCURSES_IMPEXP)&&&&&&&&&&#&&define NCURSES_IMPEXP /* nothing */#endif&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&#if !defined(NCURSES_API)&&&&&&&&&&&& #&&define NCURSES_API /* nothing */&& #endif&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&#if !defined(NCURSES_EXPORT)&&&&&&&&&&#&&define NCURSES_EXPORT(type) NCURSES_IMPEXP type NCURSES_API#endif&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&#if !defined(NCURSES_EXPORT_VAR)&&&&&&#&&define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type#endif
… 还有 TERMINAL:
typedef struct term {
/* describe an actual terminal */
/* terminal type description */
/* file description being written to */
/* original state of the terminal */
/* current state of the terminal */
/* used to compute padding */
/* used for termname() */
} TERMINAL;
12345678910
typedef struct term {&&&&&& /* describe an actual terminal */&&&&TERMTYPE&&&&type;&&&&&& /* terminal type description */&&&&short&& Filedes;&&&&/* file description being written to */&&&&TTY&&&& Ottyb,&&&&&&/* original state of the terminal */&&&&&&&&Nttyb;&&&&&&/* current state of the terminal */&&&&int&&&& _baudrate;&&/* used to compute padding */&&&&char *&&&&&&_termname;&&&&&&/* used for termname() */&&} TERMINAL;
嗨!TERMINAL 是大写的。和宏混在一起,这个代码不太好跟踪 …
好吧,到底是谁给 cur_term 赋的值呢?记住我们的问题是它被赋值为零,也许因为它未被初始化或显式赋值。浏览给它赋值的代码路径可能会找到更多的线索,来回答为什么没被初始化,或为什么被赋值为零。使用 cscope 的第一个选项:
Find this C symbol: cur_term
Find this global definition:
Find functions called by this function:
Find functions calling this function:
Find this C symbol: cur_termFind this global definition:Find functions called by this function:Find functions calling this function:[...]
快速浏览项发现:
NCURSES_EXPORT(TERMINAL *)
NCURSES_SP_NAME(set_curterm) (NCURSES_SP_DCLx TERMINAL * termp)
TERMINAL *
T((T_CALLED("set_curterm(%p)"), (void *) termp));
_nc_lock_global(curses);
oldterm = cur_
if (SP_PARM)
SP_PARM-&_term =
#if USE_REENTRANT
cur_term =
12345678910111213141516
NCURSES_EXPORT(TERMINAL *)NCURSES_SP_NAME(set_curterm) (NCURSES_SP_DCLx TERMINAL * termp){&&&&TERMINAL *oldterm;&&&&&T((T_CALLED("set_curterm(%p)"), (void *) termp));&&&&&_nc_lock_global(curses);&&&&oldterm = cur_term;&&&&if (SP_PARM)&&&&SP_PARM-&_term = termp;#if USE_REENTRANT&&&&CurTerm = termp;#else&&&&cur_term = termp;#endif
我加了高亮。甚至函数名称都被封装在宏里。但至少我们发现了 cur_term 如何被赋值的:通过 set_curterm()。也许它没被调用?
15. 外部工具:perf-tools/ftrace/uprobes
我稍后将介绍如何用 gdb 解决这个问题,可是我忍不住尝试我
工具箱里的 uprobe 工具,它使用 Linux 下的 ftrace 和 uprobes。用 tracers 的一个好处是它不会终止目标进程,像 gdb 一样(尽管对于这里的 cachetop.py 没什么用)。另一个好处是追踪几个和几千个进程一样容易。
我应该能追踪 libncursesw 对 set_curterm() 的调用,甚至打印出它的第一个参数:
# /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libncursesw.so.5:set_curterm %di'
ERROR: missing symbol "set_curterm" in /lib/x86_64-linux-gnu/libncursesw.so.5
# /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libncursesw.so.5:set_curterm %di'ERROR: missing symbol "set_curterm" in /lib/x86_64-linux-gnu/libncursesw.so.5
咦,没起作用。set_curterm() 在哪?有很多方法可以找到它,比如 gdb 或 objdump:
(gdb) info symbol set_curterm
set_curterm in section .text of /lib/x86_64-linux-gnu/libtinfo.so.5
# objdump -tT /lib/x86_64-linux-gnu/libncursesw.so.5 | grep cur_term
NCURSES_TINFO_5.0. cur_term
# objdump -tT /lib/x86_64-linux-gnu/libtinfo.so.5 | grep cur_term
NCURSES_TINFO_5.0. cur_term
(gdb) info symbol set_curtermset_curterm in section .text of /lib/x86_64-linux-gnu/libtinfo.so.5# objdump -tT /lib/x86_64-linux-gnu/libncursesw.so.5 | grep cur_term0000&&&&&&DO *UND*&&0000&&NCURSES_TINFO_5.0. cur_term# objdump -tT /lib/x86_64-linux-gnu/libtinfo.so.5 | grep cur_term8948 g&&&&DO .bss&& 0008&&NCURSES_TINFO_5.0. cur_term
gdb 表现的好些。此外如果仔细看源代码,我注意到它是为 libtinfo 构建的。
试着在 libtinfo 里边查找 set_curterm() :
# /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libtinfo.so.5:set_curterm %di'
Tracing uprobe set_curterm (p:set_curterm /lib/x86_64-linux-gnu/libtinfo.so.5:0xfa80 %di). Ctrl-C to end.
python-3] d... 959: set_curterm: (0x7f116fcc2a80) arg1=0x1345d70
python-3] d... 033: set_curterm: (0x7f116fcc2a80) arg1=0x13a22e0
python-3] d... 804: set_curterm: (0x7f116fcc2a80) arg1=0x14cdfa0
python-3] d... 838: set_curterm: (0x7f116fcc2a80) arg1=0x0
# /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libtinfo.so.5:set_curterm %di'Tracing uprobe set_curterm (p:set_curterm /lib/x86_64-linux-gnu/libtinfo.so.5:0xfa80 %di). Ctrl-C to end.&&&&&&&&&&python-31617 [007] d... 959: set_curterm: (0x7f116fcc2a80) arg1=0x1345d70&&&&&&&&&&python-31617 [007] d... 033: set_curterm: (0x7f116fcc2a80) arg1=0x13a22e0&&&&&&&&&&python-31617 [007] d... 804: set_curterm: (0x7f116fcc2a80) arg1=0x14cdfa0&&&&&&&&&&python-31617 [007] d... 838: set_curterm: (0x7f116fcc2a80) arg1=0x0^C
找到了。所以 set_curterm() 被调用了,并且被调用了四次。最后一次被传了一个零,看起来这就是问题所在。
如果你觉得疑惑,我怎么就知道 %di 寄存器就是第一个参数呢,因为 AMD64/x86_64 ABI 写着呢(假设这个库和 ABI 兼容)。这里有提示:
# man syscall
──────────────────────────────────────────────────────────────────
mips/n32,64
12345678910111213141516171819
# man syscall[...]&&&&&& arch/ABI&&&&&&arg1&&arg2&&arg3&&arg4&&arg5&&arg6&&arg7&&Notes&&&&&& ──────────────────────────────────────────────────────────────────&&&&&& arm/OABI&&&&&&a1&&&&a2&&&&a3&&&&a4&&&&v1&&&&v2&&&&v3&&&&&& arm/EABI&&&&&&r0&&&&r1&&&&r2&&&&r3&&&&r4&&&&r5&&&&r6&&&&&& arm64&&&&&&&& x0&&&&x1&&&&x2&&&&x3&&&&x4&&&&x5&&&&-&&&&&& blackfin&&&&&&R0&&&&R1&&&&R2&&&&R3&&&&R4&&&&R5&&&&-&&&&&& i386&&&&&&&&&&ebx&& ecx&& edx&& esi&& edi&& ebp&& -&&&&&& ia64&&&&&&&&&&out0&&out1&&out2&&out3&&out4&&out5&&-&&&&&& mips/o32&&&&&&a0&&&&a1&&&&a2&&&&a3&&&&-&&&& -&&&& -&&&& See below&&&&&& mips/n32,64&& a0&&&&a1&&&&a2&&&&a3&&&&a4&&&&a5&&&&-&&&&&& parisc&&&&&&&&r26&& r25&& r24&& r23&& r22&& r21&& -&&&&&& s390&&&&&&&&&&r2&&&&r3&&&&r4&&&&r5&&&&r6&&&&r7&&&&-&&&&&& s390x&&&&&&&& r2&&&&r3&&&&r4&&&&r5&&&&r6&&&&r7&&&&-&&&&&& sparc/32&&&&&&o0&&&&o1&&&&o2&&&&o3&&&&o4&&&&o5&&&&-&&&&&& sparc/64&&&&&&o0&&&&o1&&&&o2&&&&o3&&&&o4&&&&o5&&&&-&&&&&& x86_64&&&&&&&&rdi&& rsi&& rdx&& r10&& r8&&&&r9&&&&-[...]
我还想知道调用 arg1=0x0 的堆栈信息,但是 ftrace 还不支持栈追踪。
16. 外部工具:bcc/BPF
由于我们在调试 bcc 工具 cachetop.py,值得注意的是 bcc 里的 trace.py 有和我的老工具 uprobe 类似的功能:
# ./trace.py 'p:tinfo:set_curterm "%d", arg1'
01:00:20 31698
set_curterm
01:00:20 31698
set_curterm
01:00:20 31698
set_curterm
01:00:20 31698
set_curterm
# ./trace.py 'p:tinfo:set_curterm "%d", arg1'TIME&&&& PID&&&&COMM&&&&&&&& FUNC&&&&&&&&&&&& -01:00:20 31698&&python&&&&&& set_curterm&&&&&&01:00:20 31698&&python&&&&&& set_curterm&&&&&&01:00:20 31698&&python&&&&&& set_curterm&&&&&&01:00:20 31698&&python&&&&&& set_curterm&&&&&&0
是的,我们在用 bcc 调试 bcc !
不熟悉,它值得一看。它为 Linux4.x 系列里的 BPF 新特性提供了 Python 和 lua 接口 。总之,它能让很多以前不可能或昂贵以致无法运行的性能工具运行起来。我以前发过贴介绍如何在
上运行它。
bcc 的 trace.py 工具应该有一个开关来决定是否打印用户堆栈,因为内核从 Linux4.6 开始具备 BPF 堆栈功能,不过到写这篇文章的时候我们还没有加上这个开关。
17. 更多的断点
我真的应该从在 set_curterm() 下了断点的 gdb 入手,可是我觉得我们走的弯路,使用ftrace和BPF的还是蛮有趣的。
回到实时运行模式:
# gdb `which python`
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
(gdb) b set_curterm
Function "set_curterm" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (set_curterm) pending.
(gdb) r cachetop.py
Starting program: /usr/bin/python cachetop.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, set_curterm (termp=termp@entry=0xa43150) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:80
Continuing.
Breakpoint 1, set_curterm (termp=termp@entry=0xab5870) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:80
Continuing.
Breakpoint 1, set_curterm (termp=termp@entry=0xbecb90) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:80
Continuing.
Breakpoint 1, set_curterm (termp=0x0) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:80
12345678910111213141516171819202122232425262728
# gdb `which python`GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1[...](gdb) b set_curtermFunction "set_curterm" not defined.Make breakpoint pending on future shared library load? (y or [n]) yBreakpoint 1 (set_curterm) pending.(gdb) r cachetop.pyStarting program: /usr/bin/python cachetop.py[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".Breakpoint 1, set_curterm (termp=termp@entry=0xa43150) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:8080&&{(gdb) cContinuing.&Breakpoint 1, set_curterm (termp=termp@entry=0xab5870) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:8080&&{(gdb) cContinuing.&Breakpoint 1, set_curterm (termp=termp@entry=0xbecb90) at /build/ncurses-pKZ1BN/ncurses-6.0+/ncurses/tinfo/lib_cur_term.c:8080&&{(gdb) cContinuing.&Breakpoint 1, set_curterm (termp=0x0) at /

我要回帖

更多关于 gdb修改内存 的文章

 

随机推荐