linux内核调试之异常调用栈

linux内核调试之根据kernel panic找到异常调用栈

Posted by Albert on August 24, 2020

linux内核调试之异常调用栈

之前做项目的时候,在启动过程中出现
Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 90.967706] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P 3.14.77 #1
[ 90.967714] task: cf88c4c0 ti: cf8b2000 task.ti: cf8b2000
[ 90.967757] PC is at hyfi_netfilter_forwarding_hook+0x124/0x1b8 [hyfi_bridging]
[ 90.967778] LR is at hyfi_netfilter_forwarding_hook+0x11c/0x1b8 [hyfi_bridging]
[ 90.967788] pc : [<bf5cc250>] lr : [<bf5cc248>] psr: 20000113
[ 90.967788] sp : cf8b3d00 ip : 00003a89 fp : c94b8e40
[ 90.967793] r10: bf5df6f0 r9 : cd5aa000 r8 : bf5de610
[ 90.967800] r7 : c94dc000 r6 : c697e000 r5 : 00000000 r4 : cdcf4500
[ 90.967806] r3 : 00000003 r2 : cdcf4500 r1 : 00000000 r0 : 00000001
[ 90.967812] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 90.967819] Control: 10c5387d Table: 8ca4406a DAC: 00000015
[ 90.967824] Process swapper/1 (pid: 0, stack limit = 0xcf8b2238)

首先找到hyfi_netfilter_forwarding_hook的源文件对应的目标文件,使用toolchain的gdb打开,然后执行list即可:

$ /opt/toolchain-arm_cortex-a7_gcc-4.8-linaro_uClibc-1.0.14_eabi/bin/arm-openwrt-linux-gdb hyfi_netfilter.o
GNU gdb (Linaro GDB) 7.6-2013.05
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=arm-openwrt-linux-uclibcgnueabi".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/&gt;...
Reading symbols from /work/router/qca/ipq4018/build_dir/target-arm-openwrt-linux-uclibcgnueabi/linux-ipq806x/qca-hyfi-bridge-g/hyfi-netfilter/hyfi_netfilter.o...done.
(gdb) list *(hyfi_netfilter_forwarding_hook+0x124)
0x124 is in hyfi_netfilter_forwarding_hook (/work/router/qca/ipq4018/build_dir/target-arm-openwrt-linux-uclibcgnueabi/linux-ipq806x/qca-hyfi-bridge-g/./hyfi-netfilter/hyfi_netfilter.c:177).

出问题的代码是177行。

​ 这里是通过gdb来定位异常调用,其中值得注意的是内核kernel panic的时候提供的cpu这一行也十分有用,可以用于排查是否是多核操作临界资源导致异常。这一般和内核使用锁不正确有关系,后面的文章我会列举:诸如踩内存、内存double free的例子

 CPU: 1 PID: 0 Comm: swapper/1 Tainted: P 3.14.77 #1