Addr2line
It’s a way to identify where the bug happens, because I can run into a CallTrace without more informations, look like this:
[ 139.288739] ------------[ cut here ]------------
[ 139.288779] WARNING: CPU: 52 PID: 13556 at fs/iomap.c:66
iomap_seek_data+0xd0/0x240
[ 139.288814] Modules linked in: pptp gre pppoe pppox ppp_generic slhc can_bcm
rfcomm bluetooth rfkill af_key ieee802154_socket ieee802154 atm can_raw can
binfmt_misc ipt_REJECT nf_reject_ipv4 xt_conntrack ip_set nfnetlink ebtable_nat
ebtable_broute bridge stp llc iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
ebtable_filter ebtables ip6_tables iptable_filter i2c_dev nvidiafb sg at24
ofpart nvmem_core shpchp powernv_flash ipmi_powernv ipmi_devintf ipmi_msghandler
i2c_opal mtd uio_pdrv_genirq uio ibmpowernv opal_prd ip_tables xfs libcrc32c
sd_mod nouveau ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm ahci drm libahci libata tg3 ptp pps_core i2c_core dm_mirror
dm_region_hash dm_log dm_mod
[ 139.289139] CPU: 52 PID: 13556 Comm: trinity-c5 Not tainted
4.11.0-44.el7a.ppc64le #1
[ 139.289174] task: c000003f77426800 task.stack: c000003f76314000
[ 139.289204] NIP: c0000000004d4890 LR: c0000000004d486c CTR: c0000000001b4ca0
[ 139.289238] REGS: c000003f76317a80 TRAP: 0700 Not tainted
(4.11.0-44.el7a.ppc64le)
[ 139.289272] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>
[ 139.289277] CR: 24004424 XER: 20040000
[ 139.289317] CFAR: c0080000382ee0dc SOFTE: 1
[ 139.289317] GPR00: c0000000004d486c c000003f76317d00 c000000001403100
0000000000000000
[ 139.289317] GPR04: c008000038379628 c000003f76317c8d c000003f76317c8c
00000000036c8200
[ 139.289317] GPR08: 0006d9042fef1c00 0003f76317cc0000 0000000000000001
c008000038342fe0
[ 139.289317] GPR12: 0000000000004000 c000000007afee00 0000000000000000
00000000100344b0
[ 139.289317] GPR16: 0000000010034678 0000000000000000 0000000000000000
0000000012329ba0
[ 139.289317] GPR20: 0000000010034518 0000000010034100 000000001001c490
c008000038356120
[ 139.289317] GPR24: 0000000000000000 c000003fc5fc2d58 00000000109857e8
0000000000000000
[ 139.289317] GPR28: fffffffffffffff7 0000000000000009 0000000000000000
0000000000000000
[ 139.289664] NIP [c0000000004d4890] iomap_seek_data+0xd0/0x240
[ 139.289694] LR [c0000000004d486c] iomap_seek_data+0xac/0x240
[ 139.289739] Call Trace:
[ 139.289761] [c000003f76317d00] [c0000000004d486c] iomap_seek_data+0xac/0x240
(unreliable)
[ 139.289827] [c000003f76317da0] [c0080000382df2c4] xfs_file_llseek+0x114/0x130
[xfs]
[ 139.289864] [c000003f76317de0] [c00000000041a9a8] SyS_lseek+0xf8/0x130
[ 139.289895] [c000003f76317e30] [c00000000000b004] system_call+0x38/0xe0
[ 139.289923] Instruction dump:
[ 139.289941] f8410018 7d2c4b78 4e800421 e8410018 2fa30000 7c7e1b78 7c7f1b78
409e0144
[ 139.289978] e9210028 7fa9e000 7d501026 554af7fe <0b0a0000> 2faa0000 40de0160
e9010030
[ 139.290015] ---[ end trace 451b8a13e0f6d488 ]---
Hey dude! What’s happened? …Let’s start the game.
Usage
$ addr2line -fip IMAGE_DBS MEMORY_ADDRESS
- MEMORY_ADDRESS, so look carefully that address which was called
NIP: c0000000004d4890 LR:c0000000004d486c CTR: c0000000001b4ca0
- What is the IMAGE_DBS? it’s the kernel image compiled with debug symbols for be read by the addr2line then identifies which line of code is came from those address.
What are they, NIP, LR[2] and CTR[3]?
- NIP(Next insctrution pointer) : Where the next instruction is allocated.
- LR(Link Register): The link register can be thought of as the old instruction pointer for our purposes.
- CTR(Count Register): The count register (called CTR) is a special-purpose register designed to hold loop counters.
$ addr2line -fip -e usr/lib/debug/lib/modules/4.11.0-44.el7a.ppc64le/vmlinux
c0000000004d4890
iomap_apply at /usr/src/debug/kernel-alt-4.11.0-44.el7a/linux-4.11.0-44.el7a.ppc64le/fs/iomap.c:66
(inlined by) iomap_seek_data at /usr/src/debug/kernel-alt-4.11.0-44.el7a/linux-4.11.0-44.el7a.ppc64le/fs/iomap.c:670
After found the line of code, look in it:
$ vim kernel_src/linux-4.11.0-44.el7a/fs/iomap.c +670 # Using +N you go directly to line N.
...
658 loff_t
659 iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops
*ops)
660 {
661 loff_t size = i_size_read(inode);
662 loff_t length = size - offset;
663 loff_t ret;
664
665 /* Nothing to be found beyond the end of the file. */
666 if (offset >= size)
667 return -ENXIO;
668
669 while (length > 0) {
670 ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
671 &offset, iomap_seek_data_actor);
672 if (ret < 0)
673 return ret;
674 if (ret == 0)
675 break;
676
677 offset += ret;
678 length -= ret;
679 }
680
681 if (length <= 0)
682 return -ENXIO;
683 return offset;
684 }
685 EXPORT_SYMBOL_GPL(iomap_seek_data);
686
...
Looking at iomap.c to see iomap_apply
44 loff_t
45 iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
46 const struct iomap_ops *ops, void *data, iomap_actor_t
actor)
47 {
48 struct iomap iomap = { 0 };
49 loff_t written = 0, ret;
50
51 /*
52 * Need to map a range from start position for length bytes. This
can
53 * span multiple pages - it is only guaranteed to return a range of
a
54 * single type of pages (e.g. all into a hole, all mapped or all
55 * unwritten). Failure at this point has nothing to undo.
56 *
57 * If allocation is required for this range, reserve the space now
so
58 * that the allocation is guaranteed to succeed later on. Once we
copy
59 * the data into the page cache pages, then we cannot fail
otherwise we
60 * expose transient stale data. If the reserve fails, we can safely
61 * back out at this point as there is nothing to undo.
62 */
63 ret = ops->iomap_begin(inode, pos, length, flags, &iomap);
64 if (ret)
65 return ret;
66 if (WARN_ON(iomap.offset > pos))
67 return -EIO;
68
So, we discovered that iomap_apply function is called by iomap_seek_data function in iomap.c:670, so we need to understand what is wrong in this point iomap.offset > pos
to claim a WARN_ON.