Table of contents
- What is DWARF?
- Inline functions in DWARF
- Calculating call boundaries
- Finding the caller function
- inlinecall(1)
What is DWARF?
From the DWARF Debugging Standard’s documentation:
This document defines a format for describing programs to facilitate user source level debugging. This description can be generated by compilers, assemblers and linkage editors. It can be used by debuggers and other tools.
Debugging information entries (DIEs) are represented as a tree, one per
compilation unit (CU). Each DIE has a tag (DW_TAG_*
, see DWARF PDF Figure 1)
denoting its class, and attributes (DW_AT_*
, see DWARF PDF Figure 2) denoting
its various characteristics, associated with it.
The next entry of a DIE is a child DIE. If a DIE doesn’t have children, the next entry is a “sibling”.
Consider the following structure:
CU1 (DW_TAG_compile_unit)
func1 (DW_TAG_subprogram)
DW_AT_foo
DW_AT_bar
func2 (DW_TAG_subprogram)
DW_AT_foo
DW_AT_bar
myvar (DW_TAG_variable)
DW_AT_foo
DW_AT_bar
CU2 (DW_TAG_compile_unit)
...
CU1
has func1
and myvar
as chidren and CU2
as siblings. func1
has
func2
as a child.
The debug file can be generated by compiling with the -g
option. To dump
DWARF info you can use readelf -wi <file>
and dwarfdump <file>
.
Inline functions in DWARF
DIEs of inline function declarations have the DW_TAG_subprogram
tag and the
DW_AT_inline
attribute. DIEs of inline copies of this function will have
the DW_TAG_inlined_subroutine
tag.
Attributes inline copies can have include:
DW_AT_abstract_origin
: DIE offset to the inline declaration.DW_AT_call_file
: Integer denoting the file the function is called in.DW_AT_call_line
: File line.DW_AT_call_column
: File column.DW_AT_low_pc
andDW_AT_high_pc
: Lower and upper call boundaries. Explained in the next section.DW_AT_ranges
: Explained in the next section.
For example, if we dump the DWARF info for my FreeBSD kernel:
$ readelf -wi /usr/lib/debug/boot/kernel/kernel.debug > ~/foo
We find that vfs_freevnodes_dec
gets inlined:
<1><1dfa144>: Abbrev Number: 94 (DW_TAG_subprogram)
<1dfa145> DW_AT_name : (indirect string) vfs_freevnodes_dec
<1dfa149> DW_AT_decl_file : 1
<1dfa14a> DW_AT_decl_line : 1447
<1dfa14c> DW_AT_prototyped : 1
<1dfa14c> DW_AT_inline : 1
Inline copies will have DW_AT_abstract_origin
point to the declaration’s DIEs
offset, in this case 0x1dfa144
. If we look for 0x1dfa144
, we do indeed find
a few inline copies.
<3><1dfe45e>: Abbrev Number: 24 (DW_TAG_inlined_subroutine)
<1dfe45f> DW_AT_abstract_origin: <0x1dfa144>
<1dfe463> DW_AT_low_pc : 0xffffffff80cf701d
<1dfe46b> DW_AT_high_pc : 0x38
<1dfe46f> DW_AT_call_file : 1
<1dfe470> DW_AT_call_line : 3458
<1dfe472> DW_AT_call_column : 5
<3><1dfd2e2>: Abbrev Number: 58 (DW_TAG_inlined_subroutine)
<1dfd2e3> DW_AT_abstract_origin: <0x1dfa144>
<1dfd2e7> DW_AT_ranges : 0x1f1290
<1dfd2eb> DW_AT_call_file : 1
<1dfd2ec> DW_AT_call_line : 3405
<1dfd2ee> DW_AT_call_column : 3
...there are more
As I described in the first section, a debug file may consist of multiple CUs that define the same inline function. We want treat each CU independently, that is, each inline copy is handled relative to its CU.
Calculating call boundaries
There are 2 cases we have to take care of when calculating the actual call boundaries of an inline copy.
The DIE has DW_AT_low_pc
and DW_AT_high_pc
<3><1dfe45e>: Abbrev Number: 24 (DW_TAG_inlined_subroutine)
<1dfe45f> DW_AT_abstract_origin: <0x1dfa144>
<1dfe463> DW_AT_low_pc : 0xffffffff80cf701d
<1dfe46b> DW_AT_high_pc : 0x38
<1dfe46f> DW_AT_call_file : 1
<1dfe470> DW_AT_call_line : 3458
<1dfe472> DW_AT_call_column : 5
In this case, the lower boundary is low_pc
and the upper boundary is
low_pc + high_pc
, which, for the DIE shown in this example, the boundaries
are:
low = 0xffffffff80cf701d
high = 0xffffffff80cf701d + 0x38 = 0xffffffff80cf7055
The DIE has DW_AT_ranges
<3><1dfd2e2>: Abbrev Number: 58 (DW_TAG_inlined_subroutine)
<1dfd2e3> DW_AT_abstract_origin: <0x1dfa144>
<1dfd2e7> DW_AT_ranges : 0x1f1290
<1dfd2eb> DW_AT_call_file : 1
<1dfd2ec> DW_AT_call_line : 3405
<1dfd2ee> DW_AT_call_column : 3
This is a bit more involved. DW_AT_ranges
refers to the .debug_ranges
section found in debug files. We can dump the ranges:
$ dwarfdump -N /usr/lib/debug/boot/kernel/kernel.debug
.debug_ranges
Ranges group 0:
ranges: 3 at .debug_ranges offset 0 (0x00000000) (48 bytes)
[ 0] range entry 0x00000019 0x00000073
[ 1] range entry 0x0000007e 0x00000106
[ 2] range end 0x00000000 0x00000000
Ranges group 1:
ranges: 3 at .debug_ranges offset 48 (0x00000030) (48 bytes)
[ 0] range entry 0x00000022 0x0000006a
[ 1] range entry 0x0000007e 0x00000106
[ 2] range end 0x00000000 0x00000000
...
If we search for 0x1f1290
(the inline copy’s ranges), we find its range
group:
Ranges group 38809:
ranges: 3 at .debug_ranges offset 2036368 (0x001f1290) (48 bytes)
[ 0] range entry 0x000025c8 0x000025f9
[ 1] range entry 0x0000261a 0x00002621
[ 2] range end 0x00000000 0x00000000
To get the call boundaries, we add each range entry
’s boundaries to the
DW_AT_low_pc
of the root DIE of the CU. The root DIE is found
programmatically, but I happen to know that in this case, the root DIE is:
<0><1dee9fb>: Abbrev Number: 1 (DW_TAG_compile_unit)
<1dee9fc> DW_AT_producer : (indirect string) FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
<1deea00> DW_AT_language : 12 (C99)
<1deea02> DW_AT_name : (indirect string) /usr/src/sys/kern/vfs_subr.c
<1deea06> DW_AT_stmt_list : 0x6cb448
<1deea0a> DW_AT_comp_dir : (indirect string) /usr/obj/usr/src/amd64.amd64/sys/GENERIC
<1deea0e> DW_AT_low_pc : 0xffffffff80cf4020
<1deea16> DW_AT_high_pc : 0xde3d
Finally, we end up with the following boundaries:
low = 0xffffffff80cf4020 + 0x000025c8 = 0xffffffff80cf65e8
high = 0xffffffff80cf4020 + 0x000025f9 = 0xffffffff80cf6619
low = 0xffffffff80cf4020 + 0x0000261a = 0xffffffff80cf663a
high = 0xffffffff80cf4020 + 0x00002621 = 0xffffffff80cf6641
Finding the caller function
There are cases where we want to know which function an inline function is being called from. Because DWARF does not encode that information, we’ll have to scan ELF symbol tables.
$ readelf -s /usr/lib/debug/boot/kernel/kernel.debug
Since we know the inline copy’s boundaries, we only have to find which symbol’s boundaries the inline copy is inside. In other words, the following condition has to be met:
sym_lower_bound <= inline_lower_bound <= inline_upper_bound <= sym_upper_bound
Because searching through ELF symbol tables manually and doing calculations by hand would take too long, the best way to do this is programmatically through LibELF.
inlinecall(1)
I wrote a little program that does everything I talked about in this post automatically. It works on FreeBSD as-is, and most likely needs some modification to get it to work on other platforms.
The program takes an inline function name and a debug file as arguments:
inlinecall <function> <file>
And outputs the results in the following form:
cu1_func_declaration_file:line
[low_bound - high_bound] inline_copy1_file:line caller_func()
[low_bound - high_bound] inline_copy2_file:line caller_func()
...
cu2_func_declaration_file:line
...
...
For example:
$ inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/systm.h:175
[0xffffffff809eb51f - 0xffffffff809eb526] /usr/src/sys/kern/kern_intr.c:1387 intr_event_handle()
/usr/src/sys/sys/systm.h:175
[0xffffffff80a051f4 - 0xffffffff80a05208] /usr/src/sys/kern/kern_malloc.c:431 malloc_type_freed()
[0xffffffff80a0514c - 0xffffffff80a0515b] /usr/src/sys/kern/kern_malloc.c:388 malloc_type_zone_allocated()
/usr/src/sys/sys/systm.h:175
[0xffffffff80a263c4 - 0xffffffff80a263d3] /usr/src/sys/kern/kern_resource.c:509 rtp_to_pri()
/usr/src/sys/sys/systm.h:175
[0xffffffff80a28f59 - 0xffffffff80a28f5f] /usr/src/sys/kern/kern_rmlock.c:775 _rm_assert()
[0xffffffff80a29087 - 0xffffffff80a2908d] /usr/src/sys/kern/kern_rmlock.c:801 _rm_assert()
[0xffffffff80a29eb0 - 0xffffffff80a29eb7] /usr/src/sys/kern/kern_rmlock.c:645 _rm_rlock_debug()
[0xffffffff80a28c4b - 0xffffffff80a28c5a] /usr/src/sys/kern/kern_rmlock.c:160 unlock_rm()
...more
Nested inline functions
inlinecall(1) resolves nested inline functions recursively:
$ ./inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/systm.h:175
[0xffffffff80a19d7a - 0xffffffff80a19d8b] /usr/src/sys/sys/buf_ring.h:80 drbr_enqueue()
/usr/src/sys/sys/systm.h:175
[0xffffffff80a6387a - 0xffffffff80a6388b] /usr/src/sys/sys/buf_ring.h:80 drbr_enqueue()
...
Looking at the definition of critical_enter()
’s caller function in
buf_ring.h
:
static __inline int
buf_ring_enqueue(struct buf_ring *br, void *buf)
{
...
critical_enter();
...
}
Even though inlinecall(1) reported that critical_enter()
is called from
drbr_enqueue()
in buf_ring.h:80
, we see that it’s called from
buf_ring_enqueue()
instead, but buf_ring_enqueue()
is also an inline
function:
$ ./inlinecall buf_ring_enqueue /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/buf_ring.h:63
[0xffffffff80a19d7a - 0xffffffff80a19dcd] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80a19ddc - 0xffffffff80a19e18] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80a19e1f - 0xffffffff80a19e3b] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
[0xffffffff80a6387a - 0xffffffff80a638cd] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80a638dc - 0xffffffff80a63918] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80a6391f - 0xffffffff80a6393b] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
[0xffffffff80d1f81a - 0xffffffff80d1f879] /usr/src/sys/net/ifq.c:57 drbr_enqueue()
[0xffffffff80d1f91d - 0xffffffff80d1f964] /usr/src/sys/net/ifq.c:57 drbr_enqueue()
[0xffffffff80d1f9dd - 0xffffffff80d1f9f5] /usr/src/sys/net/ifq.c:57 drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
[0xffffffff80ff07ba - 0xffffffff80ff080d] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80ff081c - 0xffffffff80ff0858] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
[0xffffffff80ff085f - 0xffffffff80ff087b] /usr/src/sys/net/ifq.h:337 drbr_enqueue()
Here drbr_enqueue()
is defined twice — once in ifq.h
and once in ifq.c
.
The definition in ifq.h
is also an inline definition, and in ifq.c
it’s a
non-inline one. We know that buf_ring_enqueue()
is called from the non-inline
version of drbr_enqueue()
, otherwise inlinecall(1) would have reported the
function which calls the inline version of drbr_enqueue()
.