Christos Margiolis: programming

Inline function tracing with the kinst DTrace provider

Tue, 18 Jul 2023 00:00:00 +1200

Quick background

DTrace is a framework that gives administrators and kernel developers the ability to observe kernel behavior in real time. DTrace has modules called “providers”, that perform a particular instrumentation in the kernel (and sometimes userland) using “probes”.

kinst is a new low-level DTrace provider co-authored by Christos Margiolis and Mark Johnston for the FreeBSD operating system, which allows the user to trace arbitrary instructions in kernel functions. It is part of the base system as of FreeBSD 14.0.

kinst probes take the form of kinst::<function>:<instruction>, where <function> is the kernel function to be traced, and <instruction> is the offset to the instruction, relative to the beginning of the function, and can be obtained from the function’s disassembly. If the <instruction> field is left empty, kinst will trace all instructions in that function. Unlike FBT, kinst can also trace the entry and return points of inline functions (see Inline function tracing).

The origin of the name is inspired from an early paper written by A. Tamches and B. Miller discussing a tracing tool they developed called “KernInst”.

Usage

Find the offset corresponding to the third instruction in vm_fault() and trace it, printing the contents of the RSI register:

# kgdb
(kgdb) disas /r vm_fault
Dump of assembler code for function vm_fault:
   0xffffffff80f4e470 <+0>:     55      push   %rbp
   0xffffffff80f4e471 <+1>:     48 89 e5        mov    %rsp,%rbp
   0xffffffff80f4e474 <+4>:     41 57   push   %r15
...

# dtrace -n 'kinst::vm_fault:4 {printf("%#x", regs[R_RSI]);}'
  2  81500                       vm_fault:4 0x827c56000
  2  81500                       vm_fault:4 0x827878000
  2  81500                       vm_fault:4 0x1fab9bef0000
  2  81500                       vm_fault:4 0xe16cf749000
  0  81500                       vm_fault:4 0x13587c366000
  ^C

Trace the return point of critical_enter(), which is an inline function:

# dtrace -n 'kinst::critical_enter:return'
dtrace: description 'kinst::critical_enter:return' matched 130 probes
CPU     ID                    FUNCTION:NAME
  1  71024                spinlock_enter:53
  0  71024                spinlock_enter:53
  1  70992                uma_zalloc_arg:49
  1  70925    malloc_type_zone_allocated:21
  1  70994                uma_zfree_arg:365
  1  70924             malloc_type_freed:21
  1  71024                spinlock_enter:53
  0  71024                spinlock_enter:53
  0  70947         _epoch_enter_preempt:122
  0  70949           _epoch_exit_preempt:28
  ^C

Inline function tracing

How it works

To trace inline functions, libdtrace makes use of the DWARF Debugging Standard, to detect if the function specified is an inline call. If it is, D syntax is transformed to create kinst probes for each of the inline copies found. All work is done in libdtrace, instead of kinst(4). This feature has been added to FreeBSD with this patch.

Contrary to how kinst expects a <function>:<instruction> tuple to create probes, for inline functions, <instruction> is replaced by entry and return.

Syntactic transformations

Suppose the user wants to trace a probe of the form:

kinst::<func>:<entry|return>
/<pred>/
{
	<acts>
}

libdtrace sees that we have specified entry or return, instead of an offset, which is what a regular kinst probe would look like, so it loops through all loaded kernel modules and parses their DWARF and ELF info to see if this function is an inline — if not, the probe is converted to an FBT one, so that we don’t duplicate FBT’s functionality in kinst:

# dtrace -dn 'kinst::malloc:entry {exit(0);}'
fbt::malloc:entry
{
        exit(0x0);
}

dtrace: description 'kinst::malloc:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  2  31144                     malloc:entry

If the function however is an inline, libdtrace will find all calls refering to this function and create new probes for each one of the inline copies found.

# dtrace -dn 'kinst::cam_iosched_has_more_trim:entry { printf("\t%d\t%s", pid, execname); }'
kinst::cam_iosched_get_trim:13,
kinst::cam_iosched_next_bio:13,
kinst::cam_iosched_schedule:40
{
	printf("\t%d\t%s", pid, execname);
}

dtrace: description 'kinst::cam_iosched_has_more_trim:entry ' matched 4 probes
CPU     ID                    FUNCTION:NAME
  0  81502          cam_iosched_schedule:40     2       clock
  0  81501          cam_iosched_next_bio:13     2       clock
  2  81502          cam_iosched_schedule:40     2       clock
  1  81502          cam_iosched_next_bio:13     0	kernel
  1  81503          cam_iosched_schedule:40     0	kernel
^C

There can also be both inline and non-inline definitions of the same function. In this case, kinst creates an additional FBT probe for the non-inline definition.

The -d flag used in these examples to dump the D script after libdtrace has applied syntactic transformations, has been added to DTrace in commit 1e136a9cbd3a.

Heuristic for calculating the `entry` and `return` offsets

libdtrace reuses parts of the mechanism implemented in my inlinecall(1) program, which finds and prints all call sites of a given inline function:

$ ./inlinecall vm_page_mvqueue /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/vm/vm_page.c:4142
        [0xffffffff80f91541 - 0xffffffff80f91599]       /usr/src/sys/vm/vm_page.c:4195  vm_page_readahead_finish()
        [0xffffffff80f915f5 - 0xffffffff80f91603]       /usr/src/sys/vm/vm_page.c:4195  vm_page_readahead_finish()
        [0xffffffff80f9163d - 0xffffffff80f916c2]       /usr/src/sys/vm/vm_page.c:4184  vm_page_activate()
        [0xffffffff80f916cd - 0xffffffff80f916e5]       /usr/src/sys/vm/vm_page.c:4184  vm_page_activate()
        [0xffffffff80f916fe - 0xffffffff80f91747]       /usr/src/sys/vm/vm_page.c:4195  vm_page_deactivate()
        [0xffffffff80f91750 - 0xffffffff80f91768]       /usr/src/sys/vm/vm_page.c:4195  vm_page_deactivate()
        [0xffffffff80f94a59 - 0xffffffff80f94aa9]       /usr/src/sys/vm/vm_page.c:4195  vm_page_reclaim_contig_domain()
        [0xffffffff80f94de4 - 0xffffffff80f94df9]       /usr/src/sys/vm/vm_page.c:4195  vm_page_reclaim_contig_domain()
        [0xffffffff80f9661e - 0xffffffff80f96667]       /usr/src/sys/vm/vm_page.c:4202  vm_page_deactivate_noreuse()
        [0xffffffff80f96670 - 0xffffffff80f96688]       /usr/src/sys/vm/vm_page.c:4202  vm_page_deactivate_noreuse()
        [0xffffffff80f9669e - 0xffffffff80f966ea]       /usr/src/sys/vm/vm_page.c:4212  vm_page_launder()
        [0xffffffff80f966f3 - 0xffffffff80f9670b]       /usr/src/sys/vm/vm_page.c:4212  vm_page_launder()
        [0xffffffff80f96d4c - 0xffffffff80f96dac]       /usr/src/sys/vm/vm_page.c:4212  vm_page_advise()
        [0xffffffff80f96dac - 0xffffffff80f96e07]       /usr/src/sys/vm/vm_page.c:4202  vm_page_advise()

Most of the entries above appear twice but with different boundaries:

        [0xffffffff80f9163d - 0xffffffff80f916c2]       /usr/src/sys/vm/vm_page.c:4184  vm_page_activate()
        [0xffffffff80f916cd - 0xffffffff80f916e5]       /usr/src/sys/vm/vm_page.c:4184  vm_page_activate()

This means that the inline copy’s boundaries are split into more than one parts, which can be caused by having early returns inside the inline function. By using this assumption, we can deduce that the entry point of the function is 0xffffffff80f9163d, and each of the upper boundaries are the return points (i.e 0xffffffff80f916c2 and 0xffffffff80f916e5), so we end up with one entry and two return trace points.

However, this is not exactly true, because the final return address given by DWARF corresponds to the instruction after the actual return instruction. We can verify this in GDB using the two return addresses from the example above:

# kgdb
(kgdb) disas vm_page_activate
...
   0xffffffff80f916c0 <+144>:   jmp    0xffffffff80f91673 <vm_page_activate+67>
   0xffffffff80f916c2 <+146>:   add    $0x8,%rsp			<-- first address given by DWARF
...
   0xffffffff80f916e0 <+176>:   call   0xffffffff80bed360 <panic>	<-- last instruction
									<-- second address should be here

It turns out that 0xffffffff80f916e5 is, in fact, outside vm_page_activate() altogether!

(kgdb) x/i 0xffffffff80f916e5
   0xffffffff80f916e5:  data16 cs nopw 0x0(%rax,%rax,1)

After running a couple of tests, I came to the conclusion that there are two possible cases with return addresses:

If the inline copy’s DIE has DW_AT_lowpc and DW_AT_highpc set, the return address is always outside the inline function’s boundaries.
If the inline copy’s DIE has DW_AT_ranges set, only the last return address is outside the inline function’s boundaries.
Combining the two bullets above, if the return address of the inline function is the same as the upper boundary of the caller function, it’s outside both the inline copy’s and the caller function’s boundaries.

In order to fix this, we have to go one instruction back whenever we come across one of those 3 cases.

Finally the entry offset is calculated as:

inline_bound_lo - caller_bound_lo

And the return one, including the modifications (if any) discussed above, as:

inline_bound_hi - caller_bound_lo

These offsets are then used to create regular kinst probes of the form kinst::<func>:<instruction>, which is what kinst actually expects:

# dtrace -dn 'kinst::vm_page_mvqueue:entry,kinst::vm_page_mvqueue:return'
dtrace: description 'kinst::vm_page_mvqueue:entry,kinst::vm_page_mvqueue:return' matched 22 probes
kinst::vm_page_readahead_finish:33,
kinst::vm_page_readahead_finish:121,
kinst::vm_page_activate:13,
kinst::vm_page_deactivate:14,
kinst::vm_page_reclaim_contig_domain:1961,
kinst::vm_page_deactivate_noreuse:14,
kinst::vm_page_launder:14,
kinst::vm_page_advise:236,
kinst::vm_page_advise:332,
kinst::vm_page_readahead_finish:220,
kinst::vm_page_activate:146,
kinst::vm_page_activate:176,
kinst::vm_page_deactivate:87,
kinst::vm_page_deactivate:115,
kinst::vm_page_reclaim_contig_domain:2041,
kinst::vm_page_reclaim_contig_domain:2884,
kinst::vm_page_deactivate_noreuse:87,
kinst::vm_page_deactivate_noreuse:115,
kinst::vm_page_launder:90,
kinst::vm_page_launder:118,
kinst::vm_page_advise:330,
kinst::vm_page_advise:421
{
}

CPU     ID                    FUNCTION:NAME
  3  95381              vm_page_activate:13 
  3  95389             vm_page_activate:146 
  2  95381              vm_page_activate:13 
  2  95389             vm_page_activate:146 
  1  95387               vm_page_advise:332 
  1  95400               vm_page_advise:421 
  1  95387               vm_page_advise:332 
  1  95400               vm_page_advise:421 
  1  95387               vm_page_advise:332 
^C

Using DWARF to find call sites of inline functions

Tue, 07 Feb 2023 00:00:00 +1200

What is DWARF?

From the DWARF Debugging Standard’s documentation:

This document defines a format for describing programs to facilitate user source level debugging. This description can be generated by compilers, assemblers and linkage editors. It can be used by debuggers and other tools.

Debugging information entries (DIEs) are represented as a tree, one per compilation unit (CU). Each DIE has a tag (DW_TAG_*, see DWARF PDF Figure 1) denoting its class, and attributes (DW_AT_*, see DWARF PDF Figure 2) denoting its various characteristics, associated with it.

The next entry of a DIE is a child DIE. If a DIE doesn’t have children, the next entry is a “sibling”.

Consider the following structure:

CU1 (DW_TAG_compile_unit)
	func1 (DW_TAG_subprogram)
		DW_AT_foo
		DW_AT_bar
		func2 (DW_TAG_subprogram)
			DW_AT_foo
			DW_AT_bar
	myvar (DW_TAG_variable)
		DW_AT_foo
		DW_AT_bar
CU2 (DW_TAG_compile_unit)
	...

CU1 has func1 and myvar as chidren and CU2 as siblings. func1 has func2 as a child.

The debug file can be generated by compiling with the -g option. To dump DWARF info you can use readelf -wi <file> and dwarfdump <file>.

Inline functions in DWARF

DIEs of inline function declarations have the DW_TAG_subprogram tag and the DW_AT_inline attribute. DIEs of inline copies of this function will have the DW_TAG_inlined_subroutine tag.

Attributes inline copies can have include:

DW_AT_abstract_origin: DIE offset to the inline declaration.
DW_AT_call_file: Integer denoting the file the function is called in.
DW_AT_call_line: File line.
DW_AT_call_column: File column.
DW_AT_low_pc and DW_AT_high_pc: Lower and upper call boundaries. Explained in the next section.
DW_AT_ranges: Explained in the next section.

For example, if we dump the DWARF info for my FreeBSD kernel:

$ readelf -wi /usr/lib/debug/boot/kernel/kernel.debug > ~/foo

We find that vfs_freevnodes_dec gets inlined:

 <1><1dfa144>: Abbrev Number: 94 (DW_TAG_subprogram)
    <1dfa145>   DW_AT_name        : (indirect string) vfs_freevnodes_dec
    <1dfa149>   DW_AT_decl_file   : 1
    <1dfa14a>   DW_AT_decl_line   : 1447
    <1dfa14c>   DW_AT_prototyped  : 1
    <1dfa14c>   DW_AT_inline      : 1

Inline copies will have DW_AT_abstract_origin point to the declaration’s DIEs offset, in this case 0x1dfa144. If we look for 0x1dfa144, we do indeed find a few inline copies.

 <3><1dfe45e>: Abbrev Number: 24 (DW_TAG_inlined_subroutine)
    <1dfe45f>   DW_AT_abstract_origin: <0x1dfa144>
    <1dfe463>   DW_AT_low_pc      : 0xffffffff80cf701d
    <1dfe46b>   DW_AT_high_pc     : 0x38
    <1dfe46f>   DW_AT_call_file   : 1
    <1dfe470>   DW_AT_call_line   : 3458
    <1dfe472>   DW_AT_call_column : 5

 <3><1dfd2e2>: Abbrev Number: 58 (DW_TAG_inlined_subroutine)
    <1dfd2e3>   DW_AT_abstract_origin: <0x1dfa144>
    <1dfd2e7>   DW_AT_ranges      : 0x1f1290
    <1dfd2eb>   DW_AT_call_file   : 1
    <1dfd2ec>   DW_AT_call_line   : 3405
    <1dfd2ee>   DW_AT_call_column : 3

  ...there are more

As I described in the first section, a debug file may consist of multiple CUs that define the same inline function. We want treat each CU independently, that is, each inline copy is handled relative to its CU.

Calculating call boundaries

There are 2 cases we have to take care of when calculating the actual call boundaries of an inline copy.

The DIE has `DW_AT_low_pc` and `DW_AT_high_pc`

 <3><1dfe45e>: Abbrev Number: 24 (DW_TAG_inlined_subroutine)
    <1dfe45f>   DW_AT_abstract_origin: <0x1dfa144>
    <1dfe463>   DW_AT_low_pc      : 0xffffffff80cf701d
    <1dfe46b>   DW_AT_high_pc     : 0x38
    <1dfe46f>   DW_AT_call_file   : 1
    <1dfe470>   DW_AT_call_line   : 3458
    <1dfe472>   DW_AT_call_column : 5

In this case, the lower boundary is low_pc and the upper boundary is low_pc + high_pc, which, for the DIE shown in this example, the boundaries are:

low = 0xffffffff80cf701d
high = 0xffffffff80cf701d + 0x38 = 0xffffffff80cf7055

The DIE has `DW_AT_ranges`

 <3><1dfd2e2>: Abbrev Number: 58 (DW_TAG_inlined_subroutine)
    <1dfd2e3>   DW_AT_abstract_origin: <0x1dfa144>
    <1dfd2e7>   DW_AT_ranges      : 0x1f1290
    <1dfd2eb>   DW_AT_call_file   : 1
    <1dfd2ec>   DW_AT_call_line   : 3405
    <1dfd2ee>   DW_AT_call_column : 3

This is a bit more involved. DW_AT_ranges refers to the .debug_ranges section found in debug files. We can dump the ranges:

$ dwarfdump -N /usr/lib/debug/boot/kernel/kernel.debug
.debug_ranges
 Ranges group 0:
                ranges: 3 at .debug_ranges offset 0 (0x00000000) (48 bytes)
                        [ 0] range entry    0x00000019 0x00000073
                        [ 1] range entry    0x0000007e 0x00000106
                        [ 2] range end      0x00000000 0x00000000
 Ranges group 1:
                ranges: 3 at .debug_ranges offset 48 (0x00000030) (48 bytes)
                        [ 0] range entry    0x00000022 0x0000006a
                        [ 1] range entry    0x0000007e 0x00000106
                        [ 2] range end      0x00000000 0x00000000
 ...

If we search for 0x1f1290 (the inline copy’s ranges), we find its range group:

 Ranges group 38809:
                ranges: 3 at .debug_ranges offset 2036368 (0x001f1290) (48 bytes)
                        [ 0] range entry    0x000025c8 0x000025f9
                        [ 1] range entry    0x0000261a 0x00002621
                        [ 2] range end      0x00000000 0x00000000

To get the call boundaries, we add each range entry’s boundaries to the DW_AT_low_pc of the root DIE of the CU. The root DIE is found programmatically, but I happen to know that in this case, the root DIE is:

 <0><1dee9fb>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <1dee9fc>   DW_AT_producer    : (indirect string) FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
    <1deea00>   DW_AT_language    : 12	(C99)
    <1deea02>   DW_AT_name        : (indirect string) /usr/src/sys/kern/vfs_subr.c
    <1deea06>   DW_AT_stmt_list   : 0x6cb448
    <1deea0a>   DW_AT_comp_dir    : (indirect string) /usr/obj/usr/src/amd64.amd64/sys/GENERIC
    <1deea0e>   DW_AT_low_pc      : 0xffffffff80cf4020
    <1deea16>   DW_AT_high_pc     : 0xde3d

Finally, we end up with the following boundaries:

low = 0xffffffff80cf4020 + 0x000025c8 = 0xffffffff80cf65e8
high = 0xffffffff80cf4020 + 0x000025f9 = 0xffffffff80cf6619

low = 0xffffffff80cf4020 + 0x0000261a = 0xffffffff80cf663a
high = 0xffffffff80cf4020 + 0x00002621 = 0xffffffff80cf6641

Finding the caller function

There are cases where we want to know which function an inline function is being called from. Because DWARF does not encode that information, we’ll have to scan ELF symbol tables.

$ readelf -s /usr/lib/debug/boot/kernel/kernel.debug

Since we know the inline copy’s boundaries, we only have to find which symbol’s boundaries the inline copy is inside. In other words, the following condition has to be met:

sym_lower_bound <= inline_lower_bound <= inline_upper_bound <= sym_upper_bound

Because searching through ELF symbol tables manually and doing calculations by hand would take too long, the best way to do this is programmatically through LibELF.

inlinecall(1)

I wrote a little program that does everything I talked about in this post automatically. It works on FreeBSD as-is, and most likely needs some modification to get it to work on other platforms.

The program takes an inline function name and a debug file as arguments:

inlinecall <function> <file>

And outputs the results in the following form:

cu1_func_declaration_file:line
	[low_bound - high_bound]	inline_copy1_file:line	caller_func()
	[low_bound - high_bound]	inline_copy2_file:line	caller_func()
	...
cu2_func_declaration_file:line
	...
...

For example:

$ inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/systm.h:175
        [0xffffffff809eb51f - 0xffffffff809eb526]       /usr/src/sys/kern/kern_intr.c:1387      intr_event_handle()
/usr/src/sys/sys/systm.h:175
        [0xffffffff80a051f4 - 0xffffffff80a05208]       /usr/src/sys/kern/kern_malloc.c:431     malloc_type_freed()
        [0xffffffff80a0514c - 0xffffffff80a0515b]       /usr/src/sys/kern/kern_malloc.c:388     malloc_type_zone_allocated()
/usr/src/sys/sys/systm.h:175
        [0xffffffff80a263c4 - 0xffffffff80a263d3]       /usr/src/sys/kern/kern_resource.c:509   rtp_to_pri()
/usr/src/sys/sys/systm.h:175
        [0xffffffff80a28f59 - 0xffffffff80a28f5f]       /usr/src/sys/kern/kern_rmlock.c:775     _rm_assert()
        [0xffffffff80a29087 - 0xffffffff80a2908d]       /usr/src/sys/kern/kern_rmlock.c:801     _rm_assert()
        [0xffffffff80a29eb0 - 0xffffffff80a29eb7]       /usr/src/sys/kern/kern_rmlock.c:645     _rm_rlock_debug()
        [0xffffffff80a28c4b - 0xffffffff80a28c5a]       /usr/src/sys/kern/kern_rmlock.c:160     unlock_rm()
...more

Nested inline functions

inlinecall(1) resolves nested inline functions recursively:

$ ./inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/systm.h:175
        [0xffffffff80a19d7a - 0xffffffff80a19d8b]       /usr/src/sys/sys/buf_ring.h:80  drbr_enqueue()
/usr/src/sys/sys/systm.h:175
        [0xffffffff80a6387a - 0xffffffff80a6388b]       /usr/src/sys/sys/buf_ring.h:80  drbr_enqueue()
...

Looking at the definition of critical_enter()’s caller function in buf_ring.h:

static __inline int
buf_ring_enqueue(struct buf_ring *br, void *buf)
{
	...
	critical_enter();
	...
}

Even though inlinecall(1) reported that critical_enter() is called from drbr_enqueue() in buf_ring.h:80, we see that it’s called from buf_ring_enqueue() instead, but buf_ring_enqueue() is also an inline function:

$ ./inlinecall buf_ring_enqueue /usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/sys/buf_ring.h:63
        [0xffffffff80a19d7a - 0xffffffff80a19dcd]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80a19ddc - 0xffffffff80a19e18]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80a19e1f - 0xffffffff80a19e3b]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
        [0xffffffff80a6387a - 0xffffffff80a638cd]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80a638dc - 0xffffffff80a63918]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80a6391f - 0xffffffff80a6393b]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
        [0xffffffff80d1f81a - 0xffffffff80d1f879]       /usr/src/sys/net/ifq.c:57       drbr_enqueue()
        [0xffffffff80d1f91d - 0xffffffff80d1f964]       /usr/src/sys/net/ifq.c:57       drbr_enqueue()
        [0xffffffff80d1f9dd - 0xffffffff80d1f9f5]       /usr/src/sys/net/ifq.c:57       drbr_enqueue()
/usr/src/sys/sys/buf_ring.h:63
        [0xffffffff80ff07ba - 0xffffffff80ff080d]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80ff081c - 0xffffffff80ff0858]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()
        [0xffffffff80ff085f - 0xffffffff80ff087b]       /usr/src/sys/net/ifq.h:337      drbr_enqueue()

Here drbr_enqueue() is defined twice — once in ifq.h and once in ifq.c. The definition in ifq.h is also an inline definition, and in ifq.c it’s a non-inline one. We know that buf_ring_enqueue() is called from the non-inline version of drbr_enqueue(), otherwise inlinecall(1) would have reported the function which calls the inline version of drbr_enqueue().

Making a character device kernel module on FreeBSD

Sun, 10 Jul 2022 00:00:00 +1200

This article assumes advanced knowledge of C and a basic understanding of the FreeBSD kernel and programming environment. It is also meant to serve as a template/reference and not a complete implementation.

Sample code can be found here.

Also mirrored on the FreeBSD Wiki.

Implementing the device

malloc declaration

Kernel modules have their own malloc types, which are defined as follows:

MALLOC_DECLARE(M_MYDEV);
MALLOC_DEFINE(M_MYDEV, "mydev", "device description");

Then, you can use malloc(9) and free(9) as:

p = malloc(sizeof(foo), M_MYDEV, M_WAITOK | M_ZERO);
free(p, M_MYDEV);

`cdevsw` structure

The device’s properties and methods are stored in a cdevsw (Character Device Switch) structure, defined in sys/conf.h. The fields we care about most of the time are the following:

struct cdevsw {
	int			d_version;
	u_int			d_flags;
	const char		*d_name;
	d_open_t		*d_open;
	d_fdopen_t		*d_fdopen;
	d_close_t		*d_close;
	d_read_t		*d_read;
	d_write_t		*d_write;
	d_ioctl_t		*d_ioctl;
	d_poll_t		*d_poll;
	d_mmap_t		*d_mmap;
	d_strategy_t		*d_strategy;
	dumper_t		*d_dump;
	d_kqfilter_t		*d_kqfilter;
	d_purge_t		*d_purge;
	d_mmap_single_t		*d_mmap_single;
	...
};

All the *_t pointers are pointers to functions meant to be implemented by the driver. Not all functions have to be implemented however, but we usually do need to implement open(), close(), read(), write() and ioctl().

Declare the functions using some handy typedefs:

static d_open_t		mydev_open;
static d_close_t	mydev_close;
static d_read_t		mydev_read;
static d_write_t	mydev_write;
static d_ioctl_t	mydev_ioctl;

Declare the cdevsw structure:

static struct cdevsw mydev_cdevsw = {
	.d_name     = "mydev",
	.d_version  = D_VERSION,
	.d_flags    = D_TRACKCLOSE,
	.d_open     = mydev_open,
	.d_close    = mydev_close,
	.d_read     = mydev_read,
	.d_write    = mydev_write,
	.d_ioctl    = mydev_ioctl,
};

The D_TRACKCLOSE flag tells the kernel to track when the device closes so that it can close normally in case something goes wrong.

open() and close()

Those two functions are mainly used for resource allocation/deallocation and environment preparation:

static int
mydev_open(struct cdev *dev, int flags, int devtype, struct thread *td)
{
	int error = 0;

	/* do stuff */

	return (error);
}

static int
mydev_close(struct cdev *dev, int flags, int devtype, struct thread *td)
{
	int error = 0;

	/* do stuff */

	return (error);
}

read() and write()

It’s good practice to keep an internal buffer. Below is a very simplified example. The buffer in this example is allocated and deallocated on module load and unload respectively:

#define BUFSIZE (1 << 16)

struct foo {
	char	buf[BUFSIZE + 1];
	size_t	len;
};

static struct foo *foo;

Data to be received or sent back is stored in uio and the copy from user to kernel memory is done through uiomove(9), defined in sys/uio.h:

static int
mydev_read(struct cdev *dev, struct uio *uio, int ioflag)
{
	size_t amnt;
	int v, error = 0;

	/*
	 * Determine how many bytes we have to read. We'll either read the
	 * remaining bytes (uio->uio_resid) or the number of bytes requested by
	 * the caller.
	 */
	v = uio->uio_offset >= foo->len + 1 ? 0 : foo->len + 1 - uio->uio_offset;
	amnt = MIN(uio->uio_resid, v);

	/* Move the bytes from foo->buf to uio. */
	if ((error = uiomove(foo->buf, amnt, uio)) != 0) {
		/* error handling */
	}

	/* do stuff */

	return (error);
}

static int
mydev_write(struct cdev *dev, struct uio *uio, int ioflag)
{
	size_t amnt;
	int error = 0;

	/* Do not allow random access. */
	if (uio->uio_offset != 0 && (uio->uio_offset != foo->len))
		return (EINVAL);

	/* We're not appending, reset length. */
	else if (uio->uio_offset == 0)
		foo->len = 0;

	amnt = MIN(uio->uio_resid, (BUFSIZE - foo->len));
	if ((error = uiomove(foo->buf + uio->uio_offset, amnt, uio)) != 0) {
		/* error handling */
	}

	foo->len = uio->uio_offset;
	foo->buf[foo->len] = '\0';

	/* do stuff */

	return (error);
}

ioctl()

To create an ioctl, you give it a name and #define it using one of the following _IO* macros defined in sys/ioccom.h:

_IO: No parameters.
_IOR: Copy out parameters. Read from device.
_IOW: Copy in paramters. Write to device.
_IOWR: Copy parameters in and out. Write to device and read the modified data back.

Each of those macros* takes 3 arguments:

An arbitrary one-byte “class” identifier.
A unique ID.
The parameter type (can be anything), which is used to calculate the parameter’s size. The macro expands the type to sizeof(type).

* _IO takes only the first 2 arguments (class and ID) since it doesn’t use parameters.

We can now define a few ioctls that take foo_t as a parameter. This is usually done in a separate header file so that programs can use the ioctls:

#include <sys/ioccom.h>

typedef struct {
	int x;
	int y;
} foo_t;

#define MYDEVIOC_READ	_IOR('a', 1, foo_t)
#define MYDEVIOC_WRITE	_IOW('a', 2, foo_t)
#define MYDEVIOC_RDWR	_IOWR('a', 3, foo_t)

mydev_ioctl() is responsible for handling the ioctls we declared:

static int
mydev_ioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags,
    struct thread *td)
{
	foo_t *fp;
	int error = 0;

	switch (cmd) {
	case MYDEVIOC_READ:
		fp = (foo_t *)addr;
		/* do stuff */
		break;
	case MYDEVIOC_WRITE:
		fp = (foo_t *)addr;
		/* do stuff */
		break;
	case MYDEVIOC_RDWR:
		fp = (foo_t *)addr;
		/* do stuff */
		break;
	default:
		error = ENOTTY;
		break;
	}

	return (error);
}

Creating and destroying the device

Character devices are given a struct cdev handle upon creation, which we usually store as a global variable:

static struct cdev *mydev_cdev;

Devices are created with the make_dev() function, which is defined as:

struct cdev *
make_dev(struct cdevsw *cdevsw, int unit, uid_t uid, gid_t gid, int perms,
    const char *fmt, ...);

sys/conf.h has the definitions of all available flags.

Create the device:

mydev_cdev = make_dev(&mydev_cdevsw, 0, UID_ROOT, GID_WHEEL, 0666, "mydev");

When done, destroy the device:

destroy_dev(mydev_cdev);

Module declaration

Necessary includes:

#include <sys/types.h>
#include <sys/param.h>
#include <sys/conf.h>
#include <sys/systm.h>
#include <sys/kernel.h>
#include <sys/module.h>
#include <sys/malloc.h>
#include <sys/uio.h>

Implement the module’s event handler. This function is called at module load and unload. Since we’re dealing with a character device, it makes sense to create the device upon load and destroy it upon unload:

static int
mydev_modevent(module_t mod, int type, void *arg)
{
	int error = 0;

	switch (type) {
	case MOD_LOAD:
		mydev_cdev = make_dev(&mydev_cdevsw, 0, UID_ROOT, GID_WHEEL,
		    0666, "mydev");
		foo = malloc(sizeof(foo_t), M_MYDEV, M_WAITOK | M_ZERO);
		foo->buf[0] = '\0';
		foo->len = 0;
		break;
	case MOD_UNLOAD: /* FALLTHROUGH */
	case MOD_SHUTDOWN:
		free(foo, M_MYDEV);
		destroy_dev(mydev_cdev);
		break;
	default:
		error = EOPNOTSUPP;
		break;
	}

	return (error);
}

Lastly, declare the module. The first argument is the module’s name, the second one is a pointer to the event handler and the last one is any data we want to supply the event handler with, i.e the arg argument in mydev_modevent():

DEV_MODULE(mydev, mydev_modevent, NULL);

Makefile

KMOD=   mydev
SRCS=   mydev.c

.include <bsd.kmod.mk>

Running the module

$ make
# kldload ./mydev.ko
...
# kldunload ./mydev.ko
$ make clean cleandepend

Testing

To test the module, load it, and create a simple program that opens the device, and makes a few calls to ioctl(2), read(2) and write(2).

Process queuing using lock files

Wed, 18 May 2022 00:00:00 +1200

Locking will be done using the fcntl(2) system call. I’m aware of lockf(3) and flock(2), but both of them normally use fcntl(2) under the hood, and they are not as portable. More information on file locking.

For a real use-case, I’ve written a notification program which uses the same mechanism, so that notifications can be queued without having to run a daemon, such as D-Bus, in the background.

First create the lock file with write permissions. O_CREAT is used to create the file in case it doesn’t exist already:

#include <err.h>
#include <fcntl.h>
...
char *lockfile = "/tmp/foo.lock";
int fd;

if ((fd = open(lockfile, O_CREAT | O_WRONLY, 0600)) < 0)
	err(1, "open(%s)", lockfile);

Locking commands operate on the flock structure. Before a call to fcntl(2) is made, we need to write the following fields:

struct flock {
	off_t	l_start;	/* starting offset */
	off_t	l_len;		/* len = 0 means until end of file */
	short	l_type;		/* lock type: read/write, etc. */
	short	l_whence;	/* type of l_start */
	...
};

The starting offset, l_len, can be anything, but 0 is what makes the most sense. We’ll set l_len to 0 as well, since we want each process to lock the entire file. The lock type, l_type, needs to be an exclusive lock (F_WRLCK), that is, a lock that prevents any other process from setting a lock on that area before it’s released. l_whence will be set to SEET_SET to indicate that the relative offset l_start will be measured from the beginning of the file:

struct flock fl;

fl.l_len = 0;
fl.l_start = 0;
fl.l_type = F_WRLCK;
fl.l_whence = SEEK_SET;

The F_SETLKW command will make the calling process wait until the lock request can be satisfied. There’s also F_SETLK, but it returns immidiately if the lock is already acquired, which is not very useful for queuing processes:

if (fcntl(fd, F_SETLKW, &fl) < 0)
	err(1, "fcntl(F_SETLKW)");

When we get past the call to fcntl(2), it means that we have acquired the lock until a call to close(2) is made. Here is where we’ll put the part of the code we want to queue, in this case a simple printf followed by a 3-second sleep to make sure queuing really works:

printf("hello from %d\n", getpid());
sleep(3);

When done, release the lock:

close(fd);

To test the code, open two terminals and run the program on both of them. You’ll see the process that was run last will not execute the code after fcntl(2) until the first one has finished.

FreeBSD sound mixer improvements

Fri, 25 Feb 2022 00:00:00 +1200

This project was part of Google Summer of Code 2021, but development is still active. The development report can be found on the FreeBSD Wiki. The reason behind this project is that the FreeBSD’s OSS mixer capabilities were really basic and outdated — even un/muting didn’t exist and one had to write custom scripts for such a basic task. Setting default audio devices had to be done by tweaking sysctls and programs needing to use the mixer required DIY implementations as there was no mixer library available. The project was merged to upstream on FreeBSD 14.0.

Kernel patches

Un/muting (commit)

I decided that un/muting is better to be implemented in sound(4) in order to avoid having to write daemons or use files. The way this works is by implementing the SOUND_MIXER_READ_MUTE and SOUND_MIXER_WRITE_MUTE ioctls, which did exist in older OSS implementations, but were considered obselete. One thing to note is that the functionality isn’t the same as their old one. Older OSS versions had those 2 ioctls take/return an integer with a value of 0 or 1, which indicated whether the whole mixer is muted or not. My implementation takes/returns a bitmask that tells which devices are muted. This allows us to mute and unmute only the devices we want, instead of the whole mixer. If you’re familiar with the OSS API, this bitmask works the same way as DEVMASK, RECMASK and RECSRC.

Playback/recording mode information (commit)

Here I implemented a sysctl (dev.pcm.<N>.mode) which gives information about a device’s playback/recording mode. The rationale for this control is to include /dev/sndstat’s mixer information in the output of the new mixer(8). The sysctl can return the following values (NOTE: these values are OR’ed together if more than one mode is supported):

Value	Meaning
0x01	Mixer
0x02	Playback device
0x04	Recording device

Userland

mixer(3) implementation (commit)

mixer(3) provides a simple interface for working with the OSS mixer. The man page explains how the library works, including some examples, so there’s no need to repeat myself. You can see the library in action in the source code for mixer(8).

The basic structure of a program looks like this (link with -lmixer):

#include <err.h>
#include <mixer.h>

int
main(int argc, char *argv[])
{
	struct mixer *m;
	const char *name = "/dev/mixer0";

	if ((m = mixer_open(name)) == NULL)
		err(1, "mixer_open(%s)", name);

	/* do stuff */

	mixer_close(m);
	
	return (0);
}

mixer(8) rewrite (commit)

This implementation is a complete rewrite of the old mixer(8) utility. It now uses mixer(3) as a backend and implements all the new features the library provides. It’s got more command line options and works with a control-oriented interface inspired by OpenBSD’s mixerctl(8). Again, everything is detailed in the man page.

Old mixer(8) output:

$ mixer.old

Mixer vol      is currently set to  85:85
Mixer pcm      is currently set to 100:100
Mixer speaker  is currently set to  74:74
Mixer line     is currently set to   1:1
Mixer mic      is currently set to  67:67
Mixer mix      is currently set to  74:74
Mixer rec      is currently set to  37:37
Mixer igain    is currently set to   0:0
Mixer ogain    is currently set to 100:100
Mixer monitor  is currently set to  67:67
Recording source: mic

New mixer(8) output:

$ mixer

pcm0:mixer: <Realtek ALC662 rev3 (Analog 2.0+HP/2.0)> on hdaa0 kld snd_hda (play/rec) (default)
    vol       = 0.85:0.85     pbk
    pcm       = 1.00:1.00     pbk
    speaker   = 0.74:0.74     rec
    line      = 0.01:0.01     rec
    mic       = 0.67:0.67     rec src
    mix       = 0.74:0.74     rec
    rec       = 0.37:0.37     pbk
    igain     = 0.00:0.00     pbk
    ogain     = 1.00:1.00     pbk
    monitor   = 0.67:0.67     rec

Code and manuals

PIC microcontroller development on FreeBSD

Sun, 23 Jan 2022 00:00:00 +1200

Tested on FreeBSD 13.0. This article has also been mirrored on the FreeBSD Wiki.

Prerequisites

sdcc is a C compiler for microprocessors. It says that PIC microprocessors are unmaintained, but I’ve found it to be pretty reliable so far (take this with a grain of salt, I’m no expert). The port can be found under:

lang/sdcc

Page 75 of the sdcc user manual lists the supported PIC devices. Header files can be found under /usr/local/share/sdcc.

For programming the MCU, I’ve found pk2cmd to work alright with PICKit2 (or Chinese clones), but there’s no port for FreeBSD anymore. The Makefile won’t install files properly, so we have some extra work to do afterwards:

$ git clone https://github.com/psmay/pk2cmd.git
$ cd pk2cmd/pk2cmd
# gmake freebsd install clean
# mv /usr/share/pk2/PK2DeviceFile.dat /usr/local/bin
# rm -rf /usr/share/pk2

Supported devices for pk2cmd are listed here.

Detecting and programming the MCU

Avoid using just the -P option to auto-detect the MCU, as the VPP the PICKit2 applies to the chip trying to detect it can damage the MCU. Instead, use the chip number beforehand as shown below. Also, use the -C option to check if the chip is blank.

If any of the following pk2cmd commands fail, make sure everything really is wired properly:

$ pk2cmd -P PIC16F877A -C
Device is blank

Operation Succeeded

Compile your source code. The target executable is the .hex file sdcc will output. Replace pic14 and 16f877a with the appropriate names for your device:

$ sdcc --use-non-free -mpic14 -p16f877a main.c

Erase the PIC (if it wasn’t already blank) and flash the new code. Again, use the appropriate names:

$ pk2cmd -P PIC16F877A -E
$ pk2cmd -P PIC16F877A -X -M -F main.hex

If all went well, you should get an output similar to this:

PICkit 2 Program Report
23-1-2022, 21:01:29
Device Type: PIC16F877A

Program Succeeded.

Operation Succeeded

C coding style

Sat, 01 Jan 2022 00:00:00 +1200

First of all, I wanna wish all 0 readers of this blog a happy new year! There’s no reason to write a whole article repeating what better articles have already covered, so here’s a list with proper C coding style guides:

Use cases for goto

Tue, 19 Jan 2021 00:00:00 +1200

This article is a response to all my university professors who, for some reason, think goto is useless and should be avoided at all costs.

Some use cases

The most important use case there is for goto is by far error handling when there are more than 1 points of failure. In this case, you might want to cleanup some resources while also skipping part of the code that should not be executed, without having to deal with flags, helper functions, and other methods that would make the code ugly, slower and error prone. Try rewriting the following snippet without a goto:

int
foo(int *bar, int *baz)
{
	if (!func1())
		goto fail;
	if (!func2())
		goto fail;
	if (!func3())
		goto fail;

	return 0;

fail:
	warn("foo failed");
	if (bar != NULL)
		free(bar);
	if (baz != NULL)
		free(baz);

	return -1
}

Another use case is breaking out of deeply nested code. Let’s say you’ve got 3 for loops and there’s a special case in which you really want to break out of all the loops at once. how do you do that? There are multiple ways you can go about doing so but one way would be to set a flag and check it on every nested level.

flag = 0;

for (i = 0; i < 10; i++) {
	for (j = 0; j < 10; j++) {
		for (k = 0; k < 10; k++) {
			...
			if (flag)
				break;
		}
		if (flag)
			break;
	}
	if (flag)
		break;
}

Another ugly hack you can use is something another colleague from university showed me, and something I would never use; when the flag is set, manually max out all the loop counters.

A pretty straight-forward solution would also be to put the loop into a function and use a return statement to break out of all the loops. That’s actually a good solution, and I’m aware of it, but I want to provide another solution, which is also quite faster than using a function since it avoids that additional function call.

An alternative, and in my opinion, better way of solving this problem would be by using a (don’t say it, don’t say it) goto:

flag = 0;

for (i = 0; i < 10; i++) {
	for (j = 0; j < 10; j++) {
		for (k = 0; k < 10; k++) {
			...
			if (flag)
				goto end;
		}
	}
}
end:
...

Who cares, anyway?

In the first use case, the code is much more readable and you avoid code duplication. In the second use case the goto solution actually does improve performance. The reason why is simple; we check for flag on every single loop, which means, that in case flag is never set, we’ll have done 10 * 10 * 10 = 1000 checks just to see if flag is set. And that’s just with 3 for loops going from 0 to 10 each; think how easily this can scale up if you just increase the iterations. The goto solution does only one check in the third loop, which means that, in the above scenario, where flag never gets set, we’ll have done only 10 checks - that’s 100 times faster than the other solution.

Using a function is almost just as fast as using a goto without a function, but not having to call a function is generally faster. Both solutions are great and totally valid, I just want to show an alternative one.

Final note

goto does have its place but it should be used carefuly; if you overuse it, your code will either become incomprehensible, or flat out broken. The use cases I showcased in this post are very common and sometimes the code can be vastly improved with just a simple goto if used correctly.

Again, thanks to both my colleagues who helped me improve this article with their recommendations.

Arduino on FreeBSD

Wed, 28 Oct 2020 00:00:00 +1200

This article demonstrates how to develop for Arduino boards using only basic command line utilities, without having to use the Arduino IDE. The article has also been published on the FreeBSD Wiki.

Tested on FreeBSD 12.2 and above.

Prerequisites

Required ports:

devel/arduino-core
devel/arduino-bsd-mk
devel/avr-gcc
devel/avr-libc
devel/avrdude
comms/uarduno

With all the software installed, add the following line to /boot/loader.conf in case you want the Arduino kernel module to load automatically on boot. If you want to manually load the module whenever you need it, skip this step:

uarduno_load="YES"

Load the kernel module:

# kldload uarduno

Check your ~/.arduino/preferences.txt and see if the following lines exist (source):

serial.port=/dev/cuaU0
launcher=/usr/local/bin/firefox

Add your user to the dialer group:

# pw group mod dialer -m $USER

Connecting the board

Standard Arduino boards connect as /dev/cuaU0 and/or /dev/ttyU0 on FreeBSD. In case these serial ports don’t show up in /dev, you might need to press your board’s reset button. After you’ve plugged your board into a USB port, you should get the following output from dmesg. Although the output may vary, the important thing is that your board is connected and detected.

ugen1.5: <Arduino (www.arduino.cc) product 0x0043> at usbus1
uarduno0: <Arduino (www.arduino.cc) product 0x0043, class 2/0, rev 1.10/0.01, addr 5> on usbus1

If dmesg returned information about your board, you should also see cuaU0 and/or ttyU0 in /dev. In case your board is still not detected — considering it’s not a fake one, try using a different USB cable or reset it again and make sure you’ve followed the setup steps correctly.

The Makefile

The only thing you’re going to need in order to get started is just a Makefile that’ll be used to compile and upload your Arduino programs. Make a new directory for your Arduino project and a Makefile with the following lines:

ARDUINO_DIR= 	/usr/local/arduino
ARDUINO_MK_DIR= /usr/local/arduino-bsd-mk
#ARDUINO_LIBS=	
AVRDUDE_PORT=	your_board_port
ARDUINO_BOARD= 	your_board_name
SRCS=		your_source_files
TARGET=		your_program_name

include /usr/local/arduino-bsd-mk/bsd.arduino.mk

In my case my board is an Arduino Uno, so I’d have to set ARDUINO_BOARD to uno. You can see which other board types are available in /usr/local/arduino/hardware/arduino/avr/boards.txt. If you want to install new libraries, copy them over to /usr/local/arduino/hardware/arduino/avr/libraries/.

Avoid having source files named main.

Building and uploading a program

Write some Arduino code, and when you’re ready to compile and upload, run the following command:

# make install flash clean cleandepend

If all went well you should see the board executing the new code. If it doesn’t, try to see what errors the Makefile produced.

Monitoring

The Arduino IDE provides a serial monitor feature, but FreeBSD has a builtin monitoring utility which can be accessed directly from the terminal. Run this whenever you want to monitor your board and exit with ~! (use the appropriate port):

$ cu -l /dev/cuaU0

Using board types other than the Uno

As it’s mentioned above, we’re using the uarduno kernel module. Even though the module’s description is “FreeBSD Kernel Driver for the Arduino Uno USB interface”, you can, in fact, use different board types other than the Uno. According to uarduno’s website, you can modify /usr/ports/comms/uarduno/files/ids.txt to include more board types; the two fields are Vendor ID and Product ID. Read the comments inside the file for more information.

{ 0x2341, 0x0001 },  // Arduino UNO, vendor 2341H, product 0001H
{ 0x2341, 0x0042 },  // Arduino MEGA (rev 3), vendor 2341H, product 0042H
{ 0x2341, 0x0043 },  // Arduino UNO (rev 3), vendor 2341H, product 0043H
{ 0x2341, 0x0010 },  // Arduino MEGA 2560 R3, vendor 2341H, product 0010H 
{ 0x2341, 0x8037 },  // Arduino Micro

When you’re done, clean and re-build the port.

Known issues and their fixes

Even though you might have plugged your board to your machine, you might notice that there is no device appearing in /dev. Although there is no definite answer as to why this is happening, make sure that the USB cable is connected properly; on some boards, you have to hear a click sound.

When trying to use a new library, you might notice that your code doesn’t compile. A common issue is that you haven’t stored the library in the correct path. As mentioned, libraries are stored in /usr/local/arduino/hardware/arduino/avr/libraries/, so you have to move it there.

Simple Brainfuck interpreter in C

Wed, 02 Sep 2020 00:00:00 +1200

How Brainfuck works

There are only 8 symbols supported in Brainfuck:

Symbol	Function
>	Increase position of pointer
<	Decrease position of pointer
+	Increase value of pointer
-	Decrease value of pointer
[	Beginning of loop
]	End of loop
.	Output ASCII code of pointer
,	Read a character and stores its ASCII value in pointer

It’s best to imagine Brainfuck programs as arrays of integers, which the pointer can manipulate. Let’s say this is our initial state:

{ 0, 0, 0, 0, 0, 0 }

We can assign values to each position in the array by moving the pointer around. Using the + and - symbols, we can increment or decrement by 1 each time. If we wanted to move the pointer two times to the right and increment the value there by 3, we would have a program that looks like this:

>>+++

The updated version of the array:

{ 0, 0, 3, 0, 0, 0 }

Following the same logic, we can assign specific values to each cell and make an actual program. If we wanted to print the letter “B” on the screen, which corresponds to the ASCII value 66, we could write the following program:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++.

In order to avoid writing things like this we can use loops. The loop is executed as long as the value inside it is not 0. Essentially, it’s going to do the multiplication 10 x 6 = 66 and then print the value:

+++++ +++++		# add 10 to cell #0
[			# beginning of loop
	> +++ +++	# add 6 to cell #1
	< -		# subtract 1 from cell #0
]			# end of loop
			# value at cell 1 is now 66 (10 x 6 = 66)
> .			# go to cell 1 and print its value

Or, for compactness:

+++++++++++[>++++++<-]>.

You can learn more about how Brainfuck works here.

Building the interpreter

We’ll first read the Brainfuck source from stdin into a static size buffer. 50.000 bytes should be large enough to store any Brainfuck program, since I doubt anyone is mad enough to write actual programs in it:

#define BUFSIZE 50000
.
.
.
size_t len = 0;
char buf[BUFSIZE];

while (read(STDIN_FILENO, &buf[len], 1) > 0)
	len++;
buf[len] = '\0';

We’ll declare the rest of the needed variables:

int closed;		/* number of active loops */
int opened;		/* number of inactive loops */
int pos = 0;		/* position in the program */
unsigned short *pc;	/* program counter */
char *src;		/* source code */

One of the reasons we have a len variable is to allocate just enough memory for src. We’ll also empty the buffer because we now want to use it to store the values the Brainfuck program will produce:

if ((src = malloc(len)) == NULL) {
	perror("malloc");
	exit(1);
}
strcpy(src, buf);
memset(buf, 0, len);

We can now parse the source code symbol by symbol. pc will act as the “pointer”, moving back and forth in the array. Each symbol will have its own case inside the following switch statement:

for (pc = (unsigned short *)buf, pos = 0; pos < len; pos++) {
	switch (src[pos]) {
		...
	}
}

For the < and > symbols we simply move the pointer:

case '>':
	pc++;
	break;
case '<':
	pc--;
	break;

The + and - symbols in/decrement the value of the cell the pointer is currently at:

case '+':
	(*pc)++;
	break;
case '-':
	(*pc)--;
	break;

To implement the . and , symbols we’ll use the standard library’s putchar() and getchar() functions:

case '.':
	putchar(*pc);
	break;
case ',':
	*pc = getchar();
	break;

Now comes the last, but harder part, which is to implement loops. The logic behind my implementation is that instead of keeping track of every bracket to know where a loop starts and ends, the program keeps going through the source code and, using a counter, we know that a loop starts or ends when that counter is 0 and and an opposite bracket has been found. Also, in each iteration the pos variable changes accordingly so that we can imitate the looping behavior of Brainfuck.

These are the steps the program follows for each of the two symbols:

Beginning of loop: `[`

If the pointer’s value is 0, we have a new loop.
Count how many (if any) nested loops we encounter.
If we encouter a [, increment the opened variable.
If we encouter a ], decrement the opened variable.
Keep counting until there are no new active loops (i.e opened is 0).

case '[':
if (!(*pc)) {
	for (opened = 0; pos++; pos < len; pos++) {
		if (src[pos] == ']' && !opened)
			break;
		else if (src[pos] == '[')
			opened++;
		else if (src[pos] == ']')
			opened--;
	}
}
break;

End of loop: `]`

If the pointer’s value is is not 0, we have an active loop.
Start going back and count how many (if any) nested loops there are.
If we encouter a ], increment the closed variable.
If we encouter a [, decrement the closed variable.
Keep counting until there are no active loops (i.e closed is 0).

case ']':
if ((*pc)) {
	for (closed = 0; pos--; pos >= 0; pos--) {
		if (src[pos] == '[' && !closed)
			break;
		else if (src[pos] == ']')
			closed++;
		else if (src[pos] == '[')
			closed--;
	}
}
break;

Putting it all together

Below is the full program. You can also find this program here

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define BUFSIZE 50000

int
main(int argc, char *argv[])
{
	size_t len = 0;
	int closed, opened, pos = 0;
	unsigned short *pc;
	char buf[BUFSIZE], *src;

	while (read(STDIN_FILENO, &buf[len], 1) > 0)
		len++;
	buf[len] = '\0';

	if ((src = malloc(len)) == NULL) {
		perror("malloc");
		exit(1);
	}
	strcpy(src, buf);
	memset(buf, 0, len);

	for (pc = (unsigned short *)buf; pos < len; pos++) {
		switch (src[pos]) {
		case '>':
			pc++;
			break;
		case '<':
			pc--;
			break;
		case '+':
			(*pc)++;
			break;
		case '-':
			(*pc)--;
			break;
		case '.':
			putchar(*pc);
			break;
		case ',':
			*pc = getchar();
			break;
		case '[':
			if (!(*pc)) {
				for (opened = 0, pos++; pos < len; pos++) {
					if (src[pos] == ']' && !opened)
						break;
					else if (src[pos] == '[')
						opened++;
					else if (src[pos] == ']')
						opened--;
				}
			}
			break;
		case ']':
			if (*pc) {
				for (closed = 0, pos--; pos >= 0; pos--) {
					if (src[pos] == '[' && !closed)
						break;
					else if (src[pos] == ']')
						closed++;
					else if (src[pos] == '[')
						closed--;
				}
			}
			break;
		}
	}
	free(src);

	return (0);
}

Christos Margiolis: programming

Inline function tracing with the kinst DTrace provider

Table of Contents

Quick background

Usage

Inline function tracing

How it works

Syntactic transformations

Heuristic for calculating the entry and return offsets

Using DWARF to find call sites of inline functions

Table of contents

What is DWARF?

Inline functions in DWARF

Calculating call boundaries

The DIE has DW_AT_low_pc and DW_AT_high_pc

The DIE has DW_AT_ranges

Finding the caller function

inlinecall(1)

Nested inline functions

Making a character device kernel module on FreeBSD

Table of contents

Implementing the device

malloc declaration

cdevsw structure

open() and close()

read() and write()

ioctl()

Creating and destroying the device

Module declaration

Makefile

Running the module

Testing

Process queuing using lock files

FreeBSD sound mixer improvements

Table of contents

Kernel patches

Un/muting (commit)

Playback/recording mode information (commit)

Userland

mixer(3) implementation (commit)

mixer(8) rewrite (commit)

Code and manuals

PIC microcontroller development on FreeBSD

Prerequisites

Detecting and programming the MCU

C coding style

Use cases for goto

Some use cases

Who cares, anyway?

Final note

Arduino on FreeBSD

Prerequisites

Connecting the board

The Makefile

Building and uploading a program

Monitoring

Using board types other than the Uno

Known issues and their fixes

Simple Brainfuck interpreter in C

How Brainfuck works

Building the interpreter

Beginning of loop: [

End of loop: ]

Putting it all together

Heuristic for calculating the `entry` and `return` offsets

The DIE has `DW_AT_low_pc` and `DW_AT_high_pc`

The DIE has `DW_AT_ranges`

`cdevsw` structure

Beginning of loop: `[`

End of loop: `]`