Christos Margiolis: programming Christos Margiolis http://margiolis.net/tags/programming/ Inline function tracing with the kinst DTrace provider http://margiolis.net/w/kinst_inline/ Tue, 18 Jul 2023 00:00:00 +1200 <h2 id="table-of-contents">Table of Contents</h2> <ol> <li><a href="#quick-background">Quick background</a></li> <li><a href="#usage">Usage</a></li> <li><a href="#inline-function-tracing">Inline function tracing</a> <ul> <li><a href="#how-it-works">How it works</a></li> <li><a href="#syntactic-transformations">Syntactic transformations</a></li> <li><a href="#heuristic-entry-return">Heuristic for calculating the <code>entry</code> and <code>return</code> offsets</a></li> </ul> </li> </ol> <h2 id="quick-background">Quick background</h2> <p><a href="https://illumos.org/books/dtrace/preface.html">DTrace</a> is a framework that gives administrators and kernel developers the ability to observe kernel behavior in real time. DTrace has modules called &ldquo;providers&rdquo;, that perform a particular instrumentation in the kernel (and sometimes userland) using &ldquo;probes&rdquo;.</p> <p><a href="http://margiolis.net/files/kinst.pdf">kinst</a> is a new low-level DTrace provider co-authored by Christos Margiolis and Mark Johnston for the FreeBSD operating system, which allows the user to trace arbitrary instructions in kernel functions. It is part of the base system as of FreeBSD 14.0.</p> <p>kinst probes take the form of <code>kinst::&lt;function&gt;:&lt;instruction&gt;</code>, where <code>&lt;function&gt;</code> is the kernel function to be traced, and <code>&lt;instruction&gt;</code> is the offset to the instruction, relative to the beginning of the function, and can be obtained from the function&rsquo;s disassembly. If the <code>&lt;instruction&gt;</code> field is left empty, kinst will trace all instructions in that function. Unlike <a href="https://illumos.org/books/dtrace/chp-fbt.html">FBT</a>, kinst can also trace the entry and return points of inline functions (see <a href="#inline-function-tracing">Inline function tracing</a>).</p> <p>The origin of the name is inspired from an <a href="https://www.usenix.org/legacy/publications/library/proceedings/osdi99/full_papers/tamches/tamches.pdf">early paper written by A. Tamches and B. Miller</a> discussing a tracing tool they developed called &ldquo;KernInst&rdquo;.</p> <h2 id="usage">Usage</h2> <p>Find the offset corresponding to the third instruction in <code>vm_fault()</code> and trace it, printing the contents of the RSI register:</p> <pre tabindex="0"><code># kgdb (kgdb) disas /r vm_fault Dump of assembler code for function vm_fault: 0xffffffff80f4e470 &lt;+0&gt;: 55 push %rbp 0xffffffff80f4e471 &lt;+1&gt;: 48 89 e5 mov %rsp,%rbp 0xffffffff80f4e474 &lt;+4&gt;: 41 57 push %r15 ... # dtrace -n &#39;kinst::vm_fault:4 {printf(&#34;%#x&#34;, regs[R_RSI]);}&#39; 2 81500 vm_fault:4 0x827c56000 2 81500 vm_fault:4 0x827878000 2 81500 vm_fault:4 0x1fab9bef0000 2 81500 vm_fault:4 0xe16cf749000 0 81500 vm_fault:4 0x13587c366000 ^C </code></pre><p>Trace the return point of <code>critical_enter()</code>, which is an inline function:</p> <pre tabindex="0"><code># dtrace -n &#39;kinst::critical_enter:return&#39; dtrace: description &#39;kinst::critical_enter:return&#39; matched 130 probes CPU ID FUNCTION:NAME 1 71024 spinlock_enter:53 0 71024 spinlock_enter:53 1 70992 uma_zalloc_arg:49 1 70925 malloc_type_zone_allocated:21 1 70994 uma_zfree_arg:365 1 70924 malloc_type_freed:21 1 71024 spinlock_enter:53 0 71024 spinlock_enter:53 0 70947 _epoch_enter_preempt:122 0 70949 _epoch_exit_preempt:28 ^C </code></pre><h2 id="inline-function-tracing">Inline function tracing</h2> <h3 id="how-it-works">How it works</h3> <p>To trace inline functions, libdtrace makes use of the <a href="http://margiolis.net/w/dwarf_inline">DWARF Debugging Standard</a>, to detect if the function specified is an inline call. If it is, D syntax is transformed to create kinst probes for each of the inline copies found. All work is done in libdtrace, instead of kinst(4). This feature has been added to FreeBSD with <a href="https://reviews.freebsd.org/D38825">this patch</a>.</p> <p>Contrary to how kinst expects a <code>&lt;function&gt;:&lt;instruction&gt;</code> tuple to create probes, for inline functions, <code>&lt;instruction&gt;</code> is replaced by <code>entry</code> and <code>return</code>.</p> <h3 id="syntactic-transformations">Syntactic transformations</h3> <p>Suppose the user wants to trace a probe of the form:</p> <pre tabindex="0"><code>kinst::&lt;func&gt;:&lt;entry|return&gt; /&lt;pred&gt;/ { &lt;acts&gt; } </code></pre><p>libdtrace sees that we have specified <code>entry</code> or <code>return</code>, instead of an offset, which is what a regular kinst probe would look like, so it loops through all loaded kernel modules and parses their DWARF and ELF info to see if this function is an inline &mdash; if <em>not</em>, the probe is converted to an FBT one, so that we don&rsquo;t duplicate FBT&rsquo;s functionality in kinst:</p> <pre tabindex="0"><code># dtrace -dn &#39;kinst::malloc:entry {exit(0);}&#39; fbt::malloc:entry { exit(0x0); } dtrace: description &#39;kinst::malloc:entry &#39; matched 1 probe CPU ID FUNCTION:NAME 2 31144 malloc:entry </code></pre><p>If the function however <em>is</em> an inline, libdtrace will find all calls refering to this function and create new probes for each one of the inline copies found.</p> <pre tabindex="0"><code># dtrace -dn &#39;kinst::cam_iosched_has_more_trim:entry { printf(&#34;\t%d\t%s&#34;, pid, execname); }&#39; kinst::cam_iosched_get_trim:13, kinst::cam_iosched_next_bio:13, kinst::cam_iosched_schedule:40 { printf(&#34;\t%d\t%s&#34;, pid, execname); } dtrace: description &#39;kinst::cam_iosched_has_more_trim:entry &#39; matched 4 probes CPU ID FUNCTION:NAME 0 81502 cam_iosched_schedule:40 2 clock 0 81501 cam_iosched_next_bio:13 2 clock 2 81502 cam_iosched_schedule:40 2 clock 1 81502 cam_iosched_next_bio:13 0 kernel 1 81503 cam_iosched_schedule:40 0 kernel ^C </code></pre><p>There can also be both inline and non-inline definitions of the same function. In this case, kinst creates an additional FBT probe for the non-inline definition.</p> <p>The <code>-d</code> flag used in these examples to dump the D script after libdtrace has applied syntactic transformations, has been added to DTrace in <a href="https://cgit.freebsd.org/src/commit/?id=1e136a9cbd3a9d137037e47a53c1dba3be7f6925">commit 1e136a9cbd3a</a>.</p> <h3 id="heuristic-entry-return">Heuristic for calculating the <code>entry</code> and <code>return</code> offsets</h3> <p>libdtrace reuses parts of the mechanism implemented in my <a href="https://git.sr.ht/~crm/inlinecall">inlinecall(1)</a> program, which finds and prints all call sites of a given inline function:</p> <pre tabindex="0"><code>$ ./inlinecall vm_page_mvqueue /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/vm/vm_page.c:4142 [0xffffffff80f91541 - 0xffffffff80f91599] /usr/src/sys/vm/vm_page.c:4195 vm_page_readahead_finish() [0xffffffff80f915f5 - 0xffffffff80f91603] /usr/src/sys/vm/vm_page.c:4195 vm_page_readahead_finish() [0xffffffff80f9163d - 0xffffffff80f916c2] /usr/src/sys/vm/vm_page.c:4184 vm_page_activate() [0xffffffff80f916cd - 0xffffffff80f916e5] /usr/src/sys/vm/vm_page.c:4184 vm_page_activate() [0xffffffff80f916fe - 0xffffffff80f91747] /usr/src/sys/vm/vm_page.c:4195 vm_page_deactivate() [0xffffffff80f91750 - 0xffffffff80f91768] /usr/src/sys/vm/vm_page.c:4195 vm_page_deactivate() [0xffffffff80f94a59 - 0xffffffff80f94aa9] /usr/src/sys/vm/vm_page.c:4195 vm_page_reclaim_contig_domain() [0xffffffff80f94de4 - 0xffffffff80f94df9] /usr/src/sys/vm/vm_page.c:4195 vm_page_reclaim_contig_domain() [0xffffffff80f9661e - 0xffffffff80f96667] /usr/src/sys/vm/vm_page.c:4202 vm_page_deactivate_noreuse() [0xffffffff80f96670 - 0xffffffff80f96688] /usr/src/sys/vm/vm_page.c:4202 vm_page_deactivate_noreuse() [0xffffffff80f9669e - 0xffffffff80f966ea] /usr/src/sys/vm/vm_page.c:4212 vm_page_launder() [0xffffffff80f966f3 - 0xffffffff80f9670b] /usr/src/sys/vm/vm_page.c:4212 vm_page_launder() [0xffffffff80f96d4c - 0xffffffff80f96dac] /usr/src/sys/vm/vm_page.c:4212 vm_page_advise() [0xffffffff80f96dac - 0xffffffff80f96e07] /usr/src/sys/vm/vm_page.c:4202 vm_page_advise() </code></pre><p>Most of the entries above appear twice but with different boundaries:</p> <pre tabindex="0"><code> [0xffffffff80f9163d - 0xffffffff80f916c2] /usr/src/sys/vm/vm_page.c:4184 vm_page_activate() [0xffffffff80f916cd - 0xffffffff80f916e5] /usr/src/sys/vm/vm_page.c:4184 vm_page_activate() </code></pre><p>This means that the inline copy&rsquo;s boundaries are split into more than one parts, which can be caused by having early <code>return</code>s inside the inline function. By using this assumption, we can deduce that the entry point of the function is <code>0xffffffff80f9163d</code>, and each of the upper boundaries are the return points (i.e <code>0xffffffff80f916c2</code> and <code>0xffffffff80f916e5</code>), so we end up with one <code>entry</code> and two <code>return</code> trace points.</p> <p>However, this is <strong>not exactly true</strong>, because the final return address given by DWARF corresponds to the instruction <em>after</em> the actual return instruction. We can verify this in GDB using the two return addresses from the example above:</p> <pre tabindex="0"><code># kgdb (kgdb) disas vm_page_activate ... 0xffffffff80f916c0 &lt;+144&gt;: jmp 0xffffffff80f91673 &lt;vm_page_activate+67&gt; 0xffffffff80f916c2 &lt;+146&gt;: add $0x8,%rsp &lt;-- first address given by DWARF ... 0xffffffff80f916e0 &lt;+176&gt;: call 0xffffffff80bed360 &lt;panic&gt; &lt;-- last instruction &lt;-- second address should be here </code></pre><p>It turns out that <code>0xffffffff80f916e5</code> is, in fact, outside <code>vm_page_activate()</code> altogether!</p> <pre tabindex="0"><code>(kgdb) x/i 0xffffffff80f916e5 0xffffffff80f916e5: data16 cs nopw 0x0(%rax,%rax,1) </code></pre><p>After running a couple of tests, I came to the conclusion that there are two possible cases with return addresses:</p> <ul> <li>If <a href="http://margiolis.net/w/dwarf_inline/#lowpc-highpc">the inline copy&rsquo;s DIE has <code>DW_AT_lowpc</code> and <code>DW_AT_highpc</code> set</a>, the return address is <em>always</em> outside the inline function&rsquo;s boundaries.</li> <li>If <a href="http://margiolis.net/w/dwarf_inline/#ranges">the inline copy&rsquo;s DIE has <code>DW_AT_ranges</code> set</a>, only the last return address is outside the inline function&rsquo;s boundaries.</li> <li>Combining the two bullets above, if the return address of the inline function is the same as the upper boundary of the caller function, it&rsquo;s outside <em>both</em> the inline copy&rsquo;s and the caller function&rsquo;s boundaries.</li> </ul> <p>In order to fix this, we have to go one instruction back whenever we come across one of those 3 cases.</p> <p>Finally the <code>entry</code> offset is calculated as:</p> <pre tabindex="0"><code>inline_bound_lo - caller_bound_lo </code></pre><p>And the <code>return</code> one, including the modifications (if any) discussed above, as:</p> <pre tabindex="0"><code>inline_bound_hi - caller_bound_lo </code></pre><p>These offsets are then used to create regular kinst probes of the form <code>kinst::&lt;func&gt;:&lt;instruction&gt;</code>, which is what kinst actually expects:</p> <pre tabindex="0"><code># dtrace -dn &#39;kinst::vm_page_mvqueue:entry,kinst::vm_page_mvqueue:return&#39; dtrace: description &#39;kinst::vm_page_mvqueue:entry,kinst::vm_page_mvqueue:return&#39; matched 22 probes kinst::vm_page_readahead_finish:33, kinst::vm_page_readahead_finish:121, kinst::vm_page_activate:13, kinst::vm_page_deactivate:14, kinst::vm_page_reclaim_contig_domain:1961, kinst::vm_page_deactivate_noreuse:14, kinst::vm_page_launder:14, kinst::vm_page_advise:236, kinst::vm_page_advise:332, kinst::vm_page_readahead_finish:220, kinst::vm_page_activate:146, kinst::vm_page_activate:176, kinst::vm_page_deactivate:87, kinst::vm_page_deactivate:115, kinst::vm_page_reclaim_contig_domain:2041, kinst::vm_page_reclaim_contig_domain:2884, kinst::vm_page_deactivate_noreuse:87, kinst::vm_page_deactivate_noreuse:115, kinst::vm_page_launder:90, kinst::vm_page_launder:118, kinst::vm_page_advise:330, kinst::vm_page_advise:421 { } CPU ID FUNCTION:NAME 3 95381 vm_page_activate:13 3 95389 vm_page_activate:146 2 95381 vm_page_activate:13 2 95389 vm_page_activate:146 1 95387 vm_page_advise:332 1 95400 vm_page_advise:421 1 95387 vm_page_advise:332 1 95400 vm_page_advise:421 1 95387 vm_page_advise:332 ^C </code></pre> Using DWARF to find call sites of inline functions http://margiolis.net/w/dwarf_inline/ Tue, 07 Feb 2023 00:00:00 +1200 <h2 id="table-of-contents">Table of contents</h2> <ol> <li><a href="#what-is-dwarf">What is DWARF?</a></li> <li><a href="#inline-functions-in-dwarf">Inline functions in DWARF</a></li> <li><a href="#calculating-call-boundaries">Calculating call boundaries</a> <ul> <li><a href="#lowpc-highpc">The DIE has <code>DW_AT_low_pc</code> and <code>DW_AT_high_pc</code></a></li> <li><a href="#ranges">The DIE has <code>DW_AT_ranges</code></a></li> </ul> </li> <li><a href="#finding-the-caller-function">Finding the caller function</a></li> <li><a href="#inlinecall">inlinecall(1)</a> <ul> <li><a href="#nested-inline-functions">Nested inline functions</a></li> </ul> </li> </ol> <h2 id="what-is-dwarf">What is DWARF?</h2> <p>From the <a href="https://dwarfstd.org/">DWARF Debugging Standard&rsquo;s documentation</a>:</p> <blockquote> <p>This document defines a format for describing programs to facilitate user source level debugging. This description can be generated by compilers, assemblers and linkage editors. It can be used by debuggers and other tools.</p> </blockquote> <p>Debugging information entries (DIEs) are represented as a tree, one per compilation unit (CU). Each DIE has a tag (<code>DW_TAG_*</code>, see DWARF PDF Figure 1) denoting its class, and attributes (<code>DW_AT_*</code>, see DWARF PDF Figure 2) denoting its various characteristics, associated with it.</p> <p>The next entry of a DIE is a child DIE. If a DIE doesn&rsquo;t have children, the next entry is a &ldquo;sibling&rdquo;.</p> <p>Consider the following structure:</p> <pre tabindex="0"><code>CU1 (DW_TAG_compile_unit) func1 (DW_TAG_subprogram) DW_AT_foo DW_AT_bar func2 (DW_TAG_subprogram) DW_AT_foo DW_AT_bar myvar (DW_TAG_variable) DW_AT_foo DW_AT_bar CU2 (DW_TAG_compile_unit) ... </code></pre><p><code>CU1</code> has <code>func1</code> and <code>myvar</code> as chidren and <code>CU2</code> as siblings. <code>func1</code> has <code>func2</code> as a child.</p> <p>The debug file can be generated by compiling with the <code>-g</code> option. To dump DWARF info you can use <code>readelf -wi &lt;file&gt;</code> and <code>dwarfdump &lt;file&gt;</code>.</p> <h2 id="inline-functions-in-dwarf">Inline functions in DWARF</h2> <p>DIEs of inline function declarations have the <code>DW_TAG_subprogram</code> tag and the <code>DW_AT_inline</code> attribute. DIEs of inline copies of this function will have the <code>DW_TAG_inlined_subroutine</code> tag.</p> <p>Attributes inline copies can have include:</p> <ul> <li><code>DW_AT_abstract_origin</code>: DIE offset to the inline declaration.</li> <li><code>DW_AT_call_file</code>: Integer denoting the file the function is called in.</li> <li><code>DW_AT_call_line</code>: File line.</li> <li><code>DW_AT_call_column</code>: File column.</li> <li><code>DW_AT_low_pc</code> and <code>DW_AT_high_pc</code>: Lower and upper call boundaries. <a href="#3.1">Explained in the next section</a>.</li> <li><code>DW_AT_ranges</code>: <a href="#3.2">Explained in the next section</a>.</li> </ul> <p>For example, if we dump the DWARF info for my FreeBSD kernel:</p> <pre tabindex="0"><code>$ readelf -wi /usr/lib/debug/boot/kernel/kernel.debug &gt; ~/foo </code></pre><p>We find that <code>vfs_freevnodes_dec</code> gets inlined:</p> <pre tabindex="0"><code> &lt;1&gt;&lt;1dfa144&gt;: Abbrev Number: 94 (DW_TAG_subprogram) &lt;1dfa145&gt; DW_AT_name : (indirect string) vfs_freevnodes_dec &lt;1dfa149&gt; DW_AT_decl_file : 1 &lt;1dfa14a&gt; DW_AT_decl_line : 1447 &lt;1dfa14c&gt; DW_AT_prototyped : 1 &lt;1dfa14c&gt; DW_AT_inline : 1 </code></pre><p>Inline copies will have <code>DW_AT_abstract_origin</code> point to the declaration&rsquo;s DIEs offset, in this case <code>0x1dfa144</code>. If we look for <code>0x1dfa144</code>, we do indeed find a few inline copies.</p> <pre tabindex="0"><code> &lt;3&gt;&lt;1dfe45e&gt;: Abbrev Number: 24 (DW_TAG_inlined_subroutine) &lt;1dfe45f&gt; DW_AT_abstract_origin: &lt;0x1dfa144&gt; &lt;1dfe463&gt; DW_AT_low_pc : 0xffffffff80cf701d &lt;1dfe46b&gt; DW_AT_high_pc : 0x38 &lt;1dfe46f&gt; DW_AT_call_file : 1 &lt;1dfe470&gt; DW_AT_call_line : 3458 &lt;1dfe472&gt; DW_AT_call_column : 5 &lt;3&gt;&lt;1dfd2e2&gt;: Abbrev Number: 58 (DW_TAG_inlined_subroutine) &lt;1dfd2e3&gt; DW_AT_abstract_origin: &lt;0x1dfa144&gt; &lt;1dfd2e7&gt; DW_AT_ranges : 0x1f1290 &lt;1dfd2eb&gt; DW_AT_call_file : 1 &lt;1dfd2ec&gt; DW_AT_call_line : 3405 &lt;1dfd2ee&gt; DW_AT_call_column : 3 ...there are more </code></pre><p>As I described in the <a href="#1">first section</a>, a debug file may consist of multiple CUs that define the same inline function. We want treat each CU independently, that is, each inline copy is handled relative to its CU.</p> <h2 id="calculating-call-boundaries">Calculating call boundaries</h2> <p>There are 2 cases we have to take care of when calculating the actual call boundaries of an inline copy.</p> <h3 id="lowpc-highpc">The DIE has <code>DW_AT_low_pc</code> and <code>DW_AT_high_pc</code></h3> <pre tabindex="0"><code> &lt;3&gt;&lt;1dfe45e&gt;: Abbrev Number: 24 (DW_TAG_inlined_subroutine) &lt;1dfe45f&gt; DW_AT_abstract_origin: &lt;0x1dfa144&gt; &lt;1dfe463&gt; DW_AT_low_pc : 0xffffffff80cf701d &lt;1dfe46b&gt; DW_AT_high_pc : 0x38 &lt;1dfe46f&gt; DW_AT_call_file : 1 &lt;1dfe470&gt; DW_AT_call_line : 3458 &lt;1dfe472&gt; DW_AT_call_column : 5 </code></pre><p>In this case, the lower boundary is <code>low_pc</code> and the upper boundary is <code>low_pc + high_pc</code>, which, for the DIE shown in this example, the boundaries are:</p> <pre tabindex="0"><code>low = 0xffffffff80cf701d high = 0xffffffff80cf701d + 0x38 = 0xffffffff80cf7055 </code></pre><h3 id="ranges">The DIE has <code>DW_AT_ranges</code></h3> <pre tabindex="0"><code> &lt;3&gt;&lt;1dfd2e2&gt;: Abbrev Number: 58 (DW_TAG_inlined_subroutine) &lt;1dfd2e3&gt; DW_AT_abstract_origin: &lt;0x1dfa144&gt; &lt;1dfd2e7&gt; DW_AT_ranges : 0x1f1290 &lt;1dfd2eb&gt; DW_AT_call_file : 1 &lt;1dfd2ec&gt; DW_AT_call_line : 3405 &lt;1dfd2ee&gt; DW_AT_call_column : 3 </code></pre><p>This is a bit more involved. <code>DW_AT_ranges</code> refers to the <code>.debug_ranges</code> section found in debug files. We can dump the ranges:</p> <pre tabindex="0"><code>$ dwarfdump -N /usr/lib/debug/boot/kernel/kernel.debug .debug_ranges Ranges group 0: ranges: 3 at .debug_ranges offset 0 (0x00000000) (48 bytes) [ 0] range entry 0x00000019 0x00000073 [ 1] range entry 0x0000007e 0x00000106 [ 2] range end 0x00000000 0x00000000 Ranges group 1: ranges: 3 at .debug_ranges offset 48 (0x00000030) (48 bytes) [ 0] range entry 0x00000022 0x0000006a [ 1] range entry 0x0000007e 0x00000106 [ 2] range end 0x00000000 0x00000000 ... </code></pre><p>If we search for <code>0x1f1290</code> (the inline copy&rsquo;s ranges), we find its range group:</p> <pre tabindex="0"><code> Ranges group 38809: ranges: 3 at .debug_ranges offset 2036368 (0x001f1290) (48 bytes) [ 0] range entry 0x000025c8 0x000025f9 [ 1] range entry 0x0000261a 0x00002621 [ 2] range end 0x00000000 0x00000000 </code></pre><p>To get the call boundaries, we add each <code>range entry</code>&rsquo;s boundaries to the <code>DW_AT_low_pc</code> of the root DIE of the CU. The root DIE is found programmatically, but I happen to know that in this case, the root DIE is:</p> <pre tabindex="0"><code> &lt;0&gt;&lt;1dee9fb&gt;: Abbrev Number: 1 (DW_TAG_compile_unit) &lt;1dee9fc&gt; DW_AT_producer : (indirect string) FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303) &lt;1deea00&gt; DW_AT_language : 12 (C99) &lt;1deea02&gt; DW_AT_name : (indirect string) /usr/src/sys/kern/vfs_subr.c &lt;1deea06&gt; DW_AT_stmt_list : 0x6cb448 &lt;1deea0a&gt; DW_AT_comp_dir : (indirect string) /usr/obj/usr/src/amd64.amd64/sys/GENERIC &lt;1deea0e&gt; DW_AT_low_pc : 0xffffffff80cf4020 &lt;1deea16&gt; DW_AT_high_pc : 0xde3d </code></pre><p>Finally, we end up with the following boundaries:</p> <pre tabindex="0"><code>low = 0xffffffff80cf4020 + 0x000025c8 = 0xffffffff80cf65e8 high = 0xffffffff80cf4020 + 0x000025f9 = 0xffffffff80cf6619 low = 0xffffffff80cf4020 + 0x0000261a = 0xffffffff80cf663a high = 0xffffffff80cf4020 + 0x00002621 = 0xffffffff80cf6641 </code></pre><h2 id="finding-the-caller-function">Finding the caller function</h2> <p>There are cases where we want to know which function an inline function is being called from. Because DWARF does not encode that information, we&rsquo;ll have to scan ELF symbol tables.</p> <pre tabindex="0"><code>$ readelf -s /usr/lib/debug/boot/kernel/kernel.debug </code></pre><p>Since we know the inline copy&rsquo;s boundaries, we only have to find which symbol&rsquo;s boundaries the inline copy is inside. In other words, the following condition has to be met:</p> <pre tabindex="0"><code>sym_lower_bound &lt;= inline_lower_bound &lt;= inline_upper_bound &lt;= sym_upper_bound </code></pre><p>Because searching through ELF symbol tables manually and doing calculations by hand would take too long, the best way to do this is programmatically through <a href="https://sourceforge.net/p/elftoolchain/wiki/libelf/">LibELF</a>.</p> <h2 id="inlinecall">inlinecall(1)</h2> <p><a href="https://git.sr.ht/~crm/inlinecall">I wrote a little program</a> that does everything I talked about in this post automatically. It works on FreeBSD as-is, and most likely needs some modification to get it to work on other platforms.</p> <p>The program takes an inline function name and a debug file as arguments:</p> <pre tabindex="0"><code>inlinecall &lt;function&gt; &lt;file&gt; </code></pre><p>And outputs the results in the following form:</p> <pre tabindex="0"><code>cu1_func_declaration_file:line [low_bound - high_bound] inline_copy1_file:line caller_func() [low_bound - high_bound] inline_copy2_file:line caller_func() ... cu2_func_declaration_file:line ... ... </code></pre><p>For example:</p> <pre tabindex="0"><code>$ inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/sys/systm.h:175 [0xffffffff809eb51f - 0xffffffff809eb526] /usr/src/sys/kern/kern_intr.c:1387 intr_event_handle() /usr/src/sys/sys/systm.h:175 [0xffffffff80a051f4 - 0xffffffff80a05208] /usr/src/sys/kern/kern_malloc.c:431 malloc_type_freed() [0xffffffff80a0514c - 0xffffffff80a0515b] /usr/src/sys/kern/kern_malloc.c:388 malloc_type_zone_allocated() /usr/src/sys/sys/systm.h:175 [0xffffffff80a263c4 - 0xffffffff80a263d3] /usr/src/sys/kern/kern_resource.c:509 rtp_to_pri() /usr/src/sys/sys/systm.h:175 [0xffffffff80a28f59 - 0xffffffff80a28f5f] /usr/src/sys/kern/kern_rmlock.c:775 _rm_assert() [0xffffffff80a29087 - 0xffffffff80a2908d] /usr/src/sys/kern/kern_rmlock.c:801 _rm_assert() [0xffffffff80a29eb0 - 0xffffffff80a29eb7] /usr/src/sys/kern/kern_rmlock.c:645 _rm_rlock_debug() [0xffffffff80a28c4b - 0xffffffff80a28c5a] /usr/src/sys/kern/kern_rmlock.c:160 unlock_rm() ...more </code></pre><h3 id="nested-inline-functions">Nested inline functions</h3> <p>inlinecall(1) resolves nested inline functions recursively:</p> <pre tabindex="0"><code>$ ./inlinecall critical_enter /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/sys/systm.h:175 [0xffffffff80a19d7a - 0xffffffff80a19d8b] /usr/src/sys/sys/buf_ring.h:80 drbr_enqueue() /usr/src/sys/sys/systm.h:175 [0xffffffff80a6387a - 0xffffffff80a6388b] /usr/src/sys/sys/buf_ring.h:80 drbr_enqueue() ... </code></pre><p>Looking at the definition of <code>critical_enter()</code>&rsquo;s caller function in <code>buf_ring.h</code>:</p> <pre tabindex="0"><code>static __inline int buf_ring_enqueue(struct buf_ring *br, void *buf) { ... critical_enter(); ... } </code></pre><p>Even though inlinecall(1) reported that <code>critical_enter()</code> is called from <code>drbr_enqueue()</code> in <code>buf_ring.h:80</code>, we see that it&rsquo;s called from <code>buf_ring_enqueue()</code> instead, but <code>buf_ring_enqueue()</code> is also an inline function:</p> <pre tabindex="0"><code>$ ./inlinecall buf_ring_enqueue /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/sys/buf_ring.h:63 [0xffffffff80a19d7a - 0xffffffff80a19dcd] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80a19ddc - 0xffffffff80a19e18] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80a19e1f - 0xffffffff80a19e3b] /usr/src/sys/net/ifq.h:337 drbr_enqueue() /usr/src/sys/sys/buf_ring.h:63 [0xffffffff80a6387a - 0xffffffff80a638cd] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80a638dc - 0xffffffff80a63918] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80a6391f - 0xffffffff80a6393b] /usr/src/sys/net/ifq.h:337 drbr_enqueue() /usr/src/sys/sys/buf_ring.h:63 [0xffffffff80d1f81a - 0xffffffff80d1f879] /usr/src/sys/net/ifq.c:57 drbr_enqueue() [0xffffffff80d1f91d - 0xffffffff80d1f964] /usr/src/sys/net/ifq.c:57 drbr_enqueue() [0xffffffff80d1f9dd - 0xffffffff80d1f9f5] /usr/src/sys/net/ifq.c:57 drbr_enqueue() /usr/src/sys/sys/buf_ring.h:63 [0xffffffff80ff07ba - 0xffffffff80ff080d] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80ff081c - 0xffffffff80ff0858] /usr/src/sys/net/ifq.h:337 drbr_enqueue() [0xffffffff80ff085f - 0xffffffff80ff087b] /usr/src/sys/net/ifq.h:337 drbr_enqueue() </code></pre><p>Here <code>drbr_enqueue()</code> is defined twice &mdash; once in <code>ifq.h</code> and once in <code>ifq.c</code>. The definition in <code>ifq.h</code> is also an inline definition, and in <code>ifq.c</code> it&rsquo;s a non-inline one. We know that <code>buf_ring_enqueue()</code> is called from the non-inline version of <code>drbr_enqueue()</code>, otherwise inlinecall(1) would have reported the function which calls the inline version of <code>drbr_enqueue()</code>.</p> Making a character device kernel module on FreeBSD http://margiolis.net/w/cdev/ Sun, 10 Jul 2022 00:00:00 +1200 <!-- raw HTML omitted --> <p>This article assumes advanced knowledge of C and a basic understanding of the FreeBSD kernel and programming environment. It is also meant to serve as a template/reference and not a complete implementation.</p> <p><a href="https://git.sr.ht/~crm/random/tree/master/item/mydev_freebsd">Sample code can be found here</a>.</p> <p>Also mirrored on the <a href="https://wiki.freebsd.org/CDevModule">FreeBSD Wiki</a>.</p> <h2 id="table-of-contents">Table of contents</h2> <ul> <li><a href="#implementing-the-device">Implementing the device</a> <ul> <li><a href="#malloc-declaration">malloc declaration</a></li> <li><a href="#cdevsw-structure"><code>cdevsw</code> structure</a></li> <li><a href="#open-and-close">open() and close()</a></li> <li><a href="#read-and-write">read() and write()</a></li> <li><a href="#ioctl">ioctl()</a></li> </ul> </li> <li><a href="#creating-and-destroying-the-device">Creating and destroying the device</a></li> <li><a href="#module-declaration">Module declaration</a></li> <li><a href="#makefile">Makefile</a></li> <li><a href="#running-the-module">Running the module</a></li> <li><a href="#testing">Testing</a></li> </ul> <h2 id="implementing-the-device">Implementing the device</h2> <h3 id="malloc-declaration">malloc declaration</h3> <p>Kernel modules have their own malloc types, which are defined as follows:</p> <pre tabindex="0"><code>MALLOC_DECLARE(M_MYDEV); MALLOC_DEFINE(M_MYDEV, &#34;mydev&#34;, &#34;device description&#34;); </code></pre><p>Then, you can use malloc(9) and free(9) as:</p> <pre tabindex="0"><code>p = malloc(sizeof(foo), M_MYDEV, M_WAITOK | M_ZERO); free(p, M_MYDEV); </code></pre><h3 id="cdevsw-structure"><code>cdevsw</code> structure</h3> <p>The device&rsquo;s properties and methods are stored in a <code>cdevsw</code> (Character Device Switch) structure, defined in <code>sys/conf.h</code>. The fields we care about most of the time are the following:</p> <pre tabindex="0"><code>struct cdevsw { int d_version; u_int d_flags; const char *d_name; d_open_t *d_open; d_fdopen_t *d_fdopen; d_close_t *d_close; d_read_t *d_read; d_write_t *d_write; d_ioctl_t *d_ioctl; d_poll_t *d_poll; d_mmap_t *d_mmap; d_strategy_t *d_strategy; dumper_t *d_dump; d_kqfilter_t *d_kqfilter; d_purge_t *d_purge; d_mmap_single_t *d_mmap_single; ... }; </code></pre><p>All the <code>*_t</code> pointers are pointers to functions meant to be implemented by the driver. Not all functions have to be implemented however, but we usually do need to implement open(), close(), read(), write() and ioctl().</p> <p>Declare the functions using some handy typedefs:</p> <pre tabindex="0"><code>static d_open_t mydev_open; static d_close_t mydev_close; static d_read_t mydev_read; static d_write_t mydev_write; static d_ioctl_t mydev_ioctl; </code></pre><p>Declare the <code>cdevsw</code> structure:</p> <pre tabindex="0"><code>static struct cdevsw mydev_cdevsw = { .d_name = &#34;mydev&#34;, .d_version = D_VERSION, .d_flags = D_TRACKCLOSE, .d_open = mydev_open, .d_close = mydev_close, .d_read = mydev_read, .d_write = mydev_write, .d_ioctl = mydev_ioctl, }; </code></pre><p>The <code>D_TRACKCLOSE</code> flag tells the kernel to track when the device closes so that it can close normally in case something goes wrong.</p> <h3 id="open-and-close">open() and close()</h3> <p>Those two functions are mainly used for resource allocation/deallocation and environment preparation:</p> <pre tabindex="0"><code>static int mydev_open(struct cdev *dev, int flags, int devtype, struct thread *td) { int error = 0; /* do stuff */ return (error); } static int mydev_close(struct cdev *dev, int flags, int devtype, struct thread *td) { int error = 0; /* do stuff */ return (error); } </code></pre><h3 id="read-and-write">read() and write()</h3> <p>It&rsquo;s good practice to keep an internal buffer. Below is a very simplified example. The buffer in this example is allocated and deallocated on <a href="#module-declaration">module load and unload respectively</a>:</p> <pre tabindex="0"><code>#define BUFSIZE (1 &lt;&lt; 16) struct foo { char buf[BUFSIZE + 1]; size_t len; }; static struct foo *foo; </code></pre><p>Data to be received or sent back is stored in <code>uio</code> and the copy from user to kernel memory is done through uiomove(9), defined in <code>sys/uio.h</code>:</p> <pre tabindex="0"><code>static int mydev_read(struct cdev *dev, struct uio *uio, int ioflag) { size_t amnt; int v, error = 0; /* * Determine how many bytes we have to read. We&#39;ll either read the * remaining bytes (uio-&gt;uio_resid) or the number of bytes requested by * the caller. */ v = uio-&gt;uio_offset &gt;= foo-&gt;len + 1 ? 0 : foo-&gt;len + 1 - uio-&gt;uio_offset; amnt = MIN(uio-&gt;uio_resid, v); /* Move the bytes from foo-&gt;buf to uio. */ if ((error = uiomove(foo-&gt;buf, amnt, uio)) != 0) { /* error handling */ } /* do stuff */ return (error); } static int mydev_write(struct cdev *dev, struct uio *uio, int ioflag) { size_t amnt; int error = 0; /* Do not allow random access. */ if (uio-&gt;uio_offset != 0 &amp;&amp; (uio-&gt;uio_offset != foo-&gt;len)) return (EINVAL); /* We&#39;re not appending, reset length. */ else if (uio-&gt;uio_offset == 0) foo-&gt;len = 0; amnt = MIN(uio-&gt;uio_resid, (BUFSIZE - foo-&gt;len)); if ((error = uiomove(foo-&gt;buf + uio-&gt;uio_offset, amnt, uio)) != 0) { /* error handling */ } foo-&gt;len = uio-&gt;uio_offset; foo-&gt;buf[foo-&gt;len] = &#39;\0&#39;; /* do stuff */ return (error); } </code></pre><h3 id="ioctl">ioctl()</h3> <p>To create an ioctl, you give it a name and <code>#define</code> it using one of the following <code>_IO*</code> macros defined in <code>sys/ioccom.h</code>:</p> <ul> <li><code>_IO</code>: No parameters.</li> <li><code>_IOR</code>: Copy out parameters. Read from device.</li> <li><code>_IOW</code>: Copy in paramters. Write to device.</li> <li><code>_IOWR</code>: Copy parameters in and out. Write to device and read the modified data back.</li> </ul> <p>Each of those macros* takes 3 arguments:</p> <ul> <li>An arbitrary one-byte &ldquo;class&rdquo; identifier.</li> <li>A unique ID.</li> <li>The parameter type (can be anything), which is used to calculate the parameter&rsquo;s size. The macro expands the type to <code>sizeof(type)</code>.</li> </ul> <p>* <code>_IO</code> takes only the first 2 arguments (class and ID) since it doesn&rsquo;t use parameters.</p> <p>We can now define a few ioctls that take <code>foo_t</code> as a parameter. This is usually done in a separate header file so that programs can use the ioctls:</p> <pre tabindex="0"><code>#include &lt;sys/ioccom.h&gt; typedef struct { int x; int y; } foo_t; #define MYDEVIOC_READ _IOR(&#39;a&#39;, 1, foo_t) #define MYDEVIOC_WRITE _IOW(&#39;a&#39;, 2, foo_t) #define MYDEVIOC_RDWR _IOWR(&#39;a&#39;, 3, foo_t) </code></pre><p><code>mydev_ioctl()</code> is responsible for handling the ioctls we declared:</p> <pre tabindex="0"><code>static int mydev_ioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, struct thread *td) { foo_t *fp; int error = 0; switch (cmd) { case MYDEVIOC_READ: fp = (foo_t *)addr; /* do stuff */ break; case MYDEVIOC_WRITE: fp = (foo_t *)addr; /* do stuff */ break; case MYDEVIOC_RDWR: fp = (foo_t *)addr; /* do stuff */ break; default: error = ENOTTY; break; } return (error); } </code></pre><h2 id="creating-and-destroying-the-device">Creating and destroying the device</h2> <p>Character devices are given a <code>struct cdev</code> handle upon creation, which we usually store as a global variable:</p> <pre tabindex="0"><code>static struct cdev *mydev_cdev; </code></pre><p>Devices are created with the <code>make_dev()</code> function, which is defined as:</p> <pre tabindex="0"><code>struct cdev * make_dev(struct cdevsw *cdevsw, int unit, uid_t uid, gid_t gid, int perms, const char *fmt, ...); </code></pre><p><code>sys/conf.h</code> has the definitions of all available flags.</p> <p>Create the device:</p> <pre tabindex="0"><code>mydev_cdev = make_dev(&amp;mydev_cdevsw, 0, UID_ROOT, GID_WHEEL, 0666, &#34;mydev&#34;); </code></pre><p>When done, destroy the device:</p> <pre tabindex="0"><code>destroy_dev(mydev_cdev); </code></pre><h2 id="module-declaration">Module declaration</h2> <p>Necessary includes:</p> <pre tabindex="0"><code>#include &lt;sys/types.h&gt; #include &lt;sys/param.h&gt; #include &lt;sys/conf.h&gt; #include &lt;sys/systm.h&gt; #include &lt;sys/kernel.h&gt; #include &lt;sys/module.h&gt; #include &lt;sys/malloc.h&gt; #include &lt;sys/uio.h&gt; </code></pre><p>Implement the module&rsquo;s event handler. This function is called at module load and unload. Since we&rsquo;re dealing with a character device, it makes sense to create the device upon load and destroy it upon unload:</p> <pre tabindex="0"><code>static int mydev_modevent(module_t mod, int type, void *arg) { int error = 0; switch (type) { case MOD_LOAD: mydev_cdev = make_dev(&amp;mydev_cdevsw, 0, UID_ROOT, GID_WHEEL, 0666, &#34;mydev&#34;); foo = malloc(sizeof(foo_t), M_MYDEV, M_WAITOK | M_ZERO); foo-&gt;buf[0] = &#39;\0&#39;; foo-&gt;len = 0; break; case MOD_UNLOAD: /* FALLTHROUGH */ case MOD_SHUTDOWN: free(foo, M_MYDEV); destroy_dev(mydev_cdev); break; default: error = EOPNOTSUPP; break; } return (error); } </code></pre><p>Lastly, declare the module. The first argument is the module&rsquo;s name, the second one is a pointer to the event handler and the last one is any data we want to supply the event handler with, i.e the <code>arg</code> argument in <code>mydev_modevent()</code>:</p> <pre tabindex="0"><code>DEV_MODULE(mydev, mydev_modevent, NULL); </code></pre><h2 id="makefile">Makefile</h2> <pre tabindex="0"><code>KMOD= mydev SRCS= mydev.c .include &lt;bsd.kmod.mk&gt; </code></pre><h2 id="running-the-module">Running the module</h2> <pre tabindex="0"><code>$ make # kldload ./mydev.ko ... # kldunload ./mydev.ko $ make clean cleandepend </code></pre><h2 id="testing">Testing</h2> <p>To test the module, load it, and create a simple program that opens the device, and makes a few calls to ioctl(2), read(2) and write(2).</p> Process queuing using lock files http://margiolis.net/w/procqueue/ Wed, 18 May 2022 00:00:00 +1200 <p>Locking will be done using the <a href="https://man.openbsd.org/fcntl">fcntl(2) system call</a>. I&rsquo;m aware of lockf(3) and flock(2), but both of them normally use fcntl(2) under the hood, and they are not as portable. <a href="https://en.wikipedia.org/wiki/File_locking">More information on file locking</a>.</p> <p>For a real use-case, I&rsquo;ve written a <a href="http://margiolis.net/w/nfy">notification program</a> which uses the same mechanism, so that notifications can be queued without having to run a daemon, such as D-Bus, in the background.</p> <p>First create the lock file with write permissions. <code>O_CREAT</code> is used to create the file in case it doesn&rsquo;t exist already:</p> <pre tabindex="0"><code>#include &lt;err.h&gt; #include &lt;fcntl.h&gt; ... char *lockfile = &#34;/tmp/foo.lock&#34;; int fd; if ((fd = open(lockfile, O_CREAT | O_WRONLY, 0600)) &lt; 0) err(1, &#34;open(%s)&#34;, lockfile); </code></pre><p>Locking commands operate on the <code>flock</code> structure. Before a call to fcntl(2) is made, we need to write the following fields:</p> <pre tabindex="0"><code>struct flock { off_t l_start; /* starting offset */ off_t l_len; /* len = 0 means until end of file */ short l_type; /* lock type: read/write, etc. */ short l_whence; /* type of l_start */ ... }; </code></pre><p>The starting offset, <code>l_len</code>, can be anything, but 0 is what makes the most sense. We&rsquo;ll set <code>l_len</code> to 0 as well, since we want each process to lock the entire file. The lock type, <code>l_type</code>, needs to be an exclusive lock (<code>F_WRLCK</code>), that is, a lock that prevents any other process from setting a lock on that area before it&rsquo;s released. <code>l_whence</code> will be set to <code>SEET_SET</code> to indicate that the relative offset <code>l_start</code> will be measured from the beginning of the file:</p> <pre tabindex="0"><code>struct flock fl; fl.l_len = 0; fl.l_start = 0; fl.l_type = F_WRLCK; fl.l_whence = SEEK_SET; </code></pre><p>The <code>F_SETLKW</code> command will make the calling process wait until the lock request can be satisfied. There&rsquo;s also <code>F_SETLK</code>, but it returns immidiately if the lock is already acquired, which is not very useful for queuing processes:</p> <pre tabindex="0"><code>if (fcntl(fd, F_SETLKW, &amp;fl) &lt; 0) err(1, &#34;fcntl(F_SETLKW)&#34;); </code></pre><p>When we get past the call to fcntl(2), it means that we have acquired the lock until a call to close(2) is made. Here is where we&rsquo;ll put the part of the code we want to queue, in this case a simple <code>printf</code> followed by a 3-second sleep to make sure queuing really works:</p> <pre tabindex="0"><code>printf(&#34;hello from %d\n&#34;, getpid()); sleep(3); </code></pre><p>When done, release the lock:</p> <pre tabindex="0"><code>close(fd); </code></pre><p>To test the code, open two terminals and run the program on both of them. You&rsquo;ll see the process that was run last will not execute the code after fcntl(2) until the first one has finished.</p> FreeBSD sound mixer improvements http://margiolis.net/w/mixer_improvements/ Fri, 25 Feb 2022 00:00:00 +1200 <p>This project was part of Google Summer of Code 2021, but development is still active. The development report can be found on the <a href="https://wiki.freebsd.org/SummerOfCode2021Projects/SoundMixerImprovements">FreeBSD Wiki</a>. The reason behind this project is that the FreeBSD&rsquo;s OSS mixer capabilities were really basic and outdated &mdash; even un/muting didn&rsquo;t exist and one had to write custom scripts for such a basic task. Setting default audio devices had to be done by tweaking sysctls and programs needing to use the mixer required DIY implementations as there was no mixer library available. The project was merged to upstream on FreeBSD 14.0.</p> <h2 id="table-of-contents">Table of contents</h2> <ul> <li><a href="#kernel-patches">Kernel patches</a> <ul> <li><a href="#un-muting">Un/muting</a></li> <li><a href="#mode-configuration">Playback/recording mode information</a></li> </ul> </li> <li><a href="#userland">Userland</a> <ul> <li><a href="#libmixer-implementation">mixer(3) implementation</a></li> <li><a href="#mixer-rewrite">mixer(8) rewrite</a></li> </ul> </li> <li><a href="#code-and-manuals">Code and manuals</a></li> </ul> <h2 id="kernel-patches">Kernel patches</h2> <h3 id="un-muting">Un/muting (<a href="https://cgit.freebsd.org/src/commit/?id=0f8dafb45859569aa36b63ca2bb4a1c35c970d1e">commit</a>)</h3> <p>I decided that un/muting is better to be implemented in sound(4) in order to avoid having to write daemons or use files. The way this works is by implementing the <code>SOUND_MIXER_READ_MUTE</code> and <code>SOUND_MIXER_WRITE_MUTE</code> ioctls, which <em>did</em> exist in older OSS implementations, but were considered obselete. One thing to note is that the functionality isn&rsquo;t the same as their old one. Older OSS versions had those 2 ioctls take/return an integer with a value of 0 or 1, which indicated whether the <em>whole</em> mixer is muted or not. My implementation takes/returns a bitmask that tells which devices are muted. This allows us to mute and unmute only the devices we want, instead of the whole mixer. If you&rsquo;re familiar with the <a href="http://manuals.opensound.com/developer/">OSS API</a>, this bitmask works the same way as <code>DEVMASK</code>, <code>RECMASK</code> and <code>RECSRC</code>.</p> <h3 id="mode-configuration">Playback/recording mode information (<a href="https://cgit.freebsd.org/src/commit/?id=ed2196e5df0c8b5b81563d2fffdcb32bb7ebe966">commit</a>)</h3> <p>Here I implemented a sysctl (<code>dev.pcm.&lt;N&gt;.mode</code>) which gives information about a device&rsquo;s playback/recording mode. The rationale for this control is to include <code>/dev/sndstat</code>&rsquo;s mixer information in the output of the new mixer(8). The sysctl can return the following values (NOTE: these values are OR&rsquo;ed together if more than one mode is supported):</p> <table border="solid"> <tr> <th>Value</th> <th>Meaning</th> </tr> <tr> <td>0x01</td> <td>Mixer</td> </tr> <tr> <td>0x02</td> <td>Playback device</td> </tr> <tr> <td>0x04</td> <td>Recording device</td> </tr> </table> <h2 id="userland">Userland</h2> <h3 id="libmixer-implementation">mixer(3) implementation (<a href="https://cgit.freebsd.org/src/commit/?id=903873ce15600fc02a0ea42cbf888cff232b411d">commit</a>)</h3> <p>mixer(3) provides a simple interface for working with the OSS mixer. <a href="https://man.freebsd.org/cgi/man.cgi?query=mixer&amp;apropos=0&amp;sektion=3&amp;manpath=FreeBSD+15.0-CURRENT&amp;arch=default&amp;format=html">The man page</a> explains how the library works, including some examples, so there&rsquo;s no need to repeat myself. You can see the library in action in <a href="https://cgit.freebsd.org/src/tree/usr.sbin/mixer/mixer.c">the source code for mixer(8)</a>.</p> <p>The basic structure of a program looks like this (link with <code>-lmixer</code>):</p> <pre tabindex="0"><code>#include &lt;err.h&gt; #include &lt;mixer.h&gt; int main(int argc, char *argv[]) { struct mixer *m; const char *name = &#34;/dev/mixer0&#34;; if ((m = mixer_open(name)) == NULL) err(1, &#34;mixer_open(%s)&#34;, name); /* do stuff */ mixer_close(m); return (0); } </code></pre><h3 id="mixer-rewrite">mixer(8) rewrite (<a href="https://cgit.freebsd.org/src/commit/?id=903873ce15600fc02a0ea42cbf888cff232b411d">commit</a>)</h3> <p>This implementation is a complete rewrite of the old mixer(8) utility. It now uses mixer(3) as a backend and implements all the new features the library provides. It&rsquo;s got more command line options and works with a control-oriented interface inspired by <a href="https://man.openbsd.org/mixerctl">OpenBSD&rsquo;s mixerctl(8)</a>. Again, everything is detailed in <a href="https://man.freebsd.org/cgi/man.cgi?query=mixer&amp;apropos=0&amp;sektion=8&amp;manpath=FreeBSD+15.0-CURRENT&amp;arch=default&amp;format=html">the man page</a>.</p> <p>Old mixer(8) output:</p> <pre tabindex="0"><code>$ mixer.old Mixer vol is currently set to 85:85 Mixer pcm is currently set to 100:100 Mixer speaker is currently set to 74:74 Mixer line is currently set to 1:1 Mixer mic is currently set to 67:67 Mixer mix is currently set to 74:74 Mixer rec is currently set to 37:37 Mixer igain is currently set to 0:0 Mixer ogain is currently set to 100:100 Mixer monitor is currently set to 67:67 Recording source: mic </code></pre><p>New mixer(8) output:</p> <pre tabindex="0"><code>$ mixer pcm0:mixer: &lt;Realtek ALC662 rev3 (Analog 2.0+HP/2.0)&gt; on hdaa0 kld snd_hda (play/rec) (default) vol = 0.85:0.85 pbk pcm = 1.00:1.00 pbk speaker = 0.74:0.74 rec line = 0.01:0.01 rec mic = 0.67:0.67 rec src mix = 0.74:0.74 rec rec = 0.37:0.37 pbk igain = 0.00:0.00 pbk ogain = 1.00:1.00 pbk monitor = 0.67:0.67 rec </code></pre><h2 id="code-and-manuals">Code and manuals</h2> <ul> <li><a href="https://man.freebsd.org/cgi/man.cgi?query=mixer&amp;apropos=0&amp;sektion=3&amp;manpath=FreeBSD+15.0-CURRENT&amp;arch=default&amp;format=html">mixer(3) man page</a></li> <li><a href="https://cgit.freebsd.org/src/tree/lib/libmixer">mixer(3) source code</a></li> <li><a href="https://man.freebsd.org/cgi/man.cgi?query=mixer&amp;apropos=0&amp;sektion=8&amp;manpath=FreeBSD+15.0-CURRENT&amp;arch=default&amp;format=html">mixer(8) man page</a></li> <li><a href="https://cgit.freebsd.org/src/tree/usr.sbin/mixer/">mixer(8) source code</a></li> <li><a href="http://manuals.opensound.com/developer/">OSS 4.x Programmer&rsquo;s Guide</a></li> </ul> PIC microcontroller development on FreeBSD http://margiolis.net/w/pic_freebsd/ Sun, 23 Jan 2022 00:00:00 +1200 <p>Tested on FreeBSD 13.0. This article has also been mirrored on the <a href="https://wiki.freebsd.org/Microcontrollers/PIC">FreeBSD Wiki</a>.</p> <h2 id="prerequisites">Prerequisites</h2> <p><a href="http://sdcc.sourceforge.net/">sdcc</a> is a C compiler for microprocessors. It says that PIC microprocessors are unmaintained, but I&rsquo;ve found it to be pretty reliable so far (take this with a grain of salt, I&rsquo;m no expert). The port can be found under:</p> <pre tabindex="0"><code>lang/sdcc </code></pre><p>Page 75 of the <a href="http://sdcc.sourceforge.net/doc/sdccman.pdf">sdcc user manual</a> lists the supported PIC devices. Header files can be found under <code>/usr/local/share/sdcc</code>.</p> <p>For programming the MCU, I&rsquo;ve found pk2cmd to work alright with PICKit2 (or Chinese clones), but there&rsquo;s no port for FreeBSD anymore. The Makefile won&rsquo;t install files properly, so we have some extra work to do afterwards:</p> <pre tabindex="0"><code>$ git clone https://github.com/psmay/pk2cmd.git $ cd pk2cmd/pk2cmd # gmake freebsd install clean # mv /usr/share/pk2/PK2DeviceFile.dat /usr/local/bin # rm -rf /usr/share/pk2 </code></pre><p>Supported devices for pk2cmd are listed <a href="https://github.com/psmay/pk2cmd/blob/master/pk2cmd/ReadmeForPK2CMDLinux2-6.txt">here</a>.</p> <h2 id="detecting-and-programming-the-mcu">Detecting and programming the MCU</h2> <p>Avoid using just the -P option to auto-detect the MCU, as the VPP the PICKit2 applies to the chip trying to detect it can damage the MCU. Instead, use the chip number beforehand as shown below. Also, use the -C option to check if the chip is blank.</p> <p>If any of the following pk2cmd commands fail, make sure everything <em>really is</em> wired properly:</p> <pre tabindex="0"><code>$ pk2cmd -P PIC16F877A -C Device is blank Operation Succeeded </code></pre><p>Compile your source code. The target executable is the <code>.hex</code> file sdcc will output. Replace <code>pic14</code> and <code>16f877a</code> with the appropriate names for your device:</p> <pre tabindex="0"><code>$ sdcc --use-non-free -mpic14 -p16f877a main.c </code></pre><p>Erase the PIC (if it wasn&rsquo;t already blank) and flash the new code. Again, use the appropriate names:</p> <pre tabindex="0"><code>$ pk2cmd -P PIC16F877A -E $ pk2cmd -P PIC16F877A -X -M -F main.hex </code></pre><p>If all went well, you should get an output similar to this:</p> <pre tabindex="0"><code>PICkit 2 Program Report 23-1-2022, 21:01:29 Device Type: PIC16F877A Program Succeeded. Operation Succeeded </code></pre> C coding style http://margiolis.net/w/cstyle/ Sat, 01 Jan 2022 00:00:00 +1200 <p>First of all, I wanna wish all 0 readers of this blog a happy new year! There&rsquo;s no reason to write a whole article repeating what better articles have already covered, so here&rsquo;s a list with proper C coding style guides:</p> <ul> <li><a href="https://www.lysator.liu.se/c/pikestyle.html">Notes on Programming in C</a> by Rob Pike.</li> <li><a href="https://man.openbsd.org/style">OpenBSD style guide</a></li> <li><a href="https://suckless.org/coding_style/">suckless.org style guide</a></li> <li><a href="https://www.kernel.org/doc/Documentation/process/coding-style.rst">Linux kernel coding style</a></li> </ul> <p><img src="http://margiolis.net/files/dmr.jpg" alt=""></p> Use cases for goto http://margiolis.net/w/goto/ Tue, 19 Jan 2021 00:00:00 +1200 <p>This article is a response to all my university professors who, for some reason, think <code>goto</code> is useless and should be avoided at all costs.</p> <h2 id="some-use-cases">Some use cases</h2> <p>The most important use case there is for <code>goto</code> is by far error handling when there are more than 1 points of failure. In this case, you might want to cleanup some resources while also skipping part of the code that should not be executed, without having to deal with flags, helper functions, and other methods that would make the code ugly, slower and error prone. Try rewriting the following snippet <em>without</em> a <code>goto</code>:</p> <pre tabindex="0"><code>int foo(int *bar, int *baz) { if (!func1()) goto fail; if (!func2()) goto fail; if (!func3()) goto fail; return 0; fail: warn(&#34;foo failed&#34;); if (bar != NULL) free(bar); if (baz != NULL) free(baz); return -1 } </code></pre><p>Another use case is breaking out of deeply nested code. Let&rsquo;s say you&rsquo;ve got 3 <code>for</code> loops and there&rsquo;s a special case in which you really want to break out of all the loops at once. how do you do that? There are multiple ways you can go about doing so but one way would be to set a flag and check it on every nested level.</p> <pre tabindex="0"><code>flag = 0; for (i = 0; i &lt; 10; i++) { for (j = 0; j &lt; 10; j++) { for (k = 0; k &lt; 10; k++) { ... if (flag) break; } if (flag) break; } if (flag) break; } </code></pre><p>Another ugly hack you can use is something another colleague from university showed me, and something I would <em>never</em> use; when the flag is set, manually max out all the loop counters.</p> <p>A pretty straight-forward solution would also be to put the loop into a function and use a <code>return</code> statement to break out of all the loops. That&rsquo;s actually a good solution, and I&rsquo;m aware of it, but I want to provide another solution, which is also quite faster than using a function since it avoids that additional function call.</p> <p>An alternative, and in my opinion, better way of solving this problem would be by using a (<em>don&rsquo;t say it, don&rsquo;t say it</em>) <code>goto</code>:</p> <pre tabindex="0"><code>flag = 0; for (i = 0; i &lt; 10; i++) { for (j = 0; j &lt; 10; j++) { for (k = 0; k &lt; 10; k++) { ... if (flag) goto end; } } } end: ... </code></pre><h2 id="who-cares-anyway">Who cares, anyway?</h2> <p>In the first use case, the code is much more readable and you avoid code duplication. In the second use case the <code>goto</code> solution actually <em>does</em> improve performance. The reason why is simple; we check for <code>flag</code> on every single loop, which means, that in case <code>flag</code> is never set, we&rsquo;ll have done 10 * 10 * 10 = 1000 checks just to see if <code>flag</code> is set. And that&rsquo;s just with 3 <code>for</code> loops going from 0 to 10 each; think how easily this can scale up if you just increase the iterations. The <code>goto</code> solution does only <em>one</em> check in the third loop, which means that, in the above scenario, where <code>flag</code> never gets set, we&rsquo;ll have done only 10 checks - that&rsquo;s 100 times faster than the other solution.</p> <p>Using a function is almost just as fast as using a <code>goto</code> without a function, but not having to call a function is generally faster. Both solutions are great and totally valid, I just want to show an alternative one.</p> <h2 id="final-note">Final note</h2> <p><code>goto</code> <em>does</em> have its place but it should be used carefuly; if you overuse it, your code will either become incomprehensible, or flat out broken. The use cases I showcased in this post are very common and sometimes the code can be vastly improved with just a simple <code>goto</code> if used correctly.</p> <p>Again, thanks to both my colleagues who helped me improve this article with their recommendations.</p> Arduino on FreeBSD http://margiolis.net/w/arduino_freebsd/ Wed, 28 Oct 2020 00:00:00 +1200 <p>This article demonstrates how to develop for Arduino boards using only basic command line utilities, without having to use the Arduino IDE. The article has also been published on the <a href="https://wiki.freebsd.org/Arduino/NativeCLI">FreeBSD Wiki</a>.</p> <p>Tested on FreeBSD 12.2 and above.</p> <h2 id="prerequisites">Prerequisites</h2> <p>Required ports:</p> <pre tabindex="0"><code>devel/arduino-core devel/arduino-bsd-mk devel/avr-gcc devel/avr-libc devel/avrdude comms/uarduno </code></pre><p>With all the software installed, add the following line to <code>/boot/loader.conf</code> in case you want the Arduino kernel module to load automatically on boot. If you want to manually load the module whenever you need it, skip this step:</p> <pre tabindex="0"><code>uarduno_load=&#34;YES&#34; </code></pre><p>Load the kernel module:</p> <pre tabindex="0"><code># kldload uarduno </code></pre><p>Check your <code>~/.arduino/preferences.txt</code> and see if the following lines exist (<a href="https://wiki.freebsd.org/Arduino/NativeIDE">source</a>):</p> <pre tabindex="0"><code>serial.port=/dev/cuaU0 launcher=/usr/local/bin/firefox </code></pre><p>Add your user to the <code>dialer</code> group:</p> <pre tabindex="0"><code># pw group mod dialer -m $USER </code></pre><h2 id="connecting-the-board">Connecting the board</h2> <p>Standard Arduino boards connect as <code>/dev/cuaU0</code> and/or <code>/dev/ttyU0</code> on FreeBSD. In case these serial ports don&rsquo;t show up in <code>/dev</code>, you might need to press your board&rsquo;s reset button. After you&rsquo;ve plugged your board into a USB port, you should get the following output from <code>dmesg</code>. Although the output may vary, the important thing is that your board is connected and detected.</p> <pre tabindex="0"><code>ugen1.5: &lt;Arduino (www.arduino.cc) product 0x0043&gt; at usbus1 uarduno0: &lt;Arduino (www.arduino.cc) product 0x0043, class 2/0, rev 1.10/0.01, addr 5&gt; on usbus1 </code></pre><p>If <code>dmesg</code> returned information about your board, you should also see <code>cuaU0</code> and/or <code>ttyU0</code> in <code>/dev</code>. In case your board is still not detected &mdash; considering it&rsquo;s not a fake one, try using a different USB cable or reset it again and make sure you&rsquo;ve followed the setup steps correctly.</p> <h2 id="the-makefile">The Makefile</h2> <p>The only thing you&rsquo;re going to need in order to get started is just a <code>Makefile</code> that&rsquo;ll be used to compile <em>and upload</em> your Arduino programs. Make a new directory for your Arduino project and a <code>Makefile</code> with the following lines:</p> <pre tabindex="0"><code>ARDUINO_DIR= /usr/local/arduino ARDUINO_MK_DIR= /usr/local/arduino-bsd-mk #ARDUINO_LIBS= AVRDUDE_PORT= your_board_port ARDUINO_BOARD= your_board_name SRCS= your_source_files TARGET= your_program_name include /usr/local/arduino-bsd-mk/bsd.arduino.mk </code></pre><p>In my case my board is an Arduino Uno, so I&rsquo;d have to set <code>ARDUINO_BOARD</code> to <code>uno</code>. You can see which other board types are available in <code>/usr/local/arduino/hardware/arduino/avr/boards.txt</code>. If you want to install new libraries, copy them over to <code>/usr/local/arduino/hardware/arduino/avr/libraries/</code>.</p> <p>Avoid having source files named <code>main</code>.</p> <h2 id="building-and-uploading-a-program">Building and uploading a program</h2> <p>Write some Arduino code, and when you&rsquo;re ready to compile and upload, run the following command:</p> <pre tabindex="0"><code># make install flash clean cleandepend </code></pre><p>If all went well you should see the board executing the new code. If it doesn&rsquo;t, try to see what errors the <code>Makefile</code> produced.</p> <h2 id="monitoring">Monitoring</h2> <p>The Arduino IDE provides a serial monitor feature, but FreeBSD has a builtin monitoring utility which can be accessed directly from the terminal. Run this whenever you want to monitor your board and exit with <code>~!</code> (use the appropriate port):</p> <pre tabindex="0"><code>$ cu -l /dev/cuaU0 </code></pre><h2 id="using-board-types-other-than-the-uno">Using board types other than the Uno</h2> <p>As it&rsquo;s mentioned above, we&rsquo;re using the <code>uarduno</code> kernel module. Even though the module&rsquo;s description is <em>&ldquo;FreeBSD Kernel Driver for the Arduino Uno USB interface&rdquo;</em>, you can, in fact, use different board types other than the Uno. According to <code>uarduno</code>&rsquo;s <a href="http://www.mrp3.com/uarduno.html">website</a>, you can modify <code>/usr/ports/comms/uarduno/files/ids.txt</code> to include more board types; the two fields are Vendor ID and Product ID. Read the comments inside the file for more information.</p> <pre tabindex="0"><code>{ 0x2341, 0x0001 }, // Arduino UNO, vendor 2341H, product 0001H { 0x2341, 0x0042 }, // Arduino MEGA (rev 3), vendor 2341H, product 0042H { 0x2341, 0x0043 }, // Arduino UNO (rev 3), vendor 2341H, product 0043H { 0x2341, 0x0010 }, // Arduino MEGA 2560 R3, vendor 2341H, product 0010H { 0x2341, 0x8037 }, // Arduino Micro </code></pre><p>When you&rsquo;re done, clean and re-build the port.</p> <h2 id="known-issues-and-their-fixes">Known issues and their fixes</h2> <p>Even though you might have plugged your board to your machine, you might notice that there is no device appearing in <code>/dev</code>. Although there is no definite answer as to why this is happening, make sure that the USB cable is connected properly; on some boards, you have to hear a click sound.</p> <p>When trying to use a new library, you might notice that your code doesn&rsquo;t compile. A common issue is that you haven&rsquo;t stored the library in the correct path. As mentioned, libraries are stored in <code>/usr/local/arduino/hardware/arduino/avr/libraries/</code>, so you have to move it there.</p> Simple Brainfuck interpreter in C http://margiolis.net/w/brainfuck/ Wed, 02 Sep 2020 00:00:00 +1200 <h2 id="how-brainfuck-works">How Brainfuck works</h2> <p>There are only 8 symbols supported in Brainfuck:</p> <table border="solid" style="width: auto; table-layout: auto"> <tr> <th>Symbol</th> <th>Function</th> </tr> <tr> <td>&gt;</td> <td>Increase position of pointer</td> </tr> <tr> <td>&lt;</td> <td>Decrease position of pointer</td> </tr> <tr> <td>&#43;</td> <td>Increase value of pointer</td> </tr> <tr> <td>&#45;</td> <td>Decrease value of pointer</td> </tr> <tr> <td>&#91;</td> <td>Beginning of loop</td> </tr> <tr> <td>&#93;</td> <td>End of loop</td> </tr> <tr> <td>&#46;</td> <td>Output ASCII code of pointer</td> </tr> <tr> <td>&#44;</td> <td>Read a character and stores its ASCII value in pointer</td> </tr> </table> <p>It&rsquo;s best to imagine Brainfuck programs as arrays of integers, which the pointer can manipulate. Let&rsquo;s say this is our initial state:</p> <pre tabindex="0"><code>{ 0, 0, 0, 0, 0, 0 } </code></pre><p>We can assign values to each position in the array by moving the pointer around. Using the + and - symbols, we can increment or decrement by 1 each time. If we wanted to move the pointer <em>two times to the right</em> and increment the value there by 3, we would have a program that looks like this:</p> <pre tabindex="0"><code>&gt;&gt;+++ </code></pre><p>The updated version of the array:</p> <pre tabindex="0"><code>{ 0, 0, 3, 0, 0, 0 } </code></pre><p>Following the same logic, we can assign specific values to each cell and make an actual program. If we wanted to print the letter &ldquo;B&rdquo; on the screen, which corresponds to the ASCII value 66, we could write the following program:</p> <pre tabindex="0"><code>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++. </code></pre><p>In order to avoid writing things like this we can use loops. The loop is executed as long as the value inside it is not 0. Essentially, it&rsquo;s going to do the multiplication 10 x 6 = 66 and then print the value:</p> <pre tabindex="0"><code>+++++ +++++ # add 10 to cell #0 [ # beginning of loop &gt; +++ +++ # add 6 to cell #1 &lt; - # subtract 1 from cell #0 ] # end of loop # value at cell 1 is now 66 (10 x 6 = 66) &gt; . # go to cell 1 and print its value </code></pre><p>Or, for compactness:</p> <pre tabindex="0"><code>+++++++++++[&gt;++++++&lt;-]&gt;. </code></pre><p>You can learn more about how Brainfuck works <a href="https://esolangs.org/wiki/Brainfuck">here</a>.</p> <h2 id="building-the-interpreter">Building the interpreter</h2> <p>We&rsquo;ll first read the Brainfuck source from <code>stdin</code> into a static size buffer. 50.000 bytes should be large enough to store any Brainfuck program, since I doubt anyone is mad enough to write actual programs in it:</p> <pre tabindex="0"><code>#define BUFSIZE 50000 . . . size_t len = 0; char buf[BUFSIZE]; while (read(STDIN_FILENO, &amp;buf[len], 1) &gt; 0) len++; buf[len] = &#39;\0&#39;; </code></pre><p>We&rsquo;ll declare the rest of the needed variables:</p> <pre tabindex="0"><code>int closed; /* number of active loops */ int opened; /* number of inactive loops */ int pos = 0; /* position in the program */ unsigned short *pc; /* program counter */ char *src; /* source code */ </code></pre><p>One of the reasons we have a <code>len</code> variable is to allocate just enough memory for <code>src</code>. We&rsquo;ll also empty the buffer because we now want to use it to store the values the Brainfuck program will produce:</p> <pre tabindex="0"><code>if ((src = malloc(len)) == NULL) { perror(&#34;malloc&#34;); exit(1); } strcpy(src, buf); memset(buf, 0, len); </code></pre><p>We can now parse the source code symbol by symbol. <code>pc</code> will act as the &ldquo;pointer&rdquo;, moving back and forth in the array. Each symbol will have its own case inside the following <code>switch</code> statement:</p> <pre tabindex="0"><code>for (pc = (unsigned short *)buf, pos = 0; pos &lt; len; pos++) { switch (src[pos]) { ... } } </code></pre><p>For the <code>&lt;</code> and <code>&gt;</code> symbols we simply move the pointer:</p> <pre tabindex="0"><code>case &#39;&gt;&#39;: pc++; break; case &#39;&lt;&#39;: pc--; break; </code></pre><p>The + and - symbols in/decrement the value of the cell the pointer is currently at:</p> <pre tabindex="0"><code>case &#39;+&#39;: (*pc)++; break; case &#39;-&#39;: (*pc)--; break; </code></pre><p>To implement the <code>.</code> and <code>,</code> symbols we&rsquo;ll use the standard library&rsquo;s <code>putchar()</code> and <code>getchar()</code> functions:</p> <pre tabindex="0"><code>case &#39;.&#39;: putchar(*pc); break; case &#39;,&#39;: *pc = getchar(); break; </code></pre><p>Now comes the last, but harder part, which is to implement loops. The logic behind my implementation is that instead of keeping track of every bracket to know where a loop starts and ends, the program keeps going through the source code and, using a counter, we know that a loop starts or ends when that counter is 0 <em><strong>and</strong></em> and an opposite bracket has been found. Also, in each iteration the <code>pos</code> variable changes accordingly so that we can imitate the looping behavior of Brainfuck.</p> <p>These are the steps the program follows for each of the two symbols:</p> <h2 id="beginning-of-loop-">Beginning of loop: <code>[</code></h2> <ul> <li>If the pointer&rsquo;s value <em>is</em> 0, we have a new loop.</li> <li>Count how many (if any) nested loops we encounter.</li> <li>If we encouter a <code>[</code>, increment the <code>opened</code> variable.</li> <li>If we encouter a <code>]</code>, decrement the <code>opened</code> variable.</li> <li>Keep counting until there are no new active loops (i.e <code>opened</code> is 0).</li> </ul> <pre tabindex="0"><code>case &#39;[&#39;: if (!(*pc)) { for (opened = 0; pos++; pos &lt; len; pos++) { if (src[pos] == &#39;]&#39; &amp;&amp; !opened) break; else if (src[pos] == &#39;[&#39;) opened++; else if (src[pos] == &#39;]&#39;) opened--; } } break; </code></pre><h2 id="end-of-loop-">End of loop: <code>]</code></h2> <ul> <li>If the pointer&rsquo;s value is <em>is not</em> 0, we have an active loop.</li> <li>Start going back and count how many (if any) nested loops there are.</li> <li>If we encouter a <code>]</code>, increment the <code>closed</code> variable.</li> <li>If we encouter a <code>[</code>, decrement the <code>closed</code> variable.</li> <li>Keep counting until there are no active loops (i.e <code>closed</code> is 0).</li> </ul> <pre tabindex="0"><code>case &#39;]&#39;: if ((*pc)) { for (closed = 0; pos--; pos &gt;= 0; pos--) { if (src[pos] == &#39;[&#39; &amp;&amp; !closed) break; else if (src[pos] == &#39;]&#39;) closed++; else if (src[pos] == &#39;[&#39;) closed--; } } break; </code></pre><h2 id="putting-it-all-together">Putting it all together</h2> <p>Below is the full program. You can also find this program <a href="https://git.sr.ht/~crm/random/tree/master/item/bf">here</a></p> <pre tabindex="0"><code>#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt; #include &lt;unistd.h&gt; #define BUFSIZE 50000 int main(int argc, char *argv[]) { size_t len = 0; int closed, opened, pos = 0; unsigned short *pc; char buf[BUFSIZE], *src; while (read(STDIN_FILENO, &amp;buf[len], 1) &gt; 0) len++; buf[len] = &#39;\0&#39;; if ((src = malloc(len)) == NULL) { perror(&#34;malloc&#34;); exit(1); } strcpy(src, buf); memset(buf, 0, len); for (pc = (unsigned short *)buf; pos &lt; len; pos++) { switch (src[pos]) { case &#39;&gt;&#39;: pc++; break; case &#39;&lt;&#39;: pc--; break; case &#39;+&#39;: (*pc)++; break; case &#39;-&#39;: (*pc)--; break; case &#39;.&#39;: putchar(*pc); break; case &#39;,&#39;: *pc = getchar(); break; case &#39;[&#39;: if (!(*pc)) { for (opened = 0, pos++; pos &lt; len; pos++) { if (src[pos] == &#39;]&#39; &amp;&amp; !opened) break; else if (src[pos] == &#39;[&#39;) opened++; else if (src[pos] == &#39;]&#39;) opened--; } } break; case &#39;]&#39;: if (*pc) { for (closed = 0, pos--; pos &gt;= 0; pos--) { if (src[pos] == &#39;[&#39; &amp;&amp; !closed) break; else if (src[pos] == &#39;]&#39;) closed++; else if (src[pos] == &#39;[&#39;) closed--; } } break; } } free(src); return (0); } </code></pre>