Simple Brainfuck interpreter in C

Brainfuck has a rather descriptive name. However, writing an interpreter for it is not hard, in fact, the only thing required is some basic understanding of pointers. This is not going to be about writing a ridiculously small Brainfuck interpreter like this one.

How Brainfuck works

There are only 8 symbols supported in Brainfuck:

Symbol Function
> Increase position of pointer
< Decrease position of pointer
+ Increase value of pointer
- Decrease value of pointer
[ Beginning of loop
] End of loop
. Output ASCII code of pointer
, Read a character and stores its ASCII value in pointer

It's best to imagine a Brainfuck program as an array of values, which the pointer can access and manipulate. Let's say the array looks like this when the program starts:

	{ 0, 0, 0, 0, 0, 0 }
	

The pointer can now be moved "left" and "right" and assign values to each position in the array. Using the + and - symbols we can increment or decrement by 1 each time. If we wanted to move the pointer two times to the right and increment the value at the given position by 3, we would have a program that looks like this

	>>+++
	

The array now looks like this

	{ 0, 0, 3, 0, 0, 0 }
	

Following the same logic, we can assign specific values to each cell and make an actual program. If we wanted to print the letter "B" on the screen, which corresponds to the ASCII value 66, we could - badly - write the following program

	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
	

In order to avoid writing things like this we can use loops. Inside a loop the code is executed while the value inside it is not 0. So we can rewrite our program in a better way.
Essentially, it's going to do the multiplication 10 x 6 = 66 and then print the value.

	+++++ +++++		# add 10 to cell #0
	[			# beginning of loop
		> +++ +++	# add 6 to cell #1
		< -		# subtract 1 from cell #0
	]			# end of loop
				# value at cell 1 is now 66 (10 x 6 = 66)
	> .			# go to cell 1 and print its value
	

Or, to make it look better

	+++++++++++[>++++++<-]>.
	

You can learn more about how Brainfuck works here.

Building the interpreter

The interpreter's job is to parse the source code and apply the behavior I specified in the beginning when reading a new character. First of all we need to read the Brainfuck source file in our C program and store it in a buffer. We'll be reading from stdin directly so that the program is more extensible. We'll also set a large buffer size so that it can accept a large source files. 50.000 bytes is enough for a Brainfuck program, since it's a useless language and no one writes actual programs in it.

	#define BUFSIZE 50000
	.
	.
	.
	size_t len = 0;
	char buf[BUFSIZE];

	while (read(STDIN_FILENO, &buf[len], 1) > 0)
		len++;
	buf[len] = '\0';
	

The objectives are the following

  1. Store the whole source file in memory
  2. Read the code symbol by symbol and act accordingly
  3. Store the values in an empty buffer with the use of a pointer

We'll declare the rest of the needed variables.

	int closed;		/* number of active loops */
	int opened;		/* number of inactive loops */
	int pos = 0;		/* position in the program */
	unsigned short *pc;	/* program counter */
	char *src;		/* source code */
	

One of the reasons we have a len variable is to allocate just enough memory for src. We'll also empty the buffer because we now want to use it to store the values the Brainfuck program will produce.

	if ((src = malloc(len)) == NULL) {
		perror("malloc");
		exit(1);
	}
	strcpy(src, buf);
	memset(buf, 0, len);
	

Now we can go through the source code array, starting from position 0 (i.e the first symbol in the source code) and manipulate the buffer which we'll store the values in - the pointer (pc) will move inside the buffer. We will do this using a switch statement. The interpreter will ignore any non-Brainfuck symbol.

	for (pc = (unsigned short *)buf, pos = 0; pos < len; pos++) {
		switch (src[pos]) {
			.
			.
			.
			.
		}
	}
	

For the < and > symbols the process is to simply increment or decrement the position of our pointer by 1.

	case '>':
		pc++;
		break;
	case '<':
		pc--;
		break;
	

The + and - symbols increment the value of the cell that the pointer is currently pointing at, so what we have to do is increment the value of the pointer.

	case '+':
		(*pc)++;
		break;
	case '-':
		(*pc)--;
		break;
	

To implement the . and , symbols we'll use the putchar() and getchar() functions.

	case '.':
		putchar(*pc);
		break;
	case ',':
		*pc = getchar();
		break;
	

Now comes the last, but hard part, which is to implement loops. The logic behind my implementation is that instead of keeping track of every bracket to know where a loop starts and ends, the program keeps going through the source code and, using a counter, we know that a loop starts or ends when that counter is 0 and and an opposite bracket has been found.
Also, in each iteration the pos variable changes accordingly so that we can imitate the looping behavior of Brainfuck.

These are the steps the program follows for each of the two symbols.

Beginning of loop: [

	case '[':
	if (!(*pc)) {
		for (opened = 0; pos++; pos < len; pos++) {
			if (src[pos] == ']' && !opened)
				break;
			else if (src[pos] == '[')
				opened++;
			else if (src[pos] == ']')
				opened--;
		}
	}
	break;
	

End of loop: ]

	case ']':
	if ((*pc)) {
		for (closed = 0; pos--; pos >= 0; pos--) {
			if (src[pos] == '[' && !closed)
				break;
			else if (src[pos] == ']')
				closed++;
			else if (src[pos] == '[')
				closed--;
		}
	}
	break;
	

Putting it all together

Below is the full program. You can also find this program here.

	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>
	#include <unistd.h>

	#define BUFSIZE 50000

	int
	main(int argc, char *argv[])
	{
		size_t len = 0;
		int closed, opened, pos = 0;
		unsigned short *pc;
		char buf[BUFSIZE], *src;

		while (read(STDIN_FILENO, &buf[len], 1) > 0)
			len++;
		buf[len] = '\0';

		if ((src = malloc(len)) == NULL) {
			perror("malloc");
			exit(1);
		}
		strcpy(src, buf);
		memset(buf, 0, len);

		for (pc = (unsigned short *)buf; pos < len; pos++) {
			switch (src[pos]) {
			case '>':
				pc++;
				break;
			case '<':
				pc--;
				break;
			case '+':
				(*pc)++;
				break;
			case '-':
				(*pc)--;
				break;
			case '.':
				putchar(*pc);
				break;
			case ',':
				*pc = getchar();
				break;
			case '[':
				if (!(*pc)) {
					for (opened = 0, pos++; pos < len; pos++) {
						if (src[pos] == ']' && !opened)
							break;
						else if (src[pos] == '[')
							opened++;
						else if (src[pos] == ']')
							opened--;
					}
				}
				break;
			case ']':
				if (*pc) {
					for (closed = 0, pos--; pos >= 0; pos--) {
						if (src[pos] == '[' && !closed)
							break;
						else if (src[pos] == ']')
							closed++;
						else if (src[pos] == '[')
							closed--;
					}
				}
				break;
			}
		}
		free(src);

		return 0;
	}