How the SQLite Virtual Machine Works

Author

Name: Ben Johnson
@benbjohnson: @benbjohnson

A robot furiously making sandwiches. — Image by Annie Ruygt

Fly.io runs apps close to users around the world, by taking containers and upgrading them to full-fledged virtual machines running on our own hardware around the world. Sometimes those containers run SQLite and we make that easy too. Give us a whirl and get up and running quickly.

SQL is a weird concept. You write your application in one language, say JavaScript, and then send commands in a completely different language, called SQL, to the database. The database then compiles and optimizes that SQL command, runs it, and returns your data. It seems terribly inefficient and, yet, your application might do this hundreds of times per second. It’s madness!

But it gets weirder.

SQL was originally designed for non-technical users to interact with the database, however, it’s used almost exclusively by software developers peppering it throughout their applications.

Why would this language made for “business folks” become the industry standard for how applications are built?

One of the key benefits of SQL is that it is declarative. That means you tell the database what you want but not how to do it. Your database knows WAY more about your data than you do so it should be able to make better decisions about how to fetch it and update it. This lets you improve your data layer by adding indexes or even restructuring tables with minimal effects on your application code.

SQLite is unique among embedded databases in that it not only has a transactional, b-tree storage layer but it also includes a robust SQL execution engine. Today, we’re diving into how SQLite parses, optimizes, & executes your SQL queries.

A sandwich-making machine

If you’ve read our previous sandwich-themed SQLite blog posts on the SQLite file format, the rollback journal, & the WAL, then you’re probably feeling pretty hungry by now. You’re also probably tired of the tedium of making sandwiches by hand, so we’ll use a sandwich-making machine as our analogy in this blog post.

This machine will do a few tasks:

Take an order for sandwiches.
Determine the most efficient way to build the sandwiches.
Build the sandwiches and hand them to you.

The process for building and executing SQL queries is similar to this sandwich-building process, albeit less delicious. Let’s dive in.

Teaching our machine to read

The first step is to give our machine an order. We hand it an order slip that says:

Make 3 BLT sandwiches hold the mayo, 1 grilled cheese

To our computer, this order is just a string of individual characters: M, a, k, e, etc… If we want to make sense of it, we first need to group these letters together into words, or more specifically, “tokens”. This process is called “tokenizing” or “lexing”.

After tokenizing, we see this list of tokens:

"MAKE", "3", "BLT", "SANDWICHES", "HOLD", "THE", "MAYO", ",", "1", "GRILLED", "CHEESE"

From there, we start the parsing stage. The parser takes in a stream of tokens and tries to structure it some way that makes sense to a computer. This structure is called an Abstract Syntax Tree, or AST.

This AST for our sandwich command might look like this:

{
  "command": "MAKE",
  "sandwiches": [
    {
      "type":"BLT",
      "count": 3,
      "remove": ["MAYO"]
    },
    {
      "type": "GRILLED CHEESE",
      "count": 1
    }
  ]
}

From here, we can start to see how we might take this definition and begin building sandwiches from it. We’ve added structure to an otherwise structure-less blob of text.

Lexing & Parsing SQL

SQLite does this same process when it reads in SQL queries. First, it groups characters together into tokens such as SELECT or FROM. Then the parser builds a structure to represent it.

The SQLite documentation provides helpful “railroad diagrams” to represent the paths the parser can take when consuming the stream of tokens. The SELECT definition shows how it can start with the WITH keyword (for CTEs) and then move into the SELECT, FROM, and WHERE clauses.

When the parser is done, it outputs the aptly named Select struct. If you had a SQL query like this:

SELECT name, age FROM persons WHERE favorite_color = 'lime green'

Then you’ll end up with an AST that looks something like this:

{
  "ExprList": ["name", "age"],
  "Src": "persons",
  "Where": {
    "Left": "favorite_color",
    "Right: "lime_green",
    "Op": "eq"
  }
}

Determining the best course of action

So now that we have our sandwich order AST, we have a plan to make our sandwich, right? Not quite.

The AST represents what you want—which is a couple of sandwiches. It doesn’t tell us how to make the sandwiches. Before we get to the plan, though, we need to determine the optimal way to make the sandwiches.

Our sandwich-making machine can assemble a plethora of different sandwiches, so we stock all kinds of ingredients. If we were making a monster sandwich loaded with most of our available toppings it might make sense for the machine to visit each ingredient’s location, using it, or not, according to the AST.

But for our BLT, we need only bacon, lettuce & tomato. It’ll be way faster if we can have the machine look up the locations of just these three toppings in an index and jump directly between them.

SQLite has a similar decision to make when planning how to execute a query. For this, it uses statistics about its tables’ contents.

Using statistics for faster queries

When SQLite looks at an AST, there could be hundreds of ways to access the data to fulfill a query. The naive approach would be to simply read through the whole table and check every row to see if it matches. This what we in the biz call a full table scan and it is painfully slow if you only need a few rows from a large table.

Another option would be to use an index to help you quickly jump to the rows you need. An index is a list of row identifiers that are sorted by one or more columns, so if we have an index like this:

CREATE INDEX favorite_color_idx ON persons (favorite_color);

Then all the row identifiers for people who love “mauve” are all grouped together in our index. Using the index for our query means we have to first read from the index and then jump to a row in the table. This has a higher cost per row as it requires two lookups, however almost no one likes mauve so we don’t have too many matching rows.

But what happens if you search for a popular color like “blue”? Searching the index first and then jumping to our table for so many rows would actually be slower than if we simply searched the entire table.

So SQLite does some statistical analysis on our data and uses this information to choose the (probably) optimal recipe for each query.

SQLite’s statistics are stored in several “sqlite_stat” tables. These tables have evolved over the years so there’re 4 different versions of stats but only two are still in use with recent versions of SQLite: sqlite_stat1 & sqlite_stat4.

The sqlite_stat1 table has a simple format. It stores the approximate number of rows for each index and it stores the number of duplicate values for the columns of the index. These coarse-grained stats are the equivalent of tracking basic averages for a data set—they’re not super accurate but they’re quick to calculate and update.

The sqlite_stat4 table is a bit more advanced. It will store a few dozen samples of values that are spread across an index. These finer-grained samples mean that SQLite can understand how unique different values are across the key space.

Executing on our plan

Once we have an optimized plan for building a sandwich, we should have our machine write it down. That way if we get the same order again in the future, we can simply reuse the plan rather than having to parse & optimize the order each time.

So what does this sandwich plan look like?

The plan will be recorded as a list of commands that the machine can execute to build the BLT again in the future. We don’t want a command for each type of sandwich, as we may have a lot of different types. Better to have a set of common instructions that can be reused to compose any sandwich plan.

For example, we might have the following commands:

FIND_INGREDIENT_BIN(ingredient_name) - looks up the bin position of an ingredient.
FETCH_INGREDIENT(bin) - this grabs an ingredient with the machine’s robot arm from the given ingredient bin number.
APPLY_INGREDIENT - this puts the ingredient from the robot arm onto the sandwich.
GRILL - this grills the current sandwich.

We also have one more requirement that’s not immediately obvious. We only have so much space to hold our finished sandwiches so we need to make one sandwich at a time and have the customer take it before making the next sandwich. That way we can handle any number of sandwiches in an order.

This process of handing off is called yielding so we’ll have a YIELD command when where we wait for the customer to take the sandwich.

We’ll also need some control flow so we can make multiple of the same kind of sandwich so we’ll add a FOREACH command.

So putting our commands together, our plan might look like:

// Make our BLT sandwiches
FOREACH 1...3
  bin = FETCH_INGREDIENT_BIN("bacon")
  FETCH_INGREDIENT(bin)
  APPLY_INGREDIENT

  bin = FETCH_INGREDIENT_BIN("lettuce")
  FETCH_INGREDIENT(bin)
  APPLY_INGREDIENT

  bin = FETCH_INGREDIENT_BIN("tomato")
  FETCH_INGREDIENT(bin)
  APPLY_INGREDIENT

  YIELD
END

// Make our grilled cheese
bin = FETCH_INGREDIENT_BIN("cheese")
FETCH_INGREDIENT(bin)
APPLY_INGREDIENT
GRILL
YIELD

This set of domain-specific commands and the execution engine to run it is called a virtual machine. It gives us a level of abstraction that’s appropriate for the task we’re trying to complete (e.g. sandwich making) and it lets us reconfigure commands in different ways for different sandwiches.

Inspecting the SQLite Virtual Machine

SQLite’s virtual machine is structured similarly. It has a set of database-related commands that can execute the steps needed to fetch the results of a query.

For example, let’s start with a table of people with a few rows added:

CREATE TABLE persons (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT,
  favorite_color TEXT
);

INSERT INTO persons
  (name, favorite_color)
VALUES
  ('Vicki', 'black'),
  ('Luther', 'mauve'),
  ('Loren', 'blue');

We can inspect this with two different SQLite commands. The first command is called EXPLAIN QUERY PLAN and it gives a very high level plan of the query. If we run it for a simple SELECT with a conditional then we’ll see that it performs a table scan of the persons table:

sqlite> EXPLAIN QUERY PLAN SELECT * FROM persons WHERE favorite_color = 'blue';

QUERY PLAN
`--SCAN persons

This command can give more information as you do more complex queries. Now let’s look at the other command to further inspect the plan.

Confusingly, it’s called the EXPLAIN command. Simply drop the “QUERY PLAN” part of the first command and it will show a much more detailed plan:

sqlite> EXPLAIN SELECT * FROM persons WHERE favorite_color = 'blue';

addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     11    0                    0   Start at 11
1     OpenRead       0     2     0     3              0   root=2 iDb=0; persons
2     Rewind         0     10    0                    0   
3       Column       0     2     1                    0   r[1]=persons.favorite_color
4       Ne           2     9     1     BINARY-8       82  if r[1]!=r[2] goto 9
5       Rowid        0     3     0                    0   r[3]=rowid
6       Column       0     1     4                    0   r[4]=persons.name
7       Column       0     2     5                    0   r[5]=persons.favorite_color
8       ResultRow    3     3     0                    0   output=r[3..5]
9     Next           0     3     0                    1   
10    Halt           0     0     0                    0   
11    Transaction    0     0     1     0              1   usesStmtJournal=0
12    String8        0     2     0     blue           0   r[2]='blue'
13    Goto           0     1     0                    0

This is the “plain English” representation of the byte code that your query is compiled down to. This may look confusing but we can walk through it step-by-step to break it down.

The SQLite virtual machine instruction set

Just like how a computer has low-level CPU operations such MOV and JMP, SQLite has a similar instruction set but it’s just at a higher level. As of this writing, there are 186 commands, or opcodes, that the SQLite VM can understand. You can find the full specification on the SQLite web site but we’ll walk through a couple of them here.

The first opcode is an Init which initializes our execution and then jumps to another instruction in our program. The parameters for the opcodes are listed as p1 through p5 and their definition is specific to each command. For the Init opcode, it jumps to the instruction listed in p2 which is 11.

At address 11 we arrive at the Transaction opcode which starts our transaction. For most opcodes, the VM will move to the next address after executing the instruction so we move to address 12. This String8 opcode stores string value "blue" into register r[2]. The registers act like a set of memory addresses and are used to store values during execution. We’ll use this value later for our equality comparison.

Next, we move to address 13 which is a Goto instruction which has us jump to the instruction listed in its p2 parameter, which is address 1.

Now we get into the row processing. The OpenRead instruction opens a cursor on the persons table. A cursor is an object for iterating over or moving around in a table. The next instruction, Rewind, moves the cursor to the first entry of the database to begin our table scan.

The Column instruction reads the favorite_color column into register r[1] and the Ne instruction compares it with the "blue" value in register r[2]. If the values don’t match then we’ll move to the Next instruction at address 9. If they do match, we’ll fill in registers r[3] , r[4], & r[5] with the column id, name, & favorite_color for the row.

Finally, we get to where we can yield the result back to the caller using the ResultRow instruction. This will let the calling application copy out the values in registers r[3…5]. When the calling application calls sqlite3_step(), the program will resume from where it left off by calling Next and jumping back to re-execute the row processing at instruction 3.

When Next no longer produces any more rows, it’ll jump to the Halt instruction and the program is done.

Wrapping up our sandwich processing engine

The query execution side of SQLite follows this simple parse-optimize-execute plan on every query that comes into the database. We can use this knowledge to improve our application performance. By using bind parameters in SQL statements (aka those ? placeholders), we can prepare a statement once and skip the parse & optimize phases every time we reuse it.

SQLite uses a virtual machine approach to its query execution but that’s not the only approach available. Postgres, for example, uses a node-based execution plan which is structured quite differently.

Now that you understand the basics of how an execution plan works, try running EXPLAIN on one of your more complex queries and see if you can understand the step-by-step execution of how your query materializes into a result set for your application.

Next post ↑: Scale-to-Zero Minecraft server with Terraform and Fly Machines
Previous post ↓: How SQLite Scales Read Concurrency