Maximum Speed SQLite Inserts

In my work I tend to reach for SQLite more and more. The type of work I find it useful for most these days is quickly amalgamating, dissecting, collecting and analyzing large data sets. As I have outlined in my Euruko talk on scheduling, a key element of the project was writing a simulator. That simulator outputs metrics - lots and lots of metrics, which resemble what our APM solution collects. Looking at those metrics makes it possible to plot, dissect and examine the performance of various job flows.

You can, of course, store those metrics in plain Ruby objects and then work with them in memory - there is nothing wrong with that. However, I find using SQL vastly superior. And since the simulator only ever runs on one machine, and every session is unique - SQLite is the perfect tool for collecting metrics. Even if it is not a specialized datastore.

One challenge presented itself, though: those metrics get output in very large amounts. Every tick of the simulator can generate thousands of values. Persisting them to SQLite is fast, but with very large amounts that “fast” becomes “not that fast”. I had to go through a number of steps to make these inserts more palatable, which led to a very, very pleasant speed improvement indeed. That seems worth sharing - so strap in and let’s play.

Setting the scene

Let’s generate our data first and see how far we can push our little setup.

rng = Random.new(42)
metrics = %w( foo bar baz bad bleg )
values = (500_000).times.map do |n|
  {name: metrics.sample(random: rng), value: rng.rand}
end

We will assume we are inserting from Hash objects representing column-value mappings. We will use a fresh database for every test and keep it in memory to not even care about the filesystem performance - for now:

def create_db
  db = SQLite3::Database.new(":memory:")
  db.execute("CREATE TABLE metrics (name VARCHAR NOT NULL, value FLOAT NOT NULL)")
  db
end

and add a timing helper:

def timed(&blk)
  t = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  yield
  delta = Process.clock_gettime(Process::CLOCK_MONOTONIC) - t
  warn "Took #{delta} seconds"
end

First - the naive insert:

timed("Naive") do
  db = create_db
  first_record = values.first
  cols = first_record.keys.join(", ")
  placeholders = (["?"] * first_record.length).join(", ")
  sql = "INSERT INTO metrics (#{cols}) VALUES (#{placeholders})"
  values.each do |cols_to_values|
    db.query(sql, cols_to_values.values)
  end
end

This gives Naive - 2.3065050000150222 seconds. Surely we can do better than that. Transactions for bulk insert are great, let’s use one:

timed("With transaction") do
  db = create_db
  db.transaction do
    first_record = values.first
    cols = first_record.keys.join(", ")
    placeholders = (["?"] * first_record.length).join(", ")
    sql = "INSERT INTO metrics (#{cols}) VALUES (#{placeholders})"
    values.each do |cols_to_values|
      db.query(sql, cols_to_values.values)
    end
  end
end

That gives With transaction - 1.8898840000038035 seconds. Better, but by far not the improvement we need. Let’s use a prepared statement next:

timed("With transaction and prepared statement") do
  db = create_db
  first_record = values.first
  cols = first_record.keys.join(", ")
  placeholders = (["?"] * first_record.length).join(", ")
  sql_stmt = "INSERT INTO metrics (#{cols}) VALUES (#{placeholders})"
  db.transaction do
    prepared_stmt = db.prepare(sql_stmt)
    values.each do |cols_to_values|
      prepared_stmt.execute(cols_to_values.values)
    end
  end
end

This gives: With transaction and prepared statement - 0.6456299999845214 seconds - much better. But we can go further. By default SQLite optimizes for durability (at least on my version). Since we are working with a local database and we do not care about a potential crash, we can “downgrade” the durability of the storage engine to get more speed:

timed("With pragmas, transaction and prepared statement") do
  db = create_db
  db.query("PRAGMA synchronous = OFF")
  db.query("PRAGMA journal_mode = OFF")

  first_record = values.first
  cols = first_record.keys.join(", ")
  placeholders = (["?"] * first_record.length).join(", ")
  sql_stmt = "INSERT INTO metrics (#{cols}) VALUES (#{placeholders})"
  db.transaction do
    prepared_stmt = db.prepare(sql_stmt)
    values.each do |cols_to_values|
      prepared_stmt.execute(cols_to_values.values)
    end
  end
end

Still better: With pragmas, transaction and prepared statement - 0.6219140000175685 seconds - this is already a substantial improvement, but we can give the crank another turn.

Host parameter stuffing

How can we make it even faster than that? Well, the INSERT SQL statement supports multiple tuples in sequence, as long as they have the same cardinality. A bit like so:

  INSERT INTO metrics (name, value) VALUES ('foo', 1.0), ('bar', 2.0), ('baz', 4.0)

We can assign our placeholders in the prepared statement and then pass our bound parameters in the end:

db.query("INSERT INTO metrics (name, value) VALUES (?, ?), (?, ?)", ["foo", 1.0, "bar", 2.0"])

But there is a limit - the maximum number of bound variables per SQL statement, varies with the version of SQLite. Sadly, the sqlite3 gem does not support querying for sqlite3_limit(), but the info says:

To prevent excessive memory allocations, the maximum value of a host parameter number is SQLITE_MAX_VARIABLE_NUMBER, which defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

We can thus assume the value to be 999 for now (but you can query for the version using SQLite3::VERSION or try to find another way to access the sqlite3_limit API). What we need to do is figure out how many of our records we can stuff into a single INSERT - since we cannot really “split” the records, we always need to insert all values pertaining to a single record in one statement.

max_bindvars = 999
first_record = values.first
cardinality = first_record.length
records_per_statement, _ = max_bindvars.divmod(cardinality)

This shows us that we can at most stuff records_per_statement into a single INSERT (the remainder is not really useful here). We will use 2 statements, one of which we will prepare - since it is going to be reused. The first one will fit as many records as we can and bind variables for all of them - 999 bindvars or less, depending on the cardinality of our records. The second one will contain enough bindvars to fit the remaining records, and will be used only once - in fact, we do not even need to prepare it.

timed("With multirow inserts, pragmas, transaction and prepared statement") do
  db = create_db
  db.query("PRAGMA synchronous = OFF")
  db.query("PRAGMA journal_mode = OFF")

  first_record = values.first

  # We need to group our records into blocks of at most max_bindvars values
  cardinality = first_record.length
  row_placeholder = "(" + (["?"] * cardinality).join(", ") + ")" # => (?, ?, ?)
  max_bindvars = 999
  max_records_per_statement, _ = max_bindvars.divmod(cardinality)
  prepared_statement_for_max = nil
  cols = first_record.keys.join(", ")

  db.transaction do
    values.each_slice(max_records_per_statement) do |records_subset|
      bound_params = records_subset.flat_map(&:values)
      if records_subset.length == max_records_per_statement
        prepared_statement_for_max ||= begin
          placeholders_for_larger_chunk = ([row_placeholder] * max_records_per_statement).join(", ")
          sql_max = "INSERT INTO metrics (#{cols}) VALUES #{placeholders_for_larger_chunk}"
          db.prepare(sql_max)
        end
        prepared_statement_for_max.execute(bound_params)
      else
        # This is the last slice which is smaller
        placeholders_for_smaller_chunk = ([row_placeholder] * records_subset.length).join(", ")
        sql_rest = "INSERT INTO metrics (#{cols}) VALUES #{placeholders_for_smaller_chunk}"
        db.query(sql_rest, bound_params)
      end
    end
  end
end

Running all of our implementations then gives us:

Naive - 2.7048650000069756 seconds
With transaction - 2.3600640000076964 seconds
With transaction and prepared statement - 0.637083999987226 seconds
With pragmas, transaction and prepared statement - 0.6406159999896772 seconds
With multirow inserts, pragmas, transaction and prepared statement - 0.3141590000013821 seconds

We can see that using multirow inserts gives us a 2x speedup. Splendid.

Memory databases to disk

Of course, this is with memory databases - so it is probably very fast because of that. But what if I told you that you can actually serialize a memory DB onto disk very quickly, just using the builtin SQLite functions? A little-known feature of SQLite called online backup can be used to prepare your database in memory, do all of the bulk operations – and then write it out onto the filesystem, in a very fast (and consistent) way. The API in the Ruby gem is not pretty - but it is there and it works, and it works well (has been for more than a decade, in fact). Let’s put it to use:

def write_to_disk(source_db, filename)
  destination_db = SQLite3::Database.new(filename)
  b = SQLite3::Backup.new(destination_db, 'main', source_db, 'main')
  begin
    b.step(1)
  end while b.remaining > 0
  b.finish
  destination_db.close
end

Running the code gives us:

Naive - 2.706573000003118 seconds
With a prepared statement - 0.990191999997478 seconds
With transaction and prepared statement - 0.627656000026036 seconds
With pragmas, transaction and prepared statement - 0.6277800000098068 seconds
With multirow inserts, pragmas, transaction and prepared statement - 0.3135960000217892 seconds

These timings include serialization to disk using the backup API. And produces a few SQLite3 files of exactly the same size.

Comparing disk and memory performance

Out of curiosity, I would like to show what kind of performance we can have if we perform the same “accelerated” inserts on a disk DB, with the same 500000 rows:

Disk DBs:

Naive - 474.50496300001396 seconds
With transaction - 2.3882359999988694 seconds
With transaction and prepared statement - 0.7706829999806359 seconds
With pragmas, transaction and prepared statement - 0.7386469999910332 seconds
With multirow inserts, pragmas, transaction and prepared statement - 0.3565039999957662 seconds

Memory DBs:

Naive - 2.3065050000150222 seconds
With transaction - 1.8898840000038035 seconds
With transaction and prepared statement - 0.6456299999845214 seconds
With pragmas, transaction and prepared statement - 0.634888000000501 seconds
With multirow inserts, pragmas, transaction and prepared statement - 0.31626500000129454 seconds

This is curious: using a DB in RAM only helps is in the most pathological case with our “naive” inserts – but for other cases performance is on par. Aren’t modern SSDs marvelous?

So there you have it: a roughly x8 speedup for inserts.