c - Improve INSERT-per-second performance of SQLite

ID : 167

viewed : 323

Tags : cperformancesqliteoptimizationc





Top 5 Answer for c - Improve INSERT-per-second performance of SQLite

vote vote

91

Several tips:

  1. Put inserts/updates in a transaction.
  2. For older versions of SQLite - Consider a less paranoid journal mode (pragma journal_mode). There is NORMAL, and then there is OFF, which can significantly increase insert speed if you're not too worried about the database possibly getting corrupted if the OS crashes. If your application crashes the data should be fine. Note that in newer versions, the OFF/MEMORY settings are not safe for application level crashes.
  3. Playing with page sizes makes a difference as well (PRAGMA page_size). Having larger page sizes can make reads and writes go a bit faster as larger pages are held in memory. Note that more memory will be used for your database.
  4. If you have indices, consider calling CREATE INDEX after doing all your inserts. This is significantly faster than creating the index and then doing your inserts.
  5. You have to be quite careful if you have concurrent access to SQLite, as the whole database is locked when writes are done, and although multiple readers are possible, writes will be locked out. This has been improved somewhat with the addition of a WAL in newer SQLite versions.
  6. Take advantage of saving space...smaller databases go faster. For instance, if you have key value pairs, try making the key an INTEGER PRIMARY KEY if possible, which will replace the implied unique row number column in the table.
  7. If you are using multiple threads, you can try using the shared page cache, which will allow loaded pages to be shared between threads, which can avoid expensive I/O calls.
  8. Don't use !feof(file)!

I've also asked similar questions here and here.

vote vote

88

Try using SQLITE_STATIC instead of SQLITE_TRANSIENT for those inserts.

SQLITE_TRANSIENT will cause SQLite to copy the string data before returning.

SQLITE_STATIC tells it that the memory address you gave it will be valid until the query has been performed (which in this loop is always the case). This will save you several allocate, copy and deallocate operations per loop. Possibly a large improvement.

vote vote

74

Avoid sqlite3_clear_bindings(stmt).

The code in the test sets the bindings every time through which should be enough.

The C API intro from the SQLite docs says:

Prior to calling sqlite3_step() for the first time or immediately after sqlite3_reset(), the application can invoke the sqlite3_bind() interfaces to attach values to the parameters. Each call to sqlite3_bind() overrides prior bindings on the same parameter

There is nothing in the docs for sqlite3_clear_bindings saying you must call it in addition to simply setting the bindings.

More detail: Avoid_sqlite3_clear_bindings()

vote vote

61

On bulk inserts

Inspired by this post and by the Stack Overflow question that led me here -- Is it possible to insert multiple rows at a time in an SQLite database? -- I've posted my first Git repository:

https://github.com/rdpoor/CreateOrUpdate

which bulk loads an array of ActiveRecords into MySQL, SQLite or PostgreSQL databases. It includes an option to ignore existing records, overwrite them or raise an error. My rudimentary benchmarks show a 10x speed improvement compared to sequential writes -- YMMV.

I'm using it in production code where I frequently need to import large datasets, and I'm pretty happy with it.

vote vote

58

Bulk imports seems to perform best if you can chunk your INSERT/UPDATE statements. A value of 10,000 or so has worked well for me on a table with only a few rows, YMMV...

Top 3 video Explaining c - Improve INSERT-per-second performance of SQLite







Related QUESTION?