Meta Sqlite3
Recursive Database Stores
Sqlite3 is a versatile database that can be embedded into larger programs, used as a datastore for web servers, processing scripts, and even as a binary file format for storing configurations. It can use a file as a storage medium, or it can run a database completely in memory. We are going to recursively store a sqlite3 database inside of another sqlite3 database. The only legitimate use case I see for this is like a mini-database-as-a-service, where each database is stored as a single record in a parent database, providing compartmentalization at a low resource cost.
Why? Because I can.
What I’m about to do doesn’t seem to have many applications, and it’s hard to justify. Basically, I did it because I could.
BLOB Storage
Sqlite3 provides 5 primitive “storage classes”:
- NULL
- INTEGER
- REAL
- TEXT
- BLOB
Before we get into storing data that doesn’t qualify as any of these, it’s important to understand that the sqlite backend itself only supports these types. That’s why, if we want to store something that is not one of these, we have to convert it first. For some simple classes, it may be possible to convert to an integer or to text. For example, datetimes could plausibly be converted to either integer or text. A complex number could be converted to text. However, for more obscure or complicated datatypes, it is more practical to convert to a binary format and store it as a blob. When the object is retrieved, it can be converted back to the native datatype. For example, a numpy array can be converted to binary using numpy.ndarray.dump(file)
, and loaded with numpy.load(file)
. Python objects can be “pickled” into byte arrays using pickle.dump(obj,file)
and loaded back with pickle.load(file)
.
Our goal is to save a sqlite database into a BLOB object. To do that, we need a way to dump the sqlite database into a bytestring. sqlite3.Connection
objects profide an iterdump()
method that dumps lines of sql script that will recreate the database when run against a fresh DB using dest.executescript(sqldata.decode('utf-8'))
.
Adapters and Converters
Sqlite3 exposes the adapter and converter interfaces through which callbacks can be registered for custom datatypes when storing them and retrieving them from a database. We are going to be passing the sqlite3.Connection
type to a sql insert statement, so first we write two functions, adapt_sqlite3(con)
, and convert_sqlite3(sqldata)
. adapt_sqlite3(con)
takes in a connection, and using iterdump()
returns a binary blob of sql statements. convert_sqlite3(sqldata)
is called when retrieving the blob from the database. It takes in that binary blob, and returns a new sqlite3.Connection
object after recreating the database. Clearly, this presents some limitations to this strategy: every time you retrieve a database from a row in the parent database, the child database is not simply loaded from memory, but is created as an empty database and filled with the old data. Because of the nature of transactional databases, the databases retrieved from storage are not modified in-place, but have to be re-saved if they are updated. Some work would need to be done to make these more than just effectively read-only meta-databases.
Procedure
1 | #!/usr/bin/env python3 |
Can We Store a Database in Itself? Yes, and it’s an abomination
1 | # sqlite3 DBs to be inserted into parent DB |