SQLZero: Some of the SQL, none of the calories

This is my take on the "Build your own SQLite" Challenge by CodeCrafters. It's a simple command line tool as well as a library to parse a .db file in the SQLite file format run some simple against the database. It's mostly for didactic purposes, and is therefore not even close to feature complete, fast, or smart about the way it runs its queries and retrieves data from the SQLite file. Still, it was a very nice introduction to the world of relational databases and their actual implementation behind the scenes, and it was quite some fun to work with. If you want to give it a spin, you can find some instructions below.

To get started, build the tool with the following command:

cargo build -r

Sample Databases

A few sample databases are provided to show the functionality of the tool:

sample.db: a small database containing two tables, apples and oranges, with a few rows in them and no indices
superheroes.db: a larger database (~1MB) containing one table, superheroes, with more rows (~7K) and no indices
companies.db: an even larger datbase (~7MB) containing one table, companies, with even more rows (~56K) and one index on the country column

Example commands

Note

The commands below are among the few that work and might not necessarily work on all SQLite files, depending on the schema of the tables in the database. It doesn't take a lot of straying off of this path to run into panics and other more-or-less user friendly errors. You can try them out on the given sample files, and have some fun adding support for a SQLite database you want to run the tool on if you feel like it :)

You can get some information about the db with the .dbinfo command:

$ sqlzero sample.db '.dbinfo'
database page size: 4096
number of tables: 2

You can also get the names of the tables or indices in the file with the .tables and .indices commands:

$ sqlzero sample.db '.tables'
apples oranges
$ sqlzero companies.db '.indices'
idx_companies_country on companies(country)

You can count all the the rows in a table:

$ sqlzero companies.db 'select count(*) from companies'
55991

You can run a select * query to get all the rows in a table:

$ sqlzero sample.db 'select * from apples'
1|Granny Smith|Light Green
2|Fuji|Red
3|Honeycrisp|Blush Red
4|Golden Delicious|Yellow

This also works against the sqlite_schema table, even though it wasn't listed in the output of .tables:

$ sqlzero sample.db 'select * from sqlite_schema'
table|1|apples|2|CREATE TABLE apples
(
        id integer primary key autoincrement,
        name text,
        color text
)
table|2|sqlite_sequence|3|CREATE TABLE sqlite_sequence(name,seq)
table|3|oranges|4|CREATE TABLE oranges
(
        id integer primary key autoincrement,
        name text,
        description text
)

You can also project single columns:

$ sqlzero superheroes.db 'select * from sqlite_schema'
table|8|superheroes|2|CREATE TABLE "superheroes" (id integer primary key autoincrement, name text not null, eye_color text, hair_color text, appearance_count integer, first_appearance text, first_appearance_year text)
table|9|sqlite_sequence|3|CREATE TABLE sqlite_sequence(name,seq)
$ sqlzero superheroes.db 'select name, eye_color, first_appearance, first_appearance_year from superheroes'
Batman (Bruce Wayne)|Blue Eyes|1939, May|1939
Superman (Clark Kent)|Blue Eyes|1986, October|1986
Green Lantern (Hal Jordan)|Brown Eyes|1959, October|1959
James Gordon (New Earth)|Brown Eyes|1987, February|1987
Richard Grayson (New Earth)|Blue Eyes|1940, April|1940
Wonder Woman (Diana Prince)|Blue Eyes|1941, December|1941
Aquaman (Arthur Curry)|Blue Eyes|1941, November|1941
...

You can also filter on a single text column (mind the single quotes '):

$ sqlzero superheroes.db "select name, eye_color, first_appearance, first_appearance_year from superheroes where first_appearance_year = '1986'"
Superman (Clark Kent)|Blue Eyes|1986, October|1986
Alexander Luthor (New Earth)|Green Eyes|1986, October|1986
James Olsen (New Earth)|Green Eyes|1986, October|1986
Theodore Kord (New Earth)|Blue Eyes|1986, February|1986
Michael Jon Carter (New Earth)|Blue Eyes|1986, February|1986
Perry White (New Earth)|Blue Eyes|1986, October|1986
Kilowog (New Earth)|Red Eyes|1986, June|1986
Martha Clark (New Earth)|Blue Eyes|1986, October|1986
Jonathan Kent (New Earth)|Blue Eyes|1986, October|1986
Amanda Waller (New Earth)|Brown Eyes|1986, November|1986
...

If the column you filter on has a corresponding index, then the index will be used instead of doing a full table scan (one is much faster than the other):

$ sqlzero companies.db '.indices'
idx_companies_country on companies(country)
$ time sqlzero companies.db "select id, name from companies where name = 'unilink s.c.'"
121311|unilink s.c.
sqlzero companies.db   0.05s user 0.02s system 98% cpu 0.068 total
$ time sqlzero companies.db "select id, name from companies where country = 'eritrea'"
121311|unilink s.c.
2102438|orange asmara it solutions
6634629|asmara rental
5729848|zara mining share company
sqlzero companies.db   0.00s user 0.00s system 89% cpu 0.002 total

Lastly, a weird little flex: while this re-implementation is not feature complete, I have jerry-rigged the SQL parser and database tuple extraction code just enough to support the schema used by Anki Desktop :

$ sqlzero collection.anki2 'select * from sqlite_schema'

table|1|col|2|CREATE TABLE col (
    id              integer primary key,
    crt             integer not null,
    mod             integer not null,
    scm             integer not null,
    ver             integer not null,
    dty             integer not null,
    usn             integer not null,
    ls              integer not null,
    conf            text not null,
    models          text not null,
    decks           text not null,
    dconf           text not null,
    tags            text not null
)
table|2|notes|3|CREATE TABLE notes (
    id              integer primary key,   /* 0 */
    guid            text not null,         /* 1 */
    mid             integer not null,      /* 2 */
    mod             integer not null,      /* 3 */
    usn             integer not null,      /* 4 */
    tags            text not null,         /* 5 */
    flds            text not null,         /* 6 */
    sfld            integer not null,      /* 7 */
    csum            integer not null,      /* 8 */
    flags           integer not null,      /* 9 */
    data            text not null          /* 10 */
)
table|3|cards|4|CREATE TABLE cards (
    id              integer primary key,   /* 0 */
    nid             integer not null,      /* 1 */
    did             integer not null,      /* 2 */
    ord             integer not null,      /* 3 */
    mod             integer not null,      /* 4 */
    usn             integer not null,      /* 5 */
    type            integer not null,      /* 6 */
    queue           integer not null,      /* 7 */
    due             integer not null,      /* 8 */
    ivl             integer not null,      /* 9 */
    factor          integer not null,      /* 10 */
    reps            integer not null,      /* 11 */
    lapses          integer not null,      /* 12 */
    left            integer not null,      /* 13 */
    odue            integer not null,      /* 14 */
    odid            integer not null,      /* 15 */
    flags           integer not null,      /* 16 */
    data            text not null          /* 17 */
)
table|4|revlog|5|CREATE TABLE revlog (
    id              integer primary key,
    cid             integer not null,
    usn             integer not null,
    ease            integer not null,
    ivl             integer not null,
    lastIvl         integer not null,
    factor          integer not null,
    time            integer not null,
    type            integer not null
)
table|5|sqlite_stat1|6|CREATE TABLE sqlite_stat1(tbl,idx,stat)
table|6|sqlite_stat4|7|CREATE TABLE sqlite_stat4(tbl,idx,neq,nlt,ndlt,sample)
table|7|android_metadata|8|CREATE TABLE android_metadata (locale TEXT)
table|8|deck_config|9|CREATE TABLE deck_config (
  id integer PRIMARY KEY NOT NULL,
  name text NOT NULL COLLATE unicase,
  mtime_secs integer NOT NULL,
  usn integer NOT NULL,
  config blob NOT NULL
)
table|9|config|10|CREATE TABLE config (
  KEY text NOT NULL PRIMARY KEY,
  usn integer NOT NULL,
  mtime_secs integer NOT NULL,
  val blob NOT NULL
) without rowid
table|10|fields|11|CREATE TABLE fields (
  ntid integer NOT NULL,
  ord integer NOT NULL,
  name text NOT NULL COLLATE unicase,
  config blob NOT NULL,
  PRIMARY KEY (ntid, ord)
) without rowid
table|11|templates|12|CREATE TABLE templates (
  ntid integer NOT NULL,
  ord integer NOT NULL,
  name text NOT NULL COLLATE unicase,
  mtime_secs integer NOT NULL,
  usn integer NOT NULL,
  config blob NOT NULL,
  PRIMARY KEY (ntid, ord)
) without rowid
table|12|notetypes|13|CREATE TABLE notetypes (
  id integer NOT NULL PRIMARY KEY,
  name text NOT NULL COLLATE unicase,
  mtime_secs integer NOT NULL,
  usn integer NOT NULL,
  config blob NOT NULL
)
table|13|decks|14|CREATE TABLE decks (
  id integer PRIMARY KEY NOT NULL,
  name text NOT NULL COLLATE unicase,
  mtime_secs integer NOT NULL,
  usn integer NOT NULL,
  common blob NOT NULL,
  kind blob NOT NULL
)
table|14|tags|16|CREATE TABLE tags (
  tag text NOT NULL PRIMARY KEY COLLATE unicase,
  usn integer NOT NULL,
  collapsed boolean NOT NULL,
  config blob NULL
) without rowid
table|15|graves|18|CREATE TABLE graves (
  oid integer NOT NULL,
  type integer NOT NULL,
  usn integer NOT NULL,
  PRIMARY KEY (oid, type)
) WITHOUT ROWID
index|16|notes|19|CREATE INDEX ix_notes_usn on notes (usn)
index|17|cards|20|CREATE INDEX ix_cards_usn on cards (usn)
index|18|revlog|21|CREATE INDEX ix_revlog_usn on revlog (usn)
index|19|cards|22|CREATE INDEX ix_cards_nid on cards (nid)
index|20|cards|23|CREATE INDEX ix_cards_sched on cards (did, queue, due)
index|21|revlog|24|CREATE INDEX ix_revlog_cid on revlog (cid)
index|22|notes|25|CREATE INDEX ix_notes_csum on notes (csum)
index|23|fields|26|CREATE UNIQUE INDEX idx_fields_name_ntid ON fields (name, ntid)
index|24|templates|27|CREATE UNIQUE INDEX idx_templates_name_ntid ON templates (name, ntid)
index|25|templates|28|CREATE INDEX idx_templates_usn ON templates (usn)
index|26|notetypes|29|CREATE UNIQUE INDEX idx_notetypes_name ON notetypes (name)
index|27|notetypes|30|CREATE INDEX idx_notetypes_usn ON notetypes (usn)
index|28|decks|31|CREATE UNIQUE INDEX idx_decks_name ON decks (name)
index|29|notes|32|CREATE INDEX idx_notes_mid ON notes (mid)
index|30|cards|33|CREATE INDEX idx_cards_odid ON cards (odid)
WHERE odid != 0
index|31|graves|34|CREATE INDEX idx_graves_pending ON graves (usn)
idx_companies_country on companies(country)
$ sqlzero collection.anki2 'select count(*) from revlog'
613165
$ sqlzero collection.anki2 "SELECT * FROM notes where guid = 'mFHgaIL9ZR'"
1593961343452|mFHgaIL9ZR|1586604965499|1694931669|18732||Why can there be unused direct dependencies in a binary/library?It is related to the options used by the linker when building the binary. Although the program does not need an unused library, it was still linked and labelled as <code>NEEDED</code> in the information about the object file|Why can there be unused direct dependencies in a binary/library?|1375559563|0|

The collection used in the example was ~60MB in size and had tuples with enough data to overflow to multiple pages. It's not the most complex case, but it's one I cared particularly about since I'm an avid user of Anki, and my tool is compatible with a real-world datase I use every day :) (somewhat at least).

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.codecrafters		.codecrafters
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
codecrafters.yml		codecrafters.yml
download_sample_databases.sh		download_sample_databases.sh
sample.db		sample.db
your_program.sh		your_program.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQLZero: Some of the SQL, none of the calories

Sample Databases

Example commands

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

License

andrea-berling/sqlzero

Folders and files

Latest commit

History

Repository files navigation

SQLZero: Some of the SQL, none of the calories

Sample Databases

Example commands

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!