Writing A Simple Database: Part 1

Source: Instagram

As part of learning by doing, I am trying to implement a simple database in Go. I got this project idea from Nikhil’s blog about implementing a DB in Rust. The basic concept for implementation is based on A Simple and Efficient Implementation for Small Databases paper. The paper describes the theoretical details of implementing a simple but fully functional DB.

Implementing a database as a side project looked fascinating to me as, till now, I have majorly worked with web technologies. Writing a DB will also expose me to a lot of new systems concepts and logic behind writing a reliable(hopefully distributed) system. It’s also a good fit for using a static language, Go in this case. By the end of this “Writing a Simple Database” series we will have a working key-value database with persistence and fault-tolerance

In the past, I have casually programmed in Go and I quite like it. But I forget the syntax over time as I don’t get to use it much. Working on a side project will help me to learn it with a purpose this time. What I like about Go is that it’s readable, compiled and language behind a lot fo distributed reliable systems like Kubernetes, etcd, docker, etc.

High-Level Overview

Functional requirements

  • The DB will have a client and server architecture
  • It will store key-value data
  • DB will be completely in-memory
  • Backed by Write-ahead logs(WAL)
  • Fault tolerance and restore in case of failure using WAL
  • The latest snapshot of the in-memory DB will be backed up time-to-time in the file system

Non-functional requirements

  • Should be fast(somewhat comparable to key-value stores like etcd/rockdb)
  • Should be highly reliable
  • Should always follow ACID properties

Extension

  • Distributed database
  • If DB is too big to fit in memory, overflow to disk
  • Log compaction

Some of the requirements are likely to change as I start implementing the individual components.

Progress

Source code: https://github.com/amitt001/moodb

DB Name: “MooDB” or “Mdb”. I got this name idea from Friend’s “Moo point” scene :) and it sounded like a funny but mildly apt name for a DB which won’t be used in production.

I have already started the implementation. Till now I have a working command line with an in-memory key-value store. It supports 4 operations: GET, SET, UPDATE, DELETE.


MooDB cli

DB cli with commands

Currently, the client and server runs in the same process. In its current state like it’s like an embedded DB library(like SQLite). As each run starts a new memory DB instance.

What’s next?

Next, I plan to implement the following features:

  • Separate client and server: to support multi-client and one datastore
  • Write-ahead log: to make DB fault-tolerant against power loss or crashes.

I have decided to blog as I go to log my learnings. I am enjoying writing the code and learning a lot of new things. For implementing each part I am learning both the language and the underlying concept behind the DB component. Fun!

comments powered by Disqus