Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Considerations when building embedded databases (memfault.com)
68 points by tzmlab on Aug 2, 2024 | hide | past | favorite | 17 comments


Since it’s not clear from the title: embedded here does not mean sqlite-inside-your-app but small devices with microcontrollers


That's what I initially thought too, though having read and understood little of the article, is sqlite not a consideration too? Or does it take up too much memory?


There's a pretty good write up on running sqlite on an STM32 here: https://github.com/luismeruje/SQLite-STM32

Keeping in mind that STM32 is on the high end of what you might call embedded, and the writeup is pretty clear that it barely fits/works there, lots of caveats, etc.


Another thing to consider is that sqlite dynamically allocates memory. You typically want to avoid that in embedded. Everything is slow, memory is scarce. The sheer code size is also a issue.


Sqlite has a mode where you can give it an area of memory and it will never use any other memory.


It's still going to dynamically allocate its memory in there, isn't it? This is a slow operation.


Memory allocation isn’t that slow (in fact all the ram is SRAM which is typically quite fast), it’s just that you only have 256k-1M RAM in total. This means that any time you’re saving later by trying to fill space now ends up getting wasted when that memory needs to be reclaimed.


SQLite would require a file system. Embedded systems typically do not have this.


SQLite doesn’t require a file system per se, as you can create an in-memory db and then use vfs to load/write the actual data from/to your IO layer.


Though note you can define your own "Virtual File System" (VFS) for SQLite to use[1]. It might be a bit of an undertaking, but I think you could use that to run on just about anything with storage you control. (I haven't done it myself, just some research back in the day)

[1] https://www.sqlite.org/c3ref/vfs_find.html


> However, if you compile on an AWS EC2 instance (as you may want to do if there is a cloud component to your fishtank), then you get a 16-byte structure because int there is 8 bytes

I'm not familiar with any OS that EC2 would offer that uses an 8-byte int. A better example might be that long is the size of a pointer on Unix and microcontroller systems, but always 4 bytes on Windows.


there are several dubious assertions like this in here. it says a lot of things about flash in general that are only true of nor flash but false of nand, for example


> What we learned is that, unless you need to support some very exotic big-endian architecture, you’re better off ignoring endianness and just letting all your structs be little-endian. This greatly simplifies maintainability, reduces confusion for developers unfamiliar with byte-order, saves CPU time, and allows working with constant record data without RAM copies to reverse the byte order before write to flash.

I was hoping to read this. Thank goodness. Although I wonder what embedded systems dev isn’t familiar with endianness? The overlap between network programmers and ES must be close to a circle, for one. Regardless, “just keep it LE” is a good move.


The chosen solution isn’t ideal for all cases, especially those with many or large records. It doesn’t require parsing all the bytes of a payload but it does require reading them (because it’s a sequential scan of all content).

A better solution would separate the keys from the values (à la MFT) because reading a page of flash is going to be the slowest step. If you only have to sequentially scan the header table, you have to read an order of magnitude or more less data to find the record you are searching for.


A columnar database format might perform well out of the box in that case.

Reading more, it seems like a columnular database built on LSM trees would probably work great, since you could easily move the log around in flash to distribute write wear out, and only occasionally compact LSM tree files.


It would, but there are any small and simple enough to use in a standalone (no-OS, non-POSIX) environment?


I found https://github.com/embassy-rs/ekv after some searching, but who knows how battle tested it is.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: