SQLite may become foundational for digital progress

endisneigh · on May 22, 2022

I've yet to see how you can use SQLite for a multi-user multi-write app effectively. Unless we're all going back to single tenant, single user applications SQLite seems overhyped for the new usecases.

Don't get me wrong, there are plenty of uses for SQLite, but I think the hype is getting out of hand IMHO.

If you are making an app that only a single person is going to use at a time, then there are plenty of options, including SQLite. Heck, IndexedDB is sufficient. CouchDB tried the whole DB-per-user thing and it didn't end super well.

Happy to be proven wrong though. If anyone has an example of a site with more than say, 10K concurrent writers (edit: changed from users) running on a single SQLite DB I'd probably change my mind.

---

Now, what would be interesting is a way to architect an app such that you can have a central entity, like this site, for instance, but all of your posts are actually referring to your own SQLite, or equivalent store. In that sense you can own your data completely. For performance you can specify a TTL for your data and the consumer (this site) could cache* it accordingly.

Though this would probably end up being a lot less performant than simply using something like Postgres, but at least you'd have more control over your own data.

* - This caching ideally would be enforced by the browser, and not the server, that way the server actually never touches or knows what data you have.

simonw · on May 22, 2022

In my experience most SQLite writes take less than 1ms. So if your app is handling less than a thousand writes per second you will probably be OK!

A common pattern (which I've implemented myself in Datasette) is to put your writes in an in-memory queue and apply them using a single dedicated connection.

rogers18445 · on May 23, 2022

If you open multiple connections to an sqlite db from a single app you are a bad developer, full stop.

We have server app on a single beefy server servicing ~100,000 simultaneous users with avg user action per second of 0.2 (~20k actions per second) requiring some ~60k sqlite writes per second.

A single thread handles that with message passing with little load. We estimate we can handle 5-10 times as much without changing anything. That's with SSD's and standard OS caching. With either an in-memory db or using zstd compressed ramdisks the limits are ridiculous.

bob1029 · on May 23, 2022

This is pretty much how we've been running in production with several different banks over the last half decade.

1 connection per database is critical. WAL is important. Not double-locking helps (most builds of SQLite serialize all writes internally).

You do all of these things correctly at the same time, then you can easily handle tens of thousands of transactions per second.

Additionally, we also do a per session database concept where data is scoped as specifically as possible. Using just 1 sqlite database for the whole application would be a mistake imo.

Synchronous replication is our next step, but we might build something in-house for this.

beagle3 · on May 23, 2022

Are you aware of litestream? It's become quite useful in the last year or two, and recently provides synchronous replication. Worth checking it out (as well as rqlite and dqlite and bedrock) before starting your own project.

edwinyzh · on May 27, 2022

Hi, thanks for sharing your experiences. What's "Not double-locking"? Thanks.

tremon · on May 23, 2022

With either an in-memory db or using zstd compressed ramdisks the limits are ridiculous.

Who needs durability anyway, right?

taspeotis · on May 23, 2022

There are workloads where some data loss is acceptable, like analytics.

But more seriously if you want the best of both worlds (in-memory speeds and durability) there are NVDIMMs.

SQL Server can take advantage of them: https://docs.microsoft.com/en-us/sql/relational-databases/pe...

I imagine with some effort SQLite could as well.

zasdffaa · on May 23, 2022

The problem is (I've seen it) staff turning off or disabling WAL[1] because it makes things faster, or as an unintended consequence of doing something else. Great if 'some data loss' being possible is understood in advance. Often, it isn't.

[1] example: not quite WAL, but disabling full logging, so it was then on simple logging, on MSSQL as part of a log shrink process. On a cient site. Who did 3-hourly log backups.

jaytaylor · on May 23, 2022

Litestream can ensure a pretty decent level of durability, for a ramdisk anyway.

Suitability depends on what the requirements are.

codethief · on May 23, 2022

Do you happen to have a blog post laying out the details? Would love to learn more about what you're doing!

edwinyzh · on May 27, 2022

By build a c/s architecture on top of SQLite, we can get almost 100K writes per second, including the object to JSON, JSON to SQL parsing process:

https://blog.synopse.info/?post/2022/02/15/mORMot-2-ORM-Perf...

meribold · on May 26, 2022

Can you elaborate on why you think it's bad to open multiple connections? How would you allow read access while a writer is active without multiple connections?

warmwaffles · on May 23, 2022

> If you open multiple connections to an sqlite db from a single app you are a bad developer, full stop.

Even if you are taking advantage of the built in WAL feature?

kqr · on May 23, 2022

Quick queuing theory recap: independent requests arriving at ~1000/s and each taking ~1 ms will, in the presence of any variability at all, be guaranteed to lead to an infinite backlog of unserved requests.

rco8786 · on May 23, 2022

I’m not an expert on queue theory but this little “tidbit” has always bothered me. As far as I can tell, it is true IFF your variability causes the average wpos to go up, which negates the entire premise of “requests arriving at 1000/s”

If you’re averaging 1000/s and you get 1010 req in one sec, yes this leads to an infinite queue - but your average is now above 1000/s! Then if a few seconds later you get 990 requests, the queue catches up (because your avg is back to 1000)

IOW - it assumes only positive variability and no negative variability which does not mimic real world systems in my experience.

It’s an academic exercise that’s interesting to think about, but has no practical application. Else we would see infinite queue backups every time a system received a burst of traffic beyond their maximum throughput in any given time period.

What am I missing?

hansvm · on May 23, 2022

Just code up a 5-line simulation and see what happens?

You do actually expect the server to eventually recover from all load spikes, but that dead time means cumulatively you'll typically have queued more requests than you've handled at long enough time scales. On average that surplus is infinite, despite it sometimes being as low as 0.

It's not just academic, but it's not surprising that doesn't echo your real-world experience either. We don't usually run systems anywhere near capacity, and even when you do you'll still expect the queue to be able to handle the load eventually.

antifa · on May 23, 2022

I tried sqlite once and had a lot of errors with just 2 writers. Instantly no errors after switching to postgres.

stefanos82 · on May 22, 2022

Bloomberg [1] uses SQLite in a custom distributed RDBMS engine of theirs in a way I lack the knowledge to completely understand its design.

I remember Richard Hipp mentioning Bloomberg in one of his interviews (don't ask me which one, I don't remember), that they use SQLite to serve billions of bank transaction on a daily basis without a problem.

  [1] https://github.com/bloomberg/comdb2

Update: I have found the following paper that describes the engine in greater detail: http://www.vldb.org/pvldb/vol9/p1377-scotti.pdf

endisneigh · on May 22, 2022

SQLite has been used on top of a lot of things, including FoundationDB (https://apple.github.io/foundationdb/architecture.html).

That's different than using the DB directly though.

jdright · on May 23, 2022

That is what "foundational" means.

sally1620 · on May 23, 2022

After reading the paper, I see that it is not built on top of SQLite, but shares some components with SQLite.

nine_k · on May 22, 2022

It's like asking whether it's possible to write an OS kernel module in Python.

No, not really possible, and attempts to do so would be misguided. But it does not prevent Python from being highly successful in other areas.

Same with SQLite: it's a fine tool for a large spectrum of applications, but high-throughput OLTP is not it; SQLite can't reasonably replace Cassandra or even Postgres.

The point of much of the "SQLite hype" is that there are many applications where read load is high and queries complicated, but the write load is low to non-existent. In situations like these, you can make do with a smaller DB than Postgres.

endisneigh · on May 22, 2022

> The point of much of the "SQLite hype" is that there are many applications where read load is high and queries complicated, but the write load is low to non-existent. In situations like these, you can make do with a smaller DB than Postgres.

Sure, but my point is that unless you know for sure your app will stay the same forever, why not just use Postgres or something more flexible to begin with? With things like Supabase, Hasura, and Managed Postgres why even bother with SQLite? Just seems like you're causing your tech team inevitable pain when you have more writes than expected and have to migrate off of it.

Both Fly.io and Cloudflare's solutions are a good example of this. If you're going to use another service, why even bother with SQLite? I totally get the SQLite use case when you're using it directly like with X-Range-Requests (https://phiresky.github.io/blog/2021/hosting-sqlite-database...) or SQLite.js and a WASM app (https://blog.ouseful.info/2022/02/11/sql-databases-in-the-br...), but idk.

tldr: if you're using a service, why bother with one backed by SQLite instead of Postgres? The underlying data store will be abstracted away anyways. One scales decently, and the other, not so much.

scoopdewoop · on May 23, 2022

Why don't you solve a bunch of problems you don't have with tools you don't need? You might want them in the future. Thats good engineering

tluyben2 · on May 23, 2022

You say that on one of the top over engineering forums in the world. People on HN are running their startups like they will soon be larger than Google or Meta, building their architecture and software like that; not like you suggest.

skybrian · on May 22, 2022

These are services that replicate flat files and key-value stores. SQLite seems like a step up, particularly if it's similarly priced to flat file replication.

catmanjan · on May 22, 2022

10K users isn’t much, but 10k concurrent writer’s is a different proposition - I’d be surprised if 100k users had 10k concurrent writers

endisneigh · on May 22, 2022

That's a fair point. AFAIK SQLite uses a queue (in WAL mode) in order to handle concurrent writes. I imagine it basically couldn't handle 10K concurrent writes to begin with in a practical application since newly read data would be out of date pretty quickly.

pdw · on May 22, 2022

To be explicit, SQLite doesn't support concurrent writers. In WAL mode it can handle a single writer with concurrent readers. In "vanilla" mode, a writer requires exclusive access to the database.

catmanjan · on May 22, 2022

I’m not sure as I’ve never used it but if the writes were fast enough perhaps it wouldn’t be a problem

endisneigh · on May 22, 2022

idk, the author themselves don't recommend it:

>If many threads and/or processes need to write the database at the same instant (and they cannot queue up and take turns) then it is best to select a database engine that supports that capability, which always means a client/server database engine.

https://www.sqlite.org/whentouse.html

simonw · on May 22, 2022

That reads to me like a somewhat-sarcastic way to make fun of people who think SQLite can't handle their write load.

Not a lot of applications have many processes that need to write at the same instant in time and cannot be put in a queue instead.

antifa · on May 23, 2022

Why build a queue before you really need it when you can just use postgres?

jrockway · on May 23, 2022

I have a controversial opinion.

We got to the canonical "LAMP" stack incrementally, shaped by the computing environment at the time. The servers that you had available to your web app 15 years ago were maybe 2 cores running at 1.5GHz and you had a couple gigs of RAM, and you wrote the app in a very slow dynamic language. So in order to scale out to the Internet, you had to split your application into replicas that ran on different servers; the traffic you were getting and the architecture of your application meant that you needed more than 1 CPU second per second, and that was the only way to get that at the time. All of these replicas needed some coordination, so you had a database server to handle that coordination, and it ran on the biggest computer possible because it was the single point of failure. (And when one computer wasn't big enough, it was time for ugly hacks like sharding.)

Fast forward 15 years and the landscape has changed dramatically. You can get 448vCPU machines from AWS. You're writing your application that is compiled or JIT'd into native code. With the same hardware, your application of today might be 10-100x faster than your application 15 years ago on the same hardware. But the hardware is even faster, and computers are bigger (you can go get a 64 core processor off the shelf at an electronics store!). So one node might now be 1000-10000x more powerful than where this "I must have a hundred replicas of my application" mindset came from.

With that in mind, if you start right now, you can build a web application that scales beyond your wildest dreams with a single computer. If you only have one computer, why not build the database into the application? You will have made so much money by the time the 448 core computer is inadequate for your application that it doesn't even matter. Hire 100 developers to move the thing to FoundationDB or Spanner or Postgres or whatever "real" database you think you need to scale to the next 10 billion users. (Interestingly, the top article on HN right now is "What if it changes?", a sarcastic reminder that maybe today you should build the system you need for today. If it sucks in a year, fix it in a year.)

I am throwing some pretty important things out the window. You probably update your application, so you want some rolling deployment that aborts if the new version fails to start up or whatever. A tornado blowing up your datacenter on your biggest use day of the year would be bad. Your users are all over the world, so you probably want to serve as much content as you can from a computer near them. But honestly, the state of the art for these concepts are pretty new. Globally consistent ("planet scale") transactions are hard, and not that fast. Automated canarying is also not trivial. So you might build a really complicated application to support those needs, and not even achieve them, because nobody has achieved perfection there yet.

Anyway, my controversial opinion is: don't build a distributed system unless you are absolutely sure you must have a distributed system. Every time you split up your state storage, your work as a programmer becomes harder. A single thread; no need for transactions, everything happens in order. A single computer with multiple threads; you'll need some locks, write barriers, or atomic instructions. Multiple computers with multiple threads? Now you're writing a computer science paper. That's a super fun activity, but check that discovering new classes of computer science problems is what your company's business is before you go out of your way to start doing it. You might be able to make a lot of money by being pretty boring. When you're relaxing on your private island, you might find a recreational activity even more fun than finding bugs in distributed transaction protocols. Who knows.

The currently-accepted simplification is to use a single database server to coordinate your stateless application replicas. That doesn't protect you from tornadoes, give users in Antarctica sub-millisecond page loads, or let you upgrade the database without downtime. If you're OK with that, it's totally reasonable to just build the database into your application as SQLite does. It's simpler, and no worse to an outside observer.

(That said, I tend to reach for Postgres first because it has a lot of polish that I don't think SQLite has. But I don't think anyone is unreasonable or stupid for picking SQLite. Especially if the SQLite instance streams its data to S3 and you have point-in-time recovery options for a disaster or bad release. You're going to want those with Postgres too, and they aren't enabled out of the box.)

bambax · on May 23, 2022

> I have a controversial opinion.

How is this controversial. This should be mainstream thinking. Solve the problems that you have, when you're having them, instead of spending time preparing for a future that may never come.

ospider · on May 23, 2022

Brilliant idea. Most startups or side projects won't reach the day when a single 448vCPU machine is not enough to handle their traffic.

tepitoperrito · on May 23, 2022

"This caching ideally would be enforced by the browser, and not the server"

I don't think we'd need a UI served by that central server at that point. A sqlite file and some local program that communicates with other (de)centralized hosts for whatever services or data isn't available locally sounds like a neat departure from the approach the Solid project is taking with plug-able data backends.

Instead whatever ecosystem springs up around really making use of sqlite at "the edge" (here I mean individual's computing devices - not what they mostly references to fly.io and cloudflare from TFA) could invert how web apps are delivered and consumed for the 1% of people who care about this kind of thing. By invert I just mean that the user hosts their own frontends and interfaces with DNS named hosts / ports providing functionality, CPU, disk, network a la plan9.

alatkins · on May 22, 2022

> CouchDB tried the whole DB-per-user thing and it didn't end super well.

Can anyone give further context here? Haven't looked at CouchDB for several years, remember it as being good for certain use cases (intermittently offline devices etc).

endisneigh · on May 22, 2022

CouchDB doesn't support document ACLs so inevitably you will have to use a DB per user and map reduce in order to implement anything that has more than one user.

LAC-Tech · on May 23, 2022

Yeah it stuck me as a weird comment. As if there was some public case of a company doing this and then going under or something.

So I'm wondering as well who it didn't end too well for.

tlb · on May 23, 2022

Concurrent writers isn't a thing people directly need. What they need is high write throughput. One way to improve write throughput is with concurrency, but the other way is by just being fast.

Making the database fast has the advantage of not requiring extra concurrency complications throughout the rest of your application.

anamexis · on May 22, 2022

If I understand your idea correctly, wouldn't that require connecting to the browser of every single person who has posted to load the content?

endisneigh · on May 22, 2022

Yes - hypothetically you could allow for some notion of a remote TTL where the server could cache as opposed to the browser, but when you follow this to its logical conclusion you're just reinvented the sort of thing we already have.

otabdeveloper4 · on May 23, 2022

My only experience with sqlite is having to spend time ripping it out and replacing with something that actually works.

Sqlite is almost never the right solution.

tiffanyh · on May 22, 2022

The answer is much simpler.

When people created desktop apps - they needed a datastore that was simple and reliable to use. SQLite was perfect for this.

Now, no-one is creating desktop apps anymore (everything is a web app), but people still need a simple & reliable datastore. Even though SQLite has historically advertised to not be used for client/server, people are finding that SQLite works great for most web apps as well.

luvs2code · on May 23, 2022

I really miss desktop apps. They worked so well and didn't have half the problems that come with web apps.

I'm working on a project, for fun not for money, that uses Golang and talks to a SQLite database for things like accounts and user content. So far I really enjoy writing it a lot but I have to admit that it would probably have issues if it scaled to a large number of users.

In fairness to SQLite it was designed in a different era technologically. Maybe there will be a replacement for the web app era if we distill what we like about SQLite

marpstar · on May 22, 2022

"Native Mobile Apps" are the new "Desktop App". They can both be client apps to a remote server/datastore, or not. "Desktop Apps" didn't go away, the internet-connected apps just overtook them.

NonNefarious · on May 23, 2022

My company just finished a new desktop app and will continue to produce more.

brokenmachine · on May 23, 2022

Bless you. So sick of cloud everything.

TheAceOfHearts · on May 23, 2022

I'm still bitter that we could've had SQLite in the browser as WebSQL but it got killed off in favor of IndexedDB which is terrible.

The code is in the public domain. The spec could just say to include SQLite.

IndexedDB implementations all use SQLite under the hood anyway, AFAIK.

hitekker · on May 23, 2022

Back then, many NoSQL/MongoDB opportunists were in positions of authority on the web. When WebSQL was discussed, those folks pushed the lie that "SQL is SQLite. Therefore, SQL does not belong in the browser". They pretended that SQLite and SQL APIs were the exact same thing, as if no browser could offer a dialect of SQL without fully exposing the database engine running under the hood. In their rhetoric, encapsulation was impossible, the specification was the implementation. With SQL and SQLite conflated, the opportunists were then able to wield the valid argument "SQlite is not a standard. Browsers should be based on standards" against the simple point "SQL ought to be a standard". Once they beat the more reasonable voices into submission and were then free to assert that only NoSQL can be standardized in the browser. Which misdirected years of efforts into IndexedDB.

Nowadays, if you look up those opportunists on LinkedIn, you can see their NoSQL careers didn't quite go the way they were expecting. Their wrongdoing is a very, very sore point for them, and it's unfortunate the web had to pay for their hubris.

For more details: https://nolanlawson.com/2014/04/26/web-sql-database-in-memor...

LAC-Tech · on May 23, 2022

The conspiracy theorist in me thinks Mozilla has it in for Richard Hipp.

Wonder if there's some schadenfreude seeing Firefox fade rapidly into obscurity and SQLite continue to take over the world.

i5heu · on May 23, 2022

It is possible to have SQLite in WASM and use indexDB as a filesystem

scrame · on May 23, 2022

Yep, that'll make things better.

richardc323 · on May 23, 2022

For reasons I wont go into here, I built a system with a similar approach 10 years ago. The system was horizontally scaleable. There was no database tier, instead each server had a replica of the database locally which were used for reads. The servers discovered each other other and nominated one server as the master, which writes were sent to. Replication was done by having the master sending the DML queries to a writer process on each server. When a new server joined the cluster it was sent a copy of the entire database and a stream of queries for it to catch up before it joined the cluster. There were other tricks to make sure reads from replicas waited until the replicas were sufficiently up to date.

It worked fine as the system was read heavy and write light. SQLite serialises writes so does not perform well with multiple writers, particularly if the write transactions are long running. Reads were blazingly fast as there was no round-trips across the network to a separate database tier. The plan for dealing with performance problems if/when they arrived was to shard the servers into groups of customers.

I moved on and the next developer ripped it out and replaced it with Postgres because it was such an oddball system. I came back six months later to fix the mess as the new developer messed up transactions with the new database code.

Technically using SQLite with replication tacked on works fine. Superficially it is all the same because it is SQL. However the performance characteristics are very different from a conventional Multi Version Concurrency Control databases such as Postgres.

This is where the problem lies with this kind of database - developers seeing SQL and assuming they can develop exactly the same way they would with other SQL databases. That said I love approaches that get away from the database architectures of last century.

kumarvvr · on May 23, 2022

As a Postgres enthusiast, I am really curious to know how the new developer messed up the transactions.

Could you please elaborate?

TheRealPomax · on May 23, 2022

SQLite is your file format. If your use case is poorly dealt with by using files it's poorly dealt with by using SQLite. But a _lot_ of things actually work fine with a file backing until they need to scale, and most projects that use SQLite never need that scale. But when they do, unlike a home-rolled file formats, SQLite has "prepping your data for importing into a real data store" baked right in.

Mertax · on May 22, 2022

I think the potential exists for many multi-tenancy B2B SaaS products to be converted to an SQLite-per-tenant model.

CRUD Database products like Jira, Salesforce, Airtable, Notion, Clickup, Asana, etc. seem like internal line of business services that just need to be thick clients with a distributed database. Most individual businesses/tenants won’t have to deal with more than 100 concurrent writes at a time (if that) so SQLite could definitely handle that.

lenkite · on May 23, 2022

How does one handle HA and failover with SQLite ? These products you mentioned are more than just simple CRUD and have full plugin engines and workflows.

Mertax · on May 23, 2022

I’d be curious how products like GitHub/BitBucket handle this with git repository hosting, because I’d imagine it would not be much different. Also if you took a truly local first approach, then fully distributing the database amongst peers and supporting peer to peer synchronization would be another way (which would work for use cases with smaller data storage requirements)

tlb · on May 23, 2022

For most of those products, a commodity server is more reliable than the software itself. A straightforward hourly backup with manual recovery is adequate for most orgs.

ec109685 · on May 22, 2022

Are there really questions about using SQLite?

> Another wrinkle is that it’s not exactly open source. The original developer of SQLite, Dwane Richard Hipp, placed it in the public domain. Generally, this means that there are no legal restrictions on using the code at all, although there are some questions whether every country recognizes this view.

A bigger issue than the license is that it’s not open for contributions. They don’t accept pull requests and the extensive test suite is proprietary.

capableweb · on May 22, 2022

> A bigger issue than the license is that it’s not open for contributions

SQLite is open for contributions, as long as you "submit an affidavit dedicating your contribution into the public domain" (https://www.sqlite.org/copyright.html). It's not "Open-Contribution" in the sense that some npm packages are for example, where if you send any PR, it will most likely be merged and then you get write access to the repository. If you're curious how you can contribute, check out this page: https://system.data.sqlite.org/index.html/doc/trunk/www/cont...

Personally, I prefer projects that are limited in what contributions they accept to the main trunk. Code quality tends to be a lot higher in those. As long as the source is open so I can freely maintain my own patches, I have no qualms with that workflow.

ec109685 · on May 23, 2022

This is what I was referring to: “SQLite is open-source but it is not open-contribution. All the code in SQLite is written by a small team of experts. The project does not accept "pull requests" or patches from anonymous passers-by on the internet.”

https://www.sqlite.org/hirely.html

capableweb · on May 23, 2022

Yeah, that statement means you have to make yourself known before contributing, via the procedure I described earlier.

It doesn't mean they don't accept contributions at all, just that they are strict about who can contribute.

cryptonector · on May 23, 2022

It's not that easy. Have you tried?

The main problem is that with your contribution you're making them do the work of writing tests for their proprietary 100% branch coverage suite to cover your contribution.

Spelling fixes and such are trivially accepted. Everything else is not.

boyter · on May 22, 2022

Although that affidavit is still a problem for some. Both German and Australian citizens cannot revoke their human rights and as such cannot put things into public domain. Hence cannot ever contribute to sqlite.

This may be seen as advantageous however.

capableweb · on May 22, 2022

I'm sure there are other countries that have restrictions towards who can contribute to US-based projects as well. Unfortunate yes, but hardly the fault of SQLite.

I'm curious about the "citizens cannot revoke their human rights and as such cannot put things into public domain" part of your comment. How does putting something into public domain revoke anything regarding your human rights?

boyter · on May 23, 2022

I am not a lawyer but my understanding is that copyright is considered a human right, hence being non revokable for citizens in Australia and Germany.

Hence you must explicitly licence to allow reuse.

It's why personally if I want something as free as possible I dual licence under Unlicense and MIT with the user able to take whichever suits them best.

robocat · on May 23, 2022

In https://www.sqlite.org/copyright.html they specifically talk about how to use SQLite in a jurisdiction that does not recognize the public domain. And major projects like Android and Safari use SQLite globally, so clearly the legal risks are limited.

Any legal risk, and Hwaci can avoid employing people from problematic jurisdictions (from their POV). I suspect there is a workaround anyway: Hwaci can require any German developers to assign copyright of code to Hwaci, and then Hwaci can put that code into the public domain.

Regardless, Hwaci mention that SQLite is “Open-Source, not Open-Contribution” and also say “the project does not accept patches. If you would like to suggest a change and you include a patch as a proof-of-concept, that would be great. However, please do not be offended if we rewrite your patch from scratch.”.

boyter · on May 23, 2022

100% you can use it even in Germany. I was more pointing out that for some people contributing to public domain software or any work is problematic.

mhio · on May 23, 2022

What's the human right angle on me assigning the copyright of my work to the public domain?

boyter · on May 23, 2022

In another comment, but my understanding is that you cannot revoke your copyright over anything you create, but can make it free for use by others. Hence you must licence appropriately.

ThunderSizzle · on May 22, 2022

Does this mean they can't contribute to open-source MIT-equivalent licensed code bases? As that's basically giving that code to the public domain, is it not?

capableweb · on May 22, 2022

"Public Domain Is Not Open Source | Open Source Initiative" - https://opensource.org/node/878

tzs · on May 23, 2022

That article's argument is that you can't be sure that you have rights to use public domain software because there are legal differences in how public domain works in different countries.

I do not find this argument very persuasive. Other aspects and of IP law and of licensing related law are different in different countries too.

If I find some software purportedly from a developer in Wheretheheckisthatistan that says it is under some OSI-approved license all that tells me is that the developer (probably) tried to license it under that license. But I have no idea if they actually succeeded in doing so--for instance maybe their employment agreement for their day job says it belongs to the employer and they are in a country where such agreements are enforceable.

boyter · on May 23, 2022

No. Explicitly licensing things is fine as you still retain copyright, but are allowing others to use and modify.

cryptonector · on May 23, 2022

This.

The public domain thing is a red herring not worthy of the attention it's gotten, but the infeasibility of making contributions is very much worth talking about.

Basically, Richard Hipp is an open source business genius. He made an exceedingly popular open source thing. Then he found a way to make it so no one would bother using any forks. Then he created a consortium that pays for development and upkeep.

The secret to that success is the proprietary test suite and their refusal to accept contributions. Anyone wanting to fork the code base will not be able to recreate that test suite, nor the upkeep, therefore they won't be anywhere near as trusted as the SQLite dev team.

smitty1e · on May 22, 2022

> “I think the biggest complication for us is that there’s no tooling for it.” said Mackey. “We have people deploy apps. They’re like, ‘How do I connect to my database and like query things? How do I import data?’”

Wut?

Every scripting language includes a SQLite driver.

Python's SQLAlchemy wraps SQLite beautifully. You can develop locally and then deploy to an enterprise server with a simple change of connection string.

I must be missing the speaker's point.

nicoburns · on May 22, 2022

I think they mean GUI tooling for ad-hoc inspection of the database. And possibly the stumbling block is difficulty accessing it over a network?

41b696ef1113 · on May 22, 2022

I personally love DBeaver for SQL, but I did stumble across a thats-not-how-it-was-intended-for-usecase for Datasette [0]. Datasette is intended to be a way to share static SQLite databases. However, I was recently debugging a Django application where I lacked appropriate tooling to inspect what was happening in the database. However, I am a huge fan of Datasette, so why not? While the Django debug server was running, I was able to launch datasette in a separate terminal and have a live view into the database while the application was free to mutate the underlying data. You can even write custom SQL inside the browser to select/modify any interesting data.

[0]: https://datasette.io/

smitty1e · on May 22, 2022

https://sqlitebrowser.org/

stefanos82 · on May 22, 2022

I prefer https://sqlitestudio.pl/ to be honest with you.

edwinyzh · on May 27, 2022

I've tried over 15 sqlite management tool, but SqliteStudio is my favorite! And recently discovered https://github.com/little-brother/sqlite-gui

sally1620 · on May 23, 2022

This sounds like a hype that is going to die at some point. SQLite has its own uses in client devices and small webservers, but it is not built to scale to millions of transactions per second.

It is possible to build all kinds of cool things on top of SQLite, but most of these articles sound like it is going to replace PostgreSQL or MySQL.

euroderf · on May 23, 2022

I want to see a mountable file system contained in an SQLite file. It creates possibilities. [1] is Java (yikes) and has been inactive six years. [2] is in Rust and has been inactive three years but does not seem to be available in cargo. [3] is in Go, lacks a LICENSE file, has been inactive four months, the README is in Russian, but well the demo code works. pkg.go.dev will not display its API.

[1] https://github.com/andyhebear/sqlitefs [2] https://github.com/narumatt/sqlitefs [3] https://github.com/jilio/sqlitefs, https://github-com.translate.goog/jilio/sqlitefs?_x_tr_sl=ru...

eterps · on May 23, 2022

Why would that be useful? (I'm genuinely curious)

euroderf · on May 24, 2022

For example, applications that use a pre-prepared content, or that use a lot of relative links ? Such as publishing, or content management, or any number of web apps ?

pdimitar · on May 23, 2022

sqlite authors say that for many scenarios plain sqlite beats filesystem traversal times.

So for some applications it might be faster to put "files" in a database (where primary keys are hierarchical full paths) as opposed to actually use the filesystem.

metadat · on May 23, 2022

This makes sense, I have experience with and lots of appreciation for SQLite. It's exempliry across many areas of software engineering. Elegant simplicity, not bloated, high performance, broad API library support, amazing 3rd party ecosystem (rqlite [1], litestream [2]). It's wildly nice!

With that said, I've found that I'm a bit hesitant to use SQLite for anything interactive for one silly reason:

It's so hard to query precise date ranges in an ergonomic way interactively. There is no timestamp type, and the general advice I've seen is to use Unix Epoch offsets. This works fine except for exploring.

I keep finding myself desiring porting my SQLite's to PostgreSQL for only the comparatively easy ranged data exploration.

[1] https://github.com/rqlite/rqlite

[2] https://github.com/benbjohnson/litestream

piaste · on May 23, 2022

SQLite supports computed columns, so you could just add a virtual text column next to the 'real' epoch column defined as the date representation of your choice.

https://www.sqlite.org/draft/gencol.html

Ignore the text column in your application so it won't be computed at runtime, but it will appear when you 'select *' or use a db browser in exploration mode.

somenameforme · on May 23, 2022

Can you elaborate on the problem there? I'm confused because as you described it, it seems just something like

select * from table where startDate >= 123 and endDate <= 456;

would solve the problem.

petre · on May 23, 2022

He has to convert every timestamp into unix epoch.

TazeTSchnitzel · on May 22, 2022

“may become”? It already is the backbone of so much software.

edwinyzh · on May 27, 2022

We can get 929,281 inserts per second writing to sqlite db: https://blog.synopse.info/?post/2022/02/15/mORMot-2-ORM-Perf...

Matthias247 · on May 23, 2022

May become? Sounds like the author missed that it's already used in lots of extremely important places. Among other use-cases, Android uses SQLite for > 10 years.

adamnemecek · on May 22, 2022

Can’t wait for something better to come along. The text orientedness of SQL is annoying. Something like Linq without the underlying SQL.

tehlike · on May 22, 2022

Amen. I made this point over and over. Sql is fairly annoying to deal with, linq is very close to me being perfect, prql is close

https://github.com/prql/prql

jdmichal · on May 22, 2022

Isn't "LINQ without the underlying SQL" simply list comprehensions?

pjc50 · on May 23, 2022

Microsoft call it "LINQ to objects": https://docs.microsoft.com/en-us/dotnet/csharp/programming-g...

adamnemecek · on May 23, 2022

Well plus serialization that’s not SQL.

boyter · on May 22, 2022

The thing I always liked about linq was it’s lazy evaluation. You could write simple selects on your data model and join and filter in the service. Allowing unit tests that operated on the joins.

I’m yet to find a better way to write crud apps than that pattern.

beached_whale · on May 22, 2022

I know not what you ment but just mentioning that if one uses entity framework or linq to sql in .net, linq should work with sqlite.

smitty1e · on May 22, 2022

Use an object-relational mapper and leave the boring stuff to a library.

__MatrixMan__ · on May 23, 2022

I hate ORMs.

After days of fumbling in the dark you finally isolate the problematic query, post it like "guys I found it," and still nobody speaks up. So you hack the ORM to attach tracebacks as comments to each query, retest everything, and finally git-blame your way to somebody who owns that bit of code and they're always like "wait, the ORM did WHAT?"

I'd rather grep code for SQL any day.

jdmichal · on May 23, 2022

That's why my favorite ORM in Java is MyBatis. It doesn't do SQL generation; it only maps objects to SQL parameters and results to objects.

pdimitar · on May 23, 2022

That's no longer an ORM. These are called data mappers.

tehlike · on May 22, 2022

Sadly the choice of sql tends to make underlying framework very complicated and sometimes incapable.

Some things got easier to express with SQL after Json_agg in postgres became a thing (now you can do multi-depth relations without blowing up the result table with Cartesian product) but it's still massive pain in the ass and Json agg has performance overhead too.

smitty1e · on May 23, 2022

Especially for reporting, I'm just as happy to write a serious piece of SQL and execute it directly.

This is the joy of SQLAlchemy: great features when you want to use them; escape hatches when you don't.

tehlike · on May 23, 2022

Right, most ORM tools give an escape hatch to execute raw SQL queries. However, even though SQL is turing complete, it's not a great language for retrieval imho. It's probably an artifact of the underlying relational models (and the relational output it has to generate) but still.

smitty1e · on May 24, 2022

I was all set to disagree with you, but a bit of googling yields => https://localcoder.org/is-sql-or-even-tsql-turing-complete

tehlike · on May 24, 2022

It could have been wrong, but main argument was whether SQL was a good language for retrieval or not, and for me that's a no. It's probably subjective though :)

mfarstad · on May 23, 2022

Obligatory: https://github.com/mathaou/termdbms

Granted I'm the author, but until I get around to rewriting it from the ground up, I really think it saves a lot of time for editing single cells, querying, and viewing in a headless environment.

ryandvm · on May 23, 2022

Wait until these kids discover embedding BerkleyDB...

cropcirclbureau · on May 22, 2022

Interesting usage of the word folklore.

ProtoMathics · on May 22, 2022

Just another filesystem container.

kumarvvr · on May 23, 2022

But one you can query and play around with. Its absolutely phenomenal as a custom file type for your applications.

NonNefarious · on May 23, 2022

"I can probably figure out how to do what I need to do with [Key-Value] but"

It's key/value.

cryptonector · on May 23, 2022