Hacker News new | past | comments | ask | show | jobs | submit login
There are over one trillion SQLite databases in active use (sqlite.org)
148 points by nothrowaways on Dec 6, 2021 | hide | past | favorite | 86 comments



SQLlite has also got probably the most well-tested code of conduct of any open source project in the world (aside from the other open source projects which use this CoC)

https://sqlite.org/codeofethics.html


I don't think anyone can reasonably get angry over someone else following these. Good rules.


> Speak no useless words or words that move to laughter.

I'm extremely biased coming from a family with a long tradition of atheism and even anticlerical "militantism".


PS: the second line here, though true, is meant as a meta-joke contradicting the quoted commandment


Haha I’d love to agree but it seems there are plenty of people in this thread that are upset about it nonetheless


You can find that just about any statement will result in plenty of people being upset about it. We in the first world society have made life so resource abundant that people need to invent new ways to be seen as a victim, be offended by anything or oppressed in some way.


Well, you can agree, and they can continue being upset about other people's life choices. Letting the pop vox dictate your speech isn't wise.


> Be in dread of hell.

I have thought about this one a bit, as an Atheist, and frankly, I have a problem with it. Ok, heaven, yes. Eternal peace and love, sounds great. Hell, obviously bad, the torture, the pain. But, to suffer pain I need to be conscious, awake, aware.

For all Eternity.

And surely the Demons must take breaks. nip out for a quick fag. Maybe a long lunch at the pub. I mean they might not tell Satan they were doing it - these are evil beings after all, not known for following the rules. So 15 minutes here, an hour there. Over a week it will add up.

And weeks add up over all of eternity for infinite time.

I mean. You could manage. If you knew you had infinite time, to think, to be. Compared to my belief, extinction, Hell is probably good runners up medal.


I wonder if they've ever had to enforce "Be not drowsy" or "Be in dread of hell".


Isn’t SQLite essentially not accepting outside contributions at all? In such a case this CoC really might work just fine.

As would a red “GO AWAY” banner.


> This document continues to be used for its original purpose - providing a reference to fill in the "code of conduct" box on supplier registration forms.


One of the examples of wildly successful "one-man-army" software (conceived of and built by a single person). See also: Redis, Linux, Doom/Quake, Steve Wozniak, jQuery, Python and countless obscure modern SaaS products.


Steve Wozniak was a software project?


>conceived of and built by a single person

Clearly Wozniak is Jesus.


Their extensive test library is proprietary, so they ensure the path of least resistance for companies is to hire themselves to make improvements to the library.

That has helped ensure there are no forks given quality would be inferior.

It also kept SQLite out of the browser since there are no competing implementations of it.


SQLite is extensively used in web browser codebases. If you check your Firefox profile, it's well-packed with SQLite databases.


I’m fairly sure that refers to https://en.wikipedia.org/wiki/Web_SQL_Database:

“Web SQL Database is a deprecated web browser API specification for storing data in databases that can be queried using SQL variant.

The API is supported by Google Chrome, Opera, and the Android Browser.

The W3C Web Applications Working Group ceased working on the specification in November 2010, citing a lack of independent implementations (i.e. using database system other than SQLite as the backend) as the reason the specification could not move forward to become a W3C Recommendation.”


I see, thanks. I don't totally agree, though. The comment implied that SQLite's test suite being proprietary is what "kept SQLite out of the browser" by preventing forks, when what kept SQLite out of the browser was just that there were no other high-quality embeddable SQL libraries. Perhaps the test suite being non-proprietary would have allowed for easier forking, but the lack of an available open test suite certainly doesn't *prevent* forks of a public domain codebase, and nothing prevents competing embedded SQL libraries other than the lack of motivation to compete with SQLite.

Even if SQLite's test suite was open, I wouldn't see any big forks taking off, because why would you really want to build a competitor to something that's already super easy and free to use, extremely high quality, very high performance? What really is the drive? If SQLite ever does something the community really doesn't like, then a fork would be worth starting, but unless that happens, I don't see any point, and I really don't see that happening any time soon, given the consistent quality and power of SQLite.


They don't allow pull requests could be one reason. E.g. if you wanted to take SQLite in a different direction.

https://www.sqlite.org/copyright.html


What a bullshit reason. This is why we can’t have good things.


It's sort-of misleading IMHO. A shared backend would be fine (i.e. I believe the independent IndexedDB implementations all end up putting stuff into sqlite, but nevertheless count as independent since they demonstrate that the API isn't tied to it), but the spec draft said

> User agents must implement the SQL dialect supported by Sqlite 3.6.19.

as the only definition on what's allowed SQL, which is a no-go for a spec. The authors could have provided a specification of a required SQL dialect that's sufficient to implement it from scratch/on top of another database, but decided to abandon the effort instead.


Ah, I see. The wording on the Wikipedia page isn't as good as the wording on the specification draft itself:

> This document was on the W3C Recommendation track but specification work has stopped. The specification reached an impasse: all interested implementors have used the same SQL backend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path.

It seems like they didn't just need a specified SQL dialect, but multiple actual implementations, but I might be reading into the wording incorrectly.


It would likely be necessary to have an independent interpreter for the SQL dialect at least, yes. I.e. I assume that something that parses the SQL from scratch and then down the line hands stuff to sqlite as the storage engine would qualify, since the SQL dialect is the point of contention. (The "multiple independent implementations" requirement is afaik also not 100% hard - if everyone agrees to ignore it, it can be ignored, but it's a strong objection if someone disagrees)


SQLite is one of the most amazing pieces of software I've ever seen. It's reliable, it doesn't add any inconvenience to end users, and it's very easy to use.


https://twitter.com/bagder/status/1467895841541210118?t=pPU8... and https://daniel.haxx.se/blog/2021/10/21/the-most-used-softwar... might be relevant here. TLDR; cURL is likely installed in even more devices (but not listed on SQLite's page).


I would hazard a guess that most used "software module of any description" is dtoa().


Richard Hipp discussed this (alongside a deep-dive on the philosophies underpinning his Fossil SCM) when we had him back on The Changelog podcast recently:

https://changelog.fm/454


Ok tens of billions I believe, but trillion+ is likely an exaggeration


> Since SQLite is used extensively in every smartphone, and there are more than 4.0 billion (4.0e9) smartphones in active use, each holding hundreds of SQLite database files, it is seems likely that there are over one trillion (1e12) SQLite databases in active use.

Sounds reasonable to me.


Hundreds of SQLite files? In active use? Unless iOS itself spawns a few hundred instances and actively uses them all the time I have a hard time thinking of what would use so many databases, much less that they’re actually in active use.


Well. I counted 27 SQLite files in my Chrome profile. On a PC though. But as they are used to store History, Cookie, etc. I guess on iOS/Android situations are similar.

And that's one single app.


What does active use really mean? The databases themselves are only accessed for a few nanoseconds at a time.


I think "something somebody uses will break if this file is missing or corrupt" probably makes a reasonable definition here.


That's true of most software though, even when you're staring at their GUI. It's part of the normal operation of the software, where do you draw the line?


The OS does do that, on behalf of applications CoreData or those using SQLite directly.


No, that still seems inflated. That's like counting installs of Word or Excel by counting the number of documents that exist.


The pays says: "Billions and billions of copies of SQLite exist in the wild". Which seems reasonable and is counting the number of installs as you preferred.

Later on the page says "it is seems likely that there are over one trillion (1e12) SQLite databases in active use" which is the quoted title and seems like a reasonable claim and is counting "documents" not installs.

I don't think "documents" is an uninteresting statistic. It is just a different one.


That's exactly what's being counted. A database is a SQLite document.


That's... Exactly what a SQLite database is - a file. This thread isn't about the number of unique devices that potentially use SQLite


"Active" is a big word here. Each of these databases is in nearly continuous use. You couldn't say that about the vast majority of excel spreadsheets, or any word documents.


Tens of billions of installs of the database _engine_, trillions of databases (ie SQLite files).

The webpage is a bit unclear about the difference, I'll agree


He's counting database files (100/phone), not physical devices.


Geopackage uses sqlite under hood https://en.m.wikipedia.org/wiki/GeoPackage .


Don’t forget to count all the cruise missiles, too!


Those tend to fall out of the “active use” category not long after entering it, so it’s probably not a big factor.

On a more serious note, had there been something special about SQLite supporting military use of their project that prompted this? Otherwise that’s the point of free software, people are free to do (mostly) whatever they want with it. Including building weapons. SQLite is so prevalent, it’s almost like pointing out you have to include cruise missiles in the count of active x86 chips, or $insert_bad_guy as an active user of roads and electricity.


I'm sure it was picked a long time ago and it's too costly to switch. I imagine any code going into multi-million dollar missile needs to be incredibly audited and understood.

(Not that this never happens! https://www.simscale.com/blog/2017/12/nasa-mars-climate-orbi...)


I was just making a funny joke about how it was developed originally for cruise missiles. Though it turns out I was misremembering: it was made for guided missile destroyers. And it makes sense! What a great use case. https://en.wikipedia.org/wiki/SQLite#History


> Those tend to fall out of the “active use”

At least you don't have to write garbage collection code for those. It just crashes.


I remember an interview whereby he mentioned sqlite was initially built for software on warships.


i have a feeling that every time SQLite is mentioned on HN it always gets to the frontpage

great for them i guess!


They have a lot of fans, of the type of people that haunt HN.

High quality project that is useful for for small hobbyist projects, startups and large companies alike. Persisting data something that is used in probably the majority of software projects across many verticals. It is OSS software that is easy to pick up. It is a values driven project (both technical and ethical). And the team maintaining it has kept it going for a long time, so as it get rediscovered by others it still has relevancy for many.

Basically it is good software, with a wide TAM, and just enough quirks to appeal to the folks here.


[flagged]


Let's not get distracted by an extraneous provocation. As you can imagine, this has come up frequently when people run into that bit, and tends to lead to the same discussion over and over (and ultimately, religious flamewar).

There's a new site guideline about this actually:

"Please don't pick the most provocative thing in an article and rush to the thread to complain about it. Find something interesting to comment about instead."

https://news.ycombinator.com/newsguidelines.html

We detached this subthread from https://news.ycombinator.com/item?id=29461959.


> "Please don't pick the most provocative thing in an article and rush to the thread to complain about it. Find something interesting to comment about instead."

So you agree it's a provocative thing. Good, that's all I meant to convey.


Yes, but we're specifically asking you not to convey it, for the same reason that one doesn't convey lit matches to dry forests.


> No one is required to follow The Rule...[The SQLite developers] view The Rule as their promise to all SQLite users of how the developers are expected to behave. This is a one-way promise, or covenant. In other words, the developers are saying: "We will treat you this way regardless of how you treat us."

I don't see a problem here, despite my own disinterest in following the Christian faith or any other. It's not hard to distinguish what's applicable from what's not in the verbatim quote of the Rule as given in the document, and if I had maintained my childhood Catholicism I too might demur from merely excerpting a saint.

Had anyone involved with SQLite mistreated someone for not happening to share their faith, I would expect it quite quickly to become public knowledge at least in the community including HN. I haven't heard such a report. Have you?


It put me off until I read that part about the code that's laid out being optional.

They're not forcing anything on anyone. I'm perfectly fine with that.


In fact, that would go against their code of ethics.


While true, I suspect this may not reliably qualify as a compelling argument for someone disposed to raise prima facie concerns about the same code of ethics because it makes open reference to deity.

Then again, I suppose an argument from behavior may also be less likely to convince an interlocutor more inclined to concern with what someone else believes than with how they actually behave. So it goes.


The code of ethics is a 1500 year old "code of ethics" meant to govern the lives of benedictine monks: https://en.wikipedia.org/wiki/D._Richard_Hipp

That might provide some context.

edit: haha, this is what I get for skimming the wikipedia article. He has a PhD in computer science, but the full "doctorate of philosophy" is written out instead of abbreviation on that wikipedia article. My pattern matcher needs tuning, apparently. Thanks to those who pointed this out.


His PhD is in Computer Science.


I think you are taking what “Ph.D.” stands for to be his actual area of study.


Okay that's embarassing. Serves me right for just skimming his article. Thanks for the correction, edited my post.


Not quite April fool's, but it's definitely a joke, but one with a practical purpose.

> This document continues to be used for its original purpose - providing a reference to fill in the "code of conduct" box on supplier registration forms.

> [we] have pledged to govern [...] in accordance with the "instruments of good works" from chapter 4 of The Rule of St. Benedict (hereafter: "The Rule"). This code of ethics has proven its mettle in thousands of diverse communities for over 1,500 years, and has served as a baseline for many civil law codes since the time of Charlemagne.

So on paper, this makes it a very good code of conduct - something very safe to put on a supplier registration form.


> No one is required to follow The Rule, to know The Rule, or even to think that The Rule is a good idea. The Founder of SQLite believes that anyone who follows The Rule will live a happier and more productive life, but individuals are free to dispute or ignore that advice if they wish.


It's a reproduction of Saint Benedict's code of conduct for overseeing monks; it's not going to be 1:1.

It might be soft evangelism but it's not being used for discrimination.


>must be some sort of forgotten April fool's remnant.

I don't think so. His homepage has a couple of religious references, and you can find stuff like this in the tests: https://github.com/sqlite/sqlite/blob/37d4ec86bfa78c31732132...


That's just because you disagree. Go read the Rust code of ethics, it may be more up your alley.


Did you read 1. History and 2. Purpose?


SQLite code of conduct, §17: bury the dead.

Hmmm… ok.


To be fair it's better than leaving them laying around in the open


It isn't. The SQLite team is 3ish people who don't accept outside contributions and are all Christan.


They'd have one trillion + 1 if they had better JSON support.

I'd be using it in a large healthcare project I'm working on now.

Unfortunately, the JSON support is basically "serialize to a string". Postgres is miles ahead with jsonb.

Shame, I <3 sqlite.


I love PostgreSQL, and would recommend it anytime. Even though jsonb is very powerful, in general if you store JSON in a SQL database you're very likely doing something wrong. (You're breaking first normal form)

But if you really wanted to use SQLite to store JSON à la jsonb in PostgreSQL you can use generated fields[1]

    sqlite> create table t(id integer primary key autoincrement, data text);
    sqlite> insert into t(data) values ('{"foo": "value", "bar": "other value"}'), ('{"foo": "baz", "bar": "qux"}');'
    […]
    sqlite> alter table t add column foo text generated always as (json_extract(data, '$.foo')) virtual;
    sqlite> select * from t;
    id  data                                    foo  
    --  --------------------------------------  -----
    1   {"foo": "value", "bar": "other value"}  value
    2   {"foo": "baz", "bar": "qux"}            baz

You can even use a "stored" (instead of 'virtual') generated field, and create an index on it for fast lookups.

It's not as powerful as Postgres, but it does a pretty good job.

[1] https://www.sqlite.org/gencol.html


Database optimization and the level of normalization is a complex topic that varies widely based on application and workload.

JSON is absolutely a valid approach to data storage when dealing with certain data structures. As is using a relational database and jsonb data types for it. Cherry picking columns gives you the advantages for both a NoSQL document storage database as well as a relational analytics database.

This is a pretty common modern technique that Postgres, for example, has made easy.

Go try modeling a FHIR database structure using standard normalization rules.

You'll quickly discover why no one does it. Even HAPI FHIR server (the most full featured and popular FHIR server) written in Java doesn't attempt it.


Disagree storing json in databases is often the best thing to do. Clickhouse for example allows you to quickly pull out relevant data and feed materialized views.


I never said storing json in databases was wrong. I said storing json in *SQL* databases was an anti-pattern.

https://en.wikipedia.org/wiki/First_normal_form


It's perfectly fine

one column for a timestamp and one column to store the json

You then build materialized views pulling out whatever json fields you need.


Just because its easier doesn't mean its better - json is just one possible materialization for your set.


Could you expand on what Postgres offers that SQLite doesn't in this regard? I'm curious to know what I'm missing out on.


Not GP, but PostgreSQL offers jsonb fields[1]

Since PostgreSQL allows for functional indexes, you can query and index the data in a structured way.

For example you have a table t with "id" and "data" where data is a jsonb field like {foo: ..., bar: ...}. You can do

    SELECT id, data->'foo'
    FROM   t
    WHERE  data->'bar' > 5
Which will yield the value of "foo" (inside the json field) for rows where "bar" (inside the json field) is greater than 5.

[1] https://www.postgresql.org/docs/9.5/functions-json.html


Unless I'm misunderstanding SQLite can do this as well, but with less nice syntax. The above could be written something like:

    SELECT id, json_extract(data, '$.foo')
    FROM   t
    WHERE  json_extract(data, '$.bar') > 5
in SQLite. You can also add an index on one of these expressions: https://dgl.cx/2020/06/sqlite-json-support


It seems like one thing is that Postgres has two types for JSON: JSON and JSONB.

Both validate syntax on insert. The JSONB datatype (not JSON; note the B on the end) further stores the content in a parsed, binary form so future accesses are faster. The schema author does not have to indicate which columns they would like "fast" access to - queries are free to access any arbitrary path in the JSON, and will get reads that don't require JSON parsing.

It seems like SQLite can achieve something similar but not quite the same. It looks like the user has to specifically enumerate and materialize the columns to which they want fast access.

So it seems like json_extract(...) is comparable to Postgres's JSON type, but there's not quite an analagous thing for Postgres's JSONB type.


That is true, but SQLite doesn't support functional indexing. This means that you cannot do

    CREATE INDEX idx_t_bar ON t(json_extract(data, '$.bar'));
for fast lookups. But you can do

    CREATE INDEX idx_t_bar ON t(data->'bar');
in PostgreSQL. You have some workarounds like I explained.[1]

Also PostgreSQL has many operators[2] which makes using JSON fields easier.

[1] https://news.ycombinator.com/item?id=29461926

[2] https://www.postgresql.org/docs/9.5/functions-json.html


> SQLite doesn't support functional indexing

SQLite has "Indexes On Expressions"[1] since 2015. I'm assuming functional indexing in Postgres is the same. So I think the syntax would be:

    CREATE INDEX idx_t_bar ON json_extract(data, '$.bar');
[1]: https://www.sqlite.org/expridx.html

> Also PostgreSQL has many operators[2] which makes using JSON fields easier.

This does look a little more convenient to write than SQLite's JSON syntax, but I don't see any fundamental difference here.


Postgres's JSONB type can essentially turn Postgres into a full NoSQL database. Infact, its so good that it can beat MongoDB at its own game in a lot of benchmarks.

https://www.theguardian.com/info/2018/nov/30/bye-bye-mongo-h...

Example: The Guardian switched from Mongo to Postgres jsonb as a nosql db.


Sqlite is just that, lightweight. If you want better JSON support specialized for what you want, extend it. The product allows extensions easily.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: