Although I agree with his definitions, I meet (and work with) many people who vehemently disagree - especially on things like "unit tests should never use a database". I wish there was some authority who could establish useful working definitions once and for all.
I think it's a generally accepted rule rather than a hard and fast one. That's because a unit test is usually expected to run faster than other test types, and they're expected to be the most deterministic. Using an actual database for a unit test not only can slow things down because of setup, but it can cause state leakage(which may not be a problem for the app itself) between tests and introduces more possibilities for race conditions and random test failures since it relies on a separate process.
But I see no reason why a unit test has to be databaseless. Of course you can write a unit test that uses a database! If it makes sense to do it, then do it. There's no computer god dictating rules to us that we are not allowed to break. However, it probably doesn't make sense to do so. ;)
If a term is popular, people will use it in different ways. I don't see so much wrong with that, so long someone is using the terms usefully.
e.g. with unit test, there's surely more than one useful way to emphasise it:
I suppose if a test is small and doesn't talk to a database, etc., then it will be quick. -- I think that's one thing people mean when they say "unit tests": "tests that I can run quickly".
I think another sense of the word "unit" is more sophisticated, meaning "the system under test should be a coherent unit" or something, in the same way that some function/class/etc. should have a single responsibility.
By definition it's not a unit test if it uses a database. Unit test means to test an individual method or function, and specifically without testing anything else.
> I wish there was some authority
It's an interesting idea since we do have authorities for things like language specifications. In this case though I do think the definitions of different types of tests are pretty well established in our industry. [1]. You can find the same definitions at that source in literally thousands of other places around the web, and I would be very surprised to find someone with experience claiming that unit tests should hit the database. That's not to say tests should not use a database. But that's a different type of test.
Reality is fuzzier than you're making it out to be.
There's no such thing as not testing anything else -- even if you're testing a pure function in a functional language, you're still going to be pulling in behavior from the OS, from your runtime environment, etc... We arbitrarily draw lines around certain interactions and say that a single function can't call into a database or rest endpoint, but can interact with complicated systems built on multiple abstractions and frameworks that are baked into the OS/language.
Units are built out of units! Almost every useful unit test will encompass multiple smaller units of code.
I think a lot of harm in testing comes from people thinking too hard about the distinction between integration and unit tests instead of asking, "what is the actual behavior I want to test, and how can I test just that and nothing else?"
For some codebases, it might make sense to have a test (whether you want to call it integration or unit) that hits the database but mocks a large portion of the rest of the code. For example, if you're writing an app that's being translated into multiple languages, you probably don't want your full-stack integration tests to look at translated strings in each language, since your translation team is going to be changing those strings all the time and it's wrong behavior for a test to fail because your translation team fixed a typo.
That's a situation where it makes sense to have some kind of mock or special translation that sidesteps the issue and doesn't try to test the entire system exactly as a client would see it.
The unit/integration test distinction takes something that is fundamentally a continuum and turns it into a binary yes/no question. This leads to dogmatism, where people make bad testing choices purely because, "by definition, my integration test can't mock this 3rd-party service that obviously should be mocked."
In contrast, if you spend all your time thinking about tests purely in terms of refactorability, reliability, and coverage, you will often end up with good tests that catch a lot of bugs regardless of whether anyone else calls them integration tests, functional tests, or unit tests.
You are making things way more complicated than they need to be. If your test needs to hit the database just call it a functional or integration or end-to-end test. What's the problem with that? What's the need to call it a unit test? You even seem to make this point yourself:
> For some codebases, it might make sense to have a test (whether you want to call it integration or unit) that hits the database...
> There's no such thing as not testing anything else -- even if you're testing a pure function in a functional language, you're still going to be pulling in behavior from the OS, from your runtime environment, etc.
I don't see how this point adds to the conversation around automated testing and different types of testing. Sometimes we mock the database to make tests faster. And sometimes we even mock the OS. I just recently wrote a test that mocks the filesystem. And I didn't have to do any work to do that. There's already a package available that does it for me.
> Units are built out of units! Almost every useful unit test will encompass multiple smaller units of code.
This seems like another point that doesn't add to the conversation around how to write useful tests. It sounds like pedantry around definitions to me, but maybe I'm missing some nuanced point that you're trying to make.
> The unit/integration test distinction takes something that is fundamentally a continuum and turns it into a binary yes/no question.
That's a fair observation. Binary definitions are also extremely useful in some cases. Sometimes you need to know whether something is black or white and there is simply a cutoff point between the two. That's useful in all sorts of situations in life, including talking about tests.
> This leads to dogmatism, where people make bad testing choices purely because, "by definition, my integration test can't mock this 3rd-party service that obviously should be mocked."
That's not what leads to dogmatism. If the best test for the purpose needs to hit the database, just call it an integration test. What exactly is the problem?
> In contrast, if you spend all your time thinking about tests purely in terms of refactorability, reliability, and coverage...
Those are good factors to consider. So is the time it takes to run tests. If I can run my unit tests in a few seconds, I will use them often. My current end-to-end tests take well over 30 minutes to run. They obviously do not get run hundreds of times a day. So there is very useful conversation around what we want to include in our end-to-end tests, and more importantly what we want to exclude. And without definitions like "unit test" and "end-to-end" test, those conversations would be needlessly awkward and take longer. Hopefully no one is suggesting we drop "unit tests" in favor of a "the tests that run super fast and before code merge" category?
When someone on my team says "let's leave that out of the end-to-end tests and just write some unit tests to cover this lesser used feature" everyone on the team knows exactly what to do. It's not really that hard to develop a shared and useful testing vocabulary on a team, and that shared vocabulary can and should come from the widely used definitions already out there. If you're getting dogmatic about definitions on your team, the problem is not the definitions themselves.
Ive seen people define unit test as something that could use a database. I've seen others vehemently disagree with this.
"Unit" is not a particularly well defined thing.
I think this lack of agreement is part of the problem coz it makes it impossible to have a conversation with these terms.
For this reason I usually avoid talking about unit tests and try to use something a bit more specific (e.g. xUnit framework test) to highlight what I mean.
> "Unit" is not a particularly well defined thing.
Fair point. I think in this case I would stop focusing on definitions and focus on goals. Saying that one goal with our unit tests is for them to be lightening fast, so disk access including hitting the database is not allowed. Feel free to write a "unit test" that hits the database. But we are going to run that "unit test" along with our slower running tests that we call "functional tests".
I really don't get the position that a test that is potentially harder to write (because mocks) and catches fewer bugs (because mocks) but takes 0.1 seconds to run instead of 3 seconds is intrinsically "better".
It's not intrinsically better. It's just a different type of test. And which type of test you want to use and when is going to be based on your goals as an organization. 5000 unit tests times 3 seconds is 4 hours, isn't it? Compared to 9 minutes in the former case. That matters in some organizations.