You are making things way more complicated than they need to be. If your test needs to hit the database just call it a functional or integration or end-to-end test. What's the problem with that? What's the need to call it a unit test? You even seem to make this point yourself:
> For some codebases, it might make sense to have a test (whether you want to call it integration or unit) that hits the database...
> There's no such thing as not testing anything else -- even if you're testing a pure function in a functional language, you're still going to be pulling in behavior from the OS, from your runtime environment, etc.
I don't see how this point adds to the conversation around automated testing and different types of testing. Sometimes we mock the database to make tests faster. And sometimes we even mock the OS. I just recently wrote a test that mocks the filesystem. And I didn't have to do any work to do that. There's already a package available that does it for me.
> Units are built out of units! Almost every useful unit test will encompass multiple smaller units of code.
This seems like another point that doesn't add to the conversation around how to write useful tests. It sounds like pedantry around definitions to me, but maybe I'm missing some nuanced point that you're trying to make.
> The unit/integration test distinction takes something that is fundamentally a continuum and turns it into a binary yes/no question.
That's a fair observation. Binary definitions are also extremely useful in some cases. Sometimes you need to know whether something is black or white and there is simply a cutoff point between the two. That's useful in all sorts of situations in life, including talking about tests.
> This leads to dogmatism, where people make bad testing choices purely because, "by definition, my integration test can't mock this 3rd-party service that obviously should be mocked."
That's not what leads to dogmatism. If the best test for the purpose needs to hit the database, just call it an integration test. What exactly is the problem?
> In contrast, if you spend all your time thinking about tests purely in terms of refactorability, reliability, and coverage...
Those are good factors to consider. So is the time it takes to run tests. If I can run my unit tests in a few seconds, I will use them often. My current end-to-end tests take well over 30 minutes to run. They obviously do not get run hundreds of times a day. So there is very useful conversation around what we want to include in our end-to-end tests, and more importantly what we want to exclude. And without definitions like "unit test" and "end-to-end" test, those conversations would be needlessly awkward and take longer. Hopefully no one is suggesting we drop "unit tests" in favor of a "the tests that run super fast and before code merge" category?
When someone on my team says "let's leave that out of the end-to-end tests and just write some unit tests to cover this lesser used feature" everyone on the team knows exactly what to do. It's not really that hard to develop a shared and useful testing vocabulary on a team, and that shared vocabulary can and should come from the widely used definitions already out there. If you're getting dogmatic about definitions on your team, the problem is not the definitions themselves.
> For some codebases, it might make sense to have a test (whether you want to call it integration or unit) that hits the database...
> There's no such thing as not testing anything else -- even if you're testing a pure function in a functional language, you're still going to be pulling in behavior from the OS, from your runtime environment, etc.
I don't see how this point adds to the conversation around automated testing and different types of testing. Sometimes we mock the database to make tests faster. And sometimes we even mock the OS. I just recently wrote a test that mocks the filesystem. And I didn't have to do any work to do that. There's already a package available that does it for me.
> Units are built out of units! Almost every useful unit test will encompass multiple smaller units of code.
This seems like another point that doesn't add to the conversation around how to write useful tests. It sounds like pedantry around definitions to me, but maybe I'm missing some nuanced point that you're trying to make.
> The unit/integration test distinction takes something that is fundamentally a continuum and turns it into a binary yes/no question.
That's a fair observation. Binary definitions are also extremely useful in some cases. Sometimes you need to know whether something is black or white and there is simply a cutoff point between the two. That's useful in all sorts of situations in life, including talking about tests.
> This leads to dogmatism, where people make bad testing choices purely because, "by definition, my integration test can't mock this 3rd-party service that obviously should be mocked."
That's not what leads to dogmatism. If the best test for the purpose needs to hit the database, just call it an integration test. What exactly is the problem?
> In contrast, if you spend all your time thinking about tests purely in terms of refactorability, reliability, and coverage...
Those are good factors to consider. So is the time it takes to run tests. If I can run my unit tests in a few seconds, I will use them often. My current end-to-end tests take well over 30 minutes to run. They obviously do not get run hundreds of times a day. So there is very useful conversation around what we want to include in our end-to-end tests, and more importantly what we want to exclude. And without definitions like "unit test" and "end-to-end" test, those conversations would be needlessly awkward and take longer. Hopefully no one is suggesting we drop "unit tests" in favor of a "the tests that run super fast and before code merge" category?
When someone on my team says "let's leave that out of the end-to-end tests and just write some unit tests to cover this lesser used feature" everyone on the team knows exactly what to do. It's not really that hard to develop a shared and useful testing vocabulary on a team, and that shared vocabulary can and should come from the widely used definitions already out there. If you're getting dogmatic about definitions on your team, the problem is not the definitions themselves.