I mean, no? None of the AI-generated images managed to be indistinguishable. Some people were much better than others at spotting the differences. He even quotes, at length, an artist giving a detailed breakdown of what's wrong with one of the images he thought was good.
Did you read the article? Respondents performed barely better than chance. Sure, no one was actually 100% wrong[0]. Just almost always wrong, with a noticeable bias towards liking AI art more.
The detailed breakdown you mention? Maybe it's accurate to that artist's thought process, maybe it's more of a rationalization; either way, it's not a general rule they, or anyone, could apply to any of the other AI images. Most of those in the article don't exhibit those "telltale signs", and the one that does - the Victorian Megaship - was actually made by human artist with no AI in the mix.
EDIT:
Another image that stands out to me is Riverside Cafe. Myself, like apparently a lot of other people, going by articles' comments, assumed it's a human-made one, because we vaguely remembered Vang Gogh painted something like it. He did, it's called Café Terrace at Night - and yet, despite immediately evoking the association, Riverside Cafe was made by AI, and is actually nothing like Café Terrace at Night at any level.
(I find it fascinating how this work looks like a copy of Van Gogh at first glance, for no obvious reason, but nothing alike once you pause to look closer. It's like... they have similar low-frequency spectra or something?)
EDIT2:
Played around with the two images in https://ejectamenta.com/imaging-experiments/fourifier/. There are some similarities in the spectra, I can't put my finger on them exactly. But it's probably not the whole answer. I'll try to do some more detailed experimentation later.
--
[0] - Nor should you expect it - it would mean either a perfect calibration, or be the equivalent of flipping a coin and getting heads 30 times in a row; it's not impossible, but you shouldn't expect to see it unless you're interviewing fewer people than literally the entire population of the planet.
> The average participant scored 60%, but people who hated AI art scored 64%, professional artists scored 66%, and people who were both professional artists and hated AI art scored 68%.
> The highest score was 98% (49/50), which 5 out of 11,000 people achieved. Even with 11,000 people, getting scores this high by luck alone is near-impossible.
This accurately boils down to "cannot reliably be binned as AI-generated". Your objection amounts to a vanishing few people who are informed that this is a test being able to do a pretty good job at it.
If 0.0005% of people who are specifically judging art as AI or not AI, in a test which presumably attracts people who would like to be able to do that thing, can do a 98% accurate job, and the average is around 60%: that isn't reliable.
If that doesn't work for you, I encourage you to take the test. Obviously since you've read the article there are some spoilers, but there's still plenty of chances to get it right or wrong. I think you'll discover that you, too, cannot do this reliably. Let us know what happens.
I can't do it reliably and I don't want to - I learnt to spot certain popular video compression artifacts in my youth, and that has not enhanced my life. But any distinction that random people taking a casual internet survey get right 60% of the time is absolutely one that you can make reliably if you put in the effort. Look at something like chicken sexing.