I encountered the same problem a few years ago and indeed realized that using categories to understand what type of article a thing was (person? subject? event?) was utterly useless, for the reasons you describe.
On the other hand, I discovered that infoboxes (the data in the top-right box on most pages) was generally extremely reliable, if frustrating to parse.
As far as I can tell, that is not the case, sadly.
Right now it appears that only 3,975 articles have infoboxes auto-generated from Wikidata. [1] The wikitext contains something like "{{Wikidata Infobox ...}}" instead of just "{{Infobox ...}}".
If you look up a popular article like Barack Obama [2], it's just a traditional hand-edited infobox. In fact, one of the first lines of data says "Vice President = Joe Biden", while the Wikidata entry for Barack Obama [3] doesn't reference Biden anywhere -- so not only is the Wikipedia infobox not generated from Wikidata, but Wikidata isn't pulling all the relevant info from Wikipedia either.
Back when I had been working on my project, I'd hoped Wikidata could be a solution but it was far too incomplete and information was regularly out of date. Perhaps (hopefully) it's better now, but it's clearly not being used to power infoboxes yet except in a tiny number of cases. (Which actually complicates things more now, since anybody parsing Wikipedia infoboxes now has to deal separately with the 3,975 ones that grab from Wikidata, since none of the actual data is copied over into the wikitext...)
On the other hand, I discovered that infoboxes (the data in the top-right box on most pages) was generally extremely reliable, if frustrating to parse.