Hacker News new | past | comments | ask | show | jobs | submit login

A very very long time ago I did some programming to help a friend of a friend analyze a data set. The data set was from public schools and it was being used to inform some policies at the state level. All I can say is, the data submitted from many schools had very disturbing patterns of regularity (as in identical records on many unexpected metrics that you would not expect) ... like 10 or 15 records in a row, in the same rough geographic area (we didn't have exact addresses) with the exact same scores in all subjects as well as reading and match assessments. Basically it looked like someone had copy pasted rows of data over and over while keeping the original unique ID numbers (end result, different ID numbers representing different students, all having the exact same scores). And guess what ... whenever I saw that pattern, the scores were all very much above average. You didn't see that pattern with below average scores.

I told the person I was working with about the data and suggested it was fraudulent and she became concerned and raised it with her supervisor. Within about 24 hours I no longer had access to the data sets. And the friend of a friend just said that she didn't need help anymore.

I suppose I could have raised a fuss and contacted a journalist but all I had was columns of data without context. Plus at the time I'm ashamed to say I was playing a lot of World of Warcraft and not inclined to do much else that required effort.




This perfectly matches my experience in similar cases. Just not in academia but in business.

Policy always represents somebody's interests. "Data driven" is an excellent cover when you control the data. You get to eat the cake and look good while doing it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: