*"Science is facing a "reproducibility crisis" where more than two-thirds of res...

endorphone · on Feb 23, 2017

It is a terrible thing, and it is absolutely nothing like finding bugs.

Reproducibility is a core requirement of good science, and if we need to compare it to software engineering, the reproducibility crisis is like the adage "many eyes make all bugs shallow", when the assumption that there is many eyes even looking is often untrue. Most studies are never reproduced, but are held as true under the belief that if someone tried they could.

EDIT: You claimed that in an ideal world, 100% of experiments/studies would not be reproducible. This denotes a profound misunderstanding of the scientific process, or the whole basis of reproducibility. In an idea world, 100% of studies would be vetted through reproduction, and 100% of them would be reproducible. This is essentially the fundamental assumption of the scientific process.

ZeroGravitas · on Feb 23, 2017

No, I claimed that all scientist would have had the experience of not reproducing something. Because if they do it a lot, as part of a regular process then they will eventually find something that doesn't work because the original scientist didn't document a step correctly or misread the results or just got lucky due to random chance.

Just like all developers will eventually find a bug in code they code review. This is different from all code they review having bugs.

endorphone · on Feb 23, 2017

While the wording may be vague, they aren't talking about the experiences of a subset of researchers -- they are saying that of the experiments they tried to replicate, 2/3rds weren't reproduceable. That is terrible, and has absolutely nothing to do with finding bugs.

pvaldes · on Feb 23, 2017

Science publishing is based in a review by peers system. All evident bugs (and many non so evident bugs) should be catched before to appear in a journal. Is totally different to standard journalism.

TorKlingberg · on Feb 23, 2017

Replication happens after publication. Why is everyone misunderstanding ZeroGravitas' point?

pmontra · on Feb 23, 2017

I agree that there are no test suites to provide experiments to run and no make test to repeat an experiment with little effort. This accounts for the "too complex" thing.

However many experiments should be reproducible. Not making results testable is against the goal of sharing knowledge. But I understand that's an extra effort compared the current state of the art, and that must be rewarded and acknowledged. In another comment I proposed to include reproducibility in the h-index.

droopyEyelids · on Feb 23, 2017

The whole point of an experiment is to isolate a single variable so you can test a falsifiable statement about it.

It's the exact opposite of building a system, which is what coders do.

tgb · on Feb 23, 2017

I don't think that analogy is at all accurate and I think the conclusion that you reach from it is completely incorrect.

ZeroGravitas · on Feb 23, 2017

I think I must have expressed myself poorly, as I think my conclusion is the same as the article suggests i.e. that the science/code shouldn't be considered "done", until it's been peer-reviewed, since it's easy to fool yourself and others if you're not actually reviewing and testing your code.

What did you take away from my comment?

gcp · on Feb 23, 2017

Peer review does not imply reproducibility, and it's the latter that is the problem.

I can confirm, as a reviewer, that your methodology and analysis looks sensible, but the flaws may be deeper, and the fact that you didn't publish the 19 other studies that failed, but that this is the "lucky one", or that you simply cherry picked the data, is not something I can see as a reviewer.

This is especially true if the experiment is nontrivial to re-do.

ZeroGravitas · on Feb 23, 2017

" Peer review does not imply reproducibility, and it's the latter that is the problem."

I think this is the key to it, I'm suggesting that reproducibility should be part of considering something peer-reviewed, but of course as currently practised, that isn't true.

Of course in a software metaphor, that would probably cover both code review and QA, which is sometimes done by a different job role which further muddies the water.

pvaldes · on Feb 24, 2017

This would be like expecting a car brand to open its code to its competitors before to release a new product. Will lead typically also to the peers rushing to publish the same discovering on disguise before the original.

tgb · on Feb 23, 2017

After reading your explanations elsewhere, I take back my statements. Your statements are literally correct, although easy to take incorrectly, and I support them: in particular you seem to be supporting that 100% of researchers should attempt to replicate studies and do so sufficiently that all of them will eventually fail to replicate at least some studies. I think most of us took this to mean that you thought every study should fail to replicate (by analogy that every software has at least some bugs), but I see now your intent and that your original wording backed that up.

eclee2 · on Feb 23, 2017

If there are errors in a study's methods that make it unreplicable, then it shouldn't have passed peer review or been published.

pvaldes · on Feb 24, 2017

Then you risk to lose all einstein work unless you have two einsteins at the same time in fact. Genious are scarce. And you could not publish nothing about comets for example, because this would be unreplicable until the next 20 years. Is not so simple as that.

DanBC · on Feb 23, 2017

How do you know it's unreplicable until someone tries to replicate it?

rokosbasilisk · on Feb 23, 2017

I work with principal investigators phd/mds at upenn automating some of their data analysis pipelines.

They have all have secret checklists for bs detection in papers they read. Certain labs set off red flags to them or certain techniques being too fuzzy or easy to mess up.

Every one seems to have their own heuristics, and no one seems to take any article at face value anymore.

I hear PI's say stuff is unreplicable all the time.