Agreed. DeepMind's AlphaFold (graph neural nets that solved protein folding acco...

derbOac · on March 5, 2022

Speaking as someone who's done the whole tenure thing etc, the gap between what has happened in the nonacademic and academic sector in the last decade or so has been mind-boggling to me at times, at least in my field. It's not even the big 5 tech corporations. Trends in tech that are completely and overwhelmingly obvious, even normal to daily life, get treated like some kind of bleeding edge "newfangled" thing. As a result those in the academic field end up looking like luddites, even though it should be the other way around. Years after you suggest something normal, after something is well-established, someone gets a grant for something that should have been done years earlier, and it's praised as being cutting edge or something.

I have so much to say about this. Rigor is absolutely essential but something is seriously broken about incentives in publicly funded research.

lmeyerov · on March 5, 2022

Yes!

To your point, I've been watching the slow uptake of basic ideas in science departments. In this case, neural networks are still largely looked at either with suspicion (wrong principles) or as magic (too difficult) by top non-CS departments at top institutions. There are exceptions, but they're the ones that prove the rule.

The big ~5 (Apple, Microsoft, Google, Facebook, maybe Amazon), and then an even bigger universe of Chinese companies + consulting companies, are interesting to me because they solve scaling these ideas. They've been putting money into projects that academic teams largely aren't, collecting global-scale data that these teams aren't, and doing global-scale real-world experiments that academics aren't.

Ex: For something like healthcare, there isn't a good reason for Flatiron Health to be a leader while, to almost a rounding error, most real-world clinical data researchers are data-poor & AI-poor. While scientists are squabbling around open access for papers, imagine if the NIH+HHS+CDC+... required all publicly-funded genomic + health data be unified into a national database and the DOE provided AI compute infra for working with it. Instead, we have an area PI at every research university getting grants to make believe that their regional hospital network's tiny genome/ehr database will be the one that becomes that.

t_serpico · on March 5, 2022

Unifying clinical and biological data is an incredibly challenging task for a variety of reasons. Private sector is much more well suited to tackle such problems because of the engineering and coordination required.

lmeyerov · on March 5, 2022

I'd have agreed with you 5-10 years ago.

- I'm interacting with and hearing a bunch of regional tech companies, hospital networks, and worse, consultant shops, doing what you're saying. For the most part, they're not that special. There are variants with unique twists (edge compute/ai/crypto/reselling/graph/..), but ultimately, not that many, and generally prioritize the same obvious data sources (epic, ..). Genome data isn't as standardized/centralized as top EHRs afaict, but even there, we're see common data formats crop up & getting popular. Likewise, every hospital network IT group is independently having to reinvent the wheel on things like access for their researchers.

- I'm not advocating they do all the things. Industry has value. (Ex: Imagine trifacta for everyone!). But this is a commons issue, and when industry owns it, that's a problem (ex: HHS groups ensnared by Palantir, or today's balkanized approach to real-world data). But for meat and potatoes of EHRs/labs/genomes, as part of the continued digital transform these systems are doing, targeting baseline standards & timely data submission isn't that crazy. Likewise, VA and other ~federated national groups already do have (bad) centralized database + compute facilities. I hear about their problems every ~night ;-) But as a baseline, a lot of what is a scramble today would become easy.

Getting back to the main point. The top 5 tech co's are quite used to working at nation/global scale for data stuff, including for PII & AI, so there's technical precedent. Smaller nations actually already do centralize this stuff - so it's not even without precedent in terms of government. The US gov is intentionally spending much more money on doing it much tinier & weaker.