This assumes the only way to use LLMs effectively is to have a monolith model th... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

dontupvoteme on Nov 22, 2023 | parent | context | favorite | on: We have reached an agreement in principle for Sam ...

This assumes the only way to use LLMs effectively is to have a monolith model that does everything from translation (from ANY language to ANY language) to creative writing to coding to what have you. And supposedly GPT4 is a mixture of experts (maybe 8-cross)

The efficiency of finetuned models is quite, quite a bit improved at the cost of giving up the rest of the world to do specific things, and disk space to have a few dozen local finetunes (or even hundreds+ for SaaS services) is peanuts compared to acquiring 80GB of VRAM on a single device for monomodels

cft on Nov 22, 2023 [–]

Sutskever says there's a "phase transition" at the order of 9 bn neurons, after which LLMs begin to become really useful. I don't know much here, but wouldn't the monomodels become overfit, because they don't have enough data for 9+bn parameters?

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact