Hacker News new | past | comments | ask | show | jobs | submit login

AMD attempted responses go all the way back to 2007 when CUDA first debuted with "Close to Metal" (https://en.wikipedia.org/wiki/Close_to_Metal). They've had nearly 20 years to fix the situation and have failed to do so. Maybe some third party player like Lamini AI will do what they couldn't and get acquired for it.



The thing about modern AI, it's that operations involved here (e.g. Dense matmuls) are lot simpler and GPU friendly than what you'd find in a typical HPC applications. This means you can get pretty close to peak hardware performance using high-level languages like Python or OpenAI's Triton. I think it's unlikely that the push to improve ROCm's standard libraries will come from an AI-focused startup


Even with AMD working on hard on specific benchmarks for marketing purposes they could not get close to peak hardware performance on their brand new chips.

https://www.semianalysis.com/p/amd-mi300-performance-faster-...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: