Hacker News new | past | comments | ask | show | jobs | submit login

For throughput data, well, you need to actually run prompts to gather the data which racks up costs fast and performance can vary based on input prompt lengths. The two sources I use are OpenRouter's provider breakdown [1] and Unify's runtime benchmarks [2].

[1]: https://openrouter.ai/models/meta-llama/llama-3.1-70b-instru...

[2]: https://unify.ai/benchmarks/llama-3.1-70b-chat




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: