Hacker News new | past | comments | ask | show | jobs | submit login

We built the cheapest Llama 3.1 70B inference API, specialized for tasks that are not time sensitive (ie. batch processing jobs for example).

Without any quantization our current price is 30cts ingest and 50cts output per million tokens. [1]

1: https://withexxa.com/#pricing




Amazing! Please dont hesitate to open an issue or a PR Will update our dataset and add it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: