We built the cheapest Llama 3.1 70B inference API, specialized for tasks that ar...

We built the cheapest Llama 3.1 70B inference API, specialized for tasks that are not time sensitive (ie. batch processing jobs for example).

Without any quantization our current price is 30cts ingest and 50cts output per million tokens. [1]

alextttty 7 months ago [–]

Amazing! Please dont hesitate to open an issue or a PR Will update our dataset and add it.