The problem with these Open weights LLMs hosted by these provider is that we don...

Whiteshadow12 · 2024-08-14T12:12:41 1723637561

Something Open Router get's close to that https://openrouter.ai/models/meta-llama/llama-3.1-405b-instr...

amrrs · 2024-08-14T12:30:08 1723638608

Amazing, just noticed the mention of precision

s-macke · 2024-08-14T15:09:32 1723648172

I have realized that in my benchmarks [1]. Llama 3 was significantly better than Llama 3.1, which was puzzling.

Then I realized, that I changed the provider. And the new one quantized Llama 3.1 with fp8.

Then I tried Hyperbolic [2], because they offer the model in different quantizations. As result, Llama 3.1 was better than Llama 3 or at least on par.

[1] https://github.com/s-macke/AdventureAI

[2] https://app.hyperbolic.xyz/models

alextttty · 2024-08-14T12:18:47 1723637927

Exactly, always best to rely on your hardware, we need to collect/add more data from self hosted models on different gpus/clouds to compare

edgoode · 2024-08-15T00:32:37 1723681957

We have 15+ clouds you can try on our platform if you're looking for a place to compare inference engines

Email me at ed at shadeform dot ai if we can help