I have realized that in my benchmarks [1]. Llama 3 was significantly better than...

I have realized that in my benchmarks [1]. Llama 3 was significantly better than Llama 3.1, which was puzzling.

Then I realized, that I changed the provider. And the new one quantized Llama 3.1 with fp8.

Then I tried Hyperbolic [2], because they offer the model in different quantizations. As result, Llama 3.1 was better than Llama 3 or at least on par.

[1] https://github.com/s-macke/AdventureAI

[2] https://app.hyperbolic.xyz/models