Then I realized, that I changed the provider. And the new one quantized Llama 3.1 with fp8.
Then I tried Hyperbolic [2], because they offer the model in different quantizations. As result, Llama 3.1 was better than Llama 3 or at least on par.
[1] https://github.com/s-macke/AdventureAI
[2] https://app.hyperbolic.xyz/models
Then I realized, that I changed the provider. And the new one quantized Llama 3.1 with fp8.
Then I tried Hyperbolic [2], because they offer the model in different quantizations. As result, Llama 3.1 was better than Llama 3 or at least on par.
[1] https://github.com/s-macke/AdventureAI
[2] https://app.hyperbolic.xyz/models