I love what llamafile is doing, but I'm primarily interested in a frontend for ollama, as I prefer their method of model/weights distribution. Unless I'm wrong, llamafile serves as both the frontend and backend.
If I understand the distinction correctly, I run llamafile as a backend. I start it with the filename of a model on the command-line (might need a -M flag or something) and it will start up a chat-prompt for interaction in the terminal but also opens a port that speaks some protocol that I can connect to using a frontend (in my case usually gptel in emacs).