This is awesome! I've done a lot of little projects exploring the use of graphs with LLMs and it's great to see that this approach really pays off. Stupid me for trying to prematurely optimize when the solution is just prompt engineering and burning a bunch of tokens on multiple passes. Going to give this a try and see if my jaw drops. If it's as good as it looks then I'll have to put in the work to get it out of Python-land.