I can't share anything much more specific than what I already have, unfortunately.
All I can really add - Binary classification is like a super power once you understand the statistics around it. If you can find clever ways to combine multiple binary classifiers, you can quickly narrow down relevant context. You can also use the statistics to do things like determine if a query is too vague to be serviced by an LLM backend. You can also answer questions like "Why didn't we consider a table or join when writing the user's query?".
All I can really add - Binary classification is like a super power once you understand the statistics around it. If you can find clever ways to combine multiple binary classifiers, you can quickly narrow down relevant context. You can also use the statistics to do things like determine if a query is too vague to be serviced by an LLM backend. You can also answer questions like "Why didn't we consider a table or join when writing the user's query?".