Hacker News new | past | comments | ask | show | jobs | submit login

You can usually even ask the same LLM:

- do a task

- criticize your job on that task

- redo that task based on criticism

I find giving the LLM a process greatly improves the results.




What’s fun is that you can skip step 1. The LLM will happily critique its own nonexistent output.


So?

I too can write made up criticism if that’s what my boss wants in the workplace — but that doesn’t suddenly invalidate my ability to criticize my own work to improve it.


That's a smart idea I didn't think of.

I've been arguing with Copilot back and forth where it gave me a half-working solution that seemed overly complicated but since I was new to the tech used, I couldn't say what exactly was wrong. After a couple of hours, I googled the background and trust my instinct and was able to simplify the code.

At that situation, where I iteratively improved the solution by telling Copilot things seem to complicated and this or that isn't working. That led the LLM to actually come back with better ideas. I kept asking myself why something like you propose isn't baked into the system.


The papers I've read have shown LLM critics to be quite bad at their work. If you give an LLM a few known good and bad results, I think you'll see the LLM is just as likely to make good results bad as it is to make bad results good.


How do you know the second result is correct? Or the third? Or the fourth?


I approach it the same way as the things I build myself - testing and measuring.

Although if I’m truly honest with myself, even after many years of developing, the true cycle of me writing code is: over confidence, then shock it didn’t work 100% the first time, wondering if there is a bug in the compiler, and then reality setting in that of course the compiler is fine and I just made my 15th off-by-one error of the day :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: