Hacker News new | past | comments | ask | show | jobs | submit login

>suppose you find that the model has a bias in terms of labeling African Americans as criminals; or women as lousy computer programmers. If all you have is the model weights of the trained model, how easily can you fix the model?

That's textbook fine-tuning and is basically trivial. Adding another layer and training that is many orders of magnitude more efficient than retraining the whole model and works ~exactly as well.

Models are data, not instructions. Analogies to software are actively harmful. We do not fix bugs in models any more than we fix bugs in a JPEG.




Instructions is exactly what weights are. We just have no idea what those instructions are.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: