>suppose you find that the model has a bias in terms of labeling African Americans as criminals; or women as lousy computer programmers. If all you have is the model weights of the trained model, how easily can you fix the model?
That's textbook fine-tuning and is basically trivial. Adding another layer and training that is many orders of magnitude more efficient than retraining the whole model and works ~exactly as well.
Models are data, not instructions. Analogies to software are actively harmful. We do not fix bugs in models any more than we fix bugs in a JPEG.
That's textbook fine-tuning and is basically trivial. Adding another layer and training that is many orders of magnitude more efficient than retraining the whole model and works ~exactly as well.
Models are data, not instructions. Analogies to software are actively harmful. We do not fix bugs in models any more than we fix bugs in a JPEG.