I don’t think you’d need to download on the fly. You can imagine models being installed like extensions where chrome comes with Gemini installed by default. Then have the API allow for falling back to the default (Gemini) or throwing an error when no model is available. I’d contend that this would be a better API design because the user can choose to remove all models to save space on devices where AI is not needed (ex: kiosk).
That doesn't really seem possible (mobile data connection) or convenient (Chrome binary size, disk space) for the user.