I think it's possible AI models will generate dynamic UI for each client and stream the UI to clients (maybe eventually client devices will generate their UI on the fly) similar to Google Stadia. Maybe some offset of video that allows the remote to control it. Maybe Wasm based - just stream wasm bytecode around? The guy behind VLC is building a library for ulta low latency: https://www.kyber.video/techology.
I was playing around with the idea in this: https://github.com/StreamUI/StreamUI. Thinking is take the ideas of Elixir LiveView to the extreme.