I love Steve Jobs' "bicycle for the mind" metaphor, and what you describe is the best possible example of this concept. A computer that does that would enable us to do so much more.
This is the sort of AI I want; a true personal assistant, not a bullshit generator.
It appears that we are tantalizingly close to have the perfect voice assistant. But for some inexplicable reason, it does not exist yet. Siri was introduced over a decade ago, and it seems that its development has not progressed as anticipated. Meanwhile, language models have made significant advancements. I am uncertain as to what is preventing Apple, a company with boundless resources, from enhancing Siri. Perhaps it is the absence of competition and the duopoly maintained by Apple and Google, both of whom seem reluctant to engage in a competitive battle within this domain.
It is probably a people problem. The people who really understood Siri have probably left, the managers left running it are scored primarily on not making any mistakes and staying off the headlines. Any engineers who understand what it would take to upgrade it aren't given the resources and spend their days on maintenance tasks that nobody really sees.
It's more likely a perverse incentive problem. Voice activated "assistants" weren't viewed as assistance for end users. They were universally viewed as one of two things: A way of treating the consumer as a product, or a feature check-box.
That Siri went from useful to far less useful had more to do with the aim to push products at you rather than actually accomplishing the task you set for Siri. If Apple actually delivers an assistant that works locally, doesn't make me the product, and generally makes it easier to accomplish my tasks, then that's a product worth paying for.
When anyone asks "who benefits from 'AI'?" the answer is almost invariably "the people running the AI." Microsoft and OpenAI get more user data, and subscriptions. Google gets another vehicle for attention-injection. But if I run Vicuna or Alpaca (or some eventual equivalent) on my hardware, I can ensure I get what I need, and that there's much less hijacking of my intentions.
So Microsoft, if you're listening: I don't want Bing Chat search, I want Cortana Local.
When was Siri ever useful? I have yet to encounter a voice "assistant" that can do more than search Google and set timers reliably, and Siri itself can't even do those very well.
I use it around 50 - 100 times per day. Mostly playing music, sending messages, controlling lights in the home, weather, timers, and turning on/off/opening apps on the TV
There are definite frustrations, mostly around playing music. Around 5% of the time, Siri will play the wrong album or artist because the artist name sounds like some other album name, or vice versa. I wish, here, that it used my Music playback history to figure out which one I meant
Doing what Siri is doing is not rocket science. It’s a simple intent based system where you give it patterns to understand intents and you trigger some API based on it.
Once you have the intents parsing, it should be just a matter of throwing man power at it and giving it better intents.
Yes, I have experience with building on top of such a system.
But the group managing Siri has probably been gutted in the past 10 years, and while the core is always simple the integrations and the QA testing to make sure it all keeps working is probably brittle and time consuming, and the core code is likely highly-patched spaghetti at this point.
It would be easy to write Siri again and make it a hundred times better, if you could start all over and only write the core features, and not have to validate against the whole product/feature matrix.
The problem with the rewrite of course would be that you won't be able to deliver that minimal viable product any more and you will have 10 years worth of product requirements and user expectations that you MUST hit for the 1.0 release (which must be a 1.0 and not an 0.1).
I've worked on lots of "simple" and "not rocket science" systems that were 10-years old, and it is always incredibly difficult due to the state of the code, the lack of resources, and the organizational inertia.
This is the sort of AI I want; a true personal assistant, not a bullshit generator.