I certainly recognize that possibility; but I also realize that systems can be extremely useful, and have a great 'understanding' (for some definition of 'understanding') of linguistic and visual data, without any need for 'sentience', 'conscious', or any other completely ill-defined ideas anyone wants to throw around.
There are a few cases where overlaps in sensory cortex above visual, audio, and linguistic processing (the main systems every decent AI already has as inputs, which are a very small fraction of the brain) would be very helpful, but clearly not absolutely necessary, in improving the capability of a world model - for example, know that a metal container half full water will slosh differently than a full or empty one. That requires proprioception, motor skills, as well as visual inputs etc. So cases such as this will be slightly less performant, but they're typically not relevant for tasks we are interested in automating.
There are a few cases where overlaps in sensory cortex above visual, audio, and linguistic processing (the main systems every decent AI already has as inputs, which are a very small fraction of the brain) would be very helpful, but clearly not absolutely necessary, in improving the capability of a world model - for example, know that a metal container half full water will slosh differently than a full or empty one. That requires proprioception, motor skills, as well as visual inputs etc. So cases such as this will be slightly less performant, but they're typically not relevant for tasks we are interested in automating.