Same reason the same models don't fundamentally understand all languages. They're not trained to. Frankly the design changes to get this to work in training is minimal but this isn't the way English works so expect most of the corporate LLM to struggle because that's where the interest and money is.
Give it time until we have true globally multi lingual models for superior context awareness.
A byte tokenized model is naturally 100% multi-lingual in all languages in its data set. There just isn't a lot of reason for teams to spend the extra training time to build that sort of model.
Give it time until we have true globally multi lingual models for superior context awareness.