What's arbitrary isn't that languages are different from each other, what's arbitrary is where you draw the line. When you take two languages on opposite sides of the world they're unquestionably different languages. But as you transition slowly from one language to another, how many languages you spin off and which dialects fall under which languages is arbitrary.
> My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.
Even if you tried to use a method like this to draw lines, it requires you to pick a "center" dialect that you compare all other prospective dialects/languages to. Which dialect you pick as your "center" dialect will determine which dialects end up under your umbrella language, and picking a different center would yield very different results. Which language you pick as your center is inherently a political question, one which would be settled by a sovereign state.
And aside from that problem, lexical similarity is not used to define languages. All it measures is how similar word sets are, and language variations are way more complicated than just vocabulary. No serious linguist would ever try to use a single metric like that to draw lines between languages (and again, most serious linguists aren't actually interested in drawing general-purpose lines because they understand that the lines are not real).
> My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.
Even if you tried to use a method like this to draw lines, it requires you to pick a "center" dialect that you compare all other prospective dialects/languages to. Which dialect you pick as your "center" dialect will determine which dialects end up under your umbrella language, and picking a different center would yield very different results. Which language you pick as your center is inherently a political question, one which would be settled by a sovereign state.
And aside from that problem, lexical similarity is not used to define languages. All it measures is how similar word sets are, and language variations are way more complicated than just vocabulary. No serious linguist would ever try to use a single metric like that to draw lines between languages (and again, most serious linguists aren't actually interested in drawing general-purpose lines because they understand that the lines are not real).