Fun fact (not fabricated BECAUSE IT'S APRIL 2ND OVER HERE):
Japanese alphabets (kana) doesn't have a concept of upper/lower cases. There are two different types of kanas (round ones and square ones) but they are kind of equal in terms of strength/stress so they can't be used to express anger.
People on 2-Channel forum (Japan's 4chan, basically) came up with a brilliant idea, which is to insert a space between each letter so that it looks a bit wider and has an extra oomph.
Example:
- normal: 今日はいい天気だね。 (it's a fine day, isn't it.)
- angry: 今 日 は い い 天 気 だ ね 。 (IT'S A F*KING NICE DAY ISN'T IT)
When you land on a site with a delayed paywall, you can copy out the content just before the popup, and then paste it into <body contenteditable> to view at your own leisure.
Writing horizontally the dot would be above, so I'd assume there's support by composition.
It would still be a problem of IME and font support.
On input, there's already so many shortcuts and hacks (e.g. SHIFT/CAPS is already taken to force switch from hiragana to katakana) that it's hard to imagine some natural combination that could be memorized.
On the font side, there's already the battle raging for having proper jp fonts in smarphones instead of the chinese ones, so additional support for marginal features is an uphill battle to say the least.
That's an interesting fact, but I can't help but feel like your comment is a bit too unaware of its own culture. After all, there's nothing that makes capitals inherently shouty, it's just another convention.
Saying "they didn't have capitals so they used spaces" would sound odd to an alien, who would wonder why capitals were necessary in the first place.
But he's not talking to aliens though. We all know what putting stress or emphasis on words means. I'd be surprised if that was an exclusively western phenomenon.
It's fun to hear how it's done in scripts that doesn't support our (as in the average HN reader) default way of doing it (which would be caps or italics/bold).
To assume that an English reader would know what "dots" are in your example seems unreasonable. As the sibling comment said, if you wrote that in japanese it would make perfect sense. The sort of meta statement would be "That other language doesn't have X, which the one I'm using has. It instead uses Y, which has the same approximate meaning.".
The equivalent would be "in Eglish you cannot use special marks for emphasis (as they do not exist) so they different variants of their alphabet".
The comment is premised by how in japanese there are "two" sets of "letters" with slightly different uses but differently from lowercase/uppercase the difference does not translate in emphasis.
Uppercase letters are linguistically related to emphasis independently form net-speak (proper names, I, beginning of sentences), so to me it reads as "the thing that naturally works here is impossible there, so they have a different solution to the problem".
It is fine to take the perspective of your own cultural situation.
Many of the Asian languages do not have the concepts of upper or lower case.
They contain vowels that go beyond the basic 5 "aeiou" of the English language (eg. see [0]). These suffice to let the speaker know how exactly to say the word unlike in English where it has to be learned case by case based on whatever is popular or acceptable pronunciation.
A subset of of these languages are tonal languages which also have special characters to additionally allow the speaker to set the pitch of the word correctly which then changes the meaning of the words [1].
"These suffice to let the speaker know how exactly to say the word unlike in English where it has to be learned case by case based on whatever is popular or acceptable pronunciation."
This is more due to how widespread English is, and how vowels/pronunciation have shifted over time. For example, The Great Vowel Shift. [0]
Korean has exactly the same issues, albeit to a smaller degree. There are plenty of words that aren't pronounced like how they're spelled, due to grammatical rules. 종로 as an example. Or cases where words sound exactly the same and you just need to know the context/spelling- 쫗다, 쫒다, 쫃다, 쫏다, 쫓다.
Then there's regional slang/pronunciation/dialect. Busan dialect is fairly different from 표준어, or the "official" standard language. This phenomenon is not unique to English in any way. Any language scaled up will develop these issues over time.
At least you can try to pronounce "cough" or "종로", instead of not being able to pronounce "願" at all because you don't already know the pronunciation.
> This is more due to how widespread English is, and how vowels/pronunciation have shifted over time.
I'd go so far as to say that english is now spoken by so many people in so many different regions, all of whom can now be heard by each other on a reasonably frequent basis, that it has forced english speakers to become so adept at vowel-reconstruction that one could pronounce words with completely arbitrary vowels and still be understood.
"it has forced english speakers to become so adept at vowel-reconstruction that one could pronounce words with completely arbitrary vowels and still be understood."
This was still occurring before people were able to widely hear other regions' speakers, though.
However things like cough, plough, although, thorough, etc having different, -correct- pronunciations are due to English taking in words from other languages.
> These suffice to let the speaker know how exactly to say the word unlike in English where it has to be learned case by case based on whatever is popular or acceptable pronunciation.
You can have three things:
1. A spoken language that evolves over time.
2. A writing system that accurately describes pronunciation.
3. A writing system that indicates history and etymology.
But you only get to pick two. English went with 1 and 3, which is arguably the optimal choice.
Why do you think #3 is more important than #2? Why is preserving history and etymology more important than ease of learning for new writers? Put another way, why should kids have to struggle with spelling so we can have a writing system that preserves linguistic history?
1. Because it's more important to know what words mean than it is to know how they sound.
My daughter reads a ton and learns a lot of words from reading. Fairly often, she mispronounces them, and that's OK. What's more valuable is that she can often infer the correct meaning of the word both from the surrounding context and from the parts that the word is made of. If we normalize spelling to match pronunciation, much of the latter gets lost.
It's easier to see that "mean" and "meant" are related than "meen" and "ment". "History" and "story" versus "histery" and "story".
2. Because pronunciation changes over time. If we continuously change spelling to match, it means older printed works get harder to read. In the worst case, they can appear to be saying different words than they intended.
3. Because pronunciation isn't uniform across regions.
Should "lawyer" be spelled "loyer" or "lawyer"? Is "crayon" spelled "crayon", "crayawn", "cran", or "crown"? Is it "caramel" or "carmel"?
Probably they mean spoken English, rather than written English. Written languages are traditionally considered secondary to spoken ones in linguistics, perhaps because they tend to be acquired several years later in childhood and several millennia† later in history. English is normally considered to have about 13–15 vowels, if we exclude the rhotics, depending on dialect: TRAP BATH PALM LOT CLOTH THOUGHT KIT DRESS STRUT FOOT FACE GOAT FLEECE GOOSE PRICE CHOICE MOUTH COMMA LETTER HAPPY, in Wells's standard lexical sets.
But, you say, that's 20 lexical sets, not 13–15? Well, no dialect distinguishes all 20. My idiolect (a slight variant of General American) realizes TRAP and BATH as [æ], PALM and LOT as [a], CLOTH and THOUGHT as [ɔ], KIT as [ɪ], DRESS as [ɛ], STRUT as [ʌ], FOOT as [ʊ], FACE as [ei], GOAT as [ʌu], FLEECE and HAPPY as [i], GOOSE as [u], PRICE as [ai], CHOICE as [ɔi], MOUTH as [æu], and COMMA as [ə]. That's 15, or 12 if you leave out PRICE, CHOICE, and MOUTH, which are diphthongs made of vowels that also occur isolated. (GOAT is debatable, usually analyzed as [oʊ] or [ou].)
Different dialects draw the boundaries in different places; for example, dialects with the "trap–bath split", such as RP, famously realize TRAP and BATH differently ([æ] and [a] in RP). Some dialects have fewer vowels; if we consider Indian English to be a single dialect, it may have more speakers than even GA, and most varieties of Indian English have fewer vowels than 12. I haven't found a good phonological analysis, but if you know any Indian English speakers and also know phonology, you know what I mean. https://en.wikipedia.org/wiki/Regional_differences_and_diale... goes into some detail.
______
† The historical gap might be much larger than this.
Sumerian cuneiform and Egyptian hieroglyphs date back about 5300 years, and they provide evidence that spoken language was considered to be universal among humans at the time—there is no suggestion of tribes that lacked language anywhere in the written record. Today there are still peoples without written language, and a few who only acquired written language within the last generation. So we have good evidence that it has taken at least 5300 years.
But Homo sapiens has been around for sixty times that long, over 300 millennia, and stone tools date back 2 million years. It strains credibility to imagine that the authors of the Lascaux cave paintings or the Denisovans who invented sewing were so unlike us as to lack speech; the origin of spoken language is usually dated to before 40kya. Unfortunately, no tape recorders have yet been found from that epoch, so the uncertainty of the antiquity of spoken language ranges over nearly a factor of 100. Maybe spoken language is a million years older than written language, or five million. Probably not ten million, though, or we'd be studying chimpanzee folklore.
I love how explicit some written languages seem to be. It sounds great to be able to reliably pronounce any word perfectly. I suppose it's a trade off of complexity though. Learning these more explicit languages seems really daunting. Maybe it's just bias?
English and French are the oddballs here. In virtually all other European languages you can reliably pronounce any written word of the language. You don't have to go to Asia for this.
I don't know about that. I'm Swedish and there are a lot of words in our language that is impossible to deduce the pronunciation of. I'd assume the same is true for the other scandinavian languages as well since they are very similar. Perhaps we are oddballs as well, but it seems unlikely.
You might find the IPA interesting. With maybe an hour of studying to learn the letters/symbols and mouth movements you can reliably pronounce any word in any language so long as you’ve got the IPA spelling.
I've actually looked in to IPA at one point. It is extremely useful when learning the basics of a new language. It would be very tedious to try to look up every new word you come across though. Alas, the worst part is that you don't know which words are pronounced differently than you assumed until you hear it or someone raises their eyebrows.
I don't understand people that read ALL CAPS as shouty, and in fact, it was ine of my first culture clashes on HN. I find the italicized emphasis mode to be harder to read and recognize.
Maybe it seems so odd to me because there are so many licenses, contracts, or government forms that use ALL CAPS as emphasis. I don't know.
It just doesn't translate to shouty in reading mental voice.
To me (and probably most others), license texts and such absolutely look like they are shouting. I do not understand whence the convention of having them in ALL CAPS, and can only assume it's in itself some sort of a cultural association between ALL CAPS and IMPORTANCE. I have only seen it in English legal texts, anyway – is it even used in other languages? It looks to me like ALL CAPS was originally used to emphasize key points, and then an inevitable race to the bottom happened until EVERYTHING WAS IMPORTANT which really means that nothing is important.
When it comes to typography, nearly every type of emphasis employed in Western text except italics (and sᴍᴀʟʟ ᴄᴀᴘs, which see too little use these days methinks) only exist due to technological limitations, particularly the extremely limited typographic options available to typewriters and, later, 7- or 8-bit text terminals. This includes ALL CAPS, s p a c i n g, and u͟n͟d͟e͟r͟l͟i͟n͟e͟d͟, never mind ASCII crutches like /pseudoitalics/, _pseudounderlined_, and ∗pseudoboldface∗.
I don't tend to read it as shouting unless the contents are clearly angry. I've always seemed to parse it as more of a monotone 80s/90s computer voice. I think growing up using DOS and BASIC made me just associate all-caps with computers.
Honestly I'm a little surprised that people who've likely seen their share of BAD COMMAND OR FILE NAME would still read that as shouting.
There is a similar thing in Korean Hangul (also unicased) where you put full stop between each letter: "알겠습니다." ("I see.") vs. "알.겠.습.니.다." I believe it is an independent invention.
That kind [1] of language-script mismatches mainly for humorous purposes actually exists in Korean and is considered a kind of 한본어 (a portmanteau of 한국어 Korean language and 일본어 Japanese language).
[1] In this case, Japanese そうですね "I see" written in Hangul.
Chinese phonology is a bit restrictive but I have great fun written short messages in other languages (Japanese, French, English) with Chinese characters. Of course the number of friends I can do that why is very limited, which in a way makes it is even nicer.
Every time I'm reading these letter or the mega-wide numbers in Japanese uploads on youtube, I'm having a heart attack thinking my font cache is broken again.
Yes. writing word in kanji or katakana sometimes works as emphasis. In other words, writing word in hiragana works as not to be emphasised.
Writing word in katakana that usually written in kanji is also works as emphasis with a bit different meaning. It tend to be used as stereotype.
For example, 福島(Fukushima) / 広島(Hiroshima) is just a name of prefecture, but sometimes written フクシマ / ヒロシマ that refers nuclear plant accident / nuke bomb event. (I really dislike this usage).
Sometimes all-katakana is used in fiction to indicate foreigner or robotic voices (like the Starmen in Mother 2). Writing a scream as a string of ア's instead of あ's gives it a piercing quality, moreso when you add the dakuten marker (゛), even though it doesn't change the pronunciation in this case.
More of an indication that implications exist, e.g. ガンバる implies it’s supposed to be but not in kanji, ヒロシマ or フクシマ implies nuclear context.
They can be used to convey tones in text similarly to how italics, caps, symbols and other decorations work in general, I think that’s what they meant to say by emphasis.
No concept of case is also true of Chinese, Korean, Arabic, and if I'm not mistaken, most South Asian scripts as well.
There are an incredible amount of other ways to add emphasis in Chinese though, so it's not lacking anything, and I imagine the same is true of the other languages.
Case is a uniquely European script phenomenon, and one that came late in the development of most of the scripts (Cyrillic, because it was the last of the European scripts to be developed has the shortest time between its unicameral origins and the development of upper- and lowercase).
I have a book published in the 1920s with a forward by Stanley Morison, by an author who attempted to enhance the Hebrew alphabet by introducing upper and lower case letterforms to it as well as to bring the letter forms more in line with the styles of the Latin-Greek-Cyrillic alphabets. It's—odd.
> I have a book published in the 1920s with a forward by Stanley Morison, by an author who attempted to enhance the Hebrew alphabet by introducing upper and lower case letterforms to it as well as to bring the letter forms more in line with the styles of the Latin-Greek-Cyrillic alphabets. It's—odd.
Before the eventual standardization of Hangul around early 20th century there were numerous attempts to "linearize" Hangul's characteristic syllabic blocks ("풀어쓰기" [1]). Many of them were influenced by Western alphabets and had two cases, and none were successful. And yes, they are also odd.
When I was an undergrad, I wrote some algorithms for composing Hangul letters into the "ideographs" in Metafont. It was kind of fun to build. The whole east-Asian font project was too ambitious and never got finished though. I was trying to enable algorithmic composition not just of Hangul but also Kanji/Hanzi from radicals but the latter was not as amenable to algorithmic composition.
And it's worth mentioning how relatively recently case was "invented" even for Latin and Greek/Greek-derived scripts (including Cyrillic, and probably Latin itself technically...)
The way we write modern text is more modern than most people realize, I think. The letter 'j' wasn't really used as a separate letter indicating a separate sound until sometime around the 16th century I think! Case is older, but not ancient. I've seen some Greek on 3rd/4th century middle eastern ruins and I struggle to read the "all upper case with no spaces" writing sometimes, but that's just how it was! No "lower case" until much later...
The characters we think of as lower case were starting to take familiar forms around the 3rd century, but they were just the handwritten form of the language. One set of letters with clear sharp lines that can be worked into stone, and another with curves that can be quickly handwritten.
Arabic script has a vaguely similar concept of a given letter having multiple (up to 3) forms. The form depends on what letters precede and/or succeed it (if any).
Neither do uppercase letterforms. Uppercase individual letters aren't emphasized, that's just grammar, much like Hebrew's final letters that are only used at the end of words (I'm guessing these indicated the ends of words to assist reading when spacing wasn't used--I've certainly found it useful as such).
I learned Thai in the last 10 years and they don't have letter cases. I naturally began to use spacings for emphasis in "Chat language". It's so natural to do this.
They do have half width versions though that are used when typing sometimes あ -> ぁ like if you want to indicate stretching vowel sounds out for emphasis.
Japanese alphabets (kana) doesn't have a concept of upper/lower cases. There are two different types of kanas (round ones and square ones) but they are kind of equal in terms of strength/stress so they can't be used to express anger.
People on 2-Channel forum (Japan's 4chan, basically) came up with a brilliant idea, which is to insert a space between each letter so that it looks a bit wider and has an extra oomph.
Example:
- normal: 今日はいい天気だね。 (it's a fine day, isn't it.)
- angry: 今 日 は い い 天 気 だ ね 。 (IT'S A F*KING NICE DAY ISN'T IT)