Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, it's a good gripe to bring up. I'd love to come up with some indication of "confidence" in the parsing. Or maybe when there are two possible parses, give the user the ability to see both somehow?


I think giving an option of two possible parsing would confuse people learning. They wouldn't know which is the right one to pick! So adding a second option just ads to the confusion...

I think showing a confidence level would be a good solution based on the number of available readings for the given context of the kanji.

生 would be a good example. When used by itself, the confidence that the reading is correct should be low (since there are many possible options!) but if it is used in 生まれる the confidence level is very high (because there is only the one possible reading). When used in 生す it could be 50/50 (since it could be read な or む).

Explaining this to learners is a little tricky.


Yeah, 生 is a good example. Fortunately, there are corpuses out there with hand-parsed sentences, so we at least train on those to pick the word for 生 on its own that's most common (I just tried - it's なま). We could use this to get our confidence even better than 50-50 for 生す.

I can see how giving users more options could just bring more confusion. Such a tricky problem! I'll be thinking about this one.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: