Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow, the first one 'は' seems really good. Out of 279 messages in my Spam folder, 216 messages matched that character, missing only 24 additional Japanese spam. (That means 240/279, or 86% of my spam is Japanese, god damn it).

The second one 'す' matched only 10 Japanese spam messages.



You could also try the particle for belonging の which is a bit like " 's " in English. Should appear in hiragana (as a standalone syllable) frequently since it is a particle much like the first one they suggested (ha for the theme of a sentence). The second one (su) tends to be at the end of maybe half the verbs, might be why it's less likely.

Another one which might match is Japanese punctuation, such as the comma 、 and the period 。

https://www.tofugu.com/japanese-grammar/particle-no-noun-mod...

https://en.m.wikipedia.org/wiki/Japanese_punctuation


Nice! The 'の' (237 matches) is even better than 'は' (216 matches). The 'の' matches every Japanese spam in my Spam folder.

I was not able to use comma 、 and the period 。 because I think FastMail disables searches on common punctuations, so those matched nothing.

(In case people are wondering, I sometimes scan through my Spam folder to check for false positives, i.e. things which were incorrectly marked as spam. It's difficult to do that when it is flooded with Japanese spam.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: