All of the methods require a decent amount of text. So unless someone is writing hundreds of words in each sample, they're unlikely to be distinguishable from the crowd. That, to me, rules out typical comments on threads. Lots of little samples seem to be just too noisy with too much overlap among authors to allow for unique information-based fingerprints.
However, if the data is sufficiently large - in number and length (say, lots of essay-type blog posts) I'd expect some classifier-based machine learning techniques could match authors. That is, take a sample of 100 bloggers, split their data in half, train the classifiers on one half then test to match up the other half of the data. Under those conditions you could probably get 90-95% accuracy.
The question, I think, is how small you could push the training set in terms of the fewest words and the fewest posts.
However, if the data is sufficiently large - in number and length (say, lots of essay-type blog posts) I'd expect some classifier-based machine learning techniques could match authors. That is, take a sample of 100 bloggers, split their data in half, train the classifiers on one half then test to match up the other half of the data. Under those conditions you could probably get 90-95% accuracy.
The question, I think, is how small you could push the training set in terms of the fewest words and the fewest posts.