Actually, "plain" regular expressions/FSAs *can* parse HTML up to any arbitrary ...

_delirium · on June 20, 2015

Fwiw browsers do this in practice, so the subset of HTML usable on the web is already regular. For example, Webkit-based browsers impose a nesting depth limit of 512.

gsnedders · on June 21, 2015

In case anyone is curious:

WebKit and Blink both use 512. Gecko uses 200. Trident uses a limit beyond what I've tested quickly (over 4096). Presto uses 500.