Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It doesn't work for me with regex101. "The preceding token is not quantifiable" on this part:

  | < (? \w+ )


See, this is kinda what I mean. Maybe you can detect tags with regex, but maybe you shouldn't, given the widespread but subtle differences in regex engines.

Perhaps the entire approach of "why are you trying to parse X?" Needs to be traced and re-evaluated.


> Maybe you can detect tags with regex, but maybe you shouldn't...

So what do you think would be a more appropriate choice for writing a tokenizer?


You want (?:, not (?

Without the colon, the parser appears to be interpreting (? as "one or more instances of (", but ( is no a full expression by itself and therefore cannot be modified with a quantifier.


I actually meant (?<tag> in order to create a named capture.


It was supposed to be (?<tag> \w+ ) in order to create a named capture. The <tag> was apparently lost in editing. Thanks for the heads-up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: