Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is speed really that much of a concern with grep? I typically use :vimgrep inside of vim, not because it's faster (it's orders of magnitude slower due to being interpreted vimscript), but because I hate remembering the differences between pcre/vim/gnu/posix regex syntax.


I regularly search my whole Firefox clone for keywords. If this takes 2s, that's plenty fast; if it takes 20s, I'd have to come up with some other way of doing it.


Ctags?


Firefox is quite complicated; we have code written in C, C++, JS, Python, Make, m4, plus at least three custom IDL formats. grep handles these with ease.


I use grep in some pipelines to bulk-process data, because if you have a fast grep, using it to pre-filter input files to remove definitely-not-matching lines is one of the quickest ways to speed up some kinds of scripts without rewriting the whole thing. And in that case, sometimes processing gigabytes+ of data, it's nice if it's fast.

One common case: I have a Perl script processing a giant file, but it only processes certain lines that match a test. You can move that test to grep, to remove nonmatching lines before Perl even hits them, which will typically be much faster than making Perl loop through them.

Say your script.pl is doing something like:

    next unless /relevant/;
You can replace that with:

    grep "relevant" filename | perl ./script.pl


At the scale you are talking about (10Gb+ files), it's far more efficient to put primitive filtering in the application generating the lines in the first place. you pay two penalties for using grep: having another process touch the data and having to generate superfluous lines in the first place.


This doesn't work if you're processing logs. You might need those other lines in other places.


In this case, alas, I'm processing third-party data I didn't generate, so one way or another I have to scan through it at least once.


    :vimgrep
is slower because it loads each file in memory with all the filetype-specific stuff going on each time before the actual searching.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: