9504 shaares
222 private links
222 private links
Summary:
Use Boyer-Moore (and unroll its inner loop a few times).
Roll your own unbuffered input using raw system calls. Avoid copying
the input bytes before searching them. (Do, however, use buffered
output. The normal grep scenario is that the amount of output is
small compared to the amount of input, so the overhead of output
buffer copying is small, while savings due to avoiding many small
unbuffered writes can be large.)Don't look for newlines in the input until after you've found a match.
Try to set things up (page-aligned buffers, page-sized read chunks,
optionally use mmap) so the kernel can ALSO avoid copying the bytes.The key to making programs fast is to make them do practically nothing. ;-)