Good article on Search theory & problem
Good article on search theory & problems
I'm not much on reading up on theory because it's boring and it's so much reading for so little application. But here's a rather interesting article on same websites problems with searching 7 gigs of data. Althougth like many computing problems, the problem goes away as we get more and faster ram, cpu & bus. But they do discuss some interesting ideas like a massive binary index to prevent the same piece of data being in a more than 1 record (a per word index) and even touch on some ins and outs of assembly vs high level coding the parser.
http://www.kuro5hin.org/story/2004/5/1/154819/1324