Tuesday, December 06, 2005
Search engine from a googler
The released Funes search engine indexes mbox e-mail archives (that's a Thunderbird native format, most of the other e-mail clients can at least export into the mbox format) and is used from the command line.I will give it a try and let you know how it works.
I have eight years of email archives. These archives are my memory. But my memory is awkward, it is hard for me to find things in it. When did I last write my old friend? What all have I thought about Jorge Luis Borges? Who wrote me email in early April, 2001?
Funes is a Java program that enables you to search your memories. At its core is a search engine: Funes indexes all of your email into a quickly-searchable database and then lets you query that database. The search engine itself is implemented with Lucene. Funes adds the glue to parse mailboxes and interact with you via a command line interface.
There are other search tools out there for email, for example grepmail or mg. I wrote Funes because I wanted my own tool to work my own way, because I needed something to keep out of trouble while I was looking for work, and because Lucene was so cool. Mostly, I wrote it because my memory is important to me.
Funes is currently minimally usable and has much work to go. I am not likely to work on it in the near future. It is available as free software according to the GNU Public License. If you try it, please let me know.
Funes actually did work when I released it, but I don't use it. I don't think anyone does. These days I use Gmail or a little Perl script called grepmail.
I wonder would someday there be a Thunderbird extension capable of matching the post date against date-like strings in the post body (I use Thunderbird as an RSS aggregator).
Links to this post: