A collection of awesome web crawl, scraping, and spidering projects in different languages.
The homepage of a distributed search engine project. The project involves downloading and running a cross-platform spider (available for Windows, Linux, FreeBSD, MacOSX, and pretty much any OS which can run Mono) that will then crawl the web and upload what it finds to the project. This can use lots of bandwidth so consider carefully before joining in.
A search engine for Tor hidden services. The problem is that it's on the public Net so if you don't want your services known you'll have to take additional measures. It also exposes your activity on the public Net so don't think that you'll have much privacy.