A CLI utility which scans websites for broken links. Sitemap aware.
Autonomous (self-hosted) BitTorrent DHT search engine suite. Designed specifically for end-users. Has a back-end daemon and a front-end UI. Uses SQLite as its backing store. Goes from DHT node to node, indexing what it finds.
REST API docs: https://app.swaggerhub.com/apis/boramalper/magneticow-api/v0
(v1 not published yet)
Does not have any built-in rate limiter yet, and it will literally suck the hell out of your bandwidth.
The homepage of a distributed search engine project. The project involves downloading and running a cross-platform spider (available for Windows, Linux, FreeBSD, MacOSX, and pretty much any OS which can run Mono) that will then crawl the web and upload what it finds to the project. This can use lots of bandwidth so consider carefully before joining in.
Specific documentation for telling YaCy to start crawling a site, the mode in which to index it, where to start, how deeply, et cetera. indexing exocortex spider
3749 links, including 200 private