The web version of the classic linux productivity tool xroach. Cockroaches will scamper around the page until they find an image to hide behind.
I'm pushing these thoughts into the public sphere for two primary reasons:
(1) To facilitate a discussion around web search in which I can learn from others. I believe what I write here has solid merits but I also believe that we do our best work when we are challenged, encouraged, and refocused by others.
(2) To create a community of individuals who are interested in the future of web search. Particularly individuals who are interested in actively participating in this future.
Note that you needn't be part of the second for me to value your input on the first. I don't want to miss out on wisdom from those who have other commitments/priorities than this project.
A webpage bookmarking and snapshotting service.
Omnom consists of two parts; a multi-user web application that accepts bookmarks and snapshots and a browser extension responsible for bookmark and snapshot creation.
Omnom is a rebooted implementation of @stef's original omnom project, big thanks for it.
Library of Alexandria (LoA in short) is a project that aims to collect and archive documents from the internet.
In our modern age new text documents are born in a blink of an eye then (often just as quickly) disappear from the internet. We find it a noble task to save these documents for future generations.
This project aims to support this noble goal in a scalable way. We want to make the archival activity streamlined and easy to do even in a huge (Terabyte / Petabyte) scale. This way we hope that more and more people can start their own collection helping the archiving effort.
tinysearch is a lightweight, fast, full-text search engine. It is designed for static websites.
tinysearch is written in Rust, and then compiled to WebAssembly to run in a browser. It can be used together with static site generators such as Jekyll, Hugo, Zola, Cobalt, or Pelican.
The test index file of my blog with around 40 posts creates a WASM payload of 99kB (49kB gzipped, 40kB brotli).
Only finds entire words. As a consequence there are no search suggestions (yet). This is a necessary tradeoff for reducing memory usage. A trie datastructure was about 10x bigger than the xor filters. New research on compact datastructures for prefix searches might lift this limitation in the future.
Since we bundle all search indices for all articles into one static binary, we recommend to only use it for small- to medium-size websites. Expect around 2 kB uncompressed per article (~1 kb compressed).
WarcDB is a an SQLite-based file format that makes web crawl data easier to share and query. It is based on the standardized Web ARChive format, used by web archivers.
Online demo: https://lmorchard.github.io/tarot-thing/
When used inside a browser, Python has full access to the Web APIs.
Online REPL console: https://pyodide.org/en/stable/console.html
uBlacklist subscription list for developers.
Subscribe this list to block useless websites from Google Search results, such as machine-translated Stack Overflow clones.
Transforms tkinter, Qt, Remi, WxPython into portable people-friendly Pythonic interfaces, especially if you primarily do CLI tools. Tries to make it easy to build GUIs for applications, because ordinarily the process sucks. Supports several toolkits, including QT, WxPython, and Remi (if you want to turn something into a webapp); you can switch between those toolkits with a single line. No callback functions, that's all handled for you. Has a built-in debugger.
The free tier has only 1000 API calls. Multiple tiers of features.
CORS (Cross-Origin Resource Sharing) is hard. It's hard because it's part of how browsers fetch stuff, and that's a set of behaviours that started with the very first web browser over thirty years ago. Since then, it's been a constant source of development; adding features, improving defaults, and papering over past mistakes without breaking too much of the web.
Anyway, I figured I'd write down pretty much everything I know about CORS, and to make things interactive, I built an exciting new app.
Raindrop.io is the best place to keep all your favorite books, songs, articles or whatever else you come across while browsing.
We're not trying to reinvent the wheel; we're working on a tool that does everything you expect from a modern bookmark manager.
Collections of links. Folksonomy tags. Filters. Finds duplicates and broken links for you. Full text search. Automatically makes copies of every page you bookmark to prevent link rot.
Unlimited bookmarks, collections, and devices indefinitely at the free level. Additional features (probably collaboration) at paid tiers.
BadWolf is a minimalist and privacy-oriented WebKitGTK+ browser.
Minimalist - Small codebase (~1 500 LoC), reuses existing components when available or makes them available.
Customizable - WebKitGTK native extensions, Interface customizable through CSS.
Powerful & Usable - Stable User-Interface; The common shortcuts are available, no vi-modal edition or single-key shortcuts are used.
Git repo: https://hacktivis.me/git/badwolf/
In the AUR.
A multithreaded hyperlink checker that crawls a site and looks for 404s. Unfortunately, not maintained anymore and written in Python2. Still useful.
Join the most popular Internet of Things platform with free Cloud, iOS and Android mobile apps, Web dashboard, and Machine Learning. Has mobile apps for interacting with interfaced devices. Assemble custom apps with a drag-and-drop builder. If it's networked and you can mess with it, you can get it talking to Blynk.
If you want to use their service, developer accounts are free but are limited to five (5) devices at a time. Paid service starts at $415us.
Open source: You can download the server's source code and run it yourself if you want. It's written in Java.
Wiby is a search engine for older style pages, lightweight and based on a subject of interest. Building a web more reminiscent of the early internet.
Futuristic sci-fi and cyberpunk graphical user interface framework for web apps. If you ever wanted to build a theme that looks like JARVIS or something out of Bladerunner, this seems like a good place to start.
Github repo: https://github.com/arwes/arwes
Used by the Internet Archive as one of its online media players.
List of libraries, tools and APIs for web scraping and data processing.