A command line tool to extract the main content from a webpage, as done by the "Reader View" feature of most modern browsers. It's intended to be used with terminal RSS readers, to make the articles more readable on web browsers such as lynx. The code is closely adapted from the Firefox version and the output is expected to be mostly equivalent.
This tool is young and written in C, so it's reasonable to wonder about the potential for memory issues. To be safe, all HTML parsing happens inside a sandboxed subprocess. Seccomp is used for this purpose on Linux, Pledge on OpenBSD, and Capsicum on FreeBSD.
5054 links, including 361 private