You are viewing a humanly curated list of fine personal & independent blogs that are updated regularly. No algorithms ever!
Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on an extent by extent basis and compare those hashes to each other, finding and categorizing extents that match each other. Optionally, a per-block hash can be applied for further duplication lookup. When given the -d option, duperemove will submit those extents for deduplication using the Linux kernel FIDEDUPRANGE ioctl, which only applies to btrfs and xfs.
Duperemove can store the hashes it computes in a 'hashfile'. If given an existing hashfile, duperemove will only compute hashes for those files which have changed since the last run. Thus you can run duperemove repeatedly on your data as it changes, without having to re-checksum unchanged data.
Requrires kernel v3.13 or later.
It's in the Arch extra package repository.
Hancho is a simple, pleasant build system with few moving parts.
Hancho fits comfortably in a single Python file and requires no installation, just copy-paste it into your source tree. Hancho is inspired by Ninja (for speed and simplicity) and Bazel (for syntax and extensibility). Like Ninja, it knows nothing about your build tools and is only trying to assemble and run commands as fast as possible. Unlike Ninja, you can use glob("*.cpp")
and such to make things far less verbose. Like Bazel, you invoke build rules by calling them as if they were functions with keyword arguments. Unlike Bazel, you can create build rules that call arbitrary Python code (for better or worse). Hancho should suffice for small to medium sized projects.
A text-mode X display manager for the console. Lightweight, not trying to be pretty. Enter your username and password, get a desktop. Has an extensive list of window managers and desktop environments it's been tested with. Designed to not require systemd (though it can work under it if necessary).
Schnoz is a tool that I wrote in Python to monitor network traffic and analyze potential threats. I compiled all of the small scripts regarding network analysis to create a multirange tool. Please make sure that you have scapy installed. Implements active network sniffing, pulling from pcap files, alerting on specific traffic parameters, and analysis of captured HTTP traffic.
An open standard for a common interconnect between headsets and radios.
An opionated (and incomplete) ActivityPub service implementation in Go. The documentation for this package is incomplete reflecting the nature of our work to first understand the mechanics, and second explore the tolerances, of the ActivityPub protocols. The closest thing to "quick start" documentation can be found in the Example section of this README.
XFiles is a file manager for X11. It can navigate through directories, show icons for files, select files, call a command to open files, generate thumbnails, and call a command to run on right mouse button click. Supports running scripts when the user selects a file.
This is an old-school X11-style X application. No toolkit, no desktop environment, no skinning, just a file manager.
All of the RSS feeds provided by the Straits Times' international edition.
A huge blocklist of sites (~850) that contain AI generated content, for the purposes of cleaning image search engines (Google Search, DuckDuckGo, and Bing) with uBlock Origin or uBlacklist.
list.txt can probably be processed and used to build a blocking database for search bots.
A Shaarli browser extension using the API for both Firefox and Chrome based browsers. It features add/edit and search of bookmarks for your Shaarli instance.
Firefox: https://addons.mozilla.org/firefox/addon/shaanti/
Chrome: https://chromewebstore.google.com/detail/shaanti/bfecpppjnokkpdegijfgbldholankami
Pint is a Python package to define, operate and manipulate physical quantities: the product of a numerical value and a unit of measurement. It allows arithmetic operations between them and conversions from and to different units.
It is distributed with a comprehensive list of physical units, prefixes and constants. Due to its modular design, you can extend (or even rewrite!) the complete list without changing the source code. It supports a lot of numpy mathematical operations without monkey patching or wrapping numpy.
A command-line script pint-convert provides a quick way to convert between units or get conversion factors.
Webcrawlers/bots often identify themselves in the user agent string. Well it turns out, up until now, a huge majority of my bandwidth usage has come from bots scraping my site thousands of times a day.
A robots.txt file can advertise that you don't want bots to crawl your site. But it's completely voluntary—a bot may happily ignore it and scrape your site anyway. And I'm fine with webcrawlers indexing my site, so that it might be more discoverable. It's the bandwidth hogs that I want to block.
The Python standard library once included a basic SMTP server in the smtpd module, based on the old asynchronous libraries asyncore and asynchat. It was formally removed in v3.12.
This package provides such an implementation of both the SMTP and LMTP protocols using the asyncio module (which has been standard since Python v3.4). Supports the relevant RFCs natively.
Can be executed from the command line, defaulting to port 8025/tcp: python3 -m aiosmtpd -n
or aiosmtpd -n
403JUMP is a tool designed for penetration testers and bug bounty hunters to audit the security of web applications. It aims to bypass HTTP 403 (Forbidden) pages using various techniques, including different HTTP verbs, different HTTP headers, and path fuzzing.
CryFS encrypts your files, so you can safely store them anywhere. It works well together with cloud services like Dropbox, iCloud, OneDrive and others. Easy to setup and works with a lot of cloud storage providers. It runs in the background - you won't notice it when accessing your files in your daily workflow. Your data only leaves your computer in encrypted form. File contents, metadata and directory structure are all secure from someone who hacked your cloud. Released under LGPL.
Can be used locally but that's not its primary use case.
Two directories: A basedir that holds the encrypted files, and a mountdir which you interact with. The basedir is what gets stored remotely, synced, or whatever. Note: Not safe for concurrent access!
Files are split into equal size blocks, encrypted individually. Metadata and directory structures are also represented as those blocks for obfuscation. Block cipher used, random key generated, key encrypted with passphrase.
In Apt, Pacman, Homebrew, Nix repositories.
Default encryption algorithm: XChaCha20-Poly1305, scrypt for key derivation.
Github: https://github.com/cryfs/cryfs
Documentation for the Go AWS SDK library.