This project presents and utilizes YAML Paths, which are a powerful, intuitive means of identifying one or more nodes within YAML, EYAML, or compatible data structures like JSON. Both dot-notation (inspired by Hiera) and forward-slash-notation (influenced by XPath) are supported. The libraries (modules) and several command-line tool implementations are provided. With these, you can build YAML Path support right into your own application or easily use its capabilities right away from the command-line to retrieve, update, merge, validate, and scan YAML/JSON/Compatible data.
This implementation of YAML Path is a query language in addition to a node descriptor. With it, you can describe or select a single precise node or search for any number of nodes that match some criteria. Keys, values, elements, anchors, and aliases can all be searched at any number of levels within the data structure using the same query. Collectors can also be used to gather and further select from otherwise disparate parts of the source data.
Basically, it's like JSONpath, but for YAML.
chezmoi helps you manage your personal configuration files (dotfiles, like ~/.gitconfig) across multiple machines. chezmoi provides many features beyond symlinking or using a bare git repo including:
Github: https://github.com/twpayne/chezmoi
In Homebrew. In Arch's extra package repo.
Torrent file parsing and creation with pydantic (and models for other bittorrent things too). Can create and parse v1, v2, hybrid, and other BEPs. Is focused on library usage (but does cli things too). Validates torrent files. Treats .torrent files as an extensible rather than fixed format.
Gain another host's network access permissions by establishing a stateful TCP connection with a spoofed source IP. Requires all of the hosts in question to be on the same subnet; uses ARP cache poisoning.
A collection of utilities for ripping, dumping, analysing, and modifying disk images. Written with the Greaseweazel in mind; the primary developer is the inventor so it only makes sense. Targets Linux, Mac OS X, and Windows (using Cygwin or MinGW), and should be very POSIX portable. amiga-native/ targets classic Amiga m68k, tested with SAS/C 6.50.
Straightforward compilation, just a Makefile.
A Rust application to discover and verify professional email addresses based on contact names and company websites. This tool helps you find valid email addresses for business contacts when you have their name and company domain.
Creates common email patterns based on first and last names. Crawls company websites for email addresses. Validates email existence via direct mail server communication. Uses provider-specific APIs for enhanced accuracy. Simulates login attempts with a headless web browser to validate emails. Uses DNS (MX records) to find mail servers. Handles multiple contacts simultaneously. Ranks possible email addresses by confidence.
Floptool is a tool for the maintenance and manipulation of floppy images that MAME users need to deal with. MAME directly supports .WAV audio formatted images, but many of the existing images out there may come in forms such as .TAP for Commodore 64 tapes, .CAS for Tandy Color Computer tapes, and so forth. Castool will convert these other formats to .WAV for use in MAME.
Floptool is part of the MAME project. It shares large portions of code with MAME, and its existence would not be if it were not for MAME. As such, the distribution terms are the same as MAME. Please read the MAME license thoroughly.
Supports dozens of image formats.
Tantivy is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine. Tantivy is, in fact, strongly inspired by Lucene's design.
Full-text search. Configurable tokenizer (stemming available for 17 Latin languages) with third party support for Chinese, Japanese, and Korean. Fast. Tiny startup time (<10ms), perfect for command-line tools. BM25 scoring (the same as Lucene). Natural query language and phrase query search. Incremental indexing of data. Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop).
There is a CLI tool (tantivy-cli) that lets you do all the configuration and setting up from the command line.
Write an Assembler program that translates programs written in the symbolic Hack assembly language into binary code that can execute on the Hack hardware platform built in the previous projects.
To run this program execute the shell script make.sh from the command line. This will generate all the necessary .hack files from .asm files using a python script (Assembler.py)
Once this has been completed open up the Assembler.sh and load up the .asm and corresponding hack files and compare their output.
pwncat is a sophisticated bind and reverse shell handler with many features as well as a drop-in replacement or compatible complement to netcat, ncat or socat. Fully scriptable with Python. Self-injecting mode to deploy itself and auto-start multiple unbreakable reverse shells back to you. Reverse shells will reconnect to you if you accidentally kill pwncat or are cut off. Connections over TCP or UDP. Bind shells, reverse shells, port forwarding.
It can wrap your network traffic in any other protocol to obfuscate it or encrypt it.
Written using only with Python core libraries to allow it to run without having to install anything.
In the AUR. Also installable with pip.
modemu2k adds telnet capability to a comm program. It can redirect telnet I/O to a pty so that a comm program can handle the pty as a tty with a real modem, and allows you to use a comm program's scripting and file transfer features over telnet. Now supports IPv6 connections.
It works like file transfer protocols do in minicom (rx/sx, ry/sy, rz/sz).
Or you can use it as a stand-alone CLI client.
str2speech is a simple command-line tool for converting text to speech using Transformer-based text-to-speech (TTS) models. It supports multiple models and voice presets, allowing users to generate high-quality speech audio from text.
Supports multiple TTS models, including suno/bark-small, suno/bark, and various facebook/mms-tts models. Allows selection of voice presets. Supports text input via command-line arguments or files. Outputs speech in .wav format. Works with both CPU and GPU.
Looks like the speech models have to be installed locally to work.
Subtrace is Wireshark for your Docker containers. It lets developers see all incoming and outgoing requests in their backend server so that they can resolve production issues faster. Works out-of-the-box, no code changes needed. Supports all languages (Python + Node + Go + everything else). See full payload, headers, status code, and latency.
Why learn actual skills when you can just look impressive instead?
Introducing rust-stakeholder - a CLI tool that generates absolutely meaningless but impressive-looking terminal output to convince everyone you're a coding genius without writing a single line of useful code.
Remember, it's not about your actual contribution to the codebase, it's about how complicated your terminal looks when the VP of Engineering walks by. Nothing says "I'm vital to this company" like 15 progress bars, cryptic error messages you seem unfazed by, and technical jargon nobody understands.
A multi-threaded PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks. Supports wordlist-based dictionary attacks, date, number range, and alphanumeric brute-forcing, and a custom query builder for password formats. Performs about 50k-100k+ passwords per second utilizing full CPU cores. You can write your own queries like STRING{69-420} which would generate and use a wordlist with the full number range. Specify a maximum and optionally a minimum length for the password search and all passwords of length 4 up to the specified maximum consisting of letters and numbers (a-zA-Z0-9) will be tried.
A commandline utility to search text in PDF files. Tries to be compatible with GNU Grep, where it makes sense. Many of your favorite grep options are supported (such as -r, -i, -n or -c).
Git: https://gitlab.com/pdfgrep/pdfgrep
I wonder if I can plug this into SearxNG.
A data hoarder’s dream come true: bundle any web page into a single HTML file. You can finally replace that gazillion of open tabs with a gazillion of .html files stored somewhere on your precious little drive.
Unlike the conventional “Save page as”, monolith not only saves the target document, it embeds CSS, image, and JavaScript assets all at once, producing a single HTML5 document that is a joy to store and share.
If compared to saving websites with wget -mpk
, this tool embeds all assets as data URLs and therefore lets browsers render the saved page exactly the way it was on the Internet, even when no network connection is available.
In the Arch package repos.
A little CLI utility that calculates and lists all of the numbers between 1 and 1,000,000,000. The algorithm used runs with complexity of O(√n) and took about 27 minute 11 seconds. There's also a link to just download its output (50 megs compressed with 7z, 502 megs uncompressed).
A bit of glue between components that is able to textually summarize videos and podcasts - offline. The script takes a URL as argument, downloads and extracts the audio, transcribes the spoken words to text and then finally prints a summary of the content. No external services are used by this script except for the initial audio download. Examples of URLs that work are Youtube videos and Apple podcasts, see the yt-dlp project for the full list.
This script doesn't do anything clever, it just makes use of the great work done by other projects. Since the purpose is to not have to sit through 8-12 minutes of someone explaining what should've just been a short blog post. The default model used is LLaMa-3 to support medium spec hardware. If you have a large system, Mixtral 8x7b is another great option with a much larger context window (= able to work with longer transcriptions).
The script saves transcriptions to a folder in the same directory, and if the same URL is later used again it will not re-download the audio and create a new transcription but use the existing one. This means it's possible to later use the conversational mode to ask questions on the content, even if not done the first time.
Relies upon a locally hosted LLM to do the heavy lifting so you don't have to ship the data off to another service. Entirely self hosted.
A CLI file sharing utility that serves data over the Veilid network in a BitTorrent-like fashion. The data is available as long as the share is running.