freedb.org announced its services would shutdown entirely on 2020-03-31. Many legacy software applications have FreeDB/CDDB support built-in for fetching CD metadata such as artist, title, and track names. To keep these apps functioning in their fully glory, this is meant as a drop-in replacement for FreeDB/CDDB.
This application does not use the original CDDB database, but fetches disc information from MusicBrainz which has an open API and excellent up-to-date disc metadata.
You can use their public service as documented, or stand up and run your own. Written in Euphoria, a language I've never heard of.
We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone. The Common Crawl corpus contains petabytes of data collected since 2008. It contains raw web page data, extracted metadata and text extractions. The Common Crawl dataset lives on Amazon S3 as part of the Amazon Web Services’ Open Data Sponsorships program. You can download the files entirely free using HTTP(S) or S3. Our goal is to democratize the data so everyone, not just big companies, can do high quality research and analysis.
A Community-driven, FLOSS-licensed Wiki documenting unsolicited requests, metadata leaks, and privacy-invasive features in applications. Privacy is a complex topic, and can be very context specific. I, NetNauseam, believe the best approach to privacy is to simply stop all applications from leaking data and metadata to and through the network.
All information in this repo should be viewed as an opinion, not a fact, and I do not claim your privacy will be improved in any way by following any of these recommendations. These are complex topics with many edge cases and any guarantees are difficult, if not impossible, to make.
Source code: https://codeberg.org/netnauseam/wiki/
binlist.net is a public web service for looking up credit and debit card metadata.
The first 6 or 8 digits of a payment card number (credit cards, debit cards, etc.) are known as the Issuer Identification Numbers (IIN), previously known as Bank Identification Number (BIN). These identify the institution that issued the card to the card holder.
Requests are throttled at 10 per minute with a burst allowance of 10. If you hit the speed limit the service will return a 429 http status code.
Someone also assembled a directory of metadata tags used by Pelican in its templates.
pText is a pure python library to read, write and manipulate PDF documents. It represents a PDF document as a JSON-like datastructure of nested lists, dictionaries and primitives. Extract and edit metadata, extract and edit text and images, add annotations.
Seems like it would be useful for a large-scale indexing effort.
With Meta Tags you can edit and experiment with your content then preview how your webpage will look on Google, Facebook, Twitter and more!
A site that tests the capabilities of the VPN you're using for privacy leaks.
MusicBrainz is an open music encyclopedia that collects music metadata and makes it available to the public.
A personal file system indexing and search application. Part of the Gnome desktop. Indexes file contents, metadata, and location to better help you find things. Also allows you to do your own tagging of stuff it keeps track of. Uses D-BUS for IPC and SPARQL for search. Uses multiple ontologies for different kinds of files (including multimedia content).
The MAT2 is a set of tools for scrubbing the metadata (data about the origin and nature of files) from document files, images, audio recordings, and more. This data can be dangerous if anonymity is important to you.
Supports PNG, JPG, DOC, DOCX, PPT, ODT, TAR, BZ2, GZ, MP3, .torrent files, and too many others to list here.
How matrix algebra can be used on a table of names and membership checkmarks to develop a detailed social connection network.
Mediainfo is a utility which parses the metadata of media files and tells you the file and/or container format and CODECs used.
A utility for turning fanfic from online archives into ebooks to load into a tablet, phone, or reader. Written in python. Consists of a plugin for Calibre, a CLI utility, and a webservice.
A howto for activists that describes how to capture and archive video footage. Includes archival of metadata, keeping files intact, raw and edited video concerns, organization, storage concerns, cataloging, sharing, and preservation. Treats it in a verifiable, library-like manner. Can be downloaded, too.
python module for extracting text from different documents. Can also be used as a CLI utility. Can work with text-based formats like CSV, JSON, and HTML. Can work with binary formats like MS Word, MP3, and PDF. The list is fairly extensive.