An image viewer and browser utility. Pix is part of the X-Apps project, which aims at producing cross-distribution and cross-desktop software.
As an image browser, browse your hard disk showing you thumbnails of image files. Thumbnails are saved in the same database used by Nautilus so you don't waste disk space. Implements all of the file management functions you'd expect. As an image viewer it'll display just about every image format out there, from BMP to JPG. Optional support for RAW and HDR (high dynamic range) images. Add comments to images. Organize images in catalogs, catalogs in libraries. Search for images on you hard disk and save the result as a catalog. Search criteria remain attached to the catalog so you can update it when you want. Minor image editing and conversion features.
Especially handy is the capability to rename files in a series (normalizing filenames), edit EXIF data, and deduplicate by image (and not just by file hash). Deduplication can recurse directory structures. It's incredibly fast, too. 500,000 images took less than an hour to process (geeqie ran for three days straight and wasn't even finished).
In the AUR.
Czkawka (tch•kav•ka (IPA: [ˈʧ̑kafka]), "hiccup" in Polish) is a simple, fast and free app to remove unnecessary files from your computer.
Krokiet ((IPA: [ˈkrɔcɛt]), "croquet" in Polish) same as above, but uses Slint frontend.
Amazingly fast - due to using more or less advanced algorithms and multithreading. Cache support - second and further scans should be much faster than the first one. CLI and GUI (gtk4 or slint). Multiple tools for flexibility.
Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on an extent by extent basis and compare those hashes to each other, finding and categorizing extents that match each other. Optionally, a per-block hash can be applied for further duplication lookup. When given the -d option, duperemove will submit those extents for deduplication using the Linux kernel FIDEDUPRANGE ioctl, which only applies to btrfs and xfs.
Duperemove can store the hashes it computes in a 'hashfile'. If given an existing hashfile, duperemove will only compute hashes for those files which have changed since the last run. Thus you can run duperemove repeatedly on your data as it changes, without having to re-checksum unchanged data.
Requrires kernel v3.13 or later.
It's in the Arch extra package repository.