Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on an extent by extent basis and compare those hashes to each other, finding and categorizing extents that match each other. Optionally, a per-block hash can be applied for further duplication lookup. When given the -d option, duperemove will submit those extents for deduplication using the Linux kernel FIDEDUPRANGE ioctl, which only applies to btrfs and xfs.
Duperemove can store the hashes it computes in a 'hashfile'. If given an existing hashfile, duperemove will only compute hashes for those files which have changed since the last run. Thus you can run duperemove repeatedly on your data as it changes, without having to re-checksum unchanged data.
Requrires kernel v3.13 or later.
It's in the Arch extra package repository.