You may have a lot of duplicate images on your hard drives. Now Exact duplicate images are easy to find and delete with
fdupes
, but what about images that are different size, cropped, or different just a bit visually ?findimagedupes
is a command line tool that finds visually similar images, even if they are of different resolution and size. There is a threshold setting to define how much similar two images have to be.Install findimagedupes with
apt-get install findimagedupes
The simplest form to run is
findimagedupes -R ~/images
this finds all the duplicate images recursively in all subdirectorie of
~/images.
A more complete form which I generally use is
findimagedupes -v=fp -R -f=fp_data ~/images<
/div>
- -v : verbose mode. You can see fingerprint of eachfile with fp, or md5 hash with md5. For some reason the option -v=LIST specified in the command help doesn’t work. You only can get -v=md5 or -v=fp to work.
- -R : recursively search all subdirectories
- -f : This is a file to we are going to write to / or read from , each images fingerprint . especially useful for large directories with lots of images.
Generally I would let it run first for sometime so that I have all the files scanned and their signatures saved in
fp_data
. On next run, the program reads off the fp_data
file and runs a lot more faster.To use a photo viewer like feh, use the -p switch with full path . This is what you should do if you want to see the pairs and delete one or more of them. Feh lets you navigate in each set , delete the images and then move on to the next set.
-p doesn’t work with simple program name, even if this exist in your system path. You have to always specify the full path of the program.
findimagedupes -v=fp -R -f=fp_data -p= /usr/bin/feh ~/images
You can also set the threshold. To find Exactly same visual image, set the threshold to 100 for example
findimagedupes -v=fp -R -f=fp_data -p= /usr/bin/feh -t=100 ~/images