If you need to delete duplicate e-mail messages on an IMAP server, take a look at this useful IMAP de-duplicator script:
IMAP de-duplicator – IMAPdedup
As IMAPdedup is a command line interface tool (a python script), it’s particularly useful for:
- automated deletion of duplicates (as it can be called from other scripts)
- extraordinarily big mailboxes or if you have many subfolders (as there’s no intervention by the user required)
- if you have console/shell access to the IMAP server (as you can then run the script on the server itself, speeding the de-duplication process further up)
I also found that it deals relatively well with failures (e.g. when a folder is read-only and hence messages can’t be deleted): It simply reports them on the screen and carries on.
Here’s a quick’n’dirty bash script to de-dup the inbox and all subfolders of the specified account:
#!/bin/sh
# Delete all duplicate messages in all folders of said account.
# Note that we connect through SSL (-x) to the default port.
SERVER="my.server.com"
USER="mylogin"
PASS="mypass"
for folder in `imapdedup.py -s $SERVER -x -u $USER -w $PASS -l`;
do
imapdedup.py -s $SERVER -x -u $USER -w $PASS $folder
done
If you only have to de-duplicate messages in a small folder, you could also use the following de-duplication add-on for Mozilla Thunderbird:
Remove Duplicate Messages Add-on for Thunderbird
Note however that the ‘Remove Duplicate Messages’ add-on is intended for interactive use only, not for batch processing. I also noticed that it fails at cleaning big mail folders (e.g. containing 50’000 messages).