README.md 3.4 KB
Newer Older
Nikos Papagiannopoulos's avatar
Nikos Papagiannopoulos committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
DuplicateEmailFinder is a cli application built with PHP which finds
duplicate email messages (files) in a Maildir directory structure.

Features:
- Maildir directory structure validation
- Duplicate email detection based on rules
  i.e. Email and Date comparison
  Existing rules are:
  - attachment content
  - attachment filename
  - attachment mimetype
  - body html
  - body text
  - cc email
  - cc name
  - date
  - from email
  - from name
  - to email
20
  - to name
Nikos Papagiannopoulos's avatar
Nikos Papagiannopoulos committed
21
22
23
24
25
26
27
28
29
- Output duplicate emails
- Output as PHP array
- Option to use cache file, which speeds up the process
- Option to output to a file
- Option to output as a file list
- Filters to exclude first/last duplicate email
- Option to output a file list only
- Option to delete the duplicate emails

30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Usage:

```
bin/duplicate-email-finder  --help                                

Description:
  Find duplicate Emails in a Maildir directory.

Usage:
  duplicate-email-finder [options] [--] <maildir>

Arguments:
  maildir                        Maildir path

Options:
      --no-cache                 Disables the use of the cache and refreshes the cache
      --files                    Outputs only email file paths
      --exclude-first            Exclude first duplicate email file path from output
      --exclude-last             Exclude last duplicate email file path from output
      --output=OUTPUT            Save output to a file instead of displaying it
      --delete                   Delete the selected emails
      --use-attachment-content   Add E-mail Attachment Contents to the comparison criteria
      --use-attachment-filename  Add E-mail Attachment Filenames to the comparison criteria
      --use-attachment-mimetype  Add E-mail Attachment Mimetypes to the comparison criteria
      --use-body-html            Add E-mail Body HTML to the comparison criteria
      --use-body-text            Add E-mail Body Text to the comparison criteria
      --use-cc-email             Add E-mail CC Addresses to the comparison criteria
      --use-cc-name              Add E-mail CC Names to the comparison criteria
      --use-date                 Add E-mail Date to the comparison criteria
      --use-from-email           Add E-mail From Addresses to the comparison criteria
      --use-from-name            Add E-mail From Names to the comparison criteria
      --use-to-email             Add E-mail to-address to the comparison criteria
      --use-to-name              Add E-mail to-name to the comparison criteria
  -h, --help                     Display help for the given command. When no command is given display help for the duplicate-email-finder command
  -q, --quiet                    Do not output any message
  -V, --version                  Display this application version
      --ansi                     Force ANSI output
      --no-ansi                  Disable ANSI output
  -n, --no-interaction           Do not ask any interactive question
  -v|vv|vvv, --verbose           Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug
```

You can use the Maildir inside the fixtures to test the command.

Example:
```bash
devilbox@php-7.3.16 in /shared/httpd/DuplicateEmailFinder $ bin/duplicate-email-finder  tests/fixtures/Maildir --files --exclude-first
tests/fixtures/Maildir/.Personal/cur/1601332792.M790307P28397.Somehost,S=6682,W=6777:2,S
tests/fixtures/Maildir/.Personal/cur/1601332792.M661090P28397.Somehost,S=4334,W=4401:2,S
```