aboutsummaryrefslogtreecommitdiffstats
path: root/TODO
blob: fe8ee642d6f348a9836eb855b2f859fe3b8fc047 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
Integrate --debug-imap option into yet-to-be-implemented -vv switch? 
I had the idea to provide separate debugging info levels anyway, see --debug
below.

Gracefully close IMAP connection upon unexptected error (currently archivemail
just terminates). 

LOCKING & Co:
* switch to or add fcntl() locking; when combining with flock, be careful not to
  break on Solaris/FreeBSD, where flock() is just fcntl() so using both would
  block. 
* Ensure that we don't accidently lose fcntl locks when closing some file
  descriptor; this applies even for flock, since again, it might be emulated
  with fcntl
* Block signals while writing changed mailbox back. Also, we probably shouldn't
  use shutil.copy2() for this; at least we have to handle symlinked targets in a
  sane way, see Debian bug #349068.  mbox_sync_mailbox() in the mutt code might
  be an example how to write back a changed mailbox. 
* Double-check the entire locking code. 
* FIXME: no locking at all is applied to the archives. 

Seems like existing archives are not read or validated in any way.  New archive
data is blindly appended...  Probably okay, but should be documented. 

I don't like the representation of mboxes as python objects, it's unclean. 
E.g. the archive mbox object is a subclass of mailbox.UnixMailbox, but this
super class is not used in any way, it's not even initialized.  Makes sense,
since we need write-only access, and UnixMailbox is read-only.  But should it be
a subclass at all, then? 
.
Also, the original mbox and the "retain mbox" are separate objects.  This is
disputable.  E.g. the finalise() method of the retain mbox overwrites the
existing original mbox, and I feel that's really unclean.  

IMAP SEARCH BEFORE disregards time and timezone information.  This should at
least be documented.  E.g. I've found that '-d 0' didn't match all messages in
an IMAP mailbox.  This is because the SEARCH key is (BEFORE 14-Nov-2007) on 14
November, not matching messages that arrived today.  (This problem is probably 
fixed for most use cases by the --all option.) 

Document mbox format issues: link to
http://homepages.tesco.net/~J.deBoynePollard/FGA/mail-mbox-formats.html, 
qmail mbox manpage, Debian manpage, RFC 4155.  Document what mbox format we can
read, and what we write. 
FIXME: we cannot yet parse rfc 2822 addr-spec stuff like quoted local-parts in
return-path addresses.

Minor annoyance: when a From_ line is generated, guess_delivery_time() reports
the used date header a second time. 

Check sf.net and Debian BTS for new bugs.  Again.

IMAP: ensure mailbox archives are properly named.  Currently imap folder names
are mapped like this:

  IMAP URL    |  resulting mbox_archive
  ------------+------------------------
  test.box    |  test.box_archive.gz
  test/box    |  box_archive.gz


Implement --include-draft.  But before, think about it again.  (This is feature
request #1569305.)

Create temporary archive mbox in /tmp only if we don't have write permissions in
the mbox directory.  Currently, if /tmp resides on another filesystem, we have
to copy the entire box to its destination. 

Implement a fallback if an IMAP server doesn't support SEARCH. (Ouch!)

Add IMAP tests to the testsuite (upload test messages with IMAP "APPEND
date-string"). 

Try to port archivemail to email.message and the new mailboxes in Python 2.5.
Is these flexible enough for our needs?

Add recursive archiving of mail subfolders? 

Line out what we want with respect to multiple selection criteria. 
Some make sense, but this easily gets too complex, and if only it's a hassle
with adding all the options.  Hm.  

Reject patch #1036022 "Added option to inverse date compare" after cooling down
because the patch is both stupid (copy+paste code) and broken.  Don't see why
anyone should want this/we should support it. 
If this is reasonable *at all*, I think we'd better go for all the complexity
to honour _two_ cut off dates (see Debian bug "#184124: archivemail: -D and -d
should not be incompatible", which is a comparably half-baken thought). </rant>

Add --debug or -vv switch, and move the printing of diagnostic info for each
message to --debug.  

Perhaps add some more nice stuff like printing of subject, sender... 
See tracker #868714 "added stats option to archivemail", which has a point.
Message-Ids are useful for diagnosis, but not very nice to read for humans. 

Regarding the --archive-name option:
* Do we want this?  Probably, it adds flexibility.
* I think we should expand date format strings like we do with --suffix
* Hmm, --output-dir overrides os.dirname(archive_name)... 
  If no output_dir is given, use $PWD like we do for IMAP, or require -o?
* Provide short option -a?  Not sure. 
* The patch in #905657 is not bad.  The Debian package also has a custom
  --archive-name option, but with a worse implementation.

Be a nicer citizen with respect to mailbox locking. 

Perhaps prune/shorten IMAP mailbox URLs in messages?
They may be quite long and may contain the sensitive password.
Also shows up in the process list... 
Perhaps find a clean, lean replacement for all that clutter in the IMAP urls.

Require --output-dir for IMAP archiving?  Otherwise we just drop the archive in
in the current working directory.

Switch to fcntl(2) locking?  That would be NFS-safe.  Perhaps make the locking
method configurable?

Check all items below, which are from the original author. :-)

.archivemailrc support

Specify an option to not seteuid() when run as root?

When you get a file-not-found in the 6th mailbox of 10, it aborts the whole
run. Better to fail gracefully and keep going.

Think about the best way to specify the names of archives created with
possibly an --archive-name option.

Add more tests (see top of test_archivemail.py)

We need some better checking to see if we are really looking at a valid
mbox-format mailbox.

Lock any original .gz files 
- is this necessary?

Check for symlink attacks for tempfiles (although we don't use /var/tmp)

Add an option to not cut threads.

Add MMDF mailbox support

Add Babyl mailbox support

Add option to archive depending on mailbox size threshold 
- is this a good idea?

Add option to archive depending on number of messages
- is this a good idea?