diff options
author | Eric S. Raymond <esr@thyrsus.com> | 1997-07-08 21:23:54 +0000 |
---|---|---|
committer | Eric S. Raymond <esr@thyrsus.com> | 1997-07-08 21:23:54 +0000 |
commit | f914515cdb80edb1627cb6a6a95ae047e33a65ab (patch) | |
tree | 32fba8e7751707d5adbb5512ef66db422bacb7aa /design-notes.html | |
parent | c93b1bc93b6827eedaaa94751efeff1fafd5e9d5 (diff) | |
download | fetchmail-f914515cdb80edb1627cb6a6a95ae047e33a65ab.tar.gz fetchmail-f914515cdb80edb1627cb6a6a95ae047e33a65ab.tar.bz2 fetchmail-f914515cdb80edb1627cb6a6a95ae047e33a65ab.zip |
Initial revision
svn path=/trunk/; revision=1161
Diffstat (limited to 'design-notes.html')
-rw-r--r-- | design-notes.html | 382 |
1 files changed, 382 insertions, 0 deletions
diff --git a/design-notes.html b/design-notes.html new file mode 100644 index 00000000..8b08efe8 --- /dev/null +++ b/design-notes.html @@ -0,0 +1,382 @@ +<!doctype HTML public "-//W3O//DTD W3 HTML 2.0//EN"> +<HTML> +<HEAD> +<TITLE>Design notes on fetchmail</TITLE> +<link rev=made href=mailto:esr@snark.thyrsus.com> +<meta name="description" content="Design notes on fetchmail."> +<meta name="keywords" content="fetchmail, POP, POP2, POP3, IMAP, remote mail"> +</HEAD> +<BODY> +<H1><center>Design Notes On Fetchmail</center></H1> + +Back to <A HREF="index.html">Fetchmail Home Page</A>. +<hr> + +This notes are for the benefit of future hackers and maintainers. +The following sections are both functional and narrative, read from +beginning to end.<P> + +<H1>History</H1> + +A direct ancestor of the fetchmail program was originally authored +(under the name popclient) by Carl Harris <ceharris@mal.com>. I took +over development in June 1996 and subsequently renamed the program +`fetchmail' to reflect the addition of IMAP support. In early +November 1996 Carl officially ended support for the last popclient +versions.<P> + +Before accepting responsibility for the popclient sources from Carl, I +had investigated and used and tinkered with every other UNIX +remote-mail forwarder I could find, including fetchpop1.9, +PopTart-0.9.3, get-mail, gwpop, pimp-1.0, pop-perl5-1.2, popc, +popmail-1.6 and upop. My major goal was to get a header-rewrite +feature like fetchmail's working so I wouldn't have reply problems +anymore.<P> + +Despite having done a good bit of work on fetchpop1.9, when I found +popclient I quickly concluded that it offered the solidest base for +future development. I was convinced of this primarily by the presence +of multiple-protocol support. The competition didn't do +POP2/RPOP/APOP, and I was already having vague thoughts of maybe +adding IMAP. (This would advance two other goals: learn IMAP and get +comfortable writing TCP/IP client software.)<P> + +Until popclient 3.05 I was simply following out the implications of +Carl's basic design. He already had daemon.c in the distribution, +and I wanted daemon mode almost as badly as I wanted the header +rewrite feature. The other things I added were bug fixes or +minor extensions.<P> + +After 3.1, when I put in SMTP-forwarding support (more about this +below) the nature of the project changed -- it became a +carefully-thought-out attempt to render obsolete every other program +in its class. The name change quickly followed.<P> + +<H1>The rewrite option</H1> + +RFC 1123 stipulates that MTAs ought to canonicalize the addresses of +outgoing mail so that From:, To:, Cc:, Bcc: and other address headers +contain only fully qualified domain names. Failure to do so can break +the reply function on many mailers.<P> + +This problem only becomes obvious when a reply is generated on a +machine different from where the message was delivered. The +two machines will have different local username spaces, potentially +leading to misrouted mail.<P> + +Most MTAs (and sendmail in particular) do not canonicalize address headers +in this way (violating RFC 1123). Fetchmail therefore has to do it. This +is the first feature I added to the ancestral popclient.<P> + +<H1>Reorganization</H1> + +The second thing I did reorganize and simplify popclient a lot. Carl +Harris's implementation was very sound, but exhibited a kind of +unnecessary complexity common to many C programmers. He treated the +code as central and the data structures as support for the code. As a +result, the code was beautiful but the data structure design ad-hoc +and rather ugly (at least to this old LISP hacker).<P> + +I was able to improve matters significantly by reorganizing most of the +program around the `query' data structure and eliminating a bunch of +global context. This especially simplified the main sequence in +fetchmail.c and was critical in enabling the daemon mode changes.<P> + +<H1>IMAP support and the method table</H1> + +The next step was IMAP support. I initially wrote the IMAP code +as a generic query driver and a method table. The idea was to have +all the protocol-independent setup logic and flow of control in the +driver, and the protocol-specific stuff in the method table.<P> + +Once this worked, I rewrote the POP3 code to use the same organization. +The POP2 code kept its own driver for a couple more releases, until +I found sources of a POP2 server to test against (the breed seems +to be nearly extinct).<P> + +The purpose of this reorganization, of course, is to trivialize +the development of support for future protocols as much as possible. +All mail-retrieval protocols have to have pretty similar logical +design by the nature of the task. By abstracting out that common +logic and its interface to the rest of the program, both the common +and protocol-specific parts become easier to understand.<P> + +Furthermore, many kinds of new features can instantly be supported +across all protocols by modifying the one driver module.<P> + +<H1>Implications of smtp forwarding</H1> + +The direction of the project changed radically when Harry Hochheiser +sent me his scratch code for forwarding fetched mail to the SMTP port. +I realized almost immediately that a reliable implementation of this +feature would make all the other delivery modes obsolete.<P> + +Why mess with all the complexity of configuring an MDA or setting up +lock-and-append on a mailbox when port 25 is guaranteed to be there on +any platform with TCP/IP support in the first place? Especially when +this means retrieved mail is guaranteed to look like normal sender- +initiated SMTP mail, which is really what we want anyway.<P> + +Clearly, the right thing to do was (1) hack SMTP forwarding support +into the generic driver, (2) make it the default mode, and (3) eventually +throw out all the other delivery modes. <P> + +I hesitated over step 3 for some time, fearing to upset long-time +popclient users dependent on the alternate delivery mechanisms. In +theory, they could immediately switch to .forward files or their +non-sendmail equivalents to get the same effects. In practice the +transition might have been messy.<P> + +But when I did it (see the NEWS note on the great options massacre) +the benefits proved huge. The cruftiest parts of the driver code +vanished. Configuration got radically simpler -- no more grovelling +around for the system MDA and user's mailbox, no more worries about +whether the underlying OS supports file locking.<P> + +Also, the only way to lose mail vanished. If you specified localfolder +and the disk got full, your mail got lost. This can't happen with +SMTP forwarding because your SMTP listener won't return OK unless +the message can be spooled or processed.<P> + +Also, performance improved (though not so you'd notice it in a single +run). Another not insignificant benefit of this change was that the +manual page got a lot simpler.<P> + +Later, I had to bring --mda back in order to allow handling of some +obscure situations involving dynamic SLIP. But I found a much simpler +way to do it.<P> + +The moral? Don't hesitate to throw away superannuated features when +you can do it without loss of effectiveness. I tanked a couple I'd +added myself and have no regrets at all. As Saint-Exupery said, +"Perfection [in design] is achieved not when there is nothing more to +add, but rather when there is nothing more to take away." This +program isn't perfect, but it's trying.<P> + +<H1>The most-requested features that I will never add, and why not:</H1> + +<H2>1. Password encryption in .fetchmailrc</H2> + +The reason there's no facility to store passwords encrypted in the +.fetchmailrc file is because this doesn't actually add protection.<P> + +Anyone who's acquired the 0600 permissions needed to read your +.fetchmailrc file will be able to run fetchmail as you anyway -- and +if it's your password they're after, they'd be able to rip the +necessary decoder out of the fetchmail code itself to get it.<P> + +All .fetchmailrc encryption would do is give a false sense of +security to people who don't think very hard.<P> + +<H2>2. Truly concurrent queries to multiple hosts</H2> + +Occasionally I get a request for this on "efficiency" grounds. These +people aren't thinking either. True concurrency would do nothing to lessen +fetchmail's total IP volume. The best it could possibly do is change the +usage profile to shorten the duration of the active part of a poll cycle +at the cost of increasing its demand on IP volume per unit time.<P> + +If one could thread the protocol code so that fetchmail didn't block +on waiting for a protocol response, but rather switched to trying to +process another host query, one might get an efficiency gain (close to +constant loading at the single-host level).<P> + +Fortunately, I've only seldom seen a server that incurred significant +wait time on an individual response. I judge the gain from this not +worth the hideous complexity increase it would require in the code.<P> + +<H1>Multidrop and alias handling</H1> + +I decided to add the multidrop support partly because some users were +clamoring for it, but mostly because I thought it would shake bugs out +of the single-drop code by forcing me to deal with addressing in full +generality. And so it proved.<P> + +There are two important aspects of the features for handling +multiple-drop aliases and mailing lists which future hackers should be +careful to preserve.<P> + +<OL> +<LI> + The logic path for single-recipient mailboxes doesn't involve header + parsing or DNS lookups at all. This is important -- it means the code + for the most common case can be much simpler and more robust.<P> + +<LI> + The multidrop handing does <EM>not</EM> rely on doing the equivalent of passing + the message to sendmail -oem -t. Instead, it explicitly mines members + of a specified set of local usernames out of the header.<P> + +<LI> + We do <EM>not</EM> attempt delivery to multidrop mailboxes in the presence of DNS + errors. Before each multidrop poll we probe DNS to see if we have a + nameserver handy. If not, the poll is skipped. If DNS crashes during a + poll, the error return from the next nameserver lookup aborts message + delivery and ends the poll. The daemon mode will then quietly spin until + DNS comes up again, at which point it will resume delivering mail.<P> +</OL> + +When I designed this support, I was terrified of doing anything that could +conceivably cause a mail loop (you should be too). That's why the code +as written can only append <EM>local</EM> names (never @-addresses) to the +recipients list.<P> + +The code in mxget.c is nasty, no two ways about it. But it's utterly +necessary, there are a lot of MX pointers out there. It really ought +to be a (documented!) entry point in the bind library.<P> + +<H1>DNS error handling</H1> + +Fetchmail's behavior on DNS errors is to suppress forwarding and +deletion of the individual message that each occurs in, leaving it +queued on the server for retrieval on a subsequent poll. The +assumption is that DNS errors are transient, due to temporary server +outages.<P> + +Unfortunately this means that if a DNS error is permanent a message +can be perpetually stuck in the server mailbox. We've had a couple +bug reports of this kind due to subtle RFC822 parsing errors in the fetchmail +code that resulted in impossible things getting passed to the DNS lookup +routines.<P> + +Alternative ways to handle the problem: ignore DNS errors (treating +them as a non-match on the mailserver domain), or forward messages +with errors to fetchmail's invoking user in addition to any other +recipients. These would fit an assumption that DNS lookup errors are +likely to be permanent problems associated with an address.<P> + +<H1>Lessons learned</H1> + +<H3>1. Server-side state is essential</H3> + +The person(s) responsible for removing LAST from POP3 deserve to suffer. +Without it, a client has no way to know which messages in a box have been +read by other means, such as an MUA running on the server.<P> + +The POP3 UID feature described in RFC1725 to replace LAST is +insufficient. The only problem it solves is tracking which messages +have been read <EM>by this client</EM> -- and even that requires +tricky, fragile implementation.<P> + +The underlying lesson is that maintaining accessible server-side +`seen' state bits associated with Status headers is indispensible in a +Unix/RFC822 mail server protocol. IMAP gets this right.<P> + +<H3>2. Readable text protocol transactions are a Good Thing</H3> + +A nice thing about the general class of text-based protocols that SMTP, +POP2, POP3, and IMAP belongs to is that client/server transactions are +easy to watch and transaction code correspondingly easy to debug. Given +a decent layer of socket utility functions (which Carl provided) it's +easy to write protocol engines and not hard to show that they're working +correctly.<P> + +This is an advantage not to be despised! Because of it, this project has +been interesting and fun -- no serious or persistent bugs, no long +hours spent looking for subtle pathologies.<P> + +<H3>3. IMAP is a Good Thing.</H3> + +If there were a standard IMAP equivalent of the POP3 APOP validation, +POP3 would be completely obsolete.<P> + +<H3>4. SMTP is the Right Thing</H3> + +In retrospect it seems clear that this program (and others like it) +should have been designed to forward via SMTP from the beginning. +This lesson may be applicable to other Unix programs that now call the +local MDA/MTA as a program.<P> + +<H3>5. Syntactic noise can be your friend</H3> + +The optional `noise' keywords in the rc file syntax started out as +a late-night experiment. The English-like syntax they allow is +considerably more readable than the traditional terse keyword-value +pairs you get when you strip them all out. I think there may be a +wider lesson here.<P> + +<H1>Motivation and validation</H1> + +It is truly written: the best hacks start out as personal solutions to +the author's everyday problems, and spread because the problem turns +out to be typical for a large class of users. So it was with Carl Harris +and the ancestral popclient, and so with me and fetchmail.<P> + +It's gratifying that fetchmail has become so popular. Until just before +1.9 I was designing strictly to my own taste. The multi-drop mailbox +support and the new --limit option were the first features to go in that +I didn't need myself.<P> + +By 1.9, four months after I started hacking on popclient and a month +after the first fetchmail release, there were literally a hundred +people on the fetchmail-friends contact list. That's pretty powerful +motivation. And they were a good crowd, too, sending fixes and +intelligent bug reports in volume. A user population like that is +a gift from the gods, and this is my expression of gratitude.<P> + +The beta testers didn't know it at the time, but they were also the +subjects of a sociological experiment. The results are described in +my paper, <cite>The Cathedral And The Bazaar</cite>, available on the +<a href="http://www.ccil.org/~esr/fetchmail">Fetchmail home page</a>. + +<H1>Credits</H1> + +Special thanks go to Carl Harris, who built a good solid code base +and then tolerated me hacking it out of recognition. And to Harry +Hochheiser, who gave me the idea of the SMTP-forwarding delivery mode.<P> + +Other significant contributors to the code have included Dave Bodenstab +(error.c code and --syslog), George Sipe (--monitor and --interface), +Gordon Matzigkeit (netrc.c), Al Longyear (UIDL support), and Nalin +Dahyabhai (Kerberos V4 support).<P> + +<H1>Conclusion</H1> + +At this point, the fetchmail code appears to be pretty stable. +It will probably undergo substantial change only if and when support +for a new retrieval protocol or authentication method is added.<P> + +<H1>Relevant RFCS</H1> + +Not all of these describe standards explicitly used in fetchmail, but they +all shaped the design in one way or another.<P> + +<DL> +<DT>RFC821<DD> SMTP protocol +<DT>RFC822<DD> Mail header format +<DT>RFC937<DD> Post Office Protocol - Version 2 +<DT>RFC974<DD> MX routing +<DT>RFC976<DD> UUCP mail format +<DT>RFC1081<DD> Post Office Protocol - Version 3 +<DT>RFC1123<DD> Host requirements (modifies 821, 822, and 974) +<DT>RFC1176<DD> Interactive Mail Access Protocol - Version 2 +<DT>RFC1203<DD> Interactive Mail Access Protocol - Version 3 +<DT>RFC1225<DD> Post Office Protocol - Version 3 +<DT>RFC1344<DD> Implications of MIME for Internet Mail Gateways +<DT>RFC1413<DD> Identification server +<DT>RFC1428<DD> Transition of Internet Mail from Just-Send-8 to 8-bit SMTP/MIME +<DT>RFC1460<DD> Post Office Protocol - Version 3 +<DT>RFC1521<DD> MIME: Multipurpose Internet Mail Extensions +<DT>RFC1869<DD> SMTP Service Extensions (ESMTP spec) +<DT>RFC1652<DD> SMTP Service Extension for 8bit-MIMEtransport +<DT>RFC1725<DD> Post Office Protocol - Version 3 +<DT>RFC1730<DD> Interactive Mail Access Protocol - Version 4 +<DT>RFC1731<DD> IMAP4 Authentication Mechanisms +<DT>RFC1732<DD> IMAP4 Compatibility With IMAP2 And IMAP2bis +<DT>RFC1734<DD> POP3 AUTHentication command +<DT>RFC1870<DD> SMTP Service Extension for Message Size Declaration +<DT>RFC1891<DD> SMTP Service Extension for Delivery Status Notifications +<DT>RFC1893<DD> Enhanced Mail System Status Codes +<DT>RFC1894<DD> An Extensible Message Format for Delivery Status Notifications +<DT>RFC1939<DD> Post Office Protocol - Version 3 +<DT>RFC1985<DD> SMTP Service Extension for Remote Message Queue Starting +<DT>RFC2060<DD> Internet Message Access Protocol - Version 4rev1 +<DT>RFC2061<DD> IMAP4 Compatibility With IMAP2bis +<DT>RFC2062<DD> Internet Message Access Protocol - Obsolete Syntax +</DL> + +<HR> +Back to <A HREF="index.html">Fetchmail Home Page</A>.<P> +<ADDRESS>Eric S. Raymond <A HREF="mailto:esr@thyrsus.com"><esr@snark.thyrsus.com></A></ADDRESS> +</BODY> +</HTML> |