Forward to Updated design notes Back to Fetchmail Home Page $Date$

Eric S. Raymond's former Design Notes On Fetchmail

These notes are for the benefit of future hackers and maintainers. The following sections are both functional and narrative, read from beginning to end.

History

A direct ancestor of the fetchmail program was originally authored (under the name popclient) by Carl Harris <ceharris@mal.com>. I took over development in June 1996 and subsequently renamed the program `fetchmail' to reflect the addition of IMAP support and SMTP delivery. In early November 1996 Carl officially ended support for the last popclient versions.

Before accepting responsibility for the popclient sources from Carl, I had investigated and used and tinkered with every other UNIX remote-mail forwarder I could find, including fetchpop1.9, PopTart-0.9.3, get-mail, gwpop, pimp-1.0, pop-perl5-1.2, popc, popmail-1.6 and upop. My major goal was to get a header-rewrite feature like fetchmail's working so I wouldn't have reply problems anymore.

Despite having done a good bit of work on fetchpop1.9, when I found popclient I quickly concluded that it offered the solidest base for future development. I was convinced of this primarily by the presence of multiple-protocol support. The competition didn't do POP2/RPOP/APOP, and I was already having vague thoughts of maybe adding IMAP. (This would advance two other goals: learn IMAP and get comfortable writing TCP/IP client software.)

Until popclient 3.05 I was simply following out the implications of Carl's basic design. He already had daemon.c in the distribution, and I wanted daemon mode almost as badly as I wanted the header rewrite feature. The other things I added were bug fixes or minor extensions.

After 3.1, when I put in SMTP-forwarding support (more about this below) the nature of the project changed -- it became a carefully-thought-out attempt to render obsolete every other program in its class. The name change quickly followed.

The rewrite option

MTAs ought to canonicalize the addresses of outgoing non-local mail so that From:, To:, Cc:, Bcc: and other address headers contain only fully qualified domain names. Failure to do so can break the reply function on many mailers. (Sendmail has an option to do this.)

This problem only becomes obvious when a reply is generated on a machine different from where the message was delivered. The two machines will have different local username spaces, potentially leading to misrouted mail.

Most MTAs (and sendmail in particular) do not canonicalize address headers in this way (violating RFC 1123). Fetchmail therefore has to do it. This is the first feature I added to the ancestral popclient.

Reorganization

The second thing I did reorganize and simplify popclient a lot. Carl Harris's implementation was very sound, but exhibited a kind of unnecessary complexity common to many C programmers. He treated the code as central and the data structures as support for the code. As a result, the code was beautiful but the da

I maintain an open-source POP and IMAP client called fetchmail.  It is
widely used in the Linux and open-source community, and is probably
the single most popular remote-mail client in that world.  You can
find out more about this project at
<http://fetchmail.berlios.de/>.

In order to be able to do thorough regression testing before each release,
I collect test accounts on as many different kinds of POP3, IMAP, and
ODMR servers as possible.  Because fetchmail is strictly conformant to the 
remote-mail RFCs, many server developers have found fetchmail a useful
standards-conformance test.

I'm writing to request test accounts on your server.  I support all flavors 
of POP2, POP3, IMAP and ODMR with either plain-password, CRAM-MD5, NTLM, 
GSSAPI, or Kerberos authentication.  I also support SSL/TLS.

It would be very helpful if I could have a separate test account for
each protocol you support (that is, separate POP3, IMAP, and ODMR
accounts) so I can do automated regression testing without worrying
about mailbox race conditions.
orts before clearing its semaphore, and how do we recover reliably?).

I'm just not satisfied that there's enough functional gain here to pay for the large increase in complexity that adding these semaphores would entail.

Multidrop and alias handling

I decided to add the multidrop support partly because some users were clamoring for it, but mostly because I thought it would shake bugs out of the single-drop code by forcing me to deal with addressing in full generality. And so it proved.

There are two important aspects of the features for handling multiple-drop aliases and mailing lists which future hackers should be careful to preserve.

  1. The logic path for single-recipient mailboxes doesn't involve header parsing or DNS lookups at all. This is important -- it means the code for the most common case can be much simpler and more robust.

  2. The multidrop handing does not rely on doing the equivalent of passing the message to sendmail -t. Instead, it explicitly mines members of a specified set of local usernames out of the header.

  3. We do not attempt delivery to multidrop mailboxes in the presence of DNS errors. Before each multidrop poll we probe DNS to see if we have a nameserver handy. If not, the poll is skipped. If DNS crashes during a poll, the error return from the next nameserver lookup aborts message delivery and ends the poll. The daemon mode will then quietly spin until DNS comes up again, at which point it will resume delivering mail.

When I designed this support, I was terrified of doing anything that could conceivably cause a mail loop (you should be too). That's why the code as written can only append local names (never @-addresses) to the recipients list.

The code in mxget.c is nasty, no two ways about it. But it's utterly necessary, there are a lot of MX pointers out there. It really ought to be a (documented!) entry point in the bind library.

DNS error handling

Fetchmail's behavior on DNS errors is to suppress forwarding and deletion of the individual message that each occurs in, leaving it queued on the server for retrieval on a subsequent poll. The assumption is that DNS errors are transient, due to temporary server outages.

Unfortunately this means that if a DNS error is permanent a message can be perpetually stuck in the server mailbox. We've had a couple bug reports of this kind due to subtle RFC822 parsing errors in the fetchmail code that resulted in impossible things getting passed to the DNS lookup routines.

Alternative ways to handle the problem: ignore DNS errors (treating them as a non-match on the mailserver domain), or forward messages with errors to fetchmail's invoking user in addition to any other recipients. These would fit an assumption that DNS lookup errors are likely to be permanent problems associated with an address.

IPv6 and IPSEC

The IPv6 support patches are really more protocol-family independence patches. Because of this, in most places, "ports" (numbers) have been replaced with "services" (strings, that may be digits). This allows us to run with certain protocols that use strings as "service names" where we in the IP world think of port numbers. Someday we'll plumb strings all over and then, if inet6 is not enabled, do a getservbyname() down in SocketOpen. The IPv6 support patches use getaddrinfo(), which is a POSIX p1003.1g mandated function. So, in the not too distant future, we'll zap the ifdefs and just let autoconf check for getaddrinfo. IPv6 support comes pretty much automatically once you have protocol family independence.

Internationalization

Internationalization is handled using GNU gettext (see the file ABOUT_NLS in the source distribution). This places some minor constraints on the code.

Strings that must be subject to translation should be wrapped with GT_() or N_() -- the former in function arguments, the latter in static initializers and other non-function-argument contexts.

Checklist for Adding Options

Adding a control option is not complicated in principle, but there are a lot of fiddly details in the process. You'll need to do the following minimum steps.

There may be other things you have to do in the way of logic, of course.

Before you implement an option, though, think hard. Is there any way to make fetchmail automatically detect the circumstances under which it should change its behavior? If so, don't write an option. Just do the check!

Lessons learned

1. Server-side state is essential

The person(s) responsible for removing LAST from POP3 deserve to suffer. Without it, a client has no way to know which messages in a box have been read by other means, such as an MUA running on the server.

The POP3 UID feature described in RFC1725 to replace LAST is insufficient. The only problem it solves is tracking which messages have been read by this client -- and even that requires tricky, fragile implementation.

The underlying lesson is that maintaining accessible server-side `seen' state bits associated with Status headers is indispensible in a Unix/RFC822 mail server protocol. IMAP gets this right.

2. Readable text protocol transactions are a Good Thing

A nice thing about the general class of text-based protocols that SMTP, POP2, POP3, and IMAP belongs to is that client/server transactions are easy to watch and transaction code correspondingly easy to debug. Given a decent layer of socket utility functions (which Carl provided) it's easy to write protocol engines and not hard to show that they're working correctly.

This is an advantage not to be despised! Because of it, this project has been interesting and fun -- no serious or persistent bugs, no long hours spent looking for subtle pathologies.

3. IMAP is a Good Thing.

Now that there is a standard IMAP equivalent of the POP3 APOP validation in CRAM-MD5, POP3 is completely obsolete.

4. SMTP is the Right Thing

In retrospect it seems clear that this program (and others like it) should have been designed to forward via SMTP from the beginning. This lesson may be applicable to other Unix programs that now call the local MDA/MTA as a program.

5. Syntactic noise can be your friend

The optional `noise' keywords in the rc file syntax started out as a late-night experiment. The English-like syntax they allow is considerably more readable than the traditional terse keyword-value pairs you get when you strip them all out. I think there may be a wider lesson here.

Motivation and validation

It is truly written: the best hacks start out as personal solutions to the author's everyday problems, and spread because the problem turns out to be typical for a large class of users. So it was with Carl Harris and the ancestral popclient, and so with me and fetchmail.

It's gratifying that fetchmail has become so popular. Until just before 1.9 I was designing strictly to my own taste. The multi-drop mailbox support and the new --limit option were the first features to go in that I didn't need myself.

By 1.9, four months after I started hacking on popclient and a month after the first fetchmail release, there were literally a hundred people on the fetchmail-friends contact list. That's pretty powerful motivation. And they were a good crowd, too, sending fixes and intelligent bug reports in volume. A user population like that is a gift from the gods, and this is my expression of gratitude.

The beta testers didn't know it at the time, but they were also the subjects of a sociological experiment. The results are described in my paper, The Cathedral And The Bazaar.

Credits

Special thanks go to Carl Harris, who built a good solid code base and then tolerated me hacking it out of recognition. And to Harry Hochheiser, who gave me the idea of the SMTP-forwarding delivery mode.

Other significant contributors to the code have included Dave Bodenstab (error.c code and --syslog), George Sipe (--monitor and --interface), Gordon Matzigkeit (netrc.c), Al Longyear (UIDL support), Chris Hanson (Kerberos V4 support), and Craig Metz (OPIE, IPv6, IPSEC).

Conclusion

At this point, the fetchmail code appears to be pretty stable. It will probably undergo substantial change only if and when support for a new retrieval protocol or authentication method is added.

Relevant RFCS

Not all of these describe standards explicitly used in fetchmail, but they all shaped the design in one way or another.

RFC821
SMTP protocol
RFC822
Mail header format
RFC937
Post Office Protocol - Version 2
RFC974
MX routing
RFC976
UUCP mail format
RFC1081
Post Office Protocol - Version 3
RFC1123
Host requirements (modifies 821, 822, and 974)
RFC1176
Interactive Mail Access Protocol - Version 2
RFC1203
Interactive Mail Access Protocol - Version 3
RFC1225
Post Office Protocol - Version 3
RFC1344
Implications of MIME for Internet Mail Gateways
RFC1413
Identification server
RFC1428
Transition of Internet Mail from Just-Send-8 to 8-bit SMTP/MIME
RFC1460
Post Office Protocol - Version 3
RFC1508
Generic Security Service Application Program Interface
RFC1521
MIME: Multipurpose Internet Mail Extensions
RFC1869
SMTP Service Extensions (ESMTP spec)
RFC1652
SMTP Service Extension for 8bit-MIMEtransport
RFC1725
Post Office Protocol - Version 3
RFC1730
Interactive Mail Access Protocol - Version 4
RFC1731
IMAP4 Authentication Mechanisms
RFC1732
IMAP4 Compatibility With IMAP2 And IMAP2bis
RFC1734
POP3 AUTHentication command
RFC1870
SMTP Service Extension for Message Size Declaration
RFC1891
SMTP Service Extension for Delivery Status Notifications
RFC1892
The Multipart/Report Content Type for the Reporting of Mail System Administrative Messages
RFC1894
An Extensible Message Format for Delivery Status Notifications
RFC1893
Enhanced Mail System Status Codes
RFC1894
An Extensible Message Format for Delivery Status Notifications
RFC1938
A One-Time Password System
RFC1939
Post Office Protocol - Version 3
RFC1957
Some Observations on Implementations of the Post Office Protocol (POP3)
RFC1985
SMTP Service Extension for Remote Message Queue Starting
RFC2033
Local Mail Transfer Protocol
RFC2060
Internet Message Access Protocol - Version 4rev1
RFC2061
IMAP4 Compatibility With IMAP2bis
RFC2062
Internet Message Access Protocol - Obsolete Syntax
RFC2195
IMAP/POP AUTHorize Extension for Simple Challenge/Response
RFC2177
IMAP IDLE command
RFC2449
POP3 Extension Mechanism
RFC2554
SMTP Service Extension for Authentication
RFC2595
Using TLS with IMAP, POP3 and ACAP
RFC2645
On-Demand Mail Relay: SMTP with Dynamic IP Addresses
RFC2683
IMAP4 Implementation Recommendations
RFC2821
Simple Mail Transfer Protocol
RFC2822
Internet Message Format

Other useful documents

http://www.faqs.org/faqs/LANs/mail-protocols/
LAN Mail Protocols Summary

Back to Fetchmail Home Page $Date$

Eric S. Raymond <esr@snark.thyrsus.com>