From 68f099a09fdc59fd1e246729214fe4caf7c80c28 Mon Sep 17 00:00:00 2001 From: Matthias Andree Date: Wed, 20 Jul 2005 09:37:39 +0000 Subject: Rename design-notes.html to esrs-design-notes.html. Remove ~esr/ path from links. svn path=/trunk/; revision=4124 --- Makefile.am | 6 +- design-notes.html | 763 ------------------------------------------------ esrs-design-notes.html | 761 +++++++++++++++++++++++++++++++++++++++++++++++ fetchmail-FAQ.html | 6 +- fetchmail-features.html | 2 - history.html | 4 - specgen.sh | 2 +- 7 files changed, 766 insertions(+), 778 deletions(-) delete mode 100644 design-notes.html create mode 100644 esrs-design-notes.html diff --git a/Makefile.am b/Makefile.am index 72f09f5a..6166c4a3 100644 --- a/Makefile.am +++ b/Makefile.am @@ -71,7 +71,7 @@ fetchmail.spec: Makefile.in specgen.sh $(srcdir)/specgen.sh $(VERSION) >fetchmail.spec DISTDOCS= FAQ FEATURES NOTES OLDNEWS fetchmail-man.html \ - fetchmail-FAQ.html design-notes.html todo.html \ + fetchmail-FAQ.html esrs-design-notes.html todo.html \ fetchmail-features.html README.SSL README.NTLM # extra directories to ship @@ -86,8 +86,8 @@ FAQ: fetchmail-FAQ.html FEATURES: fetchmail-features.html AWK=$(AWK) $(SHELL) $(srcdir)/dist-tools/html2txt $(srcdir)/fetchmail-features.html >$@ || { rm -f $@ ; exit 1 ; } -NOTES: design-notes.html - AWK=$(AWK) $(SHELL) $(srcdir)/dist-tools/html2txt $(srcdir)/design-notes.html >$@ || { rm -f $@ ; exit 1 ; } +NOTES: esrs-design-notes.html + AWK=$(AWK) $(SHELL) $(srcdir)/dist-tools/html2txt $(srcdir)/esrs-design-notes.html >$@ || { rm -f $@ ; exit 1 ; } TODO: todo.html AWK=$(AWK) $(SHELL) $(srcdir)/dist-tools/html2txt $(srcdir)/todo.html >$@ || { rm -f $@ ; exit 1 ; } diff --git a/design-notes.html b/design-notes.html deleted file mode 100644 index 8d4a841c..00000000 --- a/design-notes.html +++ /dev/null @@ -1,763 +0,0 @@ - - - - -Design notes on fetchmail - - - - - - - - - - - - -
Back to Fetchmail Home PageTo Site Map$Date: 2003/02/28 11:26:47 $
- -
-

Design Notes On Fetchmail

- -

These notes are for the benefit of future hackers and -maintainers. The following sections are both functional and -narrative, read from beginning to end.

- -

History

- -

A direct ancestor of the fetchmail program was originally -authored (under the name popclient) by Carl Harris -<ceharris@mal.com>. I took over development in June 1996 and -subsequently renamed the program `fetchmail' to reflect the -addition of IMAP support and SMTP delivery. In early November 1996 -Carl officially ended support for the last popclient versions.

- -

Before accepting responsibility for the popclient sources from -Carl, I had investigated and used and tinkered with every other -UNIX remote-mail forwarder I could find, including fetchpop1.9, -PopTart-0.9.3, get-mail, gwpop, pimp-1.0, pop-perl5-1.2, popc, -popmail-1.6 and upop. My major goal was to get a header-rewrite -feature like fetchmail's working so I wouldn't have reply problems -anymore.

- -

Despite having done a good bit of work on fetchpop1.9, when I -found popclient I quickly concluded that it offered the solidest -base for future development. I was convinced of this primarily by -the presence of multiple-protocol support. The competition didn't -do POP2/RPOP/APOP, and I was already having vague thoughts of maybe -adding IMAP. (This would advance two other goals: learn IMAP and -get comfortable writing TCP/IP client software.)

- -

Until popclient 3.05 I was simply following out the implications -of Carl's basic design. He already had daemon.c in the -distribution, and I wanted daemon mode almost as badly as I wanted -the header rewrite feature. The other things I added were bug fixes -or minor extensions.

- -

After 3.1, when I put in SMTP-forwarding support (more about -this below) the nature of the project changed -- it became a -carefully-thought-out attempt to render obsolete every other -program in its class. The name change quickly followed.

- -

The rewrite option

- -

MTAs ought to canonicalize the addresses of outgoing non-local -mail so that From:, To:, Cc:, Bcc: and other address headers -contain only fully qualified domain names. Failure to do so can -break the reply function on many mailers. (Sendmail has an option -to do this.)

- -

This problem only becomes obvious when a reply is generated on a -machine different from where the message was delivered. The two -machines will have different local username spaces, potentially -leading to misrouted mail.

- -

Most MTAs (and sendmail in particular) do not canonicalize -address headers in this way (violating RFC 1123). Fetchmail -therefore has to do it. This is the first feature I added to the -ancestral popclient.

- -

Reorganization

- -

The second thing I did reorganize and simplify popclient a lot. -Carl Harris's implementation was very sound, but exhibited a kind -of unnecessary complexity common to many C programmers. He treated -the code as central and the data structures as support for the -code. As a result, the code was beautiful but the data structure -design ad-hoc and rather ugly (at least to this old LISP -hacker).

- -

I was able to improve matters significantly by reorganizing most -of the program around the `query' data structure and eliminating a -bunch of global context. This especially simplified the main -sequence in fetchmail.c and was critical in enabling the daemon -mode changes.

- -

IMAP support and the method table

- -

The next step was IMAP support. I initially wrote the IMAP code -as a generic query driver and a method table. The idea was to have -all the protocol-independent setup logic and flow of control in the -driver, and the protocol-specific stuff in the method table.

- -

Once this worked, I rewrote the POP3 code to use the same -organization. The POP2 code kept its own driver for a couple more -releases, until I found sources of a POP2 server to test against -(the breed seems to be nearly extinct).

- -

The purpose of this reorganization, of course, is to trivialize -the development of support for future protocols as much as -possible. All mail-retrieval protocols have to have pretty similar -logical design by the nature of the task. By abstracting out that -common logic and its interface to the rest of the program, both the -common and protocol-specific parts become easier to understand.

- -

Furthermore, many kinds of new features can instantly be -supported across all protocols by modifying the one driver -module.

- -

Implications of smtp forwarding

- -

The direction of the project changed radically when Harry -Hochheiser sent me his scratch code for forwarding fetched mail to -the SMTP port. I realized almost immediately that a reliable -implementation of this feature would make all the other delivery -modes obsolete.

- -

Why mess with all the complexity of configuring an MDA or -setting up lock-and-append on a mailbox when port 25 is guaranteed -to be there on any platform with TCP/IP support in the first place? -Especially when this means retrieved mail is guaranteed to look -like normal sender- initiated SMTP mail, which is really what we -want anyway.

- -

Clearly, the right thing to do was (1) hack SMTP forwarding -support into the generic driver, (2) make it the default mode, and -(3) eventually throw out all the other delivery modes.

- -

I hesitated over step 3 for some time, fearing to upset -long-time popclient users dependent on the alternate delivery -mechanisms. In theory, they could immediately switch to .forward -files or their non-sendmail equivalents to get the same effects. In -practice the transition might have been messy.

- -

But when I did it (see the NEWS note on the great options -massacre) the benefits proved huge. The cruftiest parts of the -driver code vanished. Configuration got radically simpler -- no -more grovelling around for the system MDA and user's mailbox, no -more worries about whether the underlying OS supports file -locking.

- -

Also, the only way to lose mail vanished. If you specified -localfolder and the disk got full, your mail got lost. This can't -happen with SMTP forwarding because your SMTP listener won't return -OK unless the message can be spooled or processed.

- -

Also, performance improved (though not so you'd notice it in a -single run). Another not insignificant benefit of this change was -that the manual page got a lot simpler.

- -

Later, I had to bring --mda back in order to allow handling of -some obscure situations involving dynamic SLIP. But I found a much -simpler way to do it.

- -

The moral? Don't hesitate to throw away superannuated features -when you can do it without loss of effectiveness. I tanked a couple -I'd added myself and have no regrets at all. As Saint-Exupery said, -"Perfection [in design] is achieved not when there is nothing more -to add, but rather when there is nothing more to take away." This -program isn't perfect, but it's trying.

- -

The most-requested features that I will never add, and why -not:

- -

Password encryption in .fetchmailrc

- -

The reason there's no facility to store passwords encrypted in -the .fetchmailrc file is because this doesn't actually add -protection.

- -

Anyone who's acquired the 0600 permissions needed to read your -.fetchmailrc file will be able to run fetchmail as you anyway -- -and if it's your password they're after, they'd be able to rip the -necessary decoder out of the fetchmail code itself to get it.

- -

All .fetchmailrc encryption would do is give a false sense of -security to people who don't think very hard.

- -

Truly concurrent queries to multiple hosts

- -

Occasionally I get a request for this on "efficiency" grounds. -These people aren't thinking either. True concurrency would do -nothing to lessen fetchmail's total IP volume. The best it could -possibly do is change the usage profile to shorten the duration of -the active part of a poll cycle at the cost of increasing its -demand on IP volume per unit time.

- -

If one could thread the protocol code so that fetchmail didn't -block on waiting for a protocol response, but rather switched to -trying to process another host query, one might get an efficiency -gain (close to constant loading at the single-host level).

- -

Fortunately, I've only seldom seen a server that incurred -significant wait time on an individual response. I judge the gain -from this not worth the hideous complexity increase it would -require in the code.

- -

Multiple concurrent instances of fetchmail

- -

Fetchmail locking is on a per-invoking-user because -finer-grained locks would be really hard to implement in a portable -way. The problem is that you don't want two fetchmails querying the -same site for the same remote user at the same time.

- -

To handle this optimally, multiple fetchmails would have to -associate a system-wide semaphore with each active pair of a remote -user and host canonical address. A fetchmail would have to block -until getting this semaphore at the start of a query, and release -it at the end of a query.

- -

This would be way too complicated to do just for an "it might be -nice" feature. Instead, you can run a single root fetchmail polling -for multiple users in either single-drop or multidrop mode.

- -

The fundamental problem here is how an instance of fetchmail -polling host foo can assert that it's doing so in a way visible to -all other fetchmails. System V semaphores would be ideal for this -purpose, but they're not portable.

- -

I've thought about this a lot and roughed up several designs. -All are complicated and fragile, with a bunch of the standard -problems (what happens if a fetchmail aborts before clearing its -semaphore, and how do we recover reliably?).

- -

I'm just not satisfied that there's enough functional gain here -to pay for the large increase in complexity that adding these -semaphores would entail.

- -

Multidrop and alias handling

- -

I decided to add the multidrop support partly because some users -were clamoring for it, but mostly because I thought it would shake -bugs out of the single-drop code by forcing me to deal with -addressing in full generality. And so it proved.

- -

There are two important aspects of the features for handling -multiple-drop aliases and mailing lists which future hackers should -be careful to preserve.

- -
    -
  1. -

    The logic path for single-recipient mailboxes doesn't involve -header parsing or DNS lookups at all. This is important -- it means -the code for the most common case can be much simpler and more -robust.

    -
  2. - -
  3. -

    The multidrop handing does not rely on doing the -equivalent of passing the message to sendmail -t. Instead, it -explicitly mines members of a specified set of local usernames out -of the header.

    -
  4. - -
  5. -

    We do not attempt delivery to multidrop mailboxes in -the presence of DNS errors. Before each multidrop poll we probe DNS -to see if we have a nameserver handy. If not, the poll is skipped. -If DNS crashes during a poll, the error return from the next -nameserver lookup aborts message delivery and ends the poll. The -daemon mode will then quietly spin until DNS comes up again, at -which point it will resume delivering mail.

    -
  6. -
- -

When I designed this support, I was terrified of doing anything -that could conceivably cause a mail loop (you should be too). -That's why the code as written can only append local names -(never @-addresses) to the recipients list.

- -

The code in mxget.c is nasty, no two ways about it. But it's -utterly necessary, there are a lot of MX pointers out there. It -really ought to be a (documented!) entry point in the bind -library.

- -

DNS error handling

- -

Fetchmail's behavior on DNS errors is to suppress forwarding and -deletion of the individual message that each occurs in, leaving it -queued on the server for retrieval on a subsequent poll. The -assumption is that DNS errors are transient, due to temporary -server outages.

- -

Unfortunately this means that if a DNS error is permanent a -message can be perpetually stuck in the server mailbox. We've had a -couple bug reports of this kind due to subtle RFC822 parsing errors -in the fetchmail code that resulted in impossible things getting -passed to the DNS lookup routines.

- -

Alternative ways to handle the problem: ignore DNS errors -(treating them as a non-match on the mailserver domain), or forward -messages with errors to fetchmail's invoking user in addition to -any other recipients. These would fit an assumption that DNS lookup -errors are likely to be permanent problems associated with an -address.

- -

IPv6 and IPSEC

- -

The IPv6 support patches are really more protocol-family -independence patches. Because of this, in most places, "ports" -(numbers) have been replaced with "services" (strings, that may be -digits). This allows us to run with certain protocols that use -strings as "service names" where we in the IP world think of port -numbers. Someday we'll plumb strings all over and then, if inet6 is -not enabled, do a getservbyname() down in SocketOpen. The IPv6 -support patches use getaddrinfo(), which is a POSIX p1003.1g -mandated function. So, in the not too distant future, we'll zap the -ifdefs and just let autoconf check for getaddrinfo. IPv6 support -comes pretty much automatically once you have protocol family -independence.

- -

Internationalization

- -

Internationalization is handled using GNU gettext (see the file -ABOUT_NLS in the source distribution). This places some minor -constraints on the code.

- -

Strings that must be subject to translation should be wrapped -with GT_() or N_() -- the former in function arguments, the latter -in static initializers and other non-function-argument -contexts.

- -

Checklist for Adding Options

- -

Adding a control option is not complicated in principle, but -there are a lot of fiddly details in the process. You'll need to do -the following minimum steps.

- - - -

There may be other things you have to do in the way of logic, of -course.

- -

Before you implement an option, though, think hard. Is there any -way to make fetchmail automatically detect the circumstances under -which it should change its behavior? If so, don't write an option. -Just do the check!

- -

Lessons learned

- -

1. Server-side state is essential

- -

The person(s) responsible for removing LAST from POP3 deserve to -suffer. Without it, a client has no way to know which messages in a -box have been read by other means, such as an MUA running on the -server.

- -

The POP3 UID feature described in RFC1725 to replace LAST is -insufficient. The only problem it solves is tracking which messages -have been read by this client -- and even that requires -tricky, fragile implementation.

- -

The underlying lesson is that maintaining accessible server-side -`seen' state bits associated with Status headers is indispensible -in a Unix/RFC822 mail server protocol. IMAP gets this right.

- -

2. Readable text protocol transactions are a Good Thing

- -

A nice thing about the general class of text-based protocols -that SMTP, POP2, POP3, and IMAP belongs to is that client/server -transactions are easy to watch and transaction code correspondingly -easy to debug. Given a decent layer of socket utility functions -(which Carl provided) it's easy to write protocol engines and not -hard to show that they're working correctly.

- -

This is an advantage not to be despised! Because of it, this -project has been interesting and fun -- no serious or persistent -bugs, no long hours spent looking for subtle pathologies.

- -

3. IMAP is a Good Thing.

- -

Now that there is a standard IMAP equivalent of the POP3 APOP -validation in CRAM-MD5, POP3 is completely obsolete.

- -

4. SMTP is the Right Thing

- -

In retrospect it seems clear that this program (and others like -it) should have been designed to forward via SMTP from the -beginning. This lesson may be applicable to other Unix programs -that now call the local MDA/MTA as a program.

- -

5. Syntactic noise can be your friend

- -

The optional `noise' keywords in the rc file syntax started out -as a late-night experiment. The English-like syntax they allow is -considerably more readable than the traditional terse keyword-value -pairs you get when you strip them all out. I think there may be a -wider lesson here.

- -

Motivation and validation

- -

It is truly written: the best hacks start out as personal -solutions to the author's everyday problems, and spread because the -problem turns out to be typical for a large class of users. So it -was with Carl Harris and the ancestral popclient, and so with me -and fetchmail.

- -

It's gratifying that fetchmail has become so popular. Until just -before 1.9 I was designing strictly to my own taste. The multi-drop -mailbox support and the new --limit option were the first features -to go in that I didn't need myself.

- -

By 1.9, four months after I started hacking on popclient and a -month after the first fetchmail release, there were literally a -hundred people on the fetchmail-friends contact list. That's pretty -powerful motivation. And they were a good crowd, too, sending fixes -and intelligent bug reports in volume. A user population like that -is a gift from the gods, and this is my expression of -gratitude.

- -

The beta testers didn't know it at the time, but they were also -the subjects of a sociological experiment. The results are -described in my paper, The -Cathedral And The Bazaar.

- -

Credits

- -

Special thanks go to Carl Harris, who built a good solid code -base and then tolerated me hacking it out of recognition. And to -Harry Hochheiser, who gave me the idea of the SMTP-forwarding -delivery mode.

- -

Other significant contributors to the code have included Dave -Bodenstab (error.c code and --syslog), George Sipe (--monitor and ---interface), Gordon Matzigkeit (netrc.c), Al Longyear (UIDL -support), Chris Hanson (Kerberos V4 support), and Craig Metz (OPIE, -IPv6, IPSEC).

- -

Conclusion

- -

At this point, the fetchmail code appears to be pretty stable. -It will probably undergo substantial change only if and when -support for a new retrieval protocol or authentication method is -added.

- -

Relevant RFCS

- -

Not all of these describe standards explicitly used in -fetchmail, but they all shaped the design in one way or -another.

- -
-
RFC821
- -
SMTP protocol
- -
RFC822
- -
Mail header format
- -
RFC937
- -
Post Office Protocol - Version 2
- -
RFC974
- -
MX routing
- -
RFC976
- -
UUCP mail format
- -
RFC1081
- -
Post Office Protocol - Version 3
- -
RFC1123
- -
Host requirements (modifies 821, 822, and 974)
- -
RFC1176
- -
Interactive Mail Access Protocol - Version 2
- -
RFC1203
- -
Interactive Mail Access Protocol - Version 3
- -
RFC1225
- -
Post Office Protocol - Version 3
- -
RFC1344
- -
Implications of MIME for Internet Mail Gateways
- -
RFC1413
- -
Identification server
- -
RFC1428
- -
Transition of Internet Mail from Just-Send-8 to 8-bit -SMTP/MIME
- -
RFC1460
- -
Post Office Protocol - Version 3
- -
RFC1508
- -
Generic Security Service Application Program Interface
- -
RFC1521
- -
MIME: Multipurpose Internet Mail Extensions
- -
RFC1869
- -
SMTP Service Extensions (ESMTP spec)
- -
RFC1652
- -
SMTP Service Extension for 8bit-MIMEtransport
- -
RFC1725
- -
Post Office Protocol - Version 3
- -
RFC1730
- -
Interactive Mail Access Protocol - Version 4
- -
RFC1731
- -
IMAP4 Authentication Mechanisms
- -
RFC1732
- -
IMAP4 Compatibility With IMAP2 And IMAP2bis
- -
RFC1734
- -
POP3 AUTHentication command
- -
RFC1870
- -
SMTP Service Extension for Message Size Declaration
- -
RFC1891
- -
SMTP Service Extension for Delivery Status Notifications
- -
RFC1892
- -
The Multipart/Report Content Type for the Reporting of Mail -System Administrative Messages
- -
RFC1894
- -
An Extensible Message Format for Delivery Status -Notifications
- -
RFC1893
- -
Enhanced Mail System Status Codes
- -
RFC1894
- -
An Extensible Message Format for Delivery Status -Notifications
- -
RFC1938
- -
A One-Time Password System
- -
RFC1939
- -
Post Office Protocol - Version 3
- -
RFC1957
- -
Some Observations on Implementations of the Post Office -Protocol (POP3)
- -
RFC1985
- -
SMTP Service Extension for Remote Message Queue Starting
- -
RFC2033
- -
Local Mail Transfer Protocol
- -
RFC2060
- -
Internet Message Access Protocol - Version 4rev1
- -
RFC2061
- -
IMAP4 Compatibility With IMAP2bis
- -
RFC2062
- -
Internet Message Access Protocol - Obsolete Syntax
- -
RFC2195
- -
IMAP/POP AUTHorize Extension for Simple Challenge/Response
- -
RFC2177
- -
IMAP IDLE command
- -
RFC2449
- -
POP3 Extension Mechanism
- -
RFC2554
- -
SMTP Service Extension for Authentication
- -
RFC2595
- -
Using TLS with IMAP, POP3 and ACAP
- -
RFC2645
- -
On-Demand Mail Relay: SMTP with Dynamic IP Addresses
- -
RFC2683
- -
IMAP4 Implementation Recommendations
- -
RFC2821
- -
Simple Mail Transfer Protocol
- -
RFC2822
- -
Internet Message Format
-
- - - -

Other useful documents

- -
-
http://www.faqs.org/faqs/LANs/mail-protocols/
- -
LAN Mail Protocols Summary
-
- -
- - - - - - -
Back to Fetchmail Home PageTo Site Map$Date: 2003/02/28 11:26:47 $
- -
-
Eric S. Raymond <esr@snark.thyrsus.com>
- - - diff --git a/esrs-design-notes.html b/esrs-design-notes.html new file mode 100644 index 00000000..29ba0fb9 --- /dev/null +++ b/esrs-design-notes.html @@ -0,0 +1,761 @@ + + + + +Design notes on fetchmail + + + + + + + + + + + +
Back to Fetchmail Home Page$Date: 2003/02/28 11:26:47 $
+ +
+

Design Notes On Fetchmail

+ +

These notes are for the benefit of future hackers and +maintainers. The following sections are both functional and +narrative, read from beginning to end.

+ +

History

+ +

A direct ancestor of the fetchmail program was originally +authored (under the name popclient) by Carl Harris +<ceharris@mal.com>. I took over development in June 1996 and +subsequently renamed the program `fetchmail' to reflect the +addition of IMAP support and SMTP delivery. In early November 1996 +Carl officially ended support for the last popclient versions.

+ +

Before accepting responsibility for the popclient sources from +Carl, I had investigated and used and tinkered with every other +UNIX remote-mail forwarder I could find, including fetchpop1.9, +PopTart-0.9.3, get-mail, gwpop, pimp-1.0, pop-perl5-1.2, popc, +popmail-1.6 and upop. My major goal was to get a header-rewrite +feature like fetchmail's working so I wouldn't have reply problems +anymore.

+ +

Despite having done a good bit of work on fetchpop1.9, when I +found popclient I quickly concluded that it offered the solidest +base for future development. I was convinced of this primarily by +the presence of multiple-protocol support. The competition didn't +do POP2/RPOP/APOP, and I was already having vague thoughts of maybe +adding IMAP. (This would advance two other goals: learn IMAP and +get comfortable writing TCP/IP client software.)

+ +

Until popclient 3.05 I was simply following out the implications +of Carl's basic design. He already had daemon.c in the +distribution, and I wanted daemon mode almost as badly as I wanted +the header rewrite feature. The other things I added were bug fixes +or minor extensions.

+ +

After 3.1, when I put in SMTP-forwarding support (more about +this below) the nature of the project changed -- it became a +carefully-thought-out attempt to render obsolete every other +program in its class. The name change quickly followed.

+ +

The rewrite option

+ +

MTAs ought to canonicalize the addresses of outgoing non-local +mail so that From:, To:, Cc:, Bcc: and other address headers +contain only fully qualified domain names. Failure to do so can +break the reply function on many mailers. (Sendmail has an option +to do this.)

+ +

This problem only becomes obvious when a reply is generated on a +machine different from where the message was delivered. The two +machines will have different local username spaces, potentially +leading to misrouted mail.

+ +

Most MTAs (and sendmail in particular) do not canonicalize +address headers in this way (violating RFC 1123). Fetchmail +therefore has to do it. This is the first feature I added to the +ancestral popclient.

+ +

Reorganization

+ +

The second thing I did reorganize and simplify popclient a lot. +Carl Harris's implementation was very sound, but exhibited a kind +of unnecessary complexity common to many C programmers. He treated +the code as central and the data structures as support for the +code. As a result, the code was beautiful but the data structure +design ad-hoc and rather ugly (at least to this old LISP +hacker).

+ +

I was able to improve matters significantly by reorganizing most +of the program around the `query' data structure and eliminating a +bunch of global context. This especially simplified the main +sequence in fetchmail.c and was critical in enabling the daemon +mode changes.

+ +

IMAP support and the method table

+ +

The next step was IMAP support. I initially wrote the IMAP code +as a generic query driver and a method table. The idea was to have +all the protocol-independent setup logic and flow of control in the +driver, and the protocol-specific stuff in the method table.

+ +

Once this worked, I rewrote the POP3 code to use the same +organization. The POP2 code kept its own driver for a couple more +releases, until I found sources of a POP2 server to test against +(the breed seems to be nearly extinct).

+ +

The purpose of this reorganization, of course, is to trivialize +the development of support for future protocols as much as +possible. All mail-retrieval protocols have to have pretty similar +logical design by the nature of the task. By abstracting out that +common logic and its interface to the rest of the program, both the +common and protocol-specific parts become easier to understand.

+ +

Furthermore, many kinds of new features can instantly be +supported across all protocols by modifying the one driver +module.

+ +

Implications of smtp forwarding

+ +

The direction of the project changed radically when Harry +Hochheiser sent me his scratch code for forwarding fetched mail to +the SMTP port. I realized almost immediately that a reliable +implementation of this feature would make all the other delivery +modes obsolete.

+ +

Why mess with all the complexity of configuring an MDA or +setting up lock-and-append on a mailbox when port 25 is guaranteed +to be there on any platform with TCP/IP support in the first place? +Especially when this means retrieved mail is guaranteed to look +like normal sender- initiated SMTP mail, which is really what we +want anyway.

+ +

Clearly, the right thing to do was (1) hack SMTP forwarding +support into the generic driver, (2) make it the default mode, and +(3) eventually throw out all the other delivery modes.

+ +

I hesitated over step 3 for some time, fearing to upset +long-time popclient users dependent on the alternate delivery +mechanisms. In theory, they could immediately switch to .forward +files or their non-sendmail equivalents to get the same effects. In +practice the transition might have been messy.

+ +

But when I did it (see the NEWS note on the great options +massacre) the benefits proved huge. The cruftiest parts of the +driver code vanished. Configuration got radically simpler -- no +more grovelling around for the system MDA and user's mailbox, no +more worries about whether the underlying OS supports file +locking.

+ +

Also, the only way to lose mail vanished. If you specified +localfolder and the disk got full, your mail got lost. This can't +happen with SMTP forwarding because your SMTP listener won't return +OK unless the message can be spooled or processed.

+ +

Also, performance improved (though not so you'd notice it in a +single run). Another not insignificant benefit of this change was +that the manual page got a lot simpler.

+ +

Later, I had to bring --mda back in order to allow handling of +some obscure situations involving dynamic SLIP. But I found a much +simpler way to do it.

+ +

The moral? Don't hesitate to throw away superannuated features +when you can do it without loss of effectiveness. I tanked a couple +I'd added myself and have no regrets at all. As Saint-Exupery said, +"Perfection [in design] is achieved not when there is nothing more +to add, but rather when there is nothing more to take away." This +program isn't perfect, but it's trying.

+ +

The most-requested features that I will never add, and why +not:

+ +

Password encryption in .fetchmailrc

+ +

The reason there's no facility to store passwords encrypted in +the .fetchmailrc file is because this doesn't actually add +protection.

+ +

Anyone who's acquired the 0600 permissions needed to read your +.fetchmailrc file will be able to run fetchmail as you anyway -- +and if it's your password they're after, they'd be able to rip the +necessary decoder out of the fetchmail code itself to get it.

+ +

All .fetchmailrc encryption would do is give a false sense of +security to people who don't think very hard.

+ +

Truly concurrent queries to multiple hosts

+ +

Occasionally I get a request for this on "efficiency" grounds. +These people aren't thinking either. True concurrency would do +nothing to lessen fetchmail's total IP volume. The best it could +possibly do is change the usage profile to shorten the duration of +the active part of a poll cycle at the cost of increasing its +demand on IP volume per unit time.

+ +

If one could thread the protocol code so that fetchmail didn't +block on waiting for a protocol response, but rather switched to +trying to process another host query, one might get an efficiency +gain (close to constant loading at the single-host level).

+ +

Fortunately, I've only seldom seen a server that incurred +significant wait time on an individual response. I judge the gain +from this not worth the hideous complexity increase it would +require in the code.

+ +

Multiple concurrent instances of fetchmail

+ +

Fetchmail locking is on a per-invoking-user because +finer-grained locks would be really hard to implement in a portable +way. The problem is that you don't want two fetchmails querying the +same site for the same remote user at the same time.

+ +

To handle this optimally, multiple fetchmails would have to +associate a system-wide semaphore with each active pair of a remote +user and host canonical address. A fetchmail would have to block +until getting this semaphore at the start of a query, and release +it at the end of a query.

+ +

This would be way too complicated to do just for an "it might be +nice" feature. Instead, you can run a single root fetchmail polling +for multiple users in either single-drop or multidrop mode.

+ +

The fundamental problem here is how an instance of fetchmail +polling host foo can assert that it's doing so in a way visible to +all other fetchmails. System V semaphores would be ideal for this +purpose, but they're not portable.

+ +

I've thought about this a lot and roughed up several designs. +All are complicated and fragile, with a bunch of the standard +problems (what happens if a fetchmail aborts before clearing its +semaphore, and how do we recover reliably?).

+ +

I'm just not satisfied that there's enough functional gain here +to pay for the large increase in complexity that adding these +semaphores would entail.

+ +

Multidrop and alias handling

+ +

I decided to add the multidrop support partly because some users +were clamoring for it, but mostly because I thought it would shake +bugs out of the single-drop code by forcing me to deal with +addressing in full generality. And so it proved.

+ +

There are two important aspects of the features for handling +multiple-drop aliases and mailing lists which future hackers should +be careful to preserve.

+ +
    +
  1. +

    The logic path for single-recipient mailboxes doesn't involve +header parsing or DNS lookups at all. This is important -- it means +the code for the most common case can be much simpler and more +robust.

    +
  2. + +
  3. +

    The multidrop handing does not rely on doing the +equivalent of passing the message to sendmail -t. Instead, it +explicitly mines members of a specified set of local usernames out +of the header.

    +
  4. + +
  5. +

    We do not attempt delivery to multidrop mailboxes in +the presence of DNS errors. Before each multidrop poll we probe DNS +to see if we have a nameserver handy. If not, the poll is skipped. +If DNS crashes during a poll, the error return from the next +nameserver lookup aborts message delivery and ends the poll. The +daemon mode will then quietly spin until DNS comes up again, at +which point it will resume delivering mail.

    +
  6. +
+ +

When I designed this support, I was terrified of doing anything +that could conceivably cause a mail loop (you should be too). +That's why the code as written can only append local names +(never @-addresses) to the recipients list.

+ +

The code in mxget.c is nasty, no two ways about it. But it's +utterly necessary, there are a lot of MX pointers out there. It +really ought to be a (documented!) entry point in the bind +library.

+ +

DNS error handling

+ +

Fetchmail's behavior on DNS errors is to suppress forwarding and +deletion of the individual message that each occurs in, leaving it +queued on the server for retrieval on a subsequent poll. The +assumption is that DNS errors are transient, due to temporary +server outages.

+ +

Unfortunately this means that if a DNS error is permanent a +message can be perpetually stuck in the server mailbox. We've had a +couple bug reports of this kind due to subtle RFC822 parsing errors +in the fetchmail code that resulted in impossible things getting +passed to the DNS lookup routines.

+ +

Alternative ways to handle the problem: ignore DNS errors +(treating them as a non-match on the mailserver domain), or forward +messages with errors to fetchmail's invoking user in addition to +any other recipients. These would fit an assumption that DNS lookup +errors are likely to be permanent problems associated with an +address.

+ +

IPv6 and IPSEC

+ +

The IPv6 support patches are really more protocol-family +independence patches. Because of this, in most places, "ports" +(numbers) have been replaced with "services" (strings, that may be +digits). This allows us to run with certain protocols that use +strings as "service names" where we in the IP world think of port +numbers. Someday we'll plumb strings all over and then, if inet6 is +not enabled, do a getservbyname() down in SocketOpen. The IPv6 +support patches use getaddrinfo(), which is a POSIX p1003.1g +mandated function. So, in the not too distant future, we'll zap the +ifdefs and just let autoconf check for getaddrinfo. IPv6 support +comes pretty much automatically once you have protocol family +independence.

+ +

Internationalization

+ +

Internationalization is handled using GNU gettext (see the file +ABOUT_NLS in the source distribution). This places some minor +constraints on the code.

+ +

Strings that must be subject to translation should be wrapped +with GT_() or N_() -- the former in function arguments, the latter +in static initializers and other non-function-argument +contexts.

+ +

Checklist for Adding Options

+ +

Adding a control option is not complicated in principle, but +there are a lot of fiddly details in the process. You'll need to do +the following minimum steps.

+ + + +

There may be other things you have to do in the way of logic, of +course.

+ +

Before you implement an option, though, think hard. Is there any +way to make fetchmail automatically detect the circumstances under +which it should change its behavior? If so, don't write an option. +Just do the check!

+ +

Lessons learned

+ +

1. Server-side state is essential

+ +

The person(s) responsible for removing LAST from POP3 deserve to +suffer. Without it, a client has no way to know which messages in a +box have been read by other means, such as an MUA running on the +server.

+ +

The POP3 UID feature described in RFC1725 to replace LAST is +insufficient. The only problem it solves is tracking which messages +have been read by this client -- and even that requires +tricky, fragile implementation.

+ +

The underlying lesson is that maintaining accessible server-side +`seen' state bits associated with Status headers is indispensible +in a Unix/RFC822 mail server protocol. IMAP gets this right.

+ +

2. Readable text protocol transactions are a Good Thing

+ +

A nice thing about the general class of text-based protocols +that SMTP, POP2, POP3, and IMAP belongs to is that client/server +transactions are easy to watch and transaction code correspondingly +easy to debug. Given a decent layer of socket utility functions +(which Carl provided) it's easy to write protocol engines and not +hard to show that they're working correctly.

+ +

This is an advantage not to be despised! Because of it, this +project has been interesting and fun -- no serious or persistent +bugs, no long hours spent looking for subtle pathologies.

+ +

3. IMAP is a Good Thing.

+ +

Now that there is a standard IMAP equivalent of the POP3 APOP +validation in CRAM-MD5, POP3 is completely obsolete.

+ +

4. SMTP is the Right Thing

+ +

In retrospect it seems clear that this program (and others like +it) should have been designed to forward via SMTP from the +beginning. This lesson may be applicable to other Unix programs +that now call the local MDA/MTA as a program.

+ +

5. Syntactic noise can be your friend

+ +

The optional `noise' keywords in the rc file syntax started out +as a late-night experiment. The English-like syntax they allow is +considerably more readable than the traditional terse keyword-value +pairs you get when you strip them all out. I think there may be a +wider lesson here.

+ +

Motivation and validation

+ +

It is truly written: the best hacks start out as personal +solutions to the author's everyday problems, and spread because the +problem turns out to be typical for a large class of users. So it +was with Carl Harris and the ancestral popclient, and so with me +and fetchmail.

+ +

It's gratifying that fetchmail has become so popular. Until just +before 1.9 I was designing strictly to my own taste. The multi-drop +mailbox support and the new --limit option were the first features +to go in that I didn't need myself.

+ +

By 1.9, four months after I started hacking on popclient and a +month after the first fetchmail release, there were literally a +hundred people on the fetchmail-friends contact list. That's pretty +powerful motivation. And they were a good crowd, too, sending fixes +and intelligent bug reports in volume. A user population like that +is a gift from the gods, and this is my expression of +gratitude.

+ +

The beta testers didn't know it at the time, but they were also +the subjects of a sociological experiment. The results are +described in my paper, The +Cathedral And The Bazaar.

+ +

Credits

+ +

Special thanks go to Carl Harris, who built a good solid code +base and then tolerated me hacking it out of recognition. And to +Harry Hochheiser, who gave me the idea of the SMTP-forwarding +delivery mode.

+ +

Other significant contributors to the code have included Dave +Bodenstab (error.c code and --syslog), George Sipe (--monitor and +--interface), Gordon Matzigkeit (netrc.c), Al Longyear (UIDL +support), Chris Hanson (Kerberos V4 support), and Craig Metz (OPIE, +IPv6, IPSEC).

+ +

Conclusion

+ +

At this point, the fetchmail code appears to be pretty stable. +It will probably undergo substantial change only if and when +support for a new retrieval protocol or authentication method is +added.

+ +

Relevant RFCS

+ +

Not all of these describe standards explicitly used in +fetchmail, but they all shaped the design in one way or +another.

+ +
+
RFC821
+ +
SMTP protocol
+ +
RFC822
+ +
Mail header format
+ +
RFC937
+ +
Post Office Protocol - Version 2
+ +
RFC974
+ +
MX routing
+ +
RFC976
+ +
UUCP mail format
+ +
RFC1081
+ +
Post Office Protocol - Version 3
+ +
RFC1123
+ +
Host requirements (modifies 821, 822, and 974)
+ +
RFC1176
+ +
Interactive Mail Access Protocol - Version 2
+ +
RFC1203
+ +
Interactive Mail Access Protocol - Version 3
+ +
RFC1225
+ +
Post Office Protocol - Version 3
+ +
RFC1344
+ +
Implications of MIME for Internet Mail Gateways
+ +
RFC1413
+ +
Identification server
+ +
RFC1428
+ +
Transition of Internet Mail from Just-Send-8 to 8-bit +SMTP/MIME
+ +
RFC1460
+ +
Post Office Protocol - Version 3
+ +
RFC1508
+ +
Generic Security Service Application Program Interface
+ +
RFC1521
+ +
MIME: Multipurpose Internet Mail Extensions
+ +
RFC1869
+ +
SMTP Service Extensions (ESMTP spec)
+ +
RFC1652
+ +
SMTP Service Extension for 8bit-MIMEtransport
+ +
RFC1725
+ +
Post Office Protocol - Version 3
+ +
RFC1730
+ +
Interactive Mail Access Protocol - Version 4
+ +
RFC1731
+ +
IMAP4 Authentication Mechanisms
+ +
RFC1732
+ +
IMAP4 Compatibility With IMAP2 And IMAP2bis
+ +
RFC1734
+ +
POP3 AUTHentication command
+ +
RFC1870
+ +
SMTP Service Extension for Message Size Declaration
+ +
RFC1891
+ +
SMTP Service Extension for Delivery Status Notifications
+ +
RFC1892
+ +
The Multipart/Report Content Type for the Reporting of Mail +System Administrative Messages
+ +
RFC1894
+ +
An Extensible Message Format for Delivery Status +Notifications
+ +
RFC1893
+ +
Enhanced Mail System Status Codes
+ +
RFC1894
+ +
An Extensible Message Format for Delivery Status +Notifications
+ +
RFC1938
+ +
A One-Time Password System
+ +
RFC1939
+ +
Post Office Protocol - Version 3
+ +
RFC1957
+ +
Some Observations on Implementations of the Post Office +Protocol (POP3)
+ +
RFC1985
+ +
SMTP Service Extension for Remote Message Queue Starting
+ +
RFC2033
+ +
Local Mail Transfer Protocol
+ +
RFC2060
+ +
Internet Message Access Protocol - Version 4rev1
+ +
RFC2061
+ +
IMAP4 Compatibility With IMAP2bis
+ +
RFC2062
+ +
Internet Message Access Protocol - Obsolete Syntax
+ +
RFC2195
+ +
IMAP/POP AUTHorize Extension for Simple Challenge/Response
+ +
RFC2177
+ +
IMAP IDLE command
+ +
RFC2449
+ +
POP3 Extension Mechanism
+ +
RFC2554
+ +
SMTP Service Extension for Authentication
+ +
RFC2595
+ +
Using TLS with IMAP, POP3 and ACAP
+ +
RFC2645
+ +
On-Demand Mail Relay: SMTP with Dynamic IP Addresses
+ +
RFC2683
+ +
IMAP4 Implementation Recommendations
+ +
RFC2821
+ +
Simple Mail Transfer Protocol
+ +
RFC2822
+ +
Internet Message Format
+
+ + + +

Other useful documents

+ +
+
http://www.faqs.org/faqs/LANs/mail-protocols/
+ +
LAN Mail Protocols Summary
+
+ +
+ + + + + +
Back to Fetchmail Home Page$Date: 2003/02/28 11:26:47 $
+ +
+
Eric S. Raymond <esr@snark.thyrsus.com>
+ + + diff --git a/fetchmail-FAQ.html b/fetchmail-FAQ.html index 9968034f..acf0b27c 100644 --- a/fetchmail-FAQ.html +++ b/fetchmail-FAQ.html @@ -14,8 +14,6 @@ content="Frequently asked questions about fetchmail."/> Back to Fetchmail Home Page -To Site -Map $Date: 2004/01/13 08:46:00 $ @@ -417,7 +415,7 @@ fetchmail simple so it stays reliable.

For reasons fetchmail doesn't have other commonly-requested features (such as password encryption, or multiple concurrent polls from the same instance of fetchmail) see the design +href="http://fetchmail.berlios.de/esrs-design-notes.html">design notes.

Fetchmail is a mature project, no longer in constant active @@ -3546,8 +3544,6 @@ does something like "date >> $HOME/Procmail/fetchmail.log".

Back to Fetchmail Home Page -To Site -Map $Date: 2004/01/13 08:46:00 $ diff --git a/fetchmail-features.html b/fetchmail-features.html index 7a07d266..95f8db55 100644 --- a/fetchmail-features.html +++ b/fetchmail-features.html @@ -17,7 +17,6 @@ -
Back to Fetchmail Home PageTo Site Map $Date: 2003/07/22 02:32:06 $
@@ -280,7 +279,6 @@ be unique to fetchmail if I hadn't added it to fetchpop.) -
Back to Fetchmail Home PageTo Site Map $Date: 2003/07/22 02:32:06 $
diff --git a/history.html b/history.html index da683297..63d0355f 100644 --- a/history.html +++ b/history.html @@ -22,8 +22,6 @@ content="Fetchmail participation statistics" /> - -
Back to Eric's Home PageUp to Site Map $Date: 2003/12/09 16:59:26 $
@@ -115,8 +113,6 @@ of eligible programmers are rising on trend curves of the same
- -
Back to Eric's Home PageUp to Site Map $Date: 2003/12/09 16:59:26 $
diff --git a/specgen.sh b/specgen.sh index 651efc3a..bb493cc0 100755 --- a/specgen.sh +++ b/specgen.sh @@ -167,7 +167,7 @@ rm -rf \$RPM_BUILD_ROOT %doc ABOUT-NLS FAQ COPYING FEATURES NEWS %doc NOTES OLDNEWS README README.SSL %doc contrib -%doc fetchmail-features.html fetchmail-FAQ.html design-notes.html +%doc fetchmail-features.html fetchmail-FAQ.html esrs-design-notes.html %attr(644, root, man) %{_mandir}/man1/fetchmail.1* %attr(755, root, root) /usr/bin/fetchmail # Uncomment the following to support internationalization -- cgit v1.2.3