aboutsummaryrefslogtreecommitdiffstats
path: root/contrib/008523.html
blob: 535ffec54ac64513b38c17fb9ee38a6de7f026ab (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
 <HEAD>
   <TITLE> [fetchmail]fetchmail vs Maillenium; mail truncated to 80K
   </TITLE>
   <LINK REL="Index" HREF="index.html" >
   <LINK REL="made" HREF="mailto:jcfoley%40comcast.net">
   <META NAME="robots" CONTENT="index,nofollow">
   
   <LINK REL="Previous"  HREF="008522.html">
   <LINK REL="Next"  HREF="008524.html">
 </HEAD>
 <BODY BGCOLOR="#ffffff">
   <H1>[fetchmail]fetchmail vs Maillenium; mail truncated to 80K
   </H1>
    <B>jcfoley@comcast.net
    </B> 
    <A HREF="mailto:jcfoley%40comcast.net"
       TITLE="[fetchmail]fetchmail vs Maillenium; mail truncated to 80K">jcfoley@comcast.net
       </A><BR>
    <I>Fri, 23 Apr 2004 02:51:22 +0000</I>
    <P><UL>
        <LI> Previous message: <A HREF="008522.html">[fetchmail]fetchmail vs Maillenium; mail truncated to 80K
</A></li>
        <LI> Next message: <A HREF="008524.html">[fetchmail]fetchmail vs Maillenium; mail truncated to 80K
</A></li>
         <LI> <B>Messages sorted by:</B> 
              <a href="date.html#8523">[ date ]</a>
              <a href="thread.html#8523">[ thread ]</a>
              <a href="subject.html#8523">[ subject ]</a>
              <a href="author.html#8523">[ author ]</a>
         </LI>
       </UL>
    <HR>  
<!--beginarticle-->
<PRE>You're probably using a Comcast POP3 server.  Many others have
experienced this problem.  The problem is that the server truncates
the amount of data returned by the POP3 TOP command.  Comcast changed
to the Maillennium POP3 server in Summer 2003.  For several months
they refused to acknowledge any issue at their end that would account
for email truncation.  Recently the Comcast Government Affairs Manager
at Comcast of Montgomery (Maryland) sent me the information at the end
of this message.

I believe the Outlook Express flaw they reference was fixed a few
years ago.  Regardless it does seem to be a strange and non-conforming
server implementation that silently does the wrong thing specified by
the RFC and every other server I've used.

On the other hand, people have made the comment that fetchmail should
not be relying on TOP because a) that's not what it is for and/or b)
it is an optional POP3 command.

Item I8 of the fetchmail FAQ which appears to be maintained by Eric
S. Raymond says, &quot;Don't mistake this for a fetchmail bug.&quot;

It would be nice to hear from a fetchmail expert/authority on whether
fetchmail is doing the right thing by using TOP and for a rationale of
the FAQ's response.

If fetchmail's use of TOP is legitimate then maybe Comcast would
uncripple their server if more people complained.

Jim Foley

=======================================================================
=======================================================================

Date: Wed, 3 Mar 2004 11:59:17 -0500

Mr. Foley, this email responds to the questions you posed following our
conference call.

First, Comcast does support POP 3 TOP commands, however Comcast has found
that increasing the amount of data TOP returns beyond the value of 64K has a
tendency to crash Microsoft Outlook Express when an abnormally large header
is sent.  Increasing the value beyond 64K would open the platform to
malicious use of large headers that adversely impacts system performance.
Virtually all of Comcast's high-speed Internet customers use Outlook
Express. Comcast has not received requests from other subscribers who seek
to use the TOP command in the manner you have requested.  Further, Comcast
has not received any other complaints regarding email truncation with the
TOP command.  Should you wish to continue checking your mail through manual
commands you might try using the RETR command, which will return the entire
message.

...



Date: Fri, 5 Mar 2004 16:28:11 -0500

Mr. Foley:

This is in response to your question regarding &quot;POP 3 RFC compliance.&quot;  We
have tried to answer your question about Comcast's services by talking about
the specific application in which you are interested and how that
application relates to technical information regarding the configuration of
Comcast's Internet service.  We have provided you all the information that
we can by explaining that Comcast limits the optional POP 3 Top Command to a
value of 64k because any larger value has a tendency to crash Microsoft
Outlook and could leave Comcast's system open to the malicious use of large
headers intended to impair system performance.

The decision by Comcast to place limitations on the optional POP 3 TOP email
commands is a technical business decision made by Comcast in the best
interest of all its customers and its system. ...

...

With respect to the specific RFC at issue, RFC 1939, POP 3, it is our
understanding that it is a protocol &quot;intended to permit a workstation to
dynamically access a maildrop on a server host in a useful fashion.
Usually, this means that the POP3 protocol is used to allow a workstation to
retrieve mail that the server is holding for it.  Pop 3 is not intended to
provide extensive manipulation operations of mail on the server.&quot;  POP 3 was
created in May 1996 and has not been revised since, despite the many changes
in computer hardware and software related to handling of email since that
time.  In any event, the TOP command is identified as an optional POP 3
command in RFC 1939.

...


</PRE>
<!--endarticle-->
    <HR>
    <P><UL>
        <!--threads-->
	<LI> Previous message: <A HREF="008522.html">[fetchmail]fetchmail vs Maillenium; mail truncated to 80K
</A></li>
	<LI> Next message: <A HREF="008524.html">[fetchmail]fetchmail vs Maillenium; mail truncated to 80K
</A></li>
         <LI> <B>Messages sorted by:</B> 
              <a href="date.html#8523">[ date ]</a>
              <a href="thread.html#8523">[ thread ]</a>
              <a href="subject.html#8523">[ subject ]</a>
              <a href="author.html#8523">[ author ]</a>
         </LI>
       </UL>
</body></html>
an>> Until popclient 3.05 I was simply following out the implications of Carl's basic design. He already had daemon.c in the distribution, and I wanted daemon mode almost as badly as I wanted the header rewrite feature. The other things I added were bug fixes or minor extensions.<P> After 3.1, when I put in SMTP-forwarding support (more about this below) the nature of the project changed -- it became a carefully-thought-out attempt to render obsolete every other program in its class. The name change quickly followed.<P> <H1>The rewrite option</H1> RFC 1123 stipulates that MTAs ought to canonicalize the addresses of outgoing mail so that From:, To:, Cc:, Bcc: and other address headers contain only fully qualified domain names. Failure to do so can break the reply function on many mailers.<P> This problem only becomes obvious when a reply is generated on a machine different from where the message was delivered. The two machines will have different local username spaces, potentially leading to misrouted mail.<P> Most MTAs (and sendmail in particular) do not canonicalize address headers in this way (violating RFC 1123). Fetchmail therefore has to do it. This is the first feature I added to the ancestral popclient.<P> <H1>Reorganization</H1> The second thing I did reorganize and simplify popclient a lot. Carl Harris's implementation was very sound, but exhibited a kind of unnecessary complexity common to many C programmers. He treated the code as central and the data structures as support for the code. As a result, the code was beautiful but the data structure design ad-hoc and rather ugly (at least to this old LISP hacker).<P> I was able to improve matters significantly by reorganizing most of the program around the `query' data structure and eliminating a bunch of global context. This especially simplified the main sequence in fetchmail.c and was critical in enabling the daemon mode changes.<P> <H1>IMAP support and the method table</H1> The next step was IMAP support. I initially wrote the IMAP code as a generic query driver and a method table. The idea was to have all the protocol-independent setup logic and flow of control in the driver, and the protocol-specific stuff in the method table.<P> Once this worked, I rewrote the POP3 code to use the same organization. The POP2 code kept its own driver for a couple more releases, until I found sources of a POP2 server to test against (the breed seems to be nearly extinct).<P> The purpose of this reorganization, of course, is to trivialize the development of support for future protocols as much as possible. All mail-retrieval protocols have to have pretty similar logical design by the nature of the task. By abstracting out that common logic and its interface to the rest of the program, both the common and protocol-specific parts become easier to understand.<P> Furthermore, many kinds of new features can instantly be supported across all protocols by modifying the one driver module.<P> <H1>Implications of smtp forwarding</H1> The direction of the project changed radically when Harry Hochheiser sent me his scratch code for forwarding fetched mail to the SMTP port. I realized almost immediately that a reliable implementation of this feature would make all the other delivery modes obsolete.<P> Why mess with all the complexity of configuring an MDA or setting up lock-and-append on a mailbox when port 25 is guaranteed to be there on any platform with TCP/IP support in the first place? Especially when this means retrieved mail is guaranteed to look like normal sender- initiated SMTP mail, which is really what we want anyway.<P> Clearly, the right thing to do was (1) hack SMTP forwarding support into the generic driver, (2) make it the default mode, and (3) eventually throw out all the other delivery modes. <P> I hesitated over step 3 for some time, fearing to upset long-time popclient users dependent on the alternate delivery mechanisms. In theory, they could immediately switch to .forward files or their non-sendmail equivalents to get the same effects. In practice the transition might have been messy.<P> But when I did it (see the NEWS note on the great options massacre) the benefits proved huge. The cruftiest parts of the driver code vanished. Configuration got radically simpler -- no more grovelling around for the system MDA and user's mailbox, no more worries about whether the underlying OS supports file locking.<P> Also, the only way to lose mail vanished. If you specified localfolder and the disk got full, your mail got lost. This can't happen with SMTP forwarding because your SMTP listener won't return OK unless the message can be spooled or processed.<P> Also, performance improved (though not so you'd notice it in a single run). Another not insignificant benefit of this change was that the manual page got a lot simpler.<P> Later, I had to bring --mda back in order to allow handling of some obscure situations involving dynamic SLIP. But I found a much simpler way to do it.<P> The moral? Don't hesitate to throw away superannuated features when you can do it without loss of effectiveness. I tanked a couple I'd added myself and have no regrets at all. As Saint-Exupery said, "Perfection [in design] is achieved not when there is nothing more to add, but rather when there is nothing more to take away." This program isn't perfect, but it's trying.<P> <H1>The most-requested features that I will never add, and why not:</H1> <H2>1. Password encryption in .fetchmailrc</H2> The reason there's no facility to store passwords encrypted in the .fetchmailrc file is because this doesn't actually add protection.<P> Anyone who's acquired the 0600 permissions needed to read your .fetchmailrc file will be able to run fetchmail as you anyway -- and if it's your password they're after, they'd be able to rip the necessary decoder out of the fetchmail code itself to get it.<P> All .fetchmailrc encryption would do is give a false sense of security to people who don't think very hard.<P> <H2>2. Truly concurrent queries to multiple hosts</H2> Occasionally I get a request for this on "efficiency" grounds. These people aren't thinking either. True concurrency would do nothing to lessen fetchmail's total IP volume. The best it could possibly do is change the usage profile to shorten the duration of the active part of a poll cycle at the cost of increasing its demand on IP volume per unit time.<P> If one could thread the protocol code so that fetchmail didn't block on waiting for a protocol response, but rather switched to trying to process another host query, one might get an efficiency gain (close to constant loading at the single-host level).<P> Fortunately, I've only seldom seen a server that incurred significant wait time on an individual response. I judge the gain from this not worth the hideous complexity increase it would require in the code.<P> <H1>Multidrop and alias handling</H1> I decided to add the multidrop support partly because some users were clamoring for it, but mostly because I thought it would shake bugs out of the single-drop code by forcing me to deal with addressing in full generality. And so it proved.<P> There are two important aspects of the features for handling multiple-drop aliases and mailing lists which future hackers should be careful to preserve.<P> <OL> <LI> The logic path for single-recipient mailboxes doesn't involve header parsing or DNS lookups at all. This is important -- it means the code for the most common case can be much simpler and more robust.<P> <LI> The multidrop handing does <EM>not</EM> rely on doing the equivalent of passing the message to sendmail -oem -t. Instead, it explicitly mines members of a specified set of local usernames out of the header.<P> <LI> We do <EM>not</EM> attempt delivery to multidrop mailboxes in the presence of DNS errors. Before each multidrop poll we probe DNS to see if we have a nameserver handy. If not, the poll is skipped. If DNS crashes during a poll, the error return from the next nameserver lookup aborts message delivery and ends the poll. The daemon mode will then quietly spin until DNS comes up again, at which point it will resume delivering mail.<P> </OL> When I designed this support, I was terrified of doing anything that could conceivably cause a mail loop (you should be too). That's why the code as written can only append <EM>local</EM> names (never @-addresses) to the recipients list.<P> The code in mxget.c is nasty, no two ways about it. But it's utterly necessary, there are a lot of MX pointers out there. It really ought to be a (documented!) entry point in the bind library.<P> <H1>DNS error handling</H1> Fetchmail's behavior on DNS errors is to suppress forwarding and deletion of the individual message that each occurs in, leaving it queued on the server for retrieval on a subsequent poll. The assumption is that DNS errors are transient, due to temporary server outages.<P> Unfortunately this means that if a DNS error is permanent a message can be perpetually stuck in the server mailbox. We've had a couple bug reports of this kind due to subtle RFC822 parsing errors in the fetchmail code that resulted in impossible things getting passed to the DNS lookup routines.<P> Alternative ways to handle the problem: ignore DNS errors (treating them as a non-match on the mailserver domain), or forward messages with errors to fetchmail's invoking user in addition to any other recipients. These would fit an assumption that DNS lookup errors are likely to be permanent problems associated with an address.<P> <H1>Lessons learned</H1> <H3>1. Server-side state is essential</H3> The person(s) responsible for removing LAST from POP3 deserve to suffer. Without it, a client has no way to know which messages in a box have been read by other means, such as an MUA running on the server.<P> The POP3 UID feature described in RFC1725 to replace LAST is insufficient. The only problem it solves is tracking which messages have been read <EM>by this client</EM> -- and even that requires tricky, fragile implementation.<P> The underlying lesson is that maintaining accessible server-side `seen' state bits associated with Status headers is indispensible in a Unix/RFC822 mail server protocol. IMAP gets this right.<P> <H3>2. Readable text protocol transactions are a Good Thing</H3> A nice thing about the general class of text-based protocols that SMTP, POP2, POP3, and IMAP belongs to is that client/server transactions are easy to watch and transaction code correspondingly easy to debug. Given a decent layer of socket utility functions (which Carl provided) it's easy to write protocol engines and not hard to show that they're working correctly.<P> This is an advantage not to be despised! Because of it, this project has been interesting and fun -- no serious or persistent bugs, no long hours spent looking for subtle pathologies.<P> <H3>3. IMAP is a Good Thing.</H3> If there were a standard IMAP equivalent of the POP3 APOP validation, POP3 would be completely obsolete.<P> <H3>4. SMTP is the Right Thing</H3> In retrospect it seems clear that this program (and others like it) should have been designed to forward via SMTP from the beginning. This lesson may be applicable to other Unix programs that now call the local MDA/MTA as a program.<P> <H3>5. Syntactic noise can be your friend</H3> The optional `noise' keywords in the rc file syntax started out as a late-night experiment. The English-like syntax they allow is considerably more readable than the traditional terse keyword-value pairs you get when you strip them all out. I think there may be a wider lesson here.<P> <H1>Motivation and validation</H1> It is truly written: the best hacks start out as personal solutions to the author's everyday problems, and spread because the problem turns out to be typical for a large class of users. So it was with Carl Harris and the ancestral popclient, and so with me and fetchmail.<P> It's gratifying that fetchmail has become so popular. Until just before 1.9 I was designing strictly to my own taste. The multi-drop mailbox support and the new --limit option were the first features to go in that I didn't need myself.<P> By 1.9, four months after I started hacking on popclient and a month after the first fetchmail release, there were literally a hundred people on the fetchmail-friends contact list. That's pretty powerful motivation. And they were a good crowd, too, sending fixes and intelligent bug reports in volume. A user population like that is a gift from the gods, and this is my expression of gratitude.<P> The beta testers didn't know it at the time, but they were also the subjects of a sociological experiment. The results are described in my paper, <cite>The Cathedral And The Bazaar</cite>, available on the <a href="http://www.ccil.org/~esr/fetchmail">Fetchmail home page</a>. <H1>Credits</H1> Special thanks go to Carl Harris, who built a good solid code base and then tolerated me hacking it out of recognition. And to Harry Hochheiser, who gave me the idea of the SMTP-forwarding delivery mode.<P> Other significant contributors to the code have included Dave Bodenstab (error.c code and --syslog), George Sipe (--monitor and --interface), Gordon Matzigkeit (netrc.c), Al Longyear (UIDL support), and Nalin Dahyabhai (Kerberos V4 support).<P> <H1>Conclusion</H1> At this point, the fetchmail code appears to be pretty stable. It will probably undergo substantial change only if and when support for a new retrieval protocol or authentication method is added.<P> <H1>Relevant RFCS</H1> Not all of these describe standards explicitly used in fetchmail, but they all shaped the design in one way or another.<P> <DL> <DT>RFC821<DD> SMTP protocol <DT>RFC822<DD> Mail header format <DT>RFC937<DD> Post Office Protocol - Version 2 <DT>RFC974<DD> MX routing <DT>RFC976<DD> UUCP mail format <DT>RFC1081<DD> Post Office Protocol - Version 3 <DT>RFC1123<DD> Host requirements (modifies 821, 822, and 974) <DT>RFC1176<DD> Interactive Mail Access Protocol - Version 2 <DT>RFC1203<DD> Interactive Mail Access Protocol - Version 3 <DT>RFC1225<DD> Post Office Protocol - Version 3 <DT>RFC1344<DD> Implications of MIME for Internet Mail Gateways <DT>RFC1413<DD> Identification server <DT>RFC1428<DD> Transition of Internet Mail from Just-Send-8 to 8-bit SMTP/MIME <DT>RFC1460<DD> Post Office Protocol - Version 3 <DT>RFC1521<DD> MIME: Multipurpose Internet Mail Extensions <DT>RFC1869<DD> SMTP Service Extensions (ESMTP spec) <DT>RFC1652<DD> SMTP Service Extension for 8bit-MIMEtransport <DT>RFC1725<DD> Post Office Protocol - Version 3 <DT>RFC1730<DD> Interactive Mail Access Protocol - Version 4 <DT>RFC1731<DD> IMAP4 Authentication Mechanisms <DT>RFC1732<DD> IMAP4 Compatibility With IMAP2 And IMAP2bis <DT>RFC1734<DD> POP3 AUTHentication command <DT>RFC1870<DD> SMTP Service Extension for Message Size Declaration <DT>RFC1891<DD> SMTP Service Extension for Delivery Status Notifications <DT>RFC1893<DD> Enhanced Mail System Status Codes <DT>RFC1894<DD> An Extensible Message Format for Delivery Status Notifications <DT>RFC1938<DD> A One-Time Password System <DT>RFC1939<DD> Post Office Protocol - Version 3 <DT>RFC1985<DD> SMTP Service Extension for Remote Message Queue Starting <DT>RFC2060<DD> Internet Message Access Protocol - Version 4rev1 <DT>RFC2061<DD> IMAP4 Compatibility With IMAP2bis <DT>RFC2062<DD> Internet Message Access Protocol - Obsolete Syntax </DL> <HR> <table width="100%" cellpadding=0><tr> <td width="30%">Back to <a href="index.html">Fetchmail Home Page</a> <td width="30%" align=center>To <a href="/~esr/sitemap.html">Site Map</a> <td width="30%" align=right>$Date: 1997/08/05 04:14:49 $ </table> <P><ADDRESS>Eric S. Raymond <A HREF="mailto:esr@thyrsus.com">&lt;esr@snark.thyrsus.com&gt;</A></ADDRESS> </BODY> </HTML>