Relaying

Here are some possible headers from a message that had a very different "life cycle" than anything described so far:

Received: from unwilling.intermediary.com (unwilling.intermediary.com
[98.134.11.32]) by mail.someisp.com (8.8.5) id 004B32 for
; Wed, Jul 30 1997 16:39:50 -0800 (PST)
Received: from turmeric.com ([104.128.23.115]) by unwilling.intermediary.com
(8.6.5/8.5.8) with SMTP id LAA12741; Wed, Jul 30 1997 19:36:28 -0500 (EST)
From: Anonymous Spammer
To: (recipient list suppressed)
Message-Id:
X-Mailer: Massive Annoyance
Subject: WANT HOT PORN???

A variety of things in this header might clue the reader in to the fact that this is a piece of electronic junk mail, but the thing to focus on here is the Received: lines. This message originated at turmeric.com, was passed from there to unwilling.intermediary.com, and from there to its final destination at mail.bieberdorf.edu. All well and good--but how did unwilling.intermediary.com get there, since it has nothing to do with either the sender or the recipient? Understanding the answer requires some knowledge of SMTP. In essence, turmeric.com simply connected to the SMTP port at unwilling.intermediary.com and told it "Send this message to diane@someisp.com". It did this, probably, in the most direct manner imaginable, by saying RCPT TO: diane@someisp.com. At that point, unwilling.intermediary.com took over processing the message, since it had been told to send it to a user at some other domain (someisp.com ). It went out and found the mail server for that domain and handed off its mail in the usual manner. This process is known as mail relaying.

Historically, there are good reasons for allowing relaying. On much of the Net until about the late 1980s, machines rarely sent mail by talking directly to each other. Rather, they worked out a route for a message to travel, and sent it step-by-step along that route. It was a cumbersome system (especially since the sender often had to work out the route by hand!) By way of analogy, imagine sending a letter from Phoenix to New York, and having to address the envelope thus:

Phoenix, Flagstaff, Albuquerque, Salt Lake City, Rock Springs, Laramie, North Platte, Lincoln, Omaha, Des Moines, Cedar Rapids, Dubuque, Rockford, Chicago, Gary, Elkhart, Fort Wayne, Toledo, Cleveland, Erie, Elmira, Williamsport, Newark, New York City, Greenwich Village, #86 Deadbeat Row, Apt. #2b, Bob Dylan.

It's clear why this is a useful addressing model if you're a postal worker---the post office in Gary, Indiana only has to be able to communicate with the adjacent offices in Chicago and Elkhart, rather than having to devote its resources to figuring out how to get something to New York. (It's also clear why this isn't a good idea from the standpoint of the letter-writer, and why email is no longer commonly routed this way!) This is exactly how email was sent; so it was important that one machine be able to give another instructions that said "I have email for diane@someisp.com, to be sent from you to turmeric.com to galangal.org to asafoetida.com to diane@someisp.com", hence relaying. In modern times, however, relaying is usually used by unethical advertisers as a technique for concealing the source of their messages, deflecting complaints to the innocent relay site rather than to the spammers' own ISPs. It also offloads the work of processing addresses and contacting recipients from the spammers' machines to those of an uninvolved third party. It's widely felt that relaying, especially large-scale relaying, constitutes theft of service for that reason. For that reason, reporting to the ISP is the best you can do, since they truly are motivated to stop this if they are unwilling participants. If they are not, they will still do it because you can sue based on the Telecommunications Act.

The essential point here is to realize that the content of the message was formulated at the sending point---turmeric.com, in the example above. Unwilling.intermediary.com, is involved only as an unwilling intermediary. They have no control over the sender, as much as the Flagstaff post office has no real influence over someone writing letters in Phoenix. The intermediate link does, however, have the power to turn off relaying at their site, though!

One more thing to notice in the sample headers: The Message-Id: line was filled in, not by the sending machine (turmeric.com), but by the relayer (unwilling.intermediary.com). This is a common feature of relayed mail. It just reflects the fact that the sending machine didn't supply a Message-Id.

Envelope Headers

The section on SMTP, above, alluded to a distinction between "message" and "envelope" headers. This distinction and some of its consequences are detailed here.

Briefly, the "envelope" headers are actually generated by the machine that receives a message, rather than by the sender. By this definition, Received: headers are envelope headers. However, the term usually refers to the "envelope From" and "envelope To" only.

The envelope From header is the header derived from the information in a MAIL FROM command. For instance, if a sending machine says MAIL FROM: ginger@turmeric.com, the receiving machine will generate an envelope From header that looks like this:

>From ginger@turmeric.com

Notice the absence of the colon---"From", not "From:". Frequently, envelope headers don't have colons after them. This convention is not universal, but it is common enough to pay attention to.

Symmetrically, the envelope To is derived from a RCPT TO command. If the sender says RCPT TO: diane@someisp.com, then the envelope To is diane@someisp.com. There often isn't an actual header containing this information; sometimes it's embedded in the Received: headers.

An important consequence of the existence of envelope information is that the message From: and To: headers are meaningless. The contents of the From: header are provided by the sender; and so, counter intuitively, are the contents of the To: header. Mail is routed only based on the envelope To, never based on the message To: header.

To see this in action, consider an SMTP transaction like this:

HELO galangal.org
250 diane@someisp.com Hello turmeric.com [104.128.23.115], pleased to meet you
MAIL FROM: forged-address@galangal.org
250 forged-address@galangal.org... Sender ok
RCPT TO: diane@someisp.com
250 tmh@bieberdorf.edu... Recipient OK
DATA
354 Enter mail, end with "." on a line by itself
From: another-forged-address@lemongrass.org
To: (your address suppressed for stealth mailing and annoyance)
.
250 OAA08757 Message accepted for delivery
Here are the corresponding headers (excerpted for clarity):
>From forged-address@galangal.org
Received: from galangal.org ([104.128.23.115]) by diane@someisp.com (8.8.5)
for ...
From: another-forged-address@lemongrass.org
To: (your address suppressed for stealth mailing and annoyance)

Notice that the contents of the envelope From, the message From:, and the message To: are all dictated by the sender, and have no bearing whatsoever on reality! This example illustrates why the From, From:, and To: headers can never be trusted in mail that might be forged. They are simply too easy to falsify.

The Importance of Received: Headers

We've seen already, in the examples above, that the Received: headers provide a detailed log of a message's history, and so make it possible to draw some conclusions about the origin of a piece of email even when other headers have been forged. This section explores some details associated with these singularly important headers, and, in particular, how to circumvent common forgery techniques.

Unquestionably, the single most valuable forgery protection in the Received: headers is the information logged by the receiving host from the sender. Recall that the sender can lie about its identity by putting garbage in its HELO command to the receiver. Fortunately, modern mail transfer programs are able to detect such false information and correct it.

If, for instance, the machine turmeric.com, whose IP address is 104.128.23.115, sends a message to diane@someisp.com, but falsely says HELO galangal.org, the resultant Received: line might start like this: Received: from galangal.org ([104.128.23.115]) by diane@someisp.com (8.8.5)... (The rest of the line is omitted for clarity.)

Notice that, although the someisp.com machine doesn't explicitly announce that galangal.org wasn't really the sending machine, it does record the correct IP address of the sender. If someone receiving the mail had reason to think that galangal.org appeared in the headers through the work of a forger, they could look up the IP address 104.128.23.115 (you can go to www.amnesi.com for this) and find that that address in fact belonged to turmeric.com (and not galangal.org). In other words, logging the IP address of the sending machine provides enough information to confirm a suspected forgery.

Many modern mail programs actually automate this process, looking up the name of the sending machine on their own. The lookup process is called reverse DNS (for Domain Name Service)---"reverse" because it reverses the usual process of translating a name to an address for routing purposes. If diane@someisp.com were using software that did this, the Received: line would start something like this:

Received: from galangal.org (turmeric.com [104.128.23.115]) by diane@someisp.com...

Here the forgery is crystal clear; this line effectively says "turmeric.com, whose address is 104.128.23.115, reported its name as galangal.org." Needless to say, information like this is extremely helpful in identifying and tracking forged email! (For this very reason, spammers try to avoid using relaying machines that report reverse DNS information. Sometimes they even find machines that don't do the kind of IP logging described in the previous paragraph---though there aren't very many of those around on the net any more.) Another trick used by forgers of email, which is increasingly common, is adding spurious Received: headers before sending the offending mail. This means that the hypothetical email sent from turmeric.com might have Received: lines that looked something like this:

Received: from galangal.org ([104.128.23.115]) by diane@someisp.com (8.8.5)...
Received: from nowhere by fictitious-site (8.8.3/8.7.2)...
Received: No Information Here, Go Away!

Obviously, the last two lines are complete nonsense, written by the sender and attached to the message before it was sent.

Since the sender has no control over the message once it leaves turmeric.com, and Received: headers are always added at the top, the forged lines have to appear at the bottom of the list. This means that someone reading the lines from top to bottom, tracing the history of the message, can safely throw out anything after the first forged line. Even if the Received: lines after that point look plausible, they're guaranteed to be forgeries.

Of course, the sender doesn't have to use obvious garbage; a really devious forger could create a plausible list of Received: lines like this:

Received: from galangal.org ([104.128.23.115]) by diane@someisp.com (8.8.5)...
Received: from lemongrass.org by galangal.org (8.7.3/8.5.1)...
Received: from graprao.com by lemongrass.org (8.6.4)...

Here the only dead giveaway is the inaccurate IP address for galangal.org in the very first Received: line. The forgery would still be harder to detect if the forger had written in correct IP addresses for lemongrass.org and graprao.com. However, the IP mismatch in the first line would still reveal that the message had been forged and "injected" into the network at the site 104.128.23.115 (i.e., turmeric.com). Most header forgeries are considerably less sophisticated, and the extra Received: lines are obvious garbage.


To read more of this article, click here.