A Short History of Spam

The Age of Open Relays

In the beginning there were open relays, and they were good. I am speaking here of a time when the net was much smaller and very much slower and less reliable than it is today. Open mail relays were provided as a curtesy to others. Eventually spammers discovered them, and open relays became evil.

The earliest posting that I can find on Google in news.admin.net-abuse.* regarding open mail relays is one from November 1996 by Ronald Guilmette, responding to "trebor @ sirius.com" who states that "Just like with open news servers, the time will come when open mail servers will be considered gross negligence". Ronald points out that the then current version of Sendmail is very difficult to configure to NOT be an open relay. Later in the thread, Claus Aßmann provides hacks to more easily configure sendmail.

The Spambone / Cyber Promotions / Wallace

Cyber Promotions (owned by Sanford Wallace) was disconnected by their ISP Agis in September 1997. They sued Agis in an attempt to get reconnected, and Judge Brody issued a preliminary injunction on 1997-09-30 ordering Agis to reconnect Cyber Promotions, but in December that lawsuit was dismissed with prejudice by Judge Brody.

By late November 1997, Sanford Wallace and his company, Cyber Promotions, was talking about teaming up with Walt Rines and a start-up telephone company to form their own ISP which would be dedicated to spamming. That never got off the ground.

The Age of MAPS, ORBS, and others

By late in 1997, MAPS was running a blackhole list suitable for use in mail servers to block mail coming from spam sources, potentially including open relays. However, they were very slow to list open relays. As a result, others started dns based blacklists of open relays.

Alan Hodgson started Dorkslayers in September 1998. By November 1998 he was forced to close, since his upstream BCTel considered the open relay scanning to be abusive. The ORBS project was then moved to Alan Brown in New Zealand. By June 2001 that project closed due to legal issues.

Al Iverson of Radparker started the RRSS around May 1999. By September 1999 that project was folded into the MAPS group of dns based lists as the RSS.

In August 1999, MAPS listed the ORBS mail servers, since the ORBS relay testing was thought to be abusive.

In June 2001, ORBS was sued in New Zealand, and shortly thereafter closed down.

In July 2001, MAPS moved to a subscription model, which essentially forced the mass of small users to move to other blocking lists. This eventually lead a LOT of folks to create their own dns based blocking lists, some of which are shown here.

ORDB started around that time, and for the next five years was the major free list of open relays.

SPEWS first came to public attention around August 2001, and seems to have ceased operations August 24th, 2006. They were notable for making an attempt to stay anonymous, presumably to avoid lawsuits. That attempt seems to have been successful.

The open relay spam problem essentially went away by 2004, partially due to the effectivness of the open relay lists such as ORDB, and mostly since the spammers discovered the wonders of spam zombies. ORDB shutdown in December 2006, since it was no longer needed.

Split routing, spamming on two links

Around 2000 and 2001, spammers discovered an interesting mechanism to hide part of their network connectivity. They obtained connectivity from two different providers where those links needed some properties that were fairly easy to arrange. One link is a high speed link where the provider does not filter outbound packets with source ip addresses belonging to other folks. This is egress filtering, and there are still many providers that don't do such filtering. The other link can be slower, since it will only be used to receive the ACK packets sent in response to the data packets sent thru the high speed link. This combination makes it appear that the spam arrived with the ip address associated with the slower link. This caused problems for AOL and others since their initial approach to port 25 blocking only looked at the remote port on outbound packets. The widely deployed fix is to block all traffic from or to a remote tcp port 25 (for those links where you are blocking access to remote smtp servers).

Matt Wright and FormMail

From early 2001 thru early 2003, spammers abused the widely distributed Matt Wright FormMail scripts. I don't recall many widely used dnsbl's listing these, but the resulting spam was trivial to filter with procmail. Many places must still have filters that will reject any mail containing the string "Below is the result of your feedback form".

Rise of the Zombies

I think it was around the middle of 2003 when spammers and virus writers joined forces to create spam messages carrying a payload of executable code that caused vulnerable machines to become spam proxies, or zombies. These machines have essentially been taken over by the spammer, and are used to send even more spam.

By the end of 2006, the zombies were the major spam distribution channel. Spamhaus starts the Policy Block List, and zen.spamhaus.org includes the SBL, XBL, and PBL. The very wide usage of the Spamhaus lists, combined with the PBL listing wide swaths of ip address space, puts pressure on spammers to adapt. And they did this in two different ways, using vulnerable web servers, and having their zombies sending thru ISP smarthosts.

Temporary address space hijacking

We have a short digression here to talk about a technical possibility that alarmed some folks. In a paper from 2006, Ramachandran and Feamster claim evidence for the statement that spammers are using short-lived bogus BGP route announcements to send spam from hijacked parts of the IPv4 address space. Their paper was referenced and given popularity in this article. They also claim that "even the most aggressive blacklist has a false negative rate of about 50%", which seems unreasonable. Their dataset covers 2004-08 thru 2005-12. I thought that the SBL alone was blocking much more than 50% of the incoming spam in that period, but apparently not for their sample.

Team Cymru has some current evidence for this. Today, 2008-02-16, they are showing bogus route announcements within the last month for 100/8 and some scattered /16 and /24 blocks within that 100/8. However, my mail server logs that cover the time of those bogus route announcements do not show any SMTP connection attempts from 100/8. So either they were not sending spam from those addresses, or they just did not send any spam here. However, these are announcements of bogon space, which may be filtered by many BGP routers. Therefore, such address space is probably not attractive for spammers attempting to use this technique.

PHAS is another system that attempts to detect address space hijacking, but it is not correlated with SMTP connections or spam attempts.

IAR is another system that attempts to detect address space hijacking, but it is not correlated with SMTP connections or spam attempts. IAR uses methods detailed in PGBGP to detect suspicious routes.

We built a system that monitors the BGP update stream, detects suspicious routes, and correlates that with SMTP connections. The results indicate that spammers are not currently (2008-07) using bogus BGP announcements to hijack ip address space to send spam.

Vulnerable web servers

By early 2007, almost all of the spam leaking thru the filters here is coming from vulnerable web scripts. This is not as easy to filter as the earlier spew from FormMail, since it is arriving from a wider variety of individual php scripts. The web clients talking to these vulnerable web scripts are mostly zombies, but they don't necessarily show up on the XBL since these zombies don't talk on port 25 to systems that might report them.

Zombies talking to smarthosts

By mid 2007, the spammers were targeting their zombies at the zombies own ISP smarthosts. If the outbound mailers provided by the ISP for use by their customers don't require SMTP AUTH, then it is trivial for the zombie to send their spam out thru the ISP smarthosts. There are a number of counter measures that ISPs can and should take to avoid this, including volume limits per account, outbound spam filtering using the DCC or something like it, and requiring the use of SMTP AUTH even for connections from their own customers from their own ip address space. Tiscali apparently did none of that, and as a result experienced significant mail delivery problems.

Backup MX servers

Consider a domain example.com, with multiple MX servers with different precedence values. Spammers used to vastly prefer sending spam to the backup MX servers, under the perhaps valid assumption that the backup servers had weaker spam filtering than the primary servers, and that the primary servers would simply accept all the mail forwarded from the backup servers. That behaviour has now changed, but I don't have any data on exactly when it changed. However, as of early 2008, on two mail servers that accept mail for the same collection of domain names, we observe that the backup machine only sees about two percent of the SMTP connections seen by the primary machine. By the middle of 2010, that was up to five percent.

Movement back to web host spamming

The increasing utilization and coverage of the various Spamhaus lists like the SBL has reduced the deliverability of spam from botnets. Some spammers now seem to be rediscovering the utility of low cost web hosting. The web hosting market is a sufficiently low margin high volume business, that the operators cannot really know their customers. Yet those operators are selling unrestricted access to machines on high speed networks, using ip address space that is generally not listed on widely used blacklists like the SBL. By the middle of 2008, a significant percentage of the spam that makes it thru the ip address filters here is from such web hosts, but almost all of that is caught by the deeper body filters (DCC and SpamAssassin).

By late 2008 this becomes known as snowshoe spamming - using a large number of domain names and spreading the outbound load over large chunks of address space, sometimes much larger than a /24, in an attempt to stay under the radar. It was generally used with rapidly rotating domain names. However, using the DBL or Surbl on various names associated with incoming connections stops almost all of this.

By the middle of 2010 some of these spammers have begun to realize that setting up proper reverse dns names pointing to their bogus domain names just makes them more visible. Some of them are now going back to either removing reverse dns names completely, or setting up generic consecutive reverse dns names for their snowshoe blocks.

IP address blocking vs. body content filtering

The SBL and other lists used to block spam by source ip address are much faster than body content filtering. In the middle of 2008, a significant percentage of the spam made it past the ip address based checks, and into the body content filtering stage. By the end of 2008, the body content filtering is hardly catching any spam. Rather than the content filtering becoming less effective, I think this is due to the address based blocking becoming much more effective. There does not seem to be a corresponding rise in the amount of spam in user mailboxes. Essentially all of the spam is now caught in the initial ip address based filtering layer. The following is a summary from a mail server here at the end of 2008.

bad domain 705 0%
relay denied 941 1%
relay tmp deny 12 0%
no such user 3620 3%
invalid route 151 0%
local dnsbl 61738 43%
sbl ip 73059 51%
sbl content 19 0%
surbl content 1 0%
spam assassin 194 0%
dcc 330 0%
unknown 1 0%
ok 1838 1%
total messages 62008 43%
total recipients 142609 100%

Those rejections are listed the order that they are checked. The bad domain, relay denied, invalid user and route checks are all done by the stock sendmail. The local and SBL dnsbl lists are then checked in that order, which accounts for the relatively small 51% hitting the SBL. Then the content filters for host names on the SBL or Surbl, spam assassin and DCC are run in parallel, but the results are checked in that order.

This pattern continues as of August 2010.

Routing unused address blocks

Some spammers have discovered that it is relatively easy to find old address allocations that were given to organizations that no longer exist. They then pretend to be the successor to that old organization, and simply take over that ip address space. Some examples were: was gigaipnet.com, details was ZAO Russian Telecommunications Group, details was FTP Software/NetManage was Gold Hill Computers, details was Oracle (according to Spamhaus), details was SF Bay Packet Radio, details was Automation Intelligence, details was RF Engineering, details was Cayenne Software Inc, details was Precipice, Inc., details was Vitalink Communications, details was Hoechst Celanese Corporation, details was General Instrument, details was Symbolics, Inc., details was Central Notts Healthcare Trust, details was H-Line Communications, details was DataStream, details was Westin Hotels & Resorts, details
as of 2016-11-10: was Ross Technology details was Ross Technology details

Some folks have discovered they can take over an ASn via the same mechanism. For example, AS30186, pretending to be the old Ross Technology in Austin.

Vietnam - the new China

For a long time, China and Korea were the primary sources of huge amounts of spam seen here, and that relationship was relatively stable with Korea making about one half as many delivery attempts as China. In the early part of 2009 that shifted to Korea making twice as many delivery attempts as China. Starting in August of 2009, we noticed a huge increase in the delivery attempts from Vietnam, to the point where Vietnam is now ahead of Korea. Some numbers from August 2010:


India - the new China

By late 2010, India surpassed both China and Korea as a spam source. Some numbers from October 2010 from two different mail systems:

17455 .cn    7987 .kr
18005 .kr    10522 .ch
40434 .id    14694 .id
78943 .vn    37941 .vn