What is spam and how can it be reduced?

59
rate this page

By Lincoln Armstrong


Photo courtesy Buggolo
Photo courtesy Buggolo

Spam. It's one of the most intractable problems of the Internet and it is sometimes difficult to even define precisely what the word means in discussions of web communications. One thing is for sure, spam affects almost every communications channel on the Internet: blogs, message boards, instant messaging, SMS chat, web pages, RSS feeds, and yes, the dreaded, the infuriating, spam e-mail.

But what exactly is spam?

The original definition of spam, from back in the paleolithic Internet, when dial-up was getting faster every six months it seemed, was fairly specific. Spam was then defined as a bulk, unsolicitied, off-topic message. It didn't necessarily have to be commerical and it didn't necessarily have to include an advertisement for a mortgage refinance.

The first spam to be popularly opposed appeared on the Usenet newsgroup networks, where commercially minded participants would start thread after thread with the same commercial advertisement or link to a web page. The result was that the threads having to do with the newsgroup topic became more and more difficult to find as first dozens, then hundreds of threads pushed everything else aside.

The origin of the word is the very famous Monty Python "Spam" skit where a group of vikings would begin chanting the word over and over again if another character repeated the word enough times. The singing vikings would inevitably interrupt whatever other conversaionts were going on at the time, which is actually a fairly accurate representation of what spam does on electronic communications networks like Usenet and e-mail.

Ironically, the very freedom which is so treasured by those who make use of the Internet and its many powerful technologies is what makes spam possible. If anyone can freely contribute to the development of the Internet, then it is inevitable that some of those contributors will seek to communicate commercial messages to as large an audience as possible.

The other factor is cost. Spamming is incredibly cheap compared to other forms of mass advertising, despite the fact that such a small percentage of people actually responds to a spam message. On a cost-per-response basis, spamming is probably the least expensive form of advertising there is. Sending e-mail or putting a message up on a message board is practically zero cost, and even that cost can be avoided entirely through the use of bots or zombified computers.

But the freedom to speak, the freedom to transmit commercial advertising and the freedom to start hundreds of spammy message board threads does not confer upon an audience any obligation to listen, and so the active opposition to spam began.

Almost every publically accessible form of Internet communications now has at least an informal policy which prohibits using that network to transmit spam messages. The problem with many of these policies is the expansion of the definition of spam to the point where a policy with good intentions becomes a tool to shout down viewpoints or participants that happen to be at a particular rhetorical disadvantage. "Spam" becomes a word for "anything I see more than once that I don't recognize" or "any message I disagree with." The only practical result of this kind of "policy creep" is arguments about the policy which only add to the clutter the policy was supposed to help reduce.

So, communications networks had to take a more flexible approach to the problem and place the responsibility in the hands of users. The first successful examples of user driven anti spam tools were things like article cancelling on newsgroups, and e-mail filters like procmail. Tools like procmail in particular were extraordinarily effective because they took a heuristic approach to defeating spam. As users flagged messages they thought were spam or at least unwanted commercial messages, the filters would analyze the messages for links, web domains, certain combinations of keywords and subject lines, storing the data and using it to tune the filter. Each new spam message only served to make the filters stronger and more capable of determining with faster and faster speed which messages were spam and which were not.

As these kinds of tools grew in popularity, the major providers of network connectivity, and, of course, e-mail and communications services, began to implement filtering technology in their networks at an even lower level of operation. This had the immediate effect of filtering spam before it ever reached an end user's inbox. Any message that did get through would be re-analyzed by that user's own filter and even if it managed to get past the second filter, would be used to train both filters, potentially, to exclude that message in the future.

At the other end of the transmission, providers of e-mail and network service began to implement policies prohibiting the use of their network, or any of the services on it, for the purpose of transmitting bulk unsolicitied commercial e-mail, which only made it more difficult to send those messages in the first place.

Spam, like so many other problems somewhat unique to the Internet, is likely to continue to be at least a potential problem for some time. The good news is that the technology and the knowledge to reduce spam and to increase the availability of electronic communications is continuing to grow in both capability and convenience.

Comments

RSS for comments on this Hub Small RSS Icon

No comments yet.

Submit a Comment

Members and Guests

Sign in or sign up and post using a hubpages account.


optional



working