CNS home page

Usenet Spam Filter Policy (DRAFT)

There are a number of different definitions of spam. This document deals with usenet Spam Filter Policy on campus news servers, not with defining SPAM. For different definitions of SPAM, see: www.uiuc.edu/ph/www/tskirvin/faqs/spam.html

The development of a Spam Policy is closely related to the type of service provided to users and news peers. The intention is to provide access to news resources with a minimal amount of spam, and to provide a newsfeed to peers that offers as little spam as possible (reducing load on the sending server, the remote server, as well as reducing the network bandwidth required for news peering).

Spam has grown increasingly worse, and now accounts for the majority of usenet traffic (with cancel messages included). Cancel messages for spam are no longer able to cope with the high volume of spam traffic, and are too resource intensive: writing and storing an article to disk, writing a cancel message to disk, and then removing the canceled article.

Because of this, a spam filtering policy has been established for campus news servers which filters based upon the following criteria:

  • Excessive Cross Posting (ECP). Cross-posting to more than a given number of groups (currently 8).
  • Excessive Multi-Posting (EMP). Posting of an identical article numerous times.
  • Articles with invalid "From:" header. Addresses in the from header which do not match a potentially valid user name and FQDN (Fully Qualified Domain Name). The Perl regular expression which the "From:" header must match is:
  • (.+?)\@([-\w\d]+\.)*([-\w\d]+)\.([-\w\d]{2,})
  • Make Money Fast (MMF) posts. MMF posts are typically identified by certain header elements, such as the appearance of the phrase "Make Money Fast" in the "Subject:" header. MMF and similar "subject-matter based" filtering is not currently implemented.

A number of methods for dealing with spam have been developed which focus on refusing spam articles before they are processed by the news host. They typically check the headers/body for certain characteristics which identify the article as spam, and refuse it. The resources required to do basic header/body checking are typically much less than the disk I/O that would result by processing the articles.

The filtering policy employed is often referred to in usenet circles as "non-aggressive" filtering. No attempt is made to filter with regards to subject matter or content of the articles, but rather on characteristic SPAM features such as repetitive posting or crossposting, or common indications of spam attempts, such as a from address (required according to RFC 1036) which is not in "Internet Syntax".

Send comments to usenet@agate.berkeley.edu.


Last revised: August 07, 2003
Contact Information