The Problem with Spam Filters

2007-03-08 10:33:40

( Computers )

The spam filter service is the most common method used to control spam. It sorts out your incoming mail into spam and non-spam categories. There are many types of filters used and the results they produce depend on how they are set-up and maintained. Some need to be updated regularly, while some can learn on their own. However, you may find that spam filters in general are not that efficient in stopping spam in its tracks.

First type is the Word List filter. It contains simple and complex list of words that are known to be associated with spam. This type of filter checks the content of your incoming mail for such words. This type of filter is easiest to create; you can even make your own set of rules to sort out spam if your e-mail provider allows it. Since Word List filters are easy to make, they are also easy to break. Spammers have come to adapt to such spam filters, deliberately misspelling spam words to fool the filter. Since you may not have the dedication to constantly update your Word Lists, you’ll find that spam will continue to leak into your mail.

Black Lists and White Lists contain a database of IP addresses which it checks against the IP address of your incoming mail. If it comes from a known trusted source, that mail will proceed to your inbox. Those that are in the Black List are filtered out. You can add your own set of trusted sources in the White List, such as friends and family. The downside is that you are still likely to receive spam from the people you know.

Hash Tables get the main points in your e-mail and checks it against its database of spam values. The more words in the e-mail that appear in the hash table, the more likely the message is spam. Spammers have gotten around this type of spam filter by inserting different sequences of random characters in the body of the e-mail, so that it won’t be identified from the values listed in the Hash Table.

Artificial Intelligence and Probabilistic Systems improve upon the Hash Table method by adding mathematical reasoning to determine whether your mail fits the spam or non-spam category. It studies the word frequencies of the e-mails you receive and keeps a hash table from which it computes the probability of your incoming e-mail containing spam.

Even though an anti spam filter can effectively sort out your incoming mail into spam and non-spam categories, it does not stop spam from reaching you in the first place. Filter efficiency is based on the number of spam that passes the filter (false negatives), and the number of legitimate messages that get filtered out (false positives). Zero false positives and minimal false negatives are ideal results. However, to achieve this level, spam filters need to be constantly updated to keep up with the constantly evolving techniques of spammers.

BLOGSHARP: Cutting Edge Posts

Home

The Problem with Spam Filters

( Computers )

Blog

Tags

Archivio