Wednesday 21 December 2005

Spam Quarantines Should Be Sorted by Score

When spam filters decide what's spam and what's legitimate email, they often assign a score to the message. You can think of this score as the confidence that the message is spam. For example, filters based on SpamAssassin typically assign a score of more than 5.0 to indicate spam. However, spam filters can make mistakes and occasionally flag legitimate messages as spam (known as a false positive). Usually these false positives have a relatively low score.

Most spam filters maintain a quarantine or spam folder where they put the spam messages. Users or administrators can browse the quarantine folder in an attempt to find false positives.

Searching for false positives is a laborious task. It's very helpful to sort the quarantine list by the messages' score. This means that any false positives are likely to be near the top of the quarantine list. The Pareto Principle -- the "80/20 rule" -- applies. In other words, in order to get 80% of the benefit, the user only need browse the first 20% of the quarantined messages.

An example of a quarantine that does this is Electric Mail's PerimeterProtect hosted service. A surprising number of spam filter quarantines don't even allow this sort order as an option.

Tags: .

1 comment:

Anonymous said...

There are way too many false positives if you set SpamAssassin to filter or tag at 5 stars. I set it to tag at 6, and deal with a few more false negatives. And I filter at higher scores, leaving the lower scores to be sorted by the recipients.

If you have SpamAssassin available to you through cpanel, you may want to set it to filter at 8 stars. That cuts down on the spam considerably, without your clueless friends being filtered away.


