Richi Jennings: 15 April 2007

Wednesday, 18 April 2007

More About the CEAS Spam Control Bake-off

Last week, I wrote about the CEAS 2007 Live Spam Challenge (CEAS is the Conference on Email and Anti-Spam). I opined that fair comparative testing of spam control technologies is extremely difficult, especially when behavioural analysis techniques such as greylisting and OS fingerprinting are part of the spam control technology mix.

I wanted to clarify that the test isn't intended to evaluate the relative strengths and weaknesses of existing spam control products (that would be extremely difficult to do fairly, as last week's post pointed out). The intention is to compare some promising new content-based filtering techniques -- techniques that might be employed as components in a cocktail of techniques used by a spam control product.

As Gordon Cormack, one of the test's co-organizers, wrote:

An open competition attracts all sorts of techniques that can be vetted. The methods that are uncompetitive can be discounted, and the "greatest hits" can be tested ... in combination with greylisting ... and other intrusive techniques.

...

One popular fallacy that I run into all the time is, "this test has limitations, so it shouldn't be done." All tests and experiments have limitations, and the scientific method involves identifying them and constructing specific experiments to see how much the limitations matter, not witholding all tests until the perfect one can be done (which, of course, it can never be).