Monday, 6 August 2007

C/R and "Spam Index" Conversation Roundup

I wanted to pull together some of the conversations that have been flying around recently about challenge/response spam filtering and this "spam index" idea. As is often the case, quite a bit of the value is in the conversation, in addition to the original posts, hence this roundup...

Anonymous:
As the holder of a domain name frequently forged into the From: or Reply-To: fields of spam, I can testify for certain that it doesn't work. In fact, whenever I receive a challenge to one of those forged addresses, I make sure to reply to it to make sure the spam gets through. Petty, perhaps, but I'm not being paid to filter C/R users' spam, so I'll pass it through.

Dean Harding:
I'll admit I was a bit suspicious that if challenge/response was such a panacea why were there not more people using it? My point was not that people should start using challenge/response, though, it was more to just point out that many people are still not happy with their spam filtering.

Len Dressler:
[Richi,] you're really kind of a dork ... It appears you have some sort of agenda of your own, fairly skewed towards blacklist and the like, which from an IT managers perspective, is a joke.

Richi:
Len, you're entitled to your opinion, and I will defend your right to express it to the best of my ability. Fact is, state of the art spam filters catch 95-99% of spam, with a vanishingly-small false positive rate. Such spam filters use a combination of techniques ... I see no evidence that a single approach—such as IP blacklisting—is viable.

Anonymous:
I was interested in learning of Peter's methodology ... I attempted to register on his web site in order to download a copy of his report. I'm still waiting for a response, who knows maybe his acceptance e-mail was justifiably intercepted by my spam filter.

Sandman:
If its my inbox, it is a communication tool for me, and I own the right to ask people to verify they are who they say they are.

Don Marti:
I see lots of “I just started using C-R, it’s great” posts, but no “I’ve been using C-R for years and it’s great” posts. C-R is something that you try and give up on. Or, in my case, watch other people try and give up on.

Anonymous:
Effective spam control is possible. It doesn't require cumbersome and work-flow disruptive band-aid solutions like C/R ... What's needed and has been proven to be most effective is a human feedback component. Several of the best anti-spam products available today include this as part of their toolset.

This is not to say that you need a solution where YOU have to be the human in the loop. The best vendors in the space do that for you and push new rules out to their customers every 10 mins or so.

Devil's Advocate:
Asking various people "how happy" they are with their present anti-spam product has absolutely no bearing on the effectiveness of those products ... if you ask if a C/R user sees less spam, you're going to get a "yes". But, what if you ask all the innocent 3rd parties that receive the challenges (which the C/R user doesn't see)? ... All C/R succeeds in doing is displacing the original spam volume in favour of its own variety of spam ... [and] shows a blatant disrepect for the health of the Internet.

Anonymous:
Nonsense - I am no expert, just a user, but every fact you make is wrong.

Richi:
In my spamtrap archive, I have several samples of inappropriate challenges from every C/R system known to me. Just in the past month, I've got challenge-spam from: [long list deleted]
...
Still don't believe that C/R systems send spam to innocent 3rd parties?

Peter Brockmann:
Your last post proves precisely the point. Users don't care and shouldn't have to care about what falls into YOUR inbox, only what falls into THEIRS.

Richi:
So users don't care that they're sending spam, as long as they don't get any?
...
Increasingly, the main issue with C/R isn't that it annoys innocent 3rd parties -- it's that the backscatter hits spamtraps, causing legitimate challenges to go undelivered. Hence, the false positive rate of C/R is actually surprisingly high.

Ask a C/R user about this though, and they'll often be blissfully unaware. It's hard to know when one is missing a legitimate unsolicited message from someone you don't know.

David Merrill:
For recipients, challenge-response and sender verification methods are good, but their use can get your domain blacklisted. Why? Because each incoming message, spam or not, generates an outgoing message, and spammers can (and do) use those in denial-of-service attacks.

Justin Mason:
Focussing the debate on the “user’s inbox” ignores the overall picture, including everyone else’s mailbox, which is where C/R fails.

But my favourite comment has to be from Al Iverson, on the membership-only list, SPAM-L (Al kindly gave me his permission to be quoted here):
C/R is trapped in this eternal September of newbie solution developers who think they're the bee's knees because they figured out how to implement a "new" version of C/R (which is usually exactly the same as every other one). Then they act like a kicked puppy when we don't jump for joy over how awesome it is to see...yet another implementation of C/R.

Eternal September of newbie solution developers? Priceless!

5 comments:

John Foster said...

Rich, I've been using C/R since 2003 and I like it. I pay Spamarrest about $2/mo for the service and it costs less than Postini at $2.50/mo. 95%-99% is unacceptable at near the same price and certainly unacceptable at a more expensive price than C/R. When I'm paying for a service, my Inbox is the only thing to consider. I'm so surprised that this whole SPAM argument consists of this kum ba ya world of other people's inboxes, when the main purpose of SPAM filtering is to prevent unsolicited commercial email. BTW, if it isn't trying to sell you something or scam you, it isn't SPAM under most legal statutes. Don't confuse C/R mail with the legal definition of SPAM.

Richi Jennings said...

Well, "Joh," here's the thing...

If you don't care about SpamArrest polluting my inbox with your misdirected challenges, you won't care that the comapny also sends challenges to spamtraps.

This causes SpamArrest to be added to blacklists.

This causes challenges to not be delivered.

This causes SpamArrest users to experience a significant false positive problem.

This isn't the place for fanboi-ism. The criticisms that damn C/R are well understood by anti-spam technologists. Mischaracterizing them as kumbaya simply makes it clear that you've not grasped the point.

John Foster said...

Rich, I wouldn't call it fan-boi-ism. It's a matter of what works. If your typical anti-spam product whacks an email it's gone forever. At least with C/R the sender knows he's been intercepted and can white list himself.

I've dealt with several government agencies where you get a response with no challenge. "Please alter the content of your message and resend it so our filters don't catch it" Well, that's really useful.

There's no perfect solution. I have friends and business associates who have moved off email to Facebook or Myspace for their personal and business communication. With the sheer number of people there semi-proprietary closed systems such as those might be serious contenders for a real SPAM solution. What do you think about just dropping SMTP?

Richi Jennings said...

Joh, it's an interesting question.

Some time ago, I wrote about the, "People are stopping using email" meme. I said that it's not so much that people are turning their backs on email as a medium, but that they have more media available to them now -- such as IM, SMS, and social network websites.

Nothing's changed my mind since then.

As Meng said recently, all such media attract spammers if they become sufficiently popular. Don't forget that spam was first a big problem on USENET -- email came later.

tzink said...

Measuring the effectiveness of spam filters, and then comparing them, is a very difficult task.

A static test is unrealistic because while you might be able to get a corpus of spam messages, the content of spam changes very rapidly so how do you know your corpus is representative of the spam that is being seen today?

That's one problem, but a bigger problem is capturing non-spam. Many static tests I've seen attempt to generate non-spam and send it from a single IP address. That's unrealistic and not representative at all.

Metrics require a lot of precision and the task of accurate acquistion of them, on a continual on-going basis, is non-trivial.

Post a Comment