public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Howard Chu <hyc@symas.com>
To: John Graham-Cumming <antispam@jgc.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Spam, bogofilter, etc
Date: Tue, 03 Oct 2006 01:52:55 -0700	[thread overview]
Message-ID: <452224E7.9060105@symas.com> (raw)
In-Reply-To: <loom.20061003T100646-668@post.gmane.org>

John Graham-Cumming wrote:
> Linus Torvalds <torvalds <at> osdl.org> writes:
>> I'm sorry, but spam-filtering is simply harder than the bayesian 
>> word-count weenies think it is. I even used to _know_ something about 
>> bayesian filtering, since it was one of the projects I worked on at uni, 
>> and dammit, it's not a good approach, as shown by the fact that it's 
>> trivial to get around.

> Have you actually followed any of the research into Bayesian (and similar
> machine learning based) anti-spam filtering, and attacks on such filters?  Are
> you making a claim that these filters are 'trivial to get around' based on a
> project you did at University over 10 years ago?

Well the recent spate of spams with technical/jargon keywords in their 
subjects was enough to make my Seamonkey client start marking all 
incoming mail as spam. Interesting that recent journals talk about this 
as an approach to get spam past current filters; instead it had a 
reverse effect.

So much for email management at our hosting provider. At least on my 
highlandsun.com domain I've got my own sendmail milter blocking spams 
before they get into the server. It's basically the equivalent of a 
sendmail accessdb in LDAP, plus simple rules to reject relays from 
unregistered IP addresses, or addresses with dynamically generated 
hostnames. Rejecting with 451 temporary failure is also useful, most 
bulk mailer programs fail immediately and go away. Real mail servers 
will retry; by looking at the logs of the envelope FROM and RCPT I can 
pick out any emails that should have been let thru and add an OK 
exception to LDAP so the message eventually gets redelivered. I suppose 
I could put a URL in the reject error message, and let the sender 
confirm it from there. At this point the only spam that gets thru is 
from dedicated mass marketers with legitimate DNS registrations and I 
just manually add their subnets to my blacklist.

(One then is faced with the interesting question - what if someone from 
one of those companies was actually trying to hire my services? Their 
loss I guess, sometimes money really is tainted...)
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc
   OpenLDAP Core Team            http://www.openldap.org/project/

  reply	other threads:[~2006-10-03  8:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-29 14:23 Spam, bogofilter, etc Lee Revell
2006-09-29 14:29 ` Ismail Donmez
2006-10-01 23:23 ` Chris Wedgwood
2006-10-02  0:41   ` Kasper Sandberg
2006-10-02 10:03 ` Matti Aarnio
2006-10-02 15:21   ` Lee Revell
2006-10-02 15:24     ` Martin J. Bligh
2006-10-02 15:48       ` Lee Revell
2006-10-02 17:39         ` Erik Andersen
2006-10-03  3:37           ` dean gaudet
2006-10-03  4:05             ` Neil Brown
2006-10-02 16:40       ` Linus Torvalds
2006-10-02 17:49         ` Alan Cox
2006-10-02 17:19           ` David Lang
2006-10-02 18:02           ` Linus Torvalds
2006-10-02 18:07             ` Martin Bligh
2006-10-02 18:22             ` Valdis.Kletnieks
2006-10-02 18:29               ` Linus Torvalds
2006-10-02 19:31                 ` jdow
2006-10-02 19:31                 ` Antonio Vargas
2006-10-02 21:58             ` Alan Cox
2006-10-04 22:41             ` Adrian Bunk
2006-10-03 17:32           ` Mariusz Kozlowski
2006-10-02 21:33         ` Horst H. von Brand
2006-10-03  8:08         ` John Graham-Cumming
2006-10-03  8:52           ` Howard Chu [this message]
2006-10-03  9:40         ` Devdas Bhagat
2006-10-03  9:43         ` Helge Hafting
2006-10-03 10:50         ` Gordon Cormack
2006-10-02 17:34   ` Thomas Davis
2006-10-03 16:42   ` Mariusz Kozlowski
2006-10-27 22:30 ` Oleg Verych
  -- strict thread matches above, loose matches on Subject: below --
2006-10-03  6:08 Paul Zimmerman
2006-10-03 12:51 ` Valdis.Kletnieks
     [not found] <20061003060346.55869.qmail@web80821.mail.yahoo.com>
2006-10-03  7:01 ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=452224E7.9060105@symas.com \
    --to=hyc@symas.com \
    --cc=antispam@jgc.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox