From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Graf Subject: Re: [RFC] string matching ematch Date: Wed, 26 Jan 2005 22:41:19 +0100 Message-ID: <20050126214119.GP31837@postel.suug.ch> References: <20050126150714.GL31837@postel.suug.ch> <20050126130323.2dc10187.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: hadi@cyberus.ca, kaber@trash.net, netdev@oss.sgi.com Return-path: To: "David S. Miller" Content-Disposition: inline In-Reply-To: <20050126130323.2dc10187.davem@davemloft.net> Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org * David S. Miller <20050126130323.2dc10187.davem@davemloft.net> 2005-01-26 13:03 > On Wed, 26 Jan 2005 16:07:14 +0100 > Thomas Graf wrote: > > > I'd like to discuss the string matching ematch, I don't care about the > > algorithm used but rather whether to make it stateful, match over > > fragments, etc. > > I think you'll need to make it stateful. > > I assume this is meant to be used for things like catching references > to "Falun Gong" in SMTP sessions and stuff like that. Not that I know > any entity interested in such applications :-) Hehe, it's main purpose is to catch mail from your sweetie and redirect them through a low latency link but of course you can also use it to match on text based protocols without strict header ordering. ;-> > Anyways, if the string goes across the TCP data portion of multiple > packets, statefulness becomes necessary to catch it. Right? Yes and no, it is of course necessary if one wants to match any string at any position without limitation. OTOH, it gets quite complex. We'd have to store the state of every configured kmp ematch to just be able to tell the result. On top of that, the whole classification process is stateless and should be kept like this. Assuming one configures three ematches like this: u32(ip dport 333 0xff) and ( kmp("Falun Gong" from 20 layer transport) and nbyte("SMTP" at 0 layer application) ) assuming the u32 and nbyte ematch matches in the first packet, the string matches only partially. We can't regard regard the ematch tree as matched so we must return false. The next packet in the flow completes the string but the nbyte match doesn't match anymore so no match either. In fact a stateless filter can't do any better but it doesn't consume as much resources. There are cases where a statefull string matching would be of use, one of them is when it doesn't matter which packet you actually classify, e.g. dropping connections such as to protect your web server from stilly requests. I'm not sure if mixing stateful with stateless stuff is such of a good idea. I think it should be separated and have stateful filters only be executed when the flow matters, not packets.