Linux Netfilter discussions
 help / color / mirror / Atom feed
From: "James King" <t.james.king@gmail.com>
To: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>,
	Jan Engelhardt <jengelh@medozas.de>, Dave <finalglide@gmail.com>,
	netfilter@vger.kernel.org,
	Netfilter Development Mailinglist
	<netfilter-devel@vger.kernel.org>
Subject: Re: POM Xtables???
Date: Fri, 15 Aug 2008 01:17:41 -0700	[thread overview]
Message-ID: <38bcb3ec0808150117n729cb7b1yadcfa9018b9714be@mail.gmail.com> (raw)
In-Reply-To: <48884ED3.8080901@trash.net>

(Sorry for the spam Patrick, I just realized that I didn't reply-all
the first time around)

On Thu, Jul 24, 2008 at 2:43 AM, Patrick McHardy <kaber@trash.net> wrote:
>> James King wrote:
>>> One thing
>>> I'm not sure of is whether the license used by the Henry Spencer regex
>>> library it depends on is acceptable by kernel standards (or whether
>>> it's permissive enough to relicense under GPL, as IANAL).
>
> I know of the regexp.old.zip library, which IIRC used a GPL
> compatible license (and a non-POSIX conform interface).
>
>> If we want to do this in-kernel I think that it's better if it must use
>> the textsearch infrastructure. Probably it would require some patches to
>> extend the existing infrastructure.
>
> I'd also prefer that over adding a regex library to the kernel.
> I think one of the bigger problems is that there are dependencies
> in the match that can't be easily expressed, like "byte 4 has
> skb->len - 10". At least ipp2p does something like that. But
> maybe thats not necessary, since l7filter already uses regexes
> there's apparently a different method for doing this.
>
> It would be useful if someone could post an excerpt from the
> l7filter expressions or simply the entire patch.

ipp2p and l7filter both use different strategies for DPI
classification, each having their own pros and cons.

ipp2p has the potential to classify traffic faster since it uses magic
values/offsets before moving on to more detailed searches.  As you
noted above, the match can also be a bit "deeper" since it can express
more complex relationships.  It's also better for memory usage, since
it doesn't store packet data.  The downside is poor configurability
and maintainability.

l7filter adds a buffer (default size of 2048 bytes, kmalloc'd on the
first packet of a flow, and freed after classification is complete) to
the conntrack and appends inbound and outbound packets into this
buffer (in the order received), up to a configurable number of packets
before giving up on classification.  It runs each configured regex
rule against this buffer in sequence, which can get very expensive
especially when you have a lot of matches or your patterns include a
lot of branches, since a large amount of search work is potentially
being duplicated by each rule.  However, being able to setup these
matches at runtime is what makes l7filter much more convenient to use
overall.

After reviewing the l7filter code a bit, I've found that it does some
nasty things at present like holding a BH spinlock over the entire
match function, and allocating a per-conntrack character buffer
containing the string name of the pattern that we've matched (eg.
http-audio, bittorrent, battlefield2, etc) for the entire life of the
conntrack.  Also, the work necessary to convert the backend to use
textsearch and still maintain compatibility with the existing patterns
seems a bit prohibitive at the moment.

I think the best thing to do for now is to move it into xtables-addons
(providing Jan has no objections) for further review and cleanup.
I've had a bit of time to work on this during my vacation, but I was
wondering if I could get some clarification on a couple things:
-Under what circumstances does it become mandatory to implement the
ct_extend move callback?
-The description from Krzystof's recent accounting patch (which btw
doesn't appear to have been merged during the last window after all)
notes:

"...newly created connections get additional acct extend.  Old
connections are not changed as it is not possible to add a ct_extend
area to confirmed conntrack"

His patch adds a call to nf_conntrack_acct_init() inside of
conntrack_init() to add the private area.  For an out of tree module,
obviously this isn't a good solution (I'd like to use this module
against a stock kernel if possible).  I suppose I could include
something like this near the beginning of the match function:

layer7 = nf_ct_ext_find(ct, NF_CT_EXT_LAYER7);
if (!layer7 && !nf_ct_is_confirmed(ct))
	ret = nf_ct_ext_add(ct, NF_CT_EXT_LAYER7, GFP_ATOMIC);

...but this seems kludgy.  Is there a better way of doing this?  If
not, maybe we could add an init callback to ct_extend (and basically
clone nf_ct_ext_destroy() and __nf_ct_ext_destroy()), and call it from
conntrack_init(), so any number of extensions can be added without
having to patch conntrack_init() each time.

-In the same thread of avoiding a kernel recompile, it would be great
if there was a way to add a new extension type on the fly, or at least
had one enumeration reserved for use by out of tree modules (eg.
NF_CT_EXT_RESERVED).  Any thoughts here?

  reply	other threads:[~2008-08-15  8:17 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-27 17:54 POM Xtables??? Dave
2008-06-27 18:58 ` Jan Engelhardt
2008-06-27 20:08   ` Dave
2008-06-27 21:16     ` Jan Engelhardt
2008-06-29  2:20   ` Grant Taylor
2008-06-30 16:04     ` Dave
2008-06-30 16:20       ` Patrick McHardy
2008-06-30 20:46         ` Jan Engelhardt
2008-06-30 20:52           ` Patrick McHardy
2008-07-01  9:43             ` Jozsef Kadlecsik
2008-07-01  9:46               ` Patrick McHardy
2008-07-01 11:38                 ` Jan Engelhardt
2008-07-01 11:43                   ` Patrick McHardy
2008-07-01 11:50                     ` Jan Engelhardt
2008-07-01 11:57                       ` Patrick McHardy
2008-07-01 14:05                     ` Grant Taylor
2008-07-01 14:10                       ` Patrick McHardy
2008-07-01 14:27                         ` Grant Taylor
2008-07-01 14:34                           ` Patrick McHardy
2008-07-01 14:30                       ` Jan Engelhardt
2008-07-23 20:19             ` Jan Engelhardt
2008-07-23 23:21               ` Patrick McHardy
2008-07-24  8:31                 ` James King
2008-07-24  9:21                   ` Pablo Neira Ayuso
2008-07-24  9:43                     ` Patrick McHardy
2008-08-15  8:17                       ` James King [this message]
2008-08-19 11:35                         ` Brent Clark
2008-08-15  8:48                     ` James King
2008-06-30 21:11         ` Jozsef Kadlecsik
2008-06-30 21:47           ` Jan Engelhardt
2008-07-01 10:00             ` Jozsef Kadlecsik
2008-07-01 11:19               ` Jan Engelhardt
2008-06-30 20:18       ` Jan Engelhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38bcb3ec0808150117n729cb7b1yadcfa9018b9714be@mail.gmail.com \
    --to=t.james.king@gmail.com \
    --cc=finalglide@gmail.com \
    --cc=jengelh@medozas.de \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=netfilter@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox