* [RFC] string matching based packet classification/filtering
@ 2005-02-15 20:32 Thomas Graf
2005-02-15 21:41 ` Pablo Neira
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Graf @ 2005-02-15 20:32 UTC (permalink / raw)
To: Pablo Neira, Harald Welte; +Cc: netdev, netfilter-devel
We have been discussing string matching based packet classification and
filterings a few times already and I'd like to make it serious this time
to get the string matching ematch ready for 2.6.12 inclusion. I'm aware
of the bayer-moore based patch by Emmanuel Roger, Gianni Tedesco, and Pablo
but I also heard about a generic string matching architecture supporting
various algorithms I haven't found that patchset though.
Is there any effort going into the generic architecture? Any plans for
a stateful string matching netfilter module? As it was mentioned already
we could share some code between the ematch and netfilter. I do not care
for the algorithm, actually I think it doesn't matter at all as long as
it's not a naive linear search. The essential parts are to be able to
define a searching range and to support paged skbs. If there is someone
going for the generic architecture fullfilling the essential parts
just described then I'll be more than happy to use that bit of code
otherwise I'd be happy to discuss the requirements of both sides and
try to find a compromise both sides can live with.
The requirements from my side:
In:
o pattern as byte stream
o length of pattern
o begin of search range (skb layer + offset)
o end of search range (skb layer + offset)
o (p)skb
Out:
o true or false
Applying this on the recently posted implementation by Pablo it shows
that it nearly fits already except for the search range. Additionaly
it could be improved by using prefix optimizations for the fragment
border regions instead of a naive string search which would help for
large patterns on paged skbs.
If needed an additional input argument could be added specifying the
algorithm to be used. Eventually it requires an additional algoirthm
specific argument carrying meta data such as prefix lookup tables.
Thoughts?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] string matching based packet classification/filtering
2005-02-15 20:32 [RFC] string matching based packet classification/filtering Thomas Graf
@ 2005-02-15 21:41 ` Pablo Neira
2005-02-15 21:56 ` Thomas Graf
2005-02-16 22:30 ` Thomas Graf
0 siblings, 2 replies; 6+ messages in thread
From: Pablo Neira @ 2005-02-15 21:41 UTC (permalink / raw)
To: Thomas Graf; +Cc: netdev, netfilter-devel, Harald Welte
Hi Thomas,
Thomas Graf wrote:
>We have been discussing string matching based packet classification and
>filterings a few times already and I'd like to make it serious this time
>to get the string matching ematch ready for 2.6.12 inclusion.
>
I agree with you. I'd like to finish see something usable. On the other
hand, there's no need to rush anyway.
> I'm aware
>of the bayer-moore based patch by Emmanuel Roger, Gianni Tedesco, and Pablo
>but I also heard about a generic string matching architecture supporting
>various algorithms I haven't found that patchset though.
>
>Is there any effort going into the generic architecture?
>
Actually I wanted to contact Harald after this friday, once I'm done
with my exams. I'd like to merge my work with his libqsearch hackings
that were about to be finished.
> Any plans for
>a stateful string matching netfilter module? As it was mentioned already
>we could share some code between the ematch and netfilter.
>
Indeed, I think that we must share that code.
> I do not care
>for the algorithm, actually I think it doesn't matter at all as long as
>it's not a naive linear search.
>
An infrastructure based on libqsearch let us have different algorithms,
so we could use whatever algorithm. This won't be a problem.
> The essential parts are to be able t define a searching range and to support paged skbs.
>
I've been working on this, I'll pass you what I have as soon as I get
time to review it again.
> If there is someone
>going for the generic architecture fullfilling the essential parts
>just described then I'll be more than happy to use that bit of code
>otherwise I'd be happy to discuss the requirements of both sides and
>try to find a compromise both sides can live with.
>
>The requirements from my side:
> In:
> o pattern as byte stream
> o length of pattern
> o begin of search range (skb layer + offset)
> o end of search range (skb layer + offset)
> o (p)skb
> Out:
> o true or false
>
>
Those look fine for me. Actually I also need more than a true of false,
a pointer to where a matching happens. This is a requirement for
conntrack helpers.
>Applying this on the recently posted implementation by Pablo it shows
>that it nearly fits already except for the search range. Additionaly
>it could be improved by using prefix optimizations for the fragment
>border regions instead of a naive string search which would help for
>large patterns on paged skbs.
>
>
Sure that you can improve current way of doing things :)
>If needed an additional input argument could be added specifying the
>algorithm to be used. Eventually it requires an additional algoirthm
>specific argument carrying meta data such as prefix lookup tables.
>
>Thoughts?
>
>
Harald,any thoughts?
--
Pablo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] string matching based packet classification/filtering
2005-02-15 21:41 ` Pablo Neira
@ 2005-02-15 21:56 ` Thomas Graf
2005-02-16 22:30 ` Thomas Graf
1 sibling, 0 replies; 6+ messages in thread
From: Thomas Graf @ 2005-02-15 21:56 UTC (permalink / raw)
To: Pablo Neira; +Cc: netdev, netfilter-devel, Harald Welte
Pablo,
> I agree with you. I'd like to finish see something usable. On the other
> hand, there's no need to rush anyway.
Agreed, I need for work though and want to make a decision wether to
keep putting effort in em_kmp.c or focus on a new way.
> Actually I wanted to contact Harald after this friday, once I'm done
> with my exams. I'd like to merge my work with his libqsearch hackings
> that were about to be finished.
libqsearch, exactly. Let me know once it's finished.
> Those look fine for me. Actually I also need more than a true of false,
> a pointer to where a matching happens. This is a requirement for
> conntrack helpers.
Fine with me.
I'll sit thight and wait for libqsearch to come up. Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] string matching based packet classification/filtering
2005-02-15 21:41 ` Pablo Neira
2005-02-15 21:56 ` Thomas Graf
@ 2005-02-16 22:30 ` Thomas Graf
2005-02-17 1:00 ` Pablo Neira
1 sibling, 1 reply; 6+ messages in thread
From: Thomas Graf @ 2005-02-16 22:30 UTC (permalink / raw)
To: Pablo Neira; +Cc: netdev, netfilter-devel, Harald Welte
> Actually I wanted to contact Harald after this friday, once I'm done
> with my exams. I'd like to merge my work with his libqsearch hackings
> that were about to be finished.
Thinking about it, we can use the architecture to implement a regular
expression match. The pattern would consist of an array of tokens
in the form of:
struct regexp_token
{
__u8 type;
__u8 recur;
__u8 value;
__u8 unused;
};
where type would be
- specific character (must match `value')
- wildcard
- digit
- xdigit
- alpha ...
and recur
- 1
- 0..1
- 0..n
- 1..0
The matching algorithm would parse the array as a finite automation
eating up byte by byte in the text. I think it would be efficient
enough, easy to implement, not too error prone and powerful enough
for most needs but we can discuss this in more details once libqsearch
is ready. It would be one step closer to the increasing need for
application level based filtering and classification.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] string matching based packet classification/filtering
2005-02-16 22:30 ` Thomas Graf
@ 2005-02-17 1:00 ` Pablo Neira
2005-02-17 13:31 ` Thomas Graf
0 siblings, 1 reply; 6+ messages in thread
From: Pablo Neira @ 2005-02-17 1:00 UTC (permalink / raw)
To: Thomas Graf; +Cc: netdev, Netfilter Development Mailinglist, Harald Welte
Thomas Graf wrote:
>>Actually I wanted to contact Harald after this friday, once I'm done
>>with my exams. I'd like to merge my work with his libqsearch hackings
>>that were about to be finished.
>>
>>
>
>Thinking about it, we can use the architecture to implement a regular
>expression match. The pattern would consist of an array of tokens
>in the form of:
>
>struct regexp_token
>{
> __u8 type;
> __u8 recur;
> __u8 value;
> __u8 unused;
>};
>
>
Yes, this is a good idea. This could be useful to match things like
interrogation marks or asterisks that are usually used as wildcard.
>where type would be
> - specific character (must match `value')
> - wildcard
> - digit
> - xdigit
> - alpha ...
>
>and recur
> - 1
> - 0..1
> - 0..n
> - 1..0
>
>The matching algorithm would parse the array as a finite automation
>eating up byte by byte in the text.
>
Looks good but Boyer-Moore algorithm doesn't need this at all. BM
doesn't support wildcards like '*' because of the shiftings that it uses
to look for matches. This issue together with memory consumption are two
limitations that we have to live with if we want to use BM. Anyway I
think it's worth it because we can perform search in O(n/m).
--
Pablo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] string matching based packet classification/filtering
2005-02-17 1:00 ` Pablo Neira
@ 2005-02-17 13:31 ` Thomas Graf
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Graf @ 2005-02-17 13:31 UTC (permalink / raw)
To: Pablo Neira; +Cc: netdev, Netfilter Development Mailinglist, Harald Welte
> >The matching algorithm would parse the array as a finite automation
> >eating up byte by byte in the text.
> >
>
> Looks good but Boyer-Moore algorithm doesn't need this at all. BM
> doesn't support wildcards like '*' because of the shiftings that it uses
> to look for matches. This issue together with memory consumption are two
> limitations that we have to live with if we want to use BM. Anyway I
> think it's worth it because we can perform search in O(n/m).
This would be _additional_ to BM/KMP/naive/finite automata/you name it.
The benefit of libqsearch is hide various algorithms behind a single
interface, isn't it? It's argueable whether it would make sense to try
and rule out bad shifts while parsing a chain of non-wildcard regular
expression tokens but I guess that would be over the top.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-02-17 13:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-15 20:32 [RFC] string matching based packet classification/filtering Thomas Graf
2005-02-15 21:41 ` Pablo Neira
2005-02-15 21:56 ` Thomas Graf
2005-02-16 22:30 ` Thomas Graf
2005-02-17 1:00 ` Pablo Neira
2005-02-17 13:31 ` Thomas Graf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).