netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jamal <hadi@cyberus.ca>
To: Thomas Graf <tgraf@suug.ch>
Cc: netdev@oss.sgi.com, Pablo Neira <pablo@eurodev.net>
Subject: Re: [RFC] textsearch infrastructure + skb_find_text()
Date: Thu, 05 May 2005 08:42:17 -0400	[thread overview]
Message-ID: <1115296937.7680.52.camel@localhost.localdomain> (raw)
In-Reply-To: <20050504234036.GH18452@postel.suug.ch>

On Thu, 2005-05-05 at 01:40 +0200, Thomas Graf wrote:
> The patch below is a report on the current state of the textsearch
> infrastructure and its first user skb_find_text(). The textsearch
> is kept as simple as possible but advanced enough to handle non-linear
> data such as skb fragments. Unlike in many other approaches the text
> input is not seen as a single pointer but rather as a continuously
> called callback get_text() until 0 is returned allowing to search
> on any kind of data and to implement customized from-to limits.
> 

How is this different from libqsearch? IIRC, it also kept pointers and
callbacks.

BTW, I hope theres sync with libqsearch - at least some canibalization
of ideas.
Also hopefully, pluggin of ne algorithms is trivial (e.g boyer-moore
could be included in addition to kmp etc)

> The patch is separated into 3 parts, the first one being the textsearch
> infrastructure itself followed by a simple Knuth-Morris-Pratt
> implementation for reference. I'm also working on what could be called
> the smallest regular expression implementation ever but I left that
> out for now since it still has issues. Last but not least the
> function skb_find_text() written in a hurry and probably not yet
> correct but you should get the idea. From a userspace perspective
> the first user will be an ematch but writing it will be peanuts
> so I left it out for now.
> 

nice

> Basically what it looks like right now is:
> 
> int pos;
> struct ts_state;
> struct ts_config *conf = textsearch_prepare("kmp", "hanky", 5, GFP_KERNEL, 1);
>
> /* search for "hanky" at offset 20 until end of packet */
> for (pos = skb_find_text(skb, 20, INT_MAX, conf, &state;
>      pos >= 0;
>      pos = textsearch_next(conf, &state)) {
>         printk("Need a hanky? I found one at offset %d.\n", pos);
> }
> 

I have a lot of questions:
- does a string have to be terminated by \0?
- do you keep state of the string from the begining? ex: how do you know
that preceeding "hanky" was "Need a"?
- all sorts of limits: how long is the string? etc
- what happens if a string spans multiple skbs or even multiple
fragments?

> textsearch_put(conf);
> kfree(conf);
> 
> You might wonder about the 1 given to _prepare(),  it indicates whether
> to autoload modules because the ematches will need it to be able to drop
> rtnl sem.
> 

do you really wanna leave that decision upto the user?

> The code is not tested and cerainly not bug free yet but should compile.
> 
> Thoughts?

I dont have time to look at the patch to sufficiently critique it, but
it looks like a good start - maybe this weekend.
It would be nice to have other utilities which could be loaded eg; case
compare, regualr expressions, strchr after you match, etc
Of course all this to be followed by actions such as strok etc.
Probably all this is a layer above this - but essentially when you are
doing this keep the desire to do this in mind.

cheers,
jamal

  reply	other threads:[~2005-05-05 12:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-04 23:40 [RFC] textsearch infrastructure + skb_find_text() Thomas Graf
2005-05-05 12:42 ` jamal [this message]
2005-05-05 14:12   ` Thomas Graf
2005-05-05 17:02 ` Pablo Neira
2005-05-05 17:42   ` Thomas Graf
2005-05-06  1:33     ` Pablo Neira
2005-05-06 12:36       ` Thomas Graf
2005-05-06 13:04         ` jamal
2005-05-06 14:43           ` Thomas Graf
2005-05-07 13:03             ` Jamal Hadi Salim
2005-05-08 11:45               ` Thomas Graf
2005-05-06 21:44 ` Thomas Graf
2005-05-07  0:17   ` YOSHIFUJI Hideaki / 吉藤英明
2005-05-07  0:36     ` Thomas Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1115296937.7680.52.camel@localhost.localdomain \
    --to=hadi@cyberus.ca \
    --cc=netdev@oss.sgi.com \
    --cc=pablo@eurodev.net \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).