All of lore.kernel.org
 help / color / mirror / Atom feed
From: jamal <hadi@cyberus.ca>
To: Thomas Graf <tgraf@suug.ch>
Cc: netdev@oss.sgi.com, Pablo Neira <pablo@eurodev.net>
Subject: Re: [RFC] textsearch infrastructure + skb_find_text()
Date: Thu, 05 May 2005 08:42:17 -0400	[thread overview]
Message-ID: <1115296937.7680.52.camel@localhost.localdomain> (raw)
In-Reply-To: <20050504234036.GH18452@postel.suug.ch>

On Thu, 2005-05-05 at 01:40 +0200, Thomas Graf wrote:
> The patch below is a report on the current state of the textsearch
> infrastructure and its first user skb_find_text(). The textsearch
> is kept as simple as possible but advanced enough to handle non-linear
> data such as skb fragments. Unlike in many other approaches the text
> input is not seen as a single pointer but rather as a continuously
> called callback get_text() until 0 is returned allowing to search
> on any kind of data and to implement customized from-to limits.
> 

How is this different from libqsearch? IIRC, it also kept pointers and
callbacks.

BTW, I hope theres sync with libqsearch - at least some canibalization
of ideas.
Also hopefully, pluggin of ne algorithms is trivial (e.g boyer-moore
could be included in addition to kmp etc)

> The patch is separated into 3 parts, the first one being the textsearch
> infrastructure itself followed by a simple Knuth-Morris-Pratt
> implementation for reference. I'm also working on what could be called
> the smallest regular expression implementation ever but I left that
> out for now since it still has issues. Last but not least the
> function skb_find_text() written in a hurry and probably not yet
> correct but you should get the idea. From a userspace perspective
> the first user will be an ematch but writing it will be peanuts
> so I left it out for now.
> 

nice

> Basically what it looks like right now is:
> 
> int pos;
> struct ts_state;
> struct ts_config *conf = textsearch_prepare("kmp", "hanky", 5, GFP_KERNEL, 1);
>
> /* search for "hanky" at offset 20 until end of packet */
> for (pos = skb_find_text(skb, 20, INT_MAX, conf, &state;
>      pos >= 0;
>      pos = textsearch_next(conf, &state)) {
>         printk("Need a hanky? I found one at offset %d.\n", pos);
> }
> 

I have a lot of questions:
- does a string have to be terminated by \0?
- do you keep state of the string from the begining? ex: how do you know
that preceeding "hanky" was "Need a"?
- all sorts of limits: how long is the string? etc
- what happens if a string spans multiple skbs or even multiple
fragments?

> textsearch_put(conf);
> kfree(conf);
> 
> You might wonder about the 1 given to _prepare(),  it indicates whether
> to autoload modules because the ematches will need it to be able to drop
> rtnl sem.
> 

do you really wanna leave that decision upto the user?

> The code is not tested and cerainly not bug free yet but should compile.
> 
> Thoughts?

I dont have time to look at the patch to sufficiently critique it, but
it looks like a good start - maybe this weekend.
It would be nice to have other utilities which could be loaded eg; case
compare, regualr expressions, strchr after you match, etc
Of course all this to be followed by actions such as strok etc.
Probably all this is a layer above this - but essentially when you are
doing this keep the desire to do this in mind.

cheers,
jamal

  reply	other threads:[~2005-05-05 12:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-04 23:40 [RFC] textsearch infrastructure + skb_find_text() Thomas Graf
2005-05-05 12:42 ` jamal [this message]
2005-05-05 14:12   ` Thomas Graf
2005-05-05 17:02 ` Pablo Neira
2005-05-05 17:42   ` Thomas Graf
2005-05-06  1:33     ` Pablo Neira
2005-05-06 12:36       ` Thomas Graf
2005-05-06 13:04         ` jamal
2005-05-06 14:43           ` Thomas Graf
2005-05-07 13:03             ` Jamal Hadi Salim
2005-05-08 11:45               ` Thomas Graf
2005-05-06 21:44 ` Thomas Graf
2005-05-07  0:17   ` YOSHIFUJI Hideaki / 吉藤英明
2005-05-07  0:36     ` Thomas Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1115296937.7680.52.camel@localhost.localdomain \
    --to=hadi@cyberus.ca \
    --cc=netdev@oss.sgi.com \
    --cc=pablo@eurodev.net \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.