Re: [RFC] textsearch infrastructure + skb_find_text()

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jamal Hadi Salim <hadi@znyx.com>
To: Thomas Graf <tgraf@suug.ch>
Cc: netdev@oss.sgi.com, Pablo Neira <pablo@eurodev.net>
Subject: Re: [RFC] textsearch infrastructure + skb_find_text()
Date: Sat, 07 May 2005 09:03:04 -0400	[thread overview]
Message-ID: <1115470985.19561.58.camel@localhost.localdomain> (raw)
In-Reply-To: <20050506144308.GF28419@postel.suug.ch>

On Fri, 2005-06-05 at 16:43 +0200, Thomas Graf wrote:

> As you can see, it expects a char * in args[0] and the length of it
> in args[1]. All it does is check whether all bytes have been read
> already and if not return the remaining part of the buffer so even
> if the search algorithm can't consume all the bytes returned it will
> still work as expected.
> 

Ok, makes sense - in the case of a string spanning multi skbs, i suppose
it wouldnt matter, correct?

[..] 

> Not sure if this is clear out of context but maybe it gives you an idea
> why it is easier to maintain state of get_text() rather than the state
> of a whole searching algorithm.
> 

I got it. I suppose in the case of text contained within one skb this
would be an improvement (spanning across multi-skb should be no
difference; an improvemengt nonetheless)

> > 
> > I am trying to sink this in; prefetching would be valuable for regexp,
> > but why would the other scheme not be able to do it? 
> 
> I'm really not an expert on the validity of L1 caches and how to optimize
> it best but I believe that the less memory movement is in between the
> more likely prefetching helps? Both schemes involve a switch to another
> stack namespace but get_text() tends to be a lot smaller and less intrusive
> than a store & reload of a complex state machine. I really can't tell
> which is better regarding this subject without trying it out actually.
> 

Sorry - I thought you were talking about pre-fetching text as in
lookahead for text in a regexp state machine.
I am not sure i see the L1 cache connection. Both seem to have tight for
loops and depending on the algorithm there would be no difference
in cache warmth afaics. Infact your scheme may suffer more because it
has a lot of stuff on the stack. However, playing around with the code
is the only way to find out.

cheers,
jamal

next prev parent reply	other threads:[~2005-05-07 13:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-04 23:40 [RFC] textsearch infrastructure + skb_find_text() Thomas Graf
2005-05-05 12:42 ` jamal
2005-05-05 14:12   ` Thomas Graf
2005-05-05 17:02 ` Pablo Neira
2005-05-05 17:42   ` Thomas Graf
2005-05-06  1:33     ` Pablo Neira
2005-05-06 12:36       ` Thomas Graf
2005-05-06 13:04         ` jamal
2005-05-06 14:43           ` Thomas Graf
2005-05-07 13:03             ` Jamal Hadi Salim [this message]
2005-05-08 11:45               ` Thomas Graf
2005-05-06 21:44 ` Thomas Graf
2005-05-07  0:17   ` YOSHIFUJI Hideaki / 吉藤英明
2005-05-07  0:36     ` Thomas Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1115470985.19561.58.camel@localhost.localdomain \
    --to=hadi@znyx.com \
    --cc=netdev@oss.sgi.com \
    --cc=pablo@eurodev.net \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).