public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@SteelEye.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] make radix tree gang lookup faster by using a bitmap search
Date: Sun, 28 Aug 2005 14:43:20 -0500	[thread overview]
Message-ID: <1125258200.5048.18.camel@mulgrave> (raw)
In-Reply-To: <20050827105355.360bd26a.akpm@osdl.org>

On Sat, 2005-08-27 at 10:53 -0700, Andrew Morton wrote:
> I'd say the main naivety in gang lookup is the awkward top-level iteration
> algorithm.  The way it bales out all the way to the top level of the tree
> once __lookup() hits the end of the slots[] array, even though results[]
> isn't full yet.  It's surely possible to go back up the tree just a
> sufficient distance to resume the iteration, rather than all the way to the
> top.  But it's hard, and it's all in CPU cache anyway there.
> 
> It would be much simpler if it was using recursion, of course.

I agree; I actually checked this point: most page gang lookups do have
to restart the search.  At least using a bitmap gets it back on point
much faster.  The page radix tree lookups are usually at most four
levels deep, anyway.

> >  This patch replaces
> > the integer count with an unsigned long representing the bitmap of
> > occupied elements.  We then use that bitmap to find the first occupied
> > entry instead of looping over all the entries from the beginning of the
> > radix node.
> 
> But only in __lookup().  I think most gang lookups use
> radix_tree_gang_lookup_tag() -> __lookup_tag().
> 
> And __lookup_tag() could use find_next_bit() on the tags anyway, as the
> comment says.  I spent a bit of time doing that, had some bug, shelved it,
> never got on to fixing it up.
> 
> There's a userspace test/devel setup at
> http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz, btw.

OK ... I'll take a look.  I didn't mean to do this, it's just that for
the idr replacement code I had to use bitmap lookup, so this seemed like
a natural precursor.

> > The penalty of doing this is that on 32 bit machines, the size of the
> > radix tree array is reduced from 64 to 32 (so an unsigned long can
> > represent the bitmap).
> 
> If we did the bitmap lookup in __lookup_tag() we wouldn't have this
> restriction.
> 
> Maybe we can
> 
> a) fix radix_tree_gang_lookup() to use find_next_bit()

Well, not quite; with the size changes, the tag map now never overflows
an unsigned long.

> b) remove radix_tree_node.count

yes, did that.

> c) Add a new tag field which simply means "present"

> d) remove radix_tree_gang_lookup() and __lookup() altogether

> e) Implement radix_tree_gang_lookup() via radix_tree_gang_lookup_tag()
> 
> That would involve setting and clearing bit in the "present" tag field when
> adding and removing items.

OK, I see how to do all of this using the currently implemented logic
(the occupied word is what you would like to be the present tag).  I'll
see what I can do.

> > I also exported radix_tree_preload() so modules can make use of radix
> > trees.
> 
> uh, OK.  Note that radix_tree_preload() uses prempt_disable() protection. 
> So it has the limitation that the guarantee which it provides will become
> unreliable, kernel-wide, if anyone anywhere tries to do a
> radix_tree_insert() from interrupt context.

radix_tree_insert() is reliable from IRQ provided you don't try to use
radix_tree_preload() and you defined your radix tree gfp flag to be
GFP_ATOMIC.  preloading is only optional, and should only be done really
if you have process context to preload with GFP_KERNEL.  Preloading with
GFP_ATOMIC is pretty pointless since radix_tree_insert() will also try
to allocate with the radix tree flags.

James



  reply	other threads:[~2005-08-28 19:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-27 16:26 [PATCH] make radix tree gang lookup faster by using a bitmap search James Bottomley
2005-08-27 17:53 ` Andrew Morton
2005-08-28 19:43   ` James Bottomley [this message]
2005-08-28 20:00     ` Linus Torvalds
2005-08-28 20:39       ` Andrew Morton
2005-08-29  0:45   ` James Bottomley
2005-08-29  0:52     ` Andrew Morton
2005-08-29  1:19       ` James Bottomley
2005-08-29  1:35         ` Andrew Morton
2005-08-29  3:26           ` James Bottomley
2005-08-29  3:37             ` Nick Piggin
2005-08-29  3:54               ` Trond Myklebust
2005-08-29 13:16                 ` Luben Tuikov
2005-08-29 15:01               ` James Bottomley
     [not found]               ` <20050829164144.GC9508@localhost.localdomain>
2005-08-30  0:56                 ` Nick Piggin
2005-08-30  1:54                   ` Andrew Morton
2005-08-30  2:46                   ` James Bottomley
2005-08-30  2:53                     ` Nick Piggin
     [not found]                       ` <20050830052405.GB20843@localhost.localdomain>
2005-08-30 13:06                         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1125258200.5048.18.camel@mulgrave \
    --to=james.bottomley@steeleye.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox