From: James Bottomley <James.Bottomley@SteelEye.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] make radix tree gang lookup faster by using a bitmap search
Date: Sun, 28 Aug 2005 14:43:20 -0500 [thread overview]
Message-ID: <1125258200.5048.18.camel@mulgrave> (raw)
In-Reply-To: <20050827105355.360bd26a.akpm@osdl.org>
On Sat, 2005-08-27 at 10:53 -0700, Andrew Morton wrote:
> I'd say the main naivety in gang lookup is the awkward top-level iteration
> algorithm. The way it bales out all the way to the top level of the tree
> once __lookup() hits the end of the slots[] array, even though results[]
> isn't full yet. It's surely possible to go back up the tree just a
> sufficient distance to resume the iteration, rather than all the way to the
> top. But it's hard, and it's all in CPU cache anyway there.
>
> It would be much simpler if it was using recursion, of course.
I agree; I actually checked this point: most page gang lookups do have
to restart the search. At least using a bitmap gets it back on point
much faster. The page radix tree lookups are usually at most four
levels deep, anyway.
> > This patch replaces
> > the integer count with an unsigned long representing the bitmap of
> > occupied elements. We then use that bitmap to find the first occupied
> > entry instead of looping over all the entries from the beginning of the
> > radix node.
>
> But only in __lookup(). I think most gang lookups use
> radix_tree_gang_lookup_tag() -> __lookup_tag().
>
> And __lookup_tag() could use find_next_bit() on the tags anyway, as the
> comment says. I spent a bit of time doing that, had some bug, shelved it,
> never got on to fixing it up.
>
> There's a userspace test/devel setup at
> http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz, btw.
OK ... I'll take a look. I didn't mean to do this, it's just that for
the idr replacement code I had to use bitmap lookup, so this seemed like
a natural precursor.
> > The penalty of doing this is that on 32 bit machines, the size of the
> > radix tree array is reduced from 64 to 32 (so an unsigned long can
> > represent the bitmap).
>
> If we did the bitmap lookup in __lookup_tag() we wouldn't have this
> restriction.
>
> Maybe we can
>
> a) fix radix_tree_gang_lookup() to use find_next_bit()
Well, not quite; with the size changes, the tag map now never overflows
an unsigned long.
> b) remove radix_tree_node.count
yes, did that.
> c) Add a new tag field which simply means "present"
> d) remove radix_tree_gang_lookup() and __lookup() altogether
> e) Implement radix_tree_gang_lookup() via radix_tree_gang_lookup_tag()
>
> That would involve setting and clearing bit in the "present" tag field when
> adding and removing items.
OK, I see how to do all of this using the currently implemented logic
(the occupied word is what you would like to be the present tag). I'll
see what I can do.
> > I also exported radix_tree_preload() so modules can make use of radix
> > trees.
>
> uh, OK. Note that radix_tree_preload() uses prempt_disable() protection.
> So it has the limitation that the guarantee which it provides will become
> unreliable, kernel-wide, if anyone anywhere tries to do a
> radix_tree_insert() from interrupt context.
radix_tree_insert() is reliable from IRQ provided you don't try to use
radix_tree_preload() and you defined your radix tree gfp flag to be
GFP_ATOMIC. preloading is only optional, and should only be done really
if you have process context to preload with GFP_KERNEL. Preloading with
GFP_ATOMIC is pretty pointless since radix_tree_insert() will also try
to allocate with the radix tree flags.
James
next prev parent reply other threads:[~2005-08-28 19:43 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-27 16:26 [PATCH] make radix tree gang lookup faster by using a bitmap search James Bottomley
2005-08-27 17:53 ` Andrew Morton
2005-08-28 19:43 ` James Bottomley [this message]
2005-08-28 20:00 ` Linus Torvalds
2005-08-28 20:39 ` Andrew Morton
2005-08-29 0:45 ` James Bottomley
2005-08-29 0:52 ` Andrew Morton
2005-08-29 1:19 ` James Bottomley
2005-08-29 1:35 ` Andrew Morton
2005-08-29 3:26 ` James Bottomley
2005-08-29 3:37 ` Nick Piggin
2005-08-29 3:54 ` Trond Myklebust
2005-08-29 13:16 ` Luben Tuikov
2005-08-29 15:01 ` James Bottomley
[not found] ` <20050829164144.GC9508@localhost.localdomain>
2005-08-30 0:56 ` Nick Piggin
2005-08-30 1:54 ` Andrew Morton
2005-08-30 2:46 ` James Bottomley
2005-08-30 2:53 ` Nick Piggin
[not found] ` <20050830052405.GB20843@localhost.localdomain>
2005-08-30 13:06 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1125258200.5048.18.camel@mulgrave \
--to=james.bottomley@steeleye.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox