public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [Slightly off topic] A question about R/B trees.
@ 2008-10-17 21:34 Maxim Levitsky
  2008-10-17 22:15 ` Chris Snook
  0 siblings, 1 reply; 7+ messages in thread
From: Maxim Levitsky @ 2008-10-17 21:34 UTC (permalink / raw)
  To: linux-kernel

Hi,

I am working on my small project, and I need a fast container to hold a large sparse array.
Balanced trees seem to fit perfectly.

I decided to implement a red/black tree, and took a look at kernel rb tree for reference,
and I noticed that tree item has no parent pointer, while it seems that it should have it.

I know now that it has parent pointer, but it is mixed with current and parent node colour.
Thus it is assumed that last two bits of this pointer are zero.

I can see anywhere that this restriction is applied.
I see that structure is "aligned" but that I think only ensures that compiler places it
aligned in static data, does the alignment ensures that it will always place it on aligned address in a structure?
But then, the whole container structure can be misaligned, can't it?

Besides a comment there states that alignment is only for CRIS

How about a check for misalignment?

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-17 21:34 [Slightly off topic] A question about R/B trees Maxim Levitsky
@ 2008-10-17 22:15 ` Chris Snook
  2008-10-17 22:47   ` Maxim Levitsky
  2008-10-18  7:53   ` Andi Kleen
  0 siblings, 2 replies; 7+ messages in thread
From: Chris Snook @ 2008-10-17 22:15 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-kernel

Maxim Levitsky wrote:
> I am working on my small project, and I need a fast container to hold a 
> large sparse array.
> Balanced trees seem to fit perfectly.

Balanced trees take O(log n) to perform a great many operations, and traversing 
a binary tree is a particularly bad case for branch prediction.  Hash tables 
will perform much better, unless you get them horribly wrong.

> I decided to implement a red/black tree, and took a look at kernel rb 
> tree for reference,
> and I noticed that tree item has no parent pointer, while it seems that 
> it should have it.
> 
> I know now that it has parent pointer, but it is mixed with current and 
> parent node colour.
> Thus it is assumed that last two bits of this pointer are zero.

Not quite.  Read this:

http://lwn.net/Articles/184495/

> I can see anywhere that this restriction is applied.
> I see that structure is "aligned" but that I think only ensures that 
> compiler places it
> aligned in static data, does the alignment ensures that it will always 
> place it on aligned address in a structure?
> But then, the whole container structure can be misaligned, can't it?

GCC will only misalign the contents of a struct if you explicitly tell it to 
pack the struct.  That's one of those things you only do if you're 100% certain 
it's the right thing, and you're prepared to accept the consequences if you 
screw it up.

> Besides a comment there states that alignment is only for CRIS

I'm not sure this check is still necessary, but CRIS is a rather niche 
architecture.  On most architectures, word-aligning structures boosts 
performance at negligible memory cost, so compilers do it automatically.

> How about a check for misalignment?

The kernel is written in a dialect of C that makes several assumptions about the 
compiler, among them that the compiler won't screw this up unless you tell it 
to.  Any compiler that has alignment problems with the rbtree code is going to 
have similar problems in lots of other places too.  We don't support those 
compilers.

-- Chris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-17 22:15 ` Chris Snook
@ 2008-10-17 22:47   ` Maxim Levitsky
  2008-10-20 14:54     ` Chris Friesen
  2008-10-18  7:53   ` Andi Kleen
  1 sibling, 1 reply; 7+ messages in thread
From: Maxim Levitsky @ 2008-10-17 22:47 UTC (permalink / raw)
  To: Chris Snook; +Cc: linux-kernel

Chris Snook wrote:
> Maxim Levitsky wrote:
>> I am working on my small project, and I need a fast container to hold 
>> a large sparse array.
>> Balanced trees seem to fit perfectly.
> 
> Balanced trees take O(log n) to perform a great many operations, and 
> traversing a binary tree is a particularly bad case for branch 
> prediction.  Hash tables will perform much better, unless you get them 
> horribly wrong.
Let me explain.

I am writing a userspace packet writing application.

One of things I need is to have a cache of the disk.

I need an 'array' that will hold cache of written blocks in ascending order,
I need to be able to insert a block anywhere in the array, and be able to read it
from lowest block to highest.

Hash tables can't be read this way, right?

I could use a linked list, but insertion will be slower.


> 
>> I decided to implement a red/black tree, and took a look at kernel rb 
>> tree for reference,
>> and I noticed that tree item has no parent pointer, while it seems 
>> that it should have it.
>>
>> I know now that it has parent pointer, but it is mixed with current 
>> and parent node colour.
>> Thus it is assumed that last two bits of this pointer are zero.
> 
> Not quite.  Read this:
> 
> http://lwn.net/Articles/184495/

What do you mean?

I have read this article, I haven't yet spotted anything suspicious about parent pointer there yet.
> 
>> I can see anywhere that this restriction is applied.
>> I see that structure is "aligned" but that I think only ensures that 
>> compiler places it
>> aligned in static data, does the alignment ensures that it will always 
>> place it on aligned address in a structure?
>> But then, the whole container structure can be misaligned, can't it?
> 
> GCC will only misalign the contents of a struct if you explicitly tell 
> it to pack the struct.  That's one of those things you only do if you're 
> 100% certain it's the right thing, and you're prepared to accept the 
> consequences if you screw it up.

Why gcc?

Say you allocate a piece of memory using kmalloc, and write there, a structure that contains a r/b tree item.
I agree that gcc will ensure that offset from start of that structure to first byte of the tree item will be aligned.

But what if malloc returned a misaligned pointer?
This will ensure that virtual address of the tree item won't be aligned.
(I know it doesn't, but this isn't a assumption about gcc anymore)



> 
>> Besides a comment there states that alignment is only for CRIS
> 
> I'm not sure this check is still necessary, but CRIS is a rather niche 
> architecture.  On most architectures, word-aligning structures boosts 
> performance at negligible memory cost, so compilers do it automatically.
> 
>> How about a check for misalignment?
> 
> The kernel is written in a dialect of C that makes several assumptions 
> about the compiler, among them that the compiler won't screw this up 
> unless you tell it to.  Any compiler that has alignment problems with 
> the rbtree code is going to have similar problems in lots of other 
> places too.  We don't support those compilers.
> 

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-17 22:15 ` Chris Snook
  2008-10-17 22:47   ` Maxim Levitsky
@ 2008-10-18  7:53   ` Andi Kleen
  2008-10-18  8:45     ` Pekka Enberg
  1 sibling, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-10-18  7:53 UTC (permalink / raw)
  To: Chris Snook; +Cc: Maxim Levitsky, linux-kernel

Chris Snook <csnook@redhat.com> writes:

> Maxim Levitsky wrote:
>> I am working on my small project, and I need a fast container to
>> hold a large sparse array.
>> Balanced trees seem to fit perfectly.
>
> Balanced trees take O(log n) to perform a great many operations, and
> traversing a binary tree is a particularly bad case for branch
> prediction.  Hash tables will perform much better, unless you get them
> horribly wrong.

That seems like a unfair generalization.

The problem with hash tables is that if they're big enough
or if the rest of the workload is memory intensive
each hit will be a cache miss. And you can do a lot of branch
mispredicts in the time of a single cache miss.

In general trees can be much better for cache usage, although
it's generally better to use some tree that has nodes near 
the cache line size. Binary trees like RB are too small for that.

The other advantage of trees is that they scale naturally.
For hash tables you either have it being sized for the worst
case (usually wasting a lot of memory[1] and making the cache miss
problem worse) or you need to do dynamic rehashing which 
is complex and difficult to get right especially in multi threaded
situations.

That is why the trend in Linux at least is to move away from
hash tables.

[1] Just take a look at how much memory that various hash 
tables in Linux use: dmesg | grep hash

> The kernel is written in a dialect of C that makes several assumptions
> about the compiler, among them that the compiler won't screw this up
> unless you tell it to.  Any compiler that has alignment problems with
> the rbtree code is going to have similar problems in lots of other
> places too.  We don't support those compilers.

At least upto 4 bytes or so it's generally a safe assumption
that objects will be naturally aligned. Also except for tree 
root the objects are typically allocated with malloc() (or
equivalent kmalloc) anyways and malloc()s guarantee
worst case alignment.


-Andi
-- 
ak@linux.intel.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-18  7:53   ` Andi Kleen
@ 2008-10-18  8:45     ` Pekka Enberg
  2008-10-20 15:57       ` Maxim Levitsky
  0 siblings, 1 reply; 7+ messages in thread
From: Pekka Enberg @ 2008-10-18  8:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Chris Snook, Maxim Levitsky, linux-kernel

Hi Andi,

On Sat, Oct 18, 2008 at 10:53 AM, Andi Kleen <andi@firstfloor.org> wrote:
> The problem with hash tables is that if they're big enough
> or if the rest of the workload is memory intensive
> each hit will be a cache miss. And you can do a lot of branch
> mispredicts in the time of a single cache miss.
>
> In general trees can be much better for cache usage, although
> it's generally better to use some tree that has nodes near
> the cache line size. Binary trees like RB are too small for that.

Right, but even for binary trees, you can get some of the benefits by
packing all the nodes in a slab cache of their own. That way many of
the neighboring nodes will share the same cache line if you're
allocating memory for the nodes in a top-down order. Of course, you
lose the benefits if the tree is updated a lot because you're back to
worst case allocation again.

                       Pekka

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-17 22:47   ` Maxim Levitsky
@ 2008-10-20 14:54     ` Chris Friesen
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Friesen @ 2008-10-20 14:54 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Chris Snook, linux-kernel

Maxim Levitsky wrote:

> Say you allocate a piece of memory using kmalloc, and write there, a
> structure that contains a r/b tree item. I agree that gcc will ensure
> that offset from start of that structure to first byte of the tree
> item will be aligned.
> 
> But what if malloc returned a misaligned pointer? This will ensure
> that virtual address of the tree item won't be aligned. (I know it
> doesn't, but this isn't a assumption about gcc anymore)

malloc() can't return a misaligned pointer.  From the spec:  "The 
pointer returned if the allocation succeeds shall be suitably aligned so 
that it may be assigned to a pointer to any type of object..."

Chris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Slightly off topic] A question about R/B trees.
  2008-10-18  8:45     ` Pekka Enberg
@ 2008-10-20 15:57       ` Maxim Levitsky
  0 siblings, 0 replies; 7+ messages in thread
From: Maxim Levitsky @ 2008-10-20 15:57 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Andi Kleen, Chris Snook, linux-kernel

Pekka Enberg wrote:
> Hi Andi,
> 
> On Sat, Oct 18, 2008 at 10:53 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> The problem with hash tables is that if they're big enough
>> or if the rest of the workload is memory intensive
>> each hit will be a cache miss. And you can do a lot of branch
>> mispredicts in the time of a single cache miss.
>>
>> In general trees can be much better for cache usage, although
>> it's generally better to use some tree that has nodes near
>> the cache line size. Binary trees like RB are too small for that.
> 
> Right, but even for binary trees, you can get some of the benefits by
> packing all the nodes in a slab cache of their own. That way many of
> the neighboring nodes will share the same cache line if you're
> allocating memory for the nodes in a top-down order. Of course, you
> lose the benefits if the tree is updated a lot because you're back to
> worst case allocation again.
> 
>                        Pekka

Thank you very much.

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-10-20 15:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-17 21:34 [Slightly off topic] A question about R/B trees Maxim Levitsky
2008-10-17 22:15 ` Chris Snook
2008-10-17 22:47   ` Maxim Levitsky
2008-10-20 14:54     ` Chris Friesen
2008-10-18  7:53   ` Andi Kleen
2008-10-18  8:45     ` Pekka Enberg
2008-10-20 15:57       ` Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox