* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:05 ` Linus Torvalds
@ 2001-10-10 5:17 ` BALBIR SINGH
2001-10-10 5:29 ` Davide Libenzi
2001-10-10 5:46 ` Linus Torvalds
2001-10-10 6:16 ` [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion Paul Mackerras
` (2 subsequent siblings)
3 siblings, 2 replies; 26+ messages in thread
From: BALBIR SINGH @ 2001-10-10 5:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1675 bytes --]
Linus Torvalds wrote:
>In article <20011010040502.A726@athlon.random>,
>Andrea Arcangeli <andrea@suse.de> wrote:
>
>>On Tue, Oct 09, 2001 at 08:45:15AM -0700, Paul McKenney wrote:
>>
>>>Please see the example above. I do believe that my algorithms are
>>>reliably forcing proper read ordering using IPIs, just in an different
>>>way. Please note that I have discussed this algorithm with Alpha
>>>architects, who believe that it is sound.
>>>
>>The IPI way is certainly safe.
>>
>
>Now, before people get all excited, what is this particular code
>actually _good_ for?
>
>Creating a lock-free list that allows insertion concurrently with lookup
>is _easy_.
>
>But what's the point? If you insert stuff, you eventually have to remove
>it. What goes up must come down. Insert-inane-quote-here.
>
>And THAT is the hard part. Doing lookup without locks ends up being
>pretty much worthless, because you need the locks for the removal
>anyway, at which point the whole thing looks pretty moot.
>
>Did I miss something?
>
> Linus
>
What about cases like the pci device list or any other such list. Sometimes
you do not care if somebody added something, while you were looking through
the list as long as you do not get illegal addresses or data.
Wouldn't this be very useful there? Most of these lists come up
at system startup and change rearly, but we look through them often.
Me too, Did I miss something?
Balbir
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
[-- Attachment #2: Wipro_Disclaimer.txt --]
[-- Type: text/plain, Size: 853 bytes --]
----------------------------------------------------------------------------------------------------------------------
Information transmitted by this E-MAIL is proprietary to Wipro and/or its Customers and
is intended for use only by the individual or entity to which it is
addressed, and may contain information that is privileged, confidential or
exempt from disclosure under applicable law. If you are not the intended
recipient or it appears that this mail has been forwarded to you without
proper authority, you are notified that any use or dissemination of this
information in any manner is strictly prohibited. In such cases, please
notify us immediately at mailto:mailadmin@wipro.com and delete this mail
from your records.
----------------------------------------------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:17 ` BALBIR SINGH
@ 2001-10-10 5:29 ` Davide Libenzi
2001-10-10 5:46 ` Linus Torvalds
1 sibling, 0 replies; 26+ messages in thread
From: Davide Libenzi @ 2001-10-10 5:29 UTC (permalink / raw)
To: BALBIR SINGH; +Cc: Linus Torvalds, linux-kernel
On Wed, 10 Oct 2001, BALBIR SINGH wrote:
> What about cases like the pci device list or any other such list. Sometimes
> you do not care if somebody added something, while you were looking through
> the list as long as you do not get illegal addresses or data.
> Wouldn't this be very useful there? Most of these lists come up
> at system startup and change rearly, but we look through them often.
>
> Me too, Did I miss something?
What means that changes rarely that you're going to use rarely_lock() ?
If you're going to remove even only 1 element in 1000000 jiffies then
you've to lock.
If your list can only grow, that's different.
- Davide
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:17 ` BALBIR SINGH
2001-10-10 5:29 ` Davide Libenzi
@ 2001-10-10 5:46 ` Linus Torvalds
2001-10-10 6:01 ` BALBIR SINGH
2001-10-10 7:14 ` kdb requires kallsyms Kirill Ratkin
1 sibling, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2001-10-10 5:46 UTC (permalink / raw)
To: BALBIR SINGH; +Cc: linux-kernel
On Wed, 10 Oct 2001, BALBIR SINGH wrote:
> >
> >And THAT is the hard part. Doing lookup without locks ends up being
> >pretty much worthless, because you need the locks for the removal
> >anyway, at which point the whole thing looks pretty moot.
>
> What about cases like the pci device list or any other such list. Sometimes
> you do not care if somebody added something, while you were looking through
> the list as long as you do not get illegal addresses or data.
> Wouldn't this be very useful there? Most of these lists come up
> at system startup and change rearly, but we look through them often.
It's not about "change rarely". You cannot use a non-locking lookup if
they _ever_ remove anything.
I can't think of many lists like that. The PCI lists certainly are both
add/remove: cardbus bridges and hotplug-PCI means that they are not just
purely "enumerate at bootup".
Sure, maybe there are _some_ things that don't need to ever be removed,
but I can't think of any interesting data structure off-hand. Truly static
stuff tends to be allocated in an array that is sized once - array lookups
are much faster than traversing a linked list anyway.
So the linked list approach tends to make sense for things that _aren't_
just static, but I don't know of anything that only grows and grows. In
fact, that would sound like a horrible memory leak to me if we had
something like that. Even slow growth is bad if you want up-times measured
in years.
Now, in all fairness I can imagine hacky lock-less removals too. To get
them to work, you have to (a) change the "next" pointer to point to the
next->next (and have some serialization between removals, but removals
only, to make sure you don't have next->next also going away from you) and
(b) leave the old "next" pointer (and thus the data structure) around
until you can _prove_ that nobody is looking anything up any more, and
that the now-defunct data structure can truly be removed.
However, apart from locking, there aren't all that many ways to "prove"
that non-use. You could probably do it with interesting atomic sequence
numbers etc, although by the time you generate atomic sequence numbers
your lookup is already getting heavier - to the point where locking
probably isn't so bad an idea any more.
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:46 ` Linus Torvalds
@ 2001-10-10 6:01 ` BALBIR SINGH
2001-10-10 15:23 ` Victor Yodaiken
2001-10-10 7:14 ` kdb requires kallsyms Kirill Ratkin
1 sibling, 1 reply; 26+ messages in thread
From: BALBIR SINGH @ 2001-10-10 6:01 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2995 bytes --]
Linus Torvalds wrote:
>On Wed, 10 Oct 2001, BALBIR SINGH wrote:
>
>>>And THAT is the hard part. Doing lookup without locks ends up being
>>>pretty much worthless, because you need the locks for the removal
>>>anyway, at which point the whole thing looks pretty moot.
>>>
>>What about cases like the pci device list or any other such list. Sometimes
>>you do not care if somebody added something, while you were looking through
>>the list as long as you do not get illegal addresses or data.
>>Wouldn't this be very useful there? Most of these lists come up
>>at system startup and change rearly, but we look through them often.
>>
>
>It's not about "change rarely". You cannot use a non-locking lookup if
>they _ever_ remove anything.
>
>I can't think of many lists like that. The PCI lists certainly are both
>add/remove: cardbus bridges and hotplug-PCI means that they are not just
>purely "enumerate at bootup".
>
I agree, I just thought of one case quickly. Assume that somebody did a cat /proc/pci.
Meanwhile somebody is adding a new pci device (hotplug PCI) simultaneously (which is very rare).
I would not care if the new device showed up in the output of /proc/pci this time. It would
definitely show up next time. Meanwhile locking the list (just in case it changes) is an
overhead in the case above. I was referring to these cases in my earlier mail.
>
>Sure, maybe there are _some_ things that don't need to ever be removed,
>but I can't think of any interesting data structure off-hand. Truly static
>stuff tends to be allocated in an array that is sized once - array lookups
>are much faster than traversing a linked list anyway.
>
>So the linked list approach tends to make sense for things that _aren't_
>just static, but I don't know of anything that only grows and grows. In
>fact, that would sound like a horrible memory leak to me if we had
>something like that. Even slow growth is bad if you want up-times measured
>in years.
>
>Now, in all fairness I can imagine hacky lock-less removals too. To get
>them to work, you have to (a) change the "next" pointer to point to the
>next->next (and have some serialization between removals, but removals
>only, to make sure you don't have next->next also going away from you) and
>(b) leave the old "next" pointer (and thus the data structure) around
>until you can _prove_ that nobody is looking anything up any more, and
>that the now-defunct data structure can truly be removed.
>
The problem with removals is that somebody could be adding an element simulataneously
to the "next" element. Which might cause problems. I hope I correctly understood
the scenario here.
>
>However, apart from locking, there aren't all that many ways to "prove"
>that non-use. You could probably do it with interesting atomic sequence
>numbers etc, although by the time you generate atomic sequence numbers
>your lookup is already getting heavier - to the point where locking
>probably isn't so bad an idea any more.
>
> Linus
>
>
Balbir
[-- Attachment #2: Wipro_Disclaimer.txt --]
[-- Type: text/plain, Size: 853 bytes --]
----------------------------------------------------------------------------------------------------------------------
Information transmitted by this E-MAIL is proprietary to Wipro and/or its Customers and
is intended for use only by the individual or entity to which it is
addressed, and may contain information that is privileged, confidential or
exempt from disclosure under applicable law. If you are not the intended
recipient or it appears that this mail has been forwarded to you without
proper authority, you are notified that any use or dissemination of this
information in any manner is strictly prohibited. In such cases, please
notify us immediately at mailto:mailadmin@wipro.com and delete this mail
from your records.
----------------------------------------------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 6:01 ` BALBIR SINGH
@ 2001-10-10 15:23 ` Victor Yodaiken
0 siblings, 0 replies; 26+ messages in thread
From: Victor Yodaiken @ 2001-10-10 15:23 UTC (permalink / raw)
To: BALBIR SINGH; +Cc: Linus Torvalds, linux-kernel
On Wed, Oct 10, 2001 at 11:31:08AM +0530, BALBIR SINGH wrote:
> Linus Torvalds wrote:
> >I can't think of many lists like that. The PCI lists certainly are both
> >add/remove: cardbus bridges and hotplug-PCI means that they are not just
> >purely "enumerate at bootup".
> >
>
> I agree, I just thought of one case quickly. Assume that somebody did a cat /proc/pci.
> Meanwhile somebody is adding a new pci device (hotplug PCI) simultaneously (which is very rare).
> I would not care if the new device showed up in the output of /proc/pci this time. It would
> definitely show up next time. Meanwhile locking the list (just in case it changes) is an
> overhead in the case above. I was referring to these cases in my earlier mail.
So you make the data structure and algorithms more complex and hard to maintain in
order to get an undetectable improvement in the speed of something that almost
never happens and and that is not even in the same neighborhood as being a
bottleneck?
^ permalink raw reply [flat|nested] 26+ messages in thread
* kdb requires kallsyms
2001-10-10 5:46 ` Linus Torvalds
2001-10-10 6:01 ` BALBIR SINGH
@ 2001-10-10 7:14 ` Kirill Ratkin
2001-10-10 7:38 ` BALBIR SINGH
1 sibling, 1 reply; 26+ messages in thread
From: Kirill Ratkin @ 2001-10-10 7:14 UTC (permalink / raw)
To: linux-kernel
Hi.
I've trouble with kdb. I've patched my kernel (stable
2.4.10) and tried to make kernel with kdb (from SGI)
and I see :
/bin/sh: /sbin/kallsyms: No such file or directory
make[1]: *** [kallsyms] error 127
Where can I find this kallsyms?
Regards,
__________________________________________________
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: kdb requires kallsyms
2001-10-10 7:14 ` kdb requires kallsyms Kirill Ratkin
@ 2001-10-10 7:38 ` BALBIR SINGH
0 siblings, 0 replies; 26+ messages in thread
From: BALBIR SINGH @ 2001-10-10 7:38 UTC (permalink / raw)
To: Kirill Ratkin; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 848 bytes --]
Kirill Ratkin wrote:
>Hi.
>
>I've trouble with kdb. I've patched my kernel (stable
>2.4.10) and tried to make kernel with kdb (from SGI)
>and I see :
>/bin/sh: /sbin/kallsyms: No such file or directory
>make[1]: *** [kallsyms] error 127
>
>Where can I find this kallsyms?
>
modutils provides this executable. please use
rpm -q --whatprovides `type <name of file>' to see which package provides
the executable.
Regards,
Balbir
>
>Regards,
>
>
>
>
>
>__________________________________________________
>Do You Yahoo!?
>Make a great connection at Yahoo! Personals.
>http://personals.yahoo.com
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
[-- Attachment #2: Wipro_Disclaimer.txt --]
[-- Type: text/plain, Size: 853 bytes --]
----------------------------------------------------------------------------------------------------------------------
Information transmitted by this E-MAIL is proprietary to Wipro and/or its Customers and
is intended for use only by the individual or entity to which it is
addressed, and may contain information that is privileged, confidential or
exempt from disclosure under applicable law. If you are not the intended
recipient or it appears that this mail has been forwarded to you without
proper authority, you are notified that any use or dissemination of this
information in any manner is strictly prohibited. In such cases, please
notify us immediately at mailto:mailadmin@wipro.com and delete this mail
from your records.
----------------------------------------------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:05 ` Linus Torvalds
2001-10-10 5:17 ` BALBIR SINGH
@ 2001-10-10 6:16 ` Paul Mackerras
2001-10-10 6:30 ` Linus Torvalds
2001-10-10 7:36 ` Paul Mackerras
2001-10-10 11:54 ` Keith Owens
3 siblings, 1 reply; 26+ messages in thread
From: Paul Mackerras @ 2001-10-10 6:16 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
Linus Torvalds writes:
> And THAT is the hard part. Doing lookup without locks ends up being
> pretty much worthless, because you need the locks for the removal
> anyway, at which point the whole thing looks pretty moot.
>
> Did I miss something?
I believe this all becomes (much more) useful when you are doing
read-copy-update.
There is an assumption that anyone modifying the list (inserting or
deleting) would take a lock first, so the deletion is just a pointer
assignment. Any reader traversing the list (without a lock) sees
either the old pointer or the new, which is fine.
The difficulty is in making sure that no reader is still inspecting
the list element you just removed before you free it, or modify any
field that the reader would be looking at (particularly the `next'
field :). One way of doing that is to defer the free or modification
to a quiescent point. If you have a separate `next_free' field, you
could safely put the element on a list of elements to be freed at the
next quiescent point.
Paul.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 6:16 ` [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion Paul Mackerras
@ 2001-10-10 6:30 ` Linus Torvalds
0 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2001-10-10 6:30 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linux-kernel
On Wed, 10 Oct 2001, Paul Mackerras wrote:
>
> There is an assumption that anyone modifying the list (inserting or
> deleting) would take a lock first, so the deletion is just a pointer
> assignment. Any reader traversing the list (without a lock) sees
> either the old pointer or the new, which is fine.
Right, that's not the problem.
> The difficulty is in making sure that no reader is still inspecting
> the list element you just removed before you free it, or modify any
> field that the reader would be looking at (particularly the `next'
> field :).
..which implies _some_ sort of locking, even if it may be deferred.
The locking can be per-CPU, whatever - have a per-CPU count for "this many
traversals in progress", and have every lookup increment it before
starting, and decrement it after stopping.
Then the deleter can actually free the thing once he has seen every CPU go
down to zero, with the proper memory barriers.
Yes, I see that it can work. But it sounds like a _lot_ of complexity for
not very much gain.
Right now, you already have to have eight CPU's to see locking as a large
problem in normal life. People have normal patches to bring that easily up
to 16. Then how much hard-to-debug-with-subtle-memory-ordering-issues code
do you want to handle the few machines that aren't in that range?
I'm just trying to inject a bit of realism here.
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:05 ` Linus Torvalds
2001-10-10 5:17 ` BALBIR SINGH
2001-10-10 6:16 ` [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion Paul Mackerras
@ 2001-10-10 7:36 ` Paul Mackerras
2001-10-10 15:54 ` Victor Yodaiken
2001-10-10 11:54 ` Keith Owens
3 siblings, 1 reply; 26+ messages in thread
From: Paul Mackerras @ 2001-10-10 7:36 UTC (permalink / raw)
To: torvalds, linux-kernel
Linus Torvalds writes:
> And THAT is the hard part. Doing lookup without locks ends up being
> pretty much worthless, because you need the locks for the removal
> anyway, at which point the whole thing looks pretty moot.
>
> Did I miss something?
I believe this all becomes (much more) useful when you are doing
read-copy-update.
There is an assumption that anyone modifying the list (inserting or
deleting) would take a lock first, so the deletion is just a pointer
assignment. Any reader traversing the list (without a lock) sees
either the old pointer or the new, which is fine.
The difficulty is in making sure that no reader is still inspecting
the list element you just removed before you free it, or modify any
field that the reader would be looking at (particularly the `next'
field :). One way of doing that is to defer the free or modification
to a quiescent point. If you have a separate `next_free' field, you
could safely put the element on a list of elements to be freed at the
next quiescent point.
Paul.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 7:36 ` Paul Mackerras
@ 2001-10-10 15:54 ` Victor Yodaiken
2001-10-10 21:56 ` Keith Owens
0 siblings, 1 reply; 26+ messages in thread
From: Victor Yodaiken @ 2001-10-10 15:54 UTC (permalink / raw)
To: Paul Mackerras; +Cc: torvalds, linux-kernel
On Wed, Oct 10, 2001 at 05:36:18PM +1000, Paul Mackerras wrote:
> Linus Torvalds writes:
>
> > And THAT is the hard part. Doing lookup without locks ends up being
> > pretty much worthless, because you need the locks for the removal
> > anyway, at which point the whole thing looks pretty moot.
> >
> > Did I miss something?
>
> I believe this all becomes (much more) useful when you are doing
> read-copy-update.
>
> There is an assumption that anyone modifying the list (inserting or
> deleting) would take a lock first, so the deletion is just a pointer
> assignment. Any reader traversing the list (without a lock) sees
> either the old pointer or the new, which is fine.
>
> The difficulty is in making sure that no reader is still inspecting
> the list element you just removed before you free it, or modify any
> field that the reader would be looking at (particularly the `next'
> field :). One way of doing that is to defer the free or modification
> to a quiescent point. If you have a separate `next_free' field, you
> could safely put the element on a list of elements to be freed at the
> next quiescent point.
And the "next quiescent point" must be a synchronization point and that must
have spinlocks around it!
Although I kind of like the idea of
normal operation create mess by avoiding synchronization
when system seems idle, get BKL, and clean up.
>
> Paul.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 15:54 ` Victor Yodaiken
@ 2001-10-10 21:56 ` Keith Owens
2001-10-10 22:24 ` Victor Yodaiken
0 siblings, 1 reply; 26+ messages in thread
From: Keith Owens @ 2001-10-10 21:56 UTC (permalink / raw)
To: Victor Yodaiken; +Cc: Paul Mackerras, torvalds, linux-kernel
On Wed, 10 Oct 2001 09:54:36 -0600,
Victor Yodaiken <yodaiken@fsmlabs.com> wrote:
>Although I kind of like the idea of
> normal operation create mess by avoiding synchronization
> when system seems idle, get BKL, and clean up.
That does not work. A process can read an entry from a list then
perform an operation that puts the process to sleep. When it wakes up
again, how can it tell if the list has been changed? How can the
cleanup code tell if any process has slept while it was traversing a
list? An idle system does not prevent processes from sleeping at the
wrong point.
Don't even think of requiring "processes must not sleep while
traversing a lock free list". Programmers cannot get "processes must
not sleep while holding a lock" correct and that error is much easier
to detect.
The inability for update code to know when a lock free list is being
traversed is the reason that most lock free work requires type stable
storage. You can mark a list entry as free and put it on a free chain
but you can never unmap the storage nor use the free entry for data of
a different type. That is the only way to guarantee that a process can
safely determine if the list has changed under it during traversal
while still allowing the process to be interrupted or to sleep.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 21:56 ` Keith Owens
@ 2001-10-10 22:24 ` Victor Yodaiken
2001-10-10 23:46 ` David S. Miller
0 siblings, 1 reply; 26+ messages in thread
From: Victor Yodaiken @ 2001-10-10 22:24 UTC (permalink / raw)
To: Keith Owens; +Cc: Victor Yodaiken, Paul Mackerras, torvalds, linux-kernel
On Thu, Oct 11, 2001 at 07:56:43AM +1000, Keith Owens wrote:
> On Wed, 10 Oct 2001 09:54:36 -0600,
> Victor Yodaiken <yodaiken@fsmlabs.com> wrote:
> >Although I kind of like the idea of
> > normal operation create mess by avoiding synchronization
> > when system seems idle, get BKL, and clean up.
>
> That does not work. A process can read an entry from a list then
> perform an operation that puts the process to sleep. When it wakes up
> again, how can it tell if the list has been changed? How can the
In general you're right, and always its better to
reduce contention than to come up with silly algorithms for
reducing the cost of contention, but if you want to live
dangerously:
reader:
atomic increment read_count
do stuff; skip queue elements marked
as zombie
atomic decrement read_count
writer:
spin lock queue
to delete element
mark element as zombie
unspin
cleanup:
spin lock queue
if(read_count == 0){
get big kernel lock
if(read_count is still 0)
// now nobody will be able to get to queue
clean up queue
unlock kernel
}
unlock spin
So you accomplished making the code harder to read and maintain, slowing
down worst case, and maybe reducing average case read spinlock delay.
For some very carefully selected cases this may be a win, but ...
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 22:24 ` Victor Yodaiken
@ 2001-10-10 23:46 ` David S. Miller
2001-10-11 0:24 ` Davide Libenzi
0 siblings, 1 reply; 26+ messages in thread
From: David S. Miller @ 2001-10-10 23:46 UTC (permalink / raw)
To: yodaiken; +Cc: kaos, paulus, torvalds, linux-kernel
From: Victor Yodaiken <yodaiken@fsmlabs.com>
Date: Wed, 10 Oct 2001 16:24:19 -0600
In general you're right, and always its better to
reduce contention than to come up with silly algorithms for
reducing the cost of contention,
I want to second this and remind people that the "cost" of spinlocks
is mostly not "spinning idly waiting for lock", rather the big cost
is shuffling the dirty cacheline ownership between the processors.
Any scheme involving shared data which is written (the read counts
in the various "lockless" schemes are examples) have the same "cost"
assosciated with them.
In short, I see no performance gain from the lockless algorithms
even in the places where they can be applied.
I spent some time oogling over lockless algorithms a few years ago,
but I stopped once I realized where the true costs were. In my view,
the lockless algorithms perhaps are a win in parallel processing
environments (in fact, the supercomputing field is where a lot of the
lockless algorithm research comes from) but not in the kinds of places
and with the kinds of data structure usage the Linux kernel has.
Franks a lot,
David S. Miller
davem@redhat.com
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 23:46 ` David S. Miller
@ 2001-10-11 0:24 ` Davide Libenzi
0 siblings, 0 replies; 26+ messages in thread
From: Davide Libenzi @ 2001-10-11 0:24 UTC (permalink / raw)
To: David S. Miller; +Cc: yodaiken, kaos, paulus, torvalds, linux-kernel
On Wed, 10 Oct 2001, David S. Miller wrote:
> From: Victor Yodaiken <yodaiken@fsmlabs.com>
> Date: Wed, 10 Oct 2001 16:24:19 -0600
>
> In general you're right, and always its better to
> reduce contention than to come up with silly algorithms for
> reducing the cost of contention,
>
> I want to second this and remind people that the "cost" of spinlocks
> is mostly not "spinning idly waiting for lock", rather the big cost
> is shuffling the dirty cacheline ownership between the processors.
>
> Any scheme involving shared data which is written (the read counts
> in the various "lockless" schemes are examples) have the same "cost"
> assosciated with them.
>
> In short, I see no performance gain from the lockless algorithms
> even in the places where they can be applied.
>
> I spent some time oogling over lockless algorithms a few years ago,
> but I stopped once I realized where the true costs were. In my view,
> the lockless algorithms perhaps are a win in parallel processing
> environments (in fact, the supercomputing field is where a lot of the
> lockless algorithm research comes from) but not in the kinds of places
> and with the kinds of data structure usage the Linux kernel has.
Agree.
To have a complete list handling you've to end up locking or
write stressing counter variables to simulate locks.
Just only having special handled lists that removes write ops from irq
handlers can make you save a big cost of ..._irqsave()/restore().
And these can be realized by having an extra list of insert/remove ops
that is filled with no locks by irq handlers and is truncated/flushed by a
__xchg() done by readers.
- Davide
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion
2001-10-10 5:05 ` Linus Torvalds
` (2 preceding siblings ...)
2001-10-10 7:36 ` Paul Mackerras
@ 2001-10-10 11:54 ` Keith Owens
2001-10-10 13:42 ` AIC7XXX war
3 siblings, 1 reply; 26+ messages in thread
From: Keith Owens @ 2001-10-10 11:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
On Wed, 10 Oct 2001 05:05:10 +0000 (UTC),
torvalds@transmeta.com (Linus Torvalds) wrote:
>Now, before people get all excited, what is this particular code
>actually _good_ for?
>
>Creating a lock-free list that allows insertion concurrently with lookup
>is _easy_.
>
>But what's the point? If you insert stuff, you eventually have to remove
>it. What goes up must come down. Insert-inane-quote-here.
>
>And THAT is the hard part. Doing lookup without locks ends up being
>pretty much worthless, because you need the locks for the removal
>anyway, at which point the whole thing looks pretty moot.
<pedantic>
It is possible to do completely lock free queue management, I have done
it (once). There are several major requirements :-
* Storage must never change its type (type stable storage). Once a
block has been assigned as struct foo, it must always contain struct
foo. This can be relaxed slightly as long as the data types have a
common header format.
The biggest problem this causes is that storage can never be released
and unmapped. You never know if somebody is going to follow an old
pointer to access storage that you freed. You must put the storage
on a free list for struct foo instead of unmapping it, to maintain
the type stable storage.
* Every piece of code that traverses the data tree for structure foo
must take a copy of each struct foo into local storage. After taking
a copy it must verify that its local copy is self consistent, via
validity indicators in each struct. The validity indicators are
typically generation numbers, whatever you use, the indicators must
be atomically updated and atomically read, with suitable wmb and rmb
barriers for machines with weak memory ordering.
* After following a pointer in your local copy of struct foo to access
another data area, you must verify that the master version of your
local copy has not been changed. If the master copy has changed then
the pointer you just followed is no longer reliable. In almost every
case you have to start the traversal from the beginning. It makes
for very convoluted reader and updater code.
</pedantic>
The only reason I used lock free read and update was to hook into an
existing system that mandated sub second responses. Some of the new
operations that were performed during list traversal and update were
subject to unbounded delays that were completely outside my control. I
could not afford to lock the data structures during traversal because
any delay in the new operations would completely stall the existing
system, also the existing system had to be able to retrieve and delete
data for the new operations at any time.
I don't recommend doing lock free unless you have absolutely no
alternative. It requires convoluted code, it is difficult to debug, it
is expensive on memory bandwidth (forget zero copy) and tends to be
expensive on storage as well.
If you are interested in lock free work, take a look at
http://www.cs.pitt.edu/~moir/papers.html, particularly Practical
Implementations of Non-Blocking Synchronization Primitives. But
beware, that paper has an error which caused intermittent bugs that
took me 5 months to track down. Section 3.3, proc Copy, line 6 should
be
6: y := addr->data[i];
^ permalink raw reply [flat|nested] 26+ messages in thread