* bkl cleanup in do_sysctl
@ 2004-08-10 16:58 Josh Aas
2004-08-10 17:28 ` Dave Hansen
0 siblings, 1 reply; 7+ messages in thread
From: Josh Aas @ 2004-08-10 16:58 UTC (permalink / raw)
To: Linux Kernel Mailing List, steiner
I'd like to hear people's thoughts on replacing the bkl in do_sysctl
with a localized spin lock that protects the sysctl structures. Instead
of grabbing the bkl, anyone that needs to mess with those values could
grab the localized lock (1 to protect all structures). Such a localized
lock would allow us to get rid of bkl usage in at least one other place
as well (do_coredump). In order to do this though, we would have to make
sure all code that grabs the bkl instead of the localized lock while
using sysctl values switches to the new lock. Might be a big job, but
perhaps it would be a good one to start after 2.6.8 is out the door.
Thoughts? Comments?
--
Josh Aas
Silicon Graphics, Inc. (SGI)
Linux System Software
651-683-3068
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 16:58 bkl cleanup in do_sysctl Josh Aas
@ 2004-08-10 17:28 ` Dave Hansen
2004-08-10 18:51 ` Lee Revell
0 siblings, 1 reply; 7+ messages in thread
From: Dave Hansen @ 2004-08-10 17:28 UTC (permalink / raw)
To: Josh Aas; +Cc: Linux Kernel Mailing List, steiner
On Tue, 2004-08-10 at 09:58, Josh Aas wrote:
> I'd like to hear people's thoughts on replacing the bkl in do_sysctl
> with a localized spin lock that protects the sysctl structures. Instead
> of grabbing the bkl, anyone that needs to mess with those values could
> grab the localized lock (1 to protect all structures). Such a localized
> lock would allow us to get rid of bkl usage in at least one other place
> as well (do_coredump). In order to do this though, we would have to make
> sure all code that grabs the bkl instead of the localized lock while
> using sysctl values switches to the new lock. Might be a big job, but
> perhaps it would be a good one to start after 2.6.8 is out the door.
Remember that the BKL isn't a plain-old spinlock. You're allowed to
sleep while holding it and it can be recursively held, which isn't true
for other spinlocks.
So, if you want to replace it with a spinlock, you'll need to do audits
looking for sysctl users that might_sleep() or get called recursively
somehow. The might_sleep() debugging checks should help immensely for
the first part, but all you'll get are deadlocks at runtime for any
recursive holders. But, those cases are increasingly rare, so you might
luck out and not have any.
Or, you could just make it a semaphore and forget about the no sleeping
requirement.
-- Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 17:28 ` Dave Hansen
@ 2004-08-10 18:51 ` Lee Revell
2004-08-10 19:16 ` Roland Dreier
2004-08-10 19:46 ` Hans Reiser
0 siblings, 2 replies; 7+ messages in thread
From: Lee Revell @ 2004-08-10 18:51 UTC (permalink / raw)
To: Dave Hansen; +Cc: Josh Aas, Linux Kernel Mailing List, steiner
On Tue, 2004-08-10 at 13:28, Dave Hansen wrote:
> Remember that the BKL isn't a plain-old spinlock. You're allowed to
> sleep while holding it and it can be recursively held, which isn't true
> for other spinlocks.
>
> So, if you want to replace it with a spinlock, you'll need to do audits
> looking for sysctl users that might_sleep() or get called recursively
> somehow. The might_sleep() debugging checks should help immensely for
> the first part, but all you'll get are deadlocks at runtime for any
> recursive holders. But, those cases are increasingly rare, so you might
> luck out and not have any.
>
> Or, you could just make it a semaphore and forget about the no sleeping
> requirement.
>
Someone once suggested that newbies who show up on LKML wanting to learn
kernel hacking should be assigned to find one use of the BKL and replace
it with proper locking. Something similar worked very well with my
previous employer, before giving someone root, new hires would first be
assigned some task like writing a script to take the user account
database and generate a report of old accounts on a bunch of machines,
or rewrite the RADIUS accounting scripts, where the point was really to
get them familiar with the system.
This way, even if they come back with a totally botched fix, someone
will probably just post a correct one. We could get rid of the BKL very
soon, I count only 247 files with lock_kernel in them.
For example reiserfs uses the BKL for all write locking (!), but it
probably would not be too hard to fix, because you can just look at
another filesystem that has proper locking.
Maybe this should be added to the FAQ:
Q: I want to hack the kernel, and I *think* I know what I am doing.
Where do I start?
Lee
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 18:51 ` Lee Revell
@ 2004-08-10 19:16 ` Roland Dreier
2004-08-10 19:21 ` Lee Revell
2004-08-10 19:46 ` Hans Reiser
1 sibling, 1 reply; 7+ messages in thread
From: Roland Dreier @ 2004-08-10 19:16 UTC (permalink / raw)
To: Lee Revell; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner
Lee> Someone once suggested that newbies who show up on LKML
Lee> wanting to learn kernel hacking should be assigned to find
Lee> one use of the BKL and replace it with proper locking.
Unfortunately most of the remaining BKL uses seem to be very subtle.
Removing lock_kernel() correctly requires a deep understanding of the
global locking semantics and may be very invasive. In the end it's
also hard to be sure bugs haven't been introduced.
Lee> For example reiserfs uses the BKL for all write locking (!),
Lee> but it probably would not be too hard to fix, because you can
Lee> just look at another filesystem that has proper locking.
Fixing up a filesystem's write locking doesn't sound like a very good
newbie project to me.
- R.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 19:16 ` Roland Dreier
@ 2004-08-10 19:21 ` Lee Revell
0 siblings, 0 replies; 7+ messages in thread
From: Lee Revell @ 2004-08-10 19:21 UTC (permalink / raw)
To: Roland Dreier; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner
On Tue, 2004-08-10 at 15:16, Roland Dreier wrote:
> Lee> For example reiserfs uses the BKL for all write locking (!),
> Lee> but it probably would not be too hard to fix, because you can
> Lee> just look at another filesystem that has proper locking.
>
> Fixing up a filesystem's write locking doesn't sound like a very good
> newbie project to me.
>
Exactly, the point isn't to get them to fix it, but as a way to learn
the internals of Linux, with the side benefit that someone might propose
a correct fix. Also, many people might be new to Linux kernel hacking
but are knowledgeable re: operating systems. Just a thought.
Lee
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 18:51 ` Lee Revell
2004-08-10 19:16 ` Roland Dreier
@ 2004-08-10 19:46 ` Hans Reiser
2004-08-10 20:10 ` Lee Revell
1 sibling, 1 reply; 7+ messages in thread
From: Hans Reiser @ 2004-08-10 19:46 UTC (permalink / raw)
To: Lee Revell; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner
Lee Revell wrote:
>On Tue, 2004-08-10 at 13:28, Dave Hansen wrote:
>
>
>
>>Remember that the BKL isn't a plain-old spinlock. You're allowed to
>>sleep while holding it and it can be recursively held, which isn't true
>>for other spinlocks.
>>
>>So, if you want to replace it with a spinlock, you'll need to do audits
>>looking for sysctl users that might_sleep() or get called recursively
>>somehow. The might_sleep() debugging checks should help immensely for
>>the first part, but all you'll get are deadlocks at runtime for any
>>recursive holders. But, those cases are increasingly rare, so you might
>>luck out and not have any.
>>
>>Or, you could just make it a semaphore and forget about the no sleeping
>>requirement.
>>
>>
>>
>
>Someone once suggested that newbies who show up on LKML wanting to learn
>kernel hacking should be assigned to find one use of the BKL and replace
>it with proper locking. Something similar worked very well with my
>previous employer, before giving someone root, new hires would first be
>assigned some task like writing a script to take the user account
>database and generate a report of old accounts on a bunch of machines,
>or rewrite the RADIUS accounting scripts, where the point was really to
>get them familiar with the system.
>
>This way, even if they come back with a totally botched fix, someone
>will probably just post a correct one. We could get rid of the BKL very
>soon, I count only 247 files with lock_kernel in them.
>
>For example reiserfs uses the BKL for all write locking (!), but it
>probably would not be too hard to fix, because you can just look at
>another filesystem that has proper locking.
>
>
Wrong. ;-) Balancing makes it way hard. Use reiser4. That has very
sophisticated locking that pushes the research envelope, if you want to
read code to learn about locking.....
>Maybe this should be added to the FAQ:
>
>Q: I want to hack the kernel, and I *think* I know what I am doing.
>Where do I start?
>
>Lee
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl
2004-08-10 19:46 ` Hans Reiser
@ 2004-08-10 20:10 ` Lee Revell
0 siblings, 0 replies; 7+ messages in thread
From: Lee Revell @ 2004-08-10 20:10 UTC (permalink / raw)
To: Hans Reiser; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner
On Tue, 2004-08-10 at 15:46, Hans Reiser wrote:
> Lee Revell wrote:
> >For example reiserfs uses the BKL for all write locking (!), but it
> >probably would not be too hard to fix, because you can just look at
> >another filesystem that has proper locking.
> >
> >
> Wrong. ;-) Balancing makes it way hard. Use reiser4. That has very
> sophisticated locking that pushes the research envelope, if you want to
> read code to learn about locking.....
>
OK, I will give it a try. Are any of the reiser3 latency issues
reported in the voluntary preemption thread addressed in reiser4?
Lee
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-08-10 20:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-10 16:58 bkl cleanup in do_sysctl Josh Aas
2004-08-10 17:28 ` Dave Hansen
2004-08-10 18:51 ` Lee Revell
2004-08-10 19:16 ` Roland Dreier
2004-08-10 19:21 ` Lee Revell
2004-08-10 19:46 ` Hans Reiser
2004-08-10 20:10 ` Lee Revell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox