* bkl cleanup in do_sysctl @ 2004-08-10 16:58 Josh Aas 2004-08-10 17:28 ` Dave Hansen 0 siblings, 1 reply; 7+ messages in thread From: Josh Aas @ 2004-08-10 16:58 UTC (permalink / raw) To: Linux Kernel Mailing List, steiner I'd like to hear people's thoughts on replacing the bkl in do_sysctl with a localized spin lock that protects the sysctl structures. Instead of grabbing the bkl, anyone that needs to mess with those values could grab the localized lock (1 to protect all structures). Such a localized lock would allow us to get rid of bkl usage in at least one other place as well (do_coredump). In order to do this though, we would have to make sure all code that grabs the bkl instead of the localized lock while using sysctl values switches to the new lock. Might be a big job, but perhaps it would be a good one to start after 2.6.8 is out the door. Thoughts? Comments? -- Josh Aas Silicon Graphics, Inc. (SGI) Linux System Software 651-683-3068 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 16:58 bkl cleanup in do_sysctl Josh Aas @ 2004-08-10 17:28 ` Dave Hansen 2004-08-10 18:51 ` Lee Revell 0 siblings, 1 reply; 7+ messages in thread From: Dave Hansen @ 2004-08-10 17:28 UTC (permalink / raw) To: Josh Aas; +Cc: Linux Kernel Mailing List, steiner On Tue, 2004-08-10 at 09:58, Josh Aas wrote: > I'd like to hear people's thoughts on replacing the bkl in do_sysctl > with a localized spin lock that protects the sysctl structures. Instead > of grabbing the bkl, anyone that needs to mess with those values could > grab the localized lock (1 to protect all structures). Such a localized > lock would allow us to get rid of bkl usage in at least one other place > as well (do_coredump). In order to do this though, we would have to make > sure all code that grabs the bkl instead of the localized lock while > using sysctl values switches to the new lock. Might be a big job, but > perhaps it would be a good one to start after 2.6.8 is out the door. Remember that the BKL isn't a plain-old spinlock. You're allowed to sleep while holding it and it can be recursively held, which isn't true for other spinlocks. So, if you want to replace it with a spinlock, you'll need to do audits looking for sysctl users that might_sleep() or get called recursively somehow. The might_sleep() debugging checks should help immensely for the first part, but all you'll get are deadlocks at runtime for any recursive holders. But, those cases are increasingly rare, so you might luck out and not have any. Or, you could just make it a semaphore and forget about the no sleeping requirement. -- Dave ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 17:28 ` Dave Hansen @ 2004-08-10 18:51 ` Lee Revell 2004-08-10 19:16 ` Roland Dreier 2004-08-10 19:46 ` Hans Reiser 0 siblings, 2 replies; 7+ messages in thread From: Lee Revell @ 2004-08-10 18:51 UTC (permalink / raw) To: Dave Hansen; +Cc: Josh Aas, Linux Kernel Mailing List, steiner On Tue, 2004-08-10 at 13:28, Dave Hansen wrote: > Remember that the BKL isn't a plain-old spinlock. You're allowed to > sleep while holding it and it can be recursively held, which isn't true > for other spinlocks. > > So, if you want to replace it with a spinlock, you'll need to do audits > looking for sysctl users that might_sleep() or get called recursively > somehow. The might_sleep() debugging checks should help immensely for > the first part, but all you'll get are deadlocks at runtime for any > recursive holders. But, those cases are increasingly rare, so you might > luck out and not have any. > > Or, you could just make it a semaphore and forget about the no sleeping > requirement. > Someone once suggested that newbies who show up on LKML wanting to learn kernel hacking should be assigned to find one use of the BKL and replace it with proper locking. Something similar worked very well with my previous employer, before giving someone root, new hires would first be assigned some task like writing a script to take the user account database and generate a report of old accounts on a bunch of machines, or rewrite the RADIUS accounting scripts, where the point was really to get them familiar with the system. This way, even if they come back with a totally botched fix, someone will probably just post a correct one. We could get rid of the BKL very soon, I count only 247 files with lock_kernel in them. For example reiserfs uses the BKL for all write locking (!), but it probably would not be too hard to fix, because you can just look at another filesystem that has proper locking. Maybe this should be added to the FAQ: Q: I want to hack the kernel, and I *think* I know what I am doing. Where do I start? Lee ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 18:51 ` Lee Revell @ 2004-08-10 19:16 ` Roland Dreier 2004-08-10 19:21 ` Lee Revell 2004-08-10 19:46 ` Hans Reiser 1 sibling, 1 reply; 7+ messages in thread From: Roland Dreier @ 2004-08-10 19:16 UTC (permalink / raw) To: Lee Revell; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner Lee> Someone once suggested that newbies who show up on LKML Lee> wanting to learn kernel hacking should be assigned to find Lee> one use of the BKL and replace it with proper locking. Unfortunately most of the remaining BKL uses seem to be very subtle. Removing lock_kernel() correctly requires a deep understanding of the global locking semantics and may be very invasive. In the end it's also hard to be sure bugs haven't been introduced. Lee> For example reiserfs uses the BKL for all write locking (!), Lee> but it probably would not be too hard to fix, because you can Lee> just look at another filesystem that has proper locking. Fixing up a filesystem's write locking doesn't sound like a very good newbie project to me. - R. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 19:16 ` Roland Dreier @ 2004-08-10 19:21 ` Lee Revell 0 siblings, 0 replies; 7+ messages in thread From: Lee Revell @ 2004-08-10 19:21 UTC (permalink / raw) To: Roland Dreier; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner On Tue, 2004-08-10 at 15:16, Roland Dreier wrote: > Lee> For example reiserfs uses the BKL for all write locking (!), > Lee> but it probably would not be too hard to fix, because you can > Lee> just look at another filesystem that has proper locking. > > Fixing up a filesystem's write locking doesn't sound like a very good > newbie project to me. > Exactly, the point isn't to get them to fix it, but as a way to learn the internals of Linux, with the side benefit that someone might propose a correct fix. Also, many people might be new to Linux kernel hacking but are knowledgeable re: operating systems. Just a thought. Lee ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 18:51 ` Lee Revell 2004-08-10 19:16 ` Roland Dreier @ 2004-08-10 19:46 ` Hans Reiser 2004-08-10 20:10 ` Lee Revell 1 sibling, 1 reply; 7+ messages in thread From: Hans Reiser @ 2004-08-10 19:46 UTC (permalink / raw) To: Lee Revell; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner Lee Revell wrote: >On Tue, 2004-08-10 at 13:28, Dave Hansen wrote: > > > >>Remember that the BKL isn't a plain-old spinlock. You're allowed to >>sleep while holding it and it can be recursively held, which isn't true >>for other spinlocks. >> >>So, if you want to replace it with a spinlock, you'll need to do audits >>looking for sysctl users that might_sleep() or get called recursively >>somehow. The might_sleep() debugging checks should help immensely for >>the first part, but all you'll get are deadlocks at runtime for any >>recursive holders. But, those cases are increasingly rare, so you might >>luck out and not have any. >> >>Or, you could just make it a semaphore and forget about the no sleeping >>requirement. >> >> >> > >Someone once suggested that newbies who show up on LKML wanting to learn >kernel hacking should be assigned to find one use of the BKL and replace >it with proper locking. Something similar worked very well with my >previous employer, before giving someone root, new hires would first be >assigned some task like writing a script to take the user account >database and generate a report of old accounts on a bunch of machines, >or rewrite the RADIUS accounting scripts, where the point was really to >get them familiar with the system. > >This way, even if they come back with a totally botched fix, someone >will probably just post a correct one. We could get rid of the BKL very >soon, I count only 247 files with lock_kernel in them. > >For example reiserfs uses the BKL for all write locking (!), but it >probably would not be too hard to fix, because you can just look at >another filesystem that has proper locking. > > Wrong. ;-) Balancing makes it way hard. Use reiser4. That has very sophisticated locking that pushes the research envelope, if you want to read code to learn about locking..... >Maybe this should be added to the FAQ: > >Q: I want to hack the kernel, and I *think* I know what I am doing. >Where do I start? > >Lee > >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bkl cleanup in do_sysctl 2004-08-10 19:46 ` Hans Reiser @ 2004-08-10 20:10 ` Lee Revell 0 siblings, 0 replies; 7+ messages in thread From: Lee Revell @ 2004-08-10 20:10 UTC (permalink / raw) To: Hans Reiser; +Cc: Dave Hansen, Josh Aas, Linux Kernel Mailing List, steiner On Tue, 2004-08-10 at 15:46, Hans Reiser wrote: > Lee Revell wrote: > >For example reiserfs uses the BKL for all write locking (!), but it > >probably would not be too hard to fix, because you can just look at > >another filesystem that has proper locking. > > > > > Wrong. ;-) Balancing makes it way hard. Use reiser4. That has very > sophisticated locking that pushes the research envelope, if you want to > read code to learn about locking..... > OK, I will give it a try. Are any of the reiser3 latency issues reported in the voluntary preemption thread addressed in reiser4? Lee ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-08-10 20:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-08-10 16:58 bkl cleanup in do_sysctl Josh Aas 2004-08-10 17:28 ` Dave Hansen 2004-08-10 18:51 ` Lee Revell 2004-08-10 19:16 ` Roland Dreier 2004-08-10 19:21 ` Lee Revell 2004-08-10 19:46 ` Hans Reiser 2004-08-10 20:10 ` Lee Revell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox