From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org, arjan@infradead.org
Subject: Re: lock validator [2.6.18 -mm merge plans]
Date: Tue, 6 Jun 2006 22:47:05 +0200 [thread overview]
Message-ID: <20060606204705.GA17787@elte.hu> (raw)
In-Reply-To: <20060606090258.004e6f5f.akpm@osdl.org>
* Andrew Morton <akpm@osdl.org> wrote:
> - I think we still have a problem with the raid/bdev changes in
> block_dev.c.
> - the changes to block_dev.c _do_ impact non-lockdep kernels
yes, there are a few more functions that do things explicitly. If you
worry about the stack footprint, there should be little if any impact:
the stack footprint you looked at yesterday was on a kernel that
included patches that are not in -mm (the -fno-sibling-calls patch) and
the lockdep tracer (-pg) which both increase stack footprint.
> - we need to take a second look to see which other
> dont-affect-non-lockdep-kernels patches are in fact affecting
> non-lockdep kernels
there should be no change to the logic of non-lockdep kernels. There
might be some minimal impact to the code built. Wanna have an explicit
list of what the affected functions are? [but unless you think it causes
problems i dont think we should worry about it - we do changes all the
time that affect the generated code but dont affect the logic. The
inlining overhead thing is a red herring i believe.]
> - the changes to block_dev.c were pretty awful anyway
yeah - but it just matches the code there. I'll think about finding
better ways.
> - did the various review comments I sent get disposed of in some
> fashion?
they were not forgotten at all (and there are others too who have sent
feedback), they are next on my list.
> My overarching concern is the rate at which false-positive workaround
> patches are piling up. [...]
i feel a bit frustrated that my arguments regarding these "false
positives" remain apparently unanswered. I expressed it lots of times
that i find most of the semantics restrictions a necessary step towards
having a more robust kernel. The restrictions are:
- unordered unlocks. I still think we want to document all of them.
They pointed to real bugs multiple times. They pointed to suboptimal
code. There has been _one_ case so far that i'd declare a true false
positive. Nevertheless i gave up resistance and implemented the
CONFIG_DEBUG_NON_NESTED_UNLOCKS (default-off) option, which makes
these messages totally voluntary.
- stealth locking via disable_irq(). We know that this is both wrong
and dangerous. It's wrong because it affects all other handlers on
the same IRQ line, for a possibly long period of time. This also
uncovered the real deadlock and design flaw related to irqpoll.
- nested locking of the same lock-type. These are not broken,
nevertheless it's useful to extend validation to these categories too
- it found real bugs (about 5) in the networking code for example -
some of them were in core networking code. The majority of these are
related to i_mutex: here too i think we want to document all the
locking rules.
(and there are some other restrictions too, which i started enforcing
via earlier cleanups of the locking code. As i mentioned in the big
mutex flamewar^H^H^H^Hdiscussion, restriction of semantics to a natural
model is what i believe leads to a kernel that can be trusted more.)
> (I'm actually quite surprised at how few real bugs this checker has
> revealed. We must rock, or something).
The number of deadlocks found was actually much higher (and happened
much sooner) than i expected. I expected there to be less than 10 bugs
left, which would trickle in the timeframe of weeks. (because we thought
we covered alot of code in our testing) What happened is that we are
well above 10 deadlocks found so far, in just a few days.
The focus of the validator is on _new code_. Lock dependencies can now
reach near-perfect quality almost immediately, while they needed to sit
in the kernel for many months before. (and even then the validator found
deadlock bugs that were years old) The fact that we now check and
document the existing and pretty well-tested kernel too is just an added
bonus.
Also, the validator was in the works for months and we fixed a bunch of
bugs before we posted the patches. In fact in the past month it was
based on -mm and was tested on an allyesconfig bzImage bootup which
further decreased the rate of detection. I guess i should quote Davem's
blog:
http://vger.kernel.org/~davem/cgi-bin/blog.cgi
"[...] I've known about this for some time, because as he was writing
it Ingo passed along some networking locking bugs that this sucker
has found. "
The networking code is the subsystem with probably the best locking
design in place. It's nearly half a million lines of code (not counting
drivers) and needed only about 5 annotations so far.
Another thing that also reduced the number of deadlock bugs is the
effect of the -rt kernel's deadlock checker and agressive preemption
model: we found dozens of deadlocks there too. The -rt kernel's deadlock
checker covers all lock types too. (while the upstream kernel only
covered about 50% of all locking APIs via in-situ deadlock checking.)
So there has been alot of focus on the locking APIs in the past year or
so, so dont be surprised that the quality of locking in existing kernel
code isnt all that bad.
Regarding annotations, my current estimation is that we'll have at most
~0.2% rate of explicit annotations (== 'false positives'), which with
the ~50,000 locking APIs will be at most 100 places.
The deadlock rate for newly released locking code is at least 0.5%-2%,
and we introduce about 2000 new locking API uses per kernel release
(about 5% of all the existing locking code in the kernel), which means
the validator can find 10-40 new deadlocks per kernel release, at the
price of 2 new annotations. (where 90% of the annotations are trivial,
and the rest is only difficult because no-one knows/remembers the
locking rules anymore ...) Furthermore, chances are that problematic
locking constructs will be introduced with a lower probability due to
the validator. Also, code around existing annotations could be cleaned
up too. (as it happened in a number of cases already) So maybe the
annotation rate will go down as well.
Furtermore, untold amount of developer time is wasted on finding
deadlocks. The fresher the code is, the more likely it is that it has a
deadlock bug. The bug rate of lock uses could be as high as 10% in
totally new code. With the validator, many of these bugs will be found
_much_ earlier, improving productivity - and putting less crap into your
tree! That will also mean that we wont see most of those bugs.
Plus Linux support engineers at sw/hw vendors are spending significant
amount of time fixing deadlocks. This has the added twist that changing
locking code is always risky in a product, and it's not bad to have some
good proof in place that the code they trigger during re-QA is actually
correct in terms of locking dependencies.
Users also get totally frustrated at deadlocks. If something doesnt work
as expected that's an usually an annoyance, but if the box locks up
totally which necessiates the destruction of its current volatile data
(its memory, via a hard reset) that's a complete and immediate
showstopper. If you look at the kind of bugs that annoy users most
you'll find 'lockups' really high on the list.
The validator found bugs in my _own code_ that i thought to be
production quality, and i thought i can write correct locking code. We
should really let the computer do this job.
Ingo
next prev parent reply other threads:[~2006-06-06 20:47 UTC|newest]
Thread overview: 166+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-04 20:50 2.6.18 -mm merge plans Andrew Morton
2006-06-04 21:20 ` 2.6.18 hdrinstall (Re: 2.6.18 -mm merge plans) Bernhard Rosenkraenzer
2006-06-04 21:33 ` header cleanup and install David Woodhouse
2006-06-04 21:43 ` Andrew Morton
2006-06-05 10:52 ` Jens Axboe
2006-06-05 10:54 ` David Woodhouse
2006-06-05 10:59 ` Jens Axboe
2006-06-05 10:57 ` David Woodhouse
2006-06-05 11:03 ` Jens Axboe
2006-06-05 18:09 ` Andrew Morton
2006-06-05 19:19 ` David Woodhouse
2006-06-17 20:35 ` Alistair John Strachan
2006-06-17 21:20 ` David Woodhouse
2006-06-04 21:36 ` 2.6.18 -mm merge plans Alan Cox
2006-06-04 21:41 ` kbuild, kconfig and hrdinstall stuff Sam Ravnborg
2006-06-04 21:54 ` David Woodhouse
2006-06-04 23:04 ` klibc (was: 2.6.18 -mm merge plans) H. Peter Anvin
2006-06-05 18:09 ` Roman Zippel
2006-06-06 15:20 ` Pavel Machek
2006-06-06 20:56 ` Rafael J. Wysocki
2006-06-07 3:37 ` H. Peter Anvin
2006-06-07 4:00 ` Nigel Cunningham
2006-06-07 4:10 ` H. Peter Anvin
2006-06-07 4:25 ` Nigel Cunningham
2006-06-07 4:26 ` klibc H. Peter Anvin
2006-06-07 6:22 ` klibc Nigel Cunningham
2006-06-07 6:38 ` klibc H. Peter Anvin
2006-06-07 6:51 ` klibc (was: 2.6.18 -mm merge plans) Joshua Hudson
2006-06-07 21:12 ` H. Peter Anvin
2006-06-09 8:03 ` klibc Nix
2006-06-09 18:45 ` klibc H. Peter Anvin
[not found] ` <bda6d13a0606091050n40fda044v668eef09af3c29a7@mail.gmail.com>
[not found] ` <871wty6rl9.fsf@hades.wkstn.nix>
2006-06-09 22:28 ` klibc Joshua Hudson
2006-06-09 22:48 ` klibc H. Peter Anvin
2006-06-09 23:13 ` klibc Joshua Hudson
2006-06-09 23:44 ` klibc H. Peter Anvin
2006-06-16 6:02 ` klibc Joshua Hudson
2006-06-16 19:19 ` klibc H. Peter Anvin
2006-06-07 8:44 ` klibc (was: 2.6.18 -mm merge plans) Pavel Machek
2006-06-07 9:44 ` Rafael J. Wysocki
2006-06-04 23:50 ` clocksource Roman Zippel
2006-06-05 20:20 ` clocksource john stultz
2006-06-05 20:53 ` clocksource john stultz
2006-06-05 21:07 ` clocksource Roman Zippel
2006-06-06 19:42 ` clocksource john stultz
2006-06-07 0:41 ` clocksource Roman Zippel
2006-06-08 8:05 ` clocksource john stultz
2006-06-15 11:40 ` clocksource Roman Zippel
2006-06-16 3:21 ` clocksource john stultz
2006-06-16 3:35 ` clocksource john stultz
2006-06-16 15:33 ` clocksource Roman Zippel
2006-06-16 18:48 ` clocksource john stultz
2006-06-17 19:45 ` clocksource Roman Zippel
2006-06-17 17:04 ` clocksource Andrew Morton
2006-06-05 0:02 ` utsname/hostname Randy.Dunlap
2006-06-05 1:06 ` utsname/hostname Andrew Morton
2006-06-05 3:10 ` utsname/hostname Randy.Dunlap
[not found] ` <20060605002807.GA4919@mail.ustc.edu.cn>
2006-06-05 0:28 ` readahead benchmark Fengguang Wu
2006-06-05 1:02 ` Andrew Morton
2006-06-05 0:32 ` new SCSI drivers (was Re: 2.6.18 -mm merge plans) Jeff Garzik
2006-06-05 1:06 ` wireless " Jeff Garzik
2006-06-05 1:15 ` Andrew Morton
2006-06-05 8:33 ` Andreas Mohr
2006-06-05 8:45 ` Arjan van de Ven
2006-06-05 10:26 ` Alan Cox
2006-06-05 10:35 ` Arjan van de Ven
2006-06-05 10:59 ` Alan Cox
2006-06-10 6:58 ` Pavel Machek
2006-06-05 8:54 ` Christoph Hellwig
2006-06-05 12:33 ` Jeff Garzik
2006-06-05 12:48 ` Arjan van de Ven
2006-06-05 12:52 ` Jeff Garzik
2006-06-05 14:02 ` Linux kernel and laws Adrian Bunk
2006-06-05 14:21 ` linux-os (Dick Johnson)
2006-06-06 5:33 ` Evgeniy Polyakov
2006-06-05 13:27 ` wireless (was Re: 2.6.18 -mm merge plans) John W. Linville
2006-06-05 13:31 ` Christoph Hellwig
2006-06-05 13:42 ` Arjan van de Ven
2006-06-05 16:24 ` Alan Cox
2006-06-29 14:26 ` ACX100 (softmac-based) driver ready to merge, but is it legal? -- " John W. Linville
[not found] ` <20060629144233.GB24463@tuxdriver.com>
2006-06-29 14:47 ` [Acx100-users] Denis Vlasenko, where are you? (mail bounced) Andreas Mohr
2006-06-05 1:32 ` merging new drivers (was Re: 2.6.18 -mm merge plans) Jeff Garzik
2006-06-05 1:47 ` Andrew Morton
2006-06-05 8:59 ` Christoph Hellwig
2006-06-05 9:10 ` Andrew Morton
2006-06-05 9:16 ` Arjan van de Ven
2006-06-05 11:10 ` Ivan Novick
2006-06-05 11:26 ` Adrian Bunk
2006-06-05 6:58 ` Francois Romieu
2006-06-05 10:32 ` Alan Cox
2006-06-05 10:36 ` Arjan van de Ven
2006-06-06 2:02 ` Chris Wright
2006-06-06 7:01 ` Andi Kleen
2006-06-06 13:04 ` Steven Rostedt
2006-06-05 13:38 ` 2.6.18 -mm merge plans -- GFS David Woodhouse
2006-06-05 14:10 ` Russell King
2006-06-05 15:01 ` Steven Whitehouse
2006-06-07 7:12 ` Steven Whitehouse
2006-06-05 14:08 ` 2.6.18 -mm merge plans Oleg Nesterov
2006-06-05 14:43 ` Serge E. Hallyn
2006-06-08 19:56 ` Eric W. Biederman
2006-06-09 13:02 ` Serge E. Hallyn
2006-06-09 23:25 ` Serge E. Hallyn
2006-06-10 0:39 ` Eric W. Biederman
2006-06-10 1:23 ` Serge E. Hallyn
2006-06-10 7:52 ` Eric W. Biederman
2006-06-10 8:09 ` Eric W. Biederman
2006-06-10 9:53 ` Christoph Hellwig
[not found] ` <20060605010501.GA4931@mail.ustc.edu.cn>
2006-06-05 1:05 ` statistics infrastructure Fengguang Wu
2006-06-05 16:30 ` Greg KH
2006-06-13 23:47 ` statistics infrastructure (in -mm tree) review Greg KH
2006-06-14 0:18 ` Randy.Dunlap
2006-06-14 16:45 ` Greg KH
2006-06-14 22:48 ` Martin Peschke
2006-06-19 22:12 ` Greg KH
2006-06-20 15:40 ` Martin Peschke
2006-06-20 16:50 ` Randy.Dunlap
2006-06-21 18:51 ` Martin Peschke
2006-06-21 19:38 ` Matthew Frost
2006-06-22 11:43 ` Martin Peschke
2006-06-14 5:04 ` Andi Kleen
2006-06-14 22:49 ` Martin Peschke
2006-06-16 20:40 ` Greg KH
2006-06-16 21:34 ` Martin Peschke
2006-06-17 6:51 ` Andi Kleen
2006-06-17 11:03 ` Martin Peschke
2006-06-17 10:30 ` Martin Peschke
2006-06-06 0:54 ` Merge of per task delay accounting (was Re: 2.6.18 -mm merge plans) Balbir Singh
2006-06-06 22:28 ` Shailabh Nagar
2006-06-06 22:40 ` Andrew Morton
2006-06-08 14:27 ` Shailabh Nagar
2006-06-08 17:42 ` Andrew Morton
2006-06-08 18:36 ` Shailabh Nagar
2006-06-08 19:33 ` Balbir Singh
2006-06-06 22:52 ` Jay Lan
2006-06-06 22:55 ` Shailabh Nagar
2006-06-12 12:02 ` Martin Peschke
2006-06-12 13:28 ` Shailabh Nagar
2006-06-06 12:32 ` 2.6.18 -mm pi-futex merge Steven Rostedt
2006-06-06 13:34 ` Roman Zippel
2006-06-06 13:44 ` Steven Rostedt
2006-06-06 14:42 ` genirq Ingo Molnar
2006-06-06 16:56 ` genirq Daniel Walker
2006-06-07 8:42 ` genirq Ingo Molnar
2006-06-07 3:46 ` genirq Benjamin Herrenschmidt
2006-06-06 14:53 ` 2.6.18 -mm merge plans Ingo Molnar
2006-06-06 16:02 ` Andrew Morton
2006-06-06 16:35 ` Arjan van de Ven
2006-06-06 20:47 ` Ingo Molnar [this message]
2006-06-07 3:52 ` mutex vs. local irqs (Was: 2.6.18 -mm merge plans) Benjamin Herrenschmidt
2006-06-07 4:29 ` Andrew Morton
2006-06-07 5:04 ` Benjamin Herrenschmidt
2006-06-07 5:29 ` Andrew Morton
2006-06-07 6:44 ` Benjamin Herrenschmidt
2006-06-07 7:03 ` Andrew Morton
2006-06-07 13:21 ` Ingo Molnar
2006-06-08 0:31 ` Benjamin Herrenschmidt
2006-06-08 10:49 ` David Woodhouse
2006-06-08 10:53 ` Ingo Molnar
2006-06-08 11:01 ` David Woodhouse
2006-06-08 11:17 ` Roman Zippel
2006-06-08 13:38 ` Benjamin Herrenschmidt
2006-06-08 14:02 ` Roman Zippel
2006-06-08 23:40 ` Benjamin Herrenschmidt
2006-06-08 22:59 ` Paul Mackerras
2006-06-10 10:22 ` 2.6.18 -mm merge plans Christoph Hellwig
2006-06-14 15:18 ` Michael Halcrow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060606204705.GA17787@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox