public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Maciej Rutecki" <maciej.rutecki@gmail.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Thomas Gleixner" <tglx@linutronix.de>
Subject: Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
Date: Fri, 30 Jan 2009 15:06:20 +0100	[thread overview]
Message-ID: <20090130140620.GD17401@elte.hu> (raw)
In-Reply-To: <200901292329.59121.rjw@sisk.pl>


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Thursday 29 January 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > > 
> > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > > 
> > > > > > In fact whatever check you put in it's _always_ going to be 
> > > > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > > > possibly check all possible places that enable interrupts. (they could 
> > > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > > 
> > > > > In this particular case, I'm not really interested in that.  What I'm 
> > > > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > > > to check directly.
> > > > 
> > > > But this is exactly what it does - without any need for debug checks 
> > > > spread around!
> > > > 
> > > > You'll get a _full stack dump_ from the very driver that is enabling 
> > > > interrupts! You dont get a trace - you get a stack dump of the very place 
> > > > that is buggy. It does not get any better than that.
> > > 
> > > I'm not going to argue.
> > > 
> > > Nevertheless, IMO something like the patch below should be sufficient to catch
> > > these bugs.
> > > 
> > > Thanks,
> > > Rafael
> > > 
> > > 
> > > ---
> > >  drivers/base/power/main.c |   12 ++++++++++++
> > >  drivers/base/sys.c        |   21 ++++++++++++++++-----
> > >  include/linux/pm.h        |   18 ++++++++++++++++++
> > >  3 files changed, 46 insertions(+), 5 deletions(-)
> > 
> > hm, so now you sprinkle debug checks all around the code, instead of 
> > putting in a single pair of:
> > 
> >     force_irqs_off_start();
> >     ...
> >     force_irqs_off_end();
> 
> And what debug options exactly would that require to be set to work?

hm, if you worry about that aspect: we could make it seemlessly enabled if 
PM_DEBUG is enabled.

> > which would catch everything that your checks would catch - and it 
> > would catch more.
> 
> Except that the checks trigger in specific places, so if a check 
> triggers you know precisely where the bug happened regardless of what 
> garbage is in the call trace.

This argument is 100% mystery to me. Do you really not see the quality 
difference between a stack trace generated _right at the buggy piece of 
code_ and a warning later on that might (or might not) trigger?

Especially considering that your approach wont catch such bugs:

   ...
   spin_unlock_irq();
   ...
   spin_lock_irq();
   ...

Or such bugs:

   local_irq_enable();
   ...
   local_irq_disable();

Or such bugs:

   spin_lock_irq_save(&lock1, flags);
   ...
           spin_lock_irqsave(&lock2, flags);
           ...
           spin_unlock_irq(&lock2);          /* accidental bug */
   ...
   spin_unlock_irq_restore(&lock1, flags);

Such types of bugs might be especially hard to find in practice, if the 
window where irqs are enabled is small. There is no guarantee at all that 
accidental irq enabling survives a critical section - it can be 
re-disabled in the normal flow of things very easily.

And even if we are lucky and if the irqs stay enabled by the time the 
callback returns, what if your warning flags some big and complex driver, 
one line of which is buggy?

If you had the choice, what would you prefer - a stack dump done at the 
point of incident (pinpointing the driver, the subsystem and the buggy 
function with its full callframe), or your "this driver is buggy" generic 
warning with no specificity about where that bug might be?

Stacktraces _at the point of incident_, and a _guaranteed_ facility that 
_enforces_ that irqs are off during the whole resume cycle are just about 
the highest quality debug info and debug protection we can get in such 
situations.

I really dont understand your points about this.

	Ingo

  reply	other threads:[~2009-01-30 14:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-17  8:58 [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible Maciej Rutecki
2009-01-25  2:37 ` Andrew Morton
2009-01-25 11:57   ` Thomas Gleixner
2009-01-25 14:19   ` Maciej Rutecki
2009-01-25 16:30   ` Maciej Rutecki
2009-01-26 16:07 ` Linus Torvalds
2009-01-26 16:53   ` Rafael J. Wysocki
2009-01-26 17:41     ` Ingo Molnar
2009-01-26 19:21       ` Rafael J. Wysocki
2009-01-26 20:35         ` Ingo Molnar
2009-01-26 20:48           ` Rafael J. Wysocki
2009-01-26 21:35             ` Ingo Molnar
2009-01-27 15:28               ` Rafael J. Wysocki
2009-01-27 15:49                 ` Ingo Molnar
2009-01-27 21:18                   ` Rafael J. Wysocki
2009-01-29 15:07                     ` Ingo Molnar
2009-01-29 22:29                       ` Rafael J. Wysocki
2009-01-30 14:06                         ` Ingo Molnar [this message]
2009-01-30 21:30                           ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090130140620.GD17401@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.rutecki@gmail.com \
    --cc=rjw@sisk.pl \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox