public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Tejun Heo <tj@kernel.org>, Nick Piggin <npiggin@suse.de>
Cc: Jiri Kosina <jkosina@suse.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Yinghai Lu <yhlu.kernel@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	cl@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: irq lock inversion
Date: Fri, 6 Nov 2009 08:58:20 +0100	[thread overview]
Message-ID: <20091106075820.GA28227@elte.hu> (raw)
In-Reply-To: <4AF3D428.8000804@kernel.org>


* Tejun Heo <tj@kernel.org> wrote:

> Ingo Molnar wrote:
> >>> This warning is bogus -- sched_init() is being called very early with IRQs
> >>> disabled, and the irqsave/restore code paths in pcpu_alloc() are only for early
> >>> init. The path can never be called from irq context once the early init
> >>> finishes. Rationale for this is explained in changelog of the commit mentioned
> >>> above.
> >>>
> >>> This problem can be encountered generally in any other early code running
> >>> with IRQs off and using irqsave/irqrestore.
> >>>
> >>> Reported-by: Yinghai Lu <yhlu.kernel@gmail.com>
> >>> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> >> Looks good to me.  Ingo, what do you think?
> > 
> > Ugh, this explanation is _BOGUS_. As i said, taking a lock with irqs 
> > disabled does _NOT_ mark a lock as 'irq safe' - if it did, we'd have 
> > false positives left and right.
> > 
> > Read the lockdep message please, consider all the backtraces it prints, 
> > it says something different.
> 
> Ah... okay, the pcpu_free() path is correctly marking the lock 
> irqsafe.  I assumed this was caused by recent pcpu_alloc() change. 
> Sorry about that.  The lock inversion problem has always been there, 
> it just never showed up because none has use allocation map that large 
> I suppose.
> 
> So, the correct fix would be either 1. push down irqsafeness down to 
> vmalloc locks or 2. the rather ugly unlock-lock dancing in 
> pcpu_extend_area_map() I posted earlier.  For 2.6.32, I guess we'll 
> have to go with #2.  For longer term, we'll probably have to do #1 as 
> it's required to implement atomic percpu allocations too.
> 
> I'll try to reproduce the problem here and verify the previous locking 
> dance patch.

I havent looked deeply but at first sight i'm not 100% sure that even 
the lock dance hack is safe - doesnt vfree() do TLB flushes, which must 
be done with irqs enabled in general? If yes, then the whole notion of 
using the allocator from irqs-off sections is wrong and the flags 
save/restore is misguided (or at least incomplete).

So the real problem right now i think is the use of the pcpu allocator 
from within a BH section (and from irqs-off sections) - that usage 
should be eliminated from .32, or the allocator should be fixed. (which 
looks non-trivial vmalloc/vfree was never really intended to be used in 
irq-atomic contexts)

	Ingo

  reply	other threads:[~2009-11-06  7:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <86802c440911041008q4969b9bdk15b4598c40bb84bd@mail.gmail.com>
     [not found] ` <4AF25FC7.4000502@kernel.org>
     [not found]   ` <20091105082102.GA2870@elte.hu>
     [not found]     ` <4AF28D7A.6020209@kernel.org>
2009-11-05 14:31       ` irq lock inversion Jiri Kosina
2009-11-06  5:53         ` Tejun Heo
2009-11-06  7:17           ` Ingo Molnar
2009-11-06  7:45             ` Tejun Heo
2009-11-06  7:58               ` Ingo Molnar [this message]
2009-11-06  8:24                 ` Tejun Heo
2009-11-06  8:40                   ` Ingo Molnar
2009-11-06  8:52                     ` Tejun Heo
2009-11-06 16:08                       ` Christoph Lameter
2009-11-06 16:38                         ` Tejun Heo
2009-11-06 17:03                           ` Christoph Lameter
2009-11-07 16:13                             ` Peter Zijlstra
2009-11-09  5:46                               ` [PATCH percpu#for-linus] percpu: fix possible deadlock via " Tejun Heo
2009-11-06  9:59             ` Jens Axboe
2009-11-08  9:38               ` Ingo Molnar
2009-11-09 15:34                 ` Jens Axboe
2009-11-09 15:45                   ` Ingo Molnar
2009-11-09 15:49                     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091106075820.GA28227@elte.hu \
    --to=mingo@elte.hu \
    --cc=cl@linux-foundation.org \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox