From: Byungchul Park <byungchul.park@lge.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: willy@linux.intel.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, akinobu.mita@gmail.com,
jack@suse.cz, sergey.senozhatsky.work@gmail.com,
peter@hurleysoftware.com, torvalds@linux-foundation.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] lock/semaphore: Avoid a deadlock within __up()
Date: Tue, 2 Feb 2016 18:00:52 +0900 [thread overview]
Message-ID: <20160202090052.GH29804@X58A-UD3R> (raw)
In-Reply-To: <20160202081355.GA30393@gmail.com>
On Tue, Feb 02, 2016 at 09:13:55AM +0100, Ingo Molnar wrote:
>
> * Byungchul Park <byungchul.park@lge.com> wrote:
>
> > Since I faced a infinite recursive printk() bug, I've tried to propose
> > patches the title of which is "lib/spinlock_debug.c: prevent a recursive
> > cycle in the debug code". But I noticed the root problem cannot be fixed
> > by that, through some discussion thanks to Sergey and Peter. So I focused
> > on preventing the DEADLOCK.
> >
> > -----8<-----
> > From 94a66990677735459a7790b637179d8600479639 Mon Sep 17 00:00:00 2001
> > From: Byungchul Park <byungchul.park@lge.com>
> > Date: Tue, 2 Feb 2016 15:35:48 +0900
> > Subject: [PATCH] lock/semaphore: Avoid a deadlock within __up()
> >
> > When the semaphore __up() is called from within printk() with
> > console_sem.lock, a DEADLOCK can happen, since the wake_up_process() can
> > call printk() again, esp. if defined CONFIG_DEBUG_SPINLOCK. And the
> > wake_up_process() don't need to be within a critical section.
> >
> > The scenario the bad thing can happen is,
> >
> > printk
> > console_trylock
> > console_unlock
> > up_console_sem
> > up
> > raw_spin_lock_irqsave(&sem->lock, flags)
> > __up
> > wake_up_process
> > try_to_wake_up
> > raw_spin_lock_irqsave(&p->pi_lock)
> > __spin_lock_debug
> > spin_dump
> > printk
> > console_trylock
> > raw_spin_lock_irqsave(&sem->lock, flags)
> >
> > *** DEADLOCK ***
> >
> > Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> > ---
> > kernel/locking/semaphore.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
> > index b8120ab..d3a28dc 100644
> > --- a/kernel/locking/semaphore.c
> > +++ b/kernel/locking/semaphore.c
> > @@ -259,5 +259,14 @@ static noinline void __sched __up(struct semaphore *sem)
> > struct semaphore_waiter, list);
> > list_del(&waiter->list);
> > waiter->up = true;
> > +
> > + /*
> > + * Trying to acquire this sem->lock in wake_up_process() leads a
> > + * DEADLOCK unless we unlock it here. For example, it's possile
> > + * in the case that called from within printk() since
> > + * wake_up_process() might call printk().
> > + */
> > + raw_spin_unlock_irq(&sem->lock);
> > wake_up_process(waiter->task);
> > + raw_spin_lock_irq(&sem->lock);
>
> So I'm pretty sad about this solution, as it penalizes every semaphore user -
Yeh... That was on my mind. Then... What about this alternative?
before
======
up
spin_lock
add count
__up
wake_up_process
spin_unlock
thispatch
=========
up
spin_lock
add count
__up
spin_unlock
wake_up_process
spin_lock
spin_unlock
alternative
===========
up
spin_lock
add count
spin_unlock
wake_up_process
This alternative does not have additional overhead and seems to be
reasonable, doesn't it? The reason why I proposed patches like this
including this alternative is that I thought it define the critical
section wider than it needs.
> while the deadlock is a really obscure one occuring within the scheduler or a
> console driver, which are very narrow code paths!
>
> (Also, please don't shout in comments, unless there's some really good reason to
> do it.)
Do you mean the upper case e.i. DEADLOCK? Okay I will keep in mind.
>
> Why doesn't spin_dump() break the console lock instead, if it detects that it's
> spinning on it, before doing the printk()? It's a likely fail state anyway - and
> this way we push any intrusive debug oriented action towards the unlikely fail
> state.
>
> Alternatively: why not improve down_trylock() to be lockless? The main reason for
> the lockup is that a trylock op takes the semaphore spinlock unconditionally.
> Which is fine for legacy code, but could perhaps be improved upon - I think we
> could in fact do it without turning sem->count into atomics.
>
> Alternatively #2: move printk() away from semaphores - it's pretty special code
> anyway and semaphore semanthics are far from obvious.
>
Thank you for your advice, and these approaches also look good. Could you
answer my question? If you don't think so, I will try it as you advised.
Thanks,
Byungchul
> Thanks,
>
> Ingo
prev parent reply other threads:[~2016-02-02 9:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-02 7:14 [PATCH] lock/semaphore: Avoid a deadlock within __up() Byungchul Park
2016-02-02 8:13 ` Ingo Molnar
2016-02-02 9:00 ` Byungchul Park [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160202090052.GH29804@X58A-UD3R \
--to=byungchul.park@lge.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akinobu.mita@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peter@hurleysoftware.com \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox