From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Byungchul Park <byungchul.park@lge.com>
Cc: akpm@linux-foundation.org, mingo@kernel.org,
linux-kernel@vger.kernel.org, akinobu.mita@gmail.com,
jack@suse.cz, torvalds@linux-foundation.org,
peter@hurleysoftware.com, sergey.senozhatsky.work@gmail.com,
sergey.senozhatsky@gmail.com
Subject: Re: [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
Date: Thu, 28 Jan 2016 11:37:50 +0900 [thread overview]
Message-ID: <20160128023750.GB1834@swordfish> (raw)
In-Reply-To: <20160128014253.GC1538@X58A-UD3R>
Hello,
On (01/28/16 10:42), Byungchul Park wrote:
> +cc peter@hurleysoftware.com
> +cc sergey.senozhatsky.work@gmail.com
thanks for Cc-ing.
> On Wed, Jan 27, 2016 at 09:01:01PM +0900, Byungchul Park wrote:
> > changes form v3 to v4
> > - reuse a existing code as much as possible for preventing an infinite
> > recursive cycle.
> >
> > changes from v2 to v3
> > - avoid printk() only in case of lockup suspected, not real lockup in
> > which case it does not help at all.
> > - consider not only console_sem.lock but also logbuf_lock which is used
> > by printk().
> >
> > changes from v1 to v2
> > - only change comment and commit message esp. replacing "deadlock" with
> > "infinite recursive cycle", since it is not an actual deadlock.
[..]
> > It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK,
> > in the spin_dump(). Backtrace prints printk() -> console_trylock() ->
> > do_raw_spin_lock() -> spin_dump() -> printk()... infinitely.
> >
> > When the debug spinlock code is called from printk(), we should prevent
> > calling spin_dump() and the like, calling printk() again in that context.
ok, I'll repeat the questions.
under what circumstances you hit this problem? ...memory corruption, cpu
core has been powered off, while it owned the spin_lock... your irqsave
didn't work?
the thing is, it's really-really hard to lockup in console_trylock()...
int down_trylock(struct semaphore *sem)
{
unsigned long flags;
int count;
raw_spin_lock_irqsave(&sem->lock, flags); <<<<<< um...
count = sem->count - 1;
if (likely(count >= 0))
sem->count = count;
raw_spin_unlock_irqrestore(&sem->lock, flags);
return (count < 0);
}
was not able to dereference sem->count? `*sem' was corrupted? or because
sem->lock was corrupted? in any of those cases you solve the problem from
the wrong side. if you have a memory corruption then it can corrupt anything,
including, for example, console driver spin_lock.
suppose, that you hit do_raw_spin_lock()->spin_dump(), which means that the
spin_lock was not corrupted, it passed debug_spin_lock_before() after all.
and that spin_lock was taken for longer than `loops_per_jiffy * HZ', while
other CPU was doing
count = sem->count - 1;
if (likely(count >= 0))
sem->count = count;
???
was the CPU that owned the lock alive? (h/w fault, perhaps?).
dunno... yes, this
printk()->console_trylock()->do_raw_spin_lock()->spin_dump()->printk()
is possible, but it's possible only when your system is screwed up badly, so
badly that this spin_dump() loop is not really a problem, IMHO.
if I'm missing something terribly obvious here, then please give more details.
-ss
> > Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> > ---
> > include/linux/debug_locks.h | 4 ++++
> > kernel/locking/spinlock_debug.c | 14 +++++++++++---
> > 2 files changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
> > index 822c135..b0f977e 100644
> > --- a/include/linux/debug_locks.h
> > +++ b/include/linux/debug_locks.h
> > @@ -10,6 +10,10 @@ struct task_struct;
> > extern int debug_locks;
> > extern int debug_locks_silent;
> >
> > +static inline void __debug_locks_on(void)
> > +{
> > + debug_locks = 1;
> > +}
> >
> > static inline int __debug_locks_off(void)
> > {
> > diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> > index 0374a59..65177ba 100644
> > --- a/kernel/locking/spinlock_debug.c
> > +++ b/kernel/locking/spinlock_debug.c
> > @@ -113,11 +113,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
> > return;
> > __delay(1);
> > }
> > - /* lockup suspected: */
> > - spin_dump(lock, "lockup suspected");
> > +
> > + /*
> > + * We should prevent calling printk() further, since it would cause
> > + * an infinite recursive cycle if it's called from printk()!
> > + */
> > + if (__debug_locks_off()) {
> > + /* lockup suspected: */
> > + spin_dump(lock, "lockup suspected");
> > #ifdef CONFIG_SMP
> > - trigger_all_cpu_backtrace();
> > + trigger_all_cpu_backtrace();
> > #endif
> > + __debug_locks_on();
> > + }
> >
> > /*
> > * The trylock above was causing a livelock. Give the lower level arch
> > --
> > 1.9.1
>
next prev parent reply other threads:[~2016-01-28 2:36 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-27 12:01 [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
2016-01-27 22:49 ` Peter Hurley
2016-01-28 7:15 ` Byungchul Park
2016-01-29 8:19 ` Byungchul Park
2016-01-28 1:42 ` Byungchul Park
2016-01-28 2:37 ` Sergey Senozhatsky [this message]
2016-01-28 4:36 ` byungchul.park
2016-01-28 6:05 ` Sergey Senozhatsky
2016-01-28 8:13 ` Byungchul Park
2016-01-28 10:41 ` Sergey Senozhatsky
2016-01-28 10:53 ` Sergey Senozhatsky
2016-01-28 15:42 ` Sergey Senozhatsky
2016-01-28 23:08 ` Peter Hurley
2016-01-28 23:54 ` Byungchul Park
2016-01-29 0:54 ` Sergey Senozhatsky
2016-01-29 3:00 ` Byungchul Park
2016-01-29 4:05 ` Sergey Senozhatsky
2016-01-29 12:15 ` Byungchul Park
2016-01-29 0:27 ` Sergey Senozhatsky
2016-01-29 4:32 ` Peter Hurley
2016-01-29 5:28 ` Sergey Senozhatsky
2016-01-29 5:48 ` Peter Hurley
2016-01-29 6:16 ` Sergey Senozhatsky
2016-01-29 6:37 ` Sergey Senozhatsky
2016-01-31 12:30 ` Sergey Senozhatsky
2016-01-31 12:33 ` [PATCH 1/3] printk: introduce console_reset_on_panic() function Sergey Senozhatsky
2016-01-31 12:33 ` [PATCH 2/3] printk: introduce reset_console_drivers() Sergey Senozhatsky
2016-01-31 12:47 ` kbuild test robot
2016-01-31 12:33 ` [PATCH 3/3] spinlock_debug: panic on recursive lock spin_dump() Sergey Senozhatsky
2016-02-01 16:14 ` Sergey Senozhatsky
2016-02-02 7:59 ` Sergey Senozhatsky
2016-01-31 12:42 ` [PATCH 1/3] printk: introduce console_reset_on_panic() function kbuild test robot
2016-01-29 6:54 ` [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
2016-01-29 7:13 ` Sergey Senozhatsky
2016-01-29 8:13 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160128023750.GB1834@swordfish \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akinobu.mita@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=byungchul.park@lge.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peter@hurleysoftware.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox