linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Byungchul Park <byungchul.park@lge.com>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Hurley <peter@hurleysoftware.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	akpm@linux-foundation.org, mingo@kernel.org,
	linux-kernel@vger.kernel.org, akinobu.mita@gmail.com,
	jack@suse.cz, torvalds@linux-foundation.org
Subject: Re: [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
Date: Fri, 29 Jan 2016 12:00:36 +0900	[thread overview]
Message-ID: <20160129030036.GD31266@X58A-UD3R> (raw)
In-Reply-To: <20160129005406.GB4820@swordfish>

On Fri, Jan 29, 2016 at 09:54:06AM +0900, Sergey Senozhatsky wrote:
> because you don't give any details and don't answer any questions.

There are 2 ways to make the kernel better and stabler.

1) Remove the possiblity which make the system go crazy, even though it
would hardly happen since the possiblity is too low.

2) Fix it after facing some problems in practice and debugging it.

I started to write this patch due to the 2nd reason after seeing the
backtrace in gdb. But I lost the data with which I can debug it now,
since I was mis-convinced that it was done. So I could not answer it for
the your questions about memory corruption and cpu off. Sorry for not
informing you these facts in advance. But please remind that I was in
progress by the 1st way.

> it took a while to even find out that you are reporting this issues
> not against a real H/W, but a qemu. I suppose qemu-arm running on
> x86_64 box.

No matter what kind of box I used because I only kept talking about the
possiblity. It does not depend on a box at all.

> 
> now, what else we don't know?
> 
> explain STEP-BY-STEP why do you think spinlock debug code can lockup
> itself. not just "I don't think this is the case, I don't think that
> is the case".

I did explaining the reason in detail even though there's something I
missed. I've never said "I don't think this is the case" on the
description explaining the problem. Anyway, I am not sure about my patch
now, thank to your advice.

> 
> on very spin_dump recursive call it waits for the spin_lock and when
> it eventually grabs it, it does the job that it wanted to do under
> that spin lock, unlock it and return back. and the only case when it
> never "return back" is when it never "eventually grabs it".

Right. I missed it.

> 
> so I still don't see what issue you fix here -- the possibility to
> consume the entire kernel stack doing recursive spin_dump->spin_lock()
> calls because:
>   a) something never unlocks the lock (no matter why.. corruption, HW
> fault, etc.)
> or
>   b) everything was OK, but we attempted to printk() already
> being in a very-very deep callstack, so doing 5 extra
> printk->spin_dump->printk->spin_dump would simply kill it.
> 
> 
> if none of the above. then what you report and fix is simply non
> realistic. spin_dump must eventually unwind the stack back. yes,
> you'll see a lot of dump_stack() and all cpus backtraces done on
> every roollback stack. but you would still see some of them anyway,
> even w/o the spinlock debug code -- because you'd just
> raw_spin_lock_irqsave() on that lock for a very long time; which
> does upset watchdog, etc.

I am not sure now, if it can be fixed by the 1st way, that is, removing
the possiblity which make the system go crazy. There's something I missed.
Now I have to solve this problem by the 2nd way after reproducing it and
debugging it in detail. I still keep trying to reproduce it now.

Anyway. Thank you very much.

Thanks,
Byungchul

> 
> 
> please start explaining the things.
> 
> 	-ss

  reply	other threads:[~2016-01-29  3:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-27 12:01 [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
2016-01-27 22:49 ` Peter Hurley
2016-01-28  7:15   ` Byungchul Park
2016-01-29  8:19     ` Byungchul Park
2016-01-28  1:42 ` Byungchul Park
2016-01-28  2:37   ` Sergey Senozhatsky
2016-01-28  4:36     ` byungchul.park
2016-01-28  6:05       ` Sergey Senozhatsky
2016-01-28  8:13         ` Byungchul Park
2016-01-28 10:41           ` Sergey Senozhatsky
2016-01-28 10:53             ` Sergey Senozhatsky
2016-01-28 15:42               ` Sergey Senozhatsky
2016-01-28 23:08                 ` Peter Hurley
2016-01-28 23:54                   ` Byungchul Park
2016-01-29  0:54                     ` Sergey Senozhatsky
2016-01-29  3:00                       ` Byungchul Park [this message]
2016-01-29  4:05                         ` Sergey Senozhatsky
2016-01-29 12:15                           ` Byungchul Park
2016-01-29  0:27                   ` Sergey Senozhatsky
2016-01-29  4:32                     ` Peter Hurley
2016-01-29  5:28                       ` Sergey Senozhatsky
2016-01-29  5:48                         ` Peter Hurley
2016-01-29  6:16                           ` Sergey Senozhatsky
2016-01-29  6:37                             ` Sergey Senozhatsky
2016-01-31 12:30                               ` Sergey Senozhatsky
2016-01-31 12:33                                 ` [PATCH 1/3] printk: introduce console_reset_on_panic() function Sergey Senozhatsky
2016-01-31 12:33                                   ` [PATCH 2/3] printk: introduce reset_console_drivers() Sergey Senozhatsky
2016-01-31 12:47                                     ` kbuild test robot
2016-01-31 12:33                                   ` [PATCH 3/3] spinlock_debug: panic on recursive lock spin_dump() Sergey Senozhatsky
2016-02-01 16:14                                     ` Sergey Senozhatsky
2016-02-02  7:59                                       ` Sergey Senozhatsky
2016-01-31 12:42                                   ` [PATCH 1/3] printk: introduce console_reset_on_panic() function kbuild test robot
2016-01-29  6:54                     ` [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
2016-01-29  7:13                       ` Sergey Senozhatsky
2016-01-29  8:13                         ` Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160129030036.GD31266@X58A-UD3R \
    --to=byungchul.park@lge.com \
    --cc=akinobu.mita@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peter@hurleysoftware.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).