public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: RCU stall
Date: Tue, 22 Mar 2016 18:59:32 -0700	[thread overview]
Message-ID: <20160323015932.GX4287@linux.vnet.ibm.com> (raw)
In-Reply-To: <56F1DAF6.3030804@sandisk.com>

On Tue, Mar 22, 2016 at 04:53:26PM -0700, Bart Van Assche wrote:
> On 03/22/2016 01:45 PM, Paul E. McKenney wrote:
> >You are getting a soft lockup as well as an RCU CPU stall warning, so
> >it looks like something is taking a very long time in blk_done_softirq().
> >
> >You have multiple occurrences at different times, so it looks to be
> >a long time as opposed to an infinite time.  Are you perhaps doing
> >something that would make a huge amount of work for blk_done_softirq()?
> >
> >See Documentation/RCU/stallwarn.txt in the kernel source tree for more
> >info on how to debug this sort of thing.
> 
> Hello Paul,
> 
> None of the drivers involved in the test I ran contain RCU code that
> has been changed recently. The block and SCSI subsystems processes
> I/O completions in softirq context but until last week I hadn't seen
> any RCU lockup complaints when I ran an SRP test against a kernel
> with lockdep and several other kernel debugging options enabled.
> This is why I sent an e-mail to you. I have read
> Documentation/RCU/stallwarn.txt after I received your reply but this
> didn't provide me any clue about where to look for the root cause.
> Any further help would be appreciated.

My suggestion would be to check the block/SCSI softirq handler for
event traces.  If there are some, enable them and see what the loop
is doing.  Documentation/trace/ftrace.txt describes how to enable
existing event tracing.

If there is no event tracing, consider adding some in your local
view.  Failing that, there is always printk().  ;-)

Or perhaps you have some sort of debug setup.

Either way, the next step is to work out why that CPU is spending
so much time in that loop.

							Thanx, Paul

  reply	other threads:[~2016-03-23  1:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <56F1A8F2.9000905@sandisk.com>
2016-03-22 20:45 ` RCU stall Paul E. McKenney
2016-03-22 23:53   ` Bart Van Assche
2016-03-23  1:59     ` Paul E. McKenney [this message]
2016-03-23  2:29       ` Paul E. McKenney
2016-03-24 20:24         ` Bart Van Assche
2016-03-24 20:46           ` Paul E. McKenney
2011-04-20  2:02 rcu stall Dave Jones
2011-04-20  8:36 ` Ingo Molnar
2011-04-20 18:30   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160323015932.GX4287@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox