From: Guenter Roeck <linux@roeck-us.net>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
sparclinux@vger.kernel.org, davem@davemloft.net
Subject: Re: next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu
Date: Wed, 30 Nov 2016 15:18:46 -0800 [thread overview]
Message-ID: <20161130231846.GB17244@roeck-us.net> (raw)
In-Reply-To: <20161130210152.GL3924@linux.vnet.ibm.com>
On Wed, Nov 30, 2016 at 01:01:52PM -0800, Paul E. McKenney wrote:
> On Wed, Nov 30, 2016 at 11:21:59AM -0800, Guenter Roeck wrote:
> > On Wed, Nov 30, 2016 at 04:03:33AM -0800, Paul E. McKenney wrote:
> > > On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote:
> > > > On 11/29/2016 11:02 PM, Paul E. McKenney wrote:
> > > > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote:
> > > > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote:
> > > > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote:
> > > > >>>>Hi Paul,
> > > > >>>>
> > > > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129.
> > > > >>>>The problem is only seen in SMP builds; non-SMP builds are fine.
> > > > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask()
> > > > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem.
> > >
> > > And I have dropped this patch. Michal Hocko showed me the error of
> > > my ways with this patch.
> > >
> >
> > :-)
> >
> > On another note, I still get RCU tracebacks in the s390 tests.
> >
> > BUG: sleeping function called from invalid context at mm/page_alloc.c:3775
> >
> > That is caused by 'rcu: Maintain special bits at bottom of ->dynticks counter';
> > if I recall correctly we had discussed that earlier.
>
> Indeed, I had missed a dyntick counter update back on Nov 11, which meant
> that some of the code was still looking at the low-order bit instead of
> the next bit up. This is now fixed.
>
> So to get to the error message you call out above, I need to have improperly
> left the system in bh state or left irqs disabled, while the system was
> running normally without an oops. I am having a hard time seeing how this
> patch can do that.
>
> I would be more suspicious of f2a471ffc8a8 ("rcu: Allow boot-time use
> of cond_resched_rcu_qs()").
>
> So you bisected or did a revert to work out which was the offending commit?
>
My most recent bisect was with the November 10 image, so that would have missed
any later fix. Comparing the log messages, the current message is indeed
different. Sorry, I mixed that up; I just assumed that the problem would be
the same without really checking. My bad.
Bisect would be tricky, since the s390 image was broken for some time after
November 10. The first time I have seen the above BUG: was with next-20161128
(which is the first build after the crash was fixed). That version did not
include f2a471ffc8a8, so that can not be the cause.
I'll try to set up a bisect tonight, working around the crash problem.
I'll let you know how it goes.
Guenter
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-11-30 23:18 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-29 21:23 next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-30 1:28 ` Paul E. McKenney
2016-11-30 4:32 ` Guenter Roeck
2016-11-30 7:02 ` Paul E. McKenney
2016-11-30 10:52 ` Guenter Roeck
2016-11-30 12:03 ` Paul E. McKenney
2016-11-30 19:21 ` Guenter Roeck
2016-11-30 21:01 ` Paul E. McKenney
2016-11-30 23:18 ` Guenter Roeck [this message]
2016-12-01 1:19 ` Paul E. McKenney
2016-12-01 6:56 ` Guenter Roeck
2016-12-01 12:34 ` Paul E. McKenney
2016-12-01 12:50 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161130231846.GB17244@roeck-us.net \
--to=linux@roeck-us.net \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).