All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	sparclinux@vger.kernel.org, davem@davemloft.net
Subject: Re: next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe
Date: Wed, 30 Nov 2016 21:01:52 +0000	[thread overview]
Message-ID: <20161130210152.GL3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <20161130192159.GB22216@roeck-us.net>

On Wed, Nov 30, 2016 at 11:21:59AM -0800, Guenter Roeck wrote:
> On Wed, Nov 30, 2016 at 04:03:33AM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote:
> > > On 11/29/2016 11:02 PM, Paul E. McKenney wrote:
> > > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote:
> > > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote:
> > > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote:
> > > >>>>Hi Paul,
> > > >>>>
> > > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129.
> > > >>>>The problem is only seen in SMP builds; non-SMP builds are fine.
> > > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask()
> > > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem.
> > 
> > And I have dropped this patch.  Michal Hocko showed me the error of
> > my ways with this patch.
> > 
> 
> :-)
> 
> On another note, I still get RCU tracebacks in the s390 tests.
> 
> BUG: sleeping function called from invalid context at mm/page_alloc.c:3775
> 
> That is caused by 'rcu: Maintain special bits at bottom of ->dynticks counter';
> if I recall correctly we had discussed that earlier.

Indeed, I had missed a dyntick counter update back on Nov 11, which meant
that some of the code was still looking at the low-order bit instead of
the next bit up.  This is now fixed.

So to get to the error message you call out above, I need to have improperly
left the system in bh state or left irqs disabled, while the system was
running normally without an oops.  I am having a hard time seeing how this
patch can do that.

I would be more suspicious of f2a471ffc8a8 ("rcu: Allow boot-time use
of cond_resched_rcu_qs()").

So you bisected or did a revert to work out which was the offending commit?

							Thanx, Paul


WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	sparclinux@vger.kernel.org, davem@davemloft.net
Subject: Re: next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu
Date: Wed, 30 Nov 2016 13:01:52 -0800	[thread overview]
Message-ID: <20161130210152.GL3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <20161130192159.GB22216@roeck-us.net>

On Wed, Nov 30, 2016 at 11:21:59AM -0800, Guenter Roeck wrote:
> On Wed, Nov 30, 2016 at 04:03:33AM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote:
> > > On 11/29/2016 11:02 PM, Paul E. McKenney wrote:
> > > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote:
> > > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote:
> > > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote:
> > > >>>>Hi Paul,
> > > >>>>
> > > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129.
> > > >>>>The problem is only seen in SMP builds; non-SMP builds are fine.
> > > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask()
> > > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem.
> > 
> > And I have dropped this patch.  Michal Hocko showed me the error of
> > my ways with this patch.
> > 
> 
> :-)
> 
> On another note, I still get RCU tracebacks in the s390 tests.
> 
> BUG: sleeping function called from invalid context at mm/page_alloc.c:3775
> 
> That is caused by 'rcu: Maintain special bits at bottom of ->dynticks counter';
> if I recall correctly we had discussed that earlier.

Indeed, I had missed a dyntick counter update back on Nov 11, which meant
that some of the code was still looking at the low-order bit instead of
the next bit up.  This is now fixed.

So to get to the error message you call out above, I need to have improperly
left the system in bh state or left irqs disabled, while the system was
running normally without an oops.  I am having a hard time seeing how this
patch can do that.

I would be more suspicious of f2a471ffc8a8 ("rcu: Allow boot-time use
of cond_resched_rcu_qs()").

So you bisected or did a revert to work out which was the offending commit?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	sparclinux@vger.kernel.org, davem@davemloft.net
Subject: Re: next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu
Date: Wed, 30 Nov 2016 13:01:52 -0800	[thread overview]
Message-ID: <20161130210152.GL3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <20161130192159.GB22216@roeck-us.net>

On Wed, Nov 30, 2016 at 11:21:59AM -0800, Guenter Roeck wrote:
> On Wed, Nov 30, 2016 at 04:03:33AM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote:
> > > On 11/29/2016 11:02 PM, Paul E. McKenney wrote:
> > > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote:
> > > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote:
> > > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote:
> > > >>>>Hi Paul,
> > > >>>>
> > > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129.
> > > >>>>The problem is only seen in SMP builds; non-SMP builds are fine.
> > > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask()
> > > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem.
> > 
> > And I have dropped this patch.  Michal Hocko showed me the error of
> > my ways with this patch.
> > 
> 
> :-)
> 
> On another note, I still get RCU tracebacks in the s390 tests.
> 
> BUG: sleeping function called from invalid context at mm/page_alloc.c:3775
> 
> That is caused by 'rcu: Maintain special bits at bottom of ->dynticks counter';
> if I recall correctly we had discussed that earlier.

Indeed, I had missed a dyntick counter update back on Nov 11, which meant
that some of the code was still looking at the low-order bit instead of
the next bit up.  This is now fixed.

So to get to the error message you call out above, I need to have improperly
left the system in bh state or left irqs disabled, while the system was
running normally without an oops.  I am having a hard time seeing how this
patch can do that.

I would be more suspicious of f2a471ffc8a8 ("rcu: Allow boot-time use
of cond_resched_rcu_qs()").

So you bisected or did a revert to work out which was the offending commit?

							Thanx, Paul

  reply	other threads:[~2016-11-30 21:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-29 21:23 next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-29 21:23 ` Guenter Roeck
2016-11-29 21:23 ` Guenter Roeck
2016-11-30  1:28 ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Paul E. McKenney
2016-11-30  1:28   ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Paul E. McKenney
2016-11-30  1:28   ` Paul E. McKenney
2016-11-30  4:32   ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-11-30  4:32     ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-30  4:32     ` Guenter Roeck
2016-11-30  7:02     ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Paul E. McKenney
2016-11-30  7:02       ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Paul E. McKenney
2016-11-30  7:02       ` Paul E. McKenney
2016-11-30 10:52       ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-11-30 10:52         ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-30 10:52         ` Guenter Roeck
2016-11-30 12:03         ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Paul E. McKenney
2016-11-30 12:03           ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Paul E. McKenney
2016-11-30 12:03           ` Paul E. McKenney
2016-11-30 19:21           ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-11-30 19:21             ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-30 19:21             ` Guenter Roeck
2016-11-30 21:01             ` Paul E. McKenney [this message]
2016-11-30 21:01               ` Paul E. McKenney
2016-11-30 21:01               ` Paul E. McKenney
2016-11-30 23:18               ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-11-30 23:18                 ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-11-30 23:18                 ` Guenter Roeck
2016-12-01  1:19                 ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Paul E. McKenney
2016-12-01  1:19                   ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Paul E. McKenney
2016-12-01  1:19                   ` Paul E. McKenney
2016-12-01  6:56                   ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-12-01  6:56                     ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-12-01  6:56                     ` Guenter Roeck
2016-12-01 12:34                     ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Paul E. McKenney
2016-12-01 12:34                       ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Paul E. McKenney
2016-12-01 12:34                       ` Paul E. McKenney
2016-12-01 12:50                       ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qe Guenter Roeck
2016-12-01 12:50                         ` next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Guenter Roeck
2016-12-01 12:50                         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161130210152.GL3924@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.