All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: WARN_ON_ONCE() in process_one_work()?
Date: Fri, 23 Jun 2017 09:41:42 -0700	[thread overview]
Message-ID: <20170623164142.GA14685@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170621153035.GA31181@linux.vnet.ibm.com>

On Wed, Jun 21, 2017 at 08:30:35AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 20, 2017 at 09:45:23AM -0700, Paul E. McKenney wrote:
> > On Sun, Jun 18, 2017 at 06:40:00AM -0400, Tejun Heo wrote:
> > > Hello,
> > > 
> > > On Sat, Jun 17, 2017 at 10:31:05AM -0700, Paul E. McKenney wrote:
> > > > On Sat, Jun 17, 2017 at 07:53:14AM -0400, Tejun Heo wrote:
> > > > > Hello,
> > > > > 
> > > > > On Fri, Jun 16, 2017 at 10:36:58AM -0700, Paul E. McKenney wrote:
> > > > > > And no test failures from yesterday evening.  So it looks like we get
> > > > > > somewhere on the order of one failure per 138 hours of TREE07 rcutorture
> > > > > > runtime with your printk() in the mix.
> > > > > >
> > > > > > Was the above output from your printk() output of any help?
> > > > > 
> > > > > Yeah, if my suspicion is correct, it'd require new kworker creation
> > > > > racing against CPU offline, which would explain why it's so difficult
> > > > > to repro.  Can you please see whether the following patch resolves the
> > > > > issue?
> > > > 
> > > > That could explain why only Steve Rostedt and I saw the issue.  As far
> > > > as I know, we are the only ones who regularly run CPU-hotplug stress
> > > > tests.  ;-)
> > > 
> > > I was a bit confused.  It has to be racing against either new kworker
> > > being created on the wrong CPU or rescuer trying to migrate to the
> > > CPU, and it looks like we're mostly seeing the rescuer condition, but,
> > > yeah, this would only get triggered rarely.  Another contributing
> > > factor could be the vmstat work putting on a workqueue w/ rescuer
> > > recently.  It runs quite often, so probably has increased the chance
> > > of hitting the right condition.
> > 
> > Sounds like too much fun!  ;-)
> > 
> > But more constructively...  If I understand correctly, it is now possible
> > to take a CPU partially offline and put it back online again.  This should
> > allow much more intense testing of this sort of interaction.
> > 
> > And no, I haven't yet tried this with RCU because I would probably need
> > to do some mix of just-RCU online/offline and full-up online-offline.
> > Plus RCU requires pretty much a full online/offline cycle to fully
> > exercise it.  :-/
> > 
> > > > I have a weekend-long run going, but will give this a shot overnight on
> > > > Monday, Pacific Time.  Thank you for putting it together, looking forward
> > > > to seeing what it does!
> > > 
> > > Thanks a lot for the testing and patience.  Sorry that it took so
> > > long.  I'm not completely sure the patch is correct.  It might have to
> > > be more specifc about which type of migration or require further
> > > synchronization around migration, but hopefully it'll at least be able
> > > to show that this was the cause of the problem.
> > 
> > And last night's tests had no failures.  Which might actually mean
> > something, will get more info when I run without your patch this
> > evening.  ;-)
> 
> And it didn't fail without the patch, either.  45 hours of test vs.
> 60 hours with the patch.  This one is not going to be easy to prove
> either way.  I will try again this evening without the patch and see
> what that gets us.

And another 36 hours (total of 81 hours) without the patch, still no
failure.  Sigh.

In the sense that the patch doesn't cause any new problem:

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

But I clearly have nothing of statistical significance, so any confidence
in the fix is coming from your reproducer.

							Thanx, Paul

  reply	other threads:[~2017-06-23 16:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-01 16:57 WARN_ON_ONCE() in process_one_work()? Paul E. McKenney
2017-05-01 18:38 ` Paul E. McKenney
2017-05-01 18:44   ` Tejun Heo
2017-05-01 18:58     ` Paul E. McKenney
2017-05-05 17:11       ` Paul E. McKenney
2017-06-13 20:58         ` Tejun Heo
2017-06-13 22:31           ` Paul E. McKenney
2017-06-14 15:15             ` Paul E. McKenney
2017-06-15 15:38               ` Paul E. McKenney
2017-06-16 17:36                 ` Paul E. McKenney
2017-06-17 11:53                   ` Tejun Heo
2017-06-17 17:31                     ` Paul E. McKenney
2017-06-18 10:40                       ` Tejun Heo
2017-06-20 16:45                         ` Paul E. McKenney
2017-06-21 15:30                           ` Paul E. McKenney
2017-06-23 16:41                             ` Paul E. McKenney [this message]
2017-06-27 16:27                               ` Paul E. McKenney
2017-05-01 18:42 ` Tejun Heo
2017-05-01 19:42   ` Steven Rostedt
2017-05-01 19:50     ` Tejun Heo
2017-05-01 20:02       ` Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2018-06-20 19:29 Paul E. McKenney
2018-07-02 21:05 ` Tejun Heo
2018-07-03  4:05   ` Paul E. McKenney
2018-07-03 16:40     ` Paul E. McKenney
2018-07-03 20:12       ` Tejun Heo
2018-07-03 21:44         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170623164142.GA14685@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.