From: Frederic Weisbecker <fweisbec@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, hpa@zytor.com,
paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org,
khilman@linaro.org, tglx@linutronix.de, axboe@fb.com,
linux-tip-commits@vger.kernel.org
Subject: Re: [tip:timers/nohz] nohz: Move full nohz kick to its own IPI
Date: Wed, 7 May 2014 18:05:08 +0200 [thread overview]
Message-ID: <20140507160504.GC16694@localhost.localdomain> (raw)
In-Reply-To: <20140507153736.GS30445@twins.programming.kicks-ass.net>
On Wed, May 07, 2014 at 05:37:36PM +0200, Peter Zijlstra wrote:
> On Wed, May 07, 2014 at 05:29:24PM +0200, Frederic Weisbecker wrote:
> > On Wed, May 07, 2014 at 05:17:35PM +0200, Peter Zijlstra wrote:
> > > On Mon, May 05, 2014 at 05:34:08PM +0200, Frederic Weisbecker wrote:
> > > > On Mon, May 05, 2014 at 05:12:28PM +0200, Peter Zijlstra wrote:
> > > > > > Note the current ordering:
> > > > > >
> > > > > > cmpxchg(&qsd->pending, 0, 1) get ipi
> > > > > > csd_lock(qsd->csd) xchg(&qsd->pending, 1)
> > > > > > send ipi csd_unlock(qsd->csd)
> > > > > >
> > > > > >
> > > > > > So there shouldn't be racing updaters. Also ipi sender shouldn't
> > > > > > race with ipi receiver, the update shouldn't always eventually see
> > > > > > the unlock happening.
> > > > >
> > > > > Yeah, I've not spotted how this particular train wreck happens either.
> > > > >
> > > > > The problem is reproduction, it took me 9 hours to confirm I could
> > > > > reproduce the problem on my machine. So how long to I run it with this
> > > > > patch reverted to show its gone..
> > > >
> > > > Maybe it could be favoured cpu hotplug. Anyway converting to irq_work should
> > > > fix it.
> > >
> > > Ingo needs a commit msg for the revert of this patch; do you think you
> > > have time to look into _why_ this patch is broken and write such a
> > > thing?
> >
> > I can try but I need to reproduce it. Do you have any clue on how to do so?
> > Also which HEAD were you guys using?
>
> Ha!, so I was running a tip/master with that commit in -- a few days
> ago, v3.15-rc4-1644-g5c658b0cdf22 might've been it.
>
> Then I ran it on my dual socket AMD interlagos, with:
>
> while :; make O=allyesconfig-build/ clean; make O=allyesconfig-build/
> -j96 -s; done
>
> for 9 hours, and then got empty RCU stall warns and a bricked machine.
>
> I might still have the .config, but I don't think there was anything
> particularly odd about the config other than having NOHZ_FULL enabled.
>
> The only way I found this patch was by staring at some RCU stall warns
> Ingo managed to get, sometimes they actually got backtraces in them
> apparently.
>
> According to Ingo the bigger the machine the faster it reproduces, but
> reproduction times, even for these 32 cpu machines, are in the many
> hours range.
Ok then, I'll try something.
But note that those commits aren't upstream yet and they are in a seperate
branch tip:timers/nohz with no other non-upstream commits.
And I work alone on this branch.
So we can as well zap these commits and replace them with the irq_work_on()
conversion (still preparing that).
next prev parent reply other threads:[~2014-05-07 16:05 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <tip-72aacf0259bb7d53b7a3b5b2f7bf982acaa52b61@git.kernel.org>
2014-05-05 12:37 ` [tip:timers/nohz] nohz: Move full nohz kick to its own IPI Peter Zijlstra
2014-05-05 13:31 ` Peter Zijlstra
2014-05-05 15:04 ` Frederic Weisbecker
2014-05-05 15:12 ` Peter Zijlstra
2014-05-05 15:34 ` Frederic Weisbecker
2014-05-07 15:17 ` Peter Zijlstra
2014-05-07 15:29 ` Frederic Weisbecker
2014-05-07 15:37 ` Peter Zijlstra
2014-05-07 16:05 ` Frederic Weisbecker [this message]
2014-05-07 16:13 ` Peter Zijlstra
2014-05-07 19:07 ` Ingo Molnar
2014-05-09 15:10 ` Frederic Weisbecker
2014-05-11 5:34 ` Ingo Molnar
2014-05-05 14:52 ` Frederic Weisbecker
2014-05-05 14:58 ` Peter Zijlstra
2014-05-05 15:06 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140507160504.GC16694@localhost.localdomain \
--to=fweisbec@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=hpa@zytor.com \
--cc=khilman@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox