All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: tglx@linutronix.de, peterz@infradead.org,
	preeti@linux.vnet.ibm.com, viresh.kumar@linaro.org,
	mtosatti@redhat.com, fweisbec@gmail.com
Cc: linux-kernel@vger.kernel.org, sasha.levin@oracle.com
Subject: Possible issue with commit 4961b6e11825?
Date: Fri, 4 Dec 2015 15:20:22 -0800	[thread overview]
Message-ID: <20151204232022.GA15891@linux.vnet.ibm.com> (raw)

Hello!

Are there any known issues with commit 4961b6e11825 (sched: core: Use
hrtimer_start[_expires]())?

The reason that I ask is that I am about 90% sure that an rcutorture
failure bisects to that commit.  I will be running more tests on
3497d206c4d9 (perf: core: Use hrtimer_start()), which is the predecessor
of 4961b6e11825, and which, unlike 4961b6e11825, passes a 12-hour
rcutorture test with scenario TREE03.  In contrast, 4961b6e11825 gets
131 RCU CPU stall warnings, 132 reports of one of RCU's grace-period
kthreads being starved, and 525 reports of one of rcutorture's kthreads
being starved.  Most of the test runs hang on shutdown, which is no
surprise if an RCU CPU stall is happening at about that time.

But perhaps 3497d206c4d9 was just getting lucky, hence additional testing
over the weekend.

Reproducing this takes some doing.  A multisocket x86 box with significant
background computation noise seems to be able to reproduce this with
high probability in a twelve-hour test.  I -can- make it happen on
a single-socket four-core system (eight hardware threads, and with
significant background computational noise), but I ran the test for
several days before seeing the first error.  In addition, the probability
of hitting this is greatly reduced when running the tests on the
multisocket x86 box without the background computational noise.
(I recently taught some IBMers about ppcmem and herd, and gave them
some problems to solve, which is where the background noise came from,
in case you were wondering.  An unexpected benefit from those tools!)

The starving of RCU's grace-period kthreads is quite surprising, as
diagnostics indicate that they are in a wait_event_interruptible_timeout()
with a three-jiffy timeout.  The starvation is not subtle: 21-second
starvation periods are quite common, and 84-second starvation periods
occur from time to time.  In addition, rcutorture goes idle every few
seconds in order to test ramp-up and ramp-down effects, which should rule
out starvation due to heavy load.  Besides, I never see any softlockup
warnings, which should appear in the heavy-load-starvation case.

The commit log for 4961b6e11825 is as follows:

	sched: core: Use hrtimer_start[_expires]()
	    
	hrtimer_start() now enforces a timer interrupt when an already
	expired timer is enqueued.
		        
	Get rid of the __hrtimer_start_range_ns() invocations and the
	loops around it.

Is it possible that I need to adjust RCU or rcutorture code to account
for these newly enforced timer interrupts?  Or is there a known bug with
this commit whose fix I need to apply when bisecting?  (There were two
other fixes that I needed to do this with, so I figured I should ask.)

							Thanx, Paul


             reply	other threads:[~2015-12-04 23:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04 23:20 Paul E. McKenney [this message]
2015-12-05 19:01 ` Possible issue with commit 4961b6e11825? Paul E. McKenney
2015-12-06  2:36   ` Viresh Kumar
2015-12-06  5:18     ` Paul E. McKenney
2015-12-06 20:56   ` Paul E. McKenney
2015-12-07 19:01 ` Frederic Weisbecker
2015-12-07 20:00   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151204232022.GA15891@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.