All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, andrea.parri@amarulasolutions.com
Subject: Re: Question about sched_setaffinity()
Date: Thu, 9 May 2019 12:36:25 -0700	[thread overview]
Message-ID: <20190509193625.GA12455@linux.ibm.com> (raw)
In-Reply-To: <20190509173654.GA23530@linux.ibm.com>

On Thu, May 09, 2019 at 10:36:54AM -0700, Paul E. McKenney wrote:
> On Tue, May 07, 2019 at 03:16:13PM -0700, Paul E. McKenney wrote:
> > On Wed, May 01, 2019 at 01:27:13PM -0700, Paul E. McKenney wrote:
> > > On Wed, May 01, 2019 at 03:16:55PM -0400, Steven Rostedt wrote:
> > > > On Wed, 1 May 2019 12:12:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > > 
> > > > 
> > > > > OK, what I did was to apply the patch at the end of this email to -rcu
> > > > > branch dev, then run rcutorture as follows:
> > > > > 
> > > > > nohup tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 2 --configs "TRIVIAL" --bootargs "trace_event=sched:sched_switch,sched:sched_wakeup ftrace=function_graph ftrace_graph_filter=sched_setaffinity,migration_cpu_stop"
> > > > > 
> > > > > This resulted in the console output that I placed here:
> > > > > 
> > > > > http://www2.rdrop.com/~paulmck/submission/console.log.gz
> > > > > 
> > > > > But I don't see calls to sched_setaffinity() or migration_cpu_stop().
> > > > > Steve, is something else needed on the kernel command line in addition to
> > > > > the following?
> > > > > 
> > > > > ftrace=function_graph ftrace_graph_filter=sched_setaffinity,migration_cpu_stop
> > > > 
> > > > Do you have function graph enabled in the config?
> > > > 
> > > > [    2.098303] ftrace bootup tracer 'function_graph' not registered.
> > > 
> > > I guess I don't!  Thank you, will fix.
> > > 
> > > Let's see...
> > > 
> > > My .config has CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y.  It looks like I
> > > need CONFIG_FUNCTION_GRAPH_TRACER=y, which I don't have.  And it looks
> > > like that needs CONFIG_FUNCTION_TRACER=y, which I also don't have.
> > > But I do have CONFIG_HAVE_FUNCTION_TRACER=y.  So I should add this
> > > to my rcutorture command line:
> > > 
> > > --kconfig "CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y".
> > > 
> > > I fired this up.  Here is hoping!  ;-)
> > > 
> > > And it does have sched_setaffinity(), woo-hoo!!!  I overwrote the old file:
> > > 
> > > 	http://www2.rdrop.com/~paulmck/submission/console.log.gz
> > 
> > And I reran after adding a trace_printk(), which shows up as follows:
> > 
> > [  211.409565]  6)               |  /* On unexpected CPU 6, expected 4!!! */
> > 
> > I placed the console log here:
> > 
> > 	http://www2.rdrop.com/~paulmck/submission/console.tpk.log.gz
> > 
> > Just in case the earlier log proves useful.
> > 
> > And it is acting as if the destination CPU proved to be offline.  Except
> > that this rcutorture scenario doesn't offline anything, and I don't see
> > any CPU-hotplug removal messages.  So I added another trace_printk() to
> > print out cpu_online_mask.  This gets me the following:
> > 
> > [   31.565605]  0)               |  /* On unexpected CPU 0, expected 1!!! */
> > [   31.565605]  0)               |  /* Online CPUs: 0-7 */
> > 
> > So we didn't make it to CPU 1 despite its being online.  I placed the
> > console log here:
> > 
> > 	http://www2.rdrop.com/~paulmck/submission/console.tpkol.log.gz
> > 
> > Thoughts?
> > 
> > Updated patch against -rcu below in case it is useful.
> 
> I added more debug and got this:
> 
> [  215.097318]  4)               |  /* On unexpected CPU 4, expected 5!!! */
> [  215.098285]  4)               |  /* Online CPUs: 0-7  Active CPUs: 0-7 */
> [  215.099283]  4)               |  /* ret = 0, ->cpus_allowed 0-4,6-7 */
> 
> http://www.rdrop.com/~paulmck/submission/console.tpkm.log.gz
> 
> The task's ->cpus_allowed got set to the bitwise not of the requested
> mask.  I took a quick scan through the code but don't see how this
> happens.
> 
> Very strange.
> 
> Any suggestions for further debugging?  Left to myself, I would copy
> the requested cpumask to a per-task location and insert checks in the
> sched_setaffinity() code path, crude and ugly though that might seem.
> (Ugly enough that I will wait until tomorrow to try it out.)

And of course the nicer approach is to simply drop a few trace_printk()
calls into sched_setaffinity(), which got me this (with other CPU
tracing removed):

[  207.440153]  0)               |  /* sched_setaffinity: new_mask 5 */
[  207.445567]  0)               |  /* On unexpected CPU 0, expected 5!!! */
[  207.450017]  0)               |  /* Online CPUs: 0-7  Active CPUs: 0-7 */
[  207.454943]  0)               |  /* ret = 0, ->cpus_allowed 0-4,6-7 */

http://www.rdrop.com/~paulmck/submission/console.tkpms.log.gz
	(Search for "->cpus_allowed".)

I forward-ported the relevant patches from -rcu and placed them on -rcu
branch peterz.2019.05.09a, and this is what produced the output above.

Any other debugging thoughts?

Or, if you wish, you can reproduce by running the following:

nohup tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 2 --configs "TRIVIAL" --bootargs "trace_event=sched:sched_switch,sched:sched_wakeup ftrace=function_graph ftrace_graph_filter=sched_setaffinity,migration_cpu_stop" --kconfig "CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y"

This gets me the following summary output:

 --- Thu May  9 12:08:31 PDT 2019 Test summary:
 Results directory: /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31
 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 2 --configs TRIVIAL --bootargs trace_event=sched:sched_switch,sched:sched_wakeup ftrace=function_graph ftrace_graph_filter=sched_setaffinity,migration_cpu_stop --kconfig CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y
 TRIVIAL ------- 2177 GPs (18.1417/s) [trivial: g0 f0x0 ]
 :CONFIG_HOTPLUG_CPU: improperly set
 WARNING: BAD SEQ 2176:2176 last:2177 version 4
     /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31/TRIVIAL/console.log
     WARNING: Assertion failure in /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31/TRIVIAL/console.log
     WARNING: Summary: Warnings: 1 Bugs: 1 Call Traces: 5 Stalls: 8

							Thanx, Paul


  reply	other threads:[~2019-05-09 19:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-27 18:02 Question about sched_setaffinity() Paul E. McKenney
2019-04-30 10:03 ` Peter Zijlstra
2019-04-30 10:51   ` Paul E. McKenney
2019-04-30 11:55     ` Peter Zijlstra
2019-05-01 19:12       ` Paul E. McKenney
2019-05-01 19:16         ` Steven Rostedt
2019-05-01 20:27           ` Paul E. McKenney
2019-05-07 22:16             ` Paul E. McKenney
2019-05-09 17:36               ` Paul E. McKenney
2019-05-09 19:36                 ` Paul E. McKenney [this message]
2019-05-10 12:08                   ` Peter Zijlstra
2019-05-10 23:07                     ` Paul E. McKenney
2019-05-11 21:45                       ` Andrea Parri
2019-05-12  0:39                         ` Paul E. McKenney
2019-05-12  1:05                           ` Andrea Parri
2019-05-13 12:20                             ` Paul E. McKenney
2019-05-13 15:37                               ` Joel Fernandes
2019-05-13 15:53                                 ` Paul E. McKenney
2019-05-13  8:10                       ` Peter Zijlstra
2019-05-13 12:19                         ` Paul E. McKenney
2019-05-09 21:40                 ` Andrea Parri
2019-05-09 21:56                   ` Andrea Parri
2019-05-09 22:17                     ` Paul E. McKenney
2019-05-10  6:32                       ` Andrea Parri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190509193625.GA12455@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.