From: joe.korty@concurrent-rt.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed
Date: Mon, 20 Nov 2017 11:30:40 -0500 [thread overview]
Message-ID: <20171120163040.GA25993@zipoli.concurrent-rt.com> (raw)
In-Reply-To: <20171117174851.2a253785@gandalf.local.home>
Hi Steve,
A quick perusal of 4.11.12-rt16 shows that it has an
entirely new version of migrate_disable which to me appears
correct.
In that new implementation, migrate_enable() recalculates
p->nr_cpus_allowed when it switches the task back to
using p->cpus_mask. This brings the two back into sync
if anything had happened to get them out of sync while
migration was disabled (as would happen on an affinity
change during that disable period).
4.9.47-rt37 has the old implementation and it appears to
have same bug as 4.4-rt though I have yet to test 4.9-rt.
The fix in these older versions could take one of two
forms: either we recalculate p->nr_cpus_allowed when
migrate_enable goes back to using p->cpus_allowed,
as the 4.11-rt version does, or the one place where we
allow p->nr_cpus_allowed to diverge from p->cpus_allowed
be fixed. The patch I submitted earlier takes this second
approach.
Regards,
Joe
On Fri, Nov 17, 2017 at 05:48:51PM -0500, Steven Rostedt wrote:
> On Wed, 15 Nov 2017 14:25:29 -0500
> joe.korty@concurrent-rt.com wrote:
>
> > 4.4.86-rt99's patch
> >
> > 0037-Intrduce-migrate_disable-cpu_light.patch
> >
> > introduces a place where a task's cpus_allowed mask is
> > updated without a corresponding update to nr_cpus_allowed.
> >
> > This path is executed when task affinity is changed while
> > migrate_disabled() is true. As there is no code present
> > to set nr_cpus_allowed when the migrate_disable state is
> > dropped, the scheduler at that point on may make incorrect
> > scheduling decisions for this task.
> >
> > My testing consists of temporarily adding a
> >
> > if (tsk_nr_cpus_allowed(p) == cpumask_weight(tsk_cpus_allowed(p))
> > printk_ratelimited(...)
>
> Have you tested v4.9-rt or 4.13-rt if it has the same bug? If it is a
> bug in 4.13-rt then it needs to go there first, and then backported to
> the stable releases (which I'm actually working on now).
>
> -- Steve
>
> >
> > stmt to schedule() and running a simple affinity rotation
> > program I wrote, one that rotates the threads of stress(1).
> > While rotating, I got the expected kernel error messages.
> > With this patch applied the messages disappeared.
> >
> > Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
> >
> > Index: b/kernel/sched/core.c
> > ===================================================================
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1220,6 +1220,7 @@ void do_set_cpus_allowed(struct task_str
> > lockdep_assert_held(&p->pi_lock);
> >
> > if (__migrate_disabled(p)) {
> > + p->nr_cpus_allowed = cpumask_weight(new_mask);
> > cpumask_copy(&p->cpus_allowed, new_mask);
> > return;
> > }
next prev parent reply other threads:[~2017-11-20 16:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-15 19:25 [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed joe.korty
2017-11-17 22:48 ` Steven Rostedt
2017-11-20 16:30 ` joe.korty [this message]
2017-11-21 4:02 ` Steven Rostedt
2017-11-21 4:57 ` Steven Rostedt
2017-11-21 14:33 ` joe.korty
2017-11-21 15:33 ` joe.korty
2017-11-29 0:22 ` Steven Rostedt
2017-11-29 14:24 ` joe.korty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171120163040.GA25993@zipoli.concurrent-rt.com \
--to=joe.korty@concurrent-rt.com \
--cc=a.p.zijlstra@chello.nl \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).