From: Juri Lelli <juri.lelli@arm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Clark Williams <williams@redhat.com>,
John Kacur <jkacur@redhat.com>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Juri Lelli <juri.lelli@gmail.com>
Subject: Re: [BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets
Date: Thu, 4 Feb 2016 12:27:45 +0000 [thread overview]
Message-ID: <20160204122745.GC29586@e106622-lin> (raw)
In-Reply-To: <20160204120412.GA29586@e106622-lin>
On 04/02/16 12:04, Juri Lelli wrote:
> On 04/02/16 09:54, Juri Lelli wrote:
> > Hi Steve,
> >
> > first of all thanks a lot for your detailed report, if only all bug
> > reports were like this.. :)
> >
> > On 03/02/16 13:55, Steven Rostedt wrote:
>
> [...]
>
> >
> > Right. I think this is the same thing that happens after hotplug. IIRC
> > the code paths are actually the same. The problem is that hotplug or
> > cpuset reconfiguration operations are destructive w.r.t. root_domains,
> > so we lose bandwidth information when that happens. The problem is that
> > we only store cumulative information regarding bandwidth in root_domain,
> > while information about which task belongs to which cpuset is store in
> > cpuset data structures.
> >
> > I tried to fix this a while back, but my tentative was broken, I failed
> > to get locking right and, even though it seemed to fix the issue for me,
> > it was prone to race conditions. You might still want to have a look at
> > that for reference: https://lkml.org/lkml/2015/9/2/162
> >
>
> [...]
>
> >
> > It's good that we can recover, but that's still a bug yes :/.
> >
> > I'll try to see if my broken patch make what you are seeing apparently
> > disappear, so that we can at least confirm that we are seeing the same
> > problem; you could do the same if you want, I pushed that here
> >
>
> No it doesn't solve this :/. I placed restoring code in the hotplug
> workfn, so updates generated by toggling sched_load_balance don't get
> caught, of course. But, this at least tells us that we should solve this
> someplace else.
>
Well, if I call an unlocked version of my cpuset_hotplug_update_rd()
from kernel/cpuset.c:update_flag() the issue seems to go away. But, we
end up overcommitting the default null domain (try to toggle sched_load_
balance multiple times). I updated the branch, but I still think we
should solve this differently.
Best,
- Juri
next prev parent reply other threads:[~2016-02-04 12:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-03 18:55 [BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets Steven Rostedt
2016-02-03 18:57 ` Steven Rostedt
2016-02-04 9:54 ` Juri Lelli
2016-02-04 12:04 ` Juri Lelli
2016-02-04 12:27 ` Juri Lelli [this message]
2016-02-04 16:30 ` Juri Lelli
2016-02-04 17:31 ` Steven Rostedt
2016-02-04 18:32 ` Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160204122745.GC29586@e106622-lin \
--to=juri.lelli@arm.com \
--cc=bristot@redhat.com \
--cc=jkacur@redhat.com \
--cc=juri.lelli@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox