Re: [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Luca Abeni <luca.abeni@santannapisa.it>
To: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	tj@kernel.org, vbabka@suse.cz, Li Zefan <lizefan@huawei.com>,
	akpm@linux-foundation.org, weiyongjun1@huawei.com,
	Juri Lelli <juri.lelli@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Claudio Scordino <claudio@evidence.eu.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
Subject: Re: [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting
Date: Fri, 13 Oct 2017 10:04:04 +0200	[thread overview]
Message-ID: <20171013100404.41cefbe0@luca> (raw)
In-Reply-To: <CANLsYkz=FixsuvRU-=-Gge5bHQvdiUZTrzh2JttY1PvKTEDaTw@mail.gmail.com>

Hi Mathieu,

On Thu, 12 Oct 2017 10:57:09 -0600
Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
[...]
> >> Regardless of how we proceed (using existing CPUset list or new ones) we
> >> need to deal with DL tasks that span more than one root domain,  something
> >> that will typically happen after a CPUset operation.  For example, if we
> >> split the number of available CPUs on a system in two CPUsets and then turn
> >> off the 'sched_load_balance' flag on the parent CPUset, DL tasks in the
> >> parent CPUset will end up spanning two root domains.
> >>
> >> One way to deal with this is to prevent CPUset operations from happening
> >> when such condition is detected, as enacted in this set.  Although simple
> >> this approach feels brittle and akin to a "whack-a-mole" game.  A better
> >> and more reliable approach would be to teach the DL scheduler to deal with
> >> tasks that span multiple root domains, a serious and substantial
> >> undertaking.
> >>
> >> I am sending this as a starting point for discussion.  I would be grateful
> >> if you could take the time to comment on the approach and most importantly
> >> provide input on how to deal with the open issue underlined above.  
> >
> > Right, so teaching DEADLINE about arbitrary affinities is 'interesting'.
> >
> > Although the rules proposed by Tomasso; if found sufficient; would
> > greatly simplify things. Also the online semi-partition approach to SMP
> > could help with that.  
> 
> The "rules" proposed by Tomasso, are you referring to patches or the
> deadline/cgroup extension work that he presented at OSPM?

No, that is an unrelated thing... Tommaso previously proposed some
improvements to the admission control mechanism to take arbitrary
affinities into account.


I think Tommaso's proposal is similar to what I previously proposed in
this thread (to admit a SCHED_DEADLINE task with utilization
u = runtime / period and affinity to N runqueues, we can account u / N
to each one of the runqueues, and check if the sum of the utilizations
on each runqueue is < 1).

As previously noticed by Peter, this might have some scalability issues
(a naive implementation would lock the root domain while iterating on
all the runqueues). Few days ago, I was discussing with Tommaso about a
possible solution based on not locking the root domain structure, and
eventually using a roll-back strategy if the status of the root domain
changes while we are updating it. I think in a previous email you
mentioned RCU, which might result in a similar solution.

Anyway, I am adding Tommaso in cc so that he can comment more.


> I'd also be
> interested to know more about this "online semi-partition approach to
> SMP" you mentioned.

It is basically an implementation (and extension to arbitrary
affinities) of this work:
http://drops.dagstuhl.de/opus/volltexte/2017/7165/


				Luca

> Maybe that's a conversation we could have at the
> upcoming RT summit in Prague.
> 
> >
> > But yes, that's fairly massive surgery. For now I think we'll have to
> > live and accept the limitations. So failing the various cpuset
> > operations when they violate rules seems fine. Relaxing rules is always
> > easier than tightening them (later).  
> 
> Agreed.
> 
> >
> > One 'series' you might be interested in when respinning these is:
> >
> >   https://lkml.kernel.org/r/20171011094833.pdp4torvotvjdmkt@hirez.programming.kicks-ass.net
> >
> > By doing synchronous domain rebuild we loose a bunch of funnies.  
> 
> Getting rid of the asynchronous nature of the hotplug path would be a
> delight - I'll start keeping track of that effort as well.
> 
> Thanks for the review,
> Mathieu

     prev parent reply	other threads:[~2017-10-13  8:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-16 21:20 [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting Mathieu Poirier
2017-08-16 21:20 ` [PATCH 1/7] sched/topology: Adding function partition_sched_domains_locked() Mathieu Poirier
2017-08-16 21:20 ` [PATCH 2/7] cpuset: Rebuild root domain deadline accounting information Mathieu Poirier
2017-08-16 21:20 ` [PATCH 3/7] sched/deadline: Keep new DL task within root domain's boundary Mathieu Poirier
2017-08-16 21:20 ` [PATCH 4/7] cgroup: Constrain 'sched_load_balance' flag when DL tasks are present Mathieu Poirier
2017-08-16 21:20 ` [PATCH 5/7] cgroup: Concentrate DL related validation code in one place Mathieu Poirier
2017-08-16 21:20 ` [PATCH 6/7] cgroup: Constrain the addition of CPUs to a new CPUset Mathieu Poirier
2017-08-16 21:20 ` [PATCH 7/7] sched/core: Don't change the affinity of DL tasks Mathieu Poirier
2017-08-22 12:21 ` [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting Luca Abeni
2017-08-23 19:47   ` Mathieu Poirier
2017-08-24  7:53     ` Luca Abeni
2017-08-24  8:29       ` Juri Lelli
2017-08-24 20:32       ` Mathieu Poirier
2017-08-25  6:02         ` luca abeni
2017-08-25  9:52           ` Luca Abeni
2017-08-25 19:53             ` Mathieu Poirier
2017-08-25 20:35             ` Mathieu Poirier
2017-08-25 14:37     ` Luca Abeni
2017-08-25 20:29       ` Mathieu Poirier
2017-10-11 16:02 ` Peter Zijlstra
2017-10-12 16:57   ` Mathieu Poirier
2017-10-13  8:04     ` Luca Abeni [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171013100404.41cefbe0@luca \
    --to=luca.abeni@santannapisa.it \
    --cc=akpm@linux-foundation.org \
    --cc=bristot@redhat.com \
    --cc=claudio@evidence.eu.com \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=tommaso.cucinotta@santannapisa.it \
    --cc=vbabka@suse.cz \
    --cc=weiyongjun1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.