From: Joel Fernandes <joel@joelfernandes.org>
To: Phil Auld <pauld@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Nishanth Aravamudan <naravamudan@digitalocean.com>,
Julien Desfossez <jdesfossez@digitalocean.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
mingo@kernel.org, tglx@linutronix.de, pjt@google.com,
torvalds@linux-foundation.org, vpillai <vpillai@digitalocean.com>,
linux-kernel@vger.kernel.org, fweisbec@gmail.com,
keescook@chromium.org, Aaron Lu <aaron.lwe@gmail.com>,
Aubrey Li <aubrey.intel@gmail.com>,
aubrey.li@linux.intel.com,
Valentin Schneider <valentin.schneider@arm.com>,
Mel Gorman <mgorman@techsingularity.net>,
Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
derkling@google.com
Subject: Re: [PATCH RFC] sched: Add a per-thread core scheduling interface
Date: Thu, 28 May 2020 10:51:46 -0400 [thread overview]
Message-ID: <20200528145146.GB87103@google.com> (raw)
In-Reply-To: <20200524140046.GA5598@lorien.usersys.redhat.com>
On Sun, May 24, 2020 at 10:00:46AM -0400, Phil Auld wrote:
> On Fri, May 22, 2020 at 05:35:24PM -0400 Joel Fernandes wrote:
> > On Fri, May 22, 2020 at 02:59:05PM +0200, Peter Zijlstra wrote:
> > [..]
> > > > > It doens't allow tasks for form their own groups (by for example setting
> > > > > the key to that of another task).
> > > >
> > > > So for this, I was thinking of making the prctl pass in an integer. And 0
> > > > would mean untagged. Does that sound good to you?
> > >
> > > A TID, I think. If you pass your own TID, you tag yourself as
> > > not-sharing. If you tag yourself with another tasks's TID, you can do
> > > ptrace tests to see if you're allowed to observe their junk.
> >
> > But that would require a bunch of tasks agreeing on which TID to tag with.
> > For example, if 2 tasks tag with each other's TID, then they would have
> > different tags and not share.
> >
> > What's wrong with passing in an integer instead? In any case, we would do the
> > CAP_SYS_ADMIN check to limit who can do it.
> >
> > Also, one thing CGroup interface allows is an external process to set the
> > cookie, so I am wondering if we should use sched_setattr(2) instead of, or in
> > addition to, the prctl(2). That way, we can drop the CGroup interface
> > completely. How do you feel about that?
> >
>
> I think it should be an arbitrary 64bit value, in both interfaces to avoid
> any potential reuse security issues.
>
> I think the cgroup interface could be extended not to be a boolean but take
> the value. With 0 being untagged as now.
>
> And sched_setattr could be used to set it on a per task basis.
Yeah, something like this will be needed.
> > > > More seriously, the reason I did it this way is the prctl-tagging is a bit
> > > > incompatible with CGroup tagging:
> > > >
> > > > 1. What happens if 2 tasks are in a tagged CGroup and one of them changes
> > > > their cookie through prctl? Do they still remain in the tagged CGroup but are
> > > > now going to not trust each other? Do they get removed from the CGroup? This
> > > > is why I made the prctl fail with -EBUSY in such cases.
In util-clamp's design (which has task-specific attribute and task-group
attribute), it seems for that the priority is task-specific value first, then
the group one, then the system-wide one.
Perhaps a similar design can be adopted for this interface. So probably we
should let the per-task interface not fail if the task was already in CGroup
and rather prioritize its value first before looking at the group one?
Uclamp's comments:
* The effective clamp bucket index of a task depends on, by increasing
* priority:
* - the task specific clamp value, when explicitly requested from userspace
* - the task group effective clamp value, for tasks not either in the root
* group or in an autogroup
* - the system default clamp value, defined by the sysadmin
> > > >
> > > > 2. What happens if 2 tagged tasks with different cookies are added to a
> > > > tagged CGroup? Do we fail the addition of the tasks to the group, or do we
> > > > override their cookie (like I'm doing)?
> > >
> > > For #2 I think I prefer failure.
> > >
> > > But having the rationale spelled out in documentation (man-pages for
> > > example) is important.
> >
> > If we drop the CGroup interface, this would avoid both #1 and #2.
> >
>
> I believe both are useful. Personally, I think the per-task setting should
> win over the cgroup tagging. In that case #1 just falls out.
Cool, this is similar to what I mentioned above.
> And #2 pretty
> much as well. Nothing would happen to the tagged task as they were added
> to the cgroup. They'd keep their explicitly assigned tags and everything
> should "just work". There are other reasons to be in a cpu cgroup together
> than just the core scheduling tag.
Well ok, so there's no reason to fail them the addition to CGroup of a
prctl-tagged task then, we can let it succeed but prioritize the
task-specific attribute over the group-specific one.
> There are a few other edge cases, like if you are in a cgroup, but have
> been tagged explicitly with sched_setattr and then get untagged (presumably
> by setting 0) do you get the cgroup tag or just stay untagged? I think based
> on per-task winning you'd stay untagged. I supposed you could move out and
> back in the cgroup to get the tag reapplied (Or maybe the cgroup interface
> could just be reused with the same value to re-tag everyone who's untagged).
If we maintain a task-specific tag and a group-specific tag, then I think
both tags can coexist and the final tag is decided on priority basis
mentioned above.
So before getting into CGroup, I think first we develop the task-specific
tagging mechanism like Peter was suggesting. So let us talk about that. I
will reply to the other thread Vineeth started while CC'ing you. In
particular, I like Peter's idea about user land passing a TID to share a core
with.
thanks,
- Joel
>
>
>
> Cheers,
> Phil
>
>
> > thanks,
> >
> > - Joel
> >
>
> --
>
next prev parent reply other threads:[~2020-05-28 14:51 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-04 16:59 [RFC PATCH 00/13] Core scheduling v5 vpillai
2020-03-04 16:59 ` [RFC PATCH 01/13] sched: Wrap rq::lock access vpillai
2020-03-04 16:59 ` [RFC PATCH 02/13] sched: Introduce sched_class::pick_task() vpillai
2020-03-04 16:59 ` [RFC PATCH 03/13] sched: Core-wide rq->lock vpillai
2020-04-01 11:42 ` [PATCH] sched/arm64: store cpu topology before notify_cpu_starting Cheng Jian
2020-04-01 13:23 ` Valentin Schneider
2020-04-01 13:23 ` Valentin Schneider
2020-04-06 8:00 ` chengjian (D)
2020-04-06 8:00 ` chengjian (D)
2020-04-09 9:59 ` Sudeep Holla
2020-04-09 9:59 ` Sudeep Holla
2020-04-09 10:32 ` Valentin Schneider
2020-04-09 10:32 ` Valentin Schneider
2020-04-09 11:08 ` Sudeep Holla
2020-04-09 11:08 ` Sudeep Holla
2020-04-09 17:54 ` Joel Fernandes
2020-04-10 13:49 ` chengjian (D)
2020-04-14 11:36 ` [RFC PATCH 03/13] sched: Core-wide rq->lock Peter Zijlstra
2020-04-14 21:35 ` Vineeth Remanan Pillai
2020-04-15 10:55 ` Peter Zijlstra
2020-04-14 14:32 ` Peter Zijlstra
2020-03-04 16:59 ` [RFC PATCH 04/13] sched/fair: Add a few assertions vpillai
2020-03-04 16:59 ` [RFC PATCH 05/13] sched: Basic tracking of matching tasks vpillai
2020-03-04 16:59 ` [RFC PATCH 06/13] sched: Update core scheduler queue when taking cpu online/offline vpillai
2020-03-04 16:59 ` [RFC PATCH 07/13] sched: Add core wide task selection and scheduling vpillai
2020-04-14 13:35 ` Peter Zijlstra
2020-04-16 23:32 ` Tim Chen
2020-04-17 10:57 ` Peter Zijlstra
2020-04-16 3:39 ` Chen Yu
2020-04-16 19:59 ` Vineeth Remanan Pillai
2020-04-17 11:18 ` Peter Zijlstra
2020-04-19 15:31 ` Chen Yu
2020-05-21 23:14 ` Joel Fernandes
2020-05-21 23:16 ` Joel Fernandes
2020-05-22 2:35 ` Joel Fernandes
2020-05-22 3:44 ` Aaron Lu
2020-05-22 20:13 ` Joel Fernandes
2020-03-04 16:59 ` [RFC PATCH 08/13] sched/fair: wrapper for cfs_rq->min_vruntime vpillai
2020-03-04 16:59 ` [RFC PATCH 09/13] sched/fair: core wide vruntime comparison vpillai
2020-04-14 13:56 ` Peter Zijlstra
2020-04-15 3:34 ` Aaron Lu
2020-04-15 4:07 ` Aaron Lu
2020-04-15 21:24 ` Vineeth Remanan Pillai
2020-04-17 9:40 ` Aaron Lu
2020-04-20 8:07 ` [PATCH updated] sched/fair: core wide cfs task priority comparison Aaron Lu
2020-04-20 22:26 ` Vineeth Remanan Pillai
2020-04-21 2:51 ` Aaron Lu
2020-04-24 14:24 ` [PATCH updated v2] " Aaron Lu
2020-05-06 14:35 ` Peter Zijlstra
2020-05-08 8:44 ` Aaron Lu
2020-05-08 9:09 ` Peter Zijlstra
2020-05-08 12:34 ` Aaron Lu
2020-05-14 13:02 ` Peter Zijlstra
2020-05-14 22:51 ` Vineeth Remanan Pillai
2020-05-15 10:38 ` Peter Zijlstra
2020-05-15 10:43 ` Peter Zijlstra
2020-05-15 14:24 ` Vineeth Remanan Pillai
2020-05-16 3:42 ` Aaron Lu
2020-05-22 9:40 ` Aaron Lu
2020-06-08 1:41 ` Ning, Hongyu
2020-03-04 17:00 ` [RFC PATCH 10/13] sched: Trivial forced-newidle balancer vpillai
2020-03-04 17:00 ` [RFC PATCH 11/13] sched: migration changes for core scheduling vpillai
2020-06-12 13:21 ` Joel Fernandes
2020-06-12 21:32 ` Vineeth Remanan Pillai
2020-06-13 2:25 ` Joel Fernandes
2020-06-13 18:59 ` Vineeth Remanan Pillai
2020-06-15 2:05 ` Li, Aubrey
2020-03-04 17:00 ` [RFC PATCH 12/13] sched: cgroup tagging interface " vpillai
2020-06-26 15:06 ` Vineeth Remanan Pillai
2020-03-04 17:00 ` [RFC PATCH 13/13] sched: Debug bits vpillai
2020-03-04 17:36 ` [RFC PATCH 00/13] Core scheduling v5 Tim Chen
2020-03-04 17:42 ` Vineeth Remanan Pillai
2020-04-14 14:21 ` Peter Zijlstra
2020-04-15 16:32 ` Joel Fernandes
2020-04-17 11:12 ` Peter Zijlstra
2020-04-17 12:35 ` Alexander Graf
2020-04-17 13:08 ` Peter Zijlstra
2020-04-18 2:25 ` Joel Fernandes
2020-05-09 14:35 ` Dario Faggioli
[not found] ` <38805656-2e2f-222a-c083-692f4b113313@linux.intel.com>
2020-05-09 3:39 ` Ning, Hongyu
2020-05-14 20:51 ` FW: " Gruza, Agata
2020-05-10 23:46 ` [PATCH RFC] Add support for core-wide protection of IRQ and softirq Joel Fernandes (Google)
2020-05-11 13:49 ` Peter Zijlstra
2020-05-11 14:54 ` Joel Fernandes
2020-05-20 22:26 ` [PATCH RFC] sched: Add a per-thread core scheduling interface Joel Fernandes (Google)
2020-05-21 4:09 ` [PATCH RFC] sched: Add a per-thread core scheduling interface(Internet mail) benbjiang(蒋彪)
2020-05-21 13:49 ` Joel Fernandes
2020-05-21 8:51 ` [PATCH RFC] sched: Add a per-thread core scheduling interface Peter Zijlstra
2020-05-21 13:47 ` Joel Fernandes
2020-05-21 20:20 ` Vineeth Remanan Pillai
2020-05-22 12:59 ` Peter Zijlstra
2020-05-22 21:35 ` Joel Fernandes
2020-05-24 14:00 ` Phil Auld
2020-05-28 14:51 ` Joel Fernandes [this message]
2020-05-28 17:01 ` Peter Zijlstra
2020-05-28 18:17 ` Phil Auld
2020-05-28 18:34 ` Phil Auld
2020-05-28 18:23 ` Joel Fernandes
2020-05-21 18:31 ` Linus Torvalds
2020-05-21 20:40 ` Joel Fernandes
2020-05-21 21:58 ` Jesse Barnes
2020-05-22 16:33 ` Linus Torvalds
2020-05-20 22:37 ` [PATCH RFC v2] Add support for core-wide protection of IRQ and softirq Joel Fernandes (Google)
2020-05-20 22:48 ` [PATCH RFC] sched: Use sched-RCU in core-scheduling balancing logic Joel Fernandes (Google)
2020-05-21 22:52 ` Paul E. McKenney
2020-05-22 1:26 ` Joel Fernandes
2020-06-25 20:12 ` [RFC PATCH 00/13] Core scheduling v5 Vineeth Remanan Pillai
2020-06-26 1:47 ` Joel Fernandes
2020-06-26 14:36 ` Vineeth Remanan Pillai
2020-06-26 15:10 ` Joel Fernandes
2020-06-26 15:12 ` Joel Fernandes
2020-06-27 16:21 ` Joel Fernandes
2020-06-30 14:11 ` Phil Auld
2020-06-29 12:33 ` Li, Aubrey
2020-06-29 19:41 ` Vineeth Remanan Pillai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200528145146.GB87103@google.com \
--to=joel@joelfernandes.org \
--cc=aaron.lwe@gmail.com \
--cc=aubrey.intel@gmail.com \
--cc=aubrey.li@linux.intel.com \
--cc=derkling@google.com \
--cc=fweisbec@gmail.com \
--cc=jdesfossez@digitalocean.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=naravamudan@digitalocean.com \
--cc=pauld@redhat.com \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=valentin.schneider@arm.com \
--cc=vpillai@digitalocean.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.