linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <llong@redhat.com>
To: David Hildenbrand <david@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Nico Pache <npache@redhat.com>, Phil Auld <pauld@redhat.com>,
	John Coleman <jocolema@redhat.com>
Subject: Re: [PATCH 1/2] sched/core: Enable full cpumask to clear user cpumask in sched_setaffinity()
Date: Mon, 20 Oct 2025 16:21:36 -0400	[thread overview]
Message-ID: <6967a07f-d48c-4fdf-8adc-414d5127b576@redhat.com> (raw)
In-Reply-To: <21ade241-76b9-4f0a-8e99-be033dcc882c@redhat.com>


On 10/20/25 4:13 PM, David Hildenbrand wrote:
> On 23.09.25 19:54, Waiman Long wrote:
>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask"), user provided CPU affinity via sched_setaffinity(2) is
>> perserved even if the task is being moved to a different cpuset.
>> However, that affinity is also being inherited by any subsequently
>> created child processes which may not want or be aware of that affinity.
>
> So I assume setting the affinity to the full bitmap would then allow 
> any child to essentially reset to the default, correct?
Yes, that is the point.
>
>>
>> One way to solve this problem is to provide a way to back off from
>> that user provided CPU affinity.  This patch implements such a scheme
>> by using a full cpumask (a cpumask with all bits set) to signal the
>> clearing of the user cpumask to follow the default as allowed by
>> the current cpuset.  In fact, with a full cpumask in user_cpus_ptr,
>> the task behavior should be the same as with a NULL user_cpus_ptr.
>> This patch just formalizes it without causing any incompatibility and
>> discard an otherwise useless cpumask.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/sched/syscalls.c | 20 ++++++++++++++------
>>   1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
>> index 77ae87f36e84..d68c7a4ee525 100644
>> --- a/kernel/sched/syscalls.c
>> +++ b/kernel/sched/syscalls.c
>> @@ -1229,14 +1229,22 @@ long sched_setaffinity(pid_t pid, const 
>> struct cpumask *in_mask)
>>           return retval;
>>         /*
>> -     * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
>> -     * alloc_user_cpus_ptr() returns NULL.
>> +     * If a full cpumask is passed in, clear user_cpus_ptr and reset 
>> the
>> +     * current cpu affinity to the default for the current cpuset.
>>        */
>> -    user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
>> -    if (user_mask) {
>> -        cpumask_copy(user_mask, in_mask);
>> +    if (cpumask_full(in_mask)) {
>> +        user_mask = NULL;
>>       } else {
>> -        return -ENOMEM;
>> +        /*
>> +         * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
>> +         * alloc_user_cpus_ptr() returns NULL.
>> +         */
>> +        user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
>> +        if (user_mask) {
>> +            cpumask_copy(user_mask, in_mask);
>> +        } else {
>> +            return -ENOMEM;
>> +        }
>>       }
>>         ac = (struct affinity_context){
>
> Not an expert on this code.
>
> I'm only wondering if there is somehow, some way we could be breaking 
> user space by doing that.
>
I don't think so. Setting user_cpus_ptr to a full cpumask will make the 
task strictly follow the cpumask restriction imposed by the current 
cpuset as if user_cpus_ptr isn't set.

Cheers,
Longman


      reply	other threads:[~2025-10-20 20:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23 17:54 [PATCH 1/2] sched/core: Enable full cpumask to clear user cpumask in sched_setaffinity() Waiman Long
2025-09-23 17:54 ` [PATCH 2/2] fs/proc: Show the content of task->user_cpus_ptr in /proc/<pid>/status Waiman Long
2025-10-20 20:06 ` [PATCH 1/2] sched/core: Enable full cpumask to clear user cpumask in sched_setaffinity() Waiman Long
2025-10-20 20:13 ` David Hildenbrand
2025-10-20 20:21   ` Waiman Long [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6967a07f-d48c-4fdf-8adc-414d5127b576@redhat.com \
    --to=llong@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsegall@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jocolema@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=npache@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).