From: Waiman Long <llong@redhat.com>
To: "Chen Ridong" <chenridong@huaweicloud.com>,
"Waiman Long" <llong@redhat.com>, "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Shuah Khan" <shuah@kernel.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH/for-next v4 3/4] cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock
Date: Tue, 10 Feb 2026 09:01:30 -0500 [thread overview]
Message-ID: <6552b863-6d2b-4537-8155-d87985e77628@redhat.com> (raw)
In-Reply-To: <f1c47301-58a6-425b-b248-913a2a7dbaf9@huaweicloud.com>
On 2/9/26 8:29 PM, Chen Ridong wrote:
>
> On 2026/2/10 4:29, Waiman Long wrote:
>> On 2/9/26 2:12 AM, Chen Ridong wrote:
>>>> return;
>>>> }
>>>> - WARN_ON_ONCE(housekeeping_update(isolated_cpus) < 0);
>>>> - isolated_cpus_updating = false;
>>>> + /*
>>>> + * update_isolation_cpumasks() may be called more than once in the
>>>> + * same cpuset_mutex critical section.
>>>> + */
>>>> + lockdep_assert_held(&cpuset_top_mutex);
>>>> + if (isolcpus_twork_queued)
>>>> + return;
>>>> +
>>>> + init_task_work(&twork_cb, isolcpus_tworkfn);
>>>> + if (!task_work_add(current, &twork_cb, TWA_RESUME))
>>>> + isolcpus_twork_queued = true;
>>>> + else
>>>> + WARN_ON_ONCE(1); /* Current task shouldn't be exiting */
>>>> }
>>>>
>>> Timeline:
>>>
>>> user A user B
>>> write isolated cpus write isolated cpus
>>> isolated_cpus_update
>>> update_isolation_cpumasks
>>> task_work_add
>>> isolcpus_twork_queued =true
>>>
>>> // before returning userspace
>>> // waiting for worker
>>> isolated_cpus_update
>>> if (isolcpus_twork_queued)
>>> return // Early exit
>>> // return to userspace
>>>
>>> // workqueue finishes
>>> // return to userspace
>>>
>>> For User B, the isolated_cpus value appears to be set and the syscall returns
>>> successfully to userspace. However, because isolcpus_twork_queued was already
>>> true (set by User A), User B's call skipped the actual mask update
>>> (update_isolation_cpumasks).
>>> Thus, the new isolated_cpus value is not yet effective in the kernel, even
>>> though User B's write operation returned without error.
>>>
>>> Is this a valid issue? Should User B's write be blocked?
>> It is perfectly possible that isolated_cpus can be modified more than one time
>> from different tasks before a work or task_work function is executed. When that
>> function is invoked, isolated_cpus should contain changes for both. It will copy
>> isolated_cpus to isolated_hk_cpus and pass it to housekeeping_update(). When the
> It is clear about isolated_hk_cpus and isolated_cpus.
>
>> 2nd work or task_work function is invoked, it will see that isolated_cpus match
>> isolated_hk_cpus and skip the housekeeping_update() action. There is no need to
>> block user B's write as only one task can update isolated_cpus at any time.
>>
> The main question remains: user B receives a success return even though
> isolated_hk_cpus has not yet taken effect (i.e.,
> /sys/devices/system/cpu/isolated does not reflect the change). In that case, how
> can user B confirm whether their configuration is actually applied?
task_work function is synchronous. IOW, if a user writes to a cpuset
control file to modify an isolated partition, when control is passed
back to userspace, it is guaranteed that the task_work function, if
queued, would have been executed.
wq work function, OTOH, is asynchronous. So if a user brings down an
isolated CPU to make an isolated partition invalid, the supposed changes
to the sched domains may not be completed by the time the offline
operation returns. However this is an operation that normal users
shouldn't do in a production system anyway and they are taking their own
risk if they try to do it.
Cheers,
Longman
next prev parent reply other threads:[~2026-02-10 14:01 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 20:37 [PATCH/for-next v4 0/4] cgroup/cpuset: Fix partition related locking issues Waiman Long
2026-02-06 20:37 ` [PATCH/for-next v4 1/4] cgroup/cpuset: Clarify exclusion rules for cpuset internal variables Waiman Long
2026-02-09 3:41 ` Chen Ridong
2026-02-09 19:58 ` Waiman Long
2026-02-06 20:37 ` [PATCH/for-next v4 2/4] cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue Waiman Long
2026-02-06 22:28 ` Frederic Weisbecker
2026-02-08 2:00 ` Waiman Long
2026-02-10 15:46 ` Frederic Weisbecker
2026-02-10 18:53 ` Waiman Long
2026-02-09 6:57 ` Chen Ridong
2026-02-06 20:37 ` [PATCH/for-next v4 3/4] cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock Waiman Long
2026-02-09 7:12 ` Chen Ridong
2026-02-09 20:29 ` Waiman Long
2026-02-10 1:29 ` Chen Ridong
2026-02-10 14:01 ` Waiman Long [this message]
2026-02-09 7:23 ` Chen Ridong
2026-02-09 20:20 ` Waiman Long
2026-02-10 1:39 ` Chen Ridong
2026-02-10 14:39 ` Waiman Long
2026-02-06 20:37 ` [PATCH/for-next v4 4/4] cgroup/cpuset: Eliminate some duplicated rebuild_sched_domains() calls Waiman Long
2026-02-09 7:53 ` Chen Ridong
2026-02-09 20:47 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6552b863-6d2b-4537-8155-d87985e77628@redhat.com \
--to=llong@redhat.com \
--cc=anna-maria@linutronix.de \
--cc=bsegall@google.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=frederic@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox