From: Thomas Gleixner <tglx@linutronix.de>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Gabriele Monaco <gmonaco@redhat.com>,
Michael Jeanson <mjeanson@efficios.com>,
Jens Axboe <axboe@kernel.dk>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Florian Weimer <fweimer@redhat.com>,
Tim Chen <tim.c.chen@intel.com>,
Yury Norov <yury.norov@gmail.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>
Subject: Re: [patch V3 17/20] sched/mmcid: Provide CID ownership mode fixup functions
Date: Fri, 31 Oct 2025 17:54:07 +0100 [thread overview]
Message-ID: <87wm4brp00.ffs@tglx> (raw)
In-Reply-To: <c2e4fed9-b207-4d28-93f5-b09f0fe78e35@efficios.com>
On Thu, Oct 30 2025 at 11:51, Mathieu Desnoyers wrote:
> On 2025-10-29 09:09, Thomas Gleixner wrote:
>> At the point of switching to per CPU mode the new user is not yet visible
>> in the system, so the task which initiated the fork() runs the fixup
>> function: mm_cid_fixup_tasks_to_cpu() walks the thread list and either
>> transfers each tasks owned CID to the CPU the task runs on or drops it into
>> the CID pool if a task is not on a CPU at that point in time. Tasks which
>> schedule in before the task walk reaches them do the handover in
>> mm_cid_schedin(). When mm_cid_fixup_tasks_to_cpus() completes it's
>> guaranteed that no task related to that MM owns a CID anymore.
>>
>> Switching back to task mode happens when the user count goes below the
>> threshold which was recorded on the per CPU mode switch:
>>
>> pcpu_thrs = min(opt_cids - (opt_cids / 4), nr_cpu_ids / 2);
>>
>
> AFAIU this provides an hysteresis so we don't switch back and
> forth between modes if a single thread is forked/exits repeatedly,
> right ?
Yes. We could do that with a timer too, but the hysteresis worked fine
so far.
>> This transition from CPU to per task ownership happens in two phases:
>>
>> 1) mm:mm_cid.transit contains MM_CID_TRANSIT. This is OR'ed on the task
>> CID and denotes that the CID is only temporarily owned by the
>> task. When it schedules out the task drops the CID back into the
>> pool if this bit is set.
>
> OK, so the mm_drop_cid() on sched out only happens due to a transition
> from per-cpu back to per-task. This answers my question in the previous
> patch.
:)
>> + * Switching back to task mode happens when the user count goes below the
>> + * threshold which was recorded on the per CPU mode switch:
>> + *
>> + * pcpu_thrs = min(opt_cids - (opt_cids / 4), num_possible_cpus() / 2);
>
> I notice that mm_update_cpus_allowed() calls __mm_update_max_cids()
> before updating the pcpu_thrs threshold.
>
> sched_mm_cid_{add,remove}_user() only invoke mm_update_max_cids(mm)
> without updating pcpu_thrs first.
>
> Are those done on purpose ?
Yes. Update of pcpu_thrs is only possible when a resulting transition
can be handled in the context. max_cids update is always possible.
That's why mm_update_cpus_allowed() only updates max_cids and then
schedules work to defer a potential transition to the worker thread
context.
sched_mm_cid_{add,remove}_user() does:
mm_update_max_cids()
__mm_update_max_cids() <- Updates max_cids
update threshold and potentially switch ownership mode
As this holds the mutex it prevents new tasks coming in or other tasks
exiting until it managed the transition.
mm_cid_work_fn() does the same thing unless a
sched_mm_cid_{add,remove}_user() did not already handle it.
Thanks,
tglx
next prev parent reply other threads:[~2025-10-31 16:54 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 13:08 [patch V3 00/20] sched: Rewrite MM CID management Thomas Gleixner
2025-10-29 13:08 ` [patch V3 01/20] sched/mmcid: Revert the complex " Thomas Gleixner
2025-10-29 13:08 ` [patch V3 02/20] sched/mmcid: Use proper data structures Thomas Gleixner
2025-10-29 15:31 ` Mathieu Desnoyers
2025-10-29 13:08 ` [patch V3 03/20] sched/mmcid: Cacheline align MM CID storage Thomas Gleixner
2025-10-29 15:39 ` Mathieu Desnoyers
2025-10-29 21:09 ` Thomas Gleixner
2025-10-30 14:15 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 04/20] sched: Fixup whitespace damage Thomas Gleixner
2025-10-29 15:42 ` Mathieu Desnoyers
2025-10-29 21:11 ` Thomas Gleixner
2025-10-29 13:09 ` [patch V3 05/20] sched/mmcid: Move scheduler code out of global header Thomas Gleixner
2025-10-29 15:43 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 06/20] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed() Thomas Gleixner
2025-10-29 15:45 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 07/20] cpumask: Introduce cpumask_weighted_or() Thomas Gleixner
2025-10-29 15:49 ` Mathieu Desnoyers
2025-11-03 9:15 ` Shrikanth Hegde
2025-11-03 13:29 ` Thomas Gleixner
2025-11-10 16:11 ` Peter Zijlstra
2025-11-10 16:42 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 08/20] sched/mmcid: Use cpumask_weighted_or() Thomas Gleixner
2025-10-29 15:51 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 09/20] cpumask: Cache num_possible_cpus() Thomas Gleixner
2025-10-29 15:54 ` Mathieu Desnoyers
2025-10-29 21:11 ` Thomas Gleixner
2025-11-01 22:59 ` Thomas Gleixner
2025-11-03 10:06 ` Shrikanth Hegde
2025-11-03 13:28 ` Thomas Gleixner
2025-10-29 13:09 ` [patch V3 10/20] sched/mmcid: Convert mm CID mask to a bitmap Thomas Gleixner
2025-10-30 13:59 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 11/20] signal: Move MMCID exit out of sighand lock Thomas Gleixner
2025-10-29 13:09 ` [patch V3 12/20] sched/mmcid: Move initialization out of line Thomas Gleixner
2025-10-30 14:02 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 13/20] sched/mmcid: Provide precomputed maximal value Thomas Gleixner
2025-10-30 14:23 ` Mathieu Desnoyers
2025-10-31 15:06 ` Thomas Gleixner
2025-10-31 15:14 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 14/20] sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex Thomas Gleixner
2025-10-30 14:25 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 15/20] sched/mmcid: Introduce per task/CPU ownership infrastrcuture Thomas Gleixner
2025-10-30 14:45 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 16/20] sched/mmcid: Provide new scheduler CID mechanism Thomas Gleixner
2025-10-30 15:09 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 17/20] sched/mmcid: Provide CID ownership mode fixup functions Thomas Gleixner
2025-10-30 15:51 ` Mathieu Desnoyers
2025-10-31 16:54 ` Thomas Gleixner [this message]
2025-10-31 19:15 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 18/20] irqwork: Move data struct to a types header Thomas Gleixner
2025-10-30 15:52 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 19/20] sched/mmcid: Implement deferred mode change Thomas Gleixner
2025-10-30 15:56 ` Mathieu Desnoyers
2025-10-29 13:09 ` [patch V3 20/20] sched/mmcid: Switch over to the new mechanism Thomas Gleixner
2025-10-30 16:07 ` Mathieu Desnoyers
2025-10-31 16:57 ` Thomas Gleixner
2025-10-31 19:17 ` Mathieu Desnoyers
2025-11-24 12:10 ` Mark Brown
2025-11-24 12:27 ` Florian Weimer
2025-11-24 13:28 ` Mark Brown
2025-10-30 5:00 ` [patch V3 00/20] sched: Rewrite MM CID management Shrikanth Hegde
2025-10-30 6:40 ` Shrikanth Hegde
2025-10-31 19:36 ` Thomas Gleixner
2025-11-01 7:56 ` Shrikanth Hegde
2025-11-01 12:27 ` Thomas Gleixner
2025-11-10 17:09 ` Gabriele Monaco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wm4brp00.ffs@tglx \
--to=tglx@linutronix.de \
--cc=axboe@kernel.dk \
--cc=fweimer@redhat.com \
--cc=gautham.shenoy@amd.com \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjeanson@efficios.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@intel.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.