From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Gabriele Monaco <gmonaco@redhat.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michael Jeanson <mjeanson@efficios.com>,
Jens Axboe <axboe@kernel.dk>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Florian Weimer <fweimer@redhat.com>,
Tim Chen <tim.c.chen@intel.com>,
TCMalloc Team <tcmalloc-eng@google.com>
Subject: Re: [patch 00/19] sched: Rewrite MM CID management
Date: Fri, 17 Oct 2025 09:09:33 +0200 [thread overview]
Message-ID: <877bwu0zte.ffs@tglx> (raw)
In-Reply-To: <20251015164952.694882104@linutronix.de>
On Wed, Oct 15 2025 at 19:29, Thomas Gleixner wrote:
> Thread create teardown
>
> I wrote a micro benchmark, which spawns pools which each create
> threads and let the threads die after creation. The 32 pools/ 32
> threads case triggers the ownership mode change case in both
> directions. The source is appended at the end of this mail.
>
> I initialy wrote it to stress the mode change mechanics, but then I
> noticed the massive difference when I ran it on upstream:
>
> 8 pools / 8 threads 32 pools / 32 threads
>
> v6.17 23666 thr/sec 16161 thr/sec
> +rseq/perf 23656 thr/sec 0% 16196 thr/sec 0%
> +cid rework 32025 thr/sec +35% 21004 thr/sec +30%
>
> Both v6.17 and v6.17 + rseq/perf show this in perf top:
>
> 14.62% [kernel] [k] update_sd_lb_stats.constprop.0
> 13.08% [kernel] [k] native_queued_spin_lock_slowpath
> 4.66% [kernel] [k] osq_lock
> 3.06% [kernel] [k] _find_next_and_bit
> 2.21% [kernel] [k] __schedule
> 2.16% [kernel] [k] sched_balance_rq
>
> with the CID rewrite this becomes:
>
> 13.48% [kernel] [k] native_queued_spin_lock_slowpath
> 8.98% [kernel] [k] update_sd_lb_stats.constprop.0
> 5.16% [kernel] [k] osq_lock
> 2.28% [kernel] [k] _find_next_and_bit
> 2.11% [kernel] [k] __schedule
> 1.75% [kernel] [k] psi_group_change
> ...
> 1.32% [kernel] [k] sched_balance_rq
>
> I haven't been able to understand that massive difference yet.
Looked deeper into it and it turns out that the problem is caused by the
upstream MM_CID implmementation. The extra work in the task migration
code increases rq lock hold time enough to cause that.
When I make CONFIG_SCHED_MM_CID a real knob and disable it on top of
rseq/perf then it becomes on par with the rewritten CID code. Toggling
it on top of the CID rewrite series does not really make a difference.
Thanks,
tglx
next prev parent reply other threads:[~2025-10-17 7:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-15 17:29 [patch 00/19] sched: Rewrite MM CID management Thomas Gleixner
2025-10-15 17:29 ` [patch 01/19] sched/mmcid: Revert the complex " Thomas Gleixner
2025-10-15 17:29 ` [patch 02/19] sched/mmcid: Use proper data structures Thomas Gleixner
2025-10-15 17:29 ` [patch 03/19] sched/mmcid: Cacheline align MM CID storage Thomas Gleixner
2025-10-15 17:29 ` [patch 04/19] sched: Fixup whitespace damage Thomas Gleixner
2025-10-15 17:29 ` [patch 05/19] sched/mmcid: Move scheduler code out of global header Thomas Gleixner
2025-10-15 17:29 ` [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed() Thomas Gleixner
2025-10-17 11:12 ` Peter Zijlstra
2025-10-17 12:49 ` Thomas Gleixner
2025-10-17 17:58 ` Peter Zijlstra
2025-10-17 18:19 ` Peter Zijlstra
2025-10-19 20:32 ` Thomas Gleixner
2025-10-20 8:22 ` Peter Zijlstra
2025-10-21 18:25 ` Thomas Gleixner
2025-10-15 17:29 ` [patch 07/19] cpumask: Introduce cpumask_or_weight() Thomas Gleixner
2025-10-15 17:41 ` Yury Norov
2025-10-15 18:06 ` Yury Norov
2025-10-21 20:21 ` Thomas Gleixner
2025-10-21 19:34 ` Thomas Gleixner
2025-10-15 17:29 ` [patch 08/19] sched/mmcid: Use cpumask_or_weight() Thomas Gleixner
2025-10-15 17:29 ` [patch 09/19] sched/mmcid: Convert mm CID mask to a bitmap Thomas Gleixner
2025-10-15 17:29 ` [patch 10/19] signal: Move MMCID exit out of sighand lock Thomas Gleixner
2025-10-15 17:29 ` [patch 11/19] sched/mmcid: Move initialization out of line Thomas Gleixner
2025-10-15 17:29 ` [patch 12/19] sched/mmcid: Provide precomputed maximal value Thomas Gleixner
2025-10-15 17:29 ` [patch 13/19] sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex Thomas Gleixner
2025-10-15 17:29 ` [patch 14/19] sched/mmcid: Introduce per task/CPU ownership infrastrcuture Thomas Gleixner
2025-10-15 17:29 ` [patch 15/19] sched/mmcid: Provide new scheduler CID mechanism Thomas Gleixner
2025-10-15 17:29 ` [patch 16/19] sched/mmcid: Provide CID ownership mode fixup functions Thomas Gleixner
2025-10-20 6:34 ` Thomas Gleixner
2025-10-20 9:13 ` Peter Zijlstra
2025-10-20 9:16 ` Peter Zijlstra
2025-10-20 9:27 ` Peter Zijlstra
2025-10-21 18:27 ` Thomas Gleixner
2025-10-15 17:29 ` [patch 17/19] irqwork: Move data struct to a types header Thomas Gleixner
2025-10-15 17:29 ` [patch 18/19] sched/mmcid: Implement deferred mode change Thomas Gleixner
2025-10-15 17:30 ` [patch 19/19] sched/mmcid: Switch over to the new mechanism Thomas Gleixner
2025-10-17 7:09 ` Thomas Gleixner [this message]
2025-10-17 11:31 ` [patch 00/19] sched: Rewrite MM CID management Florian Weimer
2025-10-17 12:56 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877bwu0zte.ffs@tglx \
--to=tglx@linutronix.de \
--cc=axboe@kernel.dk \
--cc=fweimer@redhat.com \
--cc=gautham.shenoy@amd.com \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjeanson@efficios.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tcmalloc-eng@google.com \
--cc=tim.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.