The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>,
	Waiman Long <longman@redhat.com>
Cc: Jing Wu <realwujing@gmail.com>,
	linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	cgroups@vger.kernel.org, Qiliang Yuan <yuanql9@chinatelecom.cn>
Subject: Re: [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs
Date: Thu, 02 Jul 2026 17:00:03 +0200	[thread overview]
Message-ID: <871pdlphcc.ffs@fw13> (raw)
In-Reply-To: <akUii2CyEi7SRid7@localhost.localdomain>

On Wed, Jul 01 2026 at 16:22, Frederic Weisbecker wrote:
> Le Thu, Jun 25, 2026 at 01:27:54AM -0400, Waiman Long a écrit :
>> That will require some adjustments to the nohz_full related hotplug
>> functions. I have some ideas of what needs to be done. However, I haven't
>> looked into RCU yet. I know RCU support changing the nocb mask for fully
>> offline CPUs, I will need to find out if it possible to do that for
>> partially offline CPUs.
>
> No because callbacks can still be enqueued at this stage. But we could
> manage to make it work with CPUHP_AP_IDLE_DEAD.

Well, if you go down to CPUHP_AP_IDLE_DEAD then that's not any different
from going down all the way because the latency spike of stomp_machine()
for bringing it down is the same.

You are right that with the current code this is not possible, but it
should be possible to avoid that alltogether.

The only critical path is when a CPU switches to offload mode. Switching
to 'yes queue callbacks here' mode is not really interesting.

Let's look how RCU hot-unplug works:

  1) CPU is marked !active

  2) rcutree_offline_cpu() removes the CPU from the fully functional CPU
     mask
  
  3) stomp_machine()

  4) rcutree_cpu_dying() just traces that the CPU is about to vanish

  5) Wait for the CPU to report DEAD

  6) rcutree_migrate_callbacks() mops up the leftover callbacks on the
     dead CPU

So if the whole machinery changes to:

  1) CPU is marked !active

  2) rcutree_offline_cpu() removes the CPU from the fully functional CPU
     mask _AND_ marks the CPU as "lightweight offloaded", which means:

        - no new callbacks can be queued on it anymore neither from the
          CPU itself nor from truly offloaded CPUs

        - the CPU is still processing already queued callbacks and
          participates in the GP magic

  3) Before CPUHP_AP_SCHED_WAIT_EMPTY add a new CPUHP_AP_RCU_SYNC state,
     which does:

       - a full RCU synchronization to end all outstanding read side
         critical sections

       - drain the now ready callbacks on this CPU

  4) Proceed to CPUHP_TEARDOWN_CPU, where the operation stops

  5) Do the magic cpuset changes for the CPU

  6) Bring CPU back up

At #4 the half unplugged CPU is not in NOHZ full mode and the tick keeps
running so all GP processing work as before except that the CPU itself
is not handling any callbacks because all queued ones are drained and no
new ones can be queued. When it comes back up it turns into a fully
offloaded one.

There are obviously a gazillion of details and cornercases to handle,
but I don't see why this can't be made work in principle.

Thanks,

        tglx


     

  parent reply	other threads:[~2026-07-02 15:00 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21  3:03 [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs Waiman Long
2026-04-21  3:03 ` [PATCH 01/23] sched/isolation: Add HK_TYPE_KERNEL_NOISE_BOOT & HK_TYPE_MANAGED_IRQ_BOOT Waiman Long
2026-04-21  3:03 ` [PATCH 02/23] sched/isolation: Enhance housekeeping_update() to support updating more than one HK cpumask Waiman Long
2026-04-22  6:39   ` Chen Ridong
2026-04-21  3:03 ` [PATCH 03/23] tick/nohz: Make nohz_full parameter optional Waiman Long
2026-04-21  8:32   ` Thomas Gleixner
2026-04-21 14:14     ` Waiman Long
2026-04-24 15:57       ` Frederic Weisbecker
2026-04-21  3:03 ` [PATCH 04/23] tick/nohz: Allow runtime changes in full dynticks CPUs Waiman Long
2026-04-21  8:50   ` Thomas Gleixner
2026-04-21 14:24     ` Waiman Long
2026-05-13 13:04     ` Frederic Weisbecker
2026-04-21  3:03 ` [PATCH 05/23] tick: Pass timer tick job to an online HK CPU in tick_cpu_dying() Waiman Long
2026-04-21  8:55   ` Thomas Gleixner
2026-04-21 14:22     ` Waiman Long
2026-04-21  3:03 ` [PATCH 06/23] rcu/nocbs: Allow runtime changes in RCU NOCBS cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 07/23] watchdog: Sync up with runtime change of isolated CPUs Waiman Long
2026-04-21  3:03 ` [PATCH 08/23] arm64: topology: Use RCU to protect access to HK_TYPE_TICK cpumask Waiman Long
2026-04-22  9:34   ` Chen Ridong
2026-05-13 16:19   ` Frederic Weisbecker
2026-04-21  3:03 ` [PATCH 09/23] workqueue: Use RCU to protect access of HK_TYPE_TIMER cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 10/23] cpu: " Waiman Long
2026-04-21  8:57   ` Thomas Gleixner
2026-04-21 14:25     ` Waiman Long
2026-04-21  3:03 ` [PATCH 11/23] hrtimer: " Waiman Long
2026-04-21  8:59   ` Thomas Gleixner
2026-04-21  3:03 ` [PATCH 12/23] net: Use boot time housekeeping cpumask settings for now Waiman Long
2026-04-21  3:03 ` [PATCH 13/23] sched/core: Use RCU to protect access of HK_TYPE_KERNEL_NOISE cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 14/23] hwmon/coretemp: Use RCU to protect access of HK_TYPE_MISC cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 15/23] Drivers: hv: Use RCU to protect access of HK_TYPE_MANAGED_IRQ cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 16/23] genirq/cpuhotplug: " Waiman Long
2026-04-21  9:02   ` Thomas Gleixner
2026-04-21 14:29     ` Waiman Long
2026-04-21  3:03 ` [PATCH 17/23] sched/isolation: Extend housekeeping_dereference_check() to cover changes in nohz_full or manged_irqs cpumasks Waiman Long
2026-04-21  3:03 ` [PATCH 18/23] cpu/hotplug: Add a new cpuhp_offline_cb() API Waiman Long
2026-04-21 16:17   ` Thomas Gleixner
2026-04-21 17:29     ` Waiman Long
2026-04-21 18:43       ` Thomas Gleixner
2026-04-21  3:03 ` [PATCH 19/23] cgroup/cpuset: Improve check for calling housekeeping_update() Waiman Long
2026-04-23  1:10   ` Chen Ridong
2026-04-24 18:32     ` Waiman Long
2026-04-21  3:03 ` [PATCH 20/23] cgroup/cpuset: Enable runtime update of HK_TYPE_{KERNEL_NOISE,MANAGED_IRQ} cpumasks Waiman Long
2026-04-21  3:03 ` [PATCH 21/23] cgroup/cpuset: Limit the side effect of using CPU hotplug on isolated partition Waiman Long
2026-04-21  3:03 ` [PATCH 22/23] cgroup/cpuset: Prevent offline_disabled CPUs from being used in " Waiman Long
2026-04-21  3:03 ` [PATCH 23/23] cgroup/cpuset: Documentation and kselftest updates Waiman Long
2026-06-24  6:34 ` [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs Jing Wu
2026-06-25  5:27   ` Waiman Long
2026-07-01 14:22     ` Frederic Weisbecker
2026-07-01 18:56       ` Waiman Long
2026-07-02  3:39         ` Jing Wu
2026-07-03 13:19         ` Frederic Weisbecker
2026-07-02 15:00       ` Thomas Gleixner [this message]
2026-07-02 23:07         ` Paul E. McKenney
2026-07-03  6:11           ` Jing Wu
2026-07-03 13:45         ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871pdlphcc.ffs@fw13 \
    --to=tglx@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=rcu@vger.kernel.org \
    --cc=realwujing@gmail.com \
    --cc=yuanql9@chinatelecom.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox