public inbox for rcu@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: paulmck@kernel.org, "Tejun Heo" <tj@kernel.org>,
	"Chen Ridong" <chenridong@huaweicloud.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	rcu@vger.kernel.org, frederic@kernel.org
Subject: Re: [BUG] cgroups/cpusets: Spurious CPU-hotplug failures
Date: Wed, 18 Mar 2026 11:02:16 -0400	[thread overview]
Message-ID: <5f142f48-653d-430b-90a6-400f87c88921@redhat.com> (raw)
In-Reply-To: <049415be-0be8-4e01-bba9-530e302bf655@paulmck-laptop>

On 3/18/26 8:53 AM, Paul E. McKenney wrote:
> Hello!
>
> Running rcutorture on v7.0-rc3 results in spurious CPU-hotplug failures,
> most frequently on the TREE03 scenario, which suffers about ten such
> failures per hundred hours of test time.  Repeat-by is as follows:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 80 --duration 100h --configs "100*TREE03" --trust-make
>
> Though a faster repeat-by instead uses kvm-remote.sh and lots of systems.
>
> Bisection converges here:
>
> 6df415aa46ec ("cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue")
>
> Reverting this commit gets rid of the spurious CPU-hotplug failures.
> Of course, this also gets rid of some ability to do dynamic nohz_full
> processing.
>
> Now, the problem might be that the workqueue handler might still be
> in flight by the time that rcutorture fired up the next CPU-hotplug
> operation, especially given that the TREE03 scenario only waits 200
> milliseconds between these operations.  This suggests waiting for this
> handler before ending each CPU-hotplug operation.  And the crude patch
> below does make the problem go away.
>
> This alleged fix is quite heavy-handed, and also fragile in that if
> hk_sd_workfn() uses a different workqueue, this breaks.  It might be
> better to call into the cgroups/cpusets code and to use flush_work()
> to wait only on hk_sd_workfn() and nothing else.  But it seemed best to
> keep things trivial to start with.
>
> Either way, please consider the patch below to be part of this bug report
> rather than a proper fix.
>
> Thoughts?
>
> 							Thanx, Paul
There is a fix commit ca174c705db5 ("cgroup/cpuset: Call
rebuild_sched_domains() directly in hotplug") in rc4 that may help. Could
you try out the rc4 kernel to see if that can resolve the problem that 
you have?

Thanks,
Longman


  reply	other threads:[~2026-03-18 15:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 12:53 [BUG] cgroups/cpusets: Spurious CPU-hotplug failures Paul E. McKenney
2026-03-18 15:02 ` Waiman Long [this message]
2026-03-18 18:43   ` Paul E. McKenney
2026-03-18 19:29     ` Waiman Long
2026-03-24  9:41     ` Paul E. McKenney
2026-03-25  0:03       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f142f48-653d-430b-90a6-400f87c88921@redhat.com \
    --to=longman@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=frederic@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox