From: Frederic Weisbecker <frederic@kernel.org>
To: Waiman Long <llong@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Marco Crivellari <marco.crivellari@suse.com>,
Michal Hocko <mhocko@suse.com>,
Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH 02/33] PCI: Protect against concurrent change of housekeeping cpumask
Date: Thu, 18 Sep 2025 16:00:17 +0200 [thread overview]
Message-ID: <aMwQcVZeTwuk2Q8A@localhost.localdomain> (raw)
In-Reply-To: <458c5db8-0c31-4c02-9c41-b7eca851d04a@redhat.com>
Le Fri, Aug 29, 2025 at 06:01:17PM -0400, Waiman Long a écrit :
> On 8/29/25 11:47 AM, Frederic Weisbecker wrote:
> > HK_TYPE_DOMAIN will soon integrate cpuset isolated partitions and
> > therefore be made modifyable at runtime. Synchronize against the cpumask
> > update using RCU.
> >
> > The RCU locked section includes both the housekeeping CPU target
> > election for the PCI probe work and the work enqueue.
> >
> > This way the housekeeping update side will simply need to flush the
> > pending related works after updating the housekeeping mask in order to
> > make sure that no PCI work ever executes on an isolated CPU.
> >
> > Signed-off-by: Frederic Weisbecker<frederic@kernel.org>
> > ---
> > drivers/pci/pci-driver.c | 40 +++++++++++++++++++++++++++++++---------
> > 1 file changed, 31 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 63665240ae87..cf2b83004886 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -302,9 +302,8 @@ struct drv_dev_and_id {
> > const struct pci_device_id *id;
> > };
> > -static long local_pci_probe(void *_ddi)
> > +static int local_pci_probe(struct drv_dev_and_id *ddi)
> > {
> > - struct drv_dev_and_id *ddi = _ddi;
> > struct pci_dev *pci_dev = ddi->dev;
> > struct pci_driver *pci_drv = ddi->drv;
> > struct device *dev = &pci_dev->dev;
> > @@ -338,6 +337,19 @@ static long local_pci_probe(void *_ddi)
> > return 0;
> > }
> > +struct pci_probe_arg {
> > + struct drv_dev_and_id *ddi;
> > + struct work_struct work;
> > + int ret;
> > +};
> > +
> > +static void local_pci_probe_callback(struct work_struct *work)
> > +{
> > + struct pci_probe_arg *arg = container_of(work, struct pci_probe_arg, work);
> > +
> > + arg->ret = local_pci_probe(arg->ddi);
> > +}
> > +
> > static bool pci_physfn_is_probed(struct pci_dev *dev)
> > {
> > #ifdef CONFIG_PCI_IOV
> > @@ -362,34 +374,44 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> > dev->is_probed = 1;
> > cpu_hotplug_disable();
> > -
> > /*
> > * Prevent nesting work_on_cpu() for the case where a Virtual Function
> > * device is probed from work_on_cpu() of the Physical device.
> > */
> > if (node < 0 || node >= MAX_NUMNODES || !node_online(node) ||
> > pci_physfn_is_probed(dev)) {
> > - cpu = nr_cpu_ids;
> > + error = local_pci_probe(&ddi);
> > } else {
> > cpumask_var_t wq_domain_mask;
> > + struct pci_probe_arg arg = { .ddi = &ddi };
> > +
> > + INIT_WORK_ONSTACK(&arg.work, local_pci_probe_callback);
> > if (!zalloc_cpumask_var(&wq_domain_mask, GFP_KERNEL)) {
> > error = -ENOMEM;
> > goto out;
> > }
> > +
> > + rcu_read_lock();
> > cpumask_and(wq_domain_mask,
> > housekeeping_cpumask(HK_TYPE_WQ),
> > housekeeping_cpumask(HK_TYPE_DOMAIN));
> > cpu = cpumask_any_and(cpumask_of_node(node),
> > wq_domain_mask);
> > + if (cpu < nr_cpu_ids) {
> > + schedule_work_on(cpu, &arg.work);
> > + rcu_read_unlock();
> > + flush_work(&arg.work);
> > + error = arg.ret;
> > + } else {
> > + rcu_read_unlock();
> > + error = local_pci_probe(&ddi);
> > + }
> > +
> > free_cpumask_var(wq_domain_mask);
> > + destroy_work_on_stack(&arg.work);
> > }
> > -
> > - if (cpu < nr_cpu_ids)
> > - error = work_on_cpu(cpu, local_pci_probe, &ddi);
> > - else
> > - error = local_pci_probe(&ddi);
> > out:
> > dev->is_probed = 0;
> > cpu_hotplug_enable();
>
> A question. Is the purpose of open-coding work_on_cpu() to avoid calling
> INIT_WORK_ONSTACK() and destroy_work_on_stack() in RCU read-side critical
> section? These two macro/function may call debugobjects code which I don't
> know if they are allowed inside rcu_read_lock() critical section.
>
> Cheers, Longman
No the point is that I need to keep the target selection
(housekeeping_cpumask() read) and the work queue within the same
RCU critical section so that things are synchronized that way:
CPU 0 CPU 1
----- -----
rcu_read_lock() housekeeping_update()
cpu = cpumask_any(housekeeping_cpumask(...)) housekeeping_cpumask &= ~val
queue_work_on(cpu, pci_probe_wq, work) synchronize_rcu()
rcu_read_unlock() flush_workqueue(pci_probe_wq)
flush_work(work)
And I can't include the whole work_on_cpu() within rcu_read_lock() because
flush_work() may sleep.
Also now that you mention it, I need to create that pci_probe_wq and flush it :-)
--
Frederic Weisbecker
SUSE Labs
next prev parent reply other threads:[~2025-09-18 14:00 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 15:47 [PATCH 00/33 v2] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 01/33] sched/isolation: Remove housekeeping static key Frederic Weisbecker
2025-08-29 21:34 ` Waiman Long
2025-09-18 12:04 ` Frederic Weisbecker
2025-09-01 10:26 ` Peter Zijlstra
2025-09-18 13:18 ` Frederic Weisbecker
2025-09-11 20:57 ` Phil Auld
2025-08-29 15:47 ` [PATCH 02/33] PCI: Protect against concurrent change of housekeeping cpumask Frederic Weisbecker
[not found] ` <458c5db8-0c31-4c02-9c41-b7eca851d04a@redhat.com>
2025-09-18 14:00 ` Frederic Weisbecker [this message]
2025-09-22 21:51 ` Waiman Long
2025-09-23 9:07 ` Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 03/33] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug" Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 04/33] memcg: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 05/33] mm: vmstat: " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 06/33] sched/isolation: Save boot defined domain flags Frederic Weisbecker
2025-09-11 21:02 ` Phil Auld
2025-08-29 15:47 ` [PATCH 07/33] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Frederic Weisbecker
2025-09-11 21:03 ` Phil Auld
2025-08-29 15:47 ` [PATCH 08/33] driver core: cpu: Convert /sys/devices/system/cpu/isolated " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 09/33] net: Keep ignoring isolated cpuset change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 10/33] block: Protect against concurrent " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 11/33] cpu: Provide lockdep check for CPU hotplug lock write-held Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 12/33] cpuset: Provide lockdep check for cpuset lock held Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 13/33] sched/isolation: Convert housekeeping cpumasks to rcu pointers Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 14/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Frederic Weisbecker
2025-09-01 0:40 ` Waiman Long
2025-09-22 14:57 ` Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 15/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 16/33] sched/isolation: Flush vmstat " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 17/33] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Frederic Weisbecker
2025-09-01 2:51 ` Waiman Long
2025-09-22 15:10 ` Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 18/33] cpuset: Remove cpuset_cpu_is_isolated() Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 19/33] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Frederic Weisbecker
2025-09-02 14:28 ` Waiman Long
2025-09-02 15:48 ` Waiman Long
2025-09-22 15:20 ` Frederic Weisbecker
2025-09-22 15:19 ` Frederic Weisbecker
2025-09-22 21:59 ` Waiman Long
2025-09-23 9:11 ` Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 20/33] PCI: Remove superfluous HK_TYPE_WQ check Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 21/33] kthread: Refine naming of affinity related fields Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 22/33] kthread: Include unbound kthreads in the managed affinity list Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 23/33] kthread: Include kthreadd to " Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 24/33] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 25/33] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 26/33] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Frederic Weisbecker
2025-09-02 15:44 ` Waiman Long
2025-09-23 9:17 ` Frederic Weisbecker
2025-09-23 9:24 ` Gabriele Monaco
2025-08-29 15:48 ` [PATCH 27/33] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-09-02 16:43 ` Waiman Long
2025-09-23 9:43 ` Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 28/33] kthread: Honour kthreads preferred affinity after cpuset changes Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 29/33] kthread: Comment on the purpose and placement of kthread_affine_node() call Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 30/33] kthread: Add API to update preferred affinity on kthread runtime Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 31/33] kthread: Document kthread_affine_preferred() Frederic Weisbecker
2025-08-29 15:48 ` [RFC PATCH 32/33] genirq: Correctly handle preferred kthreads affinity Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 33/33] doc: Add housekeeping documentation Frederic Weisbecker
2025-09-02 19:12 ` [PATCH 00/33 v2] cpuset/isolation: Honour kthreads preferred affinity Waiman Long
2025-09-23 9:48 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMwQcVZeTwuk2Q8A@localhost.localdomain \
--to=frederic@kernel.org \
--cc=bhelgaas@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=llong@redhat.com \
--cc=marco.crivellari@suse.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.