From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
Stewart Hildebrand <stewart.hildebrand@amd.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
George Dunlap <george.dunlap@citrix.com>,
Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>,
Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
Paul Durrant <paul@xen.org>, Kevin Tian <kevin.tian@intel.com>
Subject: Re: [PATCH v9 01/16] pci: introduce per-domain PCI rwlock
Date: Tue, 19 Sep 2023 16:09:39 +0200 [thread overview]
Message-ID: <ZQmro968jYGOb5sY@MacBookPdeRoger> (raw)
In-Reply-To: <20230829231912.4091958-2-volodymyr_babchuk@epam.com>
On Tue, Aug 29, 2023 at 11:19:42PM +0000, Volodymyr Babchuk wrote:
> Add per-domain d->pci_lock that protects access to
> d->pdev_list. Purpose of this lock is to give guarantees to VPCI code
> that underlying pdev will not disappear under feet. This is a rw-lock,
> but this patch adds only write_lock()s. There will be read_lock()
> users in the next patches.
>
> This lock should be taken in write mode every time d->pdev_list is
> altered. This covers both accesses to d->pdev_list and accesses to
> pdev->domain_list fields.
Why do you mention pdev->domain_list here? I don't think the lock
covers accesses to pdev->domain_list, unless that domain_list field
happens to be part of the linked list in d->pdev_list. I find it kind
of odd to mention here.
> All write accesses also should be protected
> by pcidevs_lock() as well. Idea is that any user that wants read
> access to the list or to the devices stored in the list should use
> either this new d->pci_lock or old pcidevs_lock(). Usage of any of
> this two locks will ensure only that pdev of interest will not
> disappear from under feet and that the pdev still will be assigned to
> the same domain. Of course, any new users should use pcidevs_lock()
> when it is appropriate (e.g. when accessing any other state that is
> protected by the said lock). In case both the newly introduced
> per-domain rwlock and the pcidevs lock is taken, the later must be
> acquired first.
>
> Any write access to pdev->domain_list should be protected by both
> pcidevs_lock() and d->pci_lock in the write mode.
>
> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>
> ---
>
> Changes in v9:
> - returned back "pdev->domain = target;" in AMD IOMMU code
> - used "source" instead of pdev->domain in IOMMU functions
> - added comment about lock ordering in the commit message
> - reduced locked regions
> - minor changes non-functional changes in various places
>
> Changes in v8:
> - New patch
>
> Changes in v8 vs RFC:
> - Removed all read_locks after discussion with Roger in #xendevel
> - pci_release_devices() now returns the first error code
> - extended commit message
> - added missing lock in pci_remove_device()
> - extended locked region in pci_add_device() to protect list_del() calls
> ---
> xen/common/domain.c | 1 +
> xen/drivers/passthrough/amd/pci_amd_iommu.c | 9 ++-
> xen/drivers/passthrough/pci.c | 71 +++++++++++++++++----
> xen/drivers/passthrough/vtd/iommu.c | 9 ++-
> xen/include/xen/sched.h | 1 +
> 5 files changed, 78 insertions(+), 13 deletions(-)
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 304aa04fa6..9b04a20160 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -651,6 +651,7 @@ struct domain *domain_create(domid_t domid,
>
> #ifdef CONFIG_HAS_PCI
> INIT_LIST_HEAD(&d->pdev_list);
> + rwlock_init(&d->pci_lock);
> #endif
>
> /* All error paths can depend on the above setup. */
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index bea70db4b7..d219bd9453 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -476,7 +476,14 @@ static int cf_check reassign_device(
>
> if ( devfn == pdev->devfn && pdev->domain != target )
> {
> - list_move(&pdev->domain_list, &target->pdev_list);
> + write_lock(&source->pci_lock);
> + list_del(&pdev->domain_list);
> + write_unlock(&source->pci_lock);
> +
> + write_lock(&target->pci_lock);
> + list_add(&pdev->domain_list, &target->pdev_list);
> + write_unlock(&target->pci_lock);
> +
> pdev->domain = target;
While I don't think this is strictly an issue right now, it would be
better to set pdev->domain before the device is added to domain_list.
A pattern like:
read_lock(d->pci_lock);
for_each_pdev(d, pdev)
foo(pdev->domain);
read_unlock(d->pci_lock);
Wouldn't work currently if the pdev is added to domain_list before the
pdev->domain field is updated to reflect the new owner.
> }
>
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 33452791a8..79ca928672 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -454,7 +454,9 @@ static void __init _pci_hide_device(struct pci_dev *pdev)
> if ( pdev->domain )
> return;
> pdev->domain = dom_xen;
> + write_lock(&dom_xen->pci_lock);
> list_add(&pdev->domain_list, &dom_xen->pdev_list);
> + write_unlock(&dom_xen->pci_lock);
> }
>
> int __init pci_hide_device(unsigned int seg, unsigned int bus,
> @@ -748,7 +750,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
> if ( !pdev->domain )
> {
> pdev->domain = hardware_domain;
> + write_lock(&hardware_domain->pci_lock);
> list_add(&pdev->domain_list, &hardware_domain->pdev_list);
> + write_unlock(&hardware_domain->pci_lock);
>
> /*
> * For devices not discovered by Xen during boot, add vPCI handlers
> @@ -758,7 +762,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
> if ( ret )
> {
> printk(XENLOG_ERR "Setup of vPCI failed: %d\n", ret);
> + write_lock(&hardware_domain->pci_lock);
> list_del(&pdev->domain_list);
> + write_unlock(&hardware_domain->pci_lock);
> pdev->domain = NULL;
> goto out;
> }
> @@ -766,7 +772,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
> if ( ret )
> {
> vpci_remove_device(pdev);
> + write_lock(&hardware_domain->pci_lock);
> list_del(&pdev->domain_list);
> + write_unlock(&hardware_domain->pci_lock);
> pdev->domain = NULL;
> goto out;
> }
> @@ -816,7 +824,11 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> pci_cleanup_msi(pdev);
> ret = iommu_remove_device(pdev);
> if ( pdev->domain )
> + {
> + write_lock(&pdev->domain->pci_lock);
> list_del(&pdev->domain_list);
> + write_unlock(&pdev->domain->pci_lock);
> + }
> printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf);
> free_pdev(pseg, pdev);
> break;
> @@ -887,26 +899,61 @@ static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus,
>
> int pci_release_devices(struct domain *d)
> {
> - struct pci_dev *pdev, *tmp;
> - u8 bus, devfn;
> - int ret;
> + int combined_ret;
> + LIST_HEAD(failed_pdevs);
>
> pcidevs_lock();
> - ret = arch_pci_clean_pirqs(d);
> - if ( ret )
> +
> + combined_ret = arch_pci_clean_pirqs(d);
> + if ( combined_ret )
> {
> pcidevs_unlock();
> - return ret;
> + return combined_ret;
> }
> - list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list )
> +
> + write_lock(&d->pci_lock);
> +
> + while ( !list_empty(&d->pdev_list) )
> {
> - bus = pdev->bus;
> - devfn = pdev->devfn;
> - ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
> + struct pci_dev *pdev = list_first_entry(&d->pdev_list,
> + struct pci_dev,
> + domain_list);
> + uint16_t seg = pdev->seg;
> + uint8_t bus = pdev->bus;
> + uint8_t devfn = pdev->devfn;
> + int ret;
> +
> + write_unlock(&d->pci_lock);
> + ret = deassign_device(d, seg, bus, devfn);
> + write_lock(&d->pci_lock);
> + if ( ret )
> + {
> + const struct pci_dev *tmp;
> +
> + /*
> + * We need to check if deassign_device() left our pdev in
> + * domain's list. As we dropped the lock, we can't be sure
> + * that list wasn't permutated in some random way, so we
> + * need to traverse the whole list.
> + */
> + for_each_pdev ( d, tmp )
> + {
> + if ( tmp == pdev )
> + {
> + list_move_tail(&pdev->domain_list, &failed_pdevs);
> + break;
> + }
> + }
> +
> + combined_ret = combined_ret ?: ret;
> + }
> }
> +
> + list_splice(&failed_pdevs, &d->pdev_list);
> + write_unlock(&d->pci_lock);
> pcidevs_unlock();
>
> - return ret;
> + return combined_ret;
> }
>
> #define PCI_CLASS_BRIDGE_HOST 0x0600
> @@ -1125,7 +1172,9 @@ static int __hwdom_init cf_check _setup_hwdom_pci_devices(
> if ( !pdev->domain )
> {
> pdev->domain = ctxt->d;
> + write_lock(&ctxt->d->pci_lock);
> list_add(&pdev->domain_list, &ctxt->d->pdev_list);
> + write_unlock(&ctxt->d->pci_lock);
> setup_one_hwdom_device(ctxt, pdev);
> }
> else if ( pdev->domain == dom_xen )
> diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
> index 0e3062c820..3228900c97 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2806,7 +2806,14 @@ static int cf_check reassign_device_ownership(
>
> if ( devfn == pdev->devfn && pdev->domain != target )
> {
> - list_move(&pdev->domain_list, &target->pdev_list);
> + write_lock(&source->pci_lock);
> + list_del(&pdev->domain_list);
> + write_unlock(&source->pci_lock);
> +
> + write_lock(&target->pci_lock);
> + list_add(&pdev->domain_list, &target->pdev_list);
> + write_unlock(&target->pci_lock);
> +
> pdev->domain = target;
Same comment as in reassign_device() above.
Thanks, Roger.
next prev parent reply other threads:[~2023-09-19 14:10 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-29 23:19 [PATCH v9 00/16] PCI devices passthrough on Arm, part 3 Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 03/16] vpci: restrict unhandled read/write operations for guests Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 02/16] vpci: use per-domain PCI lock to protect vpci structure Volodymyr Babchuk
2023-09-19 15:39 ` Roger Pau Monné
2023-09-19 15:55 ` Jan Beulich
2023-09-20 8:12 ` Roger Pau Monné
2023-09-19 16:20 ` Stewart Hildebrand
2023-09-20 8:09 ` Roger Pau Monné
2023-09-20 13:56 ` Stewart Hildebrand
2023-09-21 7:42 ` Jan Beulich
2023-09-21 9:00 ` Roger Pau Monné
2023-09-20 19:16 ` Stewart Hildebrand
2023-09-21 9:41 ` Roger Pau Monné
2023-09-25 23:03 ` Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 01/16] pci: introduce per-domain PCI rwlock Volodymyr Babchuk
2023-09-19 14:09 ` Roger Pau Monné [this message]
2023-09-25 22:44 ` Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 05/16] vpci/header: rework exit path in init_bars Volodymyr Babchuk
2023-09-20 8:49 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 06/16] vpci/header: implement guest BAR register handlers Volodymyr Babchuk
2023-09-01 5:25 ` Stewart Hildebrand
2023-09-20 9:49 ` Roger Pau Monné
2023-09-20 14:18 ` Stewart Hildebrand
2023-08-29 23:19 ` [PATCH v9 04/16] vpci: add hooks for PCI device assign/de-assign Volodymyr Babchuk
2023-09-12 9:37 ` Jan Beulich
2023-09-12 23:41 ` Volodymyr Babchuk
2023-09-13 5:58 ` Jan Beulich
2023-09-13 23:53 ` Volodymyr Babchuk
2023-09-20 8:41 ` Roger Pau Monné
2023-09-20 8:39 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 07/16] rangeset: add RANGESETF_no_print flag Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 09/16] vpci/header: program p2m with guest BAR view Volodymyr Babchuk
2023-09-21 10:34 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 08/16] vpci/header: handle p2m range sets per BAR Volodymyr Babchuk
2023-09-20 11:35 ` Roger Pau Monné
2023-09-27 18:18 ` Stewart Hildebrand
2023-08-29 23:19 ` [PATCH v9 10/16] vpci/header: emulate PCI_COMMAND register for guests Volodymyr Babchuk
2023-09-01 5:23 ` Stewart Hildebrand
2023-09-21 13:18 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 11/16] vpci/header: reset the command register when adding devices Volodymyr Babchuk
2023-09-21 13:30 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 13/16] xen/arm: translate virtual PCI bus topology for guests Volodymyr Babchuk
2023-09-22 8:32 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 14/16] xen/arm: account IO handlers for emulated PCI MSI-X Volodymyr Babchuk
2023-08-29 23:19 ` [PATCH v9 12/16] vpci: add initial support for virtual PCI bus topology Volodymyr Babchuk
2023-08-30 7:37 ` Jan Beulich
2023-08-31 21:12 ` Volodymyr Babchuk
2023-09-21 16:03 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 15/16] xen/arm: vpci: check guest range Volodymyr Babchuk
2023-09-22 8:44 ` Roger Pau Monné
2023-09-25 21:49 ` Stewart Hildebrand
2023-09-26 8:07 ` Roger Pau Monné
2023-09-26 15:27 ` Stewart Hildebrand
2023-09-26 15:48 ` Roger Pau Monné
2023-09-27 18:03 ` Stewart Hildebrand
2023-09-28 8:28 ` Roger Pau Monné
2023-09-28 18:28 ` Stewart Hildebrand
2023-10-02 11:49 ` Roger Pau Monné
2023-08-29 23:19 ` [PATCH v9 16/16] xen/arm: vpci: permit access to guest vpci space Volodymyr Babchuk
2023-09-26 0:12 ` Stewart Hildebrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQmro968jYGOb5sY@MacBookPdeRoger \
--to=roger.pau@citrix.com \
--cc=Volodymyr_Babchuk@epam.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@citrix.com \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=kevin.tian@intel.com \
--cc=paul@xen.org \
--cc=sstabellini@kernel.org \
--cc=stewart.hildebrand@amd.com \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.