* [PATCH v20 0/2] PCI devices passthrough on Arm, part 3
@ 2025-04-18 18:58 Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 1/2] xen/arm: check read handler behavior Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests Stewart Hildebrand
0 siblings, 2 replies; 12+ messages in thread
From: Stewart Hildebrand @ 2025-04-18 18:58 UTC (permalink / raw)
To: xen-devel
Cc: Stewart Hildebrand, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Roger Pau Monné, Anthony PERARD, Jiqian Chen, Mykyta Poturai
This is next version of vPCI rework. Aim of this series is to prepare
ground for introducing PCI support on ARM platform.
in v20:
- drop ("vpci: acquire d->pci_lock in I/O handlers") as it was
unnecessary
in v19:
- ("xen/arm: check read handler behavior") is ready to be committed
- add ("vpci: acquire d->pci_lock in I/O handlers")
in v18:
- address warning in vpci test suite
in v17:
- add ("xen/arm: check read handler behavior")
- drop ("xen/arm: account IO handlers for emulated PCI MSI-X") as it
should wait for future work
- drop committed patches
in v16:
- minor updates - see individual patches
in v15:
- reorder so ("arm/vpci: honor access size when returning an error")
comes first
in v14:
- drop first 9 patches as they were committed
- updated ("vpci/header: emulate PCI_COMMAND register for guests")
in v13:
- drop ("xen/arm: vpci: permit access to guest vpci space") as it was
unnecessary
in v12:
- I (Stewart) coordinated with Volodomyr to send this whole series. So,
add my (Stewart) Signed-off-by to all patches.
- The biggest change is to re-work the PCI_COMMAND register patch.
Additional feedback has also been addressed - see individual patches.
- Drop ("pci: msi: pass pdev to pci_enable_msi() function") and
("pci: introduce per-domain PCI rwlock") as they were committed
- Rename ("rangeset: add rangeset_empty() function")
to ("rangeset: add rangeset_purge() function")
- Rename ("vpci/header: rework exit path in init_bars")
to ("vpci/header: rework exit path in init_header()")
in v11:
- Added my (Volodymyr) Signed-off-by tag to all patches
- Patch "vpci/header: emulate PCI_COMMAND register for guests" is in
intermediate state, because it was agreed to rework it once Stewart's
series on register handling are in.
- Addressed comments, please see patch descriptions for details.
in v10:
- Removed patch ("xen/arm: vpci: check guest range"), proper fix
for the issue is part of ("vpci/header: emulate PCI_COMMAND
register for guests")
- Removed patch ("pci/header: reset the command register when adding
devices")
- Added patch ("rangeset: add rangeset_empty() function") because
this function is needed in ("vpci/header: handle p2m range sets
per BAR")
- Added ("vpci/header: handle p2m range sets per BAR") which addressed
an issue discovered by Andrii Chepurnyi during virtio integration
- Added ("pci: msi: pass pdev to pci_enable_msi() function"), which is
prereq for ("pci: introduce per-domain PCI rwlock")
- Fixed "Since v9/v8/... " comments in changelogs to reduce confusion.
I left "Since" entries for older versions, because they were added
by original author of the patches.
in v9:
v9 includes addressed commentes from a previous one. Also it
introduces a couple patches from Stewart. This patches are related to
vPCI use on ARM. Patch "vpci/header: rework exit path in init_bars"
was factored-out from "vpci/header: handle p2m range sets per BAR".
in v8:
The biggest change from previous, mistakenly named, v7 series is how
locking is implemented. Instead of d->vpci_rwlock we introduce
d->pci_lock which has broader scope, as it protects not only domain's
vpci state, but domain's list of PCI devices as well.
As we discussed in IRC with Roger, it is not feasible to rework all
the existing code to use the new lock right away. It was agreed that
any write access to d->pdev_list will be protected by **both**
d->pci_lock in write mode and pcidevs_lock(). Read access on other
hand should be protected by either d->pci_lock in read mode or
pcidevs_lock(). It is expected that existing code will use
pcidevs_lock() and new users will use new rw lock. Of course, this
does not mean that new users shall not use pcidevs_lock() when it is
appropriate.
Changes from previous versions are described in each separate patch.
Oleksandr Andrushchenko (1):
vpci: translate virtual PCI bus topology for guests
Stewart Hildebrand (1):
xen/arm: check read handler behavior
tools/tests/vpci/emul.h | 2 +-
xen/arch/arm/io.c | 2 ++
xen/arch/arm/vpci.c | 4 +++
xen/drivers/vpci/vpci.c | 73 +++++++++++++++++++++++++++++++++++++----
4 files changed, 73 insertions(+), 8 deletions(-)
base-commit: 64cf7a80a17a017a6b88ea17c7f490e42baefcca
--
2.49.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v20 1/2] xen/arm: check read handler behavior
2025-04-18 18:58 [PATCH v20 0/2] PCI devices passthrough on Arm, part 3 Stewart Hildebrand
@ 2025-04-18 18:58 ` Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests Stewart Hildebrand
1 sibling, 0 replies; 12+ messages in thread
From: Stewart Hildebrand @ 2025-04-18 18:58 UTC (permalink / raw)
To: xen-devel
Cc: Stewart Hildebrand, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
Roger Pau Monné, Julien Grall
We expect mmio read handlers to leave the bits above the access size
zeroed. Add an ASSERT to check this aspect of read handler behavior.
Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Acked-by: Julien Grall <jgrall@amazon.com>
---
There's no need to wait for the rest of the series to commit this patch.
v19->v20:
* no change
v18->v19:
* add Julien's A-b
* s/GENMASK_ULL/GENMASK/
v17->v18:
* no change
v16->v17:
* new patch
See https://lore.kernel.org/xen-devel/bc6660ef-59f1-4514-9792-067d987e3fbc@xen.org/
Also see 7db7bd0f319f ("arm/vpci: honor access size when returning an error")
Also see xen/arch/arm/ioreq.c:handle_ioserv()
---
xen/arch/arm/io.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
index 653428e16c1f..5a4b0e8f25c6 100644
--- a/xen/arch/arm/io.c
+++ b/xen/arch/arm/io.c
@@ -37,6 +37,8 @@ static enum io_state handle_read(const struct mmio_handler *handler,
if ( !handler->ops->read(v, info, &r, handler->priv) )
return IO_ABORT;
+ ASSERT((r & ~GENMASK((1U << info->dabt.size) * 8 - 1, 0)) == 0);
+
r = sign_extend(dabt, r);
set_user_reg(regs, dabt.reg, r);
--
2.49.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-04-18 18:58 [PATCH v20 0/2] PCI devices passthrough on Arm, part 3 Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 1/2] xen/arm: check read handler behavior Stewart Hildebrand
@ 2025-04-18 18:58 ` Stewart Hildebrand
2025-04-22 7:48 ` Jan Beulich
2025-05-06 11:16 ` Roger Pau Monné
1 sibling, 2 replies; 12+ messages in thread
From: Stewart Hildebrand @ 2025-04-18 18:58 UTC (permalink / raw)
To: xen-devel
Cc: Oleksandr Andrushchenko, Roger Pau Monné, Stewart Hildebrand,
Anthony PERARD, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Jiqian Chen,
Mykyta Poturai, Volodymyr Babchuk
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
There are two originators for the PCI configuration space access:
1. The domain that owns physical host bridge: MMIO handlers are
there so we can update vPCI register handlers with the values
written by the hardware domain, e.g. physical view of the registers
vs guest's view on the configuration space.
2. Guest access to the passed through PCI devices: we need to properly
map virtual bus topology to the physical one, e.g. pass the configuration
space access to the corresponding physical devices.
In vpci_read(), use the access size to create a mask so as to not set
any bits above the access size when returning an error.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
In v20:
* call translate_virtual_device() from within locked context of
vpci_{read,write}
* update commit message
In v19:
* move locking to pre-patch
In v18:
* address warning in vpci test suite
In v17:
* move lock to inside vpci_translate_virtual_device()
* acks were previously given for Arm [0] and vPCI [1], but since it was
two releases ago and the patch has changed, I didn't pick them up
[0] https://lore.kernel.org/xen-devel/4afe33f2-72e6-4755-98ce-d7f9da374e90@xen.org/
[1] https://lore.kernel.org/xen-devel/Zk70udmiriruMt0r@macbook/
In v15:
- base on top of ("arm/vpci: honor access size when returning an error")
In v11:
- Fixed format issues
- Added ASSERT_UNREACHABLE() to the dummy implementation of
vpci_translate_virtual_device()
- Moved variable in vpci_sbdf_from_gpa(), now it is easier to follow
the logic in the function
Since v9:
- Commend about required lock replaced with ASSERT()
- Style fixes
- call to vpci_translate_virtual_device folded into vpci_sbdf_from_gpa
Since v8:
- locks moved out of vpci_translate_virtual_device()
Since v6:
- add pcidevs locking to vpci_translate_virtual_device
- update wrt to the new locking scheme
Since v5:
- add vpci_translate_virtual_device for #ifndef CONFIG_HAS_VPCI_GUEST_SUPPORT
case to simplify ifdefery
- add ASSERT(!is_hardware_domain(d)); to vpci_translate_virtual_device
- reset output register on failed virtual SBDF translation
Since v4:
- indentation fixes
- constify struct domain
- updated commit message
- updates to the new locking scheme (pdev->vpci_lock)
Since v3:
- revisit locking
- move code to vpci.c
Since v2:
- pass struct domain instead of struct vcpu
- constify arguments where possible
- gate relevant code with CONFIG_HAS_VPCI_GUEST_SUPPORT
New in v2
---
tools/tests/vpci/emul.h | 2 +-
xen/arch/arm/vpci.c | 4 +++
xen/drivers/vpci/vpci.c | 73 +++++++++++++++++++++++++++++++++++++----
3 files changed, 71 insertions(+), 8 deletions(-)
diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
index da446bba86b4..dd048cffbf9d 100644
--- a/tools/tests/vpci/emul.h
+++ b/tools/tests/vpci/emul.h
@@ -89,7 +89,7 @@ typedef union {
#define __hwdom_init
-#define is_hardware_domain(d) ((void)(d), false)
+#define is_hardware_domain(d) ((void)(d), true)
#define has_vpci(d) true
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index b63a356bb4a8..618ddb7f6547 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -34,6 +34,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
/* data is needed to prevent a pointer cast on 32bit */
unsigned long data;
+ ASSERT(!bridge == !is_hardware_domain(v->domain));
+
if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
1U << info->dabt.size, &data) )
{
@@ -52,6 +54,8 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
struct pci_host_bridge *bridge = p;
pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
+ ASSERT(!bridge == !is_hardware_domain(v->domain));
+
return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
1U << info->dabt.size, r);
}
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 1e6aa5d799b9..fc409f3fc346 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -174,6 +174,41 @@ int vpci_assign_device(struct pci_dev *pdev)
}
#endif /* __XEN__ */
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+/*
+ * Find the physical device which is mapped to the virtual device
+ * and translate virtual SBDF to the physical one.
+ */
+static const struct pci_dev *translate_virtual_device(const struct domain *d,
+ pci_sbdf_t *sbdf)
+{
+ const struct pci_dev *pdev;
+
+ ASSERT(!is_hardware_domain(d));
+ ASSERT(rw_is_locked(&d->pci_lock));
+
+ for_each_pdev ( d, pdev )
+ {
+ if ( pdev->vpci && (pdev->vpci->guest_sbdf.sbdf == sbdf->sbdf) )
+ {
+ /* Replace guest SBDF with the physical one. */
+ *sbdf = pdev->sbdf;
+ return pdev;
+ }
+ }
+
+ return NULL;
+}
+#else
+static const struct pci_dev *translate_virtual_device(const struct domain *d,
+ pci_sbdf_t *sbdf)
+{
+ ASSERT_UNREACHABLE();
+
+ return NULL;
+}
+#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
+
static int vpci_register_cmp(const struct vpci_register *r1,
const struct vpci_register *r2)
{
@@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
const struct pci_dev *pdev;
const struct vpci_register *r;
unsigned int data_offset = 0;
- uint32_t data = ~(uint32_t)0;
+ uint32_t data = 0xffffffffU >> (32 - 8 * size);
if ( !size )
{
@@ -453,9 +488,21 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
* pci_lock is sufficient.
*/
read_lock(&d->pci_lock);
- pdev = pci_get_pdev(d, sbdf);
- if ( !pdev && is_hardware_domain(d) )
- pdev = pci_get_pdev(dom_xen, sbdf);
+ if ( is_hardware_domain(d) )
+ {
+ pdev = pci_get_pdev(d, sbdf);
+ if ( !pdev )
+ pdev = pci_get_pdev(dom_xen, sbdf);
+ }
+ else
+ {
+ pdev = translate_virtual_device(d, &sbdf);
+ if ( !pdev )
+ {
+ read_unlock(&d->pci_lock);
+ return data;
+ }
+ }
if ( !pdev || !pdev->vpci )
{
read_unlock(&d->pci_lock);
@@ -571,9 +618,21 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
* are modifying BARs, so there is a room for improvement.
*/
write_lock(&d->pci_lock);
- pdev = pci_get_pdev(d, sbdf);
- if ( !pdev && is_hardware_domain(d) )
- pdev = pci_get_pdev(dom_xen, sbdf);
+ if ( is_hardware_domain(d) )
+ {
+ pdev = pci_get_pdev(d, sbdf);
+ if ( !pdev )
+ pdev = pci_get_pdev(dom_xen, sbdf);
+ }
+ else
+ {
+ pdev = translate_virtual_device(d, &sbdf);
+ if ( !pdev )
+ {
+ write_unlock(&d->pci_lock);
+ return;
+ }
+ }
if ( !pdev || !pdev->vpci )
{
/* Ignore writes to read-only devices, which have no ->vpci. */
--
2.49.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-04-18 18:58 ` [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests Stewart Hildebrand
@ 2025-04-22 7:48 ` Jan Beulich
2025-05-06 11:16 ` Roger Pau Monné
1 sibling, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2025-04-22 7:48 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Oleksandr Andrushchenko, Roger Pau Monné, Anthony PERARD,
Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai, xen-devel
On 18.04.2025 20:58, Stewart Hildebrand wrote:
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -174,6 +174,41 @@ int vpci_assign_device(struct pci_dev *pdev)
> }
> #endif /* __XEN__ */
>
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +/*
> + * Find the physical device which is mapped to the virtual device
> + * and translate virtual SBDF to the physical one.
> + */
> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
> + pci_sbdf_t *sbdf)
> +{
> + const struct pci_dev *pdev;
> +
> + ASSERT(!is_hardware_domain(d));
> + ASSERT(rw_is_locked(&d->pci_lock));
> +
> + for_each_pdev ( d, pdev )
> + {
> + if ( pdev->vpci && (pdev->vpci->guest_sbdf.sbdf == sbdf->sbdf) )
> + {
> + /* Replace guest SBDF with the physical one. */
> + *sbdf = pdev->sbdf;
> + return pdev;
> + }
> + }
> +
> + return NULL;
> +}
> +#else
> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
> + pci_sbdf_t *sbdf)
> +{
> + ASSERT_UNREACHABLE();
> +
> + return NULL;
> +}
> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
I'm not a maintainer of this code, but if I was I'd strictly request that the
#ifdef move inside the function, to limit redundancy. Even the "return NULL"
is common between both forms.
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-04-18 18:58 ` [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests Stewart Hildebrand
2025-04-22 7:48 ` Jan Beulich
@ 2025-05-06 11:16 ` Roger Pau Monné
2025-05-07 3:05 ` Stewart Hildebrand
1 sibling, 1 reply; 12+ messages in thread
From: Roger Pau Monné @ 2025-05-06 11:16 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
Hello,
Sorry I haven't looked at this before, I was confused with the cover
letter having ARM in the subject and somehow assumed all the code was
ARM related.
On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> There are two originators for the PCI configuration space access:
> 1. The domain that owns physical host bridge: MMIO handlers are
> there so we can update vPCI register handlers with the values
> written by the hardware domain, e.g. physical view of the registers
> vs guest's view on the configuration space.
> 2. Guest access to the passed through PCI devices: we need to properly
> map virtual bus topology to the physical one, e.g. pass the configuration
> space access to the corresponding physical devices.
>
> In vpci_read(), use the access size to create a mask so as to not set
> any bits above the access size when returning an error.
I see this here, and the code below, yet I'm not following why this is
a requirement for the code added here. It seems like an unrelated
change?
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> In v20:
> * call translate_virtual_device() from within locked context of
> vpci_{read,write}
> * update commit message
> In v19:
> * move locking to pre-patch
> In v18:
> * address warning in vpci test suite
> In v17:
> * move lock to inside vpci_translate_virtual_device()
> * acks were previously given for Arm [0] and vPCI [1], but since it was
> two releases ago and the patch has changed, I didn't pick them up
>
> [0] https://lore.kernel.org/xen-devel/4afe33f2-72e6-4755-98ce-d7f9da374e90@xen.org/
> [1] https://lore.kernel.org/xen-devel/Zk70udmiriruMt0r@macbook/
>
> In v15:
> - base on top of ("arm/vpci: honor access size when returning an error")
> In v11:
> - Fixed format issues
> - Added ASSERT_UNREACHABLE() to the dummy implementation of
> vpci_translate_virtual_device()
> - Moved variable in vpci_sbdf_from_gpa(), now it is easier to follow
> the logic in the function
> Since v9:
> - Commend about required lock replaced with ASSERT()
> - Style fixes
> - call to vpci_translate_virtual_device folded into vpci_sbdf_from_gpa
> Since v8:
> - locks moved out of vpci_translate_virtual_device()
> Since v6:
> - add pcidevs locking to vpci_translate_virtual_device
> - update wrt to the new locking scheme
> Since v5:
> - add vpci_translate_virtual_device for #ifndef CONFIG_HAS_VPCI_GUEST_SUPPORT
> case to simplify ifdefery
> - add ASSERT(!is_hardware_domain(d)); to vpci_translate_virtual_device
> - reset output register on failed virtual SBDF translation
> Since v4:
> - indentation fixes
> - constify struct domain
> - updated commit message
> - updates to the new locking scheme (pdev->vpci_lock)
> Since v3:
> - revisit locking
> - move code to vpci.c
> Since v2:
> - pass struct domain instead of struct vcpu
> - constify arguments where possible
> - gate relevant code with CONFIG_HAS_VPCI_GUEST_SUPPORT
> New in v2
> ---
> tools/tests/vpci/emul.h | 2 +-
> xen/arch/arm/vpci.c | 4 +++
> xen/drivers/vpci/vpci.c | 73 +++++++++++++++++++++++++++++++++++++----
> 3 files changed, 71 insertions(+), 8 deletions(-)
>
> diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
> index da446bba86b4..dd048cffbf9d 100644
> --- a/tools/tests/vpci/emul.h
> +++ b/tools/tests/vpci/emul.h
> @@ -89,7 +89,7 @@ typedef union {
>
> #define __hwdom_init
>
> -#define is_hardware_domain(d) ((void)(d), false)
> +#define is_hardware_domain(d) ((void)(d), true)
>
> #define has_vpci(d) true
>
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> index b63a356bb4a8..618ddb7f6547 100644
> --- a/xen/arch/arm/vpci.c
> +++ b/xen/arch/arm/vpci.c
> @@ -34,6 +34,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
> /* data is needed to prevent a pointer cast on 32bit */
> unsigned long data;
>
> + ASSERT(!bridge == !is_hardware_domain(v->domain));
> +
> if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
> 1U << info->dabt.size, &data) )
> {
> @@ -52,6 +54,8 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
> struct pci_host_bridge *bridge = p;
> pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>
> + ASSERT(!bridge == !is_hardware_domain(v->domain));
> +
> return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
> 1U << info->dabt.size, r);
> }
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 1e6aa5d799b9..fc409f3fc346 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -174,6 +174,41 @@ int vpci_assign_device(struct pci_dev *pdev)
> }
> #endif /* __XEN__ */
>
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +/*
> + * Find the physical device which is mapped to the virtual device
> + * and translate virtual SBDF to the physical one.
> + */
> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
> + pci_sbdf_t *sbdf)
> +{
> + const struct pci_dev *pdev;
> +
> + ASSERT(!is_hardware_domain(d));
> + ASSERT(rw_is_locked(&d->pci_lock));
> +
> + for_each_pdev ( d, pdev )
> + {
> + if ( pdev->vpci && (pdev->vpci->guest_sbdf.sbdf == sbdf->sbdf) )
> + {
> + /* Replace guest SBDF with the physical one. */
> + *sbdf = pdev->sbdf;
> + return pdev;
> + }
> + }
> +
> + return NULL;
> +}
> +#else
> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
> + pci_sbdf_t *sbdf)
> +{
> + ASSERT_UNREACHABLE();
> +
> + return NULL;
> +}
> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
Jan's suggestion avoids having duplicate headers, and results in less
code overall:
static const struct pci_dev *translate_virtual_device(const struct domain *d,
pci_sbdf_t *sbdf)
{
#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
const struct pci_dev *pdev;
...
#else /* !CONFIG_HAS_VPCI_GUEST_SUPPORT */
ASSERT_UNREACHABLE()
#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
return NULL;
}
I've not overly fuzzed either way, but if changes are required you
might as well change to this form.
> static int vpci_register_cmp(const struct vpci_register *r1,
> const struct vpci_register *r2)
> {
> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> const struct pci_dev *pdev;
> const struct vpci_register *r;
> unsigned int data_offset = 0;
> - uint32_t data = ~(uint32_t)0;
> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
This seems kind of unrelated to the rest of the code in the patch,
why is this needed? Isn't it always fine to return all ones, and let
the caller truncate to the required size?
Otherwise the code in vpci_read_hw() also needs to be adjusted. And
you can likely use GENMASK(size * 8, 0) as it's easier to read.
>
> if ( !size )
> {
> @@ -453,9 +488,21 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> * pci_lock is sufficient.
> */
> read_lock(&d->pci_lock);
> - pdev = pci_get_pdev(d, sbdf);
> - if ( !pdev && is_hardware_domain(d) )
> - pdev = pci_get_pdev(dom_xen, sbdf);
> + if ( is_hardware_domain(d) )
> + {
> + pdev = pci_get_pdev(d, sbdf);
> + if ( !pdev )
> + pdev = pci_get_pdev(dom_xen, sbdf);
> + }
> + else
> + {
> + pdev = translate_virtual_device(d, &sbdf);
> + if ( !pdev )
> + {
> + read_unlock(&d->pci_lock);
> + return data;
> + }
You don't need this checking here, as the check below will already be
enough AFAICT, iow:
if ( is_hardware_domain(d) )
{
pdev = pci_get_pdev(d, sbdf);
if ( !pdev )
pdev = pci_get_pdev(dom_xen, sbdf);
}
else
pdev = translate_virtual_device(d, &sbdf);
if ( !pdev || !pdev->vpci )
{
read_unlock(&d->pci_lock);
return vpci_read_hw(sbdf, reg, size);
}
Achieves the same and is more compact by having a single return for
all the possible cases? Same for the write case below.
Thanks, Roger.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-06 11:16 ` Roger Pau Monné
@ 2025-05-07 3:05 ` Stewart Hildebrand
2025-05-07 7:44 ` Roger Pau Monné
0 siblings, 1 reply; 12+ messages in thread
From: Stewart Hildebrand @ 2025-05-07 3:05 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On 5/6/25 07:16, Roger Pau Monné wrote:
> Hello,
>
> Sorry I haven't looked at this before, I was confused with the cover
> letter having ARM in the subject and somehow assumed all the code was
> ARM related.
No worries, thanks for taking a look.
> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> There are two originators for the PCI configuration space access:
>> 1. The domain that owns physical host bridge: MMIO handlers are
>> there so we can update vPCI register handlers with the values
>> written by the hardware domain, e.g. physical view of the registers
>> vs guest's view on the configuration space.
>> 2. Guest access to the passed through PCI devices: we need to properly
>> map virtual bus topology to the physical one, e.g. pass the configuration
>> space access to the corresponding physical devices.
>>
>> In vpci_read(), use the access size to create a mask so as to not set
>> any bits above the access size when returning an error.
>
> I see this here, and the code below, yet I'm not following why this is
> a requirement for the code added here. It seems like an unrelated
> change?
See below
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> In v20:
>> * call translate_virtual_device() from within locked context of
>> vpci_{read,write}
>> * update commit message
>> In v19:
>> * move locking to pre-patch
>> In v18:
>> * address warning in vpci test suite
>> In v17:
>> * move lock to inside vpci_translate_virtual_device()
>> * acks were previously given for Arm [0] and vPCI [1], but since it was
>> two releases ago and the patch has changed, I didn't pick them up
>>
>> [0] https://lore.kernel.org/xen-devel/4afe33f2-72e6-4755-98ce-d7f9da374e90@xen.org/
>> [1] https://lore.kernel.org/xen-devel/Zk70udmiriruMt0r@macbook/
>>
>> In v15:
>> - base on top of ("arm/vpci: honor access size when returning an error")
>> In v11:
>> - Fixed format issues
>> - Added ASSERT_UNREACHABLE() to the dummy implementation of
>> vpci_translate_virtual_device()
>> - Moved variable in vpci_sbdf_from_gpa(), now it is easier to follow
>> the logic in the function
>> Since v9:
>> - Commend about required lock replaced with ASSERT()
>> - Style fixes
>> - call to vpci_translate_virtual_device folded into vpci_sbdf_from_gpa
>> Since v8:
>> - locks moved out of vpci_translate_virtual_device()
>> Since v6:
>> - add pcidevs locking to vpci_translate_virtual_device
>> - update wrt to the new locking scheme
>> Since v5:
>> - add vpci_translate_virtual_device for #ifndef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> case to simplify ifdefery
>> - add ASSERT(!is_hardware_domain(d)); to vpci_translate_virtual_device
>> - reset output register on failed virtual SBDF translation
>> Since v4:
>> - indentation fixes
>> - constify struct domain
>> - updated commit message
>> - updates to the new locking scheme (pdev->vpci_lock)
>> Since v3:
>> - revisit locking
>> - move code to vpci.c
>> Since v2:
>> - pass struct domain instead of struct vcpu
>> - constify arguments where possible
>> - gate relevant code with CONFIG_HAS_VPCI_GUEST_SUPPORT
>> New in v2
>> ---
>> tools/tests/vpci/emul.h | 2 +-
>> xen/arch/arm/vpci.c | 4 +++
>> xen/drivers/vpci/vpci.c | 73 +++++++++++++++++++++++++++++++++++++----
>> 3 files changed, 71 insertions(+), 8 deletions(-)
>>
>> diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h
>> index da446bba86b4..dd048cffbf9d 100644
>> --- a/tools/tests/vpci/emul.h
>> +++ b/tools/tests/vpci/emul.h
>> @@ -89,7 +89,7 @@ typedef union {
>>
>> #define __hwdom_init
>>
>> -#define is_hardware_domain(d) ((void)(d), false)
>> +#define is_hardware_domain(d) ((void)(d), true)
>>
>> #define has_vpci(d) true
>>
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> index b63a356bb4a8..618ddb7f6547 100644
>> --- a/xen/arch/arm/vpci.c
>> +++ b/xen/arch/arm/vpci.c
>> @@ -34,6 +34,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>> /* data is needed to prevent a pointer cast on 32bit */
>> unsigned long data;
>>
>> + ASSERT(!bridge == !is_hardware_domain(v->domain));
>> +
>> if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>> 1U << info->dabt.size, &data) )
>> {
>> @@ -52,6 +54,8 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>> struct pci_host_bridge *bridge = p;
>> pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>
>> + ASSERT(!bridge == !is_hardware_domain(v->domain));
>> +
>> return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>> 1U << info->dabt.size, r);
>> }
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 1e6aa5d799b9..fc409f3fc346 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -174,6 +174,41 @@ int vpci_assign_device(struct pci_dev *pdev)
>> }
>> #endif /* __XEN__ */
>>
>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> +/*
>> + * Find the physical device which is mapped to the virtual device
>> + * and translate virtual SBDF to the physical one.
>> + */
>> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
>> + pci_sbdf_t *sbdf)
>> +{
>> + const struct pci_dev *pdev;
>> +
>> + ASSERT(!is_hardware_domain(d));
>> + ASSERT(rw_is_locked(&d->pci_lock));
>> +
>> + for_each_pdev ( d, pdev )
>> + {
>> + if ( pdev->vpci && (pdev->vpci->guest_sbdf.sbdf == sbdf->sbdf) )
>> + {
>> + /* Replace guest SBDF with the physical one. */
>> + *sbdf = pdev->sbdf;
>> + return pdev;
>> + }
>> + }
>> +
>> + return NULL;
>> +}
>> +#else
>> +static const struct pci_dev *translate_virtual_device(const struct domain *d,
>> + pci_sbdf_t *sbdf)
>> +{
>> + ASSERT_UNREACHABLE();
>> +
>> + return NULL;
>> +}
>> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>
> Jan's suggestion avoids having duplicate headers, and results in less
> code overall:
>
> static const struct pci_dev *translate_virtual_device(const struct domain *d,
> pci_sbdf_t *sbdf)
> {
> #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> const struct pci_dev *pdev;
> ...
> #else /* !CONFIG_HAS_VPCI_GUEST_SUPPORT */
> ASSERT_UNREACHABLE()
> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>
> return NULL;
> }
>
> I've not overly fuzzed either way, but if changes are required you
> might as well change to this form.
Will do
>> static int vpci_register_cmp(const struct vpci_register *r1,
>> const struct vpci_register *r2)
>> {
>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>> const struct pci_dev *pdev;
>> const struct vpci_register *r;
>> unsigned int data_offset = 0;
>> - uint32_t data = ~(uint32_t)0;
>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
>
> This seems kind of unrelated to the rest of the code in the patch,
> why is this needed? Isn't it always fine to return all ones, and let
> the caller truncate to the required size?
>
> Otherwise the code in vpci_read_hw() also needs to be adjusted.
On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
assert that the read handlers don't set any bits above the access size.
I had adjusted data here due to returning it directly from vpci_read()
in the current form of the patch. With your suggestion below we would
only need to adjust vpci_read_hw() (and then data here would not
strictly need adjusting).
> And
> you can likely use GENMASK(size * 8, 0) as it's easier to read.
OK. I'll then also provide a definition for GENMASK in
tools/tests/vpci/emul.h.
>>
>> if ( !size )
>> {
>> @@ -453,9 +488,21 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>> * pci_lock is sufficient.
>> */
>> read_lock(&d->pci_lock);
>> - pdev = pci_get_pdev(d, sbdf);
>> - if ( !pdev && is_hardware_domain(d) )
>> - pdev = pci_get_pdev(dom_xen, sbdf);
>> + if ( is_hardware_domain(d) )
>> + {
>> + pdev = pci_get_pdev(d, sbdf);
>> + if ( !pdev )
>> + pdev = pci_get_pdev(dom_xen, sbdf);
>> + }
>> + else
>> + {
>> + pdev = translate_virtual_device(d, &sbdf);
>> + if ( !pdev )
>> + {
>> + read_unlock(&d->pci_lock);
>> + return data;
>> + }
>
> You don't need this checking here, as the check below will already be
> enough AFAICT, iow:
>
> if ( is_hardware_domain(d) )
> {
> pdev = pci_get_pdev(d, sbdf);
> if ( !pdev )
> pdev = pci_get_pdev(dom_xen, sbdf);
> }
> else
> pdev = translate_virtual_device(d, &sbdf);
>
> if ( !pdev || !pdev->vpci )
> {
> read_unlock(&d->pci_lock);
> return vpci_read_hw(sbdf, reg, size);
> }
>
> Achieves the same and is more compact by having a single return for
> all the possible cases? Same for the write case below.
Seeing as there is a is_hardware_domain check inside vpci_read_hw(),
that is okay. I'll make the adjustment here and in vpci_write.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-07 3:05 ` Stewart Hildebrand
@ 2025-05-07 7:44 ` Roger Pau Monné
2025-05-07 13:38 ` Stewart Hildebrand
0 siblings, 1 reply; 12+ messages in thread
From: Roger Pau Monné @ 2025-05-07 7:44 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
> On 5/6/25 07:16, Roger Pau Monné wrote:
> > Hello,
> >
> > Sorry I haven't looked at this before, I was confused with the cover
> > letter having ARM in the subject and somehow assumed all the code was
> > ARM related.
>
> No worries, thanks for taking a look.
>
> > On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
> >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >> static int vpci_register_cmp(const struct vpci_register *r1,
> >> const struct vpci_register *r2)
> >> {
> >> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >> const struct pci_dev *pdev;
> >> const struct vpci_register *r;
> >> unsigned int data_offset = 0;
> >> - uint32_t data = ~(uint32_t)0;
> >> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
> >
> > This seems kind of unrelated to the rest of the code in the patch,
> > why is this needed? Isn't it always fine to return all ones, and let
> > the caller truncate to the required size?
> >
> > Otherwise the code in vpci_read_hw() also needs to be adjusted.
>
> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
> assert that the read handlers don't set any bits above the access size.
I see. That kind of diverges from x86 behavior, that AFAICT (see
memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
the size of the access.
Maybe it would be better to instead of asserting just truncate the
returned value to the given size, as that would allow to just return
~0 from handlers without having to care about the specific access
size.
> I had adjusted data here due to returning it directly from vpci_read()
> in the current form of the patch. With your suggestion below we would
> only need to adjust vpci_read_hw() (and then data here would not
> strictly need adjusting).
Both returns would need adjusting IMO, and it should have been part of
9a5e22b64266 I think, since that's the commit that introduced the
checking.
>
> > And
> > you can likely use GENMASK(size * 8, 0) as it's easier to read.
>
> OK. I'll then also provide a definition for GENMASK in
> tools/tests/vpci/emul.h.
>
> >>
> >> if ( !size )
> >> {
> >> @@ -453,9 +488,21 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >> * pci_lock is sufficient.
> >> */
> >> read_lock(&d->pci_lock);
> >> - pdev = pci_get_pdev(d, sbdf);
> >> - if ( !pdev && is_hardware_domain(d) )
> >> - pdev = pci_get_pdev(dom_xen, sbdf);
> >> + if ( is_hardware_domain(d) )
> >> + {
> >> + pdev = pci_get_pdev(d, sbdf);
> >> + if ( !pdev )
> >> + pdev = pci_get_pdev(dom_xen, sbdf);
> >> + }
> >> + else
> >> + {
> >> + pdev = translate_virtual_device(d, &sbdf);
> >> + if ( !pdev )
> >> + {
> >> + read_unlock(&d->pci_lock);
> >> + return data;
> >> + }
> >
> > You don't need this checking here, as the check below will already be
> > enough AFAICT, iow:
> >
> > if ( is_hardware_domain(d) )
> > {
> > pdev = pci_get_pdev(d, sbdf);
> > if ( !pdev )
> > pdev = pci_get_pdev(dom_xen, sbdf);
> > }
> > else
> > pdev = translate_virtual_device(d, &sbdf);
> >
> > if ( !pdev || !pdev->vpci )
> > {
> > read_unlock(&d->pci_lock);
> > return vpci_read_hw(sbdf, reg, size);
> > }
> >
> > Achieves the same and is more compact by having a single return for
> > all the possible cases? Same for the write case below.
>
> Seeing as there is a is_hardware_domain check inside vpci_read_hw(),
> that is okay. I'll make the adjustment here and in vpci_write.
Yup, vpci_read_hw() is already prepared to handle calls from
!is_hardware_domain() contexts.
Thanks, Roger.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-07 7:44 ` Roger Pau Monné
@ 2025-05-07 13:38 ` Stewart Hildebrand
2025-05-07 17:44 ` Roger Pau Monné
0 siblings, 1 reply; 12+ messages in thread
From: Stewart Hildebrand @ 2025-05-07 13:38 UTC (permalink / raw)
To: Roger Pau Monné, Julien Grall
Cc: xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On 5/7/25 03:44, Roger Pau Monné wrote:
> On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
>> On 5/6/25 07:16, Roger Pau Monné wrote:
>>> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> static int vpci_register_cmp(const struct vpci_register *r1,
>>>> const struct vpci_register *r2)
>>>> {
>>>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>>> const struct pci_dev *pdev;
>>>> const struct vpci_register *r;
>>>> unsigned int data_offset = 0;
>>>> - uint32_t data = ~(uint32_t)0;
>>>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
>>>
>>> This seems kind of unrelated to the rest of the code in the patch,
>>> why is this needed? Isn't it always fine to return all ones, and let
>>> the caller truncate to the required size?
>>>
>>> Otherwise the code in vpci_read_hw() also needs to be adjusted.
>>
>> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
>> assert that the read handlers don't set any bits above the access size.
>
> I see. That kind of diverges from x86 behavior, that AFAICT (see
> memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
> the size of the access.
>
> Maybe it would be better to instead of asserting just truncate the
> returned value to the given size, as that would allow to just return
> ~0 from handlers without having to care about the specific access
> size.
The impression I get from [0] is that that on Arm, there's no benefit to
performing truncation in xen/arch/arm/io.c. Doing so would needlessly
affect other Arm internal read handlers (e.g. vGIC). For vPCI
specifically, however, we could potentially perform truncation in
xen/arch/arm/vpci.c. So I guess it's a question of whether we want to
give special treatment to vPCI compared to all other read handlers on
Arm?
>> I had adjusted data here due to returning it directly from vpci_read()
>> in the current form of the patch. With your suggestion below we would
>> only need to adjust vpci_read_hw() (and then data here would not
>> strictly need adjusting).
>
> Both returns would need adjusting IMO,
OK
> and it should have been part of
> 9a5e22b64266 I think, since that's the commit that introduced the
> checking.
If we proceed with adjusting vpci_read() and vpci_read_hw(): are you OK
with these adjustments included in this patch, or would you prefer them
being split into a pre-patch?
[0] https://lore.kernel.org/xen-devel/20240522225927.77398-1-stewart.hildebrand@amd.com/T/#t
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-07 13:38 ` Stewart Hildebrand
@ 2025-05-07 17:44 ` Roger Pau Monné
2025-05-07 21:17 ` Stewart Hildebrand
0 siblings, 1 reply; 12+ messages in thread
From: Roger Pau Monné @ 2025-05-07 17:44 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Julien Grall, xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On Wed, May 07, 2025 at 09:38:51AM -0400, Stewart Hildebrand wrote:
> On 5/7/25 03:44, Roger Pau Monné wrote:
> > On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
> >> On 5/6/25 07:16, Roger Pau Monné wrote:
> >>> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
> >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>>> static int vpci_register_cmp(const struct vpci_register *r1,
> >>>> const struct vpci_register *r2)
> >>>> {
> >>>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >>>> const struct pci_dev *pdev;
> >>>> const struct vpci_register *r;
> >>>> unsigned int data_offset = 0;
> >>>> - uint32_t data = ~(uint32_t)0;
> >>>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
> >>>
> >>> This seems kind of unrelated to the rest of the code in the patch,
> >>> why is this needed? Isn't it always fine to return all ones, and let
> >>> the caller truncate to the required size?
> >>>
> >>> Otherwise the code in vpci_read_hw() also needs to be adjusted.
> >>
> >> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
> >> assert that the read handlers don't set any bits above the access size.
> >
> > I see. That kind of diverges from x86 behavior, that AFAICT (see
> > memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
> > the size of the access.
> >
> > Maybe it would be better to instead of asserting just truncate the
> > returned value to the given size, as that would allow to just return
> > ~0 from handlers without having to care about the specific access
> > size.
>
> The impression I get from [0] is that that on Arm, there's no benefit to
> performing truncation in xen/arch/arm/io.c. Doing so would needlessly
> affect other Arm internal read handlers (e.g. vGIC).
But isn't this truncation desirable in order to avoid possibly setting
bits outside of the access size?
> For vPCI
> specifically, however, we could potentially perform truncation in
> xen/arch/arm/vpci.c. So I guess it's a question of whether we want to
> give special treatment to vPCI compared to all other read handlers on
> Arm?
I would think doing the truncation uniformly for all reads would be
better, as we then ensure the value propagated to the registers always
matches the access size?
I'm not expert on ARM, but it seems cumbersome to force this to all
internal handlers, instead of just truncating the value in a single
place.
> >> I had adjusted data here due to returning it directly from vpci_read()
> >> in the current form of the patch. With your suggestion below we would
> >> only need to adjust vpci_read_hw() (and then data here would not
> >> strictly need adjusting).
> >
> > Both returns would need adjusting IMO,
>
> OK
>
> > and it should have been part of
> > 9a5e22b64266 I think, since that's the commit that introduced the
> > checking.
>
> If we proceed with adjusting vpci_read() and vpci_read_hw(): are you OK
> with these adjustments included in this patch, or would you prefer them
> being split into a pre-patch?
Ideally it should be a separate patch with a "Fixes:" tag that points
to the patch that added the ASSERT(). As I think the current vPCI code is
kind of broken without that truncation on ARM?
Thanks, Roger.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-07 17:44 ` Roger Pau Monné
@ 2025-05-07 21:17 ` Stewart Hildebrand
2025-05-08 8:32 ` Roger Pau Monné
0 siblings, 1 reply; 12+ messages in thread
From: Stewart Hildebrand @ 2025-05-07 21:17 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Julien Grall, xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On 5/7/25 13:44, Roger Pau Monné wrote:
> On Wed, May 07, 2025 at 09:38:51AM -0400, Stewart Hildebrand wrote:
>> On 5/7/25 03:44, Roger Pau Monné wrote:
>>> On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
>>>> On 5/6/25 07:16, Roger Pau Monné wrote:
>>>>> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>> static int vpci_register_cmp(const struct vpci_register *r1,
>>>>>> const struct vpci_register *r2)
>>>>>> {
>>>>>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>>>>> const struct pci_dev *pdev;
>>>>>> const struct vpci_register *r;
>>>>>> unsigned int data_offset = 0;
>>>>>> - uint32_t data = ~(uint32_t)0;
>>>>>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
>>>>>
>>>>> This seems kind of unrelated to the rest of the code in the patch,
>>>>> why is this needed? Isn't it always fine to return all ones, and let
>>>>> the caller truncate to the required size?
>>>>>
>>>>> Otherwise the code in vpci_read_hw() also needs to be adjusted.
>>>>
>>>> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
>>>> assert that the read handlers don't set any bits above the access size.
>>>
>>> I see. That kind of diverges from x86 behavior, that AFAICT (see
>>> memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
>>> the size of the access.
>>>
>>> Maybe it would be better to instead of asserting just truncate the
>>> returned value to the given size, as that would allow to just return
>>> ~0 from handlers without having to care about the specific access
>>> size.
>>
>> The impression I get from [0] is that that on Arm, there's no benefit to
>> performing truncation in xen/arch/arm/io.c. Doing so would needlessly
>> affect other Arm internal read handlers (e.g. vGIC).
>
> But isn't this truncation desirable in order to avoid possibly setting
> bits outside of the access size?
On Arm we expect the read handlers to have the bits above the access
size zeroed. If a read handler sets bits above the access size, it could
indicate a bug in the read handler. As a reminder, this was already
discussed at [0] and a patch was already committed 9a5e22b64266
("xen/arm: check read handler behavior"). Perhaps we could both keep the
ASSERT (for debug builds) and perform truncation (for release builds) in
xen/arch/arm/io.c:handle_read(), but that's patch for another day.
[0] https://lore.kernel.org/xen-devel/20240522225927.77398-1-stewart.hildebrand@amd.com/T/#t
>> For vPCI
>> specifically, however, we could potentially perform truncation in
>> xen/arch/arm/vpci.c. So I guess it's a question of whether we want to
>> give special treatment to vPCI compared to all other read handlers on
>> Arm?
>
> I would think doing the truncation uniformly for all reads would be
> better, as we then ensure the value propagated to the registers always
> matches the access size?
>
> I'm not expert on ARM, but it seems cumbersome to force this to all
> internal handlers, instead of just truncating the value in a single
> place.
To move this forward, I suggest performing this truncation in
xen/arch/arm/vpci.c:vpci_mmio_read(). This will be a single place to
perform truncation for Arm vPCI, and will not affect other Arm internal
mmio handlers.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-07 21:17 ` Stewart Hildebrand
@ 2025-05-08 8:32 ` Roger Pau Monné
2025-05-08 10:34 ` Stewart Hildebrand
0 siblings, 1 reply; 12+ messages in thread
From: Roger Pau Monné @ 2025-05-08 8:32 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Julien Grall, xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On Wed, May 07, 2025 at 05:17:58PM -0400, Stewart Hildebrand wrote:
> On 5/7/25 13:44, Roger Pau Monné wrote:
> > On Wed, May 07, 2025 at 09:38:51AM -0400, Stewart Hildebrand wrote:
> >> On 5/7/25 03:44, Roger Pau Monné wrote:
> >>> On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
> >>>> On 5/6/25 07:16, Roger Pau Monné wrote:
> >>>>> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
> >>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>>>>> static int vpci_register_cmp(const struct vpci_register *r1,
> >>>>>> const struct vpci_register *r2)
> >>>>>> {
> >>>>>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >>>>>> const struct pci_dev *pdev;
> >>>>>> const struct vpci_register *r;
> >>>>>> unsigned int data_offset = 0;
> >>>>>> - uint32_t data = ~(uint32_t)0;
> >>>>>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
> >>>>>
> >>>>> This seems kind of unrelated to the rest of the code in the patch,
> >>>>> why is this needed? Isn't it always fine to return all ones, and let
> >>>>> the caller truncate to the required size?
> >>>>>
> >>>>> Otherwise the code in vpci_read_hw() also needs to be adjusted.
> >>>>
> >>>> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
> >>>> assert that the read handlers don't set any bits above the access size.
> >>>
> >>> I see. That kind of diverges from x86 behavior, that AFAICT (see
> >>> memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
> >>> the size of the access.
> >>>
> >>> Maybe it would be better to instead of asserting just truncate the
> >>> returned value to the given size, as that would allow to just return
> >>> ~0 from handlers without having to care about the specific access
> >>> size.
> >>
> >> The impression I get from [0] is that that on Arm, there's no benefit to
> >> performing truncation in xen/arch/arm/io.c. Doing so would needlessly
> >> affect other Arm internal read handlers (e.g. vGIC).
> >
> > But isn't this truncation desirable in order to avoid possibly setting
> > bits outside of the access size?
>
> On Arm we expect the read handlers to have the bits above the access
> size zeroed. If a read handler sets bits above the access size, it could
> indicate a bug in the read handler. As a reminder, this was already
> discussed at [0] and a patch was already committed 9a5e22b64266
> ("xen/arm: check read handler behavior"). Perhaps we could both keep the
> ASSERT (for debug builds) and perform truncation (for release builds) in
> xen/arch/arm/io.c:handle_read(), but that's patch for another day.
>
> [0] https://lore.kernel.org/xen-devel/20240522225927.77398-1-stewart.hildebrand@amd.com/T/#t
Oh, I see. I already expressed concerns on that thread about forcing
the truncation to be done by handler implementations vs truncating in
a generic place ahead of propagating to the registers.
My main concern is when returning ~0, as it seems cumbersome to have
to truncate that, and I think we do blindly return ~0 on more than one
x86 IO handler.
> >> For vPCI
> >> specifically, however, we could potentially perform truncation in
> >> xen/arch/arm/vpci.c. So I guess it's a question of whether we want to
> >> give special treatment to vPCI compared to all other read handlers on
> >> Arm?
> >
> > I would think doing the truncation uniformly for all reads would be
> > better, as we then ensure the value propagated to the registers always
> > matches the access size?
> >
> > I'm not expert on ARM, but it seems cumbersome to force this to all
> > internal handlers, instead of just truncating the value in a single
> > place.
>
> To move this forward, I suggest performing this truncation in
> xen/arch/arm/vpci.c:vpci_mmio_read(). This will be a single place to
> perform truncation for Arm vPCI, and will not affect other Arm internal
> mmio handlers.
You already have the mask there, so it should be easy to do:
*r = data & invalid;
To truncate the value. Could you send that as a separate patch with a
Fixes tag?
Thanks, I'm sorry for not realizing about this before.
Roger.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests
2025-05-08 8:32 ` Roger Pau Monné
@ 2025-05-08 10:34 ` Stewart Hildebrand
0 siblings, 0 replies; 12+ messages in thread
From: Stewart Hildebrand @ 2025-05-08 10:34 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Julien Grall, xen-devel, Oleksandr Andrushchenko, Anthony PERARD,
Stefano Stabellini, Bertrand Marquis, Michal Orzel,
Volodymyr Babchuk, Jiqian Chen, Mykyta Poturai
On 5/8/25 04:32, Roger Pau Monné wrote:
> On Wed, May 07, 2025 at 05:17:58PM -0400, Stewart Hildebrand wrote:
>> On 5/7/25 13:44, Roger Pau Monné wrote:
>>> On Wed, May 07, 2025 at 09:38:51AM -0400, Stewart Hildebrand wrote:
>>>> On 5/7/25 03:44, Roger Pau Monné wrote:
>>>>> On Tue, May 06, 2025 at 11:05:13PM -0400, Stewart Hildebrand wrote:
>>>>>> On 5/6/25 07:16, Roger Pau Monné wrote:
>>>>>>> On Fri, Apr 18, 2025 at 02:58:37PM -0400, Stewart Hildebrand wrote:
>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>>> static int vpci_register_cmp(const struct vpci_register *r1,
>>>>>>>> const struct vpci_register *r2)
>>>>>>>> {
>>>>>>>> @@ -438,7 +473,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>>>>>>> const struct pci_dev *pdev;
>>>>>>>> const struct vpci_register *r;
>>>>>>>> unsigned int data_offset = 0;
>>>>>>>> - uint32_t data = ~(uint32_t)0;
>>>>>>>> + uint32_t data = 0xffffffffU >> (32 - 8 * size);
>>>>>>>
>>>>>>> This seems kind of unrelated to the rest of the code in the patch,
>>>>>>> why is this needed? Isn't it always fine to return all ones, and let
>>>>>>> the caller truncate to the required size?
>>>>>>>
>>>>>>> Otherwise the code in vpci_read_hw() also needs to be adjusted.
>>>>>>
>>>>>> On Arm, since 9a5e22b64266 ("xen/arm: check read handler behavior") we
>>>>>> assert that the read handlers don't set any bits above the access size.
>>>>>
>>>>> I see. That kind of diverges from x86 behavior, that AFAICT (see
>>>>> memcpy() at tail of hvmemul_do_io()) instead truncates the memcpy to
>>>>> the size of the access.
>>>>>
>>>>> Maybe it would be better to instead of asserting just truncate the
>>>>> returned value to the given size, as that would allow to just return
>>>>> ~0 from handlers without having to care about the specific access
>>>>> size.
>>>>
>>>> The impression I get from [0] is that that on Arm, there's no benefit to
>>>> performing truncation in xen/arch/arm/io.c. Doing so would needlessly
>>>> affect other Arm internal read handlers (e.g. vGIC).
>>>
>>> But isn't this truncation desirable in order to avoid possibly setting
>>> bits outside of the access size?
>>
>> On Arm we expect the read handlers to have the bits above the access
>> size zeroed. If a read handler sets bits above the access size, it could
>> indicate a bug in the read handler. As a reminder, this was already
>> discussed at [0] and a patch was already committed 9a5e22b64266
>> ("xen/arm: check read handler behavior"). Perhaps we could both keep the
>> ASSERT (for debug builds) and perform truncation (for release builds) in
>> xen/arch/arm/io.c:handle_read(), but that's patch for another day.
>>
>> [0] https://lore.kernel.org/xen-devel/20240522225927.77398-1-stewart.hildebrand@amd.com/T/#t
>
> Oh, I see. I already expressed concerns on that thread about forcing
> the truncation to be done by handler implementations vs truncating in
> a generic place ahead of propagating to the registers.
>
> My main concern is when returning ~0, as it seems cumbersome to have
> to truncate that, and I think we do blindly return ~0 on more than one
> x86 IO handler.
>
>>>> For vPCI
>>>> specifically, however, we could potentially perform truncation in
>>>> xen/arch/arm/vpci.c. So I guess it's a question of whether we want to
>>>> give special treatment to vPCI compared to all other read handlers on
>>>> Arm?
>>>
>>> I would think doing the truncation uniformly for all reads would be
>>> better, as we then ensure the value propagated to the registers always
>>> matches the access size?
>>>
>>> I'm not expert on ARM, but it seems cumbersome to force this to all
>>> internal handlers, instead of just truncating the value in a single
>>> place.
>>
>> To move this forward, I suggest performing this truncation in
>> xen/arch/arm/vpci.c:vpci_mmio_read(). This will be a single place to
>> perform truncation for Arm vPCI, and will not affect other Arm internal
>> mmio handlers.
>
> You already have the mask there, so it should be easy to do:
>
> *r = data & invalid;
>
> To truncate the value. Could you send that as a separate patch with a
> Fixes tag?
Yes, will do
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-08 10:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-18 18:58 [PATCH v20 0/2] PCI devices passthrough on Arm, part 3 Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 1/2] xen/arm: check read handler behavior Stewart Hildebrand
2025-04-18 18:58 ` [PATCH v20 2/2] vpci: translate virtual PCI bus topology for guests Stewart Hildebrand
2025-04-22 7:48 ` Jan Beulich
2025-05-06 11:16 ` Roger Pau Monné
2025-05-07 3:05 ` Stewart Hildebrand
2025-05-07 7:44 ` Roger Pau Monné
2025-05-07 13:38 ` Stewart Hildebrand
2025-05-07 17:44 ` Roger Pau Monné
2025-05-07 21:17 ` Stewart Hildebrand
2025-05-08 8:32 ` Roger Pau Monné
2025-05-08 10:34 ` Stewart Hildebrand
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.