All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/2] vPCI capabilities filtering
@ 2023-11-28 19:44 Stewart Hildebrand
  2023-11-28 19:44 ` [PATCH v8 1/2] xen/vpci: header: status register handler Stewart Hildebrand
  2023-11-28 19:44 ` [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities Stewart Hildebrand
  0 siblings, 2 replies; 9+ messages in thread
From: Stewart Hildebrand @ 2023-11-28 19:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Roger Pau Monné, Wei Liu, Anthony PERARD,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini

This small series enables vPCI to filter which PCI capabilities we expose to a
domU. This series adds vPCI register handlers within
xen/drivers/vpci/header.c:init_bars(), along with some supporting functions.

Note there are minor rebase conflicts with the in-progress vPCI series [1].
These conflicts fall into the category of functions and code being added
adjacent to one another, so are easily resolved. I did not identify any
dependency on the vPCI locking work, and the two series deal with different
aspects of emulating the PCI header.

Future work may involve adding handlers for more registers in the vPCI header,
such as VID/DID, etc. Future work may also involve exposing additional
capabilities to the guest for broader device/driver support.

v7->v8:
* address feedback

v6->v7:
* address feedback in ("xen/vpci: header: status register handler")
* drop ("xen/pci: convert pci_find_*cap* to pci_sbdf_t") and
  ("x86/msi: rearrange read_pci_mem_bar slightly") as they were committed

v5->v6:
* drop ("xen/pci: update PCI_STATUS_* constants") as it has been committed

v4->v5:
* drop ("x86/msi: remove some unused-but-set-variables") as it has been
  committed
* add ("xen/pci: update PCI_STATUS_* constants")
* squash ro_mask patch

v3->v4:
* drop "xen/pci: address a violation of MISRA C:2012 Rule 8.3" as it has been
  committed
* re-order status register handler and capabilities filtering patches
* split an unrelated change from ("xen/pci: convert pci_find_*cap* to pci_sbdf_t")
  into its own patch
* add new patch ("x86/msi: rearrange read_pci_mem_bar slightly") based on
  feedback
* add new RFC patch ("xen/vpci: support ro mask")

v2->v3:
* drop RFC "xen/vpci: header: avoid cast for value passed to vpci_read_val"
* minor misra C violation fixup in preparatory patch
* switch to pci_sbdf_t in preparatory patch
* introduce status handler

v1->v2:
* squash helper functions into the patch where they are used to avoid transient
  dead code situation
* add new RFC patch, possibly throwaway, to get an idea of what it would look
  like to get rid of the (void *)(uintptr_t) cast by introducing a new memory
  allocation

[1] https://lists.xenproject.org/archives/html/xen-devel/2023-08/msg02361.html

Stewart Hildebrand (2):
  xen/vpci: header: status register handler
  xen/vpci: header: filter PCI capabilities

 tools/tests/vpci/main.c    | 98 ++++++++++++++++++++++++++++++++++++++
 xen/drivers/pci/pci.c      | 31 ++++++++----
 xen/drivers/vpci/header.c  | 75 +++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c    | 74 +++++++++++++++++++++++-----
 xen/include/xen/pci.h      |  3 ++
 xen/include/xen/pci_regs.h |  9 ++++
 xen/include/xen/vpci.h     | 14 ++++++
 7 files changed, 282 insertions(+), 22 deletions(-)


base-commit: 18540a313cc66a04eb15a67d74c7992a8576fbec
-- 
2.43.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v8 1/2] xen/vpci: header: status register handler
  2023-11-28 19:44 [PATCH v8 0/2] vPCI capabilities filtering Stewart Hildebrand
@ 2023-11-28 19:44 ` Stewart Hildebrand
  2023-11-29 11:03   ` Roger Pau Monné
  2023-11-28 19:44 ` [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities Stewart Hildebrand
  1 sibling, 1 reply; 9+ messages in thread
From: Stewart Hildebrand @ 2023-11-28 19:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Roger Pau Monné, Wei Liu, Anthony PERARD,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini

Introduce a handler for the PCI status register, with ability to mask
the capabilities bit. The status register contains RsvdZ bits,
read-only bits, and write-1-to-clear bits. Additionally, we use RsvdP to
mask the capabilities bit. Introduce bitmasks to handle these in vPCI.
If a bit in the bitmask is set, then the special meaning applies:

  ro_mask: read normal, guest write ignore (preserve on write to hardware)
  rw1c_mask: read normal, write 1 to clear
  rsvdp_mask: read as zero, guest write ignore (preserve on write to hardware)
  rsvdz_mask: read as zero, guest write ignore (write zero to hardware)

The RO/RW1C/RsvdP/RsvdZ naming and definitions were borrowed from the
PCI Express Base 6.1 specification. RsvdP/RsvdZ bits help Xen enforce
our view of the world. Xen preserves the value of read-only bits on
write to hardware, discarding the guests write value. This is done in
case hardware wrongly implements R/O bits as R/W.

The mask_cap_list flag will be set in a follow-on patch.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
v7->v8:
* move PCI_STATUS_UDF to rsvdz_mask (per PCI Express Base 6 spec)
* add support for rsvdp bits
* add tests for ro/rw1c/rsvdp/rsvdz bits in tools/tests/vpci/main.c
* dropped R-b tag [1] since the patch has changed moderately since the last rev

[1] https://lists.xenproject.org/archives/html/xen-devel/2023-09/msg00909.html

v6->v7:
* re-work args passed to vpci_add_register_mask() (called in init_bars())
* also check for overlap of (rsvdz_mask & ro_mask) in add_register()
* slightly adjust masking operation in vpci_write_helper()

v5->v6:
* remove duplicate PCI_STATUS_CAP_LIST in constant definition
* style fixup in constant definitions
* s/res_mask/rsvdz_mask/
* combine a new masking operation into single line
* preserve r/o bits on write
* get rid of status_read. Instead, use rsvdz_mask for conditionally masking
  PCI_STATUS_CAP_LIST bit
* add comment about PCI_STATUS_CAP_LIST and rsvdp behavior
* add sanity checks in add_register
* move mask_cap_list from struct vpci_header to local variable

v4->v5:
* add support for res_mask
* add support for ro_mask (squash ro_mask patch)
* add constants for reserved, read-only, and rw1c masks

v3->v4:
* move mask_cap_list setting to the capabilities patch
* single pci_conf_read16 in status_read
* align mask_cap_list bitfield in struct vpci_header
* change to rw1c bit mask instead of treating whole register as rw1c
* drop subsystem prefix on renamed add_register function

v2->v3:
* new patch
---
 tools/tests/vpci/main.c    | 98 ++++++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/header.c  | 12 +++++
 xen/drivers/vpci/vpci.c    | 62 +++++++++++++++++++-----
 xen/include/xen/pci_regs.h |  9 ++++
 xen/include/xen/vpci.h     |  9 ++++
 5 files changed, 178 insertions(+), 12 deletions(-)

diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
index b9a0a6006bb9..b0bb993be297 100644
--- a/tools/tests/vpci/main.c
+++ b/tools/tests/vpci/main.c
@@ -70,6 +70,26 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
     *(uint32_t *)data = val;
 }
 
+struct mask_data {
+    uint32_t val;
+    uint32_t rw1c_mask;
+};
+
+static uint32_t vpci_read32_mask(const struct pci_dev *pdev, unsigned int reg,
+                                 void *data)
+{
+    struct mask_data *md = data;
+    return md->val;
+}
+
+static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg,
+                              uint32_t val, void *data)
+{
+    struct mask_data *md = data;
+    md->val  = val | (md->val & md->rw1c_mask);
+    md->val &= ~(val & md->rw1c_mask);
+}
+
 #define VPCI_READ(reg, size, data) ({                           \
     data = vpci_read((pci_sbdf_t){ .sbdf = 0 }, reg, size);     \
 })
@@ -94,9 +114,20 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
     assert(!vpci_add_register(test_pdev.vpci, fread, fwrite, off, size,     \
                               &store))
 
+#define VPCI_ADD_REG_MASK(fread, fwrite, off, size, store,                     \
+                          ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask)          \
+    assert(!vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
+                                   &store,                                     \
+                                   ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
+
 #define VPCI_ADD_INVALID_REG(fread, fwrite, off, size)                      \
     assert(vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, NULL))
 
+#define VPCI_ADD_INVALID_REG_MASK(fread, fwrite, off, size,                   \
+                                  ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask) \
+    assert(vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
+                                  NULL, ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
+
 #define VPCI_REMOVE_REG(off, size)                                          \
     assert(!vpci_remove_register(test_pdev.vpci, off, size))
 
@@ -154,6 +185,7 @@ main(int argc, char **argv)
     uint16_t r20[2] = { };
     uint32_t r24 = 0;
     uint8_t r28, r30;
+    struct mask_data r32;
     unsigned int i;
     int rc;
 
@@ -213,6 +245,14 @@ main(int argc, char **argv)
     /* Try to add a register with missing handlers. */
     VPCI_ADD_INVALID_REG(NULL, NULL, 8, 2);
 
+    /* Try to add registers with the same bits set in multiple masks. */
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 1, 0, 0);
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 1, 0);
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 0, 1);
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 1, 0);
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 0, 1);
+    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 0, 1, 1);
+
     /* Read/write of unset register. */
     VPCI_READ_CHECK(8, 4, 0xffffffff);
     VPCI_READ_CHECK(8, 2, 0xffff);
@@ -287,6 +327,64 @@ main(int argc, char **argv)
     VPCI_ADD_REG(vpci_read8, vpci_write8, 30, 1, r30);
     VPCI_WRITE_CHECK(28, 4, 0xffacffdc);
 
+    /*
+     * Test ro/rw1c/rsvdp/rsvdz masks.
+     *
+     * 32     24     16      8      0
+     *  +---------------------------+
+     *  |            r32            | 32
+     *  +---------------------------+
+     *
+     */
+    r32.rw1c_mask = 0x0000ff00U;
+    VPCI_ADD_REG_MASK(vpci_read32_mask, vpci_write32_mask, 32, 4, r32,
+                      0x000000ffU   /* RO    */,
+                      r32.rw1c_mask /* RW1C  */,
+                      0x00ff0000U   /* RsvdP */,
+                      0xff000000U   /* RsvdZ */);
+
+    /* ro */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(32, 1, 0x0f);
+    VPCI_WRITE(32, 1, 0x5a);
+    VPCI_READ_CHECK(32, 1, 0x0f);
+    assert(r32.val == 0x000f0f0fU);
+
+    /* rw1c */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(33, 1, 0x0f);
+    VPCI_WRITE(33, 1, 0x5a);
+    VPCI_READ_CHECK(33, 1, 0x05);
+    assert(r32.val == 0x000f050fU);
+
+    /* rsvdp */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(34, 1, 0);
+    VPCI_WRITE(34, 1, 0x5a);
+    VPCI_READ_CHECK(34, 1, 0);
+    assert(r32.val == 0x000f0f0fU);
+
+    /* rsvdz */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(35, 1, 0);
+    VPCI_WRITE(35, 1, 0x5a);
+    VPCI_READ_CHECK(35, 1, 0);
+    assert(r32.val == 0x000f0f0fU);
+
+    /* write all 0's */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
+    VPCI_WRITE(32, 4, 0);
+    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
+    assert(r32.val == 0x000f0f0fU);
+
+    /* write all 1's */
+    r32.val = 0x0f0f0f0fU;
+    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
+    VPCI_WRITE(32, 4, 0xffffffffU);
+    VPCI_READ_CHECK(32, 4, 0x0000000fU);
+    assert(r32.val == 0x000f000fU);
+
     /* Finally try to remove a couple of registers. */
     VPCI_REMOVE_REG(28, 1);
     VPCI_REMOVE_REG(24, 4);
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 767c1ba718d7..351318121e48 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
     struct vpci_header *header = &pdev->vpci->header;
     struct vpci_bar *bars = header->bars;
     int rc;
+    bool mask_cap_list = false;
 
     switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
     {
@@ -544,6 +545,17 @@ static int cf_check init_bars(struct pci_dev *pdev)
     if ( rc )
         return rc;
 
+    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
+    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
+                                PCI_STATUS, 2, NULL,
+                                PCI_STATUS_RO_MASK &
+                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
+                                PCI_STATUS_RW1C_MASK,
+                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
+                                PCI_STATUS_RSVDZ_MASK);
+    if ( rc )
+        return rc;
+
     if ( pdev->ignore_bars )
         return 0;
 
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 3bec9a4153da..96187b70141b 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -29,6 +29,10 @@ struct vpci_register {
     unsigned int offset;
     void *private;
     struct list_head node;
+    uint32_t ro_mask;
+    uint32_t rw1c_mask;
+    uint32_t rsvdp_mask;
+    uint32_t rsvdz_mask;
 };
 
 #ifdef __XEN__
@@ -145,9 +149,17 @@ uint32_t cf_check vpci_hw_read32(
     return pci_conf_read32(pdev->sbdf, reg);
 }
 
-int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
-                      vpci_write_t *write_handler, unsigned int offset,
-                      unsigned int size, void *data)
+void cf_check vpci_hw_write16(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+    pci_conf_write16(pdev->sbdf, reg, val);
+}
+
+static int add_register(struct vpci *vpci, vpci_read_t *read_handler,
+                        vpci_write_t *write_handler, unsigned int offset,
+                        unsigned int size, void *data, uint32_t ro_mask,
+                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
+                        uint32_t rsvdz_mask)
 {
     struct list_head *prev;
     struct vpci_register *r;
@@ -155,7 +167,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
     /* Some sanity checks. */
     if ( (size != 1 && size != 2 && size != 4) ||
          offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) ||
-         (!read_handler && !write_handler) )
+         (!read_handler && !write_handler) || (ro_mask & rw1c_mask) ||
+         (ro_mask & rsvdp_mask) || (ro_mask & rsvdz_mask) ||
+         (rw1c_mask & rsvdp_mask) || (rw1c_mask & rsvdz_mask) ||
+         (rsvdp_mask & rsvdz_mask) )
         return -EINVAL;
 
     r = xmalloc(struct vpci_register);
@@ -167,6 +182,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
     r->size = size;
     r->offset = offset;
     r->private = data;
+    r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size));
+    r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size));
+    r->rsvdp_mask = rsvdp_mask & (0xffffffffU >> (32 - 8 * size));
+    r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size));
 
     spin_lock(&vpci->lock);
 
@@ -193,6 +212,24 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
     return 0;
 }
 
+int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
+                      vpci_write_t *write_handler, unsigned int offset,
+                      unsigned int size, void *data)
+{
+    return add_register(vpci, read_handler, write_handler, offset, size, data,
+                        0, 0, 0, 0);
+}
+
+int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
+                           vpci_write_t *write_handler, unsigned int offset,
+                           unsigned int size, void *data, uint32_t ro_mask,
+                           uint32_t rw1c_mask, uint32_t rsvdp_mask,
+                           uint32_t rsvdz_mask)
+{
+    return add_register(vpci, read_handler, write_handler, offset, size, data,
+                        ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask);
+}
+
 int vpci_remove_register(struct vpci *vpci, unsigned int offset,
                          unsigned int size)
 {
@@ -376,6 +413,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
         }
 
         val = r->read(pdev, r->offset, r->private);
+        val &= ~(r->rsvdp_mask | r->rsvdz_mask);
 
         /* Check if the read is in the middle of a register. */
         if ( r->offset < emu.offset )
@@ -407,26 +445,26 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
 
 /*
  * Perform a maybe partial write to a register.
- *
- * Note that this will only work for simple registers, if Xen needs to
- * trap accesses to rw1c registers (like the status PCI header register)
- * the logic in vpci_write will have to be expanded in order to correctly
- * deal with them.
  */
 static void vpci_write_helper(const struct pci_dev *pdev,
                               const struct vpci_register *r, unsigned int size,
                               unsigned int offset, uint32_t data)
 {
+    uint32_t val = 0;
+    uint32_t preserved_mask = r->ro_mask | r->rsvdp_mask;
+
     ASSERT(size <= r->size);
 
-    if ( size != r->size )
+    if ( (size != r->size) || preserved_mask )
     {
-        uint32_t val;
-
         val = r->read(pdev, r->offset, r->private);
+        val &= ~r->rw1c_mask;
         data = merge_result(val, data, size, offset);
     }
 
+    data &= ~(preserved_mask | r->rsvdz_mask);
+    data |= val & preserved_mask;
+
     r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)),
              r->private);
 }
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 84b18736a85d..9909b27425a5 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -66,6 +66,15 @@
 #define  PCI_STATUS_REC_MASTER_ABORT	0x2000 /* Set on master abort */
 #define  PCI_STATUS_SIG_SYSTEM_ERROR	0x4000 /* Set when we drive SERR */
 #define  PCI_STATUS_DETECTED_PARITY	0x8000 /* Set on parity error */
+#define  PCI_STATUS_RSVDZ_MASK		0x0046 /* Includes PCI_STATUS_UDF */
+
+#define  PCI_STATUS_RO_MASK (PCI_STATUS_IMM_READY | PCI_STATUS_INTERRUPT | \
+    PCI_STATUS_CAP_LIST | PCI_STATUS_66MHZ | PCI_STATUS_FAST_BACK | \
+    PCI_STATUS_DEVSEL_MASK)
+#define  PCI_STATUS_RW1C_MASK (PCI_STATUS_PARITY | \
+    PCI_STATUS_SIG_TARGET_ABORT | PCI_STATUS_REC_TARGET_ABORT | \
+    PCI_STATUS_REC_MASTER_ABORT | PCI_STATUS_SIG_SYSTEM_ERROR | \
+    PCI_STATUS_DETECTED_PARITY)
 
 #define PCI_CLASS_REVISION	0x08	/* High 24 bits are class, low 8 revision */
 #define PCI_REVISION_ID		0x08	/* Revision ID */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index d743d96a10b8..8e8e42372ec1 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -37,6 +37,13 @@ int __must_check vpci_add_register(struct vpci *vpci,
                                    vpci_write_t *write_handler,
                                    unsigned int offset, unsigned int size,
                                    void *data);
+int __must_check vpci_add_register_mask(struct vpci *vpci,
+                                        vpci_read_t *read_handler,
+                                        vpci_write_t *write_handler,
+                                        unsigned int offset, unsigned int size,
+                                        void *data, uint32_t ro_mask,
+                                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
+                                        uint32_t rsvdz_mask);
 int __must_check vpci_remove_register(struct vpci *vpci, unsigned int offset,
                                       unsigned int size);
 
@@ -50,6 +57,8 @@ uint32_t cf_check vpci_hw_read16(
     const struct pci_dev *pdev, unsigned int reg, void *data);
 uint32_t cf_check vpci_hw_read32(
     const struct pci_dev *pdev, unsigned int reg, void *data);
+void cf_check vpci_hw_write16(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
 
 /*
  * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities
  2023-11-28 19:44 [PATCH v8 0/2] vPCI capabilities filtering Stewart Hildebrand
  2023-11-28 19:44 ` [PATCH v8 1/2] xen/vpci: header: status register handler Stewart Hildebrand
@ 2023-11-28 19:44 ` Stewart Hildebrand
  2023-11-29 14:05   ` Roger Pau Monné
  1 sibling, 1 reply; 9+ messages in thread
From: Stewart Hildebrand @ 2023-11-28 19:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné

Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities.
Hide all other PCI capabilities (including extended capabilities) from domUs for
now, even though there may be certain devices/drivers that depend on being able
to discover certain capabilities.

We parse the physical PCI capabilities linked list and add vPCI register
handlers for the next elements, inserting our own next value, thus presenting a
modified linked list to the domU.

Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val
helper function returns a fixed value, which may be used for RAZ registers, or
registers whose value doesn't change.

Introduce pci_find_next_cap_ttl() helper while adapting the logic from
pci_find_next_cap() to suit our needs, and implement the existing
pci_find_next_cap() in terms of the new helper.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
v7->v8:
* use to array instead of match function
* include lib.h for ARRAY_SIZE
* don't emulate PCI_CAPABILITY_LIST register if PCI_STATUS_CAP_LIST bit is not
  set in hardware
* spell out RAZ/WI acronym
* dropped R-b tag since the patch has changed moderately since the last rev

v6->v7:
* no change

v5->v6:
* add register handlers before status register handler in init_bars()
* s/header->mask_cap_list/mask_cap_list/

v4->v5:
* use more appropriate types, continued
* get rid of unnecessary hook function
* add Jan's R-b

v3->v4:
* move mask_cap_list setting to this patch
* leave pci_find_next_cap signature alone
* use more appropriate types

v2->v3:
* get rid of > 0 in loop condition
* implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function so
  that hypothetical future callers wouldn't be required to pass &ttl.
* change NULL to (void *)0 for RAZ value passed to vpci_read_val
* change type of ttl to unsigned int
* remember to mask off the low 2 bits of next in the initial loop iteration
* change return type of pci_find_next_cap and pci_find_next_cap_ttl
* avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0

v1->v2:
* change type of ttl to int
* use switch statement instead of if/else
* adapt existing pci_find_next_cap helper instead of rolling our own
* pass ttl as in/out
* "pass through" the lower 2 bits of the next pointer
* squash helper functions into this patch to avoid transient dead code situation
* extended capabilities RAZ/WI
---
 xen/drivers/pci/pci.c     | 31 ++++++++++++-------
 xen/drivers/vpci/header.c | 63 +++++++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c   | 12 ++++++++
 xen/include/xen/pci.h     |  3 ++
 xen/include/xen/vpci.h    |  5 ++++
 5 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c
index 3569ccb24e9e..1645b3118220 100644
--- a/xen/drivers/pci/pci.c
+++ b/xen/drivers/pci/pci.c
@@ -39,31 +39,42 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap)
     return 0;
 }
 
-unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
-                               unsigned int cap)
+unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
+                                   unsigned int *cap, unsigned int n,
+                                   unsigned int *ttl)
 {
-    u8 id;
-    int ttl = 48;
+    unsigned int id, i;
 
-    while ( ttl-- )
+    while ( (*ttl)-- )
     {
         pos = pci_conf_read8(sbdf, pos);
         if ( pos < 0x40 )
             break;
 
-        pos &= ~3;
-        id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID);
+        id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID);
 
         if ( id == 0xff )
             break;
-        if ( id == cap )
-            return pos;
+        for ( i = 0; i < n; i++ )
+        {
+            if ( id == cap[i] )
+                return pos;
+        }
 
-        pos += PCI_CAP_LIST_NEXT;
+        pos = (pos & ~3) + PCI_CAP_LIST_NEXT;
     }
+
     return 0;
 }
 
+unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
+                               unsigned int cap)
+{
+    unsigned int ttl = 48;
+
+    return pci_find_next_cap_ttl(sbdf, pos, &cap, 1, &ttl) & ~3;
+}
+
 /**
  * pci_find_ext_capability - Find an extended capability
  * @sbdf: PCI device to query
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 351318121e48..d7dc0c82a6ba 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -18,6 +18,7 @@
  */
 
 #include <xen/iocap.h>
+#include <xen/lib.h>
 #include <xen/sched.h>
 #include <xen/softirq.h>
 #include <xen/vpci.h>
@@ -545,6 +546,68 @@ static int cf_check init_bars(struct pci_dev *pdev)
     if ( rc )
         return rc;
 
+    if ( !is_hardware_domain(pdev->domain) )
+    {
+        if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
+        {
+            /* Only expose capabilities to the guest that vPCI can handle. */
+            unsigned int next, ttl = 48;
+            unsigned int supported_caps[] = {
+                PCI_CAP_ID_MSI,
+                PCI_CAP_ID_MSIX,
+            };
+
+            next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
+                                         supported_caps,
+                                         ARRAY_SIZE(supported_caps), &ttl);
+
+            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+                                   PCI_CAPABILITY_LIST, 1,
+                                   (void *)(uintptr_t)next);
+            if ( rc )
+                return rc;
+
+            next &= ~3;
+
+            if ( !next )
+                /*
+                 * If we don't have any supported capabilities to expose to the
+                 * guest, mask the PCI_STATUS_CAP_LIST bit in the status
+                 * register.
+                 */
+                mask_cap_list = true;
+
+            while ( next && ttl )
+            {
+                unsigned int pos = next;
+
+                next = pci_find_next_cap_ttl(pdev->sbdf,
+                                             pos + PCI_CAP_LIST_NEXT,
+                                             supported_caps,
+                                             ARRAY_SIZE(supported_caps), &ttl);
+
+                rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
+                                       pos + PCI_CAP_LIST_ID, 1, NULL);
+                if ( rc )
+                    return rc;
+
+                rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+                                       pos + PCI_CAP_LIST_NEXT, 1,
+                                       (void *)(uintptr_t)next);
+                if ( rc )
+                    return rc;
+
+                next &= ~3;
+            }
+        }
+
+        /* Extended capabilities read as zero, write ignore */
+        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
+                               (void *)0);
+        if ( rc )
+            return rc;
+    }
+
     /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
     rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
                                 PCI_STATUS, 2, NULL,
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 96187b70141b..99307e310bbb 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -137,6 +137,18 @@ static void cf_check vpci_ignored_write(
 {
 }
 
+uint32_t cf_check vpci_read_val(
+    const struct pci_dev *pdev, unsigned int reg, void *data)
+{
+    return (uintptr_t)data;
+}
+
+uint32_t cf_check vpci_hw_read8(
+    const struct pci_dev *pdev, unsigned int reg, void *data)
+{
+    return pci_conf_read8(pdev->sbdf, reg);
+}
+
 uint32_t cf_check vpci_hw_read16(
     const struct pci_dev *pdev, unsigned int reg, void *data)
 {
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 50d7dfb2a2fd..b2dcef01a1cf 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -205,6 +205,9 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 int pci_mmcfg_write(unsigned int seg, unsigned int bus,
                     unsigned int devfn, int reg, int len, u32 value);
 unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap);
+unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
+                                   unsigned int *cap, unsigned int n,
+                                   unsigned int *ttl);
 unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
                                unsigned int cap);
 unsigned int pci_find_ext_capability(pci_sbdf_t sbdf, unsigned int cap);
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 8e8e42372ec1..3c14a74d6255 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -52,7 +52,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
 void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
                 uint32_t data);
 
+uint32_t cf_check vpci_read_val(
+    const struct pci_dev *pdev, unsigned int reg, void *data);
+
 /* Passthrough handlers. */
+uint32_t cf_check vpci_hw_read8(
+    const struct pci_dev *pdev, unsigned int reg, void *data);
 uint32_t cf_check vpci_hw_read16(
     const struct pci_dev *pdev, unsigned int reg, void *data);
 uint32_t cf_check vpci_hw_read32(
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] xen/vpci: header: status register handler
  2023-11-28 19:44 ` [PATCH v8 1/2] xen/vpci: header: status register handler Stewart Hildebrand
@ 2023-11-29 11:03   ` Roger Pau Monné
  2023-11-29 15:18     ` Stewart Hildebrand
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2023-11-29 11:03 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: xen-devel, Wei Liu, Anthony PERARD, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini

On Tue, Nov 28, 2023 at 02:44:24PM -0500, Stewart Hildebrand wrote:
> Introduce a handler for the PCI status register, with ability to mask
> the capabilities bit. The status register contains RsvdZ bits,
> read-only bits, and write-1-to-clear bits. Additionally, we use RsvdP to
> mask the capabilities bit. Introduce bitmasks to handle these in vPCI.
> If a bit in the bitmask is set, then the special meaning applies:
> 
>   ro_mask: read normal, guest write ignore (preserve on write to hardware)
>   rw1c_mask: read normal, write 1 to clear
>   rsvdp_mask: read as zero, guest write ignore (preserve on write to hardware)
>   rsvdz_mask: read as zero, guest write ignore (write zero to hardware)
> 
> The RO/RW1C/RsvdP/RsvdZ naming and definitions were borrowed from the
> PCI Express Base 6.1 specification. RsvdP/RsvdZ bits help Xen enforce
> our view of the world. Xen preserves the value of read-only bits on
> write to hardware, discarding the guests write value. This is done in
> case hardware wrongly implements R/O bits as R/W.
> 
> The mask_cap_list flag will be set in a follow-on patch.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>

Thanks for adding the tests, this is looking very good, just a couple
of cosmetics comments mostly, and a question whether we should refuse
masks that have bit set outside the register size instead of
attempting to adjust them.

> ---
> v7->v8:
> * move PCI_STATUS_UDF to rsvdz_mask (per PCI Express Base 6 spec)
> * add support for rsvdp bits
> * add tests for ro/rw1c/rsvdp/rsvdz bits in tools/tests/vpci/main.c
> * dropped R-b tag [1] since the patch has changed moderately since the last rev
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-09/msg00909.html
> 
> v6->v7:
> * re-work args passed to vpci_add_register_mask() (called in init_bars())
> * also check for overlap of (rsvdz_mask & ro_mask) in add_register()
> * slightly adjust masking operation in vpci_write_helper()
> 
> v5->v6:
> * remove duplicate PCI_STATUS_CAP_LIST in constant definition
> * style fixup in constant definitions
> * s/res_mask/rsvdz_mask/
> * combine a new masking operation into single line
> * preserve r/o bits on write
> * get rid of status_read. Instead, use rsvdz_mask for conditionally masking
>   PCI_STATUS_CAP_LIST bit
> * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior
> * add sanity checks in add_register
> * move mask_cap_list from struct vpci_header to local variable
> 
> v4->v5:
> * add support for res_mask
> * add support for ro_mask (squash ro_mask patch)
> * add constants for reserved, read-only, and rw1c masks
> 
> v3->v4:
> * move mask_cap_list setting to the capabilities patch
> * single pci_conf_read16 in status_read
> * align mask_cap_list bitfield in struct vpci_header
> * change to rw1c bit mask instead of treating whole register as rw1c
> * drop subsystem prefix on renamed add_register function
> 
> v2->v3:
> * new patch
> ---
>  tools/tests/vpci/main.c    | 98 ++++++++++++++++++++++++++++++++++++++
>  xen/drivers/vpci/header.c  | 12 +++++
>  xen/drivers/vpci/vpci.c    | 62 +++++++++++++++++++-----
>  xen/include/xen/pci_regs.h |  9 ++++
>  xen/include/xen/vpci.h     |  9 ++++
>  5 files changed, 178 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
> index b9a0a6006bb9..b0bb993be297 100644
> --- a/tools/tests/vpci/main.c
> +++ b/tools/tests/vpci/main.c
> @@ -70,6 +70,26 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
>      *(uint32_t *)data = val;
>  }
>  
> +struct mask_data {
> +    uint32_t val;
> +    uint32_t rw1c_mask;
> +};
> +
> +static uint32_t vpci_read32_mask(const struct pci_dev *pdev, unsigned int reg,
> +                                 void *data)
> +{
> +    struct mask_data *md = data;

Newline, and possibly const for md.

> +    return md->val;
> +}
> +
> +static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg,
> +                              uint32_t val, void *data)
> +{
> +    struct mask_data *md = data;

Newline.

> +    md->val  = val | (md->val & md->rw1c_mask);
> +    md->val &= ~(val & md->rw1c_mask);
> +}
> +
>  #define VPCI_READ(reg, size, data) ({                           \
>      data = vpci_read((pci_sbdf_t){ .sbdf = 0 }, reg, size);     \
>  })
> @@ -94,9 +114,20 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
>      assert(!vpci_add_register(test_pdev.vpci, fread, fwrite, off, size,     \
>                                &store))
>  
> +#define VPCI_ADD_REG_MASK(fread, fwrite, off, size, store,                     \
> +                          ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask)          \
> +    assert(!vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
> +                                   &store,                                     \
> +                                   ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
> +
>  #define VPCI_ADD_INVALID_REG(fread, fwrite, off, size)                      \
>      assert(vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, NULL))
>  
> +#define VPCI_ADD_INVALID_REG_MASK(fread, fwrite, off, size,                   \
> +                                  ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask) \
> +    assert(vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
> +                                  NULL, ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
> +
>  #define VPCI_REMOVE_REG(off, size)                                          \
>      assert(!vpci_remove_register(test_pdev.vpci, off, size))
>  
> @@ -154,6 +185,7 @@ main(int argc, char **argv)
>      uint16_t r20[2] = { };
>      uint32_t r24 = 0;
>      uint8_t r28, r30;
> +    struct mask_data r32;
>      unsigned int i;
>      int rc;
>  
> @@ -213,6 +245,14 @@ main(int argc, char **argv)
>      /* Try to add a register with missing handlers. */
>      VPCI_ADD_INVALID_REG(NULL, NULL, 8, 2);
>  
> +    /* Try to add registers with the same bits set in multiple masks. */
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 1, 0, 0);
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 1, 0);
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 0, 1);
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 1, 0);
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 0, 1);
> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 0, 1, 1);
> +
>      /* Read/write of unset register. */
>      VPCI_READ_CHECK(8, 4, 0xffffffff);
>      VPCI_READ_CHECK(8, 2, 0xffff);
> @@ -287,6 +327,64 @@ main(int argc, char **argv)
>      VPCI_ADD_REG(vpci_read8, vpci_write8, 30, 1, r30);
>      VPCI_WRITE_CHECK(28, 4, 0xffacffdc);
>  
> +    /*
> +     * Test ro/rw1c/rsvdp/rsvdz masks.
> +     *
> +     * 32     24     16      8      0
> +     *  +---------------------------+
> +     *  |            r32            | 32
> +     *  +---------------------------+

Might be even better to clarify which region is using each mask:

32     24     16      8      0
 +------+------+------+------+
 |rsvdz |rsvdp | rw1c |  ro  | 32
 +------+------+------+------+

> +     *
> +     */
> +    r32.rw1c_mask = 0x0000ff00U;
> +    VPCI_ADD_REG_MASK(vpci_read32_mask, vpci_write32_mask, 32, 4, r32,
> +                      0x000000ffU   /* RO    */,
> +                      r32.rw1c_mask /* RW1C  */,
> +                      0x00ff0000U   /* RsvdP */,
> +                      0xff000000U   /* RsvdZ */);
> +
> +    /* ro */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(32, 1, 0x0f);
> +    VPCI_WRITE(32, 1, 0x5a);
> +    VPCI_READ_CHECK(32, 1, 0x0f);
> +    assert(r32.val == 0x000f0f0fU);
> +
> +    /* rw1c */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(33, 1, 0x0f);
> +    VPCI_WRITE(33, 1, 0x5a);
> +    VPCI_READ_CHECK(33, 1, 0x05);
> +    assert(r32.val == 0x000f050fU);
> +
> +    /* rsvdp */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(34, 1, 0);
> +    VPCI_WRITE(34, 1, 0x5a);
> +    VPCI_READ_CHECK(34, 1, 0);
> +    assert(r32.val == 0x000f0f0fU);
> +
> +    /* rsvdz */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(35, 1, 0);
> +    VPCI_WRITE(35, 1, 0x5a);
> +    VPCI_READ_CHECK(35, 1, 0);
> +    assert(r32.val == 0x000f0f0fU);
> +
> +    /* write all 0's */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
> +    VPCI_WRITE(32, 4, 0);
> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
> +    assert(r32.val == 0x000f0f0fU);
> +
> +    /* write all 1's */
> +    r32.val = 0x0f0f0f0fU;
> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
> +    VPCI_WRITE(32, 4, 0xffffffffU);
> +    VPCI_READ_CHECK(32, 4, 0x0000000fU);
> +    assert(r32.val == 0x000f000fU);
> +
>      /* Finally try to remove a couple of registers. */
>      VPCI_REMOVE_REG(28, 1);
>      VPCI_REMOVE_REG(24, 4);
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 767c1ba718d7..351318121e48 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
>      struct vpci_header *header = &pdev->vpci->header;
>      struct vpci_bar *bars = header->bars;
>      int rc;
> +    bool mask_cap_list = false;
>  
>      switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
>      {
> @@ -544,6 +545,17 @@ static int cf_check init_bars(struct pci_dev *pdev)
>      if ( rc )
>          return rc;
>  
> +    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
> +    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
> +                                PCI_STATUS, 2, NULL,
> +                                PCI_STATUS_RO_MASK &
> +                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
> +                                PCI_STATUS_RW1C_MASK,
> +                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
> +                                PCI_STATUS_RSVDZ_MASK);
> +    if ( rc )
> +        return rc;
> +
>      if ( pdev->ignore_bars )
>          return 0;
>  
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 3bec9a4153da..96187b70141b 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -29,6 +29,10 @@ struct vpci_register {
>      unsigned int offset;
>      void *private;
>      struct list_head node;
> +    uint32_t ro_mask;
> +    uint32_t rw1c_mask;
> +    uint32_t rsvdp_mask;
> +    uint32_t rsvdz_mask;
>  };
>  
>  #ifdef __XEN__
> @@ -145,9 +149,17 @@ uint32_t cf_check vpci_hw_read32(
>      return pci_conf_read32(pdev->sbdf, reg);
>  }
>  
> -int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
> -                      vpci_write_t *write_handler, unsigned int offset,
> -                      unsigned int size, void *data)
> +void cf_check vpci_hw_write16(
> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> +{
> +    pci_conf_write16(pdev->sbdf, reg, val);
> +}
> +
> +static int add_register(struct vpci *vpci, vpci_read_t *read_handler,
> +                        vpci_write_t *write_handler, unsigned int offset,
> +                        unsigned int size, void *data, uint32_t ro_mask,
> +                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
> +                        uint32_t rsvdz_mask)
>  {
>      struct list_head *prev;
>      struct vpci_register *r;
> @@ -155,7 +167,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>      /* Some sanity checks. */
>      if ( (size != 1 && size != 2 && size != 4) ||
>           offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) ||
> -         (!read_handler && !write_handler) )
> +         (!read_handler && !write_handler) || (ro_mask & rw1c_mask) ||
> +         (ro_mask & rsvdp_mask) || (ro_mask & rsvdz_mask) ||
> +         (rw1c_mask & rsvdp_mask) || (rw1c_mask & rsvdz_mask) ||
> +         (rsvdp_mask & rsvdz_mask) )

It would also be helpful to check that the masks don't have bits set
above the given register size, ie:

if ( size != 4 &&
     (ro_mask | rw1c_mask | rsvdp_mask | rsvdz_mask) >> (size * 8) )
    return -EINVAL;

>          return -EINVAL;
>  
>      r = xmalloc(struct vpci_register);
> @@ -167,6 +182,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>      r->size = size;
>      r->offset = offset;
>      r->private = data;
> +    r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size));
> +    r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size));
> +    r->rsvdp_mask = rsvdp_mask & (0xffffffffU >> (32 - 8 * size));
> +    r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size));

Oh, you adjust the masks to match the expected width.  I think it
might be more sensible to instead make sure the caller has provided
appropriate masks, as providing a mask that doesn't match the register
size likely points out to issues in the caller.

>  
>      spin_lock(&vpci->lock);
>  
> @@ -193,6 +212,24 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>      return 0;
>  }
>  
> +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
> +                      vpci_write_t *write_handler, unsigned int offset,
> +                      unsigned int size, void *data)
> +{
> +    return add_register(vpci, read_handler, write_handler, offset, size, data,
> +                        0, 0, 0, 0);
> +}
> +
> +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
> +                           vpci_write_t *write_handler, unsigned int offset,
> +                           unsigned int size, void *data, uint32_t ro_mask,
> +                           uint32_t rw1c_mask, uint32_t rsvdp_mask,
> +                           uint32_t rsvdz_mask)
> +{
> +    return add_register(vpci, read_handler, write_handler, offset, size, data,
> +                        ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask);
> +}
> +
>  int vpci_remove_register(struct vpci *vpci, unsigned int offset,
>                           unsigned int size)
>  {
> @@ -376,6 +413,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>          }
>  
>          val = r->read(pdev, r->offset, r->private);
> +        val &= ~(r->rsvdp_mask | r->rsvdz_mask);
>  
>          /* Check if the read is in the middle of a register. */
>          if ( r->offset < emu.offset )
> @@ -407,26 +445,26 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>  
>  /*
>   * Perform a maybe partial write to a register.
> - *
> - * Note that this will only work for simple registers, if Xen needs to
> - * trap accesses to rw1c registers (like the status PCI header register)
> - * the logic in vpci_write will have to be expanded in order to correctly
> - * deal with them.
>   */
>  static void vpci_write_helper(const struct pci_dev *pdev,
>                                const struct vpci_register *r, unsigned int size,
>                                unsigned int offset, uint32_t data)
>  {
> +    uint32_t val = 0;

Nit: might be clearer to name this 'current': it's easy to get
confused whether val or data holds the user-provided input.

> +    uint32_t preserved_mask = r->ro_mask | r->rsvdp_mask;
> +
>      ASSERT(size <= r->size);
>  
> -    if ( size != r->size )
> +    if ( (size != r->size) || preserved_mask )
>      {
> -        uint32_t val;
> -
>          val = r->read(pdev, r->offset, r->private);
> +        val &= ~r->rw1c_mask;
>          data = merge_result(val, data, size, offset);
>      }
>  
> +    data &= ~(preserved_mask | r->rsvdz_mask);
> +    data |= val & preserved_mask;
> +
>      r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)),
>               r->private);
>  }
> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
> index 84b18736a85d..9909b27425a5 100644
> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -66,6 +66,15 @@
>  #define  PCI_STATUS_REC_MASTER_ABORT	0x2000 /* Set on master abort */
>  #define  PCI_STATUS_SIG_SYSTEM_ERROR	0x4000 /* Set when we drive SERR */
>  #define  PCI_STATUS_DETECTED_PARITY	0x8000 /* Set on parity error */
> +#define  PCI_STATUS_RSVDZ_MASK		0x0046 /* Includes PCI_STATUS_UDF */
> +
> +#define  PCI_STATUS_RO_MASK (PCI_STATUS_IMM_READY | PCI_STATUS_INTERRUPT | \
> +    PCI_STATUS_CAP_LIST | PCI_STATUS_66MHZ | PCI_STATUS_FAST_BACK | \
> +    PCI_STATUS_DEVSEL_MASK)
> +#define  PCI_STATUS_RW1C_MASK (PCI_STATUS_PARITY | \
> +    PCI_STATUS_SIG_TARGET_ABORT | PCI_STATUS_REC_TARGET_ABORT | \
> +    PCI_STATUS_REC_MASTER_ABORT | PCI_STATUS_SIG_SYSTEM_ERROR | \
> +    PCI_STATUS_DETECTED_PARITY)
>  
>  #define PCI_CLASS_REVISION	0x08	/* High 24 bits are class, low 8 revision */
>  #define PCI_REVISION_ID		0x08	/* Revision ID */
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index d743d96a10b8..8e8e42372ec1 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -37,6 +37,13 @@ int __must_check vpci_add_register(struct vpci *vpci,
>                                     vpci_write_t *write_handler,
>                                     unsigned int offset, unsigned int size,
>                                     void *data);
> +int __must_check vpci_add_register_mask(struct vpci *vpci,
> +                                        vpci_read_t *read_handler,
> +                                        vpci_write_t *write_handler,
> +                                        unsigned int offset, unsigned int size,
> +                                        void *data, uint32_t ro_mask,
> +                                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
> +                                        uint32_t rsvdz_mask);

Instead of exporting two functions, you could export only
vpci_add_register_mask() and make vpci_add_register() a static inline
defined in the header as a wrapper around vpci_add_register_mask().

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities
  2023-11-28 19:44 ` [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities Stewart Hildebrand
@ 2023-11-29 14:05   ` Roger Pau Monné
  2023-11-29 15:55     ` Stewart Hildebrand
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2023-11-29 14:05 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: xen-devel, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

On Tue, Nov 28, 2023 at 02:44:25PM -0500, Stewart Hildebrand wrote:
> Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities.
> Hide all other PCI capabilities (including extended capabilities) from domUs for
> now, even though there may be certain devices/drivers that depend on being able
> to discover certain capabilities.
> 
> We parse the physical PCI capabilities linked list and add vPCI register
> handlers for the next elements, inserting our own next value, thus presenting a
> modified linked list to the domU.
> 
> Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val
> helper function returns a fixed value, which may be used for RAZ registers, or
                                                               ^ read as zero
> registers whose value doesn't change.
> 
> Introduce pci_find_next_cap_ttl() helper while adapting the logic from
> pci_find_next_cap() to suit our needs, and implement the existing
> pci_find_next_cap() in terms of the new helper.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>

LGTM, some nits below:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> v7->v8:
> * use to array instead of match function
> * include lib.h for ARRAY_SIZE
> * don't emulate PCI_CAPABILITY_LIST register if PCI_STATUS_CAP_LIST bit is not
>   set in hardware
> * spell out RAZ/WI acronym
> * dropped R-b tag since the patch has changed moderately since the last rev
> 
> v6->v7:
> * no change
> 
> v5->v6:
> * add register handlers before status register handler in init_bars()
> * s/header->mask_cap_list/mask_cap_list/
> 
> v4->v5:
> * use more appropriate types, continued
> * get rid of unnecessary hook function
> * add Jan's R-b
> 
> v3->v4:
> * move mask_cap_list setting to this patch
> * leave pci_find_next_cap signature alone
> * use more appropriate types
> 
> v2->v3:
> * get rid of > 0 in loop condition
> * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function so
>   that hypothetical future callers wouldn't be required to pass &ttl.
> * change NULL to (void *)0 for RAZ value passed to vpci_read_val
> * change type of ttl to unsigned int
> * remember to mask off the low 2 bits of next in the initial loop iteration
> * change return type of pci_find_next_cap and pci_find_next_cap_ttl
> * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0
> 
> v1->v2:
> * change type of ttl to int
> * use switch statement instead of if/else
> * adapt existing pci_find_next_cap helper instead of rolling our own
> * pass ttl as in/out
> * "pass through" the lower 2 bits of the next pointer
> * squash helper functions into this patch to avoid transient dead code situation
> * extended capabilities RAZ/WI
> ---
>  xen/drivers/pci/pci.c     | 31 ++++++++++++-------
>  xen/drivers/vpci/header.c | 63 +++++++++++++++++++++++++++++++++++++++
>  xen/drivers/vpci/vpci.c   | 12 ++++++++
>  xen/include/xen/pci.h     |  3 ++
>  xen/include/xen/vpci.h    |  5 ++++
>  5 files changed, 104 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c
> index 3569ccb24e9e..1645b3118220 100644
> --- a/xen/drivers/pci/pci.c
> +++ b/xen/drivers/pci/pci.c
> @@ -39,31 +39,42 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap)
>      return 0;
>  }
>  
> -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
> -                               unsigned int cap)
> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
> +                                   unsigned int *cap, unsigned int n,
> +                                   unsigned int *ttl)
>  {
> -    u8 id;
> -    int ttl = 48;
> +    unsigned int id, i;

Nit: those can be defined inside the while loop.

> -    while ( ttl-- )
> +    while ( (*ttl)-- )
>      {
>          pos = pci_conf_read8(sbdf, pos);
>          if ( pos < 0x40 )
>              break;
>  
> -        pos &= ~3;
> -        id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID);
> +        id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID);
>  
>          if ( id == 0xff )
>              break;
> -        if ( id == cap )
> -            return pos;
> +        for ( i = 0; i < n; i++ )
> +        {
> +            if ( id == cap[i] )
> +                return pos;
> +        }
>  
> -        pos += PCI_CAP_LIST_NEXT;
> +        pos = (pos & ~3) + PCI_CAP_LIST_NEXT;
>      }
> +
>      return 0;
>  }
>  
> +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
> +                               unsigned int cap)
> +{
> +    unsigned int ttl = 48;
> +
> +    return pci_find_next_cap_ttl(sbdf, pos, &cap, 1, &ttl) & ~3;
> +}
> +
>  /**
>   * pci_find_ext_capability - Find an extended capability
>   * @sbdf: PCI device to query
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 351318121e48..d7dc0c82a6ba 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -18,6 +18,7 @@
>   */
>  
>  #include <xen/iocap.h>
> +#include <xen/lib.h>
>  #include <xen/sched.h>
>  #include <xen/softirq.h>
>  #include <xen/vpci.h>
> @@ -545,6 +546,68 @@ static int cf_check init_bars(struct pci_dev *pdev)

Could you please rename to init_header now that we do much more than
dealing with the BARs?

>      if ( rc )
>          return rc;
>  
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> +        {
> +            /* Only expose capabilities to the guest that vPCI can handle. */
> +            unsigned int next, ttl = 48;
> +            unsigned int supported_caps[] = {

const?

We likely need to find a way to do this programmatically, so that when
a new capability is supported we don't need to go and modify the list
here every time.  We can sort that out at a later point however.

> +                PCI_CAP_ID_MSI,
> +                PCI_CAP_ID_MSIX,
> +            };
> +
> +            next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> +                                         supported_caps,
> +                                         ARRAY_SIZE(supported_caps), &ttl);
> +
> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> +                                   PCI_CAPABILITY_LIST, 1,
> +                                   (void *)(uintptr_t)next);
> +            if ( rc )
> +                return rc;
> +
> +            next &= ~3;
> +
> +            if ( !next )
> +                /*
> +                 * If we don't have any supported capabilities to expose to the
> +                 * guest, mask the PCI_STATUS_CAP_LIST bit in the status
> +                 * register.
> +                 */
> +                mask_cap_list = true;
> +
> +            while ( next && ttl )
> +            {
> +                unsigned int pos = next;
> +
> +                next = pci_find_next_cap_ttl(pdev->sbdf,
> +                                             pos + PCI_CAP_LIST_NEXT,
> +                                             supported_caps,
> +                                             ARRAY_SIZE(supported_caps), &ttl);
> +
> +                rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> +                                       pos + PCI_CAP_LIST_ID, 1, NULL);
> +                if ( rc )
> +                    return rc;
> +
> +                rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> +                                       pos + PCI_CAP_LIST_NEXT, 1,
> +                                       (void *)(uintptr_t)next);
> +                if ( rc )
> +                    return rc;
> +
> +                next &= ~3;
> +            }
> +        }
> +
> +        /* Extended capabilities read as zero, write ignore */
> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
> +                               (void *)0);
> +        if ( rc )
> +            return rc;
> +    }
> +
>      /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
>      rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
>                                  PCI_STATUS, 2, NULL,
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 96187b70141b..99307e310bbb 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -137,6 +137,18 @@ static void cf_check vpci_ignored_write(
>  {
>  }
>  
> +uint32_t cf_check vpci_read_val(
> +    const struct pci_dev *pdev, unsigned int reg, void *data)
> +{
> +    return (uintptr_t)data;
> +}
> +
> +uint32_t cf_check vpci_hw_read8(
> +    const struct pci_dev *pdev, unsigned int reg, void *data)
> +{
> +    return pci_conf_read8(pdev->sbdf, reg);
> +}
> +
>  uint32_t cf_check vpci_hw_read16(
>      const struct pci_dev *pdev, unsigned int reg, void *data)
>  {
> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
> index 50d7dfb2a2fd..b2dcef01a1cf 100644
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -205,6 +205,9 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus,
>  int pci_mmcfg_write(unsigned int seg, unsigned int bus,
>                      unsigned int devfn, int reg, int len, u32 value);
>  unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap);
> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
> +                                   unsigned int *cap, unsigned int n,
> +                                   unsigned int *ttl);
>  unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
>                                 unsigned int cap);
>  unsigned int pci_find_ext_capability(pci_sbdf_t sbdf, unsigned int cap);
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 8e8e42372ec1..3c14a74d6255 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -52,7 +52,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
>  void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>                  uint32_t data);
>  
> +uint32_t cf_check vpci_read_val(
> +    const struct pci_dev *pdev, unsigned int reg, void *data);

A small comment could be helpful: helper to return the value passed in the data
parameter.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] xen/vpci: header: status register handler
  2023-11-29 11:03   ` Roger Pau Monné
@ 2023-11-29 15:18     ` Stewart Hildebrand
  2023-11-30  8:40       ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Stewart Hildebrand @ 2023-11-29 15:18 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Wei Liu, Anthony PERARD, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini

On 11/29/23 06:03, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 02:44:24PM -0500, Stewart Hildebrand wrote:
>> Introduce a handler for the PCI status register, with ability to mask
>> the capabilities bit. The status register contains RsvdZ bits,
>> read-only bits, and write-1-to-clear bits. Additionally, we use RsvdP to
>> mask the capabilities bit. Introduce bitmasks to handle these in vPCI.
>> If a bit in the bitmask is set, then the special meaning applies:
>>
>>   ro_mask: read normal, guest write ignore (preserve on write to hardware)
>>   rw1c_mask: read normal, write 1 to clear
>>   rsvdp_mask: read as zero, guest write ignore (preserve on write to hardware)
>>   rsvdz_mask: read as zero, guest write ignore (write zero to hardware)
>>
>> The RO/RW1C/RsvdP/RsvdZ naming and definitions were borrowed from the
>> PCI Express Base 6.1 specification. RsvdP/RsvdZ bits help Xen enforce
>> our view of the world. Xen preserves the value of read-only bits on
>> write to hardware, discarding the guests write value. This is done in
>> case hardware wrongly implements R/O bits as R/W.
>>
>> The mask_cap_list flag will be set in a follow-on patch.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> 
> Thanks for adding the tests, this is looking very good, just a couple
> of cosmetics comments mostly, and a question whether we should refuse
> masks that have bit set outside the register size instead of
> attempting to adjust them.
> 
>> ---
>> v7->v8:
>> * move PCI_STATUS_UDF to rsvdz_mask (per PCI Express Base 6 spec)
>> * add support for rsvdp bits
>> * add tests for ro/rw1c/rsvdp/rsvdz bits in tools/tests/vpci/main.c
>> * dropped R-b tag [1] since the patch has changed moderately since the last rev
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-09/msg00909.html
>>
>> v6->v7:
>> * re-work args passed to vpci_add_register_mask() (called in init_bars())
>> * also check for overlap of (rsvdz_mask & ro_mask) in add_register()
>> * slightly adjust masking operation in vpci_write_helper()
>>
>> v5->v6:
>> * remove duplicate PCI_STATUS_CAP_LIST in constant definition
>> * style fixup in constant definitions
>> * s/res_mask/rsvdz_mask/
>> * combine a new masking operation into single line
>> * preserve r/o bits on write
>> * get rid of status_read. Instead, use rsvdz_mask for conditionally masking
>>   PCI_STATUS_CAP_LIST bit
>> * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior
>> * add sanity checks in add_register
>> * move mask_cap_list from struct vpci_header to local variable
>>
>> v4->v5:
>> * add support for res_mask
>> * add support for ro_mask (squash ro_mask patch)
>> * add constants for reserved, read-only, and rw1c masks
>>
>> v3->v4:
>> * move mask_cap_list setting to the capabilities patch
>> * single pci_conf_read16 in status_read
>> * align mask_cap_list bitfield in struct vpci_header
>> * change to rw1c bit mask instead of treating whole register as rw1c
>> * drop subsystem prefix on renamed add_register function
>>
>> v2->v3:
>> * new patch
>> ---
>>  tools/tests/vpci/main.c    | 98 ++++++++++++++++++++++++++++++++++++++
>>  xen/drivers/vpci/header.c  | 12 +++++
>>  xen/drivers/vpci/vpci.c    | 62 +++++++++++++++++++-----
>>  xen/include/xen/pci_regs.h |  9 ++++
>>  xen/include/xen/vpci.h     |  9 ++++
>>  5 files changed, 178 insertions(+), 12 deletions(-)
>>
>> diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
>> index b9a0a6006bb9..b0bb993be297 100644
>> --- a/tools/tests/vpci/main.c
>> +++ b/tools/tests/vpci/main.c
>> @@ -70,6 +70,26 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
>>      *(uint32_t *)data = val;
>>  }
>>  
>> +struct mask_data {
>> +    uint32_t val;
>> +    uint32_t rw1c_mask;
>> +};
>> +
>> +static uint32_t vpci_read32_mask(const struct pci_dev *pdev, unsigned int reg,
>> +                                 void *data)
>> +{
>> +    struct mask_data *md = data;
> 
> Newline, and possibly const for md.

Will do, and will do

> 
>> +    return md->val;
>> +}
>> +
>> +static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg,
>> +                              uint32_t val, void *data)
>> +{
>> +    struct mask_data *md = data;
> 
> Newline.

Will do

> 
>> +    md->val  = val | (md->val & md->rw1c_mask);
>> +    md->val &= ~(val & md->rw1c_mask);
>> +}
>> +
>>  #define VPCI_READ(reg, size, data) ({                           \
>>      data = vpci_read((pci_sbdf_t){ .sbdf = 0 }, reg, size);     \
>>  })
>> @@ -94,9 +114,20 @@ static void vpci_write32(const struct pci_dev *pdev, unsigned int reg,
>>      assert(!vpci_add_register(test_pdev.vpci, fread, fwrite, off, size,     \
>>                                &store))
>>  
>> +#define VPCI_ADD_REG_MASK(fread, fwrite, off, size, store,                     \
>> +                          ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask)          \
>> +    assert(!vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
>> +                                   &store,                                     \
>> +                                   ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
>> +
>>  #define VPCI_ADD_INVALID_REG(fread, fwrite, off, size)                      \
>>      assert(vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, NULL))
>>  
>> +#define VPCI_ADD_INVALID_REG_MASK(fread, fwrite, off, size,                   \
>> +                                  ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask) \
>> +    assert(vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, size,   \
>> +                                  NULL, ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask))
>> +
>>  #define VPCI_REMOVE_REG(off, size)                                          \
>>      assert(!vpci_remove_register(test_pdev.vpci, off, size))
>>  
>> @@ -154,6 +185,7 @@ main(int argc, char **argv)
>>      uint16_t r20[2] = { };
>>      uint32_t r24 = 0;
>>      uint8_t r28, r30;
>> +    struct mask_data r32;
>>      unsigned int i;
>>      int rc;
>>  
>> @@ -213,6 +245,14 @@ main(int argc, char **argv)
>>      /* Try to add a register with missing handlers. */
>>      VPCI_ADD_INVALID_REG(NULL, NULL, 8, 2);
>>  
>> +    /* Try to add registers with the same bits set in multiple masks. */
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 1, 0, 0);
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 1, 0);
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 1, 0, 0, 1);
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 1, 0);
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 1, 0, 1);
>> +    VPCI_ADD_INVALID_REG_MASK(vpci_read32, vpci_write32, 8, 4, 0, 0, 1, 1);
>> +
>>      /* Read/write of unset register. */
>>      VPCI_READ_CHECK(8, 4, 0xffffffff);
>>      VPCI_READ_CHECK(8, 2, 0xffff);
>> @@ -287,6 +327,64 @@ main(int argc, char **argv)
>>      VPCI_ADD_REG(vpci_read8, vpci_write8, 30, 1, r30);
>>      VPCI_WRITE_CHECK(28, 4, 0xffacffdc);
>>  
>> +    /*
>> +     * Test ro/rw1c/rsvdp/rsvdz masks.
>> +     *
>> +     * 32     24     16      8      0
>> +     *  +---------------------------+
>> +     *  |            r32            | 32
>> +     *  +---------------------------+
> 
> Might be even better to clarify which region is using each mask:
> 
> 32     24     16      8      0
>  +------+------+------+------+
>  |rsvdz |rsvdp | rw1c |  ro  | 32
>  +------+------+------+------+

Will do

> 
>> +     *
>> +     */
>> +    r32.rw1c_mask = 0x0000ff00U;
>> +    VPCI_ADD_REG_MASK(vpci_read32_mask, vpci_write32_mask, 32, 4, r32,
>> +                      0x000000ffU   /* RO    */,
>> +                      r32.rw1c_mask /* RW1C  */,
>> +                      0x00ff0000U   /* RsvdP */,
>> +                      0xff000000U   /* RsvdZ */);
>> +
>> +    /* ro */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(32, 1, 0x0f);
>> +    VPCI_WRITE(32, 1, 0x5a);
>> +    VPCI_READ_CHECK(32, 1, 0x0f);
>> +    assert(r32.val == 0x000f0f0fU);
>> +
>> +    /* rw1c */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(33, 1, 0x0f);
>> +    VPCI_WRITE(33, 1, 0x5a);
>> +    VPCI_READ_CHECK(33, 1, 0x05);
>> +    assert(r32.val == 0x000f050fU);
>> +
>> +    /* rsvdp */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(34, 1, 0);
>> +    VPCI_WRITE(34, 1, 0x5a);
>> +    VPCI_READ_CHECK(34, 1, 0);
>> +    assert(r32.val == 0x000f0f0fU);
>> +
>> +    /* rsvdz */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(35, 1, 0);
>> +    VPCI_WRITE(35, 1, 0x5a);
>> +    VPCI_READ_CHECK(35, 1, 0);
>> +    assert(r32.val == 0x000f0f0fU);
>> +
>> +    /* write all 0's */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
>> +    VPCI_WRITE(32, 4, 0);
>> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
>> +    assert(r32.val == 0x000f0f0fU);
>> +
>> +    /* write all 1's */
>> +    r32.val = 0x0f0f0f0fU;
>> +    VPCI_READ_CHECK(32, 4, 0x00000f0fU);
>> +    VPCI_WRITE(32, 4, 0xffffffffU);
>> +    VPCI_READ_CHECK(32, 4, 0x0000000fU);
>> +    assert(r32.val == 0x000f000fU);
>> +
>>      /* Finally try to remove a couple of registers. */
>>      VPCI_REMOVE_REG(28, 1);
>>      VPCI_REMOVE_REG(24, 4);
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index 767c1ba718d7..351318121e48 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
>>      struct vpci_header *header = &pdev->vpci->header;
>>      struct vpci_bar *bars = header->bars;
>>      int rc;
>> +    bool mask_cap_list = false;
>>  
>>      switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
>>      {
>> @@ -544,6 +545,17 @@ static int cf_check init_bars(struct pci_dev *pdev)
>>      if ( rc )
>>          return rc;
>>  
>> +    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
>> +    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
>> +                                PCI_STATUS, 2, NULL,
>> +                                PCI_STATUS_RO_MASK &
>> +                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
>> +                                PCI_STATUS_RW1C_MASK,
>> +                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
>> +                                PCI_STATUS_RSVDZ_MASK);
>> +    if ( rc )
>> +        return rc;
>> +
>>      if ( pdev->ignore_bars )
>>          return 0;
>>  
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 3bec9a4153da..96187b70141b 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -29,6 +29,10 @@ struct vpci_register {
>>      unsigned int offset;
>>      void *private;
>>      struct list_head node;
>> +    uint32_t ro_mask;
>> +    uint32_t rw1c_mask;
>> +    uint32_t rsvdp_mask;
>> +    uint32_t rsvdz_mask;
>>  };
>>  
>>  #ifdef __XEN__
>> @@ -145,9 +149,17 @@ uint32_t cf_check vpci_hw_read32(
>>      return pci_conf_read32(pdev->sbdf, reg);
>>  }
>>  
>> -int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>> -                      vpci_write_t *write_handler, unsigned int offset,
>> -                      unsigned int size, void *data)
>> +void cf_check vpci_hw_write16(
>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>> +{
>> +    pci_conf_write16(pdev->sbdf, reg, val);
>> +}
>> +
>> +static int add_register(struct vpci *vpci, vpci_read_t *read_handler,
>> +                        vpci_write_t *write_handler, unsigned int offset,
>> +                        unsigned int size, void *data, uint32_t ro_mask,
>> +                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
>> +                        uint32_t rsvdz_mask)
>>  {
>>      struct list_head *prev;
>>      struct vpci_register *r;
>> @@ -155,7 +167,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>>      /* Some sanity checks. */
>>      if ( (size != 1 && size != 2 && size != 4) ||
>>           offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) ||
>> -         (!read_handler && !write_handler) )
>> +         (!read_handler && !write_handler) || (ro_mask & rw1c_mask) ||
>> +         (ro_mask & rsvdp_mask) || (ro_mask & rsvdz_mask) ||
>> +         (rw1c_mask & rsvdp_mask) || (rw1c_mask & rsvdz_mask) ||
>> +         (rsvdp_mask & rsvdz_mask) )
> 
> It would also be helpful to check that the masks don't have bits set
> above the given register size, ie:

Will do

> 
> if ( size != 4 &&
>      (ro_mask | rw1c_mask | rsvdp_mask | rsvdz_mask) >> (size * 8) )>     return -EINVAL;
> 
>>          return -EINVAL;
>>  
>>      r = xmalloc(struct vpci_register);
>> @@ -167,6 +182,10 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>>      r->size = size;
>>      r->offset = offset;
>>      r->private = data;
>> +    r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size));
>> +    r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size));
>> +    r->rsvdp_mask = rsvdp_mask & (0xffffffffU >> (32 - 8 * size));
>> +    r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size));
> 
> Oh, you adjust the masks to match the expected width.  I think it
> might be more sensible to instead make sure the caller has provided
> appropriate masks, as providing a mask that doesn't match the register
> size likely points out to issues in the caller.

Got it, I will do the more sensible thing, and add tests for it :)

> 
>>  
>>      spin_lock(&vpci->lock);
>>  
>> @@ -193,6 +212,24 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>>      return 0;
>>  }
>>  
>> +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>> +                      vpci_write_t *write_handler, unsigned int offset,
>> +                      unsigned int size, void *data)
>> +{
>> +    return add_register(vpci, read_handler, write_handler, offset, size, data,
>> +                        0, 0, 0, 0);
>> +}
>> +
>> +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
>> +                           vpci_write_t *write_handler, unsigned int offset,
>> +                           unsigned int size, void *data, uint32_t ro_mask,
>> +                           uint32_t rw1c_mask, uint32_t rsvdp_mask,
>> +                           uint32_t rsvdz_mask)
>> +{
>> +    return add_register(vpci, read_handler, write_handler, offset, size, data,
>> +                        ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask);
>> +}
>> +
>>  int vpci_remove_register(struct vpci *vpci, unsigned int offset,
>>                           unsigned int size)
>>  {
>> @@ -376,6 +413,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>          }
>>  
>>          val = r->read(pdev, r->offset, r->private);
>> +        val &= ~(r->rsvdp_mask | r->rsvdz_mask);
>>  
>>          /* Check if the read is in the middle of a register. */
>>          if ( r->offset < emu.offset )
>> @@ -407,26 +445,26 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>  
>>  /*
>>   * Perform a maybe partial write to a register.
>> - *
>> - * Note that this will only work for simple registers, if Xen needs to
>> - * trap accesses to rw1c registers (like the status PCI header register)
>> - * the logic in vpci_write will have to be expanded in order to correctly
>> - * deal with them.
>>   */
>>  static void vpci_write_helper(const struct pci_dev *pdev,
>>                                const struct vpci_register *r, unsigned int size,
>>                                unsigned int offset, uint32_t data)
>>  {
>> +    uint32_t val = 0;
> 
> Nit: might be clearer to name this 'current': it's easy to get
> confused whether val or data holds the user-provided input.

The name 'current' shadows an existing global variable/macro. How about current_val?

> 
>> +    uint32_t preserved_mask = r->ro_mask | r->rsvdp_mask;
>> +
>>      ASSERT(size <= r->size);
>>  
>> -    if ( size != r->size )
>> +    if ( (size != r->size) || preserved_mask )
>>      {
>> -        uint32_t val;
>> -
>>          val = r->read(pdev, r->offset, r->private);
>> +        val &= ~r->rw1c_mask;
>>          data = merge_result(val, data, size, offset);
>>      }
>>  
>> +    data &= ~(preserved_mask | r->rsvdz_mask);
>> +    data |= val & preserved_mask;
>> +
>>      r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)),
>>               r->private);
>>  }
>> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
>> index 84b18736a85d..9909b27425a5 100644
>> --- a/xen/include/xen/pci_regs.h
>> +++ b/xen/include/xen/pci_regs.h
>> @@ -66,6 +66,15 @@
>>  #define  PCI_STATUS_REC_MASTER_ABORT	0x2000 /* Set on master abort */
>>  #define  PCI_STATUS_SIG_SYSTEM_ERROR	0x4000 /* Set when we drive SERR */
>>  #define  PCI_STATUS_DETECTED_PARITY	0x8000 /* Set on parity error */
>> +#define  PCI_STATUS_RSVDZ_MASK		0x0046 /* Includes PCI_STATUS_UDF */
>> +
>> +#define  PCI_STATUS_RO_MASK (PCI_STATUS_IMM_READY | PCI_STATUS_INTERRUPT | \
>> +    PCI_STATUS_CAP_LIST | PCI_STATUS_66MHZ | PCI_STATUS_FAST_BACK | \
>> +    PCI_STATUS_DEVSEL_MASK)
>> +#define  PCI_STATUS_RW1C_MASK (PCI_STATUS_PARITY | \
>> +    PCI_STATUS_SIG_TARGET_ABORT | PCI_STATUS_REC_TARGET_ABORT | \
>> +    PCI_STATUS_REC_MASTER_ABORT | PCI_STATUS_SIG_SYSTEM_ERROR | \
>> +    PCI_STATUS_DETECTED_PARITY)
>>  
>>  #define PCI_CLASS_REVISION	0x08	/* High 24 bits are class, low 8 revision */
>>  #define PCI_REVISION_ID		0x08	/* Revision ID */
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index d743d96a10b8..8e8e42372ec1 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -37,6 +37,13 @@ int __must_check vpci_add_register(struct vpci *vpci,
>>                                     vpci_write_t *write_handler,
>>                                     unsigned int offset, unsigned int size,
>>                                     void *data);
>> +int __must_check vpci_add_register_mask(struct vpci *vpci,
>> +                                        vpci_read_t *read_handler,
>> +                                        vpci_write_t *write_handler,
>> +                                        unsigned int offset, unsigned int size,
>> +                                        void *data, uint32_t ro_mask,
>> +                                        uint32_t rw1c_mask, uint32_t rsvdp_mask,
>> +                                        uint32_t rsvdz_mask);
> 
> Instead of exporting two functions, you could export only
> vpci_add_register_mask() and make vpci_add_register() a static inline
> defined in the header as a wrapper around vpci_add_register_mask().

Will do

> 
> Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities
  2023-11-29 14:05   ` Roger Pau Monné
@ 2023-11-29 15:55     ` Stewart Hildebrand
  0 siblings, 0 replies; 9+ messages in thread
From: Stewart Hildebrand @ 2023-11-29 15:55 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

On 11/29/23 09:05, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 02:44:25PM -0500, Stewart Hildebrand wrote:
>> Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities.
>> Hide all other PCI capabilities (including extended capabilities) from domUs for
>> now, even though there may be certain devices/drivers that depend on being able
>> to discover certain capabilities.
>>
>> We parse the physical PCI capabilities linked list and add vPCI register
>> handlers for the next elements, inserting our own next value, thus presenting a
>> modified linked list to the domU.
>>
>> Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val
>> helper function returns a fixed value, which may be used for RAZ registers, or
>                                                                ^ read as zero

I'll change it

>> registers whose value doesn't change.
>>
>> Introduce pci_find_next_cap_ttl() helper while adapting the logic from
>> pci_find_next_cap() to suit our needs, and implement the existing
>> pci_find_next_cap() in terms of the new helper.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> 
> LGTM, some nits below:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks!

> 
>> ---
>> v7->v8:
>> * use to array instead of match function
>> * include lib.h for ARRAY_SIZE
>> * don't emulate PCI_CAPABILITY_LIST register if PCI_STATUS_CAP_LIST bit is not
>>   set in hardware
>> * spell out RAZ/WI acronym
>> * dropped R-b tag since the patch has changed moderately since the last rev
>>
>> v6->v7:
>> * no change
>>
>> v5->v6:
>> * add register handlers before status register handler in init_bars()
>> * s/header->mask_cap_list/mask_cap_list/
>>
>> v4->v5:
>> * use more appropriate types, continued
>> * get rid of unnecessary hook function
>> * add Jan's R-b
>>
>> v3->v4:
>> * move mask_cap_list setting to this patch
>> * leave pci_find_next_cap signature alone
>> * use more appropriate types
>>
>> v2->v3:
>> * get rid of > 0 in loop condition
>> * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function so
>>   that hypothetical future callers wouldn't be required to pass &ttl.
>> * change NULL to (void *)0 for RAZ value passed to vpci_read_val
>> * change type of ttl to unsigned int
>> * remember to mask off the low 2 bits of next in the initial loop iteration
>> * change return type of pci_find_next_cap and pci_find_next_cap_ttl
>> * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0
>>
>> v1->v2:
>> * change type of ttl to int
>> * use switch statement instead of if/else
>> * adapt existing pci_find_next_cap helper instead of rolling our own
>> * pass ttl as in/out
>> * "pass through" the lower 2 bits of the next pointer
>> * squash helper functions into this patch to avoid transient dead code situation
>> * extended capabilities RAZ/WI
>> ---
>>  xen/drivers/pci/pci.c     | 31 ++++++++++++-------
>>  xen/drivers/vpci/header.c | 63 +++++++++++++++++++++++++++++++++++++++
>>  xen/drivers/vpci/vpci.c   | 12 ++++++++
>>  xen/include/xen/pci.h     |  3 ++
>>  xen/include/xen/vpci.h    |  5 ++++
>>  5 files changed, 104 insertions(+), 10 deletions(-)
>>
>> diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c
>> index 3569ccb24e9e..1645b3118220 100644
>> --- a/xen/drivers/pci/pci.c
>> +++ b/xen/drivers/pci/pci.c
>> @@ -39,31 +39,42 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap)
>>      return 0;
>>  }
>>  
>> -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
>> -                               unsigned int cap)
>> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
>> +                                   unsigned int *cap, unsigned int n,
>> +                                   unsigned int *ttl)
>>  {
>> -    u8 id;
>> -    int ttl = 48;
>> +    unsigned int id, i;
> 
> Nit: those can be defined inside the while loop.

I'll move them

> 
>> -    while ( ttl-- )
>> +    while ( (*ttl)-- )
>>      {
>>          pos = pci_conf_read8(sbdf, pos);
>>          if ( pos < 0x40 )
>>              break;
>>  
>> -        pos &= ~3;
>> -        id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID);
>> +        id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID);
>>  
>>          if ( id == 0xff )
>>              break;
>> -        if ( id == cap )
>> -            return pos;
>> +        for ( i = 0; i < n; i++ )
>> +        {
>> +            if ( id == cap[i] )
>> +                return pos;
>> +        }
>>  
>> -        pos += PCI_CAP_LIST_NEXT;
>> +        pos = (pos & ~3) + PCI_CAP_LIST_NEXT;
>>      }
>> +
>>      return 0;
>>  }
>>  
>> +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
>> +                               unsigned int cap)
>> +{
>> +    unsigned int ttl = 48;
>> +
>> +    return pci_find_next_cap_ttl(sbdf, pos, &cap, 1, &ttl) & ~3;
>> +}
>> +
>>  /**
>>   * pci_find_ext_capability - Find an extended capability
>>   * @sbdf: PCI device to query
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index 351318121e48..d7dc0c82a6ba 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -18,6 +18,7 @@
>>   */
>>  
>>  #include <xen/iocap.h>
>> +#include <xen/lib.h>
>>  #include <xen/sched.h>
>>  #include <xen/softirq.h>
>>  #include <xen/vpci.h>
>> @@ -545,6 +546,68 @@ static int cf_check init_bars(struct pci_dev *pdev)
> 
> Could you please rename to init_header now that we do much more than
> dealing with the BARs?

Yes. Hm. Do you think it's deserving of a separate patch? It's a 2-line change.

> 
>>      if ( rc )
>>          return rc;
>>  
>> +    if ( !is_hardware_domain(pdev->domain) )
>> +    {
>> +        if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
>> +        {
>> +            /* Only expose capabilities to the guest that vPCI can handle. */
>> +            unsigned int next, ttl = 48;
>> +            unsigned int supported_caps[] = {
> 
> const?

Will do

> 
> We likely need to find a way to do this programmatically, so that when
> a new capability is supported we don't need to go and modify the list
> here every time.  We can sort that out at a later point however.
> 
>> +                PCI_CAP_ID_MSI,
>> +                PCI_CAP_ID_MSIX,
>> +            };
>> +
>> +            next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
>> +                                         supported_caps,
>> +                                         ARRAY_SIZE(supported_caps), &ttl);
>> +
>> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>> +                                   PCI_CAPABILITY_LIST, 1,
>> +                                   (void *)(uintptr_t)next);
>> +            if ( rc )
>> +                return rc;
>> +
>> +            next &= ~3;
>> +
>> +            if ( !next )
>> +                /*
>> +                 * If we don't have any supported capabilities to expose to the
>> +                 * guest, mask the PCI_STATUS_CAP_LIST bit in the status
>> +                 * register.
>> +                 */
>> +                mask_cap_list = true;
>> +
>> +            while ( next && ttl )
>> +            {
>> +                unsigned int pos = next;
>> +
>> +                next = pci_find_next_cap_ttl(pdev->sbdf,
>> +                                             pos + PCI_CAP_LIST_NEXT,
>> +                                             supported_caps,
>> +                                             ARRAY_SIZE(supported_caps), &ttl);
>> +
>> +                rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>> +                                       pos + PCI_CAP_LIST_ID, 1, NULL);
>> +                if ( rc )
>> +                    return rc;
>> +
>> +                rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>> +                                       pos + PCI_CAP_LIST_NEXT, 1,
>> +                                       (void *)(uintptr_t)next);
>> +                if ( rc )
>> +                    return rc;
>> +
>> +                next &= ~3;
>> +            }
>> +        }
>> +
>> +        /* Extended capabilities read as zero, write ignore */
>> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
>> +                               (void *)0);
>> +        if ( rc )
>> +            return rc;
>> +    }
>> +
>>      /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
>>      rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
>>                                  PCI_STATUS, 2, NULL,
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 96187b70141b..99307e310bbb 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -137,6 +137,18 @@ static void cf_check vpci_ignored_write(
>>  {
>>  }
>>  
>> +uint32_t cf_check vpci_read_val(
>> +    const struct pci_dev *pdev, unsigned int reg, void *data)
>> +{
>> +    return (uintptr_t)data;
>> +}
>> +
>> +uint32_t cf_check vpci_hw_read8(
>> +    const struct pci_dev *pdev, unsigned int reg, void *data)
>> +{
>> +    return pci_conf_read8(pdev->sbdf, reg);
>> +}
>> +
>>  uint32_t cf_check vpci_hw_read16(
>>      const struct pci_dev *pdev, unsigned int reg, void *data)
>>  {
>> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
>> index 50d7dfb2a2fd..b2dcef01a1cf 100644
>> --- a/xen/include/xen/pci.h
>> +++ b/xen/include/xen/pci.h
>> @@ -205,6 +205,9 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus,
>>  int pci_mmcfg_write(unsigned int seg, unsigned int bus,
>>                      unsigned int devfn, int reg, int len, u32 value);
>>  unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap);
>> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
>> +                                   unsigned int *cap, unsigned int n,
>> +                                   unsigned int *ttl);
>>  unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos,
>>                                 unsigned int cap);
>>  unsigned int pci_find_ext_capability(pci_sbdf_t sbdf, unsigned int cap);
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index 8e8e42372ec1..3c14a74d6255 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -52,7 +52,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
>>  void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>>                  uint32_t data);
>>  
>> +uint32_t cf_check vpci_read_val(
>> +    const struct pci_dev *pdev, unsigned int reg, void *data);
> 
> A small comment could be helpful: helper to return the value passed in the data
> parameter.

Will do

> 
> Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] xen/vpci: header: status register handler
  2023-11-29 15:18     ` Stewart Hildebrand
@ 2023-11-30  8:40       ` Jan Beulich
  2023-11-30  9:08         ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2023-11-30  8:40 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: xen-devel, Wei Liu, Anthony PERARD, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Roger Pau Monné

On 29.11.2023 16:18, Stewart Hildebrand wrote:
> On 11/29/23 06:03, Roger Pau Monné wrote:
>> On Tue, Nov 28, 2023 at 02:44:24PM -0500, Stewart Hildebrand wrote:
>>> @@ -407,26 +445,26 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
>>>  
>>>  /*
>>>   * Perform a maybe partial write to a register.
>>> - *
>>> - * Note that this will only work for simple registers, if Xen needs to
>>> - * trap accesses to rw1c registers (like the status PCI header register)
>>> - * the logic in vpci_write will have to be expanded in order to correctly
>>> - * deal with them.
>>>   */
>>>  static void vpci_write_helper(const struct pci_dev *pdev,
>>>                                const struct vpci_register *r, unsigned int size,
>>>                                unsigned int offset, uint32_t data)
>>>  {
>>> +    uint32_t val = 0;
>>
>> Nit: might be clearer to name this 'current': it's easy to get
>> confused whether val or data holds the user-provided input.
> 
> The name 'current' shadows an existing global variable/macro. How about current_val?

Or curval? Or just cur (to not collide with our common use of curr)?

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 1/2] xen/vpci: header: status register handler
  2023-11-30  8:40       ` Jan Beulich
@ 2023-11-30  9:08         ` Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2023-11-30  9:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stewart Hildebrand, xen-devel, Wei Liu, Anthony PERARD,
	Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini

On Thu, Nov 30, 2023 at 09:40:14AM +0100, Jan Beulich wrote:
> On 29.11.2023 16:18, Stewart Hildebrand wrote:
> > On 11/29/23 06:03, Roger Pau Monné wrote:
> >> On Tue, Nov 28, 2023 at 02:44:24PM -0500, Stewart Hildebrand wrote:
> >>> @@ -407,26 +445,26 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> >>>  
> >>>  /*
> >>>   * Perform a maybe partial write to a register.
> >>> - *
> >>> - * Note that this will only work for simple registers, if Xen needs to
> >>> - * trap accesses to rw1c registers (like the status PCI header register)
> >>> - * the logic in vpci_write will have to be expanded in order to correctly
> >>> - * deal with them.
> >>>   */
> >>>  static void vpci_write_helper(const struct pci_dev *pdev,
> >>>                                const struct vpci_register *r, unsigned int size,
> >>>                                unsigned int offset, uint32_t data)
> >>>  {
> >>> +    uint32_t val = 0;
> >>
> >> Nit: might be clearer to name this 'current': it's easy to get
> >> confused whether val or data holds the user-provided input.
> > 
> > The name 'current' shadows an existing global variable/macro. How about current_val?
> 
> Or curval? Or just cur (to not collide with our common use of curr)?

Any would be fine, curval or cur_val are short and clear IMO.  Sorry
for having suggested a wrong name in my previous reply.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-30  9:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-28 19:44 [PATCH v8 0/2] vPCI capabilities filtering Stewart Hildebrand
2023-11-28 19:44 ` [PATCH v8 1/2] xen/vpci: header: status register handler Stewart Hildebrand
2023-11-29 11:03   ` Roger Pau Monné
2023-11-29 15:18     ` Stewart Hildebrand
2023-11-30  8:40       ` Jan Beulich
2023-11-30  9:08         ` Roger Pau Monné
2023-11-28 19:44 ` [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities Stewart Hildebrand
2023-11-29 14:05   ` Roger Pau Monné
2023-11-29 15:55     ` Stewart Hildebrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.