* [PATCH v3 00/11] Support hiding capability when its initialization fails
@ 2025-04-21 6:18 Jiqian Chen
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
` (10 more replies)
0 siblings, 11 replies; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel
Cc: Huang Rui, Jiqian Chen, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Hi,
This series is to
emulate legacy and extended capability list for dom0, including patch #1, #2, #3, #4.
hide legacy and extended capability when its initialization fails, including patch #5, #6, #7.
remove all related registers and other resources when initializing capability fails, including patch #8, #9, #10, #11.
Best regards,
Jiqian Chen.
---
cc: Andrew Cooper <andrew.cooper3@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
cc: Michal Orzel <michal.orzel@amd.com>
cc: Jan Beulich <jbeulich@suse.com>
cc: Julien Grall <julien@xen.org>
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Stefano Stabellini <sstabellini@kernel.org>
---
Jiqian Chen (11):
vpci/header: Move emulating cap list logic into new function
driver/pci: Get next capability without passing caps
vpci/header: Emulate legacy capability list for dom0
vpci/header: Emulate extended capability list for dom0
vpci: Refactor REGISTER_VPCI_INIT
vpci: Hide legacy capability when it fails to initialize
vpci: Hide extended capability when it fails to initialize
vpci: Refactor vpci_remove_register to remove matched registers
vpci/rebar: Remove registers when init_rebar() fails
vpci/msi: Free MSI resources when init_msi() fails
vpci/msix: Add function to clean MSIX resources
tools/tests/vpci/main.c | 4 +-
xen/drivers/pci/pci.c | 3 +
xen/drivers/vpci/header.c | 174 +++++++++++++---------
xen/drivers/vpci/msi.c | 28 +++-
xen/drivers/vpci/msix.c | 27 +++-
xen/drivers/vpci/rebar.c | 35 +++--
xen/drivers/vpci/vpci.c | 290 ++++++++++++++++++++++++++++++++-----
xen/include/xen/pci_regs.h | 1 +
xen/include/xen/vpci.h | 34 +++--
xen/include/xen/xen.lds.h | 2 +-
10 files changed, 459 insertions(+), 139 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-04-29 7:10 ` Chen, Jiqian
2025-05-06 13:30 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 02/11] driver/pci: Get next capability without passing caps Jiqian Chen
` (9 subsequent siblings)
10 siblings, 2 replies; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
No functional changes.
Follow-on changes will benifit from this.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
new patch.
Best regards,
Jiqian Chen.
---
| 138 ++++++++++++++++++++------------------
1 file changed, 73 insertions(+), 65 deletions(-)
--git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ef6c965c081c..3e9b44454b43 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -745,6 +745,75 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
return !bar->mem ? -ENOMEM : 0;
}
+static int vpci_init_capability_list(struct pci_dev *pdev)
+{
+ int rc;
+ bool mask_cap_list = false;
+
+ if ( !is_hardware_domain(pdev->domain) &&
+ pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
+ {
+ /* Only expose capabilities to the guest that vPCI can handle. */
+ unsigned int next, ttl = 48;
+ static const unsigned int supported_caps[] = {
+ PCI_CAP_ID_MSI,
+ PCI_CAP_ID_MSIX,
+ };
+
+ next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
+ supported_caps,
+ ARRAY_SIZE(supported_caps), &ttl);
+
+ rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+ PCI_CAPABILITY_LIST, 1,
+ (void *)(uintptr_t)next);
+ if ( rc )
+ return rc;
+
+ next &= ~3;
+
+ if ( !next )
+ /*
+ * If we don't have any supported capabilities to expose to the
+ * guest, mask the PCI_STATUS_CAP_LIST bit in the status
+ * register.
+ */
+ mask_cap_list = true;
+
+ while ( next && ttl )
+ {
+ unsigned int pos = next;
+
+ next = pci_find_next_cap_ttl(pdev->sbdf,
+ pos + PCI_CAP_LIST_NEXT,
+ supported_caps,
+ ARRAY_SIZE(supported_caps), &ttl);
+
+ rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
+ pos + PCI_CAP_LIST_ID, 1, NULL);
+ if ( rc )
+ return rc;
+
+ rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+ pos + PCI_CAP_LIST_NEXT, 1,
+ (void *)(uintptr_t)next);
+ if ( rc )
+ return rc;
+
+ next &= ~3;
+ }
+ }
+
+ /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
+ return vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
+ PCI_STATUS, 2, NULL,
+ PCI_STATUS_RO_MASK &
+ ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
+ PCI_STATUS_RW1C_MASK,
+ mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
+ PCI_STATUS_RSVDZ_MASK);
+}
+
static int cf_check init_header(struct pci_dev *pdev)
{
uint16_t cmd;
@@ -753,7 +822,6 @@ static int cf_check init_header(struct pci_dev *pdev)
struct vpci_header *header = &pdev->vpci->header;
struct vpci_bar *bars = header->bars;
int rc;
- bool mask_cap_list = false;
bool is_hwdom = is_hardware_domain(pdev->domain);
ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
@@ -794,61 +862,12 @@ static int cf_check init_header(struct pci_dev *pdev)
if ( rc )
return rc;
+ rc = vpci_init_capability_list(pdev);
+ if ( rc )
+ return rc;
+
if ( !is_hwdom )
{
- if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
- {
- /* Only expose capabilities to the guest that vPCI can handle. */
- unsigned int next, ttl = 48;
- static const unsigned int supported_caps[] = {
- PCI_CAP_ID_MSI,
- PCI_CAP_ID_MSIX,
- };
-
- next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
- supported_caps,
- ARRAY_SIZE(supported_caps), &ttl);
-
- rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
- PCI_CAPABILITY_LIST, 1,
- (void *)(uintptr_t)next);
- if ( rc )
- return rc;
-
- next &= ~3;
-
- if ( !next )
- /*
- * If we don't have any supported capabilities to expose to the
- * guest, mask the PCI_STATUS_CAP_LIST bit in the status
- * register.
- */
- mask_cap_list = true;
-
- while ( next && ttl )
- {
- unsigned int pos = next;
-
- next = pci_find_next_cap_ttl(pdev->sbdf,
- pos + PCI_CAP_LIST_NEXT,
- supported_caps,
- ARRAY_SIZE(supported_caps), &ttl);
-
- rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
- pos + PCI_CAP_LIST_ID, 1, NULL);
- if ( rc )
- return rc;
-
- rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
- pos + PCI_CAP_LIST_NEXT, 1,
- (void *)(uintptr_t)next);
- if ( rc )
- return rc;
-
- next &= ~3;
- }
- }
-
/* Extended capabilities read as zero, write ignore */
rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
(void *)0);
@@ -856,17 +875,6 @@ static int cf_check init_header(struct pci_dev *pdev)
return rc;
}
- /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
- rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
- PCI_STATUS, 2, NULL,
- PCI_STATUS_RO_MASK &
- ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
- PCI_STATUS_RW1C_MASK,
- mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
- PCI_STATUS_RSVDZ_MASK);
- if ( rc )
- return rc;
-
if ( pdev->ignore_bars )
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 02/11] driver/pci: Get next capability without passing caps
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-04-22 15:59 ` Jan Beulich
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
` (8 subsequent siblings)
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel
Cc: Huang Rui, Jiqian Chen, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Modify function pci_find_next_cap_ttl to support returning position
of next capability when size "n" is zero.
That can help caller to get next capability offset if caller just
has a information of current capability offset.
That will be used in a follow-on change.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: Andrew Cooper <andrew.cooper3@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
cc: Michal Orzel <michal.orzel@amd.com>
cc: Jan Beulich <jbeulich@suse.com>
cc: Julien Grall <julien@xen.org>
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Stefano Stabellini <sstabellini@kernel.org>
---
v2->v3 changes:
* Only check if n == 0 and add assertion for array "caps".
* Not to change pci_find_next_cap_ttl definition.
v1->v2 changes:
new patch.
Best regards,
Jiqian Chen.
---
xen/drivers/pci/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c
index edf5b9f7ae9f..804f4e1e6066 100644
--- a/xen/drivers/pci/pci.c
+++ b/xen/drivers/pci/pci.c
@@ -55,6 +55,9 @@ unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
if ( id == 0xff )
break;
+ if ( n == 0 )
+ return pos;
+ ASSERT(caps);
for ( i = 0; i < n; i++ )
{
if ( id == caps[i] )
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
2025-04-21 6:18 ` [PATCH v3 02/11] driver/pci: Get next capability without passing caps Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-04-22 16:01 ` Jan Beulich
` (2 more replies)
2025-04-21 6:18 ` [PATCH v3 04/11] vpci/header: Emulate extended " Jiqian Chen
` (7 subsequent siblings)
10 siblings, 3 replies; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
Current logic of emulating legacy capability list is only for domU.
So, expand it to emulate for dom0 too. Then it will be easy to hide
a capability whose initialization fails in a function.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Not to add handler of PCI_CAP_LIST_ID when domain is dom0.
v1->v2 changes:
new patch.
Best regards,
Jiqian Chen.
---
| 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
--git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 3e9b44454b43..c98cd211d9d7 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -749,9 +749,9 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
{
int rc;
bool mask_cap_list = false;
+ bool is_hwdom = is_hardware_domain(pdev->domain);
- if ( !is_hardware_domain(pdev->domain) &&
- pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
+ if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
{
/* Only expose capabilities to the guest that vPCI can handle. */
unsigned int next, ttl = 48;
@@ -759,10 +759,11 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
PCI_CAP_ID_MSI,
PCI_CAP_ID_MSIX,
};
+ const unsigned int *caps = is_hwdom ? NULL : supported_caps;
+ const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(supported_caps);
next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
- supported_caps,
- ARRAY_SIZE(supported_caps), &ttl);
+ caps, n, &ttl);
rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
PCI_CAPABILITY_LIST, 1,
@@ -772,7 +773,7 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
next &= ~3;
- if ( !next )
+ if ( !next && !is_hwdom )
/*
* If we don't have any supported capabilities to expose to the
* guest, mask the PCI_STATUS_CAP_LIST bit in the status
@@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
next = pci_find_next_cap_ttl(pdev->sbdf,
pos + PCI_CAP_LIST_NEXT,
- supported_caps,
- ARRAY_SIZE(supported_caps), &ttl);
+ caps, n, &ttl);
- rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
- pos + PCI_CAP_LIST_ID, 1, NULL);
- if ( rc )
- return rc;
+ if ( !is_hwdom )
+ {
+ rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
+ pos + PCI_CAP_LIST_ID, 1, NULL);
+ if ( rc )
+ return rc;
+ }
rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
pos + PCI_CAP_LIST_NEXT, 1,
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 04/11] vpci/header: Emulate extended capability list for dom0
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (2 preceding siblings ...)
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-05-06 14:14 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT Jiqian Chen
` (6 subsequent siblings)
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
Add a new function to emulate extended capability list for dom0,
and call it in init_header(). So that it will be easy to hide a
extended capability whose initialization fails.
As for the extended capability list of domU, just move the logic
into above function and keep hiding it for domU.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* In vpci_init_ext_capability_list(), when domain is domU, directly return after adding a handler(hiding all extended capability for domU).
* In vpci_init_ext_capability_list(), change condition to be "while ( pos >= 0x100U && ttl-- )" instead of "while ( pos && ttl-- )".
* Add new function vpci_hw_write32, and pass it to extended capability handler for dom0.
v1->v2 changes:
new patch
Best regards,
Jiqian Chen.
---
| 36 ++++++++++++++++++++++++++++--------
xen/drivers/vpci/vpci.c | 6 ++++++
xen/include/xen/vpci.h | 2 ++
3 files changed, 36 insertions(+), 8 deletions(-)
--git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index c98cd211d9d7..ee94ad8e5037 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -817,6 +817,31 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
PCI_STATUS_RSVDZ_MASK);
}
+static int vpci_init_ext_capability_list(struct pci_dev *pdev)
+{
+ unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
+
+ if ( !is_hardware_domain(pdev->domain) )
+ /* Extended capabilities read as zero, write ignore for guest */
+ return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+ pos, 4, (void *)0);
+
+ while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
+ {
+ uint32_t header = pci_conf_read32(pdev->sbdf, pos);
+ int rc;
+
+ rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
+ pos, 4, (void *)(uintptr_t)header);
+ if ( rc )
+ return rc;
+
+ pos = PCI_EXT_CAP_NEXT(header);
+ }
+
+ return 0;
+}
+
static int cf_check init_header(struct pci_dev *pdev)
{
uint16_t cmd;
@@ -869,14 +894,9 @@ static int cf_check init_header(struct pci_dev *pdev)
if ( rc )
return rc;
- if ( !is_hwdom )
- {
- /* Extended capabilities read as zero, write ignore */
- rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
- (void *)0);
- if ( rc )
- return rc;
- }
+ rc = vpci_init_ext_capability_list(pdev);
+ if ( rc )
+ return rc;
if ( pdev->ignore_bars )
return 0;
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 1e6aa5d799b9..3349b98389b8 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
pci_conf_write16(pdev->sbdf, reg, val);
}
+void cf_check vpci_hw_write32(
+ const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+ pci_conf_write32(pdev->sbdf, reg, val);
+}
+
int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
vpci_write_t *write_handler, unsigned int offset,
unsigned int size, void *data, uint32_t ro_mask,
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 807401b2eaa2..9d47b8c1a50e 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -78,6 +78,8 @@ uint32_t cf_check vpci_hw_read32(
const struct pci_dev *pdev, unsigned int reg, void *data);
void cf_check vpci_hw_write16(
const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
+void cf_check vpci_hw_write32(
+ const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
/*
* Check for pending vPCI operations on this vcpu. Returns true if the vcpu
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (3 preceding siblings ...)
2025-04-21 6:18 ` [PATCH v3 04/11] vpci/header: Emulate extended " Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-04-22 16:03 ` Jan Beulich
2025-05-06 14:37 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize Jiqian Chen
` (5 subsequent siblings)
10 siblings, 2 replies; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel
Cc: Huang Rui, Jiqian Chen, Roger Pau Monné, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Stefano Stabellini
Refactor REGISTER_VPCI_INIT to contain more capability specific
information, this is benifit for follow-on changes to hide capability
which initialization fails.
What's more, change the definition of init_header() since it is
not a capability and it is needed for all devices' PCI config space.
Note:
Call vpci_make_msix_hole() in the end of init_msix() since the
change of sequence of init_header() and init_msix().
The fini hook will be implemented in follow-on changes.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Andrew Cooper <andrew.cooper3@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
cc: Michal Orzel <michal.orzel@amd.com>
cc: Jan Beulich <jbeulich@suse.com>
cc: Julien Grall <julien@xen.org>
cc: Stefano Stabellini <sstabellini@kernel.org>
---
v2->v3 changes:
* This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
* Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
* Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
v1->v2 changes:
* Removed the "priorities" of initializing capabilities since it isn't used anymore.
* Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove failed capability from list.
* Called vpci_make_msix_hole() in the end of init_msix().
Best regards,
Jiqian Chen.
---
| 3 +--
xen/drivers/vpci/msi.c | 2 +-
xen/drivers/vpci/msix.c | 8 +++++--
xen/drivers/vpci/rebar.c | 2 +-
xen/drivers/vpci/vpci.c | 48 +++++++++++++++++++++++++++++++--------
xen/include/xen/vpci.h | 28 ++++++++++++++++-------
xen/include/xen/xen.lds.h | 2 +-
7 files changed, 68 insertions(+), 25 deletions(-)
--git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ee94ad8e5037..afe4bcdfcb30 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -842,7 +842,7 @@ static int vpci_init_ext_capability_list(struct pci_dev *pdev)
return 0;
}
-static int cf_check init_header(struct pci_dev *pdev)
+int vpci_init_header(struct pci_dev *pdev)
{
uint16_t cmd;
uint64_t addr, size;
@@ -1038,7 +1038,6 @@ static int cf_check init_header(struct pci_dev *pdev)
pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
return rc;
}
-REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
/*
* Local variables:
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 66e5a8a116be..ea7dc0c060ea 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
return 0;
}
-REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
+REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, NULL);
void vpci_dump_msi(void)
{
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 6bd8c55bb48e..0228ffd9fda9 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -751,9 +751,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
pdev->vpci->msix = msix;
list_add(&msix->next, &d->arch.hvm.msix_tables);
- return 0;
+ spin_lock(&pdev->vpci->lock);
+ rc = vpci_make_msix_hole(pdev);
+ spin_unlock(&pdev->vpci->lock);
+
+ return rc;
}
-REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
+REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, NULL);
/*
* Local variables:
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
index 793937449af7..026f8f7972d9 100644
--- a/xen/drivers/vpci/rebar.c
+++ b/xen/drivers/vpci/rebar.c
@@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
return 0;
}
-REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, NULL);
/*
* Local variables:
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 3349b98389b8..5474b66668c1 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -36,8 +36,8 @@ struct vpci_register {
};
#ifdef __XEN__
-extern vpci_register_init_t *const __start_vpci_array[];
-extern vpci_register_init_t *const __end_vpci_array[];
+extern vpci_capability_t *const __start_vpci_array[];
+extern vpci_capability_t *const __end_vpci_array[];
#define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
@@ -83,6 +83,36 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
+static int vpci_init_capabilities(struct pci_dev *pdev)
+{
+ for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
+ {
+ const vpci_capability_t *capability = __start_vpci_array[i];
+ const unsigned int cap = capability->id;
+ const bool is_ext = capability->is_ext;
+ unsigned int pos;
+ int rc;
+
+ if ( !is_hardware_domain(pdev->domain) && is_ext )
+ continue;
+
+ if ( !is_ext )
+ pos = pci_find_cap_offset(pdev->sbdf, cap);
+ else
+ pos = pci_find_ext_capability(pdev->sbdf, cap);
+
+ if ( !pos || !capability->init )
+ continue;
+
+ rc = capability->init(pdev);
+
+ if ( rc )
+ return rc;
+ }
+
+ return 0;
+}
+
void vpci_deassign_device(struct pci_dev *pdev)
{
unsigned int i;
@@ -128,7 +158,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
int vpci_assign_device(struct pci_dev *pdev)
{
- unsigned int i;
const unsigned long *ro_map;
int rc = 0;
@@ -159,14 +188,13 @@ int vpci_assign_device(struct pci_dev *pdev)
goto out;
#endif
- for ( i = 0; i < NUM_VPCI_INIT; i++ )
- {
- rc = __start_vpci_array[i](pdev);
- if ( rc )
- break;
- }
+ rc = vpci_init_header(pdev);
+ if ( rc )
+ goto out;
+
+ rc = vpci_init_capabilities(pdev);
- out: __maybe_unused;
+ out:
if ( rc )
vpci_deassign_device(pdev);
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 9d47b8c1a50e..8e815b418b7d 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
uint32_t val, void *data);
-typedef int vpci_register_init_t(struct pci_dev *dev);
-
-#define VPCI_PRIORITY_HIGH "1"
-#define VPCI_PRIORITY_MIDDLE "5"
-#define VPCI_PRIORITY_LOW "9"
+typedef struct {
+ unsigned int id;
+ bool is_ext;
+ int (*init)(struct pci_dev *pdev);
+ void (*fini)(struct pci_dev *pdev);
+} vpci_capability_t;
#define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12)
@@ -29,9 +30,20 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
*/
#define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1)
-#define REGISTER_VPCI_INIT(x, p) \
- static vpci_register_init_t *const x##_entry \
- __used_section(".data.vpci." p) = (x)
+#define REGISTER_VPCI_CAP(cap, x, y, ext) \
+ static vpci_capability_t x##_t = { \
+ .id = (cap), \
+ .init = (x), \
+ .fini = (y), \
+ .is_ext = (ext), \
+ }; \
+ static vpci_capability_t *const x##_entry \
+ __used_section(".data.vpci.") = &(x##_t)
+
+#define REGISTER_VPCI_LEGACY_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, false)
+#define REGISTER_VPCI_EXTENDED_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, true)
+
+int __must_check vpci_init_header(struct pci_dev *pdev);
/* Assign vPCI to device by adding handlers. */
int __must_check vpci_assign_device(struct pci_dev *pdev);
diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
index 16a9b1ba03db..c73222112dd3 100644
--- a/xen/include/xen/xen.lds.h
+++ b/xen/include/xen/xen.lds.h
@@ -187,7 +187,7 @@
#define VPCI_ARRAY \
. = ALIGN(POINTER_ALIGN); \
__start_vpci_array = .; \
- *(SORT(.data.vpci.*)) \
+ *(.data.vpci.*) \
__end_vpci_array = .;
#else
#define VPCI_ARRAY
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (4 preceding siblings ...)
2025-04-21 6:18 ` [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-05-06 16:00 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 07/11] vpci: Hide extended " Jiqian Chen
` (4 subsequent siblings)
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
When vpci fails to initialize a legacy capability of device, it just
return error instead of catching and processing exception. That makes
the entire device unusable.
So, add new a function to hide legacy capability when initialization
fails. And remove the failed legacy capability from the vpci emulated
legacy capability list.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Separated from the last version patch "vpci: Hide capability when it fails to initialize"
* Whole implementation changed because last version is wrong.
This version adds a new helper function vpci_get_register() and uses it to get
target handler and previous handler from vpci->handlers, then remove the target.
v1->v2 changes:
* Removed the "priorities" of initializing capabilities since it isn't used anymore.
* Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
remove failed capability from list.
* Called vpci_make_msix_hole() in the end of init_msix().
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/vpci.c | 133 +++++++++++++++++++++++++++++++++-------
1 file changed, 112 insertions(+), 21 deletions(-)
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 5474b66668c1..f97c7cc460a0 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -35,6 +35,22 @@ struct vpci_register {
uint32_t rsvdz_mask;
};
+static int vpci_register_cmp(const struct vpci_register *r1,
+ const struct vpci_register *r2)
+{
+ /* Return 0 if registers overlap. */
+ if ( r1->offset < r2->offset + r2->size &&
+ r2->offset < r1->offset + r1->size )
+ return 0;
+ if ( r1->offset < r2->offset )
+ return -1;
+ if ( r1->offset > r2->offset )
+ return 1;
+
+ ASSERT_UNREACHABLE();
+ return 0;
+}
+
#ifdef __XEN__
extern vpci_capability_t *const __start_vpci_array[];
extern vpci_capability_t *const __end_vpci_array[];
@@ -83,7 +99,91 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
-static int vpci_init_capabilities(struct pci_dev *pdev)
+static struct vpci_register *vpci_get_register(struct vpci *vpci,
+ const unsigned int offset,
+ const unsigned int size)
+{
+ const struct vpci_register r = { .offset = offset, .size = size };
+ struct vpci_register *rm;
+
+ ASSERT(spin_is_locked(&vpci->lock));
+ list_for_each_entry ( rm, &vpci->handlers, node )
+ {
+ int cmp = vpci_register_cmp(&r, rm);
+
+ if ( !cmp && rm->offset == offset && rm->size == size )
+ return rm;
+ if ( cmp <= 0 )
+ break;
+ }
+
+ return NULL;
+}
+
+static struct vpci_register *vpci_get_previous_cap_register
+ (struct vpci *vpci, const unsigned int offset)
+{
+ uint32_t next;
+ struct vpci_register *r;
+
+ if ( offset < 0x40 )
+ return NULL;
+
+ r = vpci_get_register(vpci, PCI_CAPABILITY_LIST, 1);
+ ASSERT(r);
+
+ next = (uint32_t)(uintptr_t)r->private;
+ while ( next >= 0x40 && next != offset )
+ {
+ r = vpci_get_register(vpci, next + PCI_CAP_LIST_NEXT, 1);
+ ASSERT(r);
+ next = (uint32_t)(uintptr_t)r->private;
+ }
+
+ if ( next < 0x40 )
+ return NULL;
+
+ return r;
+}
+
+static void vpci_capability_mask(struct pci_dev *pdev,
+ const unsigned int cap)
+{
+ const unsigned int offset = pci_find_cap_offset(pdev->sbdf, cap);
+ struct vpci_register *prev_next_r, *next_r;
+ struct vpci *vpci = pdev->vpci;
+
+ spin_lock(&vpci->lock);
+ next_r = vpci_get_register(vpci, offset + PCI_CAP_LIST_NEXT, 1);
+ if ( !next_r )
+ {
+ spin_unlock(&vpci->lock);
+ return;
+ }
+
+ prev_next_r = vpci_get_previous_cap_register(vpci, offset);
+ ASSERT(prev_next_r);
+
+ prev_next_r->private = next_r->private;
+
+ if ( !is_hardware_domain(pdev->domain) )
+ {
+ struct vpci_register *id_r =
+ vpci_get_register(vpci, offset + PCI_CAP_LIST_ID, 1);
+
+ ASSERT(id_r);
+ /* PCI_CAP_LIST_ID register of target capability */
+ list_del(&id_r->node);
+ xfree(id_r);
+ }
+
+ /* PCI_CAP_LIST_NEXT register of target capability */
+ list_del(&next_r->node);
+ spin_unlock(&vpci->lock);
+ xfree(next_r);
+}
+
+static void vpci_init_capabilities(struct pci_dev *pdev)
{
for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
{
@@ -107,10 +207,17 @@ static int vpci_init_capabilities(struct pci_dev *pdev)
rc = capability->init(pdev);
if ( rc )
- return rc;
+ {
+ if ( capability->fini )
+ capability->fini(pdev);
+
+ printk(XENLOG_WARNING "%pd %pp: %s cap %u init fail rc=%d, mask it\n",
+ pdev->domain, &pdev->sbdf,
+ is_ext ? "extended" : "legacy", cap, rc);
+ if ( !is_ext )
+ vpci_capability_mask(pdev, cap);
+ }
}
-
- return 0;
}
void vpci_deassign_device(struct pci_dev *pdev)
@@ -192,7 +299,7 @@ int vpci_assign_device(struct pci_dev *pdev)
if ( rc )
goto out;
- rc = vpci_init_capabilities(pdev);
+ vpci_init_capabilities(pdev);
out:
if ( rc )
@@ -202,22 +309,6 @@ int vpci_assign_device(struct pci_dev *pdev)
}
#endif /* __XEN__ */
-static int vpci_register_cmp(const struct vpci_register *r1,
- const struct vpci_register *r2)
-{
- /* Return 0 if registers overlap. */
- if ( r1->offset < r2->offset + r2->size &&
- r2->offset < r1->offset + r1->size )
- return 0;
- if ( r1->offset < r2->offset )
- return -1;
- if ( r1->offset > r2->offset )
- return 1;
-
- ASSERT_UNREACHABLE();
- return 0;
-}
-
/* Dummy hooks, writes are ignored, reads return 1's */
static uint32_t cf_check vpci_ignored_read(
const struct pci_dev *pdev, unsigned int reg, void *data)
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (5 preceding siblings ...)
2025-04-21 6:18 ` [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize Jiqian Chen
@ 2025-04-21 6:18 ` Jiqian Chen
2025-04-22 16:06 ` Jan Beulich
2025-05-06 16:21 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers Jiqian Chen
` (3 subsequent siblings)
10 siblings, 2 replies; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:18 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
When vpci fails to initialize a extended capability of device for dom0,
it just return error instead of catching and processing exception. That
makes the entire device unusable.
So, add new a function to hide extended capability when initialization
fails. And remove the failed extended capability handler from vpci
extended capability list.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Separated from the last version patch "vpci: Hide capability when it fails to initialize".
* Whole implementation changed because last version is wrong.
This version gets target handler and previous handler from vpci->handlers, then remove the target.
* Note: a case in function vpci_ext_capability_mask() needs to be discussed,
because it may change the offset of next capability when the offset of target
capability is 0x100U(the first extended capability), my implementation is just to
ignore and let hardware to handle the target capability.
v1->v2 changes:
* Removed the "priorities" of initializing capabilities since it isn't used anymore.
* Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
remove failed capability from list.
* Called vpci_make_msix_hole() in the end of init_msix().
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
xen/include/xen/pci_regs.h | 1 +
2 files changed, 80 insertions(+)
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index f97c7cc460a0..8ff5169bdd18 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
xfree(next_r);
}
+static struct vpci_register *vpci_get_previous_ext_cap_register
+ (struct vpci *vpci, const unsigned int offset)
+{
+ uint32_t header;
+ unsigned int pos = PCI_CFG_SPACE_SIZE;
+ struct vpci_register *r;
+
+ if ( offset <= PCI_CFG_SPACE_SIZE )
+ return NULL;
+
+ r = vpci_get_register(vpci, pos, 4);
+ ASSERT(r);
+
+ header = (uint32_t)(uintptr_t)r->private;
+ pos = PCI_EXT_CAP_NEXT(header);
+ while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
+ {
+ r = vpci_get_register(vpci, pos, 4);
+ ASSERT(r);
+ header = (uint32_t)(uintptr_t)r->private;
+ pos = PCI_EXT_CAP_NEXT(header);
+ }
+
+ if ( pos <= PCI_CFG_SPACE_SIZE )
+ return NULL;
+
+ return r;
+}
+
+static void vpci_ext_capability_mask(struct pci_dev *pdev,
+ const unsigned int cap)
+{
+ const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
+ struct vpci_register *rm, *prev_r;
+ struct vpci *vpci = pdev->vpci;
+ uint32_t header, pre_header;
+
+ spin_lock(&vpci->lock);
+ rm = vpci_get_register(vpci, offset, 4);
+ if ( !rm )
+ {
+ spin_unlock(&vpci->lock);
+ return;
+ }
+
+ header = (uint32_t)(uintptr_t)rm->private;
+ if ( offset == PCI_CFG_SPACE_SIZE )
+ {
+ if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
+ rm->private = (void *)(uintptr_t)0;
+ else
+ /*
+ * Else case needs to remove the capability in position 0x100U and
+ * moves the next capability to be in position 0x100U, that would
+ * cause the offset of next capability in vpci different from the
+ * hardware, then cause error accesses, so just ignore it here and
+ * hope hardware would handle the capability well.
+ */
+ printk(XENLOG_ERR "%pd %pp: ext cap %u is first cap, can't mask it\n",
+ pdev->domain, &pdev->sbdf, cap);
+ spin_unlock(&vpci->lock);
+ return;
+ }
+
+ prev_r = vpci_get_previous_ext_cap_register(vpci, offset);
+ ASSERT(prev_r);
+
+ pre_header = (uint32_t)(uintptr_t)prev_r->private;
+ prev_r->private = (void *)(uintptr_t)((pre_header &
+ ~PCI_EXT_CAP_NEXT_MASK) |
+ (header & PCI_EXT_CAP_NEXT_MASK));
+
+ list_del(&rm->node);
+ spin_unlock(&vpci->lock);
+ xfree(rm);
+}
+
static void vpci_init_capabilities(struct pci_dev *pdev)
{
for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
@@ -216,6 +293,8 @@ static void vpci_init_capabilities(struct pci_dev *pdev)
is_ext ? "extended" : "legacy", cap, rc);
if ( !is_ext )
vpci_capability_mask(pdev, cap);
+ else
+ vpci_ext_capability_mask(pdev, cap);
}
}
}
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 27b4f44eedf3..5fe6653fded4 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -449,6 +449,7 @@
#define PCI_EXT_CAP_ID(header) ((header) & 0x0000ffff)
#define PCI_EXT_CAP_VER(header) (((header) >> 16) & 0xf)
#define PCI_EXT_CAP_NEXT(header) (((header) >> 20) & 0xffc)
+#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U
#define PCI_EXT_CAP_ID_ERR 1
#define PCI_EXT_CAP_ID_VC 2
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (6 preceding siblings ...)
2025-04-21 6:18 ` [PATCH v3 07/11] vpci: Hide extended " Jiqian Chen
@ 2025-04-21 6:19 ` Jiqian Chen
2025-05-06 16:29 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails Jiqian Chen
` (2 subsequent siblings)
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:19 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné, Anthony PERARD
vpci_remove_register() only supports removing a register in a time,
but the follow-on changes need to remove all registers within a
range. And vpci_remove_register() is only used for test currently.
So, refactor it to support removing all matched registers in a
calling time.
And it is no matter to remove a non exist register, so remove the
__must_check prefix.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
---
v2->v3 changes:
* Add new check to return error if registers overlap but not inside range.
v1->v2 changes:
new patch
Best regards,
Jiqian Chen.
---
tools/tests/vpci/main.c | 4 ++--
xen/drivers/vpci/vpci.c | 34 ++++++++++++++++++++--------------
xen/include/xen/vpci.h | 4 ++--
3 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
index 33223db3eb77..ca72877d60cd 100644
--- a/tools/tests/vpci/main.c
+++ b/tools/tests/vpci/main.c
@@ -132,10 +132,10 @@ static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg,
rsvdz_mask))
#define VPCI_REMOVE_REG(off, size) \
- assert(!vpci_remove_register(test_pdev.vpci, off, size))
+ assert(!vpci_remove_registers(test_pdev.vpci, off, size))
#define VPCI_REMOVE_INVALID_REG(off, size) \
- assert(vpci_remove_register(test_pdev.vpci, off, size))
+ assert(vpci_remove_registers(test_pdev.vpci, off, size))
/* Read a 32b register using all possible sizes. */
void multiread4_check(unsigned int reg, uint32_t val)
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 8ff5169bdd18..904770628a2a 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -497,34 +497,40 @@ int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
return 0;
}
-int vpci_remove_register(struct vpci *vpci, unsigned int offset,
- unsigned int size)
+int vpci_remove_registers(struct vpci *vpci, unsigned int start,
+ unsigned int size)
{
- const struct vpci_register r = { .offset = offset, .size = size };
struct vpci_register *rm;
+ unsigned int end = start + size;
spin_lock(&vpci->lock);
list_for_each_entry ( rm, &vpci->handlers, node )
{
- int cmp = vpci_register_cmp(&r, rm);
-
- /*
- * NB: do not use a switch so that we can use break to
- * get out of the list loop earlier if required.
- */
- if ( !cmp && rm->offset == offset && rm->size == size )
+ /* Remove rm if rm is inside the range. */
+ if ( rm->offset >= start && rm->offset + rm->size <= end )
{
+ struct vpci_register *prev =
+ list_entry(rm->node.prev, struct vpci_register, node);
+
list_del(&rm->node);
- spin_unlock(&vpci->lock);
xfree(rm);
- return 0;
+ rm = prev;
+ continue;
}
- if ( cmp <= 0 )
+
+ /* Return error if registers overlap but not inside. */
+ if ( rm->offset + rm->size > start && rm->offset < end )
+ {
+ spin_unlock(&vpci->lock);
+ return -EINVAL;
+ }
+
+ if ( start < rm->offset )
break;
}
spin_unlock(&vpci->lock);
- return -ENOENT;
+ return 0;
}
/* Wrappers for performing reads/writes to the underlying hardware. */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 8e815b418b7d..4e226331fdf3 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -69,8 +69,8 @@ static inline int __must_check vpci_add_register(struct vpci *vpci,
size, data, 0, 0, 0, 0);
}
-int __must_check vpci_remove_register(struct vpci *vpci, unsigned int offset,
- unsigned int size);
+int vpci_remove_registers(struct vpci *vpci, unsigned int start,
+ unsigned int size);
/* Generic read/write handlers for the PCI config space. */
uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size);
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (7 preceding siblings ...)
2025-04-21 6:19 ` [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers Jiqian Chen
@ 2025-04-21 6:19 ` Jiqian Chen
2025-05-08 9:39 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails Jiqian Chen
2025-04-21 6:19 ` [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources Jiqian Chen
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:19 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
When init_rebar() fails, the previous new changes will hide Rebar
capability, it can't rely on vpci_deassign_device() to remove all
Rebar related registers anymore, those registers must be removed
fini_rebar().
To do that, call vpci_remove_registers() to remove all possible
registered registers.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Use fini_rebar() to remove all register instead of in the failure path of init_rebar();
v1->v2 changes:
* Called vpci_remove_registers() to remove all possible registered registers instead of using a array to record all registered register.
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/rebar.c | 35 ++++++++++++++++++++++++-----------
1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
index 026f8f7972d9..325090afb0f8 100644
--- a/xen/drivers/vpci/rebar.c
+++ b/xen/drivers/vpci/rebar.c
@@ -49,6 +49,26 @@ static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
bar->guest_addr = bar->addr;
}
+static void fini_rebar(struct pci_dev *pdev)
+{
+ uint32_t ctrl;
+ unsigned int nbars;
+ unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
+ PCI_EXT_CAP_ID_REBAR);
+
+ if ( !rebar_offset || !is_hardware_domain(pdev->domain) )
+ return;
+
+ ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
+ nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
+ /*
+ * Remove all possible registered registers except header.
+ * Header register will be removed in mask function.
+ */
+ vpci_remove_registers(pdev->vpci, rebar_offset + PCI_REBAR_CAP(0),
+ PCI_REBAR_CTRL(nbars - 1));
+}
+
static int cf_check init_rebar(struct pci_dev *pdev)
{
uint32_t ctrl;
@@ -80,7 +100,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
{
printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
pdev->domain, &pdev->sbdf, index);
- continue;
+ return -EINVAL;
}
bar = &pdev->vpci->header.bars[index];
@@ -88,7 +108,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
{
printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
pdev->domain, &pdev->sbdf, index);
- continue;
+ return -EINVAL;
}
rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
@@ -97,14 +117,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
{
printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
pdev->domain, &pdev->sbdf, index, rc);
- /*
- * Ideally we would hide the ReBar capability on error, but code
- * for doing so still needs to be written. Use continue instead
- * to keep any already setup register hooks, as returning an
- * error will cause the hardware domain to get unmediated access
- * to all device registers.
- */
- continue;
+ return rc;
}
bar->resizable_sizes =
@@ -118,7 +131,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
return 0;
}
-REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, NULL);
+REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, fini_rebar);
/*
* Local variables:
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (8 preceding siblings ...)
2025-04-21 6:19 ` [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails Jiqian Chen
@ 2025-04-21 6:19 ` Jiqian Chen
2025-05-08 9:57 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources Jiqian Chen
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:19 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
When init_msi() fails, the previous new changes will hide MSI
capability, it can't rely on vpci_deassign_device() to remove
all MSI related resources anymore, those resources must be
cleaned up in failure path of init_msi.
To do that, add a new function to free MSI resources.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Remove all fail path, and use fini_msi() hook instead.
* Change the method to calculating the size of msi registers.
v1->v2 changes:
* Added a new function fini_msi to free all MSI resources instead of using an array to record registered registers.
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/msi.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index ea7dc0c060ea..18b06b789827 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -193,6 +193,32 @@ static void cf_check mask_write(
msi->mask = val;
}
+static void fini_msi(struct pci_dev *pdev)
+{
+ unsigned int end, size;
+ struct vpci *vpci = pdev->vpci;
+ const unsigned int msi_pos = pdev->msi_pos;
+
+ if ( !msi_pos || !vpci->msi )
+ return;
+
+ if ( vpci->msi->masking )
+ end = msi_pending_bits_reg(msi_pos, vpci->msi->address64);
+ else
+ end = msi_mask_bits_reg(msi_pos, vpci->msi->address64) - 2;
+
+ size = end - msi_control_reg(msi_pos);
+
+ /*
+ * Remove all possible registered registers except capability ID
+ * register if guest and next capability pointer register, which
+ * will be removed in mask function.
+ */
+ vpci_remove_registers(vpci, msi_control_reg(msi_pos), size);
+ xfree(vpci->msi);
+ vpci->msi = NULL;
+}
+
static int cf_check init_msi(struct pci_dev *pdev)
{
unsigned int pos = pdev->msi_pos;
@@ -270,7 +296,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
return 0;
}
-REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, NULL);
+REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, fini_msi);
void vpci_dump_msi(void)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
` (9 preceding siblings ...)
2025-04-21 6:19 ` [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails Jiqian Chen
@ 2025-04-21 6:19 ` Jiqian Chen
2025-05-08 10:04 ` Roger Pau Monné
10 siblings, 1 reply; 53+ messages in thread
From: Jiqian Chen @ 2025-04-21 6:19 UTC (permalink / raw)
To: xen-devel; +Cc: Huang Rui, Jiqian Chen, Roger Pau Monné
When init_msix() fails, it needs to clean all MSIX resources.
So, add a new function to do that.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v2->v3 changes:
* Remove unnecessary clean operations in fini_msix().
v1->v2 changes:
new patch.
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/msix.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 0228ffd9fda9..e322c260f6bc 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -703,6 +703,25 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
return 0;
}
+static void fini_msix(struct pci_dev *pdev)
+{
+ struct vpci *vpci = pdev->vpci;
+ unsigned int msix_pos = pdev->msix_pos;
+
+ if ( !msix_pos || !vpci->msix )
+ return;
+
+ list_del(&vpci->msix->next);
+
+ for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
+ if ( vpci->msix->table[i] )
+ iounmap(vpci->msix->table[i]);
+
+ vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
+ xfree(vpci->msix);
+ vpci->msix = NULL;
+}
+
static int cf_check init_msix(struct pci_dev *pdev)
{
struct domain *d = pdev->domain;
@@ -757,7 +776,7 @@ static int cf_check init_msix(struct pci_dev *pdev)
return rc;
}
-REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, NULL);
+REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, fini_msix);
/*
* Local variables:
--
2.34.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 02/11] driver/pci: Get next capability without passing caps
2025-04-21 6:18 ` [PATCH v3 02/11] driver/pci: Get next capability without passing caps Jiqian Chen
@ 2025-04-22 15:59 ` Jan Beulich
2025-04-23 3:13 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2025-04-22 15:59 UTC (permalink / raw)
To: Jiqian Chen
Cc: Huang Rui, Andrew Cooper, Anthony PERARD, Michal Orzel,
Julien Grall, Roger Pau Monné, Stefano Stabellini, xen-devel
On 21.04.2025 08:18, Jiqian Chen wrote:
> Modify function pci_find_next_cap_ttl to support returning position
> of next capability when size "n" is zero.
>
> That can help caller to get next capability offset if caller just
> has a information of current capability offset.
>
> That will be used in a follow-on change.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
albeit ...
> --- a/xen/drivers/pci/pci.c
> +++ b/xen/drivers/pci/pci.c
> @@ -55,6 +55,9 @@ unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
>
> if ( id == 0xff )
> break;
> + if ( n == 0 )
> + return pos;
> + ASSERT(caps);
> for ( i = 0; i < n; i++ )
> {
> if ( id == caps[i] )
... blank lines around you insertion might have been nice. I may take the
liberty of adding them while committing.
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
@ 2025-04-22 16:01 ` Jan Beulich
2025-04-23 3:31 ` Chen, Jiqian
2025-05-06 13:47 ` Roger Pau Monné
2025-05-06 13:50 ` Roger Pau Monné
2 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2025-04-22 16:01 UTC (permalink / raw)
To: Jiqian Chen; +Cc: Huang Rui, Roger Pau Monné, xen-devel
On 21.04.2025 08:18, Jiqian Chen wrote:
> @@ -759,10 +759,11 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
> PCI_CAP_ID_MSI,
> PCI_CAP_ID_MSIX,
> };
> + const unsigned int *caps = is_hwdom ? NULL : supported_caps;
> + const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(supported_caps);
>
> next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> - supported_caps,
> - ARRAY_SIZE(supported_caps), &ttl);
> + caps, n, &ttl);
As per the v3 adjustment to patch 02, you can pass supported_caps here in
all cases. Only n needs to be zero for the hwdom case.
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-21 6:18 ` [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT Jiqian Chen
@ 2025-04-22 16:03 ` Jan Beulich
2025-04-23 3:49 ` Chen, Jiqian
2025-05-06 14:37 ` Roger Pau Monné
1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2025-04-22 16:03 UTC (permalink / raw)
To: Jiqian Chen
Cc: Huang Rui, Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel
On 21.04.2025 08:18, Jiqian Chen wrote:
> Refactor REGISTER_VPCI_INIT to contain more capability specific
> information, this is benifit for follow-on changes to hide capability
> which initialization fails.
>
> What's more, change the definition of init_header() since it is
> not a capability and it is needed for all devices' PCI config space.
>
> Note:
> Call vpci_make_msix_hole() in the end of init_msix() since the
> change of sequence of init_header() and init_msix().
> The fini hook will be implemented in follow-on changes.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
From the description I can't derive the need for ...
> --- a/xen/include/xen/xen.lds.h
> +++ b/xen/include/xen/xen.lds.h
> @@ -187,7 +187,7 @@
> #define VPCI_ARRAY \
> . = ALIGN(POINTER_ALIGN); \
> __start_vpci_array = .; \
> - *(SORT(.data.vpci.*)) \
> + *(.data.vpci.*) \
> __end_vpci_array = .;
> #else
> #define VPCI_ARRAY
... this change.
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-04-21 6:18 ` [PATCH v3 07/11] vpci: Hide extended " Jiqian Chen
@ 2025-04-22 16:06 ` Jan Beulich
2025-05-08 9:16 ` Chen, Jiqian
2025-05-06 16:21 ` Roger Pau Monné
1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2025-04-22 16:06 UTC (permalink / raw)
To: Jiqian Chen; +Cc: Huang Rui, Roger Pau Monné, xen-devel
On 21.04.2025 08:18, Jiqian Chen wrote:
> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -449,6 +449,7 @@
> #define PCI_EXT_CAP_ID(header) ((header) & 0x0000ffff)
> #define PCI_EXT_CAP_VER(header) (((header) >> 16) & 0xf)
> #define PCI_EXT_CAP_NEXT(header) (((header) >> 20) & 0xffc)
> +#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U
To avoid introducing redundancy, imo this addition calls for
#define PCI_EXT_CAP_NEXT(header) MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK)
now.
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 02/11] driver/pci: Get next capability without passing caps
2025-04-22 15:59 ` Jan Beulich
@ 2025-04-23 3:13 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-04-23 3:13 UTC (permalink / raw)
To: Jan Beulich
Cc: Huang, Ray, Andrew Cooper, Anthony PERARD, Orzel, Michal,
Julien Grall, Roger Pau Monné, Stefano Stabellini,
xen-devel@lists.xenproject.org, Chen, Jiqian
On 2025/4/22 23:59, Jan Beulich wrote:
> On 21.04.2025 08:18, Jiqian Chen wrote:
>> Modify function pci_find_next_cap_ttl to support returning position
>> of next capability when size "n" is zero.
>>
>> That can help caller to get next capability offset if caller just
>> has a information of current capability offset.
>>
>> That will be used in a follow-on change.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thank you.
> albeit ...
>
>> --- a/xen/drivers/pci/pci.c
>> +++ b/xen/drivers/pci/pci.c
>> @@ -55,6 +55,9 @@ unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos,
>>
>> if ( id == 0xff )
>> break;
>> + if ( n == 0 )
>> + return pos;
>> + ASSERT(caps);
>> for ( i = 0; i < n; i++ )
>> {
>> if ( id == caps[i] )
>
> ... blank lines around you insertion might have been nice. I may take the
> liberty of adding them while committing.
That's fine, please add.
>
> Jan
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-22 16:01 ` Jan Beulich
@ 2025-04-23 3:31 ` Chen, Jiqian
2025-04-23 7:27 ` Jan Beulich
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-04-23 3:31 UTC (permalink / raw)
To: Jan Beulich
Cc: Roger Pau Monné, xen-devel@lists.xenproject.org, Huang, Ray,
Chen, Jiqian
On 2025/4/23 00:01, Jan Beulich wrote:
> On 21.04.2025 08:18, Jiqian Chen wrote:
>> @@ -759,10 +759,11 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>> PCI_CAP_ID_MSI,
>> PCI_CAP_ID_MSIX,
>> };
>> + const unsigned int *caps = is_hwdom ? NULL : supported_caps;
>> + const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(supported_caps);
>>
>> next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
>> - supported_caps,
>> - ARRAY_SIZE(supported_caps), &ttl);
>> + caps, n, &ttl);
>
> As per the v3 adjustment to patch 02, you can pass supported_caps here in
> all cases. Only n needs to be zero for the hwdom case.
Oh, right. I will change in next version.
>
> Jan
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-22 16:03 ` Jan Beulich
@ 2025-04-23 3:49 ` Chen, Jiqian
2025-04-23 7:36 ` Jan Beulich
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-04-23 3:49 UTC (permalink / raw)
To: Jan Beulich
Cc: Huang, Ray, Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Orzel, Michal, Julien Grall, Stefano Stabellini,
xen-devel@lists.xenproject.org, Chen, Jiqian
On 2025/4/23 00:03, Jan Beulich wrote:
> On 21.04.2025 08:18, Jiqian Chen wrote:
>> Refactor REGISTER_VPCI_INIT to contain more capability specific
>> information, this is benifit for follow-on changes to hide capability
>> which initialization fails.
>>
>> What's more, change the definition of init_header() since it is
>> not a capability and it is needed for all devices' PCI config space.
>>
>> Note:
>> Call vpci_make_msix_hole() in the end of init_msix() since the
>> change of sequence of init_header() and init_msix().
>> The fini hook will be implemented in follow-on changes.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>
> From the description I can't derive the need for ...
>
>> --- a/xen/include/xen/xen.lds.h
>> +++ b/xen/include/xen/xen.lds.h
>> @@ -187,7 +187,7 @@
>> #define VPCI_ARRAY \
>> . = ALIGN(POINTER_ALIGN); \
>> __start_vpci_array = .; \
>> - *(SORT(.data.vpci.*)) \
>> + *(.data.vpci.*) \
>> __end_vpci_array = .;
>> #else
>> #define VPCI_ARRAY
>
> ... this change.
As I understand this, this is used for initializing all capabilities according to priority before.
That is msix > header > other capabilities.
My patch removes the priority and initializing all capabilities doesn't depend on priority anymore.
So I think this is not needed anymore.
Do you mean I should add some explanation in the commit message?
>
> Jan
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-23 3:31 ` Chen, Jiqian
@ 2025-04-23 7:27 ` Jan Beulich
0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2025-04-23 7:27 UTC (permalink / raw)
To: Chen, Jiqian
Cc: Roger Pau Monné, xen-devel@lists.xenproject.org, Huang, Ray
On 23.04.2025 05:31, Chen, Jiqian wrote:
> On 2025/4/23 00:01, Jan Beulich wrote:
>> On 21.04.2025 08:18, Jiqian Chen wrote:
>>> @@ -759,10 +759,11 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>>> PCI_CAP_ID_MSI,
>>> PCI_CAP_ID_MSIX,
>>> };
>>> + const unsigned int *caps = is_hwdom ? NULL : supported_caps;
>>> + const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(supported_caps);
>>>
>>> next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
>>> - supported_caps,
>>> - ARRAY_SIZE(supported_caps), &ttl);
>>> + caps, n, &ttl);
>>
>> As per the v3 adjustment to patch 02, you can pass supported_caps here in
>> all cases. Only n needs to be zero for the hwdom case.
> Oh, right. I will change in next version.
And, at the risk of stating the obvious, a brief comment might be a good
idea here.
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-23 3:49 ` Chen, Jiqian
@ 2025-04-23 7:36 ` Jan Beulich
2025-04-23 8:17 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2025-04-23 7:36 UTC (permalink / raw)
To: Chen, Jiqian
Cc: Huang, Ray, Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Orzel, Michal, Julien Grall, Stefano Stabellini,
xen-devel@lists.xenproject.org
On 23.04.2025 05:49, Chen, Jiqian wrote:
> On 2025/4/23 00:03, Jan Beulich wrote:
>> On 21.04.2025 08:18, Jiqian Chen wrote:
>>> Refactor REGISTER_VPCI_INIT to contain more capability specific
>>> information, this is benifit for follow-on changes to hide capability
>>> which initialization fails.
>>>
>>> What's more, change the definition of init_header() since it is
>>> not a capability and it is needed for all devices' PCI config space.
>>>
>>> Note:
>>> Call vpci_make_msix_hole() in the end of init_msix() since the
>>> change of sequence of init_header() and init_msix().
>>> The fini hook will be implemented in follow-on changes.
>>>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>
>> From the description I can't derive the need for ...
>>
>>> --- a/xen/include/xen/xen.lds.h
>>> +++ b/xen/include/xen/xen.lds.h
>>> @@ -187,7 +187,7 @@
>>> #define VPCI_ARRAY \
>>> . = ALIGN(POINTER_ALIGN); \
>>> __start_vpci_array = .; \
>>> - *(SORT(.data.vpci.*)) \
>>> + *(.data.vpci.*) \
>>> __end_vpci_array = .;
>>> #else
>>> #define VPCI_ARRAY
>>
>> ... this change.
> As I understand this, this is used for initializing all capabilities according to priority before.
> That is msix > header > other capabilities.
> My patch removes the priority and initializing all capabilities doesn't depend on priority anymore.
> So I think this is not needed anymore.
Perhaps, but the word "priority" doesn't even appear in the description. So
yes, ...
> Do you mean I should add some explanation in the commit message?
... there's something to add there. But there's also the question of why the
change doesn't go further: With the SORT() dropped, what's the trailing .*
in the section name for? That's apparently connected to the puzzling
+ static vpci_capability_t *const x##_entry \
+ __used_section(".data.vpci.") = &(x##_t)
What's the trailing dot for here?
(As a nit - I also don't see why x##_t would need parenthesizing when
x##_entry doesn't. Is there another Misra gem which makes this necessary?)
Jan
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-23 7:36 ` Jan Beulich
@ 2025-04-23 8:17 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-04-23 8:17 UTC (permalink / raw)
To: Jan Beulich
Cc: Roger Pau Monné, Andrew Cooper, Anthony PERARD,
Orzel, Michal, Julien Grall, Stefano Stabellini,
xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/4/23 15:36, Jan Beulich wrote:
> On 23.04.2025 05:49, Chen, Jiqian wrote:
>> On 2025/4/23 00:03, Jan Beulich wrote:
>>> On 21.04.2025 08:18, Jiqian Chen wrote:
>>>> Refactor REGISTER_VPCI_INIT to contain more capability specific
>>>> information, this is benifit for follow-on changes to hide capability
>>>> which initialization fails.
>>>>
>>>> What's more, change the definition of init_header() since it is
>>>> not a capability and it is needed for all devices' PCI config space.
>>>>
>>>> Note:
>>>> Call vpci_make_msix_hole() in the end of init_msix() since the
>>>> change of sequence of init_header() and init_msix().
>>>> The fini hook will be implemented in follow-on changes.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>
>>> From the description I can't derive the need for ...
>>>
>>>> --- a/xen/include/xen/xen.lds.h
>>>> +++ b/xen/include/xen/xen.lds.h
>>>> @@ -187,7 +187,7 @@
>>>> #define VPCI_ARRAY \
>>>> . = ALIGN(POINTER_ALIGN); \
>>>> __start_vpci_array = .; \
>>>> - *(SORT(.data.vpci.*)) \
>>>> + *(.data.vpci.*) \
>>>> __end_vpci_array = .;
>>>> #else
>>>> #define VPCI_ARRAY
>>>
>>> ... this change.
>> As I understand this, this is used for initializing all capabilities according to priority before.
>> That is msix > header > other capabilities.
>> My patch removes the priority and initializing all capabilities doesn't depend on priority anymore.
>> So I think this is not needed anymore.
>
> Perhaps, but the word "priority" doesn't even appear in the description. So
> yes, ...
I will add description about "priority" removal in commit message in next version.
>
>> Do you mean I should add some explanation in the commit message?
>
> ... there's something to add there. But there's also the question of why the
> change doesn't go further: With the SORT() dropped, what's the trailing .*
> in the section name for? That's apparently connected to the puzzling
>
> + static vpci_capability_t *const x##_entry \
> + __used_section(".data.vpci.") = &(x##_t)
>
> What's the trailing dot for here?
Thanks for catching this problem.
I forgot to delete the dot and ".*", will delete them in next version.
>
> (As a nit - I also don't see why x##_t would need parenthesizing when
> x##_entry doesn't. Is there another Misra gem which makes this necessary?)
Oh, I will delete the parentheses in next version.
>
> Jan
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
@ 2025-04-29 7:10 ` Chen, Jiqian
2025-05-06 13:30 ` Roger Pau Monné
1 sibling, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-04-29 7:10 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
Hi,
On 2025/4/21 14:18, Jiqian Chen wrote:
> No functional changes.
> Follow-on changes will benifit from this.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> new patch.
Are there any other changes that need to be made to this series of patches?
If have, I can modify them in my upcoming new version.
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/header.c | 138 ++++++++++++++++++++------------------
> 1 file changed, 73 insertions(+), 65 deletions(-)
>
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ef6c965c081c..3e9b44454b43 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -745,6 +745,75 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
> return !bar->mem ? -ENOMEM : 0;
> }
>
> +static int vpci_init_capability_list(struct pci_dev *pdev)
> +{
> + int rc;
> + bool mask_cap_list = false;
> +
> + if ( !is_hardware_domain(pdev->domain) &&
> + pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> + {
> + /* Only expose capabilities to the guest that vPCI can handle. */
> + unsigned int next, ttl = 48;
> + static const unsigned int supported_caps[] = {
> + PCI_CAP_ID_MSI,
> + PCI_CAP_ID_MSIX,
> + };
> +
> + next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> + supported_caps,
> + ARRAY_SIZE(supported_caps), &ttl);
> +
> + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> + PCI_CAPABILITY_LIST, 1,
> + (void *)(uintptr_t)next);
> + if ( rc )
> + return rc;
> +
> + next &= ~3;
> +
> + if ( !next )
> + /*
> + * If we don't have any supported capabilities to expose to the
> + * guest, mask the PCI_STATUS_CAP_LIST bit in the status
> + * register.
> + */
> + mask_cap_list = true;
> +
> + while ( next && ttl )
> + {
> + unsigned int pos = next;
> +
> + next = pci_find_next_cap_ttl(pdev->sbdf,
> + pos + PCI_CAP_LIST_NEXT,
> + supported_caps,
> + ARRAY_SIZE(supported_caps), &ttl);
> +
> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> + pos + PCI_CAP_LIST_ID, 1, NULL);
> + if ( rc )
> + return rc;
> +
> + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> + pos + PCI_CAP_LIST_NEXT, 1,
> + (void *)(uintptr_t)next);
> + if ( rc )
> + return rc;
> +
> + next &= ~3;
> + }
> + }
> +
> + /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
> + return vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
> + PCI_STATUS, 2, NULL,
> + PCI_STATUS_RO_MASK &
> + ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
> + PCI_STATUS_RW1C_MASK,
> + mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
> + PCI_STATUS_RSVDZ_MASK);
> +}
> +
> static int cf_check init_header(struct pci_dev *pdev)
> {
> uint16_t cmd;
> @@ -753,7 +822,6 @@ static int cf_check init_header(struct pci_dev *pdev)
> struct vpci_header *header = &pdev->vpci->header;
> struct vpci_bar *bars = header->bars;
> int rc;
> - bool mask_cap_list = false;
> bool is_hwdom = is_hardware_domain(pdev->domain);
>
> ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
> @@ -794,61 +862,12 @@ static int cf_check init_header(struct pci_dev *pdev)
> if ( rc )
> return rc;
>
> + rc = vpci_init_capability_list(pdev);
> + if ( rc )
> + return rc;
> +
> if ( !is_hwdom )
> {
> - if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> - {
> - /* Only expose capabilities to the guest that vPCI can handle. */
> - unsigned int next, ttl = 48;
> - static const unsigned int supported_caps[] = {
> - PCI_CAP_ID_MSI,
> - PCI_CAP_ID_MSIX,
> - };
> -
> - next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> - supported_caps,
> - ARRAY_SIZE(supported_caps), &ttl);
> -
> - rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> - PCI_CAPABILITY_LIST, 1,
> - (void *)(uintptr_t)next);
> - if ( rc )
> - return rc;
> -
> - next &= ~3;
> -
> - if ( !next )
> - /*
> - * If we don't have any supported capabilities to expose to the
> - * guest, mask the PCI_STATUS_CAP_LIST bit in the status
> - * register.
> - */
> - mask_cap_list = true;
> -
> - while ( next && ttl )
> - {
> - unsigned int pos = next;
> -
> - next = pci_find_next_cap_ttl(pdev->sbdf,
> - pos + PCI_CAP_LIST_NEXT,
> - supported_caps,
> - ARRAY_SIZE(supported_caps), &ttl);
> -
> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> - pos + PCI_CAP_LIST_ID, 1, NULL);
> - if ( rc )
> - return rc;
> -
> - rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> - pos + PCI_CAP_LIST_NEXT, 1,
> - (void *)(uintptr_t)next);
> - if ( rc )
> - return rc;
> -
> - next &= ~3;
> - }
> - }
> -
> /* Extended capabilities read as zero, write ignore */
> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
> (void *)0);
> @@ -856,17 +875,6 @@ static int cf_check init_header(struct pci_dev *pdev)
> return rc;
> }
>
> - /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
> - rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
> - PCI_STATUS, 2, NULL,
> - PCI_STATUS_RO_MASK &
> - ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
> - PCI_STATUS_RW1C_MASK,
> - mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
> - PCI_STATUS_RSVDZ_MASK);
> - if ( rc )
> - return rc;
> -
> if ( pdev->ignore_bars )
> return 0;
>
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
2025-04-29 7:10 ` Chen, Jiqian
@ 2025-05-06 13:30 ` Roger Pau Monné
1 sibling, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 13:30 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:53PM +0800, Jiqian Chen wrote:
> No functional changes.
> Follow-on changes will benifit from this.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
2025-04-22 16:01 ` Jan Beulich
@ 2025-05-06 13:47 ` Roger Pau Monné
2025-05-06 13:50 ` Roger Pau Monné
2 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 13:47 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
> Current logic of emulating legacy capability list is only for domU.
> So, expand it to emulate for dom0 too. Then it will be easy to hide
> a capability whose initialization fails in a function.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
With the comment from Jan addressed:
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
2025-04-22 16:01 ` Jan Beulich
2025-05-06 13:47 ` Roger Pau Monné
@ 2025-05-06 13:50 ` Roger Pau Monné
2025-05-07 2:46 ` Chen, Jiqian
2 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 13:50 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
> Current logic of emulating legacy capability list is only for domU.
> So, expand it to emulate for dom0 too. Then it will be easy to hide
> a capability whose initialization fails in a function.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Sorry, one nit I've noticed while looking at the next patch.
> @@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>
> next = pci_find_next_cap_ttl(pdev->sbdf,
> pos + PCI_CAP_LIST_NEXT,
> - supported_caps,
> - ARRAY_SIZE(supported_caps), &ttl);
> + caps, n, &ttl);
>
> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> - pos + PCI_CAP_LIST_ID, 1, NULL);
> - if ( rc )
> - return rc;
> + if ( !is_hwdom )
> + {
> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> + pos + PCI_CAP_LIST_ID, 1, NULL);
> + if ( rc )
> + return rc;
> + }
>
> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
For the hardware domain the write handler should be vpci_hw_write8
instead of NULL.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 04/11] vpci/header: Emulate extended capability list for dom0
2025-04-21 6:18 ` [PATCH v3 04/11] vpci/header: Emulate extended " Jiqian Chen
@ 2025-05-06 14:14 ` Roger Pau Monné
2025-05-07 3:32 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 14:14 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:56PM +0800, Jiqian Chen wrote:
> Add a new function to emulate extended capability list for dom0,
> and call it in init_header(). So that it will be easy to hide a
> extended capability whose initialization fails.
>
> As for the extended capability list of domU, just move the logic
> into above function and keep hiding it for domU.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * In vpci_init_ext_capability_list(), when domain is domU, directly return after adding a handler(hiding all extended capability for domU).
> * In vpci_init_ext_capability_list(), change condition to be "while ( pos >= 0x100U && ttl-- )" instead of "while ( pos && ttl-- )".
> * Add new function vpci_hw_write32, and pass it to extended capability handler for dom0.
>
> v1->v2 changes:
> new patch
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/header.c | 36 ++++++++++++++++++++++++++++--------
> xen/drivers/vpci/vpci.c | 6 ++++++
> xen/include/xen/vpci.h | 2 ++
> 3 files changed, 36 insertions(+), 8 deletions(-)
>
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index c98cd211d9d7..ee94ad8e5037 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -817,6 +817,31 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
> PCI_STATUS_RSVDZ_MASK);
> }
>
> +static int vpci_init_ext_capability_list(struct pci_dev *pdev)
> +{
> + unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
> +
> + if ( !is_hardware_domain(pdev->domain) )
> + /* Extended capabilities read as zero, write ignore for guest */
> + return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> + pos, 4, (void *)0);
> +
> + while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
> + {
> + uint32_t header = pci_conf_read32(pdev->sbdf, pos);
> + int rc;
I'm thinking it might be helpful to avoid setting the handler for the
last capability on the list, or simply for devices that have no
extended capabilities at all:
if ( PCI_EXT_CAP_NEXT(header) >= PCI_CFG_SPACE_SIZE )
{
int rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
pos, 4, (void *)(uintptr_t)header);
if ( rc )
return rc;
}
Otherwise on systems with a lot of devices it can be quite wasteful to
set a handler to just return the next capability as 0.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-04-21 6:18 ` [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT Jiqian Chen
2025-04-22 16:03 ` Jan Beulich
@ 2025-05-06 14:37 ` Roger Pau Monné
2025-05-07 5:59 ` Chen, Jiqian
1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 14:37 UTC (permalink / raw)
To: Jiqian Chen
Cc: xen-devel, Huang Rui, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Stefano Stabellini
On Mon, Apr 21, 2025 at 02:18:57PM +0800, Jiqian Chen wrote:
> Refactor REGISTER_VPCI_INIT to contain more capability specific
> information, this is benifit for follow-on changes to hide capability
> which initialization fails.
>
> What's more, change the definition of init_header() since it is
> not a capability and it is needed for all devices' PCI config space.
>
> Note:
> Call vpci_make_msix_hole() in the end of init_msix() since the
> change of sequence of init_header() and init_msix().
> The fini hook will be implemented in follow-on changes.
I would maybe add that the cleanup hook is also added in this change,
even if it's still unused. Further changes will make use of it.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> cc: Andrew Cooper <andrew.cooper3@citrix.com>
> cc: Anthony PERARD <anthony.perard@vates.tech>
> cc: Michal Orzel <michal.orzel@amd.com>
> cc: Jan Beulich <jbeulich@suse.com>
> cc: Julien Grall <julien@xen.org>
> cc: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v2->v3 changes:
> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
>
> v1->v2 changes:
> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove failed capability from list.
> * Called vpci_make_msix_hole() in the end of init_msix().
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/header.c | 3 +--
> xen/drivers/vpci/msi.c | 2 +-
> xen/drivers/vpci/msix.c | 8 +++++--
> xen/drivers/vpci/rebar.c | 2 +-
> xen/drivers/vpci/vpci.c | 48 +++++++++++++++++++++++++++++++--------
> xen/include/xen/vpci.h | 28 ++++++++++++++++-------
> xen/include/xen/xen.lds.h | 2 +-
> 7 files changed, 68 insertions(+), 25 deletions(-)
>
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ee94ad8e5037..afe4bcdfcb30 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -842,7 +842,7 @@ static int vpci_init_ext_capability_list(struct pci_dev *pdev)
> return 0;
> }
>
> -static int cf_check init_header(struct pci_dev *pdev)
> +int vpci_init_header(struct pci_dev *pdev)
> {
> uint16_t cmd;
> uint64_t addr, size;
> @@ -1038,7 +1038,6 @@ static int cf_check init_header(struct pci_dev *pdev)
> pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
> return rc;
> }
> -REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
>
> /*
> * Local variables:
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 66e5a8a116be..ea7dc0c060ea 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
>
> return 0;
> }
> -REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, NULL);
>
> void vpci_dump_msi(void)
> {
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 6bd8c55bb48e..0228ffd9fda9 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -751,9 +751,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
> pdev->vpci->msix = msix;
> list_add(&msix->next, &d->arch.hvm.msix_tables);
>
> - return 0;
> + spin_lock(&pdev->vpci->lock);
> + rc = vpci_make_msix_hole(pdev);
> + spin_unlock(&pdev->vpci->lock);
> +
> + return rc;
> }
> -REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, NULL);
>
> /*
> * Local variables:
> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> index 793937449af7..026f8f7972d9 100644
> --- a/xen/drivers/vpci/rebar.c
> +++ b/xen/drivers/vpci/rebar.c
> @@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>
> return 0;
> }
> -REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
> +REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, NULL);
>
> /*
> * Local variables:
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 3349b98389b8..5474b66668c1 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -36,8 +36,8 @@ struct vpci_register {
> };
>
> #ifdef __XEN__
> -extern vpci_register_init_t *const __start_vpci_array[];
> -extern vpci_register_init_t *const __end_vpci_array[];
> +extern vpci_capability_t *const __start_vpci_array[];
> +extern vpci_capability_t *const __end_vpci_array[];
> #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>
> #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> @@ -83,6 +83,36 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>
> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>
> +static int vpci_init_capabilities(struct pci_dev *pdev)
> +{
> + for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
> + {
> + const vpci_capability_t *capability = __start_vpci_array[i];
> + const unsigned int cap = capability->id;
> + const bool is_ext = capability->is_ext;
> + unsigned int pos;
> + int rc;
> +
> + if ( !is_hardware_domain(pdev->domain) && is_ext )
> + continue;
> +
> + if ( !is_ext )
> + pos = pci_find_cap_offset(pdev->sbdf, cap);
> + else
> + pos = pci_find_ext_capability(pdev->sbdf, cap);
> +
> + if ( !pos || !capability->init )
Isn't it bogus to have a vpci_capability_t entry with a NULL init
function?
> + continue;
> +
> + rc = capability->init(pdev);
> +
I wouldn't add a newline between the function call and checking the
return value, but that's just my taste.
> + if ( rc )
> + return rc;
> + }
> +
> + return 0;
> +}
> +
> void vpci_deassign_device(struct pci_dev *pdev)
> {
> unsigned int i;
> @@ -128,7 +158,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
>
> int vpci_assign_device(struct pci_dev *pdev)
> {
> - unsigned int i;
> const unsigned long *ro_map;
> int rc = 0;
>
> @@ -159,14 +188,13 @@ int vpci_assign_device(struct pci_dev *pdev)
> goto out;
> #endif
>
> - for ( i = 0; i < NUM_VPCI_INIT; i++ )
> - {
> - rc = __start_vpci_array[i](pdev);
> - if ( rc )
> - break;
> - }
> + rc = vpci_init_header(pdev);
> + if ( rc )
> + goto out;
> +
> + rc = vpci_init_capabilities(pdev);
>
> - out: __maybe_unused;
> + out:
> if ( rc )
> vpci_deassign_device(pdev);
>
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 9d47b8c1a50e..8e815b418b7d 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
> typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
> uint32_t val, void *data);
>
> -typedef int vpci_register_init_t(struct pci_dev *dev);
> -
> -#define VPCI_PRIORITY_HIGH "1"
> -#define VPCI_PRIORITY_MIDDLE "5"
> -#define VPCI_PRIORITY_LOW "9"
> +typedef struct {
> + unsigned int id;
> + bool is_ext;
> + int (*init)(struct pci_dev *pdev);
> + void (*fini)(struct pci_dev *pdev);
> +} vpci_capability_t;
>
> #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12)
>
> @@ -29,9 +30,20 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
> */
> #define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1)
>
> -#define REGISTER_VPCI_INIT(x, p) \
> - static vpci_register_init_t *const x##_entry \
> - __used_section(".data.vpci." p) = (x)
> +#define REGISTER_VPCI_CAP(cap, x, y, ext) \
x and y are not very helpful identifier names, better use some more
descriptive naming, init and fini? Same below.
> + static vpci_capability_t x##_t = { \
> + .id = (cap), \
> + .init = (x), \
> + .fini = (y), \
> + .is_ext = (ext), \
> + }; \
> + static vpci_capability_t *const x##_entry \
> + __used_section(".data.vpci.") = &(x##_t)
> +
> +#define REGISTER_VPCI_LEGACY_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, false)
> +#define REGISTER_VPCI_EXTENDED_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, true)
> +
> +int __must_check vpci_init_header(struct pci_dev *pdev);
>
> /* Assign vPCI to device by adding handlers. */
> int __must_check vpci_assign_device(struct pci_dev *pdev);
> diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
> index 16a9b1ba03db..c73222112dd3 100644
> --- a/xen/include/xen/xen.lds.h
> +++ b/xen/include/xen/xen.lds.h
> @@ -187,7 +187,7 @@
> #define VPCI_ARRAY \
> . = ALIGN(POINTER_ALIGN); \
> __start_vpci_array = .; \
> - *(SORT(.data.vpci.*)) \
> + *(.data.vpci.*) \
Aside from Jan comment, please keep the '\' aligned with the others on
the block.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize
2025-04-21 6:18 ` [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize Jiqian Chen
@ 2025-05-06 16:00 ` Roger Pau Monné
2025-05-07 6:38 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 16:00 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:58PM +0800, Jiqian Chen wrote:
> When vpci fails to initialize a legacy capability of device, it just
> return error instead of catching and processing exception. That makes
> the entire device unusable.
I think "catching and processing exception" is a weird terminology to
use when writing C. It's IMo more accurate to use:
"When vpci fails to initialize a legacy capability of device, it just
returns an error and vPCI gets disabled for the whole device. That
most likely renders the device unusable, plus possibly causing issues
to Xen itself if guest attempts to program the native MSI or MSI-X
capabilities if present."
> So, add new a function to hide legacy capability when initialization
> fails. And remove the failed legacy capability from the vpci emulated
> legacy capability list.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * Separated from the last version patch "vpci: Hide capability when it fails to initialize"
> * Whole implementation changed because last version is wrong.
> This version adds a new helper function vpci_get_register() and uses it to get
> target handler and previous handler from vpci->handlers, then remove the target.
>
> v1->v2 changes:
> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
> remove failed capability from list.
> * Called vpci_make_msix_hole() in the end of init_msix().
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/vpci.c | 133 +++++++++++++++++++++++++++++++++-------
> 1 file changed, 112 insertions(+), 21 deletions(-)
>
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 5474b66668c1..f97c7cc460a0 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -35,6 +35,22 @@ struct vpci_register {
> uint32_t rsvdz_mask;
> };
>
> +static int vpci_register_cmp(const struct vpci_register *r1,
> + const struct vpci_register *r2)
> +{
> + /* Return 0 if registers overlap. */
> + if ( r1->offset < r2->offset + r2->size &&
> + r2->offset < r1->offset + r1->size )
> + return 0;
> + if ( r1->offset < r2->offset )
> + return -1;
> + if ( r1->offset > r2->offset )
> + return 1;
> +
> + ASSERT_UNREACHABLE();
> + return 0;
> +}
> +
> #ifdef __XEN__
> extern vpci_capability_t *const __start_vpci_array[];
> extern vpci_capability_t *const __end_vpci_array[];
> @@ -83,7 +99,91 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>
> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>
> -static int vpci_init_capabilities(struct pci_dev *pdev)
> +static struct vpci_register *vpci_get_register(struct vpci *vpci,
> + const unsigned int offset,
> + const unsigned int size)
We don't usually use const attributes for scalar function parameters.
> +{
> + const struct vpci_register r = { .offset = offset, .size = size };
> + struct vpci_register *rm;
> +
> + ASSERT(spin_is_locked(&vpci->lock));
> + list_for_each_entry ( rm, &vpci->handlers, node )
> + {
> + int cmp = vpci_register_cmp(&r, rm);
> +
> + if ( !cmp && rm->offset == offset && rm->size == size )
> + return rm;
> + if ( cmp <= 0 )
> + break;
> + }
> +
> + return NULL;
> +}
> +
> +static struct vpci_register *vpci_get_previous_cap_register
> + (struct vpci *vpci, const unsigned int offset)
The style preference here would be:
static struct vpci_register *vpci_get_previous_cap_register(
struct vpci *vpci, unsigned int offset)
{
...
> +{
> + uint32_t next;
> + struct vpci_register *r;
> +
> + if ( offset < 0x40 )
I would possibly add an ASSERT_UNREACHABLE() here, as attempting to
pass an offset below 0x40 is a sign of a bug elsewhere?
> + return NULL;
> +
> + r = vpci_get_register(vpci, PCI_CAPABILITY_LIST, 1);
> + ASSERT(r);
> +
> + next = (uint32_t)(uintptr_t)r->private;
> + while ( next >= 0x40 && next != offset )
> + {
> + r = vpci_get_register(vpci, next + PCI_CAP_LIST_NEXT, 1);
> + ASSERT(r);
> + next = (uint32_t)(uintptr_t)r->private;
> + }
> +
> + if ( next < 0x40 )
> + return NULL;
> +
> + return r;
> +}
> +
> +static void vpci_capability_mask(struct pci_dev *pdev,
This possibly needs to return an error code, as it can fail, and just
adding ASSERTs all around seems a bit clumsy, plus we might really
want to prevent assigning the device to the domain if
vpci_capability_mask() fails for whatever reason.
> + const unsigned int cap)
> +{
> + const unsigned int offset = pci_find_cap_offset(pdev->sbdf, cap);
> + struct vpci_register *prev_next_r, *next_r;
> + struct vpci *vpci = pdev->vpci;
> +
> + spin_lock(&vpci->lock);
> + next_r = vpci_get_register(vpci, offset + PCI_CAP_LIST_NEXT, 1);
> + if ( !next_r )
> + {
> + spin_unlock(&vpci->lock);
> + return;
> + }
> +
> + prev_next_r = vpci_get_previous_cap_register(vpci, offset);
> + ASSERT(prev_next_r);
> +
> + prev_next_r->private = next_r->private;
> +
> + if ( !is_hardware_domain(pdev->domain) )
> + {
> + struct vpci_register *id_r =
> + vpci_get_register(vpci, offset + PCI_CAP_LIST_ID, 1);
> +
> + ASSERT(id_r);
> + /* PCI_CAP_LIST_ID register of target capability */
> + list_del(&id_r->node);
> + xfree(id_r);
You could use vpci_remove_register() here?
> + }
> +
> + /* PCI_CAP_LIST_NEXT register of target capability */
> + list_del(&next_r->node);
> + spin_unlock(&vpci->lock);
> + xfree(next_r);
Here vpci_remove_register() could also be used, but it will involve
searching again for the register to remove, which is a bit pointless.
> +}
> +
> +static void vpci_init_capabilities(struct pci_dev *pdev)
> {
> for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
> {
> @@ -107,10 +207,17 @@ static int vpci_init_capabilities(struct pci_dev *pdev)
> rc = capability->init(pdev);
>
> if ( rc )
> - return rc;
> + {
> + if ( capability->fini )
> + capability->fini(pdev);
> +
> + printk(XENLOG_WARNING "%pd %pp: %s cap %u init fail rc=%d, mask it\n",
Best to split to next line:
printk(XENLOG_WARNING
"%pd %pp: %s cap %u init fail rc=%d, mask it\n",
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-04-21 6:18 ` [PATCH v3 07/11] vpci: Hide extended " Jiqian Chen
2025-04-22 16:06 ` Jan Beulich
@ 2025-05-06 16:21 ` Roger Pau Monné
2025-05-07 7:26 ` Chen, Jiqian
1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 16:21 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:18:59PM +0800, Jiqian Chen wrote:
> When vpci fails to initialize a extended capability of device for dom0,
> it just return error instead of catching and processing exception. That
> makes the entire device unusable.
>
> So, add new a function to hide extended capability when initialization
> fails. And remove the failed extended capability handler from vpci
> extended capability list.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * Separated from the last version patch "vpci: Hide capability when it fails to initialize".
> * Whole implementation changed because last version is wrong.
> This version gets target handler and previous handler from vpci->handlers, then remove the target.
> * Note: a case in function vpci_ext_capability_mask() needs to be discussed,
> because it may change the offset of next capability when the offset of target
> capability is 0x100U(the first extended capability), my implementation is just to
> ignore and let hardware to handle the target capability.
>
> v1->v2 changes:
> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
> remove failed capability from list.
> * Called vpci_make_msix_hole() in the end of init_msix().
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
> xen/include/xen/pci_regs.h | 1 +
> 2 files changed, 80 insertions(+)
>
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index f97c7cc460a0..8ff5169bdd18 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
> xfree(next_r);
> }
>
> +static struct vpci_register *vpci_get_previous_ext_cap_register
> + (struct vpci *vpci, const unsigned int offset)
> +{
> + uint32_t header;
> + unsigned int pos = PCI_CFG_SPACE_SIZE;
> + struct vpci_register *r;
> +
> + if ( offset <= PCI_CFG_SPACE_SIZE )
> + return NULL;
> +
> + r = vpci_get_register(vpci, pos, 4);
> + ASSERT(r);
> +
> + header = (uint32_t)(uintptr_t)r->private;
> + pos = PCI_EXT_CAP_NEXT(header);
> + while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
> + {
> + r = vpci_get_register(vpci, pos, 4);
> + ASSERT(r);
> + header = (uint32_t)(uintptr_t)r->private;
> + pos = PCI_EXT_CAP_NEXT(header);
> + }
> +
> + if ( pos <= PCI_CFG_SPACE_SIZE )
> + return NULL;
> +
> + return r;
> +}
> +
> +static void vpci_ext_capability_mask(struct pci_dev *pdev,
> + const unsigned int cap)
> +{
> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
> + struct vpci_register *rm, *prev_r;
> + struct vpci *vpci = pdev->vpci;
> + uint32_t header, pre_header;
Maybe sanity check that offset is correct?
> + spin_lock(&vpci->lock);
> + rm = vpci_get_register(vpci, offset, 4);
> + if ( !rm )
> + {
> + spin_unlock(&vpci->lock);
> + return;
> + }
> +
> + header = (uint32_t)(uintptr_t)rm->private;
> + if ( offset == PCI_CFG_SPACE_SIZE )
> + {
> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
> + rm->private = (void *)(uintptr_t)0;
> + else
> + /*
> + * Else case needs to remove the capability in position 0x100U and
> + * moves the next capability to be in position 0x100U, that would
> + * cause the offset of next capability in vpci different from the
> + * hardware, then cause error accesses, so just ignore it here and
> + * hope hardware would handle the capability well.
> + */
> + printk(XENLOG_ERR "%pd %pp: ext cap %u is first cap, can't mask it\n",
> + pdev->domain, &pdev->sbdf, cap);
In this case, could you maybe replace just the capability ID part of
the header to return 0? That will likely cause the OS to continue
scanning the list, while ID 0 won't match which any known
capability.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers
2025-04-21 6:19 ` [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers Jiqian Chen
@ 2025-05-06 16:29 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-06 16:29 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui, Anthony PERARD
On Mon, Apr 21, 2025 at 02:19:00PM +0800, Jiqian Chen wrote:
> vpci_remove_register() only supports removing a register in a time,
> but the follow-on changes need to remove all registers within a
> range. And vpci_remove_register() is only used for test currently.
> So, refactor it to support removing all matched registers in a
> calling time.
>
> And it is no matter to remove a non exist register, so remove the
> __must_check prefix.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> cc: Anthony PERARD <anthony.perard@vates.tech>
> ---
> v2->v3 changes:
> * Add new check to return error if registers overlap but not inside range.
>
> v1->v2 changes:
> new patch
>
> Best regards,
> Jiqian Chen.
> ---
> tools/tests/vpci/main.c | 4 ++--
> xen/drivers/vpci/vpci.c | 34 ++++++++++++++++++++--------------
> xen/include/xen/vpci.h | 4 ++--
> 3 files changed, 24 insertions(+), 18 deletions(-)
>
> diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c
> index 33223db3eb77..ca72877d60cd 100644
> --- a/tools/tests/vpci/main.c
> +++ b/tools/tests/vpci/main.c
> @@ -132,10 +132,10 @@ static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg,
> rsvdz_mask))
>
> #define VPCI_REMOVE_REG(off, size) \
> - assert(!vpci_remove_register(test_pdev.vpci, off, size))
> + assert(!vpci_remove_registers(test_pdev.vpci, off, size))
>
> #define VPCI_REMOVE_INVALID_REG(off, size) \
> - assert(vpci_remove_register(test_pdev.vpci, off, size))
> + assert(vpci_remove_registers(test_pdev.vpci, off, size))
>
> /* Read a 32b register using all possible sizes. */
> void multiread4_check(unsigned int reg, uint32_t val)
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 8ff5169bdd18..904770628a2a 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -497,34 +497,40 @@ int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
> return 0;
> }
>
> -int vpci_remove_register(struct vpci *vpci, unsigned int offset,
> - unsigned int size)
> +int vpci_remove_registers(struct vpci *vpci, unsigned int start,
> + unsigned int size)
> {
> - const struct vpci_register r = { .offset = offset, .size = size };
> struct vpci_register *rm;
> + unsigned int end = start + size;
>
> spin_lock(&vpci->lock);
> list_for_each_entry ( rm, &vpci->handlers, node )
You might want to use list_for_each_entry_safe ( ) so that...
> {
> - int cmp = vpci_register_cmp(&r, rm);
> -
> - /*
> - * NB: do not use a switch so that we can use break to
> - * get out of the list loop earlier if required.
> - */
> - if ( !cmp && rm->offset == offset && rm->size == size )
> + /* Remove rm if rm is inside the range. */
> + if ( rm->offset >= start && rm->offset + rm->size <= end )
> {
> + struct vpci_register *prev =
> + list_entry(rm->node.prev, struct vpci_register, node);
... you don't need to find prev here.
> list_del(&rm->node);
> - spin_unlock(&vpci->lock);
> xfree(rm);
> - return 0;
> + rm = prev;
> + continue;
> }
> - if ( cmp <= 0 )
> +
> + /* Return error if registers overlap but not inside. */
> + if ( rm->offset + rm->size > start && rm->offset < end )
> + {
> + spin_unlock(&vpci->lock);
> + return -EINVAL;
-ERANGE might be more descriptive here.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-05-06 13:50 ` Roger Pau Monné
@ 2025-05-07 2:46 ` Chen, Jiqian
2025-05-07 2:50 ` Chen, Jiqian
2025-05-07 7:49 ` Roger Pau Monné
0 siblings, 2 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 2:46 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/6 21:50, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
>> Current logic of emulating legacy capability list is only for domU.
>> So, expand it to emulate for dom0 too. Then it will be easy to hide
>> a capability whose initialization fails in a function.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>
> Sorry, one nit I've noticed while looking at the next patch.
>
>> @@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>>
>> next = pci_find_next_cap_ttl(pdev->sbdf,
>> pos + PCI_CAP_LIST_NEXT,
>> - supported_caps,
>> - ARRAY_SIZE(supported_caps), &ttl);
>> + caps, n, &ttl);
>>
>> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
The same here, NULL -> vpci_hw_write8, I think.
>> - pos + PCI_CAP_LIST_ID, 1, NULL);
>> - if ( rc )
>> - return rc;
>> + if ( !is_hwdom )
>> + {
>> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>> + pos + PCI_CAP_LIST_ID, 1, NULL);
>> + if ( rc )
>> + return rc;
>> + }
>>
>> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>
> For the hardware domain the write handler should be vpci_hw_write8
> instead of NULL.
OK, I think I need to add definition of vpci_hw_write8.
But I have a question, if hardware domain write this register through vpci_hw_write8,
then the "next address data" of hardware will be in consistent with vpci.
Is it fine? Or should I update vpci's cache?
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-05-07 2:46 ` Chen, Jiqian
@ 2025-05-07 2:50 ` Chen, Jiqian
2025-05-07 7:49 ` Roger Pau Monné
1 sibling, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 2:50 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/7 10:46, Chen, Jiqian wrote:
> On 2025/5/6 21:50, Roger Pau Monné wrote:
>> On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
>>> Current logic of emulating legacy capability list is only for domU.
>>> So, expand it to emulate for dom0 too. Then it will be easy to hide
>>> a capability whose initialization fails in a function.
>>>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>
>> Sorry, one nit I've noticed while looking at the next patch.
>>
>>> @@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>>>
>>> next = pci_find_next_cap_ttl(pdev->sbdf,
>>> pos + PCI_CAP_LIST_NEXT,
>>> - supported_caps,
>>> - ARRAY_SIZE(supported_caps), &ttl);
>>> + caps, n, &ttl);
>>>
>>> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> The same here, NULL -> vpci_hw_write8, I think.
>
>>> - pos + PCI_CAP_LIST_ID, 1, NULL);
>>> - if ( rc )
>>> - return rc;
>>> + if ( !is_hwdom )
>>> + {
>>> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>>> + pos + PCI_CAP_LIST_ID, 1, NULL);
>>> + if ( rc )
>>> + return rc;
>>> + }
>>>
>>> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>>
>> For the hardware domain the write handler should be vpci_hw_write8
>> instead of NULL.
> OK, I think I need to add definition of vpci_hw_write8.
> But I have a question, if hardware domain write this register through vpci_hw_write8,
> then the "next address data" of hardware will be in consistent with vpci.
be in consistent with -> be inconsistent with
I am sorry.
> Is it fine? Or should I update vpci's cache?
>
>>
>> Thanks, Roger.
>
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 04/11] vpci/header: Emulate extended capability list for dom0
2025-05-06 14:14 ` Roger Pau Monné
@ 2025-05-07 3:32 ` Chen, Jiqian
2025-05-07 7:55 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 3:32 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/6 22:14, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:18:56PM +0800, Jiqian Chen wrote:
>> Add a new function to emulate extended capability list for dom0,
>> and call it in init_header(). So that it will be easy to hide a
>> extended capability whose initialization fails.
>>
>> As for the extended capability list of domU, just move the logic
>> into above function and keep hiding it for domU.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v2->v3 changes:
>> * In vpci_init_ext_capability_list(), when domain is domU, directly return after adding a handler(hiding all extended capability for domU).
>> * In vpci_init_ext_capability_list(), change condition to be "while ( pos >= 0x100U && ttl-- )" instead of "while ( pos && ttl-- )".
>> * Add new function vpci_hw_write32, and pass it to extended capability handler for dom0.
>>
>> v1->v2 changes:
>> new patch
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/header.c | 36 ++++++++++++++++++++++++++++--------
>> xen/drivers/vpci/vpci.c | 6 ++++++
>> xen/include/xen/vpci.h | 2 ++
>> 3 files changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index c98cd211d9d7..ee94ad8e5037 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -817,6 +817,31 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>> PCI_STATUS_RSVDZ_MASK);
>> }
>>
>> +static int vpci_init_ext_capability_list(struct pci_dev *pdev)
>> +{
>> + unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
>> +
>> + if ( !is_hardware_domain(pdev->domain) )
>> + /* Extended capabilities read as zero, write ignore for guest */
>> + return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>> + pos, 4, (void *)0);
>> +
>> + while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
>> + {
>> + uint32_t header = pci_conf_read32(pdev->sbdf, pos);
>> + int rc;
>
> I'm thinking it might be helpful to avoid setting the handler for the
> last capability on the list, or simply for devices that have no
> extended capabilities at all:
>
> if ( PCI_EXT_CAP_NEXT(header) >= PCI_CFG_SPACE_SIZE )
> {
> int rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
> pos, 4, (void *)(uintptr_t)header);
>
> if ( rc )
> return rc;
> }
But if adding this check, there is a problem, think about this situation:
a device only has one extended capability, then under your check, it does not add handler for it,
if the initialization of that extended capability fails, we can't hide it by removing handler from vpci.
If you want to avoid adding handler for devices that have no extended capabilities.
I think adding check
If ( header == 0 )
return 0;
is enough.
>
> Otherwise on systems with a lot of devices it can be quite wasteful to
> set a handler to just return the next capability as 0.
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-05-06 14:37 ` Roger Pau Monné
@ 2025-05-07 5:59 ` Chen, Jiqian
2025-05-07 8:04 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 5:59 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Andrew Cooper,
Anthony PERARD, Orzel, Michal, Jan Beulich, Julien Grall,
Stefano Stabellini, Chen, Jiqian
On 2025/5/6 22:37, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:18:57PM +0800, Jiqian Chen wrote:
>> Refactor REGISTER_VPCI_INIT to contain more capability specific
>> information, this is benifit for follow-on changes to hide capability
>> which initialization fails.
>>
>> What's more, change the definition of init_header() since it is
>> not a capability and it is needed for all devices' PCI config space.
>>
>> Note:
>> Call vpci_make_msix_hole() in the end of init_msix() since the
>> change of sequence of init_header() and init_msix().
>> The fini hook will be implemented in follow-on changes.
>
> I would maybe add that the cleanup hook is also added in this change,
> even if it's still unused. Further changes will make use of it.
Do you mean I need to add empty fini_x() function for MSI, MSIx, Rebar in this patch?
Or just need to add this sentence to the commit message?
>
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> cc: Anthony PERARD <anthony.perard@vates.tech>
>> cc: Michal Orzel <michal.orzel@amd.com>
>> cc: Jan Beulich <jbeulich@suse.com>
>> cc: Julien Grall <julien@xen.org>
>> cc: Stefano Stabellini <sstabellini@kernel.org>
>> ---
>> v2->v3 changes:
>> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
>> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
>> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
>>
>> v1->v2 changes:
>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove failed capability from list.
>> * Called vpci_make_msix_hole() in the end of init_msix().
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/header.c | 3 +--
>> xen/drivers/vpci/msi.c | 2 +-
>> xen/drivers/vpci/msix.c | 8 +++++--
>> xen/drivers/vpci/rebar.c | 2 +-
>> xen/drivers/vpci/vpci.c | 48 +++++++++++++++++++++++++++++++--------
>> xen/include/xen/vpci.h | 28 ++++++++++++++++-------
>> xen/include/xen/xen.lds.h | 2 +-
>> 7 files changed, 68 insertions(+), 25 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index ee94ad8e5037..afe4bcdfcb30 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -842,7 +842,7 @@ static int vpci_init_ext_capability_list(struct pci_dev *pdev)
>> return 0;
>> }
>>
>> -static int cf_check init_header(struct pci_dev *pdev)
>> +int vpci_init_header(struct pci_dev *pdev)
>> {
>> uint16_t cmd;
>> uint64_t addr, size;
>> @@ -1038,7 +1038,6 @@ static int cf_check init_header(struct pci_dev *pdev)
>> pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>> return rc;
>> }
>> -REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
>>
>> /*
>> * Local variables:
>> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
>> index 66e5a8a116be..ea7dc0c060ea 100644
>> --- a/xen/drivers/vpci/msi.c
>> +++ b/xen/drivers/vpci/msi.c
>> @@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
>>
>> return 0;
>> }
>> -REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
>> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, NULL);
>>
>> void vpci_dump_msi(void)
>> {
>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
>> index 6bd8c55bb48e..0228ffd9fda9 100644
>> --- a/xen/drivers/vpci/msix.c
>> +++ b/xen/drivers/vpci/msix.c
>> @@ -751,9 +751,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
>> pdev->vpci->msix = msix;
>> list_add(&msix->next, &d->arch.hvm.msix_tables);
>>
>> - return 0;
>> + spin_lock(&pdev->vpci->lock);
>> + rc = vpci_make_msix_hole(pdev);
>> + spin_unlock(&pdev->vpci->lock);
>> +
>> + return rc;
>> }
>> -REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
>> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, NULL);
>>
>> /*
>> * Local variables:
>> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
>> index 793937449af7..026f8f7972d9 100644
>> --- a/xen/drivers/vpci/rebar.c
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>>
>> return 0;
>> }
>> -REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
>> +REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, NULL);
>>
>> /*
>> * Local variables:
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 3349b98389b8..5474b66668c1 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -36,8 +36,8 @@ struct vpci_register {
>> };
>>
>> #ifdef __XEN__
>> -extern vpci_register_init_t *const __start_vpci_array[];
>> -extern vpci_register_init_t *const __end_vpci_array[];
>> +extern vpci_capability_t *const __start_vpci_array[];
>> +extern vpci_capability_t *const __end_vpci_array[];
>> #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>>
>> #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> @@ -83,6 +83,36 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>>
>> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>>
>> +static int vpci_init_capabilities(struct pci_dev *pdev)
>> +{
>> + for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
>> + {
>> + const vpci_capability_t *capability = __start_vpci_array[i];
>> + const unsigned int cap = capability->id;
>> + const bool is_ext = capability->is_ext;
>> + unsigned int pos;
>> + int rc;
>> +
>> + if ( !is_hardware_domain(pdev->domain) && is_ext )
>> + continue;
>> +
>> + if ( !is_ext )
>> + pos = pci_find_cap_offset(pdev->sbdf, cap);
>> + else
>> + pos = pci_find_ext_capability(pdev->sbdf, cap);
>> +
>> + if ( !pos || !capability->init )
>
> Isn't it bogus to have a vpci_capability_t entry with a NULL init
> function?
Since I don't add fini_x() function for capabilities and also add check "if ( capability->fini )" below,
so I add this NULL check here.
I will remove it if you think it is unnecessary.
Should I also remove the NULL check for fini?
>
>> + continue;
>> +
>> + rc = capability->init(pdev);
>> +
>
> I wouldn't add a newline between the function call and checking the
> return value, but that's just my taste.
Will do.
>
>> + if ( rc )
>> + return rc;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> void vpci_deassign_device(struct pci_dev *pdev)
>> {
>> unsigned int i;
>> @@ -128,7 +158,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
>>
>> int vpci_assign_device(struct pci_dev *pdev)
>> {
>> - unsigned int i;
>> const unsigned long *ro_map;
>> int rc = 0;
>>
>> @@ -159,14 +188,13 @@ int vpci_assign_device(struct pci_dev *pdev)
>> goto out;
>> #endif
>>
>> - for ( i = 0; i < NUM_VPCI_INIT; i++ )
>> - {
>> - rc = __start_vpci_array[i](pdev);
>> - if ( rc )
>> - break;
>> - }
>> + rc = vpci_init_header(pdev);
>> + if ( rc )
>> + goto out;
>> +
>> + rc = vpci_init_capabilities(pdev);
>>
>> - out: __maybe_unused;
>> + out:
>> if ( rc )
>> vpci_deassign_device(pdev);
>>
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index 9d47b8c1a50e..8e815b418b7d 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
>> typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
>> uint32_t val, void *data);
>>
>> -typedef int vpci_register_init_t(struct pci_dev *dev);
>> -
>> -#define VPCI_PRIORITY_HIGH "1"
>> -#define VPCI_PRIORITY_MIDDLE "5"
>> -#define VPCI_PRIORITY_LOW "9"
>> +typedef struct {
>> + unsigned int id;
>> + bool is_ext;
>> + int (*init)(struct pci_dev *pdev);
>> + void (*fini)(struct pci_dev *pdev);
>> +} vpci_capability_t;
>>
>> #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12)
>>
>> @@ -29,9 +30,20 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>> */
>> #define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1)
>>
>> -#define REGISTER_VPCI_INIT(x, p) \
>> - static vpci_register_init_t *const x##_entry \
>> - __used_section(".data.vpci." p) = (x)
>> +#define REGISTER_VPCI_CAP(cap, x, y, ext) \
>
> x and y are not very helpful identifier names, better use some more
> descriptive naming, init and fini? Same below.
init and fini seems not good. They are conflict with the hook name of below vpci_capability_t.
Maybe init_func and fini_func ?
>
>> + static vpci_capability_t x##_t = { \
>> + .id = (cap), \
>> + .init = (x), \
>> + .fini = (y), \
>> + .is_ext = (ext), \
>> + }; \
>> + static vpci_capability_t *const x##_entry \
>> + __used_section(".data.vpci.") = &(x##_t)
>> +
>> +#define REGISTER_VPCI_LEGACY_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, false)
>> +#define REGISTER_VPCI_EXTENDED_CAP(cap, x, y) REGISTER_VPCI_CAP(cap, x, y, true)
>> +
>> +int __must_check vpci_init_header(struct pci_dev *pdev);
>>
>> /* Assign vPCI to device by adding handlers. */
>> int __must_check vpci_assign_device(struct pci_dev *pdev);
>> diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
>> index 16a9b1ba03db..c73222112dd3 100644
>> --- a/xen/include/xen/xen.lds.h
>> +++ b/xen/include/xen/xen.lds.h
>> @@ -187,7 +187,7 @@
>> #define VPCI_ARRAY \
>> . = ALIGN(POINTER_ALIGN); \
>> __start_vpci_array = .; \
>> - *(SORT(.data.vpci.*)) \
>> + *(.data.vpci.*) \
>
> Aside from Jan comment, please keep the '\' aligned with the others on
> the block.
Will do.
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize
2025-05-06 16:00 ` Roger Pau Monné
@ 2025-05-07 6:38 ` Chen, Jiqian
2025-05-07 8:07 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 6:38 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/7 00:00, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:18:58PM +0800, Jiqian Chen wrote:
>> When vpci fails to initialize a legacy capability of device, it just
>> return error instead of catching and processing exception. That makes
>> the entire device unusable.
>
> I think "catching and processing exception" is a weird terminology to
> use when writing C. It's IMo more accurate to use:
>
> "When vpci fails to initialize a legacy capability of device, it just
> returns an error and vPCI gets disabled for the whole device. That
> most likely renders the device unusable, plus possibly causing issues
> to Xen itself if guest attempts to program the native MSI or MSI-X
> capabilities if present."
Thanks, will change.
>
>> So, add new a function to hide legacy capability when initialization
>> fails. And remove the failed legacy capability from the vpci emulated
>> legacy capability list.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v2->v3 changes:
>> * Separated from the last version patch "vpci: Hide capability when it fails to initialize"
>> * Whole implementation changed because last version is wrong.
>> This version adds a new helper function vpci_get_register() and uses it to get
>> target handler and previous handler from vpci->handlers, then remove the target.
>>
>> v1->v2 changes:
>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
>> remove failed capability from list.
>> * Called vpci_make_msix_hole() in the end of init_msix().
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/vpci.c | 133 +++++++++++++++++++++++++++++++++-------
>> 1 file changed, 112 insertions(+), 21 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 5474b66668c1..f97c7cc460a0 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -35,6 +35,22 @@ struct vpci_register {
>> uint32_t rsvdz_mask;
>> };
>>
>> +static int vpci_register_cmp(const struct vpci_register *r1,
>> + const struct vpci_register *r2)
>> +{
>> + /* Return 0 if registers overlap. */
>> + if ( r1->offset < r2->offset + r2->size &&
>> + r2->offset < r1->offset + r1->size )
>> + return 0;
>> + if ( r1->offset < r2->offset )
>> + return -1;
>> + if ( r1->offset > r2->offset )
>> + return 1;
>> +
>> + ASSERT_UNREACHABLE();
>> + return 0;
>> +}
>> +
>> #ifdef __XEN__
>> extern vpci_capability_t *const __start_vpci_array[];
>> extern vpci_capability_t *const __end_vpci_array[];
>> @@ -83,7 +99,91 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>>
>> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>>
>> -static int vpci_init_capabilities(struct pci_dev *pdev)
>> +static struct vpci_register *vpci_get_register(struct vpci *vpci,
>> + const unsigned int offset,
>> + const unsigned int size)
>
> We don't usually use const attributes for scalar function parameters.
>
>> +{
>> + const struct vpci_register r = { .offset = offset, .size = size };
>> + struct vpci_register *rm;
>> +
>> + ASSERT(spin_is_locked(&vpci->lock));
>> + list_for_each_entry ( rm, &vpci->handlers, node )
>> + {
>> + int cmp = vpci_register_cmp(&r, rm);
>> +
>> + if ( !cmp && rm->offset == offset && rm->size == size )
>> + return rm;
>> + if ( cmp <= 0 )
>> + break;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static struct vpci_register *vpci_get_previous_cap_register
>> + (struct vpci *vpci, const unsigned int offset)
>
> The style preference here would be:
>
> static struct vpci_register *vpci_get_previous_cap_register(
> struct vpci *vpci, unsigned int offset)
> {
> ...
>
>> +{
>> + uint32_t next;
>> + struct vpci_register *r;
>> +
>> + if ( offset < 0x40 )
>
> I would possibly add an ASSERT_UNREACHABLE() here, as attempting to
> pass an offset below 0x40 is a sign of a bug elsewhere?
Probably yes, I will add an ASSERT_UNREACHABLE() here.
>
>> + return NULL;
>> +
>> + r = vpci_get_register(vpci, PCI_CAPABILITY_LIST, 1);
>> + ASSERT(r);
>> +
>> + next = (uint32_t)(uintptr_t)r->private;
>> + while ( next >= 0x40 && next != offset )
>> + {
>> + r = vpci_get_register(vpci, next + PCI_CAP_LIST_NEXT, 1);
>> + ASSERT(r);
>> + next = (uint32_t)(uintptr_t)r->private;
>> + }
>> +
>> + if ( next < 0x40 )
>> + return NULL;
>> +
>> + return r;
>> +}
>> +
>> +static void vpci_capability_mask(struct pci_dev *pdev,
>
> This possibly needs to return an error code, as it can fail, and just
> adding ASSERTs all around seems a bit clumsy, plus we might really
> want to prevent assigning the device to the domain if
> vpci_capability_mask() fails for whatever reason.
Make sense. Will change to return error instead of ASSERT.
>
>> + const unsigned int cap)
>> +{
>> + const unsigned int offset = pci_find_cap_offset(pdev->sbdf, cap);
>> + struct vpci_register *prev_next_r, *next_r;
>> + struct vpci *vpci = pdev->vpci;
>> +
>> + spin_lock(&vpci->lock);
>> + next_r = vpci_get_register(vpci, offset + PCI_CAP_LIST_NEXT, 1);
>> + if ( !next_r )
>> + {
>> + spin_unlock(&vpci->lock);
>> + return;
>> + }
>> +
>> + prev_next_r = vpci_get_previous_cap_register(vpci, offset);
>> + ASSERT(prev_next_r);
>> +
>> + prev_next_r->private = next_r->private;
>> +
>> + if ( !is_hardware_domain(pdev->domain) )
>> + {
>> + struct vpci_register *id_r =
>> + vpci_get_register(vpci, offset + PCI_CAP_LIST_ID, 1);
>> +
>> + ASSERT(id_r);
>> + /* PCI_CAP_LIST_ID register of target capability */
>> + list_del(&id_r->node);
>> + xfree(id_r);
>
> You could use vpci_remove_register() here?
Right.
>
>> + }
>> +
>> + /* PCI_CAP_LIST_NEXT register of target capability */
>> + list_del(&next_r->node);
>> + spin_unlock(&vpci->lock);
>> + xfree(next_r);
>
> Here vpci_remove_register() could also be used, but it will involve
> searching again for the register to remove, which is a bit pointless.
Yes, so just keeping things here instead of calling vpci_remove_register()?
>
>> +}
>> +
>> +static void vpci_init_capabilities(struct pci_dev *pdev)
>> {
>> for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
>> {
>> @@ -107,10 +207,17 @@ static int vpci_init_capabilities(struct pci_dev *pdev)
>> rc = capability->init(pdev);
>>
>> if ( rc )
>> - return rc;
>> + {
>> + if ( capability->fini )
>> + capability->fini(pdev);
>> +
>> + printk(XENLOG_WARNING "%pd %pp: %s cap %u init fail rc=%d, mask it\n",
>
> Best to split to next line:
>
> printk(XENLOG_WARNING
> "%pd %pp: %s cap %u init fail rc=%d, mask it\n",
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-05-06 16:21 ` Roger Pau Monné
@ 2025-05-07 7:26 ` Chen, Jiqian
2025-05-07 8:09 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 7:26 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/7 00:21, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:18:59PM +0800, Jiqian Chen wrote:
>> When vpci fails to initialize a extended capability of device for dom0,
>> it just return error instead of catching and processing exception. That
>> makes the entire device unusable.
>>
>> So, add new a function to hide extended capability when initialization
>> fails. And remove the failed extended capability handler from vpci
>> extended capability list.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v2->v3 changes:
>> * Separated from the last version patch "vpci: Hide capability when it fails to initialize".
>> * Whole implementation changed because last version is wrong.
>> This version gets target handler and previous handler from vpci->handlers, then remove the target.
>> * Note: a case in function vpci_ext_capability_mask() needs to be discussed,
>> because it may change the offset of next capability when the offset of target
>> capability is 0x100U(the first extended capability), my implementation is just to
>> ignore and let hardware to handle the target capability.
>>
>> v1->v2 changes:
>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
>> remove failed capability from list.
>> * Called vpci_make_msix_hole() in the end of init_msix().
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
>> xen/include/xen/pci_regs.h | 1 +
>> 2 files changed, 80 insertions(+)
>>
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index f97c7cc460a0..8ff5169bdd18 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
>> xfree(next_r);
>> }
>>
>> +static struct vpci_register *vpci_get_previous_ext_cap_register
>> + (struct vpci *vpci, const unsigned int offset)
>> +{
>> + uint32_t header;
>> + unsigned int pos = PCI_CFG_SPACE_SIZE;
>> + struct vpci_register *r;
>> +
>> + if ( offset <= PCI_CFG_SPACE_SIZE )
>> + return NULL;
>> +
>> + r = vpci_get_register(vpci, pos, 4);
>> + ASSERT(r);
>> +
>> + header = (uint32_t)(uintptr_t)r->private;
>> + pos = PCI_EXT_CAP_NEXT(header);
>> + while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
>> + {
>> + r = vpci_get_register(vpci, pos, 4);
>> + ASSERT(r);
>> + header = (uint32_t)(uintptr_t)r->private;
>> + pos = PCI_EXT_CAP_NEXT(header);
>> + }
>> +
>> + if ( pos <= PCI_CFG_SPACE_SIZE )
>> + return NULL;
>> +
>> + return r;
>> +}
>> +
>> +static void vpci_ext_capability_mask(struct pci_dev *pdev,
>> + const unsigned int cap)
>> +{
>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
>> + struct vpci_register *rm, *prev_r;
>> + struct vpci *vpci = pdev->vpci;
>> + uint32_t header, pre_header;
>
> Maybe sanity check that offset is correct?
What do you mean sanity check?
Do I need to add something?
>
>> + spin_lock(&vpci->lock);
>> + rm = vpci_get_register(vpci, offset, 4);
>> + if ( !rm )
>> + {
>> + spin_unlock(&vpci->lock);
>> + return;
>> + }
>> +
>> + header = (uint32_t)(uintptr_t)rm->private;
>> + if ( offset == PCI_CFG_SPACE_SIZE )
>> + {
>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>> + rm->private = (void *)(uintptr_t)0;
>> + else
>> + /*
>> + * Else case needs to remove the capability in position 0x100U and
>> + * moves the next capability to be in position 0x100U, that would
>> + * cause the offset of next capability in vpci different from the
>> + * hardware, then cause error accesses, so just ignore it here and
>> + * hope hardware would handle the capability well.
>> + */
>> + printk(XENLOG_ERR "%pd %pp: ext cap %u is first cap, can't mask it\n",
>> + pdev->domain, &pdev->sbdf, cap);
>
> In this case, could you maybe replace just the capability ID part of
> the header to return 0? That will likely cause the OS to continue
> scanning the list, while ID 0 won't match which any known
> capability.
OK, will do in next version.
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-05-07 2:46 ` Chen, Jiqian
2025-05-07 2:50 ` Chen, Jiqian
@ 2025-05-07 7:49 ` Roger Pau Monné
2025-05-07 8:13 ` Chen, Jiqian
1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 7:49 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: xen-devel@lists.xenproject.org, Huang, Ray
On Wed, May 07, 2025 at 02:46:52AM +0000, Chen, Jiqian wrote:
> On 2025/5/6 21:50, Roger Pau Monné wrote:
> > On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
> >> Current logic of emulating legacy capability list is only for domU.
> >> So, expand it to emulate for dom0 too. Then it will be easy to hide
> >> a capability whose initialization fails in a function.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >
> > Sorry, one nit I've noticed while looking at the next patch.
> >
> >> @@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
> >>
> >> next = pci_find_next_cap_ttl(pdev->sbdf,
> >> pos + PCI_CAP_LIST_NEXT,
> >> - supported_caps,
> >> - ARRAY_SIZE(supported_caps), &ttl);
> >> + caps, n, &ttl);
> >>
> >> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> The same here, NULL -> vpci_hw_write8, I think.
No, not here, since the PCI_CAP_LIST_ID handler is only added for
non-hardware domains, and in that case we do want to ignore writes to
the register.
> >> - pos + PCI_CAP_LIST_ID, 1, NULL);
> >> - if ( rc )
> >> - return rc;
> >> + if ( !is_hwdom )
> >> + {
> >> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> >> + pos + PCI_CAP_LIST_ID, 1, NULL);
> >> + if ( rc )
> >> + return rc;
> >> + }
> >>
> >> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> >
> > For the hardware domain the write handler should be vpci_hw_write8
> > instead of NULL.
> OK, I think I need to add definition of vpci_hw_write8.
> But I have a question, if hardware domain write this register through vpci_hw_write8,
> then the "next address data" of hardware will be in consistent with vpci.
> Is it fine? Or should I update vpci's cache?
According to the spec this field is read-only, so writes should be
ignored. We allow hardware domain full access because for hardware
domain we aim to trap as little as possible to not diverge behavior
from native, and to allow possible device quirks to work.
It could be conceivable that some vendor has a hidden specific
functionality that somehow triggered by a write to this field.
Regards, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 04/11] vpci/header: Emulate extended capability list for dom0
2025-05-07 3:32 ` Chen, Jiqian
@ 2025-05-07 7:55 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 7:55 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: xen-devel@lists.xenproject.org, Huang, Ray
On Wed, May 07, 2025 at 03:32:47AM +0000, Chen, Jiqian wrote:
> On 2025/5/6 22:14, Roger Pau Monné wrote:
> > On Mon, Apr 21, 2025 at 02:18:56PM +0800, Jiqian Chen wrote:
> >> Add a new function to emulate extended capability list for dom0,
> >> and call it in init_header(). So that it will be easy to hide a
> >> extended capability whose initialization fails.
> >>
> >> As for the extended capability list of domU, just move the logic
> >> into above function and keep hiding it for domU.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> ---
> >> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >> ---
> >> v2->v3 changes:
> >> * In vpci_init_ext_capability_list(), when domain is domU, directly return after adding a handler(hiding all extended capability for domU).
> >> * In vpci_init_ext_capability_list(), change condition to be "while ( pos >= 0x100U && ttl-- )" instead of "while ( pos && ttl-- )".
> >> * Add new function vpci_hw_write32, and pass it to extended capability handler for dom0.
> >>
> >> v1->v2 changes:
> >> new patch
> >>
> >> Best regards,
> >> Jiqian Chen.
> >> ---
> >> xen/drivers/vpci/header.c | 36 ++++++++++++++++++++++++++++--------
> >> xen/drivers/vpci/vpci.c | 6 ++++++
> >> xen/include/xen/vpci.h | 2 ++
> >> 3 files changed, 36 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >> index c98cd211d9d7..ee94ad8e5037 100644
> >> --- a/xen/drivers/vpci/header.c
> >> +++ b/xen/drivers/vpci/header.c
> >> @@ -817,6 +817,31 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
> >> PCI_STATUS_RSVDZ_MASK);
> >> }
> >>
> >> +static int vpci_init_ext_capability_list(struct pci_dev *pdev)
> >> +{
> >> + unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
> >> +
> >> + if ( !is_hardware_domain(pdev->domain) )
> >> + /* Extended capabilities read as zero, write ignore for guest */
> >> + return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> >> + pos, 4, (void *)0);
> >> +
> >> + while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
> >> + {
> >> + uint32_t header = pci_conf_read32(pdev->sbdf, pos);
> >> + int rc;
> >
> > I'm thinking it might be helpful to avoid setting the handler for the
> > last capability on the list, or simply for devices that have no
> > extended capabilities at all:
> >
> > if ( PCI_EXT_CAP_NEXT(header) >= PCI_CFG_SPACE_SIZE )
> > {
> > int rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
> > pos, 4, (void *)(uintptr_t)header);
> >
> > if ( rc )
> > return rc;
> > }
> But if adding this check, there is a problem, think about this situation:
> a device only has one extended capability, then under your check, it does not add handler for it,
> if the initialization of that extended capability fails, we can't hide it by removing handler from vpci.
> If you want to avoid adding handler for devices that have no extended capabilities.
> I think adding check
> If ( header == 0 )
> return 0;
> is enough.
Hm, yes, extended PCI capabilities don't have a start pointer like
legacy ones, so masking the initial capability (as you have discovered)
is not easy. I agree with checking whether the initial header == 0
and then not adding any handler at all.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-05-07 5:59 ` Chen, Jiqian
@ 2025-05-07 8:04 ` Roger Pau Monné
2025-05-07 8:23 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 8:04 UTC (permalink / raw)
To: Chen, Jiqian
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Andrew Cooper,
Anthony PERARD, Orzel, Michal, Jan Beulich, Julien Grall,
Stefano Stabellini
On Wed, May 07, 2025 at 05:59:52AM +0000, Chen, Jiqian wrote:
> On 2025/5/6 22:37, Roger Pau Monné wrote:
> > On Mon, Apr 21, 2025 at 02:18:57PM +0800, Jiqian Chen wrote:
> >> Refactor REGISTER_VPCI_INIT to contain more capability specific
> >> information, this is benifit for follow-on changes to hide capability
> >> which initialization fails.
> >>
> >> What's more, change the definition of init_header() since it is
> >> not a capability and it is needed for all devices' PCI config space.
> >>
> >> Note:
> >> Call vpci_make_msix_hole() in the end of init_msix() since the
> >> change of sequence of init_header() and init_msix().
> >> The fini hook will be implemented in follow-on changes.
> >
> > I would maybe add that the cleanup hook is also added in this change,
> > even if it's still unused. Further changes will make use of it.
> Do you mean I need to add empty fini_x() function for MSI, MSIx, Rebar in this patch?
> Or just need to add this sentence to the commit message?
Oh, no, sorry if it wasn't clear, I meant just adding the sentence to
the commit message.
> >
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> ---
> >> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >> cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >> cc: Anthony PERARD <anthony.perard@vates.tech>
> >> cc: Michal Orzel <michal.orzel@amd.com>
> >> cc: Jan Beulich <jbeulich@suse.com>
> >> cc: Julien Grall <julien@xen.org>
> >> cc: Stefano Stabellini <sstabellini@kernel.org>
> >> ---
> >> v2->v3 changes:
> >> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
> >> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
> >> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
> >>
> >> v1->v2 changes:
> >> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> >> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove failed capability from list.
> >> * Called vpci_make_msix_hole() in the end of init_msix().
> >>
> >> Best regards,
> >> Jiqian Chen.
> >> ---
> >> xen/drivers/vpci/header.c | 3 +--
> >> xen/drivers/vpci/msi.c | 2 +-
> >> xen/drivers/vpci/msix.c | 8 +++++--
> >> xen/drivers/vpci/rebar.c | 2 +-
> >> xen/drivers/vpci/vpci.c | 48 +++++++++++++++++++++++++++++++--------
> >> xen/include/xen/vpci.h | 28 ++++++++++++++++-------
> >> xen/include/xen/xen.lds.h | 2 +-
> >> 7 files changed, 68 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >> index ee94ad8e5037..afe4bcdfcb30 100644
> >> --- a/xen/drivers/vpci/header.c
> >> +++ b/xen/drivers/vpci/header.c
> >> @@ -842,7 +842,7 @@ static int vpci_init_ext_capability_list(struct pci_dev *pdev)
> >> return 0;
> >> }
> >>
> >> -static int cf_check init_header(struct pci_dev *pdev)
> >> +int vpci_init_header(struct pci_dev *pdev)
> >> {
> >> uint16_t cmd;
> >> uint64_t addr, size;
> >> @@ -1038,7 +1038,6 @@ static int cf_check init_header(struct pci_dev *pdev)
> >> pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
> >> return rc;
> >> }
> >> -REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
> >>
> >> /*
> >> * Local variables:
> >> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> >> index 66e5a8a116be..ea7dc0c060ea 100644
> >> --- a/xen/drivers/vpci/msi.c
> >> +++ b/xen/drivers/vpci/msi.c
> >> @@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
> >>
> >> return 0;
> >> }
> >> -REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
> >> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSI, init_msi, NULL);
> >>
> >> void vpci_dump_msi(void)
> >> {
> >> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> >> index 6bd8c55bb48e..0228ffd9fda9 100644
> >> --- a/xen/drivers/vpci/msix.c
> >> +++ b/xen/drivers/vpci/msix.c
> >> @@ -751,9 +751,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
> >> pdev->vpci->msix = msix;
> >> list_add(&msix->next, &d->arch.hvm.msix_tables);
> >>
> >> - return 0;
> >> + spin_lock(&pdev->vpci->lock);
> >> + rc = vpci_make_msix_hole(pdev);
> >> + spin_unlock(&pdev->vpci->lock);
> >> +
> >> + return rc;
> >> }
> >> -REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
> >> +REGISTER_VPCI_LEGACY_CAP(PCI_CAP_ID_MSIX, init_msix, NULL);
> >>
> >> /*
> >> * Local variables:
> >> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> >> index 793937449af7..026f8f7972d9 100644
> >> --- a/xen/drivers/vpci/rebar.c
> >> +++ b/xen/drivers/vpci/rebar.c
> >> @@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
> >>
> >> return 0;
> >> }
> >> -REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
> >> +REGISTER_VPCI_EXTENDED_CAP(PCI_EXT_CAP_ID_REBAR, init_rebar, NULL);
> >>
> >> /*
> >> * Local variables:
> >> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> >> index 3349b98389b8..5474b66668c1 100644
> >> --- a/xen/drivers/vpci/vpci.c
> >> +++ b/xen/drivers/vpci/vpci.c
> >> @@ -36,8 +36,8 @@ struct vpci_register {
> >> };
> >>
> >> #ifdef __XEN__
> >> -extern vpci_register_init_t *const __start_vpci_array[];
> >> -extern vpci_register_init_t *const __end_vpci_array[];
> >> +extern vpci_capability_t *const __start_vpci_array[];
> >> +extern vpci_capability_t *const __end_vpci_array[];
> >> #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
> >>
> >> #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> >> @@ -83,6 +83,36 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
> >>
> >> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
> >>
> >> +static int vpci_init_capabilities(struct pci_dev *pdev)
> >> +{
> >> + for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
> >> + {
> >> + const vpci_capability_t *capability = __start_vpci_array[i];
> >> + const unsigned int cap = capability->id;
> >> + const bool is_ext = capability->is_ext;
> >> + unsigned int pos;
> >> + int rc;
> >> +
> >> + if ( !is_hardware_domain(pdev->domain) && is_ext )
> >> + continue;
> >> +
> >> + if ( !is_ext )
> >> + pos = pci_find_cap_offset(pdev->sbdf, cap);
> >> + else
> >> + pos = pci_find_ext_capability(pdev->sbdf, cap);
> >> +
> >> + if ( !pos || !capability->init )
> >
> > Isn't it bogus to have a vpci_capability_t entry with a NULL init
> > function?
> Since I don't add fini_x() function for capabilities and also add check "if ( capability->fini )" below,
> so I add this NULL check here.
> I will remove it if you think it is unnecessary.
> Should I also remove the NULL check for fini?
I think `fini` is fine to be NULL, but I don't see a case for `init`
being NULL?
Maybe I'm missing some use-case, but I expect capabilities will always
need some kind of initialization (iow: setting up handlers) otherwise
it's just a no-op.
> >> + if ( rc )
> >> + return rc;
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> void vpci_deassign_device(struct pci_dev *pdev)
> >> {
> >> unsigned int i;
> >> @@ -128,7 +158,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
> >>
> >> int vpci_assign_device(struct pci_dev *pdev)
> >> {
> >> - unsigned int i;
> >> const unsigned long *ro_map;
> >> int rc = 0;
> >>
> >> @@ -159,14 +188,13 @@ int vpci_assign_device(struct pci_dev *pdev)
> >> goto out;
> >> #endif
> >>
> >> - for ( i = 0; i < NUM_VPCI_INIT; i++ )
> >> - {
> >> - rc = __start_vpci_array[i](pdev);
> >> - if ( rc )
> >> - break;
> >> - }
> >> + rc = vpci_init_header(pdev);
> >> + if ( rc )
> >> + goto out;
> >> +
> >> + rc = vpci_init_capabilities(pdev);
> >>
> >> - out: __maybe_unused;
> >> + out:
> >> if ( rc )
> >> vpci_deassign_device(pdev);
> >>
> >> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> >> index 9d47b8c1a50e..8e815b418b7d 100644
> >> --- a/xen/include/xen/vpci.h
> >> +++ b/xen/include/xen/vpci.h
> >> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
> >> typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
> >> uint32_t val, void *data);
> >>
> >> -typedef int vpci_register_init_t(struct pci_dev *dev);
> >> -
> >> -#define VPCI_PRIORITY_HIGH "1"
> >> -#define VPCI_PRIORITY_MIDDLE "5"
> >> -#define VPCI_PRIORITY_LOW "9"
> >> +typedef struct {
> >> + unsigned int id;
> >> + bool is_ext;
> >> + int (*init)(struct pci_dev *pdev);
> >> + void (*fini)(struct pci_dev *pdev);
> >> +} vpci_capability_t;
> >>
> >> #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12)
> >>
> >> @@ -29,9 +30,20 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
> >> */
> >> #define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1)
> >>
> >> -#define REGISTER_VPCI_INIT(x, p) \
> >> - static vpci_register_init_t *const x##_entry \
> >> - __used_section(".data.vpci." p) = (x)
> >> +#define REGISTER_VPCI_CAP(cap, x, y, ext) \
> >
> > x and y are not very helpful identifier names, better use some more
> > descriptive naming, init and fini? Same below.
> init and fini seems not good. They are conflict with the hook name of below vpci_capability_t.
> Maybe init_func and fini_func ?
Oh, I see. Can I recommend to name fields init and destroy or cleanup
(instead of fini), and then use finit and fdestroy/fclean as macro
parameters?
I don't think it's common in Xen to name cleanup functions 'fini'. I
understand this is a question of taste, it's mostly for coherence with
the rest of the code base.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize
2025-05-07 6:38 ` Chen, Jiqian
@ 2025-05-07 8:07 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 8:07 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: xen-devel@lists.xenproject.org, Huang, Ray
On Wed, May 07, 2025 at 06:38:45AM +0000, Chen, Jiqian wrote:
> On 2025/5/7 00:00, Roger Pau Monné wrote:
> > On Mon, Apr 21, 2025 at 02:18:58PM +0800, Jiqian Chen wrote:
> >> + }
> >> +
> >> + /* PCI_CAP_LIST_NEXT register of target capability */
> >> + list_del(&next_r->node);
> >> + spin_unlock(&vpci->lock);
> >> + xfree(next_r);
> >
> > Here vpci_remove_register() could also be used, but it will involve
> > searching again for the register to remove, which is a bit pointless.
> Yes, so just keeping things here instead of calling vpci_remove_register()?
I would add a small comment that we avoid calling
vpci_remove_register() to not have to redo the register search.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-05-07 7:26 ` Chen, Jiqian
@ 2025-05-07 8:09 ` Roger Pau Monné
2025-05-07 8:49 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 8:09 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: xen-devel@lists.xenproject.org, Huang, Ray
On Wed, May 07, 2025 at 07:26:21AM +0000, Chen, Jiqian wrote:
> On 2025/5/7 00:21, Roger Pau Monné wrote:
> > On Mon, Apr 21, 2025 at 02:18:59PM +0800, Jiqian Chen wrote:
> >> When vpci fails to initialize a extended capability of device for dom0,
> >> it just return error instead of catching and processing exception. That
> >> makes the entire device unusable.
> >>
> >> So, add new a function to hide extended capability when initialization
> >> fails. And remove the failed extended capability handler from vpci
> >> extended capability list.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> ---
> >> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >> ---
> >> v2->v3 changes:
> >> * Separated from the last version patch "vpci: Hide capability when it fails to initialize".
> >> * Whole implementation changed because last version is wrong.
> >> This version gets target handler and previous handler from vpci->handlers, then remove the target.
> >> * Note: a case in function vpci_ext_capability_mask() needs to be discussed,
> >> because it may change the offset of next capability when the offset of target
> >> capability is 0x100U(the first extended capability), my implementation is just to
> >> ignore and let hardware to handle the target capability.
> >>
> >> v1->v2 changes:
> >> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> >> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
> >> remove failed capability from list.
> >> * Called vpci_make_msix_hole() in the end of init_msix().
> >>
> >> Best regards,
> >> Jiqian Chen.
> >> ---
> >> xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
> >> xen/include/xen/pci_regs.h | 1 +
> >> 2 files changed, 80 insertions(+)
> >>
> >> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> >> index f97c7cc460a0..8ff5169bdd18 100644
> >> --- a/xen/drivers/vpci/vpci.c
> >> +++ b/xen/drivers/vpci/vpci.c
> >> @@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
> >> xfree(next_r);
> >> }
> >>
> >> +static struct vpci_register *vpci_get_previous_ext_cap_register
> >> + (struct vpci *vpci, const unsigned int offset)
> >> +{
> >> + uint32_t header;
> >> + unsigned int pos = PCI_CFG_SPACE_SIZE;
> >> + struct vpci_register *r;
> >> +
> >> + if ( offset <= PCI_CFG_SPACE_SIZE )
> >> + return NULL;
> >> +
> >> + r = vpci_get_register(vpci, pos, 4);
> >> + ASSERT(r);
> >> +
> >> + header = (uint32_t)(uintptr_t)r->private;
> >> + pos = PCI_EXT_CAP_NEXT(header);
> >> + while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
> >> + {
> >> + r = vpci_get_register(vpci, pos, 4);
> >> + ASSERT(r);
> >> + header = (uint32_t)(uintptr_t)r->private;
> >> + pos = PCI_EXT_CAP_NEXT(header);
> >> + }
> >> +
> >> + if ( pos <= PCI_CFG_SPACE_SIZE )
> >> + return NULL;
> >> +
> >> + return r;
> >> +}
> >> +
> >> +static void vpci_ext_capability_mask(struct pci_dev *pdev,
> >> + const unsigned int cap)
> >> +{
> >> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
> >> + struct vpci_register *rm, *prev_r;
> >> + struct vpci *vpci = pdev->vpci;
> >> + uint32_t header, pre_header;
> >
> > Maybe sanity check that offset is correct?
> What do you mean sanity check?
> Do I need to add something?
I would probably do something like:
if ( !offset )
{
ASSERT_UNREACHABLE();
return;
}
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0
2025-05-07 7:49 ` Roger Pau Monné
@ 2025-05-07 8:13 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 8:13 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/7 15:49, Roger Pau Monné wrote:
> On Wed, May 07, 2025 at 02:46:52AM +0000, Chen, Jiqian wrote:
>> On 2025/5/6 21:50, Roger Pau Monné wrote:
>>> On Mon, Apr 21, 2025 at 02:18:55PM +0800, Jiqian Chen wrote:
>>>> Current logic of emulating legacy capability list is only for domU.
>>>> So, expand it to emulate for dom0 too. Then it will be easy to hide
>>>> a capability whose initialization fails in a function.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>
>>> Sorry, one nit I've noticed while looking at the next patch.
>>>
>>>> @@ -786,13 +787,15 @@ static int vpci_init_capability_list(struct pci_dev *pdev)
>>>>
>>>> next = pci_find_next_cap_ttl(pdev->sbdf,
>>>> pos + PCI_CAP_LIST_NEXT,
>>>> - supported_caps,
>>>> - ARRAY_SIZE(supported_caps), &ttl);
>>>> + caps, n, &ttl);
>>>>
>>>> - rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>> The same here, NULL -> vpci_hw_write8, I think.
>
> No, not here, since the PCI_CAP_LIST_ID handler is only added for
> non-hardware domains, and in that case we do want to ignore writes to
> the register.
Oh, I write the wrong place. I mean the codes to add handler for PCI_CAPABILITY_LIST when hardware domain.
>
>>>> - pos + PCI_CAP_LIST_ID, 1, NULL);
>>>> - if ( rc )
>>>> - return rc;
>>>> + if ( !is_hwdom )
>>>> + {
>>>> + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>>>> + pos + PCI_CAP_LIST_ID, 1, NULL);
>>>> + if ( rc )
>>>> + return rc;
>>>> + }
>>>>
>>>> rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>>>
>>> For the hardware domain the write handler should be vpci_hw_write8
>>> instead of NULL.
>> OK, I think I need to add definition of vpci_hw_write8.
>> But I have a question, if hardware domain write this register through vpci_hw_write8,
>> then the "next address data" of hardware will be in consistent with vpci.
>> Is it fine? Or should I update vpci's cache?
>
> According to the spec this field is read-only, so writes should be
> ignored. We allow hardware domain full access because for hardware
> domain we aim to trap as little as possible to not diverge behavior
> from native, and to allow possible device quirks to work.
>
> It could be conceivable that some vendor has a hidden specific
> functionality that somehow triggered by a write to this field.
Got it.
>
> Regards, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT
2025-05-07 8:04 ` Roger Pau Monné
@ 2025-05-07 8:23 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 8:23 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Andrew Cooper,
Anthony PERARD, Orzel, Michal, Jan Beulich, Julien Grall,
Stefano Stabellini, Chen, Jiqian
On 2025/5/7 16:04, Roger Pau Monné wrote:
> On Wed, May 07, 2025 at 05:59:52AM +0000, Chen, Jiqian wrote:
>> On 2025/5/6 22:37, Roger Pau Monné wrote:
>>> On Mon, Apr 21, 2025 at 02:18:57PM +0800, Jiqian Chen wrote:
>>>>
>>>> + if ( !is_ext )
>>>> + pos = pci_find_cap_offset(pdev->sbdf, cap);
>>>> + else
>>>> + pos = pci_find_ext_capability(pdev->sbdf, cap);
>>>> +
>>>> + if ( !pos || !capability->init )
>>>
>>> Isn't it bogus to have a vpci_capability_t entry with a NULL init
>>> function?
>> Since I don't add fini_x() function for capabilities and also add check "if ( capability->fini )" below,
>> so I add this NULL check here.
>> I will remove it if you think it is unnecessary.
>> Should I also remove the NULL check for fini?
>
> I think `fini` is fine to be NULL, but I don't see a case for `init`
> being NULL?
>
> Maybe I'm missing some use-case, but I expect capabilities will always
> need some kind of initialization (iow: setting up handlers) otherwise
> it's just a no-op.
Got it. I will just remove the check of init.
>
>>>> + if ( rc )
>>>> + return rc;
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> void vpci_deassign_device(struct pci_dev *pdev)
>>>> {
>>>> unsigned int i;
>>>> @@ -128,7 +158,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
>>>>
>>>> int vpci_assign_device(struct pci_dev *pdev)
>>>> {
>>>> - unsigned int i;
>>>> const unsigned long *ro_map;
>>>> int rc = 0;
>>>>
>>>> @@ -159,14 +188,13 @@ int vpci_assign_device(struct pci_dev *pdev)
>>>> goto out;
>>>> #endif
>>>>
>>>> - for ( i = 0; i < NUM_VPCI_INIT; i++ )
>>>> - {
>>>> - rc = __start_vpci_array[i](pdev);
>>>> - if ( rc )
>>>> - break;
>>>> - }
>>>> + rc = vpci_init_header(pdev);
>>>> + if ( rc )
>>>> + goto out;
>>>> +
>>>> + rc = vpci_init_capabilities(pdev);
>>>>
>>>> - out: __maybe_unused;
>>>> + out:
>>>> if ( rc )
>>>> vpci_deassign_device(pdev);
>>>>
>>>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>>>> index 9d47b8c1a50e..8e815b418b7d 100644
>>>> --- a/xen/include/xen/vpci.h
>>>> +++ b/xen/include/xen/vpci.h
>>>> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
>>>> typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
>>>> uint32_t val, void *data);
>>>>
>>>> -typedef int vpci_register_init_t(struct pci_dev *dev);
>>>> -
>>>> -#define VPCI_PRIORITY_HIGH "1"
>>>> -#define VPCI_PRIORITY_MIDDLE "5"
>>>> -#define VPCI_PRIORITY_LOW "9"
>>>> +typedef struct {
>>>> + unsigned int id;
>>>> + bool is_ext;
>>>> + int (*init)(struct pci_dev *pdev);
>>>> + void (*fini)(struct pci_dev *pdev);
>>>> +} vpci_capability_t;
>>>>
>>>> #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12)
>>>>
>>>> @@ -29,9 +30,20 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>>>> */
>>>> #define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1)
>>>>
>>>> -#define REGISTER_VPCI_INIT(x, p) \
>>>> - static vpci_register_init_t *const x##_entry \
>>>> - __used_section(".data.vpci." p) = (x)
>>>> +#define REGISTER_VPCI_CAP(cap, x, y, ext) \
>>>
>>> x and y are not very helpful identifier names, better use some more
>>> descriptive naming, init and fini? Same below.
>> init and fini seems not good. They are conflict with the hook name of below vpci_capability_t.
>> Maybe init_func and fini_func ?
>
> Oh, I see. Can I recommend to name fields init and destroy or cleanup
> (instead of fini), and then use finit and fdestroy/fclean as macro
> parameters?
>
> I don't think it's common in Xen to name cleanup functions 'fini'. I
> understand this is a question of taste, it's mostly for coherence with
> the rest of the code base.
OK, will change to "init cleanup" and "finit fclean"
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-05-07 8:09 ` Roger Pau Monné
@ 2025-05-07 8:49 ` Chen, Jiqian
2025-05-07 9:05 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-07 8:49 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/7 16:09, Roger Pau Monné wrote:
> On Wed, May 07, 2025 at 07:26:21AM +0000, Chen, Jiqian wrote:
>> On 2025/5/7 00:21, Roger Pau Monné wrote:
>>> On Mon, Apr 21, 2025 at 02:18:59PM +0800, Jiqian Chen wrote:
>>>> When vpci fails to initialize a extended capability of device for dom0,
>>>> it just return error instead of catching and processing exception. That
>>>> makes the entire device unusable.
>>>>
>>>> So, add new a function to hide extended capability when initialization
>>>> fails. And remove the failed extended capability handler from vpci
>>>> extended capability list.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>>> ---
>>>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>>>> ---
>>>> v2->v3 changes:
>>>> * Separated from the last version patch "vpci: Hide capability when it fails to initialize".
>>>> * Whole implementation changed because last version is wrong.
>>>> This version gets target handler and previous handler from vpci->handlers, then remove the target.
>>>> * Note: a case in function vpci_ext_capability_mask() needs to be discussed,
>>>> because it may change the offset of next capability when the offset of target
>>>> capability is 0x100U(the first extended capability), my implementation is just to
>>>> ignore and let hardware to handle the target capability.
>>>>
>>>> v1->v2 changes:
>>>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
>>>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
>>>> remove failed capability from list.
>>>> * Called vpci_make_msix_hole() in the end of init_msix().
>>>>
>>>> Best regards,
>>>> Jiqian Chen.
>>>> ---
>>>> xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
>>>> xen/include/xen/pci_regs.h | 1 +
>>>> 2 files changed, 80 insertions(+)
>>>>
>>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>>>> index f97c7cc460a0..8ff5169bdd18 100644
>>>> --- a/xen/drivers/vpci/vpci.c
>>>> +++ b/xen/drivers/vpci/vpci.c
>>>> @@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
>>>> xfree(next_r);
>>>> }
>>>>
>>>> +static struct vpci_register *vpci_get_previous_ext_cap_register
>>>> + (struct vpci *vpci, const unsigned int offset)
>>>> +{
>>>> + uint32_t header;
>>>> + unsigned int pos = PCI_CFG_SPACE_SIZE;
>>>> + struct vpci_register *r;
>>>> +
>>>> + if ( offset <= PCI_CFG_SPACE_SIZE )
>>>> + return NULL;
>>>> +
>>>> + r = vpci_get_register(vpci, pos, 4);
>>>> + ASSERT(r);
>>>> +
>>>> + header = (uint32_t)(uintptr_t)r->private;
>>>> + pos = PCI_EXT_CAP_NEXT(header);
>>>> + while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
>>>> + {
>>>> + r = vpci_get_register(vpci, pos, 4);
>>>> + ASSERT(r);
>>>> + header = (uint32_t)(uintptr_t)r->private;
>>>> + pos = PCI_EXT_CAP_NEXT(header);
>>>> + }
>>>> +
>>>> + if ( pos <= PCI_CFG_SPACE_SIZE )
>>>> + return NULL;
>>>> +
>>>> + return r;
>>>> +}
>>>> +
>>>> +static void vpci_ext_capability_mask(struct pci_dev *pdev,
>>>> + const unsigned int cap)
>>>> +{
>>>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
>>>> + struct vpci_register *rm, *prev_r;
>>>> + struct vpci *vpci = pdev->vpci;
>>>> + uint32_t header, pre_header;
>>>
>>> Maybe sanity check that offset is correct?
>> What do you mean sanity check?
>> Do I need to add something?
>
> I would probably do something like:
>
> if ( !offset )
> {
> ASSERT_UNREACHABLE();
> return;
> }
How about adding check?
if ( offset < PCI_CFG_SPACE_SIZE )
{
ASSERT_UNREACHABLE();
return -EINVAL;
}
Do I need to add similar check in vpci_capability_mask()?
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-05-07 8:49 ` Chen, Jiqian
@ 2025-05-07 9:05 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-07 9:05 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: xen-devel@lists.xenproject.org, Huang, Ray
On Wed, May 07, 2025 at 08:49:46AM +0000, Chen, Jiqian wrote:
> On 2025/5/7 16:09, Roger Pau Monné wrote:
> > On Wed, May 07, 2025 at 07:26:21AM +0000, Chen, Jiqian wrote:
> >> On 2025/5/7 00:21, Roger Pau Monné wrote:
> >>> On Mon, Apr 21, 2025 at 02:18:59PM +0800, Jiqian Chen wrote:
> >>>> When vpci fails to initialize a extended capability of device for dom0,
> >>>> it just return error instead of catching and processing exception. That
> >>>> makes the entire device unusable.
> >>>>
> >>>> So, add new a function to hide extended capability when initialization
> >>>> fails. And remove the failed extended capability handler from vpci
> >>>> extended capability list.
> >>>>
> >>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >>>> ---
> >>>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >>>> ---
> >>>> v2->v3 changes:
> >>>> * Separated from the last version patch "vpci: Hide capability when it fails to initialize".
> >>>> * Whole implementation changed because last version is wrong.
> >>>> This version gets target handler and previous handler from vpci->handlers, then remove the target.
> >>>> * Note: a case in function vpci_ext_capability_mask() needs to be discussed,
> >>>> because it may change the offset of next capability when the offset of target
> >>>> capability is 0x100U(the first extended capability), my implementation is just to
> >>>> ignore and let hardware to handle the target capability.
> >>>>
> >>>> v1->v2 changes:
> >>>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> >>>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
> >>>> remove failed capability from list.
> >>>> * Called vpci_make_msix_hole() in the end of init_msix().
> >>>>
> >>>> Best regards,
> >>>> Jiqian Chen.
> >>>> ---
> >>>> xen/drivers/vpci/vpci.c | 79 ++++++++++++++++++++++++++++++++++++++
> >>>> xen/include/xen/pci_regs.h | 1 +
> >>>> 2 files changed, 80 insertions(+)
> >>>>
> >>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> >>>> index f97c7cc460a0..8ff5169bdd18 100644
> >>>> --- a/xen/drivers/vpci/vpci.c
> >>>> +++ b/xen/drivers/vpci/vpci.c
> >>>> @@ -183,6 +183,83 @@ static void vpci_capability_mask(struct pci_dev *pdev,
> >>>> xfree(next_r);
> >>>> }
> >>>>
> >>>> +static struct vpci_register *vpci_get_previous_ext_cap_register
> >>>> + (struct vpci *vpci, const unsigned int offset)
> >>>> +{
> >>>> + uint32_t header;
> >>>> + unsigned int pos = PCI_CFG_SPACE_SIZE;
> >>>> + struct vpci_register *r;
> >>>> +
> >>>> + if ( offset <= PCI_CFG_SPACE_SIZE )
> >>>> + return NULL;
> >>>> +
> >>>> + r = vpci_get_register(vpci, pos, 4);
> >>>> + ASSERT(r);
> >>>> +
> >>>> + header = (uint32_t)(uintptr_t)r->private;
> >>>> + pos = PCI_EXT_CAP_NEXT(header);
> >>>> + while ( pos > PCI_CFG_SPACE_SIZE && pos != offset )
> >>>> + {
> >>>> + r = vpci_get_register(vpci, pos, 4);
> >>>> + ASSERT(r);
> >>>> + header = (uint32_t)(uintptr_t)r->private;
> >>>> + pos = PCI_EXT_CAP_NEXT(header);
> >>>> + }
> >>>> +
> >>>> + if ( pos <= PCI_CFG_SPACE_SIZE )
> >>>> + return NULL;
> >>>> +
> >>>> + return r;
> >>>> +}
> >>>> +
> >>>> +static void vpci_ext_capability_mask(struct pci_dev *pdev,
> >>>> + const unsigned int cap)
> >>>> +{
> >>>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
> >>>> + struct vpci_register *rm, *prev_r;
> >>>> + struct vpci *vpci = pdev->vpci;
> >>>> + uint32_t header, pre_header;
> >>>
> >>> Maybe sanity check that offset is correct?
> >> What do you mean sanity check?
> >> Do I need to add something?
> >
> > I would probably do something like:
> >
> > if ( !offset )
> > {
> > ASSERT_UNREACHABLE();
> > return;
> > }
> How about adding check?
>
> if ( offset < PCI_CFG_SPACE_SIZE )
> {
> ASSERT_UNREACHABLE();
> return -EINVAL;
> }
That would work also, however note that pci_find_ext_capability()
should only return 0 if the capability is not found, and other callers
already assume that != 0 implies a valid position. I will simply
check !offset as that's inline with all the other checks Xen does for
return values of pci_find_ext_capability().
> Do I need to add similar check in vpci_capability_mask()?
Possibly - seems like I didn't comment on that one, sorry.
Regards, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-04-22 16:06 ` Jan Beulich
@ 2025-05-08 9:16 ` Chen, Jiqian
2025-05-08 9:47 ` Roger Pau Monné
0 siblings, 1 reply; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-08 9:16 UTC (permalink / raw)
To: Jan Beulich
Cc: Huang, Ray, Roger Pau Monné, xen-devel@lists.xenproject.org,
Chen, Jiqian
On 2025/4/23 00:06, Jan Beulich wrote:
> On 21.04.2025 08:18, Jiqian Chen wrote:
>> --- a/xen/include/xen/pci_regs.h
>> +++ b/xen/include/xen/pci_regs.h
>> @@ -449,6 +449,7 @@
>> #define PCI_EXT_CAP_ID(header) ((header) & 0x0000ffff)
>> #define PCI_EXT_CAP_VER(header) (((header) >> 16) & 0xf)
>> #define PCI_EXT_CAP_NEXT(header) (((header) >> 20) & 0xffc)
>> +#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U
>
> To avoid introducing redundancy, imo this addition calls for
>
> #define PCI_EXT_CAP_NEXT(header) MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK)
When I tested this locally, I encountered errors: every next position address loss two bits zero.
The next register has 12 bits, according to PCI spec, the bottom two bits are reserved zero,
so "#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U" is fine,
but if change this "#define PCI_EXT_CAP_NEXT(header) MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK)",
I need to change PCI_EXT_CAP_NEXT_MASK to be 0xFFF00000U, is it fine?
>
> now.
>
> Jan
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails
2025-04-21 6:19 ` [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails Jiqian Chen
@ 2025-05-08 9:39 ` Roger Pau Monné
2025-05-09 5:27 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-08 9:39 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:19:01PM +0800, Jiqian Chen wrote:
> When init_rebar() fails, the previous new changes will hide Rebar
> capability, it can't rely on vpci_deassign_device() to remove all
> Rebar related registers anymore, those registers must be removed
> fini_rebar().
>
> To do that, call vpci_remove_registers() to remove all possible
> registered registers.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * Use fini_rebar() to remove all register instead of in the failure path of init_rebar();
>
> v1->v2 changes:
> * Called vpci_remove_registers() to remove all possible registered registers instead of using a array to record all registered register.
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/rebar.c | 35 ++++++++++++++++++++++++-----------
> 1 file changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> index 026f8f7972d9..325090afb0f8 100644
> --- a/xen/drivers/vpci/rebar.c
> +++ b/xen/drivers/vpci/rebar.c
> @@ -49,6 +49,26 @@ static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> bar->guest_addr = bar->addr;
> }
>
> +static void fini_rebar(struct pci_dev *pdev)
> +{
> + uint32_t ctrl;
> + unsigned int nbars;
> + unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> + PCI_EXT_CAP_ID_REBAR);
> +
> + if ( !rebar_offset || !is_hardware_domain(pdev->domain) )
Maybe add an ASSERT_UNREACHABLE() here? I don't think we are expected
to get into the cleanup function for the capability if it's not
present, or if the owner of the device is not the hardware domain.
> + return;
> +
> + ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> + nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> + /*
> + * Remove all possible registered registers except header.
> + * Header register will be removed in mask function.
> + */
> + vpci_remove_registers(pdev->vpci, rebar_offset + PCI_REBAR_CAP(0),
> + PCI_REBAR_CTRL(nbars - 1));
> +}
> +
> static int cf_check init_rebar(struct pci_dev *pdev)
> {
> uint32_t ctrl;
> @@ -80,7 +100,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
> {
> printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> pdev->domain, &pdev->sbdf, index);
> - continue;
> + return -EINVAL;
-E2BIG might be better here. In general I try to avoid using EINVAL,
as it's a catch all that makes differentiating error later on harder.
> }
>
> bar = &pdev->vpci->header.bars[index];
> @@ -88,7 +108,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
> {
> printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> pdev->domain, &pdev->sbdf, index);
> - continue;
> + return -EINVAL;
Maybe -EDOM here? -ENXIO or EIO might also be appropriate.
Overall looks good.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 07/11] vpci: Hide extended capability when it fails to initialize
2025-05-08 9:16 ` Chen, Jiqian
@ 2025-05-08 9:47 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-08 9:47 UTC (permalink / raw)
To: Chen, Jiqian; +Cc: Jan Beulich, Huang, Ray, xen-devel@lists.xenproject.org
On Thu, May 08, 2025 at 09:16:49AM +0000, Chen, Jiqian wrote:
> On 2025/4/23 00:06, Jan Beulich wrote:
> > On 21.04.2025 08:18, Jiqian Chen wrote:
> >> --- a/xen/include/xen/pci_regs.h
> >> +++ b/xen/include/xen/pci_regs.h
> >> @@ -449,6 +449,7 @@
> >> #define PCI_EXT_CAP_ID(header) ((header) & 0x0000ffff)
> >> #define PCI_EXT_CAP_VER(header) (((header) >> 16) & 0xf)
> >> #define PCI_EXT_CAP_NEXT(header) (((header) >> 20) & 0xffc)
> >> +#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U
> >
> > To avoid introducing redundancy, imo this addition calls for
> >
> > #define PCI_EXT_CAP_NEXT(header) MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK)
> When I tested this locally, I encountered errors: every next position address loss two bits zero.
> The next register has 12 bits, according to PCI spec, the bottom two bits are reserved zero,
> so "#define PCI_EXT_CAP_NEXT_MASK 0xFFC00000U" is fine,
> but if change this "#define PCI_EXT_CAP_NEXT(header) MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK)",
> I need to change PCI_EXT_CAP_NEXT_MASK to be 0xFFF00000U, is it fine?
Oh, I see. You might want to do:
#define PCI_EXT_CAP_NEXT_MASK 0xFFF00000U
/* Bottom two bits of next capability position are reserved. */
#define PCI_EXT_CAP_NEXT(header) (MASK_EXTR(header,
PCI_EXT_CAP_NEXT_MASK)
& 0xFFCU)
The spec says:
"The bottom 2 bits of this offset are Reserved and must be implemented
as 00b although software must mask them to allow for future uses of
these bits."
So we need to make sure they are masked.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails
2025-04-21 6:19 ` [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails Jiqian Chen
@ 2025-05-08 9:57 ` Roger Pau Monné
0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-08 9:57 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:19:02PM +0800, Jiqian Chen wrote:
> When init_msi() fails, the previous new changes will hide MSI
> capability, it can't rely on vpci_deassign_device() to remove
> all MSI related resources anymore, those resources must be
> cleaned up in failure path of init_msi.
>
> To do that, add a new function to free MSI resources.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * Remove all fail path, and use fini_msi() hook instead.
> * Change the method to calculating the size of msi registers.
>
> v1->v2 changes:
> * Added a new function fini_msi to free all MSI resources instead of using an array to record registered registers.
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/msi.c | 28 +++++++++++++++++++++++++++-
> 1 file changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index ea7dc0c060ea..18b06b789827 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -193,6 +193,32 @@ static void cf_check mask_write(
> msi->mask = val;
> }
>
> +static void fini_msi(struct pci_dev *pdev)
> +{
> + unsigned int end, size;
> + struct vpci *vpci = pdev->vpci;
> + const unsigned int msi_pos = pdev->msi_pos;
> +
> + if ( !msi_pos || !vpci->msi )
> + return;
> +
> + if ( vpci->msi->masking )
> + end = msi_pending_bits_reg(msi_pos, vpci->msi->address64);
> + else
> + end = msi_mask_bits_reg(msi_pos, vpci->msi->address64) - 2;
> +
> + size = end - msi_control_reg(msi_pos);
> +
> + /*
> + * Remove all possible registered registers except capability ID
> + * register if guest and next capability pointer register, which
> + * will be removed in mask function.
The above text seems very convoluted. I prefer re-using the same
comment that you had in the ReBAR cleanup helper:
/*
* Remove all possible registered registers except header.
* Header register will be removed in mask function.
*/
> + */
> + vpci_remove_registers(vpci, msi_control_reg(msi_pos), size);
> + xfree(vpci->msi);
> + vpci->msi = NULL;
XFREE(vpci->msi);
Will be more compact.
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources
2025-04-21 6:19 ` [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources Jiqian Chen
@ 2025-05-08 10:04 ` Roger Pau Monné
2025-05-09 5:45 ` Chen, Jiqian
0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2025-05-08 10:04 UTC (permalink / raw)
To: Jiqian Chen; +Cc: xen-devel, Huang Rui
On Mon, Apr 21, 2025 at 02:19:03PM +0800, Jiqian Chen wrote:
> When init_msix() fails, it needs to clean all MSIX resources.
> So, add a new function to do that.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v2->v3 changes:
> * Remove unnecessary clean operations in fini_msix().
>
> v1->v2 changes:
> new patch.
>
> Best regards,
> Jiqian Chen.
> ---
> xen/drivers/vpci/msix.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 0228ffd9fda9..e322c260f6bc 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -703,6 +703,25 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
> return 0;
> }
>
> +static void fini_msix(struct pci_dev *pdev)
> +{
> + struct vpci *vpci = pdev->vpci;
> + unsigned int msix_pos = pdev->msix_pos;
> +
> + if ( !msix_pos || !vpci->msix )
That's not fully correct here. See how in init_msix() vpci->msix is
set at the tail of the function, after having added the register
handlers.
I think you instead want:
if ( !msix_pos )
return;
vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
if ( !(vpci->msix )
return;
list_del(&vpci->msix->next);
...
> + return;
> +
> + list_del(&vpci->msix->next);
> +
> + for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
> + if ( vpci->msix->table[i] )
> + iounmap(vpci->msix->table[i]);
> +
Since you have added to all previous cleanup functions, do you also
need a comment here to mention the capability header is not handled?
TBH I'm not sure whether that's relevant in the context here (and
other cleanup functions), as the capability header traps are not added
by the REGISTER_VPCI_{LEGACY,EXTENDED}_CAP() init hooks either, so it
would seem asymmetric for the cleanup hook to attempt to remove those
in the first place.
> + vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
> + xfree(vpci->msix);
> + vpci->msix = NULL;
XFREE();
Thanks, Roger.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails
2025-05-08 9:39 ` Roger Pau Monné
@ 2025-05-09 5:27 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-09 5:27 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/8 17:39, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:19:01PM +0800, Jiqian Chen wrote:
>> When init_rebar() fails, the previous new changes will hide Rebar
>> capability, it can't rely on vpci_deassign_device() to remove all
>> Rebar related registers anymore, those registers must be removed
>> fini_rebar().
>>
>> To do that, call vpci_remove_registers() to remove all possible
>> registered registers.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v2->v3 changes:
>> * Use fini_rebar() to remove all register instead of in the failure path of init_rebar();
>>
>> v1->v2 changes:
>> * Called vpci_remove_registers() to remove all possible registered registers instead of using a array to record all registered register.
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/rebar.c | 35 ++++++++++++++++++++++++-----------
>> 1 file changed, 24 insertions(+), 11 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
>> index 026f8f7972d9..325090afb0f8 100644
>> --- a/xen/drivers/vpci/rebar.c
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -49,6 +49,26 @@ static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>> bar->guest_addr = bar->addr;
>> }
>>
>> +static void fini_rebar(struct pci_dev *pdev)
By the way, I will rename this to be cleanup_rebar since the hook name will be changed in next version.
>> +{
>> + uint32_t ctrl;
>> + unsigned int nbars;
>> + unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>> + PCI_EXT_CAP_ID_REBAR);
>> +
>> + if ( !rebar_offset || !is_hardware_domain(pdev->domain) )
>
> Maybe add an ASSERT_UNREACHABLE() here? I don't think we are expected
> to get into the cleanup function for the capability if it's not
> present, or if the owner of the device is not the hardware domain.
Yes, we don't expect that.
Will add an ASSERT_UNREACHABLE() here in next version.
>
>> + return;
>> +
>> + ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
>> + nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>> + /*
>> + * Remove all possible registered registers except header.
>> + * Header register will be removed in mask function.
>> + */
>> + vpci_remove_registers(pdev->vpci, rebar_offset + PCI_REBAR_CAP(0),
>> + PCI_REBAR_CTRL(nbars - 1));
>> +}
>> +
>> static int cf_check init_rebar(struct pci_dev *pdev)
>> {
>> uint32_t ctrl;
>> @@ -80,7 +100,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>> {
>> printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
>> pdev->domain, &pdev->sbdf, index);
>> - continue;
>> + return -EINVAL;
>
> -E2BIG might be better here. In general I try to avoid using EINVAL,
> as it's a catch all that makes differentiating error later on harder.
Got it, will change.
>
>> }
>>
>> bar = &pdev->vpci->header.bars[index];
>> @@ -88,7 +108,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>> {
>> printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
>> pdev->domain, &pdev->sbdf, index);
>> - continue;
>> + return -EINVAL;
>
> Maybe -EDOM here? -ENXIO or EIO might also be appropriate.
Will change to -ENXIO, it seems more suitable.
Thanks.
>
> Overall looks good.
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources
2025-05-08 10:04 ` Roger Pau Monné
@ 2025-05-09 5:45 ` Chen, Jiqian
0 siblings, 0 replies; 53+ messages in thread
From: Chen, Jiqian @ 2025-05-09 5:45 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel@lists.xenproject.org, Huang, Ray, Chen, Jiqian
On 2025/5/8 18:04, Roger Pau Monné wrote:
> On Mon, Apr 21, 2025 at 02:19:03PM +0800, Jiqian Chen wrote:
>> When init_msix() fails, it needs to clean all MSIX resources.
>> So, add a new function to do that.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v2->v3 changes:
>> * Remove unnecessary clean operations in fini_msix().
>>
>> v1->v2 changes:
>> new patch.
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>> xen/drivers/vpci/msix.c | 21 ++++++++++++++++++++-
>> 1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
>> index 0228ffd9fda9..e322c260f6bc 100644
>> --- a/xen/drivers/vpci/msix.c
>> +++ b/xen/drivers/vpci/msix.c
>> @@ -703,6 +703,25 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
>> return 0;
>> }
>>
>> +static void fini_msix(struct pci_dev *pdev)
>> +{
>> + struct vpci *vpci = pdev->vpci;
>> + unsigned int msix_pos = pdev->msix_pos;
>> +
>> + if ( !msix_pos || !vpci->msix )
>
> That's not fully correct here. See how in init_msix() vpci->msix is
> set at the tail of the function, after having added the register
> handlers.
Thanks! You are more meticulous. I didn't notice that.
Will change in next version.
>
> I think you instead want:
>
> if ( !msix_pos )
> return;
>
> vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
>
> if ( !(vpci->msix )
> return;
>
> list_del(&vpci->msix->next);
> ...
>
>> + return;
>> +
>> + list_del(&vpci->msix->next);
>> +
>> + for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
>> + if ( vpci->msix->table[i] )
>> + iounmap(vpci->msix->table[i]);
>> +
>
> Since you have added to all previous cleanup functions, do you also
> need a comment here to mention the capability header is not handled?
>
> TBH I'm not sure whether that's relevant in the context here (and
> other cleanup functions), as the capability header traps are not added
> by the REGISTER_VPCI_{LEGACY,EXTENDED}_CAP() init hooks either, so it
> would seem asymmetric for the cleanup hook to attempt to remove those
> in the first place.
Indeed, you are right.
For symmetry consistency, I should not have to add these comments.
I will remove them for MSI and Rebar in next version.
>
>> + vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
>> + xfree(vpci->msix);
>> + vpci->msix = NULL;
>
> XFREE();
>
> Thanks, Roger.
--
Best regards,
Jiqian Chen.
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2025-05-09 5:46 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21 6:18 [PATCH v3 00/11] Support hiding capability when its initialization fails Jiqian Chen
2025-04-21 6:18 ` [PATCH v3 01/11] vpci/header: Move emulating cap list logic into new function Jiqian Chen
2025-04-29 7:10 ` Chen, Jiqian
2025-05-06 13:30 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 02/11] driver/pci: Get next capability without passing caps Jiqian Chen
2025-04-22 15:59 ` Jan Beulich
2025-04-23 3:13 ` Chen, Jiqian
2025-04-21 6:18 ` [PATCH v3 03/11] vpci/header: Emulate legacy capability list for dom0 Jiqian Chen
2025-04-22 16:01 ` Jan Beulich
2025-04-23 3:31 ` Chen, Jiqian
2025-04-23 7:27 ` Jan Beulich
2025-05-06 13:47 ` Roger Pau Monné
2025-05-06 13:50 ` Roger Pau Monné
2025-05-07 2:46 ` Chen, Jiqian
2025-05-07 2:50 ` Chen, Jiqian
2025-05-07 7:49 ` Roger Pau Monné
2025-05-07 8:13 ` Chen, Jiqian
2025-04-21 6:18 ` [PATCH v3 04/11] vpci/header: Emulate extended " Jiqian Chen
2025-05-06 14:14 ` Roger Pau Monné
2025-05-07 3:32 ` Chen, Jiqian
2025-05-07 7:55 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 05/11] vpci: Refactor REGISTER_VPCI_INIT Jiqian Chen
2025-04-22 16:03 ` Jan Beulich
2025-04-23 3:49 ` Chen, Jiqian
2025-04-23 7:36 ` Jan Beulich
2025-04-23 8:17 ` Chen, Jiqian
2025-05-06 14:37 ` Roger Pau Monné
2025-05-07 5:59 ` Chen, Jiqian
2025-05-07 8:04 ` Roger Pau Monné
2025-05-07 8:23 ` Chen, Jiqian
2025-04-21 6:18 ` [PATCH v3 06/11] vpci: Hide legacy capability when it fails to initialize Jiqian Chen
2025-05-06 16:00 ` Roger Pau Monné
2025-05-07 6:38 ` Chen, Jiqian
2025-05-07 8:07 ` Roger Pau Monné
2025-04-21 6:18 ` [PATCH v3 07/11] vpci: Hide extended " Jiqian Chen
2025-04-22 16:06 ` Jan Beulich
2025-05-08 9:16 ` Chen, Jiqian
2025-05-08 9:47 ` Roger Pau Monné
2025-05-06 16:21 ` Roger Pau Monné
2025-05-07 7:26 ` Chen, Jiqian
2025-05-07 8:09 ` Roger Pau Monné
2025-05-07 8:49 ` Chen, Jiqian
2025-05-07 9:05 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 08/11] vpci: Refactor vpci_remove_register to remove matched registers Jiqian Chen
2025-05-06 16:29 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 09/11] vpci/rebar: Remove registers when init_rebar() fails Jiqian Chen
2025-05-08 9:39 ` Roger Pau Monné
2025-05-09 5:27 ` Chen, Jiqian
2025-04-21 6:19 ` [PATCH v3 10/11] vpci/msi: Free MSI resources when init_msi() fails Jiqian Chen
2025-05-08 9:57 ` Roger Pau Monné
2025-04-21 6:19 ` [PATCH v3 11/11] vpci/msix: Add function to clean MSIX resources Jiqian Chen
2025-05-08 10:04 ` Roger Pau Monné
2025-05-09 5:45 ` Chen, Jiqian
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.