* [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available
@ 2026-01-07 16:54 Teddy Astie
2026-01-07 16:54 ` [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling Teddy Astie
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Teddy Astie @ 2026-01-07 16:54 UTC (permalink / raw)
To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné
Currently, Xen uses legacy method to access the configuration space unless the
access cannot be made with it, where Xen fallbacks to MMCFG. This is not really
great, as MMCFG is more flexible and doesn't require a dedicated lock, so it would
be preferable to use it whenever possible.
Teddy Astie (2):
x86/pci: Improve pci_mmcfg_{read,write} error handling
x86/pci: Prefer using mmcfg for accessing configuration space
xen/arch/x86/x86_64/mmconfig_64.c | 10 +++---
xen/arch/x86/x86_64/pci.c | 52 ++++++++++++++-----------------
2 files changed, 28 insertions(+), 34 deletions(-)
--
2.52.0
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling 2026-01-07 16:54 [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Teddy Astie @ 2026-01-07 16:54 ` Teddy Astie 2026-01-08 9:51 ` Jan Beulich 2026-01-07 16:54 ` [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space Teddy Astie 2026-01-07 17:22 ` [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Roger Pau Monné 2 siblings, 1 reply; 9+ messages in thread From: Teddy Astie @ 2026-01-07 16:54 UTC (permalink / raw) To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné Return -ENODEV in case no mmcfg information is available instead of -EINVAL, that is also returned when the parameters are incorrect. It helps us distinguish between incorrect usage and no MMCFG support. Signed-off-by: Teddy Astie <teddy.astie@vates.tech> --- xen/arch/x86/x86_64/mmconfig_64.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/x86_64/mmconfig_64.c b/xen/arch/x86/x86_64/mmconfig_64.c index ffdc62700d..1a2803d2a3 100644 --- a/xen/arch/x86/x86_64/mmconfig_64.c +++ b/xen/arch/x86/x86_64/mmconfig_64.c @@ -60,15 +60,15 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus, { char __iomem *addr; + *value = -1; + /* Why do we have this when nobody checks it. How about a BUG()!? -AK */ - if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095))) { -err: *value = -1; + if ( unlikely((bus > 255) || (devfn > 255) || (reg > 4095)) ) return -EINVAL; - } addr = pci_dev_base(seg, bus, devfn); if (!addr) - goto err; + return -ENODEV; switch (len) { case 1: @@ -96,7 +96,7 @@ int pci_mmcfg_write(unsigned int seg, unsigned int bus, addr = pci_dev_base(seg, bus, devfn); if (!addr) - return -EINVAL; + return -ENODEV; switch (len) { case 1: -- 2.52.0 -- Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling 2026-01-07 16:54 ` [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling Teddy Astie @ 2026-01-08 9:51 ` Jan Beulich 0 siblings, 0 replies; 9+ messages in thread From: Jan Beulich @ 2026-01-08 9:51 UTC (permalink / raw) To: Teddy Astie; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel On 07.01.2026 17:54, Teddy Astie wrote: > --- a/xen/arch/x86/x86_64/mmconfig_64.c > +++ b/xen/arch/x86/x86_64/mmconfig_64.c > @@ -60,15 +60,15 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus, > { > char __iomem *addr; > > + *value = -1; > + > /* Why do we have this when nobody checks it. How about a BUG()!? -AK */ While it may be okay to retain this comment right here, the next patch the latest should drop it (and its counterpart below). Question though is whether earlier replies to the cover letter won't result in there not actually appearing any users of the return values, in which case the change here would be meaningless. Hence while in principle I'm okay with it, I won't ack it for the time being. Jan ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space 2026-01-07 16:54 [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Teddy Astie 2026-01-07 16:54 ` [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling Teddy Astie @ 2026-01-07 16:54 ` Teddy Astie 2026-01-08 9:56 ` Jan Beulich 2026-01-07 17:22 ` [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Roger Pau Monné 2 siblings, 1 reply; 9+ messages in thread From: Teddy Astie @ 2026-01-07 16:54 UTC (permalink / raw) To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné Current logic prefer using CFC/CF8 and fallbacks on mmcfg when accessing >255 registers or a non-zero segment. Change the logic to always rely on mmcfg unless it is not available to avoid locking on pci_config_lock if possible. Signed-off-by: Teddy Astie <teddy.astie@vates.tech> --- Are there x86 platforms where MMCFG is the only way to access PCI configuration space ? xen/arch/x86/x86_64/pci.c | 52 +++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/xen/arch/x86/x86_64/pci.c b/xen/arch/x86/x86_64/pci.c index 8d33429103..3b3df8014d 100644 --- a/xen/arch/x86/x86_64/pci.c +++ b/xen/arch/x86/x86_64/pci.c @@ -14,62 +14,56 @@ uint8_t pci_conf_read8(pci_sbdf_t sbdf, unsigned int reg) { uint32_t value; + int ret = pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 1, &value); - if ( sbdf.seg || reg > 255 ) - { - pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 1, &value); - return value; - } + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) + return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), reg & 3, 1); - return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), reg & 3, 1); + return value; } uint16_t pci_conf_read16(pci_sbdf_t sbdf, unsigned int reg) { - if ( sbdf.seg || reg > 255 ) - { - uint32_t value; + uint32_t value; + int ret = pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 2, &value); - pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 2, &value); - return value; - } + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) + return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), reg & 2, 2); - return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), reg & 2, 2); + return value; } uint32_t pci_conf_read32(pci_sbdf_t sbdf, unsigned int reg) { - if ( sbdf.seg || reg > 255 ) - { - uint32_t value; + uint32_t value; + int ret = pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 4, &value); - pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 4, &value); - return value; - } + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) + return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), 0, 4); - return pci_conf_read(PCI_CONF_ADDRESS(sbdf, reg), 0, 4); + return value; } void pci_conf_write8(pci_sbdf_t sbdf, unsigned int reg, uint8_t data) { - if ( sbdf.seg || reg > 255 ) - pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 1, data); - else + int ret = pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 1, data); + + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) pci_conf_write(PCI_CONF_ADDRESS(sbdf, reg), reg & 3, 1, data); } void pci_conf_write16(pci_sbdf_t sbdf, unsigned int reg, uint16_t data) { - if ( sbdf.seg || reg > 255 ) - pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 2, data); - else + int ret = pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 2, data); + + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) pci_conf_write(PCI_CONF_ADDRESS(sbdf, reg), reg & 2, 2, data); } void pci_conf_write32(pci_sbdf_t sbdf, unsigned int reg, uint32_t data) { - if ( sbdf.seg || reg > 255 ) - pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 4, data); - else + int ret = pci_mmcfg_write(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 4, data); + + if ( unlikely(ret == -ENODEV) && !sbdf.seg && reg <= 255 ) pci_conf_write(PCI_CONF_ADDRESS(sbdf, reg), 0, 4, data); } -- 2.52.0 -- Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space 2026-01-07 16:54 ` [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space Teddy Astie @ 2026-01-08 9:56 ` Jan Beulich 0 siblings, 0 replies; 9+ messages in thread From: Jan Beulich @ 2026-01-08 9:56 UTC (permalink / raw) To: Teddy Astie; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel On 07.01.2026 17:54, Teddy Astie wrote: > Current logic prefer using CFC/CF8 and fallbacks on mmcfg when accessing >> 255 registers or a non-zero segment. Change the logic to always rely (Minor: Many mail programs, like mine, will mistake a > in the first column as being reply quoting.) > on mmcfg unless it is not available to avoid locking on pci_config_lock > if possible. > > Signed-off-by: Teddy Astie <teddy.astie@vates.tech> > --- > Are there x86 platforms where MMCFG is the only way to access PCI configuration space ? If there were, how would that fact be communicated? > --- a/xen/arch/x86/x86_64/pci.c > +++ b/xen/arch/x86/x86_64/pci.c > @@ -14,62 +14,56 @@ > uint8_t pci_conf_read8(pci_sbdf_t sbdf, unsigned int reg) > { > uint32_t value; > + int ret = pci_mmcfg_read(sbdf.seg, sbdf.bus, sbdf.devfn, reg, 1, &value); Along the lines of what in particular Roger said in reply to the cover letter, I'm unconvinced we want to slow down (even if just minimally) things by unconditionally making this call (and similar ones below). Jan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available 2026-01-07 16:54 [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Teddy Astie 2026-01-07 16:54 ` [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling Teddy Astie 2026-01-07 16:54 ` [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space Teddy Astie @ 2026-01-07 17:22 ` Roger Pau Monné 2026-01-07 17:58 ` Teddy Astie 2 siblings, 1 reply; 9+ messages in thread From: Roger Pau Monné @ 2026-01-07 17:22 UTC (permalink / raw) To: Teddy Astie; +Cc: xen-devel, Jan Beulich, Andrew Cooper On Wed, Jan 07, 2026 at 04:54:55PM +0000, Teddy Astie wrote: > Currently, Xen uses legacy method to access the configuration space unless the > access cannot be made with it, where Xen fallbacks to MMCFG. This is not really > great, as MMCFG is more flexible and doesn't require a dedicated lock, so it would > be preferable to use it whenever possible. > > Teddy Astie (2): > x86/pci: Improve pci_mmcfg_{read,write} error handling > x86/pci: Prefer using mmcfg for accessing configuration space AFAICT Linux is using the same approach as Xen to perform PCI accesses. Registers below 256 on segment 0 are accessed using the legacy method (IO ports), while the extended space is accessed using MMCFG. Do you know the reason for this? I fear there might be legacy devices/bridges (or root complexes?) where MMCFG is not working as expected? I think we need to understand why Xen (and Linux) do it this way so it can be properly justified why it's safe to switch to a different approach. Thanks, Roger. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available 2026-01-07 17:22 ` [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Roger Pau Monné @ 2026-01-07 17:58 ` Teddy Astie 2026-01-07 18:07 ` Andrew Cooper 0 siblings, 1 reply; 9+ messages in thread From: Teddy Astie @ 2026-01-07 17:58 UTC (permalink / raw) To: Roger Pau Monné; +Cc: xen-devel, Jan Beulich, Andrew Cooper Le 07/01/2026 à 18:25, Roger Pau Monné a écrit : > On Wed, Jan 07, 2026 at 04:54:55PM +0000, Teddy Astie wrote: >> Currently, Xen uses legacy method to access the configuration space unless the >> access cannot be made with it, where Xen fallbacks to MMCFG. This is not really >> great, as MMCFG is more flexible and doesn't require a dedicated lock, so it would >> be preferable to use it whenever possible. >> >> Teddy Astie (2): >> x86/pci: Improve pci_mmcfg_{read,write} error handling >> x86/pci: Prefer using mmcfg for accessing configuration space > > AFAICT Linux is using the same approach as Xen to perform PCI > accesses. Registers below 256 on segment 0 are accessed using the > legacy method (IO ports), while the extended space is accessed using > MMCFG. Do you know the reason for this? I fear there might be > legacy devices/bridges (or root complexes?) where MMCFG is not > working as expected? > There is apparently a errata on some K8 chipset according to FreeBSD code that uses MMCFG whenever possible. https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/pci/pci_cfgreg.c#L261-L277 > I think we need to understand why Xen (and Linux) do it this way so it > can be properly justified why it's safe to switch to a different > approach. > > Thanks, Roger. > -- Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available 2026-01-07 17:58 ` Teddy Astie @ 2026-01-07 18:07 ` Andrew Cooper 2026-01-07 20:02 ` Roger Pau Monné 0 siblings, 1 reply; 9+ messages in thread From: Andrew Cooper @ 2026-01-07 18:07 UTC (permalink / raw) To: Teddy Astie, Roger Pau Monné; +Cc: Andrew Cooper, xen-devel, Jan Beulich On 07/01/2026 5:58 pm, Teddy Astie wrote: > Le 07/01/2026 à 18:25, Roger Pau Monné a écrit : >> On Wed, Jan 07, 2026 at 04:54:55PM +0000, Teddy Astie wrote: >>> Currently, Xen uses legacy method to access the configuration space unless the >>> access cannot be made with it, where Xen fallbacks to MMCFG. This is not really >>> great, as MMCFG is more flexible and doesn't require a dedicated lock, so it would >>> be preferable to use it whenever possible. >>> >>> Teddy Astie (2): >>> x86/pci: Improve pci_mmcfg_{read,write} error handling >>> x86/pci: Prefer using mmcfg for accessing configuration space >> AFAICT Linux is using the same approach as Xen to perform PCI >> accesses. I think you mean "Xen inherited it's PCI code from Linux". :) >> Registers below 256 on segment 0 are accessed using the >> legacy method (IO ports), while the extended space is accessed using >> MMCFG. Do you know the reason for this? I fear there might be >> legacy devices/bridges (or root complexes?) where MMCFG is not >> working as expected? >> > There is apparently a errata on some K8 chipset according to FreeBSD > code that uses MMCFG whenever possible. > > https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/pci/pci_cfgreg.c#L261-L277 Using MMCFG is *far* more efficient than IO ports, not least because we can avoid serialising accesses across the system. If it really is only some K8's which were the problem then lets go the FreeBSD way. Both Linux and Xen both talk about AMD Fam10h issues, which is the fact that early AMD CPUs only allow MMCFG accesses for MOV-EAX instructions. There's also loads of code improvements to be had. The current APIs are inefficient to use, and I did some cleanup for them for Trenchboot where there is a 64k limit. (But this can strictly come later). ~Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available 2026-01-07 18:07 ` Andrew Cooper @ 2026-01-07 20:02 ` Roger Pau Monné 0 siblings, 0 replies; 9+ messages in thread From: Roger Pau Monné @ 2026-01-07 20:02 UTC (permalink / raw) To: Andrew Cooper; +Cc: Teddy Astie, xen-devel, Jan Beulich On Wed, Jan 07, 2026 at 06:07:56PM +0000, Andrew Cooper wrote: > On 07/01/2026 5:58 pm, Teddy Astie wrote: > > Le 07/01/2026 à 18:25, Roger Pau Monné a écrit : > >> On Wed, Jan 07, 2026 at 04:54:55PM +0000, Teddy Astie wrote: > >>> Currently, Xen uses legacy method to access the configuration space unless the > >>> access cannot be made with it, where Xen fallbacks to MMCFG. This is not really > >>> great, as MMCFG is more flexible and doesn't require a dedicated lock, so it would > >>> be preferable to use it whenever possible. > >>> > >>> Teddy Astie (2): > >>> x86/pci: Improve pci_mmcfg_{read,write} error handling > >>> x86/pci: Prefer using mmcfg for accessing configuration space > >> AFAICT Linux is using the same approach as Xen to perform PCI > >> accesses. > > I think you mean "Xen inherited it's PCI code from Linux". :) > > >> Registers below 256 on segment 0 are accessed using the > >> legacy method (IO ports), while the extended space is accessed using > >> MMCFG. Do you know the reason for this? I fear there might be > >> legacy devices/bridges (or root complexes?) where MMCFG is not > >> working as expected? > >> > > There is apparently a errata on some K8 chipset according to FreeBSD > > code that uses MMCFG whenever possible. > > > > https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/pci/pci_cfgreg.c#L261-L277 > > Using MMCFG is *far* more efficient than IO ports, not least because we > can avoid serialising accesses across the system. > > If it really is only some K8's which were the problem then lets go the > FreeBSD way. Both Linux and Xen both talk about AMD Fam10h issues, > which is the fact that early AMD CPUs only allow MMCFG accesses for > MOV-EAX instructions. Sorry if my previous reply made it look the opposite, I'm all fine with switching to MMCFG by default, but we need the K8 workaround, plus this needs to be noted in the commit message. I wonder if we could use alternative calls to patch MMCFG only access on capable systems, thus removing the check for the fallback legacy access on such systems. Thanks, Roger. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-01-08 9:56 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-07 16:54 [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Teddy Astie
2026-01-07 16:54 ` [PATCH v1 1/2] x86/pci: Improve pci_mmcfg_{read,write} error handling Teddy Astie
2026-01-08 9:51 ` Jan Beulich
2026-01-07 16:54 ` [PATCH v1 2/2] x86/pci: Prefer using mmcfg for accessing configuration space Teddy Astie
2026-01-08 9:56 ` Jan Beulich
2026-01-07 17:22 ` [PATCH v1 0/2] x86/pci: MMCFG improvements and always use it if available Roger Pau Monné
2026-01-07 17:58 ` Teddy Astie
2026-01-07 18:07 ` Andrew Cooper
2026-01-07 20:02 ` Roger Pau Monné
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.