qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width
@ 2024-05-29 20:17 Nam Cao
  2024-05-31 10:14 ` Jonathan Cameron via
  0 siblings, 1 reply; 4+ messages in thread
From: Nam Cao @ 2024-05-29 20:17 UTC (permalink / raw)
  To: Alex Williamson, Michael S . Tsirkin, Marcel Apfelbaum,
	qemu-devel
  Cc: Philippe Mathieu-Daudé, Nam Cao

Set link width to x1 and link speed to 2.5 Gb/s as specified by the
datasheet. Without this, these fields in the link status register read
zero, which is incorrect.

This problem appeared since 3d67447fe7c2 ("pcie: Fill PCIESlot link fields
to support higher speeds and widths"), which allows PCIe slot to set link
width and link speed. However, if PCIe slot does not explicitly set these
properties, they will be zero. Before this commit, the width and speed
default to x1 and 2.5 Gb/s.

Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths")
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
v2: implement this in .realize() instead
---
 hw/pci-bridge/xio3130_downstream.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c
index 38a2361fa2..2df1ee203d 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -72,6 +72,9 @@ static void xio3130_downstream_realize(PCIDevice *d, Error **errp)
     pci_bridge_initfn(d, TYPE_PCIE_BUS);
     pcie_port_init_reg(d);
 
+    s->speed = QEMU_PCI_EXP_LNK_2_5GT;
+    s->width = QEMU_PCI_EXP_LNK_X1;
+
     rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
                   XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
                   XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width
  2024-05-29 20:17 [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width Nam Cao
@ 2024-05-31 10:14 ` Jonathan Cameron via
  2024-05-31 10:36   ` Nam Cao
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron via @ 2024-05-31 10:14 UTC (permalink / raw)
  To: Nam Cao
  Cc: Alex Williamson, Michael S . Tsirkin, Marcel Apfelbaum,
	qemu-devel, Philippe Mathieu-Daudé

On Wed, 29 May 2024 22:17:44 +0200
Nam Cao <namcao@linutronix.de> wrote:

> Set link width to x1 and link speed to 2.5 Gb/s as specified by the
> datasheet. Without this, these fields in the link status register read
> zero, which is incorrect.
> 
> This problem appeared since 3d67447fe7c2 ("pcie: Fill PCIESlot link fields
> to support higher speeds and widths"), which allows PCIe slot to set link
> width and link speed. However, if PCIe slot does not explicitly set these
> properties, they will be zero. Before this commit, the width and speed
> default to x1 and 2.5 Gb/s.
> 
> Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths")
> Signed-off-by: Nam Cao <namcao@linutronix.de>
Hi Nam,

I'm feeling a bit guilty about this one a known it was there for a while.

I was lazy when fixing the equivalent CXL case a while back on
basis no one had noticed and unlike CXL (where migration is broken for a lot
of reasons) fixing this may need to take into account migration from broken to
fixed versions.  Have you tested that?

I did the CXL fix slightly differently.  Can't remember why though - looking
at the fact it uses an instance_post_init, is there an issue with accidentally
overwriting the parameters?  Or did I just over engineer the fix?

https://gitlab.com/jic23/qemu/-/commit/314f5033c639ebe8218078a17513935747f15d9d

> ---
> v2: implement this in .realize() instead
> ---
>  hw/pci-bridge/xio3130_downstream.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c
> index 38a2361fa2..2df1ee203d 100644
> --- a/hw/pci-bridge/xio3130_downstream.c
> +++ b/hw/pci-bridge/xio3130_downstream.c
> @@ -72,6 +72,9 @@ static void xio3130_downstream_realize(PCIDevice *d, Error **errp)
>      pci_bridge_initfn(d, TYPE_PCIE_BUS);
>      pcie_port_init_reg(d);
>  
> +    s->speed = QEMU_PCI_EXP_LNK_2_5GT;
> +    s->width = QEMU_PCI_EXP_LNK_X1;
> +
>      rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
>                    XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
>                    XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width
  2024-05-31 10:14 ` Jonathan Cameron via
@ 2024-05-31 10:36   ` Nam Cao
  2024-06-03 17:08     ` Jonathan Cameron via
  0 siblings, 1 reply; 4+ messages in thread
From: Nam Cao @ 2024-05-31 10:36 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Alex Williamson, Michael S . Tsirkin, Marcel Apfelbaum,
	qemu-devel, Philippe Mathieu-Daudé

On Fri, May 31, 2024 at 11:14:00AM +0100, Jonathan Cameron wrote:
> On Wed, 29 May 2024 22:17:44 +0200
> Nam Cao <namcao@linutronix.de> wrote:
> 
> > Set link width to x1 and link speed to 2.5 Gb/s as specified by the
> > datasheet. Without this, these fields in the link status register read
> > zero, which is incorrect.
> > 
> > This problem appeared since 3d67447fe7c2 ("pcie: Fill PCIESlot link fields
> > to support higher speeds and widths"), which allows PCIe slot to set link
> > width and link speed. However, if PCIe slot does not explicitly set these
> > properties, they will be zero. Before this commit, the width and speed
> > default to x1 and 2.5 Gb/s.
> > 
> > Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths")
> > Signed-off-by: Nam Cao <namcao@linutronix.de>
> Hi Nam,
> 
> I'm feeling a bit guilty about this one a known it was there for a while.
> 
> I was lazy when fixing the equivalent CXL case a while back on
> basis no one had noticed and unlike CXL (where migration is broken for a lot
> of reasons) fixing this may need to take into account migration from broken to
> fixed versions.  Have you tested that?

I tested this patch with Linux kernel.

I noticed this bug when Linux complained that the PCI link was broken.
Linux determines weather a link is up by checking if these speed/width
fields have valid value.

Repro:
	qemu-system-x86_64 \
	-machine pc-q35-2.10 \
	-kernel bzImage \
	-drive "file=img,format=raw" \
	-m 2048 -smp 1 -enable-kvm \
	-append "console=ttyS0 root=/dev/sda debug" \
	-nographic \
	-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=253 \
	-device x3130-upstream,id=up1,bus=rp1 \
	-device xio3130-downstream,id=dp1,bus=up1,chassis=1,slot=1

Then after Linux has booted:
	device_add device_add e1000,bus=dp1,id=eth0

Then Linux complains that something is wrong with the link:
pcieport 0000:02:00.0: pciehp: Slot(1-1): Cannot train link: status 0x2000
 
This patch gets rid of Linux's complain, and the hot-plug now works fine.

> I did the CXL fix slightly differently.  Can't remember why though - looking
> at the fact it uses an instance_post_init, is there an issue with accidentally
> overwriting the parameters?  Or did I just over engineer the fix?

I would say over engineer. I think CXL does not take link speed and link
width as parameters.

Best regards,
Nam


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width
  2024-05-31 10:36   ` Nam Cao
@ 2024-06-03 17:08     ` Jonathan Cameron via
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Cameron via @ 2024-06-03 17:08 UTC (permalink / raw)
  To: Nam Cao
  Cc: Alex Williamson, Michael S . Tsirkin, Marcel Apfelbaum,
	qemu-devel, Philippe Mathieu-Daudé

On Fri, 31 May 2024 12:36:35 +0200
Nam Cao <namcao@linutronix.de> wrote:

> On Fri, May 31, 2024 at 11:14:00AM +0100, Jonathan Cameron wrote:
> > On Wed, 29 May 2024 22:17:44 +0200
> > Nam Cao <namcao@linutronix.de> wrote:
> >   
> > > Set link width to x1 and link speed to 2.5 Gb/s as specified by the
> > > datasheet. Without this, these fields in the link status register read
> > > zero, which is incorrect.
> > > 
> > > This problem appeared since 3d67447fe7c2 ("pcie: Fill PCIESlot link fields
> > > to support higher speeds and widths"), which allows PCIe slot to set link
> > > width and link speed. However, if PCIe slot does not explicitly set these
> > > properties, they will be zero. Before this commit, the width and speed
> > > default to x1 and 2.5 Gb/s.
> > > 
> > > Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths")
> > > Signed-off-by: Nam Cao <namcao@linutronix.de>  
> > Hi Nam,
> > 
> > I'm feeling a bit guilty about this one a known it was there for a while.
> > 
> > I was lazy when fixing the equivalent CXL case a while back on
> > basis no one had noticed and unlike CXL (where migration is broken for a lot
> > of reasons) fixing this may need to take into account migration from broken to
> > fixed versions.  Have you tested that?  
> 

I've run into problems in the past around updating config space registers
because when we migrate from a prepatch QEMU instance to a post patch 1 the
config space registers are compared. I'm not sure if LNKCAP is included
in that.  LNKSTA is explicitly ruled out I think.

For examples see all the machine version checks in
hw/core/machine.c

The one that bit me was fixed with x-pcie-err-unc-mask
when I was fixing a register that didn't match the spec defined values.


> I tested this patch with Linux kernel.
> 
> I noticed this bug when Linux complained that the PCI link was broken.
> Linux determines weather a link is up by checking if these speed/width
> fields have valid value.
> 
> Repro:
> 	qemu-system-x86_64 \
> 	-machine pc-q35-2.10 \
> 	-kernel bzImage \
> 	-drive "file=img,format=raw" \
> 	-m 2048 -smp 1 -enable-kvm \
> 	-append "console=ttyS0 root=/dev/sda debug" \
> 	-nographic \
> 	-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=253 \
> 	-device x3130-upstream,id=up1,bus=rp1 \
> 	-device xio3130-downstream,id=dp1,bus=up1,chassis=1,slot=1
> 
> Then after Linux has booted:
> 	device_add device_add e1000,bus=dp1,id=eth0
> 
> Then Linux complains that something is wrong with the link:
> pcieport 0000:02:00.0: pciehp: Slot(1-1): Cannot train link: status 0x2000
>  
> This patch gets rid of Linux's complain, and the hot-plug now works fine.
> 
> > I did the CXL fix slightly differently.  Can't remember why though - looking
> > at the fact it uses an instance_post_init, is there an issue with accidentally
> > overwriting the parameters?  Or did I just over engineer the fix?  
> 
> I would say over engineer. I think CXL does not take link speed and link
> width as parameters.

I've implemented control but this still ends up over engineered because
the reason I want to control this is to vary access parameters for calculating
latency and bandwidth.  That is easiest done by controlling the EP status
to degrade the link.  For that I just set the CAP register on the switch DSP
to allow suitably high values and let pcie_sync_bridge() match this to
the status of the EP (which I have properties to contro).
There seems to be only one way 'negotiation' of these parameters so it
needs to be EP driven.

Jonathan
> 
> Best regards,
> Nam



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-03 17:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 20:17 [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width Nam Cao
2024-05-31 10:14 ` Jonathan Cameron via
2024-05-31 10:36   ` Nam Cao
2024-06-03 17:08     ` Jonathan Cameron via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).