From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Nam Cao <namcao@linutronix.de>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
qemu-devel@nongnu.org,
"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width
Date: Mon, 3 Jun 2024 18:08:10 +0100 [thread overview]
Message-ID: <20240603180810.0000751b@Huawei.com> (raw)
In-Reply-To: <20240531103635.x9vzCtCv@linutronix.de>
On Fri, 31 May 2024 12:36:35 +0200
Nam Cao <namcao@linutronix.de> wrote:
> On Fri, May 31, 2024 at 11:14:00AM +0100, Jonathan Cameron wrote:
> > On Wed, 29 May 2024 22:17:44 +0200
> > Nam Cao <namcao@linutronix.de> wrote:
> >
> > > Set link width to x1 and link speed to 2.5 Gb/s as specified by the
> > > datasheet. Without this, these fields in the link status register read
> > > zero, which is incorrect.
> > >
> > > This problem appeared since 3d67447fe7c2 ("pcie: Fill PCIESlot link fields
> > > to support higher speeds and widths"), which allows PCIe slot to set link
> > > width and link speed. However, if PCIe slot does not explicitly set these
> > > properties, they will be zero. Before this commit, the width and speed
> > > default to x1 and 2.5 Gb/s.
> > >
> > > Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths")
> > > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > Hi Nam,
> >
> > I'm feeling a bit guilty about this one a known it was there for a while.
> >
> > I was lazy when fixing the equivalent CXL case a while back on
> > basis no one had noticed and unlike CXL (where migration is broken for a lot
> > of reasons) fixing this may need to take into account migration from broken to
> > fixed versions. Have you tested that?
>
I've run into problems in the past around updating config space registers
because when we migrate from a prepatch QEMU instance to a post patch 1 the
config space registers are compared. I'm not sure if LNKCAP is included
in that. LNKSTA is explicitly ruled out I think.
For examples see all the machine version checks in
hw/core/machine.c
The one that bit me was fixed with x-pcie-err-unc-mask
when I was fixing a register that didn't match the spec defined values.
> I tested this patch with Linux kernel.
>
> I noticed this bug when Linux complained that the PCI link was broken.
> Linux determines weather a link is up by checking if these speed/width
> fields have valid value.
>
> Repro:
> qemu-system-x86_64 \
> -machine pc-q35-2.10 \
> -kernel bzImage \
> -drive "file=img,format=raw" \
> -m 2048 -smp 1 -enable-kvm \
> -append "console=ttyS0 root=/dev/sda debug" \
> -nographic \
> -device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=253 \
> -device x3130-upstream,id=up1,bus=rp1 \
> -device xio3130-downstream,id=dp1,bus=up1,chassis=1,slot=1
>
> Then after Linux has booted:
> device_add device_add e1000,bus=dp1,id=eth0
>
> Then Linux complains that something is wrong with the link:
> pcieport 0000:02:00.0: pciehp: Slot(1-1): Cannot train link: status 0x2000
>
> This patch gets rid of Linux's complain, and the hot-plug now works fine.
>
> > I did the CXL fix slightly differently. Can't remember why though - looking
> > at the fact it uses an instance_post_init, is there an issue with accidentally
> > overwriting the parameters? Or did I just over engineer the fix?
>
> I would say over engineer. I think CXL does not take link speed and link
> width as parameters.
I've implemented control but this still ends up over engineered because
the reason I want to control this is to vary access parameters for calculating
latency and bandwidth. That is easiest done by controlling the EP status
to degrade the link. For that I just set the CAP register on the switch DSP
to allow suitably high values and let pcie_sync_bridge() match this to
the status of the EP (which I have properties to contro).
There seems to be only one way 'negotiation' of these parameters so it
needs to be EP driven.
Jonathan
>
> Best regards,
> Nam
prev parent reply other threads:[~2024-06-03 17:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 20:17 [PATCH v2] pci-bridge/xio3130_downstream: fix invalid link speed and link width Nam Cao
2024-05-31 10:14 ` Jonathan Cameron via
2024-05-31 10:36 ` Nam Cao
2024-06-03 17:08 ` Jonathan Cameron via [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240603180810.0000751b@Huawei.com \
--to=qemu-devel@nongnu.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=alex.williamson@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=namcao@linutronix.de \
--cc=philmd@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.