* [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
@ 2024-12-03 10:58 Gao,Shiyuan via
2024-12-03 11:08 ` Daniel P. Berrangé
0 siblings, 1 reply; 6+ messages in thread
From: Gao,Shiyuan via @ 2024-12-03 10:58 UTC (permalink / raw)
To: eduardo@habkost.net, marcel.apfelbaum@gmail.com, mst@redhat.com,
zhao1.liu@intel.com, alex.williamson@redhat.com
Cc: qemu-devel@nongnu.org
> Some hardware devices now support PCIe 5.0, so change the default
> speed of the PCIe root port on new machine types.
>
> For passthrough Nvidia H20, this will be able to increase the h2d/d2h
> bandwidth ~17%.
>
> Origin:
> [CUDA Bandwidth Test] - Starting...
> Running on...
>
> Device 0: NVIDIA H20
> Quick Mode
>
> Host to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 45915.4
>
> Device to Host Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 45980.3
>
> Device to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 1842886.8
>
> Result = PASS
>
> With this patch:
> [CUDA Bandwidth Test] - Starting...
> Running on...
>
> Device 0: NVIDIA H20
> Quick Mode
>
> Host to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 53682.0
>
> Device to Host Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 53766.0
>
> Device to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 1842555.1
>
> Result = PASS
>
> Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
> ---
> hw/core/machine.c | 1 +
> hw/pci-bridge/gen_pcie_root_port.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index a35c4a8fae..afef55626d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -38,6 +38,7 @@
>
> GlobalProperty hw_compat_9_1[] = {
> { TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" },
> + { "pcie-root-port", "x-speed", "16" },
> };
> const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1);
>
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 784507c826..c24ce1f2d1 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -142,7 +142,7 @@ static Property gen_rp_props[] = {
> DEFINE_PROP_SIZE("pref64-reserve", GenPCIERootPort,
> res_reserve.mem_pref_64, -1),
> DEFINE_PROP_PCIE_LINK_SPEED("x-speed", PCIESlot,
> - speed, PCIE_LINK_SPEED_16),
> + speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> DEFINE_PROP_END_OF_LIST()
> --
> 2.34.1
Ping.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
2024-12-03 10:58 [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine Gao,Shiyuan via
@ 2024-12-03 11:08 ` Daniel P. Berrangé
0 siblings, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2024-12-03 11:08 UTC (permalink / raw)
To: Gao,Shiyuan
Cc: eduardo@habkost.net, marcel.apfelbaum@gmail.com, mst@redhat.com,
zhao1.liu@intel.com, alex.williamson@redhat.com,
qemu-devel@nongnu.org
On Tue, Dec 03, 2024 at 10:58:22AM +0000, Gao,Shiyuan via wrote:
> > Some hardware devices now support PCIe 5.0, so change the default
> > speed of the PCIe root port on new machine types.
> >
> > For passthrough Nvidia H20, this will be able to increase the h2d/d2h
> > bandwidth ~17%.
> >
> > Origin:
> > [CUDA Bandwidth Test] - Starting...
> > Running on...
> >
> > Device 0: NVIDIA H20
> > Quick Mode
> >
> > Host to Device Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 45915.4
> >
> > Device to Host Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 45980.3
> >
> > Device to Device Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 1842886.8
> >
> > Result = PASS
> >
> > With this patch:
> > [CUDA Bandwidth Test] - Starting...
> > Running on...
> >
> > Device 0: NVIDIA H20
> > Quick Mode
> >
> > Host to Device Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 53682.0
> >
> > Device to Host Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 53766.0
> >
> > Device to Device Bandwidth, 1 Device(s)
> > PINNED Memory Transfers
> > Transfer Size (Bytes) Bandwidth(MB/s)
> > 33554432 1842555.1
> >
> > Result = PASS
> >
> > Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
> > ---
> > hw/core/machine.c | 1 +
> > hw/pci-bridge/gen_pcie_root_port.c | 2 +-
> > 2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index a35c4a8fae..afef55626d 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -38,6 +38,7 @@
> >
> > GlobalProperty hw_compat_9_1[] = {
> > { TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" },
> > + { "pcie-root-port", "x-speed", "16" },
> > };
> > const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1);
> >
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> > index 784507c826..c24ce1f2d1 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -142,7 +142,7 @@ static Property gen_rp_props[] = {
> > DEFINE_PROP_SIZE("pref64-reserve", GenPCIERootPort,
> > res_reserve.mem_pref_64, -1),
> > DEFINE_PROP_PCIE_LINK_SPEED("x-speed", PCIESlot,
> > - speed, PCIE_LINK_SPEED_16),
> > + speed, PCIE_LINK_SPEED_32),
> > DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> > width, PCIE_LINK_WIDTH_32),
> > DEFINE_PROP_END_OF_LIST()
> > --
> > 2.34.1
>
> Ping.
There was a question from Jonathan Cameron on the original posting of this
patch that is awaiting your answer....
Regardless, at this time in the release cycle its too late for 9.2, so this
patch would likely need to be adapted for the 10.0 release and to use the
hw_compat_9_2 that will then be added.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
@ 2024-12-03 12:20 Gao,Shiyuan via
0 siblings, 0 replies; 6+ messages in thread
From: Gao,Shiyuan via @ 2024-12-03 12:20 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: eduardo@habkost.net, marcel.apfelbaum@gmail.com, mst@redhat.com,
zhao1.liu@intel.com, alex.williamson@redhat.com,
qemu-devel@nongnu.org
>
> There was a question from Jonathan Cameron on the original posting of this
> patch that is awaiting your answer....
Sorry, the reply is forgotten to cc qemu-devel, I have resend it.
>
>
> Regardless, at this time in the release cycle its too late for 9.2, so this
> patch would likely need to be adapted for the 10.0 release and to use the
> hw_compat_9_2 that will then be added.
>
Thanks, I'll adapt for the 10.0 release.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
@ 2024-12-03 12:15 Gao,Shiyuan via
0 siblings, 0 replies; 6+ messages in thread
From: Gao,Shiyuan via @ 2024-12-03 12:15 UTC (permalink / raw)
To: Jonathan Cameron, Gao Shiyuan via
Cc: eduardo@habkost.net, marcel.apfelbaum@gmail.com, mst@redhat.com,
zhao1.liu@intel.com, Daniel P. Berrangé,
alex.williamson@redhat.com
> > Some hardware devices now support PCIe 5.0, so change the default
> > speed of the PCIe root port on new machine types.
> >
> > For passthrough Nvidia H20, this will be able to increase the h2d/d2h
> > bandwidth ~17%.
>
>
> I'm curious. Why are you seeing the perf improvement?
>
>
> Maybe my mental model is wrong, but I though we just faked these
> registers so there should be no actual change of the link speed
> just a change in what the guest sees. Is the driver using this
> information to tune something else?
>
>
> I recently added support for the equivalent to CXL port emulation
> to support performance characteristic discovery but so far I think
> that's only informational for userspace software rather than affecting
> hardware usage directly.
Sorry, this email is forgotten to cc qemu-devel.
I'm also very curious about this. Maybe, as you mentioned,
the driver is using this information to make some settings?
But when I use ioh3420(it only support PCIe gen1) and pcie root port gen1, it
can alse get same bandwidth with gen4.
|---------------------------|------------|------------|
| root port | d2h(MB/s) | h2d(MB/s) |
|---------------------------|------------|------------|
| ioh3420 (gen1) | 45976.2 | 45681.9 |
|-----------------------------------------------------|
| pcie root port (gen1) | 45846.5 | 45993.7 |
|-----------------------------------------------------|
| pcie root port (gen4) | 45980.3 | 45915.4 |
|-----------------------------------------------------|
| pcie root port (gen5) | 53766.0 | 53682.0 |
|-----------------------------------------------------|
I would be glad if someone could explain it. Anyway, supporting Gen 5
won’t make things any worse.
Some information can get from VM as follow:
Origin pcie root port only support PCIe gen4
dmesg
[ 21.930515] pci 0000:61:00.0: [10de:2329] type 00 class 0x030200
[ 22.339006] pci 0000:61:00.0: reg 0x10: [mem 0x26002000000-0x26002ffffff 64bit pref]
[ 22.462006] pci 0000:61:00.0: reg 0x18: [mem 0x24000000000-0x25fffffffff 64bit pref]
[ 22.588005] pci 0000:61:00.0: reg 0x20: [mem 0x26000000000-0x26001ffffff 64bit pref]
[ 22.713135] pci 0000:61:00.0: Max Payload Size set to 128 (was 256, max 256)
[ 22.714223] pci 0000:61:00.0: Enabling HDA controller
[ 22.716635] pci 0000:61:00.0: 252.048 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x16 link at 0000:60:00.0 (capable of 504.112 Gb/s with 32.0 GT/s PCIe x16 link)
PCIe capability
60:00.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
Capabilities: [54] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x32, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 16GT/s (ok), Width x16 (downgraded)
TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+
Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
SltCtl: Enable: AttnBtn+ PwrFlt- MRL- PresDet- CmdCplt+ HPIrq+ LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState-
RootCap: CRSVisible-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 4
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
61:00.0 3D controller: NVIDIA Corporation Device 2329 (rev a1)
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer+ 2Retimers- CrosslinkRes: unsupported
pcie root port support PCIe gen1
dmesg
[ 21.419206] pci 0000:61:00.0: [10de:2329] type 00 class 0x030200
[ 21.827005] pci 0000:61:00.0: reg 0x10: [mem 0x5e002000000-0x5e002ffffff 64bit pref]
[ 21.952005] pci 0000:61:00.0: reg 0x18: [mem 0x5c000000000-0x5dfffffffff 64bit pref]
[ 22.074005] pci 0000:61:00.0: reg 0x20: [mem 0x5e000000000-0x5e001ffffff 64bit pref]
[ 22.196136] pci 0000:61:00.0: Max Payload Size set to 128 (was 256, max 256)
[ 22.197232] pci 0000:61:00.0: Enabling HDA controller
[ 22.199566] pci 0000:61:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x1 link at 0000:60:00.0 (capable of 504.112 Gb/s with 32.0 GT/s PCIe x16 link)
PCIe capability
60:00.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
Capabilities: [54] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+
Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
Changed: MRL- PresDet- LinkState-
RootCap: CRSVisible-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 4
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
61:00.0 3D controller: NVIDIA Corporation Device 2329 (rev a1)
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer+ 2Retimers- CrosslinkRes: unsupported
with this patch pcie root port support PCIe gen5
dmesg doesn't have this info.
PCIe capability
60:00.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
Capabilities: [54] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x32, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (downgraded)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+
Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
Changed: MRL- PresDet- LinkState-
RootCap: CRSVisible-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 4
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
61:00.0 3D controller: NVIDIA Corporation Device 2329 (rev a1)
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer+ 2Retimers- CrosslinkRes: unsupported
use ioh3420 as root port:
dmesg
[ 19.922501] pci 0000:61:00.0: [10de:2329] type 00 class 0x030200
[ 20.325006] pci 0000:61:00.0: reg 0x10: [mem 0x26002000000-0x26002ffffff 64bit pref]
[ 20.443005] pci 0000:61:00.0: reg 0x18: [mem 0x24000000000-0x25fffffffff 64bit pref]
[ 20.557005] pci 0000:61:00.0: reg 0x20: [mem 0x26000000000-0x26001ffffff 64bit pref]
[ 20.671137] pci 0000:61:00.0: Max Payload Size set to 128 (was 256, max 256)
[ 20.672207] pci 0000:61:00.0: Enabling HDA controller
[ 20.674744] pci 0000:61:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x1 link at 0000:60:00.0 (capable of 504.112 Gb/s with 32.0 GT/s PCIe x16 link)
PCIe capability
60:00.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI Express Root Port 0 (rev 02) (prog-if 00 [Normal decode])
Capabilities: [90] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+
Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
SltCtl: Enable: AttnBtn+ PwrFlt- MRL- PresDet- CmdCplt+ HPIrq+ LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState-
RootCap: CRSVisible-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 4
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
61:00.0 3D controller: NVIDIA Corporation Device 2329 (rev a1)
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 32GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer+ 2Retimers- CrosslinkRes: unsupported
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
@ 2024-11-17 14:39 Gao Shiyuan via
2024-11-18 11:43 ` Jonathan Cameron via
0 siblings, 1 reply; 6+ messages in thread
From: Gao Shiyuan via @ 2024-11-17 14:39 UTC (permalink / raw)
To: eduardo, marcel.apfelbaum, mst, zhao1.liu; +Cc: qemu-devel, gaoshiyuan
Some hardware devices now support PCIe 5.0, so change the default
speed of the PCIe root port on new machine types.
For passthrough Nvidia H20, this will be able to increase the h2d/d2h
bandwidth ~17%.
Origin:
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: NVIDIA H20
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 45915.4
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 45980.3
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1842886.8
Result = PASS
With this patch:
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: NVIDIA H20
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 53682.0
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 53766.0
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1842555.1
Result = PASS
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
---
hw/core/machine.c | 1 +
hw/pci-bridge/gen_pcie_root_port.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index a35c4a8fae..afef55626d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -38,6 +38,7 @@
GlobalProperty hw_compat_9_1[] = {
{ TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" },
+ { "pcie-root-port", "x-speed", "16" },
};
const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1);
diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
index 784507c826..c24ce1f2d1 100644
--- a/hw/pci-bridge/gen_pcie_root_port.c
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -142,7 +142,7 @@ static Property gen_rp_props[] = {
DEFINE_PROP_SIZE("pref64-reserve", GenPCIERootPort,
res_reserve.mem_pref_64, -1),
DEFINE_PROP_PCIE_LINK_SPEED("x-speed", PCIESlot,
- speed, PCIE_LINK_SPEED_16),
+ speed, PCIE_LINK_SPEED_32),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
width, PCIE_LINK_WIDTH_32),
DEFINE_PROP_END_OF_LIST()
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine
2024-11-17 14:39 Gao Shiyuan via
@ 2024-11-18 11:43 ` Jonathan Cameron via
0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Cameron via @ 2024-11-18 11:43 UTC (permalink / raw)
To: Gao Shiyuan via; +Cc: Gao Shiyuan, eduardo, marcel.apfelbaum, mst, zhao1.liu
On Sun, 17 Nov 2024 22:39:17 +0800
Gao Shiyuan via <qemu-devel@nongnu.org> wrote:
> Some hardware devices now support PCIe 5.0, so change the default
> speed of the PCIe root port on new machine types.
>
> For passthrough Nvidia H20, this will be able to increase the h2d/d2h
> bandwidth ~17%.
I'm curious. Why are you seeing the perf improvement?
Maybe my mental model is wrong, but I though we just faked these
registers so there should be no actual change of the link speed
just a change in what the guest sees. Is the driver using this
information to tune something else?
I recently added support for the equivalent to CXL port emulation
to support performance characteristic discovery but so far I think
that's only informational for userspace software rather than affecting
hardware usage directly.
Jonathan
>
> Origin:
> [CUDA Bandwidth Test] - Starting...
> Running on...
>
> Device 0: NVIDIA H20
> Quick Mode
>
> Host to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 45915.4
>
> Device to Host Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 45980.3
>
> Device to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 1842886.8
>
> Result = PASS
>
> With this patch:
> [CUDA Bandwidth Test] - Starting...
> Running on...
>
> Device 0: NVIDIA H20
> Quick Mode
>
> Host to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 53682.0
>
> Device to Host Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 53766.0
>
> Device to Device Bandwidth, 1 Device(s)
> PINNED Memory Transfers
> Transfer Size (Bytes) Bandwidth(MB/s)
> 33554432 1842555.1
>
> Result = PASS
>
> Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
> ---
> hw/core/machine.c | 1 +
> hw/pci-bridge/gen_pcie_root_port.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index a35c4a8fae..afef55626d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -38,6 +38,7 @@
>
> GlobalProperty hw_compat_9_1[] = {
> { TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" },
> + { "pcie-root-port", "x-speed", "16" },
> };
> const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1);
>
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
> index 784507c826..c24ce1f2d1 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -142,7 +142,7 @@ static Property gen_rp_props[] = {
> DEFINE_PROP_SIZE("pref64-reserve", GenPCIERootPort,
> res_reserve.mem_pref_64, -1),
> DEFINE_PROP_PCIE_LINK_SPEED("x-speed", PCIESlot,
> - speed, PCIE_LINK_SPEED_16),
> + speed, PCIE_LINK_SPEED_32),
> DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
> width, PCIE_LINK_WIDTH_32),
> DEFINE_PROP_END_OF_LIST()
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-12-03 12:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 10:58 [PATCH 1/1] pcie-root-port: Fast PCIe root ports for new machine Gao,Shiyuan via
2024-12-03 11:08 ` Daniel P. Berrangé
-- strict thread matches above, loose matches on Subject: below --
2024-12-03 12:20 Gao,Shiyuan via
2024-12-03 12:15 Gao,Shiyuan via
2024-11-17 14:39 Gao Shiyuan via
2024-11-18 11:43 ` Jonathan Cameron via
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.