* about mpss with pcie_bus_perf
@ 2014-01-10 0:46 Yinghai Lu
2014-01-14 22:54 ` Bjorn Helgaas
0 siblings, 1 reply; 7+ messages in thread
From: Yinghai Lu @ 2014-01-10 0:46 UTC (permalink / raw)
To: Myron Stowe, Bjorn Helgaas, linux-pci@vger.kernel.org
looks like we have some problem with MPSS.
+-02.2-[10-1f]----00.0-[11-13]--+-02.0-[12]--+-00.0
| | \-00.1
| \-03.0-[13]----00.0
kernel boot with pce_bus_perf:
00:02.2: cap/ctl: 256/256
10:00.0: cap/ctl: 256/256
11:02.0: cap/ctl: 256/256
12:00.0: cap/ctl: 128/128
12:00.1: cap/ctl: 128/128
11:03.0: cap/ctl: 256/256
13:00.0: cap/ctl: 256/256
Should we set MPSS to 128?
Thanks
Yinghai
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: about mpss with pcie_bus_perf 2014-01-10 0:46 about mpss with pcie_bus_perf Yinghai Lu @ 2014-01-14 22:54 ` Bjorn Helgaas 2014-01-15 0:34 ` Jon Mason 0 siblings, 1 reply; 7+ messages in thread From: Bjorn Helgaas @ 2014-01-14 22:54 UTC (permalink / raw) To: Yinghai Lu; +Cc: Myron Stowe, linux-pci@vger.kernel.org, Yijing Wang, Jon Mason [+cc Jon, Yijing] On Thu, Jan 9, 2014 at 5:46 PM, Yinghai Lu <yinghai@kernel.org> wrote: > looks like we have some problem with MPSS. > > +-02.2-[10-1f]----00.0-[11-13]--+-02.0-[12]--+-00.0 > | | \-00.1 > | \-03.0-[13]----00.0 > > kernel boot with pce_bus_perf: > 00:02.2: cap/ctl: 256/256 > 10:00.0: cap/ctl: 256/256 > 11:02.0: cap/ctl: 256/256 > 12:00.0: cap/ctl: 128/128 > 12:00.1: cap/ctl: 128/128 > > 11:03.0: cap/ctl: 256/256 > 13:00.0: cap/ctl: 256/256 > > Should we set MPSS to 128? Please propose a patch and/or open a bug report. I don't do enough with MPS to make the problem and its solution immediately obvious to me. Bjorn ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: about mpss with pcie_bus_perf 2014-01-14 22:54 ` Bjorn Helgaas @ 2014-01-15 0:34 ` Jon Mason 2014-01-15 2:12 ` Yijing Wang 0 siblings, 1 reply; 7+ messages in thread From: Jon Mason @ 2014-01-15 0:34 UTC (permalink / raw) To: Bjorn Helgaas Cc: Yinghai Lu, Myron Stowe, linux-pci@vger.kernel.org, Yijing Wang, Jon Mason On Tue, Jan 14, 2014 at 3:54 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc Jon, Yijing] > > On Thu, Jan 9, 2014 at 5:46 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> looks like we have some problem with MPSS. >> >> +-02.2-[10-1f]----00.0-[11-13]--+-02.0-[12]--+-00.0 >> | | \-00.1 >> | \-03.0-[13]----00.0 >> >> kernel boot with pce_bus_perf: >> 00:02.2: cap/ctl: 256/256 >> 10:00.0: cap/ctl: 256/256 >> 11:02.0: cap/ctl: 256/256 >> 12:00.0: cap/ctl: 128/128 >> 12:00.1: cap/ctl: 128/128 >> >> 11:03.0: cap/ctl: 256/256 >> 13:00.0: cap/ctl: 256/256 >> >> Should we set MPSS to 128? > > Please propose a patch and/or open a bug report. I don't do enough > with MPS to make the problem and its solution immediately obvious to > me. Not a lot of verbiage in here, but I believe this is the expected behavior for the "pcie_bus_perf" kernel boot parm. With it, each pci device sets its MPS to the max of the parent >From the commit log: - A more optimal way is possible, if it falls within a couple of constraints: * The top-level host bridge will never generate packets larger than the smallest TLP (or if it can be controlled independently from its MPS at least) * The device will never generate packets larger than MPS (which can be configured via MRRS) * No support of direct PCI-E <-> PCI-E transfers between devices without some additional code to specifically deal with that case Then we can use an approach that basically ignores downstream requests and focuses exclusively on upstream requests. In that case, all we need to care about is that a device MPS is no larger than its parent MPS, which allows us to keep all switches/bridges to the max MPS supported by their parent and eventually the PHB. If this is not behaving as described (which I can't tell from the log above), then feel free to assign the bug to me. Thanks, Jon > > Bjorn > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: about mpss with pcie_bus_perf 2014-01-15 0:34 ` Jon Mason @ 2014-01-15 2:12 ` Yijing Wang 2014-01-15 18:18 ` Jon Mason 0 siblings, 1 reply; 7+ messages in thread From: Yijing Wang @ 2014-01-15 2:12 UTC (permalink / raw) To: Jon Mason, Bjorn Helgaas Cc: Yinghai Lu, Myron Stowe, linux-pci@vger.kernel.org, Jon Mason On 2014/1/15 8:34, Jon Mason wrote: > On Tue, Jan 14, 2014 at 3:54 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> [+cc Jon, Yijing] >> >> On Thu, Jan 9, 2014 at 5:46 PM, Yinghai Lu <yinghai@kernel.org> wrote: >>> looks like we have some problem with MPSS. >>> >>> +-02.2-[10-1f]----00.0-[11-13]--+-02.0-[12]--+-00.0 >>> | | \-00.1 >>> | \-03.0-[13]----00.0 >>> >>> kernel boot with pce_bus_perf: >>> 00:02.2: cap/ctl: 256/256 >>> 10:00.0: cap/ctl: 256/256 >>> 11:02.0: cap/ctl: 256/256 >>> 12:00.0: cap/ctl: 128/128 >>> 12:00.1: cap/ctl: 128/128 >>> >>> 11:03.0: cap/ctl: 256/256 >>> 13:00.0: cap/ctl: 256/256 >>> >>> Should we set MPSS to 128? >> >> Please propose a patch and/or open a bug report. I don't do enough >> with MPS to make the problem and its solution immediately obvious to >> me. > > Not a lot of verbiage in here, but I believe this is the expected > behavior for the "pcie_bus_perf" kernel boot parm. With it, each pci > device sets its MPS to the max of the parent Yes, it's the expected behavior for the "pcie_bus_per". Pcie_write_mrrs() will additionally set mrrs to largest supported value for safe. > >>From the commit log: > > - A more optimal way is possible, if it falls within a couple of > constraints: > * The top-level host bridge will never generate packets larger than the > smallest TLP (or if it can be controlled independently from its MPS at > least) > * The device will never generate packets larger than MPS (which can be > configured via MRRS) > * No support of direct PCI-E <-> PCI-E transfers between devices without > some additional code to specifically deal with that case > > Then we can use an approach that basically ignores downstream requests > and focuses exclusively on upstream requests. In that case, all we need Hi Jon, I do not quite understand why we can ignores downstream here , as a model like Yinghai's pcie topo: mps 256 mps 256 mps 256 root port ------Switch port(UP) -------Switch port(DP) A --------PCIe Endpoint Device ( mps = 128) | <-------Read Request to upstream is safe because MRRS is set to properly value. | <-------TLP payload won't excess (mps=128) as a transmitter, so this is also safe. | -------->Downstream TLP like read completion and some other TLP write to PCIe EP device | My question is here, how can we ensure Downstream is safe? | | |-------Switch port(DP) B Sorry to disturb you, I would be appreciate if you can me any advice. Thanks! > to care about is that a device MPS is no larger than its parent MPS, > which allows us to keep all switches/bridges to the max MPS supported by > their parent and eventually the PHB. > > > If this is not behaving as described (which I can't tell from the log > above), then feel free to assign the bug to me. > > Thanks, > Jon > > > >> >> Bjorn >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > . > -- Thanks! Yijing ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: about mpss with pcie_bus_perf 2014-01-15 2:12 ` Yijing Wang @ 2014-01-15 18:18 ` Jon Mason 2014-01-16 1:56 ` Yijing Wang 2014-01-16 4:27 ` Yinghai Lu 0 siblings, 2 replies; 7+ messages in thread From: Jon Mason @ 2014-01-15 18:18 UTC (permalink / raw) To: Yijing Wang Cc: Bjorn Helgaas, Yinghai Lu, Myron Stowe, linux-pci@vger.kernel.org, Jon Mason On Tue, Jan 14, 2014 at 7:12 PM, Yijing Wang <wangyijing@huawei.com> wrote: > On 2014/1/15 8:34, Jon Mason wrote: >> On Tue, Jan 14, 2014 at 3:54 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>> [+cc Jon, Yijing] >>> >>> On Thu, Jan 9, 2014 at 5:46 PM, Yinghai Lu <yinghai@kernel.org> wrote: >>>> looks like we have some problem with MPSS. >>>> >>>> +-02.2-[10-1f]----00.0-[11-13]--+-02.0-[12]--+-00.0 >>>> | | \-00.1 >>>> | \-03.0-[13]----00.0 >>>> >>>> kernel boot with pce_bus_perf: >>>> 00:02.2: cap/ctl: 256/256 >>>> 10:00.0: cap/ctl: 256/256 >>>> 11:02.0: cap/ctl: 256/256 >>>> 12:00.0: cap/ctl: 128/128 >>>> 12:00.1: cap/ctl: 128/128 >>>> >>>> 11:03.0: cap/ctl: 256/256 >>>> 13:00.0: cap/ctl: 256/256 >>>> >>>> Should we set MPSS to 128? >>> >>> Please propose a patch and/or open a bug report. I don't do enough >>> with MPS to make the problem and its solution immediately obvious to >>> me. >> >> Not a lot of verbiage in here, but I believe this is the expected >> behavior for the "pcie_bus_perf" kernel boot parm. With it, each pci >> device sets its MPS to the max of the parent > > Yes, it's the expected behavior for the "pcie_bus_per". > Pcie_write_mrrs() will additionally set mrrs to largest supported value for safe. > >> >>>From the commit log: >> >> - A more optimal way is possible, if it falls within a couple of >> constraints: >> * The top-level host bridge will never generate packets larger than the >> smallest TLP (or if it can be controlled independently from its MPS at >> least) >> * The device will never generate packets larger than MPS (which can be >> configured via MRRS) >> * No support of direct PCI-E <-> PCI-E transfers between devices without >> some additional code to specifically deal with that case >> >> Then we can use an approach that basically ignores downstream requests >> and focuses exclusively on upstream requests. In that case, all we need > > Hi Jon, I do not quite understand why we can ignores downstream here , as a model like Yinghai's pcie topo: > > mps 256 mps 256 mps 256 > root port ------Switch port(UP) -------Switch port(DP) A --------PCIe Endpoint Device ( mps = 128) > | <-------Read Request to upstream is safe because MRRS is set to properly value. > | <-------TLP payload won't excess (mps=128) as a transmitter, so this is also safe. > | -------->Downstream TLP like read completion and some other TLP write to PCIe EP device > | My question is here, how can we ensure Downstream is safe? > | > | > |-------Switch port(DP) B > > Sorry to disturb you, I would be appreciate if you can me any advice. Thanks! If all inter-device communication is removed, then the only communication is CPU, Endpoint, and switches in-between. Going from CPU to Endpoint, the MPS is actually going to be the Cache Line size. Since the Cache line size is 64B on x86 and most other architectures, there is no worry that the endpoint will get a PCIE packet larger than the MPS. Also, using the MRRS to clamp down the endpoint to the MPS of the switches should ensure no reads larger than the MPS. Going from Endpoint to CPU, we must ensure that all switches have a MPSS large enough for any device under them. If not, then we must clamp down the Endpoint MPS. If all of this works, then we can ensure a much larger MPS for all of the PCI devices under a switch and not be bound by the smallest MPSS of an endpoint on the switch. Thanks, Jon > >> to care about is that a device MPS is no larger than its parent MPS, >> which allows us to keep all switches/bridges to the max MPS supported by >> their parent and eventually the PHB. >> >> >> If this is not behaving as described (which I can't tell from the log >> above), then feel free to assign the bug to me. >> >> Thanks, >> Jon >> >> >> >>> >>> Bjorn >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> . >> > > > -- > Thanks! > Yijing > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: about mpss with pcie_bus_perf 2014-01-15 18:18 ` Jon Mason @ 2014-01-16 1:56 ` Yijing Wang 2014-01-16 4:27 ` Yinghai Lu 1 sibling, 0 replies; 7+ messages in thread From: Yijing Wang @ 2014-01-16 1:56 UTC (permalink / raw) To: Jon Mason Cc: Bjorn Helgaas, Yinghai Lu, Myron Stowe, linux-pci@vger.kernel.org, Jon Mason >>> Not a lot of verbiage in here, but I believe this is the expected >>> behavior for the "pcie_bus_perf" kernel boot parm. With it, each pci >>> device sets its MPS to the max of the parent >> >> Yes, it's the expected behavior for the "pcie_bus_per". >> Pcie_write_mrrs() will additionally set mrrs to largest supported value for safe. >> >>> >>> >From the commit log: >>> >>> - A more optimal way is possible, if it falls within a couple of >>> constraints: >>> * The top-level host bridge will never generate packets larger than the >>> smallest TLP (or if it can be controlled independently from its MPS at >>> least) >>> * The device will never generate packets larger than MPS (which can be >>> configured via MRRS) >>> * No support of direct PCI-E <-> PCI-E transfers between devices without >>> some additional code to specifically deal with that case >>> >>> Then we can use an approach that basically ignores downstream requests >>> and focuses exclusively on upstream requests. In that case, all we need >> >> Hi Jon, I do not quite understand why we can ignores downstream here , as a model like Yinghai's pcie topo: >> >> mps 256 mps 256 mps 256 >> root port ------Switch port(UP) -------Switch port(DP) A --------PCIe Endpoint Device ( mps = 128) >> | <-------Read Request to upstream is safe because MRRS is set to properly value. >> | <-------TLP payload won't excess (mps=128) as a transmitter, so this is also safe. >> | -------->Downstream TLP like read completion and some other TLP write to PCIe EP device >> | My question is here, how can we ensure Downstream is safe? >> | >> | >> |-------Switch port(DP) B >> >> Sorry to disturb you, I would be appreciate if you can me any advice. Thanks! > > If all inter-device communication is removed, then the only > communication is CPU, Endpoint, and switches in-between. Going from > CPU to Endpoint, the MPS is actually going to be the Cache Line size. > Since the Cache line size is 64B on x86 and most other architectures, > there is no worry that the endpoint will get a PCIE packet larger than > the MPS. Also, using the MRRS to clamp down the endpoint to the MPS > of the switches should ensure no reads larger than the MPS. Going > from Endpoint to CPU, we must ensure that all switches have a MPSS > large enough for any device under them. If not, then we must clamp > down the Endpoint MPS. > > If all of this works, then we can ensure a much larger MPS for all of > the PCI devices under a switch and not be bound by the smallest MPSS > of an endpoint on the switch. > Hi Jon, I got it, Thanks for your explanation. Thanks! Yijing. >> >>> to care about is that a device MPS is no larger than its parent MPS, >>> which allows us to keep all switches/bridges to the max MPS supported by >>> their parent and eventually the PHB. >>> >>> >>> If this is not behaving as described (which I can't tell from the log >>> above), then feel free to assign the bug to me. >>> >>> Thanks, >>> Jon >>> >>> >>> >>>> >>>> Bjorn >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> . >>> >> >> >> -- >> Thanks! >> Yijing >> > > . > -- Thanks! Yijing ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: about mpss with pcie_bus_perf 2014-01-15 18:18 ` Jon Mason 2014-01-16 1:56 ` Yijing Wang @ 2014-01-16 4:27 ` Yinghai Lu 1 sibling, 0 replies; 7+ messages in thread From: Yinghai Lu @ 2014-01-16 4:27 UTC (permalink / raw) To: Jon Mason Cc: Yijing Wang, Bjorn Helgaas, Myron Stowe, linux-pci@vger.kernel.org, Jon Mason On Wed, Jan 15, 2014 at 10:18 AM, Jon Mason <jdmason@kudzu.us> wrote: > > If all inter-device communication is removed, then the only > communication is CPU, Endpoint, and switches in-between. Going from > CPU to Endpoint, the MPS is actually going to be the Cache Line size. > Since the Cache line size is 64B on x86 and most other architectures, > there is no worry that the endpoint will get a PCIE packet larger than > the MPS. Also, using the MRRS to clamp down the endpoint to the MPS > of the switches should ensure no reads larger than the MPS. Going > from Endpoint to CPU, we must ensure that all switches have a MPSS > large enough for any device under them. If not, then we must clamp > down the Endpoint MPS. > > If all of this works, then we can ensure a much larger MPS for all of > the PCI devices under a switch and not be bound by the smallest MPSS > of an endpoint on the switch. I'm confused by above statement. On system have pcie hotplug support, BIOS set root port mps to 255, and end port 255 during post. then hot-remove and hot-add cards, new MPS will be 128 default. when driver put load the end device, we will have lots of AER about TLP etc. After change root port mps and end device mps to 128, we will not have AER anymore. So question is: root port's mpss 256 and end device's mpss 128 should work well without any problem? Also I have noticed BIOS set MPSS to 256 and MRRS 512. so what is reason for current code for pcie_bus_perf to limit MRRS with MPSS? Thanks Yinghai ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-01-16 4:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-10 0:46 about mpss with pcie_bus_perf Yinghai Lu 2014-01-14 22:54 ` Bjorn Helgaas 2014-01-15 0:34 ` Jon Mason 2014-01-15 2:12 ` Yijing Wang 2014-01-15 18:18 ` Jon Mason 2014-01-16 1:56 ` Yijing Wang 2014-01-16 4:27 ` Yinghai Lu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.