* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
[not found] <187a1a7d-80bd-a0e9-a0d9-7fc53bff8907@linux.intel.com>
@ 2019-10-22 12:59 ` Bjorn Helgaas
2019-10-29 9:31 ` Dilip Kota
2019-10-29 10:42 ` Rafael J. Wysocki
0 siblings, 2 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-22 12:59 UTC (permalink / raw)
To: Dilip Kota
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm
[+cc Rafael, linux-pm, beginning of discussion at
https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > of changing link width and speed on the fly.
> > Please add more details about why this is needed. Since you're adding
> > sysfs files, it sounds like it's not actually the *driver* that needs
> > this; it's something in userspace?
> We have use cases to change the link speed and width on the fly.
> One is EMI check and other is power saving. Some battery backed
> applications have to switch PCIe link from higher GEN to GEN1 and
> width to x1. During the cases like external power supply got
> disconnected or broken. Once external power supply is connected then
> switch PCIe link to higher GEN and width.
That sounds plausible, but of course nothing there is specific to the
Intel Gateway, so we should implement this generically so it would
work on all hardware.
I'm not sure what the interface should look like -- should it be a
low-level interface as you propose where userspace would have to
identify each link of interest, or is there some system-wide
power/performance knob that could tune all links? Cc'd Rafael and
linux-pm in case they have ideas.
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-22 12:59 ` [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link Bjorn Helgaas
@ 2019-10-29 9:31 ` Dilip Kota
2019-10-30 22:14 ` Bjorn Helgaas
2019-10-29 10:42 ` Rafael J. Wysocki
1 sibling, 1 reply; 13+ messages in thread
From: Dilip Kota @ 2019-10-29 9:31 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm
On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> [+cc Rafael, linux-pm, beginning of discussion at
> https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
>
> On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
>> On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
>>> On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
>>>> On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
>>>>> PCIe RC driver on Intel Gateway SoCs have a requirement
>>>>> of changing link width and speed on the fly.
>>> Please add more details about why this is needed. Since you're adding
>>> sysfs files, it sounds like it's not actually the *driver* that needs
>>> this; it's something in userspace?
>> We have use cases to change the link speed and width on the fly.
>> One is EMI check and other is power saving. Some battery backed
>> applications have to switch PCIe link from higher GEN to GEN1 and
>> width to x1. During the cases like external power supply got
>> disconnected or broken. Once external power supply is connected then
>> switch PCIe link to higher GEN and width.
> That sounds plausible, but of course nothing there is specific to the
> Intel Gateway, so we should implement this generically so it would
> work on all hardware.
Agree.
>
> I'm not sure what the interface should look like -- should it be a
> low-level interface as you propose where userspace would have to
> identify each link of interest, or is there some system-wide
> power/performance knob that could tune all links? Cc'd Rafael and
> linux-pm in case they have ideas.
To my knowledge sysfs is the appropriate way to go.
If there are any other best possible knobs, will be helpful.
Regards,
Dilip
>
> Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-22 12:59 ` [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link Bjorn Helgaas
2019-10-29 9:31 ` Dilip Kota
@ 2019-10-29 10:42 ` Rafael J. Wysocki
2019-10-29 12:36 ` Bjorn Helgaas
1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2019-10-29 10:42 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Dilip Kota, Andrew Murray, Jingoo Han, gustavo.pimentel,
Lorenzo Pieralisi, Rob Herring, martin.blumenstingl, Linux PCI,
Christoph Hellwig, devicetree@vger.kernel.org,
Linux Kernel Mailing List, Shevchenko, Andriy, cheol.yong.kim,
chuanhua.lei, qi-ming.wu, Rafael J. Wysocki, Linux PM
On Tue, Oct 22, 2019 at 2:59 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rafael, linux-pm, beginning of discussion at
> https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
>
> On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > of changing link width and speed on the fly.
> > > Please add more details about why this is needed. Since you're adding
> > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > this; it's something in userspace?
>
> > We have use cases to change the link speed and width on the fly.
> > One is EMI check and other is power saving. Some battery backed
> > applications have to switch PCIe link from higher GEN to GEN1 and
> > width to x1. During the cases like external power supply got
> > disconnected or broken. Once external power supply is connected then
> > switch PCIe link to higher GEN and width.
>
> That sounds plausible, but of course nothing there is specific to the
> Intel Gateway, so we should implement this generically so it would
> work on all hardware.
>
> I'm not sure what the interface should look like -- should it be a
> low-level interface as you propose where userspace would have to
> identify each link of interest, or is there some system-wide
> power/performance knob that could tune all links? Cc'd Rafael and
> linux-pm in case they have ideas.
Frankly, I need some time to think about this and, in case you are
wondering about whether or not it has been discussed with me already,
it hasn't.
At this point I can only say that since we have an ASPM interface,
which IMO is not fantastic, it may be good to come up with a common
link management interface.
Cheers!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-29 10:42 ` Rafael J. Wysocki
@ 2019-10-29 12:36 ` Bjorn Helgaas
0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-29 12:36 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Dilip Kota, Andrew Murray, Jingoo Han, gustavo.pimentel,
Lorenzo Pieralisi, Rob Herring, martin.blumenstingl, Linux PCI,
Christoph Hellwig, devicetree@vger.kernel.org,
Linux Kernel Mailing List, Shevchenko, Andriy, cheol.yong.kim,
chuanhua.lei, qi-ming.wu, Rafael J. Wysocki, Linux PM,
Heiner Kallweit
[+cc Heiner for ASPM conversation]
On Tue, Oct 29, 2019 at 11:42:53AM +0100, Rafael J. Wysocki wrote:
> On Tue, Oct 22, 2019 at 2:59 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > [+cc Rafael, linux-pm, beginning of discussion at
> > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> >
> > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > of changing link width and speed on the fly.
> > > > Please add more details about why this is needed. Since you're adding
> > > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > > this; it's something in userspace?
> >
> > > We have use cases to change the link speed and width on the fly.
> > > One is EMI check and other is power saving. Some battery backed
> > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > width to x1. During the cases like external power supply got
> > > disconnected or broken. Once external power supply is connected then
> > > switch PCIe link to higher GEN and width.
> >
> > That sounds plausible, but of course nothing there is specific to the
> > Intel Gateway, so we should implement this generically so it would
> > work on all hardware.
> >
> > I'm not sure what the interface should look like -- should it be a
> > low-level interface as you propose where userspace would have to
> > identify each link of interest, or is there some system-wide
> > power/performance knob that could tune all links? Cc'd Rafael and
> > linux-pm in case they have ideas.
>
> Frankly, I need some time to think about this and, in case you are
> wondering about whether or not it has been discussed with me already,
> it hasn't.
>
> At this point I can only say that since we have an ASPM interface,
> which IMO is not fantastic, it may be good to come up with a common
> link management interface.
The ASPM interface hasn't been merged yet, so if you have better
ideas, now is the time. That one is definitely very low-level, partly
because the first use case is working around defects in a specific
device.
Some sort of unification of link management does sound like a good
idea.
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-29 9:31 ` Dilip Kota
@ 2019-10-30 22:14 ` Bjorn Helgaas
2019-10-30 23:31 ` Rafael J. Wysocki
2019-10-31 10:47 ` Dilip Kota
0 siblings, 2 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-30 22:14 UTC (permalink / raw)
To: Dilip Kota
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm, Rajat Jain,
Heiner Kallweit
[+cc Heiner, Rajat]
On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > [+cc Rafael, linux-pm, beginning of discussion at
> > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> >
> > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > of changing link width and speed on the fly.
> > > > Please add more details about why this is needed. Since you're adding
> > > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > > this; it's something in userspace?
> > > We have use cases to change the link speed and width on the fly.
> > > One is EMI check and other is power saving. Some battery backed
> > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > width to x1. During the cases like external power supply got
> > > disconnected or broken. Once external power supply is connected then
> > > switch PCIe link to higher GEN and width.
> > That sounds plausible, but of course nothing there is specific to the
> > Intel Gateway, so we should implement this generically so it would
> > work on all hardware.
> Agree.
> >
> > I'm not sure what the interface should look like -- should it be a
> > low-level interface as you propose where userspace would have to
> > identify each link of interest, or is there some system-wide
> > power/performance knob that could tune all links? Cc'd Rafael and
> > linux-pm in case they have ideas.
>
> To my knowledge sysfs is the appropriate way to go.
> If there are any other best possible knobs, will be helpful.
I agree sysfs is the right place for it; my question was whether we
should have files like:
/sys/.../0000:00:1f.3/pcie_speed
/sys/.../0000:00:1f.3/pcie_width
as I think this patch would add (BTW, please include sample paths like
the above in the commit log), or whether there should be a more global
thing that would affect all the links in the system.
I think the low-level files like you propose would be better because
one might want to tune link performance differently for different
types of devices and workloads.
We also have to decide if these files should be associated with the
device at the upstream or downstream end of the link. For ASPM, the
current proposal [1] has the files at the downstream end on the theory
that the GPU, NIC, NVMe device, etc is the user-recognizable one.
Also, neither ASPM nor link speed/width make any sense unless there
*is* a device at the downstream end, so putting them there
automatically makes them visible only when they're useful.
Rafael had some concerns about the proposed ASPM interface [2], but I
don't know what they are yet.
For ASPM we added a "link_pm" directory, and maybe that's too
specific. Maybe it should be a generic "link_mgt" or even "pcie"
directory that could contain both the ASPM and width/speed files.
There's also a change coming to put AER stats in something like this:
/sys/.../0000:00:1f.3/aer_stats/correctable_rx_err
/sys/.../0000:00:1f.3/aer_stats/correctable_timeout
/sys/.../0000:00:1f.3/aer_stats/fatal_TLP
...
It would certainly be good to have some organizational scheme or we'll
end up with a real hodge-podge.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/aspm&id=ad46fe1c733656611788e2cd59793e891ed7ded7
[2] https://lore.kernel.org/r/CAJZ5v0jdxR4roEUC_Hs3puCzGY4ThdLsi_XcxfBUUxqruP4z7A@mail.gmail.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-30 22:14 ` Bjorn Helgaas
@ 2019-10-30 23:31 ` Rafael J. Wysocki
2019-10-31 2:56 ` Bjorn Helgaas
2019-10-31 10:47 ` Dilip Kota
1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2019-10-30 23:31 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Dilip Kota, Andrew Murray, Jingoo Han, gustavo.pimentel,
Lorenzo Pieralisi, Rob Herring, martin.blumenstingl, Linux PCI,
Christoph Hellwig, devicetree@vger.kernel.org,
Linux Kernel Mailing List, Shevchenko, Andriy, cheol.yong.kim,
chuanhua.lei, qi-ming.wu, Rafael J. Wysocki, Linux PM, Rajat Jain,
Heiner Kallweit
On Wed, Oct 30, 2019 at 11:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Heiner, Rajat]
>
> On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > > [+cc Rafael, linux-pm, beginning of discussion at
> > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> > >
> > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > > of changing link width and speed on the fly.
> > > > > Please add more details about why this is needed. Since you're adding
> > > > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > > > this; it's something in userspace?
> > > > We have use cases to change the link speed and width on the fly.
> > > > One is EMI check and other is power saving. Some battery backed
> > > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > > width to x1. During the cases like external power supply got
> > > > disconnected or broken. Once external power supply is connected then
> > > > switch PCIe link to higher GEN and width.
> > > That sounds plausible, but of course nothing there is specific to the
> > > Intel Gateway, so we should implement this generically so it would
> > > work on all hardware.
> > Agree.
> > >
> > > I'm not sure what the interface should look like -- should it be a
> > > low-level interface as you propose where userspace would have to
> > > identify each link of interest, or is there some system-wide
> > > power/performance knob that could tune all links? Cc'd Rafael and
> > > linux-pm in case they have ideas.
> >
> > To my knowledge sysfs is the appropriate way to go.
> > If there are any other best possible knobs, will be helpful.
>
> I agree sysfs is the right place for it; my question was whether we
> should have files like:
>
> /sys/.../0000:00:1f.3/pcie_speed
> /sys/.../0000:00:1f.3/pcie_width
>
> as I think this patch would add (BTW, please include sample paths like
> the above in the commit log), or whether there should be a more global
> thing that would affect all the links in the system.
>
> I think the low-level files like you propose would be better because
> one might want to tune link performance differently for different
> types of devices and workloads.
>
> We also have to decide if these files should be associated with the
> device at the upstream or downstream end of the link. For ASPM, the
> current proposal [1] has the files at the downstream end on the theory
> that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> Also, neither ASPM nor link speed/width make any sense unless there
> *is* a device at the downstream end, so putting them there
> automatically makes them visible only when they're useful.
>
> Rafael had some concerns about the proposed ASPM interface [2], but I
> don't know what they are yet.
I was talking about the existing ASPM interface in sysfs. The new one
I still have to review, but I'm kind of wondering what about people
who used the old one? Would it be supported going forward?
> For ASPM we added a "link_pm" directory, and maybe that's too
> specific. Maybe it should be a generic "link_mgt" or even "pcie"
> directory that could contain both the ASPM and width/speed files.
>
> There's also a change coming to put AER stats in something like this:
>
> /sys/.../0000:00:1f.3/aer_stats/correctable_rx_err
> /sys/.../0000:00:1f.3/aer_stats/correctable_timeout
> /sys/.../0000:00:1f.3/aer_stats/fatal_TLP
> ...
>
> It would certainly be good to have some organizational scheme or we'll
> end up with a real hodge-podge.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/aspm&id=ad46fe1c733656611788e2cd59793e891ed7ded7
> [2] https://lore.kernel.org/r/CAJZ5v0jdxR4roEUC_Hs3puCzGY4ThdLsi_XcxfBUUxqruP4z7A@mail.gmail.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-30 23:31 ` Rafael J. Wysocki
@ 2019-10-31 2:56 ` Bjorn Helgaas
2019-10-31 9:13 ` Rafael J. Wysocki
0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-31 2:56 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Dilip Kota, Andrew Murray, Jingoo Han, gustavo.pimentel,
Lorenzo Pieralisi, Rob Herring, martin.blumenstingl, Linux PCI,
Christoph Hellwig, devicetree@vger.kernel.org,
Linux Kernel Mailing List, Shevchenko, Andriy, cheol.yong.kim,
chuanhua.lei, qi-ming.wu, Rafael J. Wysocki, Linux PM, Rajat Jain,
Heiner Kallweit
On Thu, Oct 31, 2019 at 12:31:44AM +0100, Rafael J. Wysocki wrote:
> On Wed, Oct 30, 2019 at 11:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > [+cc Heiner, Rajat]
> >
> > On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> > > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > > > [+cc Rafael, linux-pm, beginning of discussion at
> > > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> > > >
> > > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > > > of changing link width and speed on the fly.
> > > > > > Please add more details about why this is needed. Since you're adding
> > > > > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > > > > this; it's something in userspace?
> > > > > We have use cases to change the link speed and width on the fly.
> > > > > One is EMI check and other is power saving. Some battery backed
> > > > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > > > width to x1. During the cases like external power supply got
> > > > > disconnected or broken. Once external power supply is connected then
> > > > > switch PCIe link to higher GEN and width.
> > > > That sounds plausible, but of course nothing there is specific to the
> > > > Intel Gateway, so we should implement this generically so it would
> > > > work on all hardware.
> > > Agree.
> > > >
> > > > I'm not sure what the interface should look like -- should it be a
> > > > low-level interface as you propose where userspace would have to
> > > > identify each link of interest, or is there some system-wide
> > > > power/performance knob that could tune all links? Cc'd Rafael and
> > > > linux-pm in case they have ideas.
> > >
> > > To my knowledge sysfs is the appropriate way to go.
> > > If there are any other best possible knobs, will be helpful.
> >
> > I agree sysfs is the right place for it; my question was whether we
> > should have files like:
> >
> > /sys/.../0000:00:1f.3/pcie_speed
> > /sys/.../0000:00:1f.3/pcie_width
> >
> > as I think this patch would add (BTW, please include sample paths like
> > the above in the commit log), or whether there should be a more global
> > thing that would affect all the links in the system.
> >
> > I think the low-level files like you propose would be better because
> > one might want to tune link performance differently for different
> > types of devices and workloads.
> >
> > We also have to decide if these files should be associated with the
> > device at the upstream or downstream end of the link. For ASPM, the
> > current proposal [1] has the files at the downstream end on the theory
> > that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> > Also, neither ASPM nor link speed/width make any sense unless there
> > *is* a device at the downstream end, so putting them there
> > automatically makes them visible only when they're useful.
> >
> > Rafael had some concerns about the proposed ASPM interface [2], but I
> > don't know what they are yet.
>
> I was talking about the existing ASPM interface in sysfs. The new one
> I still have to review, but I'm kind of wondering what about people
> who used the old one? Would it be supported going forward?
The old one interface was enabled by CONFIG_PCIEASPM_DEBUG. Red Hat
doesn't enable that. Ubuntu does. I *thought* we heard from a
Canonical person who said they didn't have any tools that used it, but
I can't find that now. I don't know about SUSE.
So the idea was to drop it on the theory that nobody is using it.
Possibly that's too aggressive.
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-31 2:56 ` Bjorn Helgaas
@ 2019-10-31 9:13 ` Rafael J. Wysocki
2019-10-31 13:01 ` Bjorn Helgaas
0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2019-10-31 9:13 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Dilip Kota, Andrew Murray, Jingoo Han,
gustavo.pimentel, Lorenzo Pieralisi, Rob Herring,
martin.blumenstingl, Linux PCI, Christoph Hellwig,
devicetree@vger.kernel.org, Linux Kernel Mailing List,
Shevchenko, Andriy, cheol.yong.kim, chuanhua.lei, qi-ming.wu,
Linux PM, Rajat Jain, Heiner Kallweit
On Thursday, October 31, 2019 3:56:37 AM CET Bjorn Helgaas wrote:
> On Thu, Oct 31, 2019 at 12:31:44AM +0100, Rafael J. Wysocki wrote:
> > On Wed, Oct 30, 2019 at 11:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > > [+cc Heiner, Rajat]
> > >
> > > On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> > > > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > > > > [+cc Rafael, linux-pm, beginning of discussion at
> > > > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> > > > >
> > > > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > > > > of changing link width and speed on the fly.
> > > > > > > Please add more details about why this is needed. Since you're adding
> > > > > > > sysfs files, it sounds like it's not actually the *driver* that needs
> > > > > > > this; it's something in userspace?
> > > > > > We have use cases to change the link speed and width on the fly.
> > > > > > One is EMI check and other is power saving. Some battery backed
> > > > > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > > > > width to x1. During the cases like external power supply got
> > > > > > disconnected or broken. Once external power supply is connected then
> > > > > > switch PCIe link to higher GEN and width.
> > > > > That sounds plausible, but of course nothing there is specific to the
> > > > > Intel Gateway, so we should implement this generically so it would
> > > > > work on all hardware.
> > > > Agree.
> > > > >
> > > > > I'm not sure what the interface should look like -- should it be a
> > > > > low-level interface as you propose where userspace would have to
> > > > > identify each link of interest, or is there some system-wide
> > > > > power/performance knob that could tune all links? Cc'd Rafael and
> > > > > linux-pm in case they have ideas.
> > > >
> > > > To my knowledge sysfs is the appropriate way to go.
> > > > If there are any other best possible knobs, will be helpful.
> > >
> > > I agree sysfs is the right place for it; my question was whether we
> > > should have files like:
> > >
> > > /sys/.../0000:00:1f.3/pcie_speed
> > > /sys/.../0000:00:1f.3/pcie_width
> > >
> > > as I think this patch would add (BTW, please include sample paths like
> > > the above in the commit log), or whether there should be a more global
> > > thing that would affect all the links in the system.
> > >
> > > I think the low-level files like you propose would be better because
> > > one might want to tune link performance differently for different
> > > types of devices and workloads.
> > >
> > > We also have to decide if these files should be associated with the
> > > device at the upstream or downstream end of the link. For ASPM, the
> > > current proposal [1] has the files at the downstream end on the theory
> > > that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> > > Also, neither ASPM nor link speed/width make any sense unless there
> > > *is* a device at the downstream end, so putting them there
> > > automatically makes them visible only when they're useful.
> > >
> > > Rafael had some concerns about the proposed ASPM interface [2], but I
> > > don't know what they are yet.
> >
> > I was talking about the existing ASPM interface in sysfs. The new one
> > I still have to review, but I'm kind of wondering what about people
> > who used the old one? Would it be supported going forward?
>
> The old one interface was enabled by CONFIG_PCIEASPM_DEBUG. Red Hat
> doesn't enable that. Ubuntu does. I *thought* we heard from a
> Canonical person who said they didn't have any tools that used it, but
> I can't find that now. I don't know about SUSE.
>
> So the idea was to drop it on the theory that nobody is using it.
> Possibly that's too aggressive.
Well, one problem is that the "old" (actually existing) I/F has made it
to one of my OSS EU presentation slides (I did not talk to this particular
slide, but it is there in the deck that's available for downloading), so who
knows who is going to use it. :-)
So I guess that there's a risk that needs to be taken into consideration.
What could be done, in principle, would be to make the new I/F depend on
CONFIG_PCIEASPM_DEBUG being unset and provide the "old" one when it is set.
In any case, the pcie_aspm.policy module parameter cannot be dropped, because
AFAICS there is quite a bit of user space using it (e.g. TLP).
Cheers!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-30 22:14 ` Bjorn Helgaas
2019-10-30 23:31 ` Rafael J. Wysocki
@ 2019-10-31 10:47 ` Dilip Kota
2019-10-31 13:22 ` Bjorn Helgaas
1 sibling, 1 reply; 13+ messages in thread
From: Dilip Kota @ 2019-10-31 10:47 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm, Rajat Jain,
Heiner Kallweit
On 10/31/2019 6:14 AM, Bjorn Helgaas wrote:
> [+cc Heiner, Rajat]
>
> On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
>> On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
>>> [+cc Rafael, linux-pm, beginning of discussion at
>>> https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
>>>
>>> On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
>>>> On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
>>>>> On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
>>>>>> On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
>>>>>>> PCIe RC driver on Intel Gateway SoCs have a requirement
>>>>>>> of changing link width and speed on the fly.
>>>>> Please add more details about why this is needed. Since you're adding
>>>>> sysfs files, it sounds like it's not actually the *driver* that needs
>>>>> this; it's something in userspace?
>>>> We have use cases to change the link speed and width on the fly.
>>>> One is EMI check and other is power saving. Some battery backed
>>>> applications have to switch PCIe link from higher GEN to GEN1 and
>>>> width to x1. During the cases like external power supply got
>>>> disconnected or broken. Once external power supply is connected then
>>>> switch PCIe link to higher GEN and width.
>>> That sounds plausible, but of course nothing there is specific to the
>>> Intel Gateway, so we should implement this generically so it would
>>> work on all hardware.
>> Agree.
>>> I'm not sure what the interface should look like -- should it be a
>>> low-level interface as you propose where userspace would have to
>>> identify each link of interest, or is there some system-wide
>>> power/performance knob that could tune all links? Cc'd Rafael and
>>> linux-pm in case they have ideas.
>> To my knowledge sysfs is the appropriate way to go.
>> If there are any other best possible knobs, will be helpful.
> I agree sysfs is the right place for it; my question was whether we
> should have files like:
>
> /sys/.../0000:00:1f.3/pcie_speed
> /sys/.../0000:00:1f.3/pcie_width
>
> as I think this patch would add (BTW, please include sample paths like
> the above in the commit log), or whether there should be a more global
> thing that would affect all the links in the system.
Sure, i will add them.
>
> I think the low-level files like you propose would be better because
> one might want to tune link performance differently for different
> types of devices and workloads.
>
> We also have to decide if these files should be associated with the
> device at the upstream or downstream end of the link. For ASPM, the
> current proposal [1] has the files at the downstream end on the theory
> that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> Also, neither ASPM nor link speed/width make any sense unless there
> *is* a device at the downstream end, so putting them there
> automatically makes them visible only when they're useful.
This patch places the speed and width in the host controller directory.
/sys/.../xxx.pcie/pcie_speed
/sys/.../xxx.pcie/pcie_width
I agree with you partially, because i am having couple of points making
me to
keep speed and width change entries in controller directory:
-- For changing the speed/width with device node, software ends up
traversing to the controller
from the device and do the operations.
-- Change speed and width are performed at controller level,
-- Keeping speed and width in controller gives a perspective (to the
user) of changing
them only once irrespective of no. of devices.
-- For speed and link change in Synopsys PCIe controller, specific
registers need to be configured.
This prevents or complicates adding the speed and width change
functionality in pci-sysfs or pci framework.
Regards,
Dilip
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-31 9:13 ` Rafael J. Wysocki
@ 2019-10-31 13:01 ` Bjorn Helgaas
0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-31 13:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Rafael J. Wysocki, Dilip Kota, Andrew Murray, Jingoo Han,
gustavo.pimentel, Lorenzo Pieralisi, Rob Herring,
martin.blumenstingl, Linux PCI, Christoph Hellwig,
devicetree@vger.kernel.org, Linux Kernel Mailing List,
Shevchenko, Andriy, cheol.yong.kim, chuanhua.lei, qi-ming.wu,
Linux PM, Rajat Jain, Heiner Kallweit
On Thu, Oct 31, 2019 at 10:13:11AM +0100, Rafael J. Wysocki wrote:
> On Thursday, October 31, 2019 3:56:37 AM CET Bjorn Helgaas wrote:
> > On Thu, Oct 31, 2019 at 12:31:44AM +0100, Rafael J. Wysocki wrote:
> > > On Wed, Oct 30, 2019 at 11:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > Rafael had some concerns about the proposed ASPM interface [2], but I
> > > > don't know what they are yet.
> > >
> > > I was talking about the existing ASPM interface in sysfs. The new one
> > > I still have to review, but I'm kind of wondering what about people
> > > who used the old one? Would it be supported going forward?
> >
> > The old one interface was enabled by CONFIG_PCIEASPM_DEBUG. Red Hat
> > doesn't enable that. Ubuntu does. I *thought* we heard from a
> > Canonical person who said they didn't have any tools that used it, but
> > I can't find that now. I don't know about SUSE.
> >
> > So the idea was to drop it on the theory that nobody is using it.
> > Possibly that's too aggressive.
>
> Well, one problem is that the "old" (actually existing) I/F has made it
> to one of my OSS EU presentation slides (I did not talk to this particular
> slide, but it is there in the deck that's available for downloading), so who
> knows who is going to use it. :-)
>
> So I guess that there's a risk that needs to be taken into consideration.
>
> What could be done, in principle, would be to make the new I/F depend on
> CONFIG_PCIEASPM_DEBUG being unset and provide the "old" one when it is set.
I would prefer to enable the new interface unconditionally to make it
easier for userspace tools like powertop to use it.
I think the existing and new interfaces could coexist, with the
existing interface being enabled by CONFIG_PCIEASPM_DEBUG as it is
today. The patch that removes the existing interface is the last in
the series and could easily be dropped.
> In any case, the pcie_aspm.policy module parameter cannot be dropped, because
> AFAICS there is quite a bit of user space using it (e.g. TLP).
What is TLP? Since CONFIG_PCIEASPM is a bool, aspm.o is built in
statically if enabled, so pcie_aspm.policy is effectively a boot-time
kernel parameter, right? We don't have a plan to remove it.
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-31 10:47 ` Dilip Kota
@ 2019-10-31 13:22 ` Bjorn Helgaas
2019-11-01 5:47 ` Dilip Kota
0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-31 13:22 UTC (permalink / raw)
To: Dilip Kota
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm, Rajat Jain,
Heiner Kallweit
On Thu, Oct 31, 2019 at 06:47:10PM +0800, Dilip Kota wrote:
> On 10/31/2019 6:14 AM, Bjorn Helgaas wrote:
> > On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> > > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > > > [+cc Rafael, linux-pm, beginning of discussion at
> > > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> > > >
> > > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > > > of changing link width and speed on the fly.
> > > > > > Please add more details about why this is needed. Since
> > > > > > you're adding sysfs files, it sounds like it's not
> > > > > > actually the *driver* that needs this; it's something in
> > > > > > userspace?
> > > > > We have use cases to change the link speed and width on the fly.
> > > > > One is EMI check and other is power saving. Some battery backed
> > > > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > > > width to x1. During the cases like external power supply got
> > > > > disconnected or broken. Once external power supply is connected then
> > > > > switch PCIe link to higher GEN and width.
> > > > That sounds plausible, but of course nothing there is specific to the
> > > > Intel Gateway, so we should implement this generically so it would
> > > > work on all hardware.
> > > Agree.
> > > > I'm not sure what the interface should look like -- should it be a
> > > > low-level interface as you propose where userspace would have to
> > > > identify each link of interest, or is there some system-wide
> > > > power/performance knob that could tune all links? Cc'd Rafael and
> > > > linux-pm in case they have ideas.
> > > To my knowledge sysfs is the appropriate way to go.
> > > If there are any other best possible knobs, will be helpful.
> > I agree sysfs is the right place for it; my question was whether we
> > should have files like:
> >
> > /sys/.../0000:00:1f.3/pcie_speed
> > /sys/.../0000:00:1f.3/pcie_width
> >
> > as I think this patch would add (BTW, please include sample paths like
> > the above in the commit log), or whether there should be a more global
> > thing that would affect all the links in the system.
> Sure, i will add them.
> >
> > I think the low-level files like you propose would be better because
> > one might want to tune link performance differently for different
> > types of devices and workloads.
> >
> > We also have to decide if these files should be associated with the
> > device at the upstream or downstream end of the link. For ASPM, the
> > current proposal [1] has the files at the downstream end on the theory
> > that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> > Also, neither ASPM nor link speed/width make any sense unless there
> > *is* a device at the downstream end, so putting them there
> > automatically makes them visible only when they're useful.
>
> This patch places the speed and width in the host controller directory.
> /sys/.../xxx.pcie/pcie_speed
> /sys/.../xxx.pcie/pcie_width
>
> I agree with you partially, because i am having couple of points
> making me to keep speed and width change entries in controller
> directory:
>
> -- For changing the speed/width with device node, software ends up
> traversing to the controller from the device and do the
> operations.
> -- Change speed and width are performed at controller level,
The controller is effectively a Root Complex, which may contain
several Root Ports. I have the impression that the Synopsys
controller only supports a single Root Port, but that's just a detail
of the Synopsys implementation. I think it should be possible to
configure the width/speed of each Root Port individually.
> -- Keeping speed and width in controller gives a perspective (to the
> user) of changing them only once irrespective of no. of devices.
What if there's a switch? If we change the width/speed of the link
between the Root Port and the Switch Upstream Port, that doesn't do
anything about the links from the Switch Downstream Ports.
> -- For speed and link change in Synopsys PCIe controller, specific
> registers need to be configured. This prevents or complicates
> adding the speed and width change functionality in pci-sysfs or
> pci framework.
Don't the Link Control and related registers in PCIe spec give us
enough control to manage the link width/speed of *all* links,
including those from Root Ports and Switch Downstream Ports?
If the Synopsys controller requires controller-specific registers,
that sounds to me like it doesn't quite conform to the spec. Maybe
that means we would need some sort of quirk or controller callback?
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-10-31 13:22 ` Bjorn Helgaas
@ 2019-11-01 5:47 ` Dilip Kota
2019-11-01 11:30 ` Andrew Murray
0 siblings, 1 reply; 13+ messages in thread
From: Dilip Kota @ 2019-11-01 5:47 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andrew Murray, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm, Rajat Jain,
Heiner Kallweit
On 10/31/2019 9:22 PM, Bjorn Helgaas wrote:
> On Thu, Oct 31, 2019 at 06:47:10PM +0800, Dilip Kota wrote:
>> On 10/31/2019 6:14 AM, Bjorn Helgaas wrote:
>>> On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
>>>> On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
>>>>> [+cc Rafael, linux-pm, beginning of discussion at
>>>>> https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
>>>>>
>>>>> On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
>>>>>> On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
>>>>>>> On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
>>>>>>>> On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
>>>>>>>>> PCIe RC driver on Intel Gateway SoCs have a requirement
>>>>>>>>> of changing link width and speed on the fly.
>>>>>>> Please add more details about why this is needed. Since
>>>>>>> you're adding sysfs files, it sounds like it's not
>>>>>>> actually the *driver* that needs this; it's something in
>>>>>>> userspace?
>>>>>> We have use cases to change the link speed and width on the fly.
>>>>>> One is EMI check and other is power saving. Some battery backed
>>>>>> applications have to switch PCIe link from higher GEN to GEN1 and
>>>>>> width to x1. During the cases like external power supply got
>>>>>> disconnected or broken. Once external power supply is connected then
>>>>>> switch PCIe link to higher GEN and width.
>>>>> That sounds plausible, but of course nothing there is specific to the
>>>>> Intel Gateway, so we should implement this generically so it would
>>>>> work on all hardware.
>>>> Agree.
>>>>> I'm not sure what the interface should look like -- should it be a
>>>>> low-level interface as you propose where userspace would have to
>>>>> identify each link of interest, or is there some system-wide
>>>>> power/performance knob that could tune all links? Cc'd Rafael and
>>>>> linux-pm in case they have ideas.
>>>> To my knowledge sysfs is the appropriate way to go.
>>>> If there are any other best possible knobs, will be helpful.
>>> I agree sysfs is the right place for it; my question was whether we
>>> should have files like:
>>>
>>> /sys/.../0000:00:1f.3/pcie_speed
>>> /sys/.../0000:00:1f.3/pcie_width
>>>
>>> as I think this patch would add (BTW, please include sample paths like
>>> the above in the commit log), or whether there should be a more global
>>> thing that would affect all the links in the system.
>> Sure, i will add them.
>>> I think the low-level files like you propose would be better because
>>> one might want to tune link performance differently for different
>>> types of devices and workloads.
>>>
>>> We also have to decide if these files should be associated with the
>>> device at the upstream or downstream end of the link. For ASPM, the
>>> current proposal [1] has the files at the downstream end on the theory
>>> that the GPU, NIC, NVMe device, etc is the user-recognizable one.
>>> Also, neither ASPM nor link speed/width make any sense unless there
>>> *is* a device at the downstream end, so putting them there
>>> automatically makes them visible only when they're useful.
>> This patch places the speed and width in the host controller directory.
>> /sys/.../xxx.pcie/pcie_speed
>> /sys/.../xxx.pcie/pcie_width
>>
>> I agree with you partially, because i am having couple of points
>> making me to keep speed and width change entries in controller
>> directory:
>>
>> -- For changing the speed/width with device node, software ends up
>> traversing to the controller from the device and do the
>> operations.
>> -- Change speed and width are performed at controller level,
> The controller is effectively a Root Complex, which may contain
> several Root Ports. I have the impression that the Synopsys
> controller only supports a single Root Port, but that's just a detail
> of the Synopsys implementation. I think it should be possible to
> configure the width/speed of each Root Port individually.
>
>> -- Keeping speed and width in controller gives a perspective (to the
>> user) of changing them only once irrespective of no. of devices.
> What if there's a switch? If we change the width/speed of the link
> between the Root Port and the Switch Upstream Port, that doesn't do
> anything about the links from the Switch Downstream Ports.
I missed to evaluate the multiple root port and switch scenarios, thanks
for pointing it.
Then, placing the link speed and width change entries in the device node
will be appropriate.
Software will traverse to the respective port or bus through the device
node and does the changes.
>
>> -- For speed and link change in Synopsys PCIe controller, specific
>> registers need to be configured. This prevents or complicates
>> adding the speed and width change functionality in pci-sysfs or
>> pci framework.
> Don't the Link Control and related registers in PCIe spec give us
> enough control to manage the link width/speed of *all* links,
> including those from Root Ports and Switch Downstream Ports?
>
> If the Synopsys controller requires controller-specific registers,
> that sounds to me like it doesn't quite conform to the spec. Maybe
> that means we would need some sort of quirk or controller callback?
Yes, Synopsys has specific registers configuration for link width
resizing and speed change.
I will evaluate the possible mechanism for plugging in the controller
specific changes to the framework.
Regards,
Dilip
>
> Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link
2019-11-01 5:47 ` Dilip Kota
@ 2019-11-01 11:30 ` Andrew Murray
0 siblings, 0 replies; 13+ messages in thread
From: Andrew Murray @ 2019-11-01 11:30 UTC (permalink / raw)
To: Dilip Kota
Cc: Bjorn Helgaas, jingoohan1, gustavo.pimentel, lorenzo.pieralisi,
robh, martin.blumenstingl, linux-pci, hch, devicetree,
linux-kernel, andriy.shevchenko, cheol.yong.kim, chuanhua.lei,
qi-ming.wu, Rafael J. Wysocki, linux-pm, Rajat Jain,
Heiner Kallweit
On Fri, Nov 01, 2019 at 01:47:39PM +0800, Dilip Kota wrote:
>
> On 10/31/2019 9:22 PM, Bjorn Helgaas wrote:
> > On Thu, Oct 31, 2019 at 06:47:10PM +0800, Dilip Kota wrote:
> > > On 10/31/2019 6:14 AM, Bjorn Helgaas wrote:
> > > > On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote:
> > > > > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote:
> > > > > > [+cc Rafael, linux-pm, beginning of discussion at
> > > > > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@linux.intel.com]
> > > > > >
> > > > > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote:
> > > > > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote:
> > > > > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote:
> > > > > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote:
> > > > > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement
> > > > > > > > > > of changing link width and speed on the fly.
> > > > > > > > Please add more details about why this is needed. Since
> > > > > > > > you're adding sysfs files, it sounds like it's not
> > > > > > > > actually the *driver* that needs this; it's something in
> > > > > > > > userspace?
> > > > > > > We have use cases to change the link speed and width on the fly.
> > > > > > > One is EMI check and other is power saving. Some battery backed
> > > > > > > applications have to switch PCIe link from higher GEN to GEN1 and
> > > > > > > width to x1. During the cases like external power supply got
> > > > > > > disconnected or broken. Once external power supply is connected then
> > > > > > > switch PCIe link to higher GEN and width.
> > > > > > That sounds plausible, but of course nothing there is specific to the
> > > > > > Intel Gateway, so we should implement this generically so it would
> > > > > > work on all hardware.
> > > > > Agree.
> > > > > > I'm not sure what the interface should look like -- should it be a
> > > > > > low-level interface as you propose where userspace would have to
> > > > > > identify each link of interest, or is there some system-wide
> > > > > > power/performance knob that could tune all links? Cc'd Rafael and
> > > > > > linux-pm in case they have ideas.
> > > > > To my knowledge sysfs is the appropriate way to go.
> > > > > If there are any other best possible knobs, will be helpful.
> > > > I agree sysfs is the right place for it; my question was whether we
> > > > should have files like:
> > > >
> > > > /sys/.../0000:00:1f.3/pcie_speed
> > > > /sys/.../0000:00:1f.3/pcie_width
> > > >
> > > > as I think this patch would add (BTW, please include sample paths like
> > > > the above in the commit log), or whether there should be a more global
> > > > thing that would affect all the links in the system.
> > > Sure, i will add them.
> > > > I think the low-level files like you propose would be better because
> > > > one might want to tune link performance differently for different
> > > > types of devices and workloads.
> > > >
> > > > We also have to decide if these files should be associated with the
> > > > device at the upstream or downstream end of the link. For ASPM, the
> > > > current proposal [1] has the files at the downstream end on the theory
> > > > that the GPU, NIC, NVMe device, etc is the user-recognizable one.
> > > > Also, neither ASPM nor link speed/width make any sense unless there
> > > > *is* a device at the downstream end, so putting them there
> > > > automatically makes them visible only when they're useful.
> > > This patch places the speed and width in the host controller directory.
> > > /sys/.../xxx.pcie/pcie_speed
> > > /sys/.../xxx.pcie/pcie_width
> > >
> > > I agree with you partially, because i am having couple of points
> > > making me to keep speed and width change entries in controller
> > > directory:
> > >
> > > -- For changing the speed/width with device node, software ends up
> > > traversing to the controller from the device and do the
> > > operations.
> > > -- Change speed and width are performed at controller level,
> > The controller is effectively a Root Complex, which may contain
> > several Root Ports. I have the impression that the Synopsys
> > controller only supports a single Root Port, but that's just a detail
> > of the Synopsys implementation. I think it should be possible to
> > configure the width/speed of each Root Port individually.
> >
> > > -- Keeping speed and width in controller gives a perspective (to the
> > > user) of changing them only once irrespective of no. of devices.
> > What if there's a switch? If we change the width/speed of the link
> > between the Root Port and the Switch Upstream Port, that doesn't do
> > anything about the links from the Switch Downstream Ports.
> I missed to evaluate the multiple root port and switch scenarios, thanks for
> pointing it.
> Then, placing the link speed and width change entries in the device node
> will be appropriate.
> Software will traverse to the respective port or bus through the device node
> and does the changes.
> >
> > > -- For speed and link change in Synopsys PCIe controller, specific
> > > registers need to be configured. This prevents or complicates
> > > adding the speed and width change functionality in pci-sysfs or
> > > pci framework.
> > Don't the Link Control and related registers in PCIe spec give us
> > enough control to manage the link width/speed of *all* links,
> > including those from Root Ports and Switch Downstream Ports?
> >
> > If the Synopsys controller requires controller-specific registers,
> > that sounds to me like it doesn't quite conform to the spec. Maybe
> > that means we would need some sort of quirk or controller callback?
> Yes, Synopsys has specific registers configuration for link width resizing
> and speed change.
> I will evaluate the possible mechanism for plugging in the controller
> specific changes to the framework.
According to the spec, "Software is permitted to restrict the maximum speed
of Link operation and set the perferred Link speed by setting the value in the
Target Link Speed field in the Upstream component." - This is the Link Control
2 Register, and a link retrain should then be triggered.
With regards to this proposed sysfs API - I wonder if this implies we should
also disable 'Hardware Autonomous Speed Disable' to prevent a link speed
change for device specific reasons?
In my view, this means we *can* have a sysfs control for limiting the link
speed using standard PCI means - though callbacks and quirks may be needed
for host bridge controllers and similar.
With regards to link width, I can't see any obvious software initiated means
to change the link width (they are all RO) - though a device can change its
own link width so long as it's 'Hardware Autonomous Width Disable' bit is
clear. So whilst there may be some benefit for the initial links of a few
host bridge controllers that may opt-in to some framework for this - such an
API wouldn't benefit the majority of links in a PCI fabric. Perhaps this
(width) should be DWC specific.
Thanks,
Andrew Murray
>
> Regards,
> Dilip
> >
> > Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-11-01 11:30 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <187a1a7d-80bd-a0e9-a0d9-7fc53bff8907@linux.intel.com>
2019-10-22 12:59 ` [PATCH v4 3/3] pci: intel: Add sysfs attributes to configure pcie link Bjorn Helgaas
2019-10-29 9:31 ` Dilip Kota
2019-10-30 22:14 ` Bjorn Helgaas
2019-10-30 23:31 ` Rafael J. Wysocki
2019-10-31 2:56 ` Bjorn Helgaas
2019-10-31 9:13 ` Rafael J. Wysocki
2019-10-31 13:01 ` Bjorn Helgaas
2019-10-31 10:47 ` Dilip Kota
2019-10-31 13:22 ` Bjorn Helgaas
2019-11-01 5:47 ` Dilip Kota
2019-11-01 11:30 ` Andrew Murray
2019-10-29 10:42 ` Rafael J. Wysocki
2019-10-29 12:36 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).