Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Ian Kumlien <ian.kumlien@gmail.com>
Cc: linux-pci@vger.kernel.org,
	"Saheed O. Bolarinwa" <refactormyself@gmail.com>,
	Puranjay Mohan <puranjay12@gmail.com>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>
Subject: Re: [PATCH] Use maximum latency when determining L1/L0s ASPM v2
Date: Wed, 23 Sep 2020 16:23:18 -0500	[thread overview]
Message-ID: <20200923212318.GA2295660@bjorn-Precision-5520> (raw)
In-Reply-To: <CAA85sZs5f09uh+eCcZ+2Mh4Hj=GVVncZjyGR8Ru3vBQ3Z-_nNA@mail.gmail.com>

On Wed, Sep 23, 2020 at 01:29:18AM +0200, Ian Kumlien wrote:
> On Wed, Sep 23, 2020 at 1:00 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Sep 22, 2020 at 11:02:35PM +0200, Ian Kumlien wrote:

> > > commit db3d9c4baf4ab177d87b5cd41f624f5901e7390f
> > > Author: Ian Kumlien <ian.kumlien@gmail.com>
> > > Date:   Sun Jul 26 16:01:15 2020 +0200
> > >
> > >     Use maximum latency when determining L1 ASPM
> > >
> > >     If it's not, we clear the link for the path that had too large latency.
> > >
> > >     Currently we check the maximum latency of upstream and downstream
> > >     per link, not the maximum for the path
> > >
> > >     This would work if all links have the same latency, but:
> > >     endpoint -> c -> b -> a -> root  (in the order we walk the path)
> > >
> > >     If c or b has the higest latency, it will not register
> > >
> > >     Fix this by maintaining the maximum latency value for the path
> > >
> > >     See this bugzilla for more information:
> > >     https://bugzilla.kernel.org/show_bug.cgi?id=208741
> > >
> > >     This fixes an issue for me where my desktops machines maximum bandwidth
> > >     for remote connections dropped from 933 MBit to ~40 MBit.
> > >
> > >     The bug became obvious once we enabled ASPM on all links:
> > >     66ff14e59e8a (PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges)
> >
> > I can't connect the dots here yet.  I don't see a PCIe-to-PCI/PCI-X
> > bridge in your lspci, so I can't figure out why this commit would make
> > a difference for you.
> >
> > IIUC, the problem device is 03:00.0, the Intel I211 NIC.  Here's the
> > path to it:
> >
> >   00:01.2 Root Port              to [bus 01-07]
> >   01:00.0 Switch Upstream Port   to [bus 02-07]
> >   02:03.0 Switch Downstream Port to [bus 03]
> >   03:00.0 Endpoint (Intel I211 NIC)
> >
> > And I think this is the relevant info:
> >
> >                                                     LnkCtl    LnkCtl
> >            ------DevCap-------  ----LnkCap-------  -Before-  -After--
> >   00:01.2                                L1 <32us       L1+       L1-
> >   01:00.0                                L1 <32us       L1+       L1-
> >   02:03.0                                L1 <32us       L1+       L1+
> >   03:00.0  L0s <512ns L1 <64us  L0s <2us L1 <16us  L0s- L1-  L0s- L1-
> >
> > The NIC says it can tolerate at most 512ns of L0s exit latency and at
> > most 64us of L1 exit latency.
> >
> > 02:03.0 doesn't support L0s, and the NIC itself can't exit L0s that
> > fast anyway (it can only do <2us), so L0s should be out of the picture
> > completely.
> >
> > Before your patch, apparently we (or BIOS) enabled L1 on the link from
> > 00:01.2 to 01:00.0, and partially enabled it on the link from 02:03.0
> > to 03:00.0.
> 
> According to the spec, this is managed by the OS - which was the
> change introduced...

BIOS frequently enables ASPM and, if CONFIG_PCIEASPM_DEFAULT=y, I
don't think Linux touches it unless a user requests it via sysfs.

What does "grep ASPM .config" tell you?

Boot with "pci=earlydump" and the dmesg will tell us what the BIOS
did.

If you do this in the unmodified kernel:

  # echo 1 > /sys/.../0000:03:00.0/l1_aspm

it should enable L1 for 03:00.0.  I'd like to know whether it actually
does, and whether the NIC behaves any differently when L1 is enabled
for the entire path instead of just the first three components.

If the above doesn't work, you should be able to enable ASPM manually:

  # setpci -s03:00.0 CAP_EXP+0x10.w=0x0042

> > It looks like we *should* be able to enable L1 on both links since the
> > exit latency should be <33us (first link starts exit at T=0, completes
> > by T=32; second link starts exit at T=1, completes by T=33), and
> > 03:00.0 can tolerate up to 64us.
> >
> > I guess the effect of your patch is to disable L1 on the 00:01.2 -
> > 01:00.0 link?  And that makes the NIC work better?  I am obviously
> > missing something because I don't understand why the patch does that
> > or why it works better.
> 
> It makes it work like normal again, like if i disable ASPM on the
> nic itself...

I wonder if ASPM is just broken on this device.
__e1000e_disable_aspm() mentions hardware errata on a different Intel
NIC.

> I don't know which value that reflects, up or down - since we do max
> of both values and
> it actually disables ASPM.
> 
> What we can see is that the first device that passes the threshold
> is 01:00.0

I don't understand this.  03:00.0 claims to be able to tolerate 64us
of L1 exit latency.  The path should only have <33us of latency, so
it's not even close to the 64us threshold.

> How can I read more data from PCIe without needing to add kprint...
> 
> This is what lspci uses apparently:
> #define  PCI_EXP_LNKCAP_L0S     0x07000 /* L0s Exit Latency */
> #define  PCI_EXP_LNKCAP_L1      0x38000 /* L1 Exit Latency */
> 
> But which latencies are those? up or down?

I think the idea in aspm.c that exit latencies depend on which
direction traffic is going is incorrect.  The components at the
upstream and downstream ends of the link may have different exit
latencies, of course.

  parent reply	other threads:[~2020-09-23 21:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-27 21:30 [PATCH] Use maximum latency when determining L1 ASPM Ian Kumlien
2020-07-29 22:27 ` Bjorn Helgaas
2020-07-29 22:43   ` Ian Kumlien
2020-08-03 14:58   ` [PATCH] Use maximum latency when determining L1/L0s ASPM v2 Ian Kumlien
2020-08-15 19:39     ` Ian Kumlien
2020-09-18 22:47     ` Ian Kumlien
2020-09-22 20:19     ` Bjorn Helgaas
2020-09-22 21:02       ` Ian Kumlien
2020-09-22 23:00         ` Bjorn Helgaas
2020-09-22 23:29           ` Ian Kumlien
2020-09-22 23:31             ` Ian Kumlien
2020-09-23 21:23             ` Bjorn Helgaas [this message]
2020-09-23 21:36               ` Ian Kumlien
2020-09-23 21:48                 ` Ian Kumlien
2020-09-24 16:24                 ` Bjorn Helgaas
2020-09-25  8:06                   ` Ian Kumlien
2020-10-05 18:31     ` Bjorn Helgaas
2020-10-05 18:38       ` Ian Kumlien
2020-10-05 19:09         ` Bjorn Helgaas
2020-10-07 11:31           ` Ian Kumlien
2020-10-07 13:03             ` Bjorn Helgaas
     [not found] <CAA85sZvrPApeAYPVSYdVuKnp84xCpLBLf+f32e=R9tdPC0dvOw@mail.gmail.com>
2020-09-25 15:49 ` Bjorn Helgaas
2020-09-25 22:26   ` Ian Kumlien
2020-09-28  0:06     ` Bjorn Helgaas
2020-09-28 10:24       ` Ian Kumlien
2020-09-28 17:09         ` Bjorn Helgaas
2020-09-28 17:41           ` Ian Kumlien
2020-09-28 19:53             ` Alexander Duyck
2020-09-28 20:04               ` Ian Kumlien
2020-09-28 20:33                 ` Ian Kumlien
2020-09-28 23:30                   ` Alexander Duyck
2020-09-29 12:51                     ` Ian Kumlien
2020-09-29 16:23                       ` Alexander Duyck
2020-09-29 21:13                         ` Ian Kumlien
2020-09-28 21:43               ` Ian Kumlien
2020-09-28 18:10           ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200923212318.GA2295660@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=ian.kumlien@gmail.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=puranjay12@gmail.com \
    --cc=refactormyself@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox