From: Bjorn Helgaas <helgaas@kernel.org>
To: "Limonciello, Mario" <mario.limonciello@amd.com>
Cc: Basavaraj Natikar <bnatikar@amd.com>,
"Natikar, Basavaraj" <Basavaraj.Natikar@amd.com>,
"bhelgaas@google.com" <bhelgaas@google.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"thomas@glanzmann.de" <thomas@glanzmann.de>
Subject: Re: [PATCH] PCI: Add quirk to clear MSI-X
Date: Thu, 9 Mar 2023 16:30:51 -0600 [thread overview]
Message-ID: <20230309223051.GA1178661@bhelgaas> (raw)
In-Reply-To: <3edd370c-e9e2-733c-2d79-51a08dd10e9d@amd.com>
On Thu, Mar 09, 2023 at 12:32:41PM -0600, Limonciello, Mario wrote:
> On 3/9/2023 12:25, Bjorn Helgaas wrote:
> ...
> > > > https://gitlab.freedesktop.org/agd5f/linux/-/commit/07494a25fc8881e122c242a46b5c53e0e4403139
> >
> > That nbio_v7.2.c patch and this patch don't look anything alike. It
> > looks like the nbio_v7.2.c patch might run once? Could *this* be done
> > once at enumeration-time, too?
>
> They don't look anything alike because they're attacking the problem from
> different angles.
Why do we need different angles?
> The NBIO patch fixes the initialization value for the internal registers.
> This is what the BIOS "should" have done. When the internal registers are
> configured properly then the behavior the kernel expects works as well.
>
> The NBIO patch will run both at amdgpu startup as well as when resuming from
> suspend.
If initializing something as BIOS should have done makes the hardware
work correctly, isn't once enough? Why does the NBIO patch need to
run at resume-time?
> This patch we're discussing treats the symptoms of the deficiency and avoids
> the impact.
> This patch runs any time the controller is runtime resumed. So, yes it will
> run more frequently. Because this patch is treating the symptoms it needs
> to be applied every single time the controller exits D3.
This patch runs at *suspend*-time (DECLARE_PCI_FIXUP_SUSPEND), not
resume-time.
The difference is important because with this broken BIOS, MSI-X is
disabled between the suspend quirk and some distant point in resume.
With non-broken BIOS, MSI-X remains *enabled* for at least part of
that period, and I don't want to have to figure out whether that
difference is important.
We have fragments of a coherent commit log, but it's not quite a
complete story yet. I think so far we have:
- Issue affects only the 1022:15b8 USB controller (well, I guess it
also affects some GPU device?)
- Only a problem when BIOS doesn't initialize controller correctly
- Controller claims to preserve internal state on D3hot->D0
transition, but it doesn't
- D0->D3hot->D0 transitions do preserve external PCI_MSIX_FLAGS
state; only internal state is lost
- When MSI-X is enabled and controller transitions D0->D3hot->D0,
MSI-X appears enabled per PCI_MSIX_FLAGS, but is actually
*disabled* because the internal state was lost
- MSI-X being disabled leads to xhci_hcd command timeouts because
interrupts are missed
- Not possible for an enumeration-time quirk to fix the controller
initialization problem (why not?)
- Writing PCI_MSIX_FLAGS with a *different* value fixes the internal
state; writing the same value does nothing
- A suspend- or resume-time quirk can work around this, and this is
safe on *all* 1022:15b8 devices regardless of whether the BIOS is
broken
- The same approach can't be used for both 1022:15b8 and the GPU
device because ...?
Bjorn
next prev parent reply other threads:[~2023-03-09 22:30 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-06 7:23 [PATCH] PCI: Add quirk to clear MSI-X Basavaraj Natikar
2023-03-06 8:14 ` Thomas Glanzmann
2023-03-08 22:44 ` Bjorn Helgaas
2023-03-08 23:04 ` Limonciello, Mario
2023-03-09 7:34 ` Basavaraj Natikar
2023-03-09 18:25 ` Bjorn Helgaas
2023-03-09 18:32 ` Limonciello, Mario
2023-03-09 22:30 ` Bjorn Helgaas [this message]
2023-03-10 0:57 ` Mario Limonciello
2023-03-10 7:41 ` Basavaraj Natikar
2023-03-10 22:13 ` Bjorn Helgaas
2023-03-20 1:32 ` Limonciello, Mario
2023-03-20 17:14 ` Bjorn Helgaas
2023-03-20 17:20 ` Limonciello, Mario
2023-03-20 19:36 ` Bjorn Helgaas
2023-03-20 19:47 ` Limonciello, Mario
2023-03-20 21:30 ` Bjorn Helgaas
2023-03-20 21:37 ` Limonciello, Mario
2023-03-20 22:08 ` Bjorn Helgaas
2023-03-20 22:52 ` Mario Limonciello
2023-03-21 11:07 ` Bjorn Helgaas
2023-03-28 13:15 ` Basavaraj Natikar
2023-03-28 13:25 ` Limonciello, Mario
2023-03-28 17:42 ` Bjorn Helgaas
2023-03-10 7:22 ` Basavaraj Natikar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230309223051.GA1178661@bhelgaas \
--to=helgaas@kernel.org \
--cc=Basavaraj.Natikar@amd.com \
--cc=bhelgaas@google.com \
--cc=bnatikar@amd.com \
--cc=linux-pci@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=thomas@glanzmann.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox