Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: "Limonciello, Mario" <mario.limonciello@amd.com>
Cc: Basavaraj Natikar <bnatikar@amd.com>,
	"Natikar, Basavaraj" <Basavaraj.Natikar@amd.com>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"thomas@glanzmann.de" <thomas@glanzmann.de>
Subject: Re: [PATCH] PCI: Add quirk to clear MSI-X
Date: Thu, 9 Mar 2023 16:30:51 -0600	[thread overview]
Message-ID: <20230309223051.GA1178661@bhelgaas> (raw)
In-Reply-To: <3edd370c-e9e2-733c-2d79-51a08dd10e9d@amd.com>

On Thu, Mar 09, 2023 at 12:32:41PM -0600, Limonciello, Mario wrote:
> On 3/9/2023 12:25, Bjorn Helgaas wrote:
> ...

> > > > https://gitlab.freedesktop.org/agd5f/linux/-/commit/07494a25fc8881e122c242a46b5c53e0e4403139
> > 
> > That nbio_v7.2.c patch and this patch don't look anything alike.  It
> > looks like the nbio_v7.2.c patch might run once?  Could *this* be done
> > once at enumeration-time, too?
> 
> They don't look anything alike because they're attacking the problem from
> different angles.

Why do we need different angles?

> The NBIO patch fixes the initialization value for the internal registers.
> This is what the BIOS "should" have done.  When the internal registers are
> configured properly then the behavior the kernel expects works as well.
> 
> The NBIO patch will run both at amdgpu startup as well as when resuming from
> suspend.

If initializing something as BIOS should have done makes the hardware
work correctly, isn't once enough?  Why does the NBIO patch need to
run at resume-time?

> This patch we're discussing treats the symptoms of the deficiency and avoids
> the impact.
> This patch runs any time the controller is runtime resumed.  So, yes it will
> run more frequently.  Because this patch is treating the symptoms it needs
> to be applied every single time the controller exits D3.

This patch runs at *suspend*-time (DECLARE_PCI_FIXUP_SUSPEND), not
resume-time.

The difference is important because with this broken BIOS, MSI-X is
disabled between the suspend quirk and some distant point in resume.
With non-broken BIOS, MSI-X remains *enabled* for at least part of
that period, and I don't want to have to figure out whether that
difference is important.

We have fragments of a coherent commit log, but it's not quite a
complete story yet.  I think so far we have:

  - Issue affects only the 1022:15b8 USB controller (well, I guess it
    also affects some GPU device?)
  - Only a problem when BIOS doesn't initialize controller correctly
  - Controller claims to preserve internal state on D3hot->D0
    transition, but it doesn't
  - D0->D3hot->D0 transitions do preserve external PCI_MSIX_FLAGS
    state; only internal state is lost
  - When MSI-X is enabled and controller transitions D0->D3hot->D0,
    MSI-X appears enabled per PCI_MSIX_FLAGS, but is actually
    *disabled* because the internal state was lost
  - MSI-X being disabled leads to xhci_hcd command timeouts because
    interrupts are missed
  - Not possible for an enumeration-time quirk to fix the controller
    initialization problem (why not?)
  - Writing PCI_MSIX_FLAGS with a *different* value fixes the internal
    state; writing the same value does nothing
  - A suspend- or resume-time quirk can work around this, and this is
    safe on *all* 1022:15b8 devices regardless of whether the BIOS is
    broken
  - The same approach can't be used for both 1022:15b8 and the GPU
    device because ...?

Bjorn

  reply	other threads:[~2023-03-09 22:30 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06  7:23 [PATCH] PCI: Add quirk to clear MSI-X Basavaraj Natikar
2023-03-06  8:14 ` Thomas Glanzmann
2023-03-08 22:44 ` Bjorn Helgaas
2023-03-08 23:04   ` Limonciello, Mario
2023-03-09  7:34     ` Basavaraj Natikar
2023-03-09 18:25       ` Bjorn Helgaas
2023-03-09 18:32         ` Limonciello, Mario
2023-03-09 22:30           ` Bjorn Helgaas [this message]
2023-03-10  0:57             ` Mario Limonciello
2023-03-10  7:41               ` Basavaraj Natikar
2023-03-10 22:13               ` Bjorn Helgaas
2023-03-20  1:32                 ` Limonciello, Mario
2023-03-20 17:14                   ` Bjorn Helgaas
2023-03-20 17:20                     ` Limonciello, Mario
2023-03-20 19:36                       ` Bjorn Helgaas
2023-03-20 19:47                         ` Limonciello, Mario
2023-03-20 21:30                           ` Bjorn Helgaas
2023-03-20 21:37                             ` Limonciello, Mario
2023-03-20 22:08                               ` Bjorn Helgaas
2023-03-20 22:52                                 ` Mario Limonciello
2023-03-21 11:07                                   ` Bjorn Helgaas
2023-03-28 13:15                                     ` Basavaraj Natikar
2023-03-28 13:25                                       ` Limonciello, Mario
2023-03-28 17:42                                       ` Bjorn Helgaas
2023-03-10  7:22         ` Basavaraj Natikar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230309223051.GA1178661@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Basavaraj.Natikar@amd.com \
    --cc=bhelgaas@google.com \
    --cc=bnatikar@amd.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=thomas@glanzmann.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox