The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Radu Rendec <radu@rendec.net>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Daniel Tsai" <danielsftsai@google.com>,
	"Marek Behún" <kabel@kernel.org>,
	"Krishna Chaitanya Chundru" <quic_krichai@quicinc.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Rob Herring" <robh@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Jingoo Han" <jingoohan1@gmail.com>,
	"Brian Masney" <bmasney@redhat.com>,
	"Eric Chanudet" <echanude@redhat.com>,
	"Jared Kangas" <jkangas@redhat.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 3/3] PCI: dwc: Enable MSI affinity support
Date: Fri, 29 May 2026 17:35:13 -0700	[thread overview]
Message-ID: <ahowweKTZXteLgZS@google.com> (raw)
In-Reply-To: <73f04f467e65b13fd455a3650dc2bd106af1e5a6.camel@rendec.net>

Hi Radu,

On Mon, May 25, 2026 at 12:48:09PM -0400, Radu Rendec wrote:
> On Fri, 2026-05-22 at 17:07 -0700, Brian Norris wrote:
> > (Updating Radu's email; dropping another bouncing email)
> 
> Thanks for doing that! Obviously, I no longer have access to the email
> address that was used to post the patch, and I was lazy in setting up
> scripts that follow the mailing lists to catch messages that are
> addressed to me directly but using old email addresses.

No worries. The email bounce notified me rather quickly, and in the end,
you saw this within a few hours of first report anyway :)

> > On Fri, May 22, 2026 at 01:27:43PM -0700, Brian Norris wrote:
> > > I'll see if I can learn anything more here on my own, but I figured I'd
> > > report it in case you have any thoughts or leads I should investigate.
> 
> Thanks for reporting it! I do not have any thoughts or leads yet, but I
> do plan to look at it during the next few days and hopefully come up
> with something. I also apologize for the slowness in my replies.

I'm only getting occasional time to spend on this too, so I'm a bit slow
as well.

> > In an hour or two of poking, all I've learned so far is that the problem
> > also seems to go away if I:
> > 
> > (a) add a few dump_stack() and other noisy logs to a few key places (for
> >     now, __pci_write_msi_msg(), pci_power_up() failures, and
> >     irq_chip_redirect_set_affinity() -- I think __pci_write_msi_msg()
> >     was the most significant, possibly because it produced the most log
> >     text) and
> > 
> > (b) leave a 115200 baud UART kernel console running.
> > 
> > (This is on a sample size of 20+ suspend cycles, whereas previous
> > bisection would fail 100%.)
> > 
> > It then reappers when I quiet the kernel logging a bit with `dmesg -n3`.
> > 
> > I think that simply tells me that there's some timing issue or race
> > condition involved.
> 
> That's very useful! Interrupts are migrated on suspend to the main CPU
> and then migrated back on resume, and the ordering and synchronization
> around that is tricky. The stack trace in your previous message tells
> me that the nvme driver is waiting for IO completion, which is normally
> signaled by an interrupt, except that interrupt never arrives.

That's true. But the first failure is:

  nvme 0001:01:00.0: Unable to change power state from unknown to D0, device inaccessible

That means the PCI config read of the PM status register is returning
all-0xff, so we're not really able to guarantee the NVMe PCIe device has
powered back up at all. In my experience, that's indicative that the
PCIe link has failed in some way, or the root complex is otherwise
misbehaving. If the link is not functional, we won't receive any NVMe
MSIs.

I still can't explain what about this patch is causing a PCIe link
failure though.

> With my patch included, the demultiplexed interrupt (the nvme interrupt
> in this case) has an opportunity to be migrated during suspend/resume,
> whereas previously it did not. That's one more moving part, and I'll
> have to look closer at the code and think what could go wrong. I agree
> it's likely a race condition or a timing issue because it works with
> that extra logging, which adds small delays as a side effect.

I also tried hot-unplugging all non-boot CPUs before suspending the
system:

  for i in /sys/devices/system/cpu/cpu{1,2,3,4,5,6,7}/online; do echo 1 > $i; done
  echo +10 > /sys/class/rtc/rtc0/wakealarm
  echo mem > /sys/power/state

I believe that means all the affinity/migration will occur while the
system is fully online, so we're less likely to run into power
management race conditions. In this test, NVMe is still functional after
the first step (CPU offline), but it still fails after suspend-to-mem.

This seems to tell me the irq_set_affinity()/migration process isn't
really what's killing things, but something else.

Brian

  reply	other threads:[~2026-05-30  0:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-28 21:20 [PATCH v3 0/3] Enable MSI affinity support for dwc PCI Radu Rendec
2025-11-28 21:20 ` [PATCH v3 1/3] genirq: Add interrupt redirection infrastructure Radu Rendec
2025-12-15 21:34   ` [tip: irq/msi] " tip-bot2 for Radu Rendec
2025-11-28 21:20 ` [PATCH v3 2/3] PCI: dwc: Code cleanup Radu Rendec
2025-12-15 21:34   ` [tip: irq/msi] " tip-bot2 for Radu Rendec
2025-11-28 21:20 ` [PATCH v3 3/3] PCI: dwc: Enable MSI affinity support Radu Rendec
2025-12-15 21:34   ` [tip: irq/msi] " tip-bot2 for Radu Rendec
2026-01-06  9:53     ` Jon Hunter
2026-01-06 15:07       ` Radu Rendec
2026-01-07  1:13         ` Radu Rendec
2026-01-20 18:01   ` [PATCH v3 3/3] " Jon Hunter
2026-01-20 22:30     ` Radu Rendec
2026-01-21 14:00       ` Jon Hunter
2026-01-22 23:31         ` Radu Rendec
2026-01-23 13:25           ` Jon Hunter
2026-01-26  7:59           ` Thomas Gleixner
2026-01-26 22:07             ` Jon Hunter
2026-01-26 22:26               ` Radu Rendec
2026-01-27 10:30                 ` Thomas Gleixner
2026-01-27 13:34                   ` Thomas Gleixner
2026-01-27 17:09                     ` Jon Hunter
2026-01-27 21:30                       ` [PATCH] genirq/redirect: Prevent writing MSI message on affinity change Thomas Gleixner
2026-01-29 22:51                         ` [tip: irq/msi] " tip-bot2 for Thomas Gleixner
2026-03-26  3:48                       ` [PATCH v3 3/3] PCI: dwc: Enable MSI affinity support Tsai Sung-Fu
2026-03-26 12:52                         ` Thomas Gleixner
2026-05-22 20:27   ` Brian Norris
2026-05-23  0:07     ` Brian Norris
2026-05-25 16:48       ` Radu Rendec
2026-05-30  0:35         ` Brian Norris [this message]
2026-06-01  8:20           ` Niklas Cassel
2026-06-01 19:02             ` Brian Norris
2026-06-01 19:09               ` Brian Norris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahowweKTZXteLgZS@google.com \
    --to=briannorris@chromium.org \
    --cc=bhelgaas@google.com \
    --cc=bmasney@redhat.com \
    --cc=danielsftsai@google.com \
    --cc=echanude@redhat.com \
    --cc=jingoohan1@gmail.com \
    --cc=jkangas@redhat.com \
    --cc=kabel@kernel.org \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=quic_krichai@quicinc.com \
    --cc=radu@rendec.net \
    --cc=robh@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox