From: Bjorn Helgaas <helgaas@kernel.org>
To: fk1xdcio@duck.com
Cc: linux-pci@vger.kernel.org, Lukas Wunner <lukas@wunner.de>,
Oliver O'Halloran <oohall@gmail.com>
Subject: Re: ASMedia ASM1812 PCIe switch causes system to freeze hard
Date: Mon, 13 Mar 2023 16:57:18 -0500 [thread overview]
Message-ID: <20230313215718.GA1546868@bhelgaas> (raw)
In-Reply-To: <20230308204942.GA1032495@bhelgaas>
On Wed, Mar 08, 2023 at 02:49:42PM -0600, Bjorn Helgaas wrote:
> On Sat, Feb 25, 2023 at 01:37:23PM -0500, fk1xdcio@duck.com wrote:
> > I'm testing a generic 4-port PCIe x4 2.5Gbps Ethernet NIC. It uses an
> > ASM1812 for the PCI packet switch to four RTL8125BG network controllers.
> >
> > The more load I put on the NIC the faster the system freezes. For example if
> > I activate four 2.5Gbps fully saturated network connections then the system
> > hard freezes almost immediately. When the system freezes it seems completely
> > dead. SysRq doesn't work, serial consoles are dead, etc. so I haven't been
> > able to get much debugging information. I have tested on various different
> > physical systems, Xeon E5, Xeon E3, i7, and they all behave the same so it
> > doesn't seem like a system hardware issue.
> >
> > Disabling IOMMU makes it run for a little longer before crashing.
> >
> > The tiny bit of error information I have been able to get under various
> > conditions (eg. disabling ASPM, forcing D0, etc):
> > Test #1:
> > pcieport 0000:04:02.0: Unable to change power state from D3hot to D0,
> > device inaccessible
> >
> > Test #2:
> > pcieport 0000:04:02.0: can't change power state from D3cold to D0 (config
> > space inaccessible)
> > pcieport 0000:03:00.0: Wakeup disabled by ACPI
> > pcieport 0000:04:02.0: PME# disabled
> >
> > Test #3:
> > enp7s0: cmd = 0xff, should be 0x07 \x0a.
> > enp7s0: pci link is down \x0a.
> >
> > At times there are several of those errors printed for the different PCI
> > devices of the NIC before the system locks up.
> >
> > Setting "pci=nommconf" on the kernel command line is the only thing that
> > seems to fix the issue but performance is degraded when using bidirectional
> > transfers. 2.5Gbps TX but only 1.5Gbps RX compared to MMCONFIG enabled which
> > gets full 2.5Gbps bidirectional.
> >
> > So it seems the MMCONFIG works sometimes but eventually something happens
> > and it becomes inaccessible at which point the system freezes. Is there a
> > way to keep MMCONFIG enabled for other devices but not this ASM1812 device?
> > Or better, is there a way to debug and fix MMCONFIG for the device?
>
> Thanks for the report!
>
> So IIUC, "pci=nommconf" avoids the system hang completely, but network
> performance is lower. Do the NIC stats show packet drops that might
> explain the performance problem?
>
> You mentioned later that you see AER errors caused by ASPM, and they
> go away if you disable power management (but the hard lockups still
> happen). Is it "pcie_aspm=off" or "pcie_port_pm=off" or something
> else that makes this diffference?
I don't want to forget about this issue. Have you learned anything
new, e.g., any answers to the questions above? I don't have any good
ideas yet, but if we keep pushing on it, we might be able to figure
out something.
Bjorn
next prev parent reply other threads:[~2023-03-13 21:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-25 18:37 ASMedia ASM1812 PCIe switch causes system to freeze hard fk1xdcio
2023-02-25 21:02 ` Lukas Wunner
[not found] ` <FC4B5703-B454-4BEB-9E9C-6841FBD2CD60.1@smtp-inbound1.duck.com>
2023-02-25 21:58 ` fk1xdcio
2023-03-08 20:49 ` Bjorn Helgaas
2023-03-13 21:57 ` Bjorn Helgaas [this message]
[not found] ` <1BD0E6B9-0611-4879-BA26-DDA87E772512.1@smtp-inbound1.duck.com>
2023-03-14 8:28 ` fk1xdcio
[not found] <8e7978f65c6606fb2d48483435c78bd3@cutk.com>
2023-02-25 18:47 ` fk1xdcio
2023-02-27 8:12 ` Oliver O'Halloran
[not found] ` <9C53F704-1C13-4191-8890-20B18A23E94B.1@smtp-inbound1.duck.com>
2023-02-27 9:17 ` fk1xdcio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230313215718.GA1546868@bhelgaas \
--to=helgaas@kernel.org \
--cc=fk1xdcio@duck.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).