Linux USB
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Lukas Wunner <lukas@wunner.de>
Cc: "Greg KH" <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, "Marek Behún" <kabel@kernel.org>
Subject: Re: xhci_pci & PCIe hotplug crash
Date: Wed, 5 May 2021 15:02:40 +0200	[thread overview]
Message-ID: <20210505130240.lmryb26xffzkg4pl@pali> (raw)
In-Reply-To: <20210505124402.GB29101@wunner.de>

On Wednesday 05 May 2021 14:44:02 Lukas Wunner wrote:
> On Wed, May 05, 2021 at 02:33:46PM +0200, Pali Rohár wrote:
> > I just spotted this crash during debugging PCIe controller driver
> > pci-aardvark.c with trying to expose its link down events via "hot plug"
> > interrupt and corresponding link layer state flags.
> > 
> > And because in whole call trace I see only generic PCIe and USB code
> > path without any driver specific parts, I suspect that this is not PCIe
> > controller-specific issue but rather something "wrong" in genetic PCIe
> > (or USB) code. That is why I sent this email, so maybe somebody else
> > find something suspicious here.
> > 
> > But still there is a chance that issue can be also in pci-aardvark.c
> > driver and somehow it masked its issue and propagated it into generic
> > PCIe hot plug code path.
> 
> If you hot-remove the XHCI controller, accesses to its MMIO space
> will fail.  xhci_irq() seems to perform such MMIO accesses.

That abort happens at offset 4d00, here is part of objdump:

        if (!arch_irqs_disabled_flags(flags))
    4ccc:       340014a0        cbz     w0, 4f60 <xhci_irq+0x2d0>
    4cd0:       d2800000        mov     x0, #0x0                        // #0
    4cd4:       910a7276        add     x22, x19, #0x29c
    4cd8:       52800022        mov     w2, #0x1                        // #1
    4cdc:       f98002d1        prfm    pstl1strm, [x22]
    4ce0:       885ffec1        ldaxr   w1, [x22]
    4ce4:       4a000023        eor     w3, w1, w0
    4ce8:       35000063        cbnz    w3, 4cf4 <xhci_irq+0x64>
    4cec:       88037ec2        stxr    w3, w2, [x22]
    4cf0:       35ffff83        cbnz    w3, 4ce0 <xhci_irq+0x50>
    4cf4:       35002741        cbnz    w1, 51dc <xhci_irq+0x54c>
        status = readl(&xhci->op_regs->status);
    4cf8:       f9400f41        ldr     x1, [x26, #24]
    4cfc:       91001021        add     x1, x1, #0x4
    4d00:       b9400021        ldr     w1, [x1]

So it looks like it is that MMIO access, right?

> Normally this should happen silently and MMIO accesses just return
> with a fabricated "all ones" response.  Chances are however that the
> Aardvark controller raises a synchronous external abort instead.

This makes sense. Good catch lso with fact that it is from threaded
context!

> Perhaps you can teach it not to do that.

No :-( I read all documentation which is available for this PCIe
controller, part of Marvell A3720 SoC and I have not found anything
which allows me to configure raising external aborts.

I already figured out that CPU receive external abort also when trying
to issue a new PIO transfer for accessing PCI config space while
previous transfer has not finished yet. And also there is no way (at
least in documentation) which allows to "mask" this external abort. But
this issue can be fixed in pci-aardvark.c driver to disallow access to
config space while previous transfer is still running (I will send patch
for this one).

So seems that PCIe controller HW triggers these external aborts when
device on PCIe bus is not accessible anymore.

If this issue is really caused by MMIO access from xhci driver when
device is not accessible on the bus anymore, can we do something to
prevent this kernel crash? Somehow mask that external abort in kernel
for a time during MMIO access?

> Thanks,
> 
> Lukas

  reply	other threads:[~2021-05-05 13:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-05 12:01 xhci_pci & PCIe hotplug crash Pali Rohár
2021-05-05 12:09 ` Greg KH
2021-05-05 12:33   ` Pali Rohár
2021-05-05 12:40     ` Greg KH
2021-05-05 12:44     ` Lukas Wunner
2021-05-05 13:02       ` Pali Rohár [this message]
2021-05-05 15:20         ` David Laight
2021-05-05 15:39           ` Pali Rohár
2021-06-19  7:53             ` Lukas Wunner
2021-06-19  8:55               ` Pali Rohár
2021-05-05 12:37   ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210505130240.lmryb26xffzkg4pl@pali \
    --to=pali@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=kabel@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox