linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: lorenzo.pieralisi@arm.com (Lorenzo Pieralisi)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V7 08/11] drivers: acpi: Handle IOMMU lookup failure with deferred probing or error
Date: Wed, 1 Feb 2017 18:52:30 +0000	[thread overview]
Message-ID: <20170201185230.GA1922@red-moon> (raw)
In-Reply-To: <f401e27e-d619-0eec-2401-50d8e2fc2f0f@codeaurora.org>

On Mon, Jan 30, 2017 at 03:03:06PM -0500, Sinan Kaya wrote:
> On 1/30/2017 11:51 AM, Lorenzo Pieralisi wrote:
> > On Mon, Jan 30, 2017 at 10:46:39AM -0500, Sinan Kaya wrote:
> >> On 1/30/2017 9:54 AM, Nate Watterson wrote:
> >>> On 2017-01-30 09:38, Will Deacon wrote:
> >>>> On Mon, Jan 30, 2017 at 09:33:50AM -0500, Sinan Kaya wrote:
> >>>>> On 1/30/2017 9:23 AM, Nate Watterson wrote:
> >>>>>> On 2017-01-30 08:59, Sinan Kaya wrote:
> >>>>>>> On 1/30/2017 7:22 AM, Robin Murphy wrote:
> >>>>>>>> On 29/01/17 17:53, Sinan Kaya wrote:
> >>>>>>>>> On 1/24/2017 7:37 AM, Lorenzo Pieralisi wrote:
> >>>>>>>>>> [+hanjun, tomasz, sinan]
> >>>>>>>>>>
> >>>>>>>>>> It is quite a key patchset, I would be glad if they can test on their
> >>>>>>>>>> respective platforms with IORT.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Tested on top of 4.10-rc5.
> >>>>>>>>>
> >>>>>>>>> 1.    Platform Hidma device passed dmatest
> >>>>>>>>> 2.    Seeing some USB stalls on a platform USB device.
> >>>>>>>>> 3.    PCIe NVME drive probed and worked fine with MSI interrupts after boot.
> >>>>>>>>> 4.     NVMe driver didn't probe following a hotplug insertion and received an
> >>>>>>>>> SMMU error event during the insertion.
> >>>>>>>>
> >>>>>>>> What was the SMMU error - a translation/permission fault (implying the
> >>>>>>>> wrong DMA ops) or a bad STE fault (implying we totally failed to tell
> >>>>>>>> the SMMU about the device at all)?
> >>>>>>>>
> >>>>>>>
> >>>>>>> root at ubuntu:/sys/bus/pci/slots/4# echo 0 > power
> >>>>>>>
> >>>>>>> [__204.698522]_iommu:_Removing_device_0003:01:00.0_from_group_0
> >>>>>>> [  204.708704] pciehp 0003:00:00.0:pcie004: Slot(4): Link Down
> >>>>>>> [  204.708723] pciehp 0003:00:00.0:pcie004: Slot(4): Link Down event
> >>>>>>> ignored; already powering off
> >>>>>>>
> >>>>>>> root at ubuntu:/sys/bus/pci/slots/4#
> >>>>>>>
> >>>>>>> [__254.820440]_iommu:_Adding_device_0003:01:00.0_to_group_8
> >>>>>>> [  254.820599] nvme nvme0: pci function 0003:01:00.0
> >>>>>>> [  254.820621] nvme 0003:01:00.0: enabling device (0000 -> 0002)
> >>>>>>> [  261.948558] arm-smmu-v3 arm-smmu-v3.0.auto: event 0x0a received:
> >>>>>>> [  261.948561] arm-smmu-v3 arm-smmu-v3.0.auto:  0x000001000000000a
> >>>>>>> [  261.948563] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000
> >>>>>>> [  261.948564] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000
> >>>>>>> [  261.948566] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000
> >>>>>
> >>>>>> Looks like C_BAD_CD. Can you please try with:
> >>>>>> iommu/arm-smmu-v3: Clear prior settings when updating STEs
> >>>>>
> >>>>> This resolved the issue. Can we pull Nate's patch to 4.10 so that I don't see
> >>>>> this issue again.
> >>>>
> >>>> I already sent the pull request to Joerg for 4.11. Do you see this problem
> >>>> without Sricharan's patches (i.e. vanilla mainline)? If so, we'll need to
> >>>> send the patch to stable after -rc1.
> >>> Using vanilla mainline, I see it most commonly when directly assigning
> >>> a device to a guest machine. I think I've also seen it after removing then
> >>> re-adding a PCI device. Basically anytime an STE's CTX pointer is changed
> >>> from a non-NULL value and STE[CFG] indicates translation will be performed.
> >>>
> >>
> >> I was not able to reproduce the issue with Vanilla kernel. I only
> >> tested hotplug.
> > 
> > I would like to get the complete code path leading to this issue, it is
> > not clear to me why the probe deferral code triggers it and why we are
> > not able to trigger it with vanilla mainline, we must understand that first
> > before applying any fix to this series.
> > 
> > I do not have a platform to reproduce this issue I will send you a patch
> > to trace what's going on here please help us debug it.
> 
> Sure, send it to both Nate and me.

I debugged the issue and Nate's fix is correct, the fact that you
can't it hit it with mainline is just a matter of timing because it has
to do with the CTX pointer value (we OR it with the existing value), so
it may work or not depending on how the cdptr memory allocation
pattern turns out to be (which explains why Nate and I can hit it with
simple PCI device remove/add execution too).

So it is neither an ACPI nor an IOMMU probe deferral issue per-se,
fix is already queued, so it is all good.

What about USB stalls ?

Thanks !
Lorenzo

  reply	other threads:[~2017-02-01 18:52 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20170123161926epcas2p1cf464a2978d89d0c1fdd2f7cc6d38317@epcas2p1.samsung.com>
2017-01-23 16:18 ` [PATCH V7 00/11] IOMMU probe deferral support Sricharan R
2017-01-23 16:18   ` [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling Sricharan R
2017-01-25 17:17     ` Tomasz Nowicki
2017-01-25 17:35       ` Robin Murphy
2017-01-25 18:13         ` Tomasz Nowicki
2017-01-27 18:00         ` Sricharan
2017-01-27 18:19           ` Robin Murphy
2017-01-30  7:00             ` Sricharan
2017-01-30 18:42               ` Robin Murphy
2017-01-31 13:11                 ` Sricharan
2017-01-23 16:18   ` [PATCH V7 02/11] iommu/of: Prepare for deferred IOMMU configuration Sricharan R
2017-01-23 16:18   ` [PATCH V7 03/11] of: dma: Move range size workaround to of_dma_get_range() Sricharan R
2017-01-23 16:18   ` [PATCH V7 04/11] of: dma: Make of_dma_deconfigure() public Sricharan R
2017-01-23 16:18   ` [PATCH V7 05/11] ACPI/IORT: Add function to check SMMUs drivers presence Sricharan R
2017-01-23 16:18   ` [PATCH V7 06/11] of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices Sricharan R
2017-01-28 21:08     ` Bjorn Helgaas
2017-01-30  7:09     ` Rafael J. Wysocki
2017-01-23 16:18   ` [PATCH V7 07/11] iommu: of: Handle IOMMU lookup failure with deferred probing or error Sricharan R
2017-01-28 21:03     ` Bjorn Helgaas
2017-01-30  8:01       ` Sricharan
2017-01-29 16:36     ` Sinan Kaya
2017-01-30 12:00       ` Sricharan
2017-01-23 16:18   ` [PATCH V7 08/11] drivers: acpi: " Sricharan R
2017-01-24 12:37     ` Lorenzo Pieralisi
2017-01-24 13:14       ` Hanjun Guo
2017-01-25  7:31       ` Sricharan
2017-01-29 17:53       ` Sinan Kaya
2017-01-30 12:22         ` Robin Murphy
2017-01-30 13:59           ` Sinan Kaya
2017-01-30 14:23             ` Nate Watterson
2017-01-30 14:33               ` Sinan Kaya
2017-01-30 14:38                 ` Will Deacon
2017-01-30 14:54                   ` Nate Watterson
2017-01-30 15:46                     ` Sinan Kaya
2017-01-30 16:51                       ` Lorenzo Pieralisi
2017-01-30 20:03                         ` Sinan Kaya
2017-02-01 18:52                           ` Lorenzo Pieralisi [this message]
2017-02-01 19:10                             ` Sinan Kaya
2017-02-02 19:01                             ` Nate Watterson
2017-02-03  3:37                               ` Hanjun Guo
2017-02-03  3:37                               ` Sricharan
2017-01-23 16:18   ` [PATCH V7 09/11] arm64: dma-mapping: Remove the notifier trick to handle early setting of dma_ops Sricharan R
2017-01-28 21:06     ` Bjorn Helgaas
2017-01-30  8:04       ` Sricharan
2017-01-23 16:18   ` [PATCH V7 10/11] iommu/arm-smmu: Clean up early-probing workarounds Sricharan R
2017-01-25 17:28     ` Tomasz Nowicki
2017-01-28 21:07     ` Bjorn Helgaas
2017-01-30  8:05       ` Sricharan
2017-01-23 16:18   ` [PATCH V7 11/11] ACPI/IORT: Remove linker section for IORT entries probing Sricharan R
2017-01-24  7:40   ` [PATCH V7 00/11] IOMMU probe deferral support Marek Szyprowski
2017-01-24 11:15     ` Sricharan
2017-01-25  4:33   ` Hanjun Guo
2017-01-25  7:33     ` Sricharan
2017-01-28 21:10   ` Bjorn Helgaas
2017-01-30  9:20     ` Sricharan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170201185230.GA1922@red-moon \
    --to=lorenzo.pieralisi@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).