All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linda Knippers <linda.knippers@hp.com>
To: Bjorn Helgaas <bhelgaas@google.com>,
	"Woodhouse, David" <david.woodhouse@intel.com>
Cc: "bhe@redhat.com" <bhe@redhat.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"James.Bottomley@hansenpartnership.com"
	<James.Bottomley@hansenpartnership.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"davidlohr@hp.com" <davidlohr@hp.com>,
	"scameron@beardog.cce.hp.com" <scameron@beardog.cce.hp.com>,
	"jiang.liu@linux.intel.com" <jiang.liu@linux.intel.com>
Subject: Re: hpsa driver bug crack kernel down!
Date: Thu, 10 Apr 2014 11:36:51 -0400	[thread overview]
Message-ID: <5346BA93.5090904@hp.com> (raw)
In-Reply-To: <CAErSpo6EORf02WS1UqP8EMWfHij7W2J5B3ZOoBT90txV85uZZw@mail.gmail.com>

On 4/10/2014 11:14 AM, Bjorn Helgaas wrote:
> On Thu, Apr 10, 2014 at 2:46 AM, Woodhouse, David
> <david.woodhouse@intel.com> wrote:
> 
>>>>>>>>>>> DMAR:[fault reason 02] Present bit in context entry is clear
>>>>>>>>>>> dmar: DRHD: handling fault status reg 602
>>>>>>>>>>> dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000
>>
>> That "Present bit in context entry is clear" fault means that we have
>> not set up *any* mappings for this PCI device… on this IOMMU.
>>
>>>> Yes, specifically (finally done bisecting):
>>>>
>>>> commit 2e45528930388658603ea24d49cf52867b928d3e
>>>> Author: Jiang Liu <jiang.liu@linux.intel.com>
>>>> Date:   Wed Feb 19 14:07:36 2014 +0800
>>>>
>>>>     iommu/vt-d: Unify the way to process DMAR device scope array
>>
>> This commit is about how we decide which IOMMU a given PCI device is
>> attached to.
>>
>> Thus, my first guess would be that we are quite happily setting up the
>> requested DMA maps on the *wrong* IOMMU, and then taking faults when the
>> device actually tries to do DMA.
>>
>> However, I'm not 100% convinced of that. The fault address looks
>> suspiciously like a true physical address, not a virtual bus address of
>> the type that we'd normally allocate for a dma_map_* operation. Those
>> would start at 0xfffff000 and work downwards, typically.
> 
> I like the "wrong IOMMU (or no IOMMU at all)" theory.  If we didn't
> connect the device with an IOMMU at all, that would explain the device
> DMAing directly to a physical address, wouldn't it?
> 
>> Do you have 'iommu=pt' on the kernel command line? Can I see the full
>> dmesg as this system boots, and also a copy of the DMAR table?

This will be really helpful information.  This box has devices with
RMRR records and if they're not set up correctly, DMAR faults can occur.

>>
>> We should also rate-limit DMA faults, which would avoid the lockup
>> failure mode. Bjorn, what should an IOMMU driver *do* when it detects
>> that a device is creating an endless stream of DMA faults and isn't
>> aborting the transaction?
> 
> You mentioned that POWER with EEH does something intelligent in this
> case, but I'm not familiar with that code.  We have AER support, which
> can result in resetting a device, but I think DMA faults are reported
> differently, and I don't think there's any nice existing way for PCI
> to deal with them.  Maybe there should be, though.
> 
> Bjorn
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

  parent reply	other threads:[~2014-04-10 15:36 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-09  2:39 hpsa driver bug crack kernel down! Baoquan He
2014-04-09 22:49 ` Davidlohr Bueso
2014-04-09 23:08   ` James Bottomley
2014-04-09 23:10     ` James Bottomley
2014-04-09 23:40       ` Davidlohr Bueso
2014-04-09 23:50         ` James Bottomley
2014-04-10  0:19           ` Davidlohr Bueso
     [not found]             ` <1397089180.2608.27.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-04-10  4:03               ` Bjorn Helgaas
2014-04-10  4:03                 ` Bjorn Helgaas
2014-04-10  6:32                 ` Davidlohr Bueso
     [not found]                   ` <1397111557.2608.29.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-04-10  7:15                     ` Joerg Roedel
2014-04-10  7:15                       ` Joerg Roedel
     [not found]                       ` <20140410071535.GX13491-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2014-04-10  8:46                         ` Woodhouse, David
2014-04-10  8:46                           ` Woodhouse, David
     [not found]                           ` <1397119587.19944.14.camel-Fexsq3y4057IgHVZqg5X0TlWvGAXklZc@public.gmane.org>
2014-04-10 15:14                             ` Bjorn Helgaas
2014-04-10 15:14                               ` Bjorn Helgaas
2014-04-10 15:34                               ` Woodhouse, David
2014-04-10 15:36                               ` Linda Knippers [this message]
2014-04-10 16:19                             ` Davidlohr Bueso
2014-04-10 16:19                               ` Davidlohr Bueso
2014-04-10 16:30                               ` Woodhouse, David
2014-04-11  9:18                               ` Woodhouse, David
     [not found]                                 ` <1397207932.19944.58.camel-Fexsq3y4057IgHVZqg5X0TlWvGAXklZc@public.gmane.org>
2014-04-14 15:45                                   ` Davidlohr Bueso
2014-04-14 15:45                                     ` Davidlohr Bueso
     [not found]                                     ` <1397490358.31076.6.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-04-14 16:19                                       ` Jiang Liu
2014-04-14 16:19                                         ` Jiang Liu
     [not found]                                         ` <534C0AA9.5080909-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-04-14 16:44                                           ` Davidlohr Bueso
2014-04-14 16:44                                             ` Davidlohr Bueso
     [not found]                                             ` <1397493858.31076.8.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-04-14 16:47                                               ` Davidlohr Bueso
2014-04-14 16:47                                                 ` Davidlohr Bueso
2014-04-14 17:03                                                 ` Woodhouse, David
     [not found]                                                   ` <1397495030.19944.198.camel-Fexsq3y4057IgHVZqg5X0TlWvGAXklZc@public.gmane.org>
2014-04-16 13:37                                                     ` joro-zLv9SwRftAIdnm+yROfE0A
2014-04-16 13:37                                                       ` joro
2014-04-16 13:58                                                       ` Woodhouse, David
2014-04-16 14:13                                                         ` joro
2014-04-14  7:01                               ` Jiang Liu
2014-04-14  8:57                               ` Jiang Liu
     [not found]                                 ` <534BA30B.5040102-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-04-14 18:08                                   ` Davidlohr Bueso
2014-04-14 18:08                                     ` Davidlohr Bueso
2014-04-10 20:45                   ` scameron
     [not found]                     ` <20140410204525.GC21815-3C9H9nn4BS4HL6m8NFMY+dBPR1lH4CV8@public.gmane.org>
2014-04-10 23:17                       ` Shuah Khan
2014-04-10 23:17                         ` Shuah Khan
     [not found]                         ` <CAKocOONaqGAaiesf_MUFXEOMDtX8R8kYuPQYAxLBfth7nAx3Jg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-11  8:57                           ` David Woodhouse
2014-04-11  8:57                             ` David Woodhouse
     [not found]                 ` <CAErSpo4H=hcro8sMnt2MzDDVCROpASuUTQWBw37OxodHTyOfyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-10  8:34                   ` Jiang Liu
2014-04-10  8:34                     ` Jiang Liu
     [not found]                     ` <53465781.4010904-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-04-10 15:54                       ` Davidlohr Bueso
2014-04-10 15:54                         ` Davidlohr Bueso
2014-04-10 16:02                       ` Davidlohr Bueso
2014-04-10 16:02                         ` Davidlohr Bueso
2014-04-11  1:34                       ` Baoquan He
2014-04-11  1:34                         ` Baoquan He
2014-04-11  3:14                       ` Baoquan He
2014-04-11  3:14                         ` Baoquan He
2014-04-10 15:43 ` Bjorn Helgaas
2014-04-10 16:02   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5346BA93.5090904@hp.com \
    --to=linda.knippers@hp.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=bhe@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=david.woodhouse@intel.com \
    --cc=davidlohr@hp.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jiang.liu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=scameron@beardog.cce.hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.