linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joerg Roedel <joerg.roedel@amd.com>
To: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Bruce Allan <bruce.w.allan@intel.com>,
	Carolyn Wyborny <carolyn.wyborny@intel.com>,
	Don Skidmore <donald.c.skidmore@intel.com>,
	Greg Rose <gregory.v.rose@intel.com>,
	Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>,
	John Ronciak <john.ronciak@intel.com>,
	<e1000-devel@lists.sourceforge.net>,
	<linux-kernel@vger.kernel.org>
Subject: Re: IO_PAGE_FAULTS with igb or igbvf on AMD IOMMU system
Date: Wed, 20 Jun 2012 11:48:44 +0200	[thread overview]
Message-ID: <20120620094844.GL2624@amd.com> (raw)
In-Reply-To: <4FE0C2A8.50602@intel.com>

Hi Alexander,

On Tue, Jun 19, 2012 at 11:19:20AM -0700, Alexander Duyck wrote:
> Based on the faults it would look like accessing the descriptor rings is
> probably triggering the errors.  We allocate the descriptor rings using
> dma_alloc_coherent so the rings should be mapped correctly.

Can this happen before the driver actually allocated the descriptors? As
I said, the faults appear before any DMA-API call was made for that
device (hence, domain=0x0000, because the domain is assigned on the
first call to the DMA-API for a device).

Also, I don't see the faults every time. One out of ten times
(estimated) there are not faults. Is it possible that this is a race
condition, e.g. that the card trys to access its descriptor rings before
the driver allocated them (or something like that).

> The PF and VF will end up being locked out since they are hung on an
> uncompleted DMA transaction.  Normally we recommend that PCIe Advanced
> Error Reporting be enabled if an IOMMU is enabled so the device can be
> reset after triggering a page fault event.
> 
> The first thing that pops into my head for possible issues would be that
> maybe the VF pci_dev structure or the device structure isn't being
> correctly initialized when SR-IOV is enabled on the igb interface.  Do
> you know if there are any AMD IOMMU specific values on those structures,
> such as the domain, that are supposed to be initialized prior to calling
> the DMA API calls?  If so, have you tried adding debug output to verify
> if those values are initialized on a VF prior to bringing up a VF interface?

Well, when the device appears in the system the IOMMU driver gets
notified about it using the device_change notifiers. It will then
allocate all necessary data structures. I also verified that this works
correctly while debugging this issue. So I am pretty sure the problem
isn't there :)

> Also have you tried any other SR-IOV capable devices on this system? 
> That would be a valuable data point because we could then exclude the
> SR-IOV code as being a possible cause for the issues if other SR-IOV
> devices are working without any issues.

I have another SR-IOV device, but that fails to even enable SR-IOV
because the BIOS did not let enough MMIO resources left. So I couldn't
try it with that device. With the 82576 card enabling SR-IOV works fine
but results in the faults from the VF.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


  reply	other threads:[~2012-06-20  9:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-19 10:20 IO_PAGE_FAULTS with igb or igbvf on AMD IOMMU system Joerg Roedel
2012-06-19 18:19 ` Alexander Duyck
2012-06-20  9:48   ` Joerg Roedel [this message]
2012-06-20 16:51     ` Rose, Gregory V
2012-06-20 22:48     ` Alexander Duyck
2012-06-25 11:20       ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120620094844.GL2624@amd.com \
    --to=joerg.roedel@amd.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=bruce.w.allan@intel.com \
    --cc=carolyn.wyborny@intel.com \
    --cc=donald.c.skidmore@intel.com \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=gregory.v.rose@intel.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.p.waskiewicz.jr@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).