From: Joerg Roedel <joerg.roedel@amd.com>
To: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
Jesse Brandeburg <jesse.brandeburg@intel.com>,
Bruce Allan <bruce.w.allan@intel.com>,
Carolyn Wyborny <carolyn.wyborny@intel.com>,
Don Skidmore <donald.c.skidmore@intel.com>,
Greg Rose <gregory.v.rose@intel.com>,
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>,
John Ronciak <john.ronciak@intel.com>,
<e1000-devel@lists.sourceforge.net>,
<linux-kernel@vger.kernel.org>
Subject: Re: IO_PAGE_FAULTS with igb or igbvf on AMD IOMMU system
Date: Wed, 20 Jun 2012 11:48:44 +0200 [thread overview]
Message-ID: <20120620094844.GL2624@amd.com> (raw)
In-Reply-To: <4FE0C2A8.50602@intel.com>
Hi Alexander,
On Tue, Jun 19, 2012 at 11:19:20AM -0700, Alexander Duyck wrote:
> Based on the faults it would look like accessing the descriptor rings is
> probably triggering the errors. We allocate the descriptor rings using
> dma_alloc_coherent so the rings should be mapped correctly.
Can this happen before the driver actually allocated the descriptors? As
I said, the faults appear before any DMA-API call was made for that
device (hence, domain=0x0000, because the domain is assigned on the
first call to the DMA-API for a device).
Also, I don't see the faults every time. One out of ten times
(estimated) there are not faults. Is it possible that this is a race
condition, e.g. that the card trys to access its descriptor rings before
the driver allocated them (or something like that).
> The PF and VF will end up being locked out since they are hung on an
> uncompleted DMA transaction. Normally we recommend that PCIe Advanced
> Error Reporting be enabled if an IOMMU is enabled so the device can be
> reset after triggering a page fault event.
>
> The first thing that pops into my head for possible issues would be that
> maybe the VF pci_dev structure or the device structure isn't being
> correctly initialized when SR-IOV is enabled on the igb interface. Do
> you know if there are any AMD IOMMU specific values on those structures,
> such as the domain, that are supposed to be initialized prior to calling
> the DMA API calls? If so, have you tried adding debug output to verify
> if those values are initialized on a VF prior to bringing up a VF interface?
Well, when the device appears in the system the IOMMU driver gets
notified about it using the device_change notifiers. It will then
allocate all necessary data structures. I also verified that this works
correctly while debugging this issue. So I am pretty sure the problem
isn't there :)
> Also have you tried any other SR-IOV capable devices on this system?
> That would be a valuable data point because we could then exclude the
> SR-IOV code as being a possible cause for the issues if other SR-IOV
> devices are working without any issues.
I have another SR-IOV device, but that fails to even enable SR-IOV
because the BIOS did not let enough MMIO resources left. So I couldn't
try it with that device. With the 82576 card enabling SR-IOV works fine
but results in the faults from the VF.
Regards,
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
next prev parent reply other threads:[~2012-06-20 9:48 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-19 10:20 IO_PAGE_FAULTS with igb or igbvf on AMD IOMMU system Joerg Roedel
2012-06-19 18:19 ` Alexander Duyck
2012-06-20 9:48 ` Joerg Roedel [this message]
2012-06-20 16:51 ` Rose, Gregory V
2012-06-20 22:48 ` Alexander Duyck
2012-06-25 11:20 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120620094844.GL2624@amd.com \
--to=joerg.roedel@amd.com \
--cc=alexander.h.duyck@intel.com \
--cc=bruce.w.allan@intel.com \
--cc=carolyn.wyborny@intel.com \
--cc=donald.c.skidmore@intel.com \
--cc=e1000-devel@lists.sourceforge.net \
--cc=gregory.v.rose@intel.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jesse.brandeburg@intel.com \
--cc=john.ronciak@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peter.p.waskiewicz.jr@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).