* [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
@ 2014-05-09 15:28 Alex Williamson
2014-05-14 23:40 ` Andrew Cooks
0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2014-05-09 15:28 UTC (permalink / raw)
To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel
v2:
- Several new Marvell controllers added to quirks. There's been
a lot of success reported with this series in
https://bugzilla.kernel.org/show_bug.cgi?id=42679
- Add quirk for ASMedia and Tundra PCIe-to-PCI bridges that do
not expose a PCIe capability. These have been shown to use
the standard PCIe-to-PCI bridge requester ID.
- Fix copy/paste duplicate Ricoh quirk ID
- Fixed AMD IOMMU for the "ghost" function case where the DMA
alias is for an absent device. The iommu rlookup table and
data fields need to be initializes.
- Fixed Intel interrupt remapping, I wasn't passing the target
bus number, only the alias bus number.
The only outstanding issue that I know about is that Andrew Cooks
reported that the Ricoh quirk doesn't work. AFAIK, the existing
Ricoh quirk has only ever worked for IOMMU groups, not for dma_ops.
I don't have access to a system with this device to test on my
own. I'd appreciate feedback from anyone that can test on these
devices.
These patches are split across PCI and IOMMU, but I've front-loaded
all of the PCI infrastructure so that the first 7 patches can be
applied to PCI-core, the IOMMU maintainers can pickup their patches,
then we can finish with dead code removal. Bjorn might also be
willing to carry the IOMMU changes if the maintainers want to ack
them.
Original description:
This series attempts to fix a couple issues we've had outstanding in
the PCI/IOMMU code for a while. The first issue is with devices that
use the wrong requester ID for DMA transactions. We already have a
sort of half-baked attempt to fix this for several Ricoh devices, but
the fix only helps them be useful through IOMMU groups, not the
general DMA case. There are also several Marvell devices which use
use a different wrong requester ID and don't even fit into the DMA
source idea. This series creates a DMA alias iterator that will
step through each possible alias of a device, allowing IOMMUs to
insert mappings for both the device and its aliases.
Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
function, which is known to blowup when it finds itself suddenly at
a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
the PCIe capability). It also likes to make the invalid assumption
that a PCIe device never has its requester ID masked by any usptream
bus. We can fix this using the above new DMA alias iterator, since
that's effectively what this function was meant to do.
Finally, with all these helpers, it makes sense to consolidate code
for determining IOMMU groups. The first step in finding the root
of a group is finding the final upstream DMA alias for the device,
then applying additional ACS rules and incorporating device specific
aliases. As this is all common to PCI, create a single implementation
and remove piles of code from the individual IOMMU drivers.
This series allows devices like the Marvell 88SE9123 to finally work
on Linux with either AMD-Vi or VT-d enabled on the box. I've
collected device IDs from various bugs to support as many SKUs of
these devices as possible, but I'm sure there are others that I've
missed.
This should also enable motherboards with an onboard ASmedia
ASM1083/1085 PCIe-to-PCI bridge to work with VT-d enabled. I've
acquired an adapter board with this chip, but it actually exposes
a PCIe capability, unlike most of the onboard controllers. Therefore
I expect this series will fix the WARN_ON currently hit during boot,
but there's a 50/50 chance whether the device behaves like a PCI
bridge or a PCIe bridge with regard to the requester ID that it uses
to take ownership of the transaction. If it turns out to use the
PCIe bridge model, I expect we can quirk it using a dev_flags bit
to identify a PCI bridge that takes ownership as if it was a PCIe
bridge.
Please test and provide feedback. I expect IOMMU group topology
should not change from this series, but if a case is found where it
does, please share. Also, if there are additional quirks we need
to add, please either file new or add to the existing bugs. Thanks,
Alex
---
Alex Williamson (15):
PCI: Add DMA alias iterator
PCI: quirk pci_for_each_dma_alias()
PCI: quirk dma_func_alias for Ricoh devices
PCI: quirk dma_func_alias for Marvell devices
PCI: Quirk pci_for_each_dma_alias() for bridges
PCI: Add quirks for ASMedia and Tundra bridges
PCI: Consolidate isolation domain code
iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups
iommu/amd: Update to use PCI DMA aliases
iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups
iommu/intel: Update to use PCI DMA aliases
iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups
iommu: Remove pci.h
PCI: Remove pci_find_upstream_pcie_bridge()
PCI: Remove pci_get_dma_source()
drivers/iommu/amd_iommu.c | 195 +++++------------------
drivers/iommu/amd_iommu_types.h | 1
drivers/iommu/fsl_pamu_domain.c | 67 --------
drivers/iommu/intel-iommu.c | 293 +++++++++++++----------------------
drivers/iommu/intel_irq_remapping.c | 55 +++++--
drivers/iommu/pci.h | 29 ---
drivers/pci/quirks.c | 114 ++++++++------
drivers/pci/search.c | 240 ++++++++++++++++++++++++++---
include/linux/pci.h | 23 +--
9 files changed, 486 insertions(+), 531 deletions(-)
delete mode 100644 drivers/iommu/pci.h
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
@ 2014-05-10 12:15 George Spelvin
2014-05-10 12:53 ` Andrew Cooks
2014-05-10 13:17 ` Alex Williamson
0 siblings, 2 replies; 11+ messages in thread
From: George Spelvin @ 2014-05-10 12:15 UTC (permalink / raw)
To: alex.williamson, linux; +Cc: acooks, bhelgaas, iommu, linux-kernel, linux-pci
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 13636 bytes --]
Well, chalk up one test failure.
I'm running a GA-X79-UP4 motherboard with VT-d enabled.
It has 3x Marvell SATA controllers:
05:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
With Andrew's patch from https://bugzilla.kernel.org/show_bug.cgi?id=42679#c22
the Marvell SATA ports work. With your dma-alias branch (rebased onto 3.15-rc5),
the ports don't work:
-ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
-dmar: DRHD: handling fault status reg 502
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DRHD: handling fault status reg 700
-dmar: DRHD: handling fault status reg 702
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DRHD: handling fault status reg 100
-ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
-ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
-dmar: DRHD: handling fault status reg 102
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-ata8.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
-ata8: limiting SATA link speed to 1.5 Gbps
-ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
-ata7: limiting SATA link speed to 1.5 Gbps
-dmar: DRHD: handling fault status reg 202
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
-DMAR:[fault reason 02] Present bit in context entry is clear
-dmar: DRHD: handling fault status reg 400
Andrew's patch was on top of 3.14.3; my initial attempts to forward-port it
to the 3.15 PCI changes (appended below) failed to boot.
commit 2323d20989aca44ce0e630e7d264b3973c1ee357
Author: Andrew Cooks <acooks@gmail.com>
Date: Mon Jan 20 07:06:58 2014 -0500
drivers/pci: Add quirk for devices that use multiple functions.
Patch taken from https://bugzilla.kernel.org/show_bug.cgi?id=42679#c22
and improved locally.
WARNING WARNING WARNING: This is a BROKEN attempt to make it work with 3.15;
it DOES NOT BOOT in the form shown here.
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f256ffc02e..04d430d464 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1841,6 +1841,118 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
return 0;
}
+static void iommu_detach_dev(struct intel_iommu *iommu, u8 bus, u8 devfn);
+
+static void quirk_unmap_multi_requesters(struct pci_dev *pdev, u8 fn_map)
+{
+ int fn;
+ u8 bus, devfn;
+ struct intel_iommu *iommu = device_to_iommu(&pdev->dev, &bus, &devfn);
+
+ /* something must be seriously wrong if we can't lookup the iommu. */
+ BUG_ON(!iommu);
+
+ fn_map &= ~(1<<PCI_FUNC(devfn)); /* Skip the normal case */
+
+ for (fn = 0; fn_map >> fn; fn++) {
+ if (fn_map & (1<<fn)) {
+ iommu_detach_dev(iommu,
+ pdev->bus->number,
+ PCI_DEVFN(PCI_SLOT(devfn), fn));
+ dev_dbg(&pdev->dev,
+ "requester id quirk; ghost func %d unmapped",
+ fn);
+ }
+ }
+}
+
+/* For quirky devices that use multiple requester ids. */
+static int quirk_map_multi_requester_ids(struct dmar_domain *domain,
+ struct pci_dev *pdev,
+ int translation)
+{
+ int fn, err = 0;
+ u8 fn_map = pci_multi_requesters(pdev);
+ u8 bus, devfn;
+ struct intel_iommu *iommu;
+
+ /* this is the common, non-quirky case. */
+ if (!fn_map)
+ return 0;
+
+ fn_map &= ~(1<<PCI_FUNC(pdev->devfn)); /* Skip the normal case */
+
+ iommu = device_to_iommu(&pdev->dev, &bus, &devfn);
+
+ for (fn = 0; fn_map >> fn; fn++) {
+ if (fn_map & (1<<fn)) {
+ err = domain_context_mapping_one(domain, iommu, bus,
+ PCI_DEVFN(PCI_SLOT(pdev->devfn), fn),
+ translation);
+ if (err) {
+ dev_err(&pdev->dev,
+ "mapping ghost func %d failed", fn);
+ quirk_unmap_multi_requesters(pdev,
+ fn_map & ((1<<fn)-1));
+ return err;
+ }
+ dev_dbg(&pdev->dev,
+ "requester id quirk; ghost func %d mapped", fn);
+ }
+ }
+ return 0;
+}
+
+
+static void quirk_unmap_requester_id(struct pci_dev *pdev)
+{
+ u8 bus, devfn;
+ struct intel_iommu *iommu = device_to_iommu(&pdev->dev, &bus, &devfn);
+ u8 devfn2 = pci_requester(pdev);
+
+ /* something must be seriously wrong if we can't lookup the iommu. */
+ BUG_ON(!iommu);
+
+
+ if (devfn == devfn2)
+ return;
+
+ iommu_detach_dev(iommu, bus, devfn2);
+ dev_dbg(&pdev->dev, "requester id quirk; bugged device unmapped");
+}
+
+static int quirk_map_requester_id(struct dmar_domain *domain,
+ struct pci_dev *pdev,
+ int translation)
+{
+ u8 bus, devfn;
+ struct intel_iommu *iommu = device_to_iommu(&pdev->dev, &bus, &devfn);
+ u8 devfn2 = pci_requester(pdev);
+ int err;
+
+ if (!iommu)
+ return -ENODEV;
+
+ dev_dbg(&pdev->dev,
+ "checking for incorrect pci requester id quirk...");
+
+ if (devfn == devfn2)
+ return 0;
+
+ err = domain_context_mapping_one(domain, iommu, bus, devfn2,
+ translation);
+ if (err) {
+ dev_err(&pdev->dev,
+ "requester id quirk: mapping dev %02x:%02x.%d failed",
+ bus, PCI_SLOT(devfn2), PCI_FUNC(devfn2));
+ return err;
+ }
+ dev_dbg(&pdev->dev,
+ "requester id quirk; dmar context entry added: %02x:%02x.%d",
+ bus, PCI_SLOT(devfn2), PCI_FUNC(devfn2));
+ return 0;
+}
+
static int
domain_context_mapping(struct dmar_domain *domain, struct device *dev,
int translation)
@@ -1859,6 +1971,16 @@ domain_context_mapping(struct dmar_domain *domain, struct device *dev,
if (ret || !dev_is_pci(dev))
return ret;
+ /* quirk for devices using multiple pci requester ids */
+ ret = quirk_map_multi_requester_ids(domain, pdev, translation);
+ if (ret)
+ return ret;
+
+ /* quirk for devices single incorrect pci requester id */
+ ret = quirk_map_requester_id(domain, pdev, translation);
+ if (ret)
+ return ret;
+
/* dependent device mapping */
pdev = to_pci_dev(dev);
tmp = pci_find_upstream_pcie_bridge(pdev);
@@ -4076,12 +4198,18 @@ static void domain_remove_one_dev_info(struct dmar_domain *domain,
list_for_each_entry_safe(info, tmp, &domain->devices, link) {
if (info->iommu == iommu && info->bus == bus &&
info->devfn == devfn) {
+ struct pci_dev *pdev;
+
unlink_domain_info(info);
spin_unlock_irqrestore(&device_domain_lock, flags);
iommu_disable_dev_iotlb(info);
iommu_detach_dev(iommu, info->bus, info->devfn);
iommu_detach_dependent_devices(iommu, dev);
+ pdev = to_pci_dev(dev);
+ quirk_unmap_multi_requesters(pdev,
+ pci_multi_requesters(pdev));
+ quirk_unmap_requester_id(pdev);
free_devinfo_mem(info);
spin_lock_irqsave(&device_domain_lock, flags);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e7292065a1..d7a8296cc3 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3341,6 +3341,10 @@ static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
return pci_get_slot(dev->bus, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
}
+/* Table of source functions for real devices. The DMA requests for the
+ * device are tagged with a different real function as source. This is
+ * relevant to multifunction devices.
+ */
static const struct pci_dev_dma_source {
u16 vendor;
u16 device;
@@ -3367,7 +3371,8 @@ static const struct pci_dev_dma_source {
* the device doing the DMA, but sometimes hardware is broken and will
* tag the DMA as being sourced from a different device. This function
* allows that translation. Note that the reference count of the
- * returned device is incremented on all paths.
+ * returned device is incremented on all paths. Translation is done when
+ * the device is added to an IOMMU group.
*/
struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
{
@@ -3483,6 +3488,144 @@ static int pci_quirk_intel_pch_acs(struct pci_dev *dev, u16 acs_flags)
return acs_flags & ~flags ? 0 : 1;
}
+/* Table of multiple (ghost) source functions. Devices that may need this quirk
+ * show the following behaviour:
+ * 1. the device may use multiple PCI requester IDs during operation,
+ * (eg. one pci transaction uses xx:yy.0, the next uses xx:yy.1)
+ * 2. the requester ID may not match a known device.
+ * (eg. lspci does not show xx:yy.1 to be present)
+ *
+ * The bitmap contains all of the functions "in use" by the device.
+ * See https://bugzilla.redhat.com/show_bug.cgi?id=757166,
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679
+ * https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1089768
+ */
+static const struct pci_dev_dma_multi_source_map {
+ u16 vendor;
+ u16 device;
+ u8 func_map; /* bit map. lsb is fn 0. */
+} pci_dev_dma_multi_source_map[] = {
+ /* Reported by Patrick Bregman
+ * https://bugzilla.redhat.com/show_bug.cgi?id=863653 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9120, (1<<0)|(1<<1)},
+
+ /* Reported by Paweł Żak, Korneliusz Jarzębski, Daniel Mayer
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679 and by
+ * Justin Piszcz https://lkml.org/lkml/2012/11/24/94 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9123, (1<<0)|(1<<1)},
+
+ /* Used in a patch by Ying Chu
+ * https://bugzilla.redhat.com/show_bug.cgi?id=757166 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9125, (1<<0)|(1<<1)},
+
+ /* Reported by Robert Cicconetti
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679 and by
+ * Fernando https://bugzilla.redhat.com/show_bug.cgi?id=757166 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9128, (1<<0)|(1<<1)},
+
+ /* Reported by Stijn Tintel
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9130, (1<<0)|(1<<1)},
+
+ /* Reported by Gaudenz Steinlin
+ * https://lkml.org/lkml/2013/3/5/288 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9143, (1<<0)|(1<<1)},
+
+ /* Reported by Andrew Cooks
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679 */
+ { PCI_VENDOR_ID_MARVELL_EXT, 0x9172, (1<<0)|(1<<1)},
+
+ { 0 }
+};
+
+/*
+ * The mapping of quirky requester ids is used when the device driver sets up
+ * dma, if iommu is enabled.
+ */
+u8 pci_multi_requesters(struct pci_dev *dev)
+{
+ const struct pci_dev_dma_multi_source_map *i;
+
+ for (i = pci_dev_dma_multi_source_map; i->func_map; i++) {
+ if ((i->vendor == dev->vendor ||
+ i->vendor == (u16)PCI_ANY_ID) &&
+ (i->device == dev->device ||
+ i->device == (u16)PCI_ANY_ID)) {
+ return i->func_map;
+ }
+ }
+ return 0;
+}
+
+/* These are one-to-one translations for devices that use a single incorrect
+ * requester ID. The requester id may not be the BDF of a real device.
+ */
+static const struct pci_dev_dma_source_map {
+ u16 vendor;
+ u16 device;
+ u8 devfn;
+ u8 dma_devfn;
+} pci_dev_dma_source_map[] = {
+
+ /* Ricoh IEEE 1394 Controller */
+ {
+ PCI_VENDOR_ID_RICOH,
+ 0xe832,
+ PCI_DEVFN(0x00, 3),
+ PCI_DEVFN(0x00, 0)
+ },
+
+ /* Nils Caspar - Adaptec 3405
+ * http://www.mail-archive.com/centos@centos.org/msg90986.html
+ * Jonathan McCune
+ * http://old-list-archives.xen.org/archives/html/xen-users/2010-04/msg00535.html */
+ {
+ PCI_VENDOR_ID_ADAPTEC2,
+ 0x028b,
+ PCI_DEVFN(0x0e, 0),
+ PCI_DEVFN(0x01, 0)
+ },
+
+ /* Mateusz Murawski - LSI SAS based MegaRAID
+ * https://lkml.org/lkml/2011/9/12/104
+ * M. Nunberg - Dell PERC 5/i Integrated RAID Controller
+ * http://lists.xen.org/archives/html/xen-devel/2010-05/msg01563.html */
+ {
+ PCI_VENDOR_ID_LSI_LOGIC,
+ 0x0411,
+ PCI_DEVFN(0x0e, 0),
+ PCI_DEVFN(0x08, 0)
+ },
+
+ /* Steven Dake, Markus Stockhausen - Mellanox 26428
+ * https://bugzilla.redhat.com/show_bug.cgi?id=713774
+ * Note: mellanox uses decimal product numbers, convert to hex for PCI
+ * device ID. ie. 26428 == 0x673c */
+ {
+ PCI_VENDOR_ID_MELLANOX,
+ 0x673c,
+ PCI_DEVFN(0x00, 0),
+ PCI_DEVFN(0x00, 6)
+ },
+
+ { 0 }
+};
+
+u8 pci_requester(struct pci_dev *dev)
+{
+ const struct pci_dev_dma_source_map *i;
+
+ for (i = pci_dev_dma_source_map; i->devfn; i++) {
+ if ((i->vendor == dev->vendor) &&
+ (i->device == dev->device) &&
+ (i->devfn == dev->devfn)) {
+ return i->dma_devfn;
+ }
+ }
+ return dev->devfn;
+}
+
+
static const struct pci_dev_acs_enabled {
u16 vendor;
u16 device;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aab57b4abe..63582ad30f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1529,6 +1529,8 @@ enum pci_fixup_pass {
#ifdef CONFIG_PCI_QUIRKS
void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
struct pci_dev *pci_get_dma_source(struct pci_dev *dev);
+u8 pci_multi_requesters(struct pci_dev *dev);
+u8 pci_requester(struct pci_dev *dev);
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
void pci_dev_specific_enable_acs(struct pci_dev *dev);
#else
@@ -1538,6 +1540,14 @@ static inline struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
{
return pci_dev_get(dev);
}
+u8 pci_multi_requesters(struct pci_dev *dev)
+{
+ return 0;
+}
+u8 pci_requester(struct pci_dev *dev)
+{
+ return dev->devfn;
+}
static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
u16 acs_flags)
{
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-10 12:15 [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
@ 2014-05-10 12:53 ` Andrew Cooks
2014-05-10 13:17 ` Alex Williamson
1 sibling, 0 replies; 11+ messages in thread
From: Andrew Cooks @ 2014-05-10 12:53 UTC (permalink / raw)
To: George Spelvin
Cc: Alex Williamson, Bjorn Helgaas, open list:INTEL IOMMU (VT-d),
linux-kernel, open list:PCI SUBSYSTEM
On Sat, May 10, 2014 at 8:15 PM, George Spelvin <linux@horizon.com> wrote:
> Well, chalk up one test failure.
>
> I'm running a GA-X79-UP4 motherboard with VT-d enabled.
> It has 3x Marvell SATA controllers:
> 05:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
> 07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
>
> With Andrew's patch from https://bugzilla.kernel.org/show_bug.cgi?id=42679#c22
> the Marvell SATA ports work. With your dma-alias branch (rebased onto 3.15-rc5),
> the ports don't work:
>
> -ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> -dmar: DRHD: handling fault status reg 502
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
When I checked, about 12 hours ago, the dma-alias branch on github
didn't have all the v2 changes, so I applied the v2 patch set on
3.15-rc5. You may want to try that as well.
I didn't notice any problems on the board I used: GA-X79-UD5, with the
same 88SE9172 controllers as above and with drives attached to each of
the internal SATA ports.
a.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-10 12:15 [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
2014-05-10 12:53 ` Andrew Cooks
@ 2014-05-10 13:17 ` Alex Williamson
2014-05-10 14:08 ` George Spelvin
1 sibling, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2014-05-10 13:17 UTC (permalink / raw)
To: George Spelvin; +Cc: acooks, bhelgaas, iommu, linux-kernel, linux-pci
On Sat, 2014-05-10 at 08:15 -0400, George Spelvin wrote:
> Well, chalk up one test failure.
>
> I'm running a GA-X79-UP4 motherboard with VT-d enabled.
> It has 3x Marvell SATA controllers:
> 05:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
> 07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
What's the device ID of these devices?
> With Andrew's patch from https://bugzilla.kernel.org/show_bug.cgi?id=42679#c22
> the Marvell SATA ports work. With your dma-alias branch (rebased onto 3.15-rc5),
> the ports don't work:
The dma-alias branch only has the v1 patchset. There are several more
Marvell devices included in the quirks in the v2 set. Please re-test
with this code. Thanks,
Alex
> -ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> -dmar: DRHD: handling fault status reg 502
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DRHD: handling fault status reg 700
> -dmar: DRHD: handling fault status reg 702
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DRHD: handling fault status reg 100
> -ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> -ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> -dmar: DRHD: handling fault status reg 102
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -ata8.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> -ata8: limiting SATA link speed to 1.5 Gbps
> -ata7.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> -ata7: limiting SATA link speed to 1.5 Gbps
> -dmar: DRHD: handling fault status reg 202
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffc0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DMAR:[DMA Write] Request device [05:00.1] fault addr fffe0000
> -DMAR:[fault reason 02] Present bit in context entry is clear
> -dmar: DRHD: handling fault status reg 400
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-10 13:17 ` Alex Williamson
@ 2014-05-10 14:08 ` George Spelvin
2014-05-10 14:15 ` Alex Williamson
0 siblings, 1 reply; 11+ messages in thread
From: George Spelvin @ 2014-05-10 14:08 UTC (permalink / raw)
To: alex.williamson, linux; +Cc: acooks, bhelgaas, iommu, linux-kernel, linux-pci
> What's the device ID of these devices?
Oops! It's
05:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
06:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
07:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
> The dma-alias branch only has the v1 patchset. There are several more
> Marvell devices included in the quirks in the v2 set. Please re-test
> with this code. Thanks,
Oh, okay! I got the patches from github because it was a lot easier
to get the whole set that way. (I don't actually have an SMTP LKML
subscription, so I have to copy them all from a web archive, then
manually edit the headers into something RFC822-like enough for "git am"
to accept.)
Mea culpa for forgetting to check the version. And thank you for the response!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-10 14:08 ` George Spelvin
@ 2014-05-10 14:15 ` Alex Williamson
2014-05-10 16:16 ` George Spelvin
0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2014-05-10 14:15 UTC (permalink / raw)
To: George Spelvin; +Cc: acooks, bhelgaas, iommu, linux-kernel, linux-pci
On Sat, 2014-05-10 at 10:08 -0400, George Spelvin wrote:
> > What's the device ID of these devices?
>
> Oops! It's
>
> 05:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
> 06:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
> 07:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] (rev 11)
>
> > The dma-alias branch only has the v1 patchset. There are several more
> > Marvell devices included in the quirks in the v2 set. Please re-test
> > with this code. Thanks,
>
> Oh, okay! I got the patches from github because it was a lot easier
> to get the whole set that way. (I don't actually have an SMTP LKML
> subscription, so I have to copy them all from a web archive, then
> manually edit the headers into something RFC822-like enough for "git am"
> to accept.)
>
> Mea culpa for forgetting to check the version. And thank you for the response!
Hi George,
I'll post a v3 soon, I think we figured out the problem Andrew was
having, a couple typos on my part. I'll push out a new branch for that
so you don't need to piece it together yourself. Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-10 14:15 ` Alex Williamson
@ 2014-05-10 16:16 ` George Spelvin
0 siblings, 0 replies; 11+ messages in thread
From: George Spelvin @ 2014-05-10 16:16 UTC (permalink / raw)
To: alex.williamson, linux; +Cc: acooks, bhelgaas, iommu, linux-kernel, linux-pci
> I'll post a v3 soon, I think we figured out the problem Andrew was
> having, a couple typos on my part. I'll push out a new branch for that
> so you don't need to piece it together yourself. Thanks,
Don't worry; I imported v2, and have it running successfully right now.
I ran e2fsck and hdparm -t on a couple of drives on the controller
(including in parallel), with no problems.
Again, sorry for my earlier laziness.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-09 15:28 Alex Williamson
@ 2014-05-14 23:40 ` Andrew Cooks
2014-05-15 17:43 ` Alex Williamson
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooks @ 2014-05-14 23:40 UTC (permalink / raw)
To: Alex Williamson
Cc: open list:PCI SUBSYSTEM, open list:INTEL IOMMU (VT-d),
Bjorn Helgaas, open list
Hi Alex
On Fri, May 9, 2014 at 11:28 PM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> ....
>
> Original description:
>
> This series attempts to fix a couple issues we've had outstanding in
> the PCI/IOMMU code for a while. The first issue is with devices that
> use the wrong requester ID for DMA transactions. We already have a
> sort of half-baked attempt to fix this for several Ricoh devices, but
> the fix only helps them be useful through IOMMU groups, not the
> general DMA case. There are also several Marvell devices which use
> use a different wrong requester ID and don't even fit into the DMA
> source idea. This series creates a DMA alias iterator that will
> step through each possible alias of a device, allowing IOMMUs to
> insert mappings for both the device and its aliases.
>
> Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
> function, which is known to blowup when it finds itself suddenly at
> a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
> the PCIe capability). It also likes to make the invalid assumption
> that a PCIe device never has its requester ID masked by any usptream
> bus. We can fix this using the above new DMA alias iterator, since
> that's effectively what this function was meant to do.
>
There are two cases where the DMA requester id seems to use the wrong
slot (as opposed to function) in the patch I attached to
https://bugzilla.kernel.org/show_bug.cgi?id=42679
The original bug reports are listed in the patch.
Unfortunately, I wasn't able to get test feedback on those two cases,
but I'm wondering...
Did I understand correctly that a slot alias is something that could be needed?
If so, is it a variation of the PCIe-to-PCI bridge case that's already
covered or will it require a different approach?
Thanks,
a.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-14 23:40 ` Andrew Cooks
@ 2014-05-15 17:43 ` Alex Williamson
2014-05-15 23:45 ` Andrew Cooks
0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2014-05-15 17:43 UTC (permalink / raw)
To: Andrew Cooks
Cc: open list:PCI SUBSYSTEM, open list:INTEL IOMMU (VT-d),
Bjorn Helgaas, open list
On Thu, 2014-05-15 at 07:40 +0800, Andrew Cooks wrote:
> Hi Alex
>
> On Fri, May 9, 2014 at 11:28 PM, Alex Williamson
> <alex.williamson@redhat.com> wrote:
> > ....
> >
> > Original description:
> >
> > This series attempts to fix a couple issues we've had outstanding in
> > the PCI/IOMMU code for a while. The first issue is with devices that
> > use the wrong requester ID for DMA transactions. We already have a
> > sort of half-baked attempt to fix this for several Ricoh devices, but
> > the fix only helps them be useful through IOMMU groups, not the
> > general DMA case. There are also several Marvell devices which use
> > use a different wrong requester ID and don't even fit into the DMA
> > source idea. This series creates a DMA alias iterator that will
> > step through each possible alias of a device, allowing IOMMUs to
> > insert mappings for both the device and its aliases.
> >
> > Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
> > function, which is known to blowup when it finds itself suddenly at
> > a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
> > the PCIe capability). It also likes to make the invalid assumption
> > that a PCIe device never has its requester ID masked by any usptream
> > bus. We can fix this using the above new DMA alias iterator, since
> > that's effectively what this function was meant to do.
> >
>
> There are two cases where the DMA requester id seems to use the wrong
> slot (as opposed to function) in the patch I attached to
> https://bugzilla.kernel.org/show_bug.cgi?id=42679
> The original bug reports are listed in the patch.
>
> Unfortunately, I wasn't able to get test feedback on those two cases,
> but I'm wondering...
> Did I understand correctly that a slot alias is something that could be needed?
> If so, is it a variation of the PCIe-to-PCI bridge case that's already
> covered or will it require a different approach?
Wow, I didn't think that kind of broken was possible. Maybe instead of
a bitmap of function aliases we could have a single devfn alias for a
device. That means we'd only be able to support a single alias for a
device, but since I don't think we've seen devices that use more than a
single alias, maybe that's ok. I'll see what changes we'd need to make
for that, but I probably won't go adding the actual quirk based on those
old reports. Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-15 17:43 ` Alex Williamson
@ 2014-05-15 23:45 ` Andrew Cooks
2014-05-16 3:55 ` Alex Williamson
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooks @ 2014-05-15 23:45 UTC (permalink / raw)
To: Alex Williamson
Cc: open list:PCI SUBSYSTEM, open list:INTEL IOMMU (VT-d),
Bjorn Helgaas, open list
Hi Alex
On Fri, May 16, 2014 at 1:43 AM, Alex Williamson
<alex.williamson@redhat.com> wrote:
> On Thu, 2014-05-15 at 07:40 +0800, Andrew Cooks wrote:
>> Hi Alex
>>
>> On Fri, May 9, 2014 at 11:28 PM, Alex Williamson
>> <alex.williamson@redhat.com> wrote:
>> > ....
>> >
>> > Original description:
>> >
>> > This series attempts to fix a couple issues we've had outstanding in
>> > the PCI/IOMMU code for a while. The first issue is with devices that
>> > use the wrong requester ID for DMA transactions. We already have a
>> > sort of half-baked attempt to fix this for several Ricoh devices, but
>> > the fix only helps them be useful through IOMMU groups, not the
>> > general DMA case. There are also several Marvell devices which use
>> > use a different wrong requester ID and don't even fit into the DMA
>> > source idea. This series creates a DMA alias iterator that will
>> > step through each possible alias of a device, allowing IOMMUs to
>> > insert mappings for both the device and its aliases.
>> >
>> > Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
>> > function, which is known to blowup when it finds itself suddenly at
>> > a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
>> > the PCIe capability). It also likes to make the invalid assumption
>> > that a PCIe device never has its requester ID masked by any usptream
>> > bus. We can fix this using the above new DMA alias iterator, since
>> > that's effectively what this function was meant to do.
>> >
>>
>> There are two cases where the DMA requester id seems to use the wrong
>> slot (as opposed to function) in the patch I attached to
>> https://bugzilla.kernel.org/show_bug.cgi?id=42679
>> The original bug reports are listed in the patch.
>>
>> Unfortunately, I wasn't able to get test feedback on those two cases,
>> but I'm wondering...
>> Did I understand correctly that a slot alias is something that could be needed?
>> If so, is it a variation of the PCIe-to-PCI bridge case that's already
>> covered or will it require a different approach?
>
> Wow, I didn't think that kind of broken was possible. Maybe instead of
> a bitmap of function aliases we could have a single devfn alias for a
> device. That means we'd only be able to support a single alias for a
> device, but since I don't think we've seen devices that use more than a
> single alias, maybe that's ok.
The current patch creates a context entry for the reported device
(function 0), plus it's alias (function 1). Is there reason to always
add a context entry for the reported devfn and define 'alias' to mean
'one additional devfn'? That will work for all the Marvell cases.
In the testing I did, the Marvell controllers needed context entries
for both function 0 and 1. It would be nice if someone could confirm
or debunk this. I tested with a 88SE9172 with both ports of the
controller in use.
Thanks,
a.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems
2014-05-15 23:45 ` Andrew Cooks
@ 2014-05-16 3:55 ` Alex Williamson
0 siblings, 0 replies; 11+ messages in thread
From: Alex Williamson @ 2014-05-16 3:55 UTC (permalink / raw)
To: Andrew Cooks
Cc: open list:PCI SUBSYSTEM, open list:INTEL IOMMU (VT-d),
Bjorn Helgaas, open list
On Fri, 2014-05-16 at 07:45 +0800, Andrew Cooks wrote:
> Hi Alex
>
> On Fri, May 16, 2014 at 1:43 AM, Alex Williamson
> <alex.williamson@redhat.com> wrote:
> > On Thu, 2014-05-15 at 07:40 +0800, Andrew Cooks wrote:
> >> Hi Alex
> >>
> >> On Fri, May 9, 2014 at 11:28 PM, Alex Williamson
> >> <alex.williamson@redhat.com> wrote:
> >> > ....
> >> >
> >> > Original description:
> >> >
> >> > This series attempts to fix a couple issues we've had outstanding in
> >> > the PCI/IOMMU code for a while. The first issue is with devices that
> >> > use the wrong requester ID for DMA transactions. We already have a
> >> > sort of half-baked attempt to fix this for several Ricoh devices, but
> >> > the fix only helps them be useful through IOMMU groups, not the
> >> > general DMA case. There are also several Marvell devices which use
> >> > use a different wrong requester ID and don't even fit into the DMA
> >> > source idea. This series creates a DMA alias iterator that will
> >> > step through each possible alias of a device, allowing IOMMUs to
> >> > insert mappings for both the device and its aliases.
> >> >
> >> > Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
> >> > function, which is known to blowup when it finds itself suddenly at
> >> > a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
> >> > the PCIe capability). It also likes to make the invalid assumption
> >> > that a PCIe device never has its requester ID masked by any usptream
> >> > bus. We can fix this using the above new DMA alias iterator, since
> >> > that's effectively what this function was meant to do.
> >> >
> >>
> >> There are two cases where the DMA requester id seems to use the wrong
> >> slot (as opposed to function) in the patch I attached to
> >> https://bugzilla.kernel.org/show_bug.cgi?id=42679
> >> The original bug reports are listed in the patch.
> >>
> >> Unfortunately, I wasn't able to get test feedback on those two cases,
> >> but I'm wondering...
> >> Did I understand correctly that a slot alias is something that could be needed?
> >> If so, is it a variation of the PCIe-to-PCI bridge case that's already
> >> covered or will it require a different approach?
> >
> > Wow, I didn't think that kind of broken was possible. Maybe instead of
> > a bitmap of function aliases we could have a single devfn alias for a
> > device. That means we'd only be able to support a single alias for a
> > device, but since I don't think we've seen devices that use more than a
> > single alias, maybe that's ok.
>
> The current patch creates a context entry for the reported device
> (function 0), plus it's alias (function 1). Is there reason to always
> add a context entry for the reported devfn and define 'alias' to mean
> 'one additional devfn'? That will work for all the Marvell cases.
Right, that's effectively what it would become, probably making use of a
flag bit to indicate whether the alias devfn is valid. The
pci_for_each_dma_alias() would just drop the loop over the
dma_alias_funcs bitmap at replace it with a test of the flag and single
dma alias devfn. I need to think about whether the IOMMU group code can
is such a simple replacement.
I think it makes sense to always include both the actual devfn and, if
it exists, an alias devfn. Otherwise we'd just end up with more quirks
to add later.
> In the testing I did, the Marvell controllers needed context entries
> for both function 0 and 1. It would be nice if someone could confirm
> or debunk this. I tested with a 88SE9172 with both ports of the
> controller in use.
I think 0 needs to be quirked to 1, but I don't think 1 needs to be
quirked to 0. Unfortunately intel-iommu doesn't do any of the reference
counting that amd_iommu does, so if we have 0 aliased to 1 and we unbind
function 0 from the driver, intel-iommu will also teardown the mapping
for function 1. It's horrible. That's a problem beyond what I'm trying
to do here though, it already exists if you have two devices behind a
PCIe-to-PCI bridge. Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-05-16 3:55 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-10 12:15 [PATCH v2 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
2014-05-10 12:53 ` Andrew Cooks
2014-05-10 13:17 ` Alex Williamson
2014-05-10 14:08 ` George Spelvin
2014-05-10 14:15 ` Alex Williamson
2014-05-10 16:16 ` George Spelvin
-- strict thread matches above, loose matches on Subject: below --
2014-05-09 15:28 Alex Williamson
2014-05-14 23:40 ` Andrew Cooks
2014-05-15 17:43 ` Alex Williamson
2014-05-15 23:45 ` Andrew Cooks
2014-05-16 3:55 ` Alex Williamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).