* oops in pci_acs_path_enabled @ 2012-08-03 17:39 David Ahern 2012-08-03 20:21 ` Alex Williamson 0 siblings, 1 reply; 6+ messages in thread From: David Ahern @ 2012-08-03 17:39 UTC (permalink / raw) To: Alex Williamson; +Cc: LKML [-- Attachment #1: Type: text/plain, Size: 132 bytes --] Hi Alex: Hitting an oops with 3.6-rc1. Backtrace from console attached. git blame for the top function points to ad805758. David [-- Attachment #2: oops2.png --] [-- Type: image/png, Size: 45774 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: oops in pci_acs_path_enabled 2012-08-03 17:39 oops in pci_acs_path_enabled David Ahern @ 2012-08-03 20:21 ` Alex Williamson 2012-08-03 21:12 ` David Ahern 0 siblings, 1 reply; 6+ messages in thread From: Alex Williamson @ 2012-08-03 20:21 UTC (permalink / raw) To: David Ahern; +Cc: LKML On Fri, 2012-08-03 at 11:39 -0600, David Ahern wrote: > Hi Alex: > > Hitting an oops with 3.6-rc1. Backtrace from console attached. git blame > for the top function points to ad805758. Hey David, Hmm, what's special about your system? I've got an 82576 here and the same path works fine. Any way you can get the top of the oops message? Thanks, Alex ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: oops in pci_acs_path_enabled 2012-08-03 20:21 ` Alex Williamson @ 2012-08-03 21:12 ` David Ahern 2012-08-03 21:52 ` Alex Williamson 0 siblings, 1 reply; 6+ messages in thread From: David Ahern @ 2012-08-03 21:12 UTC (permalink / raw) To: Alex Williamson; +Cc: LKML [-- Attachment #1: Type: text/plain, Size: 2216 bytes --] On 8/3/12 2:21 PM, Alex Williamson wrote: > On Fri, 2012-08-03 at 11:39 -0600, David Ahern wrote: >> Hi Alex: >> >> Hitting an oops with 3.6-rc1. Backtrace from console attached. git blame >> for the top function points to ad805758. > > Hey David, > > Hmm, what's special about your system? I've got an 82576 here and the > same path works fine. Any way you can get the top of the oops message? > Thanks, > > Alex > Dell R410 I believe. pair of 5620 processors. 3 overlapping screen shots attached. objdump on pci.o suggests the pdev is NULL: /opt/sw/ahern/kernels/kernel.git/drivers/pci/pci.c:2454 ret = pci_dev_specific_acs_enabled(pdev, acs_flags); if (ret >= 0) return ret > 0; if (!pci_is_pcie(pdev)) 408a: 41 80 7c 24 4a 00 cmpb $0x0,0x4a(%r12) 4090: 74 e8 je 407a <pci_acs_enabled+0x2a> Perhaps this bug explains the larger the issue which is that device passthrough in 3.6-rc1 (0d7614f) is broken for me -- config field for the PCI device does not exist. e.g., pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config lspci: Unable to read the standard configuration space header of device 0000:06:10.0 pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config lspci: Unable to read the standard configuration space header of device 0000:06:10.0 failed to find vendor-product id for PCI id "06:10.0" Failed to claim PCI device 06:10.0 git bisect points to: 783f157bc5a7fa30ee17b4099b27146bd1b68af4 is the first bad commit commit 783f157bc5a7fa30ee17b4099b27146bd1b68af4 Author: Alex Williamson <alex.williamson@redhat.com> Date: Wed May 30 14:19:43 2012 -0600 intel-iommu: Make use of DMA quirks and ACS checks in IOMMU groups Work around broken devices and adhere to ACS support when determining IOMMU grouping. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> :040000 040000 83890398dabbf225fd0f5b3c8c3713a75b3fb5e1 b674ce2ecb315393a8c6c1ac98b3796d5ba09708 M drivers I triggered the oops in a number of the bisect points as well -- in those cases the machine had to be power cycled. David [-- Attachment #2: oops1.png --] [-- Type: image/png, Size: 42116 bytes --] [-- Attachment #3: oops2.png --] [-- Type: image/png, Size: 40206 bytes --] [-- Attachment #4: oops3.png --] [-- Type: image/png, Size: 39749 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: oops in pci_acs_path_enabled 2012-08-03 21:12 ` David Ahern @ 2012-08-03 21:52 ` Alex Williamson 2012-08-03 22:08 ` David Ahern 0 siblings, 1 reply; 6+ messages in thread From: Alex Williamson @ 2012-08-03 21:52 UTC (permalink / raw) To: David Ahern; +Cc: LKML On Fri, 2012-08-03 at 15:12 -0600, David Ahern wrote: > On 8/3/12 2:21 PM, Alex Williamson wrote: > > On Fri, 2012-08-03 at 11:39 -0600, David Ahern wrote: > >> Hi Alex: > >> > >> Hitting an oops with 3.6-rc1. Backtrace from console attached. git blame > >> for the top function points to ad805758. > > > > Hey David, > > > > Hmm, what's special about your system? I've got an 82576 here and the > > same path works fine. Any way you can get the top of the oops message? > > Thanks, > > > > Alex > > > > Dell R410 I believe. pair of 5620 processors. 3 overlapping screen shots > attached. objdump on pci.o suggests the pdev is NULL: > > /opt/sw/ahern/kernels/kernel.git/drivers/pci/pci.c:2454 > > ret = pci_dev_specific_acs_enabled(pdev, acs_flags); > if (ret >= 0) > return ret > 0; > > if (!pci_is_pcie(pdev)) > 408a: 41 80 7c 24 4a 00 cmpb $0x0,0x4a(%r12) > 4090: 74 e8 je 407a <pci_acs_enabled+0x2a> > > > Perhaps this bug explains the larger the issue which is that device > passthrough in 3.6-rc1 (0d7614f) is broken for me -- config field for > the PCI device does not exist. e.g., > > pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config > lspci: Unable to read the standard configuration space header of device > 0000:06:10.0 > pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config > lspci: Unable to read the standard configuration space header of device > 0000:06:10.0 > failed to find vendor-product id for PCI id "06:10.0" > Failed to claim PCI device 06:10.0 > > git bisect points to: > > 783f157bc5a7fa30ee17b4099b27146bd1b68af4 is the first bad commit > commit 783f157bc5a7fa30ee17b4099b27146bd1b68af4 > Author: Alex Williamson <alex.williamson@redhat.com> > Date: Wed May 30 14:19:43 2012 -0600 > > intel-iommu: Make use of DMA quirks and ACS checks in IOMMU groups > > Work around broken devices and adhere to ACS support when determining > IOMMU grouping. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> > > :040000 040000 83890398dabbf225fd0f5b3c8c3713a75b3fb5e1 > b674ce2ecb315393a8c6c1ac98b3796d5ba09708 M drivers > > I triggered the oops in a number of the bisect points as well -- in > those cases the machine had to be power cycled. Is this the chunk that's causing the oops? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 7469b53..27d8c97 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4133,6 +4133,7 @@ static int intel_iommu_add_device(struct device *dev) PCI_DEVFN(PCI_SLOT(dma_pdev->devfn), 0))); +#if 0 while (!pci_is_root_bus(dma_pdev->bus)) { if (pci_acs_path_enabled(dma_pdev->bus->self, NULL, REQ_ACS_FLAGS)) @@ -4140,6 +4141,7 @@ static int intel_iommu_add_device(struct device *dev) swap_pci_ref(&dma_pdev, pci_dev_get(dma_pdev->bus->self)); } +#endif group = iommu_group_get(&dma_pdev->dev); pci_dev_put(dma_pdev); ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: oops in pci_acs_path_enabled 2012-08-03 21:52 ` Alex Williamson @ 2012-08-03 22:08 ` David Ahern 2012-08-04 1:41 ` Alex Williamson 0 siblings, 1 reply; 6+ messages in thread From: David Ahern @ 2012-08-03 22:08 UTC (permalink / raw) To: Alex Williamson; +Cc: LKML On 8/3/12 3:52 PM, Alex Williamson wrote: > Is this the chunk that's causing the oops? Yes. And taking it out fixes passthrough as well. David > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 7469b53..27d8c97 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -4133,6 +4133,7 @@ static int intel_iommu_add_device(struct device *dev) > PCI_DEVFN(PCI_SLOT(dma_pdev->devfn), > 0))); > > +#if 0 > while (!pci_is_root_bus(dma_pdev->bus)) { > if (pci_acs_path_enabled(dma_pdev->bus->self, > NULL, REQ_ACS_FLAGS)) > @@ -4140,6 +4141,7 @@ static int intel_iommu_add_device(struct device *dev) > > swap_pci_ref(&dma_pdev, pci_dev_get(dma_pdev->bus->self)); > } > +#endif > > group = iommu_group_get(&dma_pdev->dev); > pci_dev_put(dma_pdev); > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: oops in pci_acs_path_enabled 2012-08-03 22:08 ` David Ahern @ 2012-08-04 1:41 ` Alex Williamson 0 siblings, 0 replies; 6+ messages in thread From: Alex Williamson @ 2012-08-04 1:41 UTC (permalink / raw) To: David Ahern; +Cc: LKML On Fri, 2012-08-03 at 16:08 -0600, David Ahern wrote: > On 8/3/12 3:52 PM, Alex Williamson wrote: > > Is this the chunk that's causing the oops? > > Yes. And taking it out fixes passthrough as well. Hey David, One more test please. It looks like sriov creates buses with bus->self is NULL. I think what we want to do in this case is to look at bus->parent->self. The patch below redefines pci_acs_path_enabled slightly to allow it to do this. The caller needs to change too, but this also allows us to be more consistent about applying quirks and dealing with multifunction devices. If this works I'll apply the same change to amd_iommu and submit. Thanks, Alex Signed-off-by: Alex Williamson <alex.williamson@redhat.com> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 7469b53..4e37e9b 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4124,8 +4124,14 @@ static int intel_iommu_add_device(struct device *dev) } else dma_pdev = pci_dev_get(pdev); +acs_retest: + /* Account for quirked devices */ swap_pci_ref(&dma_pdev, pci_get_dma_source(dma_pdev)); + /* + * If it's a multifunction device that does not support our + * required ACS flags, add to the same group as function 0. + */ if (dma_pdev->multifunction && !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS)) swap_pci_ref(&dma_pdev, @@ -4133,14 +4139,29 @@ static int intel_iommu_add_device(struct device *dev) PCI_DEVFN(PCI_SLOT(dma_pdev->devfn), 0))); - while (!pci_is_root_bus(dma_pdev->bus)) { - if (pci_acs_path_enabled(dma_pdev->bus->self, - NULL, REQ_ACS_FLAGS)) - break; + /* + * Test ACS support from our current DMA device up to the top of the + * hierarchy. If the test fails, go to the next upstream device and + * try again. Devices on the root bus always go through the iommu. + */ + if (!pci_is_root_bus(dma_pdev->bus)) { + struct pci_bus *bus = dma_pdev->bus; + + if (pci_acs_path_enabled(bus, NULL, REQ_ACS_FLAGS)) + goto done; + + while (!bus->self) { + if (!pci_is_root_bus(bus)) + bus = bus->parent; + else + goto done; + } - swap_pci_ref(&dma_pdev, pci_dev_get(dma_pdev->bus->self)); + swap_pci_ref(&dma_pdev, pci_dev_get(bus->self)); + goto acs_retest; } +done: group = iommu_group_get(&dma_pdev->dev); pci_dev_put(dma_pdev); if (!group) { diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f3ea977..995c13f 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2475,21 +2475,28 @@ bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags) } /** - * pci_acs_path_enable - test ACS flags from start to end in a hierarchy - * @start: starting downstream device + * pci_acs_path_enabled - test ACS flags from a starting bus to an end device + * @bus: starting downstream bus * @end: ending upstream device or NULL to search to the root bus * @acs_flags: required flags * - * Walk up a device tree from start to end testing PCI ACS support. If + * Walk up a PCI hiearchy from bus to end testing PCI ACS support. If * any step along the way does not support the required flags, return false. */ -bool pci_acs_path_enabled(struct pci_dev *start, +bool pci_acs_path_enabled(struct pci_bus *bus, struct pci_dev *end, u16 acs_flags) { - struct pci_dev *pdev, *parent = start; + struct pci_dev *pdev; do { - pdev = parent; + while (!bus->self) { + if (!pci_is_root_bus(bus)) + bus = bus->parent; + else + return (end == NULL); + } + + pdev = bus->self; if (!pci_acs_enabled(pdev, acs_flags)) return false; @@ -2497,7 +2504,7 @@ bool pci_acs_path_enabled(struct pci_dev *start, if (pci_is_root_bus(pdev->bus)) return (end == NULL); - parent = pdev->bus->self; + bus = bus->self->bus; } while (pdev != end); return true; diff --git a/include/linux/pci.h b/include/linux/pci.h index 5faa831..eb9773c 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1652,7 +1652,7 @@ static inline bool pci_is_pcie(struct pci_dev *dev) void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); -bool pci_acs_path_enabled(struct pci_dev *start, +bool pci_acs_path_enabled(struct pci_bus *bus, struct pci_dev *end, u16 acs_flags); #define PCI_VPD_LRDT 0x80 /* Large Resource Data Type */ ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-08-04 1:41 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-03 17:39 oops in pci_acs_path_enabled David Ahern 2012-08-03 20:21 ` Alex Williamson 2012-08-03 21:12 ` David Ahern 2012-08-03 21:52 ` Alex Williamson 2012-08-03 22:08 ` David Ahern 2012-08-04 1:41 ` Alex Williamson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox