* Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question @ 2012-11-24 19:40 Justin Piszcz 2012-11-26 21:42 ` Bruno Prémont 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-24 19:40 UTC (permalink / raw) To: support, linux-kernel Hi, Is the following normal on an X9SRL-F board (bios 1.0a)? In the manual it states: Data Direct I/O Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), which significantly reduces CPU overhead by leveraging CPU architectural improvements and freeing the system resource for other tasks. The options are Disabled and Enabled. Default is Enabled. When enabled in the kernel, I see the following: [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed -- Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query is ignored, it fails to work: [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored I assume this is something Supermicro has to fix? Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-24 19:40 Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question Justin Piszcz @ 2012-11-26 21:42 ` Bruno Prémont 2012-11-27 0:50 ` Justin Piszcz 2012-11-27 0:56 ` Bjorn Helgaas 0 siblings, 2 replies; 24+ messages in thread From: Bruno Prémont @ 2012-11-26 21:42 UTC (permalink / raw) To: Justin Piszcz; +Cc: support, linux-kernel Hi Justin, On Sat, 24 November 2012 "Justin Piszcz" wrote: > Is the following normal on an X9SRL-F board (bios 1.0a)? > > In the manual it states: > > Data Direct I/O > Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), which > significantly reduces CPU overhead by leveraging CPU architectural > improvements and freeing the system resource for other tasks. The options > are Disabled and Enabled. > > Default is Enabled. > > When enabled in the kernel, I see the following: > > [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 > [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable > [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error > [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed > [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable > [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error > [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed > [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable > [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error > [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed > [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable > [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error > [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed > [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable > [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error > [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed > [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable > [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error > [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed > [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable > [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error > [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed > [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable > [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error > [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed > > -- > > Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query is > ignored, it fails to work: > [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored > > I assume this is something Supermicro has to fix? You are probably missing some kernel config option(s) :) - I did fight similar issues on a Fujitsu SandyBridge Xeon based server. Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU options. Bruno ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-26 21:42 ` Bruno Prémont @ 2012-11-27 0:50 ` Justin Piszcz 2012-11-27 0:56 ` Bjorn Helgaas 1 sibling, 0 replies; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 0:50 UTC (permalink / raw) To: 'Bruno Prémont'; +Cc: support, linux-kernel > [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 > [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable > I assume this is something Supermicro has to fix? You are probably missing some kernel config option(s) :) - I did fight similar issues on a Fujitsu SandyBridge Xeon based server. Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU options. Bruno => Enabled: CONFIG_IOMMU_SUPPORT CONFIG_INTEL_IOMMU CONFIG_INTEL_IOMMU_DEFAULT_ON CONFIG_IRQ_REMAP Also tried enabling NUMA, etc: [ 0.330998] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it [ 0.331068] ACPI: bus type pci registered [ 0.615234] ACPI: Dynamic OEM Table Load: [ 0.615373] ACPI: PRAD (null) 000BE (v02 PRADID PRADTID 00000001 MSFT 04000000) [ 0.615631] \_SB_:_OSC invalid UUID [ 0.615633] _OSC request data:1 7 [ 0.663138] pci 0000:ff:13.5: [8086:3c44] type 00 class 0x110100 [ 0.663170] pci 0000:ff:13.6: [8086:3c45] type 00 class 0x088000 [ 0.663211] pci0000:ff: ACPI _OSC support notification failed, disabling PCIe ASPM [ 0.663281] pci0000:ff: Unable to request _OSC control (_OSC support mask: 0x08) :( Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-26 21:42 ` Bruno Prémont 2012-11-27 0:50 ` Justin Piszcz @ 2012-11-27 0:56 ` Bjorn Helgaas 2012-11-27 1:00 ` Bjorn Helgaas 2012-11-27 1:11 ` Dan Williams 1 sibling, 2 replies; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-27 0:56 UTC (permalink / raw) To: Bruno Prémont; +Cc: Justin Piszcz, support, linux-kernel, Dan Williams [+cc Dan] On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont <bonbons@linux-vserver.org> wrote: > Hi Justin, > > On Sat, 24 November 2012 "Justin Piszcz" wrote: >> Is the following normal on an X9SRL-F board (bios 1.0a)? >> >> In the manual it states: >> >> Data Direct I/O >> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), which >> significantly reduces CPU overhead by leveraging CPU architectural >> improvements and freeing the system resource for other tasks. The options >> are Disabled and Enabled. >> >> Default is Enabled. >> >> When enabled in the kernel, I see the following: >> >> [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 >> [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable >> [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error >> [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed >> [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable >> [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error >> [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed >> [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable >> [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error >> [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed >> [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable >> [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error >> [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed >> [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable >> [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error >> [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed >> [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable >> [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error >> [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed >> [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable >> [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error >> [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed >> [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable >> [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error >> [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed >> >> -- >> >> Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query is >> ignored, it fails to work: >> [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored >> >> I assume this is something Supermicro has to fix? > > You are probably missing some kernel config option(s) :) - I did fight similar > issues on a Fujitsu SandyBridge Xeon based server. > > Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU options. Changing config options is not a valid fix for error messages like this. We should be able to make the config smarter by adding dependencies or something, or else make the driver smart enough to give a more useful diagnostic. The "channel error register unreachable" message indicates that pci_read_config_dword() failed. The register in question (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have PCI config accessors for the extended config space (0x100-0xfff). A complete dmesg log should show that. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 0:56 ` Bjorn Helgaas @ 2012-11-27 1:00 ` Bjorn Helgaas 2012-11-27 1:00 ` Justin Piszcz 2012-11-27 1:11 ` Dan Williams 1 sibling, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-27 1:00 UTC (permalink / raw) To: Bruno Prémont; +Cc: Justin Piszcz, support, linux-kernel, Dan Williams [Try Dan's current email address; sorry Dan] On Mon, Nov 26, 2012 at 5:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc Dan] > > On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont > <bonbons@linux-vserver.org> wrote: >> Hi Justin, >> >> On Sat, 24 November 2012 "Justin Piszcz" wrote: >>> Is the following normal on an X9SRL-F board (bios 1.0a)? >>> >>> In the manual it states: >>> >>> Data Direct I/O >>> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), which >>> significantly reduces CPU overhead by leveraging CPU architectural >>> improvements and freeing the system resource for other tasks. The options >>> are Disabled and Enabled. >>> >>> Default is Enabled. >>> >>> When enabled in the kernel, I see the following: >>> >>> [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 >>> [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable >>> [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error >>> [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed >>> [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable >>> [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error >>> [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed >>> [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable >>> [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error >>> [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed >>> [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable >>> [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error >>> [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable >>> [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error >>> [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable >>> [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error >>> [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable >>> [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error >>> [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable >>> [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error >>> [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed >>> >>> -- >>> >>> Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query is >>> ignored, it fails to work: >>> [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored >>> >>> I assume this is something Supermicro has to fix? >> >> You are probably missing some kernel config option(s) :) - I did fight similar >> issues on a Fujitsu SandyBridge Xeon based server. >> >> Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU options. > > Changing config options is not a valid fix for error messages like > this. We should be able to make the config smarter by adding > dependencies or something, or else make the driver smart enough to > give a more useful diagnostic. > > The "channel error register unreachable" message indicates that > pci_read_config_dword() failed. The register in question > (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have > PCI config accessors for the extended config space (0x100-0xfff). A > complete dmesg log should show that. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 1:00 ` Bjorn Helgaas @ 2012-11-27 1:00 ` Justin Piszcz 2012-11-27 1:11 ` Bjorn Helgaas 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 1:00 UTC (permalink / raw) To: 'Bjorn Helgaas', 'Bruno Prémont' Cc: support, linux-kernel, 'Dan Williams' -----Original Message----- From: Bjorn Helgaas [mailto:bhelgaas@google.com] Sent: Monday, November 26, 2012 8:00 PM To: Bruno Prémont Cc: Justin Piszcz; support@supermicro.com; linux-kernel@vger.kernel.org; Dan Williams Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question [Try Dan's current email address; sorry Dan] On Mon, Nov 26, 2012 at 5:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc Dan] > > On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont > <bonbons@linux-vserver.org> wrote: >> Hi Justin, >> >> On Sat, 24 November 2012 "Justin Piszcz" wrote: >>> Is the following normal on an X9SRL-F board (bios 1.0a)? >>> >>> In the manual it states: >>> >>> Data Direct I/O >>> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), which >>> significantly reduces CPU overhead by leveraging CPU architectural >>> improvements and freeing the system resource for other tasks. The options >>> are Disabled and Enabled. >>> >>> Default is Enabled. >>> >>> When enabled in the kernel, I see the following: >>> >>> [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 >>> [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable >>> [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error >>> [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed >>> [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable >>> [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error >>> [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed >>> [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable >>> [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error >>> [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed >>> [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable >>> [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error >>> [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable >>> [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error >>> [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable >>> [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error >>> [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable >>> [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error >>> [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed >>> [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable >>> [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error >>> [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed >>> >>> -- >>> >>> Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query is >>> ignored, it fails to work: >>> [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored >>> >>> I assume this is something Supermicro has to fix? >> >> You are probably missing some kernel config option(s) :) - I did fight similar >> issues on a Fujitsu SandyBridge Xeon based server. >> >> Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU options. > > Changing config options is not a valid fix for error messages like > this. We should be able to make the config smarter by adding > dependencies or something, or else make the driver smart enough to > give a more useful diagnostic. > > The "channel error register unreachable" message indicates that > pci_read_config_dword() failed. The register in question > (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have > PCI config accessors for the extended config space (0x100-0xfff). A > complete dmesg log should show that. -- Here is the full dmesg: (I went back to my older kernel, let me know if you need a dmesg w/ those options enabled) http://home.comcast.net/~jpiszcz/20121126/dmesg.txt Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 1:00 ` Justin Piszcz @ 2012-11-27 1:11 ` Bjorn Helgaas 2012-11-27 13:33 ` Justin Piszcz 0 siblings, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-27 1:11 UTC (permalink / raw) To: Justin Piszcz; +Cc: Bruno Prémont, support, linux-kernel, Dan Williams On Mon, Nov 26, 2012 at 6:00 PM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Bjorn Helgaas [mailto:bhelgaas@google.com] > Sent: Monday, November 26, 2012 8:00 PM > To: Bruno Prémont > Cc: Justin Piszcz; support@supermicro.com; linux-kernel@vger.kernel.org; Dan > Williams > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > [Try Dan's current email address; sorry Dan] > > On Mon, Nov 26, 2012 at 5:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> [+cc Dan] >> >> On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont >> <bonbons@linux-vserver.org> wrote: >>> Hi Justin, >>> >>> On Sat, 24 November 2012 "Justin Piszcz" wrote: >>>> Is the following normal on an X9SRL-F board (bios 1.0a)? >>>> >>>> In the manual it states: >>>> >>>> Data Direct I/O >>>> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), > which >>>> significantly reduces CPU overhead by leveraging CPU architectural >>>> improvements and freeing the system resource for other tasks. The > options >>>> are Disabled and Enabled. >>>> >>>> Default is Enabled. >>>> >>>> When enabled in the kernel, I see the following: >>>> >>>> [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00 >>>> [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable >>>> [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error >>>> [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable >>>> [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error >>>> [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable >>>> [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error >>>> [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable >>>> [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error >>>> [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable >>>> [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error >>>> [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable >>>> [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error >>>> [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable >>>> [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error >>>> [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init > failed >>>> [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable >>>> [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error >>>> [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init > failed >>>> >>>> -- >>>> >>>> Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query > is >>>> ignored, it fails to work: >>>> [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored >>>> >>>> I assume this is something Supermicro has to fix? >>> >>> You are probably missing some kernel config option(s) :) - I did fight > similar >>> issues on a Fujitsu SandyBridge Xeon based server. >>> >>> Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU > options. >> >> Changing config options is not a valid fix for error messages like >> this. We should be able to make the config smarter by adding >> dependencies or something, or else make the driver smart enough to >> give a more useful diagnostic. >> >> The "channel error register unreachable" message indicates that >> pci_read_config_dword() failed. The register in question >> (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have >> PCI config accessors for the extended config space (0x100-0xfff). A >> complete dmesg log should show that. > > -- > > Here is the full dmesg: (I went back to my older kernel, let me know if you > need a dmesg w/ those options enabled) > http://home.comcast.net/~jpiszcz/20121126/dmesg.txt It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 1:11 ` Bjorn Helgaas @ 2012-11-27 13:33 ` Justin Piszcz 2012-11-27 13:49 ` Justin Piszcz 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 13:33 UTC (permalink / raw) To: 'Bjorn Helgaas' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Bjorn Helgaas [mailto:bhelgaas@google.com] Sent: Monday, November 26, 2012 8:12 PM To: Justin Piszcz Cc: Bruno Prémont; support@supermicro.com; linux-kernel@vger.kernel.org; Dan Williams Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question On Mon, Nov 26, 2012 at 6:00 PM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Bjorn Helgaas [mailto:bhelgaas@google.com] > Sent: Monday, November 26, 2012 8:00 PM > To: Bruno Prémont > Cc: Justin Piszcz; support@supermicro.com; linux-kernel@vger.kernel.org; Dan > Williams > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > [Try Dan's current email address; sorry Dan] > > On Mon, Nov 26, 2012 at 5:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> [+cc Dan] >> >> On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont >> <bonbons@linux-vserver.org> wrote: >>> Hi Justin, >>> >>> On Sat, 24 November 2012 "Justin Piszcz" wrote: >>>> Is the following normal on an X9SRL-F board (bios 1.0a)? >>>> >>>> In the manual it states: >>>> >>>> Data Direct I/O >>>> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology), > which >>>> significantly reduces CPU overhead by leveraging CPU architectural >>>> improvements and freeing the system resource for other tasks. The> > Here is the full dmesg: (I went back to my older kernel, let me know if you > need a dmesg w/ those options enabled) > http://home.comcast.net/~jpiszcz/20121126/dmesg.txt It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? Hi, I have two supermicro boards I am trying this on, I tried this on another system I have (X8DTH-6F), with all of these options enabled, the system does not boot. It cannot talk to the SATA boot drive. " 5520 chips built in, the X8DTH-6/X8DTH-6F/X8DTH-i/X8DTH-iF offers ...... The Intel I/OAT (I/O Acceleration Technology) significantly reduces CPU over- head by ..." When the following options are enabled, the system does not boot: +CONFIG_HAVE_INTEL_TXT=y +CONFIG_IOMMU_API=y +CONFIG_IOMMU_SUPPORT=y +CONFIG_DMAR_TABLE=y +CONFIG_INTEL_IOMMU=y +CONFIG_INTEL_IOMMU_DEFAULT_ON=y +CONFIG_INTEL_IOMMU_FLOPPY_WA=y It fails like so: (Fails to talk to the SSD) http://home.comcast.net/~jpiszcz/20121127/photo1-resize.jpg (then, a few moments later: Kernel panic) http://home.comcast.net/~jpiszcz/20121127/photo2-resize.jpg With those options disabled, the system boots (and always has booted fine). Is there a certain combination of parameters that allows I/OAT to be enabled _and_ allow the system to boot? Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 13:33 ` Justin Piszcz @ 2012-11-27 13:49 ` Justin Piszcz 2012-11-27 13:56 ` Justin Piszcz ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 13:49 UTC (permalink / raw) To: 'Bjorn Helgaas' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' > It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? ===> FOR I/OAT DMA Latest status, it _appears_ its working on the X9SRL-F now, thank you! 1) Supermicro X9SRL-F (GOOD) [ 0.738510] ioatdma: Intel(R) QuickData Technology Driver 4.00 [ 0.738719] ioatdma 0000:00:04.0: irq 75 for MSI/MSI-X [ 0.739088] ioatdma 0000:00:04.1: irq 76 for MSI/MSI-X [ 0.739408] ioatdma 0000:00:04.2: irq 77 for MSI/MSI-X [ 0.739739] ioatdma 0000:00:04.3: irq 78 for MSI/MSI-X [ 0.740040] ioatdma 0000:00:04.4: irq 79 for MSI/MSI-X [ 0.740342] ioatdma 0000:00:04.5: irq 80 for MSI/MSI-X [ 0.740670] ioatdma 0000:00:04.6: irq 81 for MSI/MSI-X [ 0.740971] ioatdma 0000:00:04.7: irq 82 for MSI/MSI-X It is _not_ working on the: 2) Supermicro X8DTH-F (the boot drive in this system is running off a PCI-e card, could the IRQ for the I/O controller be getting re-mapped and fail?)-- worse case I can move the SSD from the 6.0gbpa SATA card to the motherboard and see if that works, but that kind of defeats the purpose of a 6.0gbps SATA SSD. (Fails to talk to the SSD) http://home.comcast.net/~jpiszcz/20121127/photo1-resize.jpg (then, a few moments later: Kernel panic) http://home.comcast.net/~jpiszcz/20121127/photo2-resize.jpg Would be curious if anyone had any suggestions besides removing the controller card? -- ==> Further issues with the X9SRL-F -- does this board support ASPM or is this a Linux/ASPM implementation issue? [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling PCIe ASPM [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support mask: 0x08) Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 13:49 ` Justin Piszcz @ 2012-11-27 13:56 ` Justin Piszcz 2012-11-27 14:35 ` Justin Piszcz 2012-11-28 23:54 ` Bjorn Helgaas 2012-11-29 0:34 ` Robert Hancock 2 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 13:56 UTC (permalink / raw) To: 'Bjorn Helgaas' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' > It is _not_ working on the: > 2) Supermicro X8DTH-F (the boot drive in this system is running off a PCI-e > card, could the IRQ for the I/O controller be getting re-mapped and fail?)-- > worse case I can move the SSD from the 6.0gbpa SATA card to the motherboard > and see if that works, but that kind of defeats the purpose of a 6.0gbps > SATA SSD. When IOMMU is disabled, I/OAT DMA is successful on the second motherboard (X8DTH-6F). Specifically: --- DMA Engine support [*] Intel I/OAT DMA support [*] Network: TCP receive copy offload [*] Async_tx: Offload support for the async_tx api When IOMMU/X2APIC is enabled on the X8DTH-6F it fails to boot. Will keep doing more testing to see if I get anywhere w/regards to the IOMMU. Proof of success: [ 0.757467] ioatdma: Intel(R) QuickData Technology Driver 4.00 [ 0.757690] ioatdma 0000:00:16.0: irq 88 for MSI/MSI-X [ 0.757948] ioatdma 0000:00:16.1: irq 89 for MSI/MSI-X [ 0.758166] ioatdma 0000:00:16.2: irq 90 for MSI/MSI-X [ 0.758377] ioatdma 0000:00:16.3: irq 91 for MSI/MSI-X [ 0.758577] ioatdma 0000:00:16.4: irq 92 for MSI/MSI-X [ 0.758794] ioatdma 0000:00:16.5: irq 93 for MSI/MSI-X [ 0.759000] ioatdma 0000:00:16.6: irq 94 for MSI/MSI-X [ 0.759214] ioatdma 0000:00:16.7: irq 95 for MSI/MSI-X [ 0.759461] ioatdma 0000:80:16.0: irq 96 for MSI/MSI-X [ 0.759720] ioatdma 0000:80:16.1: irq 97 for MSI/MSI-X [ 0.759963] ioatdma 0000:80:16.2: irq 98 for MSI/MSI-X [ 0.760190] ioatdma 0000:80:16.3: irq 99 for MSI/MSI-X [ 0.760414] ioatdma 0000:80:16.4: irq 100 for MSI/MSI-X [ 0.760630] ioatdma 0000:80:16.5: irq 101 for MSI/MSI-X [ 0.760862] ioatdma 0000:80:16.6: irq 102 for MSI/MSI-X [ 0.761081] ioatdma 0000:80:16.7: irq 103 for MSI/MSI-X -- ==> Further issues with the X9SRL-F -- does this board support ASPM or is this a Linux/ASPM implementation issue? [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling PCIe ASPM [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support mask: 0x08) Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 13:56 ` Justin Piszcz @ 2012-11-27 14:35 ` Justin Piszcz 2012-11-29 0:08 ` Bjorn Helgaas 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-27 14:35 UTC (permalink / raw) To: 'Bjorn Helgaas' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Justin Piszcz [mailto:jpiszcz@lucidpixels.com] Sent: Tuesday, November 27, 2012 8:56 AM To: 'Bjorn Helgaas' Cc: 'Bruno Prémont'; support@supermicro.com; linux-kernel@vger.kernel.org; 'Dan Williams' Subject: RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question > It is _not_ working on the: > 2) Supermicro X8DTH-F (the boot drive in this system is running off a PCI-e > card, could the IRQ for the I/O controller be getting re-mapped and fail?)-- > worse case I can move the SSD from the 6.0gbpa SATA card to the motherboard > and see if that works, but that kind of defeats the purpose of a 6.0gbps > SATA SSD. When I removed the Highpoint 2-port SATA card and plugged it into the motherboard, the system boots (plugged the SSD into the motherboard). So if you use a HIGHPOINT 2-PORT SATA 6.0gbps card, do NOT enable IOMMU or it will fail to initialize the Highpoint 2-port SATA controller card! I also tried upgrading the BIOS (of the mobo, no diff) I also tried just leaving the SATA card in and plugging it into the motherboard (no diff) Removed the Highpoint 2-port SATA card and then success, it would be nice to use that card with IOMMU support though, is it just not compatible (marvell-problem?) or is a driver bug? Based on the pictures/etc sent earlier? $ dmesg|grep -i iommu [ 0.055134] dmar: IOMMU 0: reg_base_addr cfdfe000 ver 1:0 cap c90780106f0462 ecap f020f6 [ 0.055396] dmar: IOMMU 1: reg_base_addr fecfe000 ver 1:0 cap c90780106f0462 ecap f020f6 [ 0.760665] IOMMU 0 0xcfdfe000: using Queued invalidation [ 0.760803] IOMMU 1 0xfecfe000: using Queued invalidation [ 0.760937] IOMMU: Setting RMRR: [ 0.761102] IOMMU: Setting identity map for device 0000:00:1d.0 [0xbf7ec000 - 0xbf7fffff] [ 0.761329] IOMMU: Setting identity map for device 0000:00:1d.1 [0xbf7ec000 - 0xbf7fffff] [ 0.761542] IOMMU: Setting identity map for device 0000:00:1d.2 [0xbf7ec000 - 0xbf7fffff] [ 0.761758] IOMMU: Setting identity map for device 0000:00:1d.7 [0xbf7ec000 - 0xbf7fffff] [ 0.761974] IOMMU: Setting identity map for device 0000:00:1a.0 [0xbf7ec000 - 0xbf7fffff] [ 0.762190] IOMMU: Setting identity map for device 0000:00:1a.1 [0xbf7ec000 - 0xbf7fffff] [ 0.762407] IOMMU: Setting identity map for device 0000:00:1a.2 [0xbf7ec000 - 0xbf7fffff] [ 0.762620] IOMMU: Setting identity map for device 0000:00:1a.7 [0xbf7ec000 - 0xbf7fffff] [ 0.762816] IOMMU: Setting identity map for device 0000:00:1d.0 [0xec000 - 0xeffff] [ 0.763010] IOMMU: Setting identity map for device 0000:00:1d.1 [0xec000 - 0xeffff] [ 0.763197] IOMMU: Setting identity map for device 0000:00:1d.2 [0xec000 - 0xeffff] [ 0.763382] IOMMU: Setting identity map for device 0000:00:1d.7 [0xec000 - 0xeffff] [ 0.763567] IOMMU: Setting identity map for device 0000:00:1a.0 [0xec000 - 0xeffff] [ 0.763749] IOMMU: Setting identity map for device 0000:00:1a.1 [0xec000 - 0xeffff] [ 0.763934] IOMMU: Setting identity map for device 0000:00:1a.2 [0xec000 - 0xeffff] [ 0.764127] IOMMU: Setting identity map for device 0000:00:1a.7 [0xec000 - 0xeffff] [ 0.764311] IOMMU: Prepare 0-16MiB unity mapping for LPC [ 0.764465] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] -- ==> Further issues with the X9SRL-F -- does this board support ASPM or is this a Linux/ASPM implementation issue? [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling PCIe ASPM [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support mask: 0x08) Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 14:35 ` Justin Piszcz @ 2012-11-29 0:08 ` Bjorn Helgaas 2012-11-29 0:49 ` Justin Piszcz 0 siblings, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-29 0:08 UTC (permalink / raw) To: Justin Piszcz; +Cc: Bruno Prémont, support, linux-kernel, Dan Williams On Tue, Nov 27, 2012 at 7:35 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Justin Piszcz [mailto:jpiszcz@lucidpixels.com] > Sent: Tuesday, November 27, 2012 8:56 AM > To: 'Bjorn Helgaas' > Cc: 'Bruno Prémont'; support@supermicro.com; linux-kernel@vger.kernel.org; > 'Dan Williams' > Subject: RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > >> It is _not_ working on the: > >> 2) Supermicro X8DTH-F (the boot drive in this system is running off a > PCI-e >> card, could the IRQ for the I/O controller be getting re-mapped and > fail?)-- >> worse case I can move the SSD from the 6.0gbpa SATA card to the > motherboard >> and see if that works, but that kind of defeats the purpose of a 6.0gbps >> SATA SSD. > > When I removed the Highpoint 2-port SATA card and plugged it into the > motherboard, the system boots (plugged the SSD into the motherboard). > So if you use a HIGHPOINT 2-PORT SATA 6.0gbps card, do NOT enable IOMMU or > it will fail to initialize the Highpoint 2-port SATA controller card! > I also tried upgrading the BIOS (of the mobo, no diff) > I also tried just leaving the SATA card in and plugging it into the > motherboard (no diff) > Removed the Highpoint 2-port SATA card and then success, it would be nice to > use that card with IOMMU support though, is it just not compatible > (marvell-problem?) or is a driver bug? Based on the pictures/etc sent > earlier? I would guess this is a core bug, but it's hard to tell without more information. If you boot with "intel_iommu=off", I would guess the Highpoint card would work (this should have the same effect as turning off CONFIG_INTEL_IOMMU). I'd like to compare the complete dmesg log for that boot with the one that fails. It sounds like it might be hard to collect the log for the failing case -- you said the boot fails when the Highpoint card is in the system even if the SSD is connected to the motherboard instead of the Highpoint card. The panic in the photo2 image looks like it's just a failure to mount the root filesystem, which is what I'd expect if we can't find the SSD. It seems like we ought to be able to *boot* with the SSD connected to the motherboard, even if the Highpoint card doesn't work. But worst-case, a video of the failing boot might be enough, especially if you can slow it down with "boot_delay=" > $ dmesg|grep -i iommu > [ 0.055134] dmar: IOMMU 0: reg_base_addr cfdfe000 ver 1:0 cap > c90780106f0462 ecap f020f6 > [ 0.055396] dmar: IOMMU 1: reg_base_addr fecfe000 ver 1:0 cap > c90780106f0462 ecap f020f6 > [ 0.760665] IOMMU 0 0xcfdfe000: using Queued invalidation > [ 0.760803] IOMMU 1 0xfecfe000: using Queued invalidation > [ 0.760937] IOMMU: Setting RMRR: > [ 0.761102] IOMMU: Setting identity map for device 0000:00:1d.0 > [0xbf7ec000 - 0xbf7fffff] > [ 0.761329] IOMMU: Setting identity map for device 0000:00:1d.1 > [0xbf7ec000 - 0xbf7fffff] > [ 0.761542] IOMMU: Setting identity map for device 0000:00:1d.2 > [0xbf7ec000 - 0xbf7fffff] > [ 0.761758] IOMMU: Setting identity map for device 0000:00:1d.7 > [0xbf7ec000 - 0xbf7fffff] > [ 0.761974] IOMMU: Setting identity map for device 0000:00:1a.0 > [0xbf7ec000 - 0xbf7fffff] > [ 0.762190] IOMMU: Setting identity map for device 0000:00:1a.1 > [0xbf7ec000 - 0xbf7fffff] > [ 0.762407] IOMMU: Setting identity map for device 0000:00:1a.2 > [0xbf7ec000 - 0xbf7fffff] > [ 0.762620] IOMMU: Setting identity map for device 0000:00:1a.7 > [0xbf7ec000 - 0xbf7fffff] > [ 0.762816] IOMMU: Setting identity map for device 0000:00:1d.0 [0xec000 > - 0xeffff] > [ 0.763010] IOMMU: Setting identity map for device 0000:00:1d.1 [0xec000 > - 0xeffff] > [ 0.763197] IOMMU: Setting identity map for device 0000:00:1d.2 [0xec000 > - 0xeffff] > [ 0.763382] IOMMU: Setting identity map for device 0000:00:1d.7 [0xec000 > - 0xeffff] > [ 0.763567] IOMMU: Setting identity map for device 0000:00:1a.0 [0xec000 > - 0xeffff] > [ 0.763749] IOMMU: Setting identity map for device 0000:00:1a.1 [0xec000 > - 0xeffff] > [ 0.763934] IOMMU: Setting identity map for device 0000:00:1a.2 [0xec000 > - 0xeffff] > [ 0.764127] IOMMU: Setting identity map for device 0000:00:1a.7 [0xec000 > - 0xeffff] > [ 0.764311] IOMMU: Prepare 0-16MiB unity mapping for LPC > [ 0.764465] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - > 0xffffff] > > -- > > > ==> Further issues with the X9SRL-F -- does this board support ASPM or is > this a Linux/ASPM implementation issue? > [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling > PCIe ASPM > [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support > mask: 0x08) I'm going to ignore this issue for the time being. I know we complain about this on many machines, and I don't know whether it's a real problem or just an overly alarming message. Bjorn ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 0:08 ` Bjorn Helgaas @ 2012-11-29 0:49 ` Justin Piszcz 0 siblings, 0 replies; 24+ messages in thread From: Justin Piszcz @ 2012-11-29 0:49 UTC (permalink / raw) To: 'Bjorn Helgaas', 'Robert Hancock' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Bjorn Helgaas [mailto:bhelgaas@google.com] Sent: Wednesday, November 28, 2012 7:09 PM To: Justin Piszcz Cc: Bruno Prémont; support@supermicro.com; linux-kernel@vger.kernel.org; Dan Williams Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question On Tue, Nov 27, 2012 at 7:35 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Justin Piszcz [mailto:jpiszcz@lucidpixels.com] > Sent: Tuesday, November 27, 2012 8:56 AM > To: 'Bjorn Helgaas' > Cc: 'Bruno Prémont'; support@supermicro.com; linux-kernel@vger.kernel.org; > 'Dan Williams' > Subject: RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > >> It is _not_ working on the: > >> 2) Supermicro X8DTH-F (the boot drive in this system is running off a > PCI-e >> card, could the IRQ for the I/O controller be getting re-mapped and > fail?)-- >> worse case I can move the SSD from the 6.0gbpa SATA card to the > motherboard >> and see if that works, but that kind of defeats the purpose of a 6.0gbps >> SATA SSD. > > When I removed the Highpoint 2-port SATA card and plugged it into the > motherboard, the system boots (plugged the SSD into the motherboard). > So if you use a HIGHPOINT 2-PORT SATA 6.0gbps card, do NOT enable IOMMU or > it will fail to initialize the Highpoint 2-port SATA controller card! > I also tried upgrading the BIOS (of the mobo, no diff) > I also tried just leaving the SATA card in and plugging it into the > motherboard (no diff) > Removed the Highpoint 2-port SATA card and then success, it would be nice to > use that card with IOMMU support though, is it just not compatible > (marvell-problem?) or is a driver bug? Based on the pictures/etc sent > earlier? I would guess this is a core bug, but it's hard to tell without more information. If you boot with "intel_iommu=off", I would guess the Highpoint card would work (this should have the same effect as turning off CONFIG_INTEL_IOMMU). I'd like to compare the complete dmesg log for that boot with the one that fails. It sounds like it might be hard to collect the log for the failing case -- you said the boot fails when the Highpoint card is in the system even if the SSD is connected to the motherboard instead of the Highpoint card. The panic in the photo2 image looks like it's just a failure to mount the root filesystem, which is what I'd expect if we can't find the SSD. It seems like we ought to be able to *boot* with the SSD connected to the motherboard, even if the Highpoint card doesn't work. But worst-case, a video of the failing boot might be enough, especially if you can slow it down with "boot_delay=" -- SUMMARY: Card fails with iommu support in the kernel: (but system does now boot (3.6.8) with the card in as long as the system disk isn't attached to it, not sure what was wrong earlier). It seems to be working now: => SSD on motherboard => PCI-e card (highpoint in the system but not used, no disks attached) (After I enabled nouveau, not sure that has anything to do with it) I put the card in, and it errors as usual but the SSD now on the motherboard it does boot successfully. Here are the errors from the kernel trying to initialize the board with iommu enabled (retrieved via netconsole) also picture below (w/help from boot_delay=100 && nouveau enabled): http://home.comcast.net/~jpiszcz/20121128/highpoint.jpg Nov 28 19:30:16 p34 [ 7.771060] ata14.00: qc timeout (cmd 0xa1) Nov 28 19:30:16 p34 [ 8.270153] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) Nov 28 19:30:17 p34 [ 9.073935] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Nov 28 19:30:27 p34 [ 19.058915] ata14.00: qc timeout (cmd 0xa1) Nov 28 19:30:28 p34 [ 19.557885] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) Nov 28 19:30:28 p34 [ 19.558478] ata14: limiting SATA link speed to 1.5 Gbps Nov 28 19:30:29 p34 [ 20.363658] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Nov 28 19:30:48 p34 [ 39.568234] dmar: DRHD: handling fault status reg 502 Nov 28 19:30:48 p34 [ 39.571508] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 39.571508] DMAR:[fault reason 06] PTE Read access is not set Nov 28 19:30:59 p34 [ 50.318146] ata14.00: qc timeout (cmd 0xa1) Nov 28 19:30:59 p34 [ 50.818061] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) Nov 28 19:31:00 p34 [ 51.621827] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 13:49 ` Justin Piszcz 2012-11-27 13:56 ` Justin Piszcz @ 2012-11-28 23:54 ` Bjorn Helgaas 2012-11-29 0:48 ` Justin Piszcz 2012-11-29 0:34 ` Robert Hancock 2 siblings, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-28 23:54 UTC (permalink / raw) To: Justin Piszcz; +Cc: Bruno Prémont, support, linux-kernel, Dan Williams On Tue, Nov 27, 2012 at 6:49 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > >> It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? > > ===> FOR I/OAT DMA > Latest status, it _appears_ its working on the X9SRL-F now, thank you! > > 1) Supermicro X9SRL-F (GOOD) > [ 0.738510] ioatdma: Intel(R) QuickData Technology Driver 4.00 > [ 0.738719] ioatdma 0000:00:04.0: irq 75 for MSI/MSI-X > [ 0.739088] ioatdma 0000:00:04.1: irq 76 for MSI/MSI-X > [ 0.739408] ioatdma 0000:00:04.2: irq 77 for MSI/MSI-X > [ 0.739739] ioatdma 0000:00:04.3: irq 78 for MSI/MSI-X > [ 0.740040] ioatdma 0000:00:04.4: irq 79 for MSI/MSI-X > [ 0.740342] ioatdma 0000:00:04.5: irq 80 for MSI/MSI-X > [ 0.740670] ioatdma 0000:00:04.6: irq 81 for MSI/MSI-X > [ 0.740971] ioatdma 0000:00:04.7: irq 82 for MSI/MSI-X Good. You have two issues, and I'm going to separate them and only address the first one here. I opened a bug report [1] against the IOAT driver. It should do something more useful when CONFIG_PCI_MMCONFIG=n so we don't have to debug this again in the future. But otherwise, it sounds like this issue is resolved. [1] https://bugzilla.kernel.org/show_bug.cgi?id=51101 ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-28 23:54 ` Bjorn Helgaas @ 2012-11-29 0:48 ` Justin Piszcz 0 siblings, 0 replies; 24+ messages in thread From: Justin Piszcz @ 2012-11-29 0:48 UTC (permalink / raw) To: 'Bjorn Helgaas', 'Robert Hancock' Cc: 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Bjorn Helgaas [mailto:bhelgaas@google.com] Sent: Wednesday, November 28, 2012 6:54 PM To: Justin Piszcz Cc: Bruno Prémont; support@supermicro.com; linux-kernel@vger.kernel.org; Dan Williams Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question On Tue, Nov 27, 2012 at 6:49 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > >> It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? > > ===> FOR I/OAT DMA > Latest status, it _appears_ its working on the X9SRL-F now, thank you! > > 1) Supermicro X9SRL-F (GOOD) > [ 0.738510] ioatdma: Intel(R) QuickData Technology Driver 4.00 > [ 0.738719] ioatdma 0000:00:04.0: irq 75 for MSI/MSI-X > [ 0.739088] ioatdma 0000:00:04.1: irq 76 for MSI/MSI-X > [ 0.739408] ioatdma 0000:00:04.2: irq 77 for MSI/MSI-X > [ 0.739739] ioatdma 0000:00:04.3: irq 78 for MSI/MSI-X > [ 0.740040] ioatdma 0000:00:04.4: irq 79 for MSI/MSI-X > [ 0.740342] ioatdma 0000:00:04.5: irq 80 for MSI/MSI-X > [ 0.740670] ioatdma 0000:00:04.6: irq 81 for MSI/MSI-X > [ 0.740971] ioatdma 0000:00:04.7: irq 82 for MSI/MSI-X Good. You have two issues, and I'm going to separate them and only address the first one here. I opened a bug report [1] against the IOAT driver. It should do something more useful when CONFIG_PCI_MMCONFIG=n so we don't have to debug this again in the future. But otherwise, it sounds like this issue is resolved. [1] https://bugzilla.kernel.org/show_bug.cgi?id=51101 -- Yes--(agree w/ config option) Thank you! Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 13:49 ` Justin Piszcz 2012-11-27 13:56 ` Justin Piszcz 2012-11-28 23:54 ` Bjorn Helgaas @ 2012-11-29 0:34 ` Robert Hancock 2012-11-29 0:49 ` Justin Piszcz 2 siblings, 1 reply; 24+ messages in thread From: Robert Hancock @ 2012-11-29 0:34 UTC (permalink / raw) To: Justin Piszcz Cc: 'Bjorn Helgaas', 'Bruno Prémont', support, linux-kernel, 'Dan Williams' On 11/27/2012 07:49 AM, Justin Piszcz wrote: > >> It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on? > > ===> FOR I/OAT DMA > Latest status, it _appears_ its working on the X9SRL-F now, thank you! > > 1) Supermicro X9SRL-F (GOOD) > [ 0.738510] ioatdma: Intel(R) QuickData Technology Driver 4.00 > [ 0.738719] ioatdma 0000:00:04.0: irq 75 for MSI/MSI-X > [ 0.739088] ioatdma 0000:00:04.1: irq 76 for MSI/MSI-X > [ 0.739408] ioatdma 0000:00:04.2: irq 77 for MSI/MSI-X > [ 0.739739] ioatdma 0000:00:04.3: irq 78 for MSI/MSI-X > [ 0.740040] ioatdma 0000:00:04.4: irq 79 for MSI/MSI-X > [ 0.740342] ioatdma 0000:00:04.5: irq 80 for MSI/MSI-X > [ 0.740670] ioatdma 0000:00:04.6: irq 81 for MSI/MSI-X > [ 0.740971] ioatdma 0000:00:04.7: irq 82 for MSI/MSI-X > > It is _not_ working on the: > > 2) Supermicro X8DTH-F (the boot drive in this system is running off a PCI-e > card, could the IRQ for the I/O controller be getting re-mapped and fail?)-- > worse case I can move the SSD from the 6.0gbpa SATA card to the motherboard > and see if that works, but that kind of defeats the purpose of a 6.0gbps > SATA SSD. > > (Fails to talk to the SSD) > http://home.comcast.net/~jpiszcz/20121127/photo1-resize.jpg > > (then, a few moments later: Kernel panic) > http://home.comcast.net/~jpiszcz/20121127/photo2-resize.jpg > > Would be curious if anyone had any suggestions besides removing the > controller card? What does lspci -vv show on that controller? Not sure what actual chipset that controller is, but there's a known issue with some Marvell 6Gbps SATA controllers with DMAR enabled - it seems the device issues memory read/write requests from the wrong PCI function ID and the IOMMU rightly denies access as the function listed in the requests doesn't have any mapping to that memory. I don't think there's presently a workaround other than disabling DMAR. We could (and likely should) be detecting that device and adding some kind of quirk for it. > > -- > > > ==> Further issues with the X9SRL-F -- does this board support ASPM or is > this a Linux/ASPM implementation issue? > [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling > PCIe ASPM > [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support > mask: 0x08) What's the full dmesg from this machine (or is it already posted somewhere)? ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 0:34 ` Robert Hancock @ 2012-11-29 0:49 ` Justin Piszcz 2012-11-29 0:55 ` Robert Hancock 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-29 0:49 UTC (permalink / raw) To: 'Robert Hancock' Cc: 'Bjorn Helgaas', 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Robert Hancock [mailto:hancockrwd@gmail.com] Sent: Wednesday, November 28, 2012 7:35 PM To: Justin Piszcz Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; linux-kernel@vger.kernel.org; 'Dan Williams' Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question What does lspci -vv show on that controller? Not sure what actual chipset that controller is, but there's a known issue with some Marvell 6Gbps SATA controllers with DMAR enabled - it seems the device issues memory read/write requests from the wrong PCI function ID and the IOMMU rightly denies access as the function listed in the requests doesn't have any mapping to that memory. I don't think there's presently a workaround other than disabling DMAR. We could (and likely should) be detecting that device and adding some kind of quirk for it. That sounds likely... It is shown below: Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host Adapter lspci -vv output: 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 119 Region 0: I/O ports at e000 [size=8] Region 1: I/O ports at dc00 [size=4] Region 2: I/O ports at ec00 [size=8] Region 3: I/O ports at e800 [size=4] Region 4: I/O ports at e400 [size=16] Region 5: Memory at cfeee000 (32-bit, non-prefetchable) [size=2K] Expansion ROM at cfef0000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee20000 Data: 4076 Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Kernel driver in use: ahci > > -- > > > ==> Further issues with the X9SRL-F -- does this board support ASPM or is > this a Linux/ASPM implementation issue? > [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, disabling > PCIe ASPM > [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support > mask: 0x08) What's the full dmesg from this machine (or is it already posted somewhere)? It is now available here: http://home.comcast.net/~jpiszcz/20121128/dmesg.txt Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 0:49 ` Justin Piszcz @ 2012-11-29 0:55 ` Robert Hancock 2012-11-29 8:55 ` Justin Piszcz 0 siblings, 1 reply; 24+ messages in thread From: Robert Hancock @ 2012-11-29 0:55 UTC (permalink / raw) To: Justin Piszcz Cc: Bjorn Helgaas, Bruno Prémont, support, linux-kernel, Dan Williams On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Robert Hancock [mailto:hancockrwd@gmail.com] > Sent: Wednesday, November 28, 2012 7:35 PM > To: Justin Piszcz > Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; > linux-kernel@vger.kernel.org; 'Dan Williams' > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > > What does lspci -vv show on that controller? Not sure what actual > chipset that controller is, but there's a known issue with some Marvell > 6Gbps SATA controllers with DMAR enabled - it seems the device issues > memory read/write requests from the wrong PCI function ID and the IOMMU > rightly denies access as the function listed in the requests doesn't > have any mapping to that memory. I don't think there's presently a > workaround other than disabling DMAR. We could (and likely should) be > detecting that device and adding some kind of quirk for it. > > That sounds likely... > It is shown below: > > Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host > Adapter > > lspci -vv output: > > 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA > 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) > Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s > controller Yeah, that's one of those controllers I think. But I can't tell from the bit of the dmesg you posted exactly what's going on. Can you post a full boot log from having the card installed and some drive attached (by putting the boot drive on another controller for example)? >> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >> this a Linux/ASPM implementation issue? >> [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, > disabling >> PCIe ASPM >> [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support >> mask: 0x08) > > What's the full dmesg from this machine (or is it already posted somewhere)? > > It is now available here: > http://home.comcast.net/~jpiszcz/20121128/dmesg.txt Is that the same boot log? It doesn't have this error in it. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 0:55 ` Robert Hancock @ 2012-11-29 8:55 ` Justin Piszcz 2012-11-29 18:16 ` Bjorn Helgaas 0 siblings, 1 reply; 24+ messages in thread From: Justin Piszcz @ 2012-11-29 8:55 UTC (permalink / raw) To: 'Robert Hancock' Cc: 'Bjorn Helgaas', 'Bruno Prémont', support, linux-kernel, 'Dan Williams' -----Original Message----- From: Robert Hancock [mailto:hancockrwd@gmail.com] Sent: Wednesday, November 28, 2012 7:55 PM To: Justin Piszcz Cc: Bjorn Helgaas; Bruno Prémont; support@supermicro.com; linux-kernel@vger.kernel.org; Dan Williams Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Robert Hancock [mailto:hancockrwd@gmail.com] > Sent: Wednesday, November 28, 2012 7:35 PM > To: Justin Piszcz > Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; > linux-kernel@vger.kernel.org; 'Dan Williams' > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > > What does lspci -vv show on that controller? Not sure what actual > chipset that controller is, but there's a known issue with some Marvell > 6Gbps SATA controllers with DMAR enabled - it seems the device issues > memory read/write requests from the wrong PCI function ID and the IOMMU > rightly denies access as the function listed in the requests doesn't > have any mapping to that memory. I don't think there's presently a > workaround other than disabling DMAR. We could (and likely should) be > detecting that device and adding some kind of quirk for it. > > That sounds likely... > It is shown below: > > Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host > Adapter > > lspci -vv output: > > 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA > 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) > Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s > controller Yeah, that's one of those controllers I think. But I can't tell from the bit of the dmesg you posted exactly what's going on. Can you post a full boot log from having the card installed and some drive attached (by putting the boot drive on another controller for example)? >> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >> this a Linux/ASPM implementation issue? >> [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, > disabling >> PCIe ASPM >> [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support >> mask: 0x08) > > What's the full dmesg from this machine (or is it already posted somewhere)? > > It is now available here: > http://home.comcast.net/~jpiszcz/20121128/dmesg.txt > Is that the same boot log? It doesn't have this error in it. Yes, the error is here: (its towards the bottom) [ 7.973015] ata14.00: qc timeout (cmd 0xa1) [ 8.472120] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 9.275922] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 19.260667] ata14.00: qc timeout (cmd 0xa1) [ 19.759828] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 19.760451] ata14: limiting SATA link speed to 1.5 Gbps [ 20.566598] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [ 50.521078] ata14.00: qc timeout (cmd 0xa1) [ 51.020880] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 51.824664] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [ 51.824682] dmar: DRHD: handling fault status reg 502 [ 51.824686] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 [ 51.824686] DMAR:[fault reason 06] PTE Read access is not set [ 52.338871] EXT3-fs (sdb2): error: couldn't mount because of unsupported optional features (240) [ 52.348938] EXT2-fs (sdb2): error: couldn't mount because of unsupported optional features (240) [ 52.360314] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null) The system does not boot when the SSD is on that SATA controller. The error we were trying to get earlier (kernel panic)-- I cannot reproduce that anymore after adding nouveau for whatever reason. So to re-cap it boots now with nothing connected to the controller but the controller is non-workable/useless, as shown above. When you put the SSD on it, it cannot mount rootfs. Justin. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 8:55 ` Justin Piszcz @ 2012-11-29 18:16 ` Bjorn Helgaas 2012-11-30 2:39 ` Robert Hancock 0 siblings, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-29 18:16 UTC (permalink / raw) To: Justin Piszcz Cc: Robert Hancock, Bruno Prémont, support, linux-kernel, Dan Williams On Thu, Nov 29, 2012 at 1:55 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > -----Original Message----- > From: Robert Hancock [mailto:hancockrwd@gmail.com] > Sent: Wednesday, November 28, 2012 7:55 PM > To: Justin Piszcz > Cc: Bjorn Helgaas; Bruno Prémont; support@supermicro.com; > linux-kernel@vger.kernel.org; Dan Williams > Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware > bug question > > On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz <jpiszcz@lucidpixels.com> > wrote: >> >> >> -----Original Message----- >> From: Robert Hancock [mailto:hancockrwd@gmail.com] >> Sent: Wednesday, November 28, 2012 7:35 PM >> To: Justin Piszcz >> Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; >> linux-kernel@vger.kernel.org; 'Dan Williams' >> Subject: Re: Supermicro X9SRL-F - channel enumeration error & > ACPI/firmware >> bug question >> >> >> What does lspci -vv show on that controller? Not sure what actual >> chipset that controller is, but there's a known issue with some Marvell >> 6Gbps SATA controllers with DMAR enabled - it seems the device issues >> memory read/write requests from the wrong PCI function ID and the IOMMU >> rightly denies access as the function listed in the requests doesn't >> have any mapping to that memory. I don't think there's presently a >> workaround other than disabling DMAR. We could (and likely should) be >> detecting that device and adding some kind of quirk for it. >> >> That sounds likely... >> It is shown below: >> >> Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host >> Adapter >> >> lspci -vv output: >> >> 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA >> 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) >> Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s >> controller > > Yeah, that's one of those controllers I think. But I can't tell from > the bit of the dmesg you posted exactly what's going on. Can you post > a full boot log from having the card installed and some drive attached > (by putting the boot drive on another controller for example)? > >>> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >>> this a Linux/ASPM implementation issue? >>> [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, >> disabling >>> PCIe ASPM >>> [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support >>> mask: 0x08) >> >> What's the full dmesg from this machine (or is it already posted > somewhere)? >> >> It is now available here: >> http://home.comcast.net/~jpiszcz/20121128/dmesg.txt > >> Is that the same boot log? It doesn't have this error in it. > > Yes, the error is here: (its towards the bottom) > > [ 7.973015] ata14.00: qc timeout (cmd 0xa1) > [ 8.472120] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 9.275922] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > [ 19.260667] ata14.00: qc timeout (cmd 0xa1) > [ 19.759828] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 19.760451] ata14: limiting SATA link speed to 1.5 Gbps > [ 20.566598] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > [ 50.521078] ata14.00: qc timeout (cmd 0xa1) > [ 51.020880] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 51.824664] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > [ 51.824682] dmar: DRHD: handling fault status reg 502 > [ 51.824686] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 > [ 51.824686] DMAR:[fault reason 06] PTE Read access is not set You have these devices: pci 0000:04:00.0: [10de:01d3] type 00 class 0x030000 nVidia G72 pci 0000:84:00.0: [1b4b:9123] type 00 class 0x010601 Marvell 88SE9123 SATA pci 0000:84:00.1: [1b4b:91a4] type 00 class 0x01018f Marvell 88SE9128 IDE I think the 04:00.0 DMAR errors are symptoms of nouveau driver issues, and if you get rid of that driver, they'll probably go away. But this 84:00.1 DMAR error: dmar: DMAR:[DMA Read] Request device [84:00.1] fault addr fff00000 DMAR:[fault reason 02] Present bit in context entry is clear looks like the probable cause of the Marvell issue. It looks similar to https://bugzilla.kernel.org/show_bug.cgi?id=42679, although the reports there show a bb:dd.0 device (but no bb:dd.1 device), and the DMAR rejects DMA that appears to be from bb:dd.1. Another report that's even more similar is https://bugzilla.redhat.com/show_bug.cgi?id=757166 . In that case, both bb:dd.0 and bb:dd.1 exist (as in your system), and the DMAR fault is exactly like what you're seeing. So you're not alone, but unfortunately, nobody seems to be working on either bug report. I took the liberty to add you to the cc: list of both. I don't really know what else to do at this point. Maybe a SATA expert with some Marvell docs could figure out why we're seeing DMA from the IDE controller, but I'm not that person :) > [ 52.338871] EXT3-fs (sdb2): error: couldn't mount because of unsupported > optional features (240) > [ 52.348938] EXT2-fs (sdb2): error: couldn't mount because of unsupported > optional features (240) > [ 52.360314] EXT4-fs (sdb2): mounted filesystem with ordered data mode. > Opts: (null) > > The system does not boot when the SSD is on that SATA controller. > The error we were trying to get earlier (kernel panic)-- I cannot reproduce > that anymore after adding nouveau for whatever reason. > So to re-cap it boots now with nothing connected to the controller but the > controller is non-workable/useless, as shown above. > When you put the SSD on it, it cannot mount rootfs. > > Justin. > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-29 18:16 ` Bjorn Helgaas @ 2012-11-30 2:39 ` Robert Hancock 2012-11-30 3:38 ` Bjorn Helgaas 0 siblings, 1 reply; 24+ messages in thread From: Robert Hancock @ 2012-11-30 2:39 UTC (permalink / raw) To: Bjorn Helgaas Cc: Justin Piszcz, Bruno Prémont, support, linux-kernel, Dan Williams On Thu, Nov 29, 2012 at 12:16 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Thu, Nov 29, 2012 at 1:55 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: >> >> >> -----Original Message----- >> From: Robert Hancock [mailto:hancockrwd@gmail.com] >> Sent: Wednesday, November 28, 2012 7:55 PM >> To: Justin Piszcz >> Cc: Bjorn Helgaas; Bruno Prémont; support@supermicro.com; >> linux-kernel@vger.kernel.org; Dan Williams >> Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware >> bug question >> >> On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz <jpiszcz@lucidpixels.com> >> wrote: >>> >>> >>> -----Original Message----- >>> From: Robert Hancock [mailto:hancockrwd@gmail.com] >>> Sent: Wednesday, November 28, 2012 7:35 PM >>> To: Justin Piszcz >>> Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; >>> linux-kernel@vger.kernel.org; 'Dan Williams' >>> Subject: Re: Supermicro X9SRL-F - channel enumeration error & >> ACPI/firmware >>> bug question >>> >>> >>> What does lspci -vv show on that controller? Not sure what actual >>> chipset that controller is, but there's a known issue with some Marvell >>> 6Gbps SATA controllers with DMAR enabled - it seems the device issues >>> memory read/write requests from the wrong PCI function ID and the IOMMU >>> rightly denies access as the function listed in the requests doesn't >>> have any mapping to that memory. I don't think there's presently a >>> workaround other than disabling DMAR. We could (and likely should) be >>> detecting that device and adding some kind of quirk for it. >>> >>> That sounds likely... >>> It is shown below: >>> >>> Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host >>> Adapter >>> >>> lspci -vv output: >>> >>> 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA >>> 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) >>> Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s >>> controller >> >> Yeah, that's one of those controllers I think. But I can't tell from >> the bit of the dmesg you posted exactly what's going on. Can you post >> a full boot log from having the card installed and some drive attached >> (by putting the boot drive on another controller for example)? >> >>>> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >>>> this a Linux/ASPM implementation issue? >>>> [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, >>> disabling >>>> PCIe ASPM >>>> [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support >>>> mask: 0x08) >>> >>> What's the full dmesg from this machine (or is it already posted >> somewhere)? >>> >>> It is now available here: >>> http://home.comcast.net/~jpiszcz/20121128/dmesg.txt >> >>> Is that the same boot log? It doesn't have this error in it. >> >> Yes, the error is here: (its towards the bottom) >> >> [ 7.973015] ata14.00: qc timeout (cmd 0xa1) >> [ 8.472120] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 9.275922] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) >> [ 19.260667] ata14.00: qc timeout (cmd 0xa1) >> [ 19.759828] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 19.760451] ata14: limiting SATA link speed to 1.5 Gbps >> [ 20.566598] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >> [ 50.521078] ata14.00: qc timeout (cmd 0xa1) >> [ 51.020880] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 51.824664] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >> [ 51.824682] dmar: DRHD: handling fault status reg 502 >> [ 51.824686] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 >> [ 51.824686] DMAR:[fault reason 06] PTE Read access is not set > > You have these devices: > > pci 0000:04:00.0: [10de:01d3] type 00 class 0x030000 nVidia G72 > pci 0000:84:00.0: [1b4b:9123] type 00 class 0x010601 Marvell 88SE9123 SATA > pci 0000:84:00.1: [1b4b:91a4] type 00 class 0x01018f Marvell 88SE9128 IDE > > I think the 04:00.0 DMAR errors are symptoms of nouveau driver issues, > and if you get rid of that driver, they'll probably go away. > > But this 84:00.1 DMAR error: > > dmar: DMAR:[DMA Read] Request device [84:00.1] fault addr fff00000 > DMAR:[fault reason 02] Present bit in context entry is clear > > looks like the probable cause of the Marvell issue. It looks similar > to https://bugzilla.kernel.org/show_bug.cgi?id=42679, although the > reports there show a bb:dd.0 device (but no bb:dd.1 device), and the > DMAR rejects DMA that appears to be from bb:dd.1. > > Another report that's even more similar is > https://bugzilla.redhat.com/show_bug.cgi?id=757166 . In that case, > both bb:dd.0 and bb:dd.1 exist (as in your system), and the DMAR fault > is exactly like what you're seeing. > > So you're not alone, but unfortunately, nobody seems to be working on > either bug report. I took the liberty to add you to the cc: list of > both. > > I don't really know what else to do at this point. Maybe a SATA > expert with some Marvell docs could figure out why we're seeing DMA > from the IDE controller, but I'm not that person :) I doubt any Marvell docs would really be very helpful (except for maybe an errata list but that likely would just tell us what we can already figure out). The SATA controller part of the device seems to just be issuing accesses with the wrong PCI function ID. The only solution I can think of would be at the PCI/DMAR layer - basically functions 0 and 1 on this device should be allowed to access each other's DMA regions. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-30 2:39 ` Robert Hancock @ 2012-11-30 3:38 ` Bjorn Helgaas 2012-12-02 13:26 ` Joerg Roedel 0 siblings, 1 reply; 24+ messages in thread From: Bjorn Helgaas @ 2012-11-30 3:38 UTC (permalink / raw) To: Robert Hancock Cc: Justin Piszcz, Bruno Prémont, support, linux-kernel, Dan Williams, Jeff Garzik, linux-ide, David Woodhouse, Joerg Roedel, iommu [+cc Jeff, linux-ide, David, Joerg, iommu] On Thu, Nov 29, 2012 at 7:39 PM, Robert Hancock <hancockrwd@gmail.com> wrote: > On Thu, Nov 29, 2012 at 12:16 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Thu, Nov 29, 2012 at 1:55 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote: >>> >>> >>> -----Original Message----- >>> From: Robert Hancock [mailto:hancockrwd@gmail.com] >>> Sent: Wednesday, November 28, 2012 7:55 PM >>> To: Justin Piszcz >>> Cc: Bjorn Helgaas; Bruno Prémont; support@supermicro.com; >>> linux-kernel@vger.kernel.org; Dan Williams >>> Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware >>> bug question >>> >>> On Wed, Nov 28, 2012 at 6:49 PM, Justin Piszcz <jpiszcz@lucidpixels.com> >>> wrote: >>>> >>>> >>>> -----Original Message----- >>>> From: Robert Hancock [mailto:hancockrwd@gmail.com] >>>> Sent: Wednesday, November 28, 2012 7:35 PM >>>> To: Justin Piszcz >>>> Cc: 'Bjorn Helgaas'; 'Bruno Prémont'; support@supermicro.com; >>>> linux-kernel@vger.kernel.org; 'Dan Williams' >>>> Subject: Re: Supermicro X9SRL-F - channel enumeration error & >>> ACPI/firmware >>>> bug question >>>> >>>> >>>> What does lspci -vv show on that controller? Not sure what actual >>>> chipset that controller is, but there's a known issue with some Marvell >>>> 6Gbps SATA controllers with DMAR enabled - it seems the device issues >>>> memory read/write requests from the wrong PCI function ID and the IOMMU >>>> rightly denies access as the function listed in the requests doesn't >>>> have any mapping to that memory. I don't think there's presently a >>>> workaround other than disabling DMAR. We could (and likely should) be >>>> detecting that device and adding some kind of quirk for it. >>>> >>>> That sounds likely... >>>> It is shown below: >>>> >>>> Card name: HighPoint Rocket 620 Dual Port SATA 6 Gbps PCI Express 2.0 Host >>>> Adapter >>>> >>>> lspci -vv output: >>>> >>>> 84:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA >>>> 6.0 Gb/s controller (rev 11) (prog-if 01 [AHCI 1.0]) >>>> Subsystem: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s >>>> controller >>> >>> Yeah, that's one of those controllers I think. But I can't tell from >>> the bit of the dmesg you posted exactly what's going on. Can you post >>> a full boot log from having the card installed and some drive attached >>> (by putting the boot drive on another controller for example)? >>> >>>>> ==> Further issues with the X9SRL-F -- does this board support ASPM or is >>>>> this a Linux/ASPM implementation issue? >>>>> [ 0.632170] pci0000:ff: ACPI _OSC support notification failed, >>>> disabling >>>>> PCIe ASPM >>>>> [ 0.632239] pci0000:ff: Unable to request _OSC control (_OSC support >>>>> mask: 0x08) >>>> >>>> What's the full dmesg from this machine (or is it already posted >>> somewhere)? >>>> >>>> It is now available here: >>>> http://home.comcast.net/~jpiszcz/20121128/dmesg.txt >>> >>>> Is that the same boot log? It doesn't have this error in it. >>> >>> Yes, the error is here: (its towards the bottom) >>> >>> [ 7.973015] ata14.00: qc timeout (cmd 0xa1) >>> [ 8.472120] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>> [ 9.275922] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) >>> [ 19.260667] ata14.00: qc timeout (cmd 0xa1) >>> [ 19.759828] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>> [ 19.760451] ata14: limiting SATA link speed to 1.5 Gbps >>> [ 20.566598] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >>> [ 50.521078] ata14.00: qc timeout (cmd 0xa1) >>> [ 51.020880] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>> [ 51.824664] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >>> [ 51.824682] dmar: DRHD: handling fault status reg 502 >>> [ 51.824686] dmar: DMAR:[DMA Read] Request device [04:00.0] fault addr 0 >>> [ 51.824686] DMAR:[fault reason 06] PTE Read access is not set >> >> You have these devices: >> >> pci 0000:04:00.0: [10de:01d3] type 00 class 0x030000 nVidia G72 >> pci 0000:84:00.0: [1b4b:9123] type 00 class 0x010601 Marvell 88SE9123 SATA >> pci 0000:84:00.1: [1b4b:91a4] type 00 class 0x01018f Marvell 88SE9128 IDE >> >> I think the 04:00.0 DMAR errors are symptoms of nouveau driver issues, >> and if you get rid of that driver, they'll probably go away. >> >> But this 84:00.1 DMAR error: >> >> dmar: DMAR:[DMA Read] Request device [84:00.1] fault addr fff00000 >> DMAR:[fault reason 02] Present bit in context entry is clear >> >> looks like the probable cause of the Marvell issue. It looks similar >> to https://bugzilla.kernel.org/show_bug.cgi?id=42679, although the >> reports there show a bb:dd.0 device (but no bb:dd.1 device), and the >> DMAR rejects DMA that appears to be from bb:dd.1. >> >> Another report that's even more similar is >> https://bugzilla.redhat.com/show_bug.cgi?id=757166 . In that case, >> both bb:dd.0 and bb:dd.1 exist (as in your system), and the DMAR fault >> is exactly like what you're seeing. >> >> So you're not alone, but unfortunately, nobody seems to be working on >> either bug report. I took the liberty to add you to the cc: list of >> both. >> >> I don't really know what else to do at this point. Maybe a SATA >> expert with some Marvell docs could figure out why we're seeing DMA >> from the IDE controller, but I'm not that person :) > > I doubt any Marvell docs would really be very helpful (except for > maybe an errata list but that likely would just tell us what we can > already figure out). The SATA controller part of the device seems to > just be issuing accesses with the wrong PCI function ID. > > The only solution I can think of would be at the PCI/DMAR layer - > basically functions 0 and 1 on this device should be allowed to access > each other's DMA regions. That's essentially the patch at https://bugzilla.redhat.com/show_bug.cgi?id=757166#c16, which in my opinion is too ugly to consider. But fortunately, I'm not the maintainer for any IOMMU drivers. My point about the docs is that often we think "this hardware is clearly broken and the only workaround is X," but sometimes it's just that we don't understand the hardware designer's intent. It may be that the hardware was just never tested with DMAR and is indeed broken, or it may be that it does work with DMAR given a different driver structure or different device initialization. I just don't want lack of imagination to force us to assume there's only one workaround. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-30 3:38 ` Bjorn Helgaas @ 2012-12-02 13:26 ` Joerg Roedel 0 siblings, 0 replies; 24+ messages in thread From: Joerg Roedel @ 2012-12-02 13:26 UTC (permalink / raw) To: Bjorn Helgaas Cc: Robert Hancock, Justin Piszcz, Bruno Prémont, support, linux-kernel, Dan Williams, Jeff Garzik, linux-ide, David Woodhouse, iommu On Thu, Nov 29, 2012 at 08:38:53PM -0700, Bjorn Helgaas wrote: > That's essentially the patch at > https://bugzilla.redhat.com/show_bug.cgi?id=757166#c16, which in my > opinion is too ugly to consider. But fortunately, I'm not the > maintainer for any IOMMU drivers. There is a quirk infrastructure for those kinds of broken devices in drivers/pci/quirks.c. Have a look into the function pci_get_dma_source(). This function is used by the IOMMU drivers to create the correct mappings. Joerg ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question 2012-11-27 0:56 ` Bjorn Helgaas 2012-11-27 1:00 ` Bjorn Helgaas @ 2012-11-27 1:11 ` Dan Williams 1 sibling, 0 replies; 24+ messages in thread From: Dan Williams @ 2012-11-27 1:11 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bruno Prémont, Justin Piszcz, support, linux-kernel, Dan Williams, dave.jiang On Mon, Nov 26, 2012 at 4:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > The "channel error register unreachable" message indicates that > pci_read_config_dword() failed. The register in question > (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have > PCI config accessors for the extended config space (0x100-0xfff). A > complete dmesg log should show that. Yes, this happens when extended pci configuration space is not reachable. Although the need to access this register has been found to no longer be needed. So, it appears this patch [1] from Dave should be modified to just stop touching that register altogether and then go to -stable. -- Dan [1]: http://marc.info/?l=linux-kernel&m=135310841032707&w=2 ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2012-12-02 13:26 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-24 19:40 Supermicro X9SRL-F - channel enumeration error & ACPI/firmware bug question Justin Piszcz 2012-11-26 21:42 ` Bruno Prémont 2012-11-27 0:50 ` Justin Piszcz 2012-11-27 0:56 ` Bjorn Helgaas 2012-11-27 1:00 ` Bjorn Helgaas 2012-11-27 1:00 ` Justin Piszcz 2012-11-27 1:11 ` Bjorn Helgaas 2012-11-27 13:33 ` Justin Piszcz 2012-11-27 13:49 ` Justin Piszcz 2012-11-27 13:56 ` Justin Piszcz 2012-11-27 14:35 ` Justin Piszcz 2012-11-29 0:08 ` Bjorn Helgaas 2012-11-29 0:49 ` Justin Piszcz 2012-11-28 23:54 ` Bjorn Helgaas 2012-11-29 0:48 ` Justin Piszcz 2012-11-29 0:34 ` Robert Hancock 2012-11-29 0:49 ` Justin Piszcz 2012-11-29 0:55 ` Robert Hancock 2012-11-29 8:55 ` Justin Piszcz 2012-11-29 18:16 ` Bjorn Helgaas 2012-11-30 2:39 ` Robert Hancock 2012-11-30 3:38 ` Bjorn Helgaas 2012-12-02 13:26 ` Joerg Roedel 2012-11-27 1:11 ` Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox