linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
       [not found] <bug-13001-10286@http.bugzilla.kernel.org/>
@ 2009-04-04 18:45 ` Andrew Morton
  2009-04-05  9:39   ` Данила Жукоцкий
  2009-04-06 22:48   ` Alan Cox
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2009-04-04 18:45 UTC (permalink / raw)
  To: linux-ide; +Cc: bugme-daemon, optimusgd, x86


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13001
> 
>            Summary: PCI-DMA: Out of IOMMU space
>            Product: Platform Specific/Hardware
>            Version: 2.5
>     Kernel Version: 2.6.29-gentoo
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: x86-64
>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
>         ReportedBy: optimusgd@gmail.com
>         Regression: Yes
> 
> 
> Created an attachment (id=20789)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
> hwreport generated info
> 
> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
> 2.6.28-gentoo-r4 work ok, so it is regression.

It is indeed a regression.

> Dmesg fragments:
> 
> 
> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
> 4096 bytes
> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
> 4096 bytes
> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
> last_issue_tag 0xfafbfcfd
> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
> 0x6
> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
> 0 ncq 4096 in
> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
> (internal error)
> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link

Are these scary-looking messages also present in 2.6.28?

If so, perhaps the ata code is leaking DMA memory on the error-handling path?

> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
> 300)
> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
> Apr  3 13:38:47 rngmhpamd ata1: EH complete
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
> sectors: (250 GB/232 GiB)
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
> sectors: (250 GB/232 GiB)
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> 
> And
> 
> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 4608 bytes
> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> and address 8
> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 11776 bytes
> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> and address 8
> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 11776 bytes
> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> and address 8
> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 11776 bytes
> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> and address 8
> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 11776 bytes
> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> and address 8
> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 11776 bytes
> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> for 69632 bytes
> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
> driverbyte=0x00
> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed

Do we have any debugging option for dumping the current PCI DMA
allocations, find out where it has all gone?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-04 18:45 ` [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space Andrew Morton
@ 2009-04-05  9:39   ` Данила Жукоцкий
  2009-04-07 16:14     ` Grant Grundler
  2009-04-06 22:48   ` Alan Cox
  1 sibling, 1 reply; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-05  9:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-ide, bugme-daemon, x86

2009/4/4 Andrew Morton <akpm@linux-foundation.org>:
>
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=13001
>>
>>            Summary: PCI-DMA: Out of IOMMU space
>>            Product: Platform Specific/Hardware
>>            Version: 2.5
>>     Kernel Version: 2.6.29-gentoo
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: x86-64
>>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
>>         ReportedBy: optimusgd@gmail.com
>>         Regression: Yes
>>
>>
>> Created an attachment (id=20789)
>>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
>> hwreport generated info
>>
>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
>> 2.6.28-gentoo-r4 work ok, so it is regression.
>
> It is indeed a regression.
>
>> Dmesg fragments:
>>
>>
>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>> 4096 bytes
>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>> 4096 bytes
>> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
>> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
>> last_issue_tag 0xfafbfcfd
>> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
>> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
>> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
>> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
>> 0x6
>> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
>> 0 ncq 4096 in
>> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
>> (internal error)
>> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
>> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link
>
> Are these scary-looking messages also present in 2.6.28?
>
> If so, perhaps the ata code is leaking DMA memory on the error-handling path?
>
>> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
>> 300)
>> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
>> Apr  3 13:38:47 rngmhpamd ata1: EH complete
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>> sectors: (250 GB/232 GiB)
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>> sectors: (250 GB/232 GiB)
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA
>>
>> And
>>
>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 4608 bytes
>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>> and address 8
>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 11776 bytes
>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>> and address 8
>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 11776 bytes
>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>> and address 8
>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 11776 bytes
>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>> and address 8
>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 11776 bytes
>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>> and address 8
>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 11776 bytes
>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>> for 69632 bytes
>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
>> driverbyte=0x00
>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>
> Do we have any debugging option for dumping the current PCI DMA
> allocations, find out where it has all gone?
>
>

Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can
easyly trigger it. I boot with default apperture, 64mb, and while
write to usb-hdd get this:

Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
space for 65536 bytes
Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
space for 65536 bytes
Apr  5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using
ehci_hcd and address 6

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-04 18:45 ` [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space Andrew Morton
  2009-04-05  9:39   ` Данила Жукоцкий
@ 2009-04-06 22:48   ` Alan Cox
  2009-04-07  1:49     ` FUJITA Tomonori
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Cox @ 2009-04-06 22:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-ide, bugme-daemon, optimusgd, x86

> > Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
> > Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
> 
> Do we have any debugging option for dumping the current PCI DMA
> allocations, find out where it has all gone?

Turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG for a test run.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-06 22:48   ` Alan Cox
@ 2009-04-07  1:49     ` FUJITA Tomonori
  2009-04-07  5:05       ` Данила Жукоцкий
  0 siblings, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2009-04-07  1:49 UTC (permalink / raw)
  To: alan; +Cc: akpm, linux-ide, bugme-daemon, optimusgd, x86

On Mon, 6 Apr 2009 23:48:16 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > > Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
> > > Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
> > 
> > Do we have any debugging option for dumping the current PCI DMA
> > allocations, find out where it has all gone?
> 
> Turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG for a test run.

These options don't tell who leaks dma mappings so possibly it isn't
very helpful.

There is no interesting changes in GART IOMMU between 2.6.28 and
2.6.29 so probably this is a driver bug. It's necessary to find out
which driver leaks dma mappings.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  1:49     ` FUJITA Tomonori
@ 2009-04-07  5:05       ` Данила Жукоцкий
  2009-04-07  8:17         ` Данила Жукоцкий
  2009-04-07  8:38         ` FUJITA Tomonori
  0 siblings, 2 replies; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07  5:05 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: alan, akpm, linux-ide, bugme-daemon, x86

I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
2.6.26.1 not have such options in .config. Maybe git kernel have this
options? Bug triggered when i try mount sata drive on sata_nv and usb
hdd on echi_hcd. In all other cases i have no problem.

Also i have mysterious problem similar to
http://lkml.org/lkml/2009/1/11/302 . It prevent me return to normally
working before 2.6.28 kernel. Once boot 2.6.29 and found DMA bug i'm
reboot to old working 2.6.28 and seen that my eth0 and eth1 forcedetch
died with "no link during initialization". Restart interfaces, remove
insert module don't help.
I'm not change configs, i'm not rebuild old kernel. I'm not touch it
at all. Reboot to 2.6.29 and network work. Cold restart won't help,
and i not won't reset BIOS. I looked to dmesg 2.6.28 and seen that
irqs for forcedetch look different from other 2.6.28 kernel dmesgs.
So 2.6.29 do strange things with my machine, and that things live cold
restart. I don't know what happens, i can't understand and can't open
case and hardreset bios all time when i want return to old working
kernel. Can you explain me what wrong?

2009/4/7 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>:
> On Mon, 6 Apr 2009 23:48:16 +0100
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
>> > > Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>> > > Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>> >
>> > Do we have any debugging option for dumping the current PCI DMA
>> > allocations, find out where it has all gone?
>>
>> Turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG for a test run.
>
> These options don't tell who leaks dma mappings so possibly it isn't
> very helpful.
>
> There is no interesting changes in GART IOMMU between 2.6.28 and
> 2.6.29 so probably this is a driver bug. It's necessary to find out
> which driver leaks dma mappings.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  5:05       ` Данила Жукоцкий
@ 2009-04-07  8:17         ` Данила Жукоцкий
  2009-04-07  8:38         ` FUJITA Tomonori
  1 sibling, 0 replies; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07  8:17 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: alan, akpm, linux-ide, bugme-daemon, x86

Tryed 2.6.29-git14. Can't find CONFIG_IOMMU_LEAK and
CONFIG_IOMMU_DEBUG. Something strange happens with modules, "Invalid
module format" for all, so unusable at all, usb keyboard not work.

2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>:
> I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
> 2.6.26.1 not have such options in .config. Maybe git kernel have this
> options? Bug triggered when i try mount sata drive on sata_nv and usb
> hdd on echi_hcd. In all other cases i have no problem.
>
> Also i have mysterious problem similar to
> http://lkml.org/lkml/2009/1/11/302 . It prevent me return to normally
> working before 2.6.28 kernel. Once boot 2.6.29 and found DMA bug i'm
> reboot to old working 2.6.28 and seen that my eth0 and eth1 forcedetch
> died with "no link during initialization". Restart interfaces, remove
> insert module don't help.
> I'm not change configs, i'm not rebuild old kernel. I'm not touch it
> at all. Reboot to 2.6.29 and network work. Cold restart won't help,
> and i not won't reset BIOS. I looked to dmesg 2.6.28 and seen that
> irqs for forcedetch look different from other 2.6.28 kernel dmesgs.
> So 2.6.29 do strange things with my machine, and that things live cold
> restart. I don't know what happens, i can't understand and can't open
> case and hardreset bios all time when i want return to old working
> kernel. Can you explain me what wrong?
>
> 2009/4/7 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>:
>> On Mon, 6 Apr 2009 23:48:16 +0100
>> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>>
>>> > > Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>>> > > Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>>> >
>>> > Do we have any debugging option for dumping the current PCI DMA
>>> > allocations, find out where it has all gone?
>>>
>>> Turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG for a test run.
>>
>> These options don't tell who leaks dma mappings so possibly it isn't
>> very helpful.
>>
>> There is no interesting changes in GART IOMMU between 2.6.28 and
>> 2.6.29 so probably this is a driver bug. It's necessary to find out
>> which driver leaks dma mappings.
>>
>



-- 
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  5:05       ` Данила Жукоцкий
  2009-04-07  8:17         ` Данила Жукоцкий
@ 2009-04-07  8:38         ` FUJITA Tomonori
  2009-04-07  8:43           ` FUJITA Tomonori
  1 sibling, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2009-04-07  8:38 UTC (permalink / raw)
  To: optimusgd; +Cc: fujita.tomonori, alan, akpm, linux-ide, bugme-daemon, x86

> I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
> 2.6.26.1 not have such options in .config. Maybe git kernel have this
> options?

2.6.26.1 has these options. Probably you don't enable
DEBUG_KERNEL. But the feature is not much useful, as I said.


> Bug triggered when i try mount sata drive on sata_nv and usb
> hdd on echi_hcd. In all other cases i have no problem.

One of them might cause you problem. As I wrote in the previous mail,
we need to know which driver is bad.

1. build a kernel without usb storage or any nics. Then do lots of
I/Os for minutes to see if sata_nv is fine.

2. if sata_ntv seems to be fine, build a kernel without any nics (I
guess your box can't boot without sata_nv). Then do lots of I/Os for
usb storage.

3. If the both is fine, you could have other bad driver (might be nic).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  8:38         ` FUJITA Tomonori
@ 2009-04-07  8:43           ` FUJITA Tomonori
  2009-04-07  8:52             ` Данила Жукоцкий
  0 siblings, 1 reply; 16+ messages in thread
From: FUJITA Tomonori @ 2009-04-07  8:43 UTC (permalink / raw)
  To: fujita.tomonori; +Cc: optimusgd, alan, akpm, linux-ide, bugme-daemon, x86

On Tue, 7 Apr 2009 17:38:30 +0900
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> > I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
> > 2.6.26.1 not have such options in .config. Maybe git kernel have this
> > options?
> 
> 2.6.26.1 has these options. Probably you don't enable
> DEBUG_KERNEL. But the feature is not much useful, as I said.

Wait, you use 2.6.29, right?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  8:43           ` FUJITA Tomonori
@ 2009-04-07  8:52             ` Данила Жукоцкий
  2009-04-07 11:08               ` Данила Жукоцкий
  0 siblings, 1 reply; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07  8:52 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: alan, akpm, linux-ide, bugme-daemon, x86

My bad, just a typo, i use 2.6.29.1 of course. My boot device 3ware
raid, so i try isolate bug now as you say.

2009/4/7 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>:
> On Tue, 7 Apr 2009 17:38:30 +0900
> FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
>
>> > I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
>> > 2.6.26.1 not have such options in .config. Maybe git kernel have this
>> > options?
>>
>> 2.6.26.1 has these options. Probably you don't enable
>> DEBUG_KERNEL. But the feature is not much useful, as I said.
>
> Wait, you use 2.6.29, right?
>



-- 
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07  8:52             ` Данила Жукоцкий
@ 2009-04-07 11:08               ` Данила Жукоцкий
  2009-04-07 14:21                 ` Данила Жукоцкий
  0 siblings, 1 reply; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07 11:08 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: alan, akpm, linux-ide, bugme-daemon, x86

Ok, what's done.
I build kernel with CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG,
forcedeth, all usb and sata stuff as modules, reboot with default
apperture, 64mb, rmmod ehci_usb, forcedeth. From sata_nv hdd /dev/sdb1
i can read 80gb data without problems, dmesg flooded with "dma_map_sg
overflow" messages. When i try write data to /dev/sdb1 computer hung
completely after one or two seconds. No oopses, clear logs, just hung
and i hardreset it.


2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>:
> My bad, just a typo, i use 2.6.29.1 of course. My boot device 3ware
> raid, so i try isolate bug now as you say.
>
> 2009/4/7 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>:
>> On Tue, 7 Apr 2009 17:38:30 +0900
>> FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
>>
>>> > I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
>>> > 2.6.26.1 not have such options in .config. Maybe git kernel have this
>>> > options?
>>>
>>> 2.6.26.1 has these options. Probably you don't enable
>>> DEBUG_KERNEL. But the feature is not much useful, as I said.
>>
>> Wait, you use 2.6.29, right?
>>
>
>
>
> --
> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07 11:08               ` Данила Жукоцкий
@ 2009-04-07 14:21                 ` Данила Жукоцкий
  0 siblings, 0 replies; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07 14:21 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: alan, akpm, linux-ide, bugme-daemon, x86

I repeat test in bare console, single-user mode with all modules
removed from kernel, only libata, ahci, sata_nv and pata_amd inserted.
After read and write ~30gb data dmesg show first "dma_map_sg overflow"
message on write operation. May be this info help.

2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>
> Ok, what's done.
> I build kernel with CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG,
> forcedeth, all usb and sata stuff as modules, reboot with default
> apperture, 64mb, rmmod ehci_usb, forcedeth. From sata_nv hdd /dev/sdb1
> i can read 80gb data without problems, dmesg flooded with "dma_map_sg
> overflow" messages. When i try write data to /dev/sdb1 computer hung
> completely after one or two seconds. No oopses, clear logs, just hung
> and i hardreset it.
>
>
> 2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>:
>> My bad, just a typo, i use 2.6.29.1 of course. My boot device 3ware
>> raid, so i try isolate bug now as you say.
>>
>> 2009/4/7 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>:
>>> On Tue, 7 Apr 2009 17:38:30 +0900
>>> FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
>>>
>>>> > I can't turn on CONFIG_IOMMU_LEAK and CONFIG_IOMMU_DEBUG because
>>>> > 2.6.26.1 not have such options in .config. Maybe git kernel have this
>>>> > options?
>>>>
>>>> 2.6.26.1 has these options. Probably you don't enable
>>>> DEBUG_KERNEL. But the feature is not much useful, as I said.
>>>
>>> Wait, you use 2.6.29, right?
>>>
>>
>>
>>
>> --
>> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
>>
>



-- 
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-05  9:39   ` Данила Жукоцкий
@ 2009-04-07 16:14     ` Grant Grundler
       [not found]       ` <db2b43030904070946x495cf76et2fd4571f0ac96634@mail.gmail.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2009-04-07 16:14 UTC (permalink / raw)
  To: Данила Жукоцкий
  Cc: Andrew Morton, linux-ide, bugme-daemon, x86

2009/4/5 Данила Жукоцкий <optimusgd@gmail.com>:
> 2009/4/4 Andrew Morton <akpm@linux-foundation.org>:
>>
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001
>>>
>>>            Summary: PCI-DMA: Out of IOMMU space
>>>            Product: Platform Specific/Hardware
>>>            Version: 2.5
>>>     Kernel Version: 2.6.29-gentoo
>>>           Platform: All
>>>         OS/Version: Linux
>>>               Tree: Mainline
>>>             Status: NEW
>>>           Severity: normal
>>>           Priority: P1
>>>          Component: x86-64
>>>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
>>>         ReportedBy: optimusgd@gmail.com
>>>         Regression: Yes
>>>
>>>
>>> Created an attachment (id=20789)
>>>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
>>> hwreport generated info
>>>
>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
>>> 2.6.28-gentoo-r4 work ok, so it is regression.
>>
>> It is indeed a regression.
>>
>>> Dmesg fragments:
>>>
>>>
>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>> 4096 bytes

The bug report has a "dmesg" attachment but I wasn't able to find the
"Out of IOMMU space" message in the dmesg.  Can that be corrected?
I was looking for IDE/SATA errors *before* the IOMMU errors.

But I was surprised to find these bits:
...
Kernel command line: mce=bootlog root=/dev/ram0
real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3
3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
Initializing CPU#0
...
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 256 MB of RAM
...

I'm not familiar with iommu= parameter nor the warning about the BIOS.
Any comments on that?

thanks,
grant

>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>> 4096 bytes
>>> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
>>> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
>>> last_issue_tag 0xfafbfcfd
>>> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
>>> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
>>> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
>>> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
>>> 0x6
>>> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
>>> 0 ncq 4096 in
>>> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
>>> (internal error)
>>> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
>>> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link
>>
>> Are these scary-looking messages also present in 2.6.28?
>>
>> If so, perhaps the ata code is leaking DMA memory on the error-handling path?
>>
>>> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
>>> 300)
>>> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
>>> Apr  3 13:38:47 rngmhpamd ata1: EH complete
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>> sectors: (250 GB/232 GiB)
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>> enabled, doesn't support DPO or FUA
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>> sectors: (250 GB/232 GiB)
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>> enabled, doesn't support DPO or FUA
>>>
>>> And
>>>
>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 4608 bytes
>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>> and address 8
>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 11776 bytes
>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>> and address 8
>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 11776 bytes
>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>> and address 8
>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 11776 bytes
>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>> and address 8
>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 11776 bytes
>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>> and address 8
>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 11776 bytes
>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>> for 69632 bytes
>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
>>> driverbyte=0x00
>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>>
>> Do we have any debugging option for dumping the current PCI DMA
>> allocations, find out where it has all gone?
>>
>>
>
> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can
> easyly trigger it. I boot with default apperture, 64mb, and while
> write to usb-hdd get this:
>
> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
> space for 65536 bytes
> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
> space for 65536 bytes
> Apr  5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using
> ehci_hcd and address 6
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Fwd: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
       [not found]       ` <db2b43030904070946x495cf76et2fd4571f0ac96634@mail.gmail.com>
@ 2009-04-07 16:52         ` Данила Жукоцкий
  2009-04-07 20:23           ` Данила Жукоцкий
  0 siblings, 1 reply; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07 16:52 UTC (permalink / raw)
  To: akpm, linux-ide, bugme-daemon, x86

Forgot reply to all, sorry

---------- Forwarded message ----------
From: Данила Жукоцкий <optimusgd@gmail.com>
Date: 2009/4/7
Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
To: Grant Grundler <grundler@google.com>


Yes, in attachment clear dmesg, warnings in bugreport body

>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes
>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes

I got "PCI-DMA: Out of IOMMU space" while trying write data to usb or
sata hdd before all other error messages. After that usb and sata
drives lost. All other noise is attempts to communicate with died
devices.

>Kernel command line:
>mce=bootlog root=/dev/ram0 real_root=/dev/evms/root init=/linuxrc
I'm boot from 3ware raid with evmc
>iommu=allowdac,merge,memaper=3
This is from Documentation/x86/x86_64/boot-options.txt
iommu=allowdac Im try to avoid DMA bug. May be that not need.
allowdac           Allow double-address cycle (DAC) mode, i.e. DMA >4GB.
                      DAC is used with 32-bit PCI to push a 64-bit address in
                      two cycles. When off all DMA over >4GB is forced through
                      an IOMMU or software bounce buffering.
merge              Do scatter-gather (SG) merging.
memaper[=<order>]  Allocate an own aperture over RAM with size 32MB<<order.
                      (default: order=1, i.e. 64MB)
With default apperture, 64mb, DMA leak very fast, now i have
memaper=5, 1 gb, becouse i must do my job and can't rollback to 2.6.28
due strange mysterious problem with forcedeth nics that i can't
explain and solve. If solution for DMA leak will not be found, i'm try
to fill bugreport about problem with nics.

>3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
I prefer use msi on that hardware.
>...
>Your BIOS doesn't leave a aperture memory hole
>Please enable the IOMMU option in the BIOS setup
>This costs you 256 MB of RAM

xw9400 BIOS do not have IOMMU option in the BIOS setup. Now this costs
me 1gb of ram

Anyway, i can stable reproduce bug without all this whistlers

2009/4/7 Grant Grundler <grundler@google.com>:
> 2009/4/5 Данила Жукоцкий <optimusgd@gmail.com>:
>> 2009/4/4 Andrew Morton <akpm@linux-foundation.org>:
>>>
>>> (switched to email.  Please respond via emailed reply-to-all, not via the
>>> bugzilla web interface).
>>>
>>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>>>
>>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001
>>>>
>>>>            Summary: PCI-DMA: Out of IOMMU space
>>>>            Product: Platform Specific/Hardware
>>>>            Version: 2.5
>>>>     Kernel Version: 2.6.29-gentoo
>>>>           Platform: All
>>>>         OS/Version: Linux
>>>>               Tree: Mainline
>>>>             Status: NEW
>>>>           Severity: normal
>>>>           Priority: P1
>>>>          Component: x86-64
>>>>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
>>>>         ReportedBy: optimusgd@gmail.com
>>>>         Regression: Yes
>>>>
>>>>
>>>> Created an attachment (id=20789)
>>>>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
>>>> hwreport generated info
>>>>
>>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
>>>> 2.6.28-gentoo-r4 work ok, so it is regression.
>>>
>>> It is indeed a regression.
>>>
>>>> Dmesg fragments:
>>>>
>>>>
>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>>> 4096 bytes
>
> The bug report has a "dmesg" attachment but I wasn't able to find the
> "Out of IOMMU space" message in the dmesg.  Can that be corrected?
> I was looking for IDE/SATA errors *before* the IOMMU errors.
>
> But I was surprised to find these bits:
> ...
> Kernel command line: mce=bootlog root=/dev/ram0
> real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3
> 3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
> Initializing CPU#0
> ...
> Your BIOS doesn't leave a aperture memory hole
> Please enable the IOMMU option in the BIOS setup
> This costs you 256 MB of RAM
> ...
>
> I'm not familiar with iommu= parameter nor the warning about the BIOS.
> Any comments on that?
>
> thanks,
> grant
>
>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>>> 4096 bytes
>>>> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
>>>> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
>>>> last_issue_tag 0xfafbfcfd
>>>> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
>>>> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
>>>> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
>>>> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
>>>> 0x6
>>>> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
>>>> 0 ncq 4096 in
>>>> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
>>>> (internal error)
>>>> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
>>>> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link
>>>
>>> Are these scary-looking messages also present in 2.6.28?
>>>
>>> If so, perhaps the ata code is leaking DMA memory on the error-handling path?
>>>
>>>> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
>>>> 300)
>>>> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
>>>> Apr  3 13:38:47 rngmhpamd ata1: EH complete
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>>> sectors: (250 GB/232 GiB)
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>>> enabled, doesn't support DPO or FUA
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>>> sectors: (250 GB/232 GiB)
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>>> enabled, doesn't support DPO or FUA
>>>>
>>>> And
>>>>
>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 4608 bytes
>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>> and address 8
>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 11776 bytes
>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>> and address 8
>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 11776 bytes
>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>> and address 8
>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 11776 bytes
>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>> and address 8
>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 11776 bytes
>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>> and address 8
>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 11776 bytes
>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>> for 69632 bytes
>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
>>>> driverbyte=0x00
>>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>>>
>>> Do we have any debugging option for dumping the current PCI DMA
>>> allocations, find out where it has all gone?
>>>
>>>
>>
>> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can
>> easyly trigger it. I boot with default apperture, 64mb, and while
>> write to usb-hdd get this:
>>
>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
>> space for 65536 bytes
>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
>> space for 65536 bytes
>> Apr  5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using
>> ehci_hcd and address 6
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>



--
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"



-- 
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07 16:52         ` Fwd: " Данила Жукоцкий
@ 2009-04-07 20:23           ` Данила Жукоцкий
  2009-04-07 20:59             ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Данила Жукоцкий @ 2009-04-07 20:23 UTC (permalink / raw)
  To: akpm, linux-ide, bugme-daemon, x86

Thank You, Grant, for Your simple questions! Without "allowdac" after
couple of hours testing i cannot reproduce bug! So it is my stupid
mistake, i don't understand Why i add this absolutely unusual
parameter to boot string. I'm apologize to All off You for this stupid
mindfuck. You may close the bug, because it is bug in my damn head.

2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>:
> Forgot reply to all, sorry
>
> ---------- Forwarded message ----------
> From: Данила Жукоцкий <optimusgd@gmail.com>
> Date: 2009/4/7
> Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
> To: Grant Grundler <grundler@google.com>
>
>
> Yes, in attachment clear dmesg, warnings in bugreport body
>
>>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes
>>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes
>
> I got "PCI-DMA: Out of IOMMU space" while trying write data to usb or
> sata hdd before all other error messages. After that usb and sata
> drives lost. All other noise is attempts to communicate with died
> devices.
>
>>Kernel command line:
>>mce=bootlog root=/dev/ram0 real_root=/dev/evms/root init=/linuxrc
> I'm boot from 3ware raid with evmc
>>iommu=allowdac,merge,memaper=3
> This is from Documentation/x86/x86_64/boot-options.txt
> iommu=allowdac Im try to avoid DMA bug. May be that not need.
> allowdac           Allow double-address cycle (DAC) mode, i.e. DMA >4GB.
>                       DAC is used with 32-bit PCI to push a 64-bit address in
>                       two cycles. When off all DMA over >4GB is forced through
>                       an IOMMU or software bounce buffering.
> merge              Do scatter-gather (SG) merging.
> memaper[=<order>]  Allocate an own aperture over RAM with size 32MB<<order.
>                       (default: order=1, i.e. 64MB)
> With default apperture, 64mb, DMA leak very fast, now i have
> memaper=5, 1 gb, becouse i must do my job and can't rollback to 2.6.28
> due strange mysterious problem with forcedeth nics that i can't
> explain and solve. If solution for DMA leak will not be found, i'm try
> to fill bugreport about problem with nics.
>
>>3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
> I prefer use msi on that hardware.
>>...
>>Your BIOS doesn't leave a aperture memory hole
>>Please enable the IOMMU option in the BIOS setup
>>This costs you 256 MB of RAM
>
> xw9400 BIOS do not have IOMMU option in the BIOS setup. Now this costs
> me 1gb of ram
>
> Anyway, i can stable reproduce bug without all this whistlers
>
> 2009/4/7 Grant Grundler <grundler@google.com>:
>> 2009/4/5 Данила Жукоцкий <optimusgd@gmail.com>:
>>> 2009/4/4 Andrew Morton <akpm@linux-foundation.org>:
>>>>
>>>> (switched to email.  Please respond via emailed reply-to-all, not via the
>>>> bugzilla web interface).
>>>>
>>>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>>>>
>>>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001
>>>>>
>>>>>            Summary: PCI-DMA: Out of IOMMU space
>>>>>            Product: Platform Specific/Hardware
>>>>>            Version: 2.5
>>>>>     Kernel Version: 2.6.29-gentoo
>>>>>           Platform: All
>>>>>         OS/Version: Linux
>>>>>               Tree: Mainline
>>>>>             Status: NEW
>>>>>           Severity: normal
>>>>>           Priority: P1
>>>>>          Component: x86-64
>>>>>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
>>>>>         ReportedBy: optimusgd@gmail.com
>>>>>         Regression: Yes
>>>>>
>>>>>
>>>>> Created an attachment (id=20789)
>>>>>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
>>>>> hwreport generated info
>>>>>
>>>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
>>>>> 2.6.28-gentoo-r4 work ok, so it is regression.
>>>>
>>>> It is indeed a regression.
>>>>
>>>>> Dmesg fragments:
>>>>>
>>>>>
>>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>>>> 4096 bytes
>>
>> The bug report has a "dmesg" attachment but I wasn't able to find the
>> "Out of IOMMU space" message in the dmesg.  Can that be corrected?
>> I was looking for IDE/SATA errors *before* the IOMMU errors.
>>
>> But I was surprised to find these bits:
>> ...
>> Kernel command line: mce=bootlog root=/dev/ram0
>> real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3
>> 3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
>> Initializing CPU#0
>> ...
>> Your BIOS doesn't leave a aperture memory hole
>> Please enable the IOMMU option in the BIOS setup
>> This costs you 256 MB of RAM
>> ...
>>
>> I'm not familiar with iommu= parameter nor the warning about the BIOS.
>> Any comments on that?
>>
>> thanks,
>> grant
>>
>>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
>>>>> 4096 bytes
>>>>> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
>>>>> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
>>>>> last_issue_tag 0xfafbfcfd
>>>>> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
>>>>> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
>>>>> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
>>>>> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
>>>>> 0x6
>>>>> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
>>>>> 0 ncq 4096 in
>>>>> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
>>>>> (internal error)
>>>>> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
>>>>> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link
>>>>
>>>> Are these scary-looking messages also present in 2.6.28?
>>>>
>>>> If so, perhaps the ata code is leaking DMA memory on the error-handling path?
>>>>
>>>>> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
>>>>> 300)
>>>>> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
>>>>> Apr  3 13:38:47 rngmhpamd ata1: EH complete
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>>>> sectors: (250 GB/232 GiB)
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>>>> enabled, doesn't support DPO or FUA
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
>>>>> sectors: (250 GB/232 GiB)
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
>>>>> enabled, doesn't support DPO or FUA
>>>>>
>>>>> And
>>>>>
>>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 4608 bytes
>>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>>> and address 8
>>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 11776 bytes
>>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>>> and address 8
>>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 11776 bytes
>>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>>> and address 8
>>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 11776 bytes
>>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>>> and address 8
>>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 11776 bytes
>>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
>>>>> and address 8
>>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 11776 bytes
>>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
>>>>> for 69632 bytes
>>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
>>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
>>>>> driverbyte=0x00
>>>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
>>>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
>>>>
>>>> Do we have any debugging option for dumping the current PCI DMA
>>>> allocations, find out where it has all gone?
>>>>
>>>>
>>>
>>> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can
>>> easyly trigger it. I boot with default apperture, 64mb, and while
>>> write to usb-hdd get this:
>>>
>>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
>>> space for 65536 bytes
>>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
>>> space for 65536 bytes
>>> Apr  5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using
>>> ehci_hcd and address 6
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>
>
> --
> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
>
>
>
> --
> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
>



-- 
С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07 20:23           ` Данила Жукоцкий
@ 2009-04-07 20:59             ` Grant Grundler
  2009-04-07 21:14               ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2009-04-07 20:59 UTC (permalink / raw)
  To: Данила Жукоцкий
  Cc: akpm, linux-ide, bugme-daemon, x86

2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>
>
> Thank You, Grant, for Your simple questions!

Welcome!
I was just trying to point out some additional information to folks
who might understand the code. I don't in this case but was looking
for more clues.

>
> Without "allowdac" after
> couple of hours testing i cannot reproduce bug! So it is my stupid
> mistake, i don't understand Why i add this absolutely unusual
> parameter to boot string.

I thought I understood IOMMUs having written code for 4 different
implementations.
I don't understand why "allowdac" parameter exists.
dma_mask stuff should be handling this already.
Can someone explain *why* (Данила already posted the docs) this
parameter exists?
(ie use case that dma_mask APIs don't work.)

> I'm apologize to All off You for this stupid
> mindfuck. You may close the bug, because it is bug in my damn head.

My first impression: the bug is either allocdma exists instead of using
DMA API (See Documentation/DMA-API.txt) OR the documentation for
allocdma is missing warnings about "this could break your system"
and clearly specify when it should be used.

hth,
grant

>
> 2009/4/7 Данила Жукоцкий <optimusgd@gmail.com>:
> > Forgot reply to all, sorry
> >
> > ---------- Forwarded message ----------
> > From: Данила Жукоцкий <optimusgd@gmail.com>
> > Date: 2009/4/7
> > Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
> > To: Grant Grundler <grundler@google.com>
> >
> >
> > Yes, in attachment clear dmesg, warnings in bugreport body
> >
> >>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes
> >>Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes
> >
> > I got "PCI-DMA: Out of IOMMU space" while trying write data to usb or
> > sata hdd before all other error messages. After that usb and sata
> > drives lost. All other noise is attempts to communicate with died
> > devices.
> >
> >>Kernel command line:
> >>mce=bootlog root=/dev/ram0 real_root=/dev/evms/root init=/linuxrc
> > I'm boot from 3ware raid with evmc
> >>iommu=allowdac,merge,memaper=3
> > This is from Documentation/x86/x86_64/boot-options.txt
> > iommu=allowdac Im try to avoid DMA bug. May be that not need.
> > allowdac           Allow double-address cycle (DAC) mode, i.e. DMA >4GB.
> >                       DAC is used with 32-bit PCI to push a 64-bit address in
> >                       two cycles. When off all DMA over >4GB is forced through
> >                       an IOMMU or software bounce buffering.
> > merge              Do scatter-gather (SG) merging.
> > memaper[=<order>]  Allocate an own aperture over RAM with size 32MB<<order.
> >                       (default: order=1, i.e. 64MB)
> > With default apperture, 64mb, DMA leak very fast, now i have
> > memaper=5, 1 gb, becouse i must do my job and can't rollback to 2.6.28
> > due strange mysterious problem with forcedeth nics that i can't
> > explain and solve. If solution for DMA leak will not be found, i'm try
> > to fill bugreport about problem with nics.
> >
> >>3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
> > I prefer use msi on that hardware.
> >>...
> >>Your BIOS doesn't leave a aperture memory hole
> >>Please enable the IOMMU option in the BIOS setup
> >>This costs you 256 MB of RAM
> >
> > xw9400 BIOS do not have IOMMU option in the BIOS setup. Now this costs
> > me 1gb of ram
> >
> > Anyway, i can stable reproduce bug without all this whistlers
> >
> > 2009/4/7 Grant Grundler <grundler@google.com>:
> >> 2009/4/5 Данила Жукоцкий <optimusgd@gmail.com>:
> >>> 2009/4/4 Andrew Morton <akpm@linux-foundation.org>:
> >>>>
> >>>> (switched to email.  Please respond via emailed reply-to-all, not via the
> >>>> bugzilla web interface).
> >>>>
> >>>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> >>>>
> >>>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001
> >>>>>
> >>>>>            Summary: PCI-DMA: Out of IOMMU space
> >>>>>            Product: Platform Specific/Hardware
> >>>>>            Version: 2.5
> >>>>>     Kernel Version: 2.6.29-gentoo
> >>>>>           Platform: All
> >>>>>         OS/Version: Linux
> >>>>>               Tree: Mainline
> >>>>>             Status: NEW
> >>>>>           Severity: normal
> >>>>>           Priority: P1
> >>>>>          Component: x86-64
> >>>>>         AssignedTo: platform_x86_64@kernel-bugs.osdl.org
> >>>>>         ReportedBy: optimusgd@gmail.com
> >>>>>         Regression: Yes
> >>>>>
> >>>>>
> >>>>> Created an attachment (id=20789)
> >>>>>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20789)
> >>>>> hwreport generated info
> >>>>>
> >>>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear.
> >>>>> 2.6.28-gentoo-r4 work ok, so it is regression.
> >>>>
> >>>> It is indeed a regression.
> >>>>
> >>>>> Dmesg fragments:
> >>>>>
> >>>>>
> >>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
> >>>>> 4096 bytes
> >>
> >> The bug report has a "dmesg" attachment but I wasn't able to find the
> >> "Out of IOMMU space" message in the dmesg.  Can that be corrected?
> >> I was looking for IDE/SATA errors *before* the IOMMU errors.
> >>
> >> But I was surprised to find these bits:
> >> ...
> >> Kernel command line: mce=bootlog root=/dev/ram0
> >> real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3
> >> 3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet
> >> Initializing CPU#0
> >> ...
> >> Your BIOS doesn't leave a aperture memory hole
> >> Please enable the IOMMU option in the BIOS setup
> >> This costs you 256 MB of RAM
> >> ...
> >>
> >> I'm not familiar with iommu= parameter nor the warning about the BIOS.
> >> Any comments on that?
> >>
> >> thanks,
> >> grant
> >>
> >>>>> Apr  3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for
> >>>>> 4096 bytes
> >>>>> Apr  3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
> >>>>> Apr  3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0
> >>>>> last_issue_tag 0xfafbfcfd
> >>>>> Apr  3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0
> >>>>> Apr  3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0
> >>>>> Apr  3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve
> >>>>> Apr  3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action
> >>>>> 0x6
> >>>>> Apr  3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag
> >>>>> 0 ncq 4096 in
> >>>>> Apr  3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40
> >>>>> (internal error)
> >>>>> Apr  3 13:38:46 rngmhpamd ata1.00: status: { DRDY }
> >>>>> Apr  3 13:38:46 rngmhpamd ata1: hard resetting link
> >>>>
> >>>> Are these scary-looking messages also present in 2.6.28?
> >>>>
> >>>> If so, perhaps the ata code is leaking DMA memory on the error-handling path?
> >>>>
> >>>>> Apr  3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl
> >>>>> 300)
> >>>>> Apr  3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100
> >>>>> Apr  3 13:38:47 rngmhpamd ata1: EH complete
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
> >>>>> sectors: (250 GB/232 GiB)
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> >>>>> enabled, doesn't support DPO or FUA
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware
> >>>>> sectors: (250 GB/232 GiB)
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> >>>>> Apr  3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
> >>>>> enabled, doesn't support DPO or FUA
> >>>>>
> >>>>> And
> >>>>>
> >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 4608 bytes
> >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> >>>>> and address 8
> >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 11776 bytes
> >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> >>>>> and address 8
> >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 11776 bytes
> >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> >>>>> and address 8
> >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 11776 bytes
> >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> >>>>> and address 8
> >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 11776 bytes
> >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd
> >>>>> and address 8
> >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 11776 bytes
> >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space
> >>>>> for 69632 bytes
> >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code
> >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07
> >>>>> driverbyte=0x00
> >>>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137
> >>>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed
> >>>>
> >>>> Do we have any debugging option for dumping the current PCI DMA
> >>>> allocations, find out where it has all gone?
> >>>>
> >>>>
> >>>
> >>> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can
> >>> easyly trigger it. I boot with default apperture, 64mb, and while
> >>> write to usb-hdd get this:
> >>>
> >>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
> >>> space for 65536 bytes
> >>> Apr  5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU
> >>> space for 65536 bytes
> >>> Apr  5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using
> >>> ehci_hcd and address 6
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >
> >
> >
> > --
> > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
> >
> >
> >
> > --
> > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
> >
>
>
>
> --
> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш"
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space
  2009-04-07 20:59             ` Grant Grundler
@ 2009-04-07 21:14               ` Grant Grundler
  0 siblings, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2009-04-07 21:14 UTC (permalink / raw)
  To: Данила Жукоцкий
  Cc: akpm, linux-ide, bugme-daemon, x86

Forgot one other possibility...

2009/4/7 Grant Grundler <grundler@google.com>:
....
> My first impression: the bug is either allocdma exists instead of using
> DMA API (See Documentation/DMA-API.txt) OR the documentation for
> allocdma is missing warnings about "this could break your system"
> and clearly specify when it should be used.

or it's a bug in allowdac implementation.

BTW, this whole thread probably needs to be re-posted to either
linux-pci or linux kernel. I don't think linux-ide is the right place
to resolve this.

grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-04-07 21:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-13001-10286@http.bugzilla.kernel.org/>
2009-04-04 18:45 ` [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space Andrew Morton
2009-04-05  9:39   ` Данила Жукоцкий
2009-04-07 16:14     ` Grant Grundler
     [not found]       ` <db2b43030904070946x495cf76et2fd4571f0ac96634@mail.gmail.com>
2009-04-07 16:52         ` Fwd: " Данила Жукоцкий
2009-04-07 20:23           ` Данила Жукоцкий
2009-04-07 20:59             ` Grant Grundler
2009-04-07 21:14               ` Grant Grundler
2009-04-06 22:48   ` Alan Cox
2009-04-07  1:49     ` FUJITA Tomonori
2009-04-07  5:05       ` Данила Жукоцкий
2009-04-07  8:17         ` Данила Жукоцкий
2009-04-07  8:38         ` FUJITA Tomonori
2009-04-07  8:43           ` FUJITA Tomonori
2009-04-07  8:52             ` Данила Жукоцкий
2009-04-07 11:08               ` Данила Жукоцкий
2009-04-07 14:21                 ` Данила Жукоцкий

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).