public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
       [not found] <200610021508.k92F8bmq011159@fire-2.osdl.org>
@ 2006-10-02 17:57 ` Andrew Morton
  2006-10-02 18:06   ` adam radford
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2006-10-02 17:57 UTC (permalink / raw)
  To: adam radford, James Bottomley; +Cc: bugme-daemon, linux-scsi

On Mon, 2 Oct 2006 08:08:37 -0700
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7246
> 
>            Summary: 3w-xxxx, IOMMU and >1go RAM
>     Kernel Version: 2.6.18
>             Status: NEW
>           Severity: normal
>              Owner: scsi_drivers-other@kernel-bugs.osdl.org
>          Submitter: aarnoud@agematis.com
> 
> 
> Most recent kernel where this bug did not occur: 2.6.15-1
> Distribution: Debian Testing
> Hardware Environment: SuperMicro X6DVL-EG2, 4go RAM, 3ware 8006-2LP, 2xHitachi
> 2x80go
> Software Environment:
> Problem Description: Data corruption with kernel newer than 2.6.15-1.
> 
> Steps to reproduce: tested with most kernel 2.6.16, 2.6.17 and 2.6.18 all doing
> same corruption.
> 
> Solved the probleme leaving only 1go of RAM and setting IOMMU=off during boot
> time, with 4go RAM kernel hangs saying :
> 
> 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> 
> and hangs badly.
> 

James, Adam: I recall that we had a scsi driver recently which had problems
similar to this, but I think it was with addresses over 4G, not over 1G. 
It also might not have been the 3ware driver.  Do you recall?

Alexandre, can you please send the full dmesg output for that machine?

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 17:57 ` [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM Andrew Morton
@ 2006-10-02 18:06   ` adam radford
  2006-10-02 18:17     ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: adam radford @ 2006-10-02 18:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: James Bottomley, bugme-daemon, linux-scsi, ak

This driver is for the older 3ware controllers.  They can only DMA to 32-bit
addresses.  I set pci_set_dma_mask(pdev, DMA_32BIT_MASK).  When you
have >= 4GB of RAM, you should use the IOMMU.  The driver has a large
default queue depth, so it can use up a lot of mappings.  Perhaps IOMMU is
turned off in the bios, or the aperature isn't sufficiently large
enough for the mappings?

-Adam

On 10/2/06, Andrew Morton <akpm@osdl.org> wrote:
> On Mon, 2 Oct 2006 08:08:37 -0700
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=7246
> >
> >            Summary: 3w-xxxx, IOMMU and >1go RAM
> >     Kernel Version: 2.6.18
> >             Status: NEW
> >           Severity: normal
> >              Owner: scsi_drivers-other@kernel-bugs.osdl.org
> >          Submitter: aarnoud@agematis.com
> >
> >
> > Most recent kernel where this bug did not occur: 2.6.15-1
> > Distribution: Debian Testing
> > Hardware Environment: SuperMicro X6DVL-EG2, 4go RAM, 3ware 8006-2LP, 2xHitachi
> > 2x80go
> > Software Environment:
> > Problem Description: Data corruption with kernel newer than 2.6.15-1.
> >
> > Steps to reproduce: tested with most kernel 2.6.16, 2.6.17 and 2.6.18 all doing
> > same corruption.
> >
> > Solved the probleme leaving only 1go of RAM and setting IOMMU=off during boot
> > time, with 4go RAM kernel hangs saying :
> >
> > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> >
> > and hangs badly.
> >
>
> James, Adam: I recall that we had a scsi driver recently which had problems
> similar to this, but I think it was with addresses over 4G, not over 1G.
> It also might not have been the 3ware driver.  Do you recall?
>
> Alexandre, can you please send the full dmesg output for that machine?
>
> Thanks.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 18:06   ` adam radford
@ 2006-10-02 18:17     ` Andi Kleen
  2006-10-02 18:24       ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-10-02 18:17 UTC (permalink / raw)
  To: adam radford; +Cc: Andrew Morton, James Bottomley, bugme-daemon, linux-scsi


> > > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> > > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff
> > > 3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
> > > nommu_map_sg: overflow 2053d9000+4096 of device mask ffffffff

nommu_* means he didn't compile in the IOMMU code.

Operator error -> invalid.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 18:17     ` Andi Kleen
@ 2006-10-02 18:24       ` James Bottomley
  2006-10-02 18:35         ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2006-10-02 18:24 UTC (permalink / raw)
  To: Andi Kleen; +Cc: adam radford, Andrew Morton, bugme-daemon, linux-scsi

On Mon, 2006-10-02 at 20:17 +0200, Andi Kleen wrote:
> nommu_* means he didn't compile in the IOMMU code.
> 
> Operator error -> invalid.

But how did we get sg segments over the mask?  The block layer should
have bounced them, I think.

James



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 18:24       ` James Bottomley
@ 2006-10-02 18:35         ` Andi Kleen
  2006-10-02 18:58           ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-10-02 18:35 UTC (permalink / raw)
  To: James Bottomley; +Cc: adam radford, Andrew Morton, bugme-daemon, linux-scsi

On Monday 02 October 2006 20:24, James Bottomley wrote:
> On Mon, 2006-10-02 at 20:17 +0200, Andi Kleen wrote:
> > nommu_* means he didn't compile in the IOMMU code.
> > 
> > Operator error -> invalid.
> 
> But how did we get sg segments over the mask?  The block layer should
> have bounced them, I think.

Yes, but it can't when the bounce code is not compiled in.
In this case it just printks and you saw those printks.

-Andi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 18:35         ` Andi Kleen
@ 2006-10-02 18:58           ` James Bottomley
  2006-10-02 19:28             ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2006-10-02 18:58 UTC (permalink / raw)
  To: Andi Kleen; +Cc: adam radford, Andrew Morton, bugme-daemon, linux-scsi

On Mon, 2006-10-02 at 20:35 +0200, Andi Kleen wrote:
> Yes, but it can't when the bounce code is not compiled in.
> In this case it just printks and you saw those printks.

That's this bit of subversion in ll_rw_blk.c:blk_queue_bounce_limit()

#if BITS_PER_LONG == 64
	/* Assume anything <= 4GB can be handled by IOMMU.
	   Actually some IOMMUs can handle everything, but I don't
	   know of a way to test this here. */
	if (bounce_pfn < (min_t(u64,0xffffffff,BLK_BOUNCE_HIGH) >> PAGE_SHIFT))
		dma = 1;
	q->bounce_pfn = max_low_pfn;
#else

?

That really looks wrong ... it will fail, as we've seen for 64 bit
platforms with no iommu.  We should init the isa pool in that case.

James



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM
  2006-10-02 18:58           ` James Bottomley
@ 2006-10-02 19:28             ` Andi Kleen
  0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2006-10-02 19:28 UTC (permalink / raw)
  To: James Bottomley; +Cc: adam radford, Andrew Morton, bugme-daemon, linux-scsi


> That really looks wrong ... it will fail, as we've seen for 64 bit
> platforms with no iommu.

That is just an operator error.  It's like compiling your kernel
without the driver for your root file system. Yes, CONFIG_* gives
you plenty of rope ...

> We should init the isa pool in that case. 

No.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-10-02 19:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200610021508.k92F8bmq011159@fire-2.osdl.org>
2006-10-02 17:57 ` [Bugme-new] [Bug 7246] New: 3w-xxxx, IOMMU and >1go RAM Andrew Morton
2006-10-02 18:06   ` adam radford
2006-10-02 18:17     ` Andi Kleen
2006-10-02 18:24       ` James Bottomley
2006-10-02 18:35         ` Andi Kleen
2006-10-02 18:58           ` James Bottomley
2006-10-02 19:28             ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox