netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* atl1 driver corrupting memory?
@ 2007-07-25 18:33 Chuck Ebbert
  2007-07-25 21:22 ` Chris Snook
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2007-07-25 18:33 UTC (permalink / raw)
  To: Jay Cliburn; +Cc: Netdev

I have a report of random errors when using the atl1 driver
with kernel 2.6.22.1. Could that be a problem fixed by the
recent changes to DMA setup in 2.6.23-rc?



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: atl1 driver corrupting memory?
  2007-07-25 18:33 atl1 driver corrupting memory? Chuck Ebbert
@ 2007-07-25 21:22 ` Chris Snook
  2007-07-25 21:31   ` Chuck Ebbert
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Snook @ 2007-07-25 21:22 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Jay Cliburn, Netdev

Chuck Ebbert wrote:
> I have a report of random errors when using the atl1 driver
> with kernel 2.6.22.1. Could that be a problem fixed by the
> recent changes to DMA setup in 2.6.23-rc?

I hope so.  As far as we can tell the driver and the NIC itself are doing the 
right thing, and the pci layer or chipset is screwing up the 64-bit DMA.  This 
only manifests when physical memory addresses cross the 4 GB boundary, and as 
far as I'm aware atl1 is only used on desktop boards, so we don't have a lot of 
testers.  If someone wants to buy me and Jay more RAM so we can test it 
ourselves, I guess we wouldn't object :)

I favor disabling 64-bit DMA in atl1 until Atheros can track this down in the 
lab.  If we don't get confirmation that this bug is fixed by the DMA changes, I 
think we should revert to 32-bit DMA for 2.6.23.  Limiting ourselves to 32-bit 
DMA on desktop systems is a lot less bad than allowing arbitrary memory corruption.

	-- Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: atl1 driver corrupting memory?
  2007-07-25 21:22 ` Chris Snook
@ 2007-07-25 21:31   ` Chuck Ebbert
  2007-07-25 21:35     ` Chris Snook
  2007-07-26  0:24     ` Jay Cliburn
  0 siblings, 2 replies; 5+ messages in thread
From: Chuck Ebbert @ 2007-07-25 21:31 UTC (permalink / raw)
  To: Chris Snook; +Cc: Jay Cliburn, Netdev

On 07/25/2007 05:22 PM, Chris Snook wrote:
> Chuck Ebbert wrote:
>> I have a report of random errors when using the atl1 driver
>> with kernel 2.6.22.1. Could that be a problem fixed by the
>> recent changes to DMA setup in 2.6.23-rc?
> 
> I hope so.  As far as we can tell the driver and the NIC itself are
> doing the right thing, and the pci layer or chipset is screwing up the
> 64-bit DMA.  This only manifests when physical memory addresses cross
> the 4 GB boundary, and as far as I'm aware atl1 is only used on desktop
> boards, so we don't have a lot of testers.  If someone wants to buy me
> and Jay more RAM so we can test it ourselves, I guess we wouldn't object :)
> 

Our reporter has 8GB of memory in an x86_64 machine.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=249511


> I favor disabling 64-bit DMA in atl1 until Atheros can track this down
> in the lab.  If we don't get confirmation that this bug is fixed by the
> DMA changes, I think we should revert to 32-bit DMA for 2.6.23. 
> Limiting ourselves to 32-bit DMA on desktop systems is a lot less bad
> than allowing arbitrary memory corruption.
> 

This is what was committed.

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f516c00d416bd39aab6cfb348b68919e295fe23
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ef76e3e2505db01f7d4b537854f4a177220c26c8


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: atl1 driver corrupting memory?
  2007-07-25 21:31   ` Chuck Ebbert
@ 2007-07-25 21:35     ` Chris Snook
  2007-07-26  0:24     ` Jay Cliburn
  1 sibling, 0 replies; 5+ messages in thread
From: Chris Snook @ 2007-07-25 21:35 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Jay Cliburn, Netdev, atl1-devel

Chuck Ebbert wrote:
> On 07/25/2007 05:22 PM, Chris Snook wrote:
>> Chuck Ebbert wrote:
>>> I have a report of random errors when using the atl1 driver
>>> with kernel 2.6.22.1. Could that be a problem fixed by the
>>> recent changes to DMA setup in 2.6.23-rc?
>> I hope so.  As far as we can tell the driver and the NIC itself are
>> doing the right thing, and the pci layer or chipset is screwing up the
>> 64-bit DMA.  This only manifests when physical memory addresses cross
>> the 4 GB boundary, and as far as I'm aware atl1 is only used on desktop
>> boards, so we don't have a lot of testers.  If someone wants to buy me
>> and Jay more RAM so we can test it ourselves, I guess we wouldn't object :)
>>
> 
> Our reporter has 8GB of memory in an x86_64 machine.
> 
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=249511
> 
> 
>> I favor disabling 64-bit DMA in atl1 until Atheros can track this down
>> in the lab.  If we don't get confirmation that this bug is fixed by the
>> DMA changes, I think we should revert to 32-bit DMA for 2.6.23. 
>> Limiting ourselves to 32-bit DMA on desktop systems is a lot less bad
>> than allowing arbitrary memory corruption.
>>
> 
> This is what was committed.
> 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f516c00d416bd39aab6cfb348b68919e295fe23
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ef76e3e2505db01f7d4b537854f4a177220c26c8

Oh, I thought you were referring to a problem reproduced *after* those changes, 
to be fixed by some generic DMA setup patch.  Has anyone reproduced the problem 
after those changes?

CCing atl1-devel to see if we can get some more testing...

	-- Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: atl1 driver corrupting memory?
  2007-07-25 21:31   ` Chuck Ebbert
  2007-07-25 21:35     ` Chris Snook
@ 2007-07-26  0:24     ` Jay Cliburn
  1 sibling, 0 replies; 5+ messages in thread
From: Jay Cliburn @ 2007-07-26  0:24 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Chris Snook, Netdev, jeff, Huang Xiong

On Wed, 25 Jul 2007 17:31:02 -0400
Chuck Ebbert <cebbert@redhat.com> wrote:

> This is what was committed.
> 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f516c00d416bd39aab6cfb348b68919e295fe23
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ef76e3e2505db01f7d4b537854f4a177220c26c8

I'm doubtful these patches will fix the highmem corruption problem
we've seen in the L1.  I actually extracted the changes in the
referenced commits from the vendor's current out-of-tree driver, and
unfortunately he was able to duplicate the problem in his lab using
that driver.

As a workaround, Chuck, your reporter can boot with mem=3900 until the
problem is resolved.

I go on record with Chris:  we should apply the patch at
http://lkml.org/lkml/2007/6/25/293 until we get to the bottom of it.
The patch is in Jeff's queue, but I think he suspects a driver bug and
so far hasn't chosen to apply the patch.

Jeff, can we ask you to please reconsider?

Jay

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-07-26  0:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-25 18:33 atl1 driver corrupting memory? Chuck Ebbert
2007-07-25 21:22 ` Chris Snook
2007-07-25 21:31   ` Chuck Ebbert
2007-07-25 21:35     ` Chris Snook
2007-07-26  0:24     ` Jay Cliburn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).