linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2?
@ 2006-08-20  9:30 Fajun Chen
  2006-08-20 17:42 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Fajun Chen @ 2006-08-20  9:30 UTC (permalink / raw)
  To: linux-scsi, linux-ide; +Cc: dougg, jgarzik, htejun, alan, akpm, rmk

Hi Folks,

I use ATA pass through via sg ioctl interface for data read/write. Two
kernels were tested:
Linux 2.6.15.4 with Jeff Garzik's libata patch (pata support)
Linux 2.6.18-rc2 with Jeff Garzik's git libata patch (new EH, hotplug,
pata support)
Hardware: ARM IOP80321 with PCI-X
Host adapters: sata Sil3124 and pata Sil680
Test algorithm: random write-read-compare (same range)

I've tried sg mmap, direct and indirect IO on both 2.6.18-rc2 and
2.6.15.4 release, none of the combinations survived data compare
overnight test.  I also tried to change cache policy to write through
on 2.6.15.4,  no luck either.  Several different symptoms were
observed among the failures:
1. A few bytes in a data pattern were not written correctly to the
disc (low data miscompare rate)
2. Pretty much none of the data were written correctly to the disc
(high data miscompare rate)
3. Data were written to the disc correctly but miscompares when read
it back. What's weird is that when the read buffer was printed out
right after data miscompare , it contains the correct data!
Sg write failures (symptom #1 and #2) are more typical than read
failure (symptom #3).  This problem has been observed on many
different test machines with both pata and sata drives.

I don't know of a good way to trace and isolate the problem yet. Since
data miscompare issue can be caused by issues from different
subsystems, I cc'ed some subsystem maintainers here.  Any information
or suggestions are greatly appreciated.

Thanks,
Fajun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2?
  2006-08-20  9:30 Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2? Fajun Chen
@ 2006-08-20 17:42 ` Andrew Morton
  2006-08-20 23:19   ` Fajun Chen
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2006-08-20 17:42 UTC (permalink / raw)
  To: Fajun Chen; +Cc: linux-scsi, linux-ide, dougg, jgarzik, htejun, alan, rmk

On Sun, 20 Aug 2006 03:30:56 -0600
"Fajun Chen" <fajunchen@gmail.com> wrote:

> Hi Folks,
> 
> I use ATA pass through via sg ioctl interface for data read/write. Two
> kernels were tested:
> Linux 2.6.15.4 with Jeff Garzik's libata patch (pata support)
> Linux 2.6.18-rc2 with Jeff Garzik's git libata patch (new EH, hotplug,
> pata support)
> Hardware: ARM IOP80321 with PCI-X
> Host adapters: sata Sil3124 and pata Sil680
> Test algorithm: random write-read-compare (same range)
> 
> I've tried sg mmap, direct and indirect IO on both 2.6.18-rc2 and
> 2.6.15.4 release, none of the combinations survived data compare
> overnight test.  I also tried to change cache policy to write through
> on 2.6.15.4,  no luck either.  Several different symptoms were
> observed among the failures:
> 1. A few bytes in a data pattern were not written correctly to the
> disc (low data miscompare rate)
> 2. Pretty much none of the data were written correctly to the disc
> (high data miscompare rate)
> 3. Data were written to the disc correctly but miscompares when read
> it back. What's weird is that when the read buffer was printed out
> right after data miscompare , it contains the correct data!
> Sg write failures (symptom #1 and #2) are more typical than read
> failure (symptom #3).  This problem has been observed on many
> different test machines with both pata and sata drives.
> 
> I don't know of a good way to trace and isolate the problem yet. Since
> data miscompare issue can be caused by issues from different
> subsystems, I cc'ed some subsystem maintainers here.  Any information
> or suggestions are greatly appreciated.
> 

Have you tested the same hardware to the same extent without using the SG
interface?  Say, using read() and write()?  That'll help us determine
whether the problem is related to the sg stuff, or to something else, like
a hardware failure.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2?
  2006-08-20 17:42 ` Andrew Morton
@ 2006-08-20 23:19   ` Fajun Chen
  2006-08-21  1:22     ` Douglas Gilbert
  0 siblings, 1 reply; 4+ messages in thread
From: Fajun Chen @ 2006-08-20 23:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-scsi, linux-ide, dougg, jgarzik, htejun, alan, rmk

Hi Andrew,

The part of the puzzle is that I have not found a baseline where it
works. I do need to do more controlled and comparative testing. One on
my list is to test the same code on i386 hardware.  It's also a good
idea to test without using SG as you suggested. But what are my
alternatives to get both PATA and SATA support without sg? sd
interface, sg read/write instead of sg ioctl (probably little code
difference here).  For PATA, IDE path may be worthy of testing, but it
would require quite extensive change to my test application.

Thanks,
Fajun

On 8/20/06, Andrew Morton <akpm@osdl.org> wrote:
> On Sun, 20 Aug 2006 03:30:56 -0600
> "Fajun Chen" <fajunchen@gmail.com> wrote:
>
> > Hi Folks,
> >
> > I use ATA pass through via sg ioctl interface for data read/write. Two
> > kernels were tested:
> > Linux 2.6.15.4 with Jeff Garzik's libata patch (pata support)
> > Linux 2.6.18-rc2 with Jeff Garzik's git libata patch (new EH, hotplug,
> > pata support)
> > Hardware: ARM IOP80321 with PCI-X
> > Host adapters: sata Sil3124 and pata Sil680
> > Test algorithm: random write-read-compare (same range)
> >
> > I've tried sg mmap, direct and indirect IO on both 2.6.18-rc2 and
> > 2.6.15.4 release, none of the combinations survived data compare
> > overnight test.  I also tried to change cache policy to write through
> > on 2.6.15.4,  no luck either.  Several different symptoms were
> > observed among the failures:
> > 1. A few bytes in a data pattern were not written correctly to the
> > disc (low data miscompare rate)
> > 2. Pretty much none of the data were written correctly to the disc
> > (high data miscompare rate)
> > 3. Data were written to the disc correctly but miscompares when read
> > it back. What's weird is that when the read buffer was printed out
> > right after data miscompare , it contains the correct data!
> > Sg write failures (symptom #1 and #2) are more typical than read
> > failure (symptom #3).  This problem has been observed on many
> > different test machines with both pata and sata drives.
> >
> > I don't know of a good way to trace and isolate the problem yet. Since
> > data miscompare issue can be caused by issues from different
> > subsystems, I cc'ed some subsystem maintainers here.  Any information
> > or suggestions are greatly appreciated.
> >
>
> Have you tested the same hardware to the same extent without using the SG
> interface?  Say, using read() and write()?  That'll help us determine
> whether the problem is related to the sg stuff, or to something else, like
> a hardware failure.
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2?
  2006-08-20 23:19   ` Fajun Chen
@ 2006-08-21  1:22     ` Douglas Gilbert
  0 siblings, 0 replies; 4+ messages in thread
From: Douglas Gilbert @ 2006-08-21  1:22 UTC (permalink / raw)
  To: Fajun Chen
  Cc: Andrew Morton, linux-scsi, linux-ide, jgarzik, htejun, alan, rmk

Fajun Chen wrote:
> Hi Andrew,
> 
> The part of the puzzle is that I have not found a baseline where it
> works. I do need to do more controlled and comparative testing. One on
> my list is to test the same code on i386 hardware.  It's also a good
> idea to test without using SG as you suggested. But what are my
> alternatives to get both PATA and SATA support without sg? sd
> interface, sg read/write instead of sg ioctl (probably little code

Fajun,
I assume you meant "_sd_ read/write instead of sg ioctl". If
so, then FYI, the SG_IO ioctl is available on all block devices
that accept SCSI commands in the lk 2.6 series. There are
some small differences (see www.torque.net/sg/sg_io.html )
but I would think that your test programs using the sg
driver should work ok with the sd driver. Direct and
mmap()-ed IO are done differently.

> difference here).  For PATA, IDE path may be worthy of testing, but it
> would require quite extensive change to my test application.

>From Russell King's earlier response it sounds like
it could be an architecture issue. Quite a few
folks use the sg driver for SCSI and SATA device
bashing.

Doug Gilbert

> On 8/20/06, Andrew Morton <akpm@osdl.org> wrote:
>> On Sun, 20 Aug 2006 03:30:56 -0600
>> "Fajun Chen" <fajunchen@gmail.com> wrote:
>>
>> > Hi Folks,
>> >
>> > I use ATA pass through via sg ioctl interface for data read/write. Two
>> > kernels were tested:
>> > Linux 2.6.15.4 with Jeff Garzik's libata patch (pata support)
>> > Linux 2.6.18-rc2 with Jeff Garzik's git libata patch (new EH, hotplug,
>> > pata support)
>> > Hardware: ARM IOP80321 with PCI-X
>> > Host adapters: sata Sil3124 and pata Sil680
>> > Test algorithm: random write-read-compare (same range)
>> >
>> > I've tried sg mmap, direct and indirect IO on both 2.6.18-rc2 and
>> > 2.6.15.4 release, none of the combinations survived data compare
>> > overnight test.  I also tried to change cache policy to write through
>> > on 2.6.15.4,  no luck either.  Several different symptoms were
>> > observed among the failures:
>> > 1. A few bytes in a data pattern were not written correctly to the
>> > disc (low data miscompare rate)
>> > 2. Pretty much none of the data were written correctly to the disc
>> > (high data miscompare rate)
>> > 3. Data were written to the disc correctly but miscompares when read
>> > it back. What's weird is that when the read buffer was printed out
>> > right after data miscompare , it contains the correct data!
>> > Sg write failures (symptom #1 and #2) are more typical than read
>> > failure (symptom #3).  This problem has been observed on many
>> > different test machines with both pata and sata drives.
>> >
>> > I don't know of a good way to trace and isolate the problem yet. Since
>> > data miscompare issue can be caused by issues from different
>> > subsystems, I cc'ed some subsystem maintainers here.  Any information
>> > or suggestions are greatly appreciated.
>> >
>>
>> Have you tested the same hardware to the same extent without using the SG
>> interface?  Say, using read() and write()?  That'll help us determine
>> whether the problem is related to the sg stuff, or to something else,
>> like
>> a hardware failure.
>>
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-08-21  1:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-20  9:30 Any known issues to cause data miscompares on sg read/write on 2.6.15.4 and 2.6.18-rc2? Fajun Chen
2006-08-20 17:42 ` Andrew Morton
2006-08-20 23:19   ` Fajun Chen
2006-08-21  1:22     ` Douglas Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).