* st corruption
@ 2001-03-22 19:41 Geert Uytterhoeven
2001-03-22 23:14 ` Tony Mantler
0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2001-03-22 19:41 UTC (permalink / raw)
To: Linux/PPC Development
(cfr. my posting on linux-kernel)
I'm seeing data corruption when writing to tape. Not when reading, not when
copying between disks.
The corruption affects 32 bytes on a 32-byte boundary. The corrupted data are
always a copy of the data exactly 10240 bytes before. Note that 32 bytes is the
cache line size of a 604e, while 10240 is the default block size for tar.
Perhaps a missing sync before PCI busmastering?
My hardware: CHRP LongTrail, HP C1536 DDS1 tape drive connected to Sym53c875.
The problem happens with 2.4.3-pre4, but also with the good old
2.4.0-test1-ac10. This means all backups I have may be corrupted :-(
Anybody out there with a SCSI tape drive who's willing to do some tests?
Someone already tried with a Pentium, but no corruption, so it may be a PPC
specific problem. Just create some large files, make md5sums, tar them to tape,
untar them from tape, and verify the md5sums. I see approx. 7 blocks of
corrupted data for 256 MB of data.
Many thanks in advance!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-22 19:41 st corruption Geert Uytterhoeven
@ 2001-03-22 23:14 ` Tony Mantler
2001-03-23 7:20 ` Geert Uytterhoeven
2001-03-25 15:08 ` Guillaume Laures
0 siblings, 2 replies; 8+ messages in thread
From: Tony Mantler @ 2001-03-22 23:14 UTC (permalink / raw)
To: Geert Uytterhoeven, Linux/PPC Development
At 1:41 PM -0600 3/22/2001, Geert Uytterhoeven wrote:
[...]
>Just create some large files, make md5sums, tar them to tape,
>untar them from tape, and verify the md5sums. I see approx. 7 blocks of
>corrupted data for 256 MB of data.
merida:/home/nicoya# modprobe mesh
merida:/home/nicoya# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: QUANTUM Model: FIREBALL ST4300S Rev: 0F0D
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 03 Lun: 00
Vendor: MATSHITA Model: CD-ROM CR-8012 Rev: 1.0f
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
Vendor: ARCHIVE Model: Python 25501-XXX Rev: 2.96
Type: Sequential-Access ANSI SCSI revision: 02
merida:/home/nicoya# mt status
drive type = Generic SCSI-2 tape
drive status = 318767616
sense key error = 0
residue count = 0
file number = 0
block number = 0
Tape block size 512 bytes. Density code 0x13 (DDS (61000 bpi)).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN
merida:/home/nicoya# dd if=/dev/cdrom of=testfile bs=1024k count=256
256+0 records in
256+0 records out
merida:/home/nicoya# md5sum testfile
118c94df7aae2df0fb26dce3b13312f9 testfile
merida:/home/nicoya# tar -c testfile >/dev/st0
merida:/home/nicoya# mv testfile testfile.1
merida:/home/nicoya# tar -x </dev/st0
merida:/home/nicoya# md5sum testfile
118c94df7aae2df0fb26dce3b13312f9 testfile
merida:/home/nicoya# uname -a
Linux merida 2.4.1 #1 SMP Mon Feb 5 17:32:52 CST 2001 ppc unknown
This is with my 9600/200mp. (For the curious, the CD in the drive was my
copy of the Marathon trilogy)
Cheers - Tony 'Nicoya' Mantler :)
--
Tony "Nicoya" Mantler - Renaissance Nerd Extraordinaire - nicoya@apia.dhs.org
Winnipeg, Manitoba, Canada -- http://nicoya.feline.pp.se/
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-22 23:14 ` Tony Mantler
@ 2001-03-23 7:20 ` Geert Uytterhoeven
2001-03-23 13:22 ` Tony Mantler
2001-03-25 15:08 ` Guillaume Laures
1 sibling, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2001-03-23 7:20 UTC (permalink / raw)
To: Tony Mantler; +Cc: Linux/PPC Development
On Thu, 22 Mar 2001, Tony Mantler wrote:
> At 1:41 PM -0600 3/22/2001, Geert Uytterhoeven wrote:
> [...]
> >Just create some large files, make md5sums, tar them to tape,
> >untar them from tape, and verify the md5sums. I see approx. 7 blocks of
> >corrupted data for 256 MB of data.
>
> merida:/home/nicoya# modprobe mesh
> merida:/home/nicoya# cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 00
> Vendor: QUANTUM Model: FIREBALL ST4300S Rev: 0F0D
> Type: Direct-Access ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 03 Lun: 00
> Vendor: MATSHITA Model: CD-ROM CR-8012 Rev: 1.0f
> Type: CD-ROM ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 06 Lun: 00
> Vendor: ARCHIVE Model: Python 25501-XXX Rev: 2.96
> Type: Sequential-Access ANSI SCSI revision: 02
Ugh... I don't dare to connect my DDS to the MESH. Before I had the '875, I did
it, but from time to time I got lost arbitrations corrupting data.
> 118c94df7aae2df0fb26dce3b13312f9 testfile
> merida:/home/nicoya# uname -a
> Linux merida 2.4.1 #1 SMP Mon Feb 5 17:32:52 CST 2001 ppc unknown
Hmmm... Perhaps I should retry on the MESH, just to see whether it's a MESH or
Sym53c875 problem.
Thanks for testing!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-23 7:20 ` Geert Uytterhoeven
@ 2001-03-23 13:22 ` Tony Mantler
0 siblings, 0 replies; 8+ messages in thread
From: Tony Mantler @ 2001-03-23 13:22 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Linux/PPC Development
At 1:20 AM -0600 3/23/2001, Geert Uytterhoeven wrote:
>On Thu, 22 Mar 2001, Tony Mantler wrote:
>> At 1:41 PM -0600 3/22/2001, Geert Uytterhoeven wrote:
>> [...]
>> >Just create some large files, make md5sums, tar them to tape,
>> >untar them from tape, and verify the md5sums. I see approx. 7 blocks of
>> >corrupted data for 256 MB of data.
>>
>> merida:/home/nicoya# modprobe mesh
>> merida:/home/nicoya# cat /proc/scsi/scsi
>> Attached devices:
>> Host: scsi0 Channel: 00 Id: 00 Lun: 00
>> Vendor: QUANTUM Model: FIREBALL ST4300S Rev: 0F0D
>> Type: Direct-Access ANSI SCSI revision: 02
>> Host: scsi0 Channel: 00 Id: 03 Lun: 00
>> Vendor: MATSHITA Model: CD-ROM CR-8012 Rev: 1.0f
>> Type: CD-ROM ANSI SCSI revision: 02
>> Host: scsi0 Channel: 00 Id: 06 Lun: 00
>> Vendor: ARCHIVE Model: Python 25501-XXX Rev: 2.96
>> Type: Sequential-Access ANSI SCSI revision: 02
>
>Ugh... I don't dare to connect my DDS to the MESH. Before I had the '875,
>I did
>it, but from time to time I got lost arbitrations corrupting data.
Well, the tape drive itself is actually the assembled parts of 2 broken
tape drives, so I wouldn't exactly trust it with my life anyways. ;)
It's really just sitting in my 9600 because I didn't have anywhere else
interesting to stick it.
>> 118c94df7aae2df0fb26dce3b13312f9 testfile
>> merida:/home/nicoya# uname -a
>> Linux merida 2.4.1 #1 SMP Mon Feb 5 17:32:52 CST 2001 ppc unknown
>
>Hmmm... Perhaps I should retry on the MESH, just to see whether it's a MESH or
>Sym53c875 problem.
Finding someone with Sym53c875 SCSI in a non-pmac non-x86 might help too.
It could also be that my SMP machine has a different cache flushing
profile, since both tar and the st driver would've likely been bouncing
from CPU to CPU a bit. (Am I the only one who would like to see stronger
CPU binding in SMP linux? Especially on platforms with larger caches)
Cheers - Tony 'Nicoya' Mantler :)
--
Tony "Nicoya" Mantler - Renaissance Nerd Extraordinaire - nicoya@apia.dhs.org
Winnipeg, Manitoba, Canada -- http://nicoya.feline.pp.se/
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-22 23:14 ` Tony Mantler
2001-03-23 7:20 ` Geert Uytterhoeven
@ 2001-03-25 15:08 ` Guillaume Laures
2001-03-25 19:21 ` Geert Uytterhoeven
1 sibling, 1 reply; 8+ messages in thread
From: Guillaume Laures @ 2001-03-25 15:08 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Tony Mantler, Linux/PPC Development
Time to try out the DDS-2 that came with my ANS 700 :-)
Le 22 Mar 2001 17:14:01 -0600, Tony Mantler a écrit :
>
> At 1:41 PM -0600 3/22/2001, Geert Uytterhoeven wrote:
> [...]
> >Just create some large files, make md5sums, tar them to tape,
> >untar them from tape, and verify the md5sums. I see approx. 7 blocks of
> >corrupted data for 256 MB of data.
>
> merida:/home/nicoya# cat /proc/scsi/scsi
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: MATSHITA Model: CD-ROM CR-8005A Rev: 4.0i
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: HP Model: C1533A Rev: 9503
Type: Sequential-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 02 Lun: 00
Vendor: SEAGATE Model: ST15150W_APL Rev: 9503
Type: Direct-Access ANSI SCSI revision: 02
scsi0 is :
sym53c8xx: at PCI bus 0, device 17, function 0
sym53c8xx: setting PCI_COMMAND_IO...
sym53c8xx: setting PCI_COMMAND_PARITY...(fix-up)
sym53c8xx: 53c825a detected
sym53c8xx: at PCI bus 0, device 18, function 0
sym53c8xx: setting PCI_COMMAND_IO...
sym53c8xx: setting PCI_COMMAND_PARITY...(fix-up)
sym53c8xx: 53c825a detected
sym53c825a-0: rev 0x11 on pci bus 0 device 17 function 0 irq 22
sym53c825a-0: ID 7, Fast-10, Parity Checking
sym53c825a-1: rev 0x11 on pci bus 0 device 18 function 0 irq 26
sym53c825a-1: ID 7, Fast-10, Parity Checking
scsi0 : sym53c8xx-1.7.1-20000726
scsi1 : sym53c8xx-1.7.1-20000726
scsi2 : 53C94
scsi : 3 hosts.
> merida:/home/nicoya# mt status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x13 (DDS (61000 bpi)).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN
> merida:/home/nicoya# dd if=/dev/cdrom of=testfile bs=1024k count=256
256+0 records in
256+0 records out
> merida:/home/nicoya# md5sum testfile
13b37214355ea84d906a54bb14c1c0be testfile
> merida:/home/nicoya# tar -c testfile >/dev/st0
> merida:/home/nicoya# mv testfile testfile.1
> merida:/home/nicoya# tar -x </dev/st0
> merida:/home/nicoya# md5sum testfile
13b37214355ea84d906a54bb14c1c0be testfile
uname -a :
Linux shiner.gom.net 2.2.18-snd-bkport #26 Wed Feb 28 23:19:57 CET 2001
ppc unknown
So no prob here either with close hardware.
Cheers
--
GoM
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-25 15:08 ` Guillaume Laures
@ 2001-03-25 19:21 ` Geert Uytterhoeven
2001-03-25 20:08 ` Guillaume Laures
0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2001-03-25 19:21 UTC (permalink / raw)
To: Guillaume Laures; +Cc: Tony Mantler, Linux/PPC Development
On 25 Mar 2001, Guillaume Laures wrote:
> > merida:/home/nicoya# md5sum testfile
>
> 13b37214355ea84d906a54bb14c1c0be testfile
>
> > merida:/home/nicoya# tar -c testfile >/dev/st0
> > merida:/home/nicoya# mv testfile testfile.1
> > merida:/home/nicoya# tar -x </dev/st0
> > merida:/home/nicoya# md5sum testfile
>
> 13b37214355ea84d906a54bb14c1c0be testfile
>
> uname -a :
> Linux shiner.gom.net 2.2.18-snd-bkport #26 Wed Feb 28 23:19:57 CET 2001
> ppc unknown
>
> So no prob here either with close hardware.
Thanks! Do you also have a 2.4.x kernel around?
Or perhaps I should try 2.2.18... No idea whether 2.2.18 works on LongTrail,
though. I switched to 2.3.x a long time ago...
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-25 19:21 ` Geert Uytterhoeven
@ 2001-03-25 20:08 ` Guillaume Laures
2001-03-27 18:26 ` Geert Uytterhoeven
0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Laures @ 2001-03-25 20:08 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Tony Mantler, Linux/PPC Development
Le 25 Mar 2001 21:21:20 +0200, Geert Uytterhoeven a écrit :
> On 25 Mar 2001, Guillaume Laures wrote:
> > > merida:/home/nicoya# md5sum testfile
> >
> > 13b37214355ea84d906a54bb14c1c0be testfile
> >
> > > merida:/home/nicoya# tar -c testfile >/dev/st0
> > > merida:/home/nicoya# mv testfile testfile.1
> > > merida:/home/nicoya# tar -x </dev/st0
> > > merida:/home/nicoya# md5sum testfile
> >
> > 13b37214355ea84d906a54bb14c1c0be testfile
> >
> > uname -a :
> > Linux shiner.gom.net 2.2.18-snd-bkport #26 Wed Feb 28 23:19:57 CET 2001
> > ppc unknown
> >
> > So no prob here either with close hardware.
>
> Thanks! Do you also have a 2.4.x kernel around?
Not yet for this machine, but soon. I'll keep you tuned.
I may do test on my G4 with an external enclosure (2.4.3-pre3, scsi =
sym53c1010-33-0 through sym53c8xx-1.7.3a-20010304)
Later,
--
GoM
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: st corruption
2001-03-25 20:08 ` Guillaume Laures
@ 2001-03-27 18:26 ` Geert Uytterhoeven
0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2001-03-27 18:26 UTC (permalink / raw)
To: Linux/PPC Development
Status update:
- When I connect my DDS1 to the MESH, I see no corruption (as long as I get
no `lost arbitration' messages from the MESH driver. I never get those with
the disk BTW). So the tape drive seems to be fine.
- I wanted to try different tape drives, but all retired DDS drives I found
at work seem to be in a non-functional state. I tried 3 of them, without
any luck.
- I wanted to try a 2.2.x kernel, but linuxppc_2_2 (2.2.19-pre3) just says
`illegal instruction' and returns me to the OF prompt.
My next steps:
- Try to understand the sym53c8xx and st drivers.
- Look for missing sync()s and cache flushes in PCI and SCSI busmastering
code.
Anyone else with a clue?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2001-03-27 18:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-22 19:41 st corruption Geert Uytterhoeven
2001-03-22 23:14 ` Tony Mantler
2001-03-23 7:20 ` Geert Uytterhoeven
2001-03-23 13:22 ` Tony Mantler
2001-03-25 15:08 ` Guillaume Laures
2001-03-25 19:21 ` Geert Uytterhoeven
2001-03-25 20:08 ` Guillaume Laures
2001-03-27 18:26 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).