* sata_mv dropping disks
@ 2006-05-18 21:31 Onis
2006-05-19 21:06 ` Mark Lord
0 siblings, 1 reply; 3+ messages in thread
From: Onis @ 2006-05-18 21:31 UTC (permalink / raw)
To: linux-ide
Hello
Got warnings while rebuilding md raid5 array. Controller is 88SX5081 with
8xMaxtor 300GB 7V300F0. I've ran badblock -w on all disks, smartctl doesn't
report errors.
----
BUG: warning at drivers/scsi/sata_mv.c:1884/mv_channel_reset()
Call Trace: <IRQ> <ffffffff803a39ce>{mv_channel_reset+238}
<ffffffff803a4277>{mv_stop_and_reset+55}
<ffffffff803a45f7>{mv_interrupt+631}
<ffffffff8024e9fc>{handle_IRQ_event+44}
<ffffffff8024eae0>{__do_IRQ+176}
<ffffffff8020c5d2>{do_IRQ+66} <ffffffff80209c88>{ret_from_intr+0} <EOI>
<ffffffff80322773>{get_request_wait+35}
<ffffffff803b78bf>{xor_sse_5+191}
<ffffffff803b09fd>{compute_block+221}
<ffffffff80323b5f>{generic_make_request+495}
<ffffffff803b3520>{handle_stripe+7840} <ffffffff803b487d>{raid5d+349}
<ffffffff80241268>{prepare_to_wait+24}
<ffffffff80240dd0>{keventd_create_kthread+0}
<ffffffff803bc2ac>{md_thread+300}
<ffffffff802413e0>{autoremove_wake_function+0}
<ffffffff802413e0>{autoremove_wake_function+0}
<ffffffff803bc180>{md_thread+0}
<ffffffff80240d89>{kthread+217} <ffffffff8020a5da>{child_rip+8}
<ffffffff80240dd0>{keventd_create_kthread+0}
<ffffffff80240cb0>{kthread+0}
<ffffffff8020a5d2>{child_rip+0}
BUG: warning at drivers/scsi/sata_mv.c:1904/__msleep()
Call Trace: <IRQ> <ffffffff803a3f21>{__mv_phy_reset+241}
<ffffffff803a39da>{mv_channel_reset+250}
<ffffffff803a45f7>{mv_interrupt+631}
<ffffffff8024e9fc>{handle_IRQ_event+44}
<ffffffff8024eae0>{__do_IRQ+176}
<ffffffff8020c5d2>{do_IRQ+66} <ffffffff80209c88>{ret_from_intr+0} <EOI>
<ffffffff80322773>{get_request_wait+35}
<ffffffff803b78bf>{xor_sse_5+191}
<ffffffff803b09fd>{compute_block+221}
<ffffffff80323b5f>{generic_make_request+495}
<ffffffff803b3520>{handle_stripe+7840} <ffffffff803b487d>{raid5d+349}
<ffffffff80241268>{prepare_to_wait+24}
<ffffffff80240dd0>{keventd_create_kthread+0}
<ffffffff803bc2ac>{md_thread+300}
<ffffffff802413e0>{autoremove_wake_function+0}
<ffffffff802413e0>{autoremove_wake_function+0}
<ffffffff803bc180>{md_thread+0}
<ffffffff80240d89>{kthread+217} <ffffffff8020a5da>{child_rip+8}
<ffffffff80240dd0>{keventd_create_kthread+0}
<ffffffff80240cb0>{kthread+0}
<ffffffff8020a5d2>{child_rip+0}
ata4: translated ATA stat/err 0x50/01 to SCSI SK/ASC/ASCQ 0x3/13/00
ata4: status=0x50 { DriveReady SeekComplete }
ata4: error=0x01 { AddrMarkNotFound }
sata_mv: PCI ERROR; PCI IRQ cause=0x28000020
What does "PCI IRQ cause=0x28000020" mean?
Few minutes after that rebuild stopped:
----
sd 6:0:0:0: SCSI error: return code = 0x40000
end_request: I/O error, dev sdg, sector 403739536
sd 6:0:0:0: SCSI error: return code = 0x40000
end_request: I/O error, dev sdg, sector 403739544
sd 6:0:0:0: SCSI error: return code = 0x40000
end_request: I/O error, dev sdg, sector 403739552
md: md0: sync done.
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sda[0] sdg[8](F) sdh[7] sdf[5] sde[4] sdd[3] sdc[2] sdb[1]
2051400960 blocks level 5, 128k chunk, algorithm 2 [8/7] [UUUUUU_U]
# hdparm -I /dev/sdg
/dev/sdg:
HDIO_DRIVE_CMD(identify) failed: Input/output error
Also I'm getting a lots of these on all ports on boot. smartctl also triggers
these:
----
ata3: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata3: status=0xd0 { Busy }
ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata1: status=0xd0 { Busy }
...
System
------
* Tyan Thunder S2882 Dual Opteron 240
* Marvell Technology Group Ltd. MV88SX5081 8-port SATA I PCI-X Controller
* 8 x Maxtor Maxline III 300GB SATA2
* Debian Sarge AMD64
* kernel 2.6.17-rc4-mm1
- Onis
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: sata_mv dropping disks
2006-05-18 21:31 sata_mv dropping disks Onis
@ 2006-05-19 21:06 ` Mark Lord
2006-05-19 22:25 ` Onis
0 siblings, 1 reply; 3+ messages in thread
From: Mark Lord @ 2006-05-19 21:06 UTC (permalink / raw)
To: Onis; +Cc: linux-ide
Onis wrote:
> Hello
>
> Got warnings while rebuilding md raid5 array. Controller is 88SX5081 with
> 8xMaxtor 300GB 7V300F0. I've ran badblock -w on all disks, smartctl doesn't
> report errors.
>
> ----
> BUG: warning at drivers/scsi/sata_mv.c:1884/mv_channel_reset()
>
> Call Trace: <IRQ> <ffffffff803a39ce>{mv_channel_reset+238}
> <ffffffff803a4277>{mv_stop_and_reset+55}
> <ffffffff803a45f7>{mv_interrupt+631}
> <ffffffff8024e9fc>{handle_IRQ_event+44}
> <ffffffff8024eae0>{__do_IRQ+176}
...
I'm not sure what the complaint is about there.
I see this on line 1884: mdelay(1);
But maybe the 2.6.17-rc4-mm1 version is different from
the 2.6.17-rc4-git2-libata1 that I have handy right now. (?)
> BUG: warning at drivers/scsi/sata_mv.c:1904/__msleep()
Similarly, on that line I see: mdelay(20);
Is there something different about mdelay() in -mm now?
..
> What does "PCI IRQ cause=0x28000020" mean?
"MWrPerr: SErr# asserted upon a PErr# response to write data by the PCI master"
In other words, a PCI bus parity error was detected.
Noisy bus, or buggy hardware.
> ata4: translated ATA stat/err 0x50/01 to SCSI SK/ASC/ASCQ 0x3/13/00
> ata4: status=0x50 { DriveReady SeekComplete }
> ata4: error=0x01 { AddrMarkNotFound }
That is wrong (bug). I *think* this may be fixed by the sata_mv
patch series I just posted today. The response should be to reset
the bus (well, at least that's what it does now) and then retry
the operation, not fail it immediately.
..
> Also I'm getting a lots of these on all ports on boot. smartctl also triggers
> these:
> ----
> ata3: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> ata3: status=0xd0 { Busy }
> ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> ata1: status=0xd0 { Busy }
> ...
That's due to a Marvell chip bug. A workaround for that got posted in
my patch series today.
Cheers
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: sata_mv dropping disks
2006-05-19 21:06 ` Mark Lord
@ 2006-05-19 22:25 ` Onis
0 siblings, 0 replies; 3+ messages in thread
From: Onis @ 2006-05-19 22:25 UTC (permalink / raw)
To: Mark Lord; +Cc: linux-ide
Mark Lord wrote:
> >BUG: warning at drivers/scsi/sata_mv.c:1884/mv_channel_reset()
> ...
> >What does "PCI IRQ cause=0x28000020" mean?
>
> "MWrPerr: SErr# asserted upon a PErr# response to write data by the PCI
> master"
>
> In other words, a PCI bus parity error was detected.
> Noisy bus, or buggy hardware.
Yes, that was fixed by relaxing bus speed to 133->66MHz, ignore it. My bad.
> > ata4: translated ATA stat/err 0x50/01 to SCSI SK/ASC/ASCQ 0x3/13/00
> > ata4: status=0x50 { DriveReady SeekComplete }
> > ata4: error=0x01 { AddrMarkNotFound }
>
> That is wrong (bug). I *think* this may be fixed by the sata_mv
> patch series I just posted today. The response should be to reset
> the bus (well, at least that's what it does now) and then retry
> the operation, not fail it immediately.
I think this was related to bus speed also. Haven't seen this error before.
> >Also I'm getting a lots of these on all ports on boot. smartctl also
> >triggers
> >these:
> >----
> >ata3: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> >ata3: status=0xd0 { Busy }
> >ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> >ata1: status=0xd0 { Busy }
> >...
>
> That's due to a Marvell chip bug. A workaround for that got posted in
> my patch series today.
Thanks a lot for the patch Mark! I grabbed immediately and patched against
2.6.17-rc4. Is it okay?
Now I'm now running rebuild with 6081 controller. Everything seems great. No
ata busy warnings or anything.
> Cheers
Cheers!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-05-19 22:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-18 21:31 sata_mv dropping disks Onis
2006-05-19 21:06 ` Mark Lord
2006-05-19 22:25 ` Onis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).