* Re: What's in linux-2.6-block.git for 2.6.24 [not found] <20070921085711.GG2367@kernel.dk> @ 2007-09-23 13:19 ` Torsten Kaiser 2007-09-23 13:55 ` FUJITA Tomonori 2007-09-23 14:11 ` Alan Cox 0 siblings, 2 replies; 7+ messages in thread From: Torsten Kaiser @ 2007-09-23 13:19 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel, akpm, linux-scsi, linux-ide On 9/21/07, Jens Axboe <jens.axboe@oracle.com> wrote: > SG chaining bits: > - This is the bulk of the patchset. It consists of three major > components: > > - sglist-core, which add helpers for iterating sg lists and > switches the block layer and SCSI to use those. Should not > have any functional changes. > - sglist-drivers, which converts drivers to use the sg list > helpers. Again, should not contain functional changes. > - sglist-arch, which adds support to most architectures and > actually enables sg chaining. Adding linux-ide and linux-scsi as CC like Andrew did with my last report. I still have trouble with my Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller as reported on 2.6.23-rc4-mm1 on the new 2.6.23-rc6-mm1. I'm not 100% sure if this caused by the sg chaining, but the patch from http://lkml.org/lkml/2007/9/10/251 which touches that chaining makes a difference, so it might be related. First report: http://lkml.org/lkml/2007/9/1/92 With patch it fails fewer times: http://lkml.org/lkml/2007/9/14/107 To update the statistik: prior to 2.6.23-rc4-mm1: no trouble with any drives on the SiI 3132. 2.6.23-rc4-mm1 without patch: 2 out of 2 bad. back to 2.6.23-rc3-mm1: 18x good. 2.6.23-rc4-mm1 with patch: 2 out of 8 bad after that second mail: 2.6.23-rc4-mm1 with patch: 1 out of 5 bad 2.6.23-rc6-mm1: 1 out of 2 bad switching back to 2.6.23-rc3-mm1 to rule out the hardware: 2.6.23-rc3-mm1: 6x good The error messages from the failed 2.6.23-rc6-mm1: Sep 18 18:50:01 treogen [ 33.340000] md1: bitmap initialized from disk: read 10/10 pages, set 0 bits Sep 18 18:50:01 treogen [ 33.340000] created bitmap (145 pages) for device md1 Sep 18 18:50:01 treogen [ 63.440000] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen Sep 18 18:50:01 treogen [ 63.440000] ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out Sep 18 18:50:01 treogen [ 63.440000] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Sep 18 18:50:01 treogen [ 63.440000] ata1.00: status: {DRDY } Sep 18 18:50:01 treogen [ 63.440000] ata1: hard resetting link Sep 18 18:50:01 treogen [ 65.740000] ata1: softreset failed (port not ready) Sep 18 18:50:01 treogen [ 65.740000] ata1: reset failed (errno=-5), retrying in 8 secs Sep 18 18:50:01 treogen [ 73.440000] ata1: hard resetting link Sep 18 18:50:01 treogen [ 75.740000] ata1: softreset failed (port not ready) Sep 18 18:50:01 treogen [ 75.740000] ata1: reset failed (errno=-5), retrying in 8 secs Sep 18 18:50:01 treogen [ 83.440000] ata1: hard resetting link Sep 18 18:50:01 treogen [ 85.740000] ata1: softreset failed (port not ready) Sep 18 18:50:01 treogen [ 85.740000] ata1: reset failed (errno=-5), retrying in 33 secs Sep 18 18:50:01 treogen [ 118.440000] ata1: limiting SATA link speed to 1.5 Gbps Sep 18 18:50:01 treogen [ 118.440000] ata1: hard resetting link Sep 18 18:50:01 treogen [ 120.740000] ata1: softreset failed (port not ready) Sep 18 18:50:01 treogen [ 120.740000] ata1: reset failed, giving up Sep 18 18:50:01 treogen [ 120.740000] ata1.00: disabled Sep 18 18:50:01 treogen [ 120.740000] ata1: EH complete Sep 18 18:50:01 treogen [ 120.740000] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK Sep 18 18:50:01 treogen [ 120.740000] end_request: I/O error, dev sda, sector 625137161 Sep 18 18:50:01 treogen [ 120.740000] md: super_written gets error=-5, uptodate=0 Sep 18 18:50:01 treogen [ 120.740000] raid5: Disk failure on sda2, disabling device. Operation continuing on 2 devices After that many more errors like this, only differing in the sector number: Sep 18 18:50:01 treogen [ 120.810000] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK Sep 18 18:50:01 treogen [ 120.810000] end_request: I/O error, dev sda, sector 19550919 Any more infos needed? Torsten ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-23 13:19 ` What's in linux-2.6-block.git for 2.6.24 Torsten Kaiser @ 2007-09-23 13:55 ` FUJITA Tomonori 2007-09-23 15:31 ` Torsten Kaiser 2007-09-23 14:11 ` Alan Cox 1 sibling, 1 reply; 7+ messages in thread From: FUJITA Tomonori @ 2007-09-23 13:55 UTC (permalink / raw) To: just.for.lkml Cc: jens.axboe, linux-kernel, akpm, linux-scsi, linux-ide, fujita.tomonori On Sun, 23 Sep 2007 15:19:13 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote: > On 9/21/07, Jens Axboe <jens.axboe@oracle.com> wrote: > > SG chaining bits: > > - This is the bulk of the patchset. It consists of three major > > components: > > > > - sglist-core, which add helpers for iterating sg lists and > > switches the block layer and SCSI to use those. Should not > > have any functional changes. > > - sglist-drivers, which converts drivers to use the sg list > > helpers. Again, should not contain functional changes. > > - sglist-arch, which adds support to most architectures and > > actually enables sg chaining. > > Adding linux-ide and linux-scsi as CC like Andrew did with my last report. > > I still have trouble with my Silicon Image, Inc. SiI 3132 Serial ATA > Raid II Controller as reported on 2.6.23-rc4-mm1 on the new > 2.6.23-rc6-mm1. > > I'm not 100% sure if this caused by the sg chaining, but the patch > from http://lkml.org/lkml/2007/9/10/251 which touches that chaining > makes a difference, so it might be related. > > First report: http://lkml.org/lkml/2007/9/1/92 > With patch it fails fewer times: http://lkml.org/lkml/2007/9/14/107 > > To update the statistik: > prior to 2.6.23-rc4-mm1: no trouble with any drives on the SiI 3132. > 2.6.23-rc4-mm1 without patch: 2 out of 2 bad. > back to 2.6.23-rc3-mm1: 18x good. > 2.6.23-rc4-mm1 with patch: 2 out of 8 bad > after that second mail: > 2.6.23-rc4-mm1 with patch: 1 out of 5 bad > 2.6.23-rc6-mm1: 1 out of 2 bad git-block.patch in 2.6.23-rc6-mm1 includes my patch that disables sg chaining for libata but it still includes libata's sg chaining changes. So these changes breaks libata or libata was broken after 2.6.23-rc3-mm1. Can you try Jens's sglist-arch branch? If it works, probably libata in -mm has bugs. For your convenience, I put a sglist-arch branch patch against v2.6.23-rc7: http://www.kernel.org/pub/linux/kernel/people/tomo/misc/v2.6.23-rc7-sglist-arch.diff.bz2 > switching back to 2.6.23-rc3-mm1 to rule out the hardware: > 2.6.23-rc3-mm1: 6x good > > The error messages from the failed 2.6.23-rc6-mm1: > Sep 18 18:50:01 treogen [ 33.340000] md1: bitmap initialized from > disk: read 10/10 pages, set 0 bits > Sep 18 18:50:01 treogen [ 33.340000] created bitmap (145 pages) for device md1 > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: exception Emask 0x0 > SAct 0x1 SErr 0x0 action 0x6 frozen > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: cmd > 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out > Sep 18 18:50:01 treogen [ 63.440000] res > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: status: {DRDY } > Sep 18 18:50:01 treogen [ 63.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 65.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 65.740000] ata1: reset failed (errno=-5), > retrying in 8 secs > Sep 18 18:50:01 treogen [ 73.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 75.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 75.740000] ata1: reset failed (errno=-5), > retrying in 8 secs > Sep 18 18:50:01 treogen [ 83.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 85.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 85.740000] ata1: reset failed (errno=-5), > retrying in 33 secs > Sep 18 18:50:01 treogen [ 118.440000] ata1: limiting SATA link speed > to 1.5 Gbps > Sep 18 18:50:01 treogen [ 118.440000] ata1: hard resetting link > Sep 18 18:50:01 treogen [ 120.740000] ata1: softreset failed (port not ready) > Sep 18 18:50:01 treogen [ 120.740000] ata1: reset failed, giving up > Sep 18 18:50:01 treogen [ 120.740000] ata1.00: disabled > Sep 18 18:50:01 treogen [ 120.740000] ata1: EH complete > Sep 18 18:50:01 treogen [ 120.740000] sd 0:0:0:0: [sda] Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK > Sep 18 18:50:01 treogen [ 120.740000] end_request: I/O error, dev > sda, sector 625137161 > Sep 18 18:50:01 treogen [ 120.740000] md: super_written gets > error=-5, uptodate=0 > Sep 18 18:50:01 treogen [ 120.740000] raid5: Disk failure on sda2, > disabling device. Operation continuing on 2 devices > > After that many more errors like this, only differing in the sector number: > Sep 18 18:50:01 treogen [ 120.810000] sd 0:0:0:0: [sda] Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK > Sep 18 18:50:01 treogen [ 120.810000] end_request: I/O error, dev > sda, sector 19550919 > > Any more infos needed? > > Torsten > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-23 13:55 ` FUJITA Tomonori @ 2007-09-23 15:31 ` Torsten Kaiser 2007-09-24 18:48 ` Torsten Kaiser 0 siblings, 1 reply; 7+ messages in thread From: Torsten Kaiser @ 2007-09-23 15:31 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, linux-kernel, akpm, linux-scsi, linux-ide, fujita.tomonori On 9/23/07, FUJITA Tomonori <tomof@acm.org> wrote: > On Sun, 23 Sep 2007 15:19:13 +0200 > "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote: > > To update the statistik: > > prior to 2.6.23-rc4-mm1: no trouble with any drives on the SiI 3132. > > 2.6.23-rc4-mm1 without patch: 2 out of 2 bad. > > back to 2.6.23-rc3-mm1: 18x good. > > 2.6.23-rc4-mm1 with patch: 2 out of 8 bad > > after that second mail: > > 2.6.23-rc4-mm1 with patch: 1 out of 5 bad > > 2.6.23-rc6-mm1: 1 out of 2 bad > > git-block.patch in 2.6.23-rc6-mm1 includes my patch that disables sg > chaining for libata but it still includes libata's sg chaining > changes. So these changes breaks libata or libata was broken after > 2.6.23-rc3-mm1. > > Can you try Jens's sglist-arch branch? If it works, probably libata in > -mm has bugs. > > For your convenience, I put a sglist-arch branch patch against v2.6.23-rc7: > > http://www.kernel.org/pub/linux/kernel/people/tomo/misc/v2.6.23-rc7-sglist-arch.diff.bz2 Thanks for the patch. I tried it and 3 out of 3 boot attempts worked without problems. But I can't rule out that the bug is still there, as I have no way to trigger it on demand. Torsten ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-23 15:31 ` Torsten Kaiser @ 2007-09-24 18:48 ` Torsten Kaiser 2007-09-25 5:52 ` Torsten Kaiser 0 siblings, 1 reply; 7+ messages in thread From: Torsten Kaiser @ 2007-09-24 18:48 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, linux-kernel, akpm, linux-scsi, linux-ide, fujita.tomonori On 9/23/07, Torsten Kaiser <just.for.lkml@googlemail.com> wrote: > On 9/23/07, FUJITA Tomonori <tomof@acm.org> wrote: > > Can you try Jens's sglist-arch branch? If it works, probably libata in > > -mm has bugs. > > > > For your convenience, I put a sglist-arch branch patch against v2.6.23-rc7: > > > > http://www.kernel.org/pub/linux/kernel/people/tomo/misc/v2.6.23-rc7-sglist-arch.diff.bz2 > > Thanks for the patch. > I tried it and 3 out of 3 boot attempts worked without problems. > But I can't rule out that the bug is still there, as I have no way to > trigger it on demand. Short update: 2 more boots with that kernel did also work. I have just installed 2.6.23-rc7-mm1 and booted three times. Also no problems with that version. I will keep on using 2.6.23-rc7-mm1 and post again, if the error shows up again. Torsten ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-24 18:48 ` Torsten Kaiser @ 2007-09-25 5:52 ` Torsten Kaiser 0 siblings, 0 replies; 7+ messages in thread From: Torsten Kaiser @ 2007-09-25 5:52 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, linux-kernel, akpm, linux-scsi, linux-ide, fujita.tomonori On 9/24/07, Torsten Kaiser <just.for.lkml@googlemail.com> wrote: > I will keep on using 2.6.23-rc7-mm1 and post again, if the error shows up again. On the next boot it did show up again, so 2.6.23-rc7-mm1 still has the bug. [ 33.810000] md1: bitmap initialized from disk: read 10/10 pages, set 0 bits [ 33.810000] created bitmap (145 pages) for device md1 [ 63.910000] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen [ 63.910000] ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out [ 63.910000] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 63.910000] ata1.00: status: {DRDY } [ 63.910000] ata1: hard resetting link [ 66.210000] ata1: softreset failed (port not ready) [ 66.210000] ata1: reset failed (errno=-5), retrying in 8 secs [ 73.910000] ata1: hard resetting link [ 76.210000] ata1: softreset failed (port not ready) [ 76.210000] ata1: reset failed (errno=-5), retrying in 8 secs [ 83.910000] ata1: hard resetting link [ 86.210000] ata1: softreset failed (port not ready) [ 86.210000] ata1: reset failed (errno=-5), retrying in 33 secs [ 118.910000] ata1: limiting SATA link speed to 1.5 Gbps [ 118.910000] ata1: hard resetting link [ 121.210000] ata1: softreset failed (port not ready) [ 121.210000] ata1: reset failed, giving up [ 121.210000] ata1.00: disabled [ 121.210000] ata1: EH complete [ 121.210000] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK [ 121.210000] end_request: I/O error, dev sda, sector 625137161 [ 121.210000] md: super_written gets error=-5, uptodate=0 [ 121.210000] raid5: Disk failure on sda2, disabling device. Operation continuing on 2 devices After that there are many more error like this in the log: [ 135.760000] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK [ 135.760000] end_request: I/O error, dev sda, sector 19551113 [ 135.760000] Buffer I/O error on device sda2, logical block 1 or: [ 135.760000] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK [ 135.760000] end_request: I/O error, dev sda, sector 19551105 Torsten ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-23 13:19 ` What's in linux-2.6-block.git for 2.6.24 Torsten Kaiser 2007-09-23 13:55 ` FUJITA Tomonori @ 2007-09-23 14:11 ` Alan Cox 2007-09-23 15:40 ` Torsten Kaiser 1 sibling, 1 reply; 7+ messages in thread From: Alan Cox @ 2007-09-23 14:11 UTC (permalink / raw) To: Torsten Kaiser; +Cc: Jens Axboe, linux-kernel, akpm, linux-scsi, linux-ide > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: status: {DRDY } > Sep 18 18:50:01 treogen [ 63.440000] ata1: hard resetting link Timed out waiting for data transfers to complete that didn't. Does sound like the device got told the wrong sized transfer. It then falls off the bus because Jeff hasn't merged Mark Lord's DRQ draining patch. Alan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: What's in linux-2.6-block.git for 2.6.24 2007-09-23 14:11 ` Alan Cox @ 2007-09-23 15:40 ` Torsten Kaiser 0 siblings, 0 replies; 7+ messages in thread From: Torsten Kaiser @ 2007-09-23 15:40 UTC (permalink / raw) To: Alan Cox; +Cc: Jens Axboe, linux-kernel, akpm, linux-scsi, linux-ide On 9/23/07, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > Sep 18 18:50:01 treogen [ 63.440000] ata1.00: status: {DRDY } > > Sep 18 18:50:01 treogen [ 63.440000] ata1: hard resetting link > > Timed out waiting for data transfers to complete that didn't. Does sound > like the device got told the wrong sized transfer. > > > It then falls off the bus because Jeff hasn't merged Mark Lord's DRQ > draining patch. One time the error was different: Sep 11 19:19:24 treogen [ 33.340000] ata1.00: exception Emask 0x20 SAct 0x1 SErr 0x0 action 0x2 Sep 11 19:19:24 treogen [ 33.340000] ata1.00: irq_stat 0x00020002, PCI master abort while fetching SGT Sep 11 19:19:24 treogen [ 33.340000] ata1.00: cmd 61/08:00:09:d6:42/00:00:25:00:00/40 tag 0 cdb 0x0 data 4096 out Sep 11 19:19:24 treogen [ 33.340000] res 50/00:00:af:ea:42/00:00:25:00:00/e0 Emask 0x20 (host bus error) Sep 11 19:19:24 treogen [ 33.340000] ata1.00: status: {DRDY } Sep 11 19:19:24 treogen [ 33.670000] ata1: soft resetting link Sep 11 19:19:24 treogen [ 33.710000] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 11 19:19:24 treogen [ 33.800000] ata1.00: configured for UDMA/100 Sep 11 19:19:24 treogen [ 33.800000] ata1: EH complete This was repeated 12 times. (Diff between a good boot and one with that error is here: http://lkml.org/lkml/2007/9/14/107 ) Torsten ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-09-25 5:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20070921085711.GG2367@kernel.dk>
2007-09-23 13:19 ` What's in linux-2.6-block.git for 2.6.24 Torsten Kaiser
2007-09-23 13:55 ` FUJITA Tomonori
2007-09-23 15:31 ` Torsten Kaiser
2007-09-24 18:48 ` Torsten Kaiser
2007-09-25 5:52 ` Torsten Kaiser
2007-09-23 14:11 ` Alan Cox
2007-09-23 15:40 ` Torsten Kaiser
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox