Re: 2.6.24-rc3-mm1

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: 2.6.24-rc3-mm1
       [not found] <20071120204525.ff27ac98.akpm@linux-foundation.org>
@ 2007-11-23  1:39 ` Gabriel C
  2007-11-23  4:12   ` 2.6.24-rc3-mm1 Andrew Morton
       [not found] ` <4744A6F2.4030302@free.fr>
  1 sibling, 1 reply; 35+ messages in thread
From: Gabriel C @ 2007-11-23  1:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-scsi

I have some warnings on each SCSI disc:


...

[   30.724410] scsi 0:0:0:0: Direct-Access     SEAGATE  ST318406LW       0109 PQ: 0 ANSI: 3
[   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
[   30.724435]  target0:0:0: Beginning Domain Validation
[   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
[   30.724572]  target0:0:0: Ending Domain Validation
[   30.729747] scsi 0:0:1:0: Direct-Access     FUJITSU  MAH3182MP        0114 PQ: 0 ANSI: 4
[   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
[   30.729771]  target0:0:1: Beginning Domain Validation
[   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
[   30.729908]  target0:0:1: Ending Domain Validation

...

no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy.

hdparm -t on sda and sdb reports :

/dev/sda:
 Timing buffered disk reads:    8 MB in  3.26 seconds =   2.46 MB/sec

/dev/sdb:
 Timing buffered disk reads:    8 MB in  3.56 seconds =   2.25 MB/sec

My IDE discs are fine.

Please let me know if you need my config or any other informations.


Gabriel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-11-23  1:39 ` 2.6.24-rc3-mm1 Gabriel C
@ 2007-11-23  4:12   ` Andrew Morton
  2007-11-23  5:55     ` 2.6.24-rc3-mm1 Gabriel C
  0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2007-11-23  4:12 UTC (permalink / raw)
  To: Gabriel C; +Cc: linux-kernel, linux-scsi

On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <nix.or.die@googlemail.com> wrote:

> I have some warnings on each SCSI disc:
> 
> 
> ...
> 
> [   30.724410] scsi 0:0:0:0: Direct-Access     SEAGATE  ST318406LW       0109 PQ: 0 ANSI: 3
> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
> [   30.724435]  target0:0:0: Beginning Domain Validation
> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
> [   30.724572]  target0:0:0: Ending Domain Validation
> [   30.729747] scsi 0:0:1:0: Direct-Access     FUJITSU  MAH3182MP        0114 PQ: 0 ANSI: 4
> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
> [   30.729771]  target0:0:1: Beginning Domain Validation
> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
> [   30.729908]  target0:0:1: Ending Domain Validation
> 

Don't know what would have caused that.  But yes, something is wrong in
scsi land.

> 
> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy.
> 
> hdparm -t on sda and sdb reports :
> 
> /dev/sda:
>  Timing buffered disk reads:    8 MB in  3.26 seconds =   2.46 MB/sec
> 
> /dev/sdb:
>  Timing buffered disk reads:    8 MB in  3.56 seconds =   2.25 MB/sec
> 
> My IDE discs are fine.
> 
> Please let me know if you need my config or any other informations.
> 

And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-11-23  4:12   ` 2.6.24-rc3-mm1 Andrew Morton
@ 2007-11-23  5:55     ` Gabriel C
  2007-11-27  6:15       ` 2.6.24-rc3-mm1 Andrew Morton
  0 siblings, 1 reply; 35+ messages in thread
From: Gabriel C @ 2007-11-23  5:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-scsi, Hannes Reinecke, James Bottomley

Andrew Morton wrote:
> On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <nix.or.die@googlemail.com> wrote:
> 
>> I have some warnings on each SCSI disc:
>>
>>
>> ...
>>
>> [   30.724410] scsi 0:0:0:0: Direct-Access     SEAGATE  ST318406LW       0109 PQ: 0 ANSI: 3
>> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
>> [   30.724435]  target0:0:0: Beginning Domain Validation
>> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
>> [   30.724572]  target0:0:0: Ending Domain Validation
>> [   30.729747] scsi 0:0:1:0: Direct-Access     FUJITSU  MAH3182MP        0114 PQ: 0 ANSI: 4
>> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
>> [   30.729771]  target0:0:1: Beginning Domain Validation
>> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
>> [   30.729908]  target0:0:1: Ending Domain Validation
>>
> 
> Don't know what would have caused that.  But yes, something is wrong in
> scsi land.

Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c and I still can boot ;)

> 
>> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy.
>>
>> hdparm -t on sda and sdb reports :
>>
>> /dev/sda:
>>  Timing buffered disk reads:    8 MB in  3.26 seconds =   2.46 MB/sec
>>
>> /dev/sdb:
>>  Timing buffered disk reads:    8 MB in  3.56 seconds =   2.25 MB/sec
>>
>> My IDE discs are fine.
>>
>> Please let me know if you need my config or any other informations.
>>
> 
> And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
> 

I found the commit which cause these problems , it is in git-scsi-misc patch and reverting it fixes both problems for me.

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
       [not found]   ` <20071121144116.c932727b.akpm@linux-foundation.org>
@ 2007-11-23  7:29     ` Laurent Riffard
  2007-11-23  7:51       ` Hannes Reinecke
  0 siblings, 1 reply; 35+ messages in thread
From: Laurent Riffard @ 2007-11-23  7:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-ide, James Bottomley, linux-scsi


Le 21.11.2007 23:41, Andrew Morton a écrit :
> On Wed, 21 Nov 2007 22:45:22 +0100
> Laurent Riffard <laurent.riffard@free.fr> wrote:
> 
>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>> Hello, 
>>
>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>> that a bunch of task are blocked in "D" state, they seem to wait for
>> some I/O completion. I can try to hand-copy some data if requested.
>>
>> I found these messages in dmesg:
>>
>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>> EXT3-fs: mounted filesystem with ordered data mode.
>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sda, sector 16460
>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>> ReiserFS: sda7: using ordered data mode
>> --
>> ReiserFS: sda7: Using r5 hash to sort names
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 19632
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 40037363
>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>> lp0: using parport0 (interrupt-driven).
>>
>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>
>> Maybe something is broken in pata_via driver ?
>>
> 
> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> touch pata_via.c.

None of the above...

I did a bisection, it spotted git-scsi-misc.patch. 
I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.

I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
commits are touching documentation or drivers I don't use. I'll try 
to revert only this one this evening.

-- 
laurent


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-23  7:29     ` 2.6.24-rc3-mm1: I/O error, system hangs Laurent Riffard
@ 2007-11-23  7:51       ` Hannes Reinecke
  2007-11-23 11:38         ` Hannes Reinecke
  0 siblings, 1 reply; 35+ messages in thread
From: Hannes Reinecke @ 2007-11-23  7:51 UTC (permalink / raw)
  To: Laurent Riffard
  Cc: Andrew Morton, linux-kernel, linux-ide, James Bottomley,
	linux-scsi

Laurent Riffard wrote:
> Le 21.11.2007 23:41, Andrew Morton a écrit :
>> On Wed, 21 Nov 2007 22:45:22 +0100
>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>
>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>> Hello, 
>>>
>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>> some I/O completion. I can try to hand-copy some data if requested.
>>>
>>> I found these messages in dmesg:
>>>
>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>> EXT3-fs: mounted filesystem with ordered data mode.
>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sda, sector 16460
>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>> ReiserFS: sda7: using ordered data mode
>>> --
>>> ReiserFS: sda7: Using r5 hash to sort names
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 19632
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 40037363
>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>> lp0: using parport0 (interrupt-driven).
>>>
>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>
>>> Maybe something is broken in pata_via driver ?
>>>
>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>> touch pata_via.c.
> 
> None of the above...
> 
> I did a bisection, it spotted git-scsi-misc.patch. 
> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> 
> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> commits are touching documentation or drivers I don't use. I'll try 
> to revert only this one this evening.
> 
Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
I shouldn't. Checking ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-23  7:51       ` Hannes Reinecke
@ 2007-11-23 11:38         ` Hannes Reinecke
  2007-11-23 17:52           ` Laurent Riffard
  2007-11-24 17:44           ` James Bottomley
  0 siblings, 2 replies; 35+ messages in thread
From: Hannes Reinecke @ 2007-11-23 11:38 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Laurent Riffard, Andrew Morton, linux-kernel, linux-ide,
	James Bottomley, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2784 bytes --]

Hannes Reinecke wrote:
> Laurent Riffard wrote:
>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>
>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>> Hello, 
>>>>
>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>
>>>> I found these messages in dmesg:
>>>>
>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>> end_request: I/O error, dev sda, sector 16460
>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>> ReiserFS: sda7: using ordered data mode
>>>> --
>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>> end_request: I/O error, dev sdb, sector 19632
>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>> end_request: I/O error, dev sdb, sector 40037363
>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>> lp0: using parport0 (interrupt-driven).
>>>>
>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>
>>>> Maybe something is broken in pata_via driver ?
>>>>
>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>> touch pata_via.c.
>> None of the above...
>>
>> I did a bisection, it spotted git-scsi-misc.patch. 
>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>
>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>> commits are touching documentation or drivers I don't use. I'll try 
>> to revert only this one this evening.
>>
> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
> I shouldn't. Checking ...
> 
Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.

James, please apply.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

[-- Attachment #2: scsi-misc-fix-spi-validation --]
[-- Type: text/plain, Size: 1015 bytes --]

Fix SPI Domain validation

This fixes a thinko of the FAILFAST handling: when we get
a request with FAILFAST set, we still have to evaluate the
PREEMPT flag to decide if this request should be passed through.

Signed-off-by: Hannes Reinecke <hare@suse.de>

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 13e7e09..9ec1566 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1284,13 +1284,15 @@ int scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
 			/*
 			 * If the devices is blocked we defer normal commands.
 			 */
-			if (!(req->cmd_flags & REQ_PREEMPT))
-				ret = BLKPREP_DEFER;
-			/*
-			 * Return failfast requests immediately
-			 */
-			if (req->cmd_flags & REQ_FAILFAST)
-				ret = BLKPREP_KILL;
+			if (!(req->cmd_flags & REQ_PREEMPT)) {
+				/*
+				 * Return failfast requests immediately
+				 */
+				if (req->cmd_flags & REQ_FAILFAST)
+					ret = BLKPREP_KILL;
+				else
+					ret = BLKPREP_DEFER;
+			}
 			break;
 		default:
 			/*

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-23 11:38         ` Hannes Reinecke
@ 2007-11-23 17:52           ` Laurent Riffard
  2007-11-24  6:42             ` James Bottomley
  2007-11-24 17:44           ` James Bottomley
  1 sibling, 1 reply; 35+ messages in thread
From: Laurent Riffard @ 2007-11-23 17:52 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Andrew Morton, linux-kernel, linux-ide, James Bottomley,
	linux-scsi

Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> Hannes Reinecke wrote:
>> Laurent Riffard wrote:
>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>>
>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>>> Hello, 
>>>>>
>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>>
>>>>> I found these messages in dmesg:
>>>>>
>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>> end_request: I/O error, dev sda, sector 16460
>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>>> ReiserFS: sda7: using ordered data mode
>>>>> --
>>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>> end_request: I/O error, dev sdb, sector 19632
>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>> end_request: I/O error, dev sdb, sector 40037363
>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>>> lp0: using parport0 (interrupt-driven).
>>>>>
>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>>
>>>>> Maybe something is broken in pata_via driver ?
>>>>>
>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>>> touch pata_via.c.
>>> None of the above...
>>>
>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>
>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>> commits are touching documentation or drivers I don't use. I'll try 
>>> to revert only this one this evening.

I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
does fix the problem.

>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>> I shouldn't. Checking ...
>>
> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.

Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.

-- 
laurent


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-23 17:52           ` Laurent Riffard
@ 2007-11-24  6:42             ` James Bottomley
  2007-11-24 12:57               ` Laurent Riffard
  0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2007-11-24  6:42 UTC (permalink / raw)
  To: Laurent Riffard
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi


On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> > Hannes Reinecke wrote:
> >> Laurent Riffard wrote:
> >>> Le 21.11.2007 23:41, Andrew Morton a écrit :
> >>>> On Wed, 21 Nov 2007 22:45:22 +0100
> >>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
> >>>>
> >>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
> >>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> >>>>> Hello, 
> >>>>>
> >>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> >>>>> that a bunch of task are blocked in "D" state, they seem to wait for
> >>>>> some I/O completion. I can try to hand-copy some data if requested.
> >>>>>
> >>>>> I found these messages in dmesg:
> >>>>>
> >>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> >>>>> EXT3-fs: mounted filesystem with ordered data mode.
> >>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>> end_request: I/O error, dev sda, sector 16460
> >>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> >>>>> ReiserFS: sda7: using ordered data mode
> >>>>> --
> >>>>> ReiserFS: sda7: Using r5 hash to sort names
> >>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>> end_request: I/O error, dev sdb, sector 19632
> >>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>> end_request: I/O error, dev sdb, sector 40037363
> >>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
> >>>>> lp0: using parport0 (interrupt-driven).
> >>>>>
> >>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
> >>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >>>>>
> >>>>> Maybe something is broken in pata_via driver ?
> >>>>>
> >>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> >>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> >>>> touch pata_via.c.
> >>> None of the above...
> >>>
> >>> I did a bisection, it spotted git-scsi-misc.patch. 
> >>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> >>>
> >>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> >>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> >>> commits are touching documentation or drivers I don't use. I'll try 
> >>> to revert only this one this evening.
> 
> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
> does fix the problem.
> 
> >> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
> >> I shouldn't. Checking ...
> >>
> > Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
> > when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
> 
> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.

I think the problem is the way we treat BLOCKED and QUIESCED (the latter
is the state that the domain validation uses and which we cannot kill
fastfail on).  It's definitely wrong to kill fastfail requests when the
state is QUIESCE.

This patch (which is applied on top of Hannes original) separates the
BLOCK and QUIESCE states correctly ... does this fix the problem?

James

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 13e7e09..a7cf23a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1279,18 +1279,21 @@ int scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
 				    "rejecting I/O to dead device\n");
 			ret = BLKPREP_KILL;
 			break;
-		case SDEV_QUIESCE:
 		case SDEV_BLOCK:
 			/*
-			 * If the devices is blocked we defer normal commands.
-			 */
-			if (!(req->cmd_flags & REQ_PREEMPT))
-				ret = BLKPREP_DEFER;
-			/*
 			 * Return failfast requests immediately
 			 */
 			if (req->cmd_flags & REQ_FAILFAST)
 				ret = BLKPREP_KILL;
+
+			/* fall through */
+
+		case SDEV_QUIESCE:
+			/*
+			 * If the devices is blocked we defer normal commands.
+			 */
+			if (!(req->cmd_flags & REQ_PREEMPT))
+				ret = BLKPREP_DEFER;
 			break;
 		default:
 			/*



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24  6:42             ` James Bottomley
@ 2007-11-24 12:57               ` Laurent Riffard
  2007-11-24 13:26                 ` James Bottomley
  0 siblings, 1 reply; 35+ messages in thread
From: Laurent Riffard @ 2007-11-24 12:57 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi



Le 24.11.2007 07:42, James Bottomley a écrit :
> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>> Hannes Reinecke wrote:
>>>> Laurent Riffard wrote:
>>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>>>>
>>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>>>>> Hello, 
>>>>>>>
>>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>>>>
>>>>>>> I found these messages in dmesg:
>>>>>>>
>>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>> end_request: I/O error, dev sda, sector 16460
>>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>>>>> ReiserFS: sda7: using ordered data mode
>>>>>>> --
>>>>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>> end_request: I/O error, dev sdb, sector 19632
>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>> end_request: I/O error, dev sdb, sector 40037363
>>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>>>>> lp0: using parport0 (interrupt-driven).
>>>>>>>
>>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>>>>
>>>>>>> Maybe something is broken in pata_via driver ?
>>>>>>>
>>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>>>>> touch pata_via.c.
>>>>> None of the above...
>>>>>
>>>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>>>
>>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>>>> commits are touching documentation or drivers I don't use. I'll try 
>>>>> to revert only this one this evening.
>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>> does fix the problem.
>>
>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>> I shouldn't. Checking ...
>>>>
>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
> 
> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> is the state that the domain validation uses and which we cannot kill
> fastfail on).  It's definitely wrong to kill fastfail requests when the
> state is QUIESCE.
> 
> This patch (which is applied on top of Hannes original) separates the
> BLOCK and QUIESCE states correctly ... does this fix the problem?


No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)


> James
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 13e7e09..a7cf23a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c

> @@ -1279,18 +1279,21 @@ int scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
>  				    "rejecting I/O to dead device\n");
>  			ret = BLKPREP_KILL;
>  			break;
> -		case SDEV_QUIESCE:
>  		case SDEV_BLOCK:
>  			/*
> -			 * If the devices is blocked we defer normal commands.
> -			 */
> -			if (!(req->cmd_flags & REQ_PREEMPT))
> -				ret = BLKPREP_DEFER;
> -			/*
>  			 * Return failfast requests immediately
>  			 */
>  			if (req->cmd_flags & REQ_FAILFAST)
>  				ret = BLKPREP_KILL;
> +
> +			/* fall through */
> +
> +		case SDEV_QUIESCE:
> +			/*
> +			 * If the devices is blocked we defer normal commands.
> +			 */
> +			if (!(req->cmd_flags & REQ_PREEMPT))
> +				ret = BLKPREP_DEFER;
>  			break;
>  		default:
>  			/*
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 12:57               ` Laurent Riffard
@ 2007-11-24 13:26                 ` James Bottomley
  2007-11-24 17:54                   ` Gabriel C
  2007-11-24 22:59                   ` Laurent Riffard
  0 siblings, 2 replies; 35+ messages in thread
From: James Bottomley @ 2007-11-24 13:26 UTC (permalink / raw)
  To: Laurent Riffard
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
> Le 24.11.2007 07:42, James Bottomley a écrit :
> > On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
> >> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> >>> Hannes Reinecke wrote:
> >>>> Laurent Riffard wrote:
> >>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
> >>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
> >>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
> >>>>>>
> >>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
> >>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> >>>>>>> Hello, 
> >>>>>>>
> >>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> >>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
> >>>>>>> some I/O completion. I can try to hand-copy some data if requested.
> >>>>>>>
> >>>>>>> I found these messages in dmesg:
> >>>>>>>
> >>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> >>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
> >>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>> end_request: I/O error, dev sda, sector 16460
> >>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> >>>>>>> ReiserFS: sda7: using ordered data mode
> >>>>>>> --
> >>>>>>> ReiserFS: sda7: Using r5 hash to sort names
> >>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>> end_request: I/O error, dev sdb, sector 19632
> >>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>> end_request: I/O error, dev sdb, sector 40037363
> >>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
> >>>>>>> lp0: using parport0 (interrupt-driven).
> >>>>>>>
> >>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
> >>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >>>>>>>
> >>>>>>> Maybe something is broken in pata_via driver ?
> >>>>>>>
> >>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> >>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> >>>>>> touch pata_via.c.
> >>>>> None of the above...
> >>>>>
> >>>>> I did a bisection, it spotted git-scsi-misc.patch. 
> >>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> >>>>>
> >>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> >>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> >>>>> commits are touching documentation or drivers I don't use. I'll try 
> >>>>> to revert only this one this evening.
> >> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
> >> does fix the problem.
> >>
> >>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
> >>>> I shouldn't. Checking ...
> >>>>
> >>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
> >>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
> >> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
> > 
> > I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> > is the state that the domain validation uses and which we cannot kill
> > fastfail on).  It's definitely wrong to kill fastfail requests when the
> > state is QUIESCE.
> > 
> > This patch (which is applied on top of Hannes original) separates the
> > BLOCK and QUIESCE states correctly ... does this fix the problem?
> 
> 
> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)

OK, could you post dmesgs again, please.  I actually tested this with an
aic79xx card, and for me it does cause Domain Validation to succeed
again.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-23 11:38         ` Hannes Reinecke
  2007-11-23 17:52           ` Laurent Riffard
@ 2007-11-24 17:44           ` James Bottomley
  2007-11-26  7:54             ` Hannes Reinecke
  1 sibling, 1 reply; 35+ messages in thread
From: James Bottomley @ 2007-11-24 17:44 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Laurent Riffard, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

Probing intermittent failures in Domain Validation, even with the fixes
applied leads me to the conclusion that there are further problems with
this commit:

commit fc5eb4facedbd6d7117905e775cee1975f894e79
Author: Hannes Reinecke <hare@suse.de>
Date:   Tue Nov 6 09:23:40 2007 +0100

    [SCSI] Do not requeue requests if REQ_FAILFAST is set

The essence of the problems is that you're causing REQ_FAILFAST to
terminate commands with error on requeuing conditions, some of which are
relatively common on most SCSI devices.  While this may be the correct
behaviour for multi-path, it's certainly wrong for the previously
understood meaning of REQ_FAILFAST, which was don't retry on error,
which is why domain validation and other applications use it to control
error handling, but don't expect to get failures for a simple requeue
are now spitting errors.

I honestly can't see that, even for the multi-path case, returning an
error when we're over queue depth is the correct thing to do (it may not
matter to something like a symmetrix, but an array that has a non-zero
cost associated with a path change, like a CPQ HSV or the AVT
controllers, will show fairly large slow downs if you do this).  Even if
this is the desired behaviour (and I think that's a policy issue),
DID_NO_CONNECT is almost certainly the wrong error to be sending back.

This patch fixes up domain validation to work again correctly, however,
I really think it's just a bandaid.  Do you want to rethink the above
commit?

James

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c	2007-11-24 11:25:20.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c	2007-11-24 11:26:22.000000000 -0600
@@ -1552,7 +1552,8 @@ static void scsi_request_fn(struct reque
 			break;

 		if (!scsi_dev_queue_ready(q, sdev)) {
-			if (req->cmd_flags & REQ_FAILFAST) {
+			if ((req->cmd_flags & REQ_FAILFAST) &&
+			    !(req->cmd_flags & REQ_PREEMPT)) {
 				scsi_kill_request(req, q);
 				continue;
 			}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 13:26                 ` James Bottomley
@ 2007-11-24 17:54                   ` Gabriel C
  2007-11-24 18:04                     ` James Bottomley
  2007-11-24 22:59                   ` Laurent Riffard
  1 sibling, 1 reply; 35+ messages in thread
From: Gabriel C @ 2007-11-24 17:54 UTC (permalink / raw)
  To: James Bottomley
  Cc: Laurent Riffard, Hannes Reinecke, Andrew Morton, linux-kernel,
	linux-ide, linux-scsi

James Bottomley wrote:
> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>>>> Hannes Reinecke wrote:
>>>>>> Laurent Riffard wrote:
>>>>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>>>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>>>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>>>>>>
>>>>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>>>>>>> Hello, 
>>>>>>>>>
>>>>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>>>>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>>>>>>
>>>>>>>>> I found these messages in dmesg:
>>>>>>>>>
>>>>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>> end_request: I/O error, dev sda, sector 16460
>>>>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>>>>>>> ReiserFS: sda7: using ordered data mode
>>>>>>>>> --
>>>>>>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>> end_request: I/O error, dev sdb, sector 19632
>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>> end_request: I/O error, dev sdb, sector 40037363
>>>>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>>>>>>> lp0: using parport0 (interrupt-driven).
>>>>>>>>>
>>>>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>>>>>>
>>>>>>>>> Maybe something is broken in pata_via driver ?
>>>>>>>>>
>>>>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>>>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>>>>>>> touch pata_via.c.
>>>>>>> None of the above...
>>>>>>>
>>>>>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>>>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>>>>>
>>>>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>>>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>>>>>> commits are touching documentation or drivers I don't use. I'll try 
>>>>>>> to revert only this one this evening.
>>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>>> does fix the problem.
>>>>
>>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>>>> I shouldn't. Checking ...
>>>>>>
>>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>> is the state that the domain validation uses and which we cannot kill
>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>> state is QUIESCE.
>>>
>>> This patch (which is applied on top of Hannes original) separates the
>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>
>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> 
> OK, could you post dmesgs again, please.  I actually tested this with an
> aic79xx card, and for me it does cause Domain Validation to succeed
> again.
> 

Are the patches indeed to fix that problem as well ? 

http://lkml.org/lkml/2007/11/23/5

> James

Gabriel 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 17:54                   ` Gabriel C
@ 2007-11-24 18:04                     ` James Bottomley
  2007-11-24 18:08                       ` Gabriel C
  0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2007-11-24 18:04 UTC (permalink / raw)
  To: Gabriel C
  Cc: Laurent Riffard, Hannes Reinecke, Andrew Morton, linux-kernel,
	linux-ide, linux-scsi


On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
> James Bottomley wrote:
> > On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
> >> Le 24.11.2007 07:42, James Bottomley a écrit :
> >>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
> >>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> >>>>> Hannes Reinecke wrote:
> >>>>>> Laurent Riffard wrote:
> >>>>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
> >>>>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
> >>>>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
> >>>>>>>>
> >>>>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
> >>>>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> >>>>>>>>> Hello, 
> >>>>>>>>>
> >>>>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> >>>>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
> >>>>>>>>> some I/O completion. I can try to hand-copy some data if requested.
> >>>>>>>>>
> >>>>>>>>> I found these messages in dmesg:
> >>>>>>>>>
> >>>>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> >>>>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
> >>>>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>>>> end_request: I/O error, dev sda, sector 16460
> >>>>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> >>>>>>>>> ReiserFS: sda7: using ordered data mode
> >>>>>>>>> --
> >>>>>>>>> ReiserFS: sda7: Using r5 hash to sort names
> >>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>>>> end_request: I/O error, dev sdb, sector 19632
> >>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> >>>>>>>>> end_request: I/O error, dev sdb, sector 40037363
> >>>>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
> >>>>>>>>> lp0: using parport0 (interrupt-driven).
> >>>>>>>>>
> >>>>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
> >>>>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >>>>>>>>>
> >>>>>>>>> Maybe something is broken in pata_via driver ?
> >>>>>>>>>
> >>>>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> >>>>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> >>>>>>>> touch pata_via.c.
> >>>>>>> None of the above...
> >>>>>>>
> >>>>>>> I did a bisection, it spotted git-scsi-misc.patch. 
> >>>>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> >>>>>>>
> >>>>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> >>>>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> >>>>>>> commits are touching documentation or drivers I don't use. I'll try 
> >>>>>>> to revert only this one this evening.
> >>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
> >>>> does fix the problem.
> >>>>
> >>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
> >>>>>> I shouldn't. Checking ...
> >>>>>>
> >>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
> >>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
> >>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
> >>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> >>> is the state that the domain validation uses and which we cannot kill
> >>> fastfail on).  It's definitely wrong to kill fastfail requests when the
> >>> state is QUIESCE.
> >>>
> >>> This patch (which is applied on top of Hannes original) separates the
> >>> BLOCK and QUIESCE states correctly ... does this fix the problem?
> >>
> >> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> > 
> > OK, could you post dmesgs again, please.  I actually tested this with an
> > aic79xx card, and for me it does cause Domain Validation to succeed
> > again.
> > 
> 
> Are the patches indeed to fix that problem as well ? 
> 
> http://lkml.org/lkml/2007/11/23/5

That dmesg is from an unknown SCSI card exhibiting Domain Validation
problems, so it's a reasonable probability, yes ... but you'll need the
additional hack I just did to prevent further intermittent failures.

James



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 18:04                     ` James Bottomley
@ 2007-11-24 18:08                       ` Gabriel C
  2007-11-24 18:28                         ` Gabriel C
  0 siblings, 1 reply; 35+ messages in thread
From: Gabriel C @ 2007-11-24 18:08 UTC (permalink / raw)
  To: James Bottomley
  Cc: Laurent Riffard, Hannes Reinecke, Andrew Morton, linux-kernel,
	linux-ide, linux-scsi

James Bottomley wrote:
> On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
>> James Bottomley wrote:
>>> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>>>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>>>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>>>>>> Hannes Reinecke wrote:
>>>>>>>> Laurent Riffard wrote:
>>>>>>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>>>>>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>>>>>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>>>>>>>>
>>>>>>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>>>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>>>>>>>>> Hello, 
>>>>>>>>>>>
>>>>>>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>>>>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>>>>>>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>>>>>>>>
>>>>>>>>>>> I found these messages in dmesg:
>>>>>>>>>>>
>>>>>>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>>>>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>>>>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>> end_request: I/O error, dev sda, sector 16460
>>>>>>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>>>>>>>>> ReiserFS: sda7: using ordered data mode
>>>>>>>>>>> --
>>>>>>>>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>> end_request: I/O error, dev sdb, sector 19632
>>>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>> end_request: I/O error, dev sdb, sector 40037363
>>>>>>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>>>>>>>>> lp0: using parport0 (interrupt-driven).
>>>>>>>>>>>
>>>>>>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>>>>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>>>>>>>>
>>>>>>>>>>> Maybe something is broken in pata_via driver ?
>>>>>>>>>>>
>>>>>>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>>>>>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>>>>>>>>> touch pata_via.c.
>>>>>>>>> None of the above...
>>>>>>>>>
>>>>>>>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>>>>>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>>>>>>>
>>>>>>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>>>>>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>>>>>>>> commits are touching documentation or drivers I don't use. I'll try 
>>>>>>>>> to revert only this one this evening.
>>>>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>>>>> does fix the problem.
>>>>>>
>>>>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>>>>>> I shouldn't. Checking ...
>>>>>>>>
>>>>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>>>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>>>>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
>>>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>>>> is the state that the domain validation uses and which we cannot kill
>>>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>>>> state is QUIESCE.
>>>>>
>>>>> This patch (which is applied on top of Hannes original) separates the
>>>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
>>> OK, could you post dmesgs again, please.  I actually tested this with an
>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>> again.
>>>
>> Are the patches indeed to fix that problem as well ? 
>>
>> http://lkml.org/lkml/2007/11/23/5
> 
> That dmesg is from an unknown SCSI card exhibiting Domain Validation
> problems, so it's a reasonable probability, yes ... but you'll need the
> additional hack I just did to prevent further intermittent failures.

My controller is:

03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] (rev 02)

I'll try the patches in a bit.

> 
> James
> 

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 18:08                       ` Gabriel C
@ 2007-11-24 18:28                         ` Gabriel C
  0 siblings, 0 replies; 35+ messages in thread
From: Gabriel C @ 2007-11-24 18:28 UTC (permalink / raw)
  To: James Bottomley
  Cc: Laurent Riffard, Hannes Reinecke, Andrew Morton, linux-kernel,
	linux-ide, linux-scsi

Gabriel C wrote:
> James Bottomley wrote:
>> On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
>>> James Bottomley wrote:
>>>> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>>>>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>>>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>>>>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>>>>>>> Hannes Reinecke wrote:
>>>>>>>>> Laurent Riffard wrote:
>>>>>>>>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>>>>>>>>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>>>>>>>>>> Laurent Riffard <laurent.riffard@free.fr> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>>>>>>>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>>>>>>>>>>> Hello, 
>>>>>>>>>>>>
>>>>>>>>>>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>>>>>>>>>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>>>>>>>>>>> some I/O completion. I can try to hand-copy some data if requested.
>>>>>>>>>>>>
>>>>>>>>>>>> I found these messages in dmesg:
>>>>>>>>>>>>
>>>>>>>>>>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>>>>>>>>>>> EXT3-fs: mounted filesystem with ordered data mode.
>>>>>>>>>>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>>> end_request: I/O error, dev sda, sector 16460
>>>>>>>>>>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>>>>>>>>>>> ReiserFS: sda7: using ordered data mode
>>>>>>>>>>>> --
>>>>>>>>>>>> ReiserFS: sda7: Using r5 hash to sort names
>>>>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>>> end_request: I/O error, dev sdb, sector 19632
>>>>>>>>>>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>>>>>>>>>>> end_request: I/O error, dev sdb, sector 40037363
>>>>>>>>>>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
>>>>>>>>>>>> lp0: using parport0 (interrupt-driven).
>>>>>>>>>>>>
>>>>>>>>>>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>>>>>>>>>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe something is broken in pata_via driver ?
>>>>>>>>>>>>
>>>>>>>>>>> Could be - libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>>>>>>>>>> and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>>>>>>>>>> touch pata_via.c.
>>>>>>>>>> None of the above...
>>>>>>>>>>
>>>>>>>>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>>>>>>>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>>>>>>>>
>>>>>>>>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>>>>>>>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>>>>>>>>> commits are touching documentation or drivers I don't use. I'll try 
>>>>>>>>>> to revert only this one this evening.
>>>>>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>>>>>> does fix the problem.
>>>>>>>
>>>>>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>>>>>>> I shouldn't. Checking ...
>>>>>>>>>
>>>>>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>>>>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>>>>>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
>>>>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>>>>> is the state that the domain validation uses and which we cannot kill
>>>>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>>>>> state is QUIESCE.
>>>>>>
>>>>>> This patch (which is applied on top of Hannes original) separates the
>>>>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>>>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
>>>> OK, could you post dmesgs again, please.  I actually tested this with an
>>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>>> again.
>>>>
>>> Are the patches indeed to fix that problem as well ? 
>>>
>>> http://lkml.org/lkml/2007/11/23/5
>> That dmesg is from an unknown SCSI card exhibiting Domain Validation
>> problems, so it's a reasonable probability, yes ... but you'll need the
>> additional hack I just did to prevent further intermittent failures.
> 
> My controller is:
> 
> 03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] (rev 02)
> 
> I'll try the patches in a bit.

With your patches my problem(s) are solved. Domain Validation works again.

...

[   32.179521] scsi 0:0:0:0: Direct-Access     SEAGATE  ST318406LW       0109 PQ: 0 ANSI: 3
[   32.179540] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
[   32.179554]  target0:0:0: Beginning Domain Validation
[   32.188553]  target0:0:0: wide asynchronous
[   32.195302]  target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63)
[   32.206510]  target0:0:0: Ending Domain Validation
[   32.211699] scsi 0:0:1:0: Direct-Access     FUJITSU  MAH3182MP        0114 PQ: 0 ANSI: 4
[   32.211707] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
[   32.211717]  target0:0:1: Beginning Domain Validation
[   32.213980]  target0:0:1: wide asynchronous
[   32.215682]  target0:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
[   32.220205]  target0:0:1: Ending Domain Validation

...

 Thx James :)


Gabriel

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 13:26                 ` James Bottomley
  2007-11-24 17:54                   ` Gabriel C
@ 2007-11-24 22:59                   ` Laurent Riffard
  2007-11-25  7:37                     ` James Bottomley
  1 sibling, 1 reply; 35+ messages in thread
From: Laurent Riffard @ 2007-11-24 22:59 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2041 bytes --]

Le 24.11.2007 14:26, James Bottomley a écrit :
> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
[snip]
>>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>>> does fix the problem.
>>>>
>>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>>>> I shouldn't. Checking ...
>>>>>>
>>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>> is the state that the domain validation uses and which we cannot kill
>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>> state is QUIESCE.
>>>
>>> This patch (which is applied on top of Hannes original) separates the
>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>
>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> 
> OK, could you post dmesgs again, please.  I actually tested this with an
> aic79xx card, and for me it does cause Domain Validation to succeed
> again.

James, 

Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates the 
BLOCK and QUIESCE states correctly" (http://lkml.org/lkml/2007/11/24/8).

How to reproduce :
- boot
- switch to a text console
- capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
  system does work.
- switch to X console, log in the Gnome Desktop, the system partially 
  hangs.
- switch back to a text console: dmesg(1) still works, it shows some 
  additonal I/O errors. At this point, any disk access makes the system 
  completely hung.

Additionnal data:
- the I/O errors always happen on the same blocks.

-- 
laurent

[-- Attachment #2: dmesg-2.6.24-rc3-mm1-patched --]
[-- Type: text/plain, Size: 30536 bytes --]

[    0.000000] Linux version 2.6.24-rc3-mm1 (laurent@calimero) (gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #122 PREEMPT Fri Nov 23 18:47:58 CET 2007
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000001ffec000 (usable)
[    0.000000]  BIOS-e820: 000000001ffec000 - 000000001ffef000 (ACPI data)
[    0.000000]  BIOS-e820: 000000001ffef000 - 000000001ffff000 (reserved)
[    0.000000]  BIOS-e820: 000000001ffff000 - 0000000020000000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
[    0.000000] 511MB LOWMEM available.
[    0.000000] Entering add_active_range(0, 0, 131052) 0 entries of 256 used
[    0.000000] sizeof(struct page) = 32
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA             0 ->     4096
[    0.000000]   Normal       4096 ->   131052
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0:        0 ->   131052
[    0.000000] On node 0 totalpages: 131052
[    0.000000] Node 0 memmap at 0xC1000000 size 4194304 first pfn 0xC1000000
[    0.000000]   DMA zone: 32 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 4064 pages, LIFO batch:0
[    0.000000]   Normal zone: 991 pages used for memmap
[    0.000000]   Normal zone: 125965 pages, LIFO batch:31
[    0.000000]   Movable zone: 0 pages used for memmap
[    0.000000] DMI 2.3 present.
[    0.000000] ACPI: RSDP 000F6A80, 0014 (r0 ASUS  )
[    0.000000] ACPI: RSDT 1FFEC000, 002C (r1 ASUS   A7V133-C 30303031 MSFT 31313031)
[    0.000000] ACPI: FACP 1FFEC080, 0074 (r1 ASUS   A7V133-C 30303031 MSFT 31313031)
[    0.000000] ACPI: DSDT 1FFEC100, 2CE1 (r1   ASUS A7V133-C     1000 MSFT  100000B)
[    0.000000] ACPI: FACS 1FFFF000, 0040
[    0.000000] ACPI: BOOT 1FFEC040, 0028 (r1 ASUS   A7V133-C 30303031 MSFT 31313031)
[    0.000000] ACPI: PM-Timer IO Port: 0xe408
[    0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dfff0000)
[    0.000000] swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
[    0.000000] swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000f0000
[    0.000000] swsusp: Registered nosave memory region: 00000000000f0000 - 0000000000100000
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 130029
[    0.000000] Kernel command line: root=/dev/mapper/vglinux1-lv_ubuntu2 ro locale=fr_FR video=radeonfb:1280x1024@60 resume=/dev/mapper/vglinux1-lvswap
[    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
[    0.000000] mapped APIC to ffffb000 (01406000)
[    0.000000] Enabling fast FPU save and restore... done.
[    0.000000] Enabling unmasked SIMD FPU exception support... done.
[    0.000000] Initializing CPU#0
[    0.000000] CPU 0 irqstacks, hard=C03C2000 soft=C03C1000
[    0.000000] PID hash table entries: 2048 (order: 11, 8192 bytes)
[    0.000000] Detected 1410.216 MHz processor.
[   22.884407] Console: colour VGA+ 80x25
[   22.884416] console [tty0] enabled
[   22.886762] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[   22.886826] ... MAX_LOCKDEP_SUBCLASSES:    8
[   22.886874] ... MAX_LOCK_DEPTH:          30
[   22.886923] ... MAX_LOCKDEP_KEYS:        2048
[   22.886971] ... CLASSHASH_SIZE:           1024
[   22.887020] ... MAX_LOCKDEP_ENTRIES:     8192
[   22.887067] ... MAX_LOCKDEP_CHAINS:      16384
[   22.887115] ... CHAINHASH_SIZE:          8192
[   22.887164]  memory used by lock dependency info: 992 kB
[   22.887214]  per task-struct memory footprint: 1200 bytes
[   22.887265] ------------------------
[   22.887312] | Locking API testsuite:
[   22.887358] ----------------------------------------------------------------------------
[   22.887422]                                  | spin |wlock |rlock |mutex | wsem | rsem |
[   22.887484]   --------------------------------------------------------------------------
[   22.887557]                      A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.889187]                  A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.890642]              A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.892120]              A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.893598]          A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.895101]          A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.896609]          A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.898121]                     double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.899557]                   initialize held:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.900996]                  bad unlock order:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[   22.902460]   --------------------------------------------------------------------------
[   22.902523]               recursive read-lock:             |  ok  |             |  ok  |
[   22.903128]            recursive read-lock #2:             |  ok  |             |  ok  |
[   22.903732]             mixed read-write-lock:             |  ok  |             |  ok  |
[   22.904337]             mixed write-read-lock:             |  ok  |             |  ok  |
[   22.904945]   --------------------------------------------------------------------------
[   22.905009]      hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[   22.905772]      soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[   22.906532]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[   22.907291]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[   22.908049]        sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
[   22.908809]        sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
[   22.909568]          hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[   22.910327]          soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[   22.911086]          hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[   22.911846]          soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[   22.912605]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[   22.913377]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[   22.914148]     hard-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[   22.914916]     soft-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[   22.915686]     hard-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[   22.916456]     soft-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[   22.917227]     hard-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[   22.917996]     soft-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[   22.918766]     hard-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[   22.919526]     soft-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[   22.920290]     hard-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[   22.921059]     soft-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[   22.921827]     hard-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[   22.922597]     soft-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[   22.923368]     hard-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[   22.924137]     soft-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[   22.924907]     hard-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[   22.925675]     soft-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[   22.926446]     hard-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[   22.927214]     soft-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[   22.927984]     hard-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[   22.928754]     soft-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[   22.929525]     hard-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[   22.930295]     soft-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[   22.931066]       hard-irq lock-inversion/123:  ok  |  ok  |  ok  |
[   22.931839]       soft-irq lock-inversion/123:  ok  |  ok  |  ok  |
[   22.932615]       hard-irq lock-inversion/132:  ok  |  ok  |  ok  |
[   22.933389]       soft-irq lock-inversion/132:  ok  |  ok  |  ok  |
[   22.934162]       hard-irq lock-inversion/213:  ok  |  ok  |  ok  |
[   22.934934]       soft-irq lock-inversion/213:  ok  |  ok  |  ok  |
[   22.935709]       hard-irq lock-inversion/231:  ok  |  ok  |  ok  |
[   22.936479]       soft-irq lock-inversion/231:  ok  |  ok  |  ok  |
[   22.937251]       hard-irq lock-inversion/312:  ok  |  ok  |  ok  |
[   22.938021]       soft-irq lock-inversion/312:  ok  |  ok  |  ok  |
[   22.938796]       hard-irq lock-inversion/321:  ok  |  ok  |  ok  |
[   22.939568]       soft-irq lock-inversion/321:  ok  |  ok  |  ok  |
[   22.940341]       hard-irq read-recursion/123:  ok  |
[   22.940652]       soft-irq read-recursion/123:  ok  |
[   22.940963]       hard-irq read-recursion/132:  ok  |
[   22.941274]       soft-irq read-recursion/132:  ok  |
[   22.941587]       hard-irq read-recursion/213:  ok  |
[   22.941897]       soft-irq read-recursion/213:  ok  |
[   22.942208]       hard-irq read-recursion/231:  ok  |
[   22.942518]       soft-irq read-recursion/231:  ok  |
[   22.942829]       hard-irq read-recursion/312:  ok  |
[   22.943139]       soft-irq read-recursion/312:  ok  |
[   22.943450]       hard-irq read-recursion/321:  ok  |
[   22.943759]       soft-irq read-recursion/321:  ok  |
[   22.944071] -------------------------------------------------------
[   22.944126] Good, all 218 testcases passed! |
[   22.944174] ---------------------------------
[   22.944870] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[   22.945989] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[   22.969934] Memory: 510792k/524208k available (1734k kernel code, 12768k reserved, 883k data, 184k init, 0k highmem)
[   22.970014] virtual kernel memory layout:
[   22.970016]     fixmap  : 0xfffb5000 - 0xfffff000   ( 296 kB)
[   22.970019]     vmalloc : 0xe0800000 - 0xfffb3000   ( 503 MB)
[   22.970021]     lowmem  : 0xc0000000 - 0xdffec000   ( 511 MB)
[   22.970023]       .init : 0xc0390000 - 0xc03be000   ( 184 kB)
[   22.970025]       .data : 0xc02b18bc - 0xc038e504   ( 883 kB)
[   22.970027]       .text : 0xc0100000 - 0xc02b18bc   (1734 kB)
[   22.970368] Checking if this processor honours the WP bit even in supervisor mode... Ok.
[   23.051065] Calibrating delay using timer specific routine.. 2824.37 BogoMIPS (lpj=5648753)
[   23.051383] Mount-cache hash table entries: 512
[   23.052131] CPU: After generic identify, caps: 0383f9ff c1cbf9ff 00000000 00000000 00000000 00000000 00000000 00000000
[   23.052147] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[   23.052205] CPU: L2 Cache: 256K (64 bytes/line)
[   23.052255] CPU: After all inits, caps: 0383f9ff c1cbf9ff 00000000 00000420 00000000 00000000 00000000 00000000
[   23.052267] Intel machine check architecture supported.
[   23.052319] Intel machine check reporting enabled on CPU#0.
[   23.052376] Compat vDSO mapped to ffffe000.
[   23.052441] CPU: AMD Athlon(TM) XP 1600+ stepping 02
[   23.052557] Checking 'hlt' instruction... OK.
[   23.067300] Freeing SMP alternatives: 0k freed
[   23.067349] ACPI: Core revision 20070126
[   23.072552] Parsing all Control Methods:
[   23.072681] Table [DSDT](id 0001) - 356 Objects with 38 Devices 115 Methods 24 Regions
[   23.072782]  tbxface-0598 [00] tb_load_namespace     : ACPI Tables successfully acquired
[   23.072914] ACPI: setting ELCR to 0200 (from 0820)
[   23.073065] evxfevnt-0091 [00] enable                : Transition to ACPI mode successful
[   23.074112] khelper used greatest stack depth: 3172 bytes left
[   23.074581] net_namespace: 76 bytes
[   23.075809] NET: Registered protocol family 16
[   23.077047] ACPI: bus type pci registered
[   23.079899] PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1
[   23.080300] PCI: Using configuration type 1
[   23.080349] Setting up standard PCI resources
[   23.089819] khelper used greatest stack depth: 3064 bytes left
[   23.094177] evgpeblk-0956 [00] ev_create_gpe_block   : GPE 00 to 0F [_GPE] 2 regs on int 0x9
[   23.096633] evgpeblk-1052 [00] ev_initialize_gpe_bloc: Found 4 Wake, Enabled 0 Runtime GPEs in this block
[   23.096774] ACPI: EC: Look up EC in DSDT
[   23.100425] Completing Region/Field/Buffer/Package initialization:................................................
[   23.107399] Initialized 16/24 Regions 2/2 Fields 19/19 Buffers 11/14 Packages (365 nodes)
[   23.107501] Initializing Device/Processor/Thermal objects by executing _INI methods:
[   23.107592] Executed 0 _INI methods requiring 0 _STA executions (examined 41 objects)
[   23.107780] ACPI: Interpreter enabled
[   23.107830] ACPI: (supports S0 S1 S3 S4 S5)
[   23.108118] ACPI: Using PIC for interrupt routing
[   23.134108] ACPI: PCI Root Bridge [PCI0] (0000:00)
[   23.135036] PCI quirk: region e200-e27f claimed by vt82c686 HW-mon
[   23.135123] PCI quirk: region e800-e80f claimed by vt82c686 SMB
[   23.135717] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[   23.287800] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
[   23.288493] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
[   23.289235] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
[   23.290017] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
[   23.290844] Linux Plug and Play Support v0.97 (c) Adam Belay
[   23.291166] pnp: PnP ACPI init
[   23.291262] ACPI: bus type pnp registered
[   23.300586] pnp: PnP ACPI: found 14 devices
[   23.300648] ACPI: ACPI bus type pnp unregistered
[   23.302100] PCI: Using ACPI for IRQ routing
[   23.302153] PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
[   23.327027] Time: tsc clocksource has been installed.
[   23.335311] system 00:00: iomem range 0x0-0x9ffff could not be reserved
[   23.335371] system 00:00: iomem range 0xf0000-0xfffff could not be reserved
[   23.335430] system 00:00: iomem range 0x100000-0x1fffffff could not be reserved
[   23.335496] system 00:00: iomem range 0xfffe0000-0xffffffff could not be reserved
[   23.335579] system 00:02: ioport range 0x3f0-0x3f1 has been reserved
[   23.335636] system 00:02: ioport range 0x4d0-0x4d1 has been reserved
[   23.335708] system 00:03: ioport range 0xe400-0xe47f has been reserved
[   23.335765] system 00:03: ioport range 0xe800-0xe80f has been reserved
[   23.368445] PCI: Bridge: 0000:00:01.0
[   23.368500]   IO window: d000-dfff
[   23.368551]   MEM window: b7000000-b7ffffff
[   23.368602]   PREFETCH window: b8000000-cfffffff
[   23.368677] PCI: Setting latency timer of device 0000:00:01.0 to 64
[   23.368757] NET: Registered protocol family 2
[   23.403206] IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
[   23.404092] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
[   23.404621] TCP bind hash table entries: 16384 (order: 7, 524288 bytes)
[   23.406957] TCP: Hash tables configured (established 16384 bind 16384)
[   23.407128] TCP reno registered
[   23.420600] checking if image is initramfs... it is
[   23.870982] Switched to NOHz mode on CPU #0
[   23.918866] Freeing initrd memory: 3197k freed
[   23.919595] Simple Boot Flag at 0x3a set to 0x1
[   23.923543] io scheduler noop registered
[   23.923604] io scheduler anticipatory registered
[   23.923654] io scheduler deadline registered
[   23.923727] io scheduler cfq registered (default)
[   23.923792] Applying VIA southbridge workaround.
[   23.923845] PCI: VIA PCI bridge detected. Disabling DAC.
[   23.923898] PCI: Disabling Via external APIC routing
[   23.923989] Boot video device is 0000:01:00.0
[   23.925776] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
[   23.925837] PCI: setting IRQ 11 as level-triggered
[   23.925844] ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
[   23.926143] radeonfb: Found Intel x86 BIOS ROM Image
[   23.926200] radeonfb: Retrieved PLL infos from BIOS
[   23.926252] radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=250.00 Mhz, System=166.00 MHz
[   23.926317] radeonfb: PLL min 20000 max 40000
[   24.006702] i2c-adapter i2c-1: unable to read EDID block.
[   24.130631] i2c-adapter i2c-1: unable to read EDID block.
[   24.254602] i2c-adapter i2c-1: unable to read EDID block.
[   24.378592] i2c-adapter i2c-3: unable to read EDID block.
[   24.502559] i2c-adapter i2c-3: unable to read EDID block.
[   24.626531] i2c-adapter i2c-3: unable to read EDID block.
[   24.890696] radeonfb: Monitor 1 type CRT found
[   24.890763] radeonfb: EDID probed
[   24.890810] radeonfb: Monitor 2 type no found
[   24.939657] Console: switching to colour frame buffer device 160x64
[   24.978303] radeonfb (0000:01:00.0): ATI Radeon Ya 
[   24.979794] input: Power Button (FF) as /class/input/input0
[   24.979882] ACPI: Power Button (FF) [PWRF]
[   24.980293] input: Power Button (CM) as /class/input/input1
[   24.980367] ACPI: Power Button (CM) [PWRB]
[   24.981341] ACPI: CPU0 (power states: C1[C1] C2[C2])
[   24.981457] ACPI: Processor [CPU0] (supports 16 throttling states)
[   25.015443] RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
[   25.016038] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[   25.016973] serio: i8042 KBD port at 0x60,0x64 irq 1
[   25.017057] serio: i8042 AUX port at 0x60,0x64 irq 12
[   25.018012] mice: PS/2 mouse device common for all mice
[   25.029741] Marking TSC unstable due to: possible TSC halt in C2.
[   25.030719] Time: acpi_pm clocksource has been installed.
[   25.039605] input: AT Translated Set 2 keyboard as /class/input/input2
[   25.054670] TCP cubic registered
[   25.054759] NET: Registered protocol family 1
[   25.054863] Using IPI Shortcut mode
[   25.056562] BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
[   25.058363] Freeing unused kernel memory: 184k freed
[   25.058540] Write protecting the kernel text: 1736k
[   25.058606] Write protecting the kernel read-only data: 700k
[   25.061832] busybox used greatest stack depth: 2900 bytes left
[   25.072401] exe used greatest stack depth: 2864 bytes left
[   25.175794] exe used greatest stack depth: 2652 bytes left
[   25.240831] input: ImExPS/2 Generic Explorer Mouse as /class/input/input3
[   25.261513] consolechars used greatest stack depth: 2556 bytes left
[   25.483644] device-mapper: uevent: version 1.0.3
[   25.484060] device-mapper: ioctl: 4.13.0-ioctl (2007-10-18) initialised: dm-devel@redhat.com
[   25.506481] SCSI subsystem initialized
[   25.517545] libata version 3.00 loaded.
[   25.520583] pata_via 0000:00:04.1: version 0.3.3
[   25.521256] scsi0 : pata_via
[   25.521711] scsi1 : pata_via
[   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xb800 irq 14
[   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xb808 irq 15
[   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
[   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
[   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
[   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
[   25.691127] ata1.00: configured for UDMA/100
[   25.699142] ata1.01: configured for UDMA/100
[   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max UDMA/33
[   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
[   26.330839] ata2.00: configured for UDMA/33
[   26.490828] ata2.01: configured for MWDMA2
[   26.503014] scsi 0:0:0:0: Direct-Access     ATA      ST340016A        3.75 PQ: 0 ANSI: 5
[   26.504670] scsi 0:0:1:0: Direct-Access     ATA      Maxtor 6Y080L0   YAR4 PQ: 0 ANSI: 5
[   26.509842] scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRAM GSA-4165B DL05 PQ: 0 ANSI: 5
[   26.511673] scsi 1:0:1:0: CD-ROM            E-IDE    CD-950E/AKU      A4Q  PQ: 0 ANSI: 5
[   26.513278] modprobe used greatest stack depth: 2152 bytes left
[   26.527262] sd 0:0:0:0: [sda] 78165360 512-byte hardware sectors (40021 MB)
[   26.528154] sd 0:0:0:0: [sda] Write Protect is off
[   26.528961] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   26.529033] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   26.530146] sd 0:0:0:0: [sda] 78165360 512-byte hardware sectors (40021 MB)
[   26.531224] sd 0:0:0:0: [sda] Write Protect is off
[   26.532082] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   26.532151] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   26.533090]  sda: sda1 sda2 sda3 < sda5 sda6 sda7 > sda4
[   26.581919] sd 0:0:0:0: [sda] Attached SCSI disk
[   26.583274] sd 0:0:1:0: [sdb] 160086528 512-byte hardware sectors (81964 MB)
[   26.584268] sd 0:0:1:0: [sdb] Write Protect is off
[   26.585211] sd 0:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[   26.585282] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   26.586508] sd 0:0:1:0: [sdb] 160086528 512-byte hardware sectors (81964 MB)
[   26.587530] sd 0:0:1:0: [sdb] Write Protect is off
[   26.588492] sd 0:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[   26.588564] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   26.589603]  sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 >
[   26.639851] sd 0:0:1:0: [sdb] Attached SCSI disk
[   26.641160] modprobe used greatest stack depth: 1964 bytes left
[   29.780014] Attempting manual resume
[   29.844555] ReiserFS: dm-10: found reiserfs format "3.6" with standard journal
[   29.845623] ReiserFS: dm-10: using ordered data mode
[   29.876338] ReiserFS: dm-10: journal params: device dm-10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   29.881733] ReiserFS: dm-10: checking transaction log (dm-10)
[   29.994198] ReiserFS: dm-10: Using r5 hash to sort names
[   29.995523] exe used greatest stack depth: 1620 bytes left
[   41.338877] Linux agpgart interface v0.102
[   41.365338] agpgart: Detected VIA Twister-K/KT133x/KM133 chipset
[   41.455791] agpgart: AGP aperture is 256M @ 0xd0000000
[   41.639237] parport_pc: VIA 686A/8231 detected
[   41.639244] parport_pc: probing current configuration
[   41.639266] parport_pc: Current parallel port base: 0x378
[   41.639446] parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE]
[   41.736002] parport_pc: VIA parallel port: io=0x378, irq=7
[   41.826262] usbcore: registered new interface driver usbfs
[   41.827814] usbcore: registered new interface driver hub
[   41.838329] usbcore: registered new device driver usb
[   41.842033] via686a 0000:00:04.4: Forcing ISA address 0xe200
[   41.843094] via686a 0000:00:04.4: Enabling sensors
[   41.857018] ACPI: I/O resource vt596_smbus [0xe800-0xe807] conflicts with ACPI region SM00 [0xe800-0xe806]
[   41.858112] ACPI: Device needs an ACPI driver
[   41.859147] vt596_smbus: probe of 0000:00:04.4 failed with error -16
[   41.874600] PCI: Enabling device 0000:00:09.0 (0014 -> 0016)
[   41.876608] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5
[   41.877642] PCI: setting IRQ 5 as level-triggered
[   41.877649] ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKD] -> GSI 5 (level, low) -> IRQ 5
[   41.892852] input: PC Speaker as /class/input/input4
[   41.941506] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
[   41.958451] USB Universal Host Controller Interface driver v3.0
[   42.017891] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   42.032784] sd 0:0:1:0: Attached scsi generic sg1 type 0
[   42.034064] scsi 1:0:0:0: Attached scsi generic sg2 type 5
[   42.035203] scsi 1:0:1:0: Attached scsi generic sg3 type 5
[   42.104236] sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
[   42.105276] Uniform CD-ROM driver Revision: 3.20
[   42.123106] sr 1:0:0:0: Attached scsi CD-ROM sr0
[   42.135665] sr1: scsi3-mmc drive: 0x/48x cd/rw xa/form2 cdda tray
[   42.145075] Real Time Clock Driver v1.12ac
[   42.155668] sr 1:0:1:0: Attached scsi CD-ROM sr1
[   42.277609] Floppy drive(s): fd0 is 1.44M
[   42.299599] FDC 0 is a post-1991 82077
[   42.336396] ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[5]  MMIO=[b6800000-b68007ff]  Max Packet=[2048]  IR/IT contexts=[8/8]
[   42.376753] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
[   42.392166] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[   42.419779] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[   42.467724] PCI: Enabling device 0000:00:0b.0 (0000 -> 0001)
[   42.470235] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 11
[   42.471237] ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKB] -> GSI 11 (level, low) -> IRQ 11
[   42.474128] eth0: RealTek RTL-8029 found at 0x9400, IRQ 11, 00:40:95:46:6e:2d.
[   42.475610] ACPI: PCI Interrupt 0000:00:04.2[D] -> Link [LNKD] -> GSI 5 (level, low) -> IRQ 5
[   42.476711] uhci_hcd 0000:00:04.2: UHCI Host Controller
[   42.478965] uhci_hcd 0000:00:04.2: new USB bus registered, assigned bus number 1
[   42.480365] uhci_hcd 0000:00:04.2: irq 5, io base 0x0000b400
[   42.482290] usb usb1: configuration #1 chosen from 1 choice
[   42.483685] hub 1-0:1.0: USB hub found
[   42.484993] hub 1-0:1.0: 2 ports detected
[   42.552689] 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[   42.554316] 00:0b: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[   42.588475] usb usb1: new device found, idVendor=0000, idProduct=0000
[   42.589655] usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
[   42.590616] usb usb1: Product: UHCI Host Controller
[   42.592260] usb usb1: Manufacturer: Linux 2.6.24-rc3-mm1 uhci_hcd
[   42.593218] usb usb1: SerialNumber: 0000:00:04.2
[   42.594289] ACPI: PCI Interrupt 0000:00:04.3[D] -> Link [LNKD] -> GSI 5 (level, low) -> IRQ 5
[   42.595339] uhci_hcd 0000:00:04.3: UHCI Host Controller
[   42.596598] uhci_hcd 0000:00:04.3: new USB bus registered, assigned bus number 2
[   42.597652] uhci_hcd 0000:00:04.3: irq 5, io base 0x0000b000
[   42.599191] usb usb2: configuration #1 chosen from 1 choice
[   42.600584] hub 2-0:1.0: USB hub found
[   42.601596] hub 2-0:1.0: 2 ports detected
[   42.704030] usb usb2: new device found, idVendor=0000, idProduct=0000
[   42.705073] usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
[   42.706080] usb usb2: Product: UHCI Host Controller
[   42.707068] usb usb2: Manufacturer: Linux 2.6.24-rc3-mm1 uhci_hcd
[   42.708509] usb usb2: SerialNumber: 0000:00:04.3
[   42.811540] usb 1-1: new full speed USB device using uhci_hcd and address 2
[   42.974026] usb 1-1: configuration #1 chosen from 1 choice
[   42.978609] hub 1-1:1.0: USB hub found
[   42.981379] hub 1-1:1.0: 4 ports detected
[   43.099312] usb 1-1: new device found, idVendor=05e3, idProduct=0606
[   43.100430] usb 1-1: new device strings: Mfr=0, Product=1, SerialNumber=0
[   43.101457] usb 1-1: Product: USB2.0 Hub
[   43.467102] PCI: Enabling device 0000:00:0d.0 (0004 -> 0005)
[   43.470645] ACPI: PCI Interrupt 0000:00:0d.0[A] -> Link [LNKD] -> GSI 5 (level, low) -> IRQ 5
[   43.728051] ieee1394: Host added: ID:BUS[0-00:1023]  GUID[00308d0120e085ca]
[   59.341540] ReiserFS: dm-0: found reiserfs format "3.6" with standard journal
[   59.341569] ReiserFS: dm-0: using ordered data mode
[   59.346608] ReiserFS: dm-0: journal params: device dm-0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   59.349822] ReiserFS: dm-0: checking transaction log (dm-0)
[   59.404729] ReiserFS: dm-0: Using r5 hash to sort names
[   59.614980] Loading Reiser4. See www.namesys.com for a description of Reiser4.
[   59.616355] modprobe used greatest stack depth: 1552 bytes left
[   59.631731] reiser4: dm-5: found disk format 4.0.0.
[   59.890236] reiser4: dm-4: found disk format 4.0.0.
[   60.091592] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
[   60.091681] kjournald starting.  Commit interval 5 seconds
[   60.091911] EXT3 FS on sda5, internal journal
[   60.091989] EXT3-fs: mounted filesystem with ordered data mode.
[   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
[   60.216124] end_request: I/O error, dev sda, sector 16460
[   60.245789] ReiserFS: sda7: found reiserfs format "3.6" with standard journal
[   60.245829] ReiserFS: sda7: using ordered data mode
[   60.251474] ReiserFS: sda7: journal params: device sda7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   60.254845] ReiserFS: sda7: checking transaction log (sda7)
[   60.297870] ReiserFS: sda7: Using r5 hash to sort names
[   60.351096] sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
[   60.351107] end_request: I/O error, dev sdb, sector 19632
[   60.392984] sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
[   60.392995] end_request: I/O error, dev sdb, sector 40037363
[   60.611920] Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 across:1048568k
[   67.273626] lp0: using parport0 (interrupt-driven).
[   67.991472] [drm] Initialized drm 1.1.0 20060810
[   68.024210] [drm] Initialized radeon 1.28.0 20060524 on minor 0
[   69.882699] agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
[   69.883411] agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
[   69.883933] agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[   70.305388] [drm] Setting GART location based on new memory map
[   70.305410] [drm] Loading R200 Microcode
[   70.305452] [drm] writeback test succeeded in 1 usecs
[   70.920327] warning: process `avahi-daemon' sets w/ old libcap
[   70.927975] warning: process `avahi-daemon' sets w/ old libcap
[   70.931271] warning: process `avahi-daemon' sets w/ old libcap
[   70.945490] warning: process `avahi-daemon' sets w/ old libcap
[  323.784537] agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
[  323.784574] agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
[  323.784636] agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[  323.784658] [drm] Loading R200 Microcode
[  334.964720] dmesg used greatest stack depth: 1292 bytes left

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 22:59                   ` Laurent Riffard
@ 2007-11-25  7:37                     ` James Bottomley
  2007-11-25 20:39                       ` Laurent Riffard
  0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2007-11-25  7:37 UTC (permalink / raw)
  To: Laurent Riffard
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
> Le 24.11.2007 14:26, James Bottomley a écrit :
> > OK, could you post dmesgs again, please.  I actually tested this
> with an
> > aic79xx card, and for me it does cause Domain Validation to succeed
> > again.
> 
> James, 
> 
> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
> the 
> BLOCK and QUIESCE states
> correctly" (http://lkml.org/lkml/2007/11/24/8).
> 
> How to reproduce :
> - boot
> - switch to a text console
> - capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
>   system does work.
> - switch to X console, log in the Gnome Desktop, the system partially 
>   hangs.
> - switch back to a text console: dmesg(1) still works, it shows some 
>   additonal I/O errors. At this point, any disk access makes the
> system 
>   completely hung.
> 
> Additionnal data:
> - the I/O errors always happen on the same blocks.
> 
> plain text document attachment (dmesg-2.6.24-rc3-mm1-patched)
[...]
> [   25.521256] scsi0 : pata_via
> [   25.521711] scsi1 : pata_via
> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma
> 0xb800 irq 14
> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma
> 0xb808 irq 15
> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
> [   25.691127] ata1.00: configured for UDMA/100
> [   25.699142] ata1.01: configured for UDMA/100
> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max
> UDMA/33
> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
> [   26.330839] ata2.00: configured for UDMA/33
> [   26.490828] ata2.01: configured for MWDMA2
> [   26.503014] scsi 0:0:0:0: Direct-Access     ATA      ST340016A
> 3.75 PQ: 0 ANSI: 5
> [   26.504670] scsi 0:0:1:0: Direct-Access     ATA      Maxtor 6Y080L0
> YAR4 PQ: 0 ANSI: 5
> [   26.509842] scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRAM
> GSA-4165B DL05 PQ: 0 ANSI: 5
> [   26.511673] scsi 1:0:1:0: CD-ROM            E-IDE    CD-950E/AKU
> A4Q  PQ: 0 ANSI: 5
[...]
> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT
> driverbyte=DRIVER_OK,SUGGEST_OK
> [   60.216124] end_request: I/O error, dev sda, sector 16460

I think this one's quite easy:  PATA devices in libata are queue depth 1
(since they don't do NCQ).  Thus, they're peculiarly sensitive to the
bug where we fail over queue depth requests.

On the other hand, I don't see how a filesystem request is getting
REQ_FAILFAST ... unless there's a bio or readahead issue involved.
Anyway, could you try this patch:

http://marc.info/?l=linux-scsi&m=119592627425498

Which should fix the queue depth issue, and see if the errors go away?

Thanks,

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-25  7:37                     ` James Bottomley
@ 2007-11-25 20:39                       ` Laurent Riffard
  2007-11-28 21:38                         ` Laurent Riffard
  0 siblings, 1 reply; 35+ messages in thread
From: Laurent Riffard @ 2007-11-25 20:39 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

Le 25.11.2007 08:37, James Bottomley a écrit :
> On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 14:26, James Bottomley a écrit :
>>> OK, could you post dmesgs again, please.  I actually tested this
>> with an
>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>> again.
>> James, 
>>
>> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
>> the 
>> BLOCK and QUIESCE states
>> correctly" (http://lkml.org/lkml/2007/11/24/8).
>>
>> How to reproduce :
>> - boot
>> - switch to a text console
>> - capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
>>   system does work.
>> - switch to X console, log in the Gnome Desktop, the system partially 
>>   hangs.
>> - switch back to a text console: dmesg(1) still works, it shows some 
>>   additonal I/O errors. At this point, any disk access makes the system 
>>   completely hung.
>>
>> Additionnal data:
>> - the I/O errors always happen on the same blocks.
>>
>> plain text document attachment (dmesg-2.6.24-rc3-mm1-patched)
> [...]
>> [   25.521256] scsi0 : pata_via
>> [   25.521711] scsi1 : pata_via
>> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xb800 irq 14
>> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xb808 irq 15
>> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
>> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
>> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
>> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
>> [   25.691127] ata1.00: configured for UDMA/100
>> [   25.699142] ata1.01: configured for UDMA/100
>> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max UDMA/33
>> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
>> [   26.330839] ata2.00: configured for UDMA/33
>> [   26.490828] ata2.01: configured for MWDMA2
>> [   26.503014] scsi 0:0:0:0: Direct-Access     ATA      ST340016A 3.75 PQ: 0 ANSI: 5
>> [   26.504670] scsi 0:0:1:0: Direct-Access     ATA      Maxtor 6Y080L0 YAR4 PQ: 0 ANSI: 5
>> [   26.509842] scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRAM GSA-4165B DL05 PQ: 0 ANSI: 5
>> [   26.511673] scsi 1:0:1:0: CD-ROM            E-IDE    CD-950E/AKU A4Q  PQ: 0 ANSI: 5
> [...]
>> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>> [   60.216124] end_request: I/O error, dev sda, sector 16460
> 
> I think this one's quite easy:  PATA devices in libata are queue depth 1
> (since they don't do NCQ).  Thus, they're peculiarly sensitive to the
> bug where we fail over queue depth requests.
> 
> On the other hand, I don't see how a filesystem request is getting
> REQ_FAILFAST ... unless there's a bio or readahead issue involved.
> Anyway, could you try this patch:
> 
> http://marc.info/?l=linux-scsi&m=119592627425498
> 
> Which should fix the queue depth issue, and see if the errors go away?

No, this one doesn't help...

-- 
laurent
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-24 17:44           ` James Bottomley
@ 2007-11-26  7:54             ` Hannes Reinecke
  0 siblings, 0 replies; 35+ messages in thread
From: Hannes Reinecke @ 2007-11-26  7:54 UTC (permalink / raw)
  To: James Bottomley
  Cc: Laurent Riffard, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

On Sat, Nov 24, 2007 at 07:44:13PM +0200, James Bottomley wrote:
> Probing intermittent failures in Domain Validation, even with the fixes
> applied leads me to the conclusion that there are further problems with
> this commit:
> 
> commit fc5eb4facedbd6d7117905e775cee1975f894e79
> Author: Hannes Reinecke <hare@suse.de>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
>     [SCSI] Do not requeue requests if REQ_FAILFAST is set
>  
> The essence of the problems is that you're causing REQ_FAILFAST to
> terminate commands with error on requeuing conditions, some of which are
> relatively common on most SCSI devices.  While this may be the correct
> behaviour for multi-path, it's certainly wrong for the previously
> understood meaning of REQ_FAILFAST, which was don't retry on error,
> which is why domain validation and other applications use it to control
> error handling, but don't expect to get failures for a simple requeue
> are now spitting errors.
> 
> I honestly can't see that, even for the multi-path case, returning an
> error when we're over queue depth is the correct thing to do (it may not
> matter to something like a symmetrix, but an array that has a non-zero
> cost associated with a path change, like a CPQ HSV or the AVT
> controllers, will show fairly large slow downs if you do this).  Even if
> this is the desired behaviour (and I think that's a policy issue),
> DID_NO_CONNECT is almost certainly the wrong error to be sending back.
> 
> This patch fixes up domain validation to work again correctly, however,
> I really think it's just a bandaid.  Do you want to rethink the above
> commit?
> 
Given the amounted error, yes, I'll have to.
But we still face the initial problem that requeued requests will be
stuck in the queue forever (ie until the timeout catches it), causing
failover to be painfully slow.

Anyway, I'll think it over.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Markus Rex, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-11-23  5:55     ` 2.6.24-rc3-mm1 Gabriel C
@ 2007-11-27  6:15       ` Andrew Morton
  2007-12-11 16:33         ` 2.6.24-rc3-mm1 James Bottomley
  0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2007-11-27  6:15 UTC (permalink / raw)
  To: Gabriel C; +Cc: linux-kernel, linux-scsi, Hannes Reinecke, James Bottomley

On Fri, 23 Nov 2007 06:55:41 +0100 Gabriel C <nix.or.die@googlemail.com> wrote:

> Andrew Morton wrote:
> > On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <nix.or.die@googlemail.com> wrote:
> > 
> >> I have some warnings on each SCSI disc:
> >>
> >>
> >> ...
> >>
> >> [   30.724410] scsi 0:0:0:0: Direct-Access     SEAGATE  ST318406LW       0109 PQ: 0 ANSI: 3
> >> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
> >> [   30.724435]  target0:0:0: Beginning Domain Validation
> >> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
> >> [   30.724572]  target0:0:0: Ending Domain Validation
> >> [   30.729747] scsi 0:0:1:0: Direct-Access     FUJITSU  MAH3182MP        0114 PQ: 0 ANSI: 4
> >> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
> >> [   30.729771]  target0:0:1: Beginning Domain Validation
> >> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
> >> [   30.729908]  target0:0:1: Ending Domain Validation
> >>
> > 
> > Don't know what would have caused that.  But yes, something is wrong in
> > scsi land.
> 
> Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c and I still can boot ;)
> 
> > 
> >> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy.
> >>
> >> hdparm -t on sda and sdb reports :
> >>
> >> /dev/sda:
> >>  Timing buffered disk reads:    8 MB in  3.26 seconds =   2.46 MB/sec
> >>
> >> /dev/sdb:
> >>  Timing buffered disk reads:    8 MB in  3.56 seconds =   2.25 MB/sec
> >>
> >> My IDE discs are fine.
> >>
> >> Please let me know if you need my config or any other informations.
> >>
> > 
> > And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
> > 
> 
> I found the commit which cause these problems , it is in git-scsi-misc patch and reverting it fixes both problems for me.
> 
> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d
> 

OK, thanks.  I'll assume that James and Hannes have this in hand (or will
have, by mid-week) and I won't do anything here.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1: I/O error, system hangs
  2007-11-25 20:39                       ` Laurent Riffard
@ 2007-11-28 21:38                         ` Laurent Riffard
  0 siblings, 0 replies; 35+ messages in thread
From: Laurent Riffard @ 2007-11-28 21:38 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Andrew Morton, linux-kernel, linux-ide,
	linux-scsi

Le 25.11.2007 21:39, Laurent Riffard a écrit :
> Le 25.11.2007 08:37, James Bottomley a écrit :
>> On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
>>> Le 24.11.2007 14:26, James Bottomley a écrit :
>>>> OK, could you post dmesgs again, please.  I actually tested this
>>> with an
>>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>>> again.
>>> James, 
>>>
>>> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
>>> the 
>>> BLOCK and QUIESCE states
>>> correctly" (http://lkml.org/lkml/2007/11/24/8).
>>>
[...]
>>> [   25.521256] scsi0 : pata_via
>>> [   25.521711] scsi1 : pata_via
>>> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xb800 irq 14
>>> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xb808 irq 15
>>> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
>>> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
>>> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
>>> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
>>> [   25.691127] ata1.00: configured for UDMA/100
>>> [   25.699142] ata1.01: configured for UDMA/100
>>> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max UDMA/33
>>> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
>>> [   26.330839] ata2.00: configured for UDMA/33
>>> [   26.490828] ata2.01: configured for MWDMA2
>>> [   26.503014] scsi 0:0:0:0: Direct-Access     ATA      ST340016A 3.75 PQ: 0 ANSI: 5
>>> [   26.504670] scsi 0:0:1:0: Direct-Access     ATA      Maxtor 6Y080L0 YAR4 PQ: 0 ANSI: 5
>>> [   26.509842] scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRAM GSA-4165B DL05 PQ: 0 ANSI: 5
>>> [   26.511673] scsi 1:0:1:0: CD-ROM            E-IDE    CD-950E/AKU A4Q  PQ: 0 ANSI: 5
>> [...]
>>> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
>>> [   60.216124] end_request: I/O error, dev sda, sector 16460
>> I think this one's quite easy:  PATA devices in libata are queue depth 1
>> (since they don't do NCQ).  Thus, they're peculiarly sensitive to the
>> bug where we fail over queue depth requests.
>>
>> On the other hand, I don't see how a filesystem request is getting
>> REQ_FAILFAST ... unless there's a bio or readahead issue involved.
>> Anyway, could you try this patch:
>>
>> http://marc.info/?l=linux-scsi&m=119592627425498
>>
>> Which should fix the queue depth issue, and see if the errors go away?
> 
> No, this one doesn't help...
 
still happens with 2.6.24-rc3-mm2...
-- 
laurent

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-11-27  6:15       ` 2.6.24-rc3-mm1 Andrew Morton
@ 2007-12-11 16:33         ` James Bottomley
  2007-12-12 10:08           ` 2.6.24-rc3-mm1 Boaz Harrosh
  2007-12-14  9:00           ` 2.6.24-rc3-mm1 Hannes Reinecke
  0 siblings, 2 replies; 35+ messages in thread
From: James Bottomley @ 2007-12-11 16:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Gabriel C, linux-kernel, linux-scsi, Hannes Reinecke


On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> have, by mid-week) and I won't do anything here.

Just to confirm what I think I'm going to be doing:  rebasing the
scsi-misc tree to remove this commit:

commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
Author: Hannes Reinecke <hare@suse.de>
Date:   Tue Nov 6 09:23:40 2007 +0100

    [SCSI] Do not requeue requests if REQ_FAILFAST is set

And its allied fix ups:

commit 983289045faa96fba8841d3c51b98bb8623d9504
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Sat Nov 24 19:47:25 2007 +0200

    [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE

commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Sat Nov 24 19:55:53 2007 +0200

    [SCSI] fix domain validation to work again

James



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-12-11 16:33         ` 2.6.24-rc3-mm1 James Bottomley
@ 2007-12-12 10:08           ` Boaz Harrosh
  2007-12-12 11:03             ` [PATCH] REQ-flags to/from BIO-flags bugfix Boaz Harrosh
  2007-12-12 11:36             ` 2.6.24-rc3-mm1 Jens Axboe
  2007-12-14  9:00           ` 2.6.24-rc3-mm1 Hannes Reinecke
  1 sibling, 2 replies; 35+ messages in thread
From: Boaz Harrosh @ 2007-12-12 10:08 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe
  Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi,
	Hannes Reinecke

On Tue, Dec 11 2007 at 18:33 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>> have, by mid-week) and I won't do anything here.
> 
> Just to confirm what I think I'm going to be doing:  rebasing the
> scsi-misc tree to remove this commit:
> 
> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> Author: Hannes Reinecke <hare@suse.de>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
>     [SCSI] Do not requeue requests if REQ_FAILFAST is set
> 
> And its allied fix ups:
> 
> commit 983289045faa96fba8841d3c51b98bb8623d9504
> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date:   Sat Nov 24 19:47:25 2007 +0200
> 
>     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> 
> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date:   Sat Nov 24 19:55:53 2007 +0200
> 
>     [SCSI] fix domain validation to work again
> 
> James
> 

The problems caused by this patch where nagging me at the back of my head
from the begging. Why should we fail on a check of FAIL_FAST in all kind
of weird places like boots, when the only place that should ever set the 
flag should be one of the multi-path drivers. finally it struck me:

It might be a bug in ll_rw_blk at blk_rq_bio_prep() there is this:

static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
			    struct bio *bio)
{
	/* first two bits are identical in rq->cmd_flags and bio->bi_rw */
	rq->cmd_flags |= (bio->bi_rw & 3);
	...

Now this is no longer true and is a bug.
Second bit of bio->bi_rw defined in bio.h is:
#define BIO_RW_AHEAD	1
but
Second bit of rq->cmd_flags is __REQ_FAILFAST

so maybe we are getting FAILFAST in the wrong places?

(I will look for an old patch I sent a year ago that fixes
this bug)

Boaz
 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 10:08           ` 2.6.24-rc3-mm1 Boaz Harrosh
@ 2007-12-12 11:03             ` Boaz Harrosh
  2007-12-12 15:18               ` Matthew Wilcox
  2007-12-12 11:36             ` 2.6.24-rc3-mm1 Jens Axboe
  1 sibling, 1 reply; 35+ messages in thread
From: Boaz Harrosh @ 2007-12-12 11:03 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe
  Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi,
	Hannes Reinecke


 - BIO flags bio->bi_rw and REQ flags req->cmd_flags no longer match.
   Remove comments and do a proper translation between the 2 systems.

 (Please look in ll_rw_blk.c/blk_rq_bio_prep() below if we need more flags)

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 block/ll_rw_blk.c            |   23 +++++++++++++++++------
 include/linux/blktrace_api.h |    8 +++++++-
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 8b91994..c6a84bb 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1990,10 +1990,6 @@ blk_alloc_request(struct request_queue *q, int rw, int priv, gfp_t gfp_mask)
 	if (!rq)
 		return NULL;
 
-	/*
-	 * first three bits are identical in rq->cmd_flags and bio->bi_rw,
-	 * see bio.h and blkdev.h
-	 */
 	rq->cmd_flags = rw | REQ_ALLOCED;
 
 	if (priv) {
@@ -3772,8 +3768,23 @@ EXPORT_SYMBOL(end_request);
 static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 			    struct bio *bio)
 {
-	/* first two bits are identical in rq->cmd_flags and bio->bi_rw */
-	rq->cmd_flags |= (bio->bi_rw & 3);
+	if (bio_data_dir(bio))
+		rq->cmd_flags |= REQ_RW;
+	else
+		rq->cmd_flags &= ~REQ_RW;
+
+	if (bio->bi_rw & (1<<BIO_RW_SYNC))
+		rq->cmd_flags |= REQ_RW_SYNC;
+	else
+		rq->cmd_flags &= ~REQ_RW_SYNC;
+	/* FIXME: what about other flags, should we sync these too? */
+	/*
+	BIO_RW_AHEAD	==> ??
+	BIO_RW_BARRIER	==> REQ_SOFTBARRIER/REQ_HARDBARRIER
+	BIO_RW_FAILFAST	==> REQ_FAILFAST
+	BIO_RW_SYNC	==> REQ_RW_SYNC
+	BIO_RW_META	==> REQ_RW_META
+	*/
 
 	rq->nr_phys_segments = bio_phys_segments(q, bio);
 	rq->nr_hw_segments = bio_hw_segments(q, bio);
diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index 7e11d23..9e7ce65 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -165,7 +165,13 @@ static inline void blk_add_trace_rq(struct request_queue *q, struct request *rq,
 				    u32 what)
 {
 	struct blk_trace *bt = q->blk_trace;
-	int rw = rq->cmd_flags & 0x03;
+	/* blktrace.c prints them according to bio flags */
+	int rw = (((rq_rw_dir(rq) == WRITE) << BIO_RW) |
+	          (((rq->cmd_flags & (REQ_SOFTBARRIER|REQ_HARDBARRIER)) != 0) <<
+	           BIO_RW_BARRIER) |
+	          (((rq->cmd_flags & REQ_FAILFAST) != 0) << BIO_RW_FAILFAST) |
+	          (((rq->cmd_flags & REQ_RW_SYNC) != 0) << BIO_RW_SYNC) |
+	          (((rq->cmd_flags & REQ_RW_META) != 0) << BIO_RW_META));
 
 	if (likely(!bt))
 		return;
-- 
1.5.3.3



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-12-12 10:08           ` 2.6.24-rc3-mm1 Boaz Harrosh
  2007-12-12 11:03             ` [PATCH] REQ-flags to/from BIO-flags bugfix Boaz Harrosh
@ 2007-12-12 11:36             ` Jens Axboe
  1 sibling, 0 replies; 35+ messages in thread
From: Jens Axboe @ 2007-12-12 11:36 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: James Bottomley, Andrew Morton, Gabriel C, linux-kernel,
	linux-scsi, Hannes Reinecke

On Wed, Dec 12 2007, Boaz Harrosh wrote:
> On Tue, Dec 11 2007 at 18:33 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> >> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> >> have, by mid-week) and I won't do anything here.
> > 
> > Just to confirm what I think I'm going to be doing:  rebasing the
> > scsi-misc tree to remove this commit:
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <hare@suse.de>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> >     [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > And its allied fix ups:
> > 
> > commit 983289045faa96fba8841d3c51b98bb8623d9504
> > Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date:   Sat Nov 24 19:47:25 2007 +0200
> > 
> >     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> > 
> > commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> > Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date:   Sat Nov 24 19:55:53 2007 +0200
> > 
> >     [SCSI] fix domain validation to work again
> > 
> > James
> > 
> 
> The problems caused by this patch where nagging me at the back of my head
> from the begging. Why should we fail on a check of FAIL_FAST in all kind
> of weird places like boots, when the only place that should ever set the 
> flag should be one of the multi-path drivers. finally it struck me:
> 
> It might be a bug in ll_rw_blk at blk_rq_bio_prep() there is this:
> 
> static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
> 			    struct bio *bio)
> {
> 	/* first two bits are identical in rq->cmd_flags and bio->bi_rw */
> 	rq->cmd_flags |= (bio->bi_rw & 3);
> 	...
> 
> Now this is no longer true and is a bug.
> Second bit of bio->bi_rw defined in bio.h is:
> #define BIO_RW_AHEAD	1
> but
> Second bit of rq->cmd_flags is __REQ_FAILFAST
> 
> so maybe we are getting FAILFAST in the wrong places?

But that's actually on purpose, though the comment is pretty much crap.
We don't want to be retrying readahead requests, those should always
just be tossable.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 11:03             ` [PATCH] REQ-flags to/from BIO-flags bugfix Boaz Harrosh
@ 2007-12-12 15:18               ` Matthew Wilcox
  2007-12-12 15:54                 ` Matthew Wilcox
  2007-12-12 16:06                 ` Boaz Harrosh
  0 siblings, 2 replies; 35+ messages in thread
From: Matthew Wilcox @ 2007-12-12 15:18 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: James Bottomley, Jens Axboe, Andrew Morton, Gabriel C,
	linux-kernel, linux-scsi, Hannes Reinecke

On Wed, Dec 12, 2007 at 01:03:10PM +0200, Boaz Harrosh wrote:
>  - BIO flags bio->bi_rw and REQ flags req->cmd_flags no longer match.
>    Remove comments and do a proper translation between the 2 systems.

I'd rather see them resynchronised ... in a way that makes it obvious
that they should be desynced again:

I don't know whether BIO_RW_BARRIER is __REQ_SOFTBARRIER or
__REQ_HARDBARRIER, so I didn't include that in this patch.  There also
doesn't seem to be a __REQ equivalent to BIO_RW_AHEAD, but we can do
the other four bits (and leave gaps for those two).

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d18ee67..6aef34b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -167,11 +167,13 @@ enum {
 };
 
 /*
- * request type modified bits. first three bits match BIO_RW* bits, important
+ * request type modified bits.  Don't change without looking at bi_rw flags
  */
 enum rq_flag_bits {
-	__REQ_RW,		/* not set, read. set, write */
-	__REQ_FAILFAST,		/* no low level driver retries */
+	__REQ_RW = BIO_RW,	/* not set, read. set, write */
+	__REQ_FAILFAST = BIO_RW_FAILFAST, /* no low level driver retries */
+	__REQ_RW_SYNC = BIO_RW_SYNC,	/* request is sync (O_DIRECT) */
+	__REQ_RW_META = BIO_RW_META,	/* metadata io request */
 	__REQ_SORTED,		/* elevator knows about this request */
 	__REQ_SOFTBARRIER,	/* may not be passed by ioscheduler */
 	__REQ_HARDBARRIER,	/* may not be passed by drive either */
@@ -185,9 +187,7 @@ enum rq_flag_bits {
 	__REQ_QUIET,		/* don't worry about errors */
 	__REQ_PREEMPT,		/* set for "ide_preempt" requests */
 	__REQ_ORDERED_COLOR,	/* is before or after barrier */
-	__REQ_RW_SYNC,		/* request is sync (O_DIRECT) */
 	__REQ_ALLOCED,		/* request came from our alloc pool */
-	__REQ_RW_META,		/* metadata io request */
 	__REQ_NR_BITS,		/* stops here */
 };
 

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 15:18               ` Matthew Wilcox
@ 2007-12-12 15:54                 ` Matthew Wilcox
  2007-12-13  5:36                   ` David Chinner
  2007-12-12 16:06                 ` Boaz Harrosh
  1 sibling, 1 reply; 35+ messages in thread
From: Matthew Wilcox @ 2007-12-12 15:54 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: James Bottomley, Jens Axboe, Andrew Morton, Gabriel C,
	linux-kernel, linux-scsi, Hannes Reinecke

On Wed, Dec 12, 2007 at 08:18:14AM -0700, Matthew Wilcox wrote:
> I don't know whether BIO_RW_BARRIER is __REQ_SOFTBARRIER or
> __REQ_HARDBARRIER, so I didn't include that in this patch.  There also
> doesn't seem to be a __REQ equivalent to BIO_RW_AHEAD, but we can do
> the other four bits (and leave gaps for those two).

Hm.  BIO_RW_AHEAD seems unused:

willy@honeydew:~/kernel/linux-2.6$ grep -r BIO_RW_AHEAD *
block/blktrace.c:       (((rw) & (1 << BIO_RW_AHEAD)) << (2 - BIO_RW_AHEAD))
include/linux/bio.h:#define BIO_RW_AHEAD        1
include/linux/bio.h:#define bio_rw_ahead(bio)   ((bio)->bi_rw & (1 << BIO_RW_AHEAD))
willy@honeydew:~/kernel/linux-2.6$ grep -r bio_rw_ahead *
block/ll_rw_blk.c:      if (bio_rw_ahead(bio) || bio_failfast(bio))
drivers/md/dm-mpath.c:  if ((error == -EWOULDBLOCK) && bio_rw_ahead(bio))
drivers/md/multipath.c: else if (!bio_rw_ahead(bio)) {
include/linux/bio.h:#define bio_rw_ahead(bio)   ((bio)->bi_rw & (1 << BIO_RW_AHEAD))

BIO_RW_BARRIER seems to be __REQ_HARDBARRIER, but we set it
explicitly in init_request_from_bio().  We could probably simplify
init_request_from_bio() with the patch in the previous message.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 15:18               ` Matthew Wilcox
  2007-12-12 15:54                 ` Matthew Wilcox
@ 2007-12-12 16:06                 ` Boaz Harrosh
  2007-12-12 16:33                   ` Matthew Wilcox
  1 sibling, 1 reply; 35+ messages in thread
From: Boaz Harrosh @ 2007-12-12 16:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: James Bottomley, Jens Axboe, Andrew Morton, Gabriel C,
	linux-kernel, linux-scsi, Hannes Reinecke

On Wed, Dec 12 2007 at 17:18 +0200, Matthew Wilcox <matthew@wil.cx> wrote:
> On Wed, Dec 12, 2007 at 01:03:10PM +0200, Boaz Harrosh wrote:
>>  - BIO flags bio->bi_rw and REQ flags req->cmd_flags no longer match.
>>    Remove comments and do a proper translation between the 2 systems.
> 
> I'd rather see them resynchronised ... in a way that makes it obvious
> that they should be desynced again:
> 
> I don't know whether BIO_RW_BARRIER is __REQ_SOFTBARRIER or
> __REQ_HARDBARRIER, so I didn't include that in this patch.  There also
> doesn't seem to be a __REQ equivalent to BIO_RW_AHEAD, but we can do
> the other four bits (and leave gaps for those two).
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index d18ee67..6aef34b 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -167,11 +167,13 @@ enum {
>  };
>  
>  /*
> - * request type modified bits. first three bits match BIO_RW* bits, important
> + * request type modified bits.  Don't change without looking at bi_rw flags
>   */
>  enum rq_flag_bits {
> -	__REQ_RW,		/* not set, read. set, write */
> -	__REQ_FAILFAST,		/* no low level driver retries */
> +	__REQ_RW = BIO_RW,	/* not set, read. set, write */
> +	__REQ_FAILFAST = BIO_RW_FAILFAST, /* no low level driver retries */
> +	__REQ_RW_SYNC = BIO_RW_SYNC,	/* request is sync (O_DIRECT) */
> +	__REQ_RW_META = BIO_RW_META,	/* metadata io request */
>  	__REQ_SORTED,		/* elevator knows about this request */
>  	__REQ_SOFTBARRIER,	/* may not be passed by ioscheduler */
>  	__REQ_HARDBARRIER,	/* may not be passed by drive either */
> @@ -185,9 +187,7 @@ enum rq_flag_bits {
>  	__REQ_QUIET,		/* don't worry about errors */
>  	__REQ_PREEMPT,		/* set for "ide_preempt" requests */
>  	__REQ_ORDERED_COLOR,	/* is before or after barrier */
> -	__REQ_RW_SYNC,		/* request is sync (O_DIRECT) */
>  	__REQ_ALLOCED,		/* request came from our alloc pool */
> -	__REQ_RW_META,		/* metadata io request */
>  	__REQ_NR_BITS,		/* stops here */
>  };
>  
> 
Thats not enough You still need to fix code in ll_rw_blk(), I would
define a rq_flags_bio_match_mask = 0xf for that. 
(and also add what Jens called "needed" with the
BIO_RW_AHEAD selects REQ_FAILFAST.)

And I still don't understand why, for example, "Domain Validation" fails
with the original patch. What sets BIO_RW_FAILFAST and than panics
on Errors?
(All I see is this flag set in dm/multipath.c & dm-mpath.c)

Boaz


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 16:06                 ` Boaz Harrosh
@ 2007-12-12 16:33                   ` Matthew Wilcox
  0 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2007-12-12 16:33 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: James Bottomley, Jens Axboe, Andrew Morton, Gabriel C,
	linux-kernel, linux-scsi, Hannes Reinecke

On Wed, Dec 12, 2007 at 06:06:45PM +0200, Boaz Harrosh wrote:
> Thats not enough You still need to fix code in ll_rw_blk(), I would
> define a rq_flags_bio_match_mask = 0xf for that. 
> (and also add what Jens called "needed" with the
> BIO_RW_AHEAD selects REQ_FAILFAST.)

Yes, I appreciate it's not enough; that's why I didn't sign-off on it.

Nobody currently sets BIO_RW_AHEAD, so I don't understand why we need to
do anything ;-)

> And I still don't understand why, for example, "Domain Validation" fails
> with the original patch. What sets BIO_RW_FAILFAST and than panics
> on Errors?
> (All I see is this flag set in dm/multipath.c & dm-mpath.c)

No idea ...

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] REQ-flags to/from BIO-flags bugfix
  2007-12-12 15:54                 ` Matthew Wilcox
@ 2007-12-13  5:36                   ` David Chinner
  0 siblings, 0 replies; 35+ messages in thread
From: David Chinner @ 2007-12-13  5:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Boaz Harrosh, James Bottomley, Jens Axboe, Andrew Morton,
	Gabriel C, linux-kernel, linux-scsi, Hannes Reinecke

On Wed, Dec 12, 2007 at 08:54:07AM -0700, Matthew Wilcox wrote:
> On Wed, Dec 12, 2007 at 08:18:14AM -0700, Matthew Wilcox wrote:
> > I don't know whether BIO_RW_BARRIER is __REQ_SOFTBARRIER or
> > __REQ_HARDBARRIER, so I didn't include that in this patch.  There also
> > doesn't seem to be a __REQ equivalent to BIO_RW_AHEAD, but we can do
> > the other four bits (and leave gaps for those two).
> 
> Hm.  BIO_RW_AHEAD seems unused:
> 
> willy@honeydew:~/kernel/linux-2.6$ grep -r BIO_RW_AHEAD *
> block/blktrace.c:       (((rw) & (1 << BIO_RW_AHEAD)) << (2 - BIO_RW_AHEAD))
> include/linux/bio.h:#define BIO_RW_AHEAD        1
> include/linux/bio.h:#define bio_rw_ahead(bio)   ((bio)->bi_rw & (1 << BIO_RW_AHEAD))
> willy@honeydew:~/kernel/linux-2.6$ grep -r bio_rw_ahead *
> block/ll_rw_blk.c:      if (bio_rw_ahead(bio) || bio_failfast(bio))
> drivers/md/dm-mpath.c:  if ((error == -EWOULDBLOCK) && bio_rw_ahead(bio))
> drivers/md/multipath.c: else if (!bio_rw_ahead(bio)) {
> include/linux/bio.h:#define bio_rw_ahead(bio)   ((bio)->bi_rw & (1 << BIO_RW_AHEAD))

That would say to me that READA is not hooked up correctly. i.e:

#define READ 0
#define WRITE 1
#define READA 2         /* read-ahead  - don't block if no resources */
#define SWRITE 3        /* for ll_rw_block() - wait for buffer lock */
#define READ_SYNC       (READ | (1 << BIO_RW_SYNC))
#define READ_META       (READ | (1 << BIO_RW_META))
#define WRITE_SYNC      (WRITE | (1 << BIO_RW_SYNC))
#define WRITE_BARRIER   ((1 << BIO_RW) | (1 << BIO_RW_BARRIER))

i.e. it should be:

#define READA		(1 << BIO_RW_AHEAD)

Right?

FWIW, dm does this:

		if (bio_rw(bio) != READA)

Which really should be if (bio_rw_ahead(bio)).....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-12-11 16:33         ` 2.6.24-rc3-mm1 James Bottomley
  2007-12-12 10:08           ` 2.6.24-rc3-mm1 Boaz Harrosh
@ 2007-12-14  9:00           ` Hannes Reinecke
  2007-12-14 14:26             ` 2.6.24-rc3-mm1 James Bottomley
  1 sibling, 1 reply; 35+ messages in thread
From: Hannes Reinecke @ 2007-12-14  9:00 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi

James Bottomley wrote:
> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>> have, by mid-week) and I won't do anything here.
> 
> Just to confirm what I think I'm going to be doing:  rebasing the
> scsi-misc tree to remove this commit:
> 
> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> Author: Hannes Reinecke <hare@suse.de>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
>     [SCSI] Do not requeue requests if REQ_FAILFAST is set
> 
> And its allied fix ups:
> 
> commit 983289045faa96fba8841d3c51b98bb8623d9504
> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date:   Sat Nov 24 19:47:25 2007 +0200
> 
>     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> 
> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date:   Sat Nov 24 19:55:53 2007 +0200
> 
>     [SCSI] fix domain validation to work again
> 
> James
> 
> 
Or just apply my latest patch (cf Undo __scsi_kill_request).
The main point is that we shouldn't retry requests
with FAILFAST set when the queue is blocked. AFAICS
only FC and iSCSI transports set the queue to blocked,
and use this to indicate a loss of connection. So any
retry with queue blocked is futile.

Cheers,

Hannes

-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: 2.6.24-rc3-mm1
  2007-12-14  9:00           ` 2.6.24-rc3-mm1 Hannes Reinecke
@ 2007-12-14 14:26             ` James Bottomley
  2008-01-07 14:05               ` Multipath failover handling (Was: Re: 2.6.24-rc3-mm1) Hannes Reinecke
  0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2007-12-14 14:26 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi


On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
> James Bottomley wrote:
> > On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> >> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> >> have, by mid-week) and I won't do anything here.
> > 
> > Just to confirm what I think I'm going to be doing:  rebasing the
> > scsi-misc tree to remove this commit:
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <hare@suse.de>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> >     [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > And its allied fix ups:
> > 
> > commit 983289045faa96fba8841d3c51b98bb8623d9504
> > Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date:   Sat Nov 24 19:47:25 2007 +0200
> > 
> >     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> > 
> > commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> > Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date:   Sat Nov 24 19:55:53 2007 +0200
> > 
> >     [SCSI] fix domain validation to work again
> > 
> > James
> > 
> > 
> Or just apply my latest patch (cf Undo __scsi_kill_request).
> The main point is that we shouldn't retry requests
> with FAILFAST set when the queue is blocked. AFAICS
> only FC and iSCSI transports set the queue to blocked,
> and use this to indicate a loss of connection. So any
> retry with queue blocked is futile.

I still don't think this is the right approach.

For link up/down events, those are direct pathing events and should be
signalled along a kernel notifier, not by mucking with the SCSI state
machine.  However, there's still devloss_tmo to consider ... even in
multipath, I don't think you want to signal path failure until
devloss_tmo has fired otherwise you'll get too many transient up/down
events which damage performance if the array has an expensive failover
model.

The other problem is what to do with in-flight commands at the time the
link went down.  With your current patch, they're still stuck until they
time out ... surely there needs to be some type of recovery mechanism
for these?

James



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)
  2007-12-14 14:26             ` 2.6.24-rc3-mm1 James Bottomley
@ 2008-01-07 14:05               ` Hannes Reinecke
  2008-01-07 17:57                 ` James Bottomley
  0 siblings, 1 reply; 35+ messages in thread
From: Hannes Reinecke @ 2008-01-07 14:05 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi

James Bottomley wrote:
> On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
>> James Bottomley wrote:
>>> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>>>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>>>> have, by mid-week) and I won't do anything here.
>>> Just to confirm what I think I'm going to be doing:  rebasing the
>>> scsi-misc tree to remove this commit:
>>>
>>> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
>>> Author: Hannes Reinecke <hare@suse.de>
>>> Date:   Tue Nov 6 09:23:40 2007 +0100
>>>
>>>     [SCSI] Do not requeue requests if REQ_FAILFAST is set
>>>
>>> And its allied fix ups:
>>>
>>> commit 983289045faa96fba8841d3c51b98bb8623d9504
>>> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
>>> Date:   Sat Nov 24 19:47:25 2007 +0200
>>>
>>>     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
>>>
>>> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
>>> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
>>> Date:   Sat Nov 24 19:55:53 2007 +0200
>>>
>>>     [SCSI] fix domain validation to work again
>>>
>>> James
>>>
>>>
>> Or just apply my latest patch (cf Undo __scsi_kill_request).
>> The main point is that we shouldn't retry requests
>> with FAILFAST set when the queue is blocked. AFAICS
>> only FC and iSCSI transports set the queue to blocked,
>> and use this to indicate a loss of connection. So any
>> retry with queue blocked is futile.
> 
> I still don't think this is the right approach.
> 
> For link up/down events, those are direct pathing events and should be
> signalled along a kernel notifier, not by mucking with the SCSI state
> machine.
Of course they will be signalled. And eventually we should patch up
mutltipath-tools to read the exising events from the uevent socket.
But even with that patch there is a quite largish window during
which IOs will be sent to the blocked device, and hence will be
stuck in the request queue until the timer expires.

> However, there's still devloss_tmo to consider ... even in
> multipath, I don't think you want to signal path failure until
> devloss_tmo has fired otherwise you'll get too many transient up/down
> events which damage performance if the array has an expensive failover
> model.
> 
Yes. But currently we have a very high failover latency as we always have
to wait for the requeued commands to time-out.
Hence we're damaging performance on arrays with inexpensive failover.

> The other problem is what to do with in-flight commands at the time the
> link went down.  With your current patch, they're still stuck until they
> time out ... surely there needs to be some type of recovery mechanism
> for these?
> 
Well, the in-flight commands are owned by the HBA driver, which should
have the proper code to terminate / return those commands with the
appriopriate codes. They will then be rescheduled and will be caught
like 'normal' IO requests.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)
  2008-01-07 14:05               ` Multipath failover handling (Was: Re: 2.6.24-rc3-mm1) Hannes Reinecke
@ 2008-01-07 17:57                 ` James Bottomley
  2008-01-07 18:24                   ` Mike Christie
  0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2008-01-07 17:57 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Andrew Morton, Gabriel C, linux-kernel, linux-scsi

On Mon, 2008-01-07 at 15:05 +0100, Hannes Reinecke wrote:
> James Bottomley wrote:
> > On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
> >> James Bottomley wrote:
> >>> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> >>>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> >>>> have, by mid-week) and I won't do anything here.
> >>> Just to confirm what I think I'm going to be doing:  rebasing the
> >>> scsi-misc tree to remove this commit:
> >>>
> >>> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> >>> Author: Hannes Reinecke <hare@suse.de>
> >>> Date:   Tue Nov 6 09:23:40 2007 +0100
> >>>
> >>>     [SCSI] Do not requeue requests if REQ_FAILFAST is set
> >>>
> >>> And its allied fix ups:
> >>>
> >>> commit 983289045faa96fba8841d3c51b98bb8623d9504
> >>> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> >>> Date:   Sat Nov 24 19:47:25 2007 +0200
> >>>
> >>>     [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> >>>
> >>> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> >>> Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> >>> Date:   Sat Nov 24 19:55:53 2007 +0200
> >>>
> >>>     [SCSI] fix domain validation to work again
> >>>
> >>> James
> >>>
> >>>
> >> Or just apply my latest patch (cf Undo __scsi_kill_request).
> >> The main point is that we shouldn't retry requests
> >> with FAILFAST set when the queue is blocked. AFAICS
> >> only FC and iSCSI transports set the queue to blocked,
> >> and use this to indicate a loss of connection. So any
> >> retry with queue blocked is futile.
> > 
> > I still don't think this is the right approach.
> > 
> > For link up/down events, those are direct pathing events and should be
> > signalled along a kernel notifier, not by mucking with the SCSI state
> > machine.
> Of course they will be signalled. And eventually we should patch up
> mutltipath-tools to read the exising events from the uevent socket.
> But even with that patch there is a quite largish window during
> which IOs will be sent to the blocked device, and hence will be
> stuck in the request queue until the timer expires.

But the assumption your code makes is that if REQ_FAILFAST is set then
it's a dm request ... and that's not true.  The code in question
negatively impacts other users of REQ_FAILFAST.  For every user other
than dm, the right thing to do is to wait out the block.

> > However, there's still devloss_tmo to consider ... even in
> > multipath, I don't think you want to signal path failure until
> > devloss_tmo has fired otherwise you'll get too many transient up/down
> > events which damage performance if the array has an expensive failover
> > model.
> > 
> Yes. But currently we have a very high failover latency as we always have
> to wait for the requeued commands to time-out.
> Hence we're damaging performance on arrays with inexpensive failover.

If it's a either/or choice between the two that's showing our current
approach to multi-path is broken.

> > The other problem is what to do with in-flight commands at the time the
> > link went down.  With your current patch, they're still stuck until they
> > time out ... surely there needs to be some type of recovery mechanism
> > for these?
> > 
> Well, the in-flight commands are owned by the HBA driver, which should
> have the proper code to terminate / return those commands with the
> appriopriate codes. They will then be rescheduled and will be caught
> like 'normal' IO requests.

But my point is that if a driver goes blocked, those commands will be
forced to wait the blocked timeout anyway, so your proposed patch does
nothing to improve the case for dm anyway ... you only avoid commands
stuck when a device goes blocked if by chance its request queue was
empty.

James



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)
  2008-01-07 17:57                 ` James Bottomley
@ 2008-01-07 18:24                   ` Mike Christie
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Christie @ 2008-01-07 18:24 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Andrew Morton, Gabriel C, linux-kernel,
	linux-scsi

James Bottomley wrote:
>>> However, there's still devloss_tmo to consider ... even in
>>> multipath, I don't think you want to signal path failure until
>>> devloss_tmo has fired otherwise you'll get too many transient up/down
>>> events which damage performance if the array has an expensive failover
>>> model.
>>>
>> Yes. But currently we have a very high failover latency as we always have
>> to wait for the requeued commands to time-out.
>> Hence we're damaging performance on arrays with inexpensive failover.
> 
> If it's a either/or choice between the two that's showing our current
> approach to multi-path is broken.
> 
>>> The other problem is what to do with in-flight commands at the time the
>>> link went down.  With your current patch, they're still stuck until they
>>> time out ... surely there needs to be some type of recovery mechanism
>>> for these?
>>>
>> Well, the in-flight commands are owned by the HBA driver, which should
>> have the proper code to terminate / return those commands with the
>> appriopriate codes. They will then be rescheduled and will be caught
>> like 'normal' IO requests.
> 
> But my point is that if a driver goes blocked, those commands will be
> forced to wait the blocked timeout anyway, so your proposed patch does
> nothing to improve the case for dm anyway ... you only avoid commands
> stuck when a device goes blocked if by chance its request queue was
> empty.

How about my patches to use new transport error values and make the 
iscsi and fc behave the same.

The problem I think Hannes and I are both trying to solve is this:

1. We do not want to wait for dev_loss_tmo seconds for failover.

2. The FC drivers can hook into fast_io_fail_tmo related callouts and 
with that set that tmo to a very low value like a couple of seconds if 
they are using multipath, so failovers are fast. However, there is a bug 
with where when the fast_io_fail_tmo fires requests that made it to the 
driver get failed and returned to the multipath layer, but commands in 
the blocked request queue are stuck in there until dev_loss_tmo fires.

With my patches here (need to be rediffed and for FC I need to handle 
JamesS's comments about not using a new field for the fast_fail_timeout 
state bit):

http://marc.info/?l=linux-scsi&m=117399843216280&w=2
http://marc.info/?l=linux-scsi&m=117399544112073&w=2
http://marc.info/?l=linux-scsi&m=117399844316771&w=2
http://marc.info/?l=linux-scsi&m=117400203324693&w=2
http://marc.info/?l=linux-scsi&m=117400203324690&w=2

For FC we can use the fast_io_fail_tmo for fast failovers, and commands 
will not get stuck in a blocked queue for dev_loss_tmo seconds because 
when the fast_io_fail_tmo fires the target's queues are unblocked and 
fc_remote_port_chkready() ready kicks in (iSCSI does the same with the 
patches in the links). And with the patches if multipath-tools is 
sending its path testing IO it will get a DID_TRANSPORT_* error code 
that it can use to make a decent path failing decision with.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2008-01-07 18:25 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20071120204525.ff27ac98.akpm@linux-foundation.org>
2007-11-23  1:39 ` 2.6.24-rc3-mm1 Gabriel C
2007-11-23  4:12   ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-23  5:55     ` 2.6.24-rc3-mm1 Gabriel C
2007-11-27  6:15       ` 2.6.24-rc3-mm1 Andrew Morton
2007-12-11 16:33         ` 2.6.24-rc3-mm1 James Bottomley
2007-12-12 10:08           ` 2.6.24-rc3-mm1 Boaz Harrosh
2007-12-12 11:03             ` [PATCH] REQ-flags to/from BIO-flags bugfix Boaz Harrosh
2007-12-12 15:18               ` Matthew Wilcox
2007-12-12 15:54                 ` Matthew Wilcox
2007-12-13  5:36                   ` David Chinner
2007-12-12 16:06                 ` Boaz Harrosh
2007-12-12 16:33                   ` Matthew Wilcox
2007-12-12 11:36             ` 2.6.24-rc3-mm1 Jens Axboe
2007-12-14  9:00           ` 2.6.24-rc3-mm1 Hannes Reinecke
2007-12-14 14:26             ` 2.6.24-rc3-mm1 James Bottomley
2008-01-07 14:05               ` Multipath failover handling (Was: Re: 2.6.24-rc3-mm1) Hannes Reinecke
2008-01-07 17:57                 ` James Bottomley
2008-01-07 18:24                   ` Mike Christie
     [not found] ` <4744A6F2.4030302@free.fr>
     [not found]   ` <20071121144116.c932727b.akpm@linux-foundation.org>
2007-11-23  7:29     ` 2.6.24-rc3-mm1: I/O error, system hangs Laurent Riffard
2007-11-23  7:51       ` Hannes Reinecke
2007-11-23 11:38         ` Hannes Reinecke
2007-11-23 17:52           ` Laurent Riffard
2007-11-24  6:42             ` James Bottomley
2007-11-24 12:57               ` Laurent Riffard
2007-11-24 13:26                 ` James Bottomley
2007-11-24 17:54                   ` Gabriel C
2007-11-24 18:04                     ` James Bottomley
2007-11-24 18:08                       ` Gabriel C
2007-11-24 18:28                         ` Gabriel C
2007-11-24 22:59                   ` Laurent Riffard
2007-11-25  7:37                     ` James Bottomley
2007-11-25 20:39                       ` Laurent Riffard
2007-11-28 21:38                         ` Laurent Riffard
2007-11-24 17:44           ` James Bottomley
2007-11-26  7:54             ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).