All of lore.kernel.org
 help / color / mirror / Atom feed
From: "BERTRAND Joël" <joel.bertrand@systella.fr>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>,
	Neil Brown <neilb@suse.de>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Jeff@lessem.org
Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state
Date: Wed, 07 Nov 2007 12:20:25 +0100	[thread overview]
Message-ID: <47319F79.8060002@systella.fr> (raw)
In-Reply-To: <1194398700.2970.18.camel@dwillia2-linux.ch.intel.com>

Dan Williams wrote:
> On Tue, 2007-11-06 at 03:19 -0700, BERTRAND Joël wrote:
>>         Done. Here is obtained ouput :
> 
> Much appreciated.
>> [ 1260.969314] handling stripe 7629696, state=0x14 cnt=1, pd_idx=2 ops=0:0:0
>> [ 1260.980606] check 5: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ffcffcc0 written 0000000000000000
>> [ 1260.994808] check 4: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fdd4e360 written 0000000000000000
>> [ 1261.009325] check 3: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
>> [ 1261.244478] check 2: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
>> [ 1261.270821] check 1: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ff517e40 written 0000000000000000
>> [ 1261.312320] check 0: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fd4cae60 written 0000000000000000
>> [ 1261.361030] locked=4 uptodate=2 to_read=0 to_write=4 failed=0 failed_num=0
>> [ 1261.443120] for sector 7629696, rmw=0 rcw=0
> [..]
> 
> This looks as if the blocks were prepared to be written out, but were
> never handled in ops_run_biodrain(), so they remain locked forever.  The
> operations flags are all clear which means handle_stripe thinks nothing
> else needs to be done.
> 
> The following patch, also attached, cleans up cases where the code looks
> at sh->ops.pending when it should be looking at the consistent
> stack-based snapshot of the operations flags.

	Thanks for this patch. I'm testing it for three hours. I'm rebuilding a 
1.5 TB raid1 array over iSCSI without any trouble.

gershwin:[/usr/scripts] > cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md7 : active raid1 sdi1[2] md_d0p1[0]
       1464725632 blocks [2/1] [U_]
       [=>...................]  recovery =  6.7% (99484736/1464725632) 
finish=1450.9min speed=15679K/sec

Without your patch, I never reached 1%... I hope it fix this bug and I 
shall come back when my raid1 volume shall be resynchronized.

	Regards,

	JKB
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "BERTRAND Joël" <joel.bertrand@systella.fr>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>,
	Neil Brown <neilb@suse.de>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Jeff@lessem.org
Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state
Date: Wed, 07 Nov 2007 12:20:25 +0100	[thread overview]
Message-ID: <47319F79.8060002@systella.fr> (raw)
In-Reply-To: <1194398700.2970.18.camel@dwillia2-linux.ch.intel.com>

Dan Williams wrote:
> On Tue, 2007-11-06 at 03:19 -0700, BERTRAND Joël wrote:
>>         Done. Here is obtained ouput :
> 
> Much appreciated.
>> [ 1260.969314] handling stripe 7629696, state=0x14 cnt=1, pd_idx=2 ops=0:0:0
>> [ 1260.980606] check 5: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ffcffcc0 written 0000000000000000
>> [ 1260.994808] check 4: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fdd4e360 written 0000000000000000
>> [ 1261.009325] check 3: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
>> [ 1261.244478] check 2: state 0x1 toread 0000000000000000 read 0000000000000000 write 0000000000000000 written 0000000000000000
>> [ 1261.270821] check 1: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800ff517e40 written 0000000000000000
>> [ 1261.312320] check 0: state 0x6 toread 0000000000000000 read 0000000000000000 write fffff800fd4cae60 written 0000000000000000
>> [ 1261.361030] locked=4 uptodate=2 to_read=0 to_write=4 failed=0 failed_num=0
>> [ 1261.443120] for sector 7629696, rmw=0 rcw=0
> [..]
> 
> This looks as if the blocks were prepared to be written out, but were
> never handled in ops_run_biodrain(), so they remain locked forever.  The
> operations flags are all clear which means handle_stripe thinks nothing
> else needs to be done.
> 
> The following patch, also attached, cleans up cases where the code looks
> at sh->ops.pending when it should be looking at the consistent
> stack-based snapshot of the operations flags.

	Thanks for this patch. I'm testing it for three hours. I'm rebuilding a 
1.5 TB raid1 array over iSCSI without any trouble.

gershwin:[/usr/scripts] > cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md7 : active raid1 sdi1[2] md_d0p1[0]
       1464725632 blocks [2/1] [U_]
       [=>...................]  recovery =  6.7% (99484736/1464725632) 
finish=1450.9min speed=15679K/sec

Without your patch, I never reached 1%... I hope it fix this bug and I 
shall come back when my raid1 volume shall be resynchronized.

	Regards,

	JKB

  parent reply	other threads:[~2007-11-07 11:20 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-04 12:03 2.6.23.1: mdadm/raid5 hung/d-state Justin Piszcz
2007-11-04 12:39 ` 2.6.23.1: mdadm/raid5 hung/d-state (md3_raid5 stuck in endless loop?) Justin Piszcz
2007-11-04 12:48 ` 2.6.23.1: mdadm/raid5 hung/d-state Michael Tokarev
2007-11-04 12:52   ` Justin Piszcz
2007-11-04 14:55     ` Michael Tokarev
2007-11-04 14:59       ` Justin Piszcz
2007-11-04 18:17       ` BERTRAND Joël
2007-11-04 21:40       ` David Greaves
2007-11-04 13:40 ` BERTRAND Joël
2007-11-04 13:42   ` Justin Piszcz
2007-11-04 21:49 ` Neil Brown
2007-11-04 21:51   ` Justin Piszcz
2007-11-05 18:35     ` Dan Williams
2007-11-05 18:35       ` Dan Williams
2007-11-05 18:35       ` Justin Piszcz
2007-11-06  0:19         ` Dan Williams
2007-11-06 10:19           ` BERTRAND Joël
2007-11-06 11:29             ` Justin Piszcz
2007-11-06 11:39               ` BERTRAND Joël
2007-11-06 11:39                 ` BERTRAND Joël
2007-11-06 11:42                 ` Justin Piszcz
2007-11-06 12:20                   ` BERTRAND Joël
2007-11-06 12:20                     ` BERTRAND Joël
2007-11-07  1:25             ` Dan Williams
2007-11-07  5:00               ` Jeff Lessem
2007-11-08 17:45                 ` Bill Davidsen
2007-11-08 18:02                   ` Dan Williams
2007-11-09 20:36                     ` Jeff Lessem
2007-11-08 21:40                 ` Carlos Carvalho
2007-11-09  9:14                   ` Justin Piszcz
2007-11-09 14:09                     ` Fabiano Silva
2007-11-07 11:20               ` BERTRAND Joël [this message]
2007-11-07 11:20                 ` BERTRAND Joël
2007-11-06 23:18       ` Jeff Lessem
2007-11-05  8:36   ` BERTRAND Joël
2007-11-07 16:39     ` Chuck Ebbert
2007-11-07 16:39       ` Chuck Ebbert
2007-11-07 16:48       ` BERTRAND Joël
2007-11-07 16:48         ` BERTRAND Joël
2007-11-08 11:42         ` BERTRAND Joël
2007-11-08 11:42           ` BERTRAND Joël
2007-11-08 12:44           ` Justin Piszcz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47319F79.8060002@systella.fr \
    --to=joel.bertrand@systella.fr \
    --cc=Jeff@lessem.org \
    --cc=dan.j.williams@intel.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.