raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
@ 2013-03-04 13:50 Jes Sorensen
  2013-03-04 21:00 ` NeilBrown
  0 siblings, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2013-03-04 13:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li

Hi,

I have been hitting raid5 lockups with recent kernels. A bunch of
bisecting narrowed it down to be caused by this commit:

ca64cae96037de16e4af92678814f5d4bf0c1c65

So far I can only reproduce the problem when running a test script
creating raid5 arrays on top of loop devices and then running mkfs on
those. I haven't managed to reproduce it on real disk devices yet, but I
suspect it is possible too.

Basically it looks like a race condition where R5_LOCKED doesn't get
cleared for the device, however it is unclear to me how we get to that
point. Since I am not really deeply familiar with the discard related
changes, I figured someone might have a better idea what could go wrong.

Cheers,
Jes



[ 4799.312280] sector=97f8 i=1           (null)           (null)           (null) ffff88022f5963c0 0
[ 4799.322174] ------------[ cut here ]------------
[ 4799.327330] WARNING: at drivers/md/raid5.c:352 init_stripe+0x2d2/0x360 [raid456]()
[ 4799.335775] Hardware name: S1200BTL
[ 4799.339668] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sunrpc bnep bluetooth rfkill sg coretemp e1000e raid1 dm_mirror kvm_intel kvm crc32c_intel iTCO_wdt iTCO_vendor_support dm_region_hash ghash_clmulni_intel lpc_ich dm_log dm_mod mfd_core i2c_i801 video pcspkr microcode uinput xfs usb_storage mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core mpt2sas raid_class scsi_transport_sas [last unloaded: raid456]
[ 4799.386633] Pid: 8204, comm: mkfs.ext4 Not tainted 3.7.0-rc1+ #17
[ 4799.393431] Call Trace:
[ 4799.396163]  [<ffffffff810602ff>] warn_slowpath_common+0x7f/0xc0
[ 4799.402868]  [<ffffffff8106035a>] warn_slowpath_null+0x1a/0x20
[ 4799.409375]  [<ffffffffa0423b92>] init_stripe+0x2d2/0x360 [raid456]
[ 4799.416368]  [<ffffffffa042400b>] get_active_stripe+0x3eb/0x480 [raid456]
[ 4799.423944]  [<ffffffffa0427beb>] make_request+0x3eb/0x6b0 [raid456]
[ 4799.431037]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
[ 4799.437062]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
[ 4799.443379]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
[ 4799.450082]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
[ 4799.456881]  [<ffffffff812c7199>] submit_bio+0x79/0x160
[ 4799.462714]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
[ 4799.469321]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
[ 4799.476218]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
[ 4799.482338]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
[ 4799.488170]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
[ 4799.494185]  [<ffffffff8128a19a>] ? inode_has_perm.isra.30.constprop.60+0x2a/0x30
[ 4799.502535]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
[ 4799.508755]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
[ 4799.514384]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
[ 4799.521475]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
[ 4799.528177] ---[ end trace 583fffce97b9ddd9 ]---
[ 4799.533327] sector=97f8 i=0           (null)           (null)           (null) ffff88022f5963c0 0
[ 4799.543227] ------------[ cut here ]------------

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-04 13:50 raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65 Jes Sorensen
@ 2013-03-04 21:00 ` NeilBrown
  2013-03-05  8:44   ` Jes Sorensen
  0 siblings, 1 reply; 14+ messages in thread
From: NeilBrown @ 2013-03-04 21:00 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li

[-- Attachment #1: Type: text/plain, Size: 6767 bytes --]

On Mon, 04 Mar 2013 14:50:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:

> Hi,
> 
> I have been hitting raid5 lockups with recent kernels. A bunch of
> bisecting narrowed it down to be caused by this commit:
> 
> ca64cae96037de16e4af92678814f5d4bf0c1c65
> 
> So far I can only reproduce the problem when running a test script
> creating raid5 arrays on top of loop devices and then running mkfs on
> those. I haven't managed to reproduce it on real disk devices yet, but I
> suspect it is possible too.
> 
> Basically it looks like a race condition where R5_LOCKED doesn't get
> cleared for the device, however it is unclear to me how we get to that
> point. Since I am not really deeply familiar with the discard related
> changes, I figured someone might have a better idea what could go wrong.
> 
> Cheers,
> Jes
> 
> 
> 
> [ 4799.312280] sector=97f8 i=1           (null)           (null)           (null) ffff88022f5963c0 0
> [ 4799.322174] ------------[ cut here ]------------
> [ 4799.327330] WARNING: at drivers/md/raid5.c:352 init_stripe+0x2d2/0x360 [raid456]()
> [ 4799.335775] Hardware name: S1200BTL
> [ 4799.339668] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sunrpc bnep bluetooth rfkill sg coretemp e1000e raid1 dm_mirror kvm_intel kvm crc32c_intel iTCO_wdt iTCO_vendor_support dm_region_hash ghash_clmulni_intel lpc_ich dm_log dm_mod mfd_core i2c_i801 video pcspkr microcode uinput xfs usb_storage mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core mpt2sas raid_class scsi_transport_sas [last unloaded: raid456]
> [ 4799.386633] Pid: 8204, comm: mkfs.ext4 Not tainted 3.7.0-rc1+ #17
> [ 4799.393431] Call Trace:
> [ 4799.396163]  [<ffffffff810602ff>] warn_slowpath_common+0x7f/0xc0
> [ 4799.402868]  [<ffffffff8106035a>] warn_slowpath_null+0x1a/0x20
> [ 4799.409375]  [<ffffffffa0423b92>] init_stripe+0x2d2/0x360 [raid456]
> [ 4799.416368]  [<ffffffffa042400b>] get_active_stripe+0x3eb/0x480 [raid456]
> [ 4799.423944]  [<ffffffffa0427beb>] make_request+0x3eb/0x6b0 [raid456]
> [ 4799.431037]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
> [ 4799.437062]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
> [ 4799.443379]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
> [ 4799.450082]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
> [ 4799.456881]  [<ffffffff812c7199>] submit_bio+0x79/0x160
> [ 4799.462714]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
> [ 4799.469321]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
> [ 4799.476218]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
> [ 4799.482338]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
> [ 4799.488170]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
> [ 4799.494185]  [<ffffffff8128a19a>] ? inode_has_perm.isra.30.constprop.60+0x2a/0x30
> [ 4799.502535]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
> [ 4799.508755]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
> [ 4799.514384]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
> [ 4799.521475]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
> [ 4799.528177] ---[ end trace 583fffce97b9ddd9 ]---
> [ 4799.533327] sector=97f8 i=0           (null)           (null)           (null) ffff88022f5963c0 0
> [ 4799.543227] ------------[ cut here ]------------

Does this fix it?

NeilBrown


From 29d90fa2adbdd9f21ea73864ff333e31305df04b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 4 Mar 2013 12:37:14 +1100
Subject: [PATCH] md/raid5: schedule_construction should abort if nothing to
 do.

Since commit 1ed850f356a0a422013846b5291acff08815008b
    md/raid5: make sure to_read and to_write never go negative.

It has been possible for handle_stripe_dirtying to be called
when there isn't actually any work to do.
It then calls schedule_reconstruction() which will set R5_LOCKED
on the parity block(s) even when nothing else is happening.
This then causes problems in do_release_stripe().

So add checks to schedule_reconstruction() so that if it doesn't
find anything to do, it just aborts.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 203a558..fbd40c1 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2309,17 +2309,6 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
 	int level = conf->level;
 
 	if (rcw) {
-		/* if we are not expanding this is a proper write request, and
-		 * there will be bios with new data to be drained into the
-		 * stripe cache
-		 */
-		if (!expand) {
-			sh->reconstruct_state = reconstruct_state_drain_run;
-			set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
-		} else
-			sh->reconstruct_state = reconstruct_state_run;
-
-		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
 
 		for (i = disks; i--; ) {
 			struct r5dev *dev = &sh->dev[i];
@@ -2332,6 +2321,21 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
 				s->locked++;
 			}
 		}
+		/* if we are not expanding this is a proper write request, and
+		 * there will be bios with new data to be drained into the
+		 * stripe cache
+		 */
+		if (!expand) {
+			if (!s->locked)
+				/* False alarm, nothing to do */
+				return;
+			sh->reconstruct_state = reconstruct_state_drain_run;
+			set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
+		} else
+			sh->reconstruct_state = reconstruct_state_run;
+
+		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
+
 		if (s->locked + conf->max_degraded == disks)
 			if (!test_and_set_bit(STRIPE_FULL_WRITE, &sh->state))
 				atomic_inc(&conf->pending_full_writes);
@@ -2340,11 +2344,6 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
 		BUG_ON(!(test_bit(R5_UPTODATE, &sh->dev[pd_idx].flags) ||
 			test_bit(R5_Wantcompute, &sh->dev[pd_idx].flags)));
 
-		sh->reconstruct_state = reconstruct_state_prexor_drain_run;
-		set_bit(STRIPE_OP_PREXOR, &s->ops_request);
-		set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
-		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
-
 		for (i = disks; i--; ) {
 			struct r5dev *dev = &sh->dev[i];
 			if (i == pd_idx)
@@ -2359,6 +2358,13 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
 				s->locked++;
 			}
 		}
+		if (!s->locked)
+			/* False alarm - nothing to do */
+			return;
+		sh->reconstruct_state = reconstruct_state_prexor_drain_run;
+		set_bit(STRIPE_OP_PREXOR, &s->ops_request);
+		set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
+		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
 	}
 
 	/* keep the parity disk(s) locked while asynchronous operations

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-04 21:00 ` NeilBrown
@ 2013-03-05  8:44   ` Jes Sorensen
  2013-03-06  2:18     ` NeilBrown
  0 siblings, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2013-03-05  8:44 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li

NeilBrown <neilb@suse.de> writes:
> On Mon, 04 Mar 2013 14:50:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
>
>> Hi,
>> 
>> I have been hitting raid5 lockups with recent kernels. A bunch of
>> bisecting narrowed it down to be caused by this commit:
>> 
>> ca64cae96037de16e4af92678814f5d4bf0c1c65
>> 
>> So far I can only reproduce the problem when running a test script
>> creating raid5 arrays on top of loop devices and then running mkfs on
>> those. I haven't managed to reproduce it on real disk devices yet, but I
>> suspect it is possible too.
>> 
>> Basically it looks like a race condition where R5_LOCKED doesn't get
>> cleared for the device, however it is unclear to me how we get to that
>> point. Since I am not really deeply familiar with the discard related
>> changes, I figured someone might have a better idea what could go wrong.
>> 
>> Cheers,
>> Jes
>> 
>> 
>> 
>> [ 4799.312280] sector=97f8 i=1 (null) (null) (null) ffff88022f5963c0
>> 0
>> [ 4799.322174] ------------[ cut here ]------------
>> [ 4799.327330] WARNING: at drivers/md/raid5.c:352
>> init_stripe+0x2d2/0x360 [raid456]()
>> [ 4799.335775] Hardware name: S1200BTL
>> [ 4799.339668] Modules linked in: raid456 async_raid6_recov
>> async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sunrpc
>> bnep bluetooth rfkill sg coretemp e1000e raid1 dm_mirror kvm_intel
>> kvm crc32c_intel iTCO_wdt iTCO_vendor_support dm_region_hash
>> ghash_clmulni_intel lpc_ich dm_log dm_mod mfd_core i2c_i801 video
>> pcspkr microcode uinput xfs usb_storage mgag200 i2c_algo_bit
>> drm_kms_helper ttm drm i2c_core mpt2sas raid_class
>> scsi_transport_sas [last unloaded: raid456]
>> [ 4799.386633] Pid: 8204, comm: mkfs.ext4 Not tainted 3.7.0-rc1+ #17
>> [ 4799.393431] Call Trace:
>> [ 4799.396163]  [<ffffffff810602ff>] warn_slowpath_common+0x7f/0xc0
>> [ 4799.402868]  [<ffffffff8106035a>] warn_slowpath_null+0x1a/0x20
>> [ 4799.409375]  [<ffffffffa0423b92>] init_stripe+0x2d2/0x360 [raid456]
>> [ 4799.416368]  [<ffffffffa042400b>] get_active_stripe+0x3eb/0x480 [raid456]
>> [ 4799.423944]  [<ffffffffa0427beb>] make_request+0x3eb/0x6b0 [raid456]
>> [ 4799.431037]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
>> [ 4799.437062]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
>> [ 4799.443379]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
>> [ 4799.450082]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
>> [ 4799.456881]  [<ffffffff812c7199>] submit_bio+0x79/0x160
>> [ 4799.462714]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
>> [ 4799.469321]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
>> [ 4799.476218]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
>> [ 4799.482338]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
>> [ 4799.488170]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
>> [ 4799.494185] [<ffffffff8128a19a>] ?
>> inode_has_perm.isra.30.constprop.60+0x2a/0x30
>> [ 4799.502535]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
>> [ 4799.508755]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
>> [ 4799.514384]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
>> [ 4799.521475]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
>> [ 4799.528177] ---[ end trace 583fffce97b9ddd9 ]---
>> [ 4799.533327] sector=97f8 i=0 (null) (null) (null) ffff88022f5963c0
>> 0
>> [ 4799.543227] ------------[ cut here ]------------
>
> Does this fix it?
>
> NeilBrown

Unfortunately no, I still see these crashes with this one applied :(

Cheers,
Jes

>
>
> From 29d90fa2adbdd9f21ea73864ff333e31305df04b Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Mon, 4 Mar 2013 12:37:14 +1100
> Subject: [PATCH] md/raid5: schedule_construction should abort if nothing to
>  do.
>
> Since commit 1ed850f356a0a422013846b5291acff08815008b
>     md/raid5: make sure to_read and to_write never go negative.
>
> It has been possible for handle_stripe_dirtying to be called
> when there isn't actually any work to do.
> It then calls schedule_reconstruction() which will set R5_LOCKED
> on the parity block(s) even when nothing else is happening.
> This then causes problems in do_release_stripe().
>
> So add checks to schedule_reconstruction() so that if it doesn't
> find anything to do, it just aborts.
>
> Reported-by: majianpeng <majianpeng@gmail.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 203a558..fbd40c1 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -2309,17 +2309,6 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
>  	int level = conf->level;
>  
>  	if (rcw) {
> -		/* if we are not expanding this is a proper write request, and
> -		 * there will be bios with new data to be drained into the
> -		 * stripe cache
> -		 */
> -		if (!expand) {
> -			sh->reconstruct_state = reconstruct_state_drain_run;
> -			set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
> -		} else
> -			sh->reconstruct_state = reconstruct_state_run;
> -
> -		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
>  
>  		for (i = disks; i--; ) {
>  			struct r5dev *dev = &sh->dev[i];
> @@ -2332,6 +2321,21 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
>  				s->locked++;
>  			}
>  		}
> +		/* if we are not expanding this is a proper write request, and
> +		 * there will be bios with new data to be drained into the
> +		 * stripe cache
> +		 */
> +		if (!expand) {
> +			if (!s->locked)
> +				/* False alarm, nothing to do */
> +				return;
> +			sh->reconstruct_state = reconstruct_state_drain_run;
> +			set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
> +		} else
> +			sh->reconstruct_state = reconstruct_state_run;
> +
> +		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
> +
>  		if (s->locked + conf->max_degraded == disks)
>  			if (!test_and_set_bit(STRIPE_FULL_WRITE, &sh->state))
>  				atomic_inc(&conf->pending_full_writes);
> @@ -2340,11 +2344,6 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
>  		BUG_ON(!(test_bit(R5_UPTODATE, &sh->dev[pd_idx].flags) ||
>  			test_bit(R5_Wantcompute, &sh->dev[pd_idx].flags)));
>  
> -		sh->reconstruct_state = reconstruct_state_prexor_drain_run;
> -		set_bit(STRIPE_OP_PREXOR, &s->ops_request);
> -		set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
> -		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
> -
>  		for (i = disks; i--; ) {
>  			struct r5dev *dev = &sh->dev[i];
>  			if (i == pd_idx)
> @@ -2359,6 +2358,13 @@ schedule_reconstruction(struct stripe_head *sh, struct stripe_head_state *s,
>  				s->locked++;
>  			}
>  		}
> +		if (!s->locked)
> +			/* False alarm - nothing to do */
> +			return;
> +		sh->reconstruct_state = reconstruct_state_prexor_drain_run;
> +		set_bit(STRIPE_OP_PREXOR, &s->ops_request);
> +		set_bit(STRIPE_OP_BIODRAIN, &s->ops_request);
> +		set_bit(STRIPE_OP_RECONSTRUCT, &s->ops_request);
>  	}
>  
>  	/* keep the parity disk(s) locked while asynchronous operations

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-05  8:44   ` Jes Sorensen
@ 2013-03-06  2:18     ` NeilBrown
  2013-03-06  9:31       ` Jes Sorensen
  0 siblings, 1 reply; 14+ messages in thread
From: NeilBrown @ 2013-03-06  2:18 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li

[-- Attachment #1: Type: text/plain, Size: 5944 bytes --]

On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:

> NeilBrown <neilb@suse.de> writes:
> > On Mon, 04 Mar 2013 14:50:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> > wrote:
> >
> >> Hi,
> >> 
> >> I have been hitting raid5 lockups with recent kernels. A bunch of
> >> bisecting narrowed it down to be caused by this commit:
> >> 
> >> ca64cae96037de16e4af92678814f5d4bf0c1c65
> >> 
> >> So far I can only reproduce the problem when running a test script
> >> creating raid5 arrays on top of loop devices and then running mkfs on
> >> those. I haven't managed to reproduce it on real disk devices yet, but I
> >> suspect it is possible too.
> >> 
> >> Basically it looks like a race condition where R5_LOCKED doesn't get
> >> cleared for the device, however it is unclear to me how we get to that
> >> point. Since I am not really deeply familiar with the discard related
> >> changes, I figured someone might have a better idea what could go wrong.
> >> 
> >> Cheers,
> >> Jes
> >> 
> >> 
> >> 
> >> [ 4799.312280] sector=97f8 i=1 (null) (null) (null) ffff88022f5963c0
> >> 0
> >> [ 4799.322174] ------------[ cut here ]------------
> >> [ 4799.327330] WARNING: at drivers/md/raid5.c:352
> >> init_stripe+0x2d2/0x360 [raid456]()
> >> [ 4799.335775] Hardware name: S1200BTL
> >> [ 4799.339668] Modules linked in: raid456 async_raid6_recov
> >> async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sunrpc
> >> bnep bluetooth rfkill sg coretemp e1000e raid1 dm_mirror kvm_intel
> >> kvm crc32c_intel iTCO_wdt iTCO_vendor_support dm_region_hash
> >> ghash_clmulni_intel lpc_ich dm_log dm_mod mfd_core i2c_i801 video
> >> pcspkr microcode uinput xfs usb_storage mgag200 i2c_algo_bit
> >> drm_kms_helper ttm drm i2c_core mpt2sas raid_class
> >> scsi_transport_sas [last unloaded: raid456]
> >> [ 4799.386633] Pid: 8204, comm: mkfs.ext4 Not tainted 3.7.0-rc1+ #17
> >> [ 4799.393431] Call Trace:
> >> [ 4799.396163]  [<ffffffff810602ff>] warn_slowpath_common+0x7f/0xc0
> >> [ 4799.402868]  [<ffffffff8106035a>] warn_slowpath_null+0x1a/0x20
> >> [ 4799.409375]  [<ffffffffa0423b92>] init_stripe+0x2d2/0x360 [raid456]
> >> [ 4799.416368]  [<ffffffffa042400b>] get_active_stripe+0x3eb/0x480 [raid456]
> >> [ 4799.423944]  [<ffffffffa0427beb>] make_request+0x3eb/0x6b0 [raid456]
> >> [ 4799.431037]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
> >> [ 4799.437062]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
> >> [ 4799.443379]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
> >> [ 4799.450082]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
> >> [ 4799.456881]  [<ffffffff812c7199>] submit_bio+0x79/0x160
> >> [ 4799.462714]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
> >> [ 4799.469321]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
> >> [ 4799.476218]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
> >> [ 4799.482338]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
> >> [ 4799.488170]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
> >> [ 4799.494185] [<ffffffff8128a19a>] ?
> >> inode_has_perm.isra.30.constprop.60+0x2a/0x30
> >> [ 4799.502535]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
> >> [ 4799.508755]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
> >> [ 4799.514384]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
> >> [ 4799.521475]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
> >> [ 4799.528177] ---[ end trace 583fffce97b9ddd9 ]---
> >> [ 4799.533327] sector=97f8 i=0 (null) (null) (null) ffff88022f5963c0
> >> 0
> >> [ 4799.543227] ------------[ cut here ]------------
> >
> > Does this fix it?
> >
> > NeilBrown
> 
> Unfortunately no, I still see these crashes with this one applied :(
> 

Thanks - the symptom looked  similar, but now that I look more closely I can
see it is quite different.

How about this then?  I can't really see what is happening, but based on the
patch that you identified it must be related to these flags.
It seems that handle_stripe_clean_event() is being called to early, and it
doesn't clear out the ->written bios because they are still locked or
something.  But it does clear R5_Discard on the parity block, so
handle_stripe_clean_event doesn't get called again.

This makes the handling of the various flags somewhat more uniform, which is
probably a good thing.

Thanks for testing,
NeilBrown




diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 277d9c2..a005dcc 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1246,8 +1246,7 @@ static void ops_complete_reconstruct(void *stripe_head_ref)
 		struct r5dev *dev = &sh->dev[i];
 
 		if (dev->written || i == pd_idx || i == qd_idx) {
-			if (!discard)
-				set_bit(R5_UPTODATE, &dev->flags);
+			set_bit(R5_UPTODATE, &dev->flags);
 			if (fua)
 				set_bit(R5_WantFUA, &dev->flags);
 			if (sync)
@@ -2784,8 +2783,7 @@ static void handle_stripe_clean_event(struct r5conf *conf,
 		if (sh->dev[i].written) {
 			dev = &sh->dev[i];
 			if (!test_bit(R5_LOCKED, &dev->flags) &&
-			    (test_bit(R5_UPTODATE, &dev->flags) ||
-			     test_bit(R5_Discard, &dev->flags))) {
+			    test_bit(R5_UPTODATE, &dev->flags)) {
 				/* We can return any write requests */
 				struct bio *wbi, *wbi2;
 				pr_debug("Return write for disc %d\n", i);
@@ -2808,8 +2806,11 @@ static void handle_stripe_clean_event(struct r5conf *conf,
 					 !test_bit(STRIPE_DEGRADED, &sh->state),
 						0);
 			}
-		} else if (test_bit(R5_Discard, &sh->dev[i].flags))
-			clear_bit(R5_Discard, &sh->dev[i].flags);
+		} else if (!test_bit(R5_LOCKED, &sh->dev[i].flags) &&
+			   test_bit(R5_UPTODATE, &sh->dev[i].flags)) {
+			if (test_and_clear_bit(R5_Discard, &dev->flags))
+				clear_bit(R5_UPTODATE, &dev->flags);
+		}
 
 	if (test_and_clear_bit(STRIPE_FULL_WRITE, &sh->state))
 		if (atomic_dec_and_test(&conf->pending_full_writes))

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-06  2:18     ` NeilBrown
@ 2013-03-06  9:31       ` Jes Sorensen
  2013-03-11 22:32         ` NeilBrown
  0 siblings, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2013-03-06  9:31 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li, Eryu Guan

[-- Attachment #1: Type: text/plain, Size: 4178 bytes --]

NeilBrown <neilb@suse.de> writes:
> On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
>> > Does this fix it?
>> >
>> > NeilBrown
>> 
>> Unfortunately no, I still see these crashes with this one applied :(
>> 
>
> Thanks - the symptom looked  similar, but now that I look more closely I can
> see it is quite different.
>
> How about this then?  I can't really see what is happening, but based on the
> patch that you identified it must be related to these flags.
> It seems that handle_stripe_clean_event() is being called to early, and it
> doesn't clear out the ->written bios because they are still locked or
> something.  But it does clear R5_Discard on the parity block, so
> handle_stripe_clean_event doesn't get called again.
>
> This makes the handling of the various flags somewhat more uniform, which is
> probably a good thing.

Hi Neil,

With this one applied I end up with an OOPS instead. Note I had to
modify the last test/clear bit sequence to use &sh->dev[i].flags instead
of &dev->flags to avoid a compiler warning.

I am attaching the test script I am running too. It was written by Eryu
Guan.

Cheers,
Jes




[ 2623.554780] kernel BUG at drivers/md/raid5.c:2954!
[ 2623.560126] invalid opcode: 0000 [#1] SMP 
[ 2623.564722] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx nls_utf8 lockd sunrpc bnep bluetooth rfkill sg dm_mirror dm_region_hash dm_log dm_mod raid1 coretemp kvm_intel kvm crc32c_intel iTCO_wdt ghash_clmulni_intel e1000e iTCO_vendor_support lpc_ich microcode mfd_core i2c_i801 video pcspkr uinput xfs mgag200 i2c_algo_bit drm_kms_helper ttm drm mpt2sas i2c_core raid_class scsi_transport_sas usb_storage [last unloaded: raid456]
[ 2623.612586] CPU 3 
[ 2623.614639] Pid: 20177, comm: md42_raid5 Not tainted 3.7.0-rc1+ #17 Intel Corporation S1200BTL/S1200BTL
[ 2623.625329] RIP: 0010:[<ffffffffa0438dd7>]  [<ffffffffa0438dd7>] handle_stripe+0x2297/0x2320 [raid456]
[ 2623.635732] RSP: 0018:ffff8801dd70db68  EFLAGS: 00010246
[ 2623.641660] RAX: ffff8801fc62cf18 RBX: ffff8801fc62cbf8 RCX: 0000000000000001
[ 2623.649623] RDX: 0000000000000000 RSI: 0000000000008d88 RDI: ffff8801edb63e00
[ 2623.657585] RBP: ffff8801dd70dcb8 R08: 0000000000000000 R09: ffff8801fc62cb10
[ 2623.665547] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[ 2623.673509] R13: ffff8801fc62cbf8 R14: 0000000000000000 R15: 0000000000000001
[ 2623.681472] FS:  0000000000000000(0000) GS:ffff880236860000(0000) knlGS:0000000000000000
[ 2623.690503] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2623.696915] CR2: 00007fb484fcc950 CR3: 00000000018fd000 CR4: 00000000001407e0
[ 2623.704878] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2623.712841] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2623.720804] Process md42_raid5 (pid: 20177, threadinfo ffff8801dd70c000, task ffff88022fadcbf0)
[ 2623.730512] Stack:
[ 2623.732757]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2623.741067]  ffff880232900400 00000001002386d6 ffff880236874100 0000000000000003
[ 2623.749376]  ffff8801dd70dcb8 ffff8801fc62cc38 ffff8801edb63f78 ffff8801edb63f60
[ 2623.757686] Call Trace:
[ 2623.760419]  [<ffffffffa0439c1e>] handle_active_stripes+0x18e/0x2a0 [raid456]
[ 2623.768387]  [<ffffffffa043a79b>] raid5d+0x43b/0x5a0 [raid456]
[ 2623.774902]  [<ffffffff814a6acd>] md_thread+0x10d/0x140
[ 2623.780736]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
[ 2623.786764]  [<ffffffff814a69c0>] ? md_rdev_init+0x140/0x140
[ 2623.793081]  [<ffffffff81083810>] kthread+0xc0/0xd0
[ 2623.798529]  [<ffffffff81083750>] ? kthread_create_on_node+0x120/0x120
[ 2623.805815]  [<ffffffff8161e6ac>] ret_from_fork+0x7c/0xb0
[ 2623.811842]  [<ffffffff81083750>] ? kthread_create_on_node+0x120/0x120
[ 2623.819126] Code: 83 be a4 00 00 00 00 74 0e e8 a6 39 07 e1 e9 21 de ff ff 0f 0b 0f 0b e8 58 ad ff ff 0f 1f 84 00 00 00 00 00 e9 0b de ff ff 0f 0b <0f> 0b 8b 43 58 44 8b 43 48 48 c7 c6 88 e1 43 a0 44 0f bf 4b 38 
[ 2623.841056] RIP  [<ffffffffa0438dd7>] handle_stripe+0x2297/0x2320 [raid456]
[ 2623.848840]  RSP <ffff8801dd70db68>

[-- Attachment #2: md-2.sh --]
[-- Type: application/x-sh, Size: 2103 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-06  9:31       ` Jes Sorensen
@ 2013-03-11 22:32         ` NeilBrown
  2013-03-12  1:32           ` NeilBrown
  0 siblings, 1 reply; 14+ messages in thread
From: NeilBrown @ 2013-03-11 22:32 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li, Eryu Guan

[-- Attachment #1: Type: text/plain, Size: 2537 bytes --]

On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:

> NeilBrown <neilb@suse.de> writes:
> > On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> > wrote:
> >> > Does this fix it?
> >> >
> >> > NeilBrown
> >> 
> >> Unfortunately no, I still see these crashes with this one applied :(
> >> 
> >
> > Thanks - the symptom looked  similar, but now that I look more closely I can
> > see it is quite different.
> >
> > How about this then?  I can't really see what is happening, but based on the
> > patch that you identified it must be related to these flags.
> > It seems that handle_stripe_clean_event() is being called to early, and it
> > doesn't clear out the ->written bios because they are still locked or
> > something.  But it does clear R5_Discard on the parity block, so
> > handle_stripe_clean_event doesn't get called again.
> >
> > This makes the handling of the various flags somewhat more uniform, which is
> > probably a good thing.
> 
> Hi Neil,
> 
> With this one applied I end up with an OOPS instead. Note I had to
> modify the last test/clear bit sequence to use &sh->dev[i].flags instead
> of &dev->flags to avoid a compiler warning.

Oops.

> 
> I am attaching the test script I am running too. It was written by Eryu
> Guan.

Thanks for that.  I've tried using it but haven't managed to trigger a BUG
yet.  What size are the loop files?  I mostly use fairly small ones, but
maybe it needs to be bigger to trigger the problem.

My current guess is that recovery and discard are both called on the stripe
at the same time and they race and leave the stripe in a slightly confused
state.  But I haven't found the exact state yet.

The discard code always attaches a 'discard' request to every device in a
stripe_head all at once, under stripe_lock.  However when ops_bio_drain picks
those discard requests of ->towrite and puts them on ->written, it takes
stripe_lock once per device.
Maybe we just need to change the coverage to stripe_lock there to be held for
the entire loop.  We would still want to drop it before calling
async_copy_data() and reclaim afterwards, but that wouldn't affect the
'discard' case.

> 
> 
> [ 2623.554780] kernel BUG at drivers/md/raid5.c:2954!

Could you confirm exactly which line this was - there are a few BUG_ON()s
around there.  They are all related to R5_UPTODATE not being set I think,
but it might help to know exactly when it isn't set.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-11 22:32         ` NeilBrown
@ 2013-03-12  1:32           ` NeilBrown
  2013-03-12 11:12             ` joystick
  2013-03-12 13:45             ` Jes Sorensen
  0 siblings, 2 replies; 14+ messages in thread
From: NeilBrown @ 2013-03-12  1:32 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li, Eryu Guan

[-- Attachment #1: Type: text/plain, Size: 6300 bytes --]

On Tue, 12 Mar 2013 09:32:31 +1100 NeilBrown <neilb@suse.de> wrote:

> On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
> 

> > 
> > I am attaching the test script I am running too. It was written by Eryu
> > Guan.
> 
> Thanks for that.  I've tried using it but haven't managed to trigger a BUG
> yet.  What size are the loop files?  I mostly use fairly small ones, but
> maybe it needs to be bigger to trigger the problem.

Shortly after I wrote that I got a bug-on!  It hasn't happened again though.

This was using code without that latest patch I sent.  The bug was
		BUG_ON(s->uptodate != disks);

in the check_state_compute_result case of handle_parity_checks5() which is
probably the same cause as your most recent BUG.

I've revised my thinking a bit and am now running with this patch which I
think should fix a problem that probably caused the symptoms we have seen.

If you could run your tests for a while too and is whether it will still crash
for you, I'd really appreciate it.

Thanks,
NeilBrown



Subject: [PATCH] md/raid5: ensure sync and recovery don't happen at the same
 time.

A number of problems can occur due to races between
resync/recovery and discard.

- if sync_request calls handle_stripe() while a discard is
  happening on the stripe, it might call handle_stripe_clean_event
  before all of the individual discard requests have completed
  (so some devices are still locked, but not all).
  Since commit ca64cae96037de16e4af92678814f5d4bf0c1c65
     md/raid5: Make sure we clear R5_Discard when discard is finished.
  this will cause R5_Discard to be cleared for the parity device,
  so handle_stripe_clean_event will not be called when the other
  devices do become unlocked, so their ->written will not be cleared.
  This ultimately leads to a WARN_ON in init_stripe and a lock-up.

- If handle_stripe_clean_event does clear R5_UPTODATE at an awkward
  time for resync, it can lead to s->uptodate being less than disks
  in handle_parity_checks5(), which triggers a BUG (because it is
  one).

So:
 - keep R5_Discard  on the parity device until all other devices have
   completed their discard request
 - make sure don't try to have a 'discard' and a 'sync' action at
   the same time.

Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 51af9da..c216dd3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2609,6 +2609,8 @@ handle_failed_sync(struct r5conf *conf, struct stripe_head *sh,
 	int i;
 
 	clear_bit(STRIPE_SYNCING, &sh->state);
+	if (test_and_clear_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags))
+		wake_up(&conf->wait_for_overlap);
 	s->syncing = 0;
 	s->replacing = 0;
 	/* There is nothing more to do for sync/check/repair.
@@ -2782,6 +2784,7 @@ static void handle_stripe_clean_event(struct r5conf *conf,
 {
 	int i;
 	struct r5dev *dev;
+	int discard_pending = 0;
 
 	for (i = disks; i--; )
 		if (sh->dev[i].written) {
@@ -2810,9 +2813,23 @@ static void handle_stripe_clean_event(struct r5conf *conf,
 						STRIPE_SECTORS,
 					 !test_bit(STRIPE_DEGRADED, &sh->state),
 						0);
-			}
-		} else if (test_bit(R5_Discard, &sh->dev[i].flags))
-			clear_bit(R5_Discard, &sh->dev[i].flags);
+			} else if (test_bit(R5_Discard, &dev->flags))
+				discard_pending = 1;
+		}
+	if (!discard_pending &&
+	    test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+		clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
+		clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
+		if (sh->qd_idx >= 0) {
+			clear_bit(R5_Discard, &sh->dev[sh->qd_idx].flags);
+			clear_bit(R5_UPTODATE, &sh->dev[sh->qd_idx].flags);
+		}
+		/* now that discard is done we can proceed with any sync */
+		clear_bit(STRIPE_DISCARD, &sh->state);
+		if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
+			set_bit(STRIPE_HANDLE, &sh->state);
+
+	}
 
 	if (test_and_clear_bit(STRIPE_FULL_WRITE, &sh->state))
 		if (atomic_dec_and_test(&conf->pending_full_writes))
@@ -3464,9 +3481,15 @@ static void handle_stripe(struct stripe_head *sh)
 		return;
 	}
 
-	if (test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
-		set_bit(STRIPE_SYNCING, &sh->state);
-		clear_bit(STRIPE_INSYNC, &sh->state);
+	if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
+		spin_lock(&sh->stripe_lock);
+		/* Cannot process 'sync' concurrently with 'discard' */
+		if (!test_bit(STRIPE_DISCARD, &sh->state) &&
+		    test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
+			set_bit(STRIPE_SYNCING, &sh->state);
+			clear_bit(STRIPE_INSYNC, &sh->state);
+		}
+		spin_unlock(&sh->stripe_lock);
 	}
 	clear_bit(STRIPE_DELAYED, &sh->state);
 
@@ -3626,6 +3649,8 @@ static void handle_stripe(struct stripe_head *sh)
 	    test_bit(STRIPE_INSYNC, &sh->state)) {
 		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
 		clear_bit(STRIPE_SYNCING, &sh->state);
+		if (test_and_clear_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags))
+			wake_up(&conf->wait_for_overlap);
 	}
 
 	/* If the failed drives are just a ReadError, then we might need
@@ -4222,6 +4247,13 @@ static void make_discard_request(struct mddev *mddev, struct bio *bi)
 		prepare_to_wait(&conf->wait_for_overlap, &w,
 				TASK_UNINTERRUPTIBLE);
 		spin_lock_irq(&sh->stripe_lock);
+		if (test_bit(STRIPE_SYNCING, &sh->state)) {
+			set_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags);
+			spin_unlock_irq(&sh->stripe_lock);
+			release_stripe(sh);
+			schedule();
+			goto again;
+		}
 		for (d = 0; d < conf->raid_disks; d++) {
 			if (d == sh->pd_idx || d == sh->qd_idx)
 				continue;
@@ -4233,6 +4265,7 @@ static void make_discard_request(struct mddev *mddev, struct bio *bi)
 				goto again;
 			}
 		}
+		set_bit(STRIPE_DISCARD, &sh->state);
 		finish_wait(&conf->wait_for_overlap, &w);
 		for (d = 0; d < conf->raid_disks; d++) {
 			if (d == sh->pd_idx || d == sh->qd_idx)
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 2afd835..996bdf3 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -324,6 +324,7 @@ enum {
 	STRIPE_COMPUTE_RUN,
 	STRIPE_OPS_REQ_PENDING,
 	STRIPE_ON_UNPLUG_LIST,
+	STRIPE_DISCARD,
 };
 
 /*

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12  1:32           ` NeilBrown
@ 2013-03-12 11:12             ` joystick
  2013-03-20  0:54               ` NeilBrown
  2013-03-12 13:45             ` Jes Sorensen
  1 sibling, 1 reply; 14+ messages in thread
From: joystick @ 2013-03-12 11:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 03/12/13 02:32, NeilBrown wrote:
> Subject: [PATCH] md/raid5: ensure sync and recovery don't happen at the same
>   time.

Watch out for probable mistake


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12  1:32           ` NeilBrown
  2013-03-12 11:12             ` joystick
@ 2013-03-12 13:45             ` Jes Sorensen
  2013-03-12 23:35               ` NeilBrown
  1 sibling, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2013-03-12 13:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li, Eryu Guan

NeilBrown <neilb@suse.de> writes:
> On Tue, 12 Mar 2013 09:32:31 +1100 NeilBrown <neilb@suse.de> wrote:
>
>> On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
>> wrote:
>> 
>
>> > 
>> > I am attaching the test script I am running too. It was written by Eryu
>> > Guan.
>> 
>> Thanks for that.  I've tried using it but haven't managed to trigger a BUG
>> yet.  What size are the loop files?  I mostly use fairly small ones, but
>> maybe it needs to be bigger to trigger the problem.
>
> Shortly after I wrote that I got a bug-on!  It hasn't happened again though.
>
> This was using code without that latest patch I sent.  The bug was
> 		BUG_ON(s->uptodate != disks);
>
> in the check_state_compute_result case of handle_parity_checks5() which is
> probably the same cause as your most recent BUG.
>
> I've revised my thinking a bit and am now running with this patch which I
> think should fix a problem that probably caused the symptoms we have seen.
>
> If you could run your tests for a while too and is whether it will still crash
> for you, I'd really appreciate it.

Hi Neil,

Sorry I can't verify the line numbers of my old test since I managed to
mess up my git tree in the process :(

However running with this new patch I have just hit another but
different case. Looks like a deadlock.

This is basically running ca64cae96037de16e4af92678814f5d4bf0c1c65 with
your patch applied on top, and nothing else.

If you want me to try a more uptodate Linus tree, please let me know.

Cheers,
Jes


[17635.205927] INFO: task mkfs.ext4:20060 blocked for more than 120 seconds.
[17635.213543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[17635.222291] mkfs.ext4       D ffff880236814100     0 20060  20026 0x00000080
[17635.230199]  ffff8801bc8bbb98 0000000000000082 ffff88022f0be540 ffff8801bc8bbfd8
[17635.238518]  ffff8801bc8bbfd8 ffff8801bc8bbfd8 ffff88022d47b2a0 ffff88022f0be540
[17635.246837]  ffff8801cea1f430 000000000001d5f0 ffff8801c7f4f430 ffff88022169a400
[17635.255161] Call Trace:
[17635.257891]  [<ffffffff81614f79>] schedule+0x29/0x70
[17635.263433]  [<ffffffffa0386ada>] make_request+0x6da/0x6f0 [raid456]
[17635.270525]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
[17635.276560]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
[17635.282884]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
[17635.289586]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
[17635.296393]  [<ffffffff812c7199>] submit_bio+0x79/0x160
[17635.302232]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
[17635.308844]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
[17635.315748]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
[17635.321877]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
[17635.327714]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
[17635.333745]  [<ffffffff8128a19a>] ? inode_has_perm.isra.30.constprop.60+0x2a/0x30
[17635.342103]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
[17635.348329]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
[17635.353972]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
[17635.361070]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12 13:45             ` Jes Sorensen
@ 2013-03-12 23:35               ` NeilBrown
  2013-03-13  7:32                 ` Jes Sorensen
  2013-03-14  7:35                 ` Jes Sorensen
  0 siblings, 2 replies; 14+ messages in thread
From: NeilBrown @ 2013-03-12 23:35 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li, Eryu Guan

[-- Attachment #1: Type: text/plain, Size: 4581 bytes --]

On Tue, 12 Mar 2013 14:45:44 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:

> NeilBrown <neilb@suse.de> writes:
> > On Tue, 12 Mar 2013 09:32:31 +1100 NeilBrown <neilb@suse.de> wrote:
> >
> >> On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> >> wrote:
> >> 
> >
> >> > 
> >> > I am attaching the test script I am running too. It was written by Eryu
> >> > Guan.
> >> 
> >> Thanks for that.  I've tried using it but haven't managed to trigger a BUG
> >> yet.  What size are the loop files?  I mostly use fairly small ones, but
> >> maybe it needs to be bigger to trigger the problem.
> >
> > Shortly after I wrote that I got a bug-on!  It hasn't happened again though.
> >
> > This was using code without that latest patch I sent.  The bug was
> > 		BUG_ON(s->uptodate != disks);
> >
> > in the check_state_compute_result case of handle_parity_checks5() which is
> > probably the same cause as your most recent BUG.
> >
> > I've revised my thinking a bit and am now running with this patch which I
> > think should fix a problem that probably caused the symptoms we have seen.
> >
> > If you could run your tests for a while too and is whether it will still crash
> > for you, I'd really appreciate it.
> 
> Hi Neil,
> 
> Sorry I can't verify the line numbers of my old test since I managed to
> mess up my git tree in the process :(
> 
> However running with this new patch I have just hit another but
> different case. Looks like a deadlock.

You test setup is clearly different from mine.  I've been running all night
without a single hiccup.

> 
> This is basically running ca64cae96037de16e4af92678814f5d4bf0c1c65 with
> your patch applied on top, and nothing else.
> 
> If you want me to try a more uptodate Linus tree, please let me know.
> 
> Cheers,
> Jes
> 
> 
> [17635.205927] INFO: task mkfs.ext4:20060 blocked for more than 120 seconds.
> [17635.213543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [17635.222291] mkfs.ext4       D ffff880236814100     0 20060  20026 0x00000080
> [17635.230199]  ffff8801bc8bbb98 0000000000000082 ffff88022f0be540 ffff8801bc8bbfd8
> [17635.238518]  ffff8801bc8bbfd8 ffff8801bc8bbfd8 ffff88022d47b2a0 ffff88022f0be540
> [17635.246837]  ffff8801cea1f430 000000000001d5f0 ffff8801c7f4f430 ffff88022169a400
> [17635.255161] Call Trace:
> [17635.257891]  [<ffffffff81614f79>] schedule+0x29/0x70
> [17635.263433]  [<ffffffffa0386ada>] make_request+0x6da/0x6f0 [raid456]
> [17635.270525]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
> [17635.276560]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
> [17635.282884]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
> [17635.289586]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
> [17635.296393]  [<ffffffff812c7199>] submit_bio+0x79/0x160
> [17635.302232]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
> [17635.308844]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
> [17635.315748]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
> [17635.321877]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
> [17635.327714]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
> [17635.333745]  [<ffffffff8128a19a>] ? inode_has_perm.isra.30.constprop.60+0x2a/0x30
> [17635.342103]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
> [17635.348329]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
> [17635.353972]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
> [17635.361070]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b

There is a small race in the exclusion between discard and recovery.
This patch on top should fix it (I hope).
Thanks for testing.

NeilBrown

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c216dd3..636d492 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4246,14 +4246,14 @@ static void make_discard_request(struct mddev *mddev, struct bio *bi)
 		sh = get_active_stripe(conf, logical_sector, 0, 0, 0);
 		prepare_to_wait(&conf->wait_for_overlap, &w,
 				TASK_UNINTERRUPTIBLE);
-		spin_lock_irq(&sh->stripe_lock);
+		set_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags);
 		if (test_bit(STRIPE_SYNCING, &sh->state)) {
-			set_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags);
-			spin_unlock_irq(&sh->stripe_lock);
 			release_stripe(sh);
 			schedule();
 			goto again;
 		}
+		clear_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags);
+		spin_lock_irq(&sh->stripe_lock);
 		for (d = 0; d < conf->raid_disks; d++) {
 			if (d == sh->pd_idx || d == sh->qd_idx)
 				continue;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12 23:35               ` NeilBrown
@ 2013-03-13  7:32                 ` Jes Sorensen
  2013-03-14  7:35                 ` Jes Sorensen
  1 sibling, 0 replies; 14+ messages in thread
From: Jes Sorensen @ 2013-03-13  7:32 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li, Eryu Guan

NeilBrown <neilb@suse.de> writes:
> On Tue, 12 Mar 2013 14:45:44 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
> There is a small race in the exclusion between discard and recovery.
> This patch on top should fix it (I hope).
> Thanks for testing.

I'll give it a spin. I am running on a quad-core Xeon system:
Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz, with 8GB of RAM, and the
tests on a 3TB SATA III drive.

My loop image files are like this:
-rw-r--r--. 1 root root 268435456 Mar 12 09:53 img0
-rw-r--r--. 1 root root 268435456 Mar 12 09:53 img1
-rw-r--r--. 1 root root 268435456 Mar 12 09:53 img2
-rw-r--r--. 1 root root 268435456 Mar 12 09:26 img3

I've tried storing them both on ext4 and xfs, didn't seem to make a
difference for the test results.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12 23:35               ` NeilBrown
  2013-03-13  7:32                 ` Jes Sorensen
@ 2013-03-14  7:35                 ` Jes Sorensen
  2013-03-20  0:55                   ` NeilBrown
  1 sibling, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2013-03-14  7:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Shaohua Li, Eryu Guan

NeilBrown <neilb@suse.de> writes:
> On Tue, 12 Mar 2013 14:45:44 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
>
>> NeilBrown <neilb@suse.de> writes:
>> > On Tue, 12 Mar 2013 09:32:31 +1100 NeilBrown <neilb@suse.de> wrote:
>> >
>> >> On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
>> >> wrote:
>> >> 
>> >
>> >> > 
>> >> > I am attaching the test script I am running too. It was written by Eryu
>> >> > Guan.
>> >> 
>> >> Thanks for that.  I've tried using it but haven't managed to trigger a BUG
>> >> yet.  What size are the loop files?  I mostly use fairly small ones, but
>> >> maybe it needs to be bigger to trigger the problem.
>> >
>> > Shortly after I wrote that I got a bug-on!  It hasn't happened again though.
>> >
>> > This was using code without that latest patch I sent.  The bug was
>> > 		BUG_ON(s->uptodate != disks);
>> >
>> > in the check_state_compute_result case of handle_parity_checks5() which is
>> > probably the same cause as your most recent BUG.
>> >
>> > I've revised my thinking a bit and am now running with this patch which I
>> > think should fix a problem that probably caused the symptoms we have seen.
>> >
>> > If you could run your tests for a while too and is whether it will
>> > still crash
>> > for you, I'd really appreciate it.
>> 
>> Hi Neil,
>> 
>> Sorry I can't verify the line numbers of my old test since I managed to
>> mess up my git tree in the process :(
>> 
>> However running with this new patch I have just hit another but
>> different case. Looks like a deadlock.
>
> You test setup is clearly different from mine.  I've been running all night
> without a single hiccup.
>
>> 
>> This is basically running ca64cae96037de16e4af92678814f5d4bf0c1c65 with
>> your patch applied on top, and nothing else.
>> 
>> If you want me to try a more uptodate Linus tree, please let me know.
>> 
>> Cheers,
>> Jes
>> 
>> 
>> [17635.205927] INFO: task mkfs.ext4:20060 blocked for more than 120 seconds.
>> [17635.213543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [17635.222291] mkfs.ext4 D ffff880236814100 0 20060 20026 0x00000080
>> [17635.230199] ffff8801bc8bbb98 0000000000000082 ffff88022f0be540
>> ffff8801bc8bbfd8
>> [17635.238518] ffff8801bc8bbfd8 ffff8801bc8bbfd8 ffff88022d47b2a0
>> ffff88022f0be540
>> [17635.246837] ffff8801cea1f430 000000000001d5f0 ffff8801c7f4f430
>> ffff88022169a400
>> [17635.255161] Call Trace:
>> [17635.257891]  [<ffffffff81614f79>] schedule+0x29/0x70
>> [17635.263433]  [<ffffffffa0386ada>] make_request+0x6da/0x6f0 [raid456]
>> [17635.270525]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
>> [17635.276560]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
>> [17635.282884]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
>> [17635.289586]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
>> [17635.296393]  [<ffffffff812c7199>] submit_bio+0x79/0x160
>> [17635.302232]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
>> [17635.308844]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
>> [17635.315748]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
>> [17635.321877]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
>> [17635.327714]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
>> [17635.333745] [<ffffffff8128a19a>] ?
>> inode_has_perm.isra.30.constprop.60+0x2a/0x30
>> [17635.342103]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
>> [17635.348329]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
>> [17635.353972]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
>> [17635.361070]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
>
> There is a small race in the exclusion between discard and recovery.
> This patch on top should fix it (I hope).
> Thanks for testing.

Ok I spent most of yesterday running tests on this. With this additional
patch applied I haven't been able to reproduce the hang so far - without
it I could do it in about an hour, so I suspect it solves the problem.

Thanks!
Jes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-12 11:12             ` joystick
@ 2013-03-20  0:54               ` NeilBrown
  0 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2013-03-20  0:54 UTC (permalink / raw)
  To: joystick; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 389 bytes --]

On Tue, 12 Mar 2013 12:12:46 +0100 joystick <joystick@shiftmail.org> wrote:

> On 03/12/13 02:32, NeilBrown wrote:
> > Subject: [PATCH] md/raid5: ensure sync and recovery don't happen at the same
> >   time.
> 
> Watch out for probable mistake

I assume you mean that it isn't 'sync and recovery' but rather 'sync and
DISCARD' ?

Thanks for noticing that - fixed.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
  2013-03-14  7:35                 ` Jes Sorensen
@ 2013-03-20  0:55                   ` NeilBrown
  0 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2013-03-20  0:55 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid, Shaohua Li, Eryu Guan

[-- Attachment #1: Type: text/plain, Size: 4425 bytes --]

On Thu, 14 Mar 2013 08:35:05 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:

> NeilBrown <neilb@suse.de> writes:
> > On Tue, 12 Mar 2013 14:45:44 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> > wrote:
> >
> >> NeilBrown <neilb@suse.de> writes:
> >> > On Tue, 12 Mar 2013 09:32:31 +1100 NeilBrown <neilb@suse.de> wrote:
> >> >
> >> >> On Wed, 06 Mar 2013 10:31:55 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> >> >> wrote:
> >> >> 
> >> >
> >> >> > 
> >> >> > I am attaching the test script I am running too. It was written by Eryu
> >> >> > Guan.
> >> >> 
> >> >> Thanks for that.  I've tried using it but haven't managed to trigger a BUG
> >> >> yet.  What size are the loop files?  I mostly use fairly small ones, but
> >> >> maybe it needs to be bigger to trigger the problem.
> >> >
> >> > Shortly after I wrote that I got a bug-on!  It hasn't happened again though.
> >> >
> >> > This was using code without that latest patch I sent.  The bug was
> >> > 		BUG_ON(s->uptodate != disks);
> >> >
> >> > in the check_state_compute_result case of handle_parity_checks5() which is
> >> > probably the same cause as your most recent BUG.
> >> >
> >> > I've revised my thinking a bit and am now running with this patch which I
> >> > think should fix a problem that probably caused the symptoms we have seen.
> >> >
> >> > If you could run your tests for a while too and is whether it will
> >> > still crash
> >> > for you, I'd really appreciate it.
> >> 
> >> Hi Neil,
> >> 
> >> Sorry I can't verify the line numbers of my old test since I managed to
> >> mess up my git tree in the process :(
> >> 
> >> However running with this new patch I have just hit another but
> >> different case. Looks like a deadlock.
> >
> > You test setup is clearly different from mine.  I've been running all night
> > without a single hiccup.
> >
> >> 
> >> This is basically running ca64cae96037de16e4af92678814f5d4bf0c1c65 with
> >> your patch applied on top, and nothing else.
> >> 
> >> If you want me to try a more uptodate Linus tree, please let me know.
> >> 
> >> Cheers,
> >> Jes
> >> 
> >> 
> >> [17635.205927] INFO: task mkfs.ext4:20060 blocked for more than 120 seconds.
> >> [17635.213543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [17635.222291] mkfs.ext4 D ffff880236814100 0 20060 20026 0x00000080
> >> [17635.230199] ffff8801bc8bbb98 0000000000000082 ffff88022f0be540
> >> ffff8801bc8bbfd8
> >> [17635.238518] ffff8801bc8bbfd8 ffff8801bc8bbfd8 ffff88022d47b2a0
> >> ffff88022f0be540
> >> [17635.246837] ffff8801cea1f430 000000000001d5f0 ffff8801c7f4f430
> >> ffff88022169a400
> >> [17635.255161] Call Trace:
> >> [17635.257891]  [<ffffffff81614f79>] schedule+0x29/0x70
> >> [17635.263433]  [<ffffffffa0386ada>] make_request+0x6da/0x6f0 [raid456]
> >> [17635.270525]  [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
> >> [17635.276560]  [<ffffffff814a6633>] md_make_request+0xc3/0x200
> >> [17635.282884]  [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
> >> [17635.289586]  [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
> >> [17635.296393]  [<ffffffff812c7199>] submit_bio+0x79/0x160
> >> [17635.302232]  [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
> >> [17635.308844]  [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
> >> [17635.315748]  [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
> >> [17635.321877]  [<ffffffff811cb971>] block_ioctl+0x41/0x50
> >> [17635.327714]  [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
> >> [17635.333745] [<ffffffff8128a19a>] ?
> >> inode_has_perm.isra.30.constprop.60+0x2a/0x30
> >> [17635.342103]  [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
> >> [17635.348329]  [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
> >> [17635.353972]  [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
> >> [17635.361070]  [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
> >
> > There is a small race in the exclusion between discard and recovery.
> > This patch on top should fix it (I hope).
> > Thanks for testing.
> 
> Ok I spent most of yesterday running tests on this. With this additional
> patch applied I haven't been able to reproduce the hang so far - without
> it I could do it in about an hour, so I suspect it solves the problem.
> 
> Thanks!
> Jes

Thanks.  I'll get the queued for Linus and -stable shortly.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-03-20  0:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-04 13:50 raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65 Jes Sorensen
2013-03-04 21:00 ` NeilBrown
2013-03-05  8:44   ` Jes Sorensen
2013-03-06  2:18     ` NeilBrown
2013-03-06  9:31       ` Jes Sorensen
2013-03-11 22:32         ` NeilBrown
2013-03-12  1:32           ` NeilBrown
2013-03-12 11:12             ` joystick
2013-03-20  0:54               ` NeilBrown
2013-03-12 13:45             ` Jes Sorensen
2013-03-12 23:35               ` NeilBrown
2013-03-13  7:32                 ` Jes Sorensen
2013-03-14  7:35                 ` Jes Sorensen
2013-03-20  0:55                   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).