linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] md: avoid deadlock when md_set_badblocks.
@ 2013-10-12  5:10 ycbzzjlby
  2013-10-21 23:30 ` NeilBrown
  0 siblings, 1 reply; 2+ messages in thread
From: ycbzzjlby @ 2013-10-12  5:10 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, linux-kernel, majianpeng, Bian Yu

From: Bian Yu <bianyu@kedacom.com>

When operate harddisk and hit errors, md_set_badblocks is called after
scsi_restart_operations which already disabled the irq. but md_set_badblocks
will call write_sequnlock_irq and enable irq. so softirq can preempt the
current thread and that may cause a deadlock. I think this situation should
use write_sequnlock_irqsave/irqrestore instead.

I met the situation and the call trace is below:
[  638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010
[  638.921923]  lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0
[  638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37
[  638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013
[  638.927816]  ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007
[  638.929829]  ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8
[  638.931848]  ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8
[  638.933884] Call Trace:
[  638.935867]  <IRQ>  [<ffffffff8172ff85>] dump_stack+0x55/0x76
[  638.937878]  [<ffffffff81730030>] spin_dump+0x8a/0x8f
[  638.939861]  [<ffffffff81730056>] spin_bug+0x21/0x26
[  638.941836]  [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0
[  638.943801]  [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80
[  638.945747]  [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0
[  638.947672]  [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50
[  638.949595]  [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0
[  638.951504]  [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0
[  638.953388]  [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140
[  638.955248]  [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90
[  638.957116]  [<ffffffff8104fddd>] __do_softirq+0xfd/0x330
[  638.958987]  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
[  638.960861]  [<ffffffff8174a5cc>] call_softirq+0x1c/0x30
[  638.962724]  [<ffffffff81004c7d>] do_softirq+0x8d/0xc0
[  638.964565]  [<ffffffff8105024e>] irq_exit+0x10e/0x150
[  638.966390]  [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60
[  638.968223]  [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80
[  638.970079]  <EOI>  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
[  638.971899]  [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50
[  638.973691]  [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50
[  638.975475]  [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0
[  638.977243]  [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80
[  638.978988]  [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456]
[  638.980723]  [<ffffffff811b5a1d>] bio_endio+0x1d/0x40
[  638.982463]  [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0
[  638.984214]  [<ffffffff81306b9f>] blk_update_request+0x7f/0x350
[  638.985967]  [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90
[  638.987710]  [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50
[  638.989439]  [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30
[  638.991149]  [<ffffffff81308746>] blk_peek_request+0x106/0x250
[  638.992861]  [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130
[  638.994561]  [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0
[  638.996251]  [<ffffffff813040a7>] __blk_run_queue+0x37/0x50
[  638.997900]  [<ffffffff813045af>] blk_run_queue+0x2f/0x50
[  638.999553]  [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0
[  639.001185]  [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40
[  639.002798]  [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200
[  639.004391]  [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0
[  639.005996]  [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0
[  639.007600]  [<ffffffff81072f6b>] kthread+0xdb/0xe0
[  639.009205]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170
[  639.010821]  [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0
[  639.012437]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170

Signed-off-by: Bian Yu <bianyu@kedacom.com>
Reviewed-by: Jianpeng Ma <majianpeng@gmail.com>
---
 drivers/md/md.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8a0d762..2445fec 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8111,6 +8111,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
 	u64 *p;
 	int lo, hi;
 	int rv = 1;
+	unsigned long flags;
 
 	if (bb->shift < 0)
 		/* badblocks are disabled */
@@ -8125,7 +8126,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
 		sectors = next - s;
 	}
 
-	write_seqlock_irq(&bb->lock);
+	write_seqlock_irqsave(&bb->lock, flags);
 
 	p = bb->page;
 	lo = 0;
@@ -8241,7 +8242,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
 	bb->changed = 1;
 	if (!acknowledged)
 		bb->unacked_exist = 1;
-	write_sequnlock_irq(&bb->lock);
+	write_sequnlock_irqrestore(&bb->lock, flags);
 
 	return rv;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] md: avoid deadlock when md_set_badblocks.
  2013-10-12  5:10 [PATCH] md: avoid deadlock when md_set_badblocks ycbzzjlby
@ 2013-10-21 23:30 ` NeilBrown
  0 siblings, 0 replies; 2+ messages in thread
From: NeilBrown @ 2013-10-21 23:30 UTC (permalink / raw)
  To: ycbzzjlby; +Cc: linux-raid, linux-kernel, majianpeng, Bian Yu

[-- Attachment #1: Type: text/plain, Size: 5351 bytes --]

On Sat, 12 Oct 2013 01:10:03 -0400 ycbzzjlby@gmail.com wrote:

> From: Bian Yu <bianyu@kedacom.com>
> 
> When operate harddisk and hit errors, md_set_badblocks is called after
> scsi_restart_operations which already disabled the irq. but md_set_badblocks
> will call write_sequnlock_irq and enable irq. so softirq can preempt the
> current thread and that may cause a deadlock. I think this situation should
> use write_sequnlock_irqsave/irqrestore instead.
> 
> I met the situation and the call trace is below:
> [  638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010
> [  638.921923]  lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0
> [  638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37
> [  638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013
> [  638.927816]  ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007
> [  638.929829]  ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8
> [  638.931848]  ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8
> [  638.933884] Call Trace:
> [  638.935867]  <IRQ>  [<ffffffff8172ff85>] dump_stack+0x55/0x76
> [  638.937878]  [<ffffffff81730030>] spin_dump+0x8a/0x8f
> [  638.939861]  [<ffffffff81730056>] spin_bug+0x21/0x26
> [  638.941836]  [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0
> [  638.943801]  [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80
> [  638.945747]  [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0
> [  638.947672]  [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50
> [  638.949595]  [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0
> [  638.951504]  [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0
> [  638.953388]  [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140
> [  638.955248]  [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90
> [  638.957116]  [<ffffffff8104fddd>] __do_softirq+0xfd/0x330
> [  638.958987]  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
> [  638.960861]  [<ffffffff8174a5cc>] call_softirq+0x1c/0x30
> [  638.962724]  [<ffffffff81004c7d>] do_softirq+0x8d/0xc0
> [  638.964565]  [<ffffffff8105024e>] irq_exit+0x10e/0x150
> [  638.966390]  [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60
> [  638.968223]  [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80
> [  638.970079]  <EOI>  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
> [  638.971899]  [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50
> [  638.973691]  [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50
> [  638.975475]  [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0
> [  638.977243]  [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80
> [  638.978988]  [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456]
> [  638.980723]  [<ffffffff811b5a1d>] bio_endio+0x1d/0x40
> [  638.982463]  [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0
> [  638.984214]  [<ffffffff81306b9f>] blk_update_request+0x7f/0x350
> [  638.985967]  [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90
> [  638.987710]  [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50
> [  638.989439]  [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30
> [  638.991149]  [<ffffffff81308746>] blk_peek_request+0x106/0x250
> [  638.992861]  [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130
> [  638.994561]  [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0
> [  638.996251]  [<ffffffff813040a7>] __blk_run_queue+0x37/0x50
> [  638.997900]  [<ffffffff813045af>] blk_run_queue+0x2f/0x50
> [  638.999553]  [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0
> [  639.001185]  [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40
> [  639.002798]  [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200
> [  639.004391]  [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0
> [  639.005996]  [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0
> [  639.007600]  [<ffffffff81072f6b>] kthread+0xdb/0xe0
> [  639.009205]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170
> [  639.010821]  [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0
> [  639.012437]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170
> 
> Signed-off-by: Bian Yu <bianyu@kedacom.com>
> Reviewed-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
>  drivers/md/md.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 8a0d762..2445fec 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -8111,6 +8111,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
>  	u64 *p;
>  	int lo, hi;
>  	int rv = 1;
> +	unsigned long flags;
>  
>  	if (bb->shift < 0)
>  		/* badblocks are disabled */
> @@ -8125,7 +8126,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
>  		sectors = next - s;
>  	}
>  
> -	write_seqlock_irq(&bb->lock);
> +	write_seqlock_irqsave(&bb->lock, flags);
>  
>  	p = bb->page;
>  	lo = 0;
> @@ -8241,7 +8242,7 @@ static int md_set_badblocks(struct badblocks *bb, sector_t s, int sectors,
>  	bb->changed = 1;
>  	if (!acknowledged)
>  		bb->unacked_exist = 1;
> -	write_sequnlock_irq(&bb->lock);
> +	write_sequnlock_irqrestore(&bb->lock, flags);
>  
>  	return rv;
>  }


Applied, thanks.
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-10-21 23:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-12  5:10 [PATCH] md: avoid deadlock when md_set_badblocks ycbzzjlby
2013-10-21 23:30 ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).