scsi_debug module deadlock on 3.17-rc2

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* scsi_debug module deadlock on 3.17-rc2
@ 2014-08-30 20:56 Milan Broz
  2014-08-30 23:14 ` Douglas Gilbert
  0 siblings, 1 reply; 4+ messages in thread
From: Milan Broz @ 2014-08-30 20:56 UTC (permalink / raw)
  To: dgilbert; +Cc: linux-scsi, Linux Kernel Mailing List

Hi,

I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
it deadlocks on rmmod of scsi_debug module.

For me even this simple reproducer causes deadlock:
  modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
  DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
  mkfs -t ext4 $DEV
  rmmod scsi_debug

(adding small delay before rmmod obviously helps here)

Bisect tracked it to commit
  commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
  Author: Douglas Gilbert <dgilbert@interlog.com>
  Date:   Sat Jul 26 11:55:35 2014 -0400
  scsi_debug: support scsi-mq, queues and locks

I guess that with introducing mq the del_timer_sync() must not be called
with acquired queued_arr_lock.
(to me it looks like situation described in comment before
del_timer_sync() in kernel/time/timer.c...)

Here is the log (running on vmware VM and i686 arch):

[   67.916472] scsi_debug: host protection
[   67.916483] scsi host3: scsi_debug, version 1.84 [20140706], dev_size_mb=16, opts=0x0
[   67.917446] scsi 3:0:0:0: Direct-Access     Linux    scsi_debug       0184 PQ: 0 ANSI: 5
[   67.920539] sd 3:0:0:0: Attached scsi generic sg8 type 0
[   67.940542] sd 3:0:0:0: [sdh] 32768 512-byte logical blocks: (16.7 MB/16.0 MiB)
[   67.940548] sd 3:0:0:0: [sdh] 4096-byte physical blocks
[   67.950705] sd 3:0:0:0: [sdh] Write Protect is off
[   67.950715] sd 3:0:0:0: [sdh] Mode Sense: 73 00 10 08
[   67.970514] sd 3:0:0:0: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   68.040566]  sdh: unknown partition table
[   68.090618] sd 3:0:0:0: [sdh] Attached SCSI disk
[   68.799699]  sdh: unknown partition table
[   69.072314] 
[   69.072387] ======================================================
[   69.072433] [ INFO: possible circular locking dependency detected ]
[   69.072487] 3.17.0-rc2+ #80 Not tainted
[   69.072518] -------------------------------------------------------
[   69.072560] rmmod/2890 is trying to acquire lock:
[   69.072595]  ((sqcp->cmnd_timerp)){+.-...}, at: [<c10846c0>] del_timer_sync+0x0/0xb0
[   69.072704] 
[   69.072704] but task is already holding lock:
[   69.072743]  (queued_arr_lock){..-...}, at: [<e1271887>] stop_all_queued+0x17/0xc0 [scsi_debug]
[   69.072852] 
[   69.072852] which lock already depends on the new lock.
[   69.072852] 
[   69.072902] 
[   69.072902] the existing dependency chain (in reverse order) is:
[   69.072949] 
[   69.072949] -> #1 (queued_arr_lock){..-...}:
[   69.073045]        [<c1072689>] lock_acquire+0x59/0xa0
[   69.073114]        [<c1465cb1>] _raw_spin_lock_irqsave+0x31/0x70
[   69.073438]        [<e1271cf7>] sdebug_q_cmd_complete+0x27/0x190 [scsi_debug]
[   69.073515]        [<c108434b>] call_timer_fn+0x5b/0xd0
[   69.073581]        [<c10848d1>] run_timer_softirq+0x161/0x200
[   69.073649]        [<c103dbc9>] __do_softirq+0x119/0x230
[   69.073726]        [<c1003c77>] do_softirq_own_stack+0x27/0x30
[   69.073811]        [<c103de1e>] irq_exit+0x7e/0xa0
[   69.073889]        [<c102a893>] smp_apic_timer_interrupt+0x33/0x40
[   69.073969]        [<c14672ae>] apic_timer_interrupt+0x32/0x38
[   69.074254]        [<c100a529>] arch_cpu_idle+0x9/0x10
[   69.074318]        [<c10674ec>] cpu_startup_entry+0x22c/0x280
[   69.074381]        [<c145e1ec>] rest_init+0x9c/0xb0
[   69.074441]        [<c16799e0>] start_kernel+0x2e9/0x2ee
[   69.074504]        [<c16792ab>] i386_start_kernel+0x79/0x7d
[   69.074567] 
[   69.074567] -> #0 ((sqcp->cmnd_timerp)){+.-...}:
[   69.074794]        [<c1071b94>] __lock_acquire+0x16e4/0x1c30
[   69.074859]        [<c1072689>] lock_acquire+0x59/0xa0
[   69.074919]        [<c10846e9>] del_timer_sync+0x29/0xb0
[   69.074981]        [<e12718fa>] stop_all_queued+0x8a/0xc0 [scsi_debug]
[   69.075050]        [<e1276f85>] scsi_debug_exit+0x16/0xac [scsi_debug]
[   69.075117]        [<c109a7bd>] SyS_delete_module+0xfd/0x180
[   69.075181]        [<c1466b2e>] syscall_after_call+0x0/0x4
[   69.075243] 
[   69.075243] other info that might help us debug this:
[   69.075243] 
[   69.075321]  Possible unsafe locking scenario:
[   69.075321] 
[   69.075380]        CPU0                    CPU1
[   69.075424]        ----                    ----
[   69.075468]   lock(queued_arr_lock);
[   69.075534]                                lock((sqcp->cmnd_timerp));
[   69.075613]                                lock(queued_arr_lock);
[   69.075690]   lock((sqcp->cmnd_timerp));
[   69.075758] 
[   69.075758]  *** DEADLOCK ***
[   69.075758] 
[   69.075827] 1 lock held by rmmod/2890:
[   69.075867]  #0:  (queued_arr_lock){..-...}, at: [<e1271887>] stop_all_queued+0x17/0xc0 [scsi_debug]
[   69.076009] 
[   69.076009] stack backtrace:
[   69.076064] CPU: 1 PID: 2890 Comm: rmmod Not tainted 3.17.0-rc2+ #80
[   69.076117] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
[   69.076200]  c1c93200 00000000 da25fe30 c146081f c1c93330 da25fe60 c145fbf6 c158bbfc
[   69.076375]  c158bb99 c158bb7c c158bb91 c158bb7c da25fe9c df319070 df319098 df319618
[   69.076550]  df3195f0 da25fecc c1071b94 df3195f0 c1c4a0b0 00000001 00000000 00000001
[   69.076724] Call Trace:
[   69.076760]  [<c146081f>] dump_stack+0x4b/0x75
[   69.076805]  [<c145fbf6>] print_circular_bug+0x22d/0x23a
[   69.076863]  [<c1071b94>] __lock_acquire+0x16e4/0x1c30
[   69.076922]  [<c1072689>] lock_acquire+0x59/0xa0
[   69.076968]  [<c10846c0>] ? try_to_del_timer_sync+0x60/0x60
[   69.077017]  [<c10846e9>] del_timer_sync+0x29/0xb0
[   69.077063]  [<c10846c0>] ? try_to_del_timer_sync+0x60/0x60
[   69.077113]  [<e12718fa>] stop_all_queued+0x8a/0xc0 [scsi_debug]
[   69.077166]  [<e1276f85>] scsi_debug_exit+0x16/0xac [scsi_debug]
[   69.077218]  [<c109a7bd>] SyS_delete_module+0xfd/0x180
[   69.077265]  [<c1466b61>] ? restore_all+0xf/0xf
[   69.077311]  [<c12893ff>] ? __this_cpu_preempt_check+0xf/0x20
[   69.077361]  [<c106eab9>] ? trace_hardirqs_on_caller+0xe9/0x220
[   69.077412]  [<c1466b2e>] syscall_call+0x7/0x7

Milan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: scsi_debug module deadlock on 3.17-rc2
  2014-08-30 20:56 scsi_debug module deadlock on 3.17-rc2 Milan Broz
@ 2014-08-30 23:14 ` Douglas Gilbert
  2014-08-31  9:49   ` Milan Broz
  0 siblings, 1 reply; 4+ messages in thread
From: Douglas Gilbert @ 2014-08-30 23:14 UTC (permalink / raw)
  To: Milan Broz; +Cc: linux-scsi, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 3558 bytes --]

On 14-08-30 04:56 PM, Milan Broz wrote:
> Hi,
>
> I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
> it deadlocks on rmmod of scsi_debug module.
>
> For me even this simple reproducer causes deadlock:
>    modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
>    DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
>    mkfs -t ext4 $DEV
>    rmmod scsi_debug
>
> (adding small delay before rmmod obviously helps here)

So I used this slight variation for testing:

modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 num_parts=1
DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)"1"
echo "mkfs -t ext4 ${DEV}"
mkfs -t ext4 ${DEV}
sleep 0.1
rmmod scsi_debug

> Bisect tracked it to commit
>    commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
>    Author: Douglas Gilbert <dgilbert@interlog.com>
>    Date:   Sat Jul 26 11:55:35 2014 -0400
>    scsi_debug: support scsi-mq, queues and locks
>
> I guess that with introducing mq the del_timer_sync() must not be called
> with acquired queued_arr_lock.
> (to me it looks like situation described in comment before
> del_timer_sync() in kernel/time/timer.c...)

Looks like something a lawyer would write.

> Here is the log (running on vmware VM and i686 arch):
>
> [   67.916472] scsi_debug: host protection
> [   67.916483] scsi host3: scsi_debug, version 1.84 [20140706], dev_size_mb=16, opts=0x0
> [   67.917446] scsi 3:0:0:0: Direct-Access     Linux    scsi_debug       0184 PQ: 0 ANSI: 5
> [   67.920539] sd 3:0:0:0: Attached scsi generic sg8 type 0
> [   67.940542] sd 3:0:0:0: [sdh] 32768 512-byte logical blocks: (16.7 MB/16.0 MiB)
> [   67.940548] sd 3:0:0:0: [sdh] 4096-byte physical blocks
> [   67.950705] sd 3:0:0:0: [sdh] Write Protect is off
> [   67.950715] sd 3:0:0:0: [sdh] Mode Sense: 73 00 10 08
> [   67.970514] sd 3:0:0:0: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   68.040566]  sdh: unknown partition table
> [   68.090618] sd 3:0:0:0: [sdh] Attached SCSI disk
> [   68.799699]  sdh: unknown partition table
> [   69.072314]
> [   69.072387] ======================================================
> [   69.072433] [ INFO: possible circular locking dependency detected ]
> [   69.072487] 3.17.0-rc2+ #80 Not tainted
> [   69.072518] -------------------------------------------------------
> [   69.072560] rmmod/2890 is trying to acquire lock:
> [   69.072595]  ((sqcp->cmnd_timerp)){+.-...}, at: [<c10846c0>] del_timer_sync+0x0/0xb0
> [   69.072704]
> [   69.072704] but task is already holding lock:
> [   69.072743]  (queued_arr_lock){..-...}, at: [<e1271887>] stop_all_queued+0x17/0xc0 [scsi_debug]
> [   69.072852]
> [   69.072852] which lock already depends on the new lock.
> [   69.072852]

<snip>

> [   69.075321]  Possible unsafe locking scenario:
> [   69.075321]
> [   69.075380]        CPU0                    CPU1
> [   69.075424]        ----                    ----
> [   69.075468]   lock(queued_arr_lock);
> [   69.075534]                                lock((sqcp->cmnd_timerp));
> [   69.075613]                                lock(queued_arr_lock);
> [   69.075690]   lock((sqcp->cmnd_timerp));
> [   69.075758]
> [   69.075758]  *** DEADLOCK ***

Interesting analysis, somewhat confusing because cmnd_timerp
is a pointer. Also my guess is the sqcp pointers in the
two threads were different.

Anyway the attached patch removes the lock(queued_arr_lock)
from around the del_timer calls. Could you try it and report
back.

Doug Gilbert



[-- Attachment #2: sdebug317rc2_dlock1.patch --]
[-- Type: text/x-patch, Size: 2029 bytes --]

--- a/drivers/scsi/scsi_debug.c	2014-08-26 13:24:51.646948507 -0400
+++ b/drivers/scsi/scsi_debug.c	2014-08-30 18:04:54.589226679 -0400
@@ -2743,6 +2743,13 @@ static int stop_queued_cmnd(struct scsi_
 		if (test_bit(k, queued_in_use_bm)) {
 			sqcp = &queued_arr[k];
 			if (cmnd == sqcp->a_cmnd) {
+				devip = (struct sdebug_dev_info *)
+					cmnd->device->hostdata;
+				if (devip)
+					atomic_dec(&devip->num_in_q);
+				sqcp->a_cmnd = NULL;
+				spin_unlock_irqrestore(&queued_arr_lock,
+						       iflags);
 				if (scsi_debug_ndelay > 0) {
 					if (sqcp->sd_hrtp)
 						hrtimer_cancel(
@@ -2755,18 +2762,13 @@ static int stop_queued_cmnd(struct scsi_
 					if (sqcp->tletp)
 						tasklet_kill(sqcp->tletp);
 				}
-				__clear_bit(k, queued_in_use_bm);
-				devip = (struct sdebug_dev_info *)
-					cmnd->device->hostdata;
-				if (devip)
-					atomic_dec(&devip->num_in_q);
-				sqcp->a_cmnd = NULL;
-				break;
+				clear_bit(k, queued_in_use_bm);
+				return 1;
 			}
 		}
 	}
 	spin_unlock_irqrestore(&queued_arr_lock, iflags);
-	return (k < qmax) ? 1 : 0;
+	return 0;
 }
 
 /* Deletes (stops) timers or tasklets of all queued commands */
@@ -2782,6 +2784,13 @@ static void stop_all_queued(void)
 		if (test_bit(k, queued_in_use_bm)) {
 			sqcp = &queued_arr[k];
 			if (sqcp->a_cmnd) {
+				devip = (struct sdebug_dev_info *)
+					sqcp->a_cmnd->device->hostdata;
+				if (devip)
+					atomic_dec(&devip->num_in_q);
+				sqcp->a_cmnd = NULL;
+				spin_unlock_irqrestore(&queued_arr_lock,
+						       iflags);
 				if (scsi_debug_ndelay > 0) {
 					if (sqcp->sd_hrtp)
 						hrtimer_cancel(
@@ -2794,12 +2803,8 @@ static void stop_all_queued(void)
 					if (sqcp->tletp)
 						tasklet_kill(sqcp->tletp);
 				}
-				__clear_bit(k, queued_in_use_bm);
-				devip = (struct sdebug_dev_info *)
-					sqcp->a_cmnd->device->hostdata;
-				if (devip)
-					atomic_dec(&devip->num_in_q);
-				sqcp->a_cmnd = NULL;
+				clear_bit(k, queued_in_use_bm);
+				spin_lock_irqsave(&queued_arr_lock, iflags);
 			}
 		}
 	}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: scsi_debug module deadlock on 3.17-rc2
  2014-08-30 23:14 ` Douglas Gilbert
@ 2014-08-31  9:49   ` Milan Broz
  2014-08-31 15:40     ` Douglas Gilbert
  0 siblings, 1 reply; 4+ messages in thread
From: Milan Broz @ 2014-08-31  9:49 UTC (permalink / raw)
  To: dgilbert; +Cc: linux-scsi, Linux Kernel Mailing List

On 08/31/2014 01:14 AM, Douglas Gilbert wrote:
> On 14-08-30 04:56 PM, Milan Broz wrote:
>> Hi,
>>
>> I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
>> it deadlocks on rmmod of scsi_debug module.
>>
>> For me even this simple reproducer causes deadlock:
>>    modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
>>    DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
>>    mkfs -t ext4 $DEV
>>    rmmod scsi_debug
>>
>> (adding small delay before rmmod obviously helps here)
> 
> So I used this slight variation for testing:
> 
> modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 num_parts=1
> DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)"1"
> echo "mkfs -t ext4 ${DEV}"
> mkfs -t ext4 ${DEV}
> sleep 0.1
> rmmod scsi_debug
> 
>> Bisect tracked it to commit
>>    commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
>>    Author: Douglas Gilbert <dgilbert@interlog.com>
>>    Date:   Sat Jul 26 11:55:35 2014 -0400
>>    scsi_debug: support scsi-mq, queues and locks
>>
>> I guess that with introducing mq the del_timer_sync() must not be called
>> with acquired queued_arr_lock.
>> (to me it looks like situation described in comment before
>> del_timer_sync() in kernel/time/timer.c...)
> 
> Looks like something a lawyer would write.

:-)

...

> Anyway the attached patch removes the lock(queued_arr_lock)
> from around the del_timer calls. Could you try it and report
> back.

Yes, the patch fixes the deadlock.
Thanks!

Tested-and-reported-by: Milan Broz <gmazyland@gmail.com>

However, removing the device (rmmod scsi_debug) now take up to 20 or more seconds.
(Nothing like flush buffers or udev settle helps seems...)

It is somehow related to udev events processing (killing udev process
will remove this delay).

I have full process trace (let me know if you want to send it),
this is just where rmmod is while waiting (note the delay in the end):

[  132.918141] sd 3:0:0:0: [sdh] Attached SCSI disk
[  137.476448] SysRq : Show State
...
[  137.485711] kworker/0:2     D c16ce000     0   274      2 0x00000000
[  137.485724] Workqueue: events_freezable_power_ disk_events_workfn
[  137.485728]  dc915c2c 00000046 dc915bcc c16ce000 debad340 0000001f de15e250 c00530f0
[  137.485741]  00000002 c00530f0 00000001 c00530f0 c0053698 dc915c0c c106e934 c1465cdd
[  137.485753]  c00530f0 dc915c44 dc915c0c c128925f dc915c1c c106ea49 00000282 c1ff7d80
[  137.485765] Call Trace:
[  137.485772]  [<c106e934>] ? mark_held_locks+0x64/0x90
[  137.485779]  [<c1465cdd>] ? _raw_spin_unlock_irqrestore+0x4d/0x70
[  137.485785]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
[  137.485792]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
[  137.485798]  [<c146194e>] schedule+0x1e/0x60
[  137.485804]  [<c146520f>] schedule_timeout+0x11f/0x1b0
[  137.485811]  [<c1084270>] ? internal_add_timer+0x60/0x60
[  137.485817]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
[  137.485824]  [<c1461db6>] io_schedule_timeout+0x46/0x60
[  137.485831]  [<c146230b>] wait_for_common_io.constprop.1+0x7b/0xd0
[  137.485836]  [<c1059c60>] ? wake_up_process+0x40/0x40
[  137.485843]  [<c1462368>] wait_for_completion_io_timeout+0x8/0x10
[  137.485849]  [<c12614a1>] blk_execute_rq+0x81/0x100
[  137.485856]  [<c14622a7>] ? wait_for_common_io.constprop.1+0x17/0xd0
[  137.485863]  [<c12730f0>] ? cfq_exit_icq+0x40/0x40
[  137.485870]  [<c125a2d5>] ? elv_set_request+0x15/0x30
[  137.485877]  [<c125bd4a>] ? get_request+0x2ea/0x490
[  137.485884]  [<c1066fd0>] ? __wake_up_sync+0x10/0x10
[  137.485891]  [<c125bf59>] ? blk_get_request+0x69/0xd0
[  137.485897]  [<c13312bc>] scsi_execute+0xfc/0x190
[  137.485904]  [<c1331423>] scsi_execute_req_flags+0x53/0xb0
[  137.485912]  [<c13319be>] scsi_test_unit_ready+0x5e/0x100
[  137.485918]  [<c1349df3>] sd_check_events+0xe3/0x120
[  137.485925]  [<c12697f7>] disk_check_events+0x37/0x100
[  137.485932]  [<c12698d3>] disk_events_workfn+0x13/0x20
[  137.485939]  [<c104d096>] process_one_work+0x156/0x390
[  137.485945]  [<c104d043>] ? process_one_work+0x103/0x390
[  137.485951]  [<c104d8f9>] worker_thread+0x39/0x440
[  137.485958]  [<c104d8c0>] ? execute_in_process_context+0x70/0x70
[  137.485965]  [<c10513f3>] kthread+0xa3/0xc0
[  137.485970]  [<c106cc9d>] ? put_lock_stats.isra.21+0xd/0x30
[  137.485976]  [<c106eb8b>] ? trace_hardirqs_on+0xb/0x10
[  137.485982]  [<c14667c1>] ret_from_kernel_thread+0x21/0x30
[  137.485989]  [<c1051350>] ? kthread_create_on_node+0x160/0x160
...
[  137.493862] rmmod           D c16ce000     0  1434   1422 0x00000000
[  137.493869]  da1bbca4 00000046 deb5e050 c16ce000 199fe46e 0000001f df11c190 deb5e050
[  137.493881]  b02da1a3 da1bbcc0 c10708cb c1c0f9c8 00000004 00000000 c1c75a90 00000000
[  137.493893]  0ac0a782 b02da1a3 00000003 deb5e670 00000001 deb5e050 deb5e050 deb5e670
[  137.493905] Call Trace:
[  137.493911]  [<c10708cb>] ? __lock_acquire+0x48b/0x1c30
[  137.493917]  [<c146194e>] schedule+0x1e/0x60
[  137.493924]  [<c146523d>] schedule_timeout+0x14d/0x1b0
[  137.493930]  [<c106e934>] ? mark_held_locks+0x64/0x90
[  137.493936]  [<c1465d22>] ? _raw_spin_unlock_irq+0x22/0x50
[  137.493942]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
[  137.493948]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
[  137.493954]  [<c106eb8b>] ? trace_hardirqs_on+0xb/0x10
[  137.493960]  [<c1462156>] wait_for_common+0x86/0x110
[  137.493965]  [<c1059c60>] ? wake_up_process+0x40/0x40
[  137.493971]  [<c14621f2>] wait_for_completion+0x12/0x20
[  137.493977]  [<c104c9bf>] flush_work+0x13f/0x1e0
[  137.493982]  [<c104abe0>] ? work_busy+0x80/0x80
[  137.493988]  [<c14620ea>] ? wait_for_common+0x1a/0x110
[  137.493994]  [<c104d7a7>] __cancel_work_timer+0x67/0xf0
[  137.494000]  [<c104d84d>] cancel_delayed_work_sync+0xd/0x10
[  137.494006]  [<c126ad93>] disk_block_events+0x73/0x80
[  137.494011]  [<c126ae32>] del_gendisk+0x22/0x1d0
[  137.494017]  [<c1318de1>] ? device_del+0x131/0x180
[  137.494022]  [<c134a929>] sd_remove+0x49/0xa0
[  137.494029]  [<c131bc59>] __device_release_driver+0x49/0xb0
[  137.494034]  [<c131bcda>] device_release_driver+0x1a/0x30
[  137.494040]  [<c131b7cb>] bus_remove_device+0xbb/0xf0
[  137.494045]  [<c1318d9c>] device_del+0xec/0x180
[  137.494051]  [<c1337981>] __scsi_remove_device+0xb1/0xc0
[  137.494058]  [<c13364f7>] scsi_forget_host+0x57/0x60
[  137.494064]  [<c132c75e>] scsi_remove_host+0x5e/0xf0
[  137.494074]  [<e10e4fdb>] sdebug_driver_remove+0x1b/0x80 [scsi_debug]
[  137.494081]  [<c131bc59>] __device_release_driver+0x49/0xb0
[  137.494086]  [<c131bcda>] device_release_driver+0x1a/0x30
[  137.494092]  [<c131b7cb>] bus_remove_device+0xbb/0xf0
[  137.494097]  [<c1318d9c>] device_del+0xec/0x180
[  137.494103]  [<c1318e3b>] device_unregister+0xb/0x20
[  137.494110]  [<e10e48bd>] sdebug_remove_adapter+0x4d/0x70 [scsi_debug]
[  137.494118]  [<e10e8ffd>] scsi_debug_exit+0x6e/0xac [scsi_debug]
[  137.494125]  [<c109a74d>] SyS_delete_module+0xfd/0x180
[  137.494131]  [<c14669e1>] ? restore_all+0xf/0xf
[  137.494137]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
[  137.494143]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
[  137.494148]  [<c14669ae>] syscall_call+0x7/0x7
...
[  137.496322] Showing all locks held in the system:
[  137.496332] 2 locks held by kworker/0:2/274:
[  137.496334]  #0:  ("events_freezable_power_efficient"){.+.+.+}, at: [<c104d043>] process_one_work+0x103/0x390
[  137.496347]  #1:  ((&(&ev->dwork)->work)){+.+.+.}, at: [<c104d043>] process_one_work+0x103/0x390
[  137.496360] 2 locks held by agetty/1395:
[  137.496363]  #0:  (&tty->ldisc_sem){++++++}, at: [<c12d8c83>] tty_ldisc_ref_wait+0x13/0x40
[  137.496374]  #1:  (&ldata->atomic_read_lock){+.+...}, at: [<c12d6c04>] n_tty_read+0x84/0x920
[  137.496386] 3 locks held by bash/1416:
[  137.496388]  #0:  (sb_writers#5){.+.+.+}, at: [<c10f4282>] vfs_write+0x152/0x1a0
[  137.496399]  #1:  (rcu_read_lock){......}, at: [<c12dc9d0>] __handle_sysrq+0x0/0x150
[  137.496409]  #2:  (tasklist_lock){.+.+..}, at: [<c106d636>] debug_show_all_locks+0x36/0x1b0
[  137.496420] 4 locks held by rmmod/1434:
[  137.496423]  #0:  (&dev->mutex){......}, at: [<c131bcd3>] device_release_driver+0x13/0x30
[  137.496433]  #1:  (&shost->scan_mutex){+.+.+.}, at: [<c132c714>] scsi_remove_host+0x14/0xf0
[  137.496443]  #2:  (&dev->mutex){......}, at: [<c131bcd3>] device_release_driver+0x13/0x30
[  137.496453]  #3:  (&ev->block_mutex){+.+...}, at: [<c126ad42>] disk_block_events+0x22/0x80
[  137.496464] 
[  137.496467] =============================================
[  137.496467] 
[  164.035571] sd 3:0:0:0: [sdh] Synchronizing SCSI cache

Milan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: scsi_debug module deadlock on 3.17-rc2
  2014-08-31  9:49   ` Milan Broz
@ 2014-08-31 15:40     ` Douglas Gilbert
  0 siblings, 0 replies; 4+ messages in thread
From: Douglas Gilbert @ 2014-08-31 15:40 UTC (permalink / raw)
  To: Milan Broz; +Cc: linux-scsi, Linux Kernel Mailing List

On 14-08-31 05:49 AM, Milan Broz wrote:
> On 08/31/2014 01:14 AM, Douglas Gilbert wrote:
>> On 14-08-30 04:56 PM, Milan Broz wrote:
>>> Hi,
>>>
>>> I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
>>> it deadlocks on rmmod of scsi_debug module.
>>>
>>> For me even this simple reproducer causes deadlock:
>>>     modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
>>>     DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
>>>     mkfs -t ext4 $DEV
>>>     rmmod scsi_debug
>>>
>>> (adding small delay before rmmod obviously helps here)
>>
>> So I used this slight variation for testing:
>>
>> modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 num_parts=1
>> DEV="/dev/"$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)"1"
>> echo "mkfs -t ext4 ${DEV}"
>> mkfs -t ext4 ${DEV}
>> sleep 0.1
>> rmmod scsi_debug
>>
>>> Bisect tracked it to commit
>>>     commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
>>>     Author: Douglas Gilbert <dgilbert@interlog.com>
>>>     Date:   Sat Jul 26 11:55:35 2014 -0400
>>>     scsi_debug: support scsi-mq, queues and locks
>>>
>>> I guess that with introducing mq the del_timer_sync() must not be called
>>> with acquired queued_arr_lock.
>>> (to me it looks like situation described in comment before
>>> del_timer_sync() in kernel/time/timer.c...)
>>
>> Looks like something a lawyer would write.
>
> :-)
>
> ...
>
>> Anyway the attached patch removes the lock(queued_arr_lock)
>> from around the del_timer calls. Could you try it and report
>> back.
>
> Yes, the patch fixes the deadlock.
> Thanks!
>
> Tested-and-reported-by: Milan Broz <gmazyland@gmail.com>
>
> However, removing the device (rmmod scsi_debug) now take up to 20 or more seconds.
> (Nothing like flush buffers or udev settle helps seems...)

I did not see that in my tests with your script, perhaps
I need to increase the 'delay=<jiffies>' argument to
scsi_debug and run rmmod in a loop until it succeeds.

And are you saying that in lk 3.16 and earlier you did
not observe a 20 second delay?


Now, what your test is doing, is telling a low level
driver at the bottom of the scsi stack to shut down,
before all the WRITE commands associated with the
mkfs may have completed and perhaps the upper layers have
assumed they are completed. The default timeout on those
WRITE commands (set in the upper layers) may well be
the 20 or more seconds you are observing.

As far as I can see the scsi_debug driver has two options
when rmmod is issued:
   1) shut everything; done in as orderly fashion as
      possible (with no deadlocks). This may mean deleting
      the timers associated with "in play" commands and
      forgetting about them. This will cause pain for the
      upper levels but they should be able to cope.
   2) if there are no scsi commands "in play" then
      shut down as in 1). Otherwise reject any new
      commands and create a work queue to handle the
      remaining "in play" commands. When they are
      completed, do 1)

The scsi_debug driver takes the simpler approach and
does 1) ***. This also has the advantage of exercising
the error handling (EH) logic in the scsi mid-layer.
And it may not be that unrealistic, perhaps mimicking
a real HBA driver whose firmware has crashed, as seen
from the upper layers.

Also, for drivers that do 2), how do they cope with
'rmmod scsi_debug; modprobe scsi_debug ...'. Perhaps
an EBUSY on that modprobe if the rmmod hasn't really
finished?

> It is somehow related to udev events processing (killing udev process
> will remove this delay).
>
> I have full process trace (let me know if you want to send it),

Yes, please send me the full trace. I will probably be
talking to the scsi EH maintainer about it. Perhaps
the scsi_debug driver needs to make more calls back to
the scsi mid-layer (its parent) in this situation.

Doug Gilbert


*** there is no change in lk 3.17 for scsi_debug's shutdown
     policy to 3.16 and earlier

> this is just where rmmod is while waiting (note the delay in the end):
>
> [  132.918141] sd 3:0:0:0: [sdh] Attached SCSI disk
> [  137.476448] SysRq : Show State
> ...
> [  137.485711] kworker/0:2     D c16ce000     0   274      2 0x00000000
> [  137.485724] Workqueue: events_freezable_power_ disk_events_workfn
> [  137.485728]  dc915c2c 00000046 dc915bcc c16ce000 debad340 0000001f de15e250 c00530f0
> [  137.485741]  00000002 c00530f0 00000001 c00530f0 c0053698 dc915c0c c106e934 c1465cdd
> [  137.485753]  c00530f0 dc915c44 dc915c0c c128925f dc915c1c c106ea49 00000282 c1ff7d80
> [  137.485765] Call Trace:
> [  137.485772]  [<c106e934>] ? mark_held_locks+0x64/0x90
> [  137.485779]  [<c1465cdd>] ? _raw_spin_unlock_irqrestore+0x4d/0x70
> [  137.485785]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
> [  137.485792]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
> [  137.485798]  [<c146194e>] schedule+0x1e/0x60
> [  137.485804]  [<c146520f>] schedule_timeout+0x11f/0x1b0
> [  137.485811]  [<c1084270>] ? internal_add_timer+0x60/0x60
> [  137.485817]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
> [  137.485824]  [<c1461db6>] io_schedule_timeout+0x46/0x60
> [  137.485831]  [<c146230b>] wait_for_common_io.constprop.1+0x7b/0xd0
> [  137.485836]  [<c1059c60>] ? wake_up_process+0x40/0x40
> [  137.485843]  [<c1462368>] wait_for_completion_io_timeout+0x8/0x10
> [  137.485849]  [<c12614a1>] blk_execute_rq+0x81/0x100
> [  137.485856]  [<c14622a7>] ? wait_for_common_io.constprop.1+0x17/0xd0
> [  137.485863]  [<c12730f0>] ? cfq_exit_icq+0x40/0x40
> [  137.485870]  [<c125a2d5>] ? elv_set_request+0x15/0x30
> [  137.485877]  [<c125bd4a>] ? get_request+0x2ea/0x490
> [  137.485884]  [<c1066fd0>] ? __wake_up_sync+0x10/0x10
> [  137.485891]  [<c125bf59>] ? blk_get_request+0x69/0xd0
> [  137.485897]  [<c13312bc>] scsi_execute+0xfc/0x190
> [  137.485904]  [<c1331423>] scsi_execute_req_flags+0x53/0xb0
> [  137.485912]  [<c13319be>] scsi_test_unit_ready+0x5e/0x100
> [  137.485918]  [<c1349df3>] sd_check_events+0xe3/0x120
> [  137.485925]  [<c12697f7>] disk_check_events+0x37/0x100
> [  137.485932]  [<c12698d3>] disk_events_workfn+0x13/0x20
> [  137.485939]  [<c104d096>] process_one_work+0x156/0x390
> [  137.485945]  [<c104d043>] ? process_one_work+0x103/0x390
> [  137.485951]  [<c104d8f9>] worker_thread+0x39/0x440
> [  137.485958]  [<c104d8c0>] ? execute_in_process_context+0x70/0x70
> [  137.485965]  [<c10513f3>] kthread+0xa3/0xc0
> [  137.485970]  [<c106cc9d>] ? put_lock_stats.isra.21+0xd/0x30
> [  137.485976]  [<c106eb8b>] ? trace_hardirqs_on+0xb/0x10
> [  137.485982]  [<c14667c1>] ret_from_kernel_thread+0x21/0x30
> [  137.485989]  [<c1051350>] ? kthread_create_on_node+0x160/0x160
> ...
> [  137.493862] rmmod           D c16ce000     0  1434   1422 0x00000000
> [  137.493869]  da1bbca4 00000046 deb5e050 c16ce000 199fe46e 0000001f df11c190 deb5e050
> [  137.493881]  b02da1a3 da1bbcc0 c10708cb c1c0f9c8 00000004 00000000 c1c75a90 00000000
> [  137.493893]  0ac0a782 b02da1a3 00000003 deb5e670 00000001 deb5e050 deb5e050 deb5e670
> [  137.493905] Call Trace:
> [  137.493911]  [<c10708cb>] ? __lock_acquire+0x48b/0x1c30
> [  137.493917]  [<c146194e>] schedule+0x1e/0x60
> [  137.493924]  [<c146523d>] schedule_timeout+0x14d/0x1b0
> [  137.493930]  [<c106e934>] ? mark_held_locks+0x64/0x90
> [  137.493936]  [<c1465d22>] ? _raw_spin_unlock_irq+0x22/0x50
> [  137.493942]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
> [  137.493948]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
> [  137.493954]  [<c106eb8b>] ? trace_hardirqs_on+0xb/0x10
> [  137.493960]  [<c1462156>] wait_for_common+0x86/0x110
> [  137.493965]  [<c1059c60>] ? wake_up_process+0x40/0x40
> [  137.493971]  [<c14621f2>] wait_for_completion+0x12/0x20
> [  137.493977]  [<c104c9bf>] flush_work+0x13f/0x1e0
> [  137.493982]  [<c104abe0>] ? work_busy+0x80/0x80
> [  137.493988]  [<c14620ea>] ? wait_for_common+0x1a/0x110
> [  137.493994]  [<c104d7a7>] __cancel_work_timer+0x67/0xf0
> [  137.494000]  [<c104d84d>] cancel_delayed_work_sync+0xd/0x10
> [  137.494006]  [<c126ad93>] disk_block_events+0x73/0x80
> [  137.494011]  [<c126ae32>] del_gendisk+0x22/0x1d0
> [  137.494017]  [<c1318de1>] ? device_del+0x131/0x180
> [  137.494022]  [<c134a929>] sd_remove+0x49/0xa0
> [  137.494029]  [<c131bc59>] __device_release_driver+0x49/0xb0
> [  137.494034]  [<c131bcda>] device_release_driver+0x1a/0x30
> [  137.494040]  [<c131b7cb>] bus_remove_device+0xbb/0xf0
> [  137.494045]  [<c1318d9c>] device_del+0xec/0x180
> [  137.494051]  [<c1337981>] __scsi_remove_device+0xb1/0xc0
> [  137.494058]  [<c13364f7>] scsi_forget_host+0x57/0x60
> [  137.494064]  [<c132c75e>] scsi_remove_host+0x5e/0xf0
> [  137.494074]  [<e10e4fdb>] sdebug_driver_remove+0x1b/0x80 [scsi_debug]
> [  137.494081]  [<c131bc59>] __device_release_driver+0x49/0xb0
> [  137.494086]  [<c131bcda>] device_release_driver+0x1a/0x30
> [  137.494092]  [<c131b7cb>] bus_remove_device+0xbb/0xf0
> [  137.494097]  [<c1318d9c>] device_del+0xec/0x180
> [  137.494103]  [<c1318e3b>] device_unregister+0xb/0x20
> [  137.494110]  [<e10e48bd>] sdebug_remove_adapter+0x4d/0x70 [scsi_debug]
> [  137.494118]  [<e10e8ffd>] scsi_debug_exit+0x6e/0xac [scsi_debug]
> [  137.494125]  [<c109a74d>] SyS_delete_module+0xfd/0x180
> [  137.494131]  [<c14669e1>] ? restore_all+0xf/0xf
> [  137.494137]  [<c128925f>] ? __this_cpu_preempt_check+0xf/0x20
> [  137.494143]  [<c106ea49>] ? trace_hardirqs_on_caller+0xe9/0x220
> [  137.494148]  [<c14669ae>] syscall_call+0x7/0x7
> ...
> [  137.496322] Showing all locks held in the system:
> [  137.496332] 2 locks held by kworker/0:2/274:
> [  137.496334]  #0:  ("events_freezable_power_efficient"){.+.+.+}, at: [<c104d043>] process_one_work+0x103/0x390
> [  137.496347]  #1:  ((&(&ev->dwork)->work)){+.+.+.}, at: [<c104d043>] process_one_work+0x103/0x390
> [  137.496360] 2 locks held by agetty/1395:
> [  137.496363]  #0:  (&tty->ldisc_sem){++++++}, at: [<c12d8c83>] tty_ldisc_ref_wait+0x13/0x40
> [  137.496374]  #1:  (&ldata->atomic_read_lock){+.+...}, at: [<c12d6c04>] n_tty_read+0x84/0x920
> [  137.496386] 3 locks held by bash/1416:
> [  137.496388]  #0:  (sb_writers#5){.+.+.+}, at: [<c10f4282>] vfs_write+0x152/0x1a0
> [  137.496399]  #1:  (rcu_read_lock){......}, at: [<c12dc9d0>] __handle_sysrq+0x0/0x150
> [  137.496409]  #2:  (tasklist_lock){.+.+..}, at: [<c106d636>] debug_show_all_locks+0x36/0x1b0
> [  137.496420] 4 locks held by rmmod/1434:
> [  137.496423]  #0:  (&dev->mutex){......}, at: [<c131bcd3>] device_release_driver+0x13/0x30
> [  137.496433]  #1:  (&shost->scan_mutex){+.+.+.}, at: [<c132c714>] scsi_remove_host+0x14/0xf0
> [  137.496443]  #2:  (&dev->mutex){......}, at: [<c131bcd3>] device_release_driver+0x13/0x30
> [  137.496453]  #3:  (&ev->block_mutex){+.+...}, at: [<c126ad42>] disk_block_events+0x22/0x80
> [  137.496464]
> [  137.496467] =============================================
> [  137.496467]
> [  164.035571] sd 3:0:0:0: [sdh] Synchronizing SCSI cache


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-08-31 15:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-30 20:56 scsi_debug module deadlock on 3.17-rc2 Milan Broz
2014-08-30 23:14 ` Douglas Gilbert
2014-08-31  9:49   ` Milan Broz
2014-08-31 15:40     ` Douglas Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox