From: Michael Reed <mdr@sgi.com>
To: linux-scsi <linux-scsi@vger.kernel.org>
Cc: James Smart <James.Smart@Emulex.Com>,
Andrew Vasquez <andrew.vasquez@qlogic.com>,
Jeremy Higdon <jeremy@sgi.com>
Subject: 2.6.31 - scsi scanning / target deletion deadlock
Date: Wed, 28 Oct 2009 15:18:22 -0500 [thread overview]
Message-ID: <4AE8A70E.9070504@sgi.com> (raw)
Hi All,
I encountered the following deadlock on the Scsi_Host's scan_lock.
Target device glitches have caused the qla2xxx driver to delete and
later attempt to re-add a scsi device. (Sorry, I cannot present the
exact sequence of events.)
scsi_wq_3 is executing a scan on host 3, holds host's scan_lock.
i/o has been queued to target3:0:0, on rport 0xe00000b0f02d6c20.
qla2xxx_3_dpc is changing rport roles on rport 0xe00000b0f02d6c20. Until
this completes, the work on scsi_wq_3 cannot progress. The change in
rport roles results in a call to flush target delete work on fc_wq_3.
fc_wq_3 is trying to remove scsi target 0xe0000030f5e86488 on rport 0xe0000030f1f432d0
and needs to acquire the scan_lock held by scsi_wq_3.
Perhaps the granularity of scan_lock is too great?
Would anyone have any thoughts on how best to eliminate this deadlock?
Thanks,
Mike
[0]kdb> btp 3790
Stack traceback for pid 3790
0xe0000034f5d30000 3790 2 0 1 D 0xe0000034f5d30570 fc_wq_3
0xa0000001007280a0 schedule+0x14e0
args (0x4000, 0x0, 0x0, 0xa000000100729720, 0x813, 0xe0000034f5d3fdb0, 0x1111111111111111, 0x0, 0x1010095a6000)
0xa000000100729840 __mutex_lock_slowpath+0x320
args (0xe0000034f4f24cf0, 0xe0000034f5d30000, 0x10095a6010, 0xe0000034f4f24cf4, 0xe0000034f4f24cf8, 0xa0000001011c2600, 0xa0000001011c1cb0, 0x7ffff00)
0xa000000100729ad0 mutex_lock+0x30
args (0xe0000034f4f24d08, 0xa000000100471d30, 0x286, 0x10095a6010)
0xa000000100471d30 scsi_remove_device+0x30
args (0xe0000030f5ea57a8, 0xe0000034f4f24cf0, 0xa000000100471f40, 0x48b, 0xe0000034f4f24c90)
0xa000000100471f40 __scsi_remove_target+0x180
args (0xe0000030f5e86488, 0xe0000030f5ea57a8, 0xe0000034f4f24c90, 0xe0000034f4f24ce8, 0xe0000030f5e865f0, 0xe0000030f5e865ec, 0xa000000100472120, 0x205, 0xa00000010096c950)
0xa000000100472120 __remove_child+0x40
args (0xe0000030f5e864b0, 0xa0000001004152c0, 0x389, 0x0)
0xa0000001004152c0 device_for_each_child+0x80
args (0xe0000030f1f43338, 0x0, 0xa00000010096c200, 0x0, 0xa0000001004720b0, 0x288, 0xa0000001013a6540)
0xa0000001004720b0 scsi_remove_target+0x90
args (0xe0000030f1f43330, 0xe0000030f1f43330, 0xa000000100485630, 0x205, 0xa0000001013a6540)
0xa000000100485630 fc_starget_delete+0x30
args (0xe0000030f1f43528, 0xa0000001000cbd00, 0x50e, 0xa0000001000cbb80)
0xa0000001000cbd00 worker_thread+0x2a0
args (0xe0000034f7f1b098, 0xa00000010096cec0, 0xe0000034f7f1b0a0, 0xe0000034f7f1b0c8, 0xe0000034f7f1b0a0, 0xe0000034f7f1b0b0, 0xffffffffbfffffff, 0xa0000001000d5bb0, 0x389)
0xa0000001000d5bb0 kthread+0x110
args (0xe00000b073a1fcf8, 0xe0000034f5d3fe18, 0xe0000034f7f1b098, 0xa00000010096f650, 0xa000000100014a30, 0x286, 0xa0000001013a6540)
0xa000000100014a30 kernel_thread_helper+0xd0
args (0xa00000010096ffd0, 0xe00000b073a1fcf8, 0xa00000010000a4c0, 0x2, 0xa0000001013a6540)
0xa00000010000a4c0 start_kernel_thread+0x20
args (0xa00000010096ffd0, 0xe00000b073a1fcf8)
[0]kdb> btp 3789
Stack traceback for pid 3789
0xe0000034f5b50000 3789 2 0 1 D 0xe0000034f5b50570 scsi_wq_3
0xa0000001007280a0 schedule+0x14e0
args (0xe0000034f4ec7008, 0xe0000034f4f24d70, 0xe0000034f4ec6fe8, 0xe0000034f3669508, 0xe0000034f3669500, 0xe0000034f3669508, 0xe0000034f36694f8, 0xa0000001011c2cd0, 0x1010095a6000)
0xa000000100728640 schedule_timeout+0x40
args (0x7fffffffffffffff, 0x0, 0x0, 0xe0000034f64a6928, 0xa000000100726840, 0x50d, 0xe0000034f4ec7000)
0xa000000100726840 wait_for_common+0x1a0
args (0xe0000034f5b5fce0, 0x7fffffffffffffff, 0x2, 0xe0000034f5b5fce8, 0xe0000034f5b50000, 0xe0000034f5b5fce8, 0xa000000100726ba0, 0x207, 0xa0000001013a6540)
0xa000000100726ba0 wait_for_completion+0x40
args (0xe0000034f5b5fce0, 0xa0000001002b8460, 0x48e, 0x1)
0xa0000001002b8460 blk_execute_rq+0x140
args (0xe0000034f36692d0, 0x0, 0xe000003441024250, 0x1, 0xa0000001002b7b60, 0xe000003441024360, 0xa0000001002b8510, 0x38b, 0xe000003441024300)
0xa0000001002b8510 scsi_execute_rq+0x30
args (0xe0000034f36692d0, 0xe0000034f4ec6fb8, 0xe000003441024250, 0x1, 0xa000000100469050, 0x713, 0x713)
0xa000000100469050 scsi_execute+0x190
args (0xe0000034f4ec6fb8, 0xe000003441024250, 0xe0000034f03ec500, 0x1000, 0xe000003440f3e278, 0x5dc, 0x3, 0x4000000)
0xa000000100469200 scsi_execute_req+0xe0
args (0xe0000034f4ec6fb8, 0xe0000034f5b5fd8c, 0x2, 0xe0000034f03ec500, 0x1000, 0xe0000034f5b5fd84, 0x5dc, 0x3, 0xe000003440f3e278)
0xa00000010046da70 __scsi_scan_target+0x530
args (0x0, 0x0, 0x1000, 0xe0000034f03ec500, 0x1, 0xe0000034f4ec6fb8, 0xe0000030f14b55e0, 0xa0000001011c2cd0, 0xe0000034f5b5fd70)
0xa00000010046f000 scsi_scan_target+0x120
args (0xe00000b0f02d6c80, 0x0, 0x0, 0xffffffffffffffff, 0x1, 0xe0000034f4f24c90, 0xe0000034f4f24cf0, 0xa000000100485c20, 0x28a)
0xa000000100485c20 fc_scsi_scan_rport+0x140
args (0xe00000b0f02d6c20, 0xe0000034f4f24ce8, 0xa0000001000cbd00, 0x50e, 0x50e)
0xa0000001000cbd00 worker_thread+0x2a0
args (0xe0000034f7f1ada0, 0xa00000010096ceb0, 0xe0000034f7f1ada8, 0xe0000034f7f1add0, 0xe0000034f7f1ada8, 0xe0000034f7f1adb8, 0xffffffffbfffffff, 0xa0000001000d5bb0, 0x389)
0xa0000001000d5bb0 kthread+0x110
args (0xe00000b073a1fd18, 0xe0000034f5b5fe18, 0xe0000034f7f1ada0, 0xa00000010096f650, 0xa000000100014a30, 0x286, 0xa0000001013a6540)
0xa000000100014a30 kernel_thread_helper+0xd0
args (0xa00000010096ffd0, 0xe00000b073a1fd18, 0xa00000010000a4c0, 0x2, 0xa0000001013a6540)
0xa00000010000a4c0 start_kernel_thread+0x20
args (0xa00000010096ffd0, 0xe00000b073a1fd18)
[0]kdb> btp 3788
Stack traceback for pid 3788
0xe0000034f3e40000 3788 2 0 0 D 0xe0000034f3e40570 qla2xxx_3_dpc
0xa0000001007280a0 schedule+0x14e0
args (0x0, 0x1, 0xf, 0x43, 0xa000000100f6d300, 0x0, 0x0, 0xa0000001011e5c80, 0x1010095a6000)
0xa000000100728640 schedule_timeout+0x40
args (0x7fffffffffffffff, 0x0, 0x0, 0xa0000001000cc150, 0xa000000100726840, 0x50d, 0xe0000034f7f1b0b0)
0xa000000100726840 wait_for_common+0x1a0
args (0xe0000034f3e4fd00, 0x7fffffffffffffff, 0x2, 0xe0000034f3e4fd08, 0xe0000034f3e40000, 0xe0000034f3e4fd08, 0xa000000100726ba0, 0x207, 0xe0000034f7f1b0b0)
0xa000000100726ba0 wait_for_completion+0x40
args (0xe0000034f3e4fd00, 0xa0000001000cc390, 0x288, 0xa0000001000cc350)
0xa0000001000cc390 flush_cpu_workqueue+0x110
args (0xe0000034f7f1b098, 0x1, 0xa0000001000cc750, 0x38a, 0xe0000034f7f1b458)
0xa0000001000cc750 flush_workqueue+0x90
args (0xe0000034f5c68140, 0x0, 0xa0000001007c13a8, 0xa000000100bd0200, 0xa000000100483850, 0x206, 0x4000)
0xa000000100483850 fc_flush_work+0xb0
args (0xe0000034f4f24c90, 0xa000000100483b70, 0x48b, 0xe0000034f4f24ce0)
0xa000000100483b70 fc_remote_port_rolechg+0x2f0
args (0xe00000b0f02d6c20, 0x1, 0xe00000b0f02d6c68, 0xe0000034f4f24ce8, 0xe0000030f442a608, 0xe0000034f4f24c90, 0xa000000206fdfa20, 0x38f, 0xe0000034f7d4d0c8)
0xa000000206fdfa20 [qla2xxx]qla2x00_update_fcport+0x880
args (0xe00000b0f02d6c20, 0xe0000030f442a5b0, 0xe0000034f62131c8, 0xe0000030f442a5c0, 0xa000000206fdfc00, 0x38c, 0xa00000020700d058)
0xa000000206fdfc00 [qla2xxx]qla2x00_fabric_dev_login+0x160
args (0xe0000034f4f250a8, 0xe0000030f442a5b0, 0x0, 0xe0000034f62131c8, 0xa000000206fe2900, 0x1634, 0xa00000020700d058)
0xa000000206fe2900 [qla2xxx]qla2x00_configure_loop+0x2cc0
args (0xe0000034f4f250a8, 0xe0000030f442a5b0, 0xe0000034f4f251a4, 0xe0000034f3e4fd88, 0x300000000, 0x0, 0x1000, 0xe0000034f4f25108, 0xe000003440efdda2)
0xa000000206fe32b0 [qla2xxx]qla2x00_loop_resync+0x1b0
args (0xe0000034f4f250a8, 0xe0000034f4f25108, 0x0, 0xfe, 0xe0000034f56dc000, 0xe0000034f4f251a4, 0xe0000034f4f2511c, 0xe0000034f4f25104, 0xe0000034f7f1bab0)
0xa000000206fd6d40 [qla2xxx]qla2x00_do_dpc+0x9a0
args (0xe0000034f62131c8, 0x1, 0xe0000034f3e4fe00, 0xe0000034f4f25108, 0xe0000034f4f250a8, 0xe0000034f3e4fe00, 0xe0000034f4f250e8, 0xa000000207048958, 0xe0000034f4f250c8)
0xa0000001000d5bb0 kthread+0x110
args (0xe00000b073a1fd28, 0xe0000034f3e4fe18, 0xe0000034f62131c8, 0xa00000020703cfd8, 0xa000000100014a30, 0x286, 0xa0000001013a6540)
0xa000000100014a30 kernel_thread_helper+0xd0
args (0xa00000010096ffd0, 0xe00000b073a1fd28, 0xa00000010000a4c0, 0x2, 0xa0000001013a6540)
0xa00000010000a4c0 start_kernel_thread+0x20
args (0xa00000010096ffd0, 0xe00000b073a1fd28)
next reply other threads:[~2009-10-28 20:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-28 20:18 Michael Reed [this message]
2010-03-23 20:28 ` [RFC PATCH] fc_transport: reduce scan_mutex contention. (was: Re: 2.6.31 - scsi scanning / target deletion deadlock) Andrew Vasquez
2010-03-24 19:12 ` [RFC PATCH] fc_transport: reduce scan_mutex contention Michael Reed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AE8A70E.9070504@sgi.com \
--to=mdr@sgi.com \
--cc=James.Smart@Emulex.Com \
--cc=andrew.vasquez@qlogic.com \
--cc=jeremy@sgi.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox