From: Boaz Harrosh <bharrosh@panasas.com>
To: Christof Schmitt <christof.schmitt@de.ibm.com>
Cc: Jeff Garzik <jeff@garzik.org>,
James Bottomley <James.Bottomley@suse.de>,
linux-scsi@vger.kernel.org, linux-s390@vger.kernel.org,
schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com,
"Nicholas A. Bellinger" <nab@linux-iscsi.org>
Subject: Re: [patch 5/5] zfcp: Issue FCP command without holding SCSI host_lock
Date: Wed, 17 Nov 2010 16:35:52 +0200 [thread overview]
Message-ID: <4CE3E848.3010008@panasas.com> (raw)
In-Reply-To: <20101117141024.GA3589@schmichrtp.ibm.com>
On 11/17/2010 04:10 PM, Christof Schmitt wrote:
> On Wed, Nov 17, 2010 at 03:43:34PM +0200, Boaz Harrosh wrote:
>> On 11/17/2010 03:23 PM, Christof Schmitt wrote:
>>> From: Christof Schmitt <christof.schmitt@de.ibm.com>
>>>
>>> Interrupting the connection to the FCP channel while I/O requests are
>>> being issues can lead to this deadlock. scsi_dispatch_cmd already
>>> holds the host_lock while the recovery trigger tries to acquire the
>>> host_lock again when iterating through the scsi_devices.
>>>
>>> INFO: lockdep is turned off.
>>> BUG: spinlock lockup on CPU#1, blast/9660, 0000000078f38878
>>> CPU: 1 Not tainted 2.6.35.7SWEN2 #2
>>> Process blast (pid: 9660, task: 0000000071f75940, ksp: 0000000074393ac0)
>>> 0000000074393640 00000000743935c0 0000000000000002 0000000000000000
>>> 0000000074393660 00000000743935d8 00000000743935d8 00000000005590c2
>>> 0000000000000000 0000000078f38878 0000000026ede800 0000000078f38878
>>> 000000000000000d 040000000000000c 0000000074393628 0000000000000000
>>> 0000000000000000 0000000000100b2a 00000000743935c0 0000000074393600
>>> Call Trace:
>>> ([<0000000000100a32>] show_trace+0xee/0x144)
>>> [<00000000003be202>] do_raw_spin_lock+0x112/0x178
>>> [<000000000055d408>] _raw_spin_lock_irqsave+0x90/0xb0
>>> [<00000000003f1514>] __scsi_iterate_devices+0x38/0xbc
>>> [<00000000004849b0>] zfcp_erp_clear_adapter_status+0xd0/0x16c
>>> [<000000000048587a>] zfcp_erp_adapter_reopen+0x3a/0xb4
>>> [<0000000000489812>] zfcp_fsf_req_send+0x166/0x180
>>> [<000000000048c8d6>] zfcp_fsf_fcp_cmnd+0x272/0x408
>>> [<000000000048f864>] zfcp_scsi_queuecommand+0x11c/0x1e0
>>> [<00000000003f1f2a>] scsi_dispatch_cmd+0x1d6/0x324
>>> [<00000000003f9910>] scsi_request_fn+0x42c/0x56c
>>> [<00000000003828ae>] __blk_run_queue+0x86/0x140
>>> [<000000000037f742>] elv_insert+0x11a/0x208
>>> [<000000000038104c>] blk_insert_cloned_request+0x84/0xe4
>>> [<000003c0032b7c64>] dm_dispatch_request+0x6c/0x94 [dm_mod]
>>> [<000003c0032b7d5c>] map_request+0xd0/0x100 [dm_mod]
>>> [<000003c0032b9a78>] dm_request_fn+0xec/0x1bc [dm_mod]
>>> [<0000000000382c0e>] generic_unplug_device+0x5a/0x6c
>>> [<000003c0032b7f98>] dm_unplug_all+0x74/0x9c [dm_mod]
>>> [<00000000001d1272>] sync_page+0x76/0x9c
>>> [<00000000001d12ba>] sync_page_killable+0x22/0x60
>>> [<000000000055a768>] __wait_on_bit_lock+0xc0/0x124
>>> [<00000000001d1140>] __lock_page_killable+0x78/0x84
>>> [<00000000001d351c>] generic_file_aio_read+0x5a4/0x7e8
>>> [<0000000000228ec0>] do_sync_read+0xc8/0x12c
>>> [<0000000000229edc>] vfs_read+0xac/0x1ac
>>> [<000000000022a0d8>] SyS_read+0x58/0xa8
>>> [<00000000001146de>] sysc_noemu+0x10/0x16
>>> [<00000200000493c4>] 0x200000493c4
>>> INFO: lockdep is turned off.
>>>
>>> Call zfcp_fsf_fcp_cmnd without the host_lock and disable the
>>> interrupts when acquiring the req_q_lock.
>>>
>>> Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
>>> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
>>> ---
>>>
>>> drivers/s390/scsi/zfcp_fsf.c | 5 +++--
>>> drivers/s390/scsi/zfcp_scsi.c | 3 +++
>>> 2 files changed, 6 insertions(+), 2 deletions(-)
>>>
>>> --- a/drivers/s390/scsi/zfcp_fsf.c
>>> +++ b/drivers/s390/scsi/zfcp_fsf.c
>>> @@ -2170,12 +2170,13 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *
>>> struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
>>> struct zfcp_qdio *qdio = adapter->qdio;
>>> struct fsf_qtcb_bottom_io *io;
>>> + unsigned long flags;
>>>
>>> if (unlikely(!(atomic_read(&zfcp_sdev->status) &
>>> ZFCP_STATUS_COMMON_UNBLOCKED)))
>>> return -EBUSY;
>>>
>>> - spin_lock(&qdio->req_q_lock);
>>> + spin_lock_irqsave(&qdio->req_q_lock, flags);
>>> if (atomic_read(&qdio->req_q_free) <= 0) {
>>> atomic_inc(&qdio->req_q_full);
>>> goto out;
>>> @@ -2239,7 +2240,7 @@ failed_scsi_cmnd:
>>> zfcp_fsf_req_free(req);
>>> scsi_cmnd->host_scribble = NULL;
>>> out:
>>> - spin_unlock(&qdio->req_q_lock);
>>> + spin_unlock_irqrestore(&qdio->req_q_lock, flags);
>>> return retval;
>>> }
>>>
>>> --- a/drivers/s390/scsi/zfcp_scsi.c
>>> +++ b/drivers/s390/scsi/zfcp_scsi.c
>>> @@ -83,6 +83,7 @@ static int zfcp_scsi_queuecommand_lck(st
>>> struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
>>> struct fc_rport *rport = starget_to_rport(scsi_target(scpnt->device));
>>> int status, scsi_result, ret;
>>> + struct scsi_device *sdev = scpnt->device;
>>>
>>> /* reset the status for this request */
>>> scpnt->result = 0;
>>> @@ -118,7 +119,9 @@ static int zfcp_scsi_queuecommand_lck(st
>>> return 0;
>>> }
>>>
>>> + spin_unlock_irq(sdev->host->host_lock);
>>> ret = zfcp_fsf_fcp_cmnd(scpnt);
>>> + spin_lock_irq(sdev->host->host_lock);
>>
>> CCing Jeff
>>
>> that locks is taken in your own driver three lines below at the
>> DEF_SCSI_QCMD macro invocation
>>
>> Please do the proper host-lock-removal. The first time you are
>> touching this code. (See example patch to libata by Jeff Garzik)
>
> With the current code, the serial_number has to be updated for each
> command since scsi_error still has the check for the serial_number in
> scsi_try_to_abort_cmd. The change above is a bug fix for the zfcp
> changes introduced in 2.6.37-rc1, and i would like to fix this now. I
> remember from the host_lock discussion that the serial_number changes
> will happen in the 2.6.38 kernel. To me, it looks like there a two
> changes needed anyway.
>
Sigh! yes this is a deficiency I don't like about the current
push-down. I think we could do Nic's atomic serial_number as an
intermediate solution, or the pending fix to scsi_eh, and not force
a double change to every driver like we have now.
But I still think it's better to open code DEF_SCSI_QCMD than
to have that ugly
unlock
...
lock
That looks like an ugly typo (And is slower actually)
Jeff ?
Boaz
> Either
> - make the above change now (as bug fix)
> - remove the host_lock from zfcp's queuecommand function when the
> serial_number becomes optional
> or
> - change the queuecommand function now to include:
> take host_lock, call scsi_cmd_get_serial, release host_lock
> - remove the sequence again when the serial_number becomes optional
>
> I opted for the first approach, to have a smaller patch now. If the
> second approach is preferred, i can send an updated patch.
>
> Christof
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2010-11-17 14:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-17 13:23 [patch 0/5] zfcp fixes for 2.6.37-rc Christof Schmitt
2010-11-17 13:23 ` [patch 1/5] zfcp: Fix common FCP request reception Christof Schmitt
2010-11-17 13:23 ` [patch 2/5] zfcp: Correct false abort data assignment Christof Schmitt
2010-11-17 13:23 ` [patch 3/5] zfcp: No ERP escalation on gpn_ft eval Christof Schmitt
2010-11-17 13:23 ` [patch 4/5] zfcp: Prevent usage w/o holding a reference Christof Schmitt
2010-11-17 13:23 ` [patch 5/5] zfcp: Issue FCP command without holding SCSI host_lock Christof Schmitt
2010-11-17 13:43 ` Boaz Harrosh
2010-11-17 14:10 ` Christof Schmitt
2010-11-17 14:35 ` Boaz Harrosh [this message]
2010-11-18 8:34 ` Christof Schmitt
2010-11-18 13:53 ` [PATCH 5/5 v2] " Christof Schmitt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CE3E848.3010008@panasas.com \
--to=bharrosh@panasas.com \
--cc=James.Bottomley@suse.de \
--cc=christof.schmitt@de.ibm.com \
--cc=heiko.carstens@de.ibm.com \
--cc=jeff@garzik.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=schwidefsky@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.