All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christof Schmitt <christof.schmitt@de.ibm.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Jeff Garzik <jeff@garzik.org>,
	James Bottomley <James.Bottomley@suse.de>,
	linux-scsi@vger.kernel.org, linux-s390@vger.kernel.org,
	schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com
Subject: Re: [patch 5/5] zfcp: Issue FCP command without holding SCSI host_lock
Date: Wed, 17 Nov 2010 15:10:25 +0100	[thread overview]
Message-ID: <20101117141024.GA3589@schmichrtp.ibm.com> (raw)
In-Reply-To: <4CE3DC06.2010402@panasas.com>

On Wed, Nov 17, 2010 at 03:43:34PM +0200, Boaz Harrosh wrote:
> On 11/17/2010 03:23 PM, Christof Schmitt wrote:
> > From: Christof Schmitt <christof.schmitt@de.ibm.com>
> > 
> > Interrupting the connection to the FCP channel while I/O requests are
> > being issues can lead to this deadlock. scsi_dispatch_cmd already
> > holds the host_lock while the recovery trigger tries to acquire the
> > host_lock again when iterating through the scsi_devices.
> > 
> >  INFO: lockdep is turned off.
> >  BUG: spinlock lockup on CPU#1, blast/9660, 0000000078f38878
> >  CPU: 1 Not tainted 2.6.35.7SWEN2 #2
> >  Process blast (pid: 9660, task: 0000000071f75940, ksp: 0000000074393ac0)
> >         0000000074393640 00000000743935c0 0000000000000002 0000000000000000
> >         0000000074393660 00000000743935d8 00000000743935d8 00000000005590c2
> >         0000000000000000 0000000078f38878 0000000026ede800 0000000078f38878
> >         000000000000000d 040000000000000c 0000000074393628 0000000000000000
> >         0000000000000000 0000000000100b2a 00000000743935c0 0000000074393600
> >  Call Trace:
> >  ([<0000000000100a32>] show_trace+0xee/0x144)
> >   [<00000000003be202>] do_raw_spin_lock+0x112/0x178
> >   [<000000000055d408>] _raw_spin_lock_irqsave+0x90/0xb0
> >   [<00000000003f1514>] __scsi_iterate_devices+0x38/0xbc
> >   [<00000000004849b0>] zfcp_erp_clear_adapter_status+0xd0/0x16c
> >   [<000000000048587a>] zfcp_erp_adapter_reopen+0x3a/0xb4
> >   [<0000000000489812>] zfcp_fsf_req_send+0x166/0x180
> >   [<000000000048c8d6>] zfcp_fsf_fcp_cmnd+0x272/0x408
> >   [<000000000048f864>] zfcp_scsi_queuecommand+0x11c/0x1e0
> >   [<00000000003f1f2a>] scsi_dispatch_cmd+0x1d6/0x324
> >   [<00000000003f9910>] scsi_request_fn+0x42c/0x56c
> >   [<00000000003828ae>] __blk_run_queue+0x86/0x140
> >   [<000000000037f742>] elv_insert+0x11a/0x208
> >   [<000000000038104c>] blk_insert_cloned_request+0x84/0xe4
> >   [<000003c0032b7c64>] dm_dispatch_request+0x6c/0x94 [dm_mod]
> >   [<000003c0032b7d5c>] map_request+0xd0/0x100 [dm_mod]
> >   [<000003c0032b9a78>] dm_request_fn+0xec/0x1bc [dm_mod]
> >   [<0000000000382c0e>] generic_unplug_device+0x5a/0x6c
> >   [<000003c0032b7f98>] dm_unplug_all+0x74/0x9c [dm_mod]
> >   [<00000000001d1272>] sync_page+0x76/0x9c
> >   [<00000000001d12ba>] sync_page_killable+0x22/0x60
> >   [<000000000055a768>] __wait_on_bit_lock+0xc0/0x124
> >   [<00000000001d1140>] __lock_page_killable+0x78/0x84
> >   [<00000000001d351c>] generic_file_aio_read+0x5a4/0x7e8
> >   [<0000000000228ec0>] do_sync_read+0xc8/0x12c
> >   [<0000000000229edc>] vfs_read+0xac/0x1ac
> >   [<000000000022a0d8>] SyS_read+0x58/0xa8
> >   [<00000000001146de>] sysc_noemu+0x10/0x16
> >   [<00000200000493c4>] 0x200000493c4
> >  INFO: lockdep is turned off.
> > 
> > Call zfcp_fsf_fcp_cmnd without the host_lock and disable the
> > interrupts when acquiring the req_q_lock.
> > 
> > Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
> > Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
> > ---
> > 
> >  drivers/s390/scsi/zfcp_fsf.c  |    5 +++--
> >  drivers/s390/scsi/zfcp_scsi.c |    3 +++
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> > 
> > --- a/drivers/s390/scsi/zfcp_fsf.c
> > +++ b/drivers/s390/scsi/zfcp_fsf.c
> > @@ -2170,12 +2170,13 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *
> >  	struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
> >  	struct zfcp_qdio *qdio = adapter->qdio;
> >  	struct fsf_qtcb_bottom_io *io;
> > +	unsigned long flags;
> >  
> >  	if (unlikely(!(atomic_read(&zfcp_sdev->status) &
> >  		       ZFCP_STATUS_COMMON_UNBLOCKED)))
> >  		return -EBUSY;
> >  
> > -	spin_lock(&qdio->req_q_lock);
> > +	spin_lock_irqsave(&qdio->req_q_lock, flags);
> >  	if (atomic_read(&qdio->req_q_free) <= 0) {
> >  		atomic_inc(&qdio->req_q_full);
> >  		goto out;
> > @@ -2239,7 +2240,7 @@ failed_scsi_cmnd:
> >  	zfcp_fsf_req_free(req);
> >  	scsi_cmnd->host_scribble = NULL;
> >  out:
> > -	spin_unlock(&qdio->req_q_lock);
> > +	spin_unlock_irqrestore(&qdio->req_q_lock, flags);
> >  	return retval;
> >  }
> >  
> > --- a/drivers/s390/scsi/zfcp_scsi.c
> > +++ b/drivers/s390/scsi/zfcp_scsi.c
> > @@ -83,6 +83,7 @@ static int zfcp_scsi_queuecommand_lck(st
> >  	struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
> >  	struct fc_rport *rport = starget_to_rport(scsi_target(scpnt->device));
> >  	int    status, scsi_result, ret;
> > +	struct scsi_device *sdev = scpnt->device;
> >  
> >  	/* reset the status for this request */
> >  	scpnt->result = 0;
> > @@ -118,7 +119,9 @@ static int zfcp_scsi_queuecommand_lck(st
> >  		return 0;
> >  	}
> >  
> > +	spin_unlock_irq(sdev->host->host_lock);
> >  	ret = zfcp_fsf_fcp_cmnd(scpnt);
> > +	spin_lock_irq(sdev->host->host_lock);
> 
> CCing Jeff
> 
> that locks is taken in your own driver three lines below at the
> DEF_SCSI_QCMD macro invocation
> 
> Please do the proper host-lock-removal. The first time you are
> touching this code. (See example patch to libata by Jeff Garzik)

With the current code, the serial_number has to be updated for each
command since scsi_error still has the check for the serial_number in
scsi_try_to_abort_cmd.  The change above is a bug fix for the zfcp
changes introduced in 2.6.37-rc1, and i would like to fix this now. I
remember from the host_lock discussion that the serial_number changes
will happen in the 2.6.38 kernel. To me, it looks like there a two
changes needed anyway.

Either
 - make the above change now (as bug fix)
 - remove the host_lock from zfcp's queuecommand function when the
   serial_number becomes optional
or
 - change the queuecommand function now to include:
   take host_lock, call scsi_cmd_get_serial, release host_lock
 - remove the sequence again when the serial_number becomes optional

I opted for the first approach, to have a smaller patch now. If the
second approach is preferred, i can send an updated patch.

Christof

  reply	other threads:[~2010-11-17 14:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-17 13:23 [patch 0/5] zfcp fixes for 2.6.37-rc Christof Schmitt
2010-11-17 13:23 ` [patch 1/5] zfcp: Fix common FCP request reception Christof Schmitt
2010-11-17 13:23 ` [patch 2/5] zfcp: Correct false abort data assignment Christof Schmitt
2010-11-17 13:23 ` [patch 3/5] zfcp: No ERP escalation on gpn_ft eval Christof Schmitt
2010-11-17 13:23 ` [patch 4/5] zfcp: Prevent usage w/o holding a reference Christof Schmitt
2010-11-17 13:23 ` [patch 5/5] zfcp: Issue FCP command without holding SCSI host_lock Christof Schmitt
2010-11-17 13:43   ` Boaz Harrosh
2010-11-17 14:10     ` Christof Schmitt [this message]
2010-11-17 14:35       ` Boaz Harrosh
2010-11-18  8:34         ` Christof Schmitt
2010-11-18 13:53           ` [PATCH 5/5 v2] " Christof Schmitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101117141024.GA3589@schmichrtp.ibm.com \
    --to=christof.schmitt@de.ibm.com \
    --cc=James.Bottomley@suse.de \
    --cc=bharrosh@panasas.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jeff@garzik.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=schwidefsky@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.