public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Hans de Goede <j.w.r.degoede@hhs.nl>,
	SCSI development list <linux-scsi@vger.kernel.org>,
	USB list <linux-usb@vger.kernel.org>,
	Antonio Ospite <ospite@studenti.unina.it>
Subject: Re: BUG in handling of last_sector_bug flag
Date: Wed, 13 Aug 2008 14:13:57 +0300	[thread overview]
Message-ID: <48A2C1F5.6000708@panasas.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0808121648190.3153-100000@iolanthe.rowland.org>

Alan Stern wrote:
> On Tue, 12 Aug 2008, Boaz Harrosh wrote:
> 
>> What you are doing here is changing the semantics of what used to
>> work.
> 
> Yes, and I don't want to do that unless the current code is truly
> wrong.  That's why I wasn't certain this was the right thing to do.
> 
>>  Now I'm not saying it's a bad thing, but You must audit all
>> scsi ULDs to make sure they are not now broken because of  the new
>> assumptions. Let me explain.
>>
>> This is what happens now (before the patch):
>> A request comes in with some size. And gets prepared by ULD.
>> If the ULD does not like the size for some reason it will
>> chunk off scsi_bufflen and will submit the command with
>> bufflen != req->size. The midlayer and drivers will only
>> respond to bufflen. Until ...
>>
>> Until this loop magic in scsi_io_completion(). Since the
>> request is not done it will be requeued, the ULD will inspect
>> new size, adjust scsi_bufflen and so on until done.
>>
>> In case of an error the request goes back to the ULD and the ULD
>> should decide what to do with the reminder.
> 
> I don't understand this point.  _Every_ non-BLOCK_PC request
> automatically goes back to the ULD via the ->done callback, not only
> those which get an error.  This callback does not get to decide what to
> do with the remainder of the request, as far as I can tell; all it can
> do is return the number of bytes actually handled.
> 
>> scsi-ml until now
>> did not see any-other size but scsi_bufflen. So in theory it is
>> sd.c job to check for errors on split up requests and decide if
>> to complete the reminder or re-submit. The scsi-ml did not make
>> that policy.
> 
> How can sd.c decide whether or not to resubmit?  It doesn't know 
> whether scsi_io_completion() will go ahead and requeue the request 
> anyway.  And besides, you can't resubmit the request until the current 
> portion has been sent back to the block layer, which doesn't happen 
> until after ->done returns.
> 
>> If scsi-ml decides that it wants to set a new error policy here, then
>> you should audit other ULDs to make sure they did not rely on the
>> old behavior.
>>
>> sd could check for errors in it's drv->done(cmd) function. and return
>> the reminder of the request size in case of error. look at
>> scsi.c::scsi_finish_command(). ULD has control here.
> 
> The ->done function does not return the remainder of the request size; 
> it returns good_bytes, which is the number of bytes that were handled.
> 
> I agree that since the decision to split the request up into multiple 
> commands was made in sd.c, the decision of what to do about the 
> remainder should also be made in sd.c.  However the current structure 
> of the code doesn't seem to allow this to happen.  Requeue decisions 
> are all made in scsi_io_completion() or scsi_end_request().
> 
> Alan Stern
> 
I agree with all of what you said. Even if drv->done(cmd) would magically
guess what scsi_io_completion() is going to do and will return not good_bytes
(It's just a name) but lets call it bytes_to_complete. It will still be a
bug in the requeue case.

But now that I stared at the code harder. Current code is a complete bug.
the: "this_count = scsi_bufflen()" Has nothing to do with anything. 
Maybe it was meant to be: "this_count = scsi_bufflen() - good_bytes"
That is the count left over after the first complete. But after we
have completed good_bytes, the second complete is never scsi_bufflen().
(It used to work because most of the times it is bigger then the reminder
but then we can just use: "this_count = ~0")

Also what you did is not enough. What if the error is one of the known cases
above, where the "complete and return" is inside the case. They will hang also. 
Here is what I think should be:

---
git diff --stat -p drivers/scsi/
 drivers/scsi/scsi_lib.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ff5d56b..36995e5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -852,7 +852,7 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
 	int result = cmd->result;
-	int this_count = scsi_bufflen(cmd);
+	int this_count;
 	struct request_queue *q = cmd->device->request_queue;
 	struct request *req = cmd->request;
 	int error = 0;
@@ -909,9 +909,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
 		return;
 
-	/* good_bytes = 0, or (inclusive) there were leftovers and
-	 * result = 0, so scsi_end_request couldn't retry.
-	 */
+	this_count = blk_rq_bytes(req);
+
 	if (sense_valid && !sense_deferred) {
 		switch (sshdr.sense_key) {
 		case UNIT_ATTENTION:



  reply	other threads:[~2008-08-13 11:14 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-11 17:03 BUG in handling of last_sector_bug flag Alan Stern
     [not found] ` <Pine.LNX.4.44L0.0808111209100.2142-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-08-12  9:08   ` Alan Jenkins
     [not found]     ` <48A15318.30600-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2008-08-12 15:24       ` Alan Stern
     [not found]         ` <Pine.LNX.4.44L0.0808121120130.2248-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-08-12 16:33           ` Boaz Harrosh
     [not found]             ` <48A1BB71.5050905-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2008-08-12 21:00               ` Alan Stern
2008-08-13 11:13                 ` Boaz Harrosh [this message]
     [not found]                   ` <48A2C1F5.6000708-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2008-08-13 14:50                     ` Alan Stern
     [not found]                       ` <Pine.LNX.4.44L0.0808131036310.2455-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-08-13 16:37                         ` Boaz Harrosh
     [not found]                           ` <48A30DC7.8070501-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2008-08-13 16:45                             ` Boaz Harrosh
2008-08-13 19:17                           ` Alan Stern
2008-08-14 19:41                           ` Alan Stern
     [not found]                             ` <Pine.LNX.4.44L0.0808141535350.2148-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-08-15  8:31                               ` Alan Jenkins
2008-08-15 17:43                             ` Antonio Ospite
2008-08-26 21:13                             ` Antonio Ospite
     [not found]                               ` <20080826231301.aac6f0e5.ospite-aNJ+ML1ZbiP93QAQaVx+gl6hYfS7NtTn@public.gmane.org>
2008-08-27 14:21                                 ` Alan Stern
2008-08-27 14:33                                   ` James Bottomley
2008-08-27 15:54                                     ` Alan Stern
     [not found]                                     ` <1219847626.3292.17.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-08-27 18:50                                       ` [PATCH] Fix handling of failed requests in scsi_io_completion Alan Stern
2008-09-05 19:35                                         ` Alan Stern
     [not found]                                           ` <Pine.LNX.4.44L0.0809051533280.4482-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-09-19  7:17                                             ` Antonio Ospite
     [not found]                                               ` <20080919091748.9438726b.ospite-aNJ+ML1ZbiP93QAQaVx+gl6hYfS7NtTn@public.gmane.org>
2008-09-19 15:53                                                 ` Alan Stern
2008-09-20  0:31                                                   ` James Bottomley
2008-09-20 17:06                                                     ` Alan Stern
     [not found]                                                       ` <Pine.LNX.4.44L0.0809201241160.13839-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-09-20 17:55                                                         ` James Bottomley
2008-09-20 20:49                                                           ` Alan Stern
2008-09-20 21:09                                                             ` James Bottomley
2008-09-21 12:55                                                               ` Boaz Harrosh
     [not found]                                                                 ` <48D64459.2010802-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2008-09-21 19:57                                                                   ` James Bottomley
     [not found]                                                               ` <1221944955.3152.58.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-09-21 16:30                                                                 ` Alan Stern
     [not found]                                                                   ` <Pine.LNX.4.44L0.0809211201470.1154-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2008-09-21 19:59                                                                     ` James Bottomley
     [not found]                                                                       ` <1222027177.3152.121.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-09-21 21:03                                                                         ` Alan Stern
2008-09-22  7:24                                                                           ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48A2C1F5.6000708@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=alan-jenkins@tuffmail.co.uk \
    --cc=j.w.r.degoede@hhs.nl \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=ospite@studenti.unina.it \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox