SCSI eats error from flush failure during hot plug

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* SCSI eats error from flush failure during hot plug
@ 2014-06-05 23:52 Steven Haber
  2014-06-07 19:29 ` James Bottomley
  2014-06-26 15:02 ` Christoph Hellwig
  0 siblings, 2 replies; 24+ messages in thread
From: Steven Haber @ 2014-06-05 23:52 UTC (permalink / raw)
  To: linux-scsi

Hello,

I am testing ATA device durability during hot unplug. I have a power
fault test suite that has turned up issues with the fsync->SCSI->ATA
codepath. If a device is unplugged while an fsync is in progress, ATA
returns a flush error to the SCSI driver but scsi_io_completion()
seems to disregard it. fsync() returns no error, which should mean
that the write was durably flushed to disk. I expect fsync() to return
EIO or something similar when the flush isn't acked by the device.

When the failure occurs, the SCSI sense key is set to ABORTED_COMMAND.
However, scsi_end_command() is called without any of the sense context
and scsi_io_completion() returns early without checking sense at all.

This commit may be related:
d6b0c53723753fc0cfda63f56735b225c43e1e9a
(http://git.opencores.org/?a=commitdiff&p=linux&h=d6b0c53723753fc0cfda63f56735b225c43e1e9a)

The following patch fixes the issue, but it's a hack. I only have a
vague understanding of Linux drivers, so I'm looking for advice on how
to make this fix better and get it upstream.

Thanks!

Steven Haber
Qumulo, Inc.

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9db097a..789af39 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -822,40 +822,44 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
unsigned int good_bytes)
  /*
  * Recovered errors need reporting, but they're always treated
  * as success, so fiddle the result code here.  For BLOCK_PC
  * we already took a copy of the original into rq->errors which
  * is what gets returned to the user
  */
  if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
  /* if ATA PASS-THROUGH INFORMATION AVAILABLE skip
  * print since caller wants ATA registers. Only occurs on
  * SCSI ATA PASS_THROUGH commands when CK_COND=1
  */
  if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
  ;
  else if (!(req->cmd_flags & REQ_QUIET))
  scsi_print_sense("", cmd);
  result = 0;
  /* BLOCK_PC may have set error */
  error = 0;
  }

+ if (sense_valid && (sshdr.sense_key == ABORTED_COMMAND) &&
+    good_bytes == 0)
+ error = (sshdr.asc == 0x10) ? -EILSEQ : -EIO;
+
  /*
  * A number of bytes were successfully read.  If there
  * are leftovers and there is some kind of error
  * (result != 0), retry the rest.
  */
  if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
  return;

  error = __scsi_error_from_host_byte(cmd, result);

  if (host_byte(result) == DID_RESET) {
  /* Third party bus reset or reset for error recovery
  * reasons.  Just retry the command and see what
  * happens.
  */
  action = ACTION_RETRY;
  } else if (sense_valid && !sense_deferred) {
  switch (sshdr.sense_key) {
  case UNIT_ATTENTION:
  if (cmd->device->removable) {

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-05 23:52 SCSI eats error from flush failure during hot plug Steven Haber
@ 2014-06-07 19:29 ` James Bottomley
  2014-06-09 17:21   ` Steven Haber
  2014-06-26 15:02 ` Christoph Hellwig
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-07 19:29 UTC (permalink / raw)
  To: Steven Haber; +Cc: linux-scsi, Jens Axboe

On Thu, 2014-06-05 at 16:52 -0700, Steven Haber wrote:
> Hello,
> 
> I am testing ATA device durability during hot unplug. I have a power
> fault test suite that has turned up issues with the fsync->SCSI->ATA
> codepath. If a device is unplugged while an fsync is in progress, ATA
> returns a flush error to the SCSI driver but scsi_io_completion()
> seems to disregard it. fsync() returns no error, which should mean
> that the write was durably flushed to disk. I expect fsync() to return
> EIO or something similar when the flush isn't acked by the device.
> 
> When the failure occurs, the SCSI sense key is set to ABORTED_COMMAND.
> However, scsi_end_command() is called without any of the sense context
> and scsi_io_completion() returns early without checking sense at all.
> 
> This commit may be related:
> d6b0c53723753fc0cfda63f56735b225c43e1e9a
> (http://git.opencores.org/?a=commitdiff&p=linux&h=d6b0c53723753fc0cfda63f56735b225c43e1e9a)
> 
> The following patch fixes the issue, but it's a hack. I only have a
> vague understanding of Linux drivers, so I'm looking for advice on how
> to make this fix better and get it upstream.

OK, so for some reason this is a zero size REQ_TYPE_FS command, which
the logic actually assumes we cannot have.

I suspect REQ_TYPE_FLUSH subtly broke this logic because it's coming
back to us as REQ_TYPE_FS even though it's been set up (in SCSI) as
REQ_TYPE_BLOCK_PC.

On this theory, we'd eat the return codes of all no data transfer
commands that fail.  I think the generic fix is to make sure that all
commands initiallised as REQ_TYPE_BLOCK_PC actually have the ->cmd_type
set to that.

There may be some knock on side effects because it doesn't look like the
block flush code expects us to change the request->cmd_type.  Cc'd Jens
for opinions on this.

Can you try this out and see if it fixes the problem?  If it doesn't,
we're going to have to get into debugging exactly what this zero length
request is.

Thanks,

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9db097a..78229ec7 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1171,6 +1171,11 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
 	
 	cmd->transfersize = blk_rq_bytes(req);
 	cmd->allowed = req->retries;
+	/*
+	 * Thanks to flush and other PC prepared commands, we may
+	 * not be a REQ_TYPE_BLOCK_PC; make sure we are
+	 */
+	req->cmd_type = REQ_TYPE_BLOCK_PC;
 	return BLKPREP_OK;
 }
 EXPORT_SYMBOL(scsi_setup_blk_pc_cmnd);



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-07 19:29 ` James Bottomley
@ 2014-06-09 17:21   ` Steven Haber
  2014-06-09 17:29     ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Haber @ 2014-06-09 17:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Jens Axboe

Thanks a bunch James! I patched and ran our suite over the weekend and
didn't see any failures at all. I also didn't notice any side effects.
I poked through the kernel logs and everything looked like it did
before. I'll let you know if anything else weird crops up. Any idea
when this will get checked in?

Steven

On Sat, Jun 7, 2014 at 12:29 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Thu, 2014-06-05 at 16:52 -0700, Steven Haber wrote:
>> Hello,
>>
>> I am testing ATA device durability during hot unplug. I have a power
>> fault test suite that has turned up issues with the fsync->SCSI->ATA
>> codepath. If a device is unplugged while an fsync is in progress, ATA
>> returns a flush error to the SCSI driver but scsi_io_completion()
>> seems to disregard it. fsync() returns no error, which should mean
>> that the write was durably flushed to disk. I expect fsync() to return
>> EIO or something similar when the flush isn't acked by the device.
>>
>> When the failure occurs, the SCSI sense key is set to ABORTED_COMMAND.
>> However, scsi_end_command() is called without any of the sense context
>> and scsi_io_completion() returns early without checking sense at all.
>>
>> This commit may be related:
>> d6b0c53723753fc0cfda63f56735b225c43e1e9a
>> (http://git.opencores.org/?a=commitdiff&p=linux&h=d6b0c53723753fc0cfda63f56735b225c43e1e9a)
>>
>> The following patch fixes the issue, but it's a hack. I only have a
>> vague understanding of Linux drivers, so I'm looking for advice on how
>> to make this fix better and get it upstream.
>
> OK, so for some reason this is a zero size REQ_TYPE_FS command, which
> the logic actually assumes we cannot have.
>
> I suspect REQ_TYPE_FLUSH subtly broke this logic because it's coming
> back to us as REQ_TYPE_FS even though it's been set up (in SCSI) as
> REQ_TYPE_BLOCK_PC.
>
> On this theory, we'd eat the return codes of all no data transfer
> commands that fail.  I think the generic fix is to make sure that all
> commands initiallised as REQ_TYPE_BLOCK_PC actually have the ->cmd_type
> set to that.
>
> There may be some knock on side effects because it doesn't look like the
> block flush code expects us to change the request->cmd_type.  Cc'd Jens
> for opinions on this.
>
> Can you try this out and see if it fixes the problem?  If it doesn't,
> we're going to have to get into debugging exactly what this zero length
> request is.
>
> Thanks,
>
> James
>
> ---
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 9db097a..78229ec7 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1171,6 +1171,11 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
>
>         cmd->transfersize = blk_rq_bytes(req);
>         cmd->allowed = req->retries;
> +       /*
> +        * Thanks to flush and other PC prepared commands, we may
> +        * not be a REQ_TYPE_BLOCK_PC; make sure we are
> +        */
> +       req->cmd_type = REQ_TYPE_BLOCK_PC;
>         return BLKPREP_OK;
>  }
>  EXPORT_SYMBOL(scsi_setup_blk_pc_cmnd);
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-09 17:21   ` Steven Haber
@ 2014-06-09 17:29     ` James Bottomley
  2014-06-11 13:37       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-09 17:29 UTC (permalink / raw)
  To: Steven Haber; +Cc: linux-scsi, Jens Axboe

On Mon, 2014-06-09 at 10:21 -0700, Steven Haber wrote:
> Thanks a bunch James! I patched and ran our suite over the weekend and
> didn't see any failures at all. I also didn't notice any side effects.
> I poked through the kernel logs and everything looked like it did
> before. I'll let you know if anything else weird crops up. Any idea
> when this will get checked in?

I'll do it as a bug fix, but I do need Jens to make sure nothing else
breaks first.  Best I can tell, the state model for compound commands
like flush doesn't expect us to change the request type ... nothing puts
it back to REQ_TYPE_FS.  In your case, the flush is the last command
sent, so there's no problem ... I just worry we will get an obscure
problem later on from something that does a BLOCK_PC prepared first
command.

James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-09 17:29     ` James Bottomley
@ 2014-06-11 13:37       ` Christoph Hellwig
  2014-06-19 18:05         ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-11 13:37 UTC (permalink / raw)
  To: James Bottomley; +Cc: Steven Haber, linux-scsi, Jens Axboe

On Mon, Jun 09, 2014 at 10:29:06AM -0700, James Bottomley wrote:
> I'll do it as a bug fix, but I do need Jens to make sure nothing else
> breaks first.  Best I can tell, the state model for compound commands
> like flush doesn't expect us to change the request type ... nothing puts
> it back to REQ_TYPE_FS.  In your case, the flush is the last command
> sent, so there's no problem ... I just worry we will get an obscure
> problem later on from something that does a BLOCK_PC prepared first
> command.

Yes, I don't think resetting cmd_type is a good idea.  I'd much rather
see a special case for rq->cmd_flags & REQ_FLUSH in the completion
handler - we already treat it special during setup anyway.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-11 13:37       ` Christoph Hellwig
@ 2014-06-19 18:05         ` James Bottomley
  2014-06-25 13:13           ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-19 18:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi, Jens Axboe

On Wed, 2014-06-11 at 06:37 -0700, Christoph Hellwig wrote:
> On Mon, Jun 09, 2014 at 10:29:06AM -0700, James Bottomley wrote:
> > I'll do it as a bug fix, but I do need Jens to make sure nothing else
> > breaks first.  Best I can tell, the state model for compound commands
> > like flush doesn't expect us to change the request type ... nothing puts
> > it back to REQ_TYPE_FS.  In your case, the flush is the last command
> > sent, so there's no problem ... I just worry we will get an obscure
> > problem later on from something that does a BLOCK_PC prepared first
> > command.
> 
> Yes, I don't think resetting cmd_type is a good idea.  I'd much rather
> see a special case for rq->cmd_flags & REQ_FLUSH in the completion
> handler - we already treat it special during setup anyway.

That's not really a good idea either ... I did think of it.  We'll end
up with a cmd_type of REQ_TYPE_FS which because of REQ_FLUSH (or REQ_FUA
or REQ_DISCARD or any number of other things) we have to treat as though
it were REQ_TYPE_BLOCK_PC.  It's much better to tune handling
expectations according to req->cmd_type because that's what we already
do.  These commands are actually set up by our handlers, so it's up to
us to mark the request type correctly.

This, I think, does everything we need

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..d1458ec 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1103,6 +1103,11 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
 	
 	cmd->transfersize = blk_rq_bytes(req);
 	cmd->allowed = req->retries;
+	/*
+	 * Thanks to flush and other PC prepared commands, we may
+	 * not be a REQ_TYPE_BLOCK_PC; make sure we are
+	 */
+	req->cmd_type = REQ_TYPE_BLOCK_PC;
 	return BLKPREP_OK;
 }
 EXPORT_SYMBOL(scsi_setup_blk_pc_cmnd);
@@ -1129,6 +1134,11 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req)
 	BUG_ON(!req->nr_phys_segments);
 
 	memset(cmd->cmnd, 0, BLK_MAX_CDB);
+	/*
+	 * Thanks to flush and other PC prepared commands, we may
+	 * not be a REQ_TYPE_FS; make sure we are
+	 */
+	req->cmd_type = REQ_TYPE_FS;
 	return scsi_init_io(cmd, GFP_ATOMIC);
 }
 EXPORT_SYMBOL(scsi_setup_fs_cmnd);



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-19 18:05         ` James Bottomley
@ 2014-06-25 13:13           ` Christoph Hellwig
  2014-06-26 14:02             ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-25 13:13 UTC (permalink / raw)
  To: James Bottomley; +Cc: Christoph Hellwig, Steven Haber, linux-scsi, Jens Axboe

On Thu, Jun 19, 2014 at 11:05:59AM -0700, James Bottomley wrote:
> That's not really a good idea either ... I did think of it.  We'll end
> up with a cmd_type of REQ_TYPE_FS which because of REQ_FLUSH (or REQ_FUA
> or REQ_DISCARD or any number of other things) we have to treat as though
> it were REQ_TYPE_BLOCK_PC.  It's much better to tune handling
> expectations according to req->cmd_type because that's what we already
> do.  These commands are actually set up by our handlers, so it's up to
> us to mark the request type correctly.

Looking at the places where the SCSI midlayer cares about the request
type:

 - scsi_finish_command to call ->done for non-PC requests.  Given that
   we called into the driver to setup flush/discard/etc we should also
   call into the driver on request completion
 - scsi_eh_action: ditto for error handling
 - scsi_noretry_cmd: I don't see why we'd want to treat flush request
	as having an implicit failfast flag
 - scsi_io_completion: this mostly opts out of all kinds of error
	handling and retries, not really what we'd want either
 - scsi_unprep_fn: calls ->uninit_command only for !PC request,
   	so your patch introduces a leak for discard requests


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-25 13:13           ` Christoph Hellwig
@ 2014-06-26 14:02             ` James Bottomley
  2014-06-26 15:00               ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-26 14:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi, Jens Axboe


On Wed, 2014-06-25 at 06:13 -0700, Christoph Hellwig wrote:
> On Thu, Jun 19, 2014 at 11:05:59AM -0700, James Bottomley wrote:
> > That's not really a good idea either ... I did think of it.  We'll end
> > up with a cmd_type of REQ_TYPE_FS which because of REQ_FLUSH (or REQ_FUA
> > or REQ_DISCARD or any number of other things) we have to treat as though
> > it were REQ_TYPE_BLOCK_PC.  It's much better to tune handling
> > expectations according to req->cmd_type because that's what we already
> > do.  These commands are actually set up by our handlers, so it's up to
> > us to mark the request type correctly.
> 
> Looking at the places where the SCSI midlayer cares about the request
> type:
> 
>  - scsi_finish_command to call ->done for non-PC requests.  Given that
>    we called into the driver to setup flush/discard/etc we should also
>    call into the driver on request completion
>  - scsi_eh_action: ditto for error handling
>  - scsi_noretry_cmd: I don't see why we'd want to treat flush request
> 	as having an implicit failfast flag
>  - scsi_io_completion: this mostly opts out of all kinds of error
> 	handling and retries, not really what we'd want either
>  - scsi_unprep_fn: calls ->uninit_command only for !PC request,
>    	so your patch introduces a leak for discard requests

Yes, but think what you're proposing.  Every block command with a
special setup goes via scsi_setup_blk_pc_cmnd() because they usually
have to translate to something special.  If we did what you propose,
every time we add one, we'd have to modify these five places in the code
plus the setup ... it's a bit insane plus a maintenance nightmare.

Since block gave us the command to set up as we please, we're entitled
to change the fields including cmd_type.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 14:02             ` James Bottomley
@ 2014-06-26 15:00               ` Christoph Hellwig
  2014-06-26 15:04                 ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-26 15:00 UTC (permalink / raw)
  To: James Bottomley; +Cc: Christoph Hellwig, Steven Haber, linux-scsi, Jens Axboe

On Thu, Jun 26, 2014 at 10:02:47AM -0400, James Bottomley wrote:
> Yes, but think what you're proposing.  Every block command with a
> special setup goes via scsi_setup_blk_pc_cmnd() because they usually
> have to translate to something special.

That logic is backwards.  The "special" commands use
scsi_setup_blk_pc_cmnd because that seemed the easiest we to do it
when they were introduce.  Looking back they should have just called
scsi_init_io directly instead of doing enough setup to pretend to be
BLOCK_PC enough to use scsi_setup_blk_pc_cmnd().

> If we did what you propose,
> every time we add one, we'd have to modify these five places in the code
> plus the setup ... it's a bit insane plus a maintenance nightmare.

It's not.  They need to be treated special because they _are_ special.

If you look at the command types there's a pretty clear difference
between BLOCK_PC and everything else:

			BLOCK_PC	read/write	discard/flush/write_same
need to generate CDB	no		yes		yes
calls into the ULD	no		yes		yes
needs error handling	no		yes		yes
passes sense data up	yes		no		no

so they surely aren;t like BLOCK_PC, and treating them like BLOCK_PC
would require major changes to our architecture.  Not that I'm overly
happy with them being _FS requests either - they probably should have
a cmd_type assigned for each of them and come down from the block
layer set up that way.  I'll look into that once I get a little bit of
time.

> Since block gave us the command to set up as we please, we're entitled
> to change the fields including cmd_type.

The block layer doesn't care, but that doesn't make it right inside the
SCSI code.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 15:00               ` Christoph Hellwig
@ 2014-06-26 15:04                 ` James Bottomley
  2014-06-26 15:08                   ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-26 15:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi, Jens Axboe

On Thu, 2014-06-26 at 08:00 -0700, Christoph Hellwig wrote:
> On Thu, Jun 26, 2014 at 10:02:47AM -0400, James Bottomley wrote:
> > Yes, but think what you're proposing.  Every block command with a
> > special setup goes via scsi_setup_blk_pc_cmnd() because they usually
> > have to translate to something special.
> 
> That logic is backwards.  The "special" commands use
> scsi_setup_blk_pc_cmnd because that seemed the easiest we to do it
> when they were introduce.  Looking back they should have just called
> scsi_init_io directly instead of doing enough setup to pretend to be
> BLOCK_PC enough to use scsi_setup_blk_pc_cmnd().
> 
> > If we did what you propose,
> > every time we add one, we'd have to modify these five places in the code
> > plus the setup ... it's a bit insane plus a maintenance nightmare.
> 
> It's not.  They need to be treated special because they _are_ special.

If there's a problem outside the normal processing then it needs to be
solved once, not once for every new special block command type.

> If you look at the command types there's a pretty clear difference
> between BLOCK_PC and everything else:
> 
> 			BLOCK_PC	read/write	discard/flush/write_same
> need to generate CDB	no		yes		yes
> calls into the ULD	no		yes		yes
> needs error handling	no		yes		yes
> passes sense data up	yes		no		no

Right, but look what we do for the specials.  The only difference is the
ULD has a hook to generate the CDB.  For everything else we follow the
BLOCK_PC path, including error handling and sense data processing.

> so they surely aren;t like BLOCK_PC, and treating them like BLOCK_PC
> would require major changes to our architecture.  Not that I'm overly
> happy with them being _FS requests either - they probably should have
> a cmd_type assigned for each of them and come down from the block
> layer set up that way.  I'll look into that once I get a little bit of
> time.

OK, we'll us this as a fix for the bug.  If you come up with something
more elegant we can replace it.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 15:04                 ` James Bottomley
@ 2014-06-26 15:08                   ` Christoph Hellwig
  2014-06-26 18:52                     ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-26 15:08 UTC (permalink / raw)
  To: James Bottomley; +Cc: Steven Haber, linux-scsi, Jens Axboe

On Thu, Jun 26, 2014 at 11:04:56AM -0400, James Bottomley wrote:
> Right, but look what we do for the specials.  The only difference is the
> ULD has a hook to generate the CDB.  For everything else we follow the
> BLOCK_PC path, including error handling and sense data processing.

No, we don't use the BLOCK_PC path for anything but a tiny bit of setup
code right now.  We will with your patch, which will cause all kinds of
behavior change.  See my last mail for the list of differences, which
includes the introduction of a major memory leak.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 15:08                   ` Christoph Hellwig
@ 2014-06-26 18:52                     ` James Bottomley
  2014-06-27  7:53                       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-26 18:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi, Jens Axboe

On Thu, 2014-06-26 at 08:08 -0700, Christoph Hellwig wrote:
> On Thu, Jun 26, 2014 at 11:04:56AM -0400, James Bottomley wrote:
> > Right, but look what we do for the specials.  The only difference is the
> > ULD has a hook to generate the CDB.  For everything else we follow the
> > BLOCK_PC path, including error handling and sense data processing.
> 
> No, we don't use the BLOCK_PC path for anything but a tiny bit of setup
> code right now.  We will with your patch, which will cause all kinds of
> behavior change.  See my last mail for the list of differences, which
> includes the introduction of a major memory leak.

OK, I give up, what memory leak (you don't actually mention it in your
emails)?  All memory handling is correctly done in unprep as far as I
can tell.

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 18:52                     ` James Bottomley
@ 2014-06-27  7:53                       ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-27  7:53 UTC (permalink / raw)
  To: James Bottomley; +Cc: Christoph Hellwig, Steven Haber, linux-scsi, Jens Axboe

On Thu, Jun 26, 2014 at 02:52:56PM -0400, James Bottomley wrote:
> OK, I give up, what memory leak (you don't actually mention it in your
> emails)?  All memory handling is correctly done in unprep as far as I
> can tell.

To quote my earlier mail:

>  - scsi_unprep_fn: calls ->uninit_command only for !PC request,
>	so your patch introduces a leak for discard requests

This list both very clearly what we leak and why.  Before 3.16 we'd
indeed call sd_unprep_fn for every kind of request, but we're
fortunately back to my old model from 10 years ago where BLOCK_PC
driver stay entirely away from the ULD as they should.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-05 23:52 SCSI eats error from flush failure during hot plug Steven Haber
  2014-06-07 19:29 ` James Bottomley
@ 2014-06-26 15:02 ` Christoph Hellwig
  2014-06-26 18:13   ` Steven Haber
  2014-06-26 18:52   ` James Bottomley
  1 sibling, 2 replies; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-26 15:02 UTC (permalink / raw)
  To: Steven Haber; +Cc: linux-scsi

Hi Steven,

can you test the patch below?

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..9b56e48 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			scsi_next_command(cmd);
 			return;
 		}
+	} else if (req->cmd_flags & REQ_FLUSH) {
+		/*
+		 * Flush request don't transfer data, so we can't try
+		 * to complete the good bytes first before checking
+		 * the error.
+		 */
+		if (result && !sense_deferred)
+			error = __scsi_error_from_host_byte(cmd, result);
 	}
 
 	/* no bidi support for !REQ_TYPE_BLOCK_PC yet */

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 15:02 ` Christoph Hellwig
@ 2014-06-26 18:13   ` Steven Haber
  2014-06-26 18:52   ` James Bottomley
  1 sibling, 0 replies; 24+ messages in thread
From: Steven Haber @ 2014-06-26 18:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-scsi

Sure, I'll run it overnight.

On Thu, Jun 26, 2014 at 8:02 AM, Christoph Hellwig <hch@infradead.org> wrote:
> Hi Steven,
>
> can you test the patch below?
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f7e3163..9b56e48 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>                         scsi_next_command(cmd);
>                         return;
>                 }
> +       } else if (req->cmd_flags & REQ_FLUSH) {
> +               /*
> +                * Flush request don't transfer data, so we can't try
> +                * to complete the good bytes first before checking
> +                * the error.
> +                */
> +               if (result && !sense_deferred)
> +                       error = __scsi_error_from_host_byte(cmd, result);
>         }
>
>         /* no bidi support for !REQ_TYPE_BLOCK_PC yet */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 15:02 ` Christoph Hellwig
  2014-06-26 18:13   ` Steven Haber
@ 2014-06-26 18:52   ` James Bottomley
  2014-06-27  7:55     ` Christoph Hellwig
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2014-06-26 18:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi

On Thu, 2014-06-26 at 08:02 -0700, Christoph Hellwig wrote:
> Hi Steven,
> 
> can you test the patch below?
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f7e3163..9b56e48 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  			scsi_next_command(cmd);
>  			return;
>  		}
> +	} else if (req->cmd_flags & REQ_FLUSH) {
> +		/*
> +		 * Flush request don't transfer data, so we can't try
> +		 * to complete the good bytes first before checking
> +		 * the error.
> +		 */
> +		if (result && !sense_deferred)
> +			error = __scsi_error_from_host_byte(cmd, result);

Well, no, that's wrong.  To catch all of the corner cases that slip
through here, the check is for a zero length command with and error
otherwise we get the same problem for failed discards.

This is what the check should be.

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..83e447e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			scsi_next_command(cmd);
 			return;
 		}
+	} else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+		/*
+		 * Certain non BLOCK_PC requests are commands that don't
+		 * actually transfer anything (FLUSH, DISCARD), so cannot use
+		 * good_bytes == 0 as the signal for an error.  This sets the
+		 * error explicitly for the problem case
+		 */
+		error = __scsi_error_from_host_byte(cmd, result);
 	}
 
 	/* no bidi support for !REQ_TYPE_BLOCK_PC yet */



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-26 18:52   ` James Bottomley
@ 2014-06-27  7:55     ` Christoph Hellwig
       [not found]       ` <CAPK7rjdpxAALA-oLZJ3wDBPc3kr5Nw4jLALLxEaT0zSiAOD+wg@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-06-27  7:55 UTC (permalink / raw)
  To: James Bottomley; +Cc: Steven Haber, linux-scsi

On Thu, Jun 26, 2014 at 02:52:14PM -0400, James Bottomley wrote:
> > +	} else if (req->cmd_flags & REQ_FLUSH) {
> > +		/*
> > +		 * Flush request don't transfer data, so we can't try
> > +		 * to complete the good bytes first before checking
> > +		 * the error.
> > +		 */
> > +		if (result && !sense_deferred)
> > +			error = __scsi_error_from_host_byte(cmd, result);
> 
> Well, no, that's wrong.  To catch all of the corner cases that slip
> through here, the check is for a zero length command with and error
> otherwise we get the same problem for failed discards.

It's correct for the current code base where FLUSH is the only zero
length !BLOCK_PC case.  Your version is fine too, except for the
incorrect comment:

> +	} else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
> +		/*
> +		 * Certain non BLOCK_PC requests are commands that don't
> +		 * actually transfer anything (FLUSH, DISCARD), so cannot use
> +		 * good_bytes == 0 as the signal for an error.  This sets the
> +		 * error explicitly for the problem case
> +		 */

There aren't BLOCK_PC commands, otherwise they wouldn't end up here.

Note that REQ_DISCARD has been rewritten into either an WRITE_SAME or
UNMAP at this point, which always transfer data.


^ permalink raw reply	[flat|nested] 24+ messages in thread

[parent not found: <CAPK7rjdpxAALA-oLZJ3wDBPc3kr5Nw4jLALLxEaT0zSiAOD+wg@mail.gmail.com>]

* Re: SCSI eats error from flush failure during hot plug
       [not found]       ` <CAPK7rjdpxAALA-oLZJ3wDBPc3kr5Nw4jLALLxEaT0zSiAOD+wg@mail.gmail.com>
@ 2014-06-30 18:45         ` Steven Haber
  2014-07-01  8:15           ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Haber @ 2014-06-30 18:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: James Bottomley, linux-scsi

Our qual rig is having some woes. I'll let you know how the test works
out after I finish resuscitating it.

On Mon, Jun 30, 2014 at 11:44 AM, Steven Haber <steven@qumulo.com> wrote:
> Our qual rig is having some woes. I'll let you know how the test works out
> after I finish resuscitating it.
>
>
> On Fri, Jun 27, 2014 at 12:55 AM, Christoph Hellwig <hch@infradead.org>
> wrote:
>>
>> On Thu, Jun 26, 2014 at 02:52:14PM -0400, James Bottomley wrote:
>> > > +   } else if (req->cmd_flags & REQ_FLUSH) {
>> > > +           /*
>> > > +            * Flush request don't transfer data, so we can't try
>> > > +            * to complete the good bytes first before checking
>> > > +            * the error.
>> > > +            */
>> > > +           if (result && !sense_deferred)
>> > > +                   error = __scsi_error_from_host_byte(cmd, result);
>> >
>> > Well, no, that's wrong.  To catch all of the corner cases that slip
>> > through here, the check is for a zero length command with and error
>> > otherwise we get the same problem for failed discards.
>>
>> It's correct for the current code base where FLUSH is the only zero
>> length !BLOCK_PC case.  Your version is fine too, except for the
>> incorrect comment:
>>
>> > +     } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
>> > +             /*
>> > +              * Certain non BLOCK_PC requests are commands that don't
>> > +              * actually transfer anything (FLUSH, DISCARD), so cannot
>> > use
>> > +              * good_bytes == 0 as the signal for an error.  This sets
>> > the
>> > +              * error explicitly for the problem case
>> > +              */
>>
>> There aren't BLOCK_PC commands, otherwise they wouldn't end up here.
>>
>> Note that REQ_DISCARD has been rewritten into either an WRITE_SAME or
>> UNMAP at this point, which always transfer data.
>>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-06-30 18:45         ` Steven Haber
@ 2014-07-01  8:15           ` Christoph Hellwig
  2014-07-03 16:57             ` Steven Haber
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-07-01  8:15 UTC (permalink / raw)
  To: Steven Haber; +Cc: James Bottomley, linux-scsi

On Mon, Jun 30, 2014 at 11:45:31AM -0700, Steven Haber wrote:
> Our qual rig is having some woes. I'll let you know how the test works
> out after I finish resuscitating it.

Might be worth to test that patch James send in reply then.  They should
have an identical effect and James seems to prefer the version testing
for the length.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-07-01  8:15           ` Christoph Hellwig
@ 2014-07-03 16:57             ` Steven Haber
  2014-07-03 17:18               ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Haber @ 2014-07-03 16:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: James Bottomley, linux-scsi

Both patches work fine -- zero uncommitted writes over several hundred
power cycles.

On Tue, Jul 1, 2014 at 1:15 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jun 30, 2014 at 11:45:31AM -0700, Steven Haber wrote:
>> Our qual rig is having some woes. I'll let you know how the test works
>> out after I finish resuscitating it.
>
> Might be worth to test that patch James send in reply then.  They should
> have an identical effect and James seems to prefer the version testing
> for the length.
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-07-03 16:57             ` Steven Haber
@ 2014-07-03 17:18               ` Christoph Hellwig
  2014-07-09 22:38                 ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2014-07-03 17:18 UTC (permalink / raw)
  To: Steven Haber; +Cc: Christoph Hellwig, James Bottomley, linux-scsi

On Thu, Jul 03, 2014 at 09:57:32AM -0700, Steven Haber wrote:
> Both patches work fine -- zero uncommitted writes over several hundred
> power cycles.

Thanks for the testing.

Below is James' patch with a trivial comment fix and a proper changelog.
James, can I get your signoff for this one?  I'll throw it into the
core-for-3.16 tree then.

---
>From 92510736caa3b14a964af981ba288950042e79e3 Mon Sep 17 00:00:00 2001
From: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Thu, 3 Jul 2014 19:17:34 +0200
Subject: scsi: handle flush errors properly

Flush commands don't transfer data and thus need to be special cased
in the I/O completion handler so that we can propagate errors to
the block layer and filesystem.

Reported-by: Steven Haber <steven@qumulo.com>
Tested-by: Steven Haber <steven@qumulo.com>
---
 drivers/scsi/scsi_lib.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..143d0e2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			scsi_next_command(cmd);
 			return;
 		}
+	} else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+		/*
+		 * Certain non BLOCK_PC requests are commands that don't
+		 * actually transfer anything (FLUSH), so cannot use
+		 * good_bytes == 0 as the signal for an error.  This sets the
+		 * error explicitly for the problem case.
+		 */
+		error = __scsi_error_from_host_byte(cmd, result);
 	}
 
 	/* no bidi support for !REQ_TYPE_BLOCK_PC yet */
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-07-03 17:18               ` Christoph Hellwig
@ 2014-07-09 22:38                 ` James Bottomley
  2014-07-10  7:44                   ` Christoph Hellwig
  2014-07-17 15:27                   ` Martin K. Petersen
  0 siblings, 2 replies; 24+ messages in thread
From: James Bottomley @ 2014-07-09 22:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Steven Haber, linux-scsi

On Thu, 2014-07-03 at 10:18 -0700, Christoph Hellwig wrote:
> On Thu, Jul 03, 2014 at 09:57:32AM -0700, Steven Haber wrote:
> > Both patches work fine -- zero uncommitted writes over several hundred
> > power cycles.
> 
> Thanks for the testing.
> 
> Below is James' patch with a trivial comment fix and a proper changelog.
> James, can I get your signoff for this one?  I'll throw it into the
> core-for-3.16 tree then.

OK, here it is.  I tweaked the comment to be accurate.

James

---

>From 66aba46049037b55c5c77b45974a639f8f48b58b Mon Sep 17 00:00:00 2001
From: James Bottomley <JBottomley@Parallels.com>
Date: Thu, 3 Jul 2014 19:17:34 +0200
Subject: [PATCH] [SCSI] handle flush errors properly

Flush commands don't transfer data and thus need to be special cased
in the I/O completion handler so that we can propagate errors to
the block layer and filesystem.

Reported-by: Steven Haber <steven@qumulo.com>
Tested-by: Steven Haber <steven@qumulo.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..3f50dfc 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -733,6 +733,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			scsi_next_command(cmd);
 			return;
 		}
+	} else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+		/*
+		 * Certain non BLOCK_PC requests are commands that don't
+		 * actually transfer anything (FLUSH), so cannot use
+		 * good_bytes != blk_rq_bytes(req) as the signal for an error.
+		 * This sets the error explicitly for the problem case.
+		 */
+		error = __scsi_error_from_host_byte(cmd, result);
 	}
 
 	/* no bidi support for !REQ_TYPE_BLOCK_PC yet */



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-07-09 22:38                 ` James Bottomley
@ 2014-07-10  7:44                   ` Christoph Hellwig
  2014-07-17 15:27                   ` Martin K. Petersen
  1 sibling, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2014-07-10  7:44 UTC (permalink / raw)
  To: James Bottomley; +Cc: Christoph Hellwig, Steven Haber, linux-scsi

On Wed, Jul 09, 2014 at 03:38:18PM -0700, James Bottomley wrote:
> On Thu, 2014-07-03 at 10:18 -0700, Christoph Hellwig wrote:
> > On Thu, Jul 03, 2014 at 09:57:32AM -0700, Steven Haber wrote:
> > > Both patches work fine -- zero uncommitted writes over several hundred
> > > power cycles.
> > 
> > Thanks for the testing.
> > 
> > Below is James' patch with a trivial comment fix and a proper changelog.
> > James, can I get your signoff for this one?  I'll throw it into the
> > core-for-3.16 tree then.
> 
> OK, here it is.  I tweaked the comment to be accurate.

Thanks.  Can someone give me a second review for it as well?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: SCSI eats error from flush failure during hot plug
  2014-07-09 22:38                 ` James Bottomley
  2014-07-10  7:44                   ` Christoph Hellwig
@ 2014-07-17 15:27                   ` Martin K. Petersen
  1 sibling, 0 replies; 24+ messages in thread
From: Martin K. Petersen @ 2014-07-17 15:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: Christoph Hellwig, Steven Haber, linux-scsi

>>>>> "James" == James Bottomley <James.Bottomley@HansenPartnership.com> writes:
>> Below is James' patch with a trivial comment fix and a proper
>> changelog.  James, can I get your signoff for this one?  I'll throw
>> it into the core-for-3.16 tree then.

James> OK, here it is.  I tweaked the comment to be accurate.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2014-07-17 15:27 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-05 23:52 SCSI eats error from flush failure during hot plug Steven Haber
2014-06-07 19:29 ` James Bottomley
2014-06-09 17:21   ` Steven Haber
2014-06-09 17:29     ` James Bottomley
2014-06-11 13:37       ` Christoph Hellwig
2014-06-19 18:05         ` James Bottomley
2014-06-25 13:13           ` Christoph Hellwig
2014-06-26 14:02             ` James Bottomley
2014-06-26 15:00               ` Christoph Hellwig
2014-06-26 15:04                 ` James Bottomley
2014-06-26 15:08                   ` Christoph Hellwig
2014-06-26 18:52                     ` James Bottomley
2014-06-27  7:53                       ` Christoph Hellwig
2014-06-26 15:02 ` Christoph Hellwig
2014-06-26 18:13   ` Steven Haber
2014-06-26 18:52   ` James Bottomley
2014-06-27  7:55     ` Christoph Hellwig
     [not found]       ` <CAPK7rjdpxAALA-oLZJ3wDBPc3kr5Nw4jLALLxEaT0zSiAOD+wg@mail.gmail.com>
2014-06-30 18:45         ` Steven Haber
2014-07-01  8:15           ` Christoph Hellwig
2014-07-03 16:57             ` Steven Haber
2014-07-03 17:18               ` Christoph Hellwig
2014-07-09 22:38                 ` James Bottomley
2014-07-10  7:44                   ` Christoph Hellwig
2014-07-17 15:27                   ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox