linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* EH method APIs
@ 2014-04-04  7:04 Christoph Hellwig
  2014-04-04  7:17 ` Hannes Reinecke
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2014-04-04  7:04 UTC (permalink / raw)
  To: linux-scsi; +Cc: Hannes Reinecke

One think I noticed when doing the SCSI MQ work is that our EH method
signature are starting to really get into the way by passing a scsi_cmnd
as the only argument.  While we'll obviously need the command we want
to abort for eh_abort the resets aren't command specific at all and
passing the command doesn't seem too helpful in general.

There's two specific reasons why it's getting in it's way:

 - With the cmd_size field in the host template we can now allocate
   driver specific data as part of the scsi_cmnd, but we'll usually
   still need driver specific data to do the actual error handling.
   The virtio_scsi driver conversion I posted is a good example of that.
 - The scsi_reset_provider situation is getting worse: this fakes up
   a request on stack, then allocates a scsi_command which doesn't get
   fully set up and points to it and calls the eh_reset* methods on it.
   For now we can keep doing that even with blk-mq, but if we eventually
   want to remove the old code we need a way to fake up the request/cmnd
   combo.  Even until then drivers get a command that subtly different
   from normal ones from scsi_reset_provider in the old code case, and
   even more subtly different for scsi-mq.

I wonder if we should start adding new methods that pass a tmf context
soon.  I think Hannes was looking into EH methods that match the SAM
error handling concepts better anyway, so this might be a good synergy.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EH method APIs
  2014-04-04  7:04 EH method APIs Christoph Hellwig
@ 2014-04-04  7:17 ` Hannes Reinecke
  2014-04-04  7:24   ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Hannes Reinecke @ 2014-04-04  7:17 UTC (permalink / raw)
  To: Christoph Hellwig, linux-scsi

On 04/04/2014 09:04 AM, Christoph Hellwig wrote:
> One think I noticed when doing the SCSI MQ work is that our EH method
> signature are starting to really get into the way by passing a scsi_cmnd
> as the only argument.  While we'll obviously need the command we want
> to abort for eh_abort the resets aren't command specific at all and
> passing the command doesn't seem too helpful in general.
> 
> There's two specific reasons why it's getting in it's way:
> 
>  - With the cmd_size field in the host template we can now allocate
>    driver specific data as part of the scsi_cmnd, but we'll usually
>    still need driver specific data to do the actual error handling.
>    The virtio_scsi driver conversion I posted is a good example of that.
>  - The scsi_reset_provider situation is getting worse: this fakes up
>    a request on stack, then allocates a scsi_command which doesn't get
>    fully set up and points to it and calls the eh_reset* methods on it.
>    For now we can keep doing that even with blk-mq, but if we eventually
>    want to remove the old code we need a way to fake up the request/cmnd
>    combo.  Even until then drivers get a command that subtly different
>    from normal ones from scsi_reset_provider in the old code case, and
>    even more subtly different for scsi-mq.
> 
> I wonder if we should start adding new methods that pass a tmf context
> soon.  I think Hannes was looking into EH methods that match the SAM
> error handling concepts better anyway, so this might be a good synergy.
> 
My next step for the EH updates would be to get rid of the annoying
'holding on to the original scmd until my last breath' strategy.

Plan here is to allocate a dedicated EH command which then could be
used to send the SCSI commands during EH.
This would allow use to return the original command as soon as we've
entered SCSI EH. Of course this will trigger a retry, but EH will
block command submission until it is done.
Thereby we should be having much the same behaviour as we have now,
with the marked difference that control of the command is now being
passed back to the block layer during SCSI EH.
This allows for a much saner request handling, and allows things
like multipathing to work far better even when EH is running.
_And_ we can basically terminate SCSI EH at any time as no commands
are pending within the SCSI midlayer anymore.

Plus we don't meddle with block request allocation intrinsics
anymore; the SCSI EH command is allocate within the SCSI midlayer,
and requests originating from the block layer won't be messed with.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EH method APIs
  2014-04-04  7:17 ` Hannes Reinecke
@ 2014-04-04  7:24   ` Christoph Hellwig
  2014-04-04  7:36     ` Hannes Reinecke
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2014-04-04  7:24 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Christoph Hellwig, linux-scsi

On Fri, Apr 04, 2014 at 09:17:21AM +0200, Hannes Reinecke wrote:
> Plus we don't meddle with block request allocation intrinsics
> anymore; the SCSI EH command is allocate within the SCSI midlayer,
> and requests originating from the block layer won't be messed with.

Any chance we could also switch to a new scsi_tmf structure instead
of reusing struct scsi_cmnd for that?  It should be a lot smaller
without all the baggage, and make it clear we're not dealing with a
command.  That would also solve the scsi_reset_provide issue.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EH method APIs
  2014-04-04  7:24   ` Christoph Hellwig
@ 2014-04-04  7:36     ` Hannes Reinecke
  2014-04-04  7:42       ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Hannes Reinecke @ 2014-04-04  7:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-scsi

On 04/04/2014 09:24 AM, Christoph Hellwig wrote:
> On Fri, Apr 04, 2014 at 09:17:21AM +0200, Hannes Reinecke wrote:
>> Plus we don't meddle with block request allocation intrinsics
>> anymore; the SCSI EH command is allocate within the SCSI midlayer,
>> and requests originating from the block layer won't be messed with.
> 
> Any chance we could also switch to a new scsi_tmf structure instead
> of reusing struct scsi_cmnd for that?  It should be a lot smaller
> without all the baggage, and make it clear we're not dealing with a
> command.  That would also solve the scsi_reset_provide issue.
> 
Have to look into this. Problem is that some eh_XXX callback
implementations actually are using the command to send the TMF;
will need to look into them how and if they could be changed.

Actually, alongside with that change I would update the arguments
to the callback functions to align with the strategy level;
eh_device_reset() would be getting struct scsi_device as argument,
eh_target_reset() would get getting struct scsi_target as argument etc.

So the actual SCSI EH would not be tied to struct scsi_cmnd, and we
would just be needing something to send TUR etc in the course of
SCSI EH. Whether that needs to be a struct scsi_cmnd or something
else can discussed.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: EH method APIs
  2014-04-04  7:36     ` Hannes Reinecke
@ 2014-04-04  7:42       ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2014-04-04  7:42 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi

On Fri, Apr 04, 2014 at 09:36:54AM +0200, Hannes Reinecke wrote:
> Have to look into this. Problem is that some eh_XXX callback
> implementations actually are using the command to send the TMF;
> will need to look into them how and if they could be changed.

That's indeed a bit of a problem, guess we need to do an audit
on how the commands are used at the moment.

> Actually, alongside with that change I would update the arguments
> to the callback functions to align with the strategy level;
> eh_device_reset() would be getting struct scsi_device as argument,
> eh_target_reset() would get getting struct scsi_target as argument etc.
> 
> So the actual SCSI EH would not be tied to struct scsi_cmnd, and we
> would just be needing something to send TUR etc in the course of
> SCSI EH.

Right.  I think the commands should go through the normal BLOCK_PC code
path, we'd just need to make sure they get priority in the request
allocator.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-04-04  7:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-04  7:04 EH method APIs Christoph Hellwig
2014-04-04  7:17 ` Hannes Reinecke
2014-04-04  7:24   ` Christoph Hellwig
2014-04-04  7:36     ` Hannes Reinecke
2014-04-04  7:42       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).