[PATCH] make the SCSI mid-layer obey the device online flag

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] make the SCSI mid-layer obey the device online flag
@ 2003-06-04 16:01 James Bottomley
  2003-06-04 16:51 ` Mike Anderson
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2003-06-04 16:01 UTC (permalink / raw)
  To: SCSI Mailing List; +Cc: Alan Stern

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]

It has been pointed out by the USB people that the mid-layer doesn't
obey its own online flag.

The attached patch should fix this.  However, there are a few caveats to
offlining (read that as devices should still be prepared to process
commands).

1. Any special command will still be accepted (that's a command either
via the SCSI_IOCTL_SEND_COMMAND, or an internally generated command).
2. Outstanding already processed commands in the queue (i.e. commands
which have already been through the upper layer drivers but needed
requeuing for some reason like QUEUE_FULL or device busy).

I'm willing to consider changing 2., it just requires more speciallised
logic to distinguish between a command that has been prepared by the
upper level drivers and a command sent via 1.

However, not that LLDs may not assume they will receive no commands just
because scsi_device->online is zero.

James

[-- Attachment #2: tmp.diff --]
[-- Type: text/plain, Size: 825 bytes --]

===== drivers/scsi/scsi_lib.c 1.92 vs edited =====
--- 1.92/drivers/scsi/scsi_lib.c	Mon May 26 05:50:43 2003
+++ edited/drivers/scsi/scsi_lib.c	Wed Jun  4 11:43:01 2003
@@ -945,6 +945,18 @@
 			cmd = req->special;
 	} else if (req->flags & (REQ_CMD | REQ_BLOCK_PC)) {
 		/*
+		 * Just check to see if the device is online.  If
+		 * it isn't, we refuse to process ordinary commands
+		 * (we will allow specials just in case someone needs
+		 * to send a command to an offline device without bringing
+		 * it back online)
+		 */
+		if(!sdev->online) {
+			printk(KERN_ERR "scsi%d (%d:%d): rejecting I/O to offline device\n",
+			       sdev->host->host_no, sdev->id, sdev->lun);
+			return BLKPREP_KILL;
+		}
+		/*
 		 * Now try and find a command block that we can use.
 		 */
 		if (!req->special) {

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-04 16:01 [PATCH] make the SCSI mid-layer obey the device online flag James Bottomley
@ 2003-06-04 16:51 ` Mike Anderson
  2003-06-04 19:14   ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2003-06-04 16:51 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List, Alan Stern

James Bottomley [James.Bottomley@steeleye.com] wrote:
> It has been pointed out by the USB people that the mid-layer doesn't
> obey its own online flag.
> 
> The attached patch should fix this.  However, there are a few caveats to
> offlining (read that as devices should still be prepared to process
> commands).
> 
> 1. Any special command will still be accepted (that's a command either
> via the SCSI_IOCTL_SEND_COMMAND, or an internally generated command).
> 2. Outstanding already processed commands in the queue (i.e. commands
> which have already been through the upper layer drivers but needed
> requeuing for some reason like QUEUE_FULL or device busy).
> 
> I'm willing to consider changing 2., it just requires more speciallised
> logic to distinguish between a command that has been prepared by the
> upper level drivers and a command sent via 1.
> 
> However, not that LLDs may not assume they will receive no commands just
> because scsi_device->online is zero.
> 
> James
> 

Doesn't this patch just re-implement in the prep_fn what is already
being done by sd and sr in there init_command functions.

Why allow io after online goes to zero. The user could bring the device
back online if they needed to send IO. I was counting on no IO so we could
do faster cleanup in the scsi_remove_host function.

I having been looking at this, but it is not very clean. To use existing
common functionality and avoid deadlock from calling back into the
request_fn the command needs wasted preparation just to use common
interfaces.

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-04 16:51 ` Mike Anderson
@ 2003-06-04 19:14   ` James Bottomley
  2003-06-05  0:34     ` Patrick Mansfield
  2003-06-06  6:36     ` Christoph Hellwig
  0 siblings, 2 replies; 23+ messages in thread
From: James Bottomley @ 2003-06-04 19:14 UTC (permalink / raw)
  To: Mike Anderson; +Cc: SCSI Mailing List, Alan Stern

On Wed, 2003-06-04 at 12:51, Mike Anderson wrote:
> Doesn't this patch just re-implement in the prep_fn what is already
> being done by sd and sr in there init_command functions.

Yes, it's a precursor to consolidating them.

> Why allow io after online goes to zero. The user could bring the device
> back online if they needed to send IO. I was counting on no IO so we could
> do faster cleanup in the scsi_remove_host function.

Fundamentally, a queue is an asynchronous thing.  It is difficult (but
not imposible) to make the setting offline atomically guarantee no more
commands will be sent down.

However, on a philosophical level, it isn't necessarily desirable. 
Suppose we use offline to disconnect from a mounted filesystem (say
USB/Firewire unplug).  The user level might want to probe the device
before setting the online flag (which will resume the unerrored fs
transactions).

> I having been looking at this, but it is not very clean. To use existing
> common functionality and avoid deadlock from calling back into the
> request_fn the command needs wasted preparation just to use common
> interfaces.

Yes, that's why I think forbidding *all* I/O after offlining is too much
effort.  Offlining should be a precursor to device destruction, but
actual destruction probably relys on detaching the queue from the block
device interface and sitting on it until all use counts drop to zero.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-04 19:14   ` James Bottomley
@ 2003-06-05  0:34     ` Patrick Mansfield
  2003-06-05 12:59       ` James Bottomley
  2003-06-05 13:41       ` Alan Stern
  2003-06-06  6:36     ` Christoph Hellwig
  1 sibling, 2 replies; 23+ messages in thread
From: Patrick Mansfield @ 2003-06-05  0:34 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mike Anderson, SCSI Mailing List, Alan Stern

I thought that USB sending a command after online cleared was likely the
last prepped request being sent. This can't be fixed within scsi_prep_fn,
since it will not be called for the last request after online is cleared.

So we might as well move all the checking of online into the
scsi_request_fn.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-05  0:34     ` Patrick Mansfield
@ 2003-06-05 12:59       ` James Bottomley
  2003-06-05 13:41       ` Alan Stern
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2003-06-05 12:59 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: Mike Anderson, SCSI Mailing List, Alan Stern

On Wed, 2003-06-04 at 20:34, Patrick Mansfield wrote:
> I thought that USB sending a command after online cleared was likely the
> last prepped request being sent. This can't be fixed within scsi_prep_fn,
> since it will not be called for the last request after online is cleared.
> 
> So we might as well move all the checking of online into the
> scsi_request_fn.

Not unless there's agreement that prepared commands need killing.  To do
this in the simple fashion I outlined in the email (with the two
conditions), the prep function is the correct place for the check.

If the check is moved into the request function you have to worry about
freeing the allocated structures and terminating it yourself, which adds
unnecessary complexity.

Since the LLD knows it must handle commands until the slave_destroy, I
don't see a compelling reason to go to extraordinary lengths to prevent
it from seeing them.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-05  0:34     ` Patrick Mansfield
  2003-06-05 12:59       ` James Bottomley
@ 2003-06-05 13:41       ` Alan Stern
  1 sibling, 0 replies; 23+ messages in thread
From: Alan Stern @ 2003-06-05 13:41 UTC (permalink / raw)
  To: Patrick Mansfield; +Cc: James Bottomley, Mike Anderson, SCSI Mailing List

On Wed, 4 Jun 2003, Patrick Mansfield wrote:

> I thought that USB sending a command after online cleared was likely the
> last prepped request being sent. This can't be fixed within scsi_prep_fn,
> since it will not be called for the last request after online is cleared.
> 
> So we might as well move all the checking of online into the
> scsi_request_fn.

As far as USB is concerned, it doesn't matter very much if any commands
are sent after online is cleared, provided it's a relatively limited
number of commands.  I just would like to clarify whether or not there is
a guarantee that _no_ commands will be sent.  If it's explicitly agreed
(and it would help to put it into a comment or kerneldoc) that there is no
such guarantee, then little code needs to be changed.

Of course, it goes without saying that if a command is sent when online is
clear, the midlayer must be prepared to deal gracefully with the
inevitable error or failure of that command -- don't start a lengthy
series of retries or are-you-still-alive probes.

Alan Stern

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-04 19:14   ` James Bottomley
  2003-06-05  0:34     ` Patrick Mansfield
@ 2003-06-06  6:36     ` Christoph Hellwig
  2003-06-06 15:19       ` James Bottomley
  2003-06-06 15:28       ` Luben Tuikov
  1 sibling, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2003-06-06  6:36 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mike Anderson, SCSI Mailing List, Alan Stern

On Wed, Jun 04, 2003 at 03:14:56PM -0400, James Bottomley wrote:
> Yes, that's why I think forbidding *all* I/O after offlining is too much
> effort.  Offlining should be a precursor to device destruction, but
> actual destruction probably relys on detaching the queue from the block
> device interface and sitting on it until all use counts drop to zero.

We need a way to disable _all_ I/O to a device due to the way the driver
model works.  The driver model ->remove always is a surprise removal,
so the underlying PCI (or whatever) device for a scsi host can go away
anytime.  Because of that we need to make damn sure no call to
->queuecommand will happen after scsi_remove_host is called.  Whether
this is implemented with the same mechanisms as the current sdev->online
is another question.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06  6:36     ` Christoph Hellwig
@ 2003-06-06 15:19       ` James Bottomley
  2003-06-06 15:51         ` Oliver Neukum
  2003-06-06 16:02         ` Luben Tuikov
  2003-06-06 15:28       ` Luben Tuikov
  1 sibling, 2 replies; 23+ messages in thread
From: James Bottomley @ 2003-06-06 15:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, 2003-06-06 at 02:36, Christoph Hellwig wrote:
> We need a way to disable _all_ I/O to a device due to the way the driver
> model works.  The driver model ->remove always is a surprise removal,
> so the underlying PCI (or whatever) device for a scsi host can go away
> anytime.  Because of that we need to make damn sure no call to
> ->queuecommand will happen after scsi_remove_host is called.  Whether
> this is implemented with the same mechanisms as the current sdev->online
> is another question.

I'm not entirely convinced.  Sure, the device has gone away, but the
driver is still there.  If you rip out a real SCSI disc, all subsequent
commands will error with DID_NO_CONNECT, I don't see why this should be
so hard for other types of storage do follow.

The question is whether we should allow (and any LLD actually wants) the
ability to tell the mid-layer that it will accept no further commands at
all.

At the moment, I think we can handle both ejection scenarios adequately
without forbidding all commands to the LLD:

Surprise ejection:

driver starts returning DID_NO_CONNECT to commands
possibly hotplug notify of surprise ejection
device->online to be reset
command queue drains
command queue drops to zero, ->remove can be called and everything
cleaned up from user level (including remove-single-device)

Nice ejection:

hotplug notify of ejection request
eject script cleans up, unmounts, possibly resets online
command queue drains
command queue drops to zero, ->remove can be called and everything
cleaned up from user level (including remove-single-device)
ejection may now proceed

For the remove_host scenario, it would be convenient just to do
nice/surprise ejections on all the devices (from user level) and then do
the clean up when the hosts device count falls to zero.  That may imply
some type of host->offline flag which disallows the addition of new scsi
devices to the host.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06  6:36     ` Christoph Hellwig
  2003-06-06 15:19       ` James Bottomley
@ 2003-06-06 15:28       ` Luben Tuikov
  2003-06-06 15:39         ` James Bottomley
                           ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Luben Tuikov @ 2003-06-06 15:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, Mike Anderson, SCSI Mailing List, Alan Stern

Christoph Hellwig wrote:
> On Wed, Jun 04, 2003 at 03:14:56PM -0400, James Bottomley wrote:
> 
>>Yes, that's why I think forbidding *all* I/O after offlining is too much
>>effort.  Offlining should be a precursor to device destruction, but
>>actual destruction probably relys on detaching the queue from the block
>>device interface and sitting on it until all use counts drop to zero.
> 
> 
> We need a way to disable _all_ I/O to a device due to the way the driver
> model works.  The driver model ->remove always is a surprise removal,
> so the underlying PCI (or whatever) device for a scsi host can go away
> anytime.  Because of that we need to make damn sure no call to
> ->queuecommand will happen after scsi_remove_host is called.  Whether
> this is implemented with the same mechanisms as the current sdev->online
> is another question.

James was talking about ``device server not ready'' aliased to
online -- i.e. if online is 0 then only INQUIRY or TUR should
be sent from SCSI Core (unless SCSI Core decides to _know_ about
the different ULP).  The semantics on this are somewhat touched
in the INQUIRY description in SPC-3.

(This means that the device is on the SAN, but not ready to
process commands.)

You're talking about hardware removal -- more specifically
logical removal of a host -- in which case of course there's no
_host_, so how can online be checked.... (online being
member of sdev, sdev being parented by shost...)

IOW, you probably need either a flag in shost, to mean that
it's logically removed (whether hdwr or not) until it is
actually gone, or an external flag per host... it's your call.

-- 
Luben

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:28       ` Luben Tuikov
@ 2003-06-06 15:39         ` James Bottomley
  2003-06-06 15:52           ` Luben Tuikov
  2003-06-06 20:23         ` Mike Anderson
  2003-06-06 20:49         ` Christoph Hellwig
  2 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2003-06-06 15:39 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, 2003-06-06 at 11:28, Luben Tuikov wrote:
> James was talking about ``device server not ready'' aliased to
> online -- i.e. if online is 0 then only INQUIRY or TUR should
> be sent from SCSI Core (unless SCSI Core decides to _know_ about
> the different ULP).  The semantics on this are somewhat touched
> in the INQUIRY description in SPC-3.

Actually, not just TUR and INQUIRY.  Think about errors on removable
media: we offline the device because of them and now the user needs to
unlock the door and eject the cartridge...

Thus, we allow any special command because we assume it's part of error
handling (or post error clean up).

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:19       ` James Bottomley
@ 2003-06-06 15:51         ` Oliver Neukum
  2003-06-06 16:02         ` Luben Tuikov
  1 sibling, 0 replies; 23+ messages in thread
From: Oliver Neukum @ 2003-06-06 15:51 UTC (permalink / raw)
  To: James Bottomley, Christoph Hellwig
  Cc: Mike Anderson, SCSI Mailing List, Alan Stern


> For the remove_host scenario, it would be convenient just to do
> nice/surprise ejections on all the devices (from user level) and then do
> the clean up when the hosts device count falls to zero.  That may imply
> some type of host->offline flag which disallows the addition of new scsi
> devices to the host.

There's no use in that distinction. The hard, _common_  case to be solved is
surprise ejection. If you have solved that you've solved everything else with
it. Keeping any distinction here just complicates things. It just blurrs the
issue.

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:39         ` James Bottomley
@ 2003-06-06 15:52           ` Luben Tuikov
  2003-06-06 16:04             ` James Bottomley
  2003-06-06 20:51             ` Christoph Hellwig
  0 siblings, 2 replies; 23+ messages in thread
From: Luben Tuikov @ 2003-06-06 15:52 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

James Bottomley wrote:
> On Fri, 2003-06-06 at 11:28, Luben Tuikov wrote:
> 
>>James was talking about ``device server not ready'' aliased to
>>online -- i.e. if online is 0 then only INQUIRY or TUR should
>>be sent from SCSI Core (unless SCSI Core decides to _know_ about
>>the different ULP).  The semantics on this are somewhat touched
>>in the INQUIRY description in SPC-3.
> 
> 
> Actually, not just TUR and INQUIRY.  Think about errors on removable
> media: we offline the device because of them and now the user needs to
> unlock the door and eject the cartridge...
> 
> Thus, we allow any special command because we assume it's part of error
> handling (or post error clean up).

So SCSI Core _has_ decided to know about ULP (block, tape, optical).

Unless we (SCSI Core) generate those, it's a pickle to decide
which are ``special'' enough commands.  I think that we'll see
more user space drivers controlling devices via sg sending
commands to the device for exactly those kinds of problems...
Else the burden on SCSI Core will/might be too great.

-- 
Luben




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:19       ` James Bottomley
  2003-06-06 15:51         ` Oliver Neukum
@ 2003-06-06 16:02         ` Luben Tuikov
  1 sibling, 0 replies; 23+ messages in thread
From: Luben Tuikov @ 2003-06-06 16:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

James Bottomley wrote:
> Surprise ejection:
> 
> driver starts returning DID_NO_CONNECT to commands
> possibly hotplug notify of surprise ejection
> device->online to be reset
> command queue drains
> command queue drops to zero, ->remove can be called and everything
> cleaned up from user level (including remove-single-device)
> 
> Nice ejection:
> 
> hotplug notify of ejection request
> eject script cleans up, unmounts, possibly resets online
> command queue drains
> command queue drops to zero, ->remove can be called and everything
> cleaned up from user level (including remove-single-device)
> ejection may now proceed

Exactly right!  (This is ``ejection'' of a device off the SAN.)
 
> For the remove_host scenario, it would be convenient just to do
> nice/surprise ejections on all the devices (from user level) and then do
> the clean up when the hosts device count falls to zero.  That may imply
> some type of host->offline flag which disallows the addition of new scsi
> devices to the host.

Just to be clear: this is when a PCI host (SCSI portal) becomes
unavailable. (quite different from the above)

In which case SCSI host should NOT call the queuecommand
(it being a property of the portal) and if user space drivers submit
commands to a device there of or the host, SCSI Core should return said
commands with response SERVICE DELIVERY OR TARGET FAILURE (this is NOT
the status SAM_xxx).

-- 
Luben



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:52           ` Luben Tuikov
@ 2003-06-06 16:04             ` James Bottomley
  2003-06-06 20:51             ` Christoph Hellwig
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2003-06-06 16:04 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, 2003-06-06 at 11:52, Luben Tuikov wrote:
> So SCSI Core _has_ decided to know about ULP (block, tape, optical).
> 
> Unless we (SCSI Core) generate those, it's a pickle to decide
> which are ``special'' enough commands.  I think that we'll see
> more user space drivers controlling devices via sg sending
> commands to the device for exactly those kinds of problems...
> Else the burden on SCSI Core will/might be too great.

By "special" I mean requests marked REQ_SPECIAL.  These are generated
either internally or via SCSI_IOCTL_SEND_COMMAND.

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:28       ` Luben Tuikov
  2003-06-06 15:39         ` James Bottomley
@ 2003-06-06 20:23         ` Mike Anderson
  2003-06-06 20:52           ` Christoph Hellwig
  2003-06-10  0:00           ` Mike Anderson
  2003-06-06 20:49         ` Christoph Hellwig
  2 siblings, 2 replies; 23+ messages in thread
From: Mike Anderson @ 2003-06-06 20:23 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Christoph Hellwig, James Bottomley, SCSI Mailing List, Alan Stern

Luben Tuikov [tluben@rogers.com] wrote:
> James was talking about ``device server not ready'' aliased to
> online -- i.e. if online is 0 then only INQUIRY or TUR should
> be sent from SCSI Core (unless SCSI Core decides to _know_ about
> the different ULP).  The semantics on this are somewhat touched
> in the INQUIRY description in SPC-3.
> 
> (This means that the device is on the SAN, but not ready to
> process commands.)
> 
> You're talking about hardware removal -- more specifically
> logical removal of a host -- in which case of course there's no
> _host_, so how can online be checked.... (online being
> member of sdev, sdev being parented by shost...)
> 
> IOW, you probably need either a flag in shost, to mean that
> it's logically removed (whether hdwr or not) until it is
> actually gone, or an external flag per host... it's your call.
> 

I agree we need some indication the the host is gone as mention in the
above paragraph and previously by others in similar threads.

Instead of talking bit fields it might be good to write down the states
for scsi_device and Scsi_Host and the policy for each state. This is
partially being done in this thread, but I believe it is confusing as we
keep mapping to an already overloaded bit field.

I am currently updating my previously posted SCSI Mid refcounting text
to match recent updates by Christoph. I am also adding current bit fields
in scsi_device and Scsi_Host used for state.

It might be helpful to indicate when a hotplug event will go out and the
expected action. I currently do not have this in the document.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:28       ` Luben Tuikov
  2003-06-06 15:39         ` James Bottomley
  2003-06-06 20:23         ` Mike Anderson
@ 2003-06-06 20:49         ` Christoph Hellwig
  2003-06-06 23:21           ` Luben Tuikov
  2 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2003-06-06 20:49 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: James Bottomley, Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, Jun 06, 2003 at 11:28:22AM -0400, Luben Tuikov wrote:
> James was talking about ``device server not ready'' aliased to
> online -- i.e. if online is 0 then only INQUIRY or TUR should
> be sent from SCSI Core (unless SCSI Core decides to _know_ about
> the different ULP).  The semantics on this are somewhat touched
> in the INQUIRY description in SPC-3.

Okay, I thought about device physically no more present on the
bus for !online.

> 
> (This means that the device is on the SAN, but not ready to
> process commands.)
> 
> You're talking about hardware removal -- more specifically
> logical removal of a host

host or device.  Think about sbp2/ieee1394 or FC.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 15:52           ` Luben Tuikov
  2003-06-06 16:04             ` James Bottomley
@ 2003-06-06 20:51             ` Christoph Hellwig
  2003-06-06 23:27               ` Luben Tuikov
  1 sibling, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2003-06-06 20:51 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: James Bottomley, Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, Jun 06, 2003 at 11:52:56AM -0400, Luben Tuikov wrote:
> > Thus, we allow any special command because we assume it's part of error
> > handling (or post error clean up).
> 
> So SCSI Core _has_ decided to know about ULP (block, tape, optical).
> 
> Unless we (SCSI Core) generate those, it's a pickle to decide
> which are ``special'' enough commands.  I think that we'll see
> more user space drivers controlling devices via sg sending
> commands to the device for exactly those kinds of problems...
> Else the burden on SCSI Core will/might be too great.

I think James means REQ_SPECIAL request.  But some of them aren't
special enough so we might need another flag for those.  But
having knowledge about the upper drivers in the core sounds like
a really bad idea.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 20:23         ` Mike Anderson
@ 2003-06-06 20:52           ` Christoph Hellwig
  2003-06-10  0:00           ` Mike Anderson
  1 sibling, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2003-06-06 20:52 UTC (permalink / raw)
  To: Luben Tuikov, Christoph Hellwig, James Bottomley,
	SCSI Mailing List, Alan Stern

On Fri, Jun 06, 2003 at 01:23:50PM -0700, Mike Anderson wrote:
> Instead of talking bit fields it might be good to write down the states
> for scsi_device and Scsi_Host and the policy for each state. This is
> partially being done in this thread, but I believe it is confusing as we
> keep mapping to an already overloaded bit field.

Yes, please, please, please! :)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 20:49         ` Christoph Hellwig
@ 2003-06-06 23:21           ` Luben Tuikov
  0 siblings, 0 replies; 23+ messages in thread
From: Luben Tuikov @ 2003-06-06 23:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, Mike Anderson, SCSI Mailing List, Alan Stern

Christoph Hellwig wrote:
> On Fri, Jun 06, 2003 at 11:28:22AM -0400, Luben Tuikov wrote:
> 
>>James was talking about ``device server not ready'' aliased to
>>online -- i.e. if online is 0 then only INQUIRY or TUR should
>>be sent from SCSI Core (unless SCSI Core decides to _know_ about
>>the different ULP).  The semantics on this are somewhat touched
>>in the INQUIRY description in SPC-3.
> 
> 
> Okay, I thought about device physically no more present on the
> bus for !online.
> 
> 
>>(This means that the device is on the SAN, but not ready to
>>process commands.)
>>
>>You're talking about hardware removal -- more specifically
>>logical removal of a host
> 
> 
> host or device.  Think about sbp2/ieee1394 or FC.

Did you mean to write something more?

-- 
Luben




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 20:51             ` Christoph Hellwig
@ 2003-06-06 23:27               ` Luben Tuikov
  2003-06-06 23:43                 ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Luben Tuikov @ 2003-06-06 23:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, Mike Anderson, SCSI Mailing List, Alan Stern

Christoph Hellwig wrote:
> On Fri, Jun 06, 2003 at 11:52:56AM -0400, Luben Tuikov wrote:
> 
>>>Thus, we allow any special command because we assume it's part of error
>>>handling (or post error clean up).
>>
>>So SCSI Core _has_ decided to know about ULP (block, tape, optical).
>>
>>Unless we (SCSI Core) generate those, it's a pickle to decide
>>which are ``special'' enough commands.  I think that we'll see
>>more user space drivers controlling devices via sg sending
>>commands to the device for exactly those kinds of problems...
>>Else the burden on SCSI Core will/might be too great.
> 
> 
> I think James means REQ_SPECIAL request.  But some of them aren't
> special enough so we might need another flag for those.  But
> having knowledge about the upper drivers in the core sounds like
> a really bad idea.

*If* eh is to be done in SCSI Core you just cannot avoid it --
this is *the whole point* which James had in mind to allow
to send cmnds if !online. (see prev msgs on this thread)

(More things I want to mention here on this, but 1. No time
(I'm late for a meeting) and 2. I see no point -- you'll manage
just fine.)

-- 
Luben




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 23:27               ` Luben Tuikov
@ 2003-06-06 23:43                 ` James Bottomley
  2003-06-07  5:20                   ` Luben Tuikov
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2003-06-06 23:43 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

On Fri, 2003-06-06 at 19:27, Luben Tuikov wrote:
> *If* eh is to be done in SCSI Core you just cannot avoid it --
> this is *the whole point* which James had in mind to allow
> to send cmnds if !online. (see prev msgs on this thread)

Actually, I didn't.  Error handling has its own separate entry into
->queuecommand() (scsi_send_eh_cmnd()), so it wouldn't be affected by my
changes.

My purpose is quite simple: Any normal block actions (read/write from
mounted fs, or CD burning via SG_IO) would be errored.  Any special
commands (from ioctls, direct commands or stack generated) would not be
affected.

i.e.

REQ_CMD	errors
REQ_BLOCK_PC errors
REQ_SPECIAL is allowed

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 23:43                 ` James Bottomley
@ 2003-06-07  5:20                   ` Luben Tuikov
  0 siblings, 0 replies; 23+ messages in thread
From: Luben Tuikov @ 2003-06-07  5:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, Mike Anderson, SCSI Mailing List, Alan Stern

James Bottomley wrote:
> On Fri, 2003-06-06 at 19:27, Luben Tuikov wrote:
> 
>>*If* eh is to be done in SCSI Core you just cannot avoid it --
>>this is *the whole point* which James had in mind to allow
>>to send cmnds if !online. (see prev msgs on this thread)
> 
> 
> Actually, I didn't.  Error handling has its own separate entry into
> ->queuecommand() (scsi_send_eh_cmnd()), so it wouldn't be affected by my
> changes.
> 
> My purpose is quite simple: Any normal block actions (read/write from
> mounted fs, or CD burning via SG_IO) would be errored.  Any special
> commands (from ioctls, direct commands or stack generated) would not be
> affected.
> 
> i.e.
> 
> REQ_CMD	errors
> REQ_BLOCK_PC errors
> REQ_SPECIAL is allowed

So you're saying: online <--> all allowed,
!online <--> only REQ_SPECIAL allowed.

I doubt this will work.  (But we can try it - hey, what the heck!)

The reason is that it adds another layer of
implementation on the upper layers which is
non-existant.

ULDD *do NOT* have eh capabilities, which is where your
"ejecting the cartridge" and "unlocking the door"
commands fall (ULDD eh - non-existant in Linux).

BTW, above I was talking about user space eh of SCSI Devices,
or more appropriately upper layer device drivers (ULDD, sd, st, etc)
error handling -- i.e. of their own type of devices.
Now if this kind of eh is to be done/generated in SCSI Core
then you cannot avoid it.

(I wasn't talking of the *transport* eh.)

BTW, on another note, it is also my desire that SCSI Core should
not know anything about upper level device drivers, and has always
been.  I'd also like to see SCSI Core shrink, rather than get bigger.

-- 
Luben

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] make the SCSI mid-layer obey the device online flag
  2003-06-06 20:23         ` Mike Anderson
  2003-06-06 20:52           ` Christoph Hellwig
@ 2003-06-10  0:00           ` Mike Anderson
  1 sibling, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2003-06-10  0:00 UTC (permalink / raw)
  To: Luben Tuikov, Christoph Hellwig, James Bottomley,
	SCSI Mailing List, Alan Stern

Mike Anderson [andmike@us.ibm.com] wrote:
> I am currently updating my previously posted SCSI Mid refcounting text
> to match recent updates by Christoph. I am also adding current bit fields
> in scsi_device and Scsi_Host used for state.
> 

I update the previously posted refcounting document. I added scsi_host
and scsi_device bitfields "related" to state. I added some thoughts
on host and device states (really status bits as there is no indication
of valid state transitions in this document). 

-andmike
--
Michael Anderson
andmike@us.ibm.com

<ref counting rules>

1.) Reference counting for scsi_host
	Private = Private to SCSI mid
	Public = SCSI mid API

	(Public) scsi_host_alloc:
		(A) kmalloc Scsi_Host + xtr_bytes.
		(B) Call device_initialize, class_device_initialize
			refcount is 1 on host_gendev.
			refcount is 1 on class_dev.
	(Public) scsi_register:
		(A) Call scsi_host_alloc
		(B) Add Scsi_Host to legacy_hosts.

	(Public) scsi_get_host:
		(A) Call get_device on host_gendev
			refcount +1 on host_gendev
		(B) Call class_device_get on class_dev
			refcount +1 on class_dev
	(Public) scsi_host_put:
		(A) Call put_device on host_gendev
			refcount -1 on host_gendev
		(B) Call class_device_put on class_dev
			refcount -1 on class_dev
	(Public) scsi_add_host:
		(A) Call scsi_sysfs_add_host
			(1) Call device_add
				refcount +1 on parent struct device
				host_gendev now visible in sysfs tree.
			(2) Call class_device_add
				refcount +1 on scsi_host class
				refcount +1 on parent struct device
				class_dev now visible in sysfs tree
		(C) Call scsi_proc_host_add
			Scsi_Host now visible in proc.
		(D) Call scsi_scan_host
			refcount +1 on host_gendev for each scsi_device
			discovered

	(Public) scsi_remove_host:
		(A) "scsi_offline_host" NOT implemented.
		(B) Call scsi_proc_host_rm
		(C) Call scsi_forget_host
			refcount -1 on host_gendev for each scsi_device
			unregistered
		(D) Call scsi_sysfs_remove_host
			(1) Call class_device_del
			(2) Call device_del

	(Public) scsi_unregister:
		(A) Call scsi_put_host

	(Private) scsi_free_shost:
		(A) Kill error recovery thread.
		(B) Call scsi_destroy_command_freelist
		(C) kfree(shost)

	(Private) scsi_host_release:
		(A) Call scsi_free_shost

	(Private) init_this_scsi_driver: (Legacy only)
		(A) Call template detect
			(1) Call scsi_register for each instance
		(B) Call scsi_add_host for each instance
	
	(Private) exit_this_scsi_driver: (Legacy only)
		(A) Call scsi_remove_host for each instance
		(B) Call template release for each instance
			(1) Call scsi_unregister for Scsi_Host

2.) Reference counting for scsi_device

	(Private) scsi_add_lun:
		(A) Call scsi_device_register
		refcount 1 on sdev_driverfs_dev
		refcount +1 on parent struct device (i.e host_gendev)

	(Public) scsi_add_device:
		(A) Call scsi_probe_and_add_lun

	(Public) scsi_remove_device:
		(C) Call scsi_device_unregister
		refcount -1 on sdev_driverfs_dev
		refcount -1 on parent struct device (i.e host_gendev)

	(Public) scsi_device_get:
		(A) try_module_get
		(B) access_count +1

	(Public) scsi_device_put:
		(A) access_count -1
		(B) module_put on host module

</ref counting rules>

<device and host states>

3.) struct scsi_device states

	(A) Bitfields
		- online (device responding / connected. Currently
		  overloaded)
		- access_count (Number of openers, ?other get/put callers?)

		- device_busy (if non-zero IO in-flight)
		- device_blocked (no processing)

		- locked (SCSI_REMOVAL_PREVENT active)
		- changed (possible media change)

		- was_reset (device reset or bus reset)
		- expecting_cc_ua (redundant was_reset bit field)

	(B) States (?? Needs review ??)
		(1) Init
		(2) Ready (PQ of 000b, TUR Good)
		(3) Not_Ready (sense key not ready)
		(4) Lun_Not_Connected (PQ of 001b)
		(5) No_Response (DID_NO_CONNECT) (Surprise removal here?)

		(6) Blocked (QUEUE_FULL or BUSY status)
		(7) Timed_Out (A command timed out on this lun)
		(8) Shutdown (Device going away)

4.) struct Scsi_Host states

	(A) Bitfields
		- host_blocked (no processing)
		- host_self_blocked ( "host requested no processing")
		- host_busy (if non-zero IO in-flight)

		- host_failed (if non-zero IO failure(s))
		- in_recovery (error recovery scheduled or running)

		- resetting (delay calling queuecommand due to reset)
		- last_reset (used in conjunction with resetting).

	(B) States (?? Needs review ??)
		(1) Init
		(2) Ready (registered)
		(3) In_Recovery (error recovery thread scheduled to run)
		(4) Blocked (queuecommand returned non-zero)
		(4) Shutdown (Host going away)


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2003-06-09 23:45 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-04 16:01 [PATCH] make the SCSI mid-layer obey the device online flag James Bottomley
2003-06-04 16:51 ` Mike Anderson
2003-06-04 19:14   ` James Bottomley
2003-06-05  0:34     ` Patrick Mansfield
2003-06-05 12:59       ` James Bottomley
2003-06-05 13:41       ` Alan Stern
2003-06-06  6:36     ` Christoph Hellwig
2003-06-06 15:19       ` James Bottomley
2003-06-06 15:51         ` Oliver Neukum
2003-06-06 16:02         ` Luben Tuikov
2003-06-06 15:28       ` Luben Tuikov
2003-06-06 15:39         ` James Bottomley
2003-06-06 15:52           ` Luben Tuikov
2003-06-06 16:04             ` James Bottomley
2003-06-06 20:51             ` Christoph Hellwig
2003-06-06 23:27               ` Luben Tuikov
2003-06-06 23:43                 ` James Bottomley
2003-06-07  5:20                   ` Luben Tuikov
2003-06-06 20:23         ` Mike Anderson
2003-06-06 20:52           ` Christoph Hellwig
2003-06-10  0:00           ` Mike Anderson
2003-06-06 20:49         ` Christoph Hellwig
2003-06-06 23:21           ` Luben Tuikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox