read capacity 16

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* read capacity 16
@ 2004-12-08 21:07 Frank Borich
  0 siblings, 0 replies; 22+ messages in thread
From: Frank Borich @ 2004-12-08 21:07 UTC (permalink / raw)
  To: linux-scsi

Please forgive my ignorance here.  

I am unable to get the kernel (2.4.21) to send a read capacity 16 to a
lun
that requires more than 4 bytes to specify it's last LBA, automatically.
If I
use the sg driver to send a read capacity 16 staraight away, it works
fine.  
I am using the qlogic device driver 7.00.03 (which I believe supports 16
byte commands).  

When I do the bus scan (using modprobe), a READ CAPACITY 10 is sent 
and FFFFFFFF is returned in LBA field, this should signal linux kernel
to
send larger READ CAPACITY 16 to get correct size but it's not.  Instead
the
lun is seen as 0 MB in size.  

I can't seem to find any good documentation on this.  Can I patch this
kernel
to do this ?

Regards,

Frank

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: read capacity 16
@ 2004-12-09 14:33 Frank Borich
  2004-12-09 15:02 ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Frank Borich @ 2004-12-09 14:33 UTC (permalink / raw)
  To: Frank Borich, linux-scsi

This works as expected using SGI's 2.6.5-rc3-mm4 on ia64 and on Windows
server 2003 SP1 (still a beta).
Does anyone know how to get kernels > than 2.4.19 to work ?

I tried setting max_cmd_len to 16 in hosts.c, but it doesn't change
anything.

-----Original Message-----
From: linux-scsi-owner@vger.kernel.org
[mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Frank Borich
Sent: Wednesday, December 08, 2004 3:08 PM
To: linux-scsi@vger.kernel.org
Subject: read capacity 16

Please forgive my ignorance here.  

I am unable to get the kernel (2.4.21) to send a read capacity 16 to a
lun that requires more than 4 bytes to specify it's last LBA,
automatically.
If I
use the sg driver to send a read capacity 16 staraight away, it works
fine.  
I am using the qlogic device driver 7.00.03 (which I believe supports 16
byte commands).  

When I do the bus scan (using modprobe), a READ CAPACITY 10 is sent and
FFFFFFFF is returned in LBA field, this should signal linux kernel to
send larger READ CAPACITY 16 to get correct size but it's not.  Instead
the lun is seen as 0 MB in size.  

I can't seem to find any good documentation on this.  Can I patch this
kernel to do this ?

Regards,

Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org More majordomo info
at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: read capacity 16
  2004-12-09 14:33 read capacity 16 Frank Borich
@ 2004-12-09 15:02 ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2004-12-09 15:02 UTC (permalink / raw)
  To: Frank Borich; +Cc: linux-scsi

On Thu, Dec 09, 2004 at 06:33:15AM -0800, Frank Borich wrote:
> This works as expected using SGI's 2.6.5-rc3-mm4 on ia64 and on Windows
> server 2003 SP1 (still a beta).
> Does anyone know how to get kernels > than 2.4.19 to work ?
> 
> I tried setting max_cmd_len to 16 in hosts.c, but it doesn't change
> anything.

The 2.4 scsi layer doesn't issue 16 byte commands at all. 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* READ CAPACITY 16
@ 2008-12-17 16:42 Matthew Wilcox
  2008-12-17 17:50 ` Grant Grundler
  2008-12-18 20:41 ` Douglas Gilbert
  0 siblings, 2 replies; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 16:42 UTC (permalink / raw)
  To: linux-scsi

I'm looking at the UNMAP support again, and we now have a bit that tells
us whether the device supports UNMAP or not, it's called TPE (Thin
Provisioning Enabled) and is found in byte 14 of the result from READ
CAPACITY 16.  The problem is that we do our best to avoid calling READ
CAPACITY 16.

Presumably, there are many devices which do not support RC16.  That
isn't a problem, we can try RC16 and fall back to RC10 if the device
returns an error.  The question is what to do about devices that either
hang or take a long time to respond to an RC16 command.

This kind of problem isn't going to be limited to UNMAP.  DIF/DIX
already has to use RC16 to get the protection type.  Once 4k sector size
drives become common, we're going to want the "LOGICAL BLOCKS PER
PHYSICAL BLOCK EXPONENT" and the "LOWEST ALIGNED LOGICAL BLOCK ADDRESS"
information that RC16 returns and RC10 doesn't.  There's another 16
bytes and a couple of reserved 4-bit fields to be assigned too, and I
can imagine them getting used for new features in the future.

So what strategy should we adopt for trying harder to issue RC16?

Algorithm A (a perfect world):

Issue RC16
 -> If it fails, issue RC10
 -> If it times out, reset the device, issue RC10

Algorithm B:

Issue RC10
Issue RC16
 -> If it succeeds, use its results in preference to those from RC10
 -> If it fails, carry on with the results from RC10
 -> If it times out, reset the device, carry on with the results from RC10

Algorithm C:

As algorithm B, except:
 -> If it succeeds, use the RC10 results for LBA unless the LBA is 0xffffffff
    but use the RC16 results for TPE, PROT, etc.

Algorithm D:

Go back to T10 and say "Excuse me, kind sirs, would you mind adding an
INQUIRY bit to indicate that the device supports UNMAP?  I know you've
added a bit to RC16, but there's this nasty real world out there where
devices are apt to blow up if you send them an RC16 when they're not
expecting it."

Critiques should be expressed in the form of new algorithms.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: READ CAPACITY 16
@ 2008-12-17 17:20 bburk
  2008-12-17 17:25 ` Matthew Wilcox
  0 siblings, 1 reply; 22+ messages in thread
From: bburk @ 2008-12-17 17:20 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi

Algorithm D is rather pointless.  Even if T10 added an Inquiry bit,
there's no guarentee that all devices will support it, especially older
devices made before the change so even if it got put in you STILL can't
rely on it and will have to all back to other methods to get the
information that you're looking for.

You may as well get T10 to require that all devices support RC16 without
locking up for as much good as it would do.  No device should be doing
it, so the ones that aren't handling it properly are out of spec anyway.

Brent Burkholder
Extreme Protocol Solutions

-------- Original Message --------
Subject: READ CAPACITY 16
From: Matthew Wilcox <matthew@wil.cx>
Date: Wed, December 17, 2008 11:42 am
To: linux-scsi@vger.kernel.org

Go back to T10 and say "Excuse me, kind sirs, would you mind adding an
INQUIRY bit to indicate that the device supports UNMAP? I know you've
added a bit to RC16, but there's this nasty real world out there where
devices are apt to blow up if you send them an RC16 when they're not
expecting it."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 17:20 bburk
@ 2008-12-17 17:25 ` Matthew Wilcox
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 17:25 UTC (permalink / raw)
  To: bburk; +Cc: linux-scsi

On Wed, Dec 17, 2008 at 10:20:48AM -0700, bburk@extremeprotocol.com wrote:
> Algorithm D is rather pointless.  Even if T10 added an Inquiry bit,
> there's no guarentee that all devices will support it, especially older
> devices made before the change so even if it got put in you STILL can't
> rely on it and will have to all back to other methods to get the
> information that you're looking for.

The UNMAP command is still in the draft stages.  Now is exactly the
right time to push for the TPE bit to be moved from RC16 to INQUIRY.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 16:42 READ CAPACITY 16 Matthew Wilcox
@ 2008-12-17 17:50 ` Grant Grundler
  2008-12-17 18:06   ` Matthew Wilcox
  2008-12-18 20:41 ` Douglas Gilbert
  1 sibling, 1 reply; 22+ messages in thread
From: Grant Grundler @ 2008-12-17 17:50 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi

On Wed, Dec 17, 2008 at 8:42 AM, Matthew Wilcox <matthew@wil.cx> wrote:
>
> I'm looking at the UNMAP support again, and we now have a bit that tells
> us whether the device supports UNMAP or not, it's called TPE (Thin
> Provisioning Enabled) and is found in byte 14 of the result from READ
> CAPACITY 16.  The problem is that we do our best to avoid calling READ
> CAPACITY 16.
>
> Presumably, there are many devices which do not support RC16.  That
> isn't a problem, we can try RC16 and fall back to RC10 if the device
> returns an error.  The question is what to do about devices that either
> hang or take a long time to respond to an RC16 command.
>
> This kind of problem isn't going to be limited to UNMAP.  DIF/DIX
> already has to use RC16 to get the protection type.  Once 4k sector size
> drives become common, we're going to want the "LOGICAL BLOCKS PER
> PHYSICAL BLOCK EXPONENT" and the "LOWEST ALIGNED LOGICAL BLOCK ADDRESS"
> information that RC16 returns and RC10 doesn't.  There's another 16
> bytes and a couple of reserved 4-bit fields to be assigned too, and I
> can imagine them getting used for new features in the future.
>
> So what strategy should we adopt for trying harder to issue RC16?
>
> Algorithm A (a perfect world):
>
> Issue RC16
>  -> If it fails, issue RC10
>  -> If it times out, reset the device, issue RC10
>
> Algorithm B:
>
> Issue RC10
> Issue RC16
>  -> If it succeeds, use its results in preference to those from RC10
>  -> If it fails, carry on with the results from RC10
>  -> If it times out, reset the device, carry on with the results from RC10

I fail to see an effective difference between Algo A and B.
The question really is one you already asked:
> ...The question is what to do about devices that either
> hang or take a long time to respond to an RC16 command.

A few ideas:
1) maintain a blacklist

2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality?
    If so, then something like B or C would make sense.

3) How long does Read Capacity16 normally take?  e.g. at boot time with drive
   that isn't spun up yet or equivalent from RAID device.
   If it's not that long (e.g < 1sec or so) then just use a shorter
timeout in general?
   With parallel scanning, it should be tolerably painful.

hth,
grant

> Algorithm C:
>
> As algorithm B, except:
>  -> If it succeeds, use the RC10 results for LBA unless the LBA is 0xffffffff
>    but use the RC16 results for TPE, PROT, etc.
>
> Algorithm D:
>
> Go back to T10 and say "Excuse me, kind sirs, would you mind adding an
> INQUIRY bit to indicate that the device supports UNMAP?  I know you've
> added a bit to RC16, but there's this nasty real world out there where
> devices are apt to blow up if you send them an RC16 when they're not
> expecting it."
>
>
> Critiques should be expressed in the form of new algorithms.
>
> --
> Matthew Wilcox                          Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours.  We can't possibly take such
> a retrograde step."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 17:50 ` Grant Grundler
@ 2008-12-17 18:06   ` Matthew Wilcox
  2008-12-17 18:57     ` Grant Grundler
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 18:06 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-scsi

On Wed, Dec 17, 2008 at 09:50:52AM -0800, Grant Grundler wrote:
> > Algorithm A (a perfect world):
> >
> > Issue RC16
> >  -> If it fails, issue RC10
> >  -> If it times out, reset the device, issue RC10
> >
> > Algorithm B:
> >
> > Issue RC10
> > Issue RC16
> >  -> If it succeeds, use its results in preference to those from RC10
> >  -> If it fails, carry on with the results from RC10
> >  -> If it times out, reset the device, carry on with the results from RC10
> 
> I fail to see an effective difference between Algo A and B.

Whether to issue an RC10 before issuing an RC16 or not.  It matches what
we currently do better (we currently issue an RC10 and then issue an
RC16 if RC10 reports we have 0xffffffff LBAs).

> The question really is one you already asked:
> > ...The question is what to do about devices that either
> > hang or take a long time to respond to an RC16 command.
> 
> A few ideas:
> 1) maintain a blacklist

Which is obviously what we're trying to avoid doing.

> 2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality?
>     If so, then something like B or C would make sense.

RC10 only returns number of LBAs and how many bytes per LBA.  I don't
see anything in the INQUIRY data (other than the protection bit, which
we already use to know that RC16 is supported).  We could maybe key off
scsi_level > SCSI_2 like scsi_device_protection() does.  This would work
for ATA SSDs because libata reports SCSI ANSI revision 05, but it won't
work for USB devices because they get mangled down to SCSI_2, no matter
what they support.

> 3) How long does Read Capacity16 normally take?  e.g. at boot time with drive
>    that isn't spun up yet or equivalent from RAID device.
>    If it's not that long (e.g < 1sec or so) then just use a shorter
> timeout in general?
>    With parallel scanning, it should be tolerably painful.

I don't know how long it'll take.  I was hoping people with experience
in this matter would chime in.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 18:06   ` Matthew Wilcox
@ 2008-12-17 18:57     ` Grant Grundler
  2008-12-17 19:04     ` James Bottomley
  2008-12-18  9:05     ` Boaz Harrosh
  2 siblings, 0 replies; 22+ messages in thread
From: Grant Grundler @ 2008-12-17 18:57 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi

On Wed, Dec 17, 2008 at 10:06 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> On Wed, Dec 17, 2008 at 09:50:52AM -0800, Grant Grundler wrote:
>> > Algorithm A (a perfect world):
>> >
>> > Issue RC16
>> >  -> If it fails, issue RC10
>> >  -> If it times out, reset the device, issue RC10
>> >
>> > Algorithm B:
>> >
>> > Issue RC10
>> > Issue RC16
>> >  -> If it succeeds, use its results in preference to those from RC10
>> >  -> If it fails, carry on with the results from RC10
>> >  -> If it times out, reset the device, carry on with the results from RC10
>>
>> I fail to see an effective difference between Algo A and B.
>
> Whether to issue an RC10 before issuing an RC16 or not.  It matches what
> we currently do better (we currently issue an RC10 and then issue an
> RC16 if RC10 reports we have 0xffffffff LBAs).

Sorry, I was thinking the case of RC16 timing out.
We end up timing out on the RC16 for the given device in both Algo.
Current behavior is shaped by device behaviors and the need to get a
valid LBA count.
New behavior will be shaped by the same thing PLUS additional fields
you pointed out.

>
>> The question really is one you already asked:
>> > ...The question is what to do about devices that either
>> > hang or take a long time to respond to an RC16 command.
>>
>> A few ideas:
>> 1) maintain a blacklist
>
> Which is obviously what we're trying to avoid doing.

Yes.  But it won't be the first blacklist linux kernel has.
Similarly, adding a kernel command line flag to modify behavior also sucks.

>> 2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality?
>>     If so, then something like B or C would make sense.
>
> RC10 only returns number of LBAs and how many bytes per LBA.  I don't
> see anything in the INQUIRY data (other than the protection bit, which
> we already use to know that RC16 is supported).  We could maybe key off
> scsi_level > SCSI_2 like scsi_device_protection() does.  This would work
> for ATA SSDs because libata reports SCSI ANSI revision 05, but it won't
> work for USB devices because they get mangled down to SCSI_2, no matter
> what they support.

Ugh. So fixing this for USB would require sorting out why USB devices
get "down graded".

>
>> 3) How long does Read Capacity16 normally take?  e.g. at boot time with drive
>>    that isn't spun up yet or equivalent from RAID device.
>>    If it's not that long (e.g < 1sec or so) then just use a shorter
>> timeout in general?
>>    With parallel scanning, it should be tolerably painful.
>
> I don't know how long it'll take.  I was hoping people with experience
> in this matter would chime in.

Since this is forward looking, I'd think someone would need to ping HD
and other storage vendors. Are any subscribed to linux-scsi?

thanks,
grant

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 18:06   ` Matthew Wilcox
  2008-12-17 18:57     ` Grant Grundler
@ 2008-12-17 19:04     ` James Bottomley
  2008-12-17 19:11       ` Matthew Wilcox
  2008-12-18  9:05     ` Boaz Harrosh
  2 siblings, 1 reply; 22+ messages in thread
From: James Bottomley @ 2008-12-17 19:04 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, linux-scsi

On Wed, 2008-12-17 at 11:06 -0700, Matthew Wilcox wrote:
> On Wed, Dec 17, 2008 at 09:50:52AM -0800, Grant Grundler wrote:
> > > Algorithm A (a perfect world):
> > >
> > > Issue RC16
> > >  -> If it fails, issue RC10
> > >  -> If it times out, reset the device, issue RC10
> > >
> > > Algorithm B:
> > >
> > > Issue RC10
> > > Issue RC16
> > >  -> If it succeeds, use its results in preference to those from RC10
> > >  -> If it fails, carry on with the results from RC10
> > >  -> If it times out, reset the device, carry on with the results from RC10
> > 
> > I fail to see an effective difference between Algo A and B.
> 
> Whether to issue an RC10 before issuing an RC16 or not.  It matches what
> we currently do better (we currently issue an RC10 and then issue an
> RC16 if RC10 reports we have 0xffffffff LBAs).
> 
> > The question really is one you already asked:
> > > ...The question is what to do about devices that either
> > > hang or take a long time to respond to an RC16 command.
> > 
> > A few ideas:
> > 1) maintain a blacklist
> 
> Which is obviously what we're trying to avoid doing.

I don't really see a way of avoiding this ... for USB devices it's
probably going to be a requirement.

> > 2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality?
> >     If so, then something like B or C would make sense.
> 
> RC10 only returns number of LBAs and how many bytes per LBA.  I don't
> see anything in the INQUIRY data (other than the protection bit, which
> we already use to know that RC16 is supported).  We could maybe key off
> scsi_level > SCSI_2 like scsi_device_protection() does.  This would work
> for ATA SSDs because libata reports SCSI ANSI revision 05, but it won't
> work for USB devices because they get mangled down to SCSI_2, no matter
> what they support.

That latter piece is fixable.  We can also go with the INQUIRY version
descriptor information which I don't think USB mangles.

> > 3) How long does Read Capacity16 normally take?  e.g. at boot time with drive
> >    that isn't spun up yet or equivalent from RAID device.
> >    If it's not that long (e.g < 1sec or so) then just use a shorter
> > timeout in general?
> >    With parallel scanning, it should be tolerably painful.
> 
> I don't know how long it'll take.  I was hoping people with experience
> in this matter would chime in.

Actually, we can't afford to send READ CAPACITY(16) to failing devices;
some of them never come back.

James



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 19:04     ` James Bottomley
@ 2008-12-17 19:11       ` Matthew Wilcox
  2008-12-17 19:14         ` James Bottomley
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 19:11 UTC (permalink / raw)
  To: James Bottomley; +Cc: Grant Grundler, linux-scsi

On Wed, Dec 17, 2008 at 02:04:52PM -0500, James Bottomley wrote:
> Actually, we can't afford to send READ CAPACITY(16) to failing devices;
> some of them never come back.

When you say 'never come back', do you mean:

a) The drive discards the command silently
b) The drive hangs until a reset is issued
c) The drive hangs until it's power-cycled
d) The drive turns into a paperweight

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 19:11       ` Matthew Wilcox
@ 2008-12-17 19:14         ` James Bottomley
  2008-12-17 19:32           ` Matthew Wilcox
  0 siblings, 1 reply; 22+ messages in thread
From: James Bottomley @ 2008-12-17 19:14 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, linux-scsi

On Wed, 2008-12-17 at 12:11 -0700, Matthew Wilcox wrote:
> On Wed, Dec 17, 2008 at 02:04:52PM -0500, James Bottomley wrote:
> > Actually, we can't afford to send READ CAPACITY(16) to failing devices;
> > some of them never come back.
> 
> When you say 'never come back', do you mean:
> 
> a) The drive discards the command silently
> b) The drive hangs until a reset is issued
> c) The drive hangs until it's power-cycled
> d) The drive turns into a paperweight

All of the above ... this is USB ... well, I don't *know* of a D
case ... but I wouldn't bet one doesn't exist.

James



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 19:14         ` James Bottomley
@ 2008-12-17 19:32           ` Matthew Wilcox
  2008-12-17 19:36             ` James Bottomley
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 19:32 UTC (permalink / raw)
  To: James Bottomley; +Cc: Grant Grundler, linux-scsi

On Wed, Dec 17, 2008 at 02:14:09PM -0500, James Bottomley wrote:
> On Wed, 2008-12-17 at 12:11 -0700, Matthew Wilcox wrote:
> > On Wed, Dec 17, 2008 at 02:04:52PM -0500, James Bottomley wrote:
> > > Actually, we can't afford to send READ CAPACITY(16) to failing devices;
> > > some of them never come back.
> > 
> > When you say 'never come back', do you mean:
> > 
> > a) The drive discards the command silently
> > b) The drive hangs until a reset is issued
> > c) The drive hangs until it's power-cycled
> > d) The drive turns into a paperweight
> 
> All of the above ... this is USB ... well, I don't *know* of a D
> case ... but I wouldn't bet one doesn't exist.

The unfortunate thing is that we don't have a collection of INQUIRY
results from these devices, so we can't say whether checking for SCSI_2
would eliminate those in categories C and D.

Are you willing to take a patch that sends RC16 for devices claiming
SCSI_2, and falls back to RC10 if that doesn't work?  Or shall I try to
implement algorithm D and talk to T10?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 19:32           ` Matthew Wilcox
@ 2008-12-17 19:36             ` James Bottomley
  2008-12-17 19:49               ` Matthew Wilcox
  0 siblings, 1 reply; 22+ messages in thread
From: James Bottomley @ 2008-12-17 19:36 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, linux-scsi

On Wed, 2008-12-17 at 12:32 -0700, Matthew Wilcox wrote:
> On Wed, Dec 17, 2008 at 02:14:09PM -0500, James Bottomley wrote:
> > On Wed, 2008-12-17 at 12:11 -0700, Matthew Wilcox wrote:
> > > On Wed, Dec 17, 2008 at 02:04:52PM -0500, James Bottomley wrote:
> > > > Actually, we can't afford to send READ CAPACITY(16) to failing devices;
> > > > some of them never come back.
> > > 
> > > When you say 'never come back', do you mean:
> > > 
> > > a) The drive discards the command silently
> > > b) The drive hangs until a reset is issued
> > > c) The drive hangs until it's power-cycled
> > > d) The drive turns into a paperweight
> > 
> > All of the above ... this is USB ... well, I don't *know* of a D
> > case ... but I wouldn't bet one doesn't exist.
> 
> The unfortunate thing is that we don't have a collection of INQUIRY
> results from these devices, so we can't say whether checking for SCSI_2
> would eliminate those in categories C and D.
> 
> Are you willing to take a patch that sends RC16 for devices claiming
> SCSI_2, and falls back to RC10 if that doesn't work?  Or shall I try to
> implement algorithm D and talk to T10?

Not really ... SCSI_2 is where the problems are.  SCSI 3 would be much
more acceptable.  Then you can add an inquiry passthrough to USB
mangling for devices you need to work.

James



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 19:36             ` James Bottomley
@ 2008-12-17 19:49               ` Matthew Wilcox
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-17 19:49 UTC (permalink / raw)
  To: James Bottomley; +Cc: Grant Grundler, linux-scsi

On Wed, Dec 17, 2008 at 02:36:57PM -0500, James Bottomley wrote:
> On Wed, 2008-12-17 at 12:32 -0700, Matthew Wilcox wrote:
> > Are you willing to take a patch that sends RC16 for devices claiming
> > SCSI_2, and falls back to RC10 if that doesn't work?  Or shall I try to
> > implement algorithm D and talk to T10?
> 
> Not really ... SCSI_2 is where the problems are.  SCSI 3 would be much
> more acceptable.  Then you can add an inquiry passthrough to USB
> mangling for devices you need to work.

That works for me.  Best of all, I don't care about any USB devices
today, so I won't even have to do the second part!

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 18:06   ` Matthew Wilcox
  2008-12-17 18:57     ` Grant Grundler
  2008-12-17 19:04     ` James Bottomley
@ 2008-12-18  9:05     ` Boaz Harrosh
  2008-12-18 14:08       ` Matthew Wilcox
  2 siblings, 1 reply; 22+ messages in thread
From: Boaz Harrosh @ 2008-12-18  9:05 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, linux-scsi

Matthew Wilcox wrote:
> On Wed, Dec 17, 2008 at 09:50:52AM -0800, Grant Grundler wrote:
>>> Algorithm A (a perfect world):
>>>
>>>
>>> Algorithm B:
>>>
>>> Issue RC10
>>> Issue RC16
>>>  -> If it succeeds, use its results in preference to those from RC10
>>>  -> If it fails, carry on with the results from RC10
>>>  -> If it times out, reset the device, carry on with the results from RC10
>> I fail to see an effective difference between Algo A and B.
> 
> Whether to issue an RC10 before issuing an RC16 or not.  It matches what
> we currently do better (we currently issue an RC10 and then issue an
> RC16 if RC10 reports we have 0xffffffff LBAs).
> 

Sorry to barge in but I think this is the most practical solution and the one
to go to T10 with.

If a (new) device supports RC16 it should return LBAs==0xffffffff for RC10 even
if it's capacity is smaller, to indicate an RC16 request.
If LBAs!=0xffffffff and !SCSI_3 then do not risk RC16 unless a white list
or load parameter.

Since you are going to T10 with this the white list should be, as you said
in other mail, zero length.

>> The question really is one you already asked:
>>> ...The question is what to do about devices that either
>>> hang or take a long time to respond to an RC16 command.
>> A few ideas:
>> 1) maintain a blacklist
> 
> Which is obviously what we're trying to avoid doing.
> 

If you are going to T10 with this a white list should be much shorter

>> 2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality?
>>     If so, then something like B or C would make sense.
> 
> RC10 only returns number of LBAs and how many bytes per LBA.  I don't
> see anything in the INQUIRY data (other than the protection bit, which
> we already use to know that RC16 is supported).  We could maybe key off
> scsi_level > SCSI_2 like scsi_device_protection() does.  This would work
> for ATA SSDs because libata reports SCSI ANSI revision 05, but it won't
> work for USB devices because they get mangled down to SCSI_2, no matter
> what they support.
> 
<snip>

This is certainly a bug in the standard, draft as you say. It must be fixed in
a backward compatible way. Practical matters aside, the standard can not stay
as it is.

Thanks
Boaz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-18  9:05     ` Boaz Harrosh
@ 2008-12-18 14:08       ` Matthew Wilcox
  2008-12-18 14:38         ` Boaz Harrosh
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-18 14:08 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: Grant Grundler, linux-scsi

On Thu, Dec 18, 2008 at 11:05:54AM +0200, Boaz Harrosh wrote:
> Matthew Wilcox wrote:
> >>> Algorithm B:
> >>>
> >>> Issue RC10
> >>> Issue RC16
> >>>  -> If it succeeds, use its results in preference to those from RC10
> >>>  -> If it fails, carry on with the results from RC10
> >>>  -> If it times out, reset the device, carry on with the results from RC10
> >> I fail to see an effective difference between Algo A and B.
> > 
> > Whether to issue an RC10 before issuing an RC16 or not.  It matches what
> > we currently do better (we currently issue an RC10 and then issue an
> > RC16 if RC10 reports we have 0xffffffff LBAs).
> > 
> 
> Sorry to barge in but I think this is the most practical solution and the one
> to go to T10 with.
> 
> If a (new) device supports RC16 it should return LBAs==0xffffffff for RC10 even
> if it's capacity is smaller, to indicate an RC16 request.

That breaks compatibility with older software that doesn't know that
RC16 exists.

> If LBAs!=0xffffffff and !SCSI_3 then do not risk RC16 unless a white list
> or load parameter.
> 
> Since you are going to T10 with this the white list should be, as you said
> in other mail, zero length.

I don't need to go to T10 for anything except Algorithm D.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-18 14:08       ` Matthew Wilcox
@ 2008-12-18 14:38         ` Boaz Harrosh
  2008-12-18 14:49           ` Matthew Wilcox
  2008-12-18 14:52           ` James Bottomley
  0 siblings, 2 replies; 22+ messages in thread
From: Boaz Harrosh @ 2008-12-18 14:38 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, linux-scsi

Matthew Wilcox wrote:
> On Thu, Dec 18, 2008 at 11:05:54AM +0200, Boaz Harrosh wrote:
>> Matthew Wilcox wrote:
>>>>> Algorithm B:
>>>>>
>>>>> Issue RC10
>>>>> Issue RC16
>>>>>  -> If it succeeds, use its results in preference to those from RC10
>>>>>  -> If it fails, carry on with the results from RC10
>>>>>  -> If it times out, reset the device, carry on with the results from RC10
>>>> I fail to see an effective difference between Algo A and B.
>>> Whether to issue an RC10 before issuing an RC16 or not.  It matches what
>>> we currently do better (we currently issue an RC10 and then issue an
>>> RC16 if RC10 reports we have 0xffffffff LBAs).
>>>
>> Sorry to barge in but I think this is the most practical solution and the one
>> to go to T10 with.
>>
>> If a (new) device supports RC16 it should return LBAs==0xffffffff for RC10 even
>> if it's capacity is smaller, to indicate an RC16 request.
> 
> That breaks compatibility with older software that doesn't know that
> RC16 exists.
> 
>> If LBAs!=0xffffffff and !SCSI_3 then do not risk RC16 unless a white list
>> or load parameter.
>>
>> Since you are going to T10 with this the white list should be, as you said
>> in other mail, zero length.
> 
> I don't need to go to T10 for anything except Algorithm D.
> 

OK Then I say D, go to T10, while white list the (0) devices that currently
report !SCSI_3 but do support UNMAP. These are only USB right?

Your tested devices report SCSI_3? Do all devices that are scsi_level > SCSI_2
suppose to support RC16?

My point is make the standard, which is still a draft, crystal clear
in a backward compatible way. All new, supporting, devices can be easily
identified, and the very few devices that do support the new fixture but
were released prior to the finalization of the draft be white-listed.
And in any event don't let the standard be broken like that.

Boaz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-18 14:38         ` Boaz Harrosh
@ 2008-12-18 14:49           ` Matthew Wilcox
  2008-12-18 14:52           ` James Bottomley
  1 sibling, 0 replies; 22+ messages in thread
From: Matthew Wilcox @ 2008-12-18 14:49 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: Grant Grundler, linux-scsi

On Thu, Dec 18, 2008 at 04:38:28PM +0200, Boaz Harrosh wrote:
> OK Then I say D, go to T10, while white list the (0) devices that currently
> report !SCSI_3 but do support UNMAP. These are only USB right?

I'm not sure I've explained myself correctly.

 - There are new features from T10 (UNMAP being one of them) that are
   reported only through the RC16 command.
 - We don't currently use RC16 unless:
   - The device claims to have more than 4 billion sectors (~= 2TB with
     512-byte sectors) OR
   - The device claims to support protection information

We need a way to be able to use RC16, or we need to persuade T10 that
using RC16 is basically impossible in the real world, so they should
stop putting features in it.

James and I seem to have come to a conclusion -- that we'll try RC16
for drives which claim SCSI_3 compliance (which excludes all the current
USB devices).  It's then up to the USB people to implement a whitelist
for not mangling USB devices down to SCSI_2.

> Your tested devices report SCSI_3? Do all devices that are scsi_level > SCSI_2
> suppose to support RC16?

My tested devices are all libata which claims SCSI_SPC_2 compliance.
Obviously, this is faked.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-18 14:38         ` Boaz Harrosh
  2008-12-18 14:49           ` Matthew Wilcox
@ 2008-12-18 14:52           ` James Bottomley
  2008-12-18 14:59             ` Boaz Harrosh
  1 sibling, 1 reply; 22+ messages in thread
From: James Bottomley @ 2008-12-18 14:52 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: Matthew Wilcox, Grant Grundler, linux-scsi

On Thu, 2008-12-18 at 16:38 +0200, Boaz Harrosh wrote:
> OK Then I say D, go to T10, while white list the (0) devices that currently
> report !SCSI_3 but do support UNMAP. These are only USB right?
> 
> Your tested devices report SCSI_3? Do all devices that are scsi_level > SCSI_2
> suppose to support RC16?

The problem isn't whether they support it or not.  A proper standards
compliant SCSI device can be sent READ CAPACITY(16) and just return
ILLEGAL REQUEST sense quite normally.  If those were all the devices in
the world, we'd send 16 first and fall back to 10.

The problem is that there are devices (USB devices) that go haywire when
sent a READ CAPACITY 16 command (or, indeed, any other SCSI command not
in their vocabulary).  It's for these devices that we do the 10->16
dance the way we do in sd.c

Our problem is to identify devices that could reliably receive (and this
doesn't mean process it just means return a standards compliant response
without crashing or going out to lunch) READ CAPACITY 16 because the
current Thin Provisioning draft requires this to indicate thin
provisioning support.

My take is still that TP devices have to be SCSI-3 SBC-3 or higher, so
we just check this and for them do READ CAPACITY 16 with fallback to 10
on ILLEGAL REQUEST return.  USB has to whitelist the TP compliant
devices and not mangle the inquiry version field down to SCSI_2 for them
and the world will just work.

> My point is make the standard, which is still a draft, crystal clear
> in a backward compatible way. All new, supporting, devices can be easily
> identified, and the very few devices that do support the new fixture but
> were released prior to the finalization of the draft be white-listed.
> And in any event don't let the standard be broken like that.

James

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-18 14:52           ` James Bottomley
@ 2008-12-18 14:59             ` Boaz Harrosh
  0 siblings, 0 replies; 22+ messages in thread
From: Boaz Harrosh @ 2008-12-18 14:59 UTC (permalink / raw)
  To: James Bottomley; +Cc: Matthew Wilcox, Grant Grundler, linux-scsi

James Bottomley wrote:
> On Thu, 2008-12-18 at 16:38 +0200, Boaz Harrosh wrote:
>> OK Then I say D, go to T10, while white list the (0) devices that currently
>> report !SCSI_3 but do support UNMAP. These are only USB right?
>>
>> Your tested devices report SCSI_3? Do all devices that are scsi_level > SCSI_2
>> suppose to support RC16?
> 
> The problem isn't whether they support it or not.  A proper standards
> compliant SCSI device can be sent READ CAPACITY(16) and just return
> ILLEGAL REQUEST sense quite normally.  If those were all the devices in
> the world, we'd send 16 first and fall back to 10.
> 
> The problem is that there are devices (USB devices) that go haywire when
> sent a READ CAPACITY 16 command (or, indeed, any other SCSI command not
> in their vocabulary).  It's for these devices that we do the 10->16
> dance the way we do in sd.c
> 
> Our problem is to identify devices that could reliably receive (and this
> doesn't mean process it just means return a standards compliant response
> without crashing or going out to lunch) READ CAPACITY 16 because the
> current Thin Provisioning draft requires this to indicate thin
> provisioning support.
> 
> My take is still that TP devices have to be SCSI-3 SBC-3 or higher, so
> we just check this and for them do READ CAPACITY 16 with fallback to 10
> on ILLEGAL REQUEST return.  USB has to whitelist the TP compliant
> devices and not mangle the inquiry version field down to SCSI_2 for them
> and the world will just work.
> 
>> My point is make the standard, which is still a draft, crystal clear
>> in a backward compatible way. All new, supporting, devices can be easily
>> identified, and the very few devices that do support the new fixture but
>> were released prior to the finalization of the draft be white-listed.
>> And in any event don't let the standard be broken like that.
> 
> James
> 
> 

OK Jams Mathew thanks that makes sense.

All these emulation layers being in HW, USB, or SW, libata will have to attempt
an higher-then-SCSI_2 if they want RC16 stuff, full stop. That sounds safe to
me, and the market will win.

Boaz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: READ CAPACITY 16
  2008-12-17 16:42 READ CAPACITY 16 Matthew Wilcox
  2008-12-17 17:50 ` Grant Grundler
@ 2008-12-18 20:41 ` Douglas Gilbert
  1 sibling, 0 replies; 22+ messages in thread
From: Douglas Gilbert @ 2008-12-18 20:41 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi

Matthew Wilcox wrote:
> I'm looking at the UNMAP support again, and we now have a bit that tells
> us whether the device supports UNMAP or not, it's called TPE (Thin
> Provisioning Enabled) and is found in byte 14 of the result from READ
> CAPACITY 16.  The problem is that we do our best to avoid calling READ
> CAPACITY 16.
> 
> Presumably, there are many devices which do not support RC16.  That
> isn't a problem, we can try RC16 and fall back to RC10 if the device
> returns an error.  The question is what to do about devices that either
> hang or take a long time to respond to an RC16 command.
> 
> This kind of problem isn't going to be limited to UNMAP.  DIF/DIX
> already has to use RC16 to get the protection type.  Once 4k sector size
> drives become common, we're going to want the "LOGICAL BLOCKS PER
> PHYSICAL BLOCK EXPONENT" and the "LOWEST ALIGNED LOGICAL BLOCK ADDRESS"
> information that RC16 returns and RC10 doesn't.  There's another 16
> bytes and a couple of reserved 4-bit fields to be assigned too, and I
> can imagine them getting used for new features in the future.
> 
> So what strategy should we adopt for trying harder to issue RC16?
> 
> Algorithm A (a perfect world):
> 
> Issue RC16
>  -> If it fails, issue RC10
>  -> If it times out, reset the device, issue RC10
> 
> Algorithm B:
> 
> Issue RC10
> Issue RC16
>  -> If it succeeds, use its results in preference to those from RC10
>  -> If it fails, carry on with the results from RC10
>  -> If it times out, reset the device, carry on with the results from RC10
> 
> Algorithm C:
> 
> As algorithm B, except:
>  -> If it succeeds, use the RC10 results for LBA unless the LBA is 0xffffffff
>     but use the RC16 results for TPE, PROT, etc.
> 
> Algorithm D:
> 
> Go back to T10 and say "Excuse me, kind sirs, would you mind adding an
> INQUIRY bit to indicate that the device supports UNMAP?  I know you've
> added a bit to RC16, but there's this nasty real world out there where
> devices are apt to blow up if you send them an RC16 when they're not
> expecting it."

T10 proposal 08-149r7 on thin provisioning does add two
extra fields to the Block Limits VPD page. A value greater
than zero in the first extra field ("Maximum UNMAP LBA
count") indicates that thin provisioning is supported.

In my experience it is reasonably safe to fire a "36 byte"
INQUIRY command with the EVPD bit set (with a page code of
B0h in this case) and examine the response. Crappy devices
just ignore the EVPD bit and respond as if it was a standard
INQUIRY, and this is easy to detect. The chances of such
devices supporting thin provisioning are extremely remote.

So if a properly formatted Block Limits VPD page is returned
with "Maximum UNMAP LBA count" > 0 then do a READ CAPACITY 16.

It wouldn't be a bad idea if the block subsystem used some
of the other fields in the Block Limits VPD page.

Doug Gilbert


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2008-12-18 20:41 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-17 16:42 READ CAPACITY 16 Matthew Wilcox
2008-12-17 17:50 ` Grant Grundler
2008-12-17 18:06   ` Matthew Wilcox
2008-12-17 18:57     ` Grant Grundler
2008-12-17 19:04     ` James Bottomley
2008-12-17 19:11       ` Matthew Wilcox
2008-12-17 19:14         ` James Bottomley
2008-12-17 19:32           ` Matthew Wilcox
2008-12-17 19:36             ` James Bottomley
2008-12-17 19:49               ` Matthew Wilcox
2008-12-18  9:05     ` Boaz Harrosh
2008-12-18 14:08       ` Matthew Wilcox
2008-12-18 14:38         ` Boaz Harrosh
2008-12-18 14:49           ` Matthew Wilcox
2008-12-18 14:52           ` James Bottomley
2008-12-18 14:59             ` Boaz Harrosh
2008-12-18 20:41 ` Douglas Gilbert
  -- strict thread matches above, loose matches on Subject: below --
2008-12-17 17:20 bburk
2008-12-17 17:25 ` Matthew Wilcox
2004-12-09 14:33 read capacity 16 Frank Borich
2004-12-09 15:02 ` Christoph Hellwig
2004-12-08 21:07 Frank Borich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox