* Alternative TRIM proposal
@ 2008-10-02 15:24 Matthew Wilcox
0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2008-10-02 15:24 UTC (permalink / raw)
To: Knight, Frederick; +Cc: t10, linux-scsi, dougg, Martin Petersen
There's a meeting tomorrow to discuss the T10 TRIM command. The current
proposal can be seen at http://t10.org/ftp/t10/document.08/08-149r2.pdf
A related document (discussing READ after TRIM) can be found at
http://t10.org/ftp/t10/document.08/08-347r1.pdf
I'm not keen on the 'pass a list of blocks to be trimmed' model. I would
prefer TRIM to be a real command like READ or WRITE. To that end, here
are my notes on creating such commands, followed by an actual proposal.
I would welcome feedback on this, and it'd be most useful if such feedback
occurred within the next 24 hours so I can refine the proposal before
the meeting.
Notes
=====
SBC-3 specifies 6, 10, 12, 16 and 32 byte commands for each of READ and
WRITE as well as 10, 12, 16 and 32 byte commands for VERIFY. While it
is tempting to only define a 32-byte TRIM command, that would prevent
older controllers from supporting TRIM, as well as being wasteful in the
on-wire encoding. All drivers in Linux support at least 12-byte commands,
so I think we can avoid defining 6 and 10 byte variants of TRIM in order
to conserve the number of operation codes required for this proposal.
The 12-byte commands allow 32 bits for LBA and 32 bits for transfer length
(remember these are specified in sectors (normally 512 bytes), so support
drives up to 2TB in size). The 16-byte commands expand the LBA size
to 64-bit, supporting drive sizes over 9000 Exabytes (8192 exbibytes,
I suppose). The 32-byte commands add support for application tags.
The commands also include various fields which may or may not make sense
for TRIM. Here's a list:
WRPROTECT | The application may want the device to check protection
RDPROTECT | information before allowing the TRIM to succeed. This is
VRPROTECT | the same case as VERIFY with BYTCHK=0. See table 67 in
| SAM 3 r14.
DPO | Disable Page Out is not relevant to TRIM since the blocks
| are being discarded. Checking application tags may require
| the blocks to be accessed, but they can always be discarded
| immediately. Recommend this bit be reserved.
FUA | I don't see a reason to force unit access, recommend these
FUA_NV | bits be reserved.
BYTCHK | There might be a case to be made for allowing the device
| to discard only if the data is still what it used to be,
| but this would add additional complexity and I don't know
| if it's worth it. Reserve this bit.
GROUP NUMBER | I can see it being useful to account TRIMs to different
| groups and produce statistics about them, so recommend that
| GROUP NUMBER be specified as it is for other commands.
CONTROL | All commands shall contain the CONTROL byte as specified by
| SAM 4.
Proposal
========
Define three new commands, TRIM (12), TRIM (16) and TRIM (32):
TRIM (12)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-5 LOGICAL BLOCK ADDRESS
byte 6-9 TRANSFER LENGTH
byte 10 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 11 CONTROL
TRIM (16)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-9 LOGICAL BLOCK ADDRESS
byte 10-13 TRANSFER LENGTH
byte 14 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 15 CONTROL
TRIM (32)
byte 0 OPERATION CODE (7Fh)
byte 1 CONTROL
byte 2-5 Reserved
byte 6 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 7 ADDITIONAL CDB LENGTH (18h)
byte 8-9 SERVICE ACTION (to be assigned)
byte 10 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 11 Reserved
byte 12-19 LOGICAL BLOCK ADDRESS
byte 20-23 EXPECTED INITIAL LOGICAL BLOCK REFERENCE TAG
byte 24-25 EXPECTED LOGICAL BLOCK APPLICATION TAG
byte 26-27 LOGICAL BLOCK APPLICATION TAG MASK
byte 28-31 TRANSFER LENGTH
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Alternative TRIM proposal
[not found] <200810021732.m92HW7qL015836@coles02.co.lsil.com>
@ 2008-10-02 19:37 ` Kevin_Marks
2008-10-02 20:07 ` Gerry.Houlder
1 sibling, 0 replies; 6+ messages in thread
From: Kevin_Marks @ 2008-10-02 19:37 UTC (permalink / raw)
To: matthew, Frederick.Knight; +Cc: t10, linux-scsi, dougg
Mathew,
Other than you oppose it, what is your reasoning behind opposing a
descriptor based model. Not being a FS expert, it would seem that when
a file system deleted a file for example, that the list of LBA's that
was no longer allocated and could be Trimmed (going to be renamed again
to Punch) would not always be contiguous. Having a descriptor based
command allows communicating this in a single command vs. multiple
commands in your proposal.
Thanks
Kevin
-----Original Message-----
From: owner-t10@t10.org [mailto:owner-t10@t10.org] On Behalf Of Matthew
Wilcox
Sent: Thursday, October 02, 2008 10:25 AM
To: Knight, Frederick
Cc: t10@t10.org; linux-scsi@vger.kernel.org; dougg@torque.net
Subject: Alternative TRIM proposal
* From the T10 Reflector (t10@t10.org), posted by:
* Matthew Wilcox <matthew@wil.cx>
*
There's a meeting tomorrow to discuss the T10 TRIM command. The current
proposal can be seen at http://t10.org/ftp/t10/document.08/08-149r2.pdf
A related document (discussing READ after TRIM) can be found at
http://t10.org/ftp/t10/document.08/08-347r1.pdf
I'm not keen on the 'pass a list of blocks to be trimmed' model. I
would
prefer TRIM to be a real command like READ or WRITE. To that end, here
are my notes on creating such commands, followed by an actual proposal.
I would welcome feedback on this, and it'd be most useful if such
feedback
occurred within the next 24 hours so I can refine the proposal before
the meeting.
Notes
=====
SBC-3 specifies 6, 10, 12, 16 and 32 byte commands for each of READ and
WRITE as well as 10, 12, 16 and 32 byte commands for VERIFY. While it
is tempting to only define a 32-byte TRIM command, that would prevent
older controllers from supporting TRIM, as well as being wasteful in the
on-wire encoding. All drivers in Linux support at least 12-byte
commands,
so I think we can avoid defining 6 and 10 byte variants of TRIM in order
to conserve the number of operation codes required for this proposal.
The 12-byte commands allow 32 bits for LBA and 32 bits for transfer
length
(remember these are specified in sectors (normally 512 bytes), so
support
drives up to 2TB in size). The 16-byte commands expand the LBA size
to 64-bit, supporting drive sizes over 9000 Exabytes (8192 exbibytes,
I suppose). The 32-byte commands add support for application tags.
The commands also include various fields which may or may not make sense
for TRIM. Here's a list:
WRPROTECT | The application may want the device to check
protection
RDPROTECT | information before allowing the TRIM to succeed. This
is
VRPROTECT | the same case as VERIFY with BYTCHK=0. See table 67
in
| SAM 3 r14.
DPO | Disable Page Out is not relevant to TRIM since the
blocks
| are being discarded. Checking application tags may
require
| the blocks to be accessed, but they can always be
discarded
| immediately. Recommend this bit be reserved.
FUA | I don't see a reason to force unit access, recommend
these
FUA_NV | bits be reserved.
BYTCHK | There might be a case to be made for allowing the
device
| to discard only if the data is still what it used to
be,
| but this would add additional complexity and I don't
know
| if it's worth it. Reserve this bit.
GROUP NUMBER | I can see it being useful to account TRIMs to
different
| groups and produce statistics about them, so recommend
that
| GROUP NUMBER be specified as it is for other commands.
CONTROL | All commands shall contain the CONTROL byte as
specified by
| SAM 4.
Proposal
========
Define three new commands, TRIM (12), TRIM (16) and TRIM (32):
TRIM (12)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-5 LOGICAL BLOCK ADDRESS
byte 6-9 TRANSFER LENGTH
byte 10 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 11 CONTROL
TRIM (16)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-9 LOGICAL BLOCK ADDRESS
byte 10-13 TRANSFER LENGTH
byte 14 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 15 CONTROL
TRIM (32)
byte 0 OPERATION CODE (7Fh)
byte 1 CONTROL
byte 2-5 Reserved
byte 6 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 7 ADDITIONAL CDB LENGTH (18h)
byte 8-9 SERVICE ACTION (to be assigned)
byte 10 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 11 Reserved
byte 12-19 LOGICAL BLOCK ADDRESS
byte 20-23 EXPECTED INITIAL LOGICAL BLOCK REFERENCE TAG
byte 24-25 EXPECTED LOGICAL BLOCK APPLICATION TAG
byte 26-27 LOGICAL BLOCK APPLICATION TAG MASK
byte 28-31 TRANSFER LENGTH
--
Matthew Wilcox Intel Open Source Technology
Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo@t10.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Alternative TRIM proposal
[not found] <200810021732.m92HW7qL015836@coles02.co.lsil.com>
2008-10-02 19:37 ` Kevin_Marks
@ 2008-10-02 20:07 ` Gerry.Houlder
2008-10-02 22:39 ` Kevin_Marks
2008-10-03 15:42 ` Matthew Wilcox
1 sibling, 2 replies; 6+ messages in thread
From: Gerry.Houlder @ 2008-10-02 20:07 UTC (permalink / raw)
To: matthew; +Cc: dougg, Knight, Frederick, linux-scsi, owner-t10
I will not be able to attend the call tomorrow, so I will offer my opinions
on the T10 reflector.
(a) I don't see any point to doing a 12 byte TRIM command. The 16 byte
command is more future proof and any SCSI system (except maybe ATAPI) that
can do 10 and 12 byte commands can also do 16 byte commands.
(b) Likewise I don't see any point to a 32 byte TRIM command. The extra
fields in the CDB are for checking Protection Information fields attached
to user data; since the trim command won't transfer any user data these
fields add no value.
(c) You mention things like "checking protection information data before
trimming" or doing a verify operation before trimming? This is a pointless
waste of time for a bunch of blocks that you wish to delete. All you really
care about is that only good blocks are reused in the future, when new
information is written. Let the storage device's scrubbing or wear leveling
algorithms take care of that.
(d) I'm not sure why you don't like the T13 approach, where multiple
extents can be trimmed in one command instead of having to send a separate
command for each extent to be trimmed. If you tell me that when a person
goes to the GUI file manager and marks 10 files for deletion, some of which
might be fragmented into several extends, the operating system will only
trim one extent at a time (waiting for each to be trimmed before doing the
next) rather than combining multiple extents into one interface command
then perhaps there is no advantage to being able to do multiple extents.
The advantage of multiple extents in one command is using less interface
bus bandwidth. The only disadvantage is error recovery; if the command
fails or is aborted for other reasons it is messier because the host has to
figure out which (if any) of the extents were done and which have to be
retried. I hope you will expound on your reason(s) for liking the one
extent at a time approach.
Matthew Wilcox
<matthew@wil.cx>
Sent by: To
owner-t10@t10.org "Knight, Frederick"
No Phone Info <Frederick.Knight@netapp.com>
Available cc
t10@t10.org,
linux-scsi@vger.kernel.org,
10/02/2008 10:24 dougg@torque.net
AM Subject
Alternative TRIM proposal
* From the T10 Reflector (t10@t10.org), posted by:
* Matthew Wilcox <matthew@wil.cx>
*
There's a meeting tomorrow to discuss the T10 TRIM command. The current
proposal can be seen at http://t10.org/ftp/t10/document.08/08-149r2.pdf
A related document (discussing READ after TRIM) can be found at
http://t10.org/ftp/t10/document.08/08-347r1.pdf
I'm not keen on the 'pass a list of blocks to be trimmed' model. I would
prefer TRIM to be a real command like READ or WRITE. To that end, here
are my notes on creating such commands, followed by an actual proposal.
I would welcome feedback on this, and it'd be most useful if such feedback
occurred within the next 24 hours so I can refine the proposal before
the meeting.
Notes
=====
SBC-3 specifies 6, 10, 12, 16 and 32 byte commands for each of READ and
WRITE as well as 10, 12, 16 and 32 byte commands for VERIFY. While it
is tempting to only define a 32-byte TRIM command, that would prevent
older controllers from supporting TRIM, as well as being wasteful in the
on-wire encoding. All drivers in Linux support at least 12-byte commands,
so I think we can avoid defining 6 and 10 byte variants of TRIM in order
to conserve the number of operation codes required for this proposal.
The 12-byte commands allow 32 bits for LBA and 32 bits for transfer length
(remember these are specified in sectors (normally 512 bytes), so support
drives up to 2TB in size). The 16-byte commands expand the LBA size
to 64-bit, supporting drive sizes over 9000 Exabytes (8192 exbibytes,
I suppose). The 32-byte commands add support for application tags.
The commands also include various fields which may or may not make sense
for TRIM. Here's a list:
WRPROTECT | The application may want the device to check protection
RDPROTECT | information before allowing the TRIM to succeed. This is
VRPROTECT | the same case as VERIFY with BYTCHK=0. See table 67 in
| SAM 3 r14.
DPO | Disable Page Out is not relevant to TRIM since the blocks
| are being discarded. Checking application tags may
require
| the blocks to be accessed, but they can always be
discarded
| immediately. Recommend this bit be reserved.
FUA | I don't see a reason to force unit access, recommend
these
FUA_NV | bits be reserved.
BYTCHK | There might be a case to be made for allowing the device
| to discard only if the data is still what it used to be,
| but this would add additional complexity and I don't know
| if it's worth it. Reserve this bit.
GROUP NUMBER | I can see it being useful to account TRIMs to different
| groups and produce statistics about them, so recommend
that
| GROUP NUMBER be specified as it is for other commands.
CONTROL | All commands shall contain the CONTROL byte as specified
by
| SAM 4.
Proposal
========
Define three new commands, TRIM (12), TRIM (16) and TRIM (32):
TRIM (12)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-5 LOGICAL BLOCK ADDRESS
byte 6-9 TRANSFER LENGTH
byte 10 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 11 CONTROL
TRIM (16)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-9 LOGICAL BLOCK ADDRESS
byte 10-13 TRANSFER LENGTH
byte 14 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 15 CONTROL
TRIM (32)
byte 0 OPERATION CODE (7Fh)
byte 1 CONTROL
byte 2-5 Reserved
byte 6 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 7 ADDITIONAL CDB LENGTH (18h)
byte 8-9 SERVICE ACTION (to be assigned)
byte 10 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 11 Reserved
byte 12-19 LOGICAL BLOCK ADDRESS
byte 20-23 EXPECTED INITIAL LOGICAL BLOCK REFERENCE TAG
byte 24-25 EXPECTED LOGICAL BLOCK APPLICATION TAG
byte 26-27 LOGICAL BLOCK APPLICATION TAG MASK
byte 28-31 TRANSFER LENGTH
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo@t10.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Alternative TRIM proposal
2008-10-02 20:07 ` Gerry.Houlder
@ 2008-10-02 22:39 ` Kevin_Marks
2008-10-03 15:42 ` Matthew Wilcox
1 sibling, 0 replies; 6+ messages in thread
From: Kevin_Marks @ 2008-10-02 22:39 UTC (permalink / raw)
To: Gerry.Houlder, matthew; +Cc: dougg, Frederick.Knight, linux-scsi, owner-t10
Gerry,
The intent of the Trim and Fred can correct me if I'm incorrect, is not
the same as the ATA TRIM command (and hence why it is being changed back
to Punch). It is for thin provisioned LUNs and not really applicable to
HDD's. In the ATA TRIM, I believe there is always enough physical
sectors to map to reported LBA's, in a thin provisioned model, this is
not the case. The Punch command would allow one to shrink the physical
blocks used by a given LUN when they are not need and allocate them to a
different LUN.
I do agree that anything to do with protection information is pointless
and why they would be included regardless of whether a descriptor based
or non-descriptor based command format was choose.
I also, with little time spent thinking about, could not find a reason
that this command would fail, except sending it to a LUN that was not
thin provisioned or one provide a LBA range that was above the reported
capacity.
Kevin
-----Original Message-----
From: owner-t10@t10.org [mailto:owner-t10@t10.org] On Behalf Of
Gerry.Houlder@seagate.com
Sent: Thursday, October 02, 2008 3:08 PM
To: matthew@wil.cx
Cc: dougg@torque.net; Knight, Frederick; linux-scsi@vger.kernel.org;
owner-t10@t10.org; t10@t10.org
Subject: Re: Alternative TRIM proposal
* From the T10 Reflector (t10@t10.org), posted by:
* Gerry.Houlder@seagate.com
*
I will not be able to attend the call tomorrow, so I will offer my
opinions
on the T10 reflector.
(a) I don't see any point to doing a 12 byte TRIM command. The 16 byte
command is more future proof and any SCSI system (except maybe ATAPI)
that
can do 10 and 12 byte commands can also do 16 byte commands.
(b) Likewise I don't see any point to a 32 byte TRIM command. The extra
fields in the CDB are for checking Protection Information fields
attached
to user data; since the trim command won't transfer any user data these
fields add no value.
(c) You mention things like "checking protection information data before
trimming" or doing a verify operation before trimming? This is a
pointless
waste of time for a bunch of blocks that you wish to delete. All you
really
care about is that only good blocks are reused in the future, when new
information is written. Let the storage device's scrubbing or wear
leveling
algorithms take care of that.
(d) I'm not sure why you don't like the T13 approach, where multiple
extents can be trimmed in one command instead of having to send a
separate
command for each extent to be trimmed. If you tell me that when a person
goes to the GUI file manager and marks 10 files for deletion, some of
which
might be fragmented into several extends, the operating system will only
trim one extent at a time (waiting for each to be trimmed before doing
the
next) rather than combining multiple extents into one interface command
then perhaps there is no advantage to being able to do multiple extents.
The advantage of multiple extents in one command is using less interface
bus bandwidth. The only disadvantage is error recovery; if the command
fails or is aborted for other reasons it is messier because the host has
to
figure out which (if any) of the extents were done and which have to be
retried. I hope you will expound on your reason(s) for liking the one
extent at a time approach.
Matthew Wilcox
<matthew@wil.cx>
Sent by:
To
owner-t10@t10.org "Knight, Frederick"
No Phone Info <Frederick.Knight@netapp.com>
Available
cc
t10@t10.org,
linux-scsi@vger.kernel.org,
10/02/2008 10:24 dougg@torque.net
AM
Subject
Alternative TRIM proposal
* From the T10 Reflector (t10@t10.org), posted by:
* Matthew Wilcox <matthew@wil.cx>
*
There's a meeting tomorrow to discuss the T10 TRIM command. The current
proposal can be seen at http://t10.org/ftp/t10/document.08/08-149r2.pdf
A related document (discussing READ after TRIM) can be found at
http://t10.org/ftp/t10/document.08/08-347r1.pdf
I'm not keen on the 'pass a list of blocks to be trimmed' model. I
would
prefer TRIM to be a real command like READ or WRITE. To that end, here
are my notes on creating such commands, followed by an actual proposal.
I would welcome feedback on this, and it'd be most useful if such
feedback
occurred within the next 24 hours so I can refine the proposal before
the meeting.
Notes
=====
SBC-3 specifies 6, 10, 12, 16 and 32 byte commands for each of READ and
WRITE as well as 10, 12, 16 and 32 byte commands for VERIFY. While it
is tempting to only define a 32-byte TRIM command, that would prevent
older controllers from supporting TRIM, as well as being wasteful in the
on-wire encoding. All drivers in Linux support at least 12-byte
commands,
so I think we can avoid defining 6 and 10 byte variants of TRIM in order
to conserve the number of operation codes required for this proposal.
The 12-byte commands allow 32 bits for LBA and 32 bits for transfer
length
(remember these are specified in sectors (normally 512 bytes), so
support
drives up to 2TB in size). The 16-byte commands expand the LBA size
to 64-bit, supporting drive sizes over 9000 Exabytes (8192 exbibytes,
I suppose). The 32-byte commands add support for application tags.
The commands also include various fields which may or may not make sense
for TRIM. Here's a list:
WRPROTECT | The application may want the device to check
protection
RDPROTECT | information before allowing the TRIM to succeed. This
is
VRPROTECT | the same case as VERIFY with BYTCHK=0. See table 67
in
| SAM 3 r14.
DPO | Disable Page Out is not relevant to TRIM since the
blocks
| are being discarded. Checking application tags may
require
| the blocks to be accessed, but they can always be
discarded
| immediately. Recommend this bit be reserved.
FUA | I don't see a reason to force unit access, recommend
these
FUA_NV | bits be reserved.
BYTCHK | There might be a case to be made for allowing the
device
| to discard only if the data is still what it used to
be,
| but this would add additional complexity and I don't
know
| if it's worth it. Reserve this bit.
GROUP NUMBER | I can see it being useful to account TRIMs to
different
| groups and produce statistics about them, so recommend
that
| GROUP NUMBER be specified as it is for other commands.
CONTROL | All commands shall contain the CONTROL byte as
specified
by
| SAM 4.
Proposal
========
Define three new commands, TRIM (12), TRIM (16) and TRIM (32):
TRIM (12)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-5 LOGICAL BLOCK ADDRESS
byte 6-9 TRANSFER LENGTH
byte 10 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 11 CONTROL
TRIM (16)
byte 0 OPERATION CODE (to be assigned)
byte 1 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 2-9 LOGICAL BLOCK ADDRESS
byte 10-13 TRANSFER LENGTH
byte 14 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 15 CONTROL
TRIM (32)
byte 0 OPERATION CODE (7Fh)
byte 1 CONTROL
byte 2-5 Reserved
byte 6 bits 7-5: Reserved, bits 4-0: GROUP NUMBER
byte 7 ADDITIONAL CDB LENGTH (18h)
byte 8-9 SERVICE ACTION (to be assigned)
byte 10 bits 7-5: VRPROTECT, bits 4-0: Reserved
byte 11 Reserved
byte 12-19 LOGICAL BLOCK ADDRESS
byte 20-23 EXPECTED INITIAL LOGICAL BLOCK REFERENCE TAG
byte 24-25 EXPECTED LOGICAL BLOCK APPLICATION TAG
byte 26-27 LOGICAL BLOCK APPLICATION TAG MASK
byte 28-31 TRANSFER LENGTH
--
Matthew Wilcox Intel Open Source Technology
Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo@t10.org
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo@t10.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Alternative TRIM proposal
2008-10-02 20:07 ` Gerry.Houlder
2008-10-02 22:39 ` Kevin_Marks
@ 2008-10-03 15:42 ` Matthew Wilcox
2008-10-07 11:39 ` Jens Axboe
1 sibling, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2008-10-03 15:42 UTC (permalink / raw)
To: Gerry.Houlder; +Cc: dougg, Knight, Frederick, linux-scsi, t10, David Woodhouse
On Thu, Oct 02, 2008 at 03:07:34PM -0500, Gerry.Houlder@seagate.com wrote:
> I will not be able to attend the call tomorrow, so I will offer my opinions
> on the T10 reflector.
Thanks, Gerry.
> (a) I don't see any point to doing a 12 byte TRIM command. The 16 byte
> command is more future proof and any SCSI system (except maybe ATAPI) that
> can do 10 and 12 byte commands can also do 16 byte commands.
Unfortunately, that's not true. A survey of the SCSI host adapters in
Linux shows that many only support 12-byte commands. A reasonable
person might decide that it's not worth supporting these adapters any
more, but I don't want to see support disappear because we didn't know
about it.
> (b) Likewise I don't see any point to a 32 byte TRIM command. The extra
> fields in the CDB are for checking Protection Information fields attached
> to user data; since the trim command won't transfer any user data these
> fields add no value.
I may well have been mistaken in my understanding of the protection
information. I thought there was potential for the drive to check the
protection information currently written to the media and to fail the
TRIM if it wasn't there. It would be just a verification step.
> (d) I'm not sure why you don't like the T13 approach, where multiple
> extents can be trimmed in one command instead of having to send a separate
> command for each extent to be trimmed. If you tell me that when a person
> goes to the GUI file manager and marks 10 files for deletion, some of which
> might be fragmented into several extends, the operating system will only
> trim one extent at a time (waiting for each to be trimmed before doing the
> next) rather than combining multiple extents into one interface command
> then perhaps there is no advantage to being able to do multiple extents.
> The advantage of multiple extents in one command is using less interface
> bus bandwidth. The only disadvantage is error recovery; if the command
> fails or is aborted for other reasons it is messier because the host has to
> figure out which (if any) of the extents were done and which have to be
> retried. I hope you will expound on your reason(s) for liking the one
> extent at a time approach.
The most recent version of the T13 proposal I've seen is e07154r6 and
that only allows for a single extent per command. If T13 have changed
their proposal to allow multiple extents in a single command, then I
withdraw my opposition because synchronising the capabilities between
T10 and T13 is my main goal.
I don't think it's necessary for each TRIM to be completed before sending
the next; I think it's entirely reasonable to use tags to send multiple
TRIMs at a time.
Filesystems already do their best to avoid fragmentation (as I'm sure
we're all aware of the performance penalties for fragmented files on
rotating media), so I suspect there are not too many extents per file
already.
It is today the case in Linux that each extent will be sent from the
filesystem to the block layer individually. I can't speak to other
operating systems, maybe some of them allow filesystems to send a list
of extents to the SCSI layer, maybe their SCSI layer can take extents
out of the queue and bundle them together into a single command.
Are there devices that would operate more efficiently if given a list of
(discontiguous) extents to trim rather than getting several consecutive
commands each with a single trim extent in it?
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Alternative TRIM proposal
2008-10-03 15:42 ` Matthew Wilcox
@ 2008-10-07 11:39 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2008-10-07 11:39 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Gerry.Houlder, dougg, Knight, Frederick, linux-scsi, t10,
David Woodhouse
On Fri, Oct 03 2008, Matthew Wilcox wrote:
> It is today the case in Linux that each extent will be sent from the
> filesystem to the block layer individually. I can't speak to other
> operating systems, maybe some of them allow filesystems to send a list
> of extents to the SCSI layer, maybe their SCSI layer can take extents
> out of the queue and bundle them together into a single command.
We do that for lots of regular IO as well, but still support merging
them into a single command - the same could be done for trim, basically
making any of them mergable into a single command.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-10-07 11:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-02 15:24 Alternative TRIM proposal Matthew Wilcox
[not found] <200810021732.m92HW7qL015836@coles02.co.lsil.com>
2008-10-02 19:37 ` Kevin_Marks
2008-10-02 20:07 ` Gerry.Houlder
2008-10-02 22:39 ` Kevin_Marks
2008-10-03 15:42 ` Matthew Wilcox
2008-10-07 11:39 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox