* TRIM vs UNMAP vs WRITE SAME and thin devices
[not found] ` <1232721777.4430.7.camel@macbook.infradead.org>
@ 2009-02-07 14:53 ` Ric Wheeler
2009-02-07 15:09 ` James Bottomley
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Ric Wheeler @ 2009-02-07 14:53 UTC (permalink / raw)
To: David Woodhouse, James Bottomley, Martin K. Petersen
Cc: Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel,
IDE/ATA development list
I have been poked at by some vendors about the status of our support for
the virtually/thinly provisioned luns since they are getting close to
being able to test with real devices.
My quick summary is that we most of the work so far has been done
without any real hardware to play with - in 2.6.29-rc3, I don't see any
low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
into the specific ATA or SCSI commands. Did I miss something & if not,
do we have plans to push anything upstream soonish?
One note on the SCSI devices, there was a T10 proposal to add an "UNMAP"
bit to the "WRITE SAME" command for SCSI. The details of the proposed
interface are at:
http://www.t11.org/t10/document.08/08-356r4.pdf
The up side of using WRITE SAME with unmap is that there are no fuzzy
semantics about what the unmapped sectors will be - they will all be
whatever the WRITE SAME command would have set (usually zeroes I assume).
The summary of write same is that you send down one sector (say 512
bytes of zeroes) and a count so you can do a zeroing of the target
without having to send all of the data over the wire. Very useful for
initializing members of a RAID device for example to a known pattern.
The down side would be that if we incorrectly send down a WRITE SAME
command to a non-thin device, I think that we would kick off a potential
extremely long IO. For example, imagine doing a write same of a full TB
- that could take an hour which might be an issue :-) Of course, we
should not be doing that if we get the code right.
I don't see another of the PDF's claims of advantages for file systems
to be really all that useful.
With either the write same and its proposed unmap bit or with the
original T10 unmap, do we have a short list of infrastructure that needs
fleshed out? Anything we can do to help get peoples patches to test with
their non-GA thin enabled devices?
Is there a similar short list of things to be done for T13 devices with
TRIM? Anyone have a chance to test on real hardware yet?
Thanks!
Ric
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler
@ 2009-02-07 15:09 ` James Bottomley
2009-02-07 16:14 ` Ric Wheeler
2009-02-07 22:50 ` Matthew Wilcox
2009-02-07 22:47 ` Matthew Wilcox
2009-02-08 20:06 ` Greg Freemyer
2 siblings, 2 replies; 18+ messages in thread
From: James Bottomley @ 2009-02-07 15:09 UTC (permalink / raw)
To: Ric Wheeler
Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote:
> I have been poked at by some vendors about the status of our support for
> the virtually/thinly provisioned luns since they are getting close to
> being able to test with real devices.
With my LSF hat on, a certain array vendor might be sponsoring to get
the opportunity to raise this issue more fully. The impression (mostly
correct) is that we're thinking about trim/unmap purely from the SSD FTL
point of view and perhaps not being as useful as we might to virtually
provisioned LUNs ... so you could mention to the other vendors that they
might have an interest in coming (and even possibly sponsoring).
> My quick summary is that we most of the work so far has been done
> without any real hardware to play with - in 2.6.29-rc3, I don't see any
> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
> into the specific ATA or SCSI commands. Did I miss something & if not,
> do we have plans to push anything upstream soonish?
With no devices it's a bit hard. Also we need at least three pieces for
SSDs: Devices supporting trim, the T13 implementation of TRIM and the
SAT for UNMAP. We can get the latter two out of the proposals, but it's
still a bit of a moving target.
> One note on the SCSI devices, there was a T10 proposal to add an "UNMAP"
> bit to the "WRITE SAME" command for SCSI. The details of the proposed
> interface are at:
>
> http://www.t11.org/t10/document.08/08-356r4.pdf
>
> The up side of using WRITE SAME with unmap is that there are no fuzzy
> semantics about what the unmapped sectors will be - they will all be
> whatever the WRITE SAME command would have set (usually zeroes I assume).
>
> The summary of write same is that you send down one sector (say 512
> bytes of zeroes) and a count so you can do a zeroing of the target
> without having to send all of the data over the wire. Very useful for
> initializing members of a RAID device for example to a known pattern.
>
> The down side would be that if we incorrectly send down a WRITE SAME
> command to a non-thin device, I think that we would kick off a potential
> extremely long IO. For example, imagine doing a write same of a full TB
> - that could take an hour which might be an issue :-) Of course, we
> should not be doing that if we get the code right.
As I read it, non thin provisioned devices can be identified (and may
not even accept WRITE SAME).
> I don't see another of the PDF's claims of advantages for file systems
> to be really all that useful.
>
> With either the write same and its proposed unmap bit or with the
> original T10 unmap, do we have a short list of infrastructure that needs
> fleshed out? Anything we can do to help get peoples patches to test with
> their non-GA thin enabled devices?
Yes, REQ_DISCARD simply isn't broad enough to cope with all the
potential uses of WRITE SAME. If it's just a mechanism to get known
data into a discard sector, fine, we can set that at the lower level.
However, WRITE SAME has uses beyond TRIM in that it can be used as an
engine for data deduplication. If vendors are thinking of doing this,
then REQ_DISCARD isn't flexible enough.
> Is there a similar short list of things to be done for T13 devices with
> TRIM? Anyone have a chance to test on real hardware yet?
Not that I know of yet. It's all sort of on hold until actual devices
become available.
James
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 15:09 ` James Bottomley
@ 2009-02-07 16:14 ` Ric Wheeler
2009-02-12 13:51 ` Eyal Shani
2009-02-07 22:50 ` Matthew Wilcox
1 sibling, 1 reply; 18+ messages in thread
From: Ric Wheeler @ 2009-02-07 16:14 UTC (permalink / raw)
To: James Bottomley
Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list, Eyal Shani
James Bottomley wrote:
> On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote:
>
>> I have been poked at by some vendors about the status of our support for
>> the virtually/thinly provisioned luns since they are getting close to
>> being able to test with real devices.
>>
>
> With my LSF hat on, a certain array vendor might be sponsoring to get
> the opportunity to raise this issue more fully. The impression (mostly
> correct) is that we're thinking about trim/unmap purely from the SSD FTL
> point of view and perhaps not being as useful as we might to virtually
> provisioned LUNs ... so you could mention to the other vendors that they
> might have an interest in coming (and even possibly sponsoring).
>
That is probably worth bringing up - I don't see this as a large project
and should be reasonably quick to get completed given all the work that
David and others have already put into it. If you (with you LF hat on
:-)) have a standard form or offer process, you might want to poke at
NetApp, EMC, Hitachi, IBM, HP and Dell. We both know the names of some
people in storage in a few of those companies, others I have less
contacts with.
On the other hand, this might also be an opportunity to get them and
their engineers on the array side more directly and personally involved.
>
>> My quick summary is that we most of the work so far has been done
>> without any real hardware to play with - in 2.6.29-rc3, I don't see any
>> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
>> into the specific ATA or SCSI commands. Did I miss something & if not,
>> do we have plans to push anything upstream soonish?
>>
>
> With no devices it's a bit hard. Also we need at least three pieces for
> SSDs: Devices supporting trim, the T13 implementation of TRIM and the
> SAT for UNMAP. We can get the latter two out of the proposals, but it's
> still a bit of a moving target.
>
I think that it has settled a bit - do we have a good sense of the
status of the various proposals in T13 and T10?
>
>> One note on the SCSI devices, there was a T10 proposal to add an "UNMAP"
>> bit to the "WRITE SAME" command for SCSI. The details of the proposed
>> interface are at:
>>
>> http://www.t11.org/t10/document.08/08-356r4.pdf
>>
>> The up side of using WRITE SAME with unmap is that there are no fuzzy
>> semantics about what the unmapped sectors will be - they will all be
>> whatever the WRITE SAME command would have set (usually zeroes I assume).
>>
>> The summary of write same is that you send down one sector (say 512
>> bytes of zeroes) and a count so you can do a zeroing of the target
>> without having to send all of the data over the wire. Very useful for
>> initializing members of a RAID device for example to a known pattern.
>>
>> The down side would be that if we incorrectly send down a WRITE SAME
>> command to a non-thin device, I think that we would kick off a potential
>> extremely long IO. For example, imagine doing a write same of a full TB
>> - that could take an hour which might be an issue :-) Of course, we
>> should not be doing that if we get the code right.
>>
>
> As I read it, non thin provisioned devices can be identified (and may
> not even accept WRITE SAME).
>
I agree that the intersection of write same and thin devices is not
going to be 100%. We might end up needing both for SCSI in the worst
case I suppose.
>
>> I don't see another of the PDF's claims of advantages for file systems
>> to be really all that useful.
>>
>> With either the write same and its proposed unmap bit or with the
>> original T10 unmap, do we have a short list of infrastructure that needs
>> fleshed out? Anything we can do to help get peoples patches to test with
>> their non-GA thin enabled devices?
>>
>
> Yes, REQ_DISCARD simply isn't broad enough to cope with all the
> potential uses of WRITE SAME. If it's just a mechanism to get known
> data into a discard sector, fine, we can set that at the lower level.
> However, WRITE SAME has uses beyond TRIM in that it can be used as an
> engine for data deduplication. If vendors are thinking of doing this,
> then REQ_DISCARD isn't flexible enough.
>
I am more interested personally in the sparse support. On the dedup
side, I think that most implementations do not rely on write same. They
tend to compute hashes on the various blocks and so on.
>
>> Is there a similar short list of things to be done for T13 devices with
>> TRIM? Anyone have a chance to test on real hardware yet?
>>
>
> Not that I know of yet. It's all sort of on hold until actual devices
> become available.
>
> James
>
>
>
The vendors certainly have things that they could try in their labs if
we can get bits and pieces together for them to test with. We will need
to avoid the chicken and egg scenario where they wait for us and we wait
for them :-)
Ric
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler
2009-02-07 15:09 ` James Bottomley
@ 2009-02-07 22:47 ` Matthew Wilcox
2009-02-07 23:36 ` David Woodhouse
2009-02-07 23:46 ` Jeff Garzik
2009-02-08 20:06 ` Greg Freemyer
2 siblings, 2 replies; 18+ messages in thread
From: Matthew Wilcox @ 2009-02-07 22:47 UTC (permalink / raw)
To: Ric Wheeler
Cc: David Woodhouse, James Bottomley, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote:
> I have been poked at by some vendors about the status of our support for
> the virtually/thinly provisioned luns since they are getting close to
> being able to test with real devices.
>
> My quick summary is that we most of the work so far has been done
> without any real hardware to play with - in 2.6.29-rc3, I don't see any
> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
> into the specific ATA or SCSI commands. Did I miss something & if not,
> do we have plans to push anything upstream soonish?
Bearing in mind that I'm now three weeks behind on email, you might want
to look at
http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231
which has at least one known bug (fixed by Dave Woodhouse and Ben
Herrenschmidt). I'll be able to give a more coherent answer in a few
days. Or maybe Dave will beat me to it ;-)
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 15:09 ` James Bottomley
2009-02-07 16:14 ` Ric Wheeler
@ 2009-02-07 22:50 ` Matthew Wilcox
2009-02-07 23:03 ` James Bottomley
2009-02-08 16:47 ` Ric Wheeler
1 sibling, 2 replies; 18+ messages in thread
From: Matthew Wilcox @ 2009-02-07 22:50 UTC (permalink / raw)
To: James Bottomley
Cc: Ric Wheeler, David Woodhouse, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote:
> On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote:
> > I have been poked at by some vendors about the status of our support for
> > the virtually/thinly provisioned luns since they are getting close to
> > being able to test with real devices.
>
> With my LSF hat on, a certain array vendor might be sponsoring to get
> the opportunity to raise this issue more fully. The impression (mostly
> correct) is that we're thinking about trim/unmap purely from the SSD FTL
> point of view and perhaps not being as useful as we might to virtually
> provisioned LUNs ... so you could mention to the other vendors that they
> might have an interest in coming (and even possibly sponsoring).
I thought we had agreed on a plan which satisfied the SSD and insane
array vendors. That is that we would do no tracking of allocation units
in the filesystem, but instead extend each trim out to cover the maximum
possible size. I've confirmed with Intel's SSD people that this would
cause them no harm at all (trimming already trimmed sectors won't even
cause a slowdown). Whether the filesystem people have taken note of
this, I have no idea.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 22:50 ` Matthew Wilcox
@ 2009-02-07 23:03 ` James Bottomley
2009-02-08 16:47 ` Ric Wheeler
1 sibling, 0 replies; 18+ messages in thread
From: James Bottomley @ 2009-02-07 23:03 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Ric Wheeler, David Woodhouse, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sat, 2009-02-07 at 15:50 -0700, Matthew Wilcox wrote:
> On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote:
> > On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote:
> > > I have been poked at by some vendors about the status of our support for
> > > the virtually/thinly provisioned luns since they are getting close to
> > > being able to test with real devices.
> >
> > With my LSF hat on, a certain array vendor might be sponsoring to get
> > the opportunity to raise this issue more fully. The impression (mostly
> > correct) is that we're thinking about trim/unmap purely from the SSD FTL
> > point of view and perhaps not being as useful as we might to virtually
> > provisioned LUNs ... so you could mention to the other vendors that they
> > might have an interest in coming (and even possibly sponsoring).
>
> I thought we had agreed on a plan which satisfied the SSD and insane
> array vendors.
I don't think we got any input from array vendors, so it's rather hard
to claim this. So part of this idea would be gathering the necessary
inputs.
> That is that we would do no tracking of allocation units
> in the filesystem, but instead extend each trim out to cover the maximum
> possible size. I've confirmed with Intel's SSD people that this would
> cause them no harm at all (trimming already trimmed sectors won't even
> cause a slowdown). Whether the filesystem people have taken note of
> this, I have no idea.
It's one idea, but absent requirements from array vendors, we don't
really know if it's the right one.
James
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 22:47 ` Matthew Wilcox
@ 2009-02-07 23:36 ` David Woodhouse
2009-02-07 23:46 ` Jeff Garzik
1 sibling, 0 replies; 18+ messages in thread
From: David Woodhouse @ 2009-02-07 23:36 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen,
Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list
> On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote:
>> I have been poked at by some vendors about the status of our support for
>> the virtually/thinly provisioned luns since they are getting close to
>> being able to test with real devices.
>>
>> My quick summary is that we most of the work so far has been done
>> without any real hardware to play with - in 2.6.29-rc3, I don't see any
>> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
>> into the specific ATA or SCSI commands. Did I miss something & if not,
>> do we have plans to push anything upstream soonish?
>
> Bearing in mind that I'm now three weeks behind on email, you might want
> to look at
> http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231
> which has at least one known bug (fixed by Dave Woodhouse and Ben
> Herrenschmidt). I'll be able to give a more coherent answer in a few
> days. Or maybe Dave will beat me to it ;-)
Ben's suggestion was that the IDE core wouldn't be sending the payload of
the command because it looks at the R/W bit... which is clear (read) in
our discard requests ATM. Making them appear to be writes is simple enough
though. I gave an updated test kernel to the Sandisk folks but haven't got
results back from them yet.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 22:47 ` Matthew Wilcox
2009-02-07 23:36 ` David Woodhouse
@ 2009-02-07 23:46 ` Jeff Garzik
2009-02-08 0:24 ` Matthew Wilcox
1 sibling, 1 reply; 18+ messages in thread
From: Jeff Garzik @ 2009-02-07 23:46 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen,
linux-scsi, linux-fsdevel, IDE/ATA development list
Matthew Wilcox wrote:
> On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote:
>> I have been poked at by some vendors about the status of our support for
>> the virtually/thinly provisioned luns since they are getting close to
>> being able to test with real devices.
>>
>> My quick summary is that we most of the work so far has been done
>> without any real hardware to play with - in 2.6.29-rc3, I don't see any
>> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD
>> into the specific ATA or SCSI commands. Did I miss something & if not,
>> do we have plans to push anything upstream soonish?
>
> Bearing in mind that I'm now three weeks behind on email, you might want
> to look at
> http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231
> which has at least one known bug (fixed by Dave Woodhouse and Ben
> Herrenschmidt). I'll be able to give a more coherent answer in a few
> days. Or maybe Dave will beat me to it ;-)
BTW when will somebody send me the 4k sector patches? :)
Jeff
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 23:46 ` Jeff Garzik
@ 2009-02-08 0:24 ` Matthew Wilcox
0 siblings, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2009-02-08 0:24 UTC (permalink / raw)
To: Jeff Garzik
Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sat, Feb 07, 2009 at 06:46:42PM -0500, Jeff Garzik wrote:
> BTW when will somebody send me the 4k sector patches? :)
I'll get to that on Monday; just arrived back from holiday today.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 22:50 ` Matthew Wilcox
2009-02-07 23:03 ` James Bottomley
@ 2009-02-08 16:47 ` Ric Wheeler
2009-02-08 20:50 ` Matthew Wilcox
1 sibling, 1 reply; 18+ messages in thread
From: Ric Wheeler @ 2009-02-08 16:47 UTC (permalink / raw)
To: Matthew Wilcox
Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
Matthew Wilcox wrote:
> On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote:
>
>> On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote:
>>
>>> I have been poked at by some vendors about the status of our support for
>>> the virtually/thinly provisioned luns since they are getting close to
>>> being able to test with real devices.
>>>
>> With my LSF hat on, a certain array vendor might be sponsoring to get
>> the opportunity to raise this issue more fully. The impression (mostly
>> correct) is that we're thinking about trim/unmap purely from the SSD FTL
>> point of view and perhaps not being as useful as we might to virtually
>> provisioned LUNs ... so you could mention to the other vendors that they
>> might have an interest in coming (and even possibly sponsoring).
>>
>
> I thought we had agreed on a plan which satisfied the SSD and insane
> array vendors. That is that we would do no tracking of allocation units
> in the filesystem, but instead extend each trim out to cover the maximum
> possible size. I've confirmed with Intel's SSD people that this would
> cause them no harm at all (trimming already trimmed sectors won't even
> cause a slowdown). Whether the filesystem people have taken note of
> this, I have no idea.
>
>
That should be helpful for the array people, but for some of them with
really large delete chuck sizes, they will still miss a lot since their
size is larger than the average file size :-) I guess that we could do
something to resync - Ted mentioned some ideas for ext4.
On another note, they are pondering either using write same with the
discard bit set or the unmap command. It would seem that for thin
provisioning alone, either would work.
ric
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler
2009-02-07 15:09 ` James Bottomley
2009-02-07 22:47 ` Matthew Wilcox
@ 2009-02-08 20:06 ` Greg Freemyer
2009-02-08 20:44 ` Matthew Wilcox
2 siblings, 1 reply; 18+ messages in thread
From: Greg Freemyer @ 2009-02-08 20:06 UTC (permalink / raw)
To: Ric Wheeler
Cc: David Woodhouse, James Bottomley, Martin K. Petersen,
Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel,
IDE/ATA development list
On Sat, Feb 7, 2009 at 9:53 AM, Ric Wheeler <rwheeler@redhat.com> wrote:
>
> I have been poked at by some vendors about the status of our support for the
> virtually/thinly provisioned luns since they are getting close to being able
> to test with real devices.
I found a list of T10 activities just since just Dec. 1, 2008 and it
is a bit overwhelming. (ie. 08-356r4 is but one of many recent
reports)
http://www.t10.org/new_a.htm
For those of us that don't live and breath the SCSI spec, is there an
overview site describing what is going on.
Maybe:
09-059r0 T10 Project Summary - January 2009 John Lohmeyer PDF
(34729) 2009/01/22
http://www.t10.org/cgi-bin/ac.pl?t=d&f=09-059r0.pdf
I have not read any of the Post Dec. 1 stuff including the above
project summary, but based on the names these seem potentially
relevant:
09-055r0 T13 Liaison Report January 09 Dan Colegrove PDF (4770) 2009/01/15
08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10
08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L.
Black PDF (387549) 2009/01/15
08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08
08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09
09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick
Knight PDF (32757) 2009/01/09
08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15
09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick
Knight PDF (38063) 2008/12/08
09-011r0 SBC-3 Thin Provisioning Threshold Notification Frederick
Knight PDF (48523) 2008/12/08
08-396r3 SPC-4: Reporting support for all DIF types George Penokie
PDF (85358) 2009/01/14
09-058r0 Agenda for T10 Meeting #90 March 2009 John Lohmeyer PDF
(61437) 2009/01/19
09-020r0 T11 Liaison Report, December 2008 Robert Snively PDF
(13117) 2008/12/19
09-032r0 Minutes of T10 Plenary Meeting #89 - January 15, 2009 Weber
& Lohmeyer HTM (141593) 2009/01/23
09-032r0 Minutes of T10 Plenary Meeting #89 - January 15, 2009 Weber
& Lohmeyer PDF (344891) 2009/01/23
Greg
--
Greg Freemyer
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-08 20:06 ` Greg Freemyer
@ 2009-02-08 20:44 ` Matthew Wilcox
2009-02-09 0:01 ` Ric Wheeler
0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2009-02-08 20:44 UTC (permalink / raw)
To: Greg Freemyer
Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen,
Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list
On Sun, Feb 08, 2009 at 03:06:44PM -0500, Greg Freemyer wrote:
> I found a list of T10 activities just since just Dec. 1, 2008 and it
> is a bit overwhelming. (ie. 08-356r4 is but one of many recent
> reports)
>
> http://www.t10.org/new_a.htm
>
> For those of us that don't live and breath the SCSI spec, is there an
> overview site describing what is going on.
I've been working off 08-149r7.pdf. I'm sure that's been superseded by
now.
> 08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10
Probably interesting. Haven't read it myself.
> 08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L.
> Black PDF (387549) 2009/01/15
Fred Knight seems to be the main coordinator of this effort, so yes.
> 08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08
That's the one I'm working from.
> 08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09
A newer version ... thought so.
> 09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick
> Knight PDF (32757) 2009/01/09
Clearly related.
> 08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15
Even newer version of what I've been working from.
> 09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick
> Knight PDF (38063) 2008/12/08
Probably tedious.
> 08-396r3 SPC-4: Reporting support for all DIF types George Penokie
> PDF (85358) 2009/01/14
Unrelated, I would think.
I'd go with 08-149r9 to get a good overview.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-08 16:47 ` Ric Wheeler
@ 2009-02-08 20:50 ` Matthew Wilcox
2009-02-08 23:58 ` Ric Wheeler
0 siblings, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2009-02-08 20:50 UTC (permalink / raw)
To: Ric Wheeler
Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
On Sun, Feb 08, 2009 at 11:47:25AM -0500, Ric Wheeler wrote:
> Matthew Wilcox wrote:
> >I thought we had agreed on a plan which satisfied the SSD and insane
> >array vendors. That is that we would do no tracking of allocation units
> >in the filesystem, but instead extend each trim out to cover the maximum
> >possible size. I've confirmed with Intel's SSD people that this would
> >cause them no harm at all (trimming already trimmed sectors won't even
> >cause a slowdown). Whether the filesystem people have taken note of
> >this, I have no idea.
>
> That should be helpful for the array people, but for some of them with
> really large delete chuck sizes, they will still miss a lot since their
> size is larger than the average file size :-) I guess that we could do
> something to resync - Ted mentioned some ideas for ext4.
I'm not sure I communicated the plan effectively.
Let's consider deleting a 4k file.
The DISCARD that the filesystem sends down does not just cover the 4k
of data. It covers all adjacent free space to that 4k of data, so it
might end up sending a DISCARD of several megabytes or even gigabytes,
assuming there's that much contiguous free space.
Now, filesystems which fragment their free space will not do well on
thin provisioned devices, but then they won't do well on any devices --
keeping your free space compacted is an essential part of any filesystem's
job, even on SSDs.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-08 20:50 ` Matthew Wilcox
@ 2009-02-08 23:58 ` Ric Wheeler
0 siblings, 0 replies; 18+ messages in thread
From: Ric Wheeler @ 2009-02-08 23:58 UTC (permalink / raw)
To: Matthew Wilcox
Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik,
linux-scsi, linux-fsdevel, IDE/ATA development list
Matthew Wilcox wrote:
> On Sun, Feb 08, 2009 at 11:47:25AM -0500, Ric Wheeler wrote:
>
>> Matthew Wilcox wrote:
>>
>>> I thought we had agreed on a plan which satisfied the SSD and insane
>>> array vendors. That is that we would do no tracking of allocation units
>>> in the filesystem, but instead extend each trim out to cover the maximum
>>> possible size. I've confirmed with Intel's SSD people that this would
>>> cause them no harm at all (trimming already trimmed sectors won't even
>>> cause a slowdown). Whether the filesystem people have taken note of
>>> this, I have no idea.
>>>
>> That should be helpful for the array people, but for some of them with
>> really large delete chuck sizes, they will still miss a lot since their
>> size is larger than the average file size :-) I guess that we could do
>> something to resync - Ted mentioned some ideas for ext4.
>>
>
> I'm not sure I communicated the plan effectively.
>
> Let's consider deleting a 4k file.
>
> The DISCARD that the filesystem sends down does not just cover the 4k
> of data. It covers all adjacent free space to that 4k of data, so it
> might end up sending a DISCARD of several megabytes or even gigabytes,
> assuming there's that much contiguous free space.
>
> Now, filesystems which fragment their free space will not do well on
> thin provisioned devices, but then they won't do well on any devices --
> keeping your free space compacted is an essential part of any filesystem's
> job, even on SSDs.
>
>
Thanks - that does sound like it will in fact help clean up. I suppose
the worst case would be deleting lots of non-contiguous small files from
a full file system (say every other 4KB or something obscure like that).
I will see what the vendors I know have come up with, I think that this
should give them something interesting to play with....
Ric
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-08 20:44 ` Matthew Wilcox
@ 2009-02-09 0:01 ` Ric Wheeler
0 siblings, 0 replies; 18+ messages in thread
From: Ric Wheeler @ 2009-02-09 0:01 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Greg Freemyer, David Woodhouse, James Bottomley,
Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel,
IDE/ATA development list
Matthew Wilcox wrote:
> On Sun, Feb 08, 2009 at 03:06:44PM -0500, Greg Freemyer wrote:
>
>> I found a list of T10 activities just since just Dec. 1, 2008 and it
>> is a bit overwhelming. (ie. 08-356r4 is but one of many recent
>> reports)
>>
>> http://www.t10.org/new_a.htm
>>
>> For those of us that don't live and breath the SCSI spec, is there an
>> overview site describing what is going on.
>>
>
> I've been working off 08-149r7.pdf. I'm sure that's been superseded by
> now.
>
>
>> 08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10
>>
>
> Probably interesting. Haven't read it myself.
>
This is only a four page proposal - basically, we would use the write
same command with a special unmap bit set to tell the target that it may
(at its option) unmap the blocks. If not, it would in fact have to set
the data to the indicated pattern in the command which I presume would
be all zeros in the normal case.
>
>> 08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L.
>> Black PDF (387549) 2009/01/15
>>
>
> Fred Knight seems to be the main coordinator of this effort, so yes.
>
Fred and David Black both have been quite active.
>
>> 08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08
>>
>
> That's the one I'm working from.
>
>
>> 08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09
>>
>
> A newer version ... thought so.
>
>
>> 09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick
>> Knight PDF (32757) 2009/01/09
>>
>
> Clearly related.
>
>
>> 08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15
>>
>
> Even newer version of what I've been working from.
>
>
>> 09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick
>> Knight PDF (38063) 2008/12/08
>>
>
> Probably tedious.
>
>
>> 08-396r3 SPC-4: Reporting support for all DIF types George Penokie
>> PDF (85358) 2009/01/14
>>
>
> Unrelated, I would think.
>
> I'd go with 08-149r9 to get a good overview.
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-07 16:14 ` Ric Wheeler
@ 2009-02-12 13:51 ` Eyal Shani
2009-03-23 19:05 ` Greg Freemyer
0 siblings, 1 reply; 18+ messages in thread
From: Eyal Shani @ 2009-02-12 13:51 UTC (permalink / raw)
To: Ric Wheeler, James Bottomley
Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik,
linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org,
IDE/ATA development list, Eyal Shani
Adding my 5 cents.
T13 added Trim to the latest ATA8 proposal.
http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf
This is after the changes put into the definition, with 'Deterministic Read after Trim'.
This is not STANDARDIZED, but pretty much excepted by all sides.
I was hoping that would settle the differences between T10/T13 on this - little did I know...
We are working with David W. on his implementation for Trim feature, and hope to get to the bottom of debug process soon.
Hope to update soon...
Regards,
Eyal Shani.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-02-12 13:51 ` Eyal Shani
@ 2009-03-23 19:05 ` Greg Freemyer
2009-03-23 19:23 ` Mark Lord
0 siblings, 1 reply; 18+ messages in thread
From: Greg Freemyer @ 2009-03-23 19:05 UTC (permalink / raw)
To: Eyal Shani
Cc: Ric Wheeler, James Bottomley, David Woodhouse, Martin K. Petersen,
Matthew Wilcox, Jeff Garzik, linux-scsi@vger.kernel.org,
linux-fsdevel@vger.kernel.org, IDE/ATA development list,
Theodore Tso
On Thu, Feb 12, 2009 at 9:51 AM, Eyal Shani <Eyal.Shani@sandisk.com> wrote:
> Adding my 5 cents.
>
> T13 added Trim to the latest ATA8 proposal.
> http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf
>
> This is after the changes put into the definition, with 'Deterministic Read after Trim'.
> This is not STANDARDIZED, but pretty much excepted by all sides.
>
> I was hoping that would settle the differences between T10/T13 on this - little did I know...
>
> We are working with David W. on his implementation for Trim feature, and hope to get to the bottom of debug process soon.
> Hope to update soon...
>
>
> Regards,
> Eyal Shani.
FYI:
Several of you remember I've been concerned about the lack of
"audit-ability" associated with the new Trim feature as relates to the
T13 spec.
I finally found a contact that is on the T-13 committee and have
expressed my concern. He said the issue was raised at a recent
meeting of the committee and that a sub-group was tasked with making a
recommendation. He said that he understands my concern and said he
would push to ensure that some sort of "reliable data" flag be in the
eventual spec.
Obviously he is just one person, so no guarantees, but I am happy to
have finally connected with someone on the committee.
Greg
--
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices
2009-03-23 19:05 ` Greg Freemyer
@ 2009-03-23 19:23 ` Mark Lord
0 siblings, 0 replies; 18+ messages in thread
From: Mark Lord @ 2009-03-23 19:23 UTC (permalink / raw)
To: Greg Freemyer
Cc: Eyal Shani, Ric Wheeler, James Bottomley, David Woodhouse,
Martin K. Petersen, Matthew Wilcox, Jeff Garzik,
linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org,
IDE/ATA development list, Theodore Tso
..
> On Thu, Feb 12, 2009 at 9:51 AM, Eyal Shani <Eyal.Shani@sandisk.com> wrote:
>> Adding my 5 cents.
>>
>> T13 added Trim to the latest ATA8 proposal.
>> http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf
..
Note that there is also a Rev.1a edition, same link as above
except change the d2015r1 to d2015r1a:
http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1a-ATAATAPI_Command_Set_-_2_ACS-2.pdf
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-03-23 19:23 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090123041558.GC24652@parisc-linux.org>
[not found] ` <4979AF62.7070409@redhat.com>
[not found] ` <1232721777.4430.7.camel@macbook.infradead.org>
2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler
2009-02-07 15:09 ` James Bottomley
2009-02-07 16:14 ` Ric Wheeler
2009-02-12 13:51 ` Eyal Shani
2009-03-23 19:05 ` Greg Freemyer
2009-03-23 19:23 ` Mark Lord
2009-02-07 22:50 ` Matthew Wilcox
2009-02-07 23:03 ` James Bottomley
2009-02-08 16:47 ` Ric Wheeler
2009-02-08 20:50 ` Matthew Wilcox
2009-02-08 23:58 ` Ric Wheeler
2009-02-07 22:47 ` Matthew Wilcox
2009-02-07 23:36 ` David Woodhouse
2009-02-07 23:46 ` Jeff Garzik
2009-02-08 0:24 ` Matthew Wilcox
2009-02-08 20:06 ` Greg Freemyer
2009-02-08 20:44 ` Matthew Wilcox
2009-02-09 0:01 ` Ric Wheeler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).