* TRIM vs UNMAP vs WRITE SAME and thin devices [not found] ` <1232721777.4430.7.camel@macbook.infradead.org> @ 2009-02-07 14:53 ` Ric Wheeler 2009-02-07 15:09 ` James Bottomley ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Ric Wheeler @ 2009-02-07 14:53 UTC (permalink / raw) To: David Woodhouse, James Bottomley, Martin K. Petersen Cc: Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list I have been poked at by some vendors about the status of our support for the virtually/thinly provisioned luns since they are getting close to being able to test with real devices. My quick summary is that we most of the work so far has been done without any real hardware to play with - in 2.6.29-rc3, I don't see any low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD into the specific ATA or SCSI commands. Did I miss something & if not, do we have plans to push anything upstream soonish? One note on the SCSI devices, there was a T10 proposal to add an "UNMAP" bit to the "WRITE SAME" command for SCSI. The details of the proposed interface are at: http://www.t11.org/t10/document.08/08-356r4.pdf The up side of using WRITE SAME with unmap is that there are no fuzzy semantics about what the unmapped sectors will be - they will all be whatever the WRITE SAME command would have set (usually zeroes I assume). The summary of write same is that you send down one sector (say 512 bytes of zeroes) and a count so you can do a zeroing of the target without having to send all of the data over the wire. Very useful for initializing members of a RAID device for example to a known pattern. The down side would be that if we incorrectly send down a WRITE SAME command to a non-thin device, I think that we would kick off a potential extremely long IO. For example, imagine doing a write same of a full TB - that could take an hour which might be an issue :-) Of course, we should not be doing that if we get the code right. I don't see another of the PDF's claims of advantages for file systems to be really all that useful. With either the write same and its proposed unmap bit or with the original T10 unmap, do we have a short list of infrastructure that needs fleshed out? Anything we can do to help get peoples patches to test with their non-GA thin enabled devices? Is there a similar short list of things to be done for T13 devices with TRIM? Anyone have a chance to test on real hardware yet? Thanks! Ric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler @ 2009-02-07 15:09 ` James Bottomley 2009-02-07 16:14 ` Ric Wheeler 2009-02-07 22:50 ` Matthew Wilcox 2009-02-07 22:47 ` Matthew Wilcox 2009-02-08 20:06 ` Greg Freemyer 2 siblings, 2 replies; 18+ messages in thread From: James Bottomley @ 2009-02-07 15:09 UTC (permalink / raw) To: Ric Wheeler Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote: > I have been poked at by some vendors about the status of our support for > the virtually/thinly provisioned luns since they are getting close to > being able to test with real devices. With my LSF hat on, a certain array vendor might be sponsoring to get the opportunity to raise this issue more fully. The impression (mostly correct) is that we're thinking about trim/unmap purely from the SSD FTL point of view and perhaps not being as useful as we might to virtually provisioned LUNs ... so you could mention to the other vendors that they might have an interest in coming (and even possibly sponsoring). > My quick summary is that we most of the work so far has been done > without any real hardware to play with - in 2.6.29-rc3, I don't see any > low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD > into the specific ATA or SCSI commands. Did I miss something & if not, > do we have plans to push anything upstream soonish? With no devices it's a bit hard. Also we need at least three pieces for SSDs: Devices supporting trim, the T13 implementation of TRIM and the SAT for UNMAP. We can get the latter two out of the proposals, but it's still a bit of a moving target. > One note on the SCSI devices, there was a T10 proposal to add an "UNMAP" > bit to the "WRITE SAME" command for SCSI. The details of the proposed > interface are at: > > http://www.t11.org/t10/document.08/08-356r4.pdf > > The up side of using WRITE SAME with unmap is that there are no fuzzy > semantics about what the unmapped sectors will be - they will all be > whatever the WRITE SAME command would have set (usually zeroes I assume). > > The summary of write same is that you send down one sector (say 512 > bytes of zeroes) and a count so you can do a zeroing of the target > without having to send all of the data over the wire. Very useful for > initializing members of a RAID device for example to a known pattern. > > The down side would be that if we incorrectly send down a WRITE SAME > command to a non-thin device, I think that we would kick off a potential > extremely long IO. For example, imagine doing a write same of a full TB > - that could take an hour which might be an issue :-) Of course, we > should not be doing that if we get the code right. As I read it, non thin provisioned devices can be identified (and may not even accept WRITE SAME). > I don't see another of the PDF's claims of advantages for file systems > to be really all that useful. > > With either the write same and its proposed unmap bit or with the > original T10 unmap, do we have a short list of infrastructure that needs > fleshed out? Anything we can do to help get peoples patches to test with > their non-GA thin enabled devices? Yes, REQ_DISCARD simply isn't broad enough to cope with all the potential uses of WRITE SAME. If it's just a mechanism to get known data into a discard sector, fine, we can set that at the lower level. However, WRITE SAME has uses beyond TRIM in that it can be used as an engine for data deduplication. If vendors are thinking of doing this, then REQ_DISCARD isn't flexible enough. > Is there a similar short list of things to be done for T13 devices with > TRIM? Anyone have a chance to test on real hardware yet? Not that I know of yet. It's all sort of on hold until actual devices become available. James ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 15:09 ` James Bottomley @ 2009-02-07 16:14 ` Ric Wheeler 2009-02-12 13:51 ` Eyal Shani 2009-02-07 22:50 ` Matthew Wilcox 1 sibling, 1 reply; 18+ messages in thread From: Ric Wheeler @ 2009-02-07 16:14 UTC (permalink / raw) To: James Bottomley Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list, Eyal Shani James Bottomley wrote: > On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote: > >> I have been poked at by some vendors about the status of our support for >> the virtually/thinly provisioned luns since they are getting close to >> being able to test with real devices. >> > > With my LSF hat on, a certain array vendor might be sponsoring to get > the opportunity to raise this issue more fully. The impression (mostly > correct) is that we're thinking about trim/unmap purely from the SSD FTL > point of view and perhaps not being as useful as we might to virtually > provisioned LUNs ... so you could mention to the other vendors that they > might have an interest in coming (and even possibly sponsoring). > That is probably worth bringing up - I don't see this as a large project and should be reasonably quick to get completed given all the work that David and others have already put into it. If you (with you LF hat on :-)) have a standard form or offer process, you might want to poke at NetApp, EMC, Hitachi, IBM, HP and Dell. We both know the names of some people in storage in a few of those companies, others I have less contacts with. On the other hand, this might also be an opportunity to get them and their engineers on the array side more directly and personally involved. > >> My quick summary is that we most of the work so far has been done >> without any real hardware to play with - in 2.6.29-rc3, I don't see any >> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD >> into the specific ATA or SCSI commands. Did I miss something & if not, >> do we have plans to push anything upstream soonish? >> > > With no devices it's a bit hard. Also we need at least three pieces for > SSDs: Devices supporting trim, the T13 implementation of TRIM and the > SAT for UNMAP. We can get the latter two out of the proposals, but it's > still a bit of a moving target. > I think that it has settled a bit - do we have a good sense of the status of the various proposals in T13 and T10? > >> One note on the SCSI devices, there was a T10 proposal to add an "UNMAP" >> bit to the "WRITE SAME" command for SCSI. The details of the proposed >> interface are at: >> >> http://www.t11.org/t10/document.08/08-356r4.pdf >> >> The up side of using WRITE SAME with unmap is that there are no fuzzy >> semantics about what the unmapped sectors will be - they will all be >> whatever the WRITE SAME command would have set (usually zeroes I assume). >> >> The summary of write same is that you send down one sector (say 512 >> bytes of zeroes) and a count so you can do a zeroing of the target >> without having to send all of the data over the wire. Very useful for >> initializing members of a RAID device for example to a known pattern. >> >> The down side would be that if we incorrectly send down a WRITE SAME >> command to a non-thin device, I think that we would kick off a potential >> extremely long IO. For example, imagine doing a write same of a full TB >> - that could take an hour which might be an issue :-) Of course, we >> should not be doing that if we get the code right. >> > > As I read it, non thin provisioned devices can be identified (and may > not even accept WRITE SAME). > I agree that the intersection of write same and thin devices is not going to be 100%. We might end up needing both for SCSI in the worst case I suppose. > >> I don't see another of the PDF's claims of advantages for file systems >> to be really all that useful. >> >> With either the write same and its proposed unmap bit or with the >> original T10 unmap, do we have a short list of infrastructure that needs >> fleshed out? Anything we can do to help get peoples patches to test with >> their non-GA thin enabled devices? >> > > Yes, REQ_DISCARD simply isn't broad enough to cope with all the > potential uses of WRITE SAME. If it's just a mechanism to get known > data into a discard sector, fine, we can set that at the lower level. > However, WRITE SAME has uses beyond TRIM in that it can be used as an > engine for data deduplication. If vendors are thinking of doing this, > then REQ_DISCARD isn't flexible enough. > I am more interested personally in the sparse support. On the dedup side, I think that most implementations do not rely on write same. They tend to compute hashes on the various blocks and so on. > >> Is there a similar short list of things to be done for T13 devices with >> TRIM? Anyone have a chance to test on real hardware yet? >> > > Not that I know of yet. It's all sort of on hold until actual devices > become available. > > James > > > The vendors certainly have things that they could try in their labs if we can get bits and pieces together for them to test with. We will need to avoid the chicken and egg scenario where they wait for us and we wait for them :-) Ric ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 16:14 ` Ric Wheeler @ 2009-02-12 13:51 ` Eyal Shani 2009-03-23 19:05 ` Greg Freemyer 0 siblings, 1 reply; 18+ messages in thread From: Eyal Shani @ 2009-02-12 13:51 UTC (permalink / raw) To: Ric Wheeler, James Bottomley Cc: David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org, IDE/ATA development list, Eyal Shani Adding my 5 cents. T13 added Trim to the latest ATA8 proposal. http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf This is after the changes put into the definition, with 'Deterministic Read after Trim'. This is not STANDARDIZED, but pretty much excepted by all sides. I was hoping that would settle the differences between T10/T13 on this - little did I know... We are working with David W. on his implementation for Trim feature, and hope to get to the bottom of debug process soon. Hope to update soon... Regards, Eyal Shani. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-12 13:51 ` Eyal Shani @ 2009-03-23 19:05 ` Greg Freemyer 2009-03-23 19:23 ` Mark Lord 0 siblings, 1 reply; 18+ messages in thread From: Greg Freemyer @ 2009-03-23 19:05 UTC (permalink / raw) To: Eyal Shani Cc: Ric Wheeler, James Bottomley, David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org, IDE/ATA development list, Theodore Tso On Thu, Feb 12, 2009 at 9:51 AM, Eyal Shani <Eyal.Shani@sandisk.com> wrote: > Adding my 5 cents. > > T13 added Trim to the latest ATA8 proposal. > http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf > > This is after the changes put into the definition, with 'Deterministic Read after Trim'. > This is not STANDARDIZED, but pretty much excepted by all sides. > > I was hoping that would settle the differences between T10/T13 on this - little did I know... > > We are working with David W. on his implementation for Trim feature, and hope to get to the bottom of debug process soon. > Hope to update soon... > > > Regards, > Eyal Shani. FYI: Several of you remember I've been concerned about the lack of "audit-ability" associated with the new Trim feature as relates to the T13 spec. I finally found a contact that is on the T-13 committee and have expressed my concern. He said the issue was raised at a recent meeting of the committee and that a sub-group was tasked with making a recommendation. He said that he understands my concern and said he would push to ensure that some sort of "reliable data" flag be in the eventual spec. Obviously he is just one person, so no guarantees, but I am happy to have finally connected with someone on the committee. Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-03-23 19:05 ` Greg Freemyer @ 2009-03-23 19:23 ` Mark Lord 0 siblings, 0 replies; 18+ messages in thread From: Mark Lord @ 2009-03-23 19:23 UTC (permalink / raw) To: Greg Freemyer Cc: Eyal Shani, Ric Wheeler, James Bottomley, David Woodhouse, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org, IDE/ATA development list, Theodore Tso .. > On Thu, Feb 12, 2009 at 9:51 AM, Eyal Shani <Eyal.Shani@sandisk.com> wrote: >> Adding my 5 cents. >> >> T13 added Trim to the latest ATA8 proposal. >> http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1-ATAATAPI_Command_Set_-_2_ACS-2.pdf .. Note that there is also a Rev.1a edition, same link as above except change the d2015r1 to d2015r1a: http://www.t13.org/Documents/UploadedDocuments/docs2009/d2015r1a-ATAATAPI_Command_Set_-_2_ACS-2.pdf ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 15:09 ` James Bottomley 2009-02-07 16:14 ` Ric Wheeler @ 2009-02-07 22:50 ` Matthew Wilcox 2009-02-07 23:03 ` James Bottomley 2009-02-08 16:47 ` Ric Wheeler 1 sibling, 2 replies; 18+ messages in thread From: Matthew Wilcox @ 2009-02-07 22:50 UTC (permalink / raw) To: James Bottomley Cc: Ric Wheeler, David Woodhouse, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote: > On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote: > > I have been poked at by some vendors about the status of our support for > > the virtually/thinly provisioned luns since they are getting close to > > being able to test with real devices. > > With my LSF hat on, a certain array vendor might be sponsoring to get > the opportunity to raise this issue more fully. The impression (mostly > correct) is that we're thinking about trim/unmap purely from the SSD FTL > point of view and perhaps not being as useful as we might to virtually > provisioned LUNs ... so you could mention to the other vendors that they > might have an interest in coming (and even possibly sponsoring). I thought we had agreed on a plan which satisfied the SSD and insane array vendors. That is that we would do no tracking of allocation units in the filesystem, but instead extend each trim out to cover the maximum possible size. I've confirmed with Intel's SSD people that this would cause them no harm at all (trimming already trimmed sectors won't even cause a slowdown). Whether the filesystem people have taken note of this, I have no idea. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 22:50 ` Matthew Wilcox @ 2009-02-07 23:03 ` James Bottomley 2009-02-08 16:47 ` Ric Wheeler 1 sibling, 0 replies; 18+ messages in thread From: James Bottomley @ 2009-02-07 23:03 UTC (permalink / raw) To: Matthew Wilcox Cc: Ric Wheeler, David Woodhouse, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, 2009-02-07 at 15:50 -0700, Matthew Wilcox wrote: > On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote: > > On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote: > > > I have been poked at by some vendors about the status of our support for > > > the virtually/thinly provisioned luns since they are getting close to > > > being able to test with real devices. > > > > With my LSF hat on, a certain array vendor might be sponsoring to get > > the opportunity to raise this issue more fully. The impression (mostly > > correct) is that we're thinking about trim/unmap purely from the SSD FTL > > point of view and perhaps not being as useful as we might to virtually > > provisioned LUNs ... so you could mention to the other vendors that they > > might have an interest in coming (and even possibly sponsoring). > > I thought we had agreed on a plan which satisfied the SSD and insane > array vendors. I don't think we got any input from array vendors, so it's rather hard to claim this. So part of this idea would be gathering the necessary inputs. > That is that we would do no tracking of allocation units > in the filesystem, but instead extend each trim out to cover the maximum > possible size. I've confirmed with Intel's SSD people that this would > cause them no harm at all (trimming already trimmed sectors won't even > cause a slowdown). Whether the filesystem people have taken note of > this, I have no idea. It's one idea, but absent requirements from array vendors, we don't really know if it's the right one. James ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 22:50 ` Matthew Wilcox 2009-02-07 23:03 ` James Bottomley @ 2009-02-08 16:47 ` Ric Wheeler 2009-02-08 20:50 ` Matthew Wilcox 1 sibling, 1 reply; 18+ messages in thread From: Ric Wheeler @ 2009-02-08 16:47 UTC (permalink / raw) To: Matthew Wilcox Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list Matthew Wilcox wrote: > On Sat, Feb 07, 2009 at 09:09:32AM -0600, James Bottomley wrote: > >> On Sat, 2009-02-07 at 09:53 -0500, Ric Wheeler wrote: >> >>> I have been poked at by some vendors about the status of our support for >>> the virtually/thinly provisioned luns since they are getting close to >>> being able to test with real devices. >>> >> With my LSF hat on, a certain array vendor might be sponsoring to get >> the opportunity to raise this issue more fully. The impression (mostly >> correct) is that we're thinking about trim/unmap purely from the SSD FTL >> point of view and perhaps not being as useful as we might to virtually >> provisioned LUNs ... so you could mention to the other vendors that they >> might have an interest in coming (and even possibly sponsoring). >> > > I thought we had agreed on a plan which satisfied the SSD and insane > array vendors. That is that we would do no tracking of allocation units > in the filesystem, but instead extend each trim out to cover the maximum > possible size. I've confirmed with Intel's SSD people that this would > cause them no harm at all (trimming already trimmed sectors won't even > cause a slowdown). Whether the filesystem people have taken note of > this, I have no idea. > > That should be helpful for the array people, but for some of them with really large delete chuck sizes, they will still miss a lot since their size is larger than the average file size :-) I guess that we could do something to resync - Ted mentioned some ideas for ext4. On another note, they are pondering either using write same with the discard bit set or the unmap command. It would seem that for thin provisioning alone, either would work. ric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-08 16:47 ` Ric Wheeler @ 2009-02-08 20:50 ` Matthew Wilcox 2009-02-08 23:58 ` Ric Wheeler 0 siblings, 1 reply; 18+ messages in thread From: Matthew Wilcox @ 2009-02-08 20:50 UTC (permalink / raw) To: Ric Wheeler Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sun, Feb 08, 2009 at 11:47:25AM -0500, Ric Wheeler wrote: > Matthew Wilcox wrote: > >I thought we had agreed on a plan which satisfied the SSD and insane > >array vendors. That is that we would do no tracking of allocation units > >in the filesystem, but instead extend each trim out to cover the maximum > >possible size. I've confirmed with Intel's SSD people that this would > >cause them no harm at all (trimming already trimmed sectors won't even > >cause a slowdown). Whether the filesystem people have taken note of > >this, I have no idea. > > That should be helpful for the array people, but for some of them with > really large delete chuck sizes, they will still miss a lot since their > size is larger than the average file size :-) I guess that we could do > something to resync - Ted mentioned some ideas for ext4. I'm not sure I communicated the plan effectively. Let's consider deleting a 4k file. The DISCARD that the filesystem sends down does not just cover the 4k of data. It covers all adjacent free space to that 4k of data, so it might end up sending a DISCARD of several megabytes or even gigabytes, assuming there's that much contiguous free space. Now, filesystems which fragment their free space will not do well on thin provisioned devices, but then they won't do well on any devices -- keeping your free space compacted is an essential part of any filesystem's job, even on SSDs. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-08 20:50 ` Matthew Wilcox @ 2009-02-08 23:58 ` Ric Wheeler 0 siblings, 0 replies; 18+ messages in thread From: Ric Wheeler @ 2009-02-08 23:58 UTC (permalink / raw) To: Matthew Wilcox Cc: James Bottomley, David Woodhouse, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list Matthew Wilcox wrote: > On Sun, Feb 08, 2009 at 11:47:25AM -0500, Ric Wheeler wrote: > >> Matthew Wilcox wrote: >> >>> I thought we had agreed on a plan which satisfied the SSD and insane >>> array vendors. That is that we would do no tracking of allocation units >>> in the filesystem, but instead extend each trim out to cover the maximum >>> possible size. I've confirmed with Intel's SSD people that this would >>> cause them no harm at all (trimming already trimmed sectors won't even >>> cause a slowdown). Whether the filesystem people have taken note of >>> this, I have no idea. >>> >> That should be helpful for the array people, but for some of them with >> really large delete chuck sizes, they will still miss a lot since their >> size is larger than the average file size :-) I guess that we could do >> something to resync - Ted mentioned some ideas for ext4. >> > > I'm not sure I communicated the plan effectively. > > Let's consider deleting a 4k file. > > The DISCARD that the filesystem sends down does not just cover the 4k > of data. It covers all adjacent free space to that 4k of data, so it > might end up sending a DISCARD of several megabytes or even gigabytes, > assuming there's that much contiguous free space. > > Now, filesystems which fragment their free space will not do well on > thin provisioned devices, but then they won't do well on any devices -- > keeping your free space compacted is an essential part of any filesystem's > job, even on SSDs. > > Thanks - that does sound like it will in fact help clean up. I suppose the worst case would be deleting lots of non-contiguous small files from a full file system (say every other 4KB or something obscure like that). I will see what the vendors I know have come up with, I think that this should give them something interesting to play with.... Ric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler 2009-02-07 15:09 ` James Bottomley @ 2009-02-07 22:47 ` Matthew Wilcox 2009-02-07 23:36 ` David Woodhouse 2009-02-07 23:46 ` Jeff Garzik 2009-02-08 20:06 ` Greg Freemyer 2 siblings, 2 replies; 18+ messages in thread From: Matthew Wilcox @ 2009-02-07 22:47 UTC (permalink / raw) To: Ric Wheeler Cc: David Woodhouse, James Bottomley, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote: > I have been poked at by some vendors about the status of our support for > the virtually/thinly provisioned luns since they are getting close to > being able to test with real devices. > > My quick summary is that we most of the work so far has been done > without any real hardware to play with - in 2.6.29-rc3, I don't see any > low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD > into the specific ATA or SCSI commands. Did I miss something & if not, > do we have plans to push anything upstream soonish? Bearing in mind that I'm now three weeks behind on email, you might want to look at http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231 which has at least one known bug (fixed by Dave Woodhouse and Ben Herrenschmidt). I'll be able to give a more coherent answer in a few days. Or maybe Dave will beat me to it ;-) -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 22:47 ` Matthew Wilcox @ 2009-02-07 23:36 ` David Woodhouse 2009-02-07 23:46 ` Jeff Garzik 1 sibling, 0 replies; 18+ messages in thread From: David Woodhouse @ 2009-02-07 23:36 UTC (permalink / raw) To: Matthew Wilcox Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list > On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote: >> I have been poked at by some vendors about the status of our support for >> the virtually/thinly provisioned luns since they are getting close to >> being able to test with real devices. >> >> My quick summary is that we most of the work so far has been done >> without any real hardware to play with - in 2.6.29-rc3, I don't see any >> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD >> into the specific ATA or SCSI commands. Did I miss something & if not, >> do we have plans to push anything upstream soonish? > > Bearing in mind that I'm now three weeks behind on email, you might want > to look at > http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231 > which has at least one known bug (fixed by Dave Woodhouse and Ben > Herrenschmidt). I'll be able to give a more coherent answer in a few > days. Or maybe Dave will beat me to it ;-) Ben's suggestion was that the IDE core wouldn't be sending the payload of the command because it looks at the R/W bit... which is clear (read) in our discard requests ATM. Making them appear to be writes is simple enough though. I gave an updated test kernel to the Sandisk folks but haven't got results back from them yet. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 22:47 ` Matthew Wilcox 2009-02-07 23:36 ` David Woodhouse @ 2009-02-07 23:46 ` Jeff Garzik 2009-02-08 0:24 ` Matthew Wilcox 1 sibling, 1 reply; 18+ messages in thread From: Jeff Garzik @ 2009-02-07 23:46 UTC (permalink / raw) To: Matthew Wilcox Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen, linux-scsi, linux-fsdevel, IDE/ATA development list Matthew Wilcox wrote: > On Sat, Feb 07, 2009 at 09:53:06AM -0500, Ric Wheeler wrote: >> I have been poked at by some vendors about the status of our support for >> the virtually/thinly provisioned luns since they are getting close to >> being able to test with real devices. >> >> My quick summary is that we most of the work so far has been done >> without any real hardware to play with - in 2.6.29-rc3, I don't see any >> low level ATA or SCSI bits that turn requests tagged with REQ_DISCARD >> into the specific ATA or SCSI commands. Did I miss something & if not, >> do we have plans to push anything upstream soonish? > > Bearing in mind that I'm now three weeks behind on email, you might want > to look at > http://git.kernel.org/?p=linux/kernel/git/willy/ssd.git;a=shortlog;h=trim-20081231 > which has at least one known bug (fixed by Dave Woodhouse and Ben > Herrenschmidt). I'll be able to give a more coherent answer in a few > days. Or maybe Dave will beat me to it ;-) BTW when will somebody send me the 4k sector patches? :) Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 23:46 ` Jeff Garzik @ 2009-02-08 0:24 ` Matthew Wilcox 0 siblings, 0 replies; 18+ messages in thread From: Matthew Wilcox @ 2009-02-08 0:24 UTC (permalink / raw) To: Jeff Garzik Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, Feb 07, 2009 at 06:46:42PM -0500, Jeff Garzik wrote: > BTW when will somebody send me the 4k sector patches? :) I'll get to that on Monday; just arrived back from holiday today. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler 2009-02-07 15:09 ` James Bottomley 2009-02-07 22:47 ` Matthew Wilcox @ 2009-02-08 20:06 ` Greg Freemyer 2009-02-08 20:44 ` Matthew Wilcox 2 siblings, 1 reply; 18+ messages in thread From: Greg Freemyer @ 2009-02-08 20:06 UTC (permalink / raw) To: Ric Wheeler Cc: David Woodhouse, James Bottomley, Martin K. Petersen, Matthew Wilcox, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sat, Feb 7, 2009 at 9:53 AM, Ric Wheeler <rwheeler@redhat.com> wrote: > > I have been poked at by some vendors about the status of our support for the > virtually/thinly provisioned luns since they are getting close to being able > to test with real devices. I found a list of T10 activities just since just Dec. 1, 2008 and it is a bit overwhelming. (ie. 08-356r4 is but one of many recent reports) http://www.t10.org/new_a.htm For those of us that don't live and breath the SCSI spec, is there an overview site describing what is going on. Maybe: 09-059r0 T10 Project Summary - January 2009 John Lohmeyer PDF (34729) 2009/01/22 http://www.t10.org/cgi-bin/ac.pl?t=d&f=09-059r0.pdf I have not read any of the Post Dec. 1 stuff including the above project summary, but based on the names these seem potentially relevant: 09-055r0 T13 Liaison Report January 09 Dan Colegrove PDF (4770) 2009/01/15 08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10 08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L. Black PDF (387549) 2009/01/15 08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08 08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09 09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick Knight PDF (32757) 2009/01/09 08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15 09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick Knight PDF (38063) 2008/12/08 09-011r0 SBC-3 Thin Provisioning Threshold Notification Frederick Knight PDF (48523) 2008/12/08 08-396r3 SPC-4: Reporting support for all DIF types George Penokie PDF (85358) 2009/01/14 09-058r0 Agenda for T10 Meeting #90 March 2009 John Lohmeyer PDF (61437) 2009/01/19 09-020r0 T11 Liaison Report, December 2008 Robert Snively PDF (13117) 2008/12/19 09-032r0 Minutes of T10 Plenary Meeting #89 - January 15, 2009 Weber & Lohmeyer HTM (141593) 2009/01/23 09-032r0 Minutes of T10 Plenary Meeting #89 - January 15, 2009 Weber & Lohmeyer PDF (344891) 2009/01/23 Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-08 20:06 ` Greg Freemyer @ 2009-02-08 20:44 ` Matthew Wilcox 2009-02-09 0:01 ` Ric Wheeler 0 siblings, 1 reply; 18+ messages in thread From: Matthew Wilcox @ 2009-02-08 20:44 UTC (permalink / raw) To: Greg Freemyer Cc: Ric Wheeler, David Woodhouse, James Bottomley, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list On Sun, Feb 08, 2009 at 03:06:44PM -0500, Greg Freemyer wrote: > I found a list of T10 activities just since just Dec. 1, 2008 and it > is a bit overwhelming. (ie. 08-356r4 is but one of many recent > reports) > > http://www.t10.org/new_a.htm > > For those of us that don't live and breath the SCSI spec, is there an > overview site describing what is going on. I've been working off 08-149r7.pdf. I'm sure that's been superseded by now. > 08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10 Probably interesting. Haven't read it myself. > 08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L. > Black PDF (387549) 2009/01/15 Fred Knight seems to be the main coordinator of this effort, so yes. > 08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08 That's the one I'm working from. > 08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09 A newer version ... thought so. > 09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick > Knight PDF (32757) 2009/01/09 Clearly related. > 08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15 Even newer version of what I've been working from. > 09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick > Knight PDF (38063) 2008/12/08 Probably tedious. > 08-396r3 SPC-4: Reporting support for all DIF types George Penokie > PDF (85358) 2009/01/14 Unrelated, I would think. I'd go with 08-149r9 to get a good overview. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: TRIM vs UNMAP vs WRITE SAME and thin devices 2009-02-08 20:44 ` Matthew Wilcox @ 2009-02-09 0:01 ` Ric Wheeler 0 siblings, 0 replies; 18+ messages in thread From: Ric Wheeler @ 2009-02-09 0:01 UTC (permalink / raw) To: Matthew Wilcox Cc: Greg Freemyer, David Woodhouse, James Bottomley, Martin K. Petersen, Jeff Garzik, linux-scsi, linux-fsdevel, IDE/ATA development list Matthew Wilcox wrote: > On Sun, Feb 08, 2009 at 03:06:44PM -0500, Greg Freemyer wrote: > >> I found a list of T10 activities just since just Dec. 1, 2008 and it >> is a bit overwhelming. (ie. 08-356r4 is but one of many recent >> reports) >> >> http://www.t10.org/new_a.htm >> >> For those of us that don't live and breath the SCSI spec, is there an >> overview site describing what is going on. >> > > I've been working off 08-149r7.pdf. I'm sure that's been superseded by > now. > > >> 08-356r4 SBC-3: WRITE SAME unmap bit David L. Black PDF (56608) 2008/12/10 >> > > Probably interesting. Haven't read it myself. > This is only a four page proposal - basically, we would use the write same command with a special unmap bit set to tell the target that it may (at its option) unmap the blocks. If not, it would in fact have to set the data to the indicated pattern in the command which I presume would be all zeros in the normal case. > >> 08-356r5 SBC-3 Thin Provisioning Commands Fred Knight, David L. >> Black PDF (387549) 2009/01/15 >> > > Fred Knight seems to be the main coordinator of this effort, so yes. > Fred and David Black both have been quite active. > >> 08-149r7 SBC - Thin Provisioning Frederick Knight PDF (281001) 2008/12/08 >> > > That's the one I'm working from. > > >> 08-149r8 SBC - Thin Provisioning Frederick Knight PDF (281387) 2009/01/09 >> > > A newer version ... thought so. > > >> 09-011r1 SBC-3 Thin Provisioning Threshold Notification Frederick >> Knight PDF (32757) 2009/01/09 >> > > Clearly related. > > >> 08-149r9 SBC - Thin Provisioning Frederick Knight PDF (353888) 2009/01/15 >> > > Even newer version of what I've been working from. > > >> 09-012r0 Minutes: CAP - Thin Provisioning 12/4 con-call Frederick >> Knight PDF (38063) 2008/12/08 >> > > Probably tedious. > > >> 08-396r3 SPC-4: Reporting support for all DIF types George Penokie >> PDF (85358) 2009/01/14 >> > > Unrelated, I would think. > > I'd go with 08-149r9 to get a good overview. > > ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-03-23 19:23 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090123041558.GC24652@parisc-linux.org>
[not found] ` <4979AF62.7070409@redhat.com>
[not found] ` <1232721777.4430.7.camel@macbook.infradead.org>
2009-02-07 14:53 ` TRIM vs UNMAP vs WRITE SAME and thin devices Ric Wheeler
2009-02-07 15:09 ` James Bottomley
2009-02-07 16:14 ` Ric Wheeler
2009-02-12 13:51 ` Eyal Shani
2009-03-23 19:05 ` Greg Freemyer
2009-03-23 19:23 ` Mark Lord
2009-02-07 22:50 ` Matthew Wilcox
2009-02-07 23:03 ` James Bottomley
2009-02-08 16:47 ` Ric Wheeler
2009-02-08 20:50 ` Matthew Wilcox
2009-02-08 23:58 ` Ric Wheeler
2009-02-07 22:47 ` Matthew Wilcox
2009-02-07 23:36 ` David Woodhouse
2009-02-07 23:46 ` Jeff Garzik
2009-02-08 0:24 ` Matthew Wilcox
2009-02-08 20:06 ` Greg Freemyer
2009-02-08 20:44 ` Matthew Wilcox
2009-02-09 0:01 ` Ric Wheeler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).