From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Greg Freemyer <greg.freemyer@gmail.com>
Cc: Ric Wheeler <rwheeler@redhat.com>,
sandeen@redhat.com, Eric Sandeen <esandeen@redhat.com>,
Jeff Moyer <jmoyer@redhat.com>, Mark Lord <kernel@teksavvy.com>,
Lukas Czerner <lczerner@redhat.com>,
linux-ext4@vger.kernel.org, Edward Shishkin <eshishki@redhat.com>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH 2/2] Add batched discard support for ext4.
Date: Wed, 21 Apr 2010 17:56:47 -0400 [thread overview]
Message-ID: <1271887007.2893.352.camel@mulgrave.site> (raw)
In-Reply-To: <n2x87f94c371004211447w1e57ae43j4c988cf0205d7ec2@mail.gmail.com>
On Wed, 2010-04-21 at 17:47 -0400, Greg Freemyer wrote:
> Adding James Bottomley because high-end scsi is entering the
> discussion. James, I have a couple scsi questions for you at the end.
>
> On Wed, Apr 21, 2010 at 5:03 PM, Ric Wheeler <rwheeler@redhat.com> wrote:
> > On 04/21/2010 05:01 PM, Eric Sandeen wrote:
> >>
> >> On 04/21/2010 03:44 PM, Greg Freemyer wrote:
> >>
> >>
> >>>
> >>> Mark's benchmarks showed this as doable in seconds which seems like a
> >>> reasonable amount of time for a mount time operation.
> >>>
> >>
> >> All the other things aside, mount-time is interesting, but it's an
> >> infrequent operation, at least in my world. I think we need something
> >> that can be done runtime.
> >>
> >> For anything with uptime, I don't think it's acceptable to wait until
> >> the next mount to trim unused blocks.
So what's wrong with using wiper.sh? It can do online discard of
filesystems that support delayed allocation (ext4, xfs etc.)?
> >> But as long as the mechanism can be called either at mount time and/or
> >> kicked off runtime somehow, I'm happy.
> >>
> >> -Eric
> >>
> >
> > That makes sense to me. Most enterprise servers will go without remounting
> > a file system for (hopefully!) a very long time.
> >
> > It is really important to keep in mind that this is not just a laptop
> > feature for laptop SSD's, this is also used by high end arrays and *could*
> > be useful for virt IO, etc as well :-)
> >
> > ric
>
> I'm not arguing that a runtime solution is not needed.
>
> I'm arguing that at least for SSD backed filesystems Mark's userspace
> implementation shows how the mount time initialization of the runtime
> bitmap can be accomplished in a few seconds by leveraging the hardware
> and using vector'ed trims as opposed to having to build an additional
> on-disk structure.
>
> At least for SSDs, the primary purpose of the proposed on-disk
> structure seems to be to overcome the current lack of a vector'ed
> discard implementation.
>
> If it is too difficult to implement a fully functional vector'ed
> discard in the block layer due to locking issues, possibly a special
> purpose version could be written that is only used at mount time when
> one can be assured no other i/o is occurring to the filesystem.
>
> James,
>
> The ATA-8 spec. supports vectored trims and requires a minimum of 255
> sectors worth of range payload be supported. That equates to a single
> trim being able to trim thousands of ranges in one command.
>
> Mark Lord has benchmarked in found a vectored trim to be drastically
> faster than calling trim individually for each of those ranges.
>
> Does scsi support vector'ed discard? (ie. write-same commands)
only with UNMAP. WRITE SAME is effectively single range.
> Or are high-end scsi arrays so fast they can process tens of thousands
> of discard commands in a reasonable amount of time, unlike the SSDs
> have so far proven to do.
No ... they actually have two problems: firstly they can only use
discard ranges which align with their internal block size (usually
something huge like 3/4MB) and then a trim operation tends to be O(1)
and slow, so they'd actually like discard accumulation.
> It would be interesting to find out that a SSD can discard thousands
> of ranges drastically faster than a high-end scsi device can. But if
> true, that might argue for the on-disk bitmap to track previously
> discarded blocks/extents.
I think SSDs and Arrays both have discard problems, arrays more to do
with the time and expense of the operation, SSDs because the TRIM
command isn't queued.
James
next prev parent reply other threads:[~2010-04-21 21:56 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-19 10:55 Ext4: batched discard support Lukas Czerner
2010-04-19 10:55 ` [PATCH 1/2] Add ioctl FITRIM Lukas Czerner
2010-04-19 10:55 ` [PATCH 2/2] Add batched discard support for ext4 Lukas Czerner
2010-04-20 21:21 ` Greg Freemyer
2010-04-21 2:26 ` Mark Lord
2010-04-21 2:45 ` Eric Sandeen
2010-04-21 18:59 ` Greg Freemyer
2010-04-21 19:04 ` Ric Wheeler
2010-04-21 19:22 ` Jeff Moyer
2010-04-21 20:44 ` Greg Freemyer
2010-04-21 20:53 ` Greg Freemyer
2010-04-21 21:01 ` Eric Sandeen
2010-04-21 21:03 ` Ric Wheeler
2010-04-21 21:47 ` Greg Freemyer
2010-04-21 21:56 ` James Bottomley [this message]
2010-04-21 21:59 ` Mark Lord
2010-04-23 8:23 ` Lukas Czerner
2010-04-24 13:24 ` Greg Freemyer
2010-04-24 13:48 ` Ric Wheeler
2010-04-24 14:30 ` Greg Freemyer
2010-04-24 14:43 ` Eric Sandeen
2010-04-24 15:03 ` Greg Freemyer
2010-04-24 17:04 ` Ric Wheeler
2010-04-24 18:30 ` Greg Freemyer
2010-04-24 18:41 ` Ric Wheeler
2010-04-26 14:00 ` Mark Lord
2010-04-26 14:42 ` Martin K. Petersen
2010-04-26 15:27 ` Greg Freemyer
2010-04-26 15:51 ` Lukas Czerner
2010-04-28 1:25 ` Mark Lord
2010-04-26 15:48 ` Ric Wheeler
2010-04-24 19:06 ` Martin K. Petersen
2010-04-26 14:03 ` Mark Lord
2010-04-24 18:39 ` Martin K. Petersen
2010-04-26 16:55 ` Jan Kara
2010-04-26 17:46 ` Lukas Czerner
2010-04-26 17:52 ` Ric Wheeler
2010-04-26 18:14 ` Lukas Czerner
2010-04-26 18:28 ` Jeff Moyer
2010-04-26 18:38 ` [PATCH 2/2] Add batched discard support for ext4 - using rbtree Lukas Czerner
2010-04-26 18:42 ` Lukas Czerner
2010-04-27 15:29 ` Edward Shishkin
2010-04-21 20:52 ` [PATCH 2/2] Add batched discard support for ext4 Greg Freemyer
2010-04-19 16:20 ` Ext4: batched discard support Greg Freemyer
2010-04-19 16:30 ` Eric Sandeen
2010-04-19 17:58 ` Greg Freemyer
2010-04-19 18:04 ` Ric Wheeler
2010-04-20 20:24 ` Mark Lord
2010-04-20 20:34 ` Mark Lord
-- strict thread matches above, loose matches on Subject: below --
2010-07-07 7:53 Ext4: batched discard support - simplified version Lukas Czerner
2010-07-07 7:53 ` [PATCH 2/2] Add batched discard support for ext4 Lukas Czerner
2010-07-14 8:33 ` Dmitry Monakhov
2010-07-14 9:40 ` Lukas Czerner
2010-07-14 10:03 ` Dmitry Monakhov
2010-07-14 11:43 ` Lukas Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271887007.2893.352.camel@mulgrave.site \
--to=james.bottomley@hansenpartnership.com \
--cc=esandeen@redhat.com \
--cc=eshishki@redhat.com \
--cc=greg.freemyer@gmail.com \
--cc=hch@infradead.org \
--cc=jmoyer@redhat.com \
--cc=kernel@teksavvy.com \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=rwheeler@redhat.com \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).