From: David Chinner <dgc@sgi.com>
To: Theodore Tso <tytso@mit.edu>
Cc: David Chinner <dgc@sgi.com>, Jeff Garzik <jeff@garzik.org>,
Barry Naujok <bnaujok@melbourne.sgi.com>,
"'Dave Kleikamp'" <shaggy@austin.ibm.com>,
"'Alex Tomas'" <alex@clusterfs.com>, "'Jan Kara'" <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [RFC] Ext3 online defrag
Date: Thu, 26 Oct 2006 16:36:48 +1000 [thread overview]
Message-ID: <20061026063648.GE8394166@melbourne.sgi.com> (raw)
In-Reply-To: <20061026033316.GC27858@thunk.org>
On Wed, Oct 25, 2006 at 11:33:16PM -0400, Theodore Tso wrote:
> On Thu, Oct 26, 2006 at 11:40:20AM +1000, David Chinner wrote:
> > We don't need to expose anything filesystem specific to userspace to
> > implement this. Online data movement (i.e. the defrag mechanism)
> > becomes something like:
> >
> > do {
> > get_free_list(dst_fd, location, len, list)
> > /* select extent to use */
> > alloc_from_list(dst_fd, list[X], off, len)
> > } while (ENOALLOC)
> > move_data(src_fd, dst_fd, off, len);
> >
> > And this would work on any filesystem type that implemented these
> > interfaces. Hence tools like a startup file optimiser would
> > only need to be written once, rather than needing a different
> > tool for every different filesystem type.....
>
> Yeah, but that's simply not enough.
Not enough for what?
> A good defragger needs to know
Oh, we're back to defrag again. :/
> about a filesystem's allocation policies, and move files so they are
> optimally located, given the filesystem layout. For example, in
> ext2/3/4 we will want to move blocks so they in the same block group
> as the inode. That's filesystem specific information; other
> filesystems will require different policies.
Of which a good chunk of policies will be common. the above policy
has been around for many, many years and is implemented in many, many
filesystems (even XFS).
> > get_free_list(dst_fd, location, len, list)
location == allocation policy. e.g: give me a list of free blocks:
- anywhere (default filesystem policy applies)
- near block number X
- at block X
- in block/allocation group Y
- of the largest contiguous regions in (one of the above)
- at least N blocks in length
- near inode src_fd
- in storage tier 3
then you select one of the regions that was returned at attempt
to allocate that.
You can put whatever filesystems specific stuff you need around this
to arrive at the decision of where to put the file, but you've got
to allocate the new blocks, move the data to them, and swap them
over. Every defragger needs to do this, regardless of the filesystem
type. So why not provide a framework for it, especially as the
framework is useful for far more than just as the data movement part
of a defrag application.
> > Remember, I'm not just talking about defrag - I'm talking about
> > an interface that is actually useful to apps that might care
> > about how data is laid out on disk but the applications writers
> > don't know anyhting about how filesystem X or Y or Z is
> > implemented. Putting the burden of learning about fileystem
> > internals on application developers is not the correct solution.
>
> Unfortunately, if you want to do a good job, a defragger *has* to know
> about some very low-level filesystem specific information, if it wants
> to do a good job.
Back to defrag. Again. Bigger picture, guys, bigger picture.....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2006-10-26 6:37 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20061023122710.GA12034@atrey.karlin.mff.cuni.cz>
2006-10-23 14:16 ` [RFC] Ext3 online defrag Theodore Tso
2006-10-23 14:31 ` Alex Tomas
2006-10-23 14:48 ` Andreas Dilger
2006-10-23 14:55 ` Jan Kara
2006-10-23 14:51 ` Jan Kara
2006-10-23 15:01 ` Eric Sandeen
2006-10-24 4:14 ` Jeff Garzik
2006-10-24 13:59 ` David Chinner
2006-10-24 14:51 ` Dave Kleikamp
2006-10-24 16:01 ` David Chinner
2006-10-24 16:26 ` Dave Kleikamp
2006-10-25 1:18 ` David Chinner
2006-10-25 2:30 ` Barry Naujok
2006-10-25 2:42 ` Jeff Garzik
2006-10-25 4:27 ` David Chinner
2006-10-25 4:48 ` Jeff Garzik
2006-10-25 5:38 ` David Chinner
2006-10-25 6:01 ` Jeff Garzik
2006-10-25 8:11 ` David Chinner
2006-10-25 17:00 ` Jeff Garzik
2006-10-26 1:40 ` David Chinner
2006-10-26 3:33 ` Theodore Tso
2006-10-26 6:36 ` David Chinner [this message]
2006-10-26 13:37 ` Theodore Tso
2006-10-26 14:40 ` Dave Kleikamp
2006-10-26 11:37 ` Jan Kara
2006-10-27 1:32 ` David Chinner
2006-10-24 14:52 ` Eric Sandeen
2006-10-24 19:44 ` Theodore Tso
2006-10-24 20:31 ` Russell Cattelan
2006-10-24 23:00 ` Andreas Dilger
2006-10-25 14:54 ` Jan Kara
2006-10-25 17:02 ` Jeff Garzik
2006-10-25 17:58 ` Jan Kara
2006-10-25 18:08 ` Jeff Garzik
2006-10-25 18:25 ` Jan Kara
2006-10-25 18:33 ` Jeff Garzik
2006-10-26 9:30 ` Andreas Dilger
2006-10-25 2:09 ` David Chinner
2006-10-23 14:45 ` Jan Kara
2006-10-23 15:14 ` Andreas Dilger
2006-10-23 16:03 ` Jan Kara
2006-10-23 17:29 ` Andreas Dilger
2006-10-25 18:36 ` Jan Kara
2006-10-25 18:41 ` Jeff Garzik
2006-10-26 15:25 ` Jörn Engel
2006-10-27 7:23 sho
2006-10-27 7:44 ` Alex Tomas
2006-10-27 13:53 ` Eric Sandeen
2006-10-27 14:05 ` Alex Tomas
2006-10-27 14:24 ` Eric Sandeen
2006-10-27 14:39 ` Alex Tomas
2006-11-15 9:54 ` Takashi Sato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061026063648.GE8394166@melbourne.sgi.com \
--to=dgc@sgi.com \
--cc=alex@clusterfs.com \
--cc=bnaujok@melbourne.sgi.com \
--cc=jack@suse.cz \
--cc=jeff@garzik.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=shaggy@austin.ibm.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox