From: Andreas Dilger <adilger@sun.com>
To: Mark Fasheh <mfasheh@suse.com>
Cc: linux-fsdevel@vger.kernel.org, Josef Bacik <jbacik@redhat.com>
Subject: Re: [PATCH 1/5] vfs: vfs-level fiemap interface
Date: Wed, 28 May 2008 13:42:15 -0600 [thread overview]
Message-ID: <20080528194215.GI7263@webber.adilger.int> (raw)
In-Reply-To: <20080525000157.GK8325@wotan.suse.de>
On May 24, 2008 17:01 -0700, Mark Fasheh wrote:
> Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
> inode operation.
Mark, I was looking at a way to remove the special-casing of NUM_EXTENTS
from ioctl_fiemap() in an effort to remove Christoph's objection to
keeping these in the same ioctl.
I think it is possible and reasonable to move the special-case handling
into fiemap_fill_next_extent().
> +static int ioctl_fiemap(struct file *filp, unsigned long arg)
> +{
> +
> + if (!(fiemap.fm_flags & FIEMAP_FLAG_NUM_EXTENTS) &&
> + (fiemap.fm_extent_count == 0 ||
> + fiemap.fm_extent_count > FIEMAP_MAX_EXTENTS))
> + return -EINVAL;
This can be changed to only check:
if (fm_extent_count > FIEMAP_MAX_EXTENTS)
return -EINVAL;
> + fieinfo.fi_flags = fiemap.fm_flags;
> + if (!(fiemap.fm_flags & FIEMAP_FLAG_NUM_EXTENTS)) {
> + fieinfo.fi_extents_max = fiemap.fm_extent_count;
> + fieinfo.fi_extents_start = (char *)arg + sizeof(fiemap);
> +
> + if (!access_ok(VERIFY_WRITE, fieinfo.fi_extents_start,
> + fieinfo.fi_extents_max * sizeof(struct fiemap_extent)))
> + return -EFAULT;
> + }
It is harmless to set fi_extents_max and fi_extents_start, as this is
ignored by NUM_EXTENTS. The fiemap_fill_next_extent() will already
check in copy_to_user() whether the fi_extents_start pointer is valid,
and fiemap_fill_next_extent() doesn't even get far enough to look at
fi_extents_max or fi_extents_start. We just do:
fieinfo.fi_extents = fiemap.fm_extent_count;
fieinfo.fi_extents_start = (struct fiemap_extent *)((char *)arg +
sizeof(fiemap));
This leaves us with no checks for FIEMAP_FLAG_NUM_EXTENTS in ioctl_fiemap()
at all, and no changes needed in fiemap_fill_next_extent().
> > What about the idea to have fiemap_fill_next_extent() do "extent" merging
> > for filesystems that use the generic helper but do not return multiple
> > blocks via get_blocks()? I don't think that is too hard to implement,
> > and makes the output more useful, otherwise we get an extent per block.
> > The above is what I _think_ will work, haven't actually tried it out.
>
> I don't think we want to automatically merge extents within this helper
> function. Otherwise we would diverge from the actual disk layout for extent
> based file systems where an extent might be broken up between two records
> for some other reason, such as maximum extent length being exceeded.
Do we really want to expose the filesystem-specific extent-length limits
to userspace? In some sense, a block-based filesystem has a maximum
extent length of the blocksize, but it seems totally reasonable to merge
contiguous blocks into a single "extent" for return to userspace. I
don't see this significantly different for ext4, even though it can have
extents up to 128MB, or unwritten extents up to 64MB.
> Btw, how many block-based file systems that don't return multiple blocks via
> get_blocks() are there that we actually care about enough to write this
> code?
That I have no clue about. Joseph?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2008-05-28 19:42 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-25 0:01 [PATCH 1/5] vfs: vfs-level fiemap interface Mark Fasheh
2008-05-25 7:28 ` Andreas Dilger
2008-05-27 18:31 ` Mark Fasheh
2008-05-28 16:09 ` Andreas Dilger
2008-05-28 17:24 ` Joel Becker
2008-05-29 23:46 ` Andreas Dilger
2008-05-30 0:15 ` Mark Fasheh
2008-05-30 17:24 ` Andreas Dilger
2008-05-28 19:42 ` Andreas Dilger [this message]
2008-05-28 19:54 ` Josef Bacik
2008-05-28 20:12 ` Mark Fasheh
2008-05-28 20:19 ` Josef Bacik
2008-05-28 21:23 ` Mark Fasheh
2008-05-29 1:24 ` Dave Chinner
2008-05-29 13:04 ` Christoph Hellwig
2008-05-29 17:02 ` Andreas Dilger
2008-05-31 8:16 ` Christoph Hellwig
2008-05-29 13:03 ` Christoph Hellwig
2008-06-05 5:18 ` Andreas Dilger
2008-06-05 21:35 ` jim owens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080528194215.GI7263@webber.adilger.int \
--to=adilger@sun.com \
--cc=jbacik@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mfasheh@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).