linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Mark Fasheh <mark.fasheh@oracle.com>
Cc: linux-fsdevel@vger.kernel.org, David Chinner <dgc@sgi.com>,
	linux-ext4@vger.kernel.org, xfs@oss.sgi.com, hch@infradead.org,
	Anton Altaparmakov <aia21@cam.ac.uk>,
	Mike Waychison <mikew@google.com>,
	ocfs2-devel@oss.oracle.com
Subject: Re: [RFC] add FIEMAP ioctl to efficiently map file allocation
Date: Mon, 29 Oct 2007 16:13:02 -0600	[thread overview]
Message-ID: <20071029221302.GD3042@webber.adilger.int> (raw)
In-Reply-To: <20071029205744.GB28607@ca-server1.us.oracle.com>

On Oct 29, 2007  13:57 -0700, Mark Fasheh wrote:
> 	Thanks for posting this. I believe that an interface such as FIEMAP
> would be very useful to Ocfs2 as well. (I added ocfs2-devel to the e-mail)

I tried to make it as Lustre-agnostic as possible...

> On Mon, Oct 29, 2007 at 01:45:07PM -0600, Andreas Dilger wrote:
> > The FIEMAP ioctl (FIle Extent MAP) is similar to the existing FIBMAP
> > ioctl block device ioctl used for mapping an individual logical block
> > address in a file to a physical block address in the block device. The
> > FIEMAP ioctl will return the logical to physical mapping for the extent
> > that contains the specified logical byte address.
> > 
> > struct fiemap_extent {
> >     __u64 fe_offset;/* offset in bytes for the start of the extent */
> 
> I'm a little bit confused by fe_offset. Is it a physical offset, or a
> logical offset? The reason I ask is that your description above says "FIEMAP
> ioctl will return the logical to physical mapping for the extent that
> contains the specified logical byte address." Which seems to imply physical,
> but your math to get to the next logical start in a very fragmented file,
> implies that fe_offset is a logical offset:
> 
>        fm_start = fm_extents[fm_extent_count - 1].fe_offset +
>                          fm_extents[fm_extent_count - 1].fe_length + 1; 

Note the distinction between "fe_offset" (which is a physical offset for
a single extent) and "fm_offset" (which is a logical offset for that file).

> > We do this until we find an extent with FIEMAP_EXTENT_LAST flag set. We
> > will also need to re-initialise the fiemap flags, fm_extent_count, fm_end.
> 
> I think you meant 'fm_length' instead of 'fm_end' there.

You're right, thanks.

> > #define FIEMAP_EXTENT_LAST      0x00000020 /* last extent in the file */
> > #define FIEMAP_EXTENT_EOF       0x00000100 /* fm_start + fm_len beyond EOF*/
> 
> Is "EOF" here considering "beyond i_size" or "beyond allocation"?

_EOF == beyond i_size.
_LAST == last extent in the file.

In most cases FIEMAP_EXTENT_EOF will be set at the same time as
FIEMAP_EXTENT_LAST, but in case of e.g. prealloc beyond i_size the 
EOF flag may be set on one or more earlier extents.

> > FIEMAP_EXTENT_NO_DIRECT means data cannot be directly accessed (maybe
> > encrypted, compressed, etc.)
> 
> Would it be valid to use FIEMAP_EXTENT_NO_DIRECT for marking in-inode data?
> Btrfs, Ocfs2, and Gfs2 pack small amounts of user data directly in inode
> blocks.

Hmm, but part of the issue would be how to request the extra data, and
what offset it would be given?  One could, for example, use negative
offsets to represent metadata or something, or add a FIEMAP_EXTENT_META
or similar, I hadn't given that much thought.  The other issue is that
I'd like to get the basics of the API in place before it gets too complex.
We can always add functionality with more FIEMAP_FLAG_* (whether in the
INCOMPAT range or not, depending on what is being done).

Cheers, Andreas
--
Andreas Dilger
Sr. Software Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2007-10-29 22:13 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-12 11:05 [RFC] add FIEMAP ioctl to efficiently map file allocation Andreas Dilger
2007-04-12 11:22 ` Anton Altaparmakov
2007-04-13  4:01   ` Andreas Dilger
2007-04-13  7:46     ` Anton Altaparmakov
2007-04-13 14:53     ` Jeff Mahoney
2007-04-13  1:33 ` Nicholas Miell
2007-04-13 10:15 ` Christoph Hellwig
2007-04-13 11:38   ` Anton Altaparmakov
2007-04-13 18:55     ` Nicholas Miell
2007-04-16  8:01 ` Timothy Shimmin
2007-04-18 23:03   ` Andreas Dilger
2007-04-16 11:22 ` David Chinner
2007-04-19  0:21   ` Andreas Dilger
2007-04-19  1:54     ` David Chinner
2007-04-30 22:44       ` Andreas Dilger
2007-05-01  4:22         ` David Chinner
2007-05-01  4:39           ` Nicholas Miell
2007-05-01 14:20             ` David Chinner
2007-05-01 18:46               ` Anton Altaparmakov
2007-05-02  9:15                 ` David Chinner
2007-05-02  9:36                   ` Anton Altaparmakov
2007-05-02 10:57                     ` David Chinner
2007-05-02 11:17                       ` Anton Altaparmakov
2007-05-03  7:49                       ` Andreas Dilger
2007-05-03  8:23                         ` Anton Altaparmakov
2007-05-02  9:45                   ` Anton Altaparmakov
2007-05-01 22:32               ` Andreas Dilger
2007-05-01 18:37           ` Anton Altaparmakov
2007-05-02  0:06             ` David Chinner
2007-05-02  8:16               ` Anton Altaparmakov
2007-10-29 19:45                 ` Andreas Dilger
2007-10-29 20:57                   ` Mark Fasheh
2007-10-29 22:13                     ` Andreas Dilger [this message]
2007-10-29 22:29                       ` Andreas Dilger
2007-10-29 22:40                         ` Mark Fasheh
2007-10-30  0:11                       ` Mark Fasheh
2007-10-30  0:25                         ` Andreas Dilger
2007-10-29 22:25                   ` David Chinner
2007-05-01 22:30           ` Andreas Dilger
2007-05-02  2:26             ` David Chinner
2007-05-02  8:23             ` Anton Altaparmakov
2007-05-02  8:30               ` Anton Altaparmakov
2007-05-02  9:48               ` David Chinner
2007-05-02  9:56                 ` Anton Altaparmakov
2007-04-19  6:23     ` Timothy Shimmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071029221302.GD3042@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=aia21@cam.ac.uk \
    --cc=dgc@sgi.com \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mark.fasheh@oracle.com \
    --cc=mikew@google.com \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).