linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* xfs preallocation writeup, for comparison
@ 2006-11-28 21:04 Eric Sandeen
  0 siblings, 0 replies; only message in thread
From: Eric Sandeen @ 2006-11-28 21:04 UTC (permalink / raw)
  To: ext4 development

As promised, here is a writeup of xfs preallocation routines.

I don't hold these up as the perfect or best way to do this task, but it
is worth looking at what has been done before, to get ideas, find better
ways, and avoid pitfalls for ext4.

XFS preallocation interfaces.
=============================

The xfs preallocation interfaces are described in the xfsctl(3) manpage.
It's not the best doc, so I'll summarize:

XFS has these ioctl calls for space managment of files:

       XFS_IOC_ALLOCSP
       XFS_IOC_FREESP
       XFS_IOC_RESVSP
       XFS_IOC_UNRESVSP

All of these interfaces take an flock-style argument, and you use it to
specify the range of bytes in the file which should be preallocated,
essentially with an offset and a length.

The real work for all of this is done in xfs_change_file_space() in
xfs_vnodeops.c

The main difference between resvsp and allocsp is that resvsp marks the
blocks as "unwritten" meaning that they are allocated but not yet
written to, and if they are read, they will return zeros.  allocsp
actually writes zeros into the allocated blocks.  We can use the xfs_io
tool to demonstrate.

resvsp example:
==============

[root@magnesium test]# touch resvsp
[root@magnesium test]# xfs_io resvsp
xfs_io> resvsp 0 10g
xfs_io> bmap -vp
resvsp:
 EXT: FILE-OFFSET           BLOCK-RANGE        AG AG-OFFSET
TOTAL FLAGS
   0: [0..16657327]:        16657456..33314783  1 (64..16657391)
16657328 10000
   1: [16657328..20971519]: 96..4314287         0 (96..4314287)
4314192 10000

so we got 2 extents for this 10g file - those are actual filesystem
blocks allocated.  The file is 0 length, but is using 10g of blocks:

[root@magnesium test]# ls -lh resvsp
-rw-r--r--  1 root root 0 Nov 28 14:11 resvsp
[root@magnesium test]# du -hc resvsp
10G     resvsp
10G     total

The extents are simply flagged as unwritten (0x10000 above), so very
little IO occurs and the space reservation is fast..

allocsp example:
===============
(note there's a bit of a buglet in xfs_io, hence the swapped arguments)

[root@magnesium test]# touch allocsp
[root@magnesium test]# xfs_io allocsp
xfs_io> allocsp 10g 0
<wait for IO...>
xfs_io> bmap -vp
allocsp:
 EXT: FILE-OFFSET           BLOCK-RANGE        AG AG-OFFSET
TOTAL
   0: [0..16657327]:        33314848..49972175  2 (64..16657391)
16657328
   1: [16657328..20971519]: 4314288..8628479    0 (4314288..8628479)
4314192

We also got 2 extents here, but they are not flagged as unwritten -
those filesystem blocks were all actually filled with zeros.

[root@magnesium test]# ls -lh allocsp
-rw-r--r--  1 root root 10G Nov 28 14:19 allocsp
[root@magnesium test]# du -hc allocsp
10G     allocsp
10G     total

It would be very nice to see posix_fallocate hooked up to the underlying
filesystem, so that it can make smart decisions about how to efficiently
reserve space...

-Eric

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-11-28 21:04 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-28 21:04 xfs preallocation writeup, for comparison Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).