* Getting direct support for fallocate(2) into glibc
@ 2008-12-09 6:50 Theodore Ts'o
2008-12-09 16:58 ` Eric Sandeen
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2008-12-09 6:50 UTC (permalink / raw)
To: Eric Sandeen; +Cc: linux-ext4
Hey Eric,
Can you or Ric put in a good word with Ulrich (through appropriate Red
Hat channels, if that would be helpful) for this glibc enhancement request:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=7083
Given that glibc 2.9 was just released, presumably this won't show up
until glibc 2.10, and it'll probably be a *long* time before it this
will show up in real distributions, but it would be good to get this
into functionality into glibc ASAP.
I've been doing an analysis of which files are ending up getting
fragmented on my system, and relatively common case is logfiles (since
they are written via appending). Another common case is gaim log files,
and of course /var/spool/mail files. All of these cases could be
addressed by using fallocate() with the FALLOC_FL_KEEP_SIZE flag; and
unfortunately, there is no way to do this currently through a glibc
mediated interface (posix_fallocate() is mandated by the Posix
specification to _not_ pass the FALLOC_FL_KEEP_SIZE flag).
Changing various programs to use fallocate() when the final size of the
file is known (i.e, cp, tar, cpio, rpm, etc.) helps for large files.
And of course, it *definitely* helps for bittorent clients. So if the
goal is to optimize filesystems aging, this would be a perfect project
to sic an intern or two, or maybe an intro-level Google Summer of Code
students, to sweep through some of the more common programs. For
programs like cp, tar, et. al, posix_fallocate() will work just fine.
For things like logrotate, we'd either need direct fallocate() support
in glibc, or logrotate would have to call the system call directly,
which would be evil and tricky given that different architectures use
different system call numbers.
(Logrotate could use fallocate to preallocate the new /var/log/syslog to
same size as the previous day's syslog, which has the bonus of
guaranteeing that log messages won't get lost --- and if there's not
enough disk space for the fallocate to succeed, the system administrator
can get notified right away, while there is still free space on the
system. For paranoid DoD types who want to make sure that the audit
logs are written, this could be quite useful.)
If it's going to take too long to get this into glibc, I guess I could
put an fallocate() interface into libe2p, although that would not be my
first preference. After all, this sort of thing will be useful for all
filesystems, not just ext4; it should be helpful for xfs and btrfs as
well.
- Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Getting direct support for fallocate(2) into glibc
2008-12-09 6:50 Getting direct support for fallocate(2) into glibc Theodore Ts'o
@ 2008-12-09 16:58 ` Eric Sandeen
2008-12-09 17:31 ` Theodore Tso
2008-12-11 19:00 ` Andreas Dilger
0 siblings, 2 replies; 5+ messages in thread
From: Eric Sandeen @ 2008-12-09 16:58 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-ext4
Theodore Ts'o wrote:
> Hey Eric,
>
> Can you or Ric put in a good word with Ulrich (through appropriate Red
> Hat channels, if that would be helpful) for this glibc enhancement request:
>
> http://sources.redhat.com/bugzilla/show_bug.cgi?id=7083
>
> Given that glibc 2.9 was just released, presumably this won't show up
> until glibc 2.10, and it'll probably be a *long* time before it this
> will show up in real distributions, but it would be good to get this
> into functionality into glibc ASAP.
Hm... I thought it was already there. posix_fallocate() is indeed
calling sys_fallocate; doing a very simple test:
fd = open("testfile", O_RDWR|O_CREAT);
error = posix_fallocate(fd, 0, 16384);
and then asking xfs about the allocation (sorry for the xfs, but
xfs_bmap is still so handy for this...):
# xfs_bmap -vv testfile
testfile:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..31]: 34984576..34984607 2 (1256112..1256143) 32 10000
FLAG Values:
010000 Unwritten preallocated extent
it looks like it's doing the Right Thing. This is on:
# rpm -q glibc
glibc-2.9-2.x86_64
... oh, ok, but plain old fallocate() isn't yet hooked up, odd.
# gcc -o test-fallocate test-fallocate.c
/tmp/ccQTk6an.o: In function `main':
test-fallocate.c:(.text+0x3e): undefined reference to `fallocate'
Especially odd because we have a man page shipping, but no actual
implemented interface :) OK, I'll go bug people.
-Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Getting direct support for fallocate(2) into glibc
2008-12-09 16:58 ` Eric Sandeen
@ 2008-12-09 17:31 ` Theodore Tso
2008-12-09 18:03 ` Eric Sandeen
2008-12-11 19:00 ` Andreas Dilger
1 sibling, 1 reply; 5+ messages in thread
From: Theodore Tso @ 2008-12-09 17:31 UTC (permalink / raw)
To: Eric Sandeen; +Cc: linux-ext4
On Tue, Dec 09, 2008 at 10:58:19AM -0600, Eric Sandeen wrote:
> ... oh, ok, but plain old fallocate() isn't yet hooked up, odd.
>
> # gcc -o test-fallocate test-fallocate.c
> /tmp/ccQTk6an.o: In function `main':
> test-fallocate.c:(.text+0x3e): undefined reference to `fallocate'
>
> Especially odd because we have a man page shipping, but no actual
> implemented interface :) OK, I'll go bug people.
Yep, that was what I was referring to. We have the fallocate(2) man
page because Michael Kerrisk has documented the userspace/kernel
interface. But that doesn't necessarily mean that glibc has wired up
the interface. (Sigh, we had been able to push this issue *before*
glibc 2.9 shipped last month, but we were all too busy....)
- Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Getting direct support for fallocate(2) into glibc
2008-12-09 17:31 ` Theodore Tso
@ 2008-12-09 18:03 ` Eric Sandeen
0 siblings, 0 replies; 5+ messages in thread
From: Eric Sandeen @ 2008-12-09 18:03 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux-ext4
Theodore Tso wrote:
> On Tue, Dec 09, 2008 at 10:58:19AM -0600, Eric Sandeen wrote:
>> ... oh, ok, but plain old fallocate() isn't yet hooked up, odd.
>>
>> # gcc -o test-fallocate test-fallocate.c
>> /tmp/ccQTk6an.o: In function `main':
>> test-fallocate.c:(.text+0x3e): undefined reference to `fallocate'
>>
>> Especially odd because we have a man page shipping, but no actual
>> implemented interface :) OK, I'll go bug people.
>
> Yep, that was what I was referring to. We have the fallocate(2) man
> page because Michael Kerrisk has documented the userspace/kernel
> interface. But that doesn't necessarily mean that glibc has wired up
> the interface. (Sigh, we had been able to push this issue *before*
> glibc 2.9 shipped last month, but we were all too busy....)
Ok. yeah, I honestly didn't realize that the bare interface wasn't
there. :(
-Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Getting direct support for fallocate(2) into glibc
2008-12-09 16:58 ` Eric Sandeen
2008-12-09 17:31 ` Theodore Tso
@ 2008-12-11 19:00 ` Andreas Dilger
1 sibling, 0 replies; 5+ messages in thread
From: Andreas Dilger @ 2008-12-11 19:00 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Theodore Ts'o, linux-ext4
On Dec 09, 2008 10:58 -0600, Eric Sandeen wrote:
> sorry for the xfs, but > xfs_bmap is still so handy for this...
>
> # xfs_bmap -vv testfile
> testfile:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
> 0: [0..31]: 34984576..34984607 2 (1256112..1256143) 32 10000
> FLAG Values:
> 010000 Unwritten preallocated extent
Is the FIEMAP filefrag patch not in e2fsprogs yet? It should provide
equivalent output.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-12-11 19:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-09 6:50 Getting direct support for fallocate(2) into glibc Theodore Ts'o
2008-12-09 16:58 ` Eric Sandeen
2008-12-09 17:31 ` Theodore Tso
2008-12-09 18:03 ` Eric Sandeen
2008-12-11 19:00 ` Andreas Dilger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).