* 2.6.21-ext4-1
@ 2007-04-30 15:14 Theodore Ts'o
2007-04-30 15:58 ` 2.6.21-ext4-1 Theodore Tso
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Theodore Ts'o @ 2007-04-30 15:14 UTC (permalink / raw)
To: linux-ext4; +Cc: Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel
I've respun the ext4 development patchset, with Amit's updated fallocate
patches. I've added Dave's patch to add ia64 support to the fallocate
system call, but *not* the XFS fallocate support patches. (Probably
better for them to live in an xfs tree, where they can more easily
tested and updated.) Yes, we haven't reached complete closure on the
fallocate system call calling convention, but it's enough for us to get
more testing in -mm.
Also added Johann's jbd2-stats-through-procfs patches; it provides
useful help in turning the size of the journal, which will be useful in
benchmarking efforts. In addition, Alex Tomas's patch to free
just-allocated patches when there is an error inserting the extent into
the extent tree has also been included.
The patches have been compile-tested on x86, and compile/run-tested on
x86/UML. Would appreciate reports about testing on other platforms.
Thanks,
- Ted
P.S. One bug which I've noted --- if there is a failure due to disk
filling up, running e2fsck on the filesystem will show that the i_blocks
fields on the inodes where there was a failure to allocate disk blocks
are left incorrect. I'm guessing this is a bug in the delayed
allocation patches. Alex, when you have a moment, could you take a
look? Thanks!!
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: 2.6.21-ext4-1 2007-04-30 15:14 2.6.21-ext4-1 Theodore Ts'o @ 2007-04-30 15:58 ` Theodore Tso 2007-04-30 16:24 ` 2.6.21-ext4-1 Alex Tomas ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Theodore Tso @ 2007-04-30 15:58 UTC (permalink / raw) To: linux-ext4; +Cc: Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel Sorry, I forgot to include the URL's where ext4 development patchset can be found: ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/2.6.21-ext4-1 - Ted On Mon, Apr 30, 2007 at 11:14:57AM -0400, Theodore Ts'o wrote: > > I've respun the ext4 development patchset, with Amit's updated fallocate > patches. I've added Dave's patch to add ia64 support to the fallocate > system call, but *not* the XFS fallocate support patches. (Probably > better for them to live in an xfs tree, where they can more easily > tested and updated.) Yes, we haven't reached complete closure on the > fallocate system call calling convention, but it's enough for us to get > more testing in -mm. > > Also added Johann's jbd2-stats-through-procfs patches; it provides > useful help in turning the size of the journal, which will be useful in > benchmarking efforts. In addition, Alex Tomas's patch to free > just-allocated patches when there is an error inserting the extent into > the extent tree has also been included. > > The patches have been compile-tested on x86, and compile/run-tested on > x86/UML. Would appreciate reports about testing on other platforms. > > Thanks, > > - Ted > > P.S. One bug which I've noted --- if there is a failure due to disk > filling up, running e2fsck on the filesystem will show that the i_blocks > fields on the inodes where there was a failure to allocate disk blocks > are left incorrect. I'm guessing this is a bug in the delayed > allocation patches. Alex, when you have a moment, could you take a > look? Thanks!! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-04-30 15:14 2.6.21-ext4-1 Theodore Ts'o 2007-04-30 15:58 ` 2.6.21-ext4-1 Theodore Tso @ 2007-04-30 16:24 ` Alex Tomas 2007-04-30 17:16 ` 2.6.21-ext4-1 Jeff Garzik 2007-05-07 20:56 ` 2.6.21-ext4-1 Mingming Cao 3 siblings, 0 replies; 11+ messages in thread From: Alex Tomas @ 2007-04-30 16:24 UTC (permalink / raw) To: Theodore Ts'o Cc: linux-ext4, Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel Theodore Ts'o wrote: > P.S. One bug which I've noted --- if there is a failure due to disk > filling up, running e2fsck on the filesystem will show that the i_blocks > fields on the inodes where there was a failure to allocate disk blocks > are left incorrect. I'm guessing this is a bug in the delayed > allocation patches. Alex, when you have a moment, could you take a > look? Thanks!! definitely. thanks for the report. thanks, Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-04-30 15:14 2.6.21-ext4-1 Theodore Ts'o 2007-04-30 15:58 ` 2.6.21-ext4-1 Theodore Tso 2007-04-30 16:24 ` 2.6.21-ext4-1 Alex Tomas @ 2007-04-30 17:16 ` Jeff Garzik 2007-04-30 17:45 ` 2.6.21-ext4-1 Theodore Tso 2007-05-07 20:56 ` 2.6.21-ext4-1 Mingming Cao 3 siblings, 1 reply; 11+ messages in thread From: Jeff Garzik @ 2007-04-30 17:16 UTC (permalink / raw) To: Theodore Ts'o Cc: linux-ext4, Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel, Andrew Morton, Linus Torvalds Theodore Ts'o wrote: > I've respun the ext4 development patchset, with Amit's updated fallocate > patches. I've added Dave's patch to add ia64 support to the fallocate > system call, but *not* the XFS fallocate support patches. (Probably > better for them to live in an xfs tree, where they can more easily > tested and updated.) Yes, we haven't reached complete closure on the > fallocate system call calling convention, but it's enough for us to get > more testing in -mm. > > Also added Johann's jbd2-stats-through-procfs patches; it provides > useful help in turning the size of the journal, which will be useful in > benchmarking efforts. In addition, Alex Tomas's patch to free > just-allocated patches when there is an error inserting the extent into > the extent tree has also been included. > > The patches have been compile-tested on x86, and compile/run-tested on > x86/UML. Would appreciate reports about testing on other platforms. Why isn't this stuff going upstream rapidly? AFAICT nothing much at all has happened upstream besides a mass renaming? The whole point of having ext4 in the kernel is to do development upstream, in the public view, getting new stuff in ASAP (even if that means changing or pulling some stuff later). As it stands now, ext4 in the upstream tree is completely useless -- it's the same as ext3, and has been for months (since Oct 11). Hello? Upstream development? Ever heard of it? Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-04-30 17:16 ` 2.6.21-ext4-1 Jeff Garzik @ 2007-04-30 17:45 ` Theodore Tso 0 siblings, 0 replies; 11+ messages in thread From: Theodore Tso @ 2007-04-30 17:45 UTC (permalink / raw) To: Jeff Garzik Cc: linux-ext4, Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel, Andrew Morton, Linus Torvalds On Mon, Apr 30, 2007 at 01:16:19PM -0400, Jeff Garzik wrote: > Why isn't this stuff going upstream rapidly? Some of the patches are ready to be pushed upstream, and that will be happening shortly. In the case of the fallocate patches, the system call interface hadn't been completely closed, so we don't want to push it until we have closure and consensus. The previous versions of the patches used an ioctl interface that would have gotten potshots from the all-ioctls-are-evil camp, and it was clear that a unified system call interface was the right thing. So we wanted to make sure the XFS folks were happy with the interface as well before we pushed it. In general, yes, ext4 development has been a little slow; part of the problem is that we have a lot of people, but a number of folks are new and their patches need review before they are ready for upstream acceptance, and a number of other folks who should be doing the review have been overloaded with multiple other projects and have been time-sharing. > The whole point of having ext4 in the kernel is to do development > upstream, in the public view, getting new stuff in ASAP (even if that > means changing or pulling some stuff later). That's true, but we also get flamed when the patches don't meet various criteria, up to and including breaking on ia64. We are in the process of setting up automated testing to help address that problem, but it's a taken a little while to get that going. I'm also trying to schedule more time so I can do the needed review of the patches so they meet basic upstream standards so we *can* push them. If other folks would like to help with the review process, that would be more than welcome. But yes, we will try to get more of the patches pushed sooner rather than later. Point taken. - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-04-30 15:14 2.6.21-ext4-1 Theodore Ts'o ` (2 preceding siblings ...) 2007-04-30 17:16 ` 2.6.21-ext4-1 Jeff Garzik @ 2007-05-07 20:56 ` Mingming Cao 2007-05-08 2:50 ` 2.6.21-ext4-1 David Chinner 3 siblings, 1 reply; 11+ messages in thread From: Mingming Cao @ 2007-05-07 20:56 UTC (permalink / raw) To: Theodore Ts'o Cc: linux-ext4, Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel On Mon, 2007-04-30 at 11:14 -0400, Theodore Ts'o wrote: > I've respun the ext4 development patchset, with Amit's updated fallocate > patches. I've added Dave's patch to add ia64 support to the fallocate > system call, but *not* the XFS fallocate support patches. (Probably > better for them to live in an xfs tree, where they can more easily > tested and updated.) Yes, we haven't reached complete closure on the > fallocate system call calling convention, but it's enough for us to get > more testing in -mm. > > Also added Johann's jbd2-stats-through-procfs patches; it provides > useful help in turning the size of the journal, which will be useful in > benchmarking efforts. In addition, Alex Tomas's patch to free > just-allocated patches when there is an error inserting the extent into > the extent tree has also been included. > > The patches have been compile-tested on x86, and compile/run-tested on > x86/UML. Would appreciate reports about testing on other platforms. > I have tested this patch series on ppc64, x86_64 with dbench/tiobench/fsx, all runs fine. I am not sure what level of testing Amit has done about the fallocate() and preallocation code on various archs. I couldn't find a available s390 and ia64 machines with free partition yet. In any case, it would be useful to add a new set of testsuites for the new fallocate() syscall and fsstress in LTP testsuites to automatically the preallocation code in ext4/XFS. thanks, Mingming > Thanks, > > - Ted > > P.S. One bug which I've noted --- if there is a failure due to disk > filling up, running e2fsck on the filesystem will show that the i_blocks > fields on the inodes where there was a failure to allocate disk blocks > are left incorrect. I'm guessing this is a bug in the delayed > allocation patches. Alex, when you have a moment, could you take a > look? Thanks!! > - > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-05-07 20:56 ` 2.6.21-ext4-1 Mingming Cao @ 2007-05-08 2:50 ` David Chinner 2007-05-08 22:05 ` 2.6.21-ext4-1 Mingming Cao 0 siblings, 1 reply; 11+ messages in thread From: David Chinner @ 2007-05-08 2:50 UTC (permalink / raw) To: Mingming Cao Cc: Theodore Ts'o, linux-ext4, Johann Lombardi, Amit K. Arora, Dave Chinner, linux-kernel On Mon, May 07, 2007 at 01:56:23PM -0700, Mingming Cao wrote: > In any case, it would be useful to add a new set of testsuites for the > new fallocate() syscall and fsstress in LTP testsuites to automatically > the preallocation code in ext4/XFS. I hacked an existing XFS test prog to do manual testing of the fallocate() syscall. In the XFSQA suite we have various pre-alloc enhanced utils (e.g. fsstress, fsx, etc) that we should probably update to be able to use both fallocate and xfsctl so we can test both. Here's all the programs we use that have preallocation awareness: chook 982% grep RESVSP ltp/*.c | awk '/^ltp/ { split($1,a,":"); print a[1] ;}' | uniq ltp/doio.c ltp/fsstress.c ltp/fsx.c ltp/growfiles.c ltp/iogen.c chook 983% grep RESVSP src/* | awk '/^src/ { split($1,a,":"); print a[1] ;}' | uniq src/alloc.c src/fstest.c src/iopat.c src/randholes.c src/resvtest.c src/unwritten_mmap.c src/unwritten_sync.c BTW, have you guys tested mmap writes into unwritten extents? ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-05-08 2:50 ` 2.6.21-ext4-1 David Chinner @ 2007-05-08 22:05 ` Mingming Cao 2007-05-08 23:24 ` 2.6.21-ext4-1 David Chinner 0 siblings, 1 reply; 11+ messages in thread From: Mingming Cao @ 2007-05-08 22:05 UTC (permalink / raw) To: David Chinner Cc: Theodore Ts'o, linux-ext4, Johann Lombardi, Amit K. Arora, linux-kernel On Tue, 2007-05-08 at 12:50 +1000, David Chinner wrote: > On Mon, May 07, 2007 at 01:56:23PM -0700, Mingming Cao wrote: > > In any case, it would be useful to add a new set of testsuites for the > > new fallocate() syscall and fsstress in LTP testsuites to automatically > > the preallocation code in ext4/XFS. > > I hacked an existing XFS test prog to do manual testing of the fallocate() > syscall. In the XFSQA suite we have various pre-alloc enhanced utils (e.g. > fsstress, fsx, etc) that we should probably update to be able to use both > fallocate and xfsctl so we can test both. > > Here's all the programs we use that have preallocation awareness: > > chook 982% grep RESVSP ltp/*.c | awk '/^ltp/ { split($1,a,":"); print a[1] ;}' | uniq > ltp/doio.c > ltp/fsstress.c > ltp/fsx.c > ltp/growfiles.c > ltp/iogen.c > chook 983% grep RESVSP src/* | awk '/^src/ { split($1,a,":"); print a[1] ;}' | uniq > src/alloc.c > src/fstest.c > src/iopat.c > src/randholes.c > src/resvtest.c > src/unwritten_mmap.c > src/unwritten_sync.c > Thanks for sharing the info. I think Amit used a fsx version with preallocation awareness test, but I don't know if we have other ltp tests that aware preallocation. I would very appreciate if you could share your preallocation test with us. > BTW, have you guys tested mmap writes into unwritten extents? ;) > I am not sure, Amit, have you done some mmap write test into uninitialized extents? Sorry, I still not quite clear what's the mapped problem you are worry about. Could you explain to me a bit more? thanks! Mingming > Cheers, > > Dave. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-05-08 22:05 ` 2.6.21-ext4-1 Mingming Cao @ 2007-05-08 23:24 ` David Chinner 2007-05-09 14:36 ` 2.6.21-ext4-1 Amit K. Arora 0 siblings, 1 reply; 11+ messages in thread From: David Chinner @ 2007-05-08 23:24 UTC (permalink / raw) To: Mingming Cao Cc: David Chinner, Theodore Ts'o, linux-ext4, Johann Lombardi, Amit K. Arora, linux-kernel On Tue, May 08, 2007 at 03:05:56PM -0700, Mingming Cao wrote: > On Tue, 2007-05-08 at 12:50 +1000, David Chinner wrote: > > On Mon, May 07, 2007 at 01:56:23PM -0700, Mingming Cao wrote: > > > In any case, it would be useful to add a new set of testsuites for the > > > new fallocate() syscall and fsstress in LTP testsuites to automatically > > > the preallocation code in ext4/XFS. > > > > I hacked an existing XFS test prog to do manual testing of the fallocate() > > syscall. In the XFSQA suite we have various pre-alloc enhanced utils (e.g. > > fsstress, fsx, etc) that we should probably update to be able to use both > > fallocate and xfsctl so we can test both. > > > > Here's all the programs we use that have preallocation awareness: > > > > chook 982% grep RESVSP ltp/*.c | awk '/^ltp/ { split($1,a,":"); print a[1] ;}' | uniq > > ltp/doio.c > > ltp/fsstress.c > > ltp/fsx.c > > ltp/growfiles.c > > ltp/iogen.c > > chook 983% grep RESVSP src/* | awk '/^src/ { split($1,a,":"); print a[1] ;}' | uniq > > src/alloc.c > > src/fstest.c > > src/iopat.c > > src/randholes.c > > src/resvtest.c > > src/unwritten_mmap.c > > src/unwritten_sync.c > > > Thanks for sharing the info. I think Amit used a fsx version with > preallocation awareness test, but I don't know if we have other ltp > tests that aware preallocation. I would very appreciate if you could > share your preallocation test with us. All the tests are in CVS: http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/ I took this one: http://oss.sgi.com/cgi-bin/cvsweb.cgi/~checkout~/xfs-cmds/xfstests/src/alloc.c?rev=1.9;content-type=text%2Fplain and modified it. Patch is below - it's full of conditional XFS functionality still - you shoul dbe able to get it to run easily on ext4, though. > > BTW, have you guys tested mmap writes into unwritten extents? ;) > > > I am not sure, Amit, have you done some mmap write test into > uninitialized extents? > > Sorry, I still not quite clear what's the mapped problem you are worry > about. Could you explain to me a bit more? thanks! XFS needs a ->page_mkwrite() callout to correctly map pages that have been dirtied by mmap that span unwritten extents. mmap reads (i.e. when the fault first occurred) treat unwritten extents like holes and so we need to remap them when they are dirtied to set all the unwritten state in the bufferheads correctly for writeback. See test 166 here: http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/ http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/src/unwritten_mmap.c The same behaviour is needed for delalloc extents to prevent ENOSPC errors on writeback - the mmap write needs to do the freespace accounting at the time the page is dirtied and that can only be done through the ->page_mkwrite callout. Otherwise ENOSPC will occur in the writeback path and that is a major pain.... This may not be a problem for ext4, but I thought I better point out a couple of the more subtle problems mmap can introduce.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- xfstests/src/Makefile | 7 xfstests/src/falloc.c | 376 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 381 insertions(+), 2 deletions(-) Index: xfs-cmds/xfstests/src/falloc.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ xfs-cmds/xfstests/src/falloc.c 2007-04-30 12:41:13.862302450 +1000 @@ -0,0 +1,376 @@ +/* + * Copyright (c) 2000-2003,2007 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "global.h" + +/* should end up include somewhere */ +#define FA_ALLOCATE 0x1 +#define FA_DEALLOCATE 0x2 +#define FA_PREALLOCATE 0x3 + +/* ia64 */ +#define __NR_fallocate 1305 +/* + * Block I/O parameterization. A basic block (BB) is the lowest size of + * filesystem allocation, and must equal 512. Length units given to bio + * routines are in BB's. + */ + +/* Assume that if we have BTOBB, then we have the rest */ +#ifndef BTOBB +#define BBSHIFT 9 +#define BBSIZE (1<<BBSHIFT) +#define BBMASK (BBSIZE-1) +#define BTOBB(bytes) (((__u64)(bytes) + BBSIZE - 1) >> BBSHIFT) +#define BTOBBT(bytes) ((__u64)(bytes) >> BBSHIFT) +#define BBTOB(bbs) ((bbs) << BBSHIFT) +#define OFFTOBBT(bytes) ((__u64)(bytes) >> BBSHIFT) + +#define SEEKLIMIT32 0x7fffffff +#define BBSEEKLIMIT32 BTOBBT(SEEKLIMIT32) +#define SEEKLIMIT 0x7fffffffffffffffLL +#define BBSEEKLIMIT OFFTOBBT(SEEKLIMIT) +#endif + +#ifndef OFFTOBB +#define OFFTOBB(bytes) (((__u64)(bytes) + BBSIZE - 1) >> BBSHIFT) +#define BBTOOFF(bbs) ((__u64)(bbs) << BBSHIFT) +#endif + +#define FSBTOBB(f) (OFFTOBBT(FSBTOOFF(f))) +#define BBTOFSB(b) (OFFTOFSB(BBTOOFF(b))) +#define OFFTOFSB(o) ((o) / blocksize) +#define FSBTOOFF(f) ((f) * blocksize) + +void usage(void) +{ + printf("usage: alloc [-b blocksize] [-d dir] [-f file] [-n] [-r] [-t]\n" + "flags:\n" + " -n - non-interractive mode\n" + " -r - real time file\n" + " -t - truncate on open\n" + "\n" + "commands:\n" + " a [offset] [length] - alloc\n" + " p [offset] [length] - prealloc\n" + " f [offset] [length] - free\n" + " m [offset] [length] - print map\n" + " s - sync file\n" + " t [offset] - truncate\n" + " q - quit\n" + " h/? - this help\n"); + +} + +int fd = -1; +int blocksize; +char *filename; + +/* params are in bytes */ +void map(off64_t off, off64_t len) +{ + struct getbmap bm[2]; + + bzero(bm, sizeof(bm)); + + bm[0].bmv_count = 2; + bm[0].bmv_offset = OFFTOBB(off); + if (len==(off64_t)-1) { /* unsigned... */ + bm[0].bmv_length = -1; + printf(" MAP off=%lld, len=%lld [%lld-]\n", + (long long)off, (long long)len, + (long long)BBTOFSB(bm[0].bmv_offset)); + } else { + bm[0].bmv_length = OFFTOBB(len); + printf(" MAP off=%lld, len=%lld [%lld,%lld]\n", + (long long)off, (long long)len, + (long long)BBTOFSB(bm[0].bmv_offset), + (long long)BBTOFSB(bm[0].bmv_length)); + } + + printf(" [ofs,count]: start..end\n"); + for (;;) { +#ifdef XFS_IOC_GETBMAP + if (xfsctl(filename, fd, XFS_IOC_GETBMAP, bm) < 0) { +#else +#ifdef F_GETBMAP + if (fcntl(fd, F_GETBMAP, bm) < 0) { +#else +bozo! +#endif +#endif + perror("getbmap"); + break; + } + + if (bm[0].bmv_entries == 0) + break; + + printf(" [%lld,%lld]: ", + (long long)BBTOFSB(bm[1].bmv_offset), + (long long)BBTOFSB(bm[1].bmv_length)); + + if (bm[1].bmv_block == -1) + printf("hole"); + else + printf("%lld..%lld", + (long long)BBTOFSB(bm[1].bmv_block), + (long long)BBTOFSB(bm[1].bmv_block + + bm[1].bmv_length - 1)); + printf("\n"); + } +} + +int +main(int argc, char **argv) +{ + int c; + char *dirname = NULL; + int done = 0; + int status = 0; + struct flock64 f; + off64_t len; + char line[1024]; + off64_t off; + int oflags; + static char *opnames[] = { "alloc", + "prealloc", + "free" }; + int opno; + static int optab[] = { FA_ALLOCATE, + FA_PREALLOCATE, + FA_DEALLOCATE }; + int rflag = 0; + struct statvfs64 svfs; + int tflag = 0; + int nflag = 0; + int unlinkit = 0; + __int64_t v; + + while ((c = getopt(argc, argv, "b:d:f:rtn")) != -1) { + switch (c) { + case 'b': + blocksize = atoi(optarg); + break; + case 'd': + if (filename) { + printf("can't specify both -d and -f\n"); + exit(1); + } + dirname = optarg; + break; + case 'f': + if (dirname) { + printf("can't specify both -d and -f\n"); + exit(1); + } + filename = optarg; + break; + case 'r': + rflag = 1; + break; + case 't': + tflag = 1; + break; + case 'n': + nflag++; + break; + default: + printf("unknown option\n"); + usage(); + exit(1); + } + } + if (!dirname && !filename) + dirname = "."; + if (!filename) { + static char tmpfile[] = "allocXXXXXX"; + + mkstemp(tmpfile); + filename = malloc(strlen(tmpfile) + strlen(dirname) + 2); + sprintf(filename, "%s/%s", dirname, tmpfile); + unlinkit = 1; + } + oflags = O_RDWR | O_CREAT | (tflag ? O_TRUNC : 0); + fd = open(filename, oflags, 0666); + if (!nflag) { + printf("alloc:\n"); + printf(" filename %s\n", filename); + } + if (fd < 0) { + perror(filename); + exit(1); + } + if (!blocksize) { + if (fstatvfs64(fd, &svfs) < 0) { + perror(filename); + status = 1; + goto done; + } + blocksize = (int)svfs.f_bsize; + } + if (blocksize<0) { + fprintf(stderr,"illegal blocksize %d\n", blocksize); + status = 1; + goto done; + } + printf(" blocksize %d\n", blocksize); + if (rflag) { + struct fsxattr a; + +#ifdef XFS_IOC_FSGETXATTR + if (xfsctl(filename, fd, XFS_IOC_FSGETXATTR, &a) < 0) { + perror("XFS_IOC_FSGETXATTR"); + status = 1; + goto done; + } +#else +#ifdef F_FSGETXATTR + if (fcntl(fd, F_FSGETXATTR, &a) < 0) { + perror("F_FSGETXATTR"); + status = 1; + goto done; + } +#else +bozo! +#endif +#endif + + a.fsx_xflags |= XFS_XFLAG_REALTIME; + +#ifdef XFS_IOC_FSSETXATTR + if (xfsctl(filename, fd, XFS_IOC_FSSETXATTR, &a) < 0) { + perror("XFS_IOC_FSSETXATTR"); + status = 1; + goto done; + } +#else +#ifdef F_FSSETXATTR + if (fcntl(fd, F_FSSETXATTR, &a) < 0) { + perror("F_FSSETXATTR"); + status = 1; + goto done; + } +#else +bozo! +#endif +#endif + } + while (!done) { + char *p; + + if (!nflag) printf("alloc> "); + fflush(stdout); + if (!fgets(line, 1024, stdin)) break; + + p=line+strlen(line); + if (p!=line&&p[-1]=='\n') p[-1]=0; + + opno = 0; + switch (line[0]) { + case 'f': + opno++; + case 'p': + opno++; + case 'a': + v = strtoll(&line[2], &p, 0); + if (*p == 'b') { + off = FSBTOOFF(v); + p++; + } else + off = v; + if (*p == '\0') + v = -1; + else + v = strtoll(p, &p, 0); + if (*p == 'b') { + len = FSBTOOFF(v); + p++; + } else + len = v; + + printf(" CMD %s, off=%lld, len=%lld\n", + opnames[opno], (long long)off, (long long)len); + + c = syscall(__NR_fallocate, fd, optab[opno], off, len); + + if (c < 0) { + perror(opnames[opno]); + break; + } + + map(off,len); + break; + case 'm': + p = &line[1]; + v = strtoll(p, &p, 0); + if (*p == 'b') { + off = FSBTOOFF(v); + p++; + } else + off = v; + if (*p == '\0') + len = -1; + else { + v = strtoll(p, &p, 0); + if (*p == 'b') + len = FSBTOOFF(v); + else + len = v; + } + map(off,len); + break; + case 't': + p = &line[1]; + v = strtoll(p, &p, 0); + if (*p == 'b') + off = FSBTOOFF(v); + else + off = v; + printf(" TRUNCATE off=%lld\n", (long long)off); + if (ftruncate64(fd, off) < 0) { + perror("ftruncate"); + break; + } + break; + case 's': + printf(" SYNC\n"); + fsync(fd); + break; + case 'q': + printf(" QUIT\n"); + done = 1; + break; + case '?': + case 'h': + usage(); + break; + default: + printf("unknown command '%s'\n", line); + break; + } + } + if (!nflag) printf("\n"); +done: + if (fd != -1) + close(fd); + if (unlinkit) + unlink(filename); + exit(status); + /* NOTREACHED */ +} Index: xfs-cmds/xfstests/src/Makefile =================================================================== --- xfs-cmds.orig/xfstests/src/Makefile 2007-04-23 16:22:06.000000000 +1000 +++ xfs-cmds/xfstests/src/Makefile 2007-04-30 12:18:42.126750949 +1000 @@ -14,7 +14,7 @@ TARGETS = dirstress fill fill2 getpagesi LINUX_TARGETS = loggen xfsctl bstat t_mtab getdevicesize \ preallo_rw_pattern_reader preallo_rw_pattern_writer ftrunc trunc \ - fs_perms testx looptest locktest unwritten_mmap + fs_perms testx looptest locktest unwritten_mmap falloc IRIX_TARGETS = open_unlink @@ -98,7 +98,10 @@ trunc: trunc.o fs_perms: fs_perms.o $(LINKTEST) - + +falloc: falloc.o + $(LINKTEST) + testx: testx.o $(LINKTEST) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-05-08 23:24 ` 2.6.21-ext4-1 David Chinner @ 2007-05-09 14:36 ` Amit K. Arora 2007-05-09 14:52 ` 2.6.21-ext4-1 Eric Sandeen 0 siblings, 1 reply; 11+ messages in thread From: Amit K. Arora @ 2007-05-09 14:36 UTC (permalink / raw) To: David Chinner Cc: Mingming Cao, Theodore Ts'o, linux-ext4, Johann Lombardi, linux-kernel On Wed, May 09, 2007 at 09:24:49AM +1000, David Chinner wrote: > On Tue, May 08, 2007 at 03:05:56PM -0700, Mingming Cao wrote: > > On Tue, 2007-05-08 at 12:50 +1000, David Chinner wrote: > > > BTW, have you guys tested mmap writes into unwritten extents? ;) > > > > > I am not sure, Amit, have you done some mmap write test into > > uninitialized extents? > > > > Sorry, I still not quite clear what's the mapped problem you are worry > > about. Could you explain to me a bit more? thanks! > > XFS needs a ->page_mkwrite() callout to correctly map pages that > have been dirtied by mmap that span unwritten extents. mmap reads > (i.e. when the fault first occurred) treat unwritten extents like > holes and so we need to remap them when they are dirtied to set all > the unwritten state in the bufferheads correctly for writeback. > > See test 166 here: > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/ > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/src/unwritten_mmap.c Hi David, I updated the above testcase to use fallocate() and ran it on the ext4 (with the fallocate patches applied). It threw following message on console : # ./a.out 2000 /home/test/mnt/testfile BUG: at fs/buffer.c:1640 __block_write_full_page() 08c6fcd0: [<080572a7>] dump_stack+0x1b/0x1d 08c6fce8: [<080c0d07>] __block_write_full_page+0xc4/0x24a 08c6fd10: [<080c21e0>] block_write_full_page+0xb0/0xb8 08c6fd40: [<0810c6a3>] ext4_ordered_writepage+0xcb/0x139 08c6fd80: [<08091ba8>] generic_writepages+0x178/0x2a0 08c6fdfc: [<08091cfd>] do_writepages+0x2d/0x38 08c6fe10: [<0808cea2>] __filemap_fdatawrite_range+0x62/0x6d 08c6fe88: [<0808cec5>] filemap_fdatawrite+0x18/0x1d 08c6fea8: [<080bf4bf>] do_fsync+0x26/0x67 08c6fec0: [<080bf521>] __do_fsync+0x21/0x35 08c6fed8: [<080bf542>] sys_fsync+0xd/0xf 08c6fee8: [<08058cac>] handle_syscall+0x8c/0xa4 08c6ff64: [<0806728a>] handle_trap+0xc1/0xc9 08c6ff80: [<08067683>] userspace+0x123/0x166 08c6ffd8: [<080589db>] fork_handler+0xa0/0xa2 08c6fffc: [<a55a5a5a>] 0xa55a5a5a This is coming from: fs/buffer.c 1628 if (block > last_block) { 1629 .......... ... ......... 1639 } else if (!buffer_mapped(bh) && buffer_dirty(bh)) { => 1640 WARN_ON(bh->b_size != blocksize); 1641 err = get_block(inode, block, bh, 1); 1642 .......... ... ......... 1649 } Thus, I think in ext4 also we may need to have ->page_mkwrite implemented. I came across a patch you had submitted couple of months back which implemented a generic block_page_mkwrite() function, to which any file system could hook easily. Here is the link: http://lkml.org/lkml/2007/3/18/198 Any idea when is it going to be in the mainline ? Not sure if it is already part of some -mm kernel, but I did not find it in 2.6.21. Or, since there was a talk of ->fault() replacing ->page_mkwrite() the patch is not in the pipeline now ? And, how does XFS behave now if we write to mmapped preallocated blocks, since XFS also doesn't have ->page_mkwrite() implemented as of date ? Thanks! -- Regards, Amit Arora > > The same behaviour is needed for delalloc extents to prevent ENOSPC > errors on writeback - the mmap write needs to do the freespace > accounting at the time the page is dirtied and that can only be done > through the ->page_mkwrite callout. Otherwise ENOSPC will occur in > the writeback path and that is a major pain.... > > This may not be a problem for ext4, but I thought I better point > out a couple of the more subtle problems mmap can introduce.... > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > > > --- > xfstests/src/Makefile | 7 > xfstests/src/falloc.c | 376 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 381 insertions(+), 2 deletions(-) > > Index: xfs-cmds/xfstests/src/falloc.c > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ xfs-cmds/xfstests/src/falloc.c 2007-04-30 12:41:13.862302450 +1000 > @@ -0,0 +1,376 @@ > +/* > + * Copyright (c) 2000-2003,2007 Silicon Graphics, Inc. > + * All Rights Reserved. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it would be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write the Free Software Foundation, > + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > + */ > + > +#include "global.h" > + > +/* should end up include somewhere */ > +#define FA_ALLOCATE 0x1 > +#define FA_DEALLOCATE 0x2 > +#define FA_PREALLOCATE 0x3 > + > +/* ia64 */ > +#define __NR_fallocate 1305 > +/* > + * Block I/O parameterization. A basic block (BB) is the lowest size of > + * filesystem allocation, and must equal 512. Length units given to bio > + * routines are in BB's. > + */ > + > +/* Assume that if we have BTOBB, then we have the rest */ > +#ifndef BTOBB > +#define BBSHIFT 9 > +#define BBSIZE (1<<BBSHIFT) > +#define BBMASK (BBSIZE-1) > +#define BTOBB(bytes) (((__u64)(bytes) + BBSIZE - 1) >> BBSHIFT) > +#define BTOBBT(bytes) ((__u64)(bytes) >> BBSHIFT) > +#define BBTOB(bbs) ((bbs) << BBSHIFT) > +#define OFFTOBBT(bytes) ((__u64)(bytes) >> BBSHIFT) > + > +#define SEEKLIMIT32 0x7fffffff > +#define BBSEEKLIMIT32 BTOBBT(SEEKLIMIT32) > +#define SEEKLIMIT 0x7fffffffffffffffLL > +#define BBSEEKLIMIT OFFTOBBT(SEEKLIMIT) > +#endif > + > +#ifndef OFFTOBB > +#define OFFTOBB(bytes) (((__u64)(bytes) + BBSIZE - 1) >> BBSHIFT) > +#define BBTOOFF(bbs) ((__u64)(bbs) << BBSHIFT) > +#endif > + > +#define FSBTOBB(f) (OFFTOBBT(FSBTOOFF(f))) > +#define BBTOFSB(b) (OFFTOFSB(BBTOOFF(b))) > +#define OFFTOFSB(o) ((o) / blocksize) > +#define FSBTOOFF(f) ((f) * blocksize) > + > +void usage(void) > +{ > + printf("usage: alloc [-b blocksize] [-d dir] [-f file] [-n] [-r] [-t]\n" > + "flags:\n" > + " -n - non-interractive mode\n" > + " -r - real time file\n" > + " -t - truncate on open\n" > + "\n" > + "commands:\n" > + " a [offset] [length] - alloc\n" > + " p [offset] [length] - prealloc\n" > + " f [offset] [length] - free\n" > + " m [offset] [length] - print map\n" > + " s - sync file\n" > + " t [offset] - truncate\n" > + " q - quit\n" > + " h/? - this help\n"); > + > +} > + > +int fd = -1; > +int blocksize; > +char *filename; > + > +/* params are in bytes */ > +void map(off64_t off, off64_t len) > +{ > + struct getbmap bm[2]; > + > + bzero(bm, sizeof(bm)); > + > + bm[0].bmv_count = 2; > + bm[0].bmv_offset = OFFTOBB(off); > + if (len==(off64_t)-1) { /* unsigned... */ > + bm[0].bmv_length = -1; > + printf(" MAP off=%lld, len=%lld [%lld-]\n", > + (long long)off, (long long)len, > + (long long)BBTOFSB(bm[0].bmv_offset)); > + } else { > + bm[0].bmv_length = OFFTOBB(len); > + printf(" MAP off=%lld, len=%lld [%lld,%lld]\n", > + (long long)off, (long long)len, > + (long long)BBTOFSB(bm[0].bmv_offset), > + (long long)BBTOFSB(bm[0].bmv_length)); > + } > + > + printf(" [ofs,count]: start..end\n"); > + for (;;) { > +#ifdef XFS_IOC_GETBMAP > + if (xfsctl(filename, fd, XFS_IOC_GETBMAP, bm) < 0) { > +#else > +#ifdef F_GETBMAP > + if (fcntl(fd, F_GETBMAP, bm) < 0) { > +#else > +bozo! > +#endif > +#endif > + perror("getbmap"); > + break; > + } > + > + if (bm[0].bmv_entries == 0) > + break; > + > + printf(" [%lld,%lld]: ", > + (long long)BBTOFSB(bm[1].bmv_offset), > + (long long)BBTOFSB(bm[1].bmv_length)); > + > + if (bm[1].bmv_block == -1) > + printf("hole"); > + else > + printf("%lld..%lld", > + (long long)BBTOFSB(bm[1].bmv_block), > + (long long)BBTOFSB(bm[1].bmv_block + > + bm[1].bmv_length - 1)); > + printf("\n"); > + } > +} > + > +int > +main(int argc, char **argv) > +{ > + int c; > + char *dirname = NULL; > + int done = 0; > + int status = 0; > + struct flock64 f; > + off64_t len; > + char line[1024]; > + off64_t off; > + int oflags; > + static char *opnames[] = { "alloc", > + "prealloc", > + "free" }; > + int opno; > + static int optab[] = { FA_ALLOCATE, > + FA_PREALLOCATE, > + FA_DEALLOCATE }; > + int rflag = 0; > + struct statvfs64 svfs; > + int tflag = 0; > + int nflag = 0; > + int unlinkit = 0; > + __int64_t v; > + > + while ((c = getopt(argc, argv, "b:d:f:rtn")) != -1) { > + switch (c) { > + case 'b': > + blocksize = atoi(optarg); > + break; > + case 'd': > + if (filename) { > + printf("can't specify both -d and -f\n"); > + exit(1); > + } > + dirname = optarg; > + break; > + case 'f': > + if (dirname) { > + printf("can't specify both -d and -f\n"); > + exit(1); > + } > + filename = optarg; > + break; > + case 'r': > + rflag = 1; > + break; > + case 't': > + tflag = 1; > + break; > + case 'n': > + nflag++; > + break; > + default: > + printf("unknown option\n"); > + usage(); > + exit(1); > + } > + } > + if (!dirname && !filename) > + dirname = "."; > + if (!filename) { > + static char tmpfile[] = "allocXXXXXX"; > + > + mkstemp(tmpfile); > + filename = malloc(strlen(tmpfile) + strlen(dirname) + 2); > + sprintf(filename, "%s/%s", dirname, tmpfile); > + unlinkit = 1; > + } > + oflags = O_RDWR | O_CREAT | (tflag ? O_TRUNC : 0); > + fd = open(filename, oflags, 0666); > + if (!nflag) { > + printf("alloc:\n"); > + printf(" filename %s\n", filename); > + } > + if (fd < 0) { > + perror(filename); > + exit(1); > + } > + if (!blocksize) { > + if (fstatvfs64(fd, &svfs) < 0) { > + perror(filename); > + status = 1; > + goto done; > + } > + blocksize = (int)svfs.f_bsize; > + } > + if (blocksize<0) { > + fprintf(stderr,"illegal blocksize %d\n", blocksize); > + status = 1; > + goto done; > + } > + printf(" blocksize %d\n", blocksize); > + if (rflag) { > + struct fsxattr a; > + > +#ifdef XFS_IOC_FSGETXATTR > + if (xfsctl(filename, fd, XFS_IOC_FSGETXATTR, &a) < 0) { > + perror("XFS_IOC_FSGETXATTR"); > + status = 1; > + goto done; > + } > +#else > +#ifdef F_FSGETXATTR > + if (fcntl(fd, F_FSGETXATTR, &a) < 0) { > + perror("F_FSGETXATTR"); > + status = 1; > + goto done; > + } > +#else > +bozo! > +#endif > +#endif > + > + a.fsx_xflags |= XFS_XFLAG_REALTIME; > + > +#ifdef XFS_IOC_FSSETXATTR > + if (xfsctl(filename, fd, XFS_IOC_FSSETXATTR, &a) < 0) { > + perror("XFS_IOC_FSSETXATTR"); > + status = 1; > + goto done; > + } > +#else > +#ifdef F_FSSETXATTR > + if (fcntl(fd, F_FSSETXATTR, &a) < 0) { > + perror("F_FSSETXATTR"); > + status = 1; > + goto done; > + } > +#else > +bozo! > +#endif > +#endif > + } > + while (!done) { > + char *p; > + > + if (!nflag) printf("alloc> "); > + fflush(stdout); > + if (!fgets(line, 1024, stdin)) break; > + > + p=line+strlen(line); > + if (p!=line&&p[-1]=='\n') p[-1]=0; > + > + opno = 0; > + switch (line[0]) { > + case 'f': > + opno++; > + case 'p': > + opno++; > + case 'a': > + v = strtoll(&line[2], &p, 0); > + if (*p == 'b') { > + off = FSBTOOFF(v); > + p++; > + } else > + off = v; > + if (*p == '\0') > + v = -1; > + else > + v = strtoll(p, &p, 0); > + if (*p == 'b') { > + len = FSBTOOFF(v); > + p++; > + } else > + len = v; > + > + printf(" CMD %s, off=%lld, len=%lld\n", > + opnames[opno], (long long)off, (long long)len); > + > + c = syscall(__NR_fallocate, fd, optab[opno], off, len); > + > + if (c < 0) { > + perror(opnames[opno]); > + break; > + } > + > + map(off,len); > + break; > + case 'm': > + p = &line[1]; > + v = strtoll(p, &p, 0); > + if (*p == 'b') { > + off = FSBTOOFF(v); > + p++; > + } else > + off = v; > + if (*p == '\0') > + len = -1; > + else { > + v = strtoll(p, &p, 0); > + if (*p == 'b') > + len = FSBTOOFF(v); > + else > + len = v; > + } > + map(off,len); > + break; > + case 't': > + p = &line[1]; > + v = strtoll(p, &p, 0); > + if (*p == 'b') > + off = FSBTOOFF(v); > + else > + off = v; > + printf(" TRUNCATE off=%lld\n", (long long)off); > + if (ftruncate64(fd, off) < 0) { > + perror("ftruncate"); > + break; > + } > + break; > + case 's': > + printf(" SYNC\n"); > + fsync(fd); > + break; > + case 'q': > + printf(" QUIT\n"); > + done = 1; > + break; > + case '?': > + case 'h': > + usage(); > + break; > + default: > + printf("unknown command '%s'\n", line); > + break; > + } > + } > + if (!nflag) printf("\n"); > +done: > + if (fd != -1) > + close(fd); > + if (unlinkit) > + unlink(filename); > + exit(status); > + /* NOTREACHED */ > +} > Index: xfs-cmds/xfstests/src/Makefile > =================================================================== > --- xfs-cmds.orig/xfstests/src/Makefile 2007-04-23 16:22:06.000000000 +1000 > +++ xfs-cmds/xfstests/src/Makefile 2007-04-30 12:18:42.126750949 +1000 > @@ -14,7 +14,7 @@ TARGETS = dirstress fill fill2 getpagesi > > LINUX_TARGETS = loggen xfsctl bstat t_mtab getdevicesize \ > preallo_rw_pattern_reader preallo_rw_pattern_writer ftrunc trunc \ > - fs_perms testx looptest locktest unwritten_mmap > + fs_perms testx looptest locktest unwritten_mmap falloc > > IRIX_TARGETS = open_unlink > > @@ -98,7 +98,10 @@ trunc: trunc.o > > fs_perms: fs_perms.o > $(LINKTEST) > - > + > +falloc: falloc.o > + $(LINKTEST) > + > testx: testx.o > $(LINKTEST) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.21-ext4-1 2007-05-09 14:36 ` 2.6.21-ext4-1 Amit K. Arora @ 2007-05-09 14:52 ` Eric Sandeen 0 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2007-05-09 14:52 UTC (permalink / raw) To: Amit K. Arora Cc: David Chinner, Mingming Cao, Theodore Ts'o, linux-ext4, Johann Lombardi, linux-kernel Amit K. Arora wrote: > And, how does XFS behave now if we write to mmapped preallocated blocks, > since XFS also doesn't have ->page_mkwrite() implemented as of date ? unwritten extents remain unwritten after mmap() modifies them http://oss.sgi.com/bugzilla/show_bug.cgi?id=418 :) -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-05-09 14:55 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-30 15:14 2.6.21-ext4-1 Theodore Ts'o 2007-04-30 15:58 ` 2.6.21-ext4-1 Theodore Tso 2007-04-30 16:24 ` 2.6.21-ext4-1 Alex Tomas 2007-04-30 17:16 ` 2.6.21-ext4-1 Jeff Garzik 2007-04-30 17:45 ` 2.6.21-ext4-1 Theodore Tso 2007-05-07 20:56 ` 2.6.21-ext4-1 Mingming Cao 2007-05-08 2:50 ` 2.6.21-ext4-1 David Chinner 2007-05-08 22:05 ` 2.6.21-ext4-1 Mingming Cao 2007-05-08 23:24 ` 2.6.21-ext4-1 David Chinner 2007-05-09 14:36 ` 2.6.21-ext4-1 Amit K. Arora 2007-05-09 14:52 ` 2.6.21-ext4-1 Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).