* Using O_DIRECT in ext4 @ 2009-07-21 1:41 Xiang Wang 2009-07-21 3:41 ` Eric Sandeen 0 siblings, 1 reply; 7+ messages in thread From: Xiang Wang @ 2009-07-21 1:41 UTC (permalink / raw) To: linux-ext4 Hi, Recently I've been experimenting with O_DIRECT in ext4 to get a feeling of how much file fragmentation will be generated. On a newly formatted ext4 partition(no-journal), I created a top-level directory and under this top-level directory I ran a test program to generate some files. The test program does the following: -- create multiple threads(in my test case: 16 threads) -- each thread creates a file with the O_DIRECT flag and keeps extending the file to 1MB Since these threads run concurrently, they compete in block allocation. After the program ran to a completion, I ran filefrag on each file and measure how many extents there are in the file. And here is a sample result: file0: 6 extents found file1: 20 extents found file2: 7 extents found file3: 6 extents found file4: 6 extents found file5: 5 extents found file6: 6 extents found file7: 20 extents found file8: 20 extents found file9: 20 extents found file10: 20 extents found file11: 20 extents found file12: 20 extents found file13: 19 extents found file14: 19 extents found file15: 19 extents found Looks like these files are quite heavily fragmented. For comparison, I did the same experiment on an ext2 partition, resulting in each file having only 1 extent. I also did the experiments of using buffered writes(by removing the O_DIRECT flag) on ext2 and ext4, both resulting in each file having only 1 extent. I am wondering whether this kind of file fragmentation is already a known issue in ext4 when O_DIRECT is used? Is it something by design? Since it seems like ext2 does not have this issue under my test case, is it necessary that we make the behavior of ext4 similar to ext2 under situations like this? Thanks, Xiang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 1:41 Using O_DIRECT in ext4 Xiang Wang @ 2009-07-21 3:41 ` Eric Sandeen 2009-07-21 14:45 ` Curt Wohlgemuth 0 siblings, 1 reply; 7+ messages in thread From: Eric Sandeen @ 2009-07-21 3:41 UTC (permalink / raw) To: Xiang Wang; +Cc: linux-ext4 Xiang Wang wrote: > Hi, > > Recently I've been experimenting with O_DIRECT in ext4 to get a > feeling of how much file fragmentation will be generated. > > On a newly formatted ext4 partition(no-journal), I created a top-level > directory and under this top-level directory I ran a test program to > generate some files. > > The test program does the following: > -- create multiple threads(in my test case: 16 threads) > -- each thread creates a file with the O_DIRECT flag and keeps > extending the file to 1MB > Since these threads run concurrently, they compete in block allocation. > > After the program ran to a completion, I ran filefrag on each file and > measure how many extents there are in the file. > And here is a sample result: > file0: 6 extents found > file1: 20 extents found > file2: 7 extents found > file3: 6 extents found > file4: 6 extents found > file5: 5 extents found > file6: 6 extents found > file7: 20 extents found > file8: 20 extents found > file9: 20 extents found > file10: 20 extents found > file11: 20 extents found > file12: 20 extents found > file13: 19 extents found > file14: 19 extents found > file15: 19 extents found > > Looks like these files are quite heavily fragmented. Multiple parallel extending DIOs in a single dir is a tough case for a filesystem - it has no hints about what to do, and can't use delalloc to wait to see what's happening; it just has to allocate things as they come, more or less. > For comparison, I did the same experiment on an ext2 partition, > resulting in each file having only 1 extent. Interestinng, not sure I would have expected that. > I also did the experiments of using buffered writes(by removing the > O_DIRECT flag) on ext2 and ext4, both resulting in each file having > only 1 extent. delayed allocation at work I suppose. > I am wondering whether this kind of file fragmentation is already a > known issue in ext4 when O_DIRECT is used? Is it something by design? > Since it seems like ext2 does not have this issue under my test case, > is it necessary that we make the behavior of ext4 similar to ext2 > under situations like this? Is this representative of a real workload? -Eric > Thanks, > Xiang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 3:41 ` Eric Sandeen @ 2009-07-21 14:45 ` Curt Wohlgemuth 2009-07-21 16:38 ` Eric Sandeen 0 siblings, 1 reply; 7+ messages in thread From: Curt Wohlgemuth @ 2009-07-21 14:45 UTC (permalink / raw) To: Eric Sandeen; +Cc: Xiang Wang, linux-ext4 On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote: > Xiang Wang wrote: >> Hi, >> >> Recently I've been experimenting with O_DIRECT in ext4 to get a >> feeling of how much file fragmentation will be generated. >> >> On a newly formatted ext4 partition(no-journal), I created a top-level >> directory and under this top-level directory I ran a test program to >> generate some files. >> >> The test program does the following: >> -- create multiple threads(in my test case: 16 threads) >> -- each thread creates a file with the O_DIRECT flag and keeps >> extending the file to 1MB >> Since these threads run concurrently, they compete in block allocation. >> >> After the program ran to a completion, I ran filefrag on each file and >> measure how many extents there are in the file. >> And here is a sample result: >> file0: 6 extents found >> file1: 20 extents found >> file2: 7 extents found >> file3: 6 extents found >> file4: 6 extents found >> file5: 5 extents found >> file6: 6 extents found >> file7: 20 extents found >> file8: 20 extents found >> file9: 20 extents found >> file10: 20 extents found >> file11: 20 extents found >> file12: 20 extents found >> file13: 19 extents found >> file14: 19 extents found >> file15: 19 extents found >> >> Looks like these files are quite heavily fragmented. > > Multiple parallel extending DIOs in a single dir is a tough case for a > filesystem - it has no hints about what to do, and can't use delalloc to > wait to see what's happening; it just has to allocate things as they > come, more or less. > >> For comparison, I did the same experiment on an ext2 partition, >> resulting in each file having only 1 extent. > > Interestinng, not sure I would have expected that. Same with us; we're looking into more variables to understand it. >> I also did the experiments of using buffered writes(by removing the >> O_DIRECT flag) on ext2 and ext4, both resulting in each file having >> only 1 extent. > > delayed allocation at work I suppose. > >> I am wondering whether this kind of file fragmentation is already a >> known issue in ext4 when O_DIRECT is used? Is it something by design? >> Since it seems like ext2 does not have this issue under my test case, >> is it necessary that we make the behavior of ext4 similar to ext2 >> under situations like this? > > Is this representative of a real workload? Not exactly perhaps, but we do have apps that are showing significantly more fragmentation in their files on ext4 than with ext2, while using O_DIRECT (e.g., 8 extents on ext4 vs 1 on ext2, as reported by filefrag). The experiment above is synthetic, but fairly representative. (Hence the related questions about fallocate, since this is one possible, though ugly, workaround.) Curt ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 14:45 ` Curt Wohlgemuth @ 2009-07-21 16:38 ` Eric Sandeen 2009-07-21 20:46 ` Xiang Wang 2009-07-21 21:08 ` Frank Mayhar 0 siblings, 2 replies; 7+ messages in thread From: Eric Sandeen @ 2009-07-21 16:38 UTC (permalink / raw) To: Curt Wohlgemuth; +Cc: Xiang Wang, linux-ext4 Curt Wohlgemuth wrote: > On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote: >> Xiang Wang wrote: >>> For comparison, I did the same experiment on an ext2 partition, >>> resulting in each file having only 1 extent. >> Interestinng, not sure I would have expected that. > > Same with us; we're looking into more variables to understand it. To be more clear, I would not have expected ext2 to deal well with it either, is more what I meant ;) I'm not terribly surprised that ext4 gets fragmented. For the numbers posted, how big were the files (how many 1m chunks were written?) Just FWIW; I did something like: # for I in `seq 1 16`; do dd if=/dev/zero of=testfile$I bs=1M count=16 oflag=direct & done on a rhel5.4 beta kernel and got: ~5 extents per file on ext4 (per filefrag output) between 41 and 234 extents on ext2. ~6 extents per file on ext3. ~16 extents per file on xfs if I created a subdir for each file: # for I in `seq 1 16`; do mkdir dir$I; dd if=/dev/zero of=dir$I/testfile$I bs=1M count=16 oflag=direct & done ~5 extents per file on ext4 1 or 2 extents per file on ext2 1 or 2 extents per file on ext3 ~16 extents per file on xfs. -Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 16:38 ` Eric Sandeen @ 2009-07-21 20:46 ` Xiang Wang 2009-07-21 21:08 ` Frank Mayhar 1 sibling, 0 replies; 7+ messages in thread From: Xiang Wang @ 2009-07-21 20:46 UTC (permalink / raw) To: Eric Sandeen; +Cc: Curt Wohlgemuth, linux-ext4 On Tue, Jul 21, 2009 at 9:38 AM, Eric Sandeen<sandeen@redhat.com> wrote: > Curt Wohlgemuth wrote: >> On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote: >>> Xiang Wang wrote: > >>>> For comparison, I did the same experiment on an ext2 partition, >>>> resulting in each file having only 1 extent. >>> Interestinng, not sure I would have expected that. >> >> Same with us; we're looking into more variables to understand it. > > To be more clear, I would not have expected ext2 to deal well with it > either, is more what I meant ;) I'm not terribly surprised that ext4 > gets fragmented. > > For the numbers posted, how big were the files (how many 1m chunks were > written?) > > Just FWIW; I did something like: > > # for I in `seq 1 16`; do dd if=/dev/zero of=testfile$I bs=1M count=16 > oflag=direct & done > > on a rhel5.4 beta kernel and got: > > ~5 extents per file on ext4 (per filefrag output) > between 41 and 234 extents on ext2. > ~6 extents per file on ext3. > ~16 extents per file on xfs > I repeated this test(bs=1M count=16) by tuning some parameters in my test program. And I got the following results(per filefrag output): ext4: 5 extents per file ext2: file0: 5 extents found, perfection would be 1 extent file1: 5 extents found, perfection would be 1 extent file2: 6 extents found, perfection would be 1 extent file3: 4 extents found, perfection would be 1 extent file4: 4 extents found, perfection would be 1 extent file5: 6 extents found, perfection would be 1 extent file6: 4 extents found, perfection would be 1 extent file7: 5 extents found, perfection would be 1 extent file8: 6 extents found, perfection would be 1 extent file9: 4 extents found, perfection would be 1 extent file10: 5 extents found, perfection would be 1 extent file11: 6 extents found, perfection would be 1 extent file12: 6 extents found, perfection would be 1 extent file13: 8 extents found, perfection would be 1 extent file14: 4 extents found, perfection would be 1 extent file15: 7 extents found, perfection would be 1 extent The results on ext4 look comparable to yours while the results on ext2 look very different. I am attaching the test program I use in case you want to try it. It is at the end of the message. I invoked it like: ./mt_writes 16 1 to have 16 threads writing using O_DIRECT. > if I created a subdir for each file: > > # for I in `seq 1 16`; do mkdir dir$I; dd if=/dev/zero > of=dir$I/testfile$I bs=1M count=16 oflag=direct & done > > ~5 extents per file on ext4 > 1 or 2 extents per file on ext2 > 1 or 2 extents per file on ext3 > ~16 extents per file on xfs. > > -Eric > ====== /* * mt_write.c -- multiple threads extending files concurrently. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <pthread.h> #include <sys/stat.h> #include <fcntl.h> #define _XOPEN_SOURCE 600 #define O_DIRECT 00040000 /* direct disk access hint */ #define MAX_THREAD 1000 #define BUFSIZE 1048576 #define COUNT 16 typedef struct { int id; int odirect; } parm; void *expand(void *arg) { char *buf; char fname[16]; int fd; int i, count; parm *p = (parm *)arg; // O_DIRECT needs to work with aligned memory if (posix_memalign((void *) &buf, 512, BUFSIZE) != 0) { fprintf(stderr, "cannot allocate aligned mem!\n"); return NULL; } sprintf(fname, "file%d", p->id); if (p->odirect) fd = open(fname, O_RDWR|O_CREAT|O_APPEND|O_DIRECT); else fd = open(fname, O_RDWR|O_CREAT|O_APPEND); if (fd == -1) { fprintf(stderr, "Open %s failed!\n", fname); return NULL; } for(i = 0; i < COUNT; i++) { count = write(fd, buf, BUFSIZE); if (count == -1) { fprintf(stderr, "Only able to finish %d blocks of data\n", i); return NULL; } } if (!p->odirect) { fsync(fd); } printf("Done with writing %d blocks of data\n", COUNT); close(fd); free(buf); return NULL; } int main(int argc, char* argv[]) { int n,i, odirect; pthread_t *threads; pthread_attr_t pthread_custom_attr; parm *p; if (argc != 3) { printf ("Usage: %s <# of threads> <O_DIRECT? 1:0>\n",argv[0]); exit(1); } n=atoi(argv[1]); odirect = atoi(argv[2]); if ((n < 1) || (n > MAX_THREAD)) { printf ("The # of thread should between 1 and %d.\n",MAX_THREAD); exit(1); } threads=(pthread_t *)malloc(n*sizeof(*threads)); pthread_attr_init(&pthread_custom_attr); p=(parm *)malloc(sizeof(parm)*n); /* Start up thread */ for (i = 0; i < n; i++) { p[i].id = i; p[i].odirect = odirect; pthread_create(&threads[i], &pthread_custom_attr, expand, (void *)(p+i)); } /* Synchronize the completion of each thread. */ for (i=0; i<n; i++) { pthread_join(threads[i],NULL); } free(p); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 16:38 ` Eric Sandeen 2009-07-21 20:46 ` Xiang Wang @ 2009-07-21 21:08 ` Frank Mayhar 2009-07-21 23:46 ` Mingming Cao 1 sibling, 1 reply; 7+ messages in thread From: Frank Mayhar @ 2009-07-21 21:08 UTC (permalink / raw) To: Eric Sandeen; +Cc: Curt Wohlgemuth, Xiang Wang, linux-ext4 On Tue, 2009-07-21 at 11:38 -0500, Eric Sandeen wrote: > Curt Wohlgemuth wrote: > > On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote: > >> Xiang Wang wrote: > > >>> For comparison, I did the same experiment on an ext2 partition, > >>> resulting in each file having only 1 extent. > >> Interestinng, not sure I would have expected that. > > > > Same with us; we're looking into more variables to understand it. > > To be more clear, I would not have expected ext2 to deal well with it > either, is more what I meant ;) I'm not terribly surprised that ext4 > gets fragmented. Ext2 deals with it via the block reservation code added some time ago. It turns out it works pretty well for this case. Ext4, of course, doesn't use the block reservation code. -- Frank Mayhar <fmayhar@google.com> Google, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Using O_DIRECT in ext4 2009-07-21 21:08 ` Frank Mayhar @ 2009-07-21 23:46 ` Mingming Cao 0 siblings, 0 replies; 7+ messages in thread From: Mingming Cao @ 2009-07-21 23:46 UTC (permalink / raw) To: Frank Mayhar; +Cc: Eric Sandeen, Curt Wohlgemuth, Xiang Wang, linux-ext4 Frank Mayhar wrote: > On Tue, 2009-07-21 at 11:38 -0500, Eric Sandeen wrote: > >> Curt Wohlgemuth wrote: >> >>> On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote: >>> >>>> Xiang Wang wrote: >>>> >>>>> For comparison, I did the same experiment on an ext2 partition, >>>>> resulting in each file having only 1 extent. >>>>> >>>> Interestinng, not sure I would have expected that. >>>> >>> Same with us; we're looking into more variables to understand it. >>> >> To be more clear, I would not have expected ext2 to deal well with it >> either, is more what I meant ;) I'm not terribly surprised that ext4 >> gets fragmented. >> > > Ext2 deals with it via the block reservation code added some time ago. > It turns out it works pretty well for this case. Ext4, of course, > doesn't use the block reservation code. > ext4 mballoc code use per cpu preallocation, so all threads running on the same cpu which needs new blocks will be assign blocks next to each other. This will makes files created by those threads interleave each other as a result, causing fragmentation. Preallocation will help, but that a persistant preallocation. Mingming ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-07-21 23:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-21 1:41 Using O_DIRECT in ext4 Xiang Wang 2009-07-21 3:41 ` Eric Sandeen 2009-07-21 14:45 ` Curt Wohlgemuth 2009-07-21 16:38 ` Eric Sandeen 2009-07-21 20:46 ` Xiang Wang 2009-07-21 21:08 ` Frank Mayhar 2009-07-21 23:46 ` Mingming Cao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).