* Preallocation with direct IO? @ 2011-12-29 13:10 amit.sahrawat83 2011-12-29 20:57 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: amit.sahrawat83 @ 2011-12-29 13:10 UTC (permalink / raw) To: hch@infradead.org, david@fromorbit.com, xfs@oss.sgi.com Hi, I am using a test setup which is doing write using multiple threads using direct IO. The buffer size which is used to write is 512KB. After continously running this for long duration - i observe that number of extents in each file is getting huge(2K..4K..). I observed that each extent is of 512KB(aligned to write buffer size). I wish to have low number of extents(i.e, reduce fragmentation)... In case of buffered IO- preallocation works good alongwith the mount option 'allocsize'. Is there anything which can be done for Direct IO? Please advice for reducing fragmentation with direct IO. Thanks & Regards, Amit Sahrawat Sent from my Nokia phone _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO? 2011-12-29 13:10 Preallocation with direct IO? amit.sahrawat83 @ 2011-12-29 20:57 ` Dave Chinner 2011-12-30 3:07 ` Amit Sahrawat 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2011-12-29 20:57 UTC (permalink / raw) To: amit.sahrawat83@gmail.com; +Cc: hch@infradead.org, xfs@oss.sgi.com On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote: > Hi, I am using a test setup which is doing write using multiple > threads using direct IO. The buffer size which is used to write is > 512KB. After continously running this for long duration - i > observe that number of extents in each file is getting > huge(2K..4K..). I observed that each extent is of 512KB(aligned to > write buffer size). I wish to have low number of extents(i.e, > reduce fragmentation)... In case of buffered IO- preallocation > works good alongwith the mount option 'allocsize'. Is there > anything which can be done for Direct IO? Please advice for > reducing fragmentation with direct IO. Direct IO does not do any implicit preallocation. The filesystem simply gets out of the way of direct IO as it is assumed you know what you are doing. i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64) calls to preallocate space or to set up extent size hints to use larger allocations than the IO being done during syscalls... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO? 2011-12-29 20:57 ` Dave Chinner @ 2011-12-30 3:07 ` Amit Sahrawat 2011-12-30 20:43 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: Amit Sahrawat @ 2011-12-30 3:07 UTC (permalink / raw) To: Dave Chinner; +Cc: hch@infradead.org, xfs@oss.sgi.com On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote: > On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote: >> Hi, I am using a test setup which is doing write using multiple >> threads using direct IO. The buffer size which is used to write is >> 512KB. After continously running this for long duration - i >> observe that number of extents in each file is getting >> huge(2K..4K..). I observed that each extent is of 512KB(aligned to >> write buffer size). I wish to have low number of extents(i.e, >> reduce fragmentation)... In case of buffered IO- preallocation >> works good alongwith the mount option 'allocsize'. Is there >> anything which can be done for Direct IO? Please advice for >> reducing fragmentation with direct IO. > > Direct IO does not do any implicit preallocation. The filesystem > simply gets out of the way of direct IO as it is assumed you know > what you are doing. This is the supporting line I was looking for. > > i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64) > calls to preallocate space or to set up extent size hints to use > larger allocations than the IO being done during syscalls... I tried to make use of preallocating space using ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working well with the Direct I/O. Is there any call to set up extent size also? please update I can try to make use of that also. Thanks & Regards, Amit Sahrawat > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO? 2011-12-30 3:07 ` Amit Sahrawat @ 2011-12-30 20:43 ` Dave Chinner 2011-12-31 12:46 ` Amit Sahrawat 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2011-12-30 20:43 UTC (permalink / raw) To: Amit Sahrawat; +Cc: hch@infradead.org, xfs@oss.sgi.com On Fri, Dec 30, 2011 at 08:37:00AM +0530, Amit Sahrawat wrote: > On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote: > > On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote: > >> Hi, I am using a test setup which is doing write using multiple > >> threads using direct IO. The buffer size which is used to write is > >> 512KB. After continously running this for long duration - i > >> observe that number of extents in each file is getting > >> huge(2K..4K..). I observed that each extent is of 512KB(aligned to > >> write buffer size). I wish to have low number of extents(i.e, > >> reduce fragmentation)... In case of buffered IO- preallocation > >> works good alongwith the mount option 'allocsize'. Is there > >> anything which can be done for Direct IO? Please advice for > >> reducing fragmentation with direct IO. > > > > Direct IO does not do any implicit preallocation. The filesystem > > simply gets out of the way of direct IO as it is assumed you know > > what you are doing. > This is the supporting line I was looking for. > > > > i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64) > > calls to preallocate space or to set up extent size hints to use > > larger allocations than the IO being done during syscalls... > I tried to make use of preallocating space using > ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working > well with the Direct I/O. Without knowing how you are using preallocation, I cannot comment on this. Can you describe how your application does IO (size, frequency, location in file, etc) and preallocation (same again), as well as xfs_bmap -vp <file> output of fragmented files? That way I have some idea of what your problem is and so might be able to suggest fixes... > Is there any call to set up extent size > also? please update I can try to make use of that also. `man xfsctl` and search for XFS_IOC_FSSETXATTR. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO? 2011-12-30 20:43 ` Dave Chinner @ 2011-12-31 12:46 ` Amit Sahrawat 0 siblings, 0 replies; 5+ messages in thread From: Amit Sahrawat @ 2011-12-31 12:46 UTC (permalink / raw) To: Dave Chinner; +Cc: hch@infradead.org, xfs@oss.sgi.com On Sat, Dec 31, 2011 at 2:13 AM, Dave Chinner <david@fromorbit.com> wrote: > On Fri, Dec 30, 2011 at 08:37:00AM +0530, Amit Sahrawat wrote: >> On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote: >> > On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote: >> >> Hi, I am using a test setup which is doing write using multiple >> >> threads using direct IO. The buffer size which is used to write is >> >> 512KB. After continously running this for long duration - i >> >> observe that number of extents in each file is getting >> >> huge(2K..4K..). I observed that each extent is of 512KB(aligned to >> >> write buffer size). I wish to have low number of extents(i.e, >> >> reduce fragmentation)... In case of buffered IO- preallocation >> >> works good alongwith the mount option 'allocsize'. Is there >> >> anything which can be done for Direct IO? Please advice for >> >> reducing fragmentation with direct IO. >> > >> > Direct IO does not do any implicit preallocation. The filesystem >> > simply gets out of the way of direct IO as it is assumed you know >> > what you are doing. >> This is the supporting line I was looking for. >> > >> > i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64) >> > calls to preallocate space or to set up extent size hints to use >> > larger allocations than the IO being done during syscalls... >> I tried to make use of preallocating space using >> ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working >> well with the Direct I/O. > > Without knowing how you are using preallocation, I cannot comment on > this. Can you describe how your application does IO (size, > frequency, location in file, etc) and preallocation (same again), as > well as xfs_bmap -vp <file> output of fragmented files? That way I > have some idea of what your problem is and so might be able to > suggest fixes... Prealloction was done using - snippets like these: fl.l_whence = SEEK_SET; fl.l_start = 0; fl.l_len = (long long) PREALLOC; /* 1GB */ printf ("Preallocating %lld MB\n", (fl.l_len / (1024 * 1024))); err = ioctl (hFile, XFS_IOC_RESVSP64, &fl); I verified the prealloc working by taking a look at the file size (ls -l) disk usage using 'df -kh' and also taking a look at the file extents using xfs_bmap xfs_bmap shows the extent of the preallocated length. i.e., preallocation was working as expected. To share the test case, due to some reasons - I cannot share the exact code - but the working is like this: In the Test case - there are 5 threads WRITE_SIZE - 512KB TRUNCSIZE - 250MB 1st Thread - this is doing actual amongst all the threads buffer = valloc(WRITE_SIZE); fd= open64(file,O_CREAT|O_DIRECT|O_WRONLY|O_TRUNC) Initial write to file data of 5GB using 512KB buffer size for(i=0; i < WRITE_COUNT; i++) { write(fd, buffer,WRITE_SIZE); } fsync(fd) while(1) { if(ncount++ < TRUNCSIZE) { write(fd,buffer,WRITE_SIZE); } else { close(fd) open(fd, O_RDWR|O_CREAT) gettimeofday() - Start Point sync(fd); // At times this sync is taking time around 5sec even though the test case is doing I/O using O_DIRECT gettimeofday() - End Point If(sync time greater than 2secs) exit(0); gettimeofday() - Start Point ftruncate(fd,TRUNCSIZE); gettimeofday() - End Point if(truncate time greater than 2sec) exit(0); fsync(fd) close(fd); open64(file, O_WRONLY|O_APPEND|O_DIRECT); ncount = 0; } fsync(fd); } 2nd Thread - Writing to a file in while loop while (1) { write(10 bytes) fsync(); usleep(100 * 1000); } 3rd Thread - Reading the file from 2nd Thread while(1){ read(file, buffer,10); lseek(file, 0,0); usleep(10000); } 4th thread - Just printing the the size information for the '2' files which are written 5th thread - Also, reading the file from 2nd thread > >> Is there any call to set up extent size >> also? please update I can try to make use of that also. > > `man xfsctl` and search for XFS_IOC_FSSETXATTR. thanks Dave, this is exactly what was needed - this is working as of now. But there continues to be a problem with the sync time. Even though there is no dirty data - but still sync is taking time around 5sec(but this is very rare - and observed very few times in overnight runnings) So, also very difficult to debug what could be the issue and who could be culprit. At one time - tried to check the trace during this sync time issue - please find as given below: (dump_backtrace+0x0/0x11c) from [<c0389520>] (dump_stack+0x20/0x24) (dump_stack+0x0/0x24) from [<c0067b70>] (__schedule_bug+0x7c/0x8c) (__schedule_bug+0x0/0x8c) from [<c0389bc0>] (schedule+0x88/0x5fc) (schedule+0x0/0x5fc) from [<c020a0c8>] (_xfs_log_force+0x238/0x28c) (_xfs_log_force+0x0/0x28c) from [<c020a320>] (xfs_log_force+0x20/0x40) (xfs_log_force+0x0/0x40) from [<c02308c4>] (xfs_commit_dummy_trans+0xc8/0xd4) (xfs_commit_dummy_trans+0x0/0xd4) from [<c0231468>] (xfs_quiesce_data+0x60/0x88) (xfs_quiesce_data+0x0/0x88) from [<c022e080>] (xfs_fs_sync_fs+0x2c/0xe8) (xfs_fs_sync_fs+0x0/0xe8) from [<c015cccc>] (__sync_filesystem+0x8c/0xa8) (__sync_filesystem+0x0/0xa8) from [<c015cd1c>] (sync_one_sb+0x34/0x38) (sync_one_sb+0x0/0x38) from [<c013b1f0>] (iterate_supers+0x7c/0xc0) (iterate_supers+0x0/0xc0) from [<c015cbf4>] (sync_filesystems+0x28/0x34) (sync_filesystems+0x0/0x34) from [<c015cd68>] (sys_sync+0x48/0x78) (sys_sync+0x0/0x78) from [<c003b4c0>] (ret_fast_syscall+0x0/0x48) In order to resolve this - applied the below patche: xfs: dummy transactions should not dirty VFS state http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commitdiff;h=1a387d3be2b30c90f20d49a3497a8fc0693a9d18 But still continued to observe the sync timing issue. One thing, do we need fsync() - when performing write using O_DIRECT?I think 'no' Also, should sync() be taking time when there is no 'dirty' data? Please share your opinion. Thanks & Regards, Amit Sahrawat > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-12-31 12:46 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-29 13:10 Preallocation with direct IO? amit.sahrawat83 2011-12-29 20:57 ` Dave Chinner 2011-12-30 3:07 ` Amit Sahrawat 2011-12-30 20:43 ` Dave Chinner 2011-12-31 12:46 ` Amit Sahrawat
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox