* Preallocation with direct IO?
@ 2011-12-29 13:10 amit.sahrawat83
2011-12-29 20:57 ` Dave Chinner
0 siblings, 1 reply; 5+ messages in thread
From: amit.sahrawat83 @ 2011-12-29 13:10 UTC (permalink / raw)
To: hch@infradead.org, david@fromorbit.com, xfs@oss.sgi.com
Hi,
I am using a test setup which is doing write using multiple threads using direct IO. The buffer size which is used to write is 512KB.
After continously running this for long duration - i observe that number of extents in each file is getting huge(2K..4K..). I observed that each extent is of 512KB(aligned to write buffer size). I wish to have low number of extents(i.e, reduce fragmentation)... In case of buffered IO- preallocation works good alongwith the mount option 'allocsize'. Is there anything which can be done for Direct IO?
Please advice for reducing fragmentation with direct IO.
Thanks & Regards,
Amit Sahrawat
Sent from my Nokia phone
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO?
2011-12-29 13:10 Preallocation with direct IO? amit.sahrawat83
@ 2011-12-29 20:57 ` Dave Chinner
2011-12-30 3:07 ` Amit Sahrawat
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2011-12-29 20:57 UTC (permalink / raw)
To: amit.sahrawat83@gmail.com; +Cc: hch@infradead.org, xfs@oss.sgi.com
On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote:
> Hi, I am using a test setup which is doing write using multiple
> threads using direct IO. The buffer size which is used to write is
> 512KB. After continously running this for long duration - i
> observe that number of extents in each file is getting
> huge(2K..4K..). I observed that each extent is of 512KB(aligned to
> write buffer size). I wish to have low number of extents(i.e,
> reduce fragmentation)... In case of buffered IO- preallocation
> works good alongwith the mount option 'allocsize'. Is there
> anything which can be done for Direct IO? Please advice for
> reducing fragmentation with direct IO.
Direct IO does not do any implicit preallocation. The filesystem
simply gets out of the way of direct IO as it is assumed you know
what you are doing.
i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64)
calls to preallocate space or to set up extent size hints to use
larger allocations than the IO being done during syscalls...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO?
2011-12-29 20:57 ` Dave Chinner
@ 2011-12-30 3:07 ` Amit Sahrawat
2011-12-30 20:43 ` Dave Chinner
0 siblings, 1 reply; 5+ messages in thread
From: Amit Sahrawat @ 2011-12-30 3:07 UTC (permalink / raw)
To: Dave Chinner; +Cc: hch@infradead.org, xfs@oss.sgi.com
On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote:
>> Hi, I am using a test setup which is doing write using multiple
>> threads using direct IO. The buffer size which is used to write is
>> 512KB. After continously running this for long duration - i
>> observe that number of extents in each file is getting
>> huge(2K..4K..). I observed that each extent is of 512KB(aligned to
>> write buffer size). I wish to have low number of extents(i.e,
>> reduce fragmentation)... In case of buffered IO- preallocation
>> works good alongwith the mount option 'allocsize'. Is there
>> anything which can be done for Direct IO? Please advice for
>> reducing fragmentation with direct IO.
>
> Direct IO does not do any implicit preallocation. The filesystem
> simply gets out of the way of direct IO as it is assumed you know
> what you are doing.
This is the supporting line I was looking for.
>
> i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64)
> calls to preallocate space or to set up extent size hints to use
> larger allocations than the IO being done during syscalls...
I tried to make use of preallocating space using
ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working
well with the Direct I/O. Is there any call to set up extent size
also? please update I can try to make use of that also.
Thanks & Regards,
Amit Sahrawat
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO?
2011-12-30 3:07 ` Amit Sahrawat
@ 2011-12-30 20:43 ` Dave Chinner
2011-12-31 12:46 ` Amit Sahrawat
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2011-12-30 20:43 UTC (permalink / raw)
To: Amit Sahrawat; +Cc: hch@infradead.org, xfs@oss.sgi.com
On Fri, Dec 30, 2011 at 08:37:00AM +0530, Amit Sahrawat wrote:
> On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote:
> >> Hi, I am using a test setup which is doing write using multiple
> >> threads using direct IO. The buffer size which is used to write is
> >> 512KB. After continously running this for long duration - i
> >> observe that number of extents in each file is getting
> >> huge(2K..4K..). I observed that each extent is of 512KB(aligned to
> >> write buffer size). I wish to have low number of extents(i.e,
> >> reduce fragmentation)... In case of buffered IO- preallocation
> >> works good alongwith the mount option 'allocsize'. Is there
> >> anything which can be done for Direct IO? Please advice for
> >> reducing fragmentation with direct IO.
> >
> > Direct IO does not do any implicit preallocation. The filesystem
> > simply gets out of the way of direct IO as it is assumed you know
> > what you are doing.
> This is the supporting line I was looking for.
> >
> > i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64)
> > calls to preallocate space or to set up extent size hints to use
> > larger allocations than the IO being done during syscalls...
> I tried to make use of preallocating space using
> ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working
> well with the Direct I/O.
Without knowing how you are using preallocation, I cannot comment on
this. Can you describe how your application does IO (size,
frequency, location in file, etc) and preallocation (same again), as
well as xfs_bmap -vp <file> output of fragmented files? That way I
have some idea of what your problem is and so might be able to
suggest fixes...
> Is there any call to set up extent size
> also? please update I can try to make use of that also.
`man xfsctl` and search for XFS_IOC_FSSETXATTR.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Preallocation with direct IO?
2011-12-30 20:43 ` Dave Chinner
@ 2011-12-31 12:46 ` Amit Sahrawat
0 siblings, 0 replies; 5+ messages in thread
From: Amit Sahrawat @ 2011-12-31 12:46 UTC (permalink / raw)
To: Dave Chinner; +Cc: hch@infradead.org, xfs@oss.sgi.com
On Sat, Dec 31, 2011 at 2:13 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Dec 30, 2011 at 08:37:00AM +0530, Amit Sahrawat wrote:
>> On Fri, Dec 30, 2011 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Thu, Dec 29, 2011 at 01:10:49PM +0000, amit.sahrawat83@gmail.com wrote:
>> >> Hi, I am using a test setup which is doing write using multiple
>> >> threads using direct IO. The buffer size which is used to write is
>> >> 512KB. After continously running this for long duration - i
>> >> observe that number of extents in each file is getting
>> >> huge(2K..4K..). I observed that each extent is of 512KB(aligned to
>> >> write buffer size). I wish to have low number of extents(i.e,
>> >> reduce fragmentation)... In case of buffered IO- preallocation
>> >> works good alongwith the mount option 'allocsize'. Is there
>> >> anything which can be done for Direct IO? Please advice for
>> >> reducing fragmentation with direct IO.
>> >
>> > Direct IO does not do any implicit preallocation. The filesystem
>> > simply gets out of the way of direct IO as it is assumed you know
>> > what you are doing.
>> This is the supporting line I was looking for.
>> >
>> > i.e. you know how to use the fallocate() or ioctl(XFS_IOC_RESVSP64)
>> > calls to preallocate space or to set up extent size hints to use
>> > larger allocations than the IO being done during syscalls...
>> I tried to make use of preallocating space using
>> ioctl(XFS_IOC_RESVSP64) - but over time - this is also not working
>> well with the Direct I/O.
>
> Without knowing how you are using preallocation, I cannot comment on
> this. Can you describe how your application does IO (size,
> frequency, location in file, etc) and preallocation (same again), as
> well as xfs_bmap -vp <file> output of fragmented files? That way I
> have some idea of what your problem is and so might be able to
> suggest fixes...
Prealloction was done using - snippets like these:
fl.l_whence = SEEK_SET;
fl.l_start = 0;
fl.l_len = (long long) PREALLOC; /* 1GB */
printf ("Preallocating %lld MB\n", (fl.l_len / (1024 * 1024)));
err = ioctl (hFile, XFS_IOC_RESVSP64, &fl);
I verified the prealloc working by taking a look at the file size (ls
-l) disk usage using 'df -kh' and also taking a look at the file
extents using xfs_bmap
xfs_bmap shows the extent of the preallocated length.
i.e., preallocation was working as expected.
To share the test case, due to some reasons - I cannot share the exact
code - but the working is like this:
In the Test case - there are 5 threads
WRITE_SIZE - 512KB
TRUNCSIZE - 250MB
1st Thread - this is doing actual amongst all the threads
buffer = valloc(WRITE_SIZE);
fd= open64(file,O_CREAT|O_DIRECT|O_WRONLY|O_TRUNC)
Initial write to file data of 5GB using 512KB buffer size
for(i=0; i < WRITE_COUNT; i++)
{
write(fd, buffer,WRITE_SIZE);
}
fsync(fd)
while(1)
{
if(ncount++ < TRUNCSIZE)
{
write(fd,buffer,WRITE_SIZE);
}
else
{
close(fd)
open(fd, O_RDWR|O_CREAT)
gettimeofday() - Start Point
sync(fd); // At times this sync is taking time around 5sec even
though the test case is doing I/O using O_DIRECT
gettimeofday() - End Point
If(sync time greater than 2secs)
exit(0);
gettimeofday() - Start Point
ftruncate(fd,TRUNCSIZE);
gettimeofday() - End Point
if(truncate time greater than 2sec)
exit(0);
fsync(fd)
close(fd);
open64(file, O_WRONLY|O_APPEND|O_DIRECT);
ncount = 0;
}
fsync(fd);
}
2nd Thread - Writing to a file in while loop
while (1)
{
write(10 bytes)
fsync();
usleep(100 * 1000);
}
3rd Thread - Reading the file from 2nd Thread
while(1){
read(file, buffer,10);
lseek(file, 0,0);
usleep(10000);
}
4th thread - Just printing the the size information for the '2' files
which are written
5th thread - Also, reading the file from 2nd thread
>
>> Is there any call to set up extent size
>> also? please update I can try to make use of that also.
>
> `man xfsctl` and search for XFS_IOC_FSSETXATTR.
thanks Dave, this is exactly what was needed - this is working as of now.
But there continues to be a problem with the sync time. Even though
there is no dirty data - but still sync is taking time around 5sec(but
this is very rare - and observed very few times in overnight runnings)
So, also very difficult to debug what could be the issue and who could
be culprit. At one time - tried to check the trace during this sync
time issue - please find as given below:
(dump_backtrace+0x0/0x11c) from [<c0389520>] (dump_stack+0x20/0x24)
(dump_stack+0x0/0x24) from [<c0067b70>] (__schedule_bug+0x7c/0x8c)
(__schedule_bug+0x0/0x8c) from [<c0389bc0>] (schedule+0x88/0x5fc)
(schedule+0x0/0x5fc) from [<c020a0c8>] (_xfs_log_force+0x238/0x28c)
(_xfs_log_force+0x0/0x28c) from [<c020a320>] (xfs_log_force+0x20/0x40)
(xfs_log_force+0x0/0x40) from [<c02308c4>] (xfs_commit_dummy_trans+0xc8/0xd4)
(xfs_commit_dummy_trans+0x0/0xd4) from [<c0231468>] (xfs_quiesce_data+0x60/0x88)
(xfs_quiesce_data+0x0/0x88) from [<c022e080>] (xfs_fs_sync_fs+0x2c/0xe8)
(xfs_fs_sync_fs+0x0/0xe8) from [<c015cccc>] (__sync_filesystem+0x8c/0xa8)
(__sync_filesystem+0x0/0xa8) from [<c015cd1c>] (sync_one_sb+0x34/0x38)
(sync_one_sb+0x0/0x38) from [<c013b1f0>] (iterate_supers+0x7c/0xc0)
(iterate_supers+0x0/0xc0) from [<c015cbf4>] (sync_filesystems+0x28/0x34)
(sync_filesystems+0x0/0x34) from [<c015cd68>] (sys_sync+0x48/0x78)
(sys_sync+0x0/0x78) from [<c003b4c0>] (ret_fast_syscall+0x0/0x48)
In order to resolve this - applied the below patche:
xfs: dummy transactions should not dirty VFS state
http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commitdiff;h=1a387d3be2b30c90f20d49a3497a8fc0693a9d18
But still continued to observe the sync timing issue.
One thing, do we need fsync() - when performing write using O_DIRECT?I
think 'no'
Also, should sync() be taking time when there is no 'dirty' data?
Please share your opinion.
Thanks & Regards,
Amit Sahrawat
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-12-31 12:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-29 13:10 Preallocation with direct IO? amit.sahrawat83
2011-12-29 20:57 ` Dave Chinner
2011-12-30 3:07 ` Amit Sahrawat
2011-12-30 20:43 ` Dave Chinner
2011-12-31 12:46 ` Amit Sahrawat
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox