* Re: file streams allocator behavior @ 2014-10-25 18:56 Richard Scobie 2014-10-25 21:26 ` Stan Hoeppner 0 siblings, 1 reply; 8+ messages in thread From: Richard Scobie @ 2014-10-25 18:56 UTC (permalink / raw) To: xfs Stan Hoeppner said: > How can I disable or change the filestreams behavior so all files go > into the one AG for the single directory test? Hi Stan, Instead of mounting with -o filestreams, would using the chattr flag instead help? See http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch06s16.html Regards, Richard _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-25 18:56 file streams allocator behavior Richard Scobie @ 2014-10-25 21:26 ` Stan Hoeppner 2014-10-26 14:26 ` Brian Foster 0 siblings, 1 reply; 8+ messages in thread From: Stan Hoeppner @ 2014-10-25 21:26 UTC (permalink / raw) To: Richard Scobie, xfs On 10/25/2014 01:56 PM, Richard Scobie wrote: > Stan Hoeppner said: > >> How can I disable or change the filestreams behavior so all files go >> into the one AG for the single directory test? > > Hi Stan, > > Instead of mounting with -o filestreams, would using the chattr flag > instead help? > > See > http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch06s16.html That won't help. That turns it on (if it's not enabled by default these days). I need to turn off the behavior I'm seeing, whether it's due to the filestreams allocator or default inode64. Then again it may not be possible to turn it off... Anyone have other ideas on how to accomplish my goal? Parallel writes to a single AG on the outer platter edge vs the same to all AGs across the entire platter? I'm simply trying to demonstrate the differences in aggregate bandwidth due to the extra seek latency of all AGs case. Thanks, Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-25 21:26 ` Stan Hoeppner @ 2014-10-26 14:26 ` Brian Foster 2014-10-26 17:26 ` Stan Hoeppner 0 siblings, 1 reply; 8+ messages in thread From: Brian Foster @ 2014-10-26 14:26 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Richard Scobie, xfs On Sat, Oct 25, 2014 at 04:26:54PM -0500, Stan Hoeppner wrote: > On 10/25/2014 01:56 PM, Richard Scobie wrote: > > Stan Hoeppner said: > > > >> How can I disable or change the filestreams behavior so all files go > >> into the one AG for the single directory test? > > > > Hi Stan, > > > > Instead of mounting with -o filestreams, would using the chattr flag > > instead help? > > > > See > > http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch06s16.html > > That won't help. That turns it on (if it's not enabled by default these > days). I need to turn off the behavior I'm seeing, whether it's due to > the filestreams allocator or default inode64. Then again it may not be > possible to turn it off... > > Anyone have other ideas on how to accomplish my goal? Parallel writes > to a single AG on the outer platter edge vs the same to all AGs across > the entire platter? I'm simply trying to demonstrate the differences in > aggregate bandwidth due to the extra seek latency of all AGs case. > What about just preallocating the files? Obviously this removes block allocation contention from your experiment, but it's not clear if that's relevant to your test. If I create a smaller, but analogous fs to yours, I seem to get this behavior from just doing an fallocate of each file in advance. E.g., Create directory 0, fallocate 44 files all of which end up in AG 0. Create directory 1, fallocate 44 files which end up in AG 1, etc. >From there you can do direct I/O overwrites to 44 files across each AG or 44 files in any single AG. Brian > Thanks, > Stan > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-26 14:26 ` Brian Foster @ 2014-10-26 17:26 ` Stan Hoeppner 2014-10-26 22:18 ` Brian Foster 0 siblings, 1 reply; 8+ messages in thread From: Stan Hoeppner @ 2014-10-26 17:26 UTC (permalink / raw) To: Brian Foster; +Cc: Richard Scobie, xfs On 10/26/2014 09:26 AM, Brian Foster wrote: > On Sat, Oct 25, 2014 at 04:26:54PM -0500, Stan Hoeppner wrote: >> On 10/25/2014 01:56 PM, Richard Scobie wrote: >>> Stan Hoeppner said: >>> >>>> How can I disable or change the filestreams behavior so all files go >>>> into the one AG for the single directory test? >>> >>> Hi Stan, >>> >>> Instead of mounting with -o filestreams, would using the chattr flag >>> instead help? >>> >>> See >>> http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch06s16.html >> >> That won't help. That turns it on (if it's not enabled by default these >> days). I need to turn off the behavior I'm seeing, whether it's due to >> the filestreams allocator or default inode64. Then again it may not be >> possible to turn it off... >> >> Anyone have other ideas on how to accomplish my goal? Parallel writes >> to a single AG on the outer platter edge vs the same to all AGs across >> the entire platter? I'm simply trying to demonstrate the differences in >> aggregate bandwidth due to the extra seek latency of all AGs case. >> > > What about just preallocating the files? Obviously this removes block > allocation contention from your experiment, but it's not clear if that's > relevant to your test. If I create a smaller, but analogous fs to yours, > I seem to get this behavior from just doing an fallocate of each file in > advance. > > E.g., Create directory 0, fallocate 44 files all of which end up in AG > 0. Create directory 1, fallocate 44 files which end up in AG 1, etc. > From there you can do direct I/O overwrites to 44 files across each AG > or 44 files in any single AG. I figured preallocating would get me what I want but I've never used fallocate, nor dd into fallocated files. Is there anything special required here with dd, or can I simply specify the filename to dd, and make sure bs + count doesn't go beyond EOF? Thanks, Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-26 17:26 ` Stan Hoeppner @ 2014-10-26 22:18 ` Brian Foster 0 siblings, 0 replies; 8+ messages in thread From: Brian Foster @ 2014-10-26 22:18 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Richard Scobie, xfs On Sun, Oct 26, 2014 at 12:26:57PM -0500, Stan Hoeppner wrote: > > > On 10/26/2014 09:26 AM, Brian Foster wrote: > > On Sat, Oct 25, 2014 at 04:26:54PM -0500, Stan Hoeppner wrote: > >> On 10/25/2014 01:56 PM, Richard Scobie wrote: > >>> Stan Hoeppner said: > >>> > >>>> How can I disable or change the filestreams behavior so all files go > >>>> into the one AG for the single directory test? > >>> > >>> Hi Stan, > >>> > >>> Instead of mounting with -o filestreams, would using the chattr flag > >>> instead help? > >>> > >>> See > >>> http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch06s16.html > >> > >> That won't help. That turns it on (if it's not enabled by default these > >> days). I need to turn off the behavior I'm seeing, whether it's due to > >> the filestreams allocator or default inode64. Then again it may not be > >> possible to turn it off... > >> > >> Anyone have other ideas on how to accomplish my goal? Parallel writes > >> to a single AG on the outer platter edge vs the same to all AGs across > >> the entire platter? I'm simply trying to demonstrate the differences in > >> aggregate bandwidth due to the extra seek latency of all AGs case. > >> > > > > What about just preallocating the files? Obviously this removes block > > allocation contention from your experiment, but it's not clear if that's > > relevant to your test. If I create a smaller, but analogous fs to yours, > > I seem to get this behavior from just doing an fallocate of each file in > > advance. > > > > E.g., Create directory 0, fallocate 44 files all of which end up in AG > > 0. Create directory 1, fallocate 44 files which end up in AG 1, etc. > > From there you can do direct I/O overwrites to 44 files across each AG > > or 44 files in any single AG. > > I figured preallocating would get me what I want but I've never used > fallocate, nor dd into fallocated files. Is there anything special > required here with dd, or can I simply specify the filename to dd, and > make sure bs + count doesn't go beyond EOF? > Ah, yeah. dd will truncate the file by default iirc, which would free the preallocated blocks and start from scratch. Specify 'conv=notrunc' as part of the command line to get around that. Brian > Thanks, > Stan > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* file streams allocator behavior @ 2014-10-25 18:12 Stan Hoeppner 2014-10-26 23:56 ` Dave Chinner 0 siblings, 1 reply; 8+ messages in thread From: Stan Hoeppner @ 2014-10-25 18:12 UTC (permalink / raw) To: xfs I recall reading a while back something about disabling the filestreams allocator, or at least changing its behavior, but I'm unable to find that. What I'm trying to do is use parallel dd w/O_DIRECT to write 44 files in parallel to 44 directories, thus all 44 AGs, in one test, then write 44 files to one dir, one AG, in another test. The purpose of this quick/dirty exercise is to demonstrate throughput differences due to full platter seeking in the former case and localized seeking in the latter case. But of course the problem I'm running into in the single directory case is that the filestreams allocator starts writing all of the 44 files into the appropriate AG, but then begins allocating extents for each file in other AGs. This is of course defeating the purpose of the tests. > /mnt/VOL1/43# for i in `seq 0 43`;do xfs_bmap -v test-$i;done > test-0: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..1535]: 92341791520..92341793055 43 (160..1695) 1536 01111 > 1: [1536..3071]: 92341794688..92341796223 43 (3328..4863) 1536 00011 ... > 88: [135168..136703]: 9972480..9974015 0 (9972480..9974015) 1536 00011 > 89: [136704..138239]: 9984768..9986303 0 (9984768..9986303) 1536 00011 ... > 146: [224256..225791]: 2158167552..2158169087 1 (10684032..10685567) 1536 > 147: [225792..227327]: 2158181376..2158182911 1 (10697856..10699391) 1536 ... > 160: [245760..254975]: 10744866688..10744875903 5 (7449088..7458303) 9216 00011 > 161: [254976..256511]: 10744877440..10744878975 5 (7459840..7461375) 1536 00011 ... ... > test-43: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..1535]: 92341936000..92341937535 43 (144640..146175) 1536 00011 > 1: [1536..3071]: 92342003584..92342005119 43 (212224..213759) 1536 00011 ... > 69: [105984..107519]: 4303912064..4303913599 2 (8945024..8946559) 1536 00011 > 70: [107520..109055]: 4303922816..4303924351 2 (8955776..8957311) 1536 00011 ... ... > 180: [276480..278015]: 8598943744..8598945279 4 (9009664..9011199) 1536 00011 ... > 181: [278016..279551]: 10744961920..10744963455 5 (7544320..7545855) 1536 00011 > 182: [279552..281087]: 10744968064..10744969599 5 (7550464..7551999) 1536 00011 ... ... Files being created are 1.6 GB. Filesystem is 44 TB. AGs are 1 TB. AGs are 0-43. Directories, /mnt/VOL1/0 - /mnt/VOL1/43. Device is a single RAID5 LUN. How can I disable or change the filestreams behavior so all files go into the one AG for the single directory test? Thanks, Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-25 18:12 Stan Hoeppner @ 2014-10-26 23:56 ` Dave Chinner 2014-10-27 23:24 ` Stan Hoeppner 0 siblings, 1 reply; 8+ messages in thread From: Dave Chinner @ 2014-10-26 23:56 UTC (permalink / raw) To: Stan Hoeppner; +Cc: xfs On Sat, Oct 25, 2014 at 01:12:48PM -0500, Stan Hoeppner wrote: > I recall reading a while back something about disabling the filestreams > allocator, or at least changing its behavior, but I'm unable to find that. > > What I'm trying to do is use parallel dd w/O_DIRECT to write 44 files in > parallel to 44 directories, thus all 44 AGs, in one test, then write 44 > files to one dir, one AG, in another test. The purpose of this > quick/dirty exercise is to demonstrate throughput differences due to > full platter seeking in the former case and localized seeking in the > latter case. > > But of course the problem I'm running into in the single directory case > is that the filestreams allocator starts writing all of the 44 files > into the appropriate AG, but then begins allocating extents for each > file in other AGs. This is of course defeating the purpose of the tests. That's caused by allocator contention. When you try to write 44 files to the same dir in parallel, they'll all start with the same target AG, but then when one thread is allocating into AG 43 and has the AG locked, a second attempt to allocate to than AG will see the AG locked and so it will move to find the next AG that is not locked. Remember, AGs were not originally designed for confining physical locality - they are designed to allow allocator parallelism. Hence once the file has jumped to a new AG it will try to allocate sequentially from that point onwards in that same AG, until either ENOSPC or further contention. Hence with a workload like this, if the writes continue for long enough each file will end up finding it's own uncontended AG and hence mostly end up contiguous on disk and not getting blocked waiting for allocation on other files. When you have as many writers as there are AGs, however, such a steady state is generally not possible as there will always be files trying to write into the same AG. As it is, filestreams is not designed for this sort of parallel workload. filestreams is designed to separate single threaded streams of IO into different locations, not handle concurrent writes into multiple files in the same directory. As it is, the inode64 will probably demonstrate exactly the same behaviour because it will start by trying to write all the files to the same AG and hence hit allocator contention, too. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file streams allocator behavior 2014-10-26 23:56 ` Dave Chinner @ 2014-10-27 23:24 ` Stan Hoeppner 0 siblings, 0 replies; 8+ messages in thread From: Stan Hoeppner @ 2014-10-27 23:24 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On 10/26/2014 06:56 PM, Dave Chinner wrote: > On Sat, Oct 25, 2014 at 01:12:48PM -0500, Stan Hoeppner wrote: >> I recall reading a while back something about disabling the filestreams >> allocator, or at least changing its behavior, but I'm unable to find that. >> >> What I'm trying to do is use parallel dd w/O_DIRECT to write 44 files in >> parallel to 44 directories, thus all 44 AGs, in one test, then write 44 >> files to one dir, one AG, in another test. The purpose of this >> quick/dirty exercise is to demonstrate throughput differences due to >> full platter seeking in the former case and localized seeking in the >> latter case. >> >> But of course the problem I'm running into in the single directory case >> is that the filestreams allocator starts writing all of the 44 files >> into the appropriate AG, but then begins allocating extents for each >> file in other AGs. This is of course defeating the purpose of the tests. > > That's caused by allocator contention. When you try to write 44 > files to the same dir in parallel, they'll all start with the same > target AG, but then when one thread is allocating into AG 43 and has > the AG locked, a second attempt to allocate to than AG will see the > AG locked and so it will move to find the next AG that is not > locked. That's what I suspected given what I was seeing. > Remember, AGs were not originally designed for confining physical > locality - they are designed to allow allocator parallelism. Hence Right. But they sure do come in handy when used this way with preallocated files. I suspect we'll make heavy use of this when they have me back to implement my recommendations. > once the file has jumped to a new AG it will try to allocate > sequentially from that point onwards in that same AG, until either > ENOSPC or further contention. > > Hence with a workload like this, if the writes continue for long > enough each file will end up finding it's own uncontended AG and > hence mostly end up contiguous on disk and not getting blocked > waiting for allocation on other files. When you have as many writers > as there are AGs, however, such a steady state is generally not > possible as there will always be files trying to write into the same > AG. Yep. That's exactly what xfs_bmap was showing. Some files had extents in 3-4 AGs, some in 8 or more AGs. > As it is, filestreams is not designed for this sort of parallel > workload. filestreams is designed to separate single threaded > streams of IO into different locations, not handle concurrent writes > into multiple files in the same directory. Right. I incorrectly refered to the filestreams allocator as I didn't know of the inode64 congestion control mechanism at that time. I simply had recalled one of your emails where you and Christoph were discussing mods to the filestreams allocator, and the file pattern to the AGs looked familiar. > As it is, the inode64 will probably demonstrate exactly the same > behaviour because it will start by trying to write all the files to > the same AG and hence hit allocator contention, too. I was using the inode64 allocator, and yes, it did. Brian saved my bacon as I wanted to use preallocated files but didn't know exactly how. I knew all their extents would be in the AG I wanted them in, given the way I intended to do it. And doing this demonstrated what I anticipated. Writing 2GB into each of 132 files with 132 parallel dd processes with O_DIRECT, 264GB total, I achieved: AG0 only 1377 MB/s AGs 0-43 767 MB/s Due to the nature of the application, we should be able to distribute the files in such a manner that only 1 or 2 adjacent AGs are being accesses concurrently. This will greatly reduce head seeking, increasing throughput, as demonstrated above. All thanks to the allocation group architecture of XFS. We'd not be able to do this with any other filesystem AFAIK. Thanks, Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-10-27 23:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-25 18:56 file streams allocator behavior Richard Scobie 2014-10-25 21:26 ` Stan Hoeppner 2014-10-26 14:26 ` Brian Foster 2014-10-26 17:26 ` Stan Hoeppner 2014-10-26 22:18 ` Brian Foster -- strict thread matches above, loose matches on Subject: below -- 2014-10-25 18:12 Stan Hoeppner 2014-10-26 23:56 ` Dave Chinner 2014-10-27 23:24 ` Stan Hoeppner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox