* allocsize mount option
@ 2010-01-11 17:25 Gim Leong Chin
2010-01-11 18:16 ` Eric Sandeen
0 siblings, 1 reply; 14+ messages in thread
From: Gim Leong Chin @ 2010-01-11 17:25 UTC (permalink / raw)
To: xfs
Hi,
Mount options for xfs
allocsize=size
Sets the buffered I/O end-of-file preallocation size when doing delayed allocation writeout (default size is 64KiB).
I read that setting allocsize to a big value can be used to combat filesystem fragmentation when writing big files.
I do not understand how allocsize works. Say I set allocsize=1g, but my file size is only 1 MB or even smaller. Will the rest of the 1 GB file extent be allocated, resulting in wasted space and even file fragmentation problem?
Does setting allocsize to a big value result in performance gain when writing big files? Is performance hurt by a big value setting when writing files smaller than the allocsize value?
I am setting up a system for HPC, where two different applications have different file size characteristics, one writes files of GBs and even 128 GB, the other is in MBs to tens of MBs.
I am not able to find documentation on the behaviour of allocsize mount option.
Thank you.
Chin Gim Leong
New Email names for you!
Get the Email name you've always wanted on the new @ymail and @rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2010-01-11 17:25 Gim Leong Chin
@ 2010-01-11 18:16 ` Eric Sandeen
0 siblings, 0 replies; 14+ messages in thread
From: Eric Sandeen @ 2010-01-11 18:16 UTC (permalink / raw)
To: Gim Leong Chin; +Cc: xfs
Gim Leong Chin wrote:
> Hi,
>
> Mount options for xfs allocsize=size Sets the buffered I/O
> end-of-file preallocation size when doing delayed allocation writeout
> (default size is 64KiB).
>
>
> I read that setting allocsize to a big value can be used to combat
> filesystem fragmentation when writing big files.
That's not universally necessary though, depending on how you are
writing them. I've only used it in the very specific case of mythtv
calling "sync" every couple seconds, and defeating delalloc.
> I do not understand how allocsize works. Say I set allocsize=1g, but
> my file size is only 1 MB or even smaller. Will the rest of the 1 GB
> file extent be allocated, resulting in wasted space and even file
> fragmentation problem?
possibly :) It's only speculatively allocated, though, so you won't
have 1g for every file; when it's closed the preallocation goes
away, IIRC.
> Does setting allocsize to a big value result in performance gain when
> writing big files? Is performance hurt by a big value setting when
> writing files smaller than the allocsize value?
>
> I am setting up a system for HPC, where two different applications
> have different file size characteristics, one writes files of GBs and
> even 128 GB, the other is in MBs to tens of MBs.
We should probably back up and say: are you seeing fragmentation
problems -without- the mount option, and if so, what is your write pattern?
-Eric
> I am not able to find documentation on the behaviour of allocsize
> mount option.
>
> Thank you.
>
>
> Chin Gim Leong
>
>
> New Email names for you! Get the Email name you've always wanted
> on the new @ymail and @rocketmail. Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/
>
> _______________________________________________ xfs mailing list
> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
@ 2010-01-13 9:42 Gim Leong Chin
2010-01-13 10:50 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Gim Leong Chin @ 2010-01-13 9:42 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
Hi,
The application is ANSYS, which writes 128 GB files. The existing computer with SUSE Linux Enterprise Desktop 11 which is used for running ANSYS, has two software RAID 0 devices made up of five 1 TB drives. The /home partition is 4.5 T, and it is now 4 TB full. I see a fragmentation > 19%.
I have just set up a new computer with 16 WD Cavair Black 1 TB drives connected to an Areca 1680ix-16 RAID with 4 GB cache. 14 of these drives are in RAID 6 with 128 kB stripes. The OS is also SLED 11. The system has 16 GB memory, and AMD Phenom II X4 965 CPU.
I have done tests writing 100 30 MB files and 1 GB, 10 GB and 20 GB files, with single instance and multiple instances.
There is a big difference in writing speed when writing 20 GB files when using allocsize=1g and not using the option. That is without the inode64 option, which gives further speed gains.
I use dd for writing the 1 GB, 10 GB and 20 GB files.
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc
The start of the partition has been set to LBA 3072 using GPT Fdisk to align the stripes.
The dd command is:
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
Single instance of 20 GB dd repeats were 214, 221, 123 MB/s with allocsize=1g, compared to 94, 126 MB/s without.
Two instances of 20 GB dd repeats were aggregate 331, 372 MB/s with allocsize=1g, compared to 336, 296 MB/s without.
Three instances of 20 GB dd was aggregate 400 MB/s with, 326 MB/s without.
Six instances of 20 GB dd was 606 MB/s with, 473 MB/s without.
My production configuration is
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc,inode64
for which I got up to 297 MB/s for single instance 20 GB dd.
Chin Gim Leong
--- On Tue, 12/1/10, Eric Sandeen <sandeen@sandeen.net> wrote:
> From: Eric Sandeen <sandeen@sandeen.net>
> Subject: Re: allocsize mount option
> To: "Gim Leong Chin" <chingimleong@yahoo.com.sg>
> Cc: xfs@oss.sgi.com
> Date: Tuesday, 12 January, 2010, 2:16 AM
> Gim Leong Chin wrote:
> > Hi,
> >
> > Mount options for xfs allocsize=size Sets the
> buffered I/O
> > end-of-file preallocation size when doing delayed
> allocation writeout
> > (default size is 64KiB).
> >
> >
> > I read that setting allocsize to a big value can be
> used to combat
> > filesystem fragmentation when writing big files.
>
> That's not universally necessary though, depending on how
> you are
> writing them. I've only used it in the very specific
> case of mythtv
> calling "sync" every couple seconds, and defeating
> delalloc.
>
> > I do not understand how allocsize works. Say I
> set allocsize=1g, but
> > my file size is only 1 MB or even smaller. Will
> the rest of the 1 GB
> > file extent be allocated, resulting in wasted space
> and even file
> > fragmentation problem?
>
> possibly :) It's only speculatively allocated,
> though, so you won't
> have 1g for every file; when it's closed the preallocation
> goes
> away, IIRC.
>
> > Does setting allocsize to a big value result in
> performance gain when
> > writing big files? Is performance hurt by a big
> value setting when
> > writing files smaller than the allocsize value?
> >
> > I am setting up a system for HPC, where two different
> applications
> > have different file size characteristics, one writes
> files of GBs and
> > even 128 GB, the other is in MBs to tens of MBs.
>
> We should probably back up and say: are you seeing
> fragmentation
> problems -without- the mount option, and if so, what is
> your write pattern?
>
> -Eric
>
> > I am not able to find documentation on the behaviour
> of allocsize
> > mount option.
> >
> > Thank you.
> >
> >
> > Chin Gim Leong
> >
> >
> > New Email names for you! Get the Email name you've
> always wanted
> > on the new @ymail and @rocketmail. Hurry before
> someone else does!
> > http://mail.promotions.yahoo.com/newdomains/sg/
> >
> > _______________________________________________ xfs
> mailing list
> > xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
> >
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
New Email addresses available on Yahoo!
Get the Email name you've always wanted on the new @ymail and @rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2010-01-13 9:42 allocsize mount option Gim Leong Chin
@ 2010-01-13 10:50 ` Dave Chinner
2010-01-13 22:59 ` xfstests: Clean up build output Alex Elder
0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2010-01-13 10:50 UTC (permalink / raw)
To: Gim Leong Chin; +Cc: Eric Sandeen, xfs
On Wed, Jan 13, 2010 at 05:42:16PM +0800, Gim Leong Chin wrote:
> Hi,
>
>
> The application is ANSYS, which writes 128 GB files. The existing
> computer with SUSE Linux Enterprise Desktop 11 which is used for
> running ANSYS, has two software RAID 0 devices made up of five 1
> TB drives. The /home partition is 4.5 T, and it is now 4 TB
> full. I see a fragmentation > 19%.
XFS will start to fragment when the filesystem gets beyond 85%
full - it seems that you are very close to that threshold.
That being said, if you've pulled the figure of 19% from the xfs_db
measure of fragmentation, that doesn't mean the filesystem is badly
fragmented, it just means that that there are 19% more fragments
than the ideal. In 4TB of data with 1GB sized files, that would mean
there are 4800 extents (average length ~800MB, which is excellent)
instead of the perfect 4000 extents (@1GB each). Hence you can see
how misleading this "19% fragmentation" number can be on an extent
based filesystem...
> I have just set up a new computer with 16 WD Cavair Black 1 TB
> drives connected to an Areca 1680ix-16 RAID with 4 GB cache. 14
> of these drives are in RAID 6 with 128 kB stripes. The OS is also
> SLED 11. The system has 16 GB memory, and AMD Phenom II X4 965
> CPU.
>
> I have done tests writing 100 30 MB files and 1 GB, 10 GB and 20
> GB files, with single instance and multiple instances.
>
> There is a big difference in writing speed when writing 20 GB
> files when using allocsize=1g and not using the option. That is
> without the inode64 option, which gives further speed gains.
> I use dd for writing the 1 GB, 10 GB and 20 GB files.
>
> mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
>
>
> defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc
>
> The start of the partition has been set to LBA 3072 using GPT Fdisk to align the stripes.
This all looks good - it certainly seems that you have done your
research. ;) The only thing I'd do differently is that if you have
only one partition on the drives, I wouldn't even put a partition on it.
> The dd command is:
>
> chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
I'd significantly reduce the size of that buffer - too large a
buffer can slow down IO due to the memory it consumes and TLB misses
it causes. I'd typically use something like:
$ dd if=/dev/zero of=bigfile bs=1024k count=20k
Which does 20,000 writes of 1MB each and ensures the dd process
doesn't consume over a GB of RAM.
> Single instance of 20 GB dd repeats were 214, 221, 123 MB/s with
> allocsize=1g, compared to 94, 126 MB/s without.
This seems rather low for a buffered write on hardware that can
clearly go faster. SLED11 is based on 2.6.27, right? I suspect that
many of the buffered writeback issues that have been fixed since
2.6.30 are present in the SLED11 kernel, and if that is the case I
can see why the allocsize mount option makes such a big
difference.
It might be worth checking how well direct IO writes run to take
buffered writeback issues out ofthe equation. In that case, I'd use
stripe width multiple sized buffers like:
$ dd if=/dev/zero of=bigfile bs=3072k count=7k oflag=direct
I'd suggest that you might need to look at increasing the maximum IO
size for the block device (/sys/block/sdb/queue/max_sectors_kb),
maybe the request queue depth as well to get larger IOs to be pushed
to the raid controller. if you can, at least get it to the stripe
width of 1536k....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfstests: Clean up build output
2010-01-13 10:50 ` Dave Chinner
@ 2010-01-13 22:59 ` Alex Elder
0 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2010-01-13 22:59 UTC (permalink / raw)
To: Dave Chinner; +Cc: Eric Sandeen, xfs
> We don't need to see every compiler command line for every file that
> is compiled. This makes it hard to see warnings and errors during
> compile. For progress notification, we really only need to see the
> directory/file being operated on.
>
> Turn down the verbosity of output by suppressing various make output
> and provide better overall visibility of which directory is being
> operated on, what the operation is and what is being done to the
> files by the build/clean process.
>
> While doing this, remove explicit target-per-file rules in the
> subdirectories being built and replace them with target based rules
> using the buildrules hooks for doing this. This results in the
> makefiles being simpler, smaller and more consistent.
>
> This patch does not address the dmapi subdirectory of the xfstests
> build system.
>
> The old style verbose builds can still be run via "make V=1 ..."
>
> Signed-off-by: Dave Chinner <david@fromorbit.com>
Didn't get this in my mail box but it looks good to me.
Reviewed-by: Alex Elder <aelder@sgi.com>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
@ 2010-01-14 17:25 Gim Leong Chin
2010-01-14 17:42 ` Eric Sandeen
2010-01-14 23:28 ` Dave Chinner
0 siblings, 2 replies; 14+ messages in thread
From: Gim Leong Chin @ 2010-01-14 17:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: Eric Sandeen, xfs
[-- Attachment #1: Type: text/plain, Size: 7631 bytes --]
Hi Dave,
> fragmented, it just means that that there are 19% more
> fragments
> than the ideal. In 4TB of data with 1GB sized files, that
> would mean
> there are 4800 extents (average length ~800MB, which is
> excellent)
> instead of the perfect 4000 extents (@1GB each). Hence you
> can see
> how misleading this "19% fragmentation" number can be on an
> extent
> based filesystem...
There are many files that are 128 GB.
When I did the tests with dd on this computer, the 20 GB files had up to > 50 extents.
> This all looks good - it certainly seems that you have done
> your
> research. ;) The only thing I'd do differently is that if
> you have
> only one partition on the drives, I wouldn't even put a
> partition on it.
>
I just learnt from you that I can have a filesystem without a partition table! That takes care of having to calculate the start of the partition! Are there any other benefits? But are there any down sides to not having a partition table?
> I'd significantly reduce the size of that buffer - too
> large a
> buffer can slow down IO due to the memory it consumes and
> TLB misses
> it causes. I'd typically use something like:
>
> $ dd if=/dev/zero of=bigfile bs=1024k count=20k
>
> Which does 20,000 writes of 1MB each and ensures the dd
> process
> doesn't consume over a GB of RAM.
>
I did try with 1 MB. I have attached the raw test result file. As you can see from line 261, in writing 10 GB with bs=1MB, the speed was no faster two out of three times, so I dropped it. I could re-try that next time.
> This seems rather low for a buffered write on hardware that
> can
> clearly go faster. SLED11 is based on 2.6.27, right? I
> suspect that
> many of the buffered writeback issues that have been fixed
> since
> 2.6.30 are present in the SLED11 kernel, and if that is the
> case I
> can see why the allocsize mount option makes such a big
> difference.
Is it possible for the fixes in the 2.6.30 kernel to be backported to the 2.6.27 kernel in SLE 11?
If so, I would like to open a service request to Novell to do that to fix the performance issues in SLE 11.
> It might be worth checking how well direct IO writes run to
> take
> buffered writeback issues out ofthe equation. In that case,
> I'd use
> stripe width multiple sized buffers like:
>
> $ dd if=/dev/zero of=bigfile bs=3072k count=7k
> oflag=direct
>
I would like to do that tomorrow when I go back to work, but on my openSUSE 11.1 AMD Turion RM-74 notebook with 2.6.27.39-0.2-default kernel, on the system WD Scorpio Black 7200 RPM drive, I get 62 MB/s with dd bs=1GB for writing 20 GB file with Direct IO, and 56 MB/s without Direct IO. You are on to something!
As for the hardware performance potential, see below.
> I'd suggest that you might need to look at increasing the
> maximum IO
> size for the block device
> (/sys/block/sdb/queue/max_sectors_kb),
> maybe the request queue depth as well to get larger IOs to
> be pushed
> to the raid controller. if you can, at least get it to the
> stripe
> width of 1536k....
>
Could you give a good reference for performance tuning of these parameters? I am at a total loss here.
As seen from the results file, I have tried different configurations of RAID 0, 5 and 6, with different number of drives. I am pretty confused by the results I see, although only the 20 GB file writes were done with allocsize=1g. I also did not lock the CPU frequency governor at the top clock except for the RAID 6 tests.
I decided on the allocsize=1g after checking that the multiple instance 30 MB writes have only one extent for each file, without holes or unused space.
It appears that RAID 6 writes are faster than RAID 5! And RAID 6 can even match RAID 0! The system seems to thrive on throughput, when doing multiple instances of writes, for getting high aggregate bandwidth.
I will put the performance potential of the system in context by giving some details.
The system has four Kingston DDR2-800 MHz CL6 4 GB unbuffered ECC DIMMs, set to unganged mode, so each thread has up to 6.4 GB of memory bandwidth, from one of two independent memory channels.
The AMD Phenom II X4 965 has three levels of cache, and data from memory goes directly to the L1 caches. The four cores have dedicated L1 and L2 caches, and a shared 6 MB L3. Thread switching will result in cache misses if more than four threads are running.
The IO through the HyperTransport 3.0 from CPU to the AMD 790FX chipset is at 8 GB/s. The Areca ARC-1680ix-16 is PCI-E Gen 1 x8, so the maximum bandwidth is 2 GB/s. The cache is Kingston DDR-667 CL5 4 GB unbuffered ECC, although it runs at 533 MHz, so the maximum bandwidth is 4.2 GB/s. The Intel IOP 348 1200 MHz on the card has two cores.
There are sixteen WD Caviar Black 1 TB drives in the Lian-Li PC-V2110 chassis. For the folks reading this, please do not follow this set-up, as the Caviar Blacks are a mistake. WD quietly disabled the use of WD time limited error recovery utility since the September 2009 manufactured Caviar Black drives, so I have an array of drives that can pop out of the RAID any time if I am unlucky, and I got screwed here.
There is a battery back-up module for the cache, and the drive caches are disabled. Tests run with the drive caches enabled showed quite some bit of speed up in RAID 0.
We previously did tests of the Caviar Black 1 TB writing 100 MB chuncks to the device without a file system, with the drive connected to the SATA ports on a Tyan Opteron motherboard with nVidia nForce 4 Professional chipset. With the drive cache disabled, the sequential write speed was 30+ MB/s if I remember correctly, versus sub 100 MB/s with cache enabled. That is a big fall-off in speed, and that was writing at the outer diameter of the platter; speed would be halved at the inner diameter. It seems the controller firmware is meant to work with cache enabled for proper functioning.
The desktop Caviar Black also does not have rotatry vibration compensation, unlike the Caviar RE nearline drives. WD has a document showing the performance difference having rotary vibration compensation makes. I am not trying to save pennies here, but the local distributor refuses to bring in the Caviar REs, and I am stuck in one man's land.
The system has sixteen hard drives, and ten fans of difference sizes and purposes in total, so that is quite some bit of rotary vibration, which I can feel when I place my hand on the side panels. I really do not know how badly the drive performance suffers as a result. The drives are attached with rubber dampers on the mounting screws.
I did the 20 GB dd test on the RAID 1 system drive, also with XFS, and got 53 MB/s with disabled drive caches, 63 MB/s enabled. That is pretty disappointing, but in light of all the above considerations, plus the kernel buffer issues, I do not really know if that is a good figure.
NCQ is enabled at depth 32. NCQ should cause performance loss for single writes, but gains for multiple writes.
Areca has a document showing that this card can do RAID 6 800 MB/s with Seagate nearline drives, with the standard 512 MB cache. That is in Windows Server. I do not know if the caches are disabled. The benchmark is IO Meter workstation sequential write. IO Meter requries WIndows for the front end, which causes me great difficulties, so I gave up trying to figure it out and I do not understand what the workstation test does. However, in writing 30 MB files, I already exceed 1 GB/s.
[-- Attachment #2: xfstesting --]
[-- Type: application/octet-stream, Size: 38719 bytes --]
Testing for Areca data volume
mkfs.xfs -f -d agcount=32 -i align=1 -L /data /dev/sdb1
Config 1
14 drive RAID 0 128 kB stripes
3584 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
4 instances of iotesttyphoon, 100 files of 30 MB
5, 6, 4, 4 s (600, 500, 750, 750 MB/s)
2600 MB/s
Test 2
repeat the above
4, 4, 4, 6 s (750, 750, 750, 500 MB/s)
2750 MB/s
Test 3
1 instance of iotesttyphoon
2 s (1500 MB/s)
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
8, 15, 15, 19, 17, 18 s (375, 200, 200, 157.89, 176.47, 166.67 MB/s)
1276 MB/s
Test 5
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.43212 s, 750 MB/s
Test 6
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 11.0262 s, 974 MB/s
930 MB/s
Test 7
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 11.888 s, 903 MB/s
861 MB/s
Test 8
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 61.8759 s, 347 MB/s
330.98 MB/s
Test 9
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 65.8656 s, 326 MB/s
310.94 MB/s
Test 10
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.60774 s, 1.1 GB/s
1065 MB/s
Config 2
12 drive RAID 0 128 kB stripes
3072 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
3, 4, 4, 3 s (1000, 750, 750, 1000 MB/s)
3500 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
5, 5, 7, 6 s (600, 600, 428.57, 500 MB/s)
2128 MB/s
17, 18, 16, 16, 13, 11 s (176, 166, 187.5, 187.5, 230, 272 MB/s)
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
9, 14, 16, 21, 21, 21 s (333, 214, 187, 142, 142, 142 MB/s)
1160 MB/s
Test 5
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.776428 s, 1.4 GB/s
3863 MB/s
Test 6
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 10.1086 s, 1.1 GB/s
1012 MB/s
Test 7
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 10.7962 s, 995 MB/s
948 MB/s
Test 8
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 65.0962 s, 330 MB/s
314 MB/s
Test 9
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 66.1583 s, 325 MB/s
309 MB/s
Config 3
14 drive RAID 5 128 kB stripes
3328 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=13 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
3 s (1000 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
4, 8, 10, 15 s (750, 375, 300, 200 MB/s)
1625 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
14, 13, 11, 5 s (214, 230, 272, 600 MB/s)
1316 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
46, 46, 46, 46, 23, 9 s (65, 65, 65, 65, 130, 333 MB/s)
723 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
14, 21, 41, 40, 41, 46 s (214, 142, 73, 75, 73, 65 MB/s)
642 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.704663 s, 1.5 GB/s
1453 MB/s
Test 7
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.669676 s, 1.6 GB/s
1529 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 60.8726 s, 176 MB/s
168 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 79.5942 s, 135 MB/s
128 MB/s
Test 10
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1048576 count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 42.4831 s, 253 MB/s
241 MB/s
Test 11
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBd bs=1048576 count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 86.3433 s, 124 MB/s
118 MB/s
Test 12
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1048576 count=20480
20480+0 records in
20480+0 records out
21474836480 bytes (21 GB) copied, 169.453 s, 127 MB/s
120 MB/s
Test 13
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 162.702 s, 132 MB/s
125 MB/s
Test 14
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 213.533 s, 101 MB/s
95 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 246.291 s, 87.2 MB/s
83 MB/s
Config 4
10 drive RAID 5 128 kB stripes, drives 7 to 16 1000 GB (931 GB)
2304 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=9 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
12, 13, 10, 5 s (250, 230, 300, 600 MB/s)
1380 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
5, 7, 10, 12 s (600, 428, 300, 250 MB/s)
1578 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
25, 21, 26, 22, 13, 9 s (120, 142, 115, 136, 230, 333 MB/s)
1076 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
20, 18, 21, 14, 13, 11 s (150, 166, 142, 214, 230, 272 MB/s)
1174 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.674449 s, 1.6 GB/s
1518 MB/s
Test 7
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.768957 s, 1.4 GB/s
1331 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 83.6194 s, 128 MB/s
122 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 85.5271 s, 126 MB/s
119 MB/s
Test 10
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 178.256 s, 120 MB/s
114 MB/s
Test 11
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 187.051 s, 115 MB/s
109 MB/s
Test 12
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 186.364 s, 115 MB/s
109 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 198.787 s, 108 MB/s
103 MB/s
Total 223 MB/s
drive caches enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 165.58 s, 130 MB/s
123 MB/s
drive caches enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 117.277 s, 183 MB/s
174 MB/s
313 MB/s
chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 62.6958 s, 343 MB/s
326 MB/s
chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 73.8873 s, 291 MB/s
277 MB/s
chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 75.4624 s, 285 MB/s
271 MB/s
in dragon, 6 instances of iotesttyphoon, 100 files of 30 MB
66, 84, 107, 130, 130, 129 s (45, 35, 28, 23, 23, 23 MB/s)
on another day, ansys using 100% CPU
2 instances of 20 GB file
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 140.058 s, 153 MB/s
146 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 97.8646 s, 219 MB/s
209 MB/s
Total 355 MB/s
6 instances of 20 GB file
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 511.881 s, 42.0 MB/s
40 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 510.087 s, 42.1 MB/s
40 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 434.338 s, 49.4 MB/s
47 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 339.075 s, 63.3 MB/s
60 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 360.107 s, 59.6 MB/s
56 MB/s
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 571.768 s, 37.6 MB/s
35 MB/s
Total 278 MB/s
tornado had full CPUs and little free memory
chingl@tornado:~/testsw/iotest/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 158.466 seconds, 136 MB/s
129 MB/s
chingl@tornado:~/testsw/iotest/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 153.538 seconds, 140 MB/s
133 MB/s
chingl@typhoon:~/testsw/iotesting/w1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 133.32 seconds, 161 MB/s
153 MB/s
chingl@typhoon:~/testsw/iotesting/w1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 89.4894 seconds, 240 MB/s
228 MB/s
tornado had free CPUs
chingl@tornado:~> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 141.092 seconds, 152 MB/s
Config 5
10 drive RAID 5 128 kB stripes, drives 7 to 16 1000 GB (931 GB)
2304 LBA blocks for one set of stripes
mkfs.xfs -f -L /data /dev/sdb1
tsunami:~ # mkfs.xfs -f -L /data /dev/sdb1
meta-data=/dev/sdb1 isize=256 agcount=4, agsize=61035047 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=244140187, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
Test 1
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 189.239 s, 113 MB/s
108 GB/s
Test 2
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 172.743 s, 124 MB/s
118 MB/s
Config 6
writing to /
Test 1
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 380.755 s, 56.4 MB/s
53 MB/s
Config 7
writing to /home
Test 1
chingl@tsunami:~/testsw> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 400.869 s, 53.6 MB/s
51 MB/s
Config 8
14 drive RAID 0 128 kB stripes
3584 LBA blocks for one set of stripes
mkfs.xfs -f -L /data /dev/sdb1
meta-data=/dev/sdb1 isize=256 agcount=13, agsize=268435455 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=3417967163, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
Test 1
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 62.716 s, 342 MB/s
326 MB/s
Config 9
14 drive RAID 0 128 kB stripes
3584 LBA blocks for one set of stripes
drive cache enabled
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
Write 20 GB file
drive cache enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 38.9335 s, 552 MB/s
526 MB/s
Test 2
Write 20 GB file
drive cache enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 47.8844 s, 448 MB/s
427 GB/s
Config 10
drive cache enabled
Write to /home
chingl@tsunami:~/testsw> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 343.33 s, 62.5 MB/s
59 MB/s
Config 11
10 drive RAID 6 128 kB stripes drives 7 to 16 1000 GB (931 GB)
2048 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=8 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
16, 16, 9, 5 s (187, 187, 333, 600 MB/s)
1307 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
5, 6, 11, 11 s (600, 500, 272, 272 MB/s)
1644 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
22, 23, 20, 15, 12, 10 s (136, 130, 150, 200, 250, 300 MB/s)
1166 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
23, 24, 22, 20, 8, 2 s (130, 125, 136, 150, 375, 1500 MB/s)
2416 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.710389 s, 1.5 GB/s
1441 MB/s
Test 7
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.742939 s, 1.4 GB/s
1378 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 31.1474 s, 345 MB/s
328 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 31.8968 s, 337 MB/s
Test 10
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 277.032 s, 77.5 MB/s
73 MB/s
Test 11
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 262.301 s, 81.9 MB/s
78 MB/s
Test 12
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 269.545 s, 79.7 MB/s
75 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 287.517 s, 74.7 MB/s
71 MB/s
Total 146 MB/s
Test 12
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 260.578 s, 82.4 MB/s
78 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 257.186 s, 83.5 MB/s
79 MB/s
Total 157 MB/s
Test 13
3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 207.924 s, 103 MB/s
98 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 211.839 s, 101 MB/s
96 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 168.803 s, 127 MB/s
121 MB/s
Total 315 MB/s
Test 14
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 298.946 s, 71.8 MB/s
68 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBk bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 354.162 s, 60.6 MB/s
57 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBl bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 300.646 s, 71.4 MB/s
68 MB/s
chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GBm bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 355.955 s, 60.3 MB/s
57 MB/s
chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GBn bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 299.459 s, 71.7 MB/s
68 MB/s
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBo bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 238.459 s, 90.1 MB/s
85 MB/s
Total 403 MB/s
Config 12
14 drive RAID 0 128 kB stripes
3584 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
Test 1
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 149.758 s, 143 MB/s
136 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 167.87 s, 128 MB/s
121 MB/s
ingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 88.1136 s, 244 MB/s
232 MB/s
chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 161.687 s, 133 MB/s
126 MB/s
chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 82.8711 s, 259 MB/s
247 MB/s
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 91.1491 s, 236 MB/s
224 MB/s
Total 1086 MB/s
Config 13
14 drive RAID 6 128 kB stripes
3072 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1m,logbufs=8,logbsize=256k,largeio,swalloc
CPU locked at highest frequency
Test 1
4 instances of iotesttyphoon, 100 files of 30 MB
4, 12, 19, 20 s (750, 250, 157, 150 MB/s)
1307 MB/s
Test 2
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
4, 8, 16, 15 s (750, 375, 187, 200 MB/s)
1512 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
5, 20, 26, 38, 38, 38 s (600, 150, 115, 78, 78, 78 MB/s)
1099 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
5, 18, 30, 35, 36, 37 s (600, 166, 100, 85, 83, 81 MB/s)
1115 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.622066 s, 1.7 GB/s
1646 MB/s
Test 7
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.622326 s, 1.7 GB/s
1645 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.2308 s, 1.2 GB/s
1109 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 40.6836 s, 264 MB/s
251 MB/s
Test 10
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 7.13037 s, 1.5 GB/s
1436 MB/s
Test 10
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 106.851 s, 201 MB/s
191 MB/s
Test 11
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 63.7209 s, 337 MB/s
321 MB/s
Test 12
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 144.502 s, 149 MB/s
141 MB/s
Test 12
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 160.793 s, 134 MB/s
127 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 132.718 s, 162 MB/s
154 MB/s
Total 281 MB/s
Test 13
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 86.6886 s, 248 MB/s
236 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 132.328 s, 162 MB/s
154 MB/s
Total 390 MB/s
Test 14
3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 155.836 s, 138 MB/s
131 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 220.47 s, 97.4 MB/s
92 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 193.131 s, 111 MB/s
106 MB/s
Total 329 MB/s
Test 15
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 193.71 s, 111 MB/s
105 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 185.017 s, 116 MB/s
110 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 320.172 s, 67.1 MB/s
63 MB/s
chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 275.41 s, 78.0 MB/s
74 MB/s
chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 312.053 s, 68.8 MB/s
65 MB/s
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 230.911 s, 93.0 MB/s
88 MB/s
Total 505 MB/s
Config 14
14 drive RAID 6 128 kB stripes
3072 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,logbufs=8,logbsize=256k,largeio,swalloc
CPU locked at highest frequency
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
4, 4, 9, 9 s (750, 750, 333, 333 MB/s)
2166 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
4, 5, 6, 13 s (750, 600, 500, 230 MB/s)
2080 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
6, 20, 30, 42, 40, 50 s (500, 150, 100, 71, 75, 60 MB/s)
956 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
5, 20, 31, 36, 39, 36 s (600, 150, 96, 83, 76, 83 MB/s)
1088 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.617209 s, 1.7 GB/s
1659 MB/s
Test 7
Write 1 GB file
ingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.612309 s, 1.8 GB/s
1672 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 7.23929 s, 1.5 GB/s
1414 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 12.8803 s, 834 MB/s
795 MB/s
Test 10
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 73.2835 s, 147 MB/s
139 MB/s
Test 11
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 216.946 s, 99.0 MB/s
94 MB/s
Test 12
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 161.701 s, 133 MB/s
126 MB/s
Test 13
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 125.613 s, 171 MB/s
163 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 117.773 s, 182 MB/s
173 MB/s
336 MB/s
Test 14
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 130.653 s, 164 MB/s
156 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 145.43 s, 148 MB/s
140 MB/s
296 MB/s
Test 15
3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 192.95 s, 111 MB/s
106 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 181.372 s, 118 MB/s
112 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 188.389 s, 114 MB/s
108 MB/s
Total 326 MB/s
Test 16
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 270.999 s, 79.2 MB/s
75 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 209.593 s, 102 MB/s
97 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 268.001 s, 80.1 MB/s
76 MB/s
chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 233.568 s, 91.9 MB/s
87 MB/s
chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 307.85 s, 69.8 MB/s
66 MB/s
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 283.836 s, 75.7 MB/s
72 MB/s
Total 473 MB/s
Config 15
14 drive RAID 6 128 kB stripes
3072 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc
CPU locked at highest frequency
Test 1
1 instance of iotesttyphoon, 100 files of 30 MB
2 s (1500 MB/s)
Test 2
4 instances of iotesttyphoon, 100 files of 30 MB
4, 8, 12, 11 s (750, 375, 250, 272 MB/s)
Total 1647 MB/s
Test 3
4 instances of iotesttyphoon, 100 files of 30 MB
4, 5, 9, 9 s (750, 600, 333, 333 MB/s)
Total 2016 MB/s
Test 4
6 instances of iotesttyphoon, 100 files of 30 MB
5, 17, 23, 32, 39, 38 s (600, 176, 130, 93, 76, 78 MB/s)
1153 MB/s
Test 5
6 instances of iotesttyphoon, 100 files of 30 MB
4, 17, 35, 31, 40, 37 s (750, 176, 85, 96, 75, 81 MB/s)
1263 MB/s
Test 6
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.607413 s, 1.8 GB/s
1685 MB/s
Test 7
Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.618797 s, 1.7 GB/s
1654 MB/s
Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 16.1841 s, 663 MB/s
632 MB/s
Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.05751 s, 1.2 GB/s
1130 MB/s
Test 10
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 13.5591 s, 792 MB/s
755 MB/s
Test 11
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 95.4952 s, 225 MB/s
214 MB/s
Test 12
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 92.3127 s, 233 MB/s
221 MB/s
Test 13
Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 166.29 s, 129 MB/s
123 MB/s
Test 14
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 106.278 s, 202 MB/s
192 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 146.547 s, 147 MB/s
139 MB/s
Total 331 MB/s
Test 15
2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 98.9542 s, 217 MB/s
206 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 122.802 s, 175 MB/s
166 MB/s
Total 372 MB/s
Test 16
3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 142.879 s, 150 MB/s
143 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 202.268 s, 106 MB/s
101 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 130.598 s, 164 MB/s
156 MB/s
400 MB/s
Test 17
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 155.144 s, 138 MB/s
132 MB/s
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 265.973 s, 80.7 MB/s
77 MB/s
chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 156.72 s, 137 MB/s
130 MB/s
chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 309.921 s, 69.3 MB/s
66 MB/s
chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 268.82 s, 79.9 MB/s
76 MB/s
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 163.64 s, 131 MB/s
125 MB/s
Total 606 MB/s
Config 16
14 drive RAID 6 128 kB stripes
3072 LBA blocks for one set of stripes
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1
defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc,inode64
CPU locked at highest frequency
Test 1
Write 20 GB file
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBxx bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 77.9626 s, 275 MB/s
262 MB/s
Test 2
Write 20 GB file
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBxy bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 110.69 s, 194 MB/s
185 MB/s
Test 3
disk quota set
Write 20 GB file
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 72.4197 s, 297 MB/s
Test 4
disk quota set
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 116.401 s, 184 MB/s
Test 5
no disk quota
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 163.027 s, 132 MB/s
Test 6
no disk quota
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 83.8251 s, 256 MB/s
[-- Attachment #3: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2010-01-14 17:25 Gim Leong Chin
@ 2010-01-14 17:42 ` Eric Sandeen
2010-01-14 23:28 ` Dave Chinner
1 sibling, 0 replies; 14+ messages in thread
From: Eric Sandeen @ 2010-01-14 17:42 UTC (permalink / raw)
To: Gim Leong Chin; +Cc: xfs
Gim Leong Chin wrote:
> Hi Dave,
>
>
>> fragmented, it just means that that there are 19% more fragments
>> than the ideal. In 4TB of data with 1GB sized files, that would
>> mean there are 4800 extents (average length ~800MB, which is
>> excellent) instead of the perfect 4000 extents (@1GB each). Hence
>> you can see how misleading this "19% fragmentation" number can be
>> on an extent based filesystem...
>
> There are many files that are 128 GB.
>
> When I did the tests with dd on this computer, the 20 GB files had up
> to > 50 extents.
which is at least 400mb per extent, which is really not so bad.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2010-01-14 17:25 Gim Leong Chin
2010-01-14 17:42 ` Eric Sandeen
@ 2010-01-14 23:28 ` Dave Chinner
1 sibling, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2010-01-14 23:28 UTC (permalink / raw)
To: Gim Leong Chin; +Cc: Eric Sandeen, xfs
On Fri, Jan 15, 2010 at 01:25:15AM +0800, Gim Leong Chin wrote:
> Hi Dave,
>
>
> > fragmented, it just means that that there are 19% more
> > fragments
> > than the ideal. In 4TB of data with 1GB sized files, that
> > would mean
> > there are 4800 extents (average length ~800MB, which is
> > excellent)
> > instead of the perfect 4000 extents (@1GB each). Hence you
> > can see
> > how misleading this "19% fragmentation" number can be on an
> > extent
> > based filesystem...
>
> There are many files that are 128 GB.
>
> When I did the tests with dd on this computer, the 20 GB files had
> up to > 50 extents.
That's still an average of 400MB extents, which is more than large
enough to guarantee optimal disk bandwidth when reading or writing them
on your setup....
> > This all looks good - it certainly seems that you have done
> > your
> > research. ;) The only thing I'd do differently is that if
> > you have
> > only one partition on the drives, I wouldn't even put a
> > partition on it.
> >
>
> I just learnt from you that I can have a filesystem without a
> partition table! That takes care of having to calculate the start
> of the partition! Are there any other benefits? But are there
> any down sides to not having a partition table?
That's the main benefit, though there are others like no limit on
the partition size (e.g. msdos partitions are a max of 2TB) but
you avoided most of those problems by using GPT labels.
There aren't any real downsides that I am aware of, except
maybe that future flexibilty of the volume is reduced. e.g.
if you grow the volume, then you can still only have one filesystem
on it....
> > This seems rather low for a buffered write on hardware that
> > can
> > clearly go faster. SLED11 is based on 2.6.27, right? I
> > suspect that
> > many of the buffered writeback issues that have been fixed
> > since
> > 2.6.30 are present in the SLED11 kernel, and if that is the
> > case I
> > can see why the allocsize mount option makes such a big
> > difference.
>
> Is it possible for the fixes in the 2.6.30 kernel to be backported to the 2.6.27 kernel in SLE 11?
> If so, I would like to open a service request to Novell to do that to fix the performance issues in SLE 11.
Youἀd have to get all the fixes from 2.6.30 to 2.6.32, and the
backport would be very difficult to get right. Better would
be طust to upgrade the kernel to 2.6.32 ;)
> > I'd suggest that you might need to look at increasing the
> > maximum IO
> > size for the block device
> > (/sys/block/sdb/queue/max_sectors_kb),
> > maybe the request queue depth as well to get larger IOs to
> > be pushed
> > to the raid controller. if you can, at least get it to the
> > stripe
> > width of 1536k....
>
> Could you give a good reference for performance tuning of these
> parameters? I am at a total loss here.
Welcome to the black art of storage subsystem tuning ;)
I'm not sure there is a good reference for tuning the block device
parameters - most of what I know was handed down by word of mouth
from gurus on high mountains.
The overriding principle, though, is to try to ensure that the
stripe width sized IOs can be issued right through the IO stack to
the hardware, and that those IOs are correctly aligned to the
stripes. You've got the filesystem configuration and layout part
correct, now it's just tuning the block layer to pass the IO's
through.
I'd be looking in the Documentation/block directory
of the kernel source and googling for other documentation....
>
> As seen from the results file, I have tried different
> configurations of RAID 0, 5 and 6, with different number of
> drives. I am pretty confused by the results I see, although only
> the 20 GB file writes were done with allocsize=1g. I also did not
> lock the CPU frequency governor at the top clock except for the
> RAID 6 tests.
FWIW, your tests are not timing how longit takes for all the
data to hit the disk, only how long it takes to get into cache.
You really need to do for single threads:
$ time (dd if=/dev/zero of=<file> bs=XXX count=YYY; sync)
and something like this for multiple (N) threads:
time (
for i in `seq 0 1 N`; do
dd if=/dev/zero of=<file>.$i bs=XXX count=YYY &
done
wait
sync
)
And that will give you a much more accurate measure across all file
sizes of the throughput rate. You'll need to manually calculate
the rate from the output of the time command and the amount of
data that the test runs.
Or, alternatively, you could just use direct IO which avoids
such cache affects by bypassing it....
> I decided on the allocsize=1g after checking that the multiple
> instance 30 MB writes have only one extent for each file, without
> holes or unused space.
>
> It appears that RAID 6 writes are faster than RAID 5! And RAID 6
> can even match RAID 0! The system seems to thrive on throughput,
> when doing multiple instances of writes, for getting high
> aggregate bandwidth.
Given my above comments, that may not be true.
[....]
> We previously did tests of the Caviar Black 1 TB writing 100 MB
> chuncks to the device without a file system, with the drive
> connected to the SATA ports on a Tyan Opteron motherboard with
> nVidia nForce 4 Professional chipset. With the drive cache
> disabled, the sequential write speed was 30+ MB/s if I remember
> correctly, versus sub 100 MB/s with cache enabled. That is a big
> fall-off in speed, and that was writing at the outer diameter of
> the platter; speed would be halved at the inner diameter. It
> seems the controller firmware is meant to work with cache enabled
> for proper functioning.
That sounds wrong - it sounds like NCQ is not functioning properly
as with NCQ enabled, disabling the drive cache should not impact
throughput at all....
FWIW, for SAS and SCSI drives, I recommend turning the drive caches
off as the impact of filesystem issued barrier writes on performance
is worse than disabling the drive caches....
> The desktop Caviar Black also does not have rotatry vibration
> compensation, unlike the Caviar RE nearline drives. WD has a
> document showing the performance difference having rotary
> vibration compensation makes. I am not trying to save pennies
> here, but the local distributor refuses to bring in the Caviar
> REs, and I am stuck in one man's land.
I'd suggest trying to find another distributor that will bring them
in for you. Putting that many drives in a single chassis is almost
certainly going to cause vibration problems, especially if you get
all the disk heads moving in close synchronisation (which is what
happens when you get all your IO sizing and alignment right).
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
@ 2010-01-15 3:08 Gim Leong Chin
0 siblings, 0 replies; 14+ messages in thread
From: Gim Leong Chin @ 2010-01-15 3:08 UTC (permalink / raw)
To: Dave Chinner; +Cc: Eric Sandeen, xfs
Hi Dave,
Thank you for the advice!
I have done Direct IO dd tests writing the same 20 GB files. The results are an eye opener! bs=1GB, count=2
Single instance repeats of 830, 800 MB/s, compared to >100 to under 300 MB/s for buffered.
Two instances aggregate of 304 MB/s, six instances aggregate of 587 MB/s.
System drive /home RAID 1 of 130 MB/s compared to 51 MB/s buffered.
So the problem is with the buffered writes.
> Youἀd have to get all the fixes from 2.6.30 to 2.6.32,
> and the
> backport would be very difficult to get right. Better
> would
> be طust to upgrade the kernel to 2.6.32 ;)
If I change the kernel, I would have no support from Novell. I would try my luck and convince them.
> > > I'd suggest that you might need to look at
> increasing the
> > > maximum IO
> > > size for the block device
> > > (/sys/block/sdb/queue/max_sectors_kb),
> > > maybe the request queue depth as well to get
> larger IOs to
> > > be pushed
> > > to the raid controller. if you can, at least get
> it to the
> > > stripe
> > > width of 1536k....
> >
> > Could you give a good reference for performance tuning
> of these
> > parameters? I am at a total loss here.
>
> Welcome to the black art of storage subsystem tuning ;)
>
> I'm not sure there is a good reference for tuning the block
> device
> parameters - most of what I know was handed down by word of
> mouth
> from gurus on high mountains.
>
> The overriding principle, though, is to try to ensure that
> the
> stripe width sized IOs can be issued right through the IO
> stack to
> the hardware, and that those IOs are correctly aligned to
> the
> stripes. You've got the filesystem configuration and layout
> part
> correct, now it's just tuning the block layer to pass the
> IO's
> through.
Can I confirm that
(/sys/block/sdb/queue/max_sectors_kb)=stripe width 1536 kB
Which parameter is "request queue depth"? What should be the value?
> FWIW, your tests are not timing how longit takes for all
> the
> data to hit the disk, only how long it takes to get into
> cache.
Thank you! I do know that XFS buffers writes extensively. The drive LEDs remain lighted long after the OS says the writes are completed. Plus some timings are physically impossible.
> That sounds wrong - it sounds like NCQ is not functioning
> properly
> as with NCQ enabled, disabling the drive cache should not
> impact
> throughput at all....
I do not remember clearly if NCQ is available for that motherboard, it is an Ubuntu 32-bit, but I do remember seeing queue depth in the kernel. I will check it out next week.
But what I read is that NCQ hurts single write performance. That is also what I found with another Areca SATA RAID in Windows XP.
What I found with all the drives we tested was that disabling the cache badly hurt sequential write performance (no file system, write data directly to designated LBA).
> I'd suggest trying to find another distributor that will
> bring them
> in for you. Putting that many drives in a single chassis is
> almost
> certainly going to cause vibration problems, especially if
> you get
> all the disk heads moving in close synchronisation (which
> is what
> happens when you get all your IO sizing and alignment
> right).
I am working on changing to the WD Caviar RE4 drives. Not sure if I can pull it off.
Chin Gim Leong
New Email names for you!
Get the Email name you've always wanted on the new @ymail and @rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
@ 2010-01-24 6:44 Gim Leong Chin
0 siblings, 0 replies; 14+ messages in thread
From: Gim Leong Chin @ 2010-01-24 6:44 UTC (permalink / raw)
To: Dave Chinner; +Cc: Eric Sandeen, xfs
Hi Guys,
An update:
SUSE Linux Enterprise 11 SP1 will be based on kernel 2.6.32, according to Novell support, when I asked for backporting. That will solve my problems.
I do confirm that all the drives (no file system testing) were tested with NCQ enabled on the Tyan Opteron motherboard. What I did learn was that the note book drives nowadays come with NCQ, and that all the SATA desktop and note book drives showed very poor sequential write performance with cache disabled.
On another note, there was a post by Michael Monnerie on 20 NOV 2009 about kernel 2.6.27, XFS, inode64 and NFS.
I have the 10.9 TB inode64 XFS file system exported via NFS 4 by SLED 11, and mounted by SLES 10 SP2, and I have no problems with it.
Finally, A big Thank You to Dave Chinner and Eric Sandeen for your kind assistance!
GL
--- On Fri, 15/1/10, Gim Leong Chin <chingimleong@yahoo.com.sg> wrote:
> From: Gim Leong Chin <chingimleong@yahoo.com.sg>
> Subject: Re: allocsize mount option
> To: "Dave Chinner" <david@fromorbit.com>
> Cc: "Eric Sandeen" <sandeen@sandeen.net>, xfs@oss.sgi.com
> Date: Friday, 15 January, 2010, 11:08 AM
> Hi Dave,
>
> Thank you for the advice!
>
> I have done Direct IO dd tests writing the same 20 GB
> files. The results are an eye opener! bs=1GB, count=2
>
> Single instance repeats of 830, 800 MB/s, compared to
> >100 to under 300 MB/s for buffered.
>
> Two instances aggregate of 304 MB/s, six instances
> aggregate of 587 MB/s.
>
> System drive /home RAID 1 of 130 MB/s compared to 51 MB/s
> buffered.
>
> So the problem is with the buffered writes.
>
>
> > Youἀd have to get all the fixes from 2.6.30 to
> 2.6.32,
> > and the
> > backport would be very difficult to get right. Better
> > would
> > be طust to upgrade the kernel to 2.6.32 ;)
>
>
> If I change the kernel, I would have no support from
> Novell. I would try my luck and convince them.
>
> > > > I'd suggest that you might need to look at
> > increasing the
> > > > maximum IO
> > > > size for the block device
> > > > (/sys/block/sdb/queue/max_sectors_kb),
> > > > maybe the request queue depth as well to
> get
> > larger IOs to
> > > > be pushed
> > > > to the raid controller. if you can, at least
> get
> > it to the
> > > > stripe
> > > > width of 1536k....
> > >
> > > Could you give a good reference for performance
> tuning
> > of these
> > > parameters? I am at a total loss here.
> >
> > Welcome to the black art of storage subsystem tuning
> ;)
> >
> > I'm not sure there is a good reference for tuning the
> block
> > device
> > parameters - most of what I know was handed down by
> word of
> > mouth
> > from gurus on high mountains.
> >
> > The overriding principle, though, is to try to ensure
> that
> > the
> > stripe width sized IOs can be issued right through the
> IO
> > stack to
> > the hardware, and that those IOs are correctly aligned
> to
> > the
> > stripes. You've got the filesystem configuration and
> layout
> > part
> > correct, now it's just tuning the block layer to pass
> the
> > IO's
> > through.
>
> Can I confirm that
> (/sys/block/sdb/queue/max_sectors_kb)=stripe width 1536 kB
>
> Which parameter is "request queue depth"? What should be
> the value?
>
>
>
> > FWIW, your tests are not timing how longit takes for
> all
> > the
> > data to hit the disk, only how long it takes to get
> into
> > cache.
>
>
> Thank you! I do know that XFS buffers writes
> extensively. The drive LEDs remain lighted long after
> the OS says the writes are completed. Plus some
> timings are physically impossible.
>
>
> > That sounds wrong - it sounds like NCQ is not
> functioning
> > properly
> > as with NCQ enabled, disabling the drive cache should
> not
> > impact
> > throughput at all....
>
> I do not remember clearly if NCQ is available for that
> motherboard, it is an Ubuntu 32-bit, but I do remember
> seeing queue depth in the kernel. I will check it out
> next week.
>
> But what I read is that NCQ hurts single write
> performance. That is also what I found with another
> Areca SATA RAID in Windows XP.
>
> What I found with all the drives we tested was that
> disabling the cache badly hurt sequential write performance
> (no file system, write data directly to designated LBA).
>
>
>
> > I'd suggest trying to find another distributor that
> will
> > bring them
> > in for you. Putting that many drives in a single
> chassis is
> > almost
> > certainly going to cause vibration problems,
> especially if
> > you get
> > all the disk heads moving in close synchronisation
> (which
> > is what
> > happens when you get all your IO sizing and alignment
> > right).
>
> I am working on changing to the WD Caviar RE4 drives.
> Not sure if I can pull it off.
>
>
>
> Chin Gim Leong
>
>
> New Email names for you!
> Get the Email name you've always wanted on the new
> @ymail and @rocketmail.
> Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/
>
______________________________________________________________________
Search, browse and book your hotels and flights through Yahoo! Travel.
http://sg.travel.yahoo.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* allocsize mount option
@ 2010-09-28 18:53 Ivan.Novick
2010-09-29 0:31 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Ivan.Novick @ 2010-09-28 18:53 UTC (permalink / raw)
To: xfs; +Cc: Timothy.Heath
Hi all,
According to the documentation the allocsize mount option: "Sets the
buffered I/O end-of-file preallocation size when doing delayed allocation
writeout"
Will this value limit "extent" sizes to be be no smaller than the allocsize?
I have set the following mount options:
(rw,noatime,nodiratime,logbufs=8,allocsize=512m)
And yet, depending on the workload, the extent sizes are often 1 or 2 orders
of magnitude lower than 512 MB ...
If I wanted to do further reading on the subject, can someone point me to an
approximate location in the code where the size of a newly created extent is
determined?
Cheers,
Ivan Novick
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2010-09-28 18:53 Ivan.Novick
@ 2010-09-29 0:31 ` Dave Chinner
0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2010-09-29 0:31 UTC (permalink / raw)
To: Ivan.Novick; +Cc: Timothy.Heath, xfs
On Tue, Sep 28, 2010 at 02:53:46PM -0400, Ivan.Novick@emc.com wrote:
> Hi all,
>
> According to the documentation the allocsize mount option: "Sets the
> buffered I/O end-of-file preallocation size when doing delayed allocation
> writeout"
>
> Will this value limit "extent" sizes to be be no smaller than the allocsize?
No - it's specualtive preallocation.
> I have set the following mount options:
> (rw,noatime,nodiratime,logbufs=8,allocsize=512m)
/me wishes he could run a sed script across the internet.
noatime implies nodiratime, and logbufs=8 is the default, so you
only need "noatime,allocsize=512m"
> And yet, depending on the workload, the extent sizes are often 1 or 2 orders
> of magnitude lower than 512 MB ...
It's speculative and there's no guarantee that it can find a big
enough extent to complete the full allocsize allocation. Also, when
you close the file the speculative allocation beyond EOF is
truncated away. This is a particular problem with NFS servers.
> If I wanted to do further reading on the subject, can someone point me to an
> approximate location in the code where the size of a newly created extent is
> determined?
Start here:
fs/xfs/xfs_iomap.c::xfs_iomap_write_delay()
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* allocsize mount option
@ 2011-01-20 20:41 Peter Vajgel
2011-01-21 0:48 ` Dave Chinner
0 siblings, 1 reply; 14+ messages in thread
From: Peter Vajgel @ 2011-01-20 20:41 UTC (permalink / raw)
To: xfs@oss.sgi.com
We write about 100 100GB files into a single 10TB volume with xfs. We are using allocsize=1g to limit the fragmentation with a great success. We also need to reserve some space (~200GB) on each filesystem for processing the files and writing new versions of the files. Once we have only 200GB available we stop writing to the files. However with allocsize it's not that easy - we see +/- 100GB added or taken depending if there are still writes going and if the file was reopened ... Is there a way to programmatically disable allocsize speculative preallocation once we exceed certain threshold and also return the current speculative preallocation back to the free space (without closing the file)?
Thx
Peter Vajgel
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: allocsize mount option
2011-01-20 20:41 allocsize mount option Peter Vajgel
@ 2011-01-21 0:48 ` Dave Chinner
0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2011-01-21 0:48 UTC (permalink / raw)
To: Peter Vajgel; +Cc: xfs@oss.sgi.com
On Thu, Jan 20, 2011 at 08:41:19PM +0000, Peter Vajgel wrote:
> We write about 100 100GB files into a single 10TB volume with xfs.
> We are using allocsize=1g to limit the fragmentation with a great
> success. We also need to reserve some space (~200GB) on each
> filesystem for processing the files and writing new versions of
> the files. Once we have only 200GB available we stop writing to
> the files. However with allocsize it's not that easy - we see +/-
> 100GB added or taken depending if there are still writes going and
> if the file was reopened ... Is there a way to programmatically
> disable allocsize speculative preallocation once we exceed certain
> threshold and also return the current speculative preallocation
> back to the free space (without closing the file)?
No and no.
However, if you take a look at the new dynamic specualtive
allocation code in 2.6.38-rc1, it scales back the preallocation as
ENOSPC is approached but doesn't do any reclaiming of existing
preallocation. It will also preallocates much larger extents so it
may not be ideal for you, either. I've appended the commit message
below.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
commit 055388a3188f56676c21e92962fc366ac8b5cb72
Author: Dave Chinner <dchinner@redhat.com>
Date: Tue Jan 4 11:35:03 2011 +1100
xfs: dynamic speculative EOF preallocation
Currently the size of the speculative preallocation during delayed
allocation is fixed by either the allocsize mount option of a
default size. We are seeing a lot of cases where we need to
recommend using the allocsize mount option to prevent fragmentation
when buffered writes land in the same AG.
Rather than using a fixed preallocation size by default (up to 64k),
make it dynamic by basing it on the current inode size. That way the
EOF preallocation will increase as the file size increases. Hence
for streaming writes we are much more likely to get large
preallocations exactly when we need it to reduce fragementation.
For default settings, the size of the initial extents is determined
by the number of parallel writers and the amount of memory in the
machine. For 4GB RAM and 4 concurrent 32GB file writes:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET T
0: [0..1048575]: 1048672..2097247 0 (1048672..2097247) 10
1: [1048576..2097151]: 5242976..6291551 0 (5242976..6291551) 10
2: [2097152..4194303]: 12583008..14680159 0 (12583008..14680159) 20
3: [4194304..8388607]: 25165920..29360223 0 (25165920..29360223) 41
4: [8388608..16777215]: 58720352..67108959 0 (58720352..67108959) 83
5: [16777216..33554423]: 117440584..134217791 0 (117440584..134217791) 167
6: [33554424..50331511]: 184549056..201326143 0 (184549056..201326143) 167
7: [50331512..67108599]: 251657408..268434495 0 (251657408..268434495) 167
and for 16 concurrent 16GB file writes:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET
0: [0..262143]: 2490472..2752615 0 (2490472..2752615) 2
1: [262144..524287]: 6291560..6553703 0 (6291560..6553703) 2
2: [524288..1048575]: 13631592..14155879 0 (13631592..14155879) 5
3: [1048576..2097151]: 30408808..31457383 0 (30408808..31457383) 10
4: [2097152..4194303]: 52428904..54526055 0 (52428904..54526055) 20
5: [4194304..8388607]: 104857704..109052007 0 (104857704..109052007) 41
6: [8388608..16777215]: 209715304..218103911 0 (209715304..218103911) 83
7: [16777216..33554423]: 452984848..469762055 0 (452984848..469762055) 167
Because it is hard to take back specualtive preallocation, cases
where there are large slow growing log files on a nearly full
filesystem may cause premature ENOSPC. Hence as the filesystem nears
full, the maximum dynamic prealloc size іs reduced according to this
table (based on 4k block size):
freespace max prealloc size
>5% full extent (8GB)
4-5% 2GB (8GB >> 2)
3-4% 1GB (8GB >> 3)
2-3% 512MB (8GB >> 4)
1-2% 256MB (8GB >> 5)
<1% 128MB (8GB >> 6)
This should reduce the amount of space held in speculative
preallocation for such cases.
The allocsize mount option turns off the dynamic behaviour and fixes
the prealloc size to whatever the mount option specifies. i.e. the
behaviour is unchanged.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-01-21 0:46 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-13 9:42 allocsize mount option Gim Leong Chin
2010-01-13 10:50 ` Dave Chinner
2010-01-13 22:59 ` xfstests: Clean up build output Alex Elder
-- strict thread matches above, loose matches on Subject: below --
2011-01-20 20:41 allocsize mount option Peter Vajgel
2011-01-21 0:48 ` Dave Chinner
2010-09-28 18:53 Ivan.Novick
2010-09-29 0:31 ` Dave Chinner
2010-01-24 6:44 Gim Leong Chin
2010-01-15 3:08 Gim Leong Chin
2010-01-14 17:25 Gim Leong Chin
2010-01-14 17:42 ` Eric Sandeen
2010-01-14 23:28 ` Dave Chinner
2010-01-11 17:25 Gim Leong Chin
2010-01-11 18:16 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox