public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* allocsize mount option
@ 2010-01-11 17:25 Gim Leong Chin
  2010-01-11 18:16 ` Eric Sandeen
  0 siblings, 1 reply; 13+ messages in thread
From: Gim Leong Chin @ 2010-01-11 17:25 UTC (permalink / raw)
  To: xfs

Hi,

Mount options for xfs
       allocsize=size
       Sets  the buffered I/O end-of-file preallocation size when doing delayed allocation writeout (default size is 64KiB). 


I read that setting allocsize to a big value can be used to combat filesystem fragmentation when writing big files.

I do not understand how allocsize works.  Say I set allocsize=1g, but my file size is only 1 MB or even smaller.  Will the rest of the 1 GB file extent be allocated, resulting in wasted space and even file fragmentation problem?

Does setting allocsize to a big value result in performance gain when writing big files?  Is performance hurt by a big value setting when writing files smaller than the allocsize value?

I am setting up a system for HPC, where two different applications have different file size characteristics, one writes files of GBs and even 128 GB, the other is in MBs to tens of MBs.

I am not able to find documentation on the behaviour of allocsize mount option.

Thank you.


Chin Gim Leong


      New Email names for you! 
Get the Email name you've always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: allocsize mount option
@ 2010-01-13  9:42 Gim Leong Chin
  2010-01-13 10:50 ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Gim Leong Chin @ 2010-01-13  9:42 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Hi,


The application is ANSYS, which writes 128 GB files.  The existing computer with SUSE Linux Enterprise Desktop 11 which is used for running ANSYS, has two software RAID 0 devices made up of five 1 TB drives.  The /home partition is 4.5 T, and it is now 4 TB full.  I see a fragmentation > 19%.


I have just set up a new computer with 16 WD Cavair Black 1 TB drives connected to an Areca 1680ix-16 RAID with 4 GB cache.  14 of these drives are in RAID 6 with 128 kB stripes.  The OS is also SLED 11.  The system has 16 GB memory, and AMD Phenom II X4 965 CPU.

I have done tests writing 100 30 MB files and 1 GB, 10 GB and 20 GB files, with single instance and multiple instances.

There is a big difference in writing speed when writing 20 GB files when using allocsize=1g and not using the option.  That is without the inode64 option, which gives further speed gains.

I use dd for writing the 1 GB, 10 GB and 20 GB files.

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc

The start of the partition has been set to LBA 3072 using GPT Fdisk to align the stripes.

The dd command is:

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20

Single instance of 20 GB dd repeats were 214, 221, 123 MB/s with allocsize=1g, compared to 94, 126 MB/s without.

Two instances of 20 GB dd repeats were aggregate 331, 372 MB/s with allocsize=1g, compared to 336, 296 MB/s without.

Three instances of 20 GB dd was aggregate 400 MB/s with, 326 MB/s without.

Six instances of 20 GB dd was 606 MB/s with, 473 MB/s without.


My production configuration is

defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc,inode64

for which I got up to 297 MB/s for single instance 20 GB dd.



Chin Gim Leong


--- On Tue, 12/1/10, Eric Sandeen <sandeen@sandeen.net> wrote:

> From: Eric Sandeen <sandeen@sandeen.net>
> Subject: Re: allocsize mount option
> To: "Gim Leong Chin" <chingimleong@yahoo.com.sg>
> Cc: xfs@oss.sgi.com
> Date: Tuesday, 12 January, 2010, 2:16 AM
> Gim Leong Chin wrote:
> > Hi,
> > 
> > Mount options for xfs allocsize=size Sets  the
> buffered I/O
> > end-of-file preallocation size when doing delayed
> allocation writeout
> > (default size is 64KiB).
> > 
> > 
> > I read that setting allocsize to a big value can be
> used to combat
> > filesystem fragmentation when writing big files.
> 
> That's not universally necessary though, depending on how
> you are
> writing them.  I've only used it in the very specific
> case of mythtv
> calling "sync" every couple seconds, and defeating
> delalloc.
> 
> > I do not understand how allocsize works.  Say I
> set allocsize=1g, but
> > my file size is only 1 MB or even smaller.  Will
> the rest of the 1 GB
> > file extent be allocated, resulting in wasted space
> and even file
> > fragmentation problem?
> 
> possibly :)  It's only speculatively allocated,
> though, so you won't
> have 1g for every file; when it's closed the preallocation
> goes
> away, IIRC.
> 
> > Does setting allocsize to a big value result in
> performance gain when
> > writing big files?  Is performance hurt by a big
> value setting when
> > writing files smaller than the allocsize value?
> > 
> > I am setting up a system for HPC, where two different
> applications
> > have different file size characteristics, one writes
> files of GBs and
> > even 128 GB, the other is in MBs to tens of MBs.
> 
> We should probably back up and say:  are you seeing
> fragmentation
> problems -without- the mount option, and if so, what is
> your write pattern?
> 
> -Eric
> 
> > I am not able to find documentation on the behaviour
> of allocsize
> > mount option.
> > 
> > Thank you.
> > 
> > 
> > Chin Gim Leong
> > 
> > 
> > New Email names for you! Get the Email name you've
> always wanted
> > on the new @ymail and @rocketmail. Hurry before
> someone else does! 
> > http://mail.promotions.yahoo.com/newdomains/sg/
> > 
> > _______________________________________________ xfs
> mailing list 
> > xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
> > 
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>



      New Email addresses available on Yahoo!
Get the Email name you&#39;ve always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: allocsize mount option
@ 2010-01-14 17:25 Gim Leong Chin
  2010-01-14 17:42 ` Eric Sandeen
  2010-01-14 23:28 ` Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: Gim Leong Chin @ 2010-01-14 17:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, xfs

[-- Attachment #1: Type: text/plain, Size: 7631 bytes --]

Hi Dave,


> fragmented, it just means that that there are 19% more
> fragments
> than the ideal. In 4TB of data with 1GB sized files, that
> would mean
> there are 4800 extents (average length ~800MB, which is
> excellent)
> instead of the perfect 4000 extents (@1GB each). Hence you
> can see
> how misleading this "19% fragmentation" number can be on an
> extent
> based filesystem...

There are many files that are 128 GB.

When I did the tests with dd on this computer, the 20 GB files had up to > 50 extents.


> This all looks good - it certainly seems that you have done
> your
> research. ;) The only thing I'd do differently is that if
> you have
> only one partition on the drives, I wouldn't even put a
> partition on it.
> 

I just learnt from you that I can have a filesystem without a partition table!  That takes care of having to calculate the start of the partition!  Are there any other benefits?  But are there any down sides to not having a partition table?


> I'd significantly reduce the size of that buffer - too
> large a
> buffer can slow down IO due to the memory it consumes and
> TLB misses
> it causes. I'd typically use something like:
> 
> $ dd if=/dev/zero of=bigfile bs=1024k count=20k
> 
> Which does 20,000 writes of 1MB each and ensures the dd
> process
> doesn't consume over a GB of RAM.
> 

I did try with 1 MB.  I have attached the raw test result file.  As you can see from line 261, in writing 10 GB with bs=1MB, the speed was no faster two out of three times, so I dropped it.  I could re-try that next time.


> This seems rather low for a buffered write on hardware that
> can
> clearly go faster. SLED11 is based on 2.6.27, right? I
> suspect that
> many of the buffered writeback issues that have been fixed
> since
> 2.6.30 are present in the SLED11 kernel, and if that is the
> case I
> can see why the allocsize mount option makes such a big
> difference.

Is it possible for the fixes in the 2.6.30 kernel to be backported to the 2.6.27 kernel in SLE 11?
If so, I would like to open a service request to Novell to do that to fix the performance issues in SLE 11.


> It might be worth checking how well direct IO writes run to
> take
> buffered writeback issues out ofthe equation. In that case,
> I'd use
> stripe width multiple sized buffers like:
> 
> $ dd if=/dev/zero of=bigfile bs=3072k count=7k
> oflag=direct
> 

I would like to do that tomorrow when I go back to work, but on my openSUSE 11.1 AMD Turion RM-74 notebook with 2.6.27.39-0.2-default kernel, on the system WD Scorpio Black 7200 RPM drive, I get 62 MB/s with dd bs=1GB for writing 20 GB file with Direct IO, and 56 MB/s without Direct IO.  You are on to something!

As for the hardware performance potential, see below.

> I'd suggest that you might need to look at increasing the
> maximum IO
> size for the block device
> (/sys/block/sdb/queue/max_sectors_kb),
> maybe the request queue depth as well to get larger IOs to
> be pushed
> to the raid controller. if you can, at least get it to the
> stripe
> width of 1536k....
> 

Could you give a good reference for performance tuning of these parameters?  I am at a total loss here.


As seen from the results file, I have tried different configurations of RAID 0, 5 and 6, with different number of drives.  I am pretty confused by the results I see, although only the 20 GB file writes were done with allocsize=1g.  I also did not lock the CPU frequency governor at the top clock except for the RAID 6 tests.

I decided on the allocsize=1g after checking that the multiple instance 30 MB writes have only one extent for each file, without holes or unused space.

It appears that RAID 6 writes are faster than RAID 5!  And RAID 6 can even match RAID 0!  The system seems to thrive on throughput, when doing multiple instances of writes, for getting high aggregate bandwidth.

I will put the performance potential of the system in context by giving some details.

The system has four Kingston DDR2-800 MHz CL6 4 GB unbuffered ECC DIMMs, set to unganged mode, so each thread has up to 6.4 GB of memory bandwidth, from one of two independent memory channels.

The AMD Phenom II X4 965 has three levels of cache, and data from memory goes directly to the L1 caches. The four cores have dedicated L1 and L2 caches, and a shared 6 MB L3.  Thread switching will result in cache misses if more than four threads are running.

The IO through the HyperTransport 3.0 from CPU to the AMD 790FX chipset is at 8 GB/s.  The Areca ARC-1680ix-16 is PCI-E Gen 1 x8, so the maximum bandwidth is 2 GB/s.  The cache is Kingston DDR-667 CL5 4 GB unbuffered ECC, although it runs at 533 MHz, so the maximum bandwidth is 4.2 GB/s.  The Intel IOP 348 1200 MHz on the card has two cores.

There are sixteen WD Caviar Black 1 TB drives in the Lian-Li PC-V2110 chassis.  For the folks reading this, please do not follow this set-up, as the Caviar Blacks are a mistake.  WD quietly disabled the use of WD time limited error recovery utility since the September 2009 manufactured  Caviar Black drives, so I have an array of drives that can pop out of the RAID any time if I am unlucky, and I got screwed here.

There is a battery back-up module for the cache, and the drive caches are disabled.  Tests run with the drive caches enabled showed quite some bit of speed up in RAID 0.

We previously did tests of the Caviar Black 1 TB writing 100 MB chuncks to the device without a file system, with the drive connected to the SATA ports on a Tyan Opteron motherboard with nVidia nForce 4 Professional chipset.  With the drive cache disabled, the sequential write speed was 30+ MB/s if I remember correctly, versus sub 100 MB/s with cache enabled.  That is a big fall-off in speed, and that was writing at the outer diameter of the platter; speed would be halved at the inner diameter.  It seems the controller firmware is meant to work with cache enabled for proper functioning.

The desktop Caviar Black also does not have rotatry vibration compensation, unlike the Caviar RE nearline drives.  WD has a document showing the performance difference having rotary vibration compensation makes.  I am not trying to save pennies here, but the local distributor refuses to bring in the Caviar REs, and I am stuck in one man's land.

The system has sixteen hard drives, and ten fans of difference sizes and purposes in total, so that is quite some bit of rotary vibration, which I can feel when I place my hand on the side panels.  I really do not know how badly the drive performance suffers as a result. The drives are attached with rubber dampers on the mounting screws.

I did the 20 GB dd test on the RAID 1 system drive, also with XFS, and got 53 MB/s with disabled drive caches, 63 MB/s enabled.  That is pretty disappointing, but in light of all the above considerations, plus the kernel buffer issues, I do not really know if that is a good figure.

NCQ is enabled at depth 32.  NCQ should cause performance loss for single writes, but gains for multiple writes.

Areca has a document showing that this card can do RAID 6 800 MB/s with Seagate nearline drives, with the standard 512 MB cache.  That is in Windows Server.  I do not know if the caches are disabled.  The benchmark is IO Meter workstation sequential write.  IO Meter requries WIndows for the front end, which causes me great difficulties, so I gave up trying to figure it out and I do not understand what the workstation test does.  However, in writing 30 MB files, I already exceed 1 GB/s.



      

[-- Attachment #2: xfstesting --]
[-- Type: application/octet-stream, Size: 38719 bytes --]

Testing for Areca data volume


mkfs.xfs -f -d agcount=32 -i align=1 -L /data /dev/sdb1


Config 1
14 drive RAID 0 128 kB stripes

3584 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1
4 instances of iotesttyphoon, 100 files of 30 MB

5, 6, 4, 4 s (600, 500, 750, 750 MB/s)

2600 MB/s

Test 2
repeat the above

4, 4, 4, 6 s (750, 750, 750, 500 MB/s)

2750 MB/s

Test 3
1 instance of iotesttyphoon

2 s (1500 MB/s)

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

8, 15, 15, 19, 17, 18 s (375, 200, 200, 157.89, 176.47, 166.67 MB/s)

1276 MB/s

Test 5

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.43212 s, 750 MB/s

Test 6

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 11.0262 s, 974 MB/s

930 MB/s

Test 7

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 11.888 s, 903 MB/s

861 MB/s

Test 8

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 61.8759 s, 347 MB/s

330.98 MB/s

Test 9

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 65.8656 s, 326 MB/s

310.94 MB/s

Test 10

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.60774 s, 1.1 GB/s

1065 MB/s


Config 2
12 drive RAID 0 128 kB stripes

3072 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1

Test 1

1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 2

4 instances of iotesttyphoon, 100 files of 30 MB

3, 4, 4, 3 s (1000, 750, 750, 1000 MB/s)

3500 MB/s

Test 3

4 instances of iotesttyphoon, 100 files of 30 MB

5, 5, 7, 6 s (600, 600, 428.57, 500 MB/s)

2128 MB/s

17, 18, 16, 16, 13, 11 s (176, 166, 187.5, 187.5, 230, 272 MB/s)

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

9, 14, 16, 21, 21, 21 s (333, 214, 187, 142, 142, 142 MB/s)

1160 MB/s

Test 5

Write 1 GB file

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.776428 s, 1.4 GB/s
3863 MB/s

Test 6

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 10.1086 s, 1.1 GB/s
1012 MB/s

Test 7

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 10.7962 s, 995 MB/s
948 MB/s

Test 8

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 65.0962 s, 330 MB/s
314 MB/s

Test 9

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 66.1583 s, 325 MB/s
309 MB/s


Config 3
14 drive RAID 5 128 kB stripes

3328 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=13 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1

1 instance of iotesttyphoon, 100 files of 30 MB

3 s (1000 MB/s)

Test 2

4 instances of iotesttyphoon, 100 files of 30 MB

4, 8, 10, 15 s (750, 375, 300, 200 MB/s)

1625 MB/s

Test 3

4 instances of iotesttyphoon, 100 files of 30 MB

14, 13, 11, 5 s (214, 230, 272, 600 MB/s)

1316 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

46, 46, 46, 46, 23, 9 s (65, 65, 65, 65, 130, 333 MB/s)

723 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

14, 21, 41, 40, 41, 46 s (214, 142, 73, 75, 73, 65 MB/s)

642 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.704663 s, 1.5 GB/s
1453 MB/s

Test 7

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.669676 s, 1.6 GB/s
1529 MB/s

Test 8

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 60.8726 s, 176 MB/s
168 MB/s

Test 9

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 79.5942 s, 135 MB/s
128 MB/s

Test 10

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1048576 count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 42.4831 s, 253 MB/s
241 MB/s

Test 11

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBd bs=1048576 count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 86.3433 s, 124 MB/s
118 MB/s

Test 12

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1048576 count=20480
20480+0 records in
20480+0 records out
21474836480 bytes (21 GB) copied, 169.453 s, 127 MB/s
120 MB/s

Test 13

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 162.702 s, 132 MB/s
125 MB/s

Test 14

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 213.533 s, 101 MB/s
95 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 246.291 s, 87.2 MB/s
83 MB/s


Config 4
10 drive RAID 5 128 kB stripes, drives 7 to 16 1000 GB (931 GB)

2304 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=9 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1

1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 2

4 instances of iotesttyphoon, 100 files of 30 MB

12, 13, 10, 5 s (250, 230, 300, 600 MB/s)

1380 MB/s

Test 3

4 instances of iotesttyphoon, 100 files of 30 MB

5, 7, 10, 12 s (600, 428, 300, 250 MB/s)

1578 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

25, 21, 26, 22, 13, 9 s (120, 142, 115, 136, 230, 333 MB/s)

1076 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

20, 18, 21, 14, 13, 11 s (150, 166, 142, 214, 230, 272 MB/s)

1174 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.674449 s, 1.6 GB/s
1518 MB/s

Test 7

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.768957 s, 1.4 GB/s
1331 MB/s

Test 8
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 83.6194 s, 128 MB/s
122 MB/s

Test 9
Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 85.5271 s, 126 MB/s
119 MB/s

Test 10

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 178.256 s, 120 MB/s
114 MB/s

Test 11

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 187.051 s, 115 MB/s
109 MB/s

Test 12

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 186.364 s, 115 MB/s
109 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 198.787 s, 108 MB/s
103 MB/s

Total 223 MB/s

drive caches enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 165.58 s, 130 MB/s
123 MB/s

drive caches enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 117.277 s, 183 MB/s
174 MB/s

313 MB/s


chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 62.6958 s, 343 MB/s
326 MB/s

chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 73.8873 s, 291 MB/s
277 MB/s

chingl@dragon:~/testsw> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 75.4624 s, 285 MB/s
271 MB/s

in dragon, 6 instances of iotesttyphoon, 100 files of 30 MB

66, 84, 107, 130, 130, 129 s (45, 35, 28, 23, 23, 23 MB/s)

on another day, ansys using 100% CPU

2 instances of 20 GB file
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 140.058 s, 153 MB/s
146 MB/s 

chingl@dragon:~> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 97.8646 s, 219 MB/s
209 MB/s

Total 355 MB/s

6 instances of 20 GB file
chingl@dragon:~> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 511.881 s, 42.0 MB/s
40 MB/s

chingl@dragon:~> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 510.087 s, 42.1 MB/s
40 MB/s

chingl@dragon:~> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 434.338 s, 49.4 MB/s
47 MB/s

chingl@dragon:~> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 339.075 s, 63.3 MB/s
60 MB/s

chingl@dragon:~> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 360.107 s, 59.6 MB/s
56 MB/s

chingl@dragon:~> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 571.768 s, 37.6 MB/s
35 MB/s

Total 278 MB/s

tornado had full CPUs and little free memory
chingl@tornado:~/testsw/iotest/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 158.466 seconds, 136 MB/s
129 MB/s

chingl@tornado:~/testsw/iotest/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 153.538 seconds, 140 MB/s
133 MB/s

chingl@typhoon:~/testsw/iotesting/w1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 133.32 seconds, 161 MB/s
153 MB/s

chingl@typhoon:~/testsw/iotesting/w1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 89.4894 seconds, 240 MB/s
228 MB/s

tornado had free CPUs
chingl@tornado:~> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 141.092 seconds, 152 MB/s


Config 5
10 drive RAID 5 128 kB stripes, drives 7 to 16 1000 GB (931 GB)

2304 LBA blocks for one set of stripes

mkfs.xfs -f -L /data /dev/sdb1

tsunami:~ # mkfs.xfs -f -L /data /dev/sdb1
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=61035047 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=244140187, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0



Test 1

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 189.239 s, 113 MB/s
108 GB/s

Test 2

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 172.743 s, 124 MB/s
118 MB/s

Config 6
writing to /

Test 1

chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 380.755 s, 56.4 MB/s
53 MB/s

Config 7
writing to /home

Test 1

chingl@tsunami:~/testsw> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 400.869 s, 53.6 MB/s
51 MB/s

Config 8
14 drive RAID 0 128 kB stripes

3584 LBA blocks for one set of stripes

mkfs.xfs -f -L /data /dev/sdb1

meta-data=/dev/sdb1              isize=256    agcount=13, agsize=268435455 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=3417967163, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

Test 1

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 62.716 s, 342 MB/s
326 MB/s


Config 9
14 drive RAID 0 128 kB stripes

3584 LBA blocks for one set of stripes

drive cache enabled
mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1

Write 20 GB file
drive cache enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 38.9335 s, 552 MB/s
526 MB/s

Test 2

Write 20 GB file
drive cache enabled
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 47.8844 s, 448 MB/s
427 GB/s

Config 10

drive cache enabled
Write to /home
chingl@tsunami:~/testsw> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 343.33 s, 62.5 MB/s
59 MB/s

Config 11
10 drive RAID 6 128 kB stripes drives 7 to 16 1000 GB (931 GB)

2048 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=8 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1

1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 2

4 instances of iotesttyphoon, 100 files of 30 MB

16, 16, 9, 5 s (187, 187, 333, 600 MB/s)

1307 MB/s

Test 3

4 instances of iotesttyphoon, 100 files of 30 MB

5, 6, 11, 11 s (600, 500, 272, 272 MB/s)

1644 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

22, 23, 20, 15, 12, 10 s (136, 130, 150, 200, 250, 300 MB/s)

1166 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

23, 24, 22, 20, 8, 2 s (130, 125, 136, 150, 375, 1500 MB/s)

2416 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.710389 s, 1.5 GB/s
1441 MB/s

Test 7

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.742939 s, 1.4 GB/s
1378 MB/s

Test 8

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 31.1474 s, 345 MB/s
328 MB/s

Test 9

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 31.8968 s, 337 MB/s

Test 10

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 277.032 s, 77.5 MB/s
73 MB/s

Test 11

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 262.301 s, 81.9 MB/s
78 MB/s

Test 12

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 269.545 s, 79.7 MB/s
75 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 287.517 s, 74.7 MB/s
71 MB/s

Total 146 MB/s

Test 12

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 260.578 s, 82.4 MB/s
78 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 257.186 s, 83.5 MB/s
79 MB/s

Total 157 MB/s

Test 13

3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 207.924 s, 103 MB/s
98 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 211.839 s, 101 MB/s
96 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 168.803 s, 127 MB/s
121 MB/s

Total 315 MB/s

Test 14

6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 298.946 s, 71.8 MB/s
68 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBk bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 354.162 s, 60.6 MB/s
57 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBl bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 300.646 s, 71.4 MB/s
68 MB/s

chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GBm bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 355.955 s, 60.3 MB/s
57 MB/s

chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GBn bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 299.459 s, 71.7 MB/s
68 MB/s

chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBo bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 238.459 s, 90.1 MB/s
85 MB/s

Total 403 MB/s

Config 12
14 drive RAID 0 128 kB stripes

3584 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=14 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1


Test 1
6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 149.758 s, 143 MB/s
136 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 167.87 s, 128 MB/s
121 MB/s

ingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 88.1136 s, 244 MB/s
232 MB/s

chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 161.687 s, 133 MB/s
126 MB/s

chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 82.8711 s, 259 MB/s
247 MB/s

chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 91.1491 s, 236 MB/s
224 MB/s

Total 1086 MB/s


Config 13
14 drive RAID 6 128 kB stripes

3072 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1

defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1m,logbufs=8,logbsize=256k,largeio,swalloc

CPU locked at highest frequency

Test 1
4 instances of iotesttyphoon, 100 files of 30 MB

4, 12, 19, 20 s (750, 250, 157, 150 MB/s)

1307 MB/s

Test 2
1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 3
4 instances of iotesttyphoon, 100 files of 30 MB

4, 8, 16, 15 s (750, 375, 187, 200 MB/s)

1512 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

5, 20, 26, 38, 38, 38 s (600, 150, 115, 78, 78, 78 MB/s)

1099 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

5, 18, 30, 35, 36, 37 s (600, 166, 100, 85, 83, 81 MB/s)

1115 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.622066 s, 1.7 GB/s
1646 MB/s

Test 7

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.622326 s, 1.7 GB/s
1645 MB/s

Test 8

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.2308 s, 1.2 GB/s
1109 MB/s

Test 9

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 40.6836 s, 264 MB/s
251 MB/s

Test 10

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 7.13037 s, 1.5 GB/s
1436 MB/s

Test 10

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 106.851 s, 201 MB/s
191 MB/s

Test 11

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 63.7209 s, 337 MB/s
321 MB/s

Test 12

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 144.502 s, 149 MB/s
141 MB/s

Test 12

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 160.793 s, 134 MB/s
127 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 132.718 s, 162 MB/s
154 MB/s

Total 281 MB/s

Test 13

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 86.6886 s, 248 MB/s
236 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 132.328 s, 162 MB/s
154 MB/s

Total 390 MB/s

Test 14

3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 155.836 s, 138 MB/s
131 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 220.47 s, 97.4 MB/s
92 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 193.131 s, 111 MB/s
106 MB/s

Total 329 MB/s

Test 15

6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 193.71 s, 111 MB/s
105 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 185.017 s, 116 MB/s
110 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 320.172 s, 67.1 MB/s
63 MB/s

chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 275.41 s, 78.0 MB/s
74 MB/s

chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 312.053 s, 68.8 MB/s
65 MB/s

chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 230.911 s, 93.0 MB/s
88 MB/s

Total 505 MB/s


Config 14
14 drive RAID 6 128 kB stripes

3072 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1

defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,logbufs=8,logbsize=256k,largeio,swalloc

CPU locked at highest frequency

Test 1
1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 2
4 instances of iotesttyphoon, 100 files of 30 MB

4, 4, 9, 9 s (750, 750, 333, 333 MB/s)

2166 MB/s

Test 3
4 instances of iotesttyphoon, 100 files of 30 MB

4, 5, 6, 13 s (750, 600, 500, 230 MB/s)

2080 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

6, 20, 30, 42, 40, 50 s (500, 150, 100, 71, 75, 60 MB/s)

956 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

5, 20, 31, 36, 39, 36 s (600, 150, 96, 83, 76, 83 MB/s)

1088 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.617209 s, 1.7 GB/s
1659 MB/s

Test 7

Write 1 GB file
ingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.612309 s, 1.8 GB/s
1672 MB/s

Test 8

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 7.23929 s, 1.5 GB/s
1414 MB/s

Test 9

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 12.8803 s, 834 MB/s
795 MB/s

Test 10

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 73.2835 s, 147 MB/s
139 MB/s

Test 11

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 216.946 s, 99.0 MB/s
94 MB/s

Test 12

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 161.701 s, 133 MB/s
126 MB/s

Test 13

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 125.613 s, 171 MB/s
163 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 117.773 s, 182 MB/s
173 MB/s

336 MB/s

Test 14

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 130.653 s, 164 MB/s
156 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 145.43 s, 148 MB/s
140 MB/s

296 MB/s

Test 15

3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 192.95 s, 111 MB/s
106 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 181.372 s, 118 MB/s
112 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 188.389 s, 114 MB/s
108 MB/s

Total 326 MB/s

Test 16

6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 270.999 s, 79.2 MB/s
75 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 209.593 s, 102 MB/s
97 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 268.001 s, 80.1 MB/s
76 MB/s

chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 233.568 s, 91.9 MB/s
87 MB/s

chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 307.85 s, 69.8 MB/s
66 MB/s

chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 283.836 s, 75.7 MB/s
72 MB/s

Total 473 MB/s

Config 15
14 drive RAID 6 128 kB stripes

3072 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1

defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc

CPU locked at highest frequency

Test 1
1 instance of iotesttyphoon, 100 files of 30 MB

2 s (1500 MB/s)

Test 2
4 instances of iotesttyphoon, 100 files of 30 MB

4, 8, 12, 11 s (750, 375, 250, 272 MB/s)

Total 1647 MB/s

Test 3
4 instances of iotesttyphoon, 100 files of 30 MB

4, 5, 9, 9 s (750, 600, 333, 333 MB/s)

Total 2016 MB/s

Test 4

6 instances of iotesttyphoon, 100 files of 30 MB

5, 17, 23, 32, 39, 38 s (600, 176, 130, 93, 76, 78 MB/s)

1153 MB/s

Test 5

6 instances of iotesttyphoon, 100 files of 30 MB

4, 17, 35, 31, 40, 37 s (750, 176, 85, 96, 75, 81 MB/s)

1263 MB/s

Test 6

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GB bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.607413 s, 1.8 GB/s
1685 MB/s

Test 7

Write 1 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile1GBb bs=1073741824 count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.618797 s, 1.7 GB/s
1654 MB/s

Test 8

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GB bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 16.1841 s, 663 MB/s
632 MB/s

Test 9

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBb bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 9.05751 s, 1.2 GB/s
1130 MB/s

Test 10

Write 10 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile10GBc bs=1073741824 count=10
10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 13.5591 s, 792 MB/s
755 MB/s

Test 11

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 95.4952 s, 225 MB/s
214 MB/s

Test 12

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBb bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 92.3127 s, 233 MB/s
221 MB/s

Test 13

Write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBc bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 166.29 s, 129 MB/s
123 MB/s

Test 14

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBd bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 106.278 s, 202 MB/s
192 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBe bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 146.547 s, 147 MB/s
139 MB/s

Total 331 MB/s

Test 15

2 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBf bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 98.9542 s, 217 MB/s
206 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBg bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 122.802 s, 175 MB/s
166 MB/s

Total 372 MB/s

Test 16

3 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GBh bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 142.879 s, 150 MB/s
143 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GBi bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 202.268 s, 106 MB/s
101 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GBj bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 130.598 s, 164 MB/s
156 MB/s

400 MB/s

Test 17

6 instances of write 20 GB file
chingl@tsunami:/data/test/t1> dd if=/dev/zero of=bigfile20GB1 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 155.144 s, 138 MB/s
132 MB/s

chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 265.973 s, 80.7 MB/s
77 MB/s

chingl@tsunami:/data/test/t3> dd if=/dev/zero of=bigfile20GB3 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 156.72 s, 137 MB/s
130 MB/s

chingl@tsunami:/data/test/t4> dd if=/dev/zero of=bigfile20GB4 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 309.921 s, 69.3 MB/s
66 MB/s

chingl@tsunami:/data/test/t5> dd if=/dev/zero of=bigfile20GB5 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 268.82 s, 79.9 MB/s
76 MB/s

chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GB6 bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 163.64 s, 131 MB/s
125 MB/s

Total 606 MB/s


Config 16
14 drive RAID 6 128 kB stripes

3072 LBA blocks for one set of stripes

mkfs.xfs -f -b size=4k -d agcount=32,su=128k,sw=12 -i size=256,align=1,attr=2 -l version=2,su=128k,lazy-count=1 -n version=2 -s size=512 -L /data /dev/sdb1

defaults,nobarrier,usrquota,grpquota,noatime,nodiratime,allocsize=1g,logbufs=8,logbsize=256k,largeio,swalloc,inode64

CPU locked at highest frequency

Test 1

Write 20 GB file
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBxx bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 77.9626 s, 275 MB/s
262 MB/s

Test 2

Write 20 GB file
chingl@tsunami:/data/test/t6> dd if=/dev/zero of=bigfile20GBxy bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 110.69 s, 194 MB/s
185 MB/s

Test 3

disk quota set
Write 20 GB file
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 72.4197 s, 297 MB/s

Test 4

disk quota set
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 116.401 s, 184 MB/s

Test 5

no disk quota
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 163.027 s, 132 MB/s

Test 6

no disk quota
chingl@tsunami:/data/test/t2> dd if=/dev/zero of=bigfile20GB2y bs=1073741824 count=20
20+0 records in
20+0 records out
21474836480 bytes (21 GB) copied, 83.8251 s, 256 MB/s


[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: allocsize mount option
@ 2010-01-15  3:08 Gim Leong Chin
  0 siblings, 0 replies; 13+ messages in thread
From: Gim Leong Chin @ 2010-01-15  3:08 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, xfs

Hi Dave,

Thank you for the advice!

I have done Direct IO dd tests writing the same 20 GB files.  The results are an eye opener!  bs=1GB, count=2

Single instance repeats of 830, 800 MB/s, compared to >100 to under 300 MB/s for buffered.

Two instances aggregate of 304 MB/s, six instances aggregate of 587 MB/s.

System drive /home RAID 1 of 130 MB/s compared to 51 MB/s buffered.

So the problem is with the buffered writes.


> Youἀd have to get all the fixes from 2.6.30 to 2.6.32,
> and the
> backport would be very difficult to get right. Better
> would
> be طust to upgrade the kernel to 2.6.32 ;)


If I change the kernel, I would have no support from Novell.  I would try my luck and convince them.

> > > I'd suggest that you might need to look at
> increasing the
> > > maximum IO
> > > size for the block device
> > > (/sys/block/sdb/queue/max_sectors_kb),
> > > maybe the request queue depth as well to get
> larger IOs to
> > > be pushed
> > > to the raid controller. if you can, at least get
> it to the
> > > stripe
> > > width of 1536k....
> > 
> > Could you give a good reference for performance tuning
> of these
> > parameters?  I am at a total loss here.
> 
> Welcome to the black art of storage subsystem tuning ;)
> 
> I'm not sure there is a good reference for tuning the block
> device
> parameters - most of what I know was handed down by word of
> mouth
> from gurus on high mountains.
> 
> The overriding principle, though, is to try to ensure that
> the
> stripe width sized IOs can be issued right through the IO
> stack to
> the hardware, and that those IOs are correctly aligned to
> the
> stripes. You've got the filesystem configuration and layout
> part
> correct, now it's just tuning the block layer to pass the
> IO's
> through.

Can I confirm that
(/sys/block/sdb/queue/max_sectors_kb)=stripe width 1536 kB

Which parameter is "request queue depth"?  What should be the value?


 
> FWIW, your tests are not timing how longit takes for all
> the
> data to hit the disk, only how long it takes to get into
> cache.


Thank you!  I do know that XFS buffers writes extensively.  The drive LEDs remain lighted long after the OS says the writes are completed.  Plus some timings are physically impossible.

 
> That sounds wrong - it sounds like NCQ is not functioning
> properly
> as with NCQ enabled, disabling the drive cache should not
> impact
> throughput at all....

I do not remember clearly if NCQ is available for that motherboard, it is an Ubuntu 32-bit, but I do remember seeing queue depth in the kernel.  I will check it out next week.

But what I read is that NCQ hurts single write performance.  That is also what I found with another Areca SATA RAID in Windows XP.

What I found with all the drives we tested was that disabling the cache badly hurt sequential write performance (no file system, write data directly to designated LBA).



> I'd suggest trying to find another distributor that will
> bring them
> in for you. Putting that many drives in a single chassis is
> almost
> certainly going to cause vibration problems, especially if
> you get
> all the disk heads moving in close synchronisation (which
> is what
> happens when you get all your IO sizing and alignment
> right).

I am working on changing to the WD Caviar RE4 drives.  Not sure if I can pull it off.



Chin Gim Leong


      New Email names for you! 
Get the Email name you&#39;ve always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: allocsize mount option
@ 2010-01-24  6:44 Gim Leong Chin
  0 siblings, 0 replies; 13+ messages in thread
From: Gim Leong Chin @ 2010-01-24  6:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, xfs

Hi Guys,


An update:

SUSE Linux Enterprise 11 SP1 will be based on kernel 2.6.32, according to Novell support, when I asked for backporting.  That will solve my problems.

I do confirm that all the drives (no file system testing) were tested with NCQ enabled on the Tyan Opteron motherboard.  What I did learn was that the note book drives nowadays come with NCQ, and that all the SATA desktop and note book drives showed very poor sequential write performance with cache disabled.

On another note, there was a post by Michael Monnerie on 20 NOV 2009 about kernel 2.6.27, XFS, inode64 and NFS.
I have the 10.9 TB inode64 XFS file system exported via NFS 4 by SLED 11, and mounted by SLES 10 SP2, and I have no problems with it.

Finally, A big Thank You to Dave Chinner and Eric Sandeen for your kind assistance!


GL

--- On Fri, 15/1/10, Gim Leong Chin <chingimleong@yahoo.com.sg> wrote:

> From: Gim Leong Chin <chingimleong@yahoo.com.sg>
> Subject: Re: allocsize mount option
> To: "Dave Chinner" <david@fromorbit.com>
> Cc: "Eric Sandeen" <sandeen@sandeen.net>, xfs@oss.sgi.com
> Date: Friday, 15 January, 2010, 11:08 AM
> Hi Dave,
> 
> Thank you for the advice!
> 
> I have done Direct IO dd tests writing the same 20 GB
> files.  The results are an eye opener!  bs=1GB, count=2
> 
> Single instance repeats of 830, 800 MB/s, compared to
> >100 to under 300 MB/s for buffered.
> 
> Two instances aggregate of 304 MB/s, six instances
> aggregate of 587 MB/s.
> 
> System drive /home RAID 1 of 130 MB/s compared to 51 MB/s
> buffered.
> 
> So the problem is with the buffered writes.
> 
> 
> > Youἀd have to get all the fixes from 2.6.30 to
> 2.6.32,
> > and the
> > backport would be very difficult to get right. Better
> > would
> > be طust to upgrade the kernel to 2.6.32 ;)
> 
> 
> If I change the kernel, I would have no support from
> Novell.  I would try my luck and convince them.
> 
> > > > I'd suggest that you might need to look at
> > increasing the
> > > > maximum IO
> > > > size for the block device
> > > > (/sys/block/sdb/queue/max_sectors_kb),
> > > > maybe the request queue depth as well to
> get
> > larger IOs to
> > > > be pushed
> > > > to the raid controller. if you can, at least
> get
> > it to the
> > > > stripe
> > > > width of 1536k....
> > > 
> > > Could you give a good reference for performance
> tuning
> > of these
> > > parameters?  I am at a total loss here.
> > 
> > Welcome to the black art of storage subsystem tuning
> ;)
> > 
> > I'm not sure there is a good reference for tuning the
> block
> > device
> > parameters - most of what I know was handed down by
> word of
> > mouth
> > from gurus on high mountains.
> > 
> > The overriding principle, though, is to try to ensure
> that
> > the
> > stripe width sized IOs can be issued right through the
> IO
> > stack to
> > the hardware, and that those IOs are correctly aligned
> to
> > the
> > stripes. You've got the filesystem configuration and
> layout
> > part
> > correct, now it's just tuning the block layer to pass
> the
> > IO's
> > through.
> 
> Can I confirm that
> (/sys/block/sdb/queue/max_sectors_kb)=stripe width 1536 kB
> 
> Which parameter is "request queue depth"?  What should be
> the value?
> 
> 
>  
> > FWIW, your tests are not timing how longit takes for
> all
> > the
> > data to hit the disk, only how long it takes to get
> into
> > cache.
> 
> 
> Thank you!  I do know that XFS buffers writes
> extensively.  The drive LEDs remain lighted long after
> the OS says the writes are completed.  Plus some
> timings are physically impossible.
> 
>  
> > That sounds wrong - it sounds like NCQ is not
> functioning
> > properly
> > as with NCQ enabled, disabling the drive cache should
> not
> > impact
> > throughput at all....
> 
> I do not remember clearly if NCQ is available for that
> motherboard, it is an Ubuntu 32-bit, but I do remember
> seeing queue depth in the kernel.  I will check it out
> next week.
> 
> But what I read is that NCQ hurts single write
> performance.  That is also what I found with another
> Areca SATA RAID in Windows XP.
> 
> What I found with all the drives we tested was that
> disabling the cache badly hurt sequential write performance
> (no file system, write data directly to designated LBA).
> 
> 
> 
> > I'd suggest trying to find another distributor that
> will
> > bring them
> > in for you. Putting that many drives in a single
> chassis is
> > almost
> > certainly going to cause vibration problems,
> especially if
> > you get
> > all the disk heads moving in close synchronisation
> (which
> > is what
> > happens when you get all your IO sizing and alignment
> > right).
> 
> I am working on changing to the WD Caviar RE4 drives. 
> Not sure if I can pull it off.
> 
> 
> 
> Chin Gim Leong
> 
> 
>       New Email names for you! 
> Get the Email name you've always wanted on the new
> @ymail and @rocketmail. 
> Hurry before someone else does!

> http://mail.promotions.yahoo.com/newdomains/sg/
> 


      ______________________________________________________________________
Search, browse and book your hotels and flights through Yahoo! Travel.
http://sg.travel.yahoo.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* allocsize mount option
@ 2010-09-28 18:53 Ivan.Novick
  2010-09-29  0:31 ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Ivan.Novick @ 2010-09-28 18:53 UTC (permalink / raw)
  To: xfs; +Cc: Timothy.Heath

Hi all,

According to the documentation the allocsize mount option: "Sets the
buffered I/O end-of-file preallocation size when doing delayed allocation
writeout"

Will this value limit "extent" sizes to be be no smaller than the allocsize?

I have set the following mount options:
(rw,noatime,nodiratime,logbufs=8,allocsize=512m)

And yet, depending on the workload, the extent sizes are often 1 or 2 orders
of magnitude lower than 512 MB ...

If I wanted to do further reading on the subject, can someone point me to an
approximate location in the code where the size of a newly created extent is
determined?

Cheers,
Ivan Novick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread
* allocsize mount option
@ 2011-01-20 20:41 Peter Vajgel
  2011-01-21  0:48 ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Vajgel @ 2011-01-20 20:41 UTC (permalink / raw)
  To: xfs@oss.sgi.com

We write about 100 100GB files into a single 10TB volume with xfs. We are using allocsize=1g to limit the fragmentation with a great success. We also need to reserve some space (~200GB) on each filesystem for processing the files and writing new versions of the files. Once we have only 200GB available we stop writing to the files. However with allocsize it's not that easy - we see +/- 100GB added or taken depending if there are still writes going and if the file was reopened ... Is there a way to programmatically disable allocsize speculative preallocation once we exceed certain threshold and also return the current speculative preallocation back to the free space (without closing the file)?

Thx

Peter Vajgel

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-01-21  0:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-11 17:25 allocsize mount option Gim Leong Chin
2010-01-11 18:16 ` Eric Sandeen
  -- strict thread matches above, loose matches on Subject: below --
2010-01-13  9:42 Gim Leong Chin
2010-01-13 10:50 ` Dave Chinner
2010-01-14 17:25 Gim Leong Chin
2010-01-14 17:42 ` Eric Sandeen
2010-01-14 23:28 ` Dave Chinner
2010-01-15  3:08 Gim Leong Chin
2010-01-24  6:44 Gim Leong Chin
2010-09-28 18:53 Ivan.Novick
2010-09-29  0:31 ` Dave Chinner
2011-01-20 20:41 Peter Vajgel
2011-01-21  0:48 ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox