public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Candler <B.Candler@pobox.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: Performance problem - reads slower than writes
Date: Tue, 31 Jan 2012 14:16:04 +0000	[thread overview]
Message-ID: <20120131141604.GB46571@nsrc.org> (raw)
In-Reply-To: <20120131103126.GA46170@nsrc.org>

Updates:

(1) The bug in bonnie++ is to do with memory allocation, and you can work
around it by putting '-n' before '-s' on the command line and using the same
custom chunk size before both (or by using '-n' with '-s 0')

# time bonnie++ -d /data/sdc -n 98:800k:500k:1000:32k -s 16384k:32k -u root

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine   Size:chnk K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
storage1    16G:32k  2061  91 101801   3 49405   4  5054  97 126748   6 130.9   3
Latency             15446us     222ms     412ms   23149us   83913us     452ms
Version  1.96       ------Sequential Create------ --------Random Create--------
storage1            -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
98:819200:512000/1000   128   3    37   1 10550  25   108   3    38   1  8290  33
Latency              6874ms   99117us   45394us    4462ms   12582ms    4027ms
1.96,1.96,storage1,1,1328002525,16G,32k,2061,91,101801,3,49405,4,5054,97,126748,6,130.9,3,98,819200,512000,,1000,128,3,37,1,10550,25,108,3,38,1,8290,33,15446us,222ms,412ms,23149us,83913us,452ms,6874ms,99117us,45394us,4462ms,12582ms,4027ms

This shows that using 32k transfers instead of 8k doesn't really help; I'm
still only seeing 37-38 reads per second, either sequential or random.


(2) In case extents aren't being kept in the inode, I decided to build a
filesystem with '-i size=1024'

# time bonnie++ -d /data/sdb -n 98:800k:500k:1000:32k -s0 -u root

Version  1.96       ------Sequential Create------ --------Random Create--------
storage1            -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
98:819200:512000/1000   110   3   131   5  3410  10   110   3    33   1   387   1
Latency              6038ms   92092us   87730us    5202ms     117ms    7653ms
1.96,1.96,storage1,1,1328003901,,,,,,,,,,,,,,98,819200,512000,,1000,110,3,131,5,3410,10,110,3,33,1,387,1,,,,,,,6038ms,92092us,87730us,5202ms,117ms,7653ms

Wow! The sequential read just blows away the previous results. What's even
more amazing is the number of transactions per second reported by iostat
while bonnie++ was sequentially stat()ing and read()ing the files:

# iostat 5
...
sdb             820.80     86558.40         0.00     432792          0
                  !!

820 tps on a bog-standard hard-drive is unbelievable, although the total
throughput of 86MB/sec is.  It could be that either NCQ or drive read-ahead
is scoring big-time here.

However during random stat()+read() the performance drops:

# iostat 5
...
sdb             225.40     21632.00         0.00     108160          0

Here we appear to be limited by real seeks. 225 seeks/sec is still very good
for a hard drive, but it means the filesystem is generating about 7 seeks
for every file (stat+open+read+close).  Indeed the random read performance
appears to be a bit worse than the default (-i size=256) filesystem, where
I was getting 25MB/sec on iostat, and 38 files per second instead of 33.

There are only 1000 directories in this test, and I would expect those to
become cached quickly.

According to Wikipedia, XFS has variable-length extents. I think that as
long as the file data is contiguous then each file should only be taking a
single extent, and this is what xfs_bmap seems to be telling me:

# xfs_bmap -n1 -l -v /data/sdc/Bonnie.25448/00449/* | head
/data/sdc/Bonnie.25448/00449/000000b125mpBap4gg7U:
 EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET            TOTAL
   0: [0..1559]:       4446598752..4446600311  3 (51198864..51200423)  1560
/data/sdc/Bonnie.25448/00449/000000b1262hBudG6gV:
 EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET            TOTAL
   0: [0..1551]:       1484870256..1484871807  1 (19736960..19738511)  1552
/data/sdc/Bonnie.25448/00449/000000b127fM:
 EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET            TOTAL
   0: [0..1111]:       2954889944..2954891055  2 (24623352..24624463)  1112
/data/sdc/Bonnie.25448/00449/000000b128:

It looks like I need to get familiar with xfs_db and
http://oss.sgi.com/projects/xfs/papers/xfs_filesystem_structure.pdf
to find out what's going on.

(These filesystems are mounted with noatime,nodiratime incidentally)

Regards,

Brian.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-01-31 14:16 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-30 22:00 Performance problem - reads slower than writes Brian Candler
2012-01-31  2:05 ` Dave Chinner
2012-01-31 10:31   ` Brian Candler
2012-01-31 14:16     ` Brian Candler [this message]
2012-01-31 20:25       ` Dave Chinner
2012-02-01  7:29         ` Stan Hoeppner
2012-02-03 18:47         ` Brian Candler
2012-02-03 19:03           ` Christoph Hellwig
2012-02-03 21:01             ` Brian Candler
2012-02-03 21:17               ` Brian Candler
2012-02-05 22:50                 ` Dave Chinner
2012-02-05 22:43               ` Dave Chinner
2012-01-31 14:52     ` Christoph Hellwig
2012-01-31 21:52       ` Brian Candler
2012-02-01  0:50         ` Raghavendra D Prabhu
2012-02-01  3:59         ` Dave Chinner
2012-02-03 11:54       ` Brian Candler
2012-02-03 19:42         ` Stan Hoeppner
2012-02-03 22:10           ` Brian Candler
2012-02-04  9:59             ` Stan Hoeppner
2012-02-04 11:24               ` Brian Candler
2012-02-04 12:49                 ` Stan Hoeppner
2012-02-04 20:04                   ` Brian Candler
2012-02-04 20:44                     ` Joe Landman
2012-02-06 10:40                       ` Brian Candler
2012-02-07 17:30                       ` Brian Candler
2012-02-05  5:16                     ` Stan Hoeppner
2012-02-05  9:05                       ` Brian Candler
2012-01-31 20:06     ` Dave Chinner
2012-01-31 21:35       ` Brian Candler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120131141604.GB46571@nsrc.org \
    --to=b.candler@pobox.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox