From: "Jörn Engel" <joern@logfs.org>
To: linux-fsdevel@vger.kernel.org
Subject: Filesystem benchmarks on reasonably fast hardware
Date: Sun, 17 Jul 2011 18:05:01 +0200 [thread overview]
Message-ID: <20110717160501.GA1437@logfs.org> (raw)
Hello everyone!
Recently I have had the pleasure of working with some nice hardware
and the displeasure of seeing it fail commercially. However, when
trying to optimize performance I noticed that in some cases the
bottlenecks were not in the hardware or my driver, but rather in the
filesystem on top of it. So maybe all this may still be useful in
improving said filesystem.
Hardware is basically a fast SSD. Performance tops out at about
650MB/s and is fairly insensitive to random access behaviour. Latency
is about 50us for 512B reads and near 0 for writes, through the usual
cheating.
Numbers below were created with sysbench, using directIO. Each block
is a matrix with results for blocksizes from 512B to 16384B and thread
count from 1 to 128. Four blocks for reads and writes, both
sequential and random.
Ext4:
=====
seqrd 1 2 4 8 16 32 64 128
16384 4867 8717 16367 29249 39131 39140 39135 39123
8192 6324 10889 19980 37239 66346 78444 78429 78409
4096 9158 15810 26072 45999 85371 148061 157222 157294
2048 15019 24555 35934 59698 106541 198986 313969 315566
1024 24271 36914 51845 80230 136313 252832 454153 484120
512 37803 62144 78952 111489 177844 314896 559295 615744
rndrd 1 2 4 8 16 32 64 128
16384 4770 8539 14715 23465 33630 39073 39101 39103
8192 6138 11398 20873 35785 56068 75743 78364 78374
4096 8338 15657 29648 53927 91854 136595 157279 157349
2048 11985 22894 43495 81211 148029 239962 314183 315695
1024 16529 31306 61307 114721 222700 387439 561810 632719
512 20580 40160 73642 135711 294583 542828 795607 821025
seqwr 1 2 4 8 16 32 64 128
16384 37588 37600 37730 37680 37631 37664 37670 37662
8192 77621 77737 77947 77967 77875 77939 77833 77574
4096 124083 123171 121159 120947 120202 120315 119917 120236
2048 158540 153993 151128 150663 150686 151159 150358 147827
1024 183300 176091 170527 170919 169608 169900 169662 168622
512 229167 231672 221629 220416 223490 217877 222390 219718
rndwr 1 2 4 8 16 32 64 128
16384 38932 38290 38200 38306 38421 38404 38329 38326
8192 79790 77297 77464 77447 77420 77460 77495 77545
4096 163985 157626 158232 158212 158102 158169 158273 158236
2048 272261 322637 320032 320932 321597 322008 322242 322699
1024 339647 609192 652655 644903 654604 658292 659119 659667
512 403366 718426 1227643 1149850 1155541 1157633 1173567 1180710
Sequestial writes are significantly worse than random writes. If
someone is interested, I can see which lock is causing all this.
Sequential reads below 2k are also worse, although one might wonder
whether direct IO on 1k chunks makes sense at all. Random reads in
the last column scale very nicely with block size down to 1k, but hit
some problem at 512B. The machine could be cpu-bound at this point.
Btrfs:
======
seqrd 1 2 4 8 16 32 64 128
16384 3270 6582 12919 24866 36424 39682 39726 39721
8192 4394 8348 16483 32165 54221 79256 79396 79415
4096 6337 12024 21696 40569 74924 131763 158292 158763
2048 297222 298299 294727 294740 296496 298517 300118 300740
1024 583891 595083 584272 580965 584030 589115 599634 598054
512 1103026 1175523 1134172 1133606 1123684 1123978 1156758 1130354
rndrd 1 2 4 8 16 32 64 128
16384 3252 6621 12437 20354 30896 39365 39115 39746
8192 4273 8749 17871 32135 51812 72715 79443 79456
4096 5842 11900 24824 48072 84485 128721 158631 158812
2048 7177 12540 20244 27543 32386 34839 35728 35916
1024 7178 12577 20341 27473 32656 34763 36056 35960
512 7176 12554 20289 27603 32504 34781 35983 35919
seqwr 1 2 4 8 16 32 64 128
16384 13357 12838 12604 12596 12588 12641 12716 12814
8192 21426 20471 20090 20097 20287 20236 20445 20528
4096 30740 29187 28528 28525 28576 28580 28883 29258
2048 2949 3214 3360 3431 3440 3498 3396 3498
1024 2167 2205 2412 2376 2473 2221 2410 2420
512 1888 1876 1926 1981 1935 1938 1957 1976
rndwr 1 2 4 8 16 32 64 128
16384 10985 19312 27430 27813 28157 28528 28308 28234
8192 16505 29420 35329 34925 36020 34976 35897 35174
4096 21894 31724 34106 34799 36119 36608 37571 36274
2048 3637 8031 15225 22599 30882 31966 32567 32427
1024 3704 8121 15219 23670 31784 33156 31469 33547
512 3604 7988 15206 23742 32007 31933 32523 33667
Sequential writes below 4k perform drastically worse. Quite
unexpected. Write performance across the board is horrible when
compared to ext4. Sequential reads are much better, in particular for
<4k cases. I would assume some sort of readahead is happening.
Random reads <4k again drop off significantly.
xfs:
====
seqrd 1 2 4 8 16 32 64 128
16384 4698 4424 4397 4402 4394 4398 4642 4679
8192 6234 5827 5797 5801 5795 6114 5793 5812
4096 9100 8835 8882 8896 8874 8890 8910 8906
2048 14922 14391 14259 14248 14264 14264 14269 14273
1024 23853 22690 22329 22362 22338 22277 22240 22301
512 37353 33990 33292 33332 33306 33296 33224 33271
rndrd 1 2 4 8 16 32 64 128
16384 4585 8248 14219 22533 32020 38636 39033 39054
8192 6032 11186 20294 34443 53112 71228 78197 78284
4096 8247 15539 29046 52090 86744 125835 154031 157143
2048 11950 22652 42719 79562 140133 218092 286111 314870
1024 16526 31294 59761 112494 207848 348226 483972 574403
512 20635 39755 73010 130992 270648 484406 686190 726615
seqwr 1 2 4 8 16 32 64 128
16384 39956 39695 39971 39913 37042 37538 36591 32179
8192 67934 66073 30963 29038 29852 25210 23983 28272
4096 89250 81417 28671 18685 12917 14870 22643 22237
2048 140272 120588 140665 140012 137516 139183 131330 129684
1024 217473 147899 210350 218526 219867 220120 219758 215166
512 328260 181197 211131 263533 294009 298203 301698 298013
rndwr 1 2 4 8 16 32 64 128
16384 38447 38153 38145 38140 38156 38199 38208 38236
8192 78001 76965 76908 76945 77023 77174 77166 77106
4096 160721 156000 157196 157084 157078 157123 156978 157149
2048 325395 317148 317858 318442 318750 318981 319798 320393
1024 434084 649814 650176 651820 653928 654223 655650 655818
512 501067 876555 1290292 1217671 1244399 1267729 1285469 1298522
Sequential reads are pretty horrible. Sequential writes are hitting a
hot lock again.
So, if anyone would like to improve one of these filesystems and needs
more data, feel free to ping me.
Jörn
--
Victory in war is not repetitious.
-- Sun Tzu
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2011-07-17 16:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-17 16:05 Jörn Engel [this message]
2011-07-17 23:32 ` Filesystem benchmarks on reasonably fast hardware Dave Chinner
[not found] ` <20110718075339.GB1437@logfs.org>
2011-07-18 10:57 ` Dave Chinner
2011-07-18 11:40 ` Jörn Engel
2011-07-19 2:41 ` Dave Chinner
2011-07-19 7:36 ` Jörn Engel
2011-07-19 9:23 ` srimugunthan dhandapani
2011-07-21 19:05 ` Jörn Engel
2011-07-19 10:15 ` Dave Chinner
2011-07-18 14:34 ` Jörn Engel
[not found] ` <20110718103956.GE1437@logfs.org>
2011-07-18 11:10 ` Dave Chinner
2011-07-18 12:07 ` Ted Ts'o
2011-07-18 12:42 ` Jörn Engel
2011-07-25 15:18 ` Ted Ts'o
2011-07-25 18:20 ` Jörn Engel
2011-07-25 21:18 ` Ted Ts'o
2011-07-26 14:57 ` Ted Ts'o
2011-07-27 3:39 ` Yongqiang Yang
2011-07-19 13:19 ` Dave Chinner
2011-07-21 10:42 ` Jörn Engel
2011-07-22 18:51 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110717160501.GA1437@logfs.org \
--to=joern@logfs.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).