* Filesystem benchmarks on reasonably fast hardware @ 2011-07-17 16:05 Jörn Engel 2011-07-17 23:32 ` Dave Chinner ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-17 16:05 UTC (permalink / raw) To: linux-fsdevel Hello everyone! Recently I have had the pleasure of working with some nice hardware and the displeasure of seeing it fail commercially. However, when trying to optimize performance I noticed that in some cases the bottlenecks were not in the hardware or my driver, but rather in the filesystem on top of it. So maybe all this may still be useful in improving said filesystem. Hardware is basically a fast SSD. Performance tops out at about 650MB/s and is fairly insensitive to random access behaviour. Latency is about 50us for 512B reads and near 0 for writes, through the usual cheating. Numbers below were created with sysbench, using directIO. Each block is a matrix with results for blocksizes from 512B to 16384B and thread count from 1 to 128. Four blocks for reads and writes, both sequential and random. Ext4: ===== seqrd 1 2 4 8 16 32 64 128 16384 4867 8717 16367 29249 39131 39140 39135 39123 8192 6324 10889 19980 37239 66346 78444 78429 78409 4096 9158 15810 26072 45999 85371 148061 157222 157294 2048 15019 24555 35934 59698 106541 198986 313969 315566 1024 24271 36914 51845 80230 136313 252832 454153 484120 512 37803 62144 78952 111489 177844 314896 559295 615744 rndrd 1 2 4 8 16 32 64 128 16384 4770 8539 14715 23465 33630 39073 39101 39103 8192 6138 11398 20873 35785 56068 75743 78364 78374 4096 8338 15657 29648 53927 91854 136595 157279 157349 2048 11985 22894 43495 81211 148029 239962 314183 315695 1024 16529 31306 61307 114721 222700 387439 561810 632719 512 20580 40160 73642 135711 294583 542828 795607 821025 seqwr 1 2 4 8 16 32 64 128 16384 37588 37600 37730 37680 37631 37664 37670 37662 8192 77621 77737 77947 77967 77875 77939 77833 77574 4096 124083 123171 121159 120947 120202 120315 119917 120236 2048 158540 153993 151128 150663 150686 151159 150358 147827 1024 183300 176091 170527 170919 169608 169900 169662 168622 512 229167 231672 221629 220416 223490 217877 222390 219718 rndwr 1 2 4 8 16 32 64 128 16384 38932 38290 38200 38306 38421 38404 38329 38326 8192 79790 77297 77464 77447 77420 77460 77495 77545 4096 163985 157626 158232 158212 158102 158169 158273 158236 2048 272261 322637 320032 320932 321597 322008 322242 322699 1024 339647 609192 652655 644903 654604 658292 659119 659667 512 403366 718426 1227643 1149850 1155541 1157633 1173567 1180710 Sequestial writes are significantly worse than random writes. If someone is interested, I can see which lock is causing all this. Sequential reads below 2k are also worse, although one might wonder whether direct IO on 1k chunks makes sense at all. Random reads in the last column scale very nicely with block size down to 1k, but hit some problem at 512B. The machine could be cpu-bound at this point. Btrfs: ====== seqrd 1 2 4 8 16 32 64 128 16384 3270 6582 12919 24866 36424 39682 39726 39721 8192 4394 8348 16483 32165 54221 79256 79396 79415 4096 6337 12024 21696 40569 74924 131763 158292 158763 2048 297222 298299 294727 294740 296496 298517 300118 300740 1024 583891 595083 584272 580965 584030 589115 599634 598054 512 1103026 1175523 1134172 1133606 1123684 1123978 1156758 1130354 rndrd 1 2 4 8 16 32 64 128 16384 3252 6621 12437 20354 30896 39365 39115 39746 8192 4273 8749 17871 32135 51812 72715 79443 79456 4096 5842 11900 24824 48072 84485 128721 158631 158812 2048 7177 12540 20244 27543 32386 34839 35728 35916 1024 7178 12577 20341 27473 32656 34763 36056 35960 512 7176 12554 20289 27603 32504 34781 35983 35919 seqwr 1 2 4 8 16 32 64 128 16384 13357 12838 12604 12596 12588 12641 12716 12814 8192 21426 20471 20090 20097 20287 20236 20445 20528 4096 30740 29187 28528 28525 28576 28580 28883 29258 2048 2949 3214 3360 3431 3440 3498 3396 3498 1024 2167 2205 2412 2376 2473 2221 2410 2420 512 1888 1876 1926 1981 1935 1938 1957 1976 rndwr 1 2 4 8 16 32 64 128 16384 10985 19312 27430 27813 28157 28528 28308 28234 8192 16505 29420 35329 34925 36020 34976 35897 35174 4096 21894 31724 34106 34799 36119 36608 37571 36274 2048 3637 8031 15225 22599 30882 31966 32567 32427 1024 3704 8121 15219 23670 31784 33156 31469 33547 512 3604 7988 15206 23742 32007 31933 32523 33667 Sequential writes below 4k perform drastically worse. Quite unexpected. Write performance across the board is horrible when compared to ext4. Sequential reads are much better, in particular for <4k cases. I would assume some sort of readahead is happening. Random reads <4k again drop off significantly. xfs: ==== seqrd 1 2 4 8 16 32 64 128 16384 4698 4424 4397 4402 4394 4398 4642 4679 8192 6234 5827 5797 5801 5795 6114 5793 5812 4096 9100 8835 8882 8896 8874 8890 8910 8906 2048 14922 14391 14259 14248 14264 14264 14269 14273 1024 23853 22690 22329 22362 22338 22277 22240 22301 512 37353 33990 33292 33332 33306 33296 33224 33271 rndrd 1 2 4 8 16 32 64 128 16384 4585 8248 14219 22533 32020 38636 39033 39054 8192 6032 11186 20294 34443 53112 71228 78197 78284 4096 8247 15539 29046 52090 86744 125835 154031 157143 2048 11950 22652 42719 79562 140133 218092 286111 314870 1024 16526 31294 59761 112494 207848 348226 483972 574403 512 20635 39755 73010 130992 270648 484406 686190 726615 seqwr 1 2 4 8 16 32 64 128 16384 39956 39695 39971 39913 37042 37538 36591 32179 8192 67934 66073 30963 29038 29852 25210 23983 28272 4096 89250 81417 28671 18685 12917 14870 22643 22237 2048 140272 120588 140665 140012 137516 139183 131330 129684 1024 217473 147899 210350 218526 219867 220120 219758 215166 512 328260 181197 211131 263533 294009 298203 301698 298013 rndwr 1 2 4 8 16 32 64 128 16384 38447 38153 38145 38140 38156 38199 38208 38236 8192 78001 76965 76908 76945 77023 77174 77166 77106 4096 160721 156000 157196 157084 157078 157123 156978 157149 2048 325395 317148 317858 318442 318750 318981 319798 320393 1024 434084 649814 650176 651820 653928 654223 655650 655818 512 501067 876555 1290292 1217671 1244399 1267729 1285469 1298522 Sequential reads are pretty horrible. Sequential writes are hitting a hot lock again. So, if anyone would like to improve one of these filesystems and needs more data, feel free to ping me. Jörn -- Victory in war is not repetitious. -- Sun Tzu -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-17 16:05 Filesystem benchmarks on reasonably fast hardware Jörn Engel @ 2011-07-17 23:32 ` Dave Chinner [not found] ` <20110718075339.GB1437@logfs.org> 2011-07-18 12:07 ` Ted Ts'o 2011-07-19 13:19 ` Dave Chinner 2 siblings, 1 reply; 21+ messages in thread From: Dave Chinner @ 2011-07-17 23:32 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > Hello everyone! > > Recently I have had the pleasure of working with some nice hardware > and the displeasure of seeing it fail commercially. However, when > trying to optimize performance I noticed that in some cases the > bottlenecks were not in the hardware or my driver, but rather in the > filesystem on top of it. So maybe all this may still be useful in > improving said filesystem. > > Hardware is basically a fast SSD. Performance tops out at about > 650MB/s and is fairly insensitive to random access behaviour. Latency > is about 50us for 512B reads and near 0 for writes, through the usual > cheating. > > Numbers below were created with sysbench, using directIO. Each block > is a matrix with results for blocksizes from 512B to 16384B and thread > count from 1 to 128. Four blocks for reads and writes, both > sequential and random. What's the command line/script used to generate the result matrix? And what kernel are you running on? > xfs: > ==== > seqrd 1 2 4 8 16 32 64 128 > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > 512 37353 33990 33292 33332 33306 33296 33224 33271 Something is single threading completely there - something is very wrong. Someone want to send me a nice fast pci-e SSD - my disks don't spin that fast... :/ > rndrd 1 2 4 8 16 32 64 128 > 16384 4585 8248 14219 22533 32020 38636 39033 39054 > 8192 6032 11186 20294 34443 53112 71228 78197 78284 > 4096 8247 15539 29046 52090 86744 125835 154031 157143 > 2048 11950 22652 42719 79562 140133 218092 286111 314870 > 1024 16526 31294 59761 112494 207848 348226 483972 574403 > 512 20635 39755 73010 130992 270648 484406 686190 726615 > > seqwr 1 2 4 8 16 32 64 128 > 16384 39956 39695 39971 39913 37042 37538 36591 32179 > 8192 67934 66073 30963 29038 29852 25210 23983 28272 > 4096 89250 81417 28671 18685 12917 14870 22643 22237 > 2048 140272 120588 140665 140012 137516 139183 131330 129684 > 1024 217473 147899 210350 218526 219867 220120 219758 215166 > 512 328260 181197 211131 263533 294009 298203 301698 298013 > > rndwr 1 2 4 8 16 32 64 128 > 16384 38447 38153 38145 38140 38156 38199 38208 38236 > 8192 78001 76965 76908 76945 77023 77174 77166 77106 > 4096 160721 156000 157196 157084 157078 157123 156978 157149 > 2048 325395 317148 317858 318442 318750 318981 319798 320393 > 1024 434084 649814 650176 651820 653928 654223 655650 655818 > 512 501067 876555 1290292 1217671 1244399 1267729 1285469 1298522 I'm assuming that is the h/w can do 650MB/s then the numbers are in iops? from 4 threads up all results equate to 650MB/s. > Sequential reads are pretty horrible. Sequential writes are hitting a > hot lock again. lockstat output? > So, if anyone would like to improve one of these filesystems and needs > more data, feel free to ping me. Of course I'm interested. ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20110718075339.GB1437@logfs.org>]
* Re: Filesystem benchmarks on reasonably fast hardware [not found] ` <20110718075339.GB1437@logfs.org> @ 2011-07-18 10:57 ` Dave Chinner 2011-07-18 11:40 ` Jörn Engel 2011-07-18 14:34 ` Jörn Engel [not found] ` <20110718103956.GE1437@logfs.org> 1 sibling, 2 replies; 21+ messages in thread From: Dave Chinner @ 2011-07-18 10:57 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > > Numbers below were created with sysbench, using directIO. Each block > > > is a matrix with results for blocksizes from 512B to 16384B and thread > > > count from 1 to 128. Four blocks for reads and writes, both > > > sequential and random. > > > > What's the command line/script used to generate the result matrix? > > And what kernel are you running on? > > Script is attached. Kernel is git from July 13th (51414d41). Ok, thanks. > > > xfs: > > > ==== > > > seqrd 1 2 4 8 16 32 64 128 > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 > > > > Something is single threading completely there - something is very > > wrong. Someone want to send me a nice fast pci-e SSD - my disks > > don't spin that fast... :/ > > I wish I could just go down the shop and pick one from the > manufacturing line. :/ Heh. At this point any old pci-e ssd would be an improvement ;) > > > rndwr 1 2 4 8 16 32 64 128 > > > 16384 38447 38153 38145 38140 38156 38199 38208 38236 > > > 8192 78001 76965 76908 76945 77023 77174 77166 77106 > > > 4096 160721 156000 157196 157084 157078 157123 156978 157149 > > > 2048 325395 317148 317858 318442 318750 318981 319798 320393 > > > 1024 434084 649814 650176 651820 653928 654223 655650 655818 > > > 512 501067 876555 1290292 1217671 1244399 1267729 1285469 1298522 > > > > I'm assuming that is the h/w can do 650MB/s then the numbers are in > > iops? from 4 threads up all results equate to 650MB/s. > > Correct. Writes are spread automatically across all chips. They are > further cached, so until every chip is busy writing, their effective > latency is pretty much 0. Makes for a pretty flat graph, I agree. > > > > Sequential reads are pretty horrible. Sequential writes are hitting a > > > hot lock again. > > > > lockstat output? > > Attached for the bottom right case each of seqrd and seqwr. I hope > the filenames are descriptive enough. Looks like you attached the seqrd lockstat twice. > Lockstat itself hurts > performance. Writes were at 32245 IO/s from 298013, reads at 22458 > IO/s from 33271. In a way we are measuring oranges to figure out why > our apples are so small. Yeah, but at least it points out the lock in question - the iolock. We grab it exclusively for a very short period of time on each direct IO read to check the page cache state, then demote it to shared. I can see that when IO times are very short, this will, in fact, serialise multiple readers to a single file. A single thread shows this locking pattern: sysbench-3087 [000] 2192558.643146: xfs_ilock: dev 253:0 ino 0x83 flags IOLOCK_EXCL caller xfs_rw_ilock sysbench-3087 [000] 2192558.643147: xfs_ilock_demote: dev 253:0 ino 0x83 flags IOLOCK_EXCL caller T.1428 sysbench-3087 [000] 2192558.643150: xfs_ilock: dev 253:0 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_map_shared sysbench-3087 [001] 2192558.643877: xfs_ilock: dev 253:0 ino 0x83 flags IOLOCK_EXCL caller xfs_rw_ilock sysbench-3087 [001] 2192558.643879: xfs_ilock_demote: dev 253:0 ino 0x83 flags IOLOCK_EXCL caller T.1428 sysbench-3087 [007] 2192558.643881: xfs_ilock: dev 253:0 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_map_shared Two threads show this: sysbench-3096 [005] 2192697.678308: xfs_ilock: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller xfs_rw_ilock sysbench-3096 [005] 2192697.678314: xfs_ilock_demote: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller T.1428 sysbench-3096 [005] 2192697.678335: xfs_ilock: dev 253:0 ino 0x1c02c2 flags ILOCK_SHARED caller xfs_ilock_map_shared sysbench-3097 [006] 2192697.678556: xfs_ilock: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller xfs_rw_ilock sysbench-3097 [006] 2192697.678556: xfs_ilock_demote: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller T.1428 sysbench-3097 [006] 2192697.678577: xfs_ilock: dev 253:0 ino 0x1c02c2 flags ILOCK_SHARED caller xfs_ilock_map_shared sysbench-3096 [007] 2192697.678976: xfs_ilock: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller xfs_rw_ilock sysbench-3096 [007] 2192697.678978: xfs_ilock_demote: dev 253:0 ino 0x1c02c2 flags IOLOCK_EXCL caller T.1428 sysbench-3096 [007] 2192697.679000: xfs_ilock: dev 253:0 ino 0x1c02c2 flags ILOCK_SHARED caller xfs_ilock_map_shared Which shows the exclusive lock on the concurrent IO serialising on the IO in progress. Oops, that's not good. Ok, the patch below takes the numbers on my test setup on a 16k IO size: seqrd 1 2 4 8 16 vanilla 3603 2798 2563 not tested... patches 3707 5746 10304 12875 11016 So those numbers look a lot healthier. The patch is below, > -- > Fancy algorithms are slow when n is small, and n is usually small. > Fancy algorithms have big constants. Until you know that n is > frequently going to be big, don't get fancy. > -- Rob Pike Heh. XFS always assumes n will be big. Because where XFS is used, it just is. Cheers, Dave. -- Dave Chinner david@fromorbit.com xfs: don't serialise direct IO reads on page cache checks From: Dave Chinner <dchinner@redhat.com> There is no need to grab the i_mutex of the IO lock in exclusive mode if we don't need to invalidate the page cache. Taking hese locks on every direct IO effective serialisaes them as taking the IO lock in exclusive mode has to wait for all shared holders to drop the lock. That only happens when IO is complete, so effective it prevents dispatch of concurrent direct IO reads to the same inode. Fix this by taking the IO lock shared to check the page cache state, and only then drop it and take the IO lock exclusively if there is work to be done. Hence for the normal direct IO case, no exclusive locking will occur. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++--- 1 files changed, 14 insertions(+), 3 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c index 1e641e6..16a4bf0 100644 --- a/fs/xfs/linux-2.6/xfs_file.c +++ b/fs/xfs/linux-2.6/xfs_file.c @@ -321,7 +321,19 @@ xfs_file_aio_read( if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; - if (unlikely(ioflags & IO_ISDIRECT)) { + /* + * Locking is a bit tricky here. If we take an exclusive lock + * for direct IO, we effectively serialise all new concurrent + * read IO to this file and block it behind IO that is currently in + * progress because IO in progress holds the IO lock shared. We only + * need to hold the lock exclusive to blow away the page cache, so + * only take lock exclusively if the page cache needs invalidation. + * This allows the normal direct IO case of no page cache pages to + * proceeed concurrently without serialisation. + */ + xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); + if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) { + xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); xfs_rw_ilock(ip, XFS_IOLOCK_EXCL); if (inode->i_mapping->nrpages) { @@ -334,8 +346,7 @@ xfs_file_aio_read( } } xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL); - } else - xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); + } trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags); -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-18 10:57 ` Dave Chinner @ 2011-07-18 11:40 ` Jörn Engel 2011-07-19 2:41 ` Dave Chinner 2011-07-18 14:34 ` Jörn Engel 1 sibling, 1 reply; 21+ messages in thread From: Jörn Engel @ 2011-07-18 11:40 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > xfs: > > > > ==== > > > > seqrd 1 2 4 8 16 32 64 128 > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 Your patch definitely helps. Bottom right number is 584741 now. Still slower than ext4 or btrfs, but in the right ballpark. Will post the entire block once it has been generated. Jörn -- Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming. -- Rob Pike -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-18 11:40 ` Jörn Engel @ 2011-07-19 2:41 ` Dave Chinner 2011-07-19 7:36 ` Jörn Engel 0 siblings, 1 reply; 21+ messages in thread From: Dave Chinner @ 2011-07-19 2:41 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Mon, Jul 18, 2011 at 01:40:36PM +0200, Jörn Engel wrote: > On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > > > xfs: > > > > > ==== > > > > > seqrd 1 2 4 8 16 32 64 128 > > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 > > Your patch definitely helps. Bottom right number is 584741 now. > Still slower than ext4 or btrfs, but in the right ballpark. Will > post the entire block once it has been generated. The btrfs numbers are through doing different IO. have a look at all the sub-filesystem block size numbers for btrfs. No matter the thread count, the number is the same - hardware limits. btrfs is not doing an IO per read syscall there - I'd say it's falling back to buffered IO unlink ext4 and xfs.... ..... > seqrd 1 2 4 8 16 32 64 128 > 16384 4542 8311 15738 28955 38273 36644 38530 38527 > 8192 6000 10413 19208 33878 65927 76906 77083 77102 > 4096 8931 14971 24794 44223 83512 144867 147581 150702 > 2048 14375 23489 34364 56887 103053 192662 307167 309222 > 1024 21647 36022 49649 77163 132886 243296 421389 497581 > 512 31832 61257 79545 108782 176341 303836 517814 584741 > > Quite a nice improvement for such a small patch. As they say, "every > small factor of 17 helps". ;) And in general the numbers are within a couple of percent of the ext4 numbers, which is probably a reflection of the slightly higher CPU cost of the XFS read path compared to ext4. > What bothers me a bit is that the single-threaded numbers took such a > noticeable hit... Is it reproducable? I did notice quite a bit of run-to-run variation in the numbers I ran. For single threaded numbers, they appear to be in the order of +/-100 ops @ 16k block size. > > > Ok, the patch below takes the numbers on my test setup on a 16k IO > > size: > > > > seqrd 1 2 4 8 16 > > vanilla 3603 2798 2563 not tested... > > patches 3707 5746 10304 12875 11016 > > ...in particular when your numbers improve even for a single thread. > Wonder what's going on here. And these were just quoted from a single test run. > Anyway, feel free to add a Tested-By: or something from me. And maybe > fix the two typos below. Will do. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-19 2:41 ` Dave Chinner @ 2011-07-19 7:36 ` Jörn Engel 2011-07-19 9:23 ` srimugunthan dhandapani 2011-07-19 10:15 ` Dave Chinner 0 siblings, 2 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-19 7:36 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel On Tue, 19 July 2011 12:41:38 +1000, Dave Chinner wrote: > On Mon, Jul 18, 2011 at 01:40:36PM +0200, Jörn Engel wrote: > > On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > > > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > > > > > xfs: > > > > > > ==== > > > > > > seqrd 1 2 4 8 16 32 64 128 > > > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 > > > seqrd 1 2 4 8 16 32 64 128 > > 16384 4542 8311 15738 28955 38273 36644 38530 38527 > > 8192 6000 10413 19208 33878 65927 76906 77083 77102 > > 4096 8931 14971 24794 44223 83512 144867 147581 150702 > > 2048 14375 23489 34364 56887 103053 192662 307167 309222 > > 1024 21647 36022 49649 77163 132886 243296 421389 497581 > > 512 31832 61257 79545 108782 176341 303836 517814 584741 > > > What bothers me a bit is that the single-threaded numbers took such a > > noticeable hit... > > Is it reproducable? I did notice quite a bit of run-to-run variation > in the numbers I ran. For single threaded numbers, they appear to be > in the order of +/-100 ops @ 16k block size. Ime the numbers are stable within about 10%. And given that out of six measurements every single one is a regression, I would feel confident to bet a beverage without further measurements. Regression is 3.4%, 3.9%, 1.9%, 3.8%, 10% and 17% respectively, so the effect appears to be more visible with smaller block numbers as well. Jörn -- Schrödinger's cat is <BLINK>not</BLINK> dead. -- Illiad -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-19 7:36 ` Jörn Engel @ 2011-07-19 9:23 ` srimugunthan dhandapani 2011-07-21 19:05 ` Jörn Engel 2011-07-19 10:15 ` Dave Chinner 1 sibling, 1 reply; 21+ messages in thread From: srimugunthan dhandapani @ 2011-07-19 9:23 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Tue, Jul 19, 2011 at 1:06 PM, Jörn Engel <joern@logfs.org> wrote: > On Tue, 19 July 2011 12:41:38 +1000, Dave Chinner wrote: >> On Mon, Jul 18, 2011 at 01:40:36PM +0200, Jörn Engel wrote: >> > On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: >> > > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: >> > > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: >> > > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: >> > > >> > > > > > xfs: >> > > > > > ==== >> > > > > > seqrd 1 2 4 8 16 32 64 128 >> > > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 >> > > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 >> > > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 >> > > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 >> > > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 >> > > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 >> >> > seqrd 1 2 4 8 16 32 64 128 >> > 16384 4542 8311 15738 28955 38273 36644 38530 38527 >> > 8192 6000 10413 19208 33878 65927 76906 77083 77102 >> > 4096 8931 14971 24794 44223 83512 144867 147581 150702 >> > 2048 14375 23489 34364 56887 103053 192662 307167 309222 >> > 1024 21647 36022 49649 77163 132886 243296 421389 497581 >> > 512 31832 61257 79545 108782 176341 303836 517814 584741 >> >> > What bothers me a bit is that the single-threaded numbers took such a >> > noticeable hit... >> >> Is it reproducable? I did notice quite a bit of run-to-run variation >> in the numbers I ran. For single threaded numbers, they appear to be >> in the order of +/-100 ops @ 16k block size. > > Ime the numbers are stable within about 10%. And given that out of > six measurements every single one is a regression, I would feel > confident to bet a beverage without further measurements. Regression > is 3.4%, 3.9%, 1.9%, 3.8%, 10% and 17% respectively, so the effect > appears to be more visible with smaller block numbers as well. > > Jörn Hi Joern Is the hardware the "Drais card" that you described in the following link www.linux-kongress.org/2010/slides/logfs-engel.pdf Since the driver exposes an mtd device, do you mount the ext4,btrfs filesystem over any FTL? Is it possible to have logfs over the PCIe-SSD card? Pardon me for asking the following in this thread. I have been trying to mount logfs and i face seg fault during unmount . I have tested it in 2.6.34 and 2.39.1. I have asked about the problem here. http://comments.gmane.org/gmane.linux.file-systems/55008 Two other people have also faced umount problem in logfs 1. http://comments.gmane.org/gmane.linux.file-systems/46630 2. http://eeek.borgchat.net/lists/linux-embedded/msg02970.html My apologies again for asking it here. Since the logfs@logfs.org mailing list(and the wiki) doesnt work any more , i am asking the question here. I am thankful for your reply. Thanks, mugunthan -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-19 9:23 ` srimugunthan dhandapani @ 2011-07-21 19:05 ` Jörn Engel 0 siblings, 0 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-21 19:05 UTC (permalink / raw) To: srimugunthan dhandapani; +Cc: linux-fsdevel On Tue, 19 July 2011 14:53:08 +0530, srimugunthan dhandapani wrote: > > Is the hardware the "Drais card" that you described in the following link > www.linux-kongress.org/2010/slides/logfs-engel.pdf Yes. > Since the driver exposes an mtd device, do you mount the ext4,btrfs > filesystem over any FTL? That was last year. In the mean time I've added an FTL to the driver, so the card behaves like a regular ssd. Well, mostly. > Is it possible to have logfs over the PCIe-SSD card? YeaaaNo! Not anymore. Could be lack of error correction in the current driver or could be bitrot. Logfs over loopback seems to work just fine, so if it is bitrot, it is limited to the mtd interface. > Pardon me for asking the following in this thread. > I have been trying to mount logfs and i face seg fault during unmount > . I have tested it in 2.6.34 and 2.39.1. I have asked about the > problem here. > http://comments.gmane.org/gmane.linux.file-systems/55008 > > Two other people have also faced umount problem in logfs > > 1. http://comments.gmane.org/gmane.linux.file-systems/46630 > 2. http://eeek.borgchat.net/lists/linux-embedded/msg02970.html > > My apologies again for asking it here. Since the logfs@logfs.org > mailing list(and the wiki) doesnt work any more , i am asking the > question here. I am thankful for your reply. Yes, ever since that machine died I have basically been the non-maintainer of logfs. In a different century I would have been hanged, drawn and quartered for it. Give me some time to test the mtd side and see what's up. Jörn -- Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface. -- Doug MacIlroy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-19 7:36 ` Jörn Engel 2011-07-19 9:23 ` srimugunthan dhandapani @ 2011-07-19 10:15 ` Dave Chinner 1 sibling, 0 replies; 21+ messages in thread From: Dave Chinner @ 2011-07-19 10:15 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Tue, Jul 19, 2011 at 09:36:33AM +0200, Jörn Engel wrote: > On Tue, 19 July 2011 12:41:38 +1000, Dave Chinner wrote: > > On Mon, Jul 18, 2011 at 01:40:36PM +0200, Jörn Engel wrote: > > > On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > > > > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > > > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > > > > > > > xfs: > > > > > > > ==== > > > > > > > seqrd 1 2 4 8 16 32 64 128 > > > > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 > > > > > seqrd 1 2 4 8 16 32 64 128 > > > 16384 4542 8311 15738 28955 38273 36644 38530 38527 > > > 8192 6000 10413 19208 33878 65927 76906 77083 77102 > > > 4096 8931 14971 24794 44223 83512 144867 147581 150702 > > > 2048 14375 23489 34364 56887 103053 192662 307167 309222 > > > 1024 21647 36022 49649 77163 132886 243296 421389 497581 > > > 512 31832 61257 79545 108782 176341 303836 517814 584741 > > > > > What bothers me a bit is that the single-threaded numbers took such a > > > noticeable hit... > > > > Is it reproducable? I did notice quite a bit of run-to-run variation > > in the numbers I ran. For single threaded numbers, they appear to be > > in the order of +/-100 ops @ 16k block size. > > Ime the numbers are stable within about 10%. And given that out of > six measurements every single one is a regression, I would feel > confident to bet a beverage without further measurements. Regression > is 3.4%, 3.9%, 1.9%, 3.8%, 10% and 17% respectively, so the effect > appears to be more visible with smaller block numbers as well. Only thing I can think of then is that taking the lock shared is more expensive than taking it exclusive. Otherwise there is little change to the code path.... /me shrugs and cares not all that much right now Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-18 10:57 ` Dave Chinner 2011-07-18 11:40 ` Jörn Engel @ 2011-07-18 14:34 ` Jörn Engel 1 sibling, 0 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-18 14:34 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > xfs: > > > > ==== > > > > seqrd 1 2 4 8 16 32 64 128 > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 seqrd 1 2 4 8 16 32 64 128 16384 4542 8311 15738 28955 38273 36644 38530 38527 8192 6000 10413 19208 33878 65927 76906 77083 77102 4096 8931 14971 24794 44223 83512 144867 147581 150702 2048 14375 23489 34364 56887 103053 192662 307167 309222 1024 21647 36022 49649 77163 132886 243296 421389 497581 512 31832 61257 79545 108782 176341 303836 517814 584741 Quite a nice improvement for such a small patch. As they say, "every small factor of 17 helps". ;) What bothers me a bit is that the single-threaded numbers took such a noticeable hit... > Ok, the patch below takes the numbers on my test setup on a 16k IO > size: > > seqrd 1 2 4 8 16 > vanilla 3603 2798 2563 not tested... > patches 3707 5746 10304 12875 11016 ...in particular when your numbers improve even for a single thread. Wonder what's going on here. Anyway, feel free to add a Tested-By: or something from me. And maybe fix the two typos below. > xfs: don't serialise direct IO reads on page cache checks > > From: Dave Chinner <dchinner@redhat.com> > > There is no need to grab the i_mutex of the IO lock in exclusive > mode if we don't need to invalidate the page cache. Taking hese ^ > locks on every direct IO effective serialisaes them as taking the IO ^ > lock in exclusive mode has to wait for all shared holders to drop > the lock. That only happens when IO is complete, so effective it > prevents dispatch of concurrent direct IO reads to the same inode. > > Fix this by taking the IO lock shared to check the page cache state, > and only then drop it and take the IO lock exclusively if there is > work to be done. Hence for the normal direct IO case, no exclusive > locking will occur. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > --- > fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++--- > 1 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c > index 1e641e6..16a4bf0 100644 > --- a/fs/xfs/linux-2.6/xfs_file.c > +++ b/fs/xfs/linux-2.6/xfs_file.c > @@ -321,7 +321,19 @@ xfs_file_aio_read( > if (XFS_FORCED_SHUTDOWN(mp)) > return -EIO; > > - if (unlikely(ioflags & IO_ISDIRECT)) { > + /* > + * Locking is a bit tricky here. If we take an exclusive lock > + * for direct IO, we effectively serialise all new concurrent > + * read IO to this file and block it behind IO that is currently in > + * progress because IO in progress holds the IO lock shared. We only > + * need to hold the lock exclusive to blow away the page cache, so > + * only take lock exclusively if the page cache needs invalidation. > + * This allows the normal direct IO case of no page cache pages to > + * proceeed concurrently without serialisation. > + */ > + xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) { > + xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); > xfs_rw_ilock(ip, XFS_IOLOCK_EXCL); > > if (inode->i_mapping->nrpages) { > @@ -334,8 +346,7 @@ xfs_file_aio_read( > } > } > xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL); > - } else > - xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + } > > trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags); > Jörn -- Everything should be made as simple as possible, but not simpler. -- Albert Einstein -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20110718103956.GE1437@logfs.org>]
* Re: Filesystem benchmarks on reasonably fast hardware [not found] ` <20110718103956.GE1437@logfs.org> @ 2011-07-18 11:10 ` Dave Chinner 0 siblings, 0 replies; 21+ messages in thread From: Dave Chinner @ 2011-07-18 11:10 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Mon, Jul 18, 2011 at 12:39:56PM +0200, Jörn Engel wrote: > Write lockstat (I mistakenly sent the read one twice). Yeah, that's the i_mutex that is the issue there. We are definitely taking exclusive locks during the IO submission process there. I suspect I might be able to write a patch that does all the checks under a shared lock - similar to the patch for the read side - but it is definitely more complex and I'll have to have a bit of a think about it. Thanks for the bug report! Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-17 16:05 Filesystem benchmarks on reasonably fast hardware Jörn Engel 2011-07-17 23:32 ` Dave Chinner @ 2011-07-18 12:07 ` Ted Ts'o 2011-07-18 12:42 ` Jörn Engel 2011-07-19 13:19 ` Dave Chinner 2 siblings, 1 reply; 21+ messages in thread From: Ted Ts'o @ 2011-07-18 12:07 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel Hey Jörn, Can you send me your script and the lockstat for ext4? (Please cc the linux-ext4@vger.kernel.org list if you don't mind. Thanks!!) Thanks, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-18 12:07 ` Ted Ts'o @ 2011-07-18 12:42 ` Jörn Engel 2011-07-25 15:18 ` Ted Ts'o 0 siblings, 1 reply; 21+ messages in thread From: Jörn Engel @ 2011-07-18 12:42 UTC (permalink / raw) To: Ted Ts'o; +Cc: linux-fsdevel, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 666 bytes --] On Mon, 18 July 2011 08:07:51 -0400, Ted Ts'o wrote: > > Can you send me your script and the lockstat for ext4? Attached. The first script generates a bunch of files, the second condenses them into the tabular form. Will need some massaging to work on anything other than my particular setup, sorry. > (Please cc the linux-ext4@vger.kernel.org list if you don't mind. > Thanks!!) Sure. Lockstat will come later today. The machine is currently busy regenerating xfs seqrd numbers. Jörn -- I've never met a human being who would want to read 17,000 pages of documentation, and if there was, I'd kill him to get him out of the gene pool. -- Joseph Costello [-- Attachment #2: sysbench.sh --] [-- Type: application/x-sh, Size: 1612 bytes --] [-- Attachment #3: sysbench_result.sh --] [-- Type: application/x-sh, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-18 12:42 ` Jörn Engel @ 2011-07-25 15:18 ` Ted Ts'o 2011-07-25 18:20 ` Jörn Engel 0 siblings, 1 reply; 21+ messages in thread From: Ted Ts'o @ 2011-07-25 15:18 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel, linux-ext4 On Mon, Jul 18, 2011 at 02:42:29PM +0200, Jörn Engel wrote: > On Mon, 18 July 2011 08:07:51 -0400, Ted Ts'o wrote: > > > > Can you send me your script and the lockstat for ext4? > > Attached. The first script generates a bunch of files, the second > condenses them into the tabular form. Will need some massaging to > work on anything other than my particular setup, sorry. > > > (Please cc the linux-ext4@vger.kernel.org list if you don't mind. > > Thanks!!) > > Sure. Lockstat will come later today. The machine is currently busy > regenerating xfs seqrd numbers. Hi Jörn, Did you have a chance to do an ext4 lockstat run? Many thanks!! - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-25 15:18 ` Ted Ts'o @ 2011-07-25 18:20 ` Jörn Engel 2011-07-25 21:18 ` Ted Ts'o 2011-07-26 14:57 ` Ted Ts'o 0 siblings, 2 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-25 18:20 UTC (permalink / raw) To: Ted Ts'o; +Cc: linux-fsdevel, linux-ext4 On Mon, 25 July 2011 11:18:25 -0400, Ted Ts'o wrote: > > Did you have a chance to do an ext4 lockstat run? Yes, I did. But your mails keep bouncing, so you have to look at the list to see it (or this mail). Yes, I lack a proper reverse DNS record, as the IP belongs to my provider, not me. Most people don't care, some bounce, some silently ignore my mail. The joys of spam filtering. Jörn -- The rabbit runs faster than the fox, because the rabbit is rinning for his life while the fox is only running for his dinner. -- Aesop -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-25 18:20 ` Jörn Engel @ 2011-07-25 21:18 ` Ted Ts'o 2011-07-26 14:57 ` Ted Ts'o 1 sibling, 0 replies; 21+ messages in thread From: Ted Ts'o @ 2011-07-25 21:18 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel, linux-ext4 On Mon, Jul 25, 2011 at 08:20:37PM +0200, Jörn Engel wrote: > On Mon, 25 July 2011 11:18:25 -0400, Ted Ts'o wrote: > > > > Did you have a chance to do an ext4 lockstat run? > > Yes, I did. But your mails keep bouncing, so you have to look at the > list to see it (or this mail). Yes, I lack a proper reverse DNS > record, as the IP belongs to my provider, not me. Most people don't > care, some bounce, some silently ignore my mail. The joys of spam > filtering. I didn't see the ext4 lockstat on the list. Can you resend it to tytso@google.com or theodore.tso@gmail.com? MIT is using an outsourced SPAM provider (Brightmail anti-spam), and I can't do anything about that, unfortunately. From what I can tell the Brightmail doesn't drop all e-mails from non-resolving IP's, but if it's in a "bad neighborhood" (i.e., your neighbors are all spammers, or belong to Windows users where 80% of the machines are spambots), Brightmail is probably going to flag your mail as spam. :-( Thanks! - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-25 18:20 ` Jörn Engel 2011-07-25 21:18 ` Ted Ts'o @ 2011-07-26 14:57 ` Ted Ts'o 2011-07-27 3:39 ` Yongqiang Yang 1 sibling, 1 reply; 21+ messages in thread From: Ted Ts'o @ 2011-07-26 14:57 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel, linux-ext4 On Mon, Jul 25, 2011 at 08:20:37PM +0200, Jörn Engel wrote: > On Mon, 25 July 2011 11:18:25 -0400, Ted Ts'o wrote: > > > > Did you have a chance to do an ext4 lockstat run? Hi Jörn, Thanks for forwarding it to me. It's the same problem as in XFS, the excessive coverage of the i_mutex lock. In ext4's case, it's in the generic generic_file_aio_write() machinery where we need to do the lock busting. (XFS apparently doesn't use the generic routines, so the fix that Dave did won't help ext3 and ext4.) I don't have the time to look at it now, but I'll put it on my todo list; or maybe someone with a bit more time can look into how we might be able to use a similar approach in the generic file system code. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-26 14:57 ` Ted Ts'o @ 2011-07-27 3:39 ` Yongqiang Yang 0 siblings, 0 replies; 21+ messages in thread From: Yongqiang Yang @ 2011-07-27 3:39 UTC (permalink / raw) To: Jörn Engel, Ted Ts'o; +Cc: linux-fsdevel, linux-ext4 Hi Jörn and Ted, Could you anyone send out the ext4 lock stat on the list? Thank you, Yongqiang. On Tue, Jul 26, 2011 at 10:57 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Mon, Jul 25, 2011 at 08:20:37PM +0200, Jörn Engel wrote: >> On Mon, 25 July 2011 11:18:25 -0400, Ted Ts'o wrote: >> > >> > Did you have a chance to do an ext4 lockstat run? > > Hi Jörn, > > Thanks for forwarding it to me. It's the same problem as in XFS, the > excessive coverage of the i_mutex lock. In ext4's case, it's in the > generic generic_file_aio_write() machinery where we need to do the > lock busting. (XFS apparently doesn't use the generic routines, so > the fix that Dave did won't help ext3 and ext4.) > > I don't have the time to look at it now, but I'll put it on my todo > list; or maybe someone with a bit more time can look into how we might > be able to use a similar approach in the generic file system code. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Best Wishes Yongqiang Yang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-17 16:05 Filesystem benchmarks on reasonably fast hardware Jörn Engel 2011-07-17 23:32 ` Dave Chinner 2011-07-18 12:07 ` Ted Ts'o @ 2011-07-19 13:19 ` Dave Chinner 2011-07-21 10:42 ` Jörn Engel 2 siblings, 1 reply; 21+ messages in thread From: Dave Chinner @ 2011-07-19 13:19 UTC (permalink / raw) To: Jörn Engel; +Cc: linux-fsdevel On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > xfs: > ==== ..... > seqwr 1 2 4 8 16 32 64 128 > 16384 39956 39695 39971 39913 37042 37538 36591 32179 > 8192 67934 66073 30963 29038 29852 25210 23983 28272 > 4096 89250 81417 28671 18685 12917 14870 22643 22237 > 2048 140272 120588 140665 140012 137516 139183 131330 129684 > 1024 217473 147899 210350 218526 219867 220120 219758 215166 > 512 328260 181197 211131 263533 294009 298203 301698 298013 OK, I can explain the pattern here where throughput drops off at 2-4 threads. It's not as simple as the seqrd case, but it's related to the fact that this workload is an append write workload. See the patch description below for why that matters. As it is, the numbers I get for 16k seqwr on my hardawre are as follows: seqwr 1 2 4 8 16 vanilla 3072 2734 2506 not tested... patched 2984 4156 4922 5175 5120 Looks like my hardware is topping out at ~5-6kiops no matter the block size here. Which, no matter how you look at it, is a significant improvement. ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com xfs: don't serialise adjacent concurrent direct IO appending writes For append write workloads, extending the file requires a certain amount of exclusive locking to be done up front to ensure sanity in things like ensuring that we've zeroed any allocated regions between the old EOF and the start of the new IO. For single threads, this typically isn't a problem, and for large IOs we don't serialise enough for it to be a problem for two threads on really fast block devices. However for smaller IO and larger thread counts we have a problem. Take 4 concurrent sequential, single block sized and aligned IOs. After the first IO is submitted but before it completes, we end up with this state: IO 1 IO 2 IO 3 IO 4 +-------+-------+-------+-------+ ^ ^ | | | | | | | \- ip->i_new_size \- ip->i_size And the IO is done without exclusive locking because offset <= ip->i_size. When we submit IO 2, we see offset > ip->i_size, and grab the IO lock exclusive, because there is a chance we need to do EOF zeroing. However, there is already an IO in progress that avoids the need for IO zeroing because offset <= ip->i_new_size. hence we could avoid holding the IO lock exlcusive for this. Hence after submission of the second IO, we'd end up this state: IO 1 IO 2 IO 3 IO 4 +-------+-------+-------+-------+ ^ ^ | | | | | | | \- ip->i_new_size \- ip->i_size There is no need to grab the i_mutex of the IO lock in exclusive mode if we don't need to invalidate the page cache. Taking these locks on every direct IO effective serialises them as taking the IO lock in exclusive mode has to wait for all shared holders to drop the lock. That only happens when IO is complete, so effective it prevents dispatch of concurrent direct IO writes to the same inode. And so you can see that for the third concurrent IO, we'd avoid exclusive locking for the same reason we avoided the exclusive lock for the second IO. Fixing this is a bit more complex than that, because we need to hold a write-submission local value of ip->i_new_size to that clearing the value is only done if no other thread has updated it before our IO completes..... Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/linux-2.6/xfs_aops.c | 7 ++++ fs/xfs/linux-2.6/xfs_file.c | 69 ++++++++++++++++++++++++++++++++++--------- 2 files changed, 62 insertions(+), 14 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c index 63e971e..dda9a9e 100644 --- a/fs/xfs/linux-2.6/xfs_aops.c +++ b/fs/xfs/linux-2.6/xfs_aops.c @@ -176,6 +176,13 @@ xfs_setfilesize( if (unlikely(ioend->io_error)) return 0; + /* + * If the IO is clearly not beyond the on-disk inode size, + * return before we take locks. + */ + if (ioend->io_offset + ioend->io_size <= ip->i_d.di_size) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) return EAGAIN; diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c index 16a4bf0..5b6703a 100644 --- a/fs/xfs/linux-2.6/xfs_file.c +++ b/fs/xfs/linux-2.6/xfs_file.c @@ -422,11 +422,13 @@ xfs_aio_write_isize_update( */ STATIC void xfs_aio_write_newsize_update( - struct xfs_inode *ip) + struct xfs_inode *ip, + xfs_fsize_t new_size) { - if (ip->i_new_size) { + if (new_size == ip->i_new_size) { xfs_rw_ilock(ip, XFS_ILOCK_EXCL); - ip->i_new_size = 0; + if (new_size == ip->i_new_size) + ip->i_new_size = 0; if (ip->i_d.di_size > ip->i_size) ip->i_d.di_size = ip->i_size; xfs_rw_iunlock(ip, XFS_ILOCK_EXCL); @@ -478,7 +480,7 @@ xfs_file_splice_write( count, flags); xfs_aio_write_isize_update(inode, ppos, ret); - xfs_aio_write_newsize_update(ip); + xfs_aio_write_newsize_update(ip, new_size); xfs_iunlock(ip, XFS_IOLOCK_EXCL); return ret; } @@ -675,6 +677,7 @@ xfs_file_aio_write_checks( struct file *file, loff_t *pos, size_t *count, + xfs_fsize_t *new_sizep, int *iolock) { struct inode *inode = file->f_mapping->host; @@ -682,6 +685,8 @@ xfs_file_aio_write_checks( xfs_fsize_t new_size; int error = 0; +restart: + *new_sizep = 0; error = generic_write_checks(file, pos, count, S_ISBLK(inode->i_mode)); if (error) { xfs_rw_iunlock(ip, XFS_ILOCK_EXCL | *iolock); @@ -689,9 +694,18 @@ xfs_file_aio_write_checks( return error; } + /* + * if we are writing beyond the current EOF, only update the + * ip->i_new_size if it is larger than any other concurrent write beyond + * EOF. Regardless of whether we update ip->i_new_size, return the + * updated new_size to the caller. + */ new_size = *pos + *count; - if (new_size > ip->i_size) - ip->i_new_size = new_size; + if (new_size > ip->i_size) { + if (new_size > ip->i_new_size) + ip->i_new_size = new_size; + *new_sizep = new_size; + } if (likely(!(file->f_mode & FMODE_NOCMTIME))) file_update_time(file); @@ -699,10 +713,22 @@ xfs_file_aio_write_checks( /* * If the offset is beyond the size of the file, we need to zero any * blocks that fall between the existing EOF and the start of this - * write. + * write. Don't issue zeroing if this IO is adjacent to an IO already in + * flight. If we are currently holding the iolock shared, we need to + * update it to exclusive which involves dropping all locks and + * relocking to maintain correct locking order. If we do this, restart + * the function to ensure all checks and values are still valid. */ - if (*pos > ip->i_size) + if ((ip->i_new_size && *pos > ip->i_new_size) || + (!ip->i_new_size && *pos > ip->i_size)) { + if (*iolock == XFS_IOLOCK_SHARED) { + xfs_rw_iunlock(ip, XFS_ILOCK_EXCL | *iolock); + *iolock = XFS_IOLOCK_EXCL; + xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock); + goto restart; + } error = -xfs_zero_eof(ip, *pos, ip->i_size); + } xfs_rw_iunlock(ip, XFS_ILOCK_EXCL); if (error) @@ -749,6 +775,7 @@ xfs_file_dio_aio_write( unsigned long nr_segs, loff_t pos, size_t ocount, + xfs_fsize_t *new_size, int *iolock) { struct file *file = iocb->ki_filp; @@ -769,13 +796,25 @@ xfs_file_dio_aio_write( if ((pos & mp->m_blockmask) || ((pos + count) & mp->m_blockmask)) unaligned_io = 1; - if (unaligned_io || mapping->nrpages || pos > ip->i_size) + /* + * Tricky locking alert: if we are doing multiple concurrent sequential + * writes (e.g. via aio), we don't need to do EOF zeroing if the current + * IO is adjacent to an in-flight IO. That means for such IO we can + * avoid taking the IOLOCK exclusively. Hence we avoid checking for + * writes beyond EOF at this point when deciding what lock to take. + * We will take the IOLOCK exclusive later if necessary. + * + * This, however, means that we need a local copy of the ip->i_new_size + * value from this IO if we change it so that we can determine if we can + * clear the value from the inode when this IO completes. + */ + if (unaligned_io || mapping->nrpages) *iolock = XFS_IOLOCK_EXCL; else *iolock = XFS_IOLOCK_SHARED; xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock); - ret = xfs_file_aio_write_checks(file, &pos, &count, iolock); + ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock); if (ret) return ret; @@ -814,6 +853,7 @@ xfs_file_buffered_aio_write( unsigned long nr_segs, loff_t pos, size_t ocount, + xfs_fsize_t *new_size, int *iolock) { struct file *file = iocb->ki_filp; @@ -827,7 +867,7 @@ xfs_file_buffered_aio_write( *iolock = XFS_IOLOCK_EXCL; xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock); - ret = xfs_file_aio_write_checks(file, &pos, &count, iolock); + ret = xfs_file_aio_write_checks(file, &pos, &count, new_size, iolock); if (ret) return ret; @@ -867,6 +907,7 @@ xfs_file_aio_write( ssize_t ret; int iolock; size_t ocount = 0; + xfs_fsize_t new_size = 0; XFS_STATS_INC(xs_write_calls); @@ -886,10 +927,10 @@ xfs_file_aio_write( if (unlikely(file->f_flags & O_DIRECT)) ret = xfs_file_dio_aio_write(iocb, iovp, nr_segs, pos, - ocount, &iolock); + ocount, &new_size, &iolock); else ret = xfs_file_buffered_aio_write(iocb, iovp, nr_segs, pos, - ocount, &iolock); + ocount, &new_size, &iolock); xfs_aio_write_isize_update(inode, &iocb->ki_pos, ret); @@ -914,7 +955,7 @@ xfs_file_aio_write( } out_unlock: - xfs_aio_write_newsize_update(ip); + xfs_aio_write_newsize_update(ip, new_size); xfs_rw_iunlock(ip, iolock); return ret; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-19 13:19 ` Dave Chinner @ 2011-07-21 10:42 ` Jörn Engel 2011-07-22 18:51 ` Jörn Engel 0 siblings, 1 reply; 21+ messages in thread From: Jörn Engel @ 2011-07-21 10:42 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel On Tue, 19 July 2011 23:19:58 +1000, Dave Chinner wrote: > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > xfs: > > ==== > ..... > > seqwr 1 2 4 8 16 32 64 128 > > 16384 39956 39695 39971 39913 37042 37538 36591 32179 > > 8192 67934 66073 30963 29038 29852 25210 23983 28272 > > 4096 89250 81417 28671 18685 12917 14870 22643 22237 > > 2048 140272 120588 140665 140012 137516 139183 131330 129684 > > 1024 217473 147899 210350 218526 219867 220120 219758 215166 > > 512 328260 181197 211131 263533 294009 298203 301698 298013 > > OK, I can explain the pattern here where throughput drops off at 2-4 > threads. It's not as simple as the seqrd case, but it's related to > the fact that this workload is an append write workload. See the > patch description below for why that matters. > > As it is, the numbers I get for 16k seqwr on my hardawre are as > follows: > > seqwr 1 2 4 8 16 > vanilla 3072 2734 2506 not tested... > patched 2984 4156 4922 5175 5120 > > Looks like my hardware is topping out at ~5-6kiops no matter the > block size here. Which, no matter how you look at it, is a > significant improvement. ;) My numbers include some regressions, although the improvements clearly dominate. Below is a diff (or div) between new kernel with both your patches applied and vanilla. >1 means improvement, <1 means regression. seqrd 1 2 4 8 16 32 64 128 16384 1.037 1.975 3.726 6.643 8.901 8.902 8.431 8.365 8192 1.015 1.871 3.459 6.424 11.457 12.829 13.542 13.490 4096 1.009 1.790 2.942 5.179 9.634 16.667 17.652 17.666 2048 1.005 1.709 2.525 4.196 7.479 14.022 22.032 22.100 1024 1.017 1.624 2.328 3.587 6.112 11.365 20.311 21.315 512 1.012 1.829 2.374 3.365 5.352 9.459 16.809 18.771 rndrd 1 2 4 8 16 32 64 128 16384 1.042 1.037 1.036 1.043 1.051 1.011 1.002 1.001 8192 1.020 1.020 1.028 1.040 1.057 1.064 1.002 1.001 4096 1.011 1.007 1.021 1.036 1.059 1.086 1.021 1.001 2048 1.002 1.010 1.018 1.025 1.057 1.100 1.098 1.003 1024 1.001 1.002 1.023 1.007 1.072 1.112 1.162 1.102 512 0.998 1.010 1.004 1.035 1.088 1.121 1.156 1.127 seqwr 1 2 4 8 16 32 64 128 16384 0.942 0.949 0.942 0.945 1.017 1.004 1.030 1.172 8192 1.144 1.177 2.517 2.687 2.611 3.091 3.246 2.741 4096 1.389 1.506 4.228 6.443 9.313 8.064 5.276 5.394 2048 1.139 1.278 1.080 1.076 1.094 1.087 1.142 1.148 1024 0.852 1.190 0.806 0.783 0.776 0.774 0.769 0.774 512 0.709 1.273 1.055 0.847 0.758 0.744 0.738 0.746 rndwr 1 2 4 8 16 32 64 128 16384 1.013 1.003 1.002 1.005 1.007 1.006 1.003 1.002 8192 1.023 1.005 1.007 1.006 1.006 1.004 1.004 1.006 4096 1.020 1.007 1.007 1.007 1.007 1.007 1.008 1.007 2048 0.901 1.017 1.007 1.008 1.008 1.009 1.008 1.007 1024 0.848 0.949 1.003 0.990 1.001 1.006 1.006 1.005 512 0.821 0.833 0.948 0.956 0.935 0.929 0.921 0.914 Raw results: seqrd 1 2 4 8 16 32 64 128 16384 4873 8738 16382 29241 39111 39152 39137 39140 8192 6326 10900 20054 37263 66391 78437 78449 78404 4096 9181 15816 26130 46073 85492 148172 157276 157329 2048 14995 24588 36009 59790 106685 200012 314373 315440 1024 24248 36841 51972 80207 136529 253175 451709 475353 512 37813 62164 79048 112175 178246 314959 558458 624534 rndrd 1 2 4 8 16 32 64 128 16384 4778 8554 14724 23507 33666 39065 39109 39104 8192 6152 11409 20862 35814 56123 75776 78370 78380 4096 8335 15643 29662 53953 91867 136643 157314 157325 2048 11973 22885 43474 81545 148087 239997 314198 315680 1024 16547 31345 61123 113283 222737 387234 562457 632767 512 20590 40134 73333 135621 294448 543117 793329 818861 seqwr 1 2 4 8 16 32 64 128 16384 37629 37651 37667 37711 37658 37674 37687 37727 8192 77691 77747 77948 78017 77940 77931 77847 77488 4096 123997 122607 121219 120394 120301 119908 119457 119939 2048 159816 154063 151987 150608 150449 151298 150016 148852 1024 185215 175977 169562 171078 170649 170420 169076 166614 512 232890 230669 222830 223140 222877 221812 222588 222369 rndwr 1 2 4 8 16 32 64 128 16384 38944 38256 38227 38312 38438 38432 38331 38313 8192 79773 77378 77453 77425 77473 77500 77458 77535 4096 163925 157167 158258 158192 158244 158281 158229 158252 2048 293295 322480 320206 321022 321375 321926 322298 322558 1024 368010 616516 652359 645514 654715 658132 659513 659125 512 411236 730015 1223437 1164632 1163705 1178235 1184450 1186594 Jörn -- Ninety percent of everything is crap. -- Sturgeon's Law -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Filesystem benchmarks on reasonably fast hardware 2011-07-21 10:42 ` Jörn Engel @ 2011-07-22 18:51 ` Jörn Engel 0 siblings, 0 replies; 21+ messages in thread From: Jörn Engel @ 2011-07-22 18:51 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel On Thu, 21 July 2011 12:42:46 +0200, Jörn Engel wrote: > On Tue, 19 July 2011 23:19:58 +1000, Dave Chinner wrote: > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > [ Crap ] I had tested ext4 with two xfs patches. Try these numbers instead. Both patches have my endorsement. Excellent work! seqrd 1 2 4 8 16 32 64 128 16384 1.000 1.880 3.456 6.297 8.727 8.703 8.271 8.208 8192 1.001 1.811 3.304 6.153 10.061 12.567 13.248 12.077 4096 1.001 1.752 2.832 4.968 9.199 15.937 17.228 17.139 2048 1.001 1.689 2.459 4.053 7.152 13.241 21.565 21.694 1024 1.011 1.619 2.296 3.521 5.935 10.849 19.649 27.848 512 1.008 1.825 2.371 3.310 5.230 9.146 16.591 27.234 rndrd 1 2 4 8 16 32 64 128 16384 1.003 1.005 1.009 1.021 1.032 1.009 1.001 1.001 8192 1.002 1.004 1.013 1.024 1.041 1.051 1.001 1.001 4096 1.003 1.004 1.013 1.027 1.049 1.071 1.020 1.000 2048 1.004 1.010 1.019 1.011 1.052 1.091 1.091 1.002 1024 1.003 1.009 1.028 1.027 1.068 1.109 1.155 1.099 512 1.002 1.014 1.016 1.044 1.083 1.125 1.196 1.236 seqwr 1 2 4 8 16 32 64 128 16384 1.003 1.001 0.981 0.953 0.995 0.947 1.057 1.203 8192 0.999 1.048 2.120 2.060 1.799 1.991 2.093 1.998 4096 0.991 1.074 2.901 3.878 5.218 4.030 2.358 2.601 2048 1.005 1.273 1.058 1.077 1.112 1.123 1.137 1.161 1024 0.999 1.605 1.147 1.059 1.059 1.047 1.064 1.069 512 0.947 1.978 1.618 1.317 1.181 1.156 1.149 1.134 rndwr 1 2 4 8 16 32 64 128 16384 1.000 0.999 1.000 1.001 1.000 1.000 1.001 0.999 8192 0.999 1.000 1.000 1.001 1.000 1.001 1.001 1.003 4096 0.997 0.998 1.000 1.000 1.001 1.000 1.001 1.000 2048 1.002 1.001 1.001 1.003 1.001 1.002 1.000 1.000 1024 0.998 1.001 1.000 1.001 1.000 1.001 0.999 1.001 512 1.044 0.999 1.003 1.001 1.001 1.001 1.002 0.998 seqrd 1 2 4 8 16 32 64 128 16384 4700 8316 15197 27721 38348 38277 38394 38406 8192 6241 10551 19156 35692 58304 76835 76743 70192 4096 9110 15477 25155 44196 81632 141681 153499 152642 2048 14942 24309 35063 57754 102009 188865 307705 309641 1024 24104 36724 51278 78737 132577 241681 437003 621032 512 37646 62022 78943 110334 174203 304532 551212 906087 rndrd 1 2 4 8 16 32 64 128 16384 4598 8288 14352 22999 33051 38977 39072 39086 8192 6042 11233 20566 35279 55300 74863 78278 78359 4096 8268 15604 29428 53514 91016 134799 157045 157144 2048 11997 22877 43550 80430 147372 237967 312170 315369 1024 16578 31577 61419 115548 221986 386119 558797 631441 512 20668 40293 74185 136774 293068 545050 820771 897897 seqwr 1 2 4 8 16 32 64 128 16384 40074 39718 39198 38027 36846 35562 38659 38726 8192 67896 69240 65628 59807 53713 50181 50208 56486 4096 88439 87416 83167 72468 67401 59932 53383 57845 2048 141003 153543 148813 150740 152966 156238 149370 150576 1024 217311 237402 241186 231341 232902 230429 233877 230095 512 310980 358427 341578 347183 347281 344722 346779 337970 rndwr 1 2 4 8 16 32 64 128 16384 38436 38112 38154 38161 38174 38208 38250 38197 8192 77890 76972 76938 76993 77031 77255 77274 77301 4096 160246 155612 157142 157090 157213 157081 157193 157160 2048 326008 317372 318089 319273 318994 319596 319773 320299 1024 433107 650226 649868 652195 653764 654760 655299 656246 512 523091 875267 1294281 1218935 1245993 1269267 1287429 1296046 Jörn -- Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. -- Perlis's Programming Proverb #58, SIGPLAN Notices, Sept. 1982 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2011-07-27 3:39 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-07-17 16:05 Filesystem benchmarks on reasonably fast hardware Jörn Engel 2011-07-17 23:32 ` Dave Chinner [not found] ` <20110718075339.GB1437@logfs.org> 2011-07-18 10:57 ` Dave Chinner 2011-07-18 11:40 ` Jörn Engel 2011-07-19 2:41 ` Dave Chinner 2011-07-19 7:36 ` Jörn Engel 2011-07-19 9:23 ` srimugunthan dhandapani 2011-07-21 19:05 ` Jörn Engel 2011-07-19 10:15 ` Dave Chinner 2011-07-18 14:34 ` Jörn Engel [not found] ` <20110718103956.GE1437@logfs.org> 2011-07-18 11:10 ` Dave Chinner 2011-07-18 12:07 ` Ted Ts'o 2011-07-18 12:42 ` Jörn Engel 2011-07-25 15:18 ` Ted Ts'o 2011-07-25 18:20 ` Jörn Engel 2011-07-25 21:18 ` Ted Ts'o 2011-07-26 14:57 ` Ted Ts'o 2011-07-27 3:39 ` Yongqiang Yang 2011-07-19 13:19 ` Dave Chinner 2011-07-21 10:42 ` Jörn Engel 2011-07-22 18:51 ` Jörn Engel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).