All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tomasz Chmielewski <mangoo@wpkg.org>
To: Theodore Tso <tytso@mit.edu>, Andi Kleen <andi@firstfloor.org>,
	LKML <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: very poor ext3 write performance on big filesystems?
Date: Wed, 27 Feb 2008 12:20:19 +0100	[thread overview]
Message-ID: <47C54773.4040402@wpkg.org> (raw)
In-Reply-To: <20080218141640.GC12568@mit.edu>

Theodore Tso schrieb:
> On Mon, Feb 18, 2008 at 03:03:44PM +0100, Andi Kleen wrote:
>> Tomasz Chmielewski <mangoo@wpkg.org> writes:
>>> Is it normal to expect the write speed go down to only few dozens of
>>> kilobytes/s? Is it because of that many seeks? Can it be somehow
>>> optimized? 
>> I have similar problems on my linux source partition which also
>> has a lot of hard linked files (although probably not quite
>> as many as you do). It seems like hard linking prevents
>> some of the heuristics ext* uses to generate non fragmented
>> disk layouts and the resulting seeking makes things slow.

A follow-up to this thread.
Using small optimizations like playing with /proc/sys/vm/* didn't help 
much, increasing "commit=" ext3 mount option helped only a tiny bit.

What *did* help a lot was... disabling the internal bitmap of the RAID-5 
array. "rm -rf" doesn't "pause" for several seconds any more.

If md and dm supported barriers, it would be even better I guess (I 
could enable write cache with some degree of confidence).


This is "iostat sda -d 10" output without the internal bitmap.
The system mostly tries to read (Blk_read/s), and once in a while it
does a big commit (Blk_wrtn/s):

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             164,67      2088,62         0,00      20928          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             180,12      1999,60         0,00      20016          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             172,63      2587,01         0,00      25896          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             156,53      2054,64         0,00      20608          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             170,20      3013,60         0,00      30136          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             119,46      1377,25      5264,67      13800      52752

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             154,05      1897,10         0,00      18952          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             197,70      2177,02         0,00      21792          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             166,47      1805,19         0,00      18088          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             150,95      1552,05         0,00      15536          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             158,44      1792,61         0,00      17944          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             132,47      1399,40      3781,82      14008      37856



With the bitmap enabled, it sometimes behave similarly, but mostly, I
can see as reads compete with writes, and both have very low numbers then:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             112,57       946,11      5837,13       9480      58488

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             157,24      1858,94         0,00      18608          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             116,90      1173,60        44,00      11736        440

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              24,05        85,43       172,46        856       1728

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              25,60        90,40       165,60        904       1656

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              25,05       276,25       180,44       2768       1808

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              22,70        65,60       229,60        656       2296

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              21,66       202,79       786,43       2032       7880

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              20,90        83,20      1800,00        832      18000

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              51,75       237,36       479,52       2376       4800

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              35,43       129,34       245,91       1296       2464

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              34,50        88,00       270,40        880       2704


Now, let's disable the bitmap in the RAID-5 array:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             110,59       536,26       973,43       5368       9744

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             119,68       533,07      1574,43       5336      15760

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             123,78       368,43      2335,26       3688      23376

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             122,48       315,68      1990,01       3160      19920

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             117,08       580,22      1009,39       5808      10104

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             119,50       324,00      1080,80       3240      10808

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             118,36       353,69      1926,55       3544      19304


And let's enable it again - after a while, it degrades again, and I can
see "rm -rf" stops for longer periods:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             162,70      2213,60         0,00      22136          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             165,73      1639,16         0,00      16408          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             119,76      1192,81      3722,16      11952      37296

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             178,70      1855,20         0,00      18552          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             162,64      1528,07         0,80      15296          8

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             182,87      2082,07         0,00      20904          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             168,93      1692,71         0,00      16944          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             177,45      1572,06         0,00      15752          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             123,10      1436,00      4941,60      14360      49416

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             201,30      1984,03         0,00      19880          0

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             165,50      1555,20        22,40      15552        224

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              25,35       273,05       189,22       2736       1896

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              22,58        63,94       165,43        640       1656

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              69,40       435,20       262,40       4352       2624



There is a related thread (although not much kernel-related) on a 
BackupPC mailing list:

http://thread.gmane.org/gmane.comp.sysutils.backup.backuppc.general/14009

As it's BackupPC software which makes this amount of hardlinks (but hey, 
I can keep ~14 TB of data on a 1.2 TB filesystem which is not even 65% 
full).


-- 
Tomasz Chmielewski
http://wpkg.org

  parent reply	other threads:[~2008-02-27 11:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-18 12:57 very poor ext3 write performance on big filesystems? Tomasz Chmielewski
2008-02-18 14:03 ` Andi Kleen
2008-02-18 14:16   ` Theodore Tso
2008-02-18 15:02     ` Tomasz Chmielewski
2008-02-18 15:16       ` Theodore Tso
2008-02-18 15:57         ` Andi Kleen
2008-02-18 15:35           ` Theodore Tso
2008-02-20 10:57             ` Jan Engelhardt
2008-02-20 17:44               ` David Rees
2008-02-20 18:08                 ` Jan Engelhardt
2008-02-18 16:16         ` Tomasz Chmielewski
2008-02-18 18:45           ` Theodore Tso
2008-02-18 15:18     ` Andi Kleen
2008-02-18 15:03       ` Theodore Tso
2008-02-19 14:54     ` Tomasz Chmielewski
2008-02-19 15:06       ` Chris Mason
2008-02-19 15:21         ` Tomasz Chmielewski
2008-02-19 16:04           ` Chris Mason
2008-02-19 18:29     ` Mark Lord
2008-02-19 18:41       ` Mark Lord
2008-02-19 18:58       ` Paulo Marques
2008-02-19 22:33         ` Mark Lord
2008-02-27 11:20     ` Tomasz Chmielewski [this message]
2008-02-27 20:03       ` Andreas Dilger
2008-02-27 20:25         ` Tomasz Chmielewski
2008-03-01 20:04         ` Bill Davidsen
2008-02-19  9:24 ` Vladislav Bolkhovitin
     [not found] <9YdLC-75W-51@gated-at.bofh.it>
     [not found] ` <9YeRh-Gq-39@gated-at.bofh.it>
     [not found]   ` <9Yf0W-SX-19@gated-at.bofh.it>
     [not found]     ` <9YfNi-2da-23@gated-at.bofh.it>
     [not found]       ` <9YfWL-2pZ-1@gated-at.bofh.it>
     [not found]         ` <9Yg6H-2DJ-23@gated-at.bofh.it>
2008-02-19 13:14           ` Paul Slootman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47C54773.4040402@wpkg.org \
    --to=mangoo@wpkg.org \
    --cc=andi@firstfloor.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.