linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* abysmal rm performance?
@ 2013-07-20  5:37 Tomasz Chmielewski
  2013-07-20 12:54 ` Duncan
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Chmielewski @ 2013-07-20  5:37 UTC (permalink / raw)
  To: linux-btrfs

I'm using 3.10 with a btrfs filesystem with RAID-1 (on two drives),
with extended inode refs and skinny metadata extent refs enabled (-r
and -x options in btrfstune).

Server has 32 GB RAM.

Filesystem is mounted with noatime,compress-force=zlib mount options.


btrfs performs really, really poor when removing files.


Some examples - removing files for 10 seconds, repeated 10 times in a
row.
Each time, we measure how many files we removed, and amount of memory
we have to write to disk after rm operation ("Dirty"
from /proc/meminfo):

TIMEOUT=10s
sync
timeout $TIMEOUT rm -rfv trash_dir/ &>/tmp/rmout.log
wc -l /tmp/rmout.log
grep Dirty /proc/meminfo 


Removed files:    4319
Dirty:            211956 kB

Removed files:    3392
Dirty:            190764 kB

Removed files:    4011
Dirty:            174636 kB

Removed files:    5197
Dirty:            191500 kB

Removed files:    6395
Dirty:            202532 kB

Removed files:    4613
Dirty:            354764 kB

Removed files:    5469
Dirty:            170664 kB

Removed files:    4654
Dirty:            170876 kB

Removed files:    2245
Dirty:            152108 kB

Removed files:    2214
Dirty:            149848 kB


Compare it to ext4 - note "Dirty" is an order of magnitude lower for
ext4:

Removed files:      7346
Dirty:              4896 kB

Removed files:      11770
Dirty:              3536 kB

Removed files:      4266
Dirty:              80 kB

Removed files:      7541
Dirty:              4164 kB

Removed files:      8046
Dirty:              5428 kB

Removed files:      9630
Dirty:              5884 kB

Removed files:      14276
Dirty:              8384 kB

Removed files:      34234
Dirty:              10968 kB

Removed files:      10594
Dirty:              4348 kB

Removed files:      22672
Dirty:              4164 kB



File removal is actually quite fast until we reach around 350000 kB in
"Dirty" (this is with 32 GB RAM). Then, it's super slow.
Let's see what happens if we remove the files for 1 minute (above was
for just 10 secs):

btrfs:

Removed files:      18360
Dirty:              98276 kB

Removed files:      9913
Dirty:              60664 kB

Removed files:      10973
Dirty:              62284 kB

Removed files:      16606
Dirty:              275156 kB

Removed files:      13002
Dirty:              165844 kB

Removed files:      8349
Dirty:              178448 kB

Removed files:      20316
Dirty:              394912 kB

Removed files:      19109
Dirty:              321252 kB

Removed files:      22738
Dirty:              277964 kB

Removed files:      15288
Dirty:              41400 kB



ext4:

Removed files:      91714
Dirty:              7060 kB

Removed files:      79574
Dirty:              400 kB

Removed files:      105167
Dirty:              5384 kB

Removed files:      37123
Dirty:              25572 kB

Removed files:      94048
Dirty:              13708 kB

Removed files:      149079
Dirty:              48592 kB

Removed files:      136770
Dirty:              528 kB

Removed files:      169513
Dirty:              21024 kB

Removed files:      171877
Dirty:              1936 kB

Removed files:      95442
Dirty:              7780 kB



So it looks like removing files with btrfs needs much more metadata
updates?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: abysmal rm performance?
  2013-07-20  5:37 Tomasz Chmielewski
@ 2013-07-20 12:54 ` Duncan
  2013-07-20 13:36   ` Clemens Eisserer
  0 siblings, 1 reply; 5+ messages in thread
From: Duncan @ 2013-07-20 12:54 UTC (permalink / raw)
  To: linux-btrfs

Tomasz Chmielewski posted on Sat, 20 Jul 2013 13:37:26 +0800 as excerpted:

> So it looks like removing files with btrfs [as opposed to ext4] needs
> much more metadata updates?

You /really/ need to read up on the btrfs wiki.

The short answer is yes, btrfs does a LOT more metadata processing due to 
the checksumming it does by default.  (Consider that it must have all the 
metadata from a leaf available in ordered to rechecksum it when one 
file's metadata from that leaf gets deleted.)  Additionally, btrfs keeps 
two copies of metadata by default, in raid1 mode if there's multiple 
devices (btrfs raid1), DUP mode if not (other forms of raid, which would 
appear to btrfs as a single device).

Then there's the whole problem that you didn't provide nearly enough 
information about your test to tell what it was actually comparing.  What 
sort of raid1, btrfs/md/dm/hardware/what, and if btrfs raid1, was that 
for both data and metadata or just one of the two and what was the other 
one if they weren't both raid1?  And if you were testing btrfs raid1, 
what did you do with the ext4 test to try to make it comparable since 
ext4 doesn't have a native raid1 mode, or was it on a single device?

So... read up on the wiki a bit, then come back with questions you have 
that aren't answered there.  (I certainly had some I didn't see directly 
answered there when I first started with btrfs.)

https://btrfs.wiki.kernel.org/

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: abysmal rm performance?
  2013-07-20 12:54 ` Duncan
@ 2013-07-20 13:36   ` Clemens Eisserer
  0 siblings, 0 replies; 5+ messages in thread
From: Clemens Eisserer @ 2013-07-20 13:36 UTC (permalink / raw)
  To: linux-btrfs

> So... read up on the wiki a bit, then come back with questions you have
> that aren't answered there.  (I certainly had some I didn't see directly
> answered there when I first started with btrfs.)

I guess the original email was more ment as a bug-report than a
question, as the question was more like a "can it really be that
slow".
The wiki most likely won't help explaining/solving the high metadata
overhead either...

Regards

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: abysmal rm performance?
@ 2013-07-22  5:22 Tomasz Chmielewski
  2013-07-22 10:39 ` Duncan
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Chmielewski @ 2013-07-22  5:22 UTC (permalink / raw)
  To: linux-btrfs, 1i5t5.duncan, linuxhippy

> You /really/ need to read up on the btrfs wiki.
> 
> The short answer is yes, btrfs does a LOT more metadata processing
> due to the checksumming it does by default.

According to the wiki, checksumming has barely any influence, so I
guess the above advice is not really helpful?

https://btrfs.wiki.kernel.org/index.php/Mount_options

	nodatasum
	(...)
	On most modern CPUs this option does not result in any
	reasonable performance improvement.


> Then there's the whole problem that you didn't provide nearly enough
> information about your test to tell what it was actually comparing.
> What sort of raid1, btrfs/md/dm/hardware/what, and if btrfs raid1, was
> that for both data and metadata or just one of the two and what was
> the other one if they weren't both raid1?  And if you were testing
> btrfs raid1, what did you do with the ext4 test to try to make it
> comparable since ext4 doesn't have a native raid1 mode, or was it on
> a single device?

ext4: using md RAID


btrfs:

Data, RAID1: total=1.73TB, used=1.36TB
System, RAID1: total=32.00MB, used=264.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=79.00GB, used=70.23GB


Quite high metadata usage here.


The filesystems on ext4 and btrfs are copies; there are >30 milion 
inodes on ext4; most of the files have multiple hardlinks.


So paraphrasing my question: is there anything to improve "rm"
performance with btrfs?

"nodatacow" might help a bit, but then, it disabled the compression,
which is a major drawback.


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: abysmal rm performance?
  2013-07-22  5:22 abysmal rm performance? Tomasz Chmielewski
@ 2013-07-22 10:39 ` Duncan
  0 siblings, 0 replies; 5+ messages in thread
From: Duncan @ 2013-07-22 10:39 UTC (permalink / raw)
  To: linux-btrfs

Tomasz Chmielewski posted on Mon, 22 Jul 2013 12:22:11 +0700 as excerpted:

>> You /really/ need to read up on the btrfs wiki.
>> 
>> The short answer is yes, btrfs does a LOT more metadata processing due
>> to the checksumming it does by default.
> 
> According to the wiki, checksumming has barely any influence, so I guess
> the above advice is not really helpful?
> 
> https://btrfs.wiki.kernel.org/index.php/Mount_options
> 
> 	nodatasum (...)
> 	On most modern CPUs this option does not result in any reasonable
> 	performance improvement.

It's worth noting that in the context of the full description, that's 
referencing data write performance as that's where the checksumming would 
be done and the CPU performance would matter, not really delete 
performance, where the bottleneck is likely to be the storage device seek 
times.

However, being a user not a btrfs dev, and not having actually tested it, 
what I do NOT know is whether that option disables just the calculation, 
so the same seeks would be done and the same "unmetadata" (given the file 
was written with nodatasum) would be erased in any case, or if it short 
circuits the entire process.

It might be worth some benchmarks to see...

> btrfs:
> 
> Data, RAID1: total=1.73TB, used=1.36TB System, RAID1: total=32.00MB,
> used=264.00KB System: total=4.00MB, used=0.00 Metadata, RAID1:
> total=79.00GB, used=70.23GB
> 
> 
> Quite high metadata usage here.

Yes.  It's worth noting, however, that btrfs does store small files 
directly in the inode metadata itself, rather than in separate data 
extents.  So that can be considered too and may be part of it.

> The filesystems on ext4 and btrfs are copies; there are >30 milion
> inodes on ext4; most of the files have multiple hardlinks.

Hardlinks:  Until recently btrfs has problems if there were too many 
hardlinks in a directory.  They fixed that, but if you're doing a LOT of 
hardlinking, it may well be that is playing some part, as I don't know 
how performant the new code is.  It may be worth reading the list 
archives on that topic.

> So paraphrasing my question: is there anything to improve "rm"
> performance with btrfs?
> 
> "nodatacow" might help a bit, but then, it disabled the compression,
> which is a major drawback.

I have a strong suspicion nobarrier may help quite a bit with high-number 
delete loads, tho of course it DOES come with data corruption risks in 
the event of a power failure.

It's also likely that as the actual number of bugs go down as they are 
beginning to now, and the devs focus more on performance tuning, that 
this will get better.

Other than that, and of course the hardware/ssd option (I'm using btrfs 
in btrfs raid1 mode on a pair of ssds here and the zero-seek-time DOES 
make a difference, but I'm not doing terabytes of data either; that's 
still on reiserfs on spinning rust, here), it may simply be that btrfs 
isn't a filesystem choice well matched to your needs.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-07-22 10:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-22  5:22 abysmal rm performance? Tomasz Chmielewski
2013-07-22 10:39 ` Duncan
  -- strict thread matches above, loose matches on Subject: below --
2013-07-20  5:37 Tomasz Chmielewski
2013-07-20 12:54 ` Duncan
2013-07-20 13:36   ` Clemens Eisserer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).