* Btrfs + compression = slow performance and high cpu usage
[not found] <33040946.535.1501254718807.JavaMail.gkos@dynomob>
@ 2017-07-28 16:40 ` Konstantin V. Gavrilenko
2017-07-28 17:48 ` Roman Mamedov
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Konstantin V. Gavrilenko @ 2017-07-28 16:40 UTC (permalink / raw)
To: linux-btrfs
Hello list,
I am stuck with a problem of btrfs slow performance when using compression.
when the compress-force=lzo mount flag is enabled, the performance drops to 30-40 mb/s and one of the btrfs processes utilises 100% cpu time.
mount options: btrfs relatime,discard,autodefrag,compress=lzo,compress-force,space_cache=v2,commit=10
The command I am testing the write throughput is
# pv -tpreb /dev/sdb | dd of=./testfile bs=1M oflag=direct
# top -d 1
top - 15:49:13 up 1:52, 2 users, load average: 5.28, 2.32, 1.39
Tasks: 320 total, 2 running, 318 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 2.0 sy, 0.0 ni, 77.0 id, 21.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 1.0 sy, 0.0 ni, 90.0 id, 9.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 1.0 sy, 0.0 ni, 72.0 id, 27.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us,100.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.0 us, 1.0 sy, 0.0 ni, 57.0 id, 42.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni, 96.0 id, 4.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni, 94.0 id, 6.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 1.0 sy, 0.0 ni, 95.1 id, 3.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu8 : 1.0 us, 2.0 sy, 0.0 ni, 24.0 id, 73.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu9 : 0.0 us, 0.0 sy, 0.0 ni, 81.8 id, 18.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 1.0 us, 0.0 sy, 0.0 ni, 98.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11 : 0.0 us, 2.0 sy, 0.0 ni, 83.3 id, 14.7 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 32934136 total, 10137496 free, 602244 used, 22194396 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 30525664 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
37017 root 20 0 0 0 0 R 100.0 0.0 0:32.42 kworker/u49:8
36732 root 20 0 0 0 0 D 4.0 0.0 0:02.40 btrfs-transacti
40105 root 20 0 8388 3040 2000 D 4.0 0.0 0:02.88 dd
The keyworker process that causes the high cpu usage is most likely searching for the free space.
# echo l > /proc/sysrq-trigger
# dmest -T
Fri Jul 28 15:57:51 2017] CPU: 1 PID: 36430 Comm: kworker/u49:2 Not tainted 4.10.0-28-generic #32~16.04.2-Ubuntu
[Fri Jul 28 15:57:51 2017] Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1b 11/16/2012
[Fri Jul 28 15:57:51 2017] Workqueue: btrfs-delalloc btrfs_delalloc_helper [btrfs]
[Fri Jul 28 15:57:51 2017] task: ffff9ddce6206a40 task.stack: ffffaa9121f6c000
[Fri Jul 28 15:57:51 2017] RIP: 0010:rb_next+0x1e/0x40
[Fri Jul 28 15:57:51 2017] RSP: 0018:ffffaa9121f6fb40 EFLAGS: 00000282
[Fri Jul 28 15:57:51 2017] RAX: ffff9dddc34df1b0 RBX: 0000000000010000 RCX: 0000000000001000
[Fri Jul 28 15:57:51 2017] RDX: ffff9dddc34df708 RSI: ffff9ddccaf470a4 RDI: ffff9dddc34df2d0
[Fri Jul 28 15:57:51 2017] RBP: ffffaa9121f6fb40 R08: 0000000000000001 R09: 0000000000003000
[Fri Jul 28 15:57:51 2017] R10: 0000000000000000 R11: 0000000000020000 R12: ffff9ddccaf47080
[Fri Jul 28 15:57:51 2017] R13: 0000000000001000 R14: ffffaa9121f6fc50 R15: ffff9dddc34df2d0
[Fri Jul 28 15:57:51 2017] FS: 0000000000000000(0000) GS:ffff9ddcefa40000(0000) knlGS:0000000000000000
[Fri Jul 28 15:57:51 2017] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Jul 28 15:57:51 2017] Call Trace:_space_for_alloc+0xde/0x270 [btrfs]
[Fri Jul 28 15:57:51 2017] btrfs_find_space_for_alloc+0xde/0x270 [btrfs]
[Fri Jul 28 15:57:51 2017] find_free_extent.isra.68+0x3c6/0x1040 [btrfs]s]
[Fri Jul 28 15:57:51 2017] btrfs_reserve_extent+0xab/0x210 [btrfs]btrfs]
[Fri Jul 28 15:57:51 2017] submit_compressed_extents+0x154/0x580 [btrfs]s]
[Fri Jul 28 15:57:51 2017] ? submit_compressed_extents+0x580/0x580 [btrfs]
[Fri Jul 28 15:57:51 2017] async_cow_submit+0x82/0x90 [btrfs]00 [btrfs]
[Fri Jul 28 15:57:51 2017] btrfs_scrubparity_helper+0x1fe/0x300 [btrfs]
[Fri Jul 28 15:57:51 2017] btrfs_delalloc_helper+0xe/0x10 [btrfs]
[Fri Jul 28 15:57:51 2017] process_one_work+0x16b/0x4a0a0
[Fri Jul 28 15:57:51 2017] worker_thread+0x4b/0x500+0x60/0x60
[Fri Jul 28 15:57:51 2017] kthread+0x109/0x1400x4a0/0x4a0
When the compression is turned off, I am able to get the maximum 500-600 mb/s write speed on this disk (raid array) with minimal cpu usage.
mount options: relatime,discard,autodefrag,space_cache=v2,commit=10
# iostat -m 1
avg-cpu: %user %nice %system %iowait %steal %idle
0.08 0.00 7.74 10.77 0.00 81.40
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 2376.00 0.00 594.01 0 594
I have tried deleting mounting with nospace_cache, clear_cache, and then rebuilding with space_cache=v2
but it doesn't make any difference. The same sluggish performance is experienced when I try to write over the NFS.
Any ideas why the compression makes such a big difference and causes a bottleneck?
# uname -a
Linux backup1 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
# btrfs --version
btrfs-progs v4.8.1
# cat /etc/issue
Ubuntu 16.04.2 LTS \n \l
# btrfs fi show
Label: none uuid: f56bdc4a-239d-4268-81d8-01cdd7a3c1c9
Total devices 1 FS bytes used 9.32TiB
devid 2 size 21.83TiB used 9.33TiB path /dev/sda
# btrfs fi df /mnt/arh-backup1/
Data, single: total=9.28TiB, used=9.28TiB
System, single: total=32.00MiB, used=1.00MiB
Metadata, single: total=46.00GiB, used=44.20GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
# btrfs device usage /mnt/arh-backup1/
/dev/sda, ID: 2
Device size: 21.83TiB
Device slack: 0.00B
Data,single: 9.29TiB
Metadata,single: 46.00GiB
System,single: 32.00MiB
Unallocated: 12.49TiB
thanks in advance.
kos
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-28 16:40 ` Btrfs + compression = slow performance and high cpu usage Konstantin V. Gavrilenko
@ 2017-07-28 17:48 ` Roman Mamedov
2017-07-28 18:20 ` William Muriithi
2017-07-28 18:08 ` Peter Grandi
2017-07-28 18:44 ` Peter Grandi
2 siblings, 1 reply; 17+ messages in thread
From: Roman Mamedov @ 2017-07-28 17:48 UTC (permalink / raw)
To: Konstantin V. Gavrilenko; +Cc: linux-btrfs
On Fri, 28 Jul 2017 17:40:50 +0100 (BST)
"Konstantin V. Gavrilenko" <k.gavrilenko@arhont.com> wrote:
> Hello list,
>
> I am stuck with a problem of btrfs slow performance when using compression.
>
> when the compress-force=lzo mount flag is enabled, the performance drops to 30-40 mb/s and one of the btrfs processes utilises 100% cpu time.
> mount options: btrfs relatime,discard,autodefrag,compress=lzo,compress-force,space_cache=v2,commit=10
It does not work like that, you need to set compress-force=lzo (and remove
compress=).
With your setup I believe you currently use compress-force[=zlib](default),
overriding compress=lzo, since it's later in the options order.
Secondly,
> autodefrag
This sure sounded like a good thing to enable? on paper? right?...
The moment you see anything remotely weird about btrfs, this is the first
thing you have to disable and retest without. Oh wait, the first would be
qgroups, this one is second.
Finally, what is the reasoning behind "commit=10", and did you check with the
default value of 30?
--
With respect,
Roman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-28 16:40 ` Btrfs + compression = slow performance and high cpu usage Konstantin V. Gavrilenko
2017-07-28 17:48 ` Roman Mamedov
@ 2017-07-28 18:08 ` Peter Grandi
2017-07-30 13:42 ` Konstantin V. Gavrilenko
2017-07-28 18:44 ` Peter Grandi
2 siblings, 1 reply; 17+ messages in thread
From: Peter Grandi @ 2017-07-28 18:08 UTC (permalink / raw)
To: Linux fs Btrfs
> I am stuck with a problem of btrfs slow performance when using
> compression. [ ... ]
That to me looks like an issue with speed, not performance, and
in particular with PEBCAK issues.
As to high CPU usage, when you find a way to do both compression
and checksumming without using much CPU time, please send patches
urgently :-).
In your case the increase in CPU time is bizarre. I have the
Ubuntu 4.4 "lts-xenial" kernel and what you report does not
happen here (with a few little changes):
soft# grep 'model name' /proc/cpuinfo | sort -u
model name : AMD FX(tm)-6100 Six-Core Processor
soft# cpufreq-info | grep 'current CPU frequency'
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
soft# lsscsi | grep 'sd[ae]'
[0:0:0:0] disk ATA HFS256G32MNB-220 3L00 /dev/sda
[5:0:0:0] disk ATA ST2000DM001-1CH1 CC44 /dev/sde
soft# mkfs.btrfs -f /dev/sde3
[ ... ]
soft# mount -t btrfs -o discard,autodefrag,compress=lzo,compress-force,commit=10 /dev/sde3 /mnt/sde3
soft# df /dev/sda6 /mnt/sde3
Filesystem 1M-blocks Used Available Use% Mounted on
/dev/sda6 90048 76046 14003 85% /
/dev/sde3 237568 19 235501 1% /mnt/sde3
The above is useful context information that was "amazingly"
omitted from your reported.
In dmesg I see (not the "force zlib compression"):
[327730.917285] BTRFS info (device sde3): turning on discard
[327730.917294] BTRFS info (device sde3): enabling auto defrag
[327730.917300] BTRFS info (device sde3): setting 8 feature flag
[327730.917304] BTRFS info (device sde3): force zlib compression
[327730.917313] BTRFS info (device sde3): disk space caching is enabled
[327730.917315] BTRFS: has skinny extents
[327730.917317] BTRFS: flagging fs with big metadata feature
[327730.920740] BTRFS: creating UUID tree
and the result is:
soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=10000 oflag=direct
10000+0 records in17MB/s] [==> ] 11% ETA 0:15:06
10000+0 records out
10485760000 bytes (10 GB) copied, 112.845 s, 92.9 MB/s
0.05user 9.93system 1:53.20elapsed 8%CPU (0avgtext+0avgdata 3016maxresident)k
120inputs+20496000outputs (1major+346minor)pagefaults 0swaps
9.77GB 0:01:53 [88.3MB/s] [==> ]
11%
soft# btrfs fi df /mnt/sde3/
Data, single: total=10.01GiB, used=9.77GiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=11.66MiB
GlobalReserve, single: total=16.00MiB, used=0.00B
As it was running system CPU time was under 20% of one CPU:
top - 18:57:29 up 3 days, 19:27, 4 users, load average: 5.44, 2.82, 1.45
Tasks: 325 total, 1 running, 324 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 2.3 sy, 0.0 ni, 91.3 id, 6.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 1.3 sy, 0.0 ni, 78.5 id, 20.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.3 us, 5.8 sy, 0.0 ni, 81.0 id, 12.5 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu3 : 0.3 us, 3.4 sy, 0.0 ni, 91.9 id, 4.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.3 us, 10.6 sy, 0.0 ni, 55.4 id, 33.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8120660 total, 5162236 used, 2958424 free, 4440100 buffers
KiB Swap: 0 total, 0 used, 0 free. 351848 cached Mem
PID PPID USER PR NI VIRT RES DATA %CPU %MEM TIME+ TTY COMMAND
21047 21046 root 20 0 8872 2616 1364 12.9 0.0 0:02.31 pts/3 dd iflag=fullblo+
21045 3535 root 20 0 7928 1948 460 12.3 0.0 0:00.72 pts/3 pv -tpreb /dev/s+
21019 2 root 20 0 0 0 0 1.3 0.0 0:42.88 ? [kworker/u16:1]
Of course "oflag=direct" is a rather "optimistic" option in this
context, so I tried again with something more sensible:
soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=10000 conv=fsync
10000+0 records in.4MB/s] [==> ] 11% ETA 0:14:41
10000+0 records out
10485760000 bytes (10 GB) copied, 110.523 s, 94.9 MB/s
0.03user 8.94system 1:50.71elapsed 8%CPU (0avgtext+0avgdata 3024maxresident)k
136inputs+20499648outputs (1major+348minor)pagefaults 0swaps
9.77GB 0:01:50 [90.3MB/s] [==> ] 11%
soft# btrfs fi df /mnt/sde3/
Data, single: total=7.01GiB, used=6.35GiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=15.81MiB
GlobalReserve, single: total=16.00MiB, used=0.00B
As it was running system as expected CPU time was around 55-60%
of each of 6 CPUs:
top - 18:56:03 up 3 days, 19:26, 4 users, load average: 2.31, 1.39, 0.90
Tasks: 325 total, 5 running, 320 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 57.9 sy, 0.0 ni, 28.3 id, 13.8 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 46.8 sy, 0.0 ni, 36.9 id, 16.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 72.8 sy, 0.0 ni, 13.4 id, 12.8 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 0.3 us, 63.8 sy, 0.0 ni, 17.4 id, 17.4 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu4 : 0.0 us, 53.3 sy, 0.0 ni, 29.7 id, 17.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 54.0 sy, 0.0 ni, 32.7 id, 13.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8120660 total, 7988368 used, 132292 free, 3646496 buffers
KiB Swap: 0 total, 0 used, 0 free. 3967692 cached Mem
PID PPID USER PR NI VIRT RES DATA %CPU %MEM TIME+ TTY COMMAND
21022 2 root 20 0 0 0 0 45.2 0.0 0:19.69 ? [kworker/u16:5]
21028 2 root 20 0 0 0 0 39.9 0.0 0:27.84 ? [kworker/u16:11]
21043 2 root 20 0 0 0 0 39.9 0.0 0:04.26 ? [kworker/u16:19]
21009 2 root 20 0 0 0 0 38.2 0.0 0:24.50 ? [kworker/u16:0]
21037 2 root 20 0 0 0 0 34.2 0.0 0:17.94 ? [kworker/u16:17]
21021 2 root 20 0 0 0 0 19.9 0.0 0:14.83 ? [kworker/u16:3]
21019 2 root 20 0 0 0 0 19.3 0.0 0:29.98 ? [kworker/u16:1]
21034 2 root 20 0 0 0 0 19.3 0.0 0:28.18 ? [kworker/u16:14]
21030 2 root 20 0 0 0 0 17.9 0.0 0:24.85 ? [kworker/u16:13]
21035 2 root 20 0 0 0 0 17.6 0.0 0:20.75 ? [kworker/u16:15]
21023 2 root 20 0 0 0 0 15.0 0.0 0:28.20 ? [kworker/u16:6]
21020 2 root 20 0 0 0 0 9.6 0.0 0:27.02 ? [kworker/u16:2]
21040 3535 root 20 0 7928 1872 460 8.6 0.0 0:04.11 pts/3 pv -tpreb /dev/s+
21042 21041 root 20 0 8872 2628 1364 8.3 0.0 0:05.19 pts/3 dd iflag=fullblo+
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Btrfs + compression = slow performance and high cpu usage
2017-07-28 17:48 ` Roman Mamedov
@ 2017-07-28 18:20 ` William Muriithi
2017-07-28 18:37 ` Hugo Mills
0 siblings, 1 reply; 17+ messages in thread
From: William Muriithi @ 2017-07-28 18:20 UTC (permalink / raw)
To: Roman Mamedov, Konstantin V. Gavrilenko; +Cc: linux-btrfs@vger.kernel.org
Hi Roman,
> autodefrag
This sure sounded like a good thing to enable? on paper? right?...
The moment you see anything remotely weird about btrfs, this is the first thing you have to disable and retest without. Oh wait, the first would be qgroups, this one is second.
What's the problem with autodefrag? I am also using it, so you caught my attention when you implied that it shouldn't be used. According to docs, it seem like one of the very mature feature of the filesystem. See below for the doc I am referring to
https://btrfs.wiki.kernel.org/index.php/Status
I am using it as I assumed it could prevent the filesystem being too fragmented long term, but never thought there was price to pay for using it
Regards,
William
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-28 18:20 ` William Muriithi
@ 2017-07-28 18:37 ` Hugo Mills
0 siblings, 0 replies; 17+ messages in thread
From: Hugo Mills @ 2017-07-28 18:37 UTC (permalink / raw)
To: William Muriithi
Cc: Roman Mamedov, Konstantin V. Gavrilenko,
linux-btrfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1485 bytes --]
On Fri, Jul 28, 2017 at 06:20:14PM +0000, William Muriithi wrote:
> Hi Roman,
>
> > autodefrag
>
> This sure sounded like a good thing to enable? on paper? right?...
>
> The moment you see anything remotely weird about btrfs, this is the first thing you have to disable and retest without. Oh wait, the first would be qgroups, this one is second.
>
> What's the problem with autodefrag? I am also using it, so you caught my attention when you implied that it shouldn't be used. According to docs, it seem like one of the very mature feature of the filesystem. See below for the doc I am referring to
>
> https://btrfs.wiki.kernel.org/index.php/Status
>
> I am using it as I assumed it could prevent the filesystem being too fragmented long term, but never thought there was price to pay for using it
It introduces additional I/O on writes, as it modifies a small area
surrounding any write or cluster of writes.
I'm not aware of it causing massive slowdowns, in the way the
qgroups does in some situations.
If your system is already marginal in terms of being able to
support the I/O required, then turning on autodefrag will make things
worse (but you may be heading for _much_ worse performance in the
future as the FS becomes more fragmented -- depending on your write
patterns and use case).
Hugo.
--
Hugo Mills | Great oxymorons of the world, no. 6:
hugo@... carfax.org.uk | Mature Student
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-28 16:40 ` Btrfs + compression = slow performance and high cpu usage Konstantin V. Gavrilenko
2017-07-28 17:48 ` Roman Mamedov
2017-07-28 18:08 ` Peter Grandi
@ 2017-07-28 18:44 ` Peter Grandi
2 siblings, 0 replies; 17+ messages in thread
From: Peter Grandi @ 2017-07-28 18:44 UTC (permalink / raw)
To: Linux fs Btrfs
In addition to my previous "it does not happen here" comment, if
someone is reading this thread, there are some other interesting
details:
> When the compression is turned off, I am able to get the
> maximum 500-600 mb/s write speed on this disk (raid array)
> with minimal cpu usage.
No details on whether it is a parity RAID or not.
> btrfs device usage /mnt/arh-backup1/
> /dev/sda, ID: 2
> Device size: 21.83TiB
> Device slack: 0.00B
> Data,single: 9.29TiB
> Metadata,single: 46.00GiB
> System,single: 32.00MiB
> Unallocated: 12.49TiB
That's exactly 24TB of "Device size", of which around 45% are
used, and the string "backup" may suggest that the content is
backups, which may indicate a very fragmented freespace.
Of course compression does not help with that, in my freshly
created Btrfs volume I get as expected:
soft# umount /mnt/sde3
soft# mount -t btrfs -o commit=10 /dev/sde3 /mnt/sde3
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sde3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 103.747 s, 101 MB/s
0.00user 11.56system 1:44.86elapsed 11%CPU (0avgtext+0avgdata 3072maxresident)k
20480672inputs+20498272outputs (1major+349minor)pagefaults 0swaps
soft# filefrag /mnt/sde3/testfile
/mnt/sde3/testfile: 11 extents found
versus:
soft# umount /mnt/sde3
soft# mount -t btrfs -o commit=10,compress=lzo,compress-force /dev/sde3 /mnt/sde3
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sde3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 109.051 s, 96.2 MB/s
0.02user 13.03system 1:49.49elapsed 11%CPU (0avgtext+0avgdata 3068maxresident)k
20494784inputs+20492320outputs (1major+347minor)pagefaults 0swaps
soft# filefrag /mnt/sde3/testfile
/mnt/sde3/testfile: 49287 extents found
Most the latter extents are mercifully rather contiguous, their
size is just limited by the compression code, here is an extract
from 'filefrag -v' from around the middle:
24757: 1321888.. 1321919: 11339579.. 11339610: 32: 11339594:
24758: 1321920.. 1321951: 11339597.. 11339628: 32: 11339611:
24759: 1321952.. 1321983: 11339615.. 11339646: 32: 11339629:
24760: 1321984.. 1322015: 11339632.. 11339663: 32: 11339647:
24761: 1322016.. 1322047: 11339649.. 11339680: 32: 11339664:
24762: 1322048.. 1322079: 11339667.. 11339698: 32: 11339681:
24763: 1322080.. 1322111: 11339686.. 11339717: 32: 11339699:
24764: 1322112.. 1322143: 11339703.. 11339734: 32: 11339718:
24765: 1322144.. 1322175: 11339720.. 11339751: 32: 11339735:
24766: 1322176.. 1322207: 11339737.. 11339768: 32: 11339752:
24767: 1322208.. 1322239: 11339754.. 11339785: 32: 11339769:
24768: 1322240.. 1322271: 11339771.. 11339802: 32: 11339786:
24769: 1322272.. 1322303: 11339789.. 11339820: 32: 11339803:
But again this is on a fresh empty Btrfs volume.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-28 18:08 ` Peter Grandi
@ 2017-07-30 13:42 ` Konstantin V. Gavrilenko
2017-07-31 11:41 ` Peter Grandi
0 siblings, 1 reply; 17+ messages in thread
From: Konstantin V. Gavrilenko @ 2017-07-30 13:42 UTC (permalink / raw)
To: Linux fs Btrfs; +Cc: Peter Grandi
Thanks for the comments. Initially the system performed well, I don't have the benchmark details written, but the compressed vs non compressed speeds were more or less similar. However, after several weeks of usage, the system started experiencing the described slowdowns, thus I started investigating the problem. This indeed is a backup drive, but it predominantly contains large files.
# ls -lahR | awk '/^-/ {print $5}' | sort | uniq -c | sort -n | tail -n 15
5 322
5 396
5 400
6 1000G
6 11
6 200G
8 24G
8 48G
13 500G
20 8.0G
25 165G
32 20G
57 100G
103 50G
201 10G
# grep 'model name' /proc/cpuinfo | sort -u
model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
# lsscsi | grep 'sd[ae]'
[4:2:0:0] disk LSI MR9260-8i 2.13 /dev/sda
The sda device is a hardware RAID5 consisting of 4x8TB drives.
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
Size : 21.830 TB
Sector Size : 512
Is VD emulated : Yes
Parity Size : 7.276 TB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 4
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: No
Number of Spans: 1
Span: 0 - Number of PDs: 4
I have changed the mount flags as suggested, and I don't see the previously reported behaviour of one of the kworker consuming 100% of the cputime, but the write speed difference between the compression ON vs OFF is pretty large.
Have run several tests under with zlib, lzo and no compression and the results are rather strange.
mountflags: (rw,relatime,compress-force=zlib,space_cache=v2,subvolid=5,subvol=/)
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 93.3418 s, 57.5 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 26.0685 s, 206 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 77.4845 s, 69.3 MB/s
mountflags: (rw,relatime,compress-force=lzo,space_cache=v2,subvolid=5,subvol=/)
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 116.246 s, 46.2 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 14.704 s, 365 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 122.321 s, 43.9 MB/s
mountflags: (rw,relatime,space_cache=v2,subvolid=5,subvol=/)
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 32.2551 s, 166 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 19.9464 s, 269 MB/s
dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 10.1033 s, 531 MB/s
The CPU usage is pretty low as well. For example when the force-compress=zlib is in effect, the cpu usage is pretty low now.
Linux 4.10.0-28-generic (ais-backup1) 30/07/17 _x86_64_ (12 CPU)
14:31:27 CPU %user %nice %system %iowait %steal %idle
14:31:28 all 0.00 0.00 1.50 0.00 0.00 98.50
14:31:29 all 0.00 0.00 4.78 3.52 0.00 91.69
14:31:30 all 0.08 0.00 4.92 3.75 0.00 91.25
14:31:31 all 0.00 0.00 4.76 3.76 0.00 91.49
14:31:32 all 0.00 0.00 4.76 3.76 0.00 91.48
14:31:33 all 0.08 0.00 4.67 3.76 0.00 91.49
14:31:34 all 0.00 0.00 4.76 3.68 0.00 91.56
14:31:35 all 0.08 0.00 4.76 3.76 0.00 91.40
14:31:36 all 0.00 0.00 4.60 3.77 0.00 91.63
14:31:37 all 0.00 0.00 4.68 3.68 0.00 91.64
14:31:38 all 0.08 0.00 4.52 3.76 0.00 91.64
14:31:39 all 0.08 0.00 4.68 3.76 0.00 91.48
14:31:40 all 0.08 0.00 4.52 3.76 0.00 91.64
14:31:41 all 0.00 0.00 4.61 3.77 0.00 91.62
14:31:42 all 0.08 0.00 5.07 3.74 0.00 91.10
14:31:43 all 0.00 0.00 4.68 3.68 0.00 91.64
14:31:44 all 0.00 0.00 4.84 5.09 0.00 90.08
14:31:45 all 0.17 0.00 4.67 4.75 0.00 90.42
14:31:46 all 0.00 0.00 4.60 3.76 0.00 91.64
14:31:47 all 0.08 0.00 5.07 3.66 0.00 91.18
14:31:48 all 0.00 0.00 5.01 3.68 0.00 91.31
14:31:49 all 0.00 0.00 4.76 3.68 0.00 91.56
14:31:50 all 0.08 0.00 4.59 3.59 0.00 91.73
14:31:51 all 0.00 0.00 2.67 1.92 0.00 95.41
----- Original Message -----
From: "Peter Grandi" <pg@btrfs.list.sabi.co.UK>
To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org>
Sent: Friday, 28 July, 2017 8:08:47 PM
Subject: Re: Btrfs + compression = slow performance and high cpu usage
> I am stuck with a problem of btrfs slow performance when using
> compression. [ ... ]
That to me looks like an issue with speed, not performance, and
in particular with PEBCAK issues.
As to high CPU usage, when you find a way to do both compression
and checksumming without using much CPU time, please send patches
urgently :-).
In your case the increase in CPU time is bizarre. I have the
Ubuntu 4.4 "lts-xenial" kernel and what you report does not
happen here (with a few little changes):
soft# grep 'model name' /proc/cpuinfo | sort -u
model name : AMD FX(tm)-6100 Six-Core Processor
soft# cpufreq-info | grep 'current CPU frequency'
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
soft# lsscsi | grep 'sd[ae]'
[0:0:0:0] disk ATA HFS256G32MNB-220 3L00 /dev/sda
[5:0:0:0] disk ATA ST2000DM001-1CH1 CC44 /dev/sde
soft# mkfs.btrfs -f /dev/sde3
[ ... ]
soft# mount -t btrfs -o discard,autodefrag,compress=lzo,compress-force,commit=10 /dev/sde3 /mnt/sde3
soft# df /dev/sda6 /mnt/sde3
Filesystem 1M-blocks Used Available Use% Mounted on
/dev/sda6 90048 76046 14003 85% /
/dev/sde3 237568 19 235501 1% /mnt/sde3
The above is useful context information that was "amazingly"
omitted from your reported.
In dmesg I see (not the "force zlib compression"):
[327730.917285] BTRFS info (device sde3): turning on discard
[327730.917294] BTRFS info (device sde3): enabling auto defrag
[327730.917300] BTRFS info (device sde3): setting 8 feature flag
[327730.917304] BTRFS info (device sde3): force zlib compression
[327730.917313] BTRFS info (device sde3): disk space caching is enabled
[327730.917315] BTRFS: has skinny extents
[327730.917317] BTRFS: flagging fs with big metadata feature
[327730.920740] BTRFS: creating UUID tree
and the result is:
soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=10000 oflag=direct
10000+0 records in17MB/s] [==> ] 11% ETA 0:15:06
10000+0 records out
10485760000 bytes (10 GB) copied, 112.845 s, 92.9 MB/s
0.05user 9.93system 1:53.20elapsed 8%CPU (0avgtext+0avgdata 3016maxresident)k
120inputs+20496000outputs (1major+346minor)pagefaults 0swaps
9.77GB 0:01:53 [88.3MB/s] [==> ]
11%
soft# btrfs fi df /mnt/sde3/
Data, single: total=10.01GiB, used=9.77GiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=11.66MiB
GlobalReserve, single: total=16.00MiB, used=0.00B
As it was running system CPU time was under 20% of one CPU:
top - 18:57:29 up 3 days, 19:27, 4 users, load average: 5.44, 2.82, 1.45
Tasks: 325 total, 1 running, 324 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 2.3 sy, 0.0 ni, 91.3 id, 6.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 1.3 sy, 0.0 ni, 78.5 id, 20.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.3 us, 5.8 sy, 0.0 ni, 81.0 id, 12.5 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu3 : 0.3 us, 3.4 sy, 0.0 ni, 91.9 id, 4.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.3 us, 10.6 sy, 0.0 ni, 55.4 id, 33.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8120660 total, 5162236 used, 2958424 free, 4440100 buffers
KiB Swap: 0 total, 0 used, 0 free. 351848 cached Mem
PID PPID USER PR NI VIRT RES DATA %CPU %MEM TIME+ TTY COMMAND
21047 21046 root 20 0 8872 2616 1364 12.9 0.0 0:02.31 pts/3 dd iflag=fullblo+
21045 3535 root 20 0 7928 1948 460 12.3 0.0 0:00.72 pts/3 pv -tpreb /dev/s+
21019 2 root 20 0 0 0 0 1.3 0.0 0:42.88 ? [kworker/u16:1]
Of course "oflag=direct" is a rather "optimistic" option in this
context, so I tried again with something more sensible:
soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=10000 conv=fsync
10000+0 records in.4MB/s] [==> ] 11% ETA 0:14:41
10000+0 records out
10485760000 bytes (10 GB) copied, 110.523 s, 94.9 MB/s
0.03user 8.94system 1:50.71elapsed 8%CPU (0avgtext+0avgdata 3024maxresident)k
136inputs+20499648outputs (1major+348minor)pagefaults 0swaps
9.77GB 0:01:50 [90.3MB/s] [==> ] 11%
soft# btrfs fi df /mnt/sde3/
Data, single: total=7.01GiB, used=6.35GiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=15.81MiB
GlobalReserve, single: total=16.00MiB, used=0.00B
As it was running system as expected CPU time was around 55-60%
of each of 6 CPUs:
top - 18:56:03 up 3 days, 19:26, 4 users, load average: 2.31, 1.39, 0.90
Tasks: 325 total, 5 running, 320 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 57.9 sy, 0.0 ni, 28.3 id, 13.8 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 46.8 sy, 0.0 ni, 36.9 id, 16.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 72.8 sy, 0.0 ni, 13.4 id, 12.8 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 0.3 us, 63.8 sy, 0.0 ni, 17.4 id, 17.4 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu4 : 0.0 us, 53.3 sy, 0.0 ni, 29.7 id, 17.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 54.0 sy, 0.0 ni, 32.7 id, 13.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8120660 total, 7988368 used, 132292 free, 3646496 buffers
KiB Swap: 0 total, 0 used, 0 free. 3967692 cached Mem
PID PPID USER PR NI VIRT RES DATA %CPU %MEM TIME+ TTY COMMAND
21022 2 root 20 0 0 0 0 45.2 0.0 0:19.69 ? [kworker/u16:5]
21028 2 root 20 0 0 0 0 39.9 0.0 0:27.84 ? [kworker/u16:11]
21043 2 root 20 0 0 0 0 39.9 0.0 0:04.26 ? [kworker/u16:19]
21009 2 root 20 0 0 0 0 38.2 0.0 0:24.50 ? [kworker/u16:0]
21037 2 root 20 0 0 0 0 34.2 0.0 0:17.94 ? [kworker/u16:17]
21021 2 root 20 0 0 0 0 19.9 0.0 0:14.83 ? [kworker/u16:3]
21019 2 root 20 0 0 0 0 19.3 0.0 0:29.98 ? [kworker/u16:1]
21034 2 root 20 0 0 0 0 19.3 0.0 0:28.18 ? [kworker/u16:14]
21030 2 root 20 0 0 0 0 17.9 0.0 0:24.85 ? [kworker/u16:13]
21035 2 root 20 0 0 0 0 17.6 0.0 0:20.75 ? [kworker/u16:15]
21023 2 root 20 0 0 0 0 15.0 0.0 0:28.20 ? [kworker/u16:6]
21020 2 root 20 0 0 0 0 9.6 0.0 0:27.02 ? [kworker/u16:2]
21040 3535 root 20 0 7928 1872 460 8.6 0.0 0:04.11 pts/3 pv -tpreb /dev/s+
21042 21041 root 20 0 8872 2628 1364 8.3 0.0 0:05.19 pts/3 dd iflag=fullblo+
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-30 13:42 ` Konstantin V. Gavrilenko
@ 2017-07-31 11:41 ` Peter Grandi
2017-07-31 12:33 ` Peter Grandi
2017-08-01 9:58 ` Konstantin V. Gavrilenko
0 siblings, 2 replies; 17+ messages in thread
From: Peter Grandi @ 2017-07-31 11:41 UTC (permalink / raw)
To: Linux fs Btrfs
[ ... ]
> grep 'model name' /proc/cpuinfo | sort -u
> model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
Good, contemporary CPU with all accelerations.
> The sda device is a hardware RAID5 consisting of 4x8TB drives.
[ ... ]
> Strip Size : 256 KB
So the full RMW data stripe length is 768KiB.
> [ ... ] don't see the previously reported behaviour of one of
> the kworker consuming 100% of the cputime, but the write speed
> difference between the compression ON vs OFF is pretty large.
That's weird; of course 'lzo' is a lot cheaper than 'zlib', but
in my test the much higher CPU time of the latter was spread
across many CPUs, while in your case it wasn't, even if the
E5645 has 6 CPUs and can do 12 threads. That seemed to point to
some high cost of finding free blocks, that is a very fragmented
free list, or something else.
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 26.0685 s, 206 MB/s
The results with 'oflag=direct' are not relevant, because Btrfs
behaves "differently" with that.
> mountflags: (rw,relatime,compress-force=zlib,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 77.4845 s, 69.3 MB/s
> mountflags: (rw,relatime,compress-force=lzo,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 122.321 s, 43.9 MB/s
That's pretty good for a RAID5 with 128KiB writes and a 768KiB
stripe size, on a 3ware, and looks like that the hw host adapter
does not have a persistent cache (battery backed usually). My
guess that watching transfer rates and latencies with 'iostat
-dk -zyx 1' did not happen.
> mountflags: (rw,relatime,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 10.1033 s, 531 MB/s
I had mentioned in my previous reply the output of 'filefrag'.
That to me seems relevant here, because of RAID5 RMW and maximum
extent size with Brfs compression and strip/stripe size.
Perhaps redoing the tests with a 128KiB 'bs' *without*
compression would be interesting, perhaps even with 'oflag=sync'
instead of 'conv=fsync'.
It is hard for me to see a speed issue here with Btrfs: for
comparison I have done a simple test with a both a 3+1 MD RAID5
set with a 256KiB chunk size and a single block device on
"contemporary" 1T/2TB drives, capable of sequential transfer
rates of 150-190MB/s:
soft# grep -A2 sdb3 /proc/mdstat
md127 : active raid5 sde3[4] sdd3[2] sdc3[1] sdb3[0]
729808128 blocks super 1.0 level 5, 256k chunk, algorithm 2 [4/4] [UUUU]
with compression:
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 94.3605 s, 111 MB/s
0.01user 12.59system 1:34.36elapsed 13%CPU (0avgtext+0avgdata 2932maxresident)k
13042144inputs+20482144outputs (3major+345minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 93.5885 s, 112 MB/s
0.03user 12.35system 1:33.59elapsed 13%CPU (0avgtext+0avgdata 2940maxresident)k
13042144inputs+20482400outputs (3major+346minor)pagefaults 0swaps
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 48945 extents found
/mnt/sdg3/testfile: 49029 extents found
soft# btrfs fi df /mnt/test5/ | grep Data
Data, single: total=7.00GiB, used=6.55GiB
soft# btrfs fi df /mnt/sdg3 | grep Data
Data, single: total=7.00GiB, used=6.55GiB
soft# sysctl vm/drop_caches=3
vm.drop_caches = 3
soft# /usr/bin/time dd iflag=fullblock if=/mnt/test5/testfile bs=1M count=10000 of=/dev/zero
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 23.2975 s, 450 MB/s
0.01user 7.59system 0:23.32elapsed 32%CPU (0avgtext+0avgdata 2932maxresident)k
13759624inputs+0outputs (3major+344minor)pagefaults 0swaps
soft# sysctl vm/drop_caches=3
vm.drop_caches = 3
soft# /usr/bin/time dd iflag=fullblock if=/mnt/sdg3/testfile bs=1M count=10000 of=/dev/zero
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 35.0032 s, 300 MB/s
0.01user 8.46system 0:35.03elapsed 24%CPU (0avgtext+0avgdata 2924maxresident)k
13750568inputs+0outputs (3major+345minor)pagefaults 0swaps
and without compression:
soft# mount -t btrfs -o commit=10 /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 74.7256 s, 140 MB/s
0.02user 13.31system 1:14.72elapsed 17%CPU (0avgtext+0avgdata 2936maxresident)k
13047640inputs+20483808outputs (3major+345minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 102.002 s, 103 MB/s
0.02user 14.49system 1:42.00elapsed 14%CPU (0avgtext+0avgdata 2972maxresident)k
13030592inputs+20484032outputs (3major+345minor)pagefaults 0swaps
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 23 extents found
/mnt/sdg3/testfile: 13 extents found
> The CPU usage is pretty low as well. For example when the
> force-compress=zlib is in effect, the cpu usage is pretty low
> now.
That's 24 threads around 4-5% CPU each, that's around a 100% CPU
of system time spread around, for 70MB/s.
That's quite low. My report, which is mirrored by using 'pigz'
at the user level (very similar algorithms), was that 90MB/s
took 300% of an FX-6100 CPU at 3.3Ghz, and it is not that much
less efficient than a Xeon-E5645 at 2.4Ghz.
I have redone the test on a faster CPU:
base# grep 'model name' /proc/cpuinfo | sort -u
model name : AMD FX-8370E Eight-Core Processor
base# cpufreq-info | grep 'current CPU frequency'
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
And the result is (from a fast flash Samsung SSD to a fast 2TB
Toshiba drive):
base# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb6 /mnt/sdb6
base# /usr/bin/time dd iflag=fullblock if=/dev/sde6 of=/mnt/sdb6/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 41.7702 s, 251 MB/s
0.00user 11.41system 0:43.41elapsed 26%CPU (0avgtext+0avgdata 3132maxresident)k
20482288inputs+20503368outputs (1major+339minor)pagefaults 0swaps
With CPU usage as:
top - 09:20:38 up 20:48, 4 users, load average: 5.04, 2.03, 2.06
Tasks: 576 total, 10 running, 566 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 94.0 sy, 0.7 ni, 0.0 id, 2.0 wa, 0.0 hi, 3.3 si, 0.0 st
%Cpu1 : 0.3 us, 97.0 sy, 0.0 ni, 2.0 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 3.0 us, 94.4 sy, 0.0 ni, 1.7 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.7 us, 95.4 sy, 0.0 ni, 2.3 id, 1.3 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu4 : 1.0 us, 95.7 sy, 0.3 ni, 2.6 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.3 us, 94.7 sy, 0.0 ni, 4.3 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 1.7 us, 94.3 sy, 0.0 ni, 3.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 97.3 sy, 0.0 ni, 2.3 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 16395076 total, 15987476 used, 407600 free, 5206304 buffers
KiB Swap: 0 total, 0 used, 0 free. 8392648 cached Mem
so around 7 CPUs for 250MB/s, or around 35MB/s per CPU (more or
less what I also get user-level with 'pigz'), and it is hard for
me to imagine the Xeon-E5745 being twice as fast per-CPU for
"integer" work, but that's another discussion.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-31 11:41 ` Peter Grandi
@ 2017-07-31 12:33 ` Peter Grandi
2017-07-31 12:49 ` Peter Grandi
2017-08-01 9:58 ` Konstantin V. Gavrilenko
1 sibling, 1 reply; 17+ messages in thread
From: Peter Grandi @ 2017-07-31 12:33 UTC (permalink / raw)
To: Linux fs Btrfs
> [ ... ] It is hard for me to see a speed issue here with
> Btrfs: for comparison I have done a simple test with a both a
> 3+1 MD RAID5 set with a 256KiB chunk size and a single block
> device on "contemporary" 1T/2TB drives, capable of sequential
> transfer rates of 150-190MB/s: [ ... ]
The figures after this are a bit on the low side because I
realized looking at 'vmstat' that the source block device 'sda6'
was being a bottleneck, as the host has only 8GiB instead of the
16GiB I misremembered, and also 'sda' is a relatively slow flash
SSD that reads are most at around 220MB/s. So I have redone the
simple tests with a transfer size of 3GB, which ensures that
all reads are from memory cache:
with compression:
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 15.8869 s, 198 MB/s
0.00user 2.80system 0:15.88elapsed 17%CPU (0avgtext+0avgdata 3056maxresident)k
0inputs+6148256outputs (0major+346minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=3000 conv=fsync
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 16.9663 s, 185 MB/s
0.00user 2.61system 0:16.96elapsed 15%CPU (0avgtext+0avgdata 3056maxresident)k
0inputs+6144672outputs (0major+346minor)pagefaults 0swaps
soft# btrfs fi df /mnt/test5/ | grep Data
Data, single: total=3.00GiB, used=2.28GiB
soft# btrfs fi df /mnt/sdg3 | grep Data
Data, single: total=3.00GiB, used=2.28GiB
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 8811 extents found
/mnt/sdg3/testfile: 8759 extents found
Slightly weird that with a 3GB size the number of extents is
almost double that for the 10GB, but I guess that depends on
speed.
Then without compression:
soft# mount -t btrfs -o commit=10 /dev/md/test5 /mnt/test5
soft# mount -t btrfs -o commit=10 /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 8.06841 s, 390 MB/s
0.00user 3.90system 0:08.80elapsed 44%CPU (0avgtext+0avgdata 2880maxresident)k
0inputs+6153856outputs (0major+345minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=3000 conv=fsync
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 30.215 s, 104 MB/s
0.00user 4.82system 0:30.93elapsed 15%CPU (0avgtext+0avgdata 2888maxresident)k
0inputs+6152128outputs (0major+347minor)pagefaults 0swaps
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 5 extents found
/mnt/sdg3/testfile: 3 extents found
Also added:
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/test5/testfile bs=128k oflag=sync
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 160.315 s, 2.5 MB/s
0.02user 0.46system 2:40.31elapsed 0%CPU (0avgtext+0avgdata 1992maxresident)k
0inputs+0outputs (0major+124minor)pagefaults 0swaps
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 160.365 s, 2.5 MB/s
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/sdg3/testfile bs=128k oflag=sync
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 113.51 s, 3.5 MB/s
0.02user 0.56system 1:53.51elapsed 0%CPU (0avgtext+0avgdata 2156maxresident)k
0inputs+0outputs (0major+120minor)pagefaults 0swaps
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 113.544 s, 3.5 MB/s
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 1 extent found
/mnt/sdg3/testfile: 22 extents found
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=1M count=1000 | dd iflag=fullblock of=/mnt/test5/testfile bs=1M oflag=sync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 68.5037 s, 15.3 MB/s
0.00user 1.16system 1:08.50elapsed 1%CPU (0avgtext+0avgdata 2888maxresident)k
0inputs+0outputs (0major+347minor)pagefaults 0swaps
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 68.5859 s, 15.3 MB/s
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=1M count=1000 | dd iflag=fullblock of=/mnt/sdg3/testfile bs=1M oflag=sync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 56.6714 s, 18.5 MB/s
0.00user 1.21system 0:56.67elapsed 2%CPU (0avgtext+0avgdata 3056maxresident)k
0inputs+0outputs (0major+345minor)pagefaults 0swaps
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 56.7116 s, 18.5 MB/s
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-31 12:33 ` Peter Grandi
@ 2017-07-31 12:49 ` Peter Grandi
0 siblings, 0 replies; 17+ messages in thread
From: Peter Grandi @ 2017-07-31 12:49 UTC (permalink / raw)
To: Linux fs Btrfs
[ ... ]
> Also added:
Feeling very generous :-) today, adding these too:
soft# mkfs.btrfs -mraid10 -draid10 -L test5 /dev/sd{b,c,d,e}3
[ ... ]
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb3 /mnt/test5
soft# rm -f /mnt/test5/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB) copied, 14.2166 s, 221 MB/s
0.00user 2.54system 0:14.21elapsed 17%CPU (0avgtext+0avgdata 3056maxresident)k
0inputs+6144768outputs (0major+346minor)pagefaults 0swaps
soft# rm -f /mnt/test5/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=128k count=3000 conv=fsync
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 2.05933 s, 191 MB/s
0.00user 0.32system 0:02.06elapsed 15%CPU (0avgtext+0avgdata 1996maxresident)k
0inputs+772512outputs (0major+124minor)pagefaults 0swaps
soft# rm -f /mnt/test5/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=1M count=1000 | dd iflag=fullblock of=/mnt/test5/testfile bs=1M oflag=sync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 60.6019 s, 17.3 MB/s
0.01user 1.04system 1:00.60elapsed 1%CPU (0avgtext+0avgdata 2888maxresident)k
0inputs+0outputs (0major+348minor)pagefaults 0swaps
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 60.4116 s, 17.4 MB/s
soft# rm -f /mnt/test5/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/test5/testfile bs=128k oflag=sync
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 148.04 s, 2.7 MB/s
0.00user 0.62system 2:28.04elapsed 0%CPU (0avgtext+0avgdata 1996maxresident)k
0inputs+0outputs (0major+125minor)pagefaults 0swaps
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 148.083 s, 2.7 MB/s
soft# sysctl vm/drop_caches=3
vm.drop_caches = 3
soft# /usr/bin/time dd iflag=fullblock if=/mnt/test5/testfile bs=128k count=3000 of=/dev/zero
3000+0 records in
3000+0 records out
393216000 bytes (393 MB) copied, 1.09729 s, 358 MB/s
0.00user 0.24system 0:01.10elapsed 23%CPU (0avgtext+0avgdata 2164maxresident)k
459768inputs+0outputs (3major+121minor)pagefaults 0swaps
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-07-31 11:41 ` Peter Grandi
2017-07-31 12:33 ` Peter Grandi
@ 2017-08-01 9:58 ` Konstantin V. Gavrilenko
2017-08-01 10:53 ` Paul Jones
2017-08-01 13:14 ` Peter Grandi
1 sibling, 2 replies; 17+ messages in thread
From: Konstantin V. Gavrilenko @ 2017-08-01 9:58 UTC (permalink / raw)
To: Peter Grandi; +Cc: Linux fs Btrfs
Peter, I don't think the filefrag is showing the correct fragmentation status of the file when the compression is used.
At least the one that is installed by default in Ubuntu 16.04 - e2fsprogs | 1.42.13-1ubuntu1
So for example, fragmentation of compressed file is 320 times more then uncompressed one.
root@homenas:/mnt/storage/NEW# filefrag test5g-zeroes
test5g-zeroes: 40903 extents found
root@homenas:/mnt/storage/NEW# filefrag test5g-data
test5g-data: 129 extents found
I am currently defragmenting that mountpoint, ensuring that everrything is compressed with zlib.
# btrfs fi defragment -rv -czlib /mnt/arh-backup
my guess is that it will take another 24-36 hours to complete and then I will redo the test to see if that has helped.
will keep the list posted.
p.s. any other suggestion that might help with the fragmentation and data allocation. Should I try and rebalance the data on the drive?
kos
----- Original Message -----
From: "Peter Grandi" <pg@btrfs.list.sabi.co.UK>
To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org>
Sent: Monday, 31 July, 2017 1:41:07 PM
Subject: Re: Btrfs + compression = slow performance and high cpu usage
[ ... ]
> grep 'model name' /proc/cpuinfo | sort -u
> model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
Good, contemporary CPU with all accelerations.
> The sda device is a hardware RAID5 consisting of 4x8TB drives.
[ ... ]
> Strip Size : 256 KB
So the full RMW data stripe length is 768KiB.
> [ ... ] don't see the previously reported behaviour of one of
> the kworker consuming 100% of the cputime, but the write speed
> difference between the compression ON vs OFF is pretty large.
That's weird; of course 'lzo' is a lot cheaper than 'zlib', but
in my test the much higher CPU time of the latter was spread
across many CPUs, while in your case it wasn't, even if the
E5645 has 6 CPUs and can do 12 threads. That seemed to point to
some high cost of finding free blocks, that is a very fragmented
free list, or something else.
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 26.0685 s, 206 MB/s
The results with 'oflag=direct' are not relevant, because Btrfs
behaves "differently" with that.
> mountflags: (rw,relatime,compress-force=zlib,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 77.4845 s, 69.3 MB/s
> mountflags: (rw,relatime,compress-force=lzo,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 122.321 s, 43.9 MB/s
That's pretty good for a RAID5 with 128KiB writes and a 768KiB
stripe size, on a 3ware, and looks like that the hw host adapter
does not have a persistent cache (battery backed usually). My
guess that watching transfer rates and latencies with 'iostat
-dk -zyx 1' did not happen.
> mountflags: (rw,relatime,space_cache=v2,subvolid=5,subvol=/)
[ ... ]
> dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync
> 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 10.1033 s, 531 MB/s
I had mentioned in my previous reply the output of 'filefrag'.
That to me seems relevant here, because of RAID5 RMW and maximum
extent size with Brfs compression and strip/stripe size.
Perhaps redoing the tests with a 128KiB 'bs' *without*
compression would be interesting, perhaps even with 'oflag=sync'
instead of 'conv=fsync'.
It is hard for me to see a speed issue here with Btrfs: for
comparison I have done a simple test with a both a 3+1 MD RAID5
set with a 256KiB chunk size and a single block device on
"contemporary" 1T/2TB drives, capable of sequential transfer
rates of 150-190MB/s:
soft# grep -A2 sdb3 /proc/mdstat
md127 : active raid5 sde3[4] sdd3[2] sdc3[1] sdb3[0]
729808128 blocks super 1.0 level 5, 256k chunk, algorithm 2 [4/4] [UUUU]
with compression:
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5
soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 94.3605 s, 111 MB/s
0.01user 12.59system 1:34.36elapsed 13%CPU (0avgtext+0avgdata 2932maxresident)k
13042144inputs+20482144outputs (3major+345minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 93.5885 s, 112 MB/s
0.03user 12.35system 1:33.59elapsed 13%CPU (0avgtext+0avgdata 2940maxresident)k
13042144inputs+20482400outputs (3major+346minor)pagefaults 0swaps
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 48945 extents found
/mnt/sdg3/testfile: 49029 extents found
soft# btrfs fi df /mnt/test5/ | grep Data
Data, single: total=7.00GiB, used=6.55GiB
soft# btrfs fi df /mnt/sdg3 | grep Data
Data, single: total=7.00GiB, used=6.55GiB
soft# sysctl vm/drop_caches=3
vm.drop_caches = 3
soft# /usr/bin/time dd iflag=fullblock if=/mnt/test5/testfile bs=1M count=10000 of=/dev/zero
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 23.2975 s, 450 MB/s
0.01user 7.59system 0:23.32elapsed 32%CPU (0avgtext+0avgdata 2932maxresident)k
13759624inputs+0outputs (3major+344minor)pagefaults 0swaps
soft# sysctl vm/drop_caches=3
vm.drop_caches = 3
soft# /usr/bin/time dd iflag=fullblock if=/mnt/sdg3/testfile bs=1M count=10000 of=/dev/zero
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 35.0032 s, 300 MB/s
0.01user 8.46system 0:35.03elapsed 24%CPU (0avgtext+0avgdata 2924maxresident)k
13750568inputs+0outputs (3major+345minor)pagefaults 0swaps
and without compression:
soft# mount -t btrfs -o commit=10 /dev/sdg3 /mnt/sdg3
soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 74.7256 s, 140 MB/s
0.02user 13.31system 1:14.72elapsed 17%CPU (0avgtext+0avgdata 2936maxresident)k
13047640inputs+20483808outputs (3major+345minor)pagefaults 0swaps
soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 102.002 s, 103 MB/s
0.02user 14.49system 1:42.00elapsed 14%CPU (0avgtext+0avgdata 2972maxresident)k
13030592inputs+20484032outputs (3major+345minor)pagefaults 0swaps
soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile
/mnt/test5/testfile: 23 extents found
/mnt/sdg3/testfile: 13 extents found
> The CPU usage is pretty low as well. For example when the
> force-compress=zlib is in effect, the cpu usage is pretty low
> now.
That's 24 threads around 4-5% CPU each, that's around a 100% CPU
of system time spread around, for 70MB/s.
That's quite low. My report, which is mirrored by using 'pigz'
at the user level (very similar algorithms), was that 90MB/s
took 300% of an FX-6100 CPU at 3.3Ghz, and it is not that much
less efficient than a Xeon-E5645 at 2.4Ghz.
I have redone the test on a faster CPU:
base# grep 'model name' /proc/cpuinfo | sort -u
model name : AMD FX-8370E Eight-Core Processor
base# cpufreq-info | grep 'current CPU frequency'
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
current CPU frequency is 3.30 GHz (asserted by call to hardware).
And the result is (from a fast flash Samsung SSD to a fast 2TB
Toshiba drive):
base# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb6 /mnt/sdb6
base# /usr/bin/time dd iflag=fullblock if=/dev/sde6 of=/mnt/sdb6/testfile bs=1M count=10000 conv=fsync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 41.7702 s, 251 MB/s
0.00user 11.41system 0:43.41elapsed 26%CPU (0avgtext+0avgdata 3132maxresident)k
20482288inputs+20503368outputs (1major+339minor)pagefaults 0swaps
With CPU usage as:
top - 09:20:38 up 20:48, 4 users, load average: 5.04, 2.03, 2.06
Tasks: 576 total, 10 running, 566 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 94.0 sy, 0.7 ni, 0.0 id, 2.0 wa, 0.0 hi, 3.3 si, 0.0 st
%Cpu1 : 0.3 us, 97.0 sy, 0.0 ni, 2.0 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 3.0 us, 94.4 sy, 0.0 ni, 1.7 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.7 us, 95.4 sy, 0.0 ni, 2.3 id, 1.3 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu4 : 1.0 us, 95.7 sy, 0.3 ni, 2.6 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.3 us, 94.7 sy, 0.0 ni, 4.3 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 1.7 us, 94.3 sy, 0.0 ni, 3.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 97.3 sy, 0.0 ni, 2.3 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 16395076 total, 15987476 used, 407600 free, 5206304 buffers
KiB Swap: 0 total, 0 used, 0 free. 8392648 cached Mem
so around 7 CPUs for 250MB/s, or around 35MB/s per CPU (more or
less what I also get user-level with 'pigz'), and it is hard for
me to imagine the Xeon-E5745 being twice as fast per-CPU for
"integer" work, but that's another discussion.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Btrfs + compression = slow performance and high cpu usage
2017-08-01 9:58 ` Konstantin V. Gavrilenko
@ 2017-08-01 10:53 ` Paul Jones
2017-08-01 13:14 ` Peter Grandi
1 sibling, 0 replies; 17+ messages in thread
From: Paul Jones @ 2017-08-01 10:53 UTC (permalink / raw)
To: Konstantin V. Gavrilenko, Peter Grandi; +Cc: Linux fs Btrfs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1243 bytes --]
> -----Original Message-----
> From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-
> owner@vger.kernel.org] On Behalf Of Konstantin V. Gavrilenko
> Sent: Tuesday, 1 August 2017 7:58 PM
> To: Peter Grandi <pg@btrfs.list.sabi.co.UK>
> Cc: Linux fs Btrfs <linux-btrfs@vger.kernel.org>
> Subject: Re: Btrfs + compression = slow performance and high cpu usage
>
> Peter, I don't think the filefrag is showing the correct fragmentation status of
> the file when the compression is used.
> At least the one that is installed by default in Ubuntu 16.04 - e2fsprogs |
> 1.42.13-1ubuntu1
>
> So for example, fragmentation of compressed file is 320 times more then
> uncompressed one.
>
> root@homenas:/mnt/storage/NEW# filefrag test5g-zeroes
> test5g-zeroes: 40903 extents found
>
> root@homenas:/mnt/storage/NEW# filefrag test5g-data
> test5g-data: 129 extents found
Compressed extents are about 128kb, uncompressed extents are about 128Mb. (can't remember the exact numbers.)
I've had trouble with slow filesystems when using compression. The problem seems to go away when removing compression.
Paul.
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±ý»k~ÏâØ^nr¡ö¦zË\x1aëh¨èÚ&£ûàz¿äz¹Þú+Ê+zf£¢·h§~Ûiÿÿïêÿêçz_è®\x0fæj:+v¨þ)ߣøm
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-08-01 9:58 ` Konstantin V. Gavrilenko
2017-08-01 10:53 ` Paul Jones
@ 2017-08-01 13:14 ` Peter Grandi
2017-08-01 18:09 ` Konstantin V. Gavrilenko
1 sibling, 1 reply; 17+ messages in thread
From: Peter Grandi @ 2017-08-01 13:14 UTC (permalink / raw)
To: Linux fs Btrfs
> Peter, I don't think the filefrag is showing the correct
> fragmentation status of the file when the compression is used.
As reported on a previous message the output of 'filefrag -v'
which can be used to see what is going on:
>>>> filefrag /mnt/sde3/testfile
>>>> /mnt/sde3/testfile: 49287 extents found
>>>> Most the latter extents are mercifully rather contiguous, their
>>>> size is just limited by the compression code, here is an extract
>>>> from 'filefrag -v' from around the middle:
>>>> 24757: 1321888.. 1321919: 11339579.. 11339610: 32: 11339594:
>>>> 24758: 1321920.. 1321951: 11339597.. 11339628: 32: 11339611:
>>>> 24759: 1321952.. 1321983: 11339615.. 11339646: 32: 11339629:
>>>> 24760: 1321984.. 1322015: 11339632.. 11339663: 32: 11339647:
>>>> 24761: 1322016.. 1322047: 11339649.. 11339680: 32: 11339664:
>>>> 24762: 1322048.. 1322079: 11339667.. 11339698: 32: 11339681:
>>>> 24763: 1322080.. 1322111: 11339686.. 11339717: 32: 11339699:
>>>> 24764: 1322112.. 1322143: 11339703.. 11339734: 32: 11339718:
>>>> 24765: 1322144.. 1322175: 11339720.. 11339751: 32: 11339735:
>>>> 24766: 1322176.. 1322207: 11339737.. 11339768: 32: 11339752:
>>>> 24767: 1322208.. 1322239: 11339754.. 11339785: 32: 11339769:
>>>> 24768: 1322240.. 1322271: 11339771.. 11339802: 32: 11339786:
>>>> 24769: 1322272.. 1322303: 11339789.. 11339820: 32: 11339803:
>>>> But again this is on a fresh empty Btrfs volume.
As I wrote, "their size is just limited by the compression code"
which results in "128KiB writes". On a "fresh empty Btrfs volume"
the compressed extents limited to 128KiB also happen to be pretty
physically contiguous, but on a more fragmented free space list
they can be more scattered.
As I already wrote the main issue here seems to be that we are
talking about a "RAID5 with 128KiB writes and a 768KiB stripe
size". On MD RAID5 the slowdown because of RMW seems only to be
around 30-40%, but it looks like that several back-to-back 128KiB
writes get merged by the Linux IO subsystem (not sure whether
that's thoroughly legal), and perhaps they get merged by the 3ware
firmware only if it has a persistent cache, and maybe your 3ware
does not have one, but you have kept your counsel as to that.
My impression is that you read the Btrfs documentation and my
replies with a lot less attention than I write them. Some of the
things you have done and said make me think that you did not read
https://btrfs.wiki.kernel.org/index.php/Compression and 'man 5
btrfs', for example:
"How does compression interact with direct IO or COW?
Compression does not work with DIO, does work with COW and
does not work for NOCOW files. If a file is opened in DIO
mode, it will fall back to buffered IO.
Are there speed penalties when doing random access to a
compressed file?
Yes. The compression processes ranges of a file of maximum
size 128 KiB and compresses each 4 KiB (or page-sized) block
separately."
> I am currently defragmenting that mountpoint, ensuring that
> everrything is compressed with zlib.
Defragmenting the used space might help find more contiguous
allocations.
> p.s. any other suggestion that might help with the fragmentation
> and data allocation. Should I try and rebalance the data on the
> drive?
Yes, regularly, as that defragments the unused space.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-08-01 13:14 ` Peter Grandi
@ 2017-08-01 18:09 ` Konstantin V. Gavrilenko
2017-08-01 20:09 ` Peter Grandi
0 siblings, 1 reply; 17+ messages in thread
From: Konstantin V. Gavrilenko @ 2017-08-01 18:09 UTC (permalink / raw)
To: Peter Grandi; +Cc: Linux fs Btrfs
----- Original Message -----
From: "Peter Grandi" <pg@btrfs.list.sabi.co.UK>
To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org>
Sent: Tuesday, 1 August, 2017 3:14:07 PM
Subject: Re: Btrfs + compression = slow performance and high cpu usage
> Peter, I don't think the filefrag is showing the correct
> fragmentation status of the file when the compression is used.
<SNIP>
As I wrote, "their size is just limited by the compression code"
which results in "128KiB writes". On a "fresh empty Btrfs volume"
the compressed extents limited to 128KiB also happen to be pretty
physically contiguous, but on a more fragmented free space list
they can be more scattered.
KOS: Ok, thanks for pointing it out. I have compared the filefrag -v on another btrfs that is not fragmented
and see the difference with what is happening on the sluggish one.
5824: 186368.. 186399: 2430093383..2430093414: 32: 2430093414: encoded
5825: 186400.. 186431: 2430093384..2430093415: 32: 2430093415: encoded
5826: 186432.. 186463: 2430093385..2430093416: 32: 2430093416: encoded
5827: 186464.. 186495: 2430093386..2430093417: 32: 2430093417: encoded
5828: 186496.. 186527: 2430093387..2430093418: 32: 2430093418: encoded
5829: 186528.. 186559: 2430093388..2430093419: 32: 2430093419: encoded
5830: 186560.. 186591: 2430093389..2430093420: 32: 2430093420: encoded
As I already wrote the main issue here seems to be that we are
talking about a "RAID5 with 128KiB writes and a 768KiB stripe
size". On MD RAID5 the slowdown because of RMW seems only to be
around 30-40%, but it looks like that several back-to-back 128KiB
writes get merged by the Linux IO subsystem (not sure whether
that's thoroughly legal), and perhaps they get merged by the 3ware
firmware only if it has a persistent cache, and maybe your 3ware
does not have one, but you have kept your counsel as to that.
KOS: No I don't have persistent cache. Only the 512 Mb cache on board of a controller, that is
BBU. If I had additional SSD caching on the controller I would have mentioned it.
I was also under impression, that in a situation where mostly extra large files will be stored on the massive, the bigger strip size would indeed increase the speed, thus I went with with the 256 Kb strip size. Would I be correct in assuming that the RAID strip size of 128 Kb will be a better choice if one plans to use the BTRFS with compression?
thanks,
kos
<SNIP>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-08-01 18:09 ` Konstantin V. Gavrilenko
@ 2017-08-01 20:09 ` Peter Grandi
2017-08-01 23:54 ` Peter Grandi
2017-08-31 10:56 ` Konstantin V. Gavrilenko
0 siblings, 2 replies; 17+ messages in thread
From: Peter Grandi @ 2017-08-01 20:09 UTC (permalink / raw)
To: Linux fs Btrfs
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe
>> size". [ ... ] several back-to-back 128KiB writes [ ... ] get
>> merged by the 3ware firmware only if it has a persistent
>> cache, and maybe your 3ware does not have one,
> KOS: No I don't have persistent cache. Only the 512 Mb cache
> on board of a controller, that is BBU.
If it is a persistent cache, that can be battery-backed (as I
wrote, but it seems that you don't have too much time to read
replies) then the size of the write, 128KiB or not, should not
matter much; the write will be reported complete when it hits
the persistent cache (whichever technology it used), and then
the HA fimware will spill write cached data to the disks using
the optimal operation width.
Unless the 3ware firmware is really terrible (and depending on
model and vintage it can be amazingly terrible) or the battery
is no longer recharging and then the host adapter switches to
write-through.
That you see very different rates between uncompressed and
compressed writes, where the main difference is the limitation
on the segment size, seems to indicate that compressed writes
involve a lot of RMW, that is sub-stripe updates. As I mentioned
already, it would be interesting to retry 'dd' with different
'bs' values without compression and with 'sync' (or 'direct'
which only makes sense without compression).
> If I had additional SSD caching on the controller I would have
> mentioned it.
So far you had not mentioned the presence of BBU cache either,
which is equivalent, even if in one of your previous message
(which I try to read carefully) there were these lines:
>>>> Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
>>>> Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
So perhaps someone else would have checked long ago the status
of the BBU and whether the "No Write Cache if Bad BBU" case has
happened. If the BBU is still working and the policy is still
"WriteBack" then things are stranger still.
> I was also under impression, that in a situation where mostly
> extra large files will be stored on the massive, the bigger
> strip size would indeed increase the speed, thus I went with
> with the 256 Kb strip size.
That runs counter to this simple story: suppose a program is
doing 64KiB IO:
* For *reads*, there are 4 data drives and the strip size is
16KiB: the 64KiB will be read in parallel on 4 drives. If the
strip size is 256KiB then the 64KiB will be read sequentially
from just one disk, and 4 successive reads will be read
sequentially from the same drive.
* For *writes* on a parity RAID like RAID5 things are much, much
more extreme: the 64KiB will be written with 16KiB strips on a
5-wide RAID5 set in parallel to 5 drives, with 4 stripes being
updated with RMW. But with 256KiB strips it will partially
update 5 drives, because the stripe is 1024+256KiB, and it
needs to do RMW, and four successive 64KiB drives will need to
do that too, even if only one drive is updated. Usually for
RAID5 there is an optimization that means that only the
specific target drive and the parity drives(s) need RMW, but
it is still very expensive.
This is the "storage for beginners" version, what happens in
practice however depends a lot on specific workload profile
(typical read/write size and latencies and rates), caching and
queueing algorithms in both Linux and the HA firmware.
> Would I be correct in assuming that the RAID strip size of 128
> Kb will be a better choice if one plans to use the BTRFS with
> compression?
That would need to be tested, because of "depends a lot on
specific workload profile, caching and queueing algorithms", but
my expectation is the the lower the better. Given that you have
4 drives giving a 3+1 RAID set, perhaps a 32KiB or 64KiB strip
size, given a data stripe size of 96KiB or 192KiB, would be
better.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-08-01 20:09 ` Peter Grandi
@ 2017-08-01 23:54 ` Peter Grandi
2017-08-31 10:56 ` Konstantin V. Gavrilenko
1 sibling, 0 replies; 17+ messages in thread
From: Peter Grandi @ 2017-08-01 23:54 UTC (permalink / raw)
To: Linux fs Btrfs
[ ... ]
> This is the "storage for beginners" version, what happens in
> practice however depends a lot on specific workload profile
> (typical read/write size and latencies and rates), caching and
> queueing algorithms in both Linux and the HA firmware.
To add a bit of slightly more advanced discussion, the main
reason for larger strips ("chunk size) is to avoid the huge
latencies of disk rotation using unsynchronized disk drives, as
detailed here:
http://www.sabi.co.uk/blog/12-thr.html?120310#120310
That relates weakly to Btrfs.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Btrfs + compression = slow performance and high cpu usage
2017-08-01 20:09 ` Peter Grandi
2017-08-01 23:54 ` Peter Grandi
@ 2017-08-31 10:56 ` Konstantin V. Gavrilenko
1 sibling, 0 replies; 17+ messages in thread
From: Konstantin V. Gavrilenko @ 2017-08-31 10:56 UTC (permalink / raw)
To: Linux fs Btrfs; +Cc: Peter Grandi
Hello again list. I thought I would clear the things out and describe what is happening with my troubled RAID setup.
So having received the help from the list, I've initially run the full defragmentation of all the data and recompressed everything with zlib.
That didn't help. Then I run the full rebalance of the data and that didn't help either.
So I had to take a disk out of the raid, copy all the data onto it, recreate the RAID drive with 32kb chunk size and 96kb stripe and copied the data back. Then added the disk back and resynced the raid.
So currently the RAID device is
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
Size : 21.830 TB
Sector Size : 512
Is VD emulated : Yes
Parity Size : 7.276 TB
State : Optimal
Strip Size : 32 KB
Number Of Drives : 4
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: No
It is about 40% full with compressed data
# btrfs fi usage /mnt/arh-backup1/
Overall:
Device size: 21.83TiB
Device allocated: 8.98TiB
Device unallocated: 12.85TiB
Device missing: 0.00B
Used: 8.98TiB
Free (estimated): 12.85TiB (min: 6.43TiB)
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
I've decided to run a set of test, where 5 gb file was created using different blocksizes and different flags.
one file with urandom data was generated and another one filled with zeroes. the data was written with compression and without compression, and it seems that without compression it is possible to gain 30-40% speed, while the cpu was running at 50% idle during the highest loads.
dd write speeds (mb/s)
flags: conv=fsync
compress-force=zlib compress-force=none
RAND ZERO RAND ZERO
bs1024k 387 407 584 577
bs512k 389 414 532 547
bs256k 412 409 558 585
bs128k 412 403 572 583
bs64k 409 419 563 574
bs32k 407 404 569 572
flags: oflag=sync
compress-force=zlib compress-force=none
RAND ZERO RAND ZERO
bs1024k 86.1 97.0 203 210
bs512k 50.6 64.4 85.0 170
bs256k 25.0 29.8 67.6 67.5
bs128k 13.2 16.4 48.4 49.8
bs64k 7.4 8.3 24.5 27.9
bs32k 3.8 4.1 14.0 13.7
flags: no flags
compress-force=zlib compress-force=none
RAND ZERO RAND ZERO
bs1024k 480 419 681 595
bs512k 422 412 633 585
bs256k 413 384 707 712
bs128k 414 387 695 704
bs64k 482 467 622 587
bs32k 416 412 610 598
I have also run a test where I filled the array to about 97% capacity and the write speed went down by about 50% compared with the empty RAID.
thanks for the help.
----- Original Message -----
From: "Peter Grandi" <pg@btrfs.list.sabi.co.UK>
To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org>
Sent: Tuesday, 1 August, 2017 10:09:03 PM
Subject: Re: Btrfs + compression = slow performance and high cpu usage
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe
>> size". [ ... ] several back-to-back 128KiB writes [ ... ] get
>> merged by the 3ware firmware only if it has a persistent
>> cache, and maybe your 3ware does not have one,
> KOS: No I don't have persistent cache. Only the 512 Mb cache
> on board of a controller, that is BBU.
If it is a persistent cache, that can be battery-backed (as I
wrote, but it seems that you don't have too much time to read
replies) then the size of the write, 128KiB or not, should not
matter much; the write will be reported complete when it hits
the persistent cache (whichever technology it used), and then
the HA fimware will spill write cached data to the disks using
the optimal operation width.
Unless the 3ware firmware is really terrible (and depending on
model and vintage it can be amazingly terrible) or the battery
is no longer recharging and then the host adapter switches to
write-through.
That you see very different rates between uncompressed and
compressed writes, where the main difference is the limitation
on the segment size, seems to indicate that compressed writes
involve a lot of RMW, that is sub-stripe updates. As I mentioned
already, it would be interesting to retry 'dd' with different
'bs' values without compression and with 'sync' (or 'direct'
which only makes sense without compression).
> If I had additional SSD caching on the controller I would have
> mentioned it.
So far you had not mentioned the presence of BBU cache either,
which is equivalent, even if in one of your previous message
(which I try to read carefully) there were these lines:
>>>> Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
>>>> Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
So perhaps someone else would have checked long ago the status
of the BBU and whether the "No Write Cache if Bad BBU" case has
happened. If the BBU is still working and the policy is still
"WriteBack" then things are stranger still.
> I was also under impression, that in a situation where mostly
> extra large files will be stored on the massive, the bigger
> strip size would indeed increase the speed, thus I went with
> with the 256 Kb strip size.
That runs counter to this simple story: suppose a program is
doing 64KiB IO:
* For *reads*, there are 4 data drives and the strip size is
16KiB: the 64KiB will be read in parallel on 4 drives. If the
strip size is 256KiB then the 64KiB will be read sequentially
from just one disk, and 4 successive reads will be read
sequentially from the same drive.
* For *writes* on a parity RAID like RAID5 things are much, much
more extreme: the 64KiB will be written with 16KiB strips on a
5-wide RAID5 set in parallel to 5 drives, with 4 stripes being
updated with RMW. But with 256KiB strips it will partially
update 5 drives, because the stripe is 1024+256KiB, and it
needs to do RMW, and four successive 64KiB drives will need to
do that too, even if only one drive is updated. Usually for
RAID5 there is an optimization that means that only the
specific target drive and the parity drives(s) need RMW, but
it is still very expensive.
This is the "storage for beginners" version, what happens in
practice however depends a lot on specific workload profile
(typical read/write size and latencies and rates), caching and
queueing algorithms in both Linux and the HA firmware.
> Would I be correct in assuming that the RAID strip size of 128
> Kb will be a better choice if one plans to use the BTRFS with
> compression?
That would need to be tested, because of "depends a lot on
specific workload profile, caching and queueing algorithms", but
my expectation is the the lower the better. Given that you have
4 drives giving a 3+1 RAID set, perhaps a 32KiB or 64KiB strip
size, given a data stripe size of 96KiB or 192KiB, would be
better.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-08-31 10:56 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <33040946.535.1501254718807.JavaMail.gkos@dynomob>
2017-07-28 16:40 ` Btrfs + compression = slow performance and high cpu usage Konstantin V. Gavrilenko
2017-07-28 17:48 ` Roman Mamedov
2017-07-28 18:20 ` William Muriithi
2017-07-28 18:37 ` Hugo Mills
2017-07-28 18:08 ` Peter Grandi
2017-07-30 13:42 ` Konstantin V. Gavrilenko
2017-07-31 11:41 ` Peter Grandi
2017-07-31 12:33 ` Peter Grandi
2017-07-31 12:49 ` Peter Grandi
2017-08-01 9:58 ` Konstantin V. Gavrilenko
2017-08-01 10:53 ` Paul Jones
2017-08-01 13:14 ` Peter Grandi
2017-08-01 18:09 ` Konstantin V. Gavrilenko
2017-08-01 20:09 ` Peter Grandi
2017-08-01 23:54 ` Peter Grandi
2017-08-31 10:56 ` Konstantin V. Gavrilenko
2017-07-28 18:44 ` Peter Grandi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).