From: Martin Steigerwald <Martin@lichtvoll.de>
To: linux-btrfs@vger.kernel.org
Subject: Re: How to refresh degraded BTRFS? free space fragmentation, file fragmentation...
Date: Wed, 16 Jan 2013 21:39:10 +0100 [thread overview]
Message-ID: <201301162139.11140.Martin@lichtvoll.de> (raw)
In-Reply-To: <201212091212.26248.Martin@lichtvoll.de>
Am Sonntag, 9. Dezember 2012 schrieb Martin Steigerwald:
> Hi!
>
> I have BTRFS on some systems since more than two years. My experience so
> far is: Performance at the beginning is pretty good, but some of my more
> often used BTRFS filesystem degrade badly in different areas. On some
> workloads pretty quickly.
>
> There are also some fs however that did not degrade that badly. These were
> some that have way more free space left than the ones that degraded
> badly. About 900 GB freespace left on my eSATA backup disk with BTRFS
> that is also quite new. About 80 GB left on my BTRFS RAID 1 local home disk
> where I can build debian packages or kernels and such without the restrictions
> NFS brings (root squash). These still appear to be fine, but I redid the local
> home one with mkfs.btrfs -n 32768 and -l 32768 not to long ago, but I
> think it was quite fine before anyway, so I might have overdone it here.
> This already points at a way to prevent some degradation BTRFS filesystems:
> Leave more free space.
>
>
> 1) fsync speed on my ThinkPad T23 has gone down that much that I use
[…]
Interesting to try after latest fsync improvements.
> 2) File fragmentation: Example with a SUSE Manager VirtualBox on an
[…]
> 3) Freespace fragmentation on the / filesystem on this ThinkPad T520 with
> Intel SSD 320:
>
> === fstrim ===
>
> merkaba:~> /usr/bin/time fstrim -v /
> /: 6849871872 bytes were trimmed
> 0.00user 5.99system 0:44.69elapsed 13%CPU (0avgtext+0avgdata 752maxresident)k
> 0inputs+0outputs (0major+237minor)pagefaults 0swaps
>
> It took a second or two in the beginning.
>
>
> atop:
>
> LVM | rkaba-debian | busy 91% | read 0 | write 10313 | MBw/s 67.48 | avio 0.20 ms |
> […]
> DSK | sda | busy 90% | read 0 | write 10319 | MBw/s 67.54 | avio 0.19 ms |
> […]
>
> PID TID RUID THR SYSCPU USRCPU VGROW RGROW RDDSK WRDSK ST EXC S CPUNR CPU CMD 1/2
> 6085 - root 1 0.29s 0.00s 0K 0K 0K 0K -- - D 0 13% fstrim
>
>
> 10000 write requests in 10 seconds.
I was able to refresh my BTRFS regarding this issue on 11th of January:
merkaba:~> btrfs filesystem df /
Data: total=15.10GB, used=11.06GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.75GB, used=654.12MB
Metadata: total=8.00MB, used=0.00
merkaba:~> btrfs balance start -dusage=5 /
Done, had to relocate 0 out of 25 chunks
merkaba:~> btrfs filesystem df /
Data: total=15.01GB, used=11.06GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.75GB, used=654.05MB
Metadata: total=8.00MB, used=0.00
merkaba:~> btrfs balance start -d /
Done, had to relocate 16 out of 25 chunks
merkaba:~> btrfs filesystem df /
Data: total=11.09GB, used=11.06GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.75GB, used=647.72MB
Metadata: total=8.00MB, used=0.00
merkaba:~> /usr/bin/time -v fstrim -v /
/: 2246623232 bytes were trimmed
Command being timed: "fstrim -v /"
User time (seconds): 0.00
System time (seconds): 2.34
Percent of CPU this job got: 10%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:21.84
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 748
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 239
Voluntary context switches: 110690
Involuntary context switches: 1426
Swaps: 0
File system inputs: 16
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
merkaba:~> btrfs balance start -fmconvert=single /
Done, had to relocate 8 out of 20 chunks
merkaba:~> btrfs filesystem df /
Data: total=11.09GB, used=11.06GB
System: total=36.00MB, used=4.00KB
Metadata: total=1.75GB, used=642.92MB
[406005.831307] btrfs: balance will reduce metadata integrity, use force if you want this
[406129.187057] btrfs: force reducing metadata integrity
[406129.199133] btrfs: relocating block group 9290383360 flags 36
[406132.645299] btrfs: found 6989 extents
[406132.673390] btrfs: relocating block group 8082423808 flags 36
[406135.807065] btrfs: found 6906 extents
[406135.841572] btrfs: relocating block group 7948206080 flags 36
[406138.413270] btrfs: found 4514 extents
[406138.435382] btrfs: relocating block group 6740246528 flags 36
[406142.572004] btrfs: found 10667 extents
[406142.638079] btrfs: relocating block group 6606028800 flags 36
[406146.272095] btrfs: found 19844 extents
[406146.289729] btrfs: relocating block group 6471811072 flags 36
[406149.136422] btrfs: found 14850 extents
[406149.159510] btrfs: relocating block group 29360128 flags 36
[406183.637010] btrfs: found 116645 extents
[406183.653225] btrfs: relocating block group 20971520 flags 34
[406183.671958] btrfs: found 1 extents
Metadata tree still on old size, thus a regular rebalance:
merkaba:~> btrfs balance start -m /
Done, had to relocate 8 out of 20 chunks
merkaba:~> btrfs filesystem df /
Data: total=11.09GB, used=11.06GB
System: total=36.00MB, used=4.00KB
Metadata: total=768.00MB, used=643.38MB
[406270.880962] btrfs: relocating block group 31801212928 flags 2
[406270.961955] btrfs: found 1 extents
[406270.976857] btrfs: relocating block group 31532777472 flags 4
[406270.990729] btrfs: relocating block group 31264342016 flags 4
[406271.006172] btrfs: relocating block group 30995906560 flags 4
[406271.020158] btrfs: relocating block group 30727471104 flags 4
[406271.480442] btrfs: found 5187 extents
[406271.515768] btrfs: relocating block group 30459035648 flags 4
[406277.158280] btrfs: found 54593 extents
[406277.173024] btrfs: relocating block group 30190600192 flags 4
[406284.680294] btrfs: found 63749 extents
[406284.756582] btrfs: relocating block group 29922164736 flags 4
[406290.907101] btrfs: found 59530 extents
merkaba:~> df -hT /
Dateisystem Typ Größe Benutzt Verf. Verw% Eingehängt auf
/dev/dm-0 btrfs 19G 12G 6,8G 64% /
merkaba:~> /usr/bin/time -v fstrim -v /
/: 5472256 bytes were trimmed
Command being timed: "fstrim -v /"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 50%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 748
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 238
Voluntary context switches: 12
Involuntary context switches: 3
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Today Still fast:
merkaba:~#1> /usr/bin/time -v fstrim /
Command being timed: "fstrim /"
User time (seconds): 0.00
System time (seconds): 0.03
Percent of CPU this job got: 17%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.19
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 708
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 227
Voluntary context switches: 736
Involuntary context switches: 35
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Boot time seems a tad bid slower tough:
merkaba:~> systemd-analyze
Startup finished in 5495ms (kernel) + 6331ms (userspace) = 11827ms
merkaba:~> systemd-analyze blame
3051ms cups.service
2330ms dirmngr.service
2267ms postfix.service
1411ms schroot.service
1385ms lvm2.service
1230ms network-manager.service
1128ms ssh.service
1117ms acpi-fakekey.service
1112ms avahi-daemon.service
1061ms privoxy.service
1010ms systemd-logind.service
721ms loadcpufreq.service
646ms colord.service
552ms kdm.service
533ms networking.service
532ms keyboard-setup.service
463ms remount-rootfs.service
368ms bootlogs.service
349ms udev.service
327ms console-kit-log-system-start.service
326ms postgresql.service
322ms binfmt-support.service
316ms acpi-support.service
315ms qemu-kvm.service
310ms sys-kernel-debug.mount
309ms dev-mqueue.mount
309ms anacron.service
303ms atd.service
297ms sys-kernel-security.mount
282ms cron.service
282ms dev-hugepages.mount
272ms lightdm.service
271ms console-kit-daemon.service
271ms lirc.service
268ms lxc.service
259ms cpufrequtils.service
259ms mdadm.service
252ms openntpd.service
240ms smartmontools.service
240ms alsa-utils.service
237ms run-user.mount
237ms speech-dispatcher.service
230ms udftools.service
229ms run-lock.mount
229ms systemd-remount-api-vfs.service
224ms ebtables.service
214ms openbsd-inetd.service
208ms motd.service
199ms hdparm.service
198ms irqbalance.service
190ms mountdebugfs.service
181ms saned.service
160ms systemd-user-sessions.service
157ms polkitd.service
147ms screen-cleanup.service
146ms console-setup.service
141ms networking-routes.service
140ms pppd-dns.service
130ms rc.local.service
130ms jove.service
128ms sysstat.service
112ms rsyslog.service
111ms udev-trigger.service
103ms home.mount
93ms systemd-sysctl.service
89ms boot.mount
85ms dns-clean.service
84ms kbd.service
66ms upower.service
60ms systemd-tmpfiles-setup.service
53ms openvpn.service
37ms boot-efi.mount
27ms udisks.service
22ms sysfsutils.service
22ms mdadm-raid.service
20ms proc-sys-fs-binfmt_misc.mount
18ms tmp.mount
2ms sys-fs-fuse-connections.mount
> vmstat 1:
>
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 3 0 1963688 3943380 156972 1827836 0 0 0 0 5421 15781 6 6 88 0
> 0 0 1963688 3943132 156972 1827852 0 0 0 0 5733 16478 9 7 83 0
> 1 0 1963688 3943008 156972 1827992 0 0 0 0 5050 14434 0 4 96 0
> 1 0 1963688 3949768 156972 1826708 0 0 0 0 5246 14960 2 5 93 0
> 0 0 1963688 3949644 156980 1826712 0 0 0 36 5104 14996 1 4 94 0
> 0 0 1963688 3949768 156980 1826720 0 0 0 0 5102 15210 2 4 94 0
> 3 0 1963688 3949644 156980 1826720 0 0 0 0 5321 15995 4 7 89 0
> 0 0 1963688 3949396 156980 1827188 0 0 0 0 5316 15616 6 5 88 0
> 1 0 1963688 3949148 156980 1827188 0 0 0 0 5102 14944 1 4 95 0
> 1 0 1963688 3949272 156980 1827188 0 0 0 0 5510 15928 5 6 89 0
> 1 0 1963688 3949272 156980 1827188 0 0 0 52 5107 15054 2 4 94 0
> 0 0 1963688 3949396 156980 1826868 0 0 0 4 4930 14567 1 4 95 0
> 1 0 1963688 3949396 156988 1826828 0 0 0 52 5132 15014 2 5 93 0
> 3 0 1963688 3949396 156988 1826836 0 0 0 0 5015 14447 1 4 95 0
> 0 0 1963688 3949520 156988 1826836 0 0 0 0 5233 15652 3 6 91 0
> 1 0 1963684 3949612 156988 1827172 0 0 0 3032 2546 7555 6 4 84 6
>
> After fstrim:
>
> 0 0 1963684 3944244 157016 1827752 0 0 0 0 357 1018 2 1 97 0
> 1 0 1963684 3943776 157024 1827776 0 0 0 64 634 1660 4 2 93 0
> 0 0 1963684 3943872 157024 1827784 0 0 0 0 180 473 0 0 99 0
>
>
> The I/O activity does not seem to be reflected in vmstat, I bet due to page
> cache not involved.
> === fallocate ===
>
> merkaba:/var/tmp> /usr/bin/time fallocate -l 2G fallocate-test
> 0.00user 118.85system 2:00.50elapsed 98%CPU (0avgtext+0avgdata 720maxresident)k
> 14912inputs+49112outputs (0major+227minor)pagefaults 0swaps
Now, lets try this:
merkaba:/var/tmp> /usr/bin/time -v fallocate -l 2G fallocate-test
Command being timed: "fallocate -l 2G fallocate-test"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 80%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 724
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 231
Voluntary context switches: 5
Involuntary context switches: 6
Swaps: 0
File system inputs: 80
File system outputs: 72
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
There we go :)
> Filesystem type is: 9123683e
> File size of fallocate-test is 2147483648 (524288 blocks, blocksize 4096)
> ext logical physical expected length flags
> 0 0 2626450 2048
> 1 2048 3215128 2628498 2040
> 2 4088 3408631 3217168 2032
> 3 6120 3430045 3410663 2024
> 4 8144 3439999 3432069 2016
> 5 10160 3474610 3442015 1004
> 6 11164 3743715 3475614 1002
[…]
> fallocate-test: 4556 extents found
merkaba:/var/tmp> filefrag -v fallocate-test
Filesystem type is: 9123683e
File size of fallocate-test is 2147483648 (524288 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 8501248 524288 eof
fallocate-test: 1 extent found
Yes, thats the same filesystem :)
> But:
>
> merkaba:/var/tmp> /usr/bin/time rm fallocate-test
> 0.00user 0.24system 0:00.38elapsed 63%CPU (0avgtext+0avgdata 784maxresident)k
> 4464inputs+36184outputs (0major+243minor)pagefaults 0swaps
merkaba:/var/tmp> /usr/bin/time rm fallocate-test
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 784maxresident)k
0inputs+24outputs (0major+243minor)pagefaults 0swaps
> Some more information on the filesystem in question:
>
> merkaba:/home/martin/Linux/Dateisysteme/BTRFS/btrfs-progs-unstable> ./btrfs fi sh
> failed to read /dev/sr0
> Label: 'debian' uuid: […]
> Total devices 1 FS bytes used 13.56GB
> devid 1 size 18.62GB used 18.62GB path /dev/dm-0
>
> Btrfs v0.19-239-g0155e84
>
>
> merkaba:/home/martin/Linux/Dateisysteme/BTRFS/btrfs-progs-unstable> ./btrfs fi df /
> Disk size: 18.62GB
> Disk allocated: 18.62GB
> Disk unallocated: 0.00
> Used: 13.56GB
> Free (Estimated): 3.31GB (Max: 3.31GB, min: 3.31GB)
> Data to disk ratio: 91 %
>
>
> merkaba:/home/martin/Linux/Dateisysteme/BTRFS/btrfs-progs-unstable> ./btrfs fi disk-usage /
> Data,Single: Size:15.10GB, Used:12.94GB
> /dev/dm-0 15.10GB
>
> Metadata,Single: Size:8.00MB, Used:0.00
> /dev/dm-0 8.00MB
>
> Metadata,DUP: Size:1.75GB, Used:630.11MB
> /dev/dm-0 3.50GB
>
> System,Single: Size:4.00MB, Used:0.00
> /dev/dm-0 4.00MB
>
> System,DUP: Size:8.00MB, Used:4.00KB
> /dev/dm-0 16.00MB
>
> Unallocated:
> /dev/dm-0 0.00
> merkaba:/home/martin/Linux/Dateisysteme/BTRFS/btrfs-progs-unstable> ./btrfs dev disk-usage /
> /dev/dm-0 18.62GB
> Data,Single: 15.10GB
> Metadata,Single: 8.00MB
> Metadata,DUP: 3.50GB
> System,Single: 4.00MB
> System,DUP: 16.00MB
> Unallocated: 0.00
Thanks,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
prev parent reply other threads:[~2013-01-16 20:39 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-09 11:12 How to refresh degraded BTRFS? free space fragmentation, file fragmentation Martin Steigerwald
2012-12-09 11:20 ` Martin Steigerwald
2013-01-16 20:39 ` Martin Steigerwald [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201301162139.11140.Martin@lichtvoll.de \
--to=martin@lichtvoll.de \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.