* lots of snapshots, forcing defragmentation, and figuring out which files to defrag or nodatacow
@ 2015-09-29 5:51 James Cook
2015-09-30 6:33 ` Duncan
0 siblings, 1 reply; 2+ messages in thread
From: James Cook @ 2015-09-29 5:51 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2183 bytes --]
(Trying again in plain text and without the 100KiB log file.)
Hi linux-btrfs,
The context of these three questions is that I'm experiencing
occasional hangs for several seconds while btrfs-transacti works, and
very long boot times. General comments welcome. System info at bottom,
end part of dmesg.log attached.
Q1:
I keep a lot of read-only snapshots (> 300 total) of all of my
subvolumes and haven't deleted any so far. Is this in itself a problem
or unanticipated use of btrfs?
Q2:
I have some files that remain heavily fragmented (according to
filefrag) even after I run btrfs fi defragment. I think this happens
because btrfs doesn't want to unlink parts of the files from their
snapshotted copies. Can I tell btrfs to defragment them anyway, and
not worry about wasting space? And can I make the autodefrag mount
option do this?
For example (not all output shown):
# filefrag *
...
system@1973a03e3af1449ba5dd93362953fd5f-0000000000000001-00051f9377f11af6.journal:
553 extents found
...
# btrfs fi defragment -rf .
# filefrag *
...
system@1973a03e3af1449ba5dd93362953fd5f-0000000000000001-00051f9377f11af6.journal:
331 extents found
...
Q3:
What's the best way to tell which files are causing the hangs? My
current method is to make an educated guess (e.g. think of programs
that store large database files) then use filefrag to see if there's
fragmentation. I'm not confident I've found all the sources of bad
performance.
Thanks,
James
System info:
[root@angel-nixos:~]# uname -a
Linux angel-nixos 3.18.20 #1-NixOS SMP Thu Jan 1 00:00:01 UTC 1970
x86_64 GNU/Linux
[root@angel-nixos:~]# btrfs --version
btrfs-progs v4.2
[root@angel-nixos:~]# btrfs fi show
Label: 'AngelBtrfs' uuid: 7f4b4b5d-1ba5-46cc-b782-938e3600a427
Total devices 2 FS bytes used 1.06TiB
devid 5 size 2.00TiB used 1.07TiB path /dev/mapper/KfricBtrSec
devid 6 size 2.00TiB used 1.07TiB path /dev/mapper/SfricBtrSec
btrfs-progs v4.2
[root@angel-nixos:~]# btrfs fi df /
Data, RAID1: total=1.01TiB, used=1.01TiB
System, RAID1: total=32.00MiB, used=192.00KiB
Metadata, RAID1: total=60.00GiB, used=58.65GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
[-- Attachment #2: end_of_dmesg.log --]
[-- Type: application/octet-stream, Size: 37047 bytes --]
[ 284.033786] systemd[1]: Stopping Journal Service...
[ 284.038198] systemd[1]: Starting Journal Service...
[ 286.215587] systemd-journald[564]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 344.151729] systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
[ 434.389389] systemd[1]: systemd-journald.service stop-sigabrt timed out. Terminating.
[ 446.750751] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[ 446.755534] systemd[1]: Failed to start Journal Service.
[ 446.775552] systemd[1]: Dependency failed for Trigger Flushing of Journal to Persistent Storage.
[ 446.786049] systemd[1]: Unit systemd-journald.service entered failed state.
[ 446.786108] systemd[1]: systemd-journald.service failed.
[ 446.789210] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[ 446.809674] systemd[1]: Stopping Journal Service...
[ 446.960987] systemd[1]: Starting Journal Service...
[ 446.970223] systemd-journald[571]: File /run/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 480.134384] INFO: task btrfs-transacti:223 blocked for more than 120 seconds.
[ 480.136736] Not tainted 3.18.20 #1-NixOS
[ 480.139076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 480.141437] btrfs-transacti D ffff88042fc92c40 0 223 2 0x00000000
[ 480.141441] ffff8800cbb73d78 0000000000000046 ffff880418a48880 0000000000012c40
[ 480.141444] ffff8800cbb73fd8 0000000000012c40 ffff88041953cc80 ffff880418a48880
[ 480.141446] ffff8800cbb73d88 ffff880418cf4000 ffff880418cf4070 ffff8800cb4181a0
[ 480.141448] Call Trace:
[ 480.141455] [<ffffffff814b2459>] schedule+0x29/0x70
[ 480.141482] [<ffffffffa005c6d5>] wait_for_commit.isra.14+0x45/0x80 [btrfs]
[ 480.141486] [<ffffffff8109f990>] ? wait_woken+0x60/0x60
[ 480.141497] [<ffffffffa005d4d2>] btrfs_commit_transaction+0x132/0xa70 [btrfs]
[ 480.141507] [<ffffffffa005dea3>] ? start_transaction+0x93/0x5a0 [btrfs]
[ 480.141517] [<ffffffffa0059345>] transaction_kthread+0x235/0x250 [btrfs]
[ 480.141527] [<ffffffffa0059110>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[ 480.141530] [<ffffffff81083bd9>] kthread+0xc9/0xe0
[ 480.141534] [<ffffffff81010000>] ? xen_smp_intr_init+0x10/0x220
[ 480.141536] [<ffffffff81083b10>] ? kthread_create_on_node+0x180/0x180
[ 480.141538] [<ffffffff814b65d8>] ret_from_fork+0x58/0x90
[ 480.141541] [<ffffffff81083b10>] ? kthread_create_on_node+0x180/0x180
[ 505.704286] BTRFS info (device dm-0): disk space caching is enabled
[ 505.716195] systemd[1]: Started Remount Root and Kernel File Systems.
[ 505.716910] systemd[1]: Starting Load/Save RF Kill Switch Status of rfkill0...
[ 505.719830] systemd[1]: Starting Local File Systems (Pre).
[ 505.722322] systemd[1]: Reached target Local File Systems (Pre).
[ 505.722360] systemd[1]: Mounting /tmp...
[ 505.725196] systemd[1]: Mounting /home...
[ 505.727987] systemd[1]: Mounting /btr...
[ 505.730711] systemd[1]: Mounting /boot...
[ 505.733387] systemd[1]: Starting Load/Save Random Seed...
[ 505.736061] systemd[1]: Starting Update UTMP about System Boot/Shutdown...
[ 505.741789] systemd[1]: Mounted /tmp.
[ 505.749331] systemd[1]: Mounted /btr.
[ 505.764618] systemd[1]: Started Load/Save RF Kill Switch Status of rfkill0.
[ 505.816026] EXT2-fs (sdb3): warning: mounting unchecked fs, running e2fsck is recommended
[ 505.818618] systemd[1]: Mounted /boot.
[ 505.845271] systemd[1]: Mounted /home.
[ 505.846029] systemd[1]: Starting Local File Systems.
[ 505.848239] systemd[1]: Reached target Local File Systems.
[ 505.897926] systemd[1]: Started Update UTMP about System Boot/Shutdown.
[ 505.899460] systemd-journald[571]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 505.902188] systemd[1]: Started Load/Save Random Seed.
[ 507.129606] systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
[ 510.964005] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[ 510.968902] systemd[1]: Failed to start Journal Service.
[ 510.977288] systemd[1]: Dependency failed for Trigger Flushing of Journal to Persistent Storage.
[ 510.979311] systemd[1]: Unit systemd-journald.service entered failed state.
[ 510.979315] systemd[1]: systemd-journald.service failed.
[ 510.979395] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[ 510.981588] systemd[1]: Stopping Journal Service...
[ 510.985870] systemd[1]: Starting Journal Service...
[ 510.991713] systemd-journald[596]: File /run/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 512.212260] systemd-journald[596]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 512.376395] systemd[1]: Started Journal Service.
[ 512.531016] systemd-journald[596]: Received request to flush runtime journal from PID 1
[ 528.320098] systemd-journald[596]: Deleted empty journal /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system@00051e779bc895e3-df837743793ef8e2.journal~ (8912896 bytes).
[ 528.320204] systemd-journald[596]: Deleted empty journal /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system@00051e77ace49f89-328cc5f91fee7d07.journal~ (8912896 bytes).
[ 531.116932] r8169 0000:03:00.0 enp3s0: link down
[ 531.116976] r8169 0000:03:00.0 enp3s0: link down
[ 531.117053] IPv6: ADDRCONF(NETDEV_UP): enp3s0: link is not ready
[ 531.272952] ieee80211 phy0: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2561s.bin'
[ 531.482224] ieee80211 phy0: rt2x00lib_request_firmware: Info - Firmware detected - version: 0.8
[ 531.500620] IPv6: ADDRCONF(NETDEV_UP): wlp4s6: link is not ready
[ 533.895600] r8169 0000:03:00.0 enp3s0: link up
[ 533.895613] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
[ 534.560379] NET: Registered protocol family 17
[ 1136.224439] fuse init (API version 7.23)
[112274.182588] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[112274.182918] systemd[1]: Unit systemd-journald.service entered failed state.
[112274.182947] systemd[1]: systemd-journald.service failed.
[112274.183641] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[112274.183823] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[112274.184460] systemd[1]: Stopping Journal Service...
[112274.184491] systemd[1]: Starting Journal Service...
[112274.658856] systemd-journald[17402]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[112274.742158] systemd[1]: Started Journal Service.
[158451.247261] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[158451.247452] systemd[1]: Unit systemd-journald.service entered failed state.
[158451.247471] systemd[1]: systemd-journald.service failed.
[158451.247816] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[158451.247869] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[158451.248096] systemd[1]: Stopping Journal Service...
[158451.248111] systemd[1]: Starting Journal Service...
[158451.690339] systemd-journald[2422]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[158451.822651] systemd[1]: Started Journal Service.
[233033.975956] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[233033.976268] systemd[1]: Unit systemd-journald.service entered failed state.
[233033.976297] systemd[1]: systemd-journald.service failed.
[233033.976863] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[233033.976968] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[233033.977382] systemd[1]: Stopping Journal Service...
[233033.977411] systemd[1]: Starting Journal Service...
[233038.512986] systemd-journald[1625]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[233038.774155] systemd[1]: Started Journal Service.
[292247.500385] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[292247.500600] systemd[1]: Unit systemd-journald.service entered failed state.
[292247.500619] systemd[1]: systemd-journald.service failed.
[292247.500954] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[292247.501008] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[292247.501235] systemd[1]: Stopping Journal Service...
[292247.501251] systemd[1]: Starting Journal Service...
[292248.076928] systemd-journald[16434]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[292248.253405] systemd[1]: Started Journal Service.
[349176.790408] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[349176.790605] systemd[1]: Unit systemd-journald.service entered failed state.
[349176.790624] systemd[1]: systemd-journald.service failed.
[349176.790960] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[349176.791017] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[349176.791250] systemd[1]: Stopping Journal Service...
[349176.791266] systemd[1]: Starting Journal Service...
[349177.409266] systemd-journald[14306]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[349177.515251] systemd[1]: Started Journal Service.
[548527.442235] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[548527.442485] systemd[1]: Unit systemd-journald.service entered failed state.
[548527.442507] systemd[1]: systemd-journald.service failed.
[548527.442828] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[548527.443021] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[548527.443336] systemd[1]: Stopping Journal Service...
[548527.443358] systemd[1]: Starting Journal Service...
[548527.559358] systemd-journald[27165]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[548528.256488] systemd[1]: Started Journal Service.
[602054.223847] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[602054.224360] systemd[1]: Unit systemd-journald.service entered failed state.
[602054.224404] systemd[1]: systemd-journald.service failed.
[602054.225092] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[602054.225522] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[602054.226274] systemd[1]: Stopping Journal Service...
[602054.226324] systemd[1]: Starting Journal Service...
[602059.969637] systemd-journald[32584]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[602060.026528] systemd[1]: Started Journal Service.
[638529.495516] ata1: exception Emask 0x40 SAct 0x0 SErr 0xc0802 action 0x7
[638529.495523] ata1: SError: { RecovComm HostInt CommWake 10B8B }
[638529.495528] ata1: hard resetting link
[638529.954956] ata1: softreset failed (device not ready)
[638529.954963] ata1: applying PMP SRST workaround and retrying
[638530.109953] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[638530.113215] ata1.00: configured for UDMA/133
[638530.113320] ata1: EH complete
[638530.154548] IPv6: ADDRCONF(NETDEV_UP): tinc.falsifian: link is not ready
[638530.184193] AMD64 EDAC driver v3.4.0
[638530.184218] EDAC amd64: DRAM ECC disabled.
[638530.184225] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[638530.365021] IPv6: ADDRCONF(NETDEV_CHANGE): tinc.falsifian: link becomes ready
[638532.708691] floppy0: no floppy controllers found
[640237.872611] systemd-journald[20204]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[665755.887777] systemd-journald[22499]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[705440.408077] systemd-journald[22798]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[766454.346446] systemd-journald[20042]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[798900.686555] systemd-journald[568]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[837105.392613] systemd-journald[23946]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[935846.508524] systemd-journald[7925]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[974374.679578] systemd-journald[32370]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1011772.002160] systemd-journald[16655]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1047931.829130] systemd-journald[23534]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1101720.396241] systemd-journald[31569]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1129694.621786] systemd-journald[10564]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1162516.109559] systemd-journald[25415]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1219271.147663] systemd-journald[22864]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1302532.395219] ata1: exception Emask 0x40 SAct 0x0 SErr 0xc0802 action 0x7
[1302532.395226] ata1: SError: { RecovComm HostInt CommWake 10B8B }
[1302532.395230] ata1: hard resetting link
[1302532.855322] ata1: softreset failed (device not ready)
[1302532.855328] ata1: applying PMP SRST workaround and retrying
[1302532.861059] AMD64 EDAC driver v3.4.0
[1302532.861088] EDAC amd64: DRAM ECC disabled.
[1302532.861095] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[1302533.010346] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[1302533.013614] ata1.00: configured for UDMA/133
[1302533.013727] ata1: EH complete
[1302533.138738] IPv6: ADDRCONF(NETDEV_UP): tinc.falsifian: link is not ready
[1302533.503254] IPv6: ADDRCONF(NETDEV_CHANGE): tinc.falsifian: link becomes ready
[1302535.722052] floppy0: no floppy controllers found
[1302538.217344] dhcpcd[8585]: segfault at ec ip 00000000004168e7 sp 00007ffc88782610 error 4 in dhcpcd[400000+3f000]
[1339562.535228] systemd-journald[10896]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1388548.974100] systemd-journald[15322]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1435252.783519] systemd-journald[3341]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1523136.004440] r8169 0000:03:00.0 enp3s0: link down
[1523138.823880] r8169 0000:03:00.0 enp3s0: link up
[1523152.137298] r8169 0000:03:00.0 enp3s0: link down
[1523155.069617] r8169 0000:03:00.0 enp3s0: link up
[1523189.283573] r8169 0000:03:00.0 enp3s0: link down
[1523192.017447] r8169 0000:03:00.0 enp3s0: link up
[1523193.301725] r8169 0000:03:00.0 enp3s0: link down
[1523196.006643] r8169 0000:03:00.0 enp3s0: link up
[1523788.871112] r8169 0000:03:00.0 enp3s0: link down
[1523791.692605] r8169 0000:03:00.0 enp3s0: link up
[1523805.452208] r8169 0000:03:00.0 enp3s0: link down
[1523808.364887] r8169 0000:03:00.0 enp3s0: link up
[1523843.940463] r8169 0000:03:00.0 enp3s0: link down
[1523846.656012] r8169 0000:03:00.0 enp3s0: link up
[1523847.064816] r8169 0000:03:00.0 enp3s0: link down
[1523849.877385] r8169 0000:03:00.0 enp3s0: link up
[1534050.092030] systemd-journald[12778]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1709686.159418] systemd-journald[10125]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1805130.007296] systemd-journald[1813]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1852161.961582] QXcbEventReader[26261]: segfault at 7f6ec509b219 ip 00007f6ec509b219 sp 00007f6ec2e2de60 error 14 in locale-archive[7f6ec5123000+6530000]
[1853062.618241] usb 1-6: new high-speed USB device number 5 using ehci-pci
[1853062.734877] usb 1-6: New USB device found, idVendor=067b, idProduct=2507
[1853062.734889] usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[1853062.734896] usb 1-6: Product: Mass Storage Device
[1853062.734902] usb 1-6: Manufacturer: Prolific Technology Inc.
[1853062.734907] usb 1-6: SerialNumber: 0
[1853062.819732] usb-storage 1-6:1.0: USB Mass Storage device detected
[1853062.819803] usb-storage 1-6:1.0: Quirks match for vid 067b pid 2507: 110
[1853062.819824] scsi host10: usb-storage 1-6:1.0
[1853062.819922] usbcore: registered new interface driver usb-storage
[1853062.828404] usbcore: registered new interface driver uas
[1853063.856662] scsi 10:0:0:0: Direct-Access WDC WD25 00BB-22GUA0 08.0 PQ: 0 ANSI: 0
[1853063.858643] sd 10:0:0:0: [sdc] Adjusting the sector count from its reported value: 488395055
[1853063.858660] sd 10:0:0:0: [sdc] 488395054 512-byte logical blocks: (250 GB/232 GiB)
[1853063.859768] sd 10:0:0:0: [sdc] Write Protect is off
[1853063.859773] sd 10:0:0:0: [sdc] Mode Sense: 03 00 00 00
[1853063.860885] sd 10:0:0:0: [sdc] No Caching mode page found
[1853063.860898] sd 10:0:0:0: [sdc] Assuming drive cache: write through
[1853063.861986] sd 10:0:0:0: [sdc] Adjusting the sector count from its reported value: 488395055
[1853063.885296] sdc: sdc2 sdc3
[1853063.887642] sd 10:0:0:0: [sdc] Adjusting the sector count from its reported value: 488395055
[1853063.889865] sd 10:0:0:0: [sdc] Attached SCSI disk
[1853100.418357] NET: Registered protocol family 38
[1853111.467163] kjournald starting. Commit interval 5 seconds
[1853111.475557] EXT3-fs (dm-5): using internal journal
[1853111.475568] EXT3-fs (dm-5): mounted filesystem with ordered data mode
[1853113.205623] kjournald starting. Commit interval 5 seconds
[1853113.206536] EXT3-fs (dm-3): using internal journal
[1853113.206548] EXT3-fs (dm-3): mounted filesystem with ordered data mode
[1854004.796093] usb 1-6: USB disconnect, device number 5
[1893777.482161] systemd-journald[9755]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[1954173.452791] systemd-journald[1168]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2159674.482332] systemd-journald[26046]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2413243.809186] systemd-journald[16685]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2413244.182847] systemd-journald[16685]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
[2455070.702439] IPv6: ADDRCONF(NETDEV_UP): tinc.falsifian: link is not ready
[2455070.732450] IPv6: ADDRCONF(NETDEV_CHANGE): tinc.falsifian: link becomes ready
[2455070.748594] AMD64 EDAC driver v3.4.0
[2455070.748620] EDAC amd64: DRAM ECC disabled.
[2455070.748627] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[2455074.546520] floppy0: no floppy controllers found
[2455074.587054] ata1: exception Emask 0x40 SAct 0x0 SErr 0xc0802 action 0x7
[2455074.587060] ata1: SError: { RecovComm HostInt CommWake 10B8B }
[2455074.587065] ata1: hard resetting link
[2455075.046461] ata1: softreset failed (device not ready)
[2455075.046468] ata1: applying PMP SRST workaround and retrying
[2455075.201461] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[2455075.204699] ata1.00: configured for UDMA/133
[2455075.204805] ata1: EH complete
[2455200.218465] systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
[2455223.431564] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[2455223.432205] systemd[1]: Unit systemd-journald.service entered failed state.
[2455223.432275] systemd[1]: systemd-journald.service failed.
[2455223.433561] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[2455223.433719] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[2455223.434506] systemd[1]: Stopping Journal Service...
[2455223.434559] systemd[1]: Starting Journal Service...
[2455245.372012] systemd-journald[28692]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2455247.911994] systemd-journald[28692]: Deleted empty journal /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/user-1000@000520a97d68ba65-78b8fafa5aa71dcb.journal~ (8388608 bytes).
[2455248.909262] systemd[1]: Started Journal Service.
[2455249.017045] systemd[1]: Unit nix-daemon.service entered failed state.
[2455249.027029] systemd[1]: nix-daemon.service failed.
[2455281.209611] systemd[1]: user@104.service start operation timed out. Terminating.
[2455349.753201] systemd[1]: Unit user@104.service entered failed state.
[2455349.763189] systemd[1]: user@104.service failed.
[2457003.334505] usb 1-2: new high-speed USB device number 6 using ehci-pci
[2457003.477908] usb 1-2: New USB device found, idVendor=059f, idProduct=1053
[2457003.477914] usb 1-2: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[2457003.477917] usb 1-2: Product: P'9230 Desktop Drive
[2457003.477920] usb 1-2: Manufacturer: LaCie
[2457003.477922] usb 1-2: SerialNumber: 0000000025bc285c00e2
[2457003.478201] usb-storage 1-2:1.0: USB Mass Storage device detected
[2457003.478279] scsi host11: usb-storage 1-2:1.0
[2457014.033362] scsi 11:0:0:0: Direct-Access ST4000DM 000-1F2168 CC54 PQ: 0 ANSI: 5
[2457014.035875] sd 11:0:0:0: [sdc] 976754646 4096-byte logical blocks: (4.00 TB/3.63 TiB)
[2457014.036859] sd 11:0:0:0: [sdc] Write Protect is off
[2457014.036870] sd 11:0:0:0: [sdc] Mode Sense: 23 00 00 00
[2457014.037847] sd 11:0:0:0: [sdc] No Caching mode page found
[2457014.037856] sd 11:0:0:0: [sdc] Assuming drive cache: write through
[2457014.039594] sd 11:0:0:0: [sdc] 976754646 4096-byte logical blocks: (4.00 TB/3.63 TiB)
[2457014.067371] sdc: sdc1 sdc2
[2457014.069335] sd 11:0:0:0: [sdc] 976754646 4096-byte logical blocks: (4.00 TB/3.63 TiB)
[2457014.071469] sd 11:0:0:0: [sdc] Attached SCSI disk
[2457043.636970] BTRFS: device label pers-btr-sec devid 1 transid 7552 /dev/dm-3
[2457049.675503] BTRFS info (device dm-3): disk space caching is enabled
[2457419.484255] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[2457419.484767] systemd[1]: Unit systemd-journald.service entered failed state.
[2457419.484809] systemd[1]: systemd-journald.service failed.
[2457419.485731] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[2457419.485876] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[2457419.486624] systemd[1]: Stopping Journal Service...
[2457419.486685] systemd[1]: Starting Journal Service...
[2457421.313955] systemd-journald[27942]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2457479.726539] systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
[2457488.030875] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[2457488.031433] systemd[1]: Failed to start Journal Service.
[2457488.031527] systemd[1]: Dependency failed for Trigger Flushing of Journal to Persistent Storage.
[2457488.031643] systemd[1]: Unit systemd-journald.service entered failed state.
[2457488.031655] systemd[1]: systemd-journald.service failed.
[2457488.032579] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[2457488.032719] systemd[1]: Stopping Journal Service...
[2457488.032788] systemd[1]: Starting Journal Service...
[2457508.195526] systemd-journald[27960]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2457512.342138] systemd-journald[27960]: Deleted empty journal /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system@000520b3cc03b5ba-7946652d0a72418a.journal~ (8912896 bytes).
[2457512.865204] systemd[1]: Started Journal Service.
[2488063.167954] systemd[1]: systemd-journald.service stop-sigabrt timed out. Terminating.
[2488063.169272] systemd[1]: Starting Journal Service...
[2488070.799324] INFO: task bitcoind:18765 blocked for more than 120 seconds.
[2488070.799332] Not tainted 3.18.20 #1-NixOS
[2488070.799335] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2488070.799338] bitcoind D ffff88042fd12c40 0 18765 18674 0x00000000
[2488070.799344] ffff88035b217b48 0000000000000086 ffff88041953f700 0000000000012c40
[2488070.799349] ffff88035b217fd8 0000000000012c40 ffff88041953d500 ffff88041953f700
[2488070.799353] ffff88035b217b58 ffff880418cf4900 ffff8800b31531f0 ffff8800b31531f0
[2488070.799357] Call Trace:
[2488070.799367] [<ffffffff814b2459>] schedule+0x29/0x70
[2488070.799412] [<ffffffffa005c9df>] wait_current_trans.isra.19+0x9f/0x100 [btrfs]
[2488070.799418] [<ffffffff8109f990>] ? wait_woken+0x60/0x60
[2488070.799439] [<ffffffffa005e171>] start_transaction+0x361/0x5a0 [btrfs]
[2488070.799459] [<ffffffffa005e3cb>] btrfs_start_transaction+0x1b/0x20 [btrfs]
[2488070.799479] [<ffffffffa006c32b>] btrfs_create+0x3b/0x220 [btrfs]
[2488070.799485] [<ffffffff811b9fd5>] vfs_create+0xd5/0x140
[2488070.799489] [<ffffffff811be95f>] do_last+0x114f/0x1290
[2488070.799494] [<ffffffff811bb00e>] ? link_path_walk+0x23e/0xe90
[2488070.799498] [<ffffffff811beb58>] path_openat+0xb8/0x660
[2488070.799503] [<ffffffff812b7194>] ? call_rwsem_down_read_failed+0x14/0x30
[2488070.799508] [<ffffffff811c031a>] do_filp_open+0x3a/0xb0
[2488070.799512] [<ffffffff811ccd2e>] ? __alloc_fd+0x7e/0x120
[2488070.799517] [<ffffffff811ad9ec>] do_sys_open+0x12c/0x220
[2488070.799521] [<ffffffff811adafe>] SyS_open+0x1e/0x20
[2488070.799525] [<ffffffff814b6689>] system_call_fastpath+0x12/0x17
[2488070.799536] INFO: task dhcpcd:20319 blocked for more than 120 seconds.
[2488070.799539] Not tainted 3.18.20 #1-NixOS
[2488070.799541] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2488070.799543] dhcpcd D ffff88042fc12c40 0 20319 1 0x00000000
[2488070.799547] ffff880004823af8 0000000000000082 ffff88007466e600 0000000000012c40
[2488070.799551] ffff880004823fd8 0000000000012c40 ffffffff8181b500 ffff88007466e600
[2488070.799555] ffff880004823b08 ffff880418cf4900 ffff880417f859f0 ffff880417f859f0
[2488070.799558] Call Trace:
[2488070.799563] [<ffffffff814b2459>] schedule+0x29/0x70
[2488070.799583] [<ffffffffa005c9df>] wait_current_trans.isra.19+0x9f/0x100 [btrfs]
[2488070.799587] [<ffffffff8109f990>] ? wait_woken+0x60/0x60
[2488070.799606] [<ffffffffa005e171>] start_transaction+0x361/0x5a0 [btrfs]
[2488070.799625] [<ffffffffa005e3cb>] btrfs_start_transaction+0x1b/0x20 [btrfs]
[2488070.799645] [<ffffffffa0068faa>] btrfs_setattr+0x1ba/0x2d0 [btrfs]
[2488070.799650] [<ffffffff811cbb61>] notify_change+0x261/0x370
[2488070.799654] [<ffffffff811ac476>] do_truncate+0x66/0x90
[2488070.799658] [<ffffffff811abde0>] ? do_dentry_open+0x220/0x320
[2488070.799663] [<ffffffff811be5eb>] do_last+0xddb/0x1290
[2488070.799667] [<ffffffff811bae3d>] ? link_path_walk+0x6d/0xe90
[2488070.799672] [<ffffffff811beb58>] path_openat+0xb8/0x660
[2488070.799676] [<ffffffff811ca52c>] ? destroy_inode+0x3c/0x70
[2488070.799680] [<ffffffff811c031a>] do_filp_open+0x3a/0xb0
[2488070.799684] [<ffffffff811ccd2e>] ? __alloc_fd+0x7e/0x120
[2488070.799688] [<ffffffff811ad9ec>] do_sys_open+0x12c/0x220
[2488070.799692] [<ffffffff811adafe>] SyS_open+0x1e/0x20
[2488070.799696] [<ffffffff814b6689>] system_call_fastpath+0x12/0x17
[2488070.799706] INFO: task systemd-journal:27960 blocked for more than 120 seconds.
[2488070.799709] Not tainted 3.18.20 #1-NixOS
[2488070.799710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2488070.799712] systemd-journal D ffff88042fc92c40 0 27960 1 0x00000004
[2488070.799716] ffff88000b90fa78 0000000000000082 ffff8803d6225d80 0000000000012c40
[2488070.799719] ffff88000b90ffd8 0000000000012c40 ffff88041953cc80 ffff8803d6225d80
[2488070.799723] ffff88000b90fa88 ffff880418cf4900 ffff880417f859f0 ffff880417f859f0
[2488070.799726] Call Trace:
[2488070.799730] [<ffffffff814b2459>] schedule+0x29/0x70
[2488070.799750] [<ffffffffa005c9df>] wait_current_trans.isra.19+0x9f/0x100 [btrfs]
[2488070.799754] [<ffffffff8109f990>] ? wait_woken+0x60/0x60
[2488070.799773] [<ffffffffa005e0f8>] start_transaction+0x2e8/0x5a0 [btrfs]
[2488070.799793] [<ffffffffa005e407>] btrfs_join_transaction+0x17/0x20 [btrfs]
[2488070.799812] [<ffffffffa0064063>] btrfs_dirty_inode+0x43/0xf0 [btrfs]
[2488070.799828] [<ffffffffa003d314>] ? block_rsv_add_bytes+0x24/0x60 [btrfs]
[2488070.799847] [<ffffffffa0064191>] btrfs_update_time+0x81/0xc0 [btrfs]
[2488070.799851] [<ffffffff811ca053>] update_time+0x23/0xc0
[2488070.799856] [<ffffffff810bc796>] ? current_fs_time+0x16/0x60
[2488070.799860] [<ffffffff811ca2b3>] file_update_time+0x83/0xd0
[2488070.799881] [<ffffffffa006d37c>] btrfs_page_mkwrite+0x9c/0x370 [btrfs]
[2488070.799886] [<ffffffff8116c6bf>] do_page_mkwrite+0x3f/0x90
[2488070.799891] [<ffffffff8116e725>] do_wp_page+0x675/0x8d0
[2488070.799896] [<ffffffff81170afd>] handle_mm_fault+0x70d/0xec0
[2488070.799900] [<ffffffff8116cafc>] ? tlb_finish_mmu+0x1c/0x50
[2488070.799905] [<ffffffff81058a08>] __do_page_fault+0x1c8/0x570
[2488070.799909] [<ffffffff811c63ea>] ? dput+0x7a/0x1b0
[2488070.799913] [<ffffffff811746c9>] ? vma_rb_erase+0x129/0x250
[2488070.799917] [<ffffffff81176627>] ? do_munmap+0x267/0x440
[2488070.799921] [<ffffffff81058dd2>] do_page_fault+0x22/0x30
[2488070.799925] [<ffffffff814b8538>] page_fault+0x28/0x30
[2488070.799932] INFO: task kworker/u8:7:2294 blocked for more than 120 seconds.
[2488070.799934] Not tainted 3.18.20 #1-NixOS
[2488070.799936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2488070.799938] kworker/u8:7 D ffff88042fc92c40 0 2294 2 0x00000000
[2488070.799945] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[2488070.799948] ffff880106053678 0000000000000046 ffff8803b5712200 0000000000012c40
[2488070.799951] ffff880106053fd8 0000000000012c40 ffff88041953cc80 ffff8803b5712200
[2488070.799954] ffff880106053688 ffff880418cf4900 ffff880417f859f0 ffff880417f859f0
[2488070.799958] Call Trace:
[2488070.799963] [<ffffffff814b2459>] schedule+0x29/0x70
[2488070.799982] [<ffffffffa005c9df>] wait_current_trans.isra.19+0x9f/0x100 [btrfs]
[2488070.799986] [<ffffffff8109f990>] ? wait_woken+0x60/0x60
[2488070.800005] [<ffffffffa005e0f8>] start_transaction+0x2e8/0x5a0 [btrfs]
[2488070.800025] [<ffffffffa005e407>] btrfs_join_transaction+0x17/0x20 [btrfs]
[2488070.800044] [<ffffffffa0064c5d>] run_delalloc_nocow+0xbd/0xb10 [btrfs]
[2488070.800050] [<ffffffff8114d700>] ? account_page_dirtied+0xb0/0x110
[2488070.800071] [<ffffffffa007b6ac>] ? lock_extent_bits+0x7c/0x240 [btrfs]
[2488070.800093] [<ffffffffa007c063>] ? test_range_bit+0x33/0x170 [btrfs]
[2488070.800113] [<ffffffffa006571b>] run_delalloc_range+0x6b/0x3c0 [btrfs]
[2488070.800134] [<ffffffffa007c371>] ? find_lock_delalloc_range.constprop.46+0x1d1/0x200 [btrfs]
[2488070.800155] [<ffffffffa007c48c>] writepage_delalloc.isra.31+0xec/0x160 [btrfs]
[2488070.800176] [<ffffffffa007d4ff>] __extent_writepage+0xbf/0x2f0 [btrfs]
[2488070.800197] [<ffffffffa007da5a>] extent_write_cache_pages.isra.27.constprop.45+0x32a/0x3e0 [btrfs]
[2488070.800219] [<ffffffffa007f45d>] extent_writepages+0x4d/0x70 [btrfs]
[2488070.800239] [<ffffffffa00626a0>] ? btrfs_submit_direct+0x7b0/0x7b0 [btrfs]
[2488070.800258] [<ffffffffa005f748>] btrfs_writepages+0x28/0x30 [btrfs]
[2488070.800263] [<ffffffff8114ef2e>] do_writepages+0x1e/0x30
[2488070.800277] [<ffffffff811d8859>] __writeback_single_inode+0x49/0x240
[2488070.800281] [<ffffffff811d8f27>] writeback_sb_inodes+0x1a7/0x3e0
[2488070.800285] [<ffffffff811d91ff>] __writeback_inodes_wb+0x9f/0xd0
[2488070.800289] [<ffffffff811d9463>] wb_writeback+0x233/0x2f0
[2488070.800293] [<ffffffff811c8fc1>] ? get_nr_inodes+0x51/0x80
[2488070.800297] [<ffffffff811db78a>] bdi_writeback_workfn+0x2ba/0x450
[2488070.800302] [<ffffffff8107e6fc>] process_one_work+0x15c/0x410
[2488070.800306] [<ffffffff8107f093>] worker_thread+0x53/0x470
[2488070.800310] [<ffffffff8107f040>] ? rescuer_thread+0x270/0x270
[2488070.800315] [<ffffffff81083bd9>] kthread+0xc9/0xe0
[2488070.800320] [<ffffffff81010000>] ? xen_smp_intr_init+0x10/0x220
[2488070.800325] [<ffffffff81083b10>] ? kthread_create_on_node+0x180/0x180
[2488070.800328] [<ffffffff814b65d8>] ret_from_fork+0x58/0x90
[2488070.800333] [<ffffffff81083b10>] ? kthread_create_on_node+0x180/0x180
[2488147.297463] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[2488147.298048] systemd[1]: Unit systemd-journald.service entered failed state.
[2488147.298090] systemd[1]: systemd-journald.service failed.
[2488147.299078] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[2488147.299239] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[2488147.301087] systemd[1]: Stopping Journal Service...
[2488147.301148] systemd[1]: Starting Journal Service...
[2488151.783902] systemd-journald[2687]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2488157.710035] systemd[1]: Started Journal Service.
[2488238.953352] systemd[1]: systemd-journald.service: main process exited, code=killed, status=6/ABRT
[2488238.953883] systemd[1]: Unit systemd-journald.service entered failed state.
[2488238.953932] systemd[1]: systemd-journald.service failed.
[2488238.954879] systemd[1]: systemd-journald.service has no holdoff time, scheduling restart.
[2488238.955024] systemd[1]: Stopping Trigger Flushing of Journal to Persistent Storage...
[2488238.955063] systemd[1]: Stopping Journal Service...
[2488238.955109] systemd[1]: Starting Journal Service...
[2488241.180586] systemd-journald[3792]: File /var/log/journal/bbfa3d10a4b7f3931b8b3e3300000008/system.journal corrupted or uncleanly shut down, renaming and replacing.
[2488241.767889] systemd[1]: Started Journal Service.
[2503825.588195] usb 1-2: USB disconnect, device number 6
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: lots of snapshots, forcing defragmentation, and figuring out which files to defrag or nodatacow 2015-09-29 5:51 lots of snapshots, forcing defragmentation, and figuring out which files to defrag or nodatacow James Cook @ 2015-09-30 6:33 ` Duncan 0 siblings, 0 replies; 2+ messages in thread From: Duncan @ 2015-09-30 6:33 UTC (permalink / raw) To: linux-btrfs James Cook posted on Mon, 28 Sep 2015 22:51:05 -0700 as excerpted: > The context of these three questions is that I'm experiencing occasional > hangs for several seconds while btrfs-transacti works, and very long > boot times. General comments welcome. System info at bottom, > end part of dmesg.log attached. > > Q1: > > I keep a lot of read-only snapshots (> 300 total) of all of my > subvolumes and haven't deleted any so far. Is this in itself a problem > or unanticipated use of btrfs? Very large numbers of snapshots do slow things down, but ~300 isn't what I'd call "very large" -- we're talking thousands to tens of thousands. My general recommendation is to keep it to ~250ish (under 300) per snapshotted subvolume, preferably under 2000 (and if possible 1000) total, easy enough to do even with automated frequent snapshotting (on the order of an hour apart, initially), as long as an equally automated snapshot thinning program is also established. At ~250 per subvolume, 1000 is 4 subvolumes worth, 2000 8 subvolumes worth. A bit over 300, assuming they're all of the same subvolume, is getting a bit high, but it shouldn't be causing a lot of trouble yet. It's just time to start thinking about a thinning program. There's one exception, quotas. Quotas continue to be an issue on btrfs; they're on their third rewrite now and while they believe it will work this time, there's still some serious bugs that will take a couple more kernels to work out. And besides not working right, they dramatically increase scalability issues. So my recommendation, unless you're directly working with the devs to test, report problems with, and bug- trace various quota issues, just don't run them on btrfs at this time. If you need quotas, use a filesystem where they're mature and work. If you don't, use btrfs without them. Really. I've seen at least two confirmed cases posted where people running quotas turned them off and their scaling issues disappeared. So if you have them on, that could well be your problem, right there. > Q2: > > I have some files that remain heavily fragmented (according to filefrag) > even after I run btrfs fi defragment. I think this happens because btrfs > doesn't want to unlink parts of the files from their snapshotted copies. > Can I tell btrfs to defragment them anyway, and not worry about wasting > space? And can I make the autodefrag mount option do this? > > For example (not all output shown): > > # filefrag * > ... > system@1973a03e3af1449ba5dd93362953fd5f-0000000000000001-00051f9377f11af6.journal: > 553 extents found ... > > # btrfs fi defragment -rf . > > # filefrag * > ... > system@1973a03e3af1449ba5dd93362953fd5f-0000000000000001-00051f9377f11af6.journal: > 331 extents found ... Several points to note, here: 1) Filefrag doesn't understand btrfs compression. If you don't use btrfs compression, this doesn't apply, but for btrfs compressed files, filefrag reports huge numbers of extents -- generally one per btrfs compression block (128 KiB), so 8 per MiB, 8192 per GiB of (before compression, not like btrfs give you a way to see post- compression file size anyway) file size. But unless you run compress-force you won't see it everywhere, because btrfs only compresses some files. 2) Btrfs defrag isn't snapshot aware, and will only defrag the files it's directly pointed at, using more space as it breaks the reflinks to the snapshotted copy. Around 3.9 snapshot-aware defrag was introduced, but it turned out to have *severe* scalability issues, so that was rolled back and snapshot-aware defrag was turned off again in, IIRC, 3.12 (thus well before what you're running). So worrying about breaking snapshot reflinks while defragging isn't going to be your problem, that, per se, is simply not an issue. 3) What /can/ be an issue is dealt with using defrag's -t parameter. I don't remember what the default target extent size is, but it's somewhat smaller than you might expect, well under a gig. Extent sizes larger than this are considered to be already defragged and aren't touched. (While this does touch on #2 above as well, not unnecessarily breaking reflinks to extents shared with other snapshots, the mechanism is one of extent size, not whether the extent is shared with another snapshot. So even if it's a new file not yet snapshotted, extents over this size won't be touched.) It's worth keeping in mind that btrfs' nominal data chunk size is 1 GiB. As such, that's the nominal largest extent size as well, altho in some cases (data chunks created on nearly empty TiB-scale filesystems) data chunk size can be larger, multiple GiB, in which case extent size can be larger as well. Regardless, extent sizes > 1 GiB really aren't going to be a performance issue anyway, so while using the -t 1G or -t 2G option is a good idea and should reduce fragmentation further for extents between the default size and your -t size, going above that isn't likely to do any good anyway. Which means for multi-gig files ideal minimum fragmentation will be a number of extents equal to their size in GiB, perhaps plus 2, assuming the first extent is in a partially used data chunk and thus under a gig, with the last being under a gig as well. You are extremely unlikely to get better than that, so for file sizes over a GiB, you /will/ see multiple extents. (Again, keeping in mind the exception where a data chunk is multi-GiB sized itself.) So using defrag's -t 1G is likely to get you somewhat better results, but don't expect multi-gig files to ever be a single extent. 4) Recently (3.17?), btrfs got a new feature that allowed it to automatically delete chunk-allocations with zero usage. This solved a serious problem, as before this, btrfs could allocate new chunks but couldn't deallocate them, and over time and many normal file-creation/ deletion cycles, most people ended up with a huge number of empty data chunks, often filling their filesystem until no new metadata chunks could be allocated, resulting in ENOSPC errors despite df reporting tens or occasionally even hundreds of GiB free. (Also possible, but less common, was the reverse case, all otherwise free space allocated to empty metadata chunks, with no space left to allocate new data chunks.) A btrfs balance start -dusage=0 -musage=0 could quickly get rid of them, but before this feature, it had to be done manually. Unfortunately, it appears defrag's previously great strategy of trying to use the space in existing data chunks before allocating new ones hasn't yet been adjusted for this, and it will first try to fill up holes in existing data chunks before allocating new ones. With btrfs automatically deleting empty chunks now, that means defrag will be working with partially full chunks with free space, but where unfortunately that free space is itself fragmented. So for a large multi- gig file, instead of allocating new GiB sized chunks and defragging to whole-chunk GiB sized extents, it's likely to be trying to work with much smaller free-space extents in existing chunks. Obviously this is going to rather negatively affect the results of a defrag and post-defrag filefrag results will still be higher numbers of extent counts than desired. =:^( This is a known bug, a result of the fairly recent automated deleting empty data chunks and defrag's reluctance to allocate new ones, that will no doubt be fixed in time, but meanwhile, we deal with the code we have. Here's an *entirely* *untested* idea for a workaround that I just came up with as I was typing the explanation above. If you'd try it and report back whether it works, we might well have at least some way to work around the issue. Before your defrag, do a btrfs fi df (or usage), and see what the numbers for data are. (If you haven't run a balance recently and you see a big spread tens of GiB, between data size and data used, try something like btrfs balance start -dusage=20, which will only try to balance data chunks under 20% usage. This will go much faster than balancing the whole filesystem, and often gets you quite a bit of space returned to unallocated from partially empty data chunks. If it doesn't have the desired effect, try -dusage=50, or even higher, but be aware that above 50, you're dealing with mostly full chunks already, which will take far longer to balance, with much less return for the time taken. If you do such a balance, check again the btrfs fi df results and see how you did.) Then do a truncate -s 100G some-not-yet-existing-file (adjusting the 100G as appropriate for your level of unallocated space, see either btrfs fi show or btrfs fi usage), and a sync, to ensure it's allocated in the filesystem. Do another fi df and you should see the data size has increased accordingly. Now delete that file, sync again, and do a fi df to double-check that you now have a bunch of empty data space (a spread between data size and used of near the file's size. Since I think btrfs doesn't delete those empty data chunks immediately, this /should/ give defrag a bunch of empty data chunks to use, and post- defrag filefrag results should end up much better (accounting for the first three points above, of course)! =:^) You might have to play with the idea a bit, perhaps using fallocate instead of truncate, etc, but as long as you get something to allocate a bunch of otherwise empty data chunks, delete it to empty them, and do the defrag before btrfs has deleted those empty chunks, I think it should work. =:^) 5) Judging by the filename of your example file, you run systemd, and it's a systemd journal file. These files (along with database and VM image files) have a non-append-only rewrite pattern that is known to be problematic on COW-based filesystems such as btrfs, triggering very high levels of fragmentation. What version of systemd are you running? 221 and later (219 first tried but there were bugs in the initial 219 implementation that remained in 220, that weren't worked out until 221) are more btrfs-friendly than previous versions. For reasons I'll explain in a moment, while I run btrfs and systemd (226 ATM), I don't know how well systemd's efforts at btrfs-friendliness turned out, at least not as pertains to the journal, but AFAIK, they consisted of two parts, which together should definitely help with the problem. However, the one part won't trigger on older systems updated to the newer btrfs-friendly systemd, without some manual intervention. I'll explain... First of all, it can be noted that if you're running an older systemd, or if you're on a newer systemd now but it was updated from an older one, you can take these steps manually, as well. I'll outline them that way, tho newer systemd would take these steps automatically if installed fresh, not updated from an older installation. a) With systemd running in emergency/rescue mode or at least with journaling shut down so it won't interfere... b) Move/rename the existing journal directories (see the notes under (c), you might wish to do this with all of /var, or only with /var/log or /var/log/journal, the latter being what I'd do) to something else. c) Create a new btrfs /subvolume/ in place of the old /directory/ that you just moved, so it can be named the same. The reason for this is that btrf snapshots stop at subvolume boundaries. What we're doing here is taking the journal dir out of future / (or /var) snapshots, by making it a subvolume of its own. It should be noted that systemd's auto-create mechanism is the systemd- tmpfiles-setup service, as configured in the various tmpfiles.d locations, namely /usr/lib/tmpfiles.d/*, specifically the var.conf file, here. If you look at that file, you'll see that systemd (226 at least) actually creates a directory (d) for /var/log, with the subvolume creation (v) as /var itself. Of course, only if they don't already exist (and v/subvolume-creation gracefully degrades to the d/dir-creation behavior it was before 219, on non-btrfs). If I were doing it manually, however, I'd create the subvolume as /var/log/journal, keeping /var either as its own subvolume (as systemd now creates it) or as a directory on /, depending on whether I wanted the rest of /var to be snapshotted separately from / or not. d) Set /var/log/journal (creating it if you just created /var as its own subvolume) nocow. Systemd/tmpfiles.d has this (h line, setting +C) in /usr/lib/tmpfile.d/ journal-nocow.conf. To do it manually, use chattr +C. Do the same for subdirs (the remote subdir and individual machines subdirs), if applicable. What nocow does is turn off copy-on-write for the file(s) in question, making it/them rewrite-in-place instead of cow. As a (btrfs) side effect, it turns off compression (if otherwise on) and checksumming for the file as well. On a directory, nocow does nothing, /except/ that newly created files (and subdirs) in that directory now inherit the nocow attribute. The idea is that our journal files should be nocow, and with the +C set on the containing subdir, /newly/ /created/ journals should now be just that. e) Now copy your old journals back in from the renamed backup, but there's a caveat... On btrfs, for existing files that have nocow set after they already have content (are not zero size), when they actually start behaving as nocow isn't defined. Thus, the files have to be created new in the target directory, in ordered to ensure that the (inherited) nocow takes effect immediately. The easiest way to ensure this is to copy them in from a different filesystem (not subvolume). Assuming your memory is sufficiently large, the easiest way to do /that/ is to copy the files to tmpfs, then copy/ move them to their new location under the nocow dir. Because it's a cross-filesystem copy/move, that will ensure that they are actually created in the new filesystem, not just reflinked or some such. (AFAIK, current cp without the reflink option does the right thing anyway, but there has been quite some discussion about making reflink copies the default, in which case you'd have to set --reflink=no to turn it off. Mv will always take a shortcut and simply create a reflink at the new location, deleting the old one, if it can, which it normally can if both the source and destination are on the same filesystem, so it'd definitely be a problem from the same filesystem. So there's ways to do it without using a separate filesystem as an intermediate, but the easiest way to be sure it's right is to just use the separate filesystem as an intermediate, and not have to worry about whether it's actually a newly created file at the destination, because you /know/ it is due to the cross-filesystem copy/move.) OK, so why go to all this trouble? First of all, nocow means the file is (normally) updated in-place, so fragmentation isn't an issue -- as long as that remains true. The problem is that snapshots depend on cow because they lock in place the existing version. With the existing version locked in place, obviously new writes must be cowed elsewhere, killing the intent of nocow. What btrfs actually does with nocow files when a write comes in after a snapshot, is use what some call cow1. The first write to a (4k) block cows it elsewhere, as it normally would, but the file's nocow attribute remains, and further writes to the now new block write in-place to it... until the next snapshot locks it /too/ in place. So frequent snapshots pretty well disable nocow and trigger fragmentation just as if the file wasn't nocow... depending of course on how relatively frequent and widely spread out those writes are into a file, vs the frequency of snapshots locking the existing version in place. So, what systemd did is ensure that newly created journal files are nocow, by setting the attribute on the existing dirs containing them so they'll inherit the attribute at creation. That's half the solution. Unfortunately, systemd didn't get the other half of the solution, the subvolume side, quite right. They created a subvolume at the /var level. Which means /var will not be snapshotted with /, OK, but what if you want to snapshot other files in /var? Now you'll setup a snapshotting schedule for the /var subvol, but because /var/log/journal is a directory not a subvolume of its own, it'll be included in the /var snapshot, triggering the very same cow1 problems on the nocow journals (tho at a different frequency if /var is snapshotted at a different frequency) as they'd have if they were still part of the main / subvolume! Which is why I recommend setting up /var/log/journal as its own subvolume, to exclude it from the snapshots above (either /var if it's its own subvol or / if not) and not snapshotting the journal subvolume at all, thus avoiding snapshot-triggered cow1 fragmentation with the journal files. Of course not snapshotting the journals is its own tradeoff. If you want to snapshot them, you can do so, but try to do it at as low a frequency as possible, and consider running defrag on the files. Depending on your frequency of snapshotting and how frequently your journal files are cycled based on your journal configuration, if they're actually rotated out within a week or so and you're snapshotting also either every week or perhaps every couple days, then fragmentation shouldn't get bad enough due to the cow1 to be worth worrying about defrag. 6) Meanwhile, one final point to note, as hinted above, my own journal configuration, which bypasses all this. Basically, I did this: a) When I converted to systemd, I kept my old syslog (syslog-ng), only now built with the build-time-optional systemd/journal integration. (FWIW I'm on gentoo, so toggling the systemd integration was as simple as setting the systemd USE flag and rebuilding syslog-ng. However, binary- based distros based on systemd will probably already have such integration enabled in sysloggers they still ship.) b) Because systemd's journal has some very nice current-session features (and because it's difficult/impossible to entirely disable in any case, but the features meant I didn't want to), I configured it to handle current session stuff only, basically by setting things up in journald.conf so it only used the /run/log/journal location, which being on tmpfs by default, is memory-only and thus blown away on reboot or when /run is unmounted. This way I was able to keep the systemctl status last-ten-messages from that service feature, since the journal was still operational, but only to tmpfs. =:^) c) I then configured my normal syslogger (syslog-ng in my case) to pull from systemd/journald, and to continue to log to the normally configured files in /var/log, as it normally did. And logrotate and various scripts continue to rotate those logs as they normally would. ** Critically, because the normal syslogger writes in append-only mode, its files aren't subject to the fragmentation that rewrite-all-over-the- file files get. Thus, journals are subject to the problem, but the only journals I have are in tmpfs which is memory-only so access is fast no matter what it does, while the normal logs in (the btrfs) /var/log are append-only and thus not subject to the problem. So for current-session investigations, I can use the journal with its best features, or the normal syslog files. If I'm investigating something that happened in an earlier session/boot, I don't have the journal to use (but also don't have the problems that writing journals to btrfs brings), but the normal syslog files are still there and usable, just as they always were. IOW, I've lost none of the old log style features, while still being able to use the new journal features in the current session, where they're of most use anyway. (As an additional benefit, despite scouring the journald documentation, I found absolutely no way to filter out "noise" entries so they didn't hit the journal at all -- it's possible to filter them on the journal-read side, but not before they're written to journal. With syslog-ng and presumably with other syslogs as well, filtering out such "noise" entries before they're written to the logs at all, is simple, once the config file format is understood, at least. For noise that happens several times a minute, that's a lot of permanent filesystem writes that I'm avoiding! Of course they're still stored in the tmpfs journals, using memory, but luckily the journal's binary format does compression, so 10K identical noise entries don't actually take up that much more room than just one does.) d) I did find that with journald journaling a whole session to the /run tmpfs, instead of switching to /var/log/journal once it was available, I did have to increase the size of the /run tmpfs some. However, before systemd, my old init system (the sysvinit-based openrc) only used /run for a few PID files, etc, so I had it clamped down rather tight, to something like 20 MiB max. Now it's 512 MiB, half a GiB, reasonable on my 16 GiB memory system. I then had to tweak the journald.conf settings a bit, so it'd use most of that half-GiB for itself (IIRC it uses only 50% by default, for safety, then stops journaling), while still leaving a bit of room for the standard PID files and the like. FWIW, these are my non-default journald.conf settings; others remain at default: ForwardToSyslog=yes RuntimeKeepFree=48M RuntimeMaxFileSize=64M RuntimeMaxUse=448M Storage=volatile TTYPath=/dev/tty12 Storage=volatile is the critical one for confining journals to tmpfs. ForwardToSyslog=yes simply tells journald that I have a syslogger too, and I want it to get the logs. The Runtime* settings determine systemd's usage on that tmpfs, and TTYPath is more or less unrelated. (Gentoo defaults to printing the messages log to tty12 when it's syslog, and here I just decided to let journald do that, giving me an easy way to check on the "noise" that syslog-ng is filtering if I want to, instead of having syslog-ng print to that tty. Obviously there's a corresponding non- default setting in syslog-ng's config turning off printing to the tty, there.) So there you have it, why I don't have to worry about journal behavior on btrfs at all, should you want to do similar. Otherwise, simply follow the recommendations in point #5 (as well as the others), and your problems should go down dramatically, tho obviously I prefer the method I described here in point #6, thus not having to deal with the problem, for journal files at least, at all. > Q3: > > What's the best way to tell which files are causing the hangs? My > current method is to make an educated guess (e.g. think of programs that > store large database files) then use filefrag to see if there's > fragmentation. I'm not confident I've found all the sources of bad > performance. That one I don't really have a good answer for... *except* to remind you that systemd, which you do appear to be using, has reasonably good boottime taken reports available. If the problem's primarily at boot time or it at least exists there as well, as seems to be the case, you should be able to use those to at least get some idea of what services are taking the time, so you can focus on them. One more thing for completeness, tho based on the below it's unlikely to apply to you. People report that when their filesystem has many devices (more than four, so not applying to your two), the mount will sometimes take so long it times out, tho they can mount the filesystem fine at the command prompt. One workaround here is to simply lengthen the timeout for that mount service. IIRC some people had ideas for shortening that mount time, but I don't remember what they were. ... And few more (short) comments below the system info... > System info: > > [root@angel-nixos:~]# uname -a Linux angel-nixos 3.18.20 #1-NixOS SMP > Thu Jan 1 00:00:01 UTC 1970 x86_64 GNU/Linux > > [root@angel-nixos:~]# btrfs --version btrfs-progs v4.2 > > [root@angel-nixos:~]# btrfs fi show Label: 'AngelBtrfs' uuid: > 7f4b4b5d-1ba5-46cc-b782-938e3600a427 > Total devices 2 FS bytes used 1.06TiB > devid 5 size 2.00TiB used 1.07TiB path /dev/mapper/[snip] > devid 6 size 2.00TiB used 1.07TiB path /dev/mapper/[snip] > > btrfs-progs v4.2 > > [root@angel-nixos:~]# btrfs fi df / > Data, RAID1: total=1.01TiB, used=1.01TiB > System, RAID1: total=32.00MiB, used=192.00KiB > Metadata, RAID1: total=60.00GiB, used=58.65GiB > GlobalReserve, single: total=512.00MiB, used=0.00B 3.18 series kernel, second to last LTS series, so you're good there. 4.2 userspace, current AFAIK. Btrfs is raid1 with two devices, and seems a bit more than half full (2 TiB devices, just over 1 TiB data plus 60 GiB metadata on each). The fi df says data is basically full, with metadata close enough as well, but as I implied, nearly a full TiB unallocated, so it appears the empty- chunk removal is functioning fine, and the filesystem is healthy in terms of space available. All around, pretty healthy! =:^) The only thing I'd suggest in general is to setup a schedule of snapshot thinning, before it becomes a problem. But at ~300 snapshots, it shouldn't be anything like a problem yet. And as I said above, disable btrfs quotas if they're turned on (unless you're specifically helping to fix them), as right now they're simply more trouble than they're worth, and we have posts demonstrating that very point. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-09-30 6:33 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-29 5:51 lots of snapshots, forcing defragmentation, and figuring out which files to defrag or nodatacow James Cook 2015-09-30 6:33 ` Duncan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).