From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from know-smtprelay-omc-6.server.virginmedia.net ([80.0.253.70]:37512 "EHLO know-smtprelay-omc-6.server.virginmedia.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752964AbcCNXD6 (ORCPT ); Mon, 14 Mar 2016 19:03:58 -0400 Received: from phoenix.vfire (localhost [127.0.0.1]) by phoenix.vfire (8.14.9/8.14.5) with ESMTP id u2EN3qW3011696 for ; Mon, 14 Mar 2016 23:03:52 GMT Received: (from pete@localhost) by phoenix.vfire (8.14.9/8.14.5/Submit) id u2EN3qo3011695 for linux-btrfs@vger.kernel.org; Mon, 14 Mar 2016 23:03:52 GMT Date: Mon, 14 Mar 2016 23:03:52 GMT Message-Id: <201603142303.u2EN3qo3011695@phoenix.vfire> From: pete@petezilla.co.uk To: linux-btrfs@vger.kernel.org Subject: Re: Snapshots slowing system In-Reply-To: pan$b315b$51883804$dab51362$72285105@cox.net Sender: linux-btrfs-owner@vger.kernel.org List-ID: >pete posted on Sat, 12 Mar 2016 13:01:17 +0000 as excerpted: >> I hope this message stays within the thread on the list. I had email >> problems and ended up hacking around with sendmail & grabbing the >> message id off of the web based group archives. >Looks like it should have as the reply-to looks right, but at least on >gmane's news/nntp archive of the list (which is how I read and reply), it >didn't. But the thread was found easily enough. Found out what had happened. I think I had a quota full issue at my hosting provider, I suspect bounce messages caused majordomo to unsubscribe me, the very week I asked a quesiton. Thanks for the huge response, and thanks also to Boris. >>>>I wondered whether you had elimated fragmentation, or any other known >>>>gotchas, as a cause? >> >> Subvolumes are mounted with the following options: >> autodefrag,relatime,compress=lzo,subvol=> >That relatime (which is the default), could be an issue. See below. I've now changed that to noatime. I think I read or missread relatime as a good comprimise sometime in the past. >> Not sure if there is much else to do about fragmentation apart from >> running a balance which would probally make thje machine v sluggish for >> a day or so. >> >>>>Out of curiosity, what is/was the utilisation of the disk? Were the >>>>snapshots read-only or read-write? >> >> root@phoenix:~# btrfs fi df / >> Data, single: total=101.03GiB, used=97.91GiB >> System, single: total=32.00MiB, used=16.00KiB >> Metadata, single: total=8.00GiB, used=5.29GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> root@phoenix:~# btrfs fi df /home >> Data, RAID1: total=1.99TiB, used=1.97TiB >> System, RAID1: total=32.00MiB, used=352.00KiB >> Metadata, RAID1: total=53.00GiB, used=50.22GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >Normally when posting, either btrfs fi df *and* btrfs fi show are >needed, /or/ (with a new enough btrfs-progs) btrfs fi usage. And of >course the kernel (4.0.4 in your case) and btrfs-progs (not posted, that >I saw) versions. OK, I have usage. For the SSD with the system: root@phoenix:~# btrfs fi usage / Overall: Device size: 118.05GiB Device allocated: 110.06GiB Device unallocated: 7.99GiB Used: 103.46GiB Free (estimated): 11.85GiB (min: 11.85GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:102.03GiB, Used:98.16GiB /dev/sda3 102.03GiB Metadata,single: Size:8.00GiB, Used:5.30GiB /dev/sda3 8.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sda3 32.00MiB Unallocated: /dev/sda3 7.99GiB Hmm. A bit tight. I've just ordered a replacement SSD. Slackware should it in about 5GB+ of disk space I've seen on a website? Hmm. Don't beleive that. I'd allow at least 10GB and more if I want to add extra packages such as libreoffice. If I have no snapshots it seems to get to 45GB with various extra packages installed and grows to 100ish with snapshotting probally owing to updates. Anyway, took the lazy, but less tearing less hair out route and ordered a 500GB drive. Prices have dropped and fortunately a new drive is not a major issue. Timing is also good with Slack 14.2 immanent. You rarely hear people complaining about disk too empty problems... For the traditional hard drives with the data: root@phoenix:~# btrfs fi usage /home Overall: Device size: 5.46TiB Device allocated: 4.09TiB Device unallocated: 1.37TiB Used: 4.04TiB Free (estimated): 720.58GiB (min: 720.58GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID1: Size:1.99TiB, Used:1.97TiB /dev/sdb 1.99TiB /dev/sdc 1.99TiB Metadata,RAID1: Size:53.00GiB, Used:49.65GiB /dev/sdb 53.00GiB /dev/sdc 53.00GiB System,RAID1: Size:32.00MiB, Used:352.00KiB /dev/sdb 32.00MiB /dev/sdc 32.00MiB Unallocated: /dev/sdb 699.49GiB /dev/sdc 699.49GiB root@phoenix:~# >> Hmm. The system disk is getting a little tight. cddisk reports the >> partition I use for btrfs containing root as 127GB approx. Not sure why >> it grows so much. Suspect that software updates can't help as snapshots >> will contain the legacy versions. On the other hand they can be useful. >With the 127 GiB (I _guess_ it's GiB, 1024, not GB, 1000, multiplier, >btrfs consistently uses the 1024 multiplier and properly specifies it >using the XiB notation) for /, however, and the btrfs fi df sizes of 101 >GiB plus data and 8 GiB metadata (with system's 32 MiB a rounding error >and global reserve actually taken from metadata, so it doesn't add to >chunk reservation on its own) we can see that as you mention, it's >starting to get tight, a bit under 110 GiB of 127 GiB, but that 17 GiB >free isn't horrible, just slightly tight, as you said. >Tho it'll obviously be tighter if that's 127 GB, 1000 multiplier... Note that the system btrfs does not get 127GB, it gets /dev/sda3, not far off, but I've a 209MB partition for /boot and a 1G partition for a very cut down system for maintenance purposes (both ext4). On the new drive I'll keep the 'maintenance' ext4 install but I could use /boot from that filesystem using bind mounts, a bit cleaner. >It's tight enough that particularly with the regular snapshotting, btrfs >might be having to fragment more than it'd like. Tho kudos for the >_excellent_ snapshot rotation. We regularly see folks in here with 100K >or more snapshots per filesystem, and btrfs _does_ have scaling issues in >that case. But your rotation seems to be keeping it well below the 1-3K >snapshots per filesystem recommended max, so that's obviously NOT you're >problem, unless of course the snapshot deletion bugged out and they >aren't being deleted as they should. Yay, I've done it right at least somewhere... I was assuming that was on server hardware so I thought best to keep it tighter on my more modest desktop. They are deleting. The new ones are also read only now. >(Of course, you can check that by listing them, and I would indeed double- >check, as that _is_ the _usual_ problem we have with snapshots slowing >things down, simply too many of them, hitting the known scaling issues >btrfs had with over 10K snapshots per filesystem. But FWIW I don't use >snapshots here and thus don't deal with snapshots command-level detail.) Rarely use them except when I either delete the wrong file or do something very sneaky but dumb like inavertently set umask for root and install a package and break _lots_ of file system permissions. Easiest to recover from a good snapshot than try to fix that mess... >But as I mentioned above, that relatime mount option isn't your best >choice, in the presence of heavy snapshotting. Unless you KNOW you need >atimes for something or other, noatime is _strongly_ recommended with >snapshotting, because relatime, while /relatively/ better than >strictatime, still updates atimes once a day for files you're accessing >at least that frequently. Now noatime. >And that interacts badly with snapshots, particularly where few of the >files themselves have changed, because in that case, a large share of the >changes from one snapshot to another are going to be those atime updates >themselves. Ensuring that you're always using noatime avoids the atime >updates entirely (well, unless the file itself changes and thus mtime >changes as well), which should, in the normal most files unchanged >snapshotting context, make for much smaller snapshot-exclusive sizes. >And you mention below that the snapshots are read-write, but generally >used as read-only. Does that include actually mounting them read-only? >Because if not, and if they too are mounted the default relatime, >accessing them is obviously going to be updating atimes the relatime- >default once per day there as well... triggering further divergence of >snapshots from the subvolumes they are snapshots of and from each other... Actually they are normally not mounted. Only mount them, or rather the default subvolume that contains them, on an as needed basis. The script that does the snapshotting mounts and then unmounts. >> Is it likely the SSD? If likely I could get a larger one, now is a good >> time with a new version of slackware imminent. However, no point in >> spending money for the sake of it. >Not directly btrfs related, but when you do buy a new ssd, now or later, >keep in mind that a lot of authorities recommend that for ssds you buy >10-33% larger than you plan on actually provisioning, and that you leave >that extra space entirely unprovisioned -- either leave that extra space >entirely unpartitioned, or partition it, but don't put filesystems or >anything else (swap, etc) on it. This leaves those erase-blocks free to >be used by the FTL for additional wear-leveling block-swap, thus helping >maintain device speed as it ages, and with good wear-leveling firmware, >should dramatically increase device usable lifetime, as well. Well, went OTT so got ordered a 500GB. So if I put say 20GB as my 'maintenance' partition, then the rest minus 100-150GB as btrfs and keep the rest unallocated that should work well? >FWIW, I ended up going rather overboard with that here, as I knew I So have I. The price seems almost linear per gigabyte perhaps? Suspected it was better to go larger if I could and delay the time until the new disk runs out. Could put the old disk in the laptop for experimentation with distros. >>>>Apropos Nada: quick shout out to Qu to wish him luck for the 4.6 merge. >> >> I'm wondering if it is time for an update from 4.0.4? >The going list recommendation is to choose either current kernel track or >LTS kernel track. If you choose current kernel, the recommendation is to >stick within 1-2 kernel cycles of newest current, which with 4.5 about to >come out, means you would be on 4.3 at the oldest, and be looking at 4.4 >by now, again, on the current kernel track. 4.5 is out. Maybe I ought to await 4.5.1 or .2 for any initial bugs to shake out. >If you choose LTS kernels, until recently, the recommendation was again >the latest two, but here LTS kernel cycles. That would be 4.4 as the >newest LTS and 4.1 previous to that. However, 3.18, the LTS kernel >previous to 4.1, has been holding up reasonably well, so while 4.1 would >be preferred, 3.18 remains reasonably well supported as well. Can't see the advantage to me for a LTS kernel. In the past I've gone for the latest and then updated the kernel with the new latest kernel. Distro maintainers might want LTS kernels but I'm not going to go from say 4.1.10 to 4.1.19 when I can go to 4.5. OK googled for a bit. Upgrading within an LTS branch fixes bugs but reduces chances of breakage due to new functionality. >You're on 4.0, which isn't an LTS kernel series and is thus, along with >4.2, out of upstream's support window. So it's past time to look at >updating. =:^) Given that you obviously do _not_ follow the last couple Whilst everything worked fine and there were no security horrors there was no need to update. Kind regards, Pete