From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:54834 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750995AbbAaGl4 (ORCPT ); Sat, 31 Jan 2015 01:41:56 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YHRkf-0002ky-MI for linux-btrfs@vger.kernel.org; Sat, 31 Jan 2015 07:41:53 +0100 Received: from ip68-231-22-224.ph.ph.cox.net ([68.231.22.224]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 31 Jan 2015 07:41:53 +0100 Received: from 1i5t5.duncan by ip68-231-22-224.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 31 Jan 2015 07:41:53 +0100 To: linux-btrfs@vger.kernel.org From: Duncan <1i5t5.duncan@cox.net> Subject: Re: rolling back to ext4 doesn't work if rebalanced Date: Sat, 31 Jan 2015 06:41:48 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Vytautas D posted on Fri, 30 Jan 2015 16:41:11 +0000 as excerpted: > Is it possible to automate btrfs-convert so that it would defrag and > balance the disk without loosing ability to roll back to ext4 partition > ? > > i.e: > btrfs-convert /dev/sda3 > # I can still rollback using btrfs-convert -r /dev/sda3 > > btrfs filesystem defrag -r / > # I can still rollback using btrfs-convert -r /dev/sda3 > > btrfs balance start / > # I can no longer rollback to ext4 Mostly no, tho in theory there's also a very narrow and limited yes. (Semi-)technical background: The ext* fs layout is quite a bit different than btrfs' native layout, metadata more so than data. What the conversion actually does is leave the ext* data in-place, along with its metadata, and in the free space that still existed on the ext* filesystem, create new btrfs metadata pointing at the still-in-place ext* file data. Because btrfs is copy-on-write (COW), and the conversion process creates a special btrfs snapshot containing the ext* data and metadata so btrfs won't remove it if it updates files, you can write new data and changes to the existing data on the btrfs side, and btrfs will write that to a different location, leaving the ext* data and metadata in that special snapshot as they were. It is for that reason that any changes made while the filesystem is btrfs won't be retained if you convert back to ext*. You get the ext* filesystem as it was at the time of the conversion to btrfs. If you decide to stay with btrfs, the first thing to do is remove that special ext*_saved (IIRC that's the name, something like that) snapshot, then defrag and balance what remains, obliterating the ext* metadata and final-converting all the data to native btrfs format. The key concept to understand here is that btrfs itself doesn't understand or deal with the ext* metadata at all. The (userspace) conversion process simply creates new btrfs metadata pointing at the existing ext* data, and takes advantage of btrfs' COW nature to create a snapshot that btrfs shouldn't touch, so as to keep the ext* metadata and data intact, and along with it, the ability to rollback to that ext* filesystem as it was at the time of the conversion. That's the background. Now to answer your question. What (btrfs) defrag and balance do are two different stages of btrfs- specific optimization. That btrfs-specific optimization simply cannot be done without disrupting the ext* layout, thus the overall no. Actually, even the btrfs fi defrag would normally destroy the ability to rollback, but for one hopefully temporary issue. Ideally, btrfs defrag is snapshot aware, and defragging files in one snapshot will defrag them, to the extent that they are the same, in other snapshots as well. However, the original btrfs snapshot-aware defrag was found not to scale at all well, requiring huge amounts of memory and taking days and sometimes weeks to get thru a defrag in the presence of large numbers of snapshots (with btrfs quotas being another complicating factor at the time). Thus, snapshot-aware-defrag was disabled (hopefully) temporarily, while they rewrote both the multi-snapshot handling code and the quota handling code with a view toward *MUCH* better scaling. At least some of that work has been done and scaling is definitely better now, but apparently not good enough yet that they feel comfortable reenabling snapshot-aware-defrag. Thus, at least for the time being, btrfs defrag is not snapshot aware, and only rewrites the snapshot-instance of the files it's actually pointed at, using COW so anything that's defragged is actually rewritten, leaving other snapshots of the same files in place, as fragmented as they were before. This does the defrag for whatever it's pointed at, but obviously, if the snapshotted files were heavily fragmented, it's going to about double the space taken up, because the snapshotted versions will still exist, while the defrag writes new, defragmented copies, for whatever mounted snapshot you pointed the defrag at. That's why you could btrfs defrag and still successfully roll-back to ext*, because the ext* is its own snapshot, and with snapshot-aware- defrag temporarily disabled, the defrag rewrote any files it defragged, creating a new copy of them, while leaving the existing ext* special snapshot alone. When snapshot-aware-defrag is reenabled, that won't (normally) work any more, as the files in the snapshot would be btrfs defragged as well, thereby moving them out from under the ext* metadata pointing at the old copies. All that said, at least in theory, balance and defrag could be taught to special-case the ext* snapshot, leaving it alone, while balancing and defragging the btrfs metadata, along with any files added or changed since the btrfs conversion, which thus exist on the btrfs side. That's the very limited yes I mentioned above. But that's only theory, and in practice it'd hardly be worth the bother to implement. Why? Because the entire idea of the conversion and being able to rollback is that users will only keep the rollback snapshot very temporarily, perhaps a few days' worth of normal filesystem writes at the longest, since the ext* side remains as it is and will quickly become outdated, with all changes since then lost if the rollback option /is/ exercised. Also, remember that immediately after the conversion, all files are still in ext* form, with native btrfs versions of the files only created as the files change. So if btrfs defrag and balance were taught to ignore the ext* saved snapshot, there really wouldn't be a lot left for them to do. The btrfs metadata would be newly written by the conversion and thus very limited optimization could be done to it at least as long as it's still pointing at ext* format data, and other than that, there'd only be any new or changed files, freshly written in native btrfs mode, to optimize. Given that the rollback snapshot is supposed to be very temporary in the first place, and that it should be relatively quickly either deleted and a defrag and balance done to fully optimize the formerly ext* layout for btrfs, or the rollback option should be exercised and all changes made while the filesystem was btrfs lost, it's hardly worth the trouble to teach btrfs defrag and balance to special-case-ignore the special-case ext* snapshot, since there really shouldn't be much for them to do as long as that snapshot exists. So as I said at the top, in general, no, it's not possible. In theory there's a very narrow-case yes, but in practice, it's so narrow-case and should exist for such a short period, that it's hardly worth the trouble to implement. Meanwhile, a variant of the same question, with a slightly different answer: Would it be possible to optimize convert such that it did the defrag and balance optimization at the same time as the conversion? Yes, it /would/ be possible, but again, it's hardly worth it, tho now for a different reason. If the optimization happens at the same time, it's by definition no-rollback since optimizing to btrfs layout loses the ext* layout, so using that would mean no-rollbacks. And what then if things went wrong, the biggest reason to allow rollbacks in the first place? No rollback and failed conversion means lost data. Which would mean you could do that only if you either didn't care about losing the data, or if you had full (tested) backups of everything you cared about. And if you had backups or didn't care about the data, there's already a *MUCH* more straightforward way to do the same thing -- simply blow away the old ext* with a clean mkfs.btrfs. If you don't care about the data, no loss and it's far faster and more efficient than a conversion. And if you do care about the data, you simply restore from backup onto the freshly mkfs-ed btrfs, STILL faster and more efficient than a conversion. So the ability to rollback either due to problems with the conversion or because you decide you don't like btrfs, is critical to the whole convert- in-place concept in the first place. Which means you can't optimize the conversion/defrag/balance steps by combining them, because in so doing you lose the ability to rollback, and without that, you might as well simply start with a clean mkfs.btrfs in the first place, restoring from backup after that if desired, and forget about the whole convert-in-place entirely. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman