From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from tty0.vserver.softronics.ch ([91.214.169.36]:49641 "EHLO fe1.digint.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932178AbdIGOdf (ORCPT ); Thu, 7 Sep 2017 10:33:35 -0400 Subject: Re: send | receive: received snapshot is missing recent files To: Dave , linux-btrfs@vger.kernel.org Cc: A L References: From: Axel Burri Message-ID: Date: Thu, 7 Sep 2017 16:33:21 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Having a received_uuid set on the source volume ("/home" in your case) is indeed a bad thing when it comes to send/receive. You probably restored a backup with send/receive, and made it read/write using "btrfs property set -ts /home ro false". This is a an evil thing, as it leaves received_uuid intact. In order to make a subvolume read-write, I recommend to use "btrfs subvolume snapshot ". There is a FAQ entry on btrbk on how to fix this: https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set On 2017-09-07 15:34, Dave wrote: > I just ran a test. The btrfs send - receive problem I described is > indeed fully resolved by removing the "problematic" snapshot on the > target device. I did not make any changes to the source volume. I did > not make any other changes in my steps (see earlier message for my > exact steps). > > Therefore, the problem I described in my earlier message is not due > exclusively to having a Received UUID on the source volume (or to any > other feature of the source volume). It is not related to any feature > of the directly specified parent volume either. More details are > included in my earlier email. > > Thanks for any further feedback, including answers to my questions and > comments about whether this is a known issue. > > > On Thu, Sep 7, 2017 at 8:39 AM, Dave wrote: >> >> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? >> >> How does it happen? >> How does one remove a Received UUID from the source volume? >> >> And how does that explain my results where I showed that the problem >> is not dependent upon the source volume but is instead dependent upon >> some existing snapshot on the target volume? >> >> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. >> >> Thank you. >> >> On Thu, Sep 7, 2017 at 2:24 AM, A L wrote: >>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive. >>> >>> ---- From: Dave -- Sent: 2017-09-07 - 06:43 ---- >>> >>>> Here is more info and a possible (shocking) explanation. This >>>> aggregates my prior messages and it provides an almost complete set of >>>> steps to reproduce this problem. >>>> >>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux >>>> btrfs-progs v4.12 >>>> >>>> My steps: >>>> >>>> [root@srv]# sync >>>> [root@srv]# mkdir /home/.snapshots/test1 >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' >>>> [root@srv]# sync >>>> [root@srv]# mkdir /mnt/x5a/home/test1 >>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive >>>> /mnt/x5a/home/test1/ >>>> At subvol /home/.snapshots/test1/home/ >>>> At subvol home >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ >>>> NOTE: all recent files are present >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ >>>> NOTE: all recent files are present >>>> [root@srv]# mkdir /home/.snapshots/test2 >>>> [root@srv]# mkdir /mnt/x5a/home/test2 >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' >>>> [root@srv]# sync >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ >>>> At subvol /home/.snapshots/test2/home/ >>>> At snapshot home >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ >>>> NOTE: all recent files are MISSING >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ >>>> NOTE: all recent files are MISSING >>>> >>>> Below I am including some rsync output to illustrate when a snapshot >>>> is missing files (or not): >>>> >>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ >>>> sending incremental file list >>>> >>>> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>> >>>> This indicates that these two subvolumes contain the same files, which >>>> they should because test2 is a snapshot of test1 without any changes >>>> to files, and it was not sent to another physical device. >>>> >>>> The problem is when test2 is sent to another device as shown by the >>>> rsync results below. >>>> >>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ >>>> sending incremental file list >>>> .d..t...... ./ >>>> .d..t...... user1/ >>>>> f.st...... user1/.bash_history >>>>> f.st...... user1/.bashrc >>>>> f+++++++++ user1/test2017-09-06.txt >>>> ... >>>> and a long list of other missing files >>>> >>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is >>>> missing all recent files (any files from the month of August or >>>> September), as my prior visual inspections had indicated. The same >>>> files are missing every time. There is no randomness to the missing >>>> data. >>>> >>>> The problem does not happen for me if the receive command target is >>>> located on the same physical device as shown next. (However, I suspect >>>> there's more to it than that, as explained further below.) >>>> >>>> [root@srv]# mkdir /home/.snapshots/test2rec >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ | btrfs receive >>>> /home/.snapshots/test2rec/ >>>> At subvol /home/.snapshots/test2/home/ >>>> >>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ >>>> sending incremental file list >>>> >>>> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>> >>>> The above (as well as visual inspection of files) indicates that these >>>> two subvolumes contain the same files, which was not the case when the >>>> same command had a target located on another physical device. Of >>>> course, a snapshot which resides on the same physical device is not a >>>> very good backup. So I do need to send it to another device, but that >>>> results in missing files when the -p or -c options are used with btrfs >>>> send. (Non-incremental sending to another physical device does work.) >>>> >>>> I can think of a couple possible explanations. >>>> >>>> One is that there is a problem when using the -p or -c options with >>>> btrfs send when the target is another physical device. I suspect this >>>> is the actual explanation, however. >>>> >>>> A second possibility is that the presence of prior existing snapshots >>>> at the target location (even if old and not referenced in any current >>>> btrfs command), can determine the outcome and final contents of an >>>> incremental send operation. I believe the info below suggests this to >>>> be the case. >>>> >>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/ >>>> test2/home >>>> Name: home >>>> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:38:16 -0400 >>>> Subvolume ID: 2000 >>>> Generation: 5020 >>>> Gen at creation: 5020 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home >>>> home/test1/home >>>> Name: home >>>> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 >>>> Parent UUID: - >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:33:45 -0400 >>>> Subvolume ID: 656 >>>> Generation: 777 >>>> Gen at creation: 773 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ >>>> home/test2/home >>>> Name: home >>>> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:39:51 -0400 >>>> Subvolume ID: 660 >>>> Generation: 779 >>>> Gen at creation: 779 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ >>>> test2rec/home >>>> Name: home >>>> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 17:36:19 -0400 >>>> Subvolume ID: 2003 >>>> Generation: 5027 >>>> Gen at creation: 5027 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on >>>> device /mnt/x5a/home with a Received UUID that matches the Received >>>> UUID of test snapshots that were newly created today. How? Why? >>>> >>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot >>>> home/107/snapshot >>>> Name: snapshot >>>> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-07-21 00:00:25 -0400 >>>> Subvolume ID: 433 >>>> Generation: 222 >>>> Gen at creation: 221 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> If my guess is correct, btrfs has found this old snapshot and >>>> referenced it without me telling it to do so. The result is that the >>>> newly executed btrfs commands shown above have a totally unexpected >>>> result. >>>> >>>> Today's new snapshot will not contain any files newer than 2017-07-21. >>>> Is this a known issue? >>>> >>>> Refer back to the commands at the top of this message. I created a new >>>> snapshot and did a full (non-incremental) send to the target location >>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only >>>> referenced the prior snapshot created today. Nowhere did I reference >>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at >>>> this backup location -- it was intended to hold a lot of them.) Yet, >>>> the very presence of /mnt/x5a/home/107/snapshot on the target device >>>> resulted in today's backup (and all recent backups) being worthless >>>> due to them missing all files since 2017-07-21. >>>> >>>> These results are totally repeatable, given my set of existing >>>> backups. But it's bizarre to me. As I understand it, a staff person >>>> could transfer a btrfs snapshot to a target volume and it's mere >>>> presence there could make all subsequent backups (incremental sends) >>>> to that target volume invalid and useless. If that is true... wow. >>>> >>>> Another interesting observation is that the device that contains the >>>> source snapshot, /home/.snapshots, also contains many, many prior >>>> snapshots, going back to when this system was first set up. Why do >>>> none of them cause a problem? Is it because I had never used >>>> /home/.snapshots as the target of a receive operation (until I did so >>>> today in testing the steps above)? >>>> >>>> As far as repeating these steps, all this was totally repeatable for >>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the >>>> receive command (/mnt/x5a/home/). I do not know how to create such a >>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my >>>> results. >>>> >>>> Maybe somebody can explain to me what's really happening. How is it >>>> possible that an old snapshot created 2017-07-21 could have the same >>>> Received UUID as snapshots created today? And how could that fact lead >>>> to the result I'm seeing, which seems very serious. (Unexpected >>>> missing files from a backup which was completed without errors is >>>> pretty serious in my book.) >>>> >>>> Most important question: how can we rely on automated incremental >>>> backups with btrfs send | receive given what I'm observing here >>>> (assuming my observations are roughly correct)? >>>> >>>> Here's more info just to confirm that my results are not due to >>>> filesystem corruption. >>>> >>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home: >>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks >>>> Checking filesystem on /dev/mapper/x5a_luks >>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> checking extents [o] >>>> checking free space cache [.] >>>> checking fs roots [o] >>>> checking csums >>>> checking root refs >>>> found 258178555904 bytes used, no error found >>>> total csum bytes: 250354776 >>>> total tree bytes: 1752088576 >>>> total fs tree bytes: 1308540928 >>>> total extent tree bytes: 175161344 >>>> btree space waste bytes: 215594634 >>>> file data blocks allocated: 258634637312 >>>> referenced 292888985600 >>>> >>>> [root@srv]# btrfs fi show /mnt/x5a/ >>>> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> Total devices 1 FS bytes used 240.45GiB >>>> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks >>>> >>>> [root@srv]# btrfs fi df /mnt/x5a/ >>>> Data, single: total=239.01GiB, used=238.82GiB >>>> System, DUP: total=32.00MiB, used=48.00KiB >>>> Metadata, DUP: total=2.50GiB, used=1.63GiB >>>> GlobalReserve, single: total=422.73MiB, used=0.00B >>>> >>>> # btrfs scrub status -d /mnt/x5a/ >>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> scrub device /dev/mapper/x5a_luks (id 1) history >>>> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 >>>> total bytes scrubbed: 242.08GiB with 0 errors >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >