From: Axel Burri <axel@tty0.ch>
To: Dave <davestechshop@gmail.com>, linux-btrfs@vger.kernel.org
Cc: A L <crimsoncottage@gmail.com>
Subject: Re: send | receive: received snapshot is missing recent files
Date: Thu, 7 Sep 2017 16:33:21 +0200 [thread overview]
Message-ID: <e9ec7ac7-781d-34e0-9f48-aeb8566e5787@tty0.ch> (raw)
In-Reply-To: <CAH=dxU6S7XLVb4NY0ueZapUJW6A_KM9JezpB3icjYW4nRB8OPA@mail.gmail.com>
Having a received_uuid set on the source volume ("/home" in your case)
is indeed a bad thing when it comes to send/receive. You probably
restored a backup with send/receive, and made it read/write using "btrfs
property set -ts /home ro false". This is a an evil thing, as it leaves
received_uuid intact. In order to make a subvolume read-write, I
recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>".
There is a FAQ entry on btrbk on how to fix this:
https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set
On 2017-09-07 15:34, Dave wrote:
> I just ran a test. The btrfs send - receive problem I described is
> indeed fully resolved by removing the "problematic" snapshot on the
> target device. I did not make any changes to the source volume. I did
> not make any other changes in my steps (see earlier message for my
> exact steps).
>
> Therefore, the problem I described in my earlier message is not due
> exclusively to having a Received UUID on the source volume (or to any
> other feature of the source volume). It is not related to any feature
> of the directly specified parent volume either. More details are
> included in my earlier email.
>
> Thanks for any further feedback, including answers to my questions and
> comments about whether this is a known issue.
>
>
> On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote:
>>
>> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")?
>>
>> How does it happen?
>> How does one remove a Received UUID from the source volume?
>>
>> And how does that explain my results where I showed that the problem
>> is not dependent upon the source volume but is instead dependent upon
>> some existing snapshot on the target volume?
>>
>> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly.
>>
>> Thank you.
>>
>> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
>>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
>>>
>>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
>>>
>>>> Here is more info and a possible (shocking) explanation. This
>>>> aggregates my prior messages and it provides an almost complete set of
>>>> steps to reproduce this problem.
>>>>
>>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
>>>> btrfs-progs v4.12
>>>>
>>>> My steps:
>>>>
>>>> [root@srv]# sync
>>>> [root@srv]# mkdir /home/.snapshots/test1
>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
>>>> [root@srv]# sync
>>>> [root@srv]# mkdir /mnt/x5a/home/test1
>>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
>>>> /mnt/x5a/home/test1/
>>>> At subvol /home/.snapshots/test1/home/
>>>> At subvol home
>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
>>>> NOTE: all recent files are present
>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
>>>> NOTE: all recent files are present
>>>> [root@srv]# mkdir /home/.snapshots/test2
>>>> [root@srv]# mkdir /mnt/x5a/home/test2
>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
>>>> [root@srv]# sync
>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
>>>> At subvol /home/.snapshots/test2/home/
>>>> At snapshot home
>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
>>>> NOTE: all recent files are MISSING
>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
>>>> NOTE: all recent files are MISSING
>>>>
>>>> Below I am including some rsync output to illustrate when a snapshot
>>>> is missing files (or not):
>>>>
>>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/
>>>> sending incremental file list
>>>>
>>>> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec
>>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN)
>>>>
>>>> This indicates that these two subvolumes contain the same files, which
>>>> they should because test2 is a snapshot of test1 without any changes
>>>> to files, and it was not sent to another physical device.
>>>>
>>>> The problem is when test2 is sent to another device as shown by the
>>>> rsync results below.
>>>>
>>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
>>>> sending incremental file list
>>>> .d..t...... ./
>>>> .d..t...... user1/
>>>>> f.st...... user1/.bash_history
>>>>> f.st...... user1/.bashrc
>>>>> f+++++++++ user1/test2017-09-06.txt
>>>> ...
>>>> and a long list of other missing files
>>>>
>>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
>>>> missing all recent files (any files from the month of August or
>>>> September), as my prior visual inspections had indicated. The same
>>>> files are missing every time. There is no randomness to the missing
>>>> data.
>>>>
>>>> The problem does not happen for me if the receive command target is
>>>> located on the same physical device as shown next. (However, I suspect
>>>> there's more to it than that, as explained further below.)
>>>>
>>>> [root@srv]# mkdir /home/.snapshots/test2rec
>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/ | btrfs receive
>>>> /home/.snapshots/test2rec/
>>>> At subvol /home/.snapshots/test2/home/
>>>>
>>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
>>>> sending incremental file list
>>>>
>>>> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec
>>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN)
>>>>
>>>> The above (as well as visual inspection of files) indicates that these
>>>> two subvolumes contain the same files, which was not the case when the
>>>> same command had a target located on another physical device. Of
>>>> course, a snapshot which resides on the same physical device is not a
>>>> very good backup. So I do need to send it to another device, but that
>>>> results in missing files when the -p or -c options are used with btrfs
>>>> send. (Non-incremental sending to another physical device does work.)
>>>>
>>>> I can think of a couple possible explanations.
>>>>
>>>> One is that there is a problem when using the -p or -c options with
>>>> btrfs send when the target is another physical device. I suspect this
>>>> is the actual explanation, however.
>>>>
>>>> A second possibility is that the presence of prior existing snapshots
>>>> at the target location (even if old and not referenced in any current
>>>> btrfs command), can determine the outcome and final contents of an
>>>> incremental send operation. I believe the info below suggests this to
>>>> be the case.
>>>>
>>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/
>>>> test2/home
>>>> Name: home
>>>> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc
>>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053
>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>> Creation time: 2017-09-06 15:38:16 -0400
>>>> Subvolume ID: 2000
>>>> Generation: 5020
>>>> Gen at creation: 5020
>>>> Parent ID: 257
>>>> Top level ID: 257
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
>>>> home/test1/home
>>>> Name: home
>>>> UUID: dc00b13d-f841-cf48-a169-aa61429a5679
>>>> Parent UUID: -
>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>> Creation time: 2017-09-06 15:33:45 -0400
>>>> Subvolume ID: 656
>>>> Generation: 777
>>>> Gen at creation: 773
>>>> Parent ID: 257
>>>> Top level ID: 257
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
>>>> home/test2/home
>>>> Name: home
>>>> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea
>>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>> Creation time: 2017-09-06 15:39:51 -0400
>>>> Subvolume ID: 660
>>>> Generation: 779
>>>> Gen at creation: 779
>>>> Parent ID: 257
>>>> Top level ID: 257
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
>>>> test2rec/home
>>>> Name: home
>>>> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e
>>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053
>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>> Creation time: 2017-09-06 17:36:19 -0400
>>>> Subvolume ID: 2003
>>>> Generation: 5027
>>>> Gen at creation: 5027
>>>> Parent ID: 257
>>>> Top level ID: 257
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on
>>>> device /mnt/x5a/home with a Received UUID that matches the Received
>>>> UUID of test snapshots that were newly created today. How? Why?
>>>>
>>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
>>>> home/107/snapshot
>>>> Name: snapshot
>>>> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2
>>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>> Creation time: 2017-07-21 00:00:25 -0400
>>>> Subvolume ID: 433
>>>> Generation: 222
>>>> Gen at creation: 221
>>>> Parent ID: 257
>>>> Top level ID: 257
>>>> Flags: readonly
>>>> Snapshot(s):
>>>>
>>>> If my guess is correct, btrfs has found this old snapshot and
>>>> referenced it without me telling it to do so. The result is that the
>>>> newly executed btrfs commands shown above have a totally unexpected
>>>> result.
>>>>
>>>> Today's new snapshot will not contain any files newer than 2017-07-21.
>>>> Is this a known issue?
>>>>
>>>> Refer back to the commands at the top of this message. I created a new
>>>> snapshot and did a full (non-incremental) send to the target location
>>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only
>>>> referenced the prior snapshot created today. Nowhere did I reference
>>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
>>>> this backup location -- it was intended to hold a lot of them.) Yet,
>>>> the very presence of /mnt/x5a/home/107/snapshot on the target device
>>>> resulted in today's backup (and all recent backups) being worthless
>>>> due to them missing all files since 2017-07-21.
>>>>
>>>> These results are totally repeatable, given my set of existing
>>>> backups. But it's bizarre to me. As I understand it, a staff person
>>>> could transfer a btrfs snapshot to a target volume and it's mere
>>>> presence there could make all subsequent backups (incremental sends)
>>>> to that target volume invalid and useless. If that is true... wow.
>>>>
>>>> Another interesting observation is that the device that contains the
>>>> source snapshot, /home/.snapshots, also contains many, many prior
>>>> snapshots, going back to when this system was first set up. Why do
>>>> none of them cause a problem? Is it because I had never used
>>>> /home/.snapshots as the target of a receive operation (until I did so
>>>> today in testing the steps above)?
>>>>
>>>> As far as repeating these steps, all this was totally repeatable for
>>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
>>>> receive command (/mnt/x5a/home/). I do not know how to create such a
>>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my
>>>> results.
>>>>
>>>> Maybe somebody can explain to me what's really happening. How is it
>>>> possible that an old snapshot created 2017-07-21 could have the same
>>>> Received UUID as snapshots created today? And how could that fact lead
>>>> to the result I'm seeing, which seems very serious. (Unexpected
>>>> missing files from a backup which was completed without errors is
>>>> pretty serious in my book.)
>>>>
>>>> Most important question: how can we rely on automated incremental
>>>> backups with btrfs send | receive given what I'm observing here
>>>> (assuming my observations are roughly correct)?
>>>>
>>>> Here's more info just to confirm that my results are not due to
>>>> filesystem corruption.
>>>>
>>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
>>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
>>>> Checking filesystem on /dev/mapper/x5a_luks
>>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>> checking extents [o]
>>>> checking free space cache [.]
>>>> checking fs roots [o]
>>>> checking csums
>>>> checking root refs
>>>> found 258178555904 bytes used, no error found
>>>> total csum bytes: 250354776
>>>> total tree bytes: 1752088576
>>>> total fs tree bytes: 1308540928
>>>> total extent tree bytes: 175161344
>>>> btree space waste bytes: 215594634
>>>> file data blocks allocated: 258634637312
>>>> referenced 292888985600
>>>>
>>>> [root@srv]# btrfs fi show /mnt/x5a/
>>>> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>> Total devices 1 FS bytes used 240.45GiB
>>>> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
>>>>
>>>> [root@srv]# btrfs fi df /mnt/x5a/
>>>> Data, single: total=239.01GiB, used=238.82GiB
>>>> System, DUP: total=32.00MiB, used=48.00KiB
>>>> Metadata, DUP: total=2.50GiB, used=1.63GiB
>>>> GlobalReserve, single: total=422.73MiB, used=0.00B
>>>>
>>>> # btrfs scrub status -d /mnt/x5a/
>>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>> scrub device /dev/mapper/x5a_luks (id 1) history
>>>> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30
>>>> total bytes scrubbed: 242.08GiB with 0 errors
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2017-09-07 14:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-06 5:37 send | receive: received snapshot is missing recent files Dave
[not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
2017-09-06 19:46 ` Dave
2017-09-07 4:43 ` Dave
2017-09-07 6:24 ` A L
2017-09-07 12:39 ` Dave
2017-09-07 13:34 ` Dave
2017-09-07 14:33 ` Axel Burri [this message]
2017-09-08 4:44 ` Dave
2017-09-11 17:53 ` Axel Burri
2017-09-12 3:19 ` Andrei Borzenkov
2017-09-13 16:52 ` Dave
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e9ec7ac7-781d-34e0-9f48-aeb8566e5787@tty0.ch \
--to=axel@tty0.ch \
--cc=crimsoncottage@gmail.com \
--cc=davestechshop@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).