* send | receive: received snapshot is missing recent files
@ 2017-09-06 5:37 Dave
[not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-06 5:37 UTC (permalink / raw)
To: linux-btrfs
I'm running Arch Linux on BTRFS. I use Snapper to take hourly
snapshots and it works without any issues.
I have a bash script that uses send | receive to transfer snapshots to
a couple external HDD's. The script runs daily on a systemd timer. I
set all this up recently and I first noticed that it runs every day
and that the expected snapshots are received.
At a glance, everything looked correct. However, today was my day to
drill down and really make sure everything was working.
To my surprise, the newest received incremental snapshots are missing
all recent files. These new snapshots reflect the system state from
weeks ago and no files more recent than a certain date are in the
snapshots.
However, the snapshots are newly created and newly received. The work
is being done fresh each day when my script runs, but the results are
anchored back in time at this earlier date. Weird.
I'm not really sure where to start troubleshooting, so I'll start by
sharing part of my script. I'm sure the problem is in my script, and
is not related to BTRFS or snapper functionality. (As I said, the
Snapper snapshots are totally OK before being sent | received.
These are the key lines of the script I'm using to send | receive a snapshot:
old_num=$(snapper -c "$config" list -t single | awk
'/'"$selected_uuid"'/ {print $1}')
old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot
new_num=$(snapper -c "$config" create --print-number)
new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot
btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive
"$backup_location"
I have to admit that even after reading the following page half a
dozen times, I barely understand the difference between -c and -p.
https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F
After reading that page again today, I feel like I should switch to -p
(maybe). However, the -c vs -p choice probably isn't my problem.
Any ideas what my problem could be?
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>]
* Re: send | receive: received snapshot is missing recent files [not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com> @ 2017-09-06 19:46 ` Dave 2017-09-07 4:43 ` Dave 0 siblings, 1 reply; 11+ messages in thread From: Dave @ 2017-09-06 19:46 UTC (permalink / raw) To: linux-btrfs This is an even better set of steps for reproducing the problem. [root@srv]# sync [root@srv]# mkdir /home/.snapshots/test1 [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' [root@srv]# sync [root@srv]# mkdir /mnt/x5a/home/test1 [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive /mnt/x5a/home/test1/ At subvol /home/.snapshots/test1/home/ At subvol home [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ NOTE: all recent files are present [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ NOTE: all recent files are present [root@srv]# mkdir /home/.snapshots/test2 [root@srv]# mkdir /mnt/x5a/home/test2 [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' [root@srv]# sync [root@srv]# btrfs send -p /home/.snapshots/test1/home/ /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ At subvol /home/.snapshots/test2/home/ At snapshot home [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ NOTE: all recent files are MISSING [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ NOTE: all recent files are MISSING Any ideas what could be causing this problem with incremental backups? On Wed, Sep 6, 2017 at 3:23 PM, Dave <davestechshop@gmail.com> wrote: > > Here is more info on this problem. I can reproduce this without using my script. Simple btrfs commands will reproduce the problem every time. The same files are missing every time. There is no randomness to the missing data. > > Here are my steps: > > 1. snapper -c home create > result is a valid snapshot at /home/.snapshots/1704/snapshot > 2. btrfs send /home/.snapshots/1704/snapshot | btrfs receive /mnt/x5a/home/1704 > 3. snapper -c home create > result is a valid snapshot at /home/.snapshots/1716/snapshot > 4. btrfs send -c /home/.snapshots/1704/snapshot/ /home/.snapshots/1716/snapshot/ | btrfs receive /mnt/x5a/home/1716/ > > I expect /mnt/x5a/home/1716/snapshot to be identical to /home/.snapshots/1716/snapshot. However, it is not. > The result is that /mnt/x5a/home/1716/snapshot is missing all recent files. > > Next step was to delete snapshot 1716 (in both locations) and repeat the send | receive using -p > > btrfs su del /mnt/x5a/home/1716/snapshot > snapper -c home delete 1716 > snapper -c home create > btrfs send -p /home/.snapshots/1704/snapshot/ /home/.snapshots/1716/snapshot/ | btrfs receive /mnt/x5a/home/1716/ > > The result is once again that /mnt/x5a/home/1716/snapshot is missing all recent files. However, the other snapshots are all valid: > /home/.snapshots/1704/snapshot is valid & complete > /mnt/x5a/home/1704/snapshot -- non-incremental send: snapshot is valid & complete > /home/.snapshots/1716/snapshot is valid & complete > /mnt/x5a/home/1716/snapshot -- incrementally sent snapshot is missing all recent files whether sent with -c or -p > > The incrementally sent snapshot is even missing files that are present in the reference snapshot /mnt/x5a/home/1704/snapshot. > > > > On Wed, Sep 6, 2017 at 1:37 AM, Dave <davestechshop@gmail.com> wrote: >> >> I'm running Arch Linux on BTRFS. I use Snapper to take hourly >> snapshots and it works without any issues. >> >> I have a bash script that uses send | receive to transfer snapshots to >> a couple external HDD's. The script runs daily on a systemd timer. I >> set all this up recently and I first noticed that it runs every day >> and that the expected snapshots are received. >> >> At a glance, everything looked correct. However, today was my day to >> drill down and really make sure everything was working. >> >> To my surprise, the newest received incremental snapshots are missing >> all recent files. These new snapshots reflect the system state from >> weeks ago and no files more recent than a certain date are in the >> snapshots. >> >> However, the snapshots are newly created and newly received. The work >> is being done fresh each day when my script runs, but the results are >> anchored back in time at this earlier date. Weird. >> >> I'm not really sure where to start troubleshooting, so I'll start by >> sharing part of my script. I'm sure the problem is in my script, and >> is not related to BTRFS or snapper functionality. (As I said, the >> Snapper snapshots are totally OK before being sent | received. >> >> These are the key lines of the script I'm using to send | receive a snapshot: >> >> old_num=$(snapper -c "$config" list -t single | awk >> '/'"$selected_uuid"'/ {print $1}') >> old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot >> new_num=$(snapper -c "$config" create --print-number) >> new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot >> btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive >> "$backup_location" >> >> I have to admit that even after reading the following page half a >> dozen times, I barely understand the difference between -c and -p. >> https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F >> >> After reading that page again today, I feel like I should switch to -p >> (maybe). However, the -c vs -p choice probably isn't my problem. >> >> Any ideas what my problem could be? > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-06 19:46 ` Dave @ 2017-09-07 4:43 ` Dave 2017-09-07 6:24 ` A L 0 siblings, 1 reply; 11+ messages in thread From: Dave @ 2017-09-07 4:43 UTC (permalink / raw) To: linux-btrfs Here is more info and a possible (shocking) explanation. This aggregates my prior messages and it provides an almost complete set of steps to reproduce this problem. Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux btrfs-progs v4.12 My steps: [root@srv]# sync [root@srv]# mkdir /home/.snapshots/test1 [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' [root@srv]# sync [root@srv]# mkdir /mnt/x5a/home/test1 [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive /mnt/x5a/home/test1/ At subvol /home/.snapshots/test1/home/ At subvol home [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ NOTE: all recent files are present [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ NOTE: all recent files are present [root@srv]# mkdir /home/.snapshots/test2 [root@srv]# mkdir /mnt/x5a/home/test2 [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' [root@srv]# sync [root@srv]# btrfs send -p /home/.snapshots/test1/home/ /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ At subvol /home/.snapshots/test2/home/ At snapshot home [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ NOTE: all recent files are MISSING [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ NOTE: all recent files are MISSING Below I am including some rsync output to illustrate when a snapshot is missing files (or not): [root@srv]# rsync -aniv /home/.snapshots/test1/home/ /home/.snapshots/test2/home/ sending incremental file list sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) This indicates that these two subvolumes contain the same files, which they should because test2 is a snapshot of test1 without any changes to files, and it was not sent to another physical device. The problem is when test2 is sent to another device as shown by the rsync results below. [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ sending incremental file list .d..t...... ./ .d..t...... user1/ >f.st...... user1/.bash_history >f.st...... user1/.bashrc >f+++++++++ user1/test2017-09-06.txt ... and a long list of other missing files The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is missing all recent files (any files from the month of August or September), as my prior visual inspections had indicated. The same files are missing every time. There is no randomness to the missing data. The problem does not happen for me if the receive command target is located on the same physical device as shown next. (However, I suspect there's more to it than that, as explained further below.) [root@srv]# mkdir /home/.snapshots/test2rec [root@srv]# btrfs send -p /home/.snapshots/test1/home/ /home/.snapshots/test2/home/ | btrfs receive /home/.snapshots/test2rec/ At subvol /home/.snapshots/test2/home/ # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ sending incremental file list sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) The above (as well as visual inspection of files) indicates that these two subvolumes contain the same files, which was not the case when the same command had a target located on another physical device. Of course, a snapshot which resides on the same physical device is not a very good backup. So I do need to send it to another device, but that results in missing files when the -p or -c options are used with btrfs send. (Non-incremental sending to another physical device does work.) I can think of a couple possible explanations. One is that there is a problem when using the -p or -c options with btrfs send when the target is another physical device. I suspect this is the actual explanation, however. A second possibility is that the presence of prior existing snapshots at the target location (even if old and not referenced in any current btrfs command), can determine the outcome and final contents of an incremental send operation. I believe the info below suggests this to be the case. [root@srv]# btrfs su show /home/.snapshots/test2/home/ test2/home Name: home UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 Creation time: 2017-09-06 15:38:16 -0400 Subvolume ID: 2000 Generation: 5020 Gen at creation: 5020 Parent ID: 257 Top level ID: 257 Flags: readonly Snapshot(s): [root@srv]# btrfs su show /mnt/x5a/home/test1/home home/test1/home Name: home UUID: dc00b13d-f841-cf48-a169-aa61429a5679 Parent UUID: - Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 Creation time: 2017-09-06 15:33:45 -0400 Subvolume ID: 656 Generation: 777 Gen at creation: 773 Parent ID: 257 Top level ID: 257 Flags: readonly Snapshot(s): [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ home/test2/home Name: home UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 Creation time: 2017-09-06 15:39:51 -0400 Subvolume ID: 660 Generation: 779 Gen at creation: 779 Parent ID: 257 Top level ID: 257 Flags: readonly Snapshot(s): [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ test2rec/home Name: home UUID: bde1891d-1474-414f-b6ab-2a34c5af224e Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 Creation time: 2017-09-06 17:36:19 -0400 Subvolume ID: 2003 Generation: 5027 Gen at creation: 5027 Parent ID: 257 Top level ID: 257 Flags: readonly Snapshot(s): Below, we have old almost forgotten snapshot (date 2017-07-21) on device /mnt/x5a/home with a Received UUID that matches the Received UUID of test snapshots that were newly created today. How? Why? [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot home/107/snapshot Name: snapshot UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 Creation time: 2017-07-21 00:00:25 -0400 Subvolume ID: 433 Generation: 222 Gen at creation: 221 Parent ID: 257 Top level ID: 257 Flags: readonly Snapshot(s): If my guess is correct, btrfs has found this old snapshot and referenced it without me telling it to do so. The result is that the newly executed btrfs commands shown above have a totally unexpected result. Today's new snapshot will not contain any files newer than 2017-07-21. Is this a known issue? Refer back to the commands at the top of this message. I created a new snapshot and did a full (non-incremental) send to the target location (/mnt/x5a/home). Then I created a snapshot and did a send which only referenced the prior snapshot created today. Nowhere did I reference the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at this backup location -- it was intended to hold a lot of them.) Yet, the very presence of /mnt/x5a/home/107/snapshot on the target device resulted in today's backup (and all recent backups) being worthless due to them missing all files since 2017-07-21. These results are totally repeatable, given my set of existing backups. But it's bizarre to me. As I understand it, a staff person could transfer a btrfs snapshot to a target volume and it's mere presence there could make all subsequent backups (incremental sends) to that target volume invalid and useless. If that is true... wow. Another interesting observation is that the device that contains the source snapshot, /home/.snapshots, also contains many, many prior snapshots, going back to when this system was first set up. Why do none of them cause a problem? Is it because I had never used /home/.snapshots as the target of a receive operation (until I did so today in testing the steps above)? As far as repeating these steps, all this was totally repeatable for me as long as /mnt/x5a/home/107/snapshot existed on the target of the receive command (/mnt/x5a/home/). I do not know how to create such a "rogue" snapshot on purpose, but doing so may be key to reproducing my results. Maybe somebody can explain to me what's really happening. How is it possible that an old snapshot created 2017-07-21 could have the same Received UUID as snapshots created today? And how could that fact lead to the result I'm seeing, which seems very serious. (Unexpected missing files from a backup which was completed without errors is pretty serious in my book.) Most important question: how can we rely on automated incremental backups with btrfs send | receive given what I'm observing here (assuming my observations are roughly correct)? Here's more info just to confirm that my results are not due to filesystem corruption. running check on unmounted volume that contains /mnt/x5a/home/test2/home: [root@srv]# btrfs check -p /dev/mapper/x5a_luks Checking filesystem on /dev/mapper/x5a_luks UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a checking extents [o] checking free space cache [.] checking fs roots [o] checking csums checking root refs found 258178555904 bytes used, no error found total csum bytes: 250354776 total tree bytes: 1752088576 total fs tree bytes: 1308540928 total extent tree bytes: 175161344 btree space waste bytes: 215594634 file data blocks allocated: 258634637312 referenced 292888985600 [root@srv]# btrfs fi show /mnt/x5a/ Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a Total devices 1 FS bytes used 240.45GiB devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks [root@srv]# btrfs fi df /mnt/x5a/ Data, single: total=239.01GiB, used=238.82GiB System, DUP: total=32.00MiB, used=48.00KiB Metadata, DUP: total=2.50GiB, used=1.63GiB GlobalReserve, single: total=422.73MiB, used=0.00B # btrfs scrub status -d /mnt/x5a/ scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a scrub device /dev/mapper/x5a_luks (id 1) history scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 total bytes scrubbed: 242.08GiB with 0 errors ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-07 4:43 ` Dave @ 2017-09-07 6:24 ` A L 2017-09-07 12:39 ` Dave 0 siblings, 1 reply; 11+ messages in thread From: A L @ 2017-09-07 6:24 UTC (permalink / raw) To: Dave, linux-btrfs The problem can be that you have a Received UUID on the source volume. This breaks send-receive. ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- > Here is more info and a possible (shocking) explanation. This > aggregates my prior messages and it provides an almost complete set of > steps to reproduce this problem. > > Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux > btrfs-progs v4.12 > > My steps: > > [root@srv]# sync > [root@srv]# mkdir /home/.snapshots/test1 > [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ > Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' > [root@srv]# sync > [root@srv]# mkdir /mnt/x5a/home/test1 > [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive > /mnt/x5a/home/test1/ > At subvol /home/.snapshots/test1/home/ > At subvol home > [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ > NOTE: all recent files are present > [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ > NOTE: all recent files are present > [root@srv]# mkdir /home/.snapshots/test2 > [root@srv]# mkdir /mnt/x5a/home/test2 > [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ > Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' > [root@srv]# sync > [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ > At subvol /home/.snapshots/test2/home/ > At snapshot home > [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ > NOTE: all recent files are MISSING > [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ > NOTE: all recent files are MISSING > > Below I am including some rsync output to illustrate when a snapshot > is missing files (or not): > > [root@srv]# rsync -aniv /home/.snapshots/test1/home/ > /home/.snapshots/test2/home/ > sending incremental file list > > sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec > total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > > This indicates that these two subvolumes contain the same files, which > they should because test2 is a snapshot of test1 without any changes > to files, and it was not sent to another physical device. > > The problem is when test2 is sent to another device as shown by the > rsync results below. > > [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ > sending incremental file list > .d..t...... ./ > .d..t...... user1/ >>f.st...... user1/.bash_history >>f.st...... user1/.bashrc >>f+++++++++ user1/test2017-09-06.txt > ... > and a long list of other missing files > > The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is > missing all recent files (any files from the month of August or > September), as my prior visual inspections had indicated. The same > files are missing every time. There is no randomness to the missing > data. > > The problem does not happen for me if the receive command target is > located on the same physical device as shown next. (However, I suspect > there's more to it than that, as explained further below.) > > [root@srv]# mkdir /home/.snapshots/test2rec > [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > /home/.snapshots/test2/home/ | btrfs receive > /home/.snapshots/test2rec/ > At subvol /home/.snapshots/test2/home/ > > # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ > sending incremental file list > > sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec > total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > > The above (as well as visual inspection of files) indicates that these > two subvolumes contain the same files, which was not the case when the > same command had a target located on another physical device. Of > course, a snapshot which resides on the same physical device is not a > very good backup. So I do need to send it to another device, but that > results in missing files when the -p or -c options are used with btrfs > send. (Non-incremental sending to another physical device does work.) > > I can think of a couple possible explanations. > > One is that there is a problem when using the -p or -c options with > btrfs send when the target is another physical device. I suspect this > is the actual explanation, however. > > A second possibility is that the presence of prior existing snapshots > at the target location (even if old and not referenced in any current > btrfs command), can determine the outcome and final contents of an > incremental send operation. I believe the info below suggests this to > be the case. > > [root@srv]# btrfs su show /home/.snapshots/test2/home/ > test2/home > Name: home > UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc > Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > Creation time: 2017-09-06 15:38:16 -0400 > Subvolume ID: 2000 > Generation: 5020 > Gen at creation: 5020 > Parent ID: 257 > Top level ID: 257 > Flags: readonly > Snapshot(s): > > [root@srv]# btrfs su show /mnt/x5a/home/test1/home > home/test1/home > Name: home > UUID: dc00b13d-f841-cf48-a169-aa61429a5679 > Parent UUID: - > Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > Creation time: 2017-09-06 15:33:45 -0400 > Subvolume ID: 656 > Generation: 777 > Gen at creation: 773 > Parent ID: 257 > Top level ID: 257 > Flags: readonly > Snapshot(s): > > [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ > home/test2/home > Name: home > UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea > Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > Creation time: 2017-09-06 15:39:51 -0400 > Subvolume ID: 660 > Generation: 779 > Gen at creation: 779 > Parent ID: 257 > Top level ID: 257 > Flags: readonly > Snapshot(s): > > [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ > test2rec/home > Name: home > UUID: bde1891d-1474-414f-b6ab-2a34c5af224e > Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > Creation time: 2017-09-06 17:36:19 -0400 > Subvolume ID: 2003 > Generation: 5027 > Gen at creation: 5027 > Parent ID: 257 > Top level ID: 257 > Flags: readonly > Snapshot(s): > > Below, we have old almost forgotten snapshot (date 2017-07-21) on > device /mnt/x5a/home with a Received UUID that matches the Received > UUID of test snapshots that were newly created today. How? Why? > > [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot > home/107/snapshot > Name: snapshot > UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 > Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > Creation time: 2017-07-21 00:00:25 -0400 > Subvolume ID: 433 > Generation: 222 > Gen at creation: 221 > Parent ID: 257 > Top level ID: 257 > Flags: readonly > Snapshot(s): > > If my guess is correct, btrfs has found this old snapshot and > referenced it without me telling it to do so. The result is that the > newly executed btrfs commands shown above have a totally unexpected > result. > > Today's new snapshot will not contain any files newer than 2017-07-21. > Is this a known issue? > > Refer back to the commands at the top of this message. I created a new > snapshot and did a full (non-incremental) send to the target location > (/mnt/x5a/home). Then I created a snapshot and did a send which only > referenced the prior snapshot created today. Nowhere did I reference > the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at > this backup location -- it was intended to hold a lot of them.) Yet, > the very presence of /mnt/x5a/home/107/snapshot on the target device > resulted in today's backup (and all recent backups) being worthless > due to them missing all files since 2017-07-21. > > These results are totally repeatable, given my set of existing > backups. But it's bizarre to me. As I understand it, a staff person > could transfer a btrfs snapshot to a target volume and it's mere > presence there could make all subsequent backups (incremental sends) > to that target volume invalid and useless. If that is true... wow. > > Another interesting observation is that the device that contains the > source snapshot, /home/.snapshots, also contains many, many prior > snapshots, going back to when this system was first set up. Why do > none of them cause a problem? Is it because I had never used > /home/.snapshots as the target of a receive operation (until I did so > today in testing the steps above)? > > As far as repeating these steps, all this was totally repeatable for > me as long as /mnt/x5a/home/107/snapshot existed on the target of the > receive command (/mnt/x5a/home/). I do not know how to create such a > "rogue" snapshot on purpose, but doing so may be key to reproducing my > results. > > Maybe somebody can explain to me what's really happening. How is it > possible that an old snapshot created 2017-07-21 could have the same > Received UUID as snapshots created today? And how could that fact lead > to the result I'm seeing, which seems very serious. (Unexpected > missing files from a backup which was completed without errors is > pretty serious in my book.) > > Most important question: how can we rely on automated incremental > backups with btrfs send | receive given what I'm observing here > (assuming my observations are roughly correct)? > > Here's more info just to confirm that my results are not due to > filesystem corruption. > > running check on unmounted volume that contains /mnt/x5a/home/test2/home: > [root@srv]# btrfs check -p /dev/mapper/x5a_luks > Checking filesystem on /dev/mapper/x5a_luks > UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a > checking extents [o] > checking free space cache [.] > checking fs roots [o] > checking csums > checking root refs > found 258178555904 bytes used, no error found > total csum bytes: 250354776 > total tree bytes: 1752088576 > total fs tree bytes: 1308540928 > total extent tree bytes: 175161344 > btree space waste bytes: 215594634 > file data blocks allocated: 258634637312 > referenced 292888985600 > > [root@srv]# btrfs fi show /mnt/x5a/ > Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a > Total devices 1 FS bytes used 240.45GiB > devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks > > [root@srv]# btrfs fi df /mnt/x5a/ > Data, single: total=239.01GiB, used=238.82GiB > System, DUP: total=32.00MiB, used=48.00KiB > Metadata, DUP: total=2.50GiB, used=1.63GiB > GlobalReserve, single: total=422.73MiB, used=0.00B > > # btrfs scrub status -d /mnt/x5a/ > scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a > scrub device /dev/mapper/x5a_luks (id 1) history > scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 > total bytes scrubbed: 242.08GiB with 0 errors > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-07 6:24 ` A L @ 2017-09-07 12:39 ` Dave 2017-09-07 13:34 ` Dave 0 siblings, 1 reply; 11+ messages in thread From: Dave @ 2017-09-07 12:39 UTC (permalink / raw) To: linux-btrfs; +Cc: A L Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? How does it happen? How does one remove a Received UUID from the source volume? And how does that explain my results where I showed that the problem is not dependent upon the source volume but is instead dependent upon some existing snapshot on the target volume? My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. Thank you. On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote: > The problem can be that you have a Received UUID on the source volume. This breaks send-receive. > > ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- > >> Here is more info and a possible (shocking) explanation. This >> aggregates my prior messages and it provides an almost complete set of >> steps to reproduce this problem. >> >> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux >> btrfs-progs v4.12 >> >> My steps: >> >> [root@srv]# sync >> [root@srv]# mkdir /home/.snapshots/test1 >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' >> [root@srv]# sync >> [root@srv]# mkdir /mnt/x5a/home/test1 >> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive >> /mnt/x5a/home/test1/ >> At subvol /home/.snapshots/test1/home/ >> At subvol home >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ >> NOTE: all recent files are present >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ >> NOTE: all recent files are present >> [root@srv]# mkdir /home/.snapshots/test2 >> [root@srv]# mkdir /mnt/x5a/home/test2 >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' >> [root@srv]# sync >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ >> At subvol /home/.snapshots/test2/home/ >> At snapshot home >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ >> NOTE: all recent files are MISSING >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ >> NOTE: all recent files are MISSING >> >> Below I am including some rsync output to illustrate when a snapshot >> is missing files (or not): >> >> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ >> /home/.snapshots/test2/home/ >> sending incremental file list >> >> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec >> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >> >> This indicates that these two subvolumes contain the same files, which >> they should because test2 is a snapshot of test1 without any changes >> to files, and it was not sent to another physical device. >> >> The problem is when test2 is sent to another device as shown by the >> rsync results below. >> >> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ >> sending incremental file list >> .d..t...... ./ >> .d..t...... user1/ >>>f.st...... user1/.bash_history >>>f.st...... user1/.bashrc >>>f+++++++++ user1/test2017-09-06.txt >> ... >> and a long list of other missing files >> >> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is >> missing all recent files (any files from the month of August or >> September), as my prior visual inspections had indicated. The same >> files are missing every time. There is no randomness to the missing >> data. >> >> The problem does not happen for me if the receive command target is >> located on the same physical device as shown next. (However, I suspect >> there's more to it than that, as explained further below.) >> >> [root@srv]# mkdir /home/.snapshots/test2rec >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >> /home/.snapshots/test2/home/ | btrfs receive >> /home/.snapshots/test2rec/ >> At subvol /home/.snapshots/test2/home/ >> >> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ >> sending incremental file list >> >> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec >> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >> >> The above (as well as visual inspection of files) indicates that these >> two subvolumes contain the same files, which was not the case when the >> same command had a target located on another physical device. Of >> course, a snapshot which resides on the same physical device is not a >> very good backup. So I do need to send it to another device, but that >> results in missing files when the -p or -c options are used with btrfs >> send. (Non-incremental sending to another physical device does work.) >> >> I can think of a couple possible explanations. >> >> One is that there is a problem when using the -p or -c options with >> btrfs send when the target is another physical device. I suspect this >> is the actual explanation, however. >> >> A second possibility is that the presence of prior existing snapshots >> at the target location (even if old and not referenced in any current >> btrfs command), can determine the outcome and final contents of an >> incremental send operation. I believe the info below suggests this to >> be the case. >> >> [root@srv]# btrfs su show /home/.snapshots/test2/home/ >> test2/home >> Name: home >> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc >> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >> Creation time: 2017-09-06 15:38:16 -0400 >> Subvolume ID: 2000 >> Generation: 5020 >> Gen at creation: 5020 >> Parent ID: 257 >> Top level ID: 257 >> Flags: readonly >> Snapshot(s): >> >> [root@srv]# btrfs su show /mnt/x5a/home/test1/home >> home/test1/home >> Name: home >> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 >> Parent UUID: - >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >> Creation time: 2017-09-06 15:33:45 -0400 >> Subvolume ID: 656 >> Generation: 777 >> Gen at creation: 773 >> Parent ID: 257 >> Top level ID: 257 >> Flags: readonly >> Snapshot(s): >> >> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ >> home/test2/home >> Name: home >> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea >> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >> Creation time: 2017-09-06 15:39:51 -0400 >> Subvolume ID: 660 >> Generation: 779 >> Gen at creation: 779 >> Parent ID: 257 >> Top level ID: 257 >> Flags: readonly >> Snapshot(s): >> >> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ >> test2rec/home >> Name: home >> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e >> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >> Creation time: 2017-09-06 17:36:19 -0400 >> Subvolume ID: 2003 >> Generation: 5027 >> Gen at creation: 5027 >> Parent ID: 257 >> Top level ID: 257 >> Flags: readonly >> Snapshot(s): >> >> Below, we have old almost forgotten snapshot (date 2017-07-21) on >> device /mnt/x5a/home with a Received UUID that matches the Received >> UUID of test snapshots that were newly created today. How? Why? >> >> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot >> home/107/snapshot >> Name: snapshot >> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 >> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >> Creation time: 2017-07-21 00:00:25 -0400 >> Subvolume ID: 433 >> Generation: 222 >> Gen at creation: 221 >> Parent ID: 257 >> Top level ID: 257 >> Flags: readonly >> Snapshot(s): >> >> If my guess is correct, btrfs has found this old snapshot and >> referenced it without me telling it to do so. The result is that the >> newly executed btrfs commands shown above have a totally unexpected >> result. >> >> Today's new snapshot will not contain any files newer than 2017-07-21. >> Is this a known issue? >> >> Refer back to the commands at the top of this message. I created a new >> snapshot and did a full (non-incremental) send to the target location >> (/mnt/x5a/home). Then I created a snapshot and did a send which only >> referenced the prior snapshot created today. Nowhere did I reference >> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at >> this backup location -- it was intended to hold a lot of them.) Yet, >> the very presence of /mnt/x5a/home/107/snapshot on the target device >> resulted in today's backup (and all recent backups) being worthless >> due to them missing all files since 2017-07-21. >> >> These results are totally repeatable, given my set of existing >> backups. But it's bizarre to me. As I understand it, a staff person >> could transfer a btrfs snapshot to a target volume and it's mere >> presence there could make all subsequent backups (incremental sends) >> to that target volume invalid and useless. If that is true... wow. >> >> Another interesting observation is that the device that contains the >> source snapshot, /home/.snapshots, also contains many, many prior >> snapshots, going back to when this system was first set up. Why do >> none of them cause a problem? Is it because I had never used >> /home/.snapshots as the target of a receive operation (until I did so >> today in testing the steps above)? >> >> As far as repeating these steps, all this was totally repeatable for >> me as long as /mnt/x5a/home/107/snapshot existed on the target of the >> receive command (/mnt/x5a/home/). I do not know how to create such a >> "rogue" snapshot on purpose, but doing so may be key to reproducing my >> results. >> >> Maybe somebody can explain to me what's really happening. How is it >> possible that an old snapshot created 2017-07-21 could have the same >> Received UUID as snapshots created today? And how could that fact lead >> to the result I'm seeing, which seems very serious. (Unexpected >> missing files from a backup which was completed without errors is >> pretty serious in my book.) >> >> Most important question: how can we rely on automated incremental >> backups with btrfs send | receive given what I'm observing here >> (assuming my observations are roughly correct)? >> >> Here's more info just to confirm that my results are not due to >> filesystem corruption. >> >> running check on unmounted volume that contains /mnt/x5a/home/test2/home: >> [root@srv]# btrfs check -p /dev/mapper/x5a_luks >> Checking filesystem on /dev/mapper/x5a_luks >> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a >> checking extents [o] >> checking free space cache [.] >> checking fs roots [o] >> checking csums >> checking root refs >> found 258178555904 bytes used, no error found >> total csum bytes: 250354776 >> total tree bytes: 1752088576 >> total fs tree bytes: 1308540928 >> total extent tree bytes: 175161344 >> btree space waste bytes: 215594634 >> file data blocks allocated: 258634637312 >> referenced 292888985600 >> >> [root@srv]# btrfs fi show /mnt/x5a/ >> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a >> Total devices 1 FS bytes used 240.45GiB >> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks >> >> [root@srv]# btrfs fi df /mnt/x5a/ >> Data, single: total=239.01GiB, used=238.82GiB >> System, DUP: total=32.00MiB, used=48.00KiB >> Metadata, DUP: total=2.50GiB, used=1.63GiB >> GlobalReserve, single: total=422.73MiB, used=0.00B >> >> # btrfs scrub status -d /mnt/x5a/ >> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a >> scrub device /dev/mapper/x5a_luks (id 1) history >> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 >> total bytes scrubbed: 242.08GiB with 0 errors >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-07 12:39 ` Dave @ 2017-09-07 13:34 ` Dave 2017-09-07 14:33 ` Axel Burri 0 siblings, 1 reply; 11+ messages in thread From: Dave @ 2017-09-07 13:34 UTC (permalink / raw) To: linux-btrfs; +Cc: A L I just ran a test. The btrfs send - receive problem I described is indeed fully resolved by removing the "problematic" snapshot on the target device. I did not make any changes to the source volume. I did not make any other changes in my steps (see earlier message for my exact steps). Therefore, the problem I described in my earlier message is not due exclusively to having a Received UUID on the source volume (or to any other feature of the source volume). It is not related to any feature of the directly specified parent volume either. More details are included in my earlier email. Thanks for any further feedback, including answers to my questions and comments about whether this is a known issue. On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote: > > Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? > > How does it happen? > How does one remove a Received UUID from the source volume? > > And how does that explain my results where I showed that the problem > is not dependent upon the source volume but is instead dependent upon > some existing snapshot on the target volume? > > My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. > > Thank you. > > On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote: > > The problem can be that you have a Received UUID on the source volume. This breaks send-receive. > > > > ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- > > > >> Here is more info and a possible (shocking) explanation. This > >> aggregates my prior messages and it provides an almost complete set of > >> steps to reproduce this problem. > >> > >> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux > >> btrfs-progs v4.12 > >> > >> My steps: > >> > >> [root@srv]# sync > >> [root@srv]# mkdir /home/.snapshots/test1 > >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ > >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' > >> [root@srv]# sync > >> [root@srv]# mkdir /mnt/x5a/home/test1 > >> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive > >> /mnt/x5a/home/test1/ > >> At subvol /home/.snapshots/test1/home/ > >> At subvol home > >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ > >> NOTE: all recent files are present > >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ > >> NOTE: all recent files are present > >> [root@srv]# mkdir /home/.snapshots/test2 > >> [root@srv]# mkdir /mnt/x5a/home/test2 > >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ > >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' > >> [root@srv]# sync > >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > >> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ > >> At subvol /home/.snapshots/test2/home/ > >> At snapshot home > >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ > >> NOTE: all recent files are MISSING > >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ > >> NOTE: all recent files are MISSING > >> > >> Below I am including some rsync output to illustrate when a snapshot > >> is missing files (or not): > >> > >> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ > >> /home/.snapshots/test2/home/ > >> sending incremental file list > >> > >> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec > >> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > >> > >> This indicates that these two subvolumes contain the same files, which > >> they should because test2 is a snapshot of test1 without any changes > >> to files, and it was not sent to another physical device. > >> > >> The problem is when test2 is sent to another device as shown by the > >> rsync results below. > >> > >> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ > >> sending incremental file list > >> .d..t...... ./ > >> .d..t...... user1/ > >>>f.st...... user1/.bash_history > >>>f.st...... user1/.bashrc > >>>f+++++++++ user1/test2017-09-06.txt > >> ... > >> and a long list of other missing files > >> > >> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is > >> missing all recent files (any files from the month of August or > >> September), as my prior visual inspections had indicated. The same > >> files are missing every time. There is no randomness to the missing > >> data. > >> > >> The problem does not happen for me if the receive command target is > >> located on the same physical device as shown next. (However, I suspect > >> there's more to it than that, as explained further below.) > >> > >> [root@srv]# mkdir /home/.snapshots/test2rec > >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > >> /home/.snapshots/test2/home/ | btrfs receive > >> /home/.snapshots/test2rec/ > >> At subvol /home/.snapshots/test2/home/ > >> > >> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ > >> sending incremental file list > >> > >> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec > >> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > >> > >> The above (as well as visual inspection of files) indicates that these > >> two subvolumes contain the same files, which was not the case when the > >> same command had a target located on another physical device. Of > >> course, a snapshot which resides on the same physical device is not a > >> very good backup. So I do need to send it to another device, but that > >> results in missing files when the -p or -c options are used with btrfs > >> send. (Non-incremental sending to another physical device does work.) > >> > >> I can think of a couple possible explanations. > >> > >> One is that there is a problem when using the -p or -c options with > >> btrfs send when the target is another physical device. I suspect this > >> is the actual explanation, however. > >> > >> A second possibility is that the presence of prior existing snapshots > >> at the target location (even if old and not referenced in any current > >> btrfs command), can determine the outcome and final contents of an > >> incremental send operation. I believe the info below suggests this to > >> be the case. > >> > >> [root@srv]# btrfs su show /home/.snapshots/test2/home/ > >> test2/home > >> Name: home > >> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc > >> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >> Creation time: 2017-09-06 15:38:16 -0400 > >> Subvolume ID: 2000 > >> Generation: 5020 > >> Gen at creation: 5020 > >> Parent ID: 257 > >> Top level ID: 257 > >> Flags: readonly > >> Snapshot(s): > >> > >> [root@srv]# btrfs su show /mnt/x5a/home/test1/home > >> home/test1/home > >> Name: home > >> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 > >> Parent UUID: - > >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >> Creation time: 2017-09-06 15:33:45 -0400 > >> Subvolume ID: 656 > >> Generation: 777 > >> Gen at creation: 773 > >> Parent ID: 257 > >> Top level ID: 257 > >> Flags: readonly > >> Snapshot(s): > >> > >> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ > >> home/test2/home > >> Name: home > >> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea > >> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >> Creation time: 2017-09-06 15:39:51 -0400 > >> Subvolume ID: 660 > >> Generation: 779 > >> Gen at creation: 779 > >> Parent ID: 257 > >> Top level ID: 257 > >> Flags: readonly > >> Snapshot(s): > >> > >> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ > >> test2rec/home > >> Name: home > >> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e > >> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >> Creation time: 2017-09-06 17:36:19 -0400 > >> Subvolume ID: 2003 > >> Generation: 5027 > >> Gen at creation: 5027 > >> Parent ID: 257 > >> Top level ID: 257 > >> Flags: readonly > >> Snapshot(s): > >> > >> Below, we have old almost forgotten snapshot (date 2017-07-21) on > >> device /mnt/x5a/home with a Received UUID that matches the Received > >> UUID of test snapshots that were newly created today. How? Why? > >> > >> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot > >> home/107/snapshot > >> Name: snapshot > >> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 > >> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > >> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >> Creation time: 2017-07-21 00:00:25 -0400 > >> Subvolume ID: 433 > >> Generation: 222 > >> Gen at creation: 221 > >> Parent ID: 257 > >> Top level ID: 257 > >> Flags: readonly > >> Snapshot(s): > >> > >> If my guess is correct, btrfs has found this old snapshot and > >> referenced it without me telling it to do so. The result is that the > >> newly executed btrfs commands shown above have a totally unexpected > >> result. > >> > >> Today's new snapshot will not contain any files newer than 2017-07-21. > >> Is this a known issue? > >> > >> Refer back to the commands at the top of this message. I created a new > >> snapshot and did a full (non-incremental) send to the target location > >> (/mnt/x5a/home). Then I created a snapshot and did a send which only > >> referenced the prior snapshot created today. Nowhere did I reference > >> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at > >> this backup location -- it was intended to hold a lot of them.) Yet, > >> the very presence of /mnt/x5a/home/107/snapshot on the target device > >> resulted in today's backup (and all recent backups) being worthless > >> due to them missing all files since 2017-07-21. > >> > >> These results are totally repeatable, given my set of existing > >> backups. But it's bizarre to me. As I understand it, a staff person > >> could transfer a btrfs snapshot to a target volume and it's mere > >> presence there could make all subsequent backups (incremental sends) > >> to that target volume invalid and useless. If that is true... wow. > >> > >> Another interesting observation is that the device that contains the > >> source snapshot, /home/.snapshots, also contains many, many prior > >> snapshots, going back to when this system was first set up. Why do > >> none of them cause a problem? Is it because I had never used > >> /home/.snapshots as the target of a receive operation (until I did so > >> today in testing the steps above)? > >> > >> As far as repeating these steps, all this was totally repeatable for > >> me as long as /mnt/x5a/home/107/snapshot existed on the target of the > >> receive command (/mnt/x5a/home/). I do not know how to create such a > >> "rogue" snapshot on purpose, but doing so may be key to reproducing my > >> results. > >> > >> Maybe somebody can explain to me what's really happening. How is it > >> possible that an old snapshot created 2017-07-21 could have the same > >> Received UUID as snapshots created today? And how could that fact lead > >> to the result I'm seeing, which seems very serious. (Unexpected > >> missing files from a backup which was completed without errors is > >> pretty serious in my book.) > >> > >> Most important question: how can we rely on automated incremental > >> backups with btrfs send | receive given what I'm observing here > >> (assuming my observations are roughly correct)? > >> > >> Here's more info just to confirm that my results are not due to > >> filesystem corruption. > >> > >> running check on unmounted volume that contains /mnt/x5a/home/test2/home: > >> [root@srv]# btrfs check -p /dev/mapper/x5a_luks > >> Checking filesystem on /dev/mapper/x5a_luks > >> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a > >> checking extents [o] > >> checking free space cache [.] > >> checking fs roots [o] > >> checking csums > >> checking root refs > >> found 258178555904 bytes used, no error found > >> total csum bytes: 250354776 > >> total tree bytes: 1752088576 > >> total fs tree bytes: 1308540928 > >> total extent tree bytes: 175161344 > >> btree space waste bytes: 215594634 > >> file data blocks allocated: 258634637312 > >> referenced 292888985600 > >> > >> [root@srv]# btrfs fi show /mnt/x5a/ > >> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a > >> Total devices 1 FS bytes used 240.45GiB > >> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks > >> > >> [root@srv]# btrfs fi df /mnt/x5a/ > >> Data, single: total=239.01GiB, used=238.82GiB > >> System, DUP: total=32.00MiB, used=48.00KiB > >> Metadata, DUP: total=2.50GiB, used=1.63GiB > >> GlobalReserve, single: total=422.73MiB, used=0.00B > >> > >> # btrfs scrub status -d /mnt/x5a/ > >> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a > >> scrub device /dev/mapper/x5a_luks (id 1) history > >> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 > >> total bytes scrubbed: 242.08GiB with 0 errors > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-07 13:34 ` Dave @ 2017-09-07 14:33 ` Axel Burri 2017-09-08 4:44 ` Dave 0 siblings, 1 reply; 11+ messages in thread From: Axel Burri @ 2017-09-07 14:33 UTC (permalink / raw) To: Dave, linux-btrfs; +Cc: A L Having a received_uuid set on the source volume ("/home" in your case) is indeed a bad thing when it comes to send/receive. You probably restored a backup with send/receive, and made it read/write using "btrfs property set -ts /home ro false". This is a an evil thing, as it leaves received_uuid intact. In order to make a subvolume read-write, I recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>". There is a FAQ entry on btrbk on how to fix this: https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set On 2017-09-07 15:34, Dave wrote: > I just ran a test. The btrfs send - receive problem I described is > indeed fully resolved by removing the "problematic" snapshot on the > target device. I did not make any changes to the source volume. I did > not make any other changes in my steps (see earlier message for my > exact steps). > > Therefore, the problem I described in my earlier message is not due > exclusively to having a Received UUID on the source volume (or to any > other feature of the source volume). It is not related to any feature > of the directly specified parent volume either. More details are > included in my earlier email. > > Thanks for any further feedback, including answers to my questions and > comments about whether this is a known issue. > > > On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote: >> >> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? >> >> How does it happen? >> How does one remove a Received UUID from the source volume? >> >> And how does that explain my results where I showed that the problem >> is not dependent upon the source volume but is instead dependent upon >> some existing snapshot on the target volume? >> >> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. >> >> Thank you. >> >> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote: >>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive. >>> >>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- >>> >>>> Here is more info and a possible (shocking) explanation. This >>>> aggregates my prior messages and it provides an almost complete set of >>>> steps to reproduce this problem. >>>> >>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux >>>> btrfs-progs v4.12 >>>> >>>> My steps: >>>> >>>> [root@srv]# sync >>>> [root@srv]# mkdir /home/.snapshots/test1 >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' >>>> [root@srv]# sync >>>> [root@srv]# mkdir /mnt/x5a/home/test1 >>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive >>>> /mnt/x5a/home/test1/ >>>> At subvol /home/.snapshots/test1/home/ >>>> At subvol home >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ >>>> NOTE: all recent files are present >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ >>>> NOTE: all recent files are present >>>> [root@srv]# mkdir /home/.snapshots/test2 >>>> [root@srv]# mkdir /mnt/x5a/home/test2 >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' >>>> [root@srv]# sync >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ >>>> At subvol /home/.snapshots/test2/home/ >>>> At snapshot home >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ >>>> NOTE: all recent files are MISSING >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ >>>> NOTE: all recent files are MISSING >>>> >>>> Below I am including some rsync output to illustrate when a snapshot >>>> is missing files (or not): >>>> >>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ >>>> sending incremental file list >>>> >>>> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>> >>>> This indicates that these two subvolumes contain the same files, which >>>> they should because test2 is a snapshot of test1 without any changes >>>> to files, and it was not sent to another physical device. >>>> >>>> The problem is when test2 is sent to another device as shown by the >>>> rsync results below. >>>> >>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ >>>> sending incremental file list >>>> .d..t...... ./ >>>> .d..t...... user1/ >>>>> f.st...... user1/.bash_history >>>>> f.st...... user1/.bashrc >>>>> f+++++++++ user1/test2017-09-06.txt >>>> ... >>>> and a long list of other missing files >>>> >>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is >>>> missing all recent files (any files from the month of August or >>>> September), as my prior visual inspections had indicated. The same >>>> files are missing every time. There is no randomness to the missing >>>> data. >>>> >>>> The problem does not happen for me if the receive command target is >>>> located on the same physical device as shown next. (However, I suspect >>>> there's more to it than that, as explained further below.) >>>> >>>> [root@srv]# mkdir /home/.snapshots/test2rec >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>> /home/.snapshots/test2/home/ | btrfs receive >>>> /home/.snapshots/test2rec/ >>>> At subvol /home/.snapshots/test2/home/ >>>> >>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ >>>> sending incremental file list >>>> >>>> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>> >>>> The above (as well as visual inspection of files) indicates that these >>>> two subvolumes contain the same files, which was not the case when the >>>> same command had a target located on another physical device. Of >>>> course, a snapshot which resides on the same physical device is not a >>>> very good backup. So I do need to send it to another device, but that >>>> results in missing files when the -p or -c options are used with btrfs >>>> send. (Non-incremental sending to another physical device does work.) >>>> >>>> I can think of a couple possible explanations. >>>> >>>> One is that there is a problem when using the -p or -c options with >>>> btrfs send when the target is another physical device. I suspect this >>>> is the actual explanation, however. >>>> >>>> A second possibility is that the presence of prior existing snapshots >>>> at the target location (even if old and not referenced in any current >>>> btrfs command), can determine the outcome and final contents of an >>>> incremental send operation. I believe the info below suggests this to >>>> be the case. >>>> >>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/ >>>> test2/home >>>> Name: home >>>> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:38:16 -0400 >>>> Subvolume ID: 2000 >>>> Generation: 5020 >>>> Gen at creation: 5020 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home >>>> home/test1/home >>>> Name: home >>>> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 >>>> Parent UUID: - >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:33:45 -0400 >>>> Subvolume ID: 656 >>>> Generation: 777 >>>> Gen at creation: 773 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ >>>> home/test2/home >>>> Name: home >>>> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 15:39:51 -0400 >>>> Subvolume ID: 660 >>>> Generation: 779 >>>> Gen at creation: 779 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ >>>> test2rec/home >>>> Name: home >>>> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-09-06 17:36:19 -0400 >>>> Subvolume ID: 2003 >>>> Generation: 5027 >>>> Gen at creation: 5027 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on >>>> device /mnt/x5a/home with a Received UUID that matches the Received >>>> UUID of test snapshots that were newly created today. How? Why? >>>> >>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot >>>> home/107/snapshot >>>> Name: snapshot >>>> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>> Creation time: 2017-07-21 00:00:25 -0400 >>>> Subvolume ID: 433 >>>> Generation: 222 >>>> Gen at creation: 221 >>>> Parent ID: 257 >>>> Top level ID: 257 >>>> Flags: readonly >>>> Snapshot(s): >>>> >>>> If my guess is correct, btrfs has found this old snapshot and >>>> referenced it without me telling it to do so. The result is that the >>>> newly executed btrfs commands shown above have a totally unexpected >>>> result. >>>> >>>> Today's new snapshot will not contain any files newer than 2017-07-21. >>>> Is this a known issue? >>>> >>>> Refer back to the commands at the top of this message. I created a new >>>> snapshot and did a full (non-incremental) send to the target location >>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only >>>> referenced the prior snapshot created today. Nowhere did I reference >>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at >>>> this backup location -- it was intended to hold a lot of them.) Yet, >>>> the very presence of /mnt/x5a/home/107/snapshot on the target device >>>> resulted in today's backup (and all recent backups) being worthless >>>> due to them missing all files since 2017-07-21. >>>> >>>> These results are totally repeatable, given my set of existing >>>> backups. But it's bizarre to me. As I understand it, a staff person >>>> could transfer a btrfs snapshot to a target volume and it's mere >>>> presence there could make all subsequent backups (incremental sends) >>>> to that target volume invalid and useless. If that is true... wow. >>>> >>>> Another interesting observation is that the device that contains the >>>> source snapshot, /home/.snapshots, also contains many, many prior >>>> snapshots, going back to when this system was first set up. Why do >>>> none of them cause a problem? Is it because I had never used >>>> /home/.snapshots as the target of a receive operation (until I did so >>>> today in testing the steps above)? >>>> >>>> As far as repeating these steps, all this was totally repeatable for >>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the >>>> receive command (/mnt/x5a/home/). I do not know how to create such a >>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my >>>> results. >>>> >>>> Maybe somebody can explain to me what's really happening. How is it >>>> possible that an old snapshot created 2017-07-21 could have the same >>>> Received UUID as snapshots created today? And how could that fact lead >>>> to the result I'm seeing, which seems very serious. (Unexpected >>>> missing files from a backup which was completed without errors is >>>> pretty serious in my book.) >>>> >>>> Most important question: how can we rely on automated incremental >>>> backups with btrfs send | receive given what I'm observing here >>>> (assuming my observations are roughly correct)? >>>> >>>> Here's more info just to confirm that my results are not due to >>>> filesystem corruption. >>>> >>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home: >>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks >>>> Checking filesystem on /dev/mapper/x5a_luks >>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> checking extents [o] >>>> checking free space cache [.] >>>> checking fs roots [o] >>>> checking csums >>>> checking root refs >>>> found 258178555904 bytes used, no error found >>>> total csum bytes: 250354776 >>>> total tree bytes: 1752088576 >>>> total fs tree bytes: 1308540928 >>>> total extent tree bytes: 175161344 >>>> btree space waste bytes: 215594634 >>>> file data blocks allocated: 258634637312 >>>> referenced 292888985600 >>>> >>>> [root@srv]# btrfs fi show /mnt/x5a/ >>>> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> Total devices 1 FS bytes used 240.45GiB >>>> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks >>>> >>>> [root@srv]# btrfs fi df /mnt/x5a/ >>>> Data, single: total=239.01GiB, used=238.82GiB >>>> System, DUP: total=32.00MiB, used=48.00KiB >>>> Metadata, DUP: total=2.50GiB, used=1.63GiB >>>> GlobalReserve, single: total=422.73MiB, used=0.00B >>>> >>>> # btrfs scrub status -d /mnt/x5a/ >>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>> scrub device /dev/mapper/x5a_luks (id 1) history >>>> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 >>>> total bytes scrubbed: 242.08GiB with 0 errors >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-07 14:33 ` Axel Burri @ 2017-09-08 4:44 ` Dave 2017-09-11 17:53 ` Axel Burri 0 siblings, 1 reply; 11+ messages in thread From: Dave @ 2017-09-08 4:44 UTC (permalink / raw) To: linux-btrfs; +Cc: A L, Axel Burri I'm referring to the link below. Using "btrfs subvolume snapshot -r" copies the Received UUID from the source into the new snapshot. The btrbk FAQ entry suggests otherwise. Has something changed? The only way I see to remove a Received UUID is to create a rw snapshot (above command without the "-r"), which is not ideal in this situation when cleaning up readonly source snapshots. Any suggestions? Thanks On Thu, Sep 7, 2017 at 10:33 AM, Axel Burri <axel@tty0.ch> wrote: > > Having a received_uuid set on the source volume ("/home" in your case) > is indeed a bad thing when it comes to send/receive. You probably > restored a backup with send/receive, and made it read/write using "btrfs > property set -ts /home ro false". This is a an evil thing, as it leaves > received_uuid intact. In order to make a subvolume read-write, I > recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>". > > There is a FAQ entry on btrbk on how to fix this: > > https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set > > > On 2017-09-07 15:34, Dave wrote: > > I just ran a test. The btrfs send - receive problem I described is > > indeed fully resolved by removing the "problematic" snapshot on the > > target device. I did not make any changes to the source volume. I did > > not make any other changes in my steps (see earlier message for my > > exact steps). > > > > Therefore, the problem I described in my earlier message is not due > > exclusively to having a Received UUID on the source volume (or to any > > other feature of the source volume). It is not related to any feature > > of the directly specified parent volume either. More details are > > included in my earlier email. > > > > Thanks for any further feedback, including answers to my questions and > > comments about whether this is a known issue. > > > > > > On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote: > >> > >> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? > >> > >> How does it happen? > >> How does one remove a Received UUID from the source volume? > >> > >> And how does that explain my results where I showed that the problem > >> is not dependent upon the source volume but is instead dependent upon > >> some existing snapshot on the target volume? > >> > >> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. > >> > >> Thank you. > >> > >> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote: > >>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive. > >>> > >>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- > >>> > >>>> Here is more info and a possible (shocking) explanation. This > >>>> aggregates my prior messages and it provides an almost complete set of > >>>> steps to reproduce this problem. > >>>> > >>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux > >>>> btrfs-progs v4.12 > >>>> > >>>> My steps: > >>>> > >>>> [root@srv]# sync > >>>> [root@srv]# mkdir /home/.snapshots/test1 > >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ > >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' > >>>> [root@srv]# sync > >>>> [root@srv]# mkdir /mnt/x5a/home/test1 > >>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive > >>>> /mnt/x5a/home/test1/ > >>>> At subvol /home/.snapshots/test1/home/ > >>>> At subvol home > >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ > >>>> NOTE: all recent files are present > >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ > >>>> NOTE: all recent files are present > >>>> [root@srv]# mkdir /home/.snapshots/test2 > >>>> [root@srv]# mkdir /mnt/x5a/home/test2 > >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ > >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' > >>>> [root@srv]# sync > >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > >>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ > >>>> At subvol /home/.snapshots/test2/home/ > >>>> At snapshot home > >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ > >>>> NOTE: all recent files are MISSING > >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ > >>>> NOTE: all recent files are MISSING > >>>> > >>>> Below I am including some rsync output to illustrate when a snapshot > >>>> is missing files (or not): > >>>> > >>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ > >>>> /home/.snapshots/test2/home/ > >>>> sending incremental file list > >>>> > >>>> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec > >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > >>>> > >>>> This indicates that these two subvolumes contain the same files, which > >>>> they should because test2 is a snapshot of test1 without any changes > >>>> to files, and it was not sent to another physical device. > >>>> > >>>> The problem is when test2 is sent to another device as shown by the > >>>> rsync results below. > >>>> > >>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ > >>>> sending incremental file list > >>>> .d..t...... ./ > >>>> .d..t...... user1/ > >>>>> f.st...... user1/.bash_history > >>>>> f.st...... user1/.bashrc > >>>>> f+++++++++ user1/test2017-09-06.txt > >>>> ... > >>>> and a long list of other missing files > >>>> > >>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is > >>>> missing all recent files (any files from the month of August or > >>>> September), as my prior visual inspections had indicated. The same > >>>> files are missing every time. There is no randomness to the missing > >>>> data. > >>>> > >>>> The problem does not happen for me if the receive command target is > >>>> located on the same physical device as shown next. (However, I suspect > >>>> there's more to it than that, as explained further below.) > >>>> > >>>> [root@srv]# mkdir /home/.snapshots/test2rec > >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ > >>>> /home/.snapshots/test2/home/ | btrfs receive > >>>> /home/.snapshots/test2rec/ > >>>> At subvol /home/.snapshots/test2/home/ > >>>> > >>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ > >>>> sending incremental file list > >>>> > >>>> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec > >>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) > >>>> > >>>> The above (as well as visual inspection of files) indicates that these > >>>> two subvolumes contain the same files, which was not the case when the > >>>> same command had a target located on another physical device. Of > >>>> course, a snapshot which resides on the same physical device is not a > >>>> very good backup. So I do need to send it to another device, but that > >>>> results in missing files when the -p or -c options are used with btrfs > >>>> send. (Non-incremental sending to another physical device does work.) > >>>> > >>>> I can think of a couple possible explanations. > >>>> > >>>> One is that there is a problem when using the -p or -c options with > >>>> btrfs send when the target is another physical device. I suspect this > >>>> is the actual explanation, however. > >>>> > >>>> A second possibility is that the presence of prior existing snapshots > >>>> at the target location (even if old and not referenced in any current > >>>> btrfs command), can determine the outcome and final contents of an > >>>> incremental send operation. I believe the info below suggests this to > >>>> be the case. > >>>> > >>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/ > >>>> test2/home > >>>> Name: home > >>>> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc > >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >>>> Creation time: 2017-09-06 15:38:16 -0400 > >>>> Subvolume ID: 2000 > >>>> Generation: 5020 > >>>> Gen at creation: 5020 > >>>> Parent ID: 257 > >>>> Top level ID: 257 > >>>> Flags: readonly > >>>> Snapshot(s): > >>>> > >>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home > >>>> home/test1/home > >>>> Name: home > >>>> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 > >>>> Parent UUID: - > >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >>>> Creation time: 2017-09-06 15:33:45 -0400 > >>>> Subvolume ID: 656 > >>>> Generation: 777 > >>>> Gen at creation: 773 > >>>> Parent ID: 257 > >>>> Top level ID: 257 > >>>> Flags: readonly > >>>> Snapshot(s): > >>>> > >>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ > >>>> home/test2/home > >>>> Name: home > >>>> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea > >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >>>> Creation time: 2017-09-06 15:39:51 -0400 > >>>> Subvolume ID: 660 > >>>> Generation: 779 > >>>> Gen at creation: 779 > >>>> Parent ID: 257 > >>>> Top level ID: 257 > >>>> Flags: readonly > >>>> Snapshot(s): > >>>> > >>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ > >>>> test2rec/home > >>>> Name: home > >>>> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e > >>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 > >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >>>> Creation time: 2017-09-06 17:36:19 -0400 > >>>> Subvolume ID: 2003 > >>>> Generation: 5027 > >>>> Gen at creation: 5027 > >>>> Parent ID: 257 > >>>> Top level ID: 257 > >>>> Flags: readonly > >>>> Snapshot(s): > >>>> > >>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on > >>>> device /mnt/x5a/home with a Received UUID that matches the Received > >>>> UUID of test snapshots that were newly created today. How? Why? > >>>> > >>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot > >>>> home/107/snapshot > >>>> Name: snapshot > >>>> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 > >>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 > >>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 > >>>> Creation time: 2017-07-21 00:00:25 -0400 > >>>> Subvolume ID: 433 > >>>> Generation: 222 > >>>> Gen at creation: 221 > >>>> Parent ID: 257 > >>>> Top level ID: 257 > >>>> Flags: readonly > >>>> Snapshot(s): > >>>> > >>>> If my guess is correct, btrfs has found this old snapshot and > >>>> referenced it without me telling it to do so. The result is that the > >>>> newly executed btrfs commands shown above have a totally unexpected > >>>> result. > >>>> > >>>> Today's new snapshot will not contain any files newer than 2017-07-21. > >>>> Is this a known issue? > >>>> > >>>> Refer back to the commands at the top of this message. I created a new > >>>> snapshot and did a full (non-incremental) send to the target location > >>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only > >>>> referenced the prior snapshot created today. Nowhere did I reference > >>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at > >>>> this backup location -- it was intended to hold a lot of them.) Yet, > >>>> the very presence of /mnt/x5a/home/107/snapshot on the target device > >>>> resulted in today's backup (and all recent backups) being worthless > >>>> due to them missing all files since 2017-07-21. > >>>> > >>>> These results are totally repeatable, given my set of existing > >>>> backups. But it's bizarre to me. As I understand it, a staff person > >>>> could transfer a btrfs snapshot to a target volume and it's mere > >>>> presence there could make all subsequent backups (incremental sends) > >>>> to that target volume invalid and useless. If that is true... wow. > >>>> > >>>> Another interesting observation is that the device that contains the > >>>> source snapshot, /home/.snapshots, also contains many, many prior > >>>> snapshots, going back to when this system was first set up. Why do > >>>> none of them cause a problem? Is it because I had never used > >>>> /home/.snapshots as the target of a receive operation (until I did so > >>>> today in testing the steps above)? > >>>> > >>>> As far as repeating these steps, all this was totally repeatable for > >>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the > >>>> receive command (/mnt/x5a/home/). I do not know how to create such a > >>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my > >>>> results. > >>>> > >>>> Maybe somebody can explain to me what's really happening. How is it > >>>> possible that an old snapshot created 2017-07-21 could have the same > >>>> Received UUID as snapshots created today? And how could that fact lead > >>>> to the result I'm seeing, which seems very serious. (Unexpected > >>>> missing files from a backup which was completed without errors is > >>>> pretty serious in my book.) > >>>> > >>>> Most important question: how can we rely on automated incremental > >>>> backups with btrfs send | receive given what I'm observing here > >>>> (assuming my observations are roughly correct)? > >>>> > >>>> Here's more info just to confirm that my results are not due to > >>>> filesystem corruption. > >>>> > >>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home: > >>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks > >>>> Checking filesystem on /dev/mapper/x5a_luks > >>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a > >>>> checking extents [o] > >>>> checking free space cache [.] > >>>> checking fs roots [o] > >>>> checking csums > >>>> checking root refs > >>>> found 258178555904 bytes used, no error found > >>>> total csum bytes: 250354776 > >>>> total tree bytes: 1752088576 > >>>> total fs tree bytes: 1308540928 > >>>> total extent tree bytes: 175161344 > >>>> btree space waste bytes: 215594634 > >>>> file data blocks allocated: 258634637312 > >>>> referenced 292888985600 > >>>> > >>>> [root@srv]# btrfs fi show /mnt/x5a/ > >>>> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a > >>>> Total devices 1 FS bytes used 240.45GiB > >>>> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks > >>>> > >>>> [root@srv]# btrfs fi df /mnt/x5a/ > >>>> Data, single: total=239.01GiB, used=238.82GiB > >>>> System, DUP: total=32.00MiB, used=48.00KiB > >>>> Metadata, DUP: total=2.50GiB, used=1.63GiB > >>>> GlobalReserve, single: total=422.73MiB, used=0.00B > >>>> > >>>> # btrfs scrub status -d /mnt/x5a/ > >>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a > >>>> scrub device /dev/mapper/x5a_luks (id 1) history > >>>> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 > >>>> total bytes scrubbed: 242.08GiB with 0 errors > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>> the body of a message to majordomo@vger.kernel.org > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >>> > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-08 4:44 ` Dave @ 2017-09-11 17:53 ` Axel Burri 2017-09-12 3:19 ` Andrei Borzenkov 0 siblings, 1 reply; 11+ messages in thread From: Axel Burri @ 2017-09-11 17:53 UTC (permalink / raw) To: Dave, linux-btrfs; +Cc: A L On 2017-09-08 06:44, Dave wrote: > I'm referring to the link below. Using "btrfs subvolume snapshot -r" > copies the Received UUID from the source into the new snapshot. The > btrbk FAQ entry suggests otherwise. Has something changed? I don't think something has changed, the description for the read-only subvolumes on the btrbk FAQ was just wrong (fixed now). > The only way I see to remove a Received UUID is to create a rw > snapshot (above command without the "-r"), which is not ideal in this > situation when cleaning up readonly source snapshots. > > Any suggestions? Thanks No suggestions from my part, as far as I know there is no way to easily remove/change a received_uuid from a subvolume. As you mentioned, you can snapshot it twice: # btrfs subvolume snapshot mysubvol mysubvol.rw # btrfs subvolume delete mysubvol # btrfs subvolume snapshot -r mysubvol.rw mysubvol # btrfs subvolume delete mysubvol.rw Instead of the second snapshot operation, this time you could also use the (evil) command: "btrfs btrfs property set -ts mysnapshot ro true" > On Thu, Sep 7, 2017 at 10:33 AM, Axel Burri <axel@tty0.ch> wrote: >> >> Having a received_uuid set on the source volume ("/home" in your case) >> is indeed a bad thing when it comes to send/receive. You probably >> restored a backup with send/receive, and made it read/write using "btrfs >> property set -ts /home ro false". This is a an evil thing, as it leaves >> received_uuid intact. In order to make a subvolume read-write, I >> recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>". >> >> There is a FAQ entry on btrbk on how to fix this: >> >> https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set >> >> >> On 2017-09-07 15:34, Dave wrote: >>> I just ran a test. The btrfs send - receive problem I described is >>> indeed fully resolved by removing the "problematic" snapshot on the >>> target device. I did not make any changes to the source volume. I did >>> not make any other changes in my steps (see earlier message for my >>> exact steps). >>> >>> Therefore, the problem I described in my earlier message is not due >>> exclusively to having a Received UUID on the source volume (or to any >>> other feature of the source volume). It is not related to any feature >>> of the directly specified parent volume either. More details are >>> included in my earlier email. >>> >>> Thanks for any further feedback, including answers to my questions and >>> comments about whether this is a known issue. >>> >>> >>> On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote: >>>> >>>> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")? >>>> >>>> How does it happen? >>>> How does one remove a Received UUID from the source volume? >>>> >>>> And how does that explain my results where I showed that the problem >>>> is not dependent upon the source volume but is instead dependent upon >>>> some existing snapshot on the target volume? >>>> >>>> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly. >>>> >>>> Thank you. >>>> >>>> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote: >>>>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive. >>>>> >>>>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ---- >>>>> >>>>>> Here is more info and a possible (shocking) explanation. This >>>>>> aggregates my prior messages and it provides an almost complete set of >>>>>> steps to reproduce this problem. >>>>>> >>>>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux >>>>>> btrfs-progs v4.12 >>>>>> >>>>>> My steps: >>>>>> >>>>>> [root@srv]# sync >>>>>> [root@srv]# mkdir /home/.snapshots/test1 >>>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/ >>>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home' >>>>>> [root@srv]# sync >>>>>> [root@srv]# mkdir /mnt/x5a/home/test1 >>>>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive >>>>>> /mnt/x5a/home/test1/ >>>>>> At subvol /home/.snapshots/test1/home/ >>>>>> At subvol home >>>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/ >>>>>> NOTE: all recent files are present >>>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/ >>>>>> NOTE: all recent files are present >>>>>> [root@srv]# mkdir /home/.snapshots/test2 >>>>>> [root@srv]# mkdir /mnt/x5a/home/test2 >>>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/ >>>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home' >>>>>> [root@srv]# sync >>>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/ >>>>>> At subvol /home/.snapshots/test2/home/ >>>>>> At snapshot home >>>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/ >>>>>> NOTE: all recent files are MISSING >>>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/ >>>>>> NOTE: all recent files are MISSING >>>>>> >>>>>> Below I am including some rsync output to illustrate when a snapshot >>>>>> is missing files (or not): >>>>>> >>>>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/ >>>>>> /home/.snapshots/test2/home/ >>>>>> sending incremental file list >>>>>> >>>>>> sent 1,143,286 bytes received 1,123 bytes 762,939.33 bytes/sec >>>>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>>>> >>>>>> This indicates that these two subvolumes contain the same files, which >>>>>> they should because test2 is a snapshot of test1 without any changes >>>>>> to files, and it was not sent to another physical device. >>>>>> >>>>>> The problem is when test2 is sent to another device as shown by the >>>>>> rsync results below. >>>>>> >>>>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/ >>>>>> sending incremental file list >>>>>> .d..t...... ./ >>>>>> .d..t...... user1/ >>>>>>> f.st...... user1/.bash_history >>>>>>> f.st...... user1/.bashrc >>>>>>> f+++++++++ user1/test2017-09-06.txt >>>>>> ... >>>>>> and a long list of other missing files >>>>>> >>>>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is >>>>>> missing all recent files (any files from the month of August or >>>>>> September), as my prior visual inspections had indicated. The same >>>>>> files are missing every time. There is no randomness to the missing >>>>>> data. >>>>>> >>>>>> The problem does not happen for me if the receive command target is >>>>>> located on the same physical device as shown next. (However, I suspect >>>>>> there's more to it than that, as explained further below.) >>>>>> >>>>>> [root@srv]# mkdir /home/.snapshots/test2rec >>>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/ >>>>>> /home/.snapshots/test2/home/ | btrfs receive >>>>>> /home/.snapshots/test2rec/ >>>>>> At subvol /home/.snapshots/test2/home/ >>>>>> >>>>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/ >>>>>> sending incremental file list >>>>>> >>>>>> sent 1,143,286 bytes received 1,123 bytes 2,288,818.00 bytes/sec >>>>>> total size is 3,642,972,271 speedup is 3,183.28 (DRY RUN) >>>>>> >>>>>> The above (as well as visual inspection of files) indicates that these >>>>>> two subvolumes contain the same files, which was not the case when the >>>>>> same command had a target located on another physical device. Of >>>>>> course, a snapshot which resides on the same physical device is not a >>>>>> very good backup. So I do need to send it to another device, but that >>>>>> results in missing files when the -p or -c options are used with btrfs >>>>>> send. (Non-incremental sending to another physical device does work.) >>>>>> >>>>>> I can think of a couple possible explanations. >>>>>> >>>>>> One is that there is a problem when using the -p or -c options with >>>>>> btrfs send when the target is another physical device. I suspect this >>>>>> is the actual explanation, however. >>>>>> >>>>>> A second possibility is that the presence of prior existing snapshots >>>>>> at the target location (even if old and not referenced in any current >>>>>> btrfs command), can determine the outcome and final contents of an >>>>>> incremental send operation. I believe the info below suggests this to >>>>>> be the case. >>>>>> >>>>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/ >>>>>> test2/home >>>>>> Name: home >>>>>> UUID: 292e8bbf-a95f-2a4e-8280-129202d389dc >>>>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>>>> Creation time: 2017-09-06 15:38:16 -0400 >>>>>> Subvolume ID: 2000 >>>>>> Generation: 5020 >>>>>> Gen at creation: 5020 >>>>>> Parent ID: 257 >>>>>> Top level ID: 257 >>>>>> Flags: readonly >>>>>> Snapshot(s): >>>>>> >>>>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home >>>>>> home/test1/home >>>>>> Name: home >>>>>> UUID: dc00b13d-f841-cf48-a169-aa61429a5679 >>>>>> Parent UUID: - >>>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>>>> Creation time: 2017-09-06 15:33:45 -0400 >>>>>> Subvolume ID: 656 >>>>>> Generation: 777 >>>>>> Gen at creation: 773 >>>>>> Parent ID: 257 >>>>>> Top level ID: 257 >>>>>> Flags: readonly >>>>>> Snapshot(s): >>>>>> >>>>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/ >>>>>> home/test2/home >>>>>> Name: home >>>>>> UUID: b01ab63f-17a1-f442-b9d4-ed12a0d057ea >>>>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>>>> Creation time: 2017-09-06 15:39:51 -0400 >>>>>> Subvolume ID: 660 >>>>>> Generation: 779 >>>>>> Gen at creation: 779 >>>>>> Parent ID: 257 >>>>>> Top level ID: 257 >>>>>> Flags: readonly >>>>>> Snapshot(s): >>>>>> >>>>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/ >>>>>> test2rec/home >>>>>> Name: home >>>>>> UUID: bde1891d-1474-414f-b6ab-2a34c5af224e >>>>>> Parent UUID: 62418df6-a1f8-d74a-a152-11f519593053 >>>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>>>> Creation time: 2017-09-06 17:36:19 -0400 >>>>>> Subvolume ID: 2003 >>>>>> Generation: 5027 >>>>>> Gen at creation: 5027 >>>>>> Parent ID: 257 >>>>>> Top level ID: 257 >>>>>> Flags: readonly >>>>>> Snapshot(s): >>>>>> >>>>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on >>>>>> device /mnt/x5a/home with a Received UUID that matches the Received >>>>>> UUID of test snapshots that were newly created today. How? Why? >>>>>> >>>>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot >>>>>> home/107/snapshot >>>>>> Name: snapshot >>>>>> UUID: 94d0bc47-dbf2-374e-b1c8-de06d729cde2 >>>>>> Parent UUID: 8bf40f97-10e0-9f47-a281-1a0b21bbbad0 >>>>>> Received UUID: e00d5318-6efd-824e-ac91-f25efa5c2a74 >>>>>> Creation time: 2017-07-21 00:00:25 -0400 >>>>>> Subvolume ID: 433 >>>>>> Generation: 222 >>>>>> Gen at creation: 221 >>>>>> Parent ID: 257 >>>>>> Top level ID: 257 >>>>>> Flags: readonly >>>>>> Snapshot(s): >>>>>> >>>>>> If my guess is correct, btrfs has found this old snapshot and >>>>>> referenced it without me telling it to do so. The result is that the >>>>>> newly executed btrfs commands shown above have a totally unexpected >>>>>> result. >>>>>> >>>>>> Today's new snapshot will not contain any files newer than 2017-07-21. >>>>>> Is this a known issue? >>>>>> >>>>>> Refer back to the commands at the top of this message. I created a new >>>>>> snapshot and did a full (non-incremental) send to the target location >>>>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only >>>>>> referenced the prior snapshot created today. Nowhere did I reference >>>>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at >>>>>> this backup location -- it was intended to hold a lot of them.) Yet, >>>>>> the very presence of /mnt/x5a/home/107/snapshot on the target device >>>>>> resulted in today's backup (and all recent backups) being worthless >>>>>> due to them missing all files since 2017-07-21. >>>>>> >>>>>> These results are totally repeatable, given my set of existing >>>>>> backups. But it's bizarre to me. As I understand it, a staff person >>>>>> could transfer a btrfs snapshot to a target volume and it's mere >>>>>> presence there could make all subsequent backups (incremental sends) >>>>>> to that target volume invalid and useless. If that is true... wow. >>>>>> >>>>>> Another interesting observation is that the device that contains the >>>>>> source snapshot, /home/.snapshots, also contains many, many prior >>>>>> snapshots, going back to when this system was first set up. Why do >>>>>> none of them cause a problem? Is it because I had never used >>>>>> /home/.snapshots as the target of a receive operation (until I did so >>>>>> today in testing the steps above)? >>>>>> >>>>>> As far as repeating these steps, all this was totally repeatable for >>>>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the >>>>>> receive command (/mnt/x5a/home/). I do not know how to create such a >>>>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my >>>>>> results. >>>>>> >>>>>> Maybe somebody can explain to me what's really happening. How is it >>>>>> possible that an old snapshot created 2017-07-21 could have the same >>>>>> Received UUID as snapshots created today? And how could that fact lead >>>>>> to the result I'm seeing, which seems very serious. (Unexpected >>>>>> missing files from a backup which was completed without errors is >>>>>> pretty serious in my book.) >>>>>> >>>>>> Most important question: how can we rely on automated incremental >>>>>> backups with btrfs send | receive given what I'm observing here >>>>>> (assuming my observations are roughly correct)? >>>>>> >>>>>> Here's more info just to confirm that my results are not due to >>>>>> filesystem corruption. >>>>>> >>>>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home: >>>>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks >>>>>> Checking filesystem on /dev/mapper/x5a_luks >>>>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>>>> checking extents [o] >>>>>> checking free space cache [.] >>>>>> checking fs roots [o] >>>>>> checking csums >>>>>> checking root refs >>>>>> found 258178555904 bytes used, no error found >>>>>> total csum bytes: 250354776 >>>>>> total tree bytes: 1752088576 >>>>>> total fs tree bytes: 1308540928 >>>>>> total extent tree bytes: 175161344 >>>>>> btree space waste bytes: 215594634 >>>>>> file data blocks allocated: 258634637312 >>>>>> referenced 292888985600 >>>>>> >>>>>> [root@srv]# btrfs fi show /mnt/x5a/ >>>>>> Label: 'x5a_top' uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>>>> Total devices 1 FS bytes used 240.45GiB >>>>>> devid 1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks >>>>>> >>>>>> [root@srv]# btrfs fi df /mnt/x5a/ >>>>>> Data, single: total=239.01GiB, used=238.82GiB >>>>>> System, DUP: total=32.00MiB, used=48.00KiB >>>>>> Metadata, DUP: total=2.50GiB, used=1.63GiB >>>>>> GlobalReserve, single: total=422.73MiB, used=0.00B >>>>>> >>>>>> # btrfs scrub status -d /mnt/x5a/ >>>>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a >>>>>> scrub device /dev/mapper/x5a_luks (id 1) history >>>>>> scrub started at Wed Sep 6 17:09:58 2017 and finished after 01:42:30 >>>>>> total bytes scrubbed: 242.08GiB with 0 errors >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-11 17:53 ` Axel Burri @ 2017-09-12 3:19 ` Andrei Borzenkov 2017-09-13 16:52 ` Dave 0 siblings, 1 reply; 11+ messages in thread From: Andrei Borzenkov @ 2017-09-12 3:19 UTC (permalink / raw) To: Axel Burri, Dave, linux-btrfs; +Cc: A L 11.09.2017 20:53, Axel Burri пишет: > On 2017-09-08 06:44, Dave wrote: >> I'm referring to the link below. Using "btrfs subvolume snapshot -r" >> copies the Received UUID from the source into the new snapshot. The >> btrbk FAQ entry suggests otherwise. Has something changed? > > I don't think something has changed, the description for the read-only > subvolumes on the btrbk FAQ was just wrong (fixed now). > >> The only way I see to remove a Received UUID is to create a rw >> snapshot (above command without the "-r"), which is not ideal in this >> situation when cleaning up readonly source snapshots. >> >> Any suggestions? Thanks > > No suggestions from my part, as far as I know there is no way to easily > remove/change a received_uuid from a subvolume. > There is BTRFS_IOC_SET_RECEIVED_SUBVOL IOCTL which is used by "btrfs received". My understanding is that it can also be set to empty (this clearing it). You could write small program to do it. In general it sounds like a bug - removing read-only flag from subvolume by any means should also clear Received UUID as we cannot anymore guarantee that subvolume content is the same. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: send | receive: received snapshot is missing recent files 2017-09-12 3:19 ` Andrei Borzenkov @ 2017-09-13 16:52 ` Dave 0 siblings, 0 replies; 11+ messages in thread From: Dave @ 2017-09-13 16:52 UTC (permalink / raw) To: Andrei Borzenkov; +Cc: Axel Burri, linux-btrfs, A L On Mon, Sep 11, 2017 at 11:19 PM, Andrei Borzenkov <arvidjaar@gmail.com> wrote: > 11.09.2017 20:53, Axel Burri пишет: >> On 2017-09-08 06:44, Dave wrote: >>> I'm referring to the link below. Using "btrfs subvolume snapshot -r" >>> copies the Received UUID from the source into the new snapshot. The >>> btrbk FAQ entry suggests otherwise. Has something changed? >> >> I don't think something has changed, the description for the read-only >> subvolumes on the btrbk FAQ was just wrong (fixed now). >> >>> The only way I see to remove a Received UUID is to create a rw >>> snapshot (above command without the "-r"), which is not ideal in this >>> situation when cleaning up readonly source snapshots. >>> >>> Any suggestions? Thanks >> >> No suggestions from my part, as far as I know there is no way to easily >> remove/change a received_uuid from a subvolume. >> > > There is BTRFS_IOC_SET_RECEIVED_SUBVOL IOCTL which is used by "btrfs > received". My understanding is that it can also be set to empty (this > clearing it). You could write small program to do it. > > In general it sounds like a bug - removing read-only flag from subvolume > by any means should also clear Received UUID as we cannot anymore > guarantee that subvolume content is the same. Yes! That makes a great deal of sense. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-09-13 16:53 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-06 5:37 send | receive: received snapshot is missing recent files Dave
[not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
2017-09-06 19:46 ` Dave
2017-09-07 4:43 ` Dave
2017-09-07 6:24 ` A L
2017-09-07 12:39 ` Dave
2017-09-07 13:34 ` Dave
2017-09-07 14:33 ` Axel Burri
2017-09-08 4:44 ` Dave
2017-09-11 17:53 ` Axel Burri
2017-09-12 3:19 ` Andrei Borzenkov
2017-09-13 16:52 ` Dave
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).