* Incremental send/receive broken after snapshot restore
@ 2018-06-28 20:09 Hannes Schweizer
2018-06-29 17:44 ` Andrei Borzenkov
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Schweizer @ 2018-06-28 20:09 UTC (permalink / raw)
To: linux-btrfs
Hi,
Here's my environment:
Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64
Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux
btrfs-progs v4.17
Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390
Total devices 2 FS bytes used 3.16TiB
devid 1 size 1.82TiB used 1.58TiB path /dev/mapper/online0
devid 2 size 1.82TiB used 1.58TiB path /dev/mapper/online1
Data, RAID0: total=3.16TiB, used=3.15TiB
System, RAID0: total=16.00MiB, used=240.00KiB
Metadata, RAID0: total=7.00GiB, used=4.91GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29
Total devices 2 FS bytes used 3.52TiB
devid 1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0
devid 2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1
Data, RAID1: total=3.52TiB, used=3.52TiB
System, RAID1: total=8.00MiB, used=512.00KiB
Metadata, RAID1: total=6.00GiB, used=5.11GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0
Total devices 1 FS bytes used 3.65TiB
devid 1 size 5.46TiB used 3.66TiB path
/dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc
Data, single: total=3.64TiB, used=3.64TiB
System, DUP: total=32.00MiB, used=448.00KiB
Metadata, DUP: total=11.00GiB, used=9.72GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
The following automatic backup scheme is in place:
hourly:
btrfs sub snap -r online/root online/root.<date>
daily:
btrfs sub snap -r online/root online/root.<new_offline_reference>
btrfs send -c online/root.<old_offline_reference>
online/root.<new_offline_reference> | btrfs receive offline
btrfs sub del -c online/root.<old_offline_reference>
monthly:
btrfs sub snap -r online/root online/root.<new_external_reference>
btrfs send -c online/root.<old_external_reference>
online/root.<new_external_reference> | btrfs receive external
btrfs sub del -c online/root.<old_external_reference>
Now here are the commands leading up to my problem:
After the online filesystem suddenly went ro, and btrfs check showed
massive problems, I decided to start the online array from scratch:
1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0
/dev/mapper/online1
As you can see from the backup commands above, the snapshots of
offline and external are not related, so in order to at least keep the
extensive backlog of the external snapshot set (including all
reflinks), I decided to restore the latest snapshot from external.
2: btrfs send external/root.<external_reference> | btrfs receive online
I wanted to ensure I can restart the incremental backup flow from
online to external, so I did this
3: mv online/root.<external_reference> online/root
4: btrfs sub snap -r online/root online/root.<external_reference>
5: btrfs property set online/root ro false
Now, I naively expected a simple restart of my automatic backups for
external should work.
However after running
6: btrfs sub snap -r online/root online/root.<new_external_reference>
7: btrfs send -c online/root.<old_external_reference>
online/root.<new_external_reference> | btrfs receive external
I see the following error:
ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file
or directory
Which is unfortunate, but the second problem actually encouraged me to
post this message.
As planned, I had to start the offline array from scratch as well,
because I no longer had any reference snapshot for incremental backups
on other devices:
8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0
/dev/mapper/offline1
However restarting the automatic daily backup flow bails out with a
similar error, although no potentially problematic previous
incremental snapshots should be involved here!
ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory
I'm a bit lost now. The only thing I could image which might be
confusing for btrfs,
is the residual "Received UUID" of online/root.<external_reference>
after command 2.
What's the recommended way to restore snapshots with send/receive
without breaking subsequent incremental backups (including reflinks of
existing backups)?
Any hints appreciated...
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Incremental send/receive broken after snapshot restore 2018-06-28 20:09 Incremental send/receive broken after snapshot restore Hannes Schweizer @ 2018-06-29 17:44 ` Andrei Borzenkov [not found] ` <CAOfGOYyFcQ5gN7z=4zEaGH0VMVUuFE5qiGwgF+c14FU228Y3iQ@mail.gmail.com> 0 siblings, 1 reply; 9+ messages in thread From: Andrei Borzenkov @ 2018-06-29 17:44 UTC (permalink / raw) To: Hannes Schweizer, linux-btrfs 28.06.2018 23:09, Hannes Schweizer пишет: > Hi, > > Here's my environment: > Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 > Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux > btrfs-progs v4.17 > > Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 > Total devices 2 FS bytes used 3.16TiB > devid 1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 > devid 2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 > Data, RAID0: total=3.16TiB, used=3.15TiB > System, RAID0: total=16.00MiB, used=240.00KiB > Metadata, RAID0: total=7.00GiB, used=4.91GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 > Total devices 2 FS bytes used 3.52TiB > devid 1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 > devid 2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 > Data, RAID1: total=3.52TiB, used=3.52TiB > System, RAID1: total=8.00MiB, used=512.00KiB > Metadata, RAID1: total=6.00GiB, used=5.11GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 > Total devices 1 FS bytes used 3.65TiB > devid 1 size 5.46TiB used 3.66TiB path > /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc > Data, single: total=3.64TiB, used=3.64TiB > System, DUP: total=32.00MiB, used=448.00KiB > Metadata, DUP: total=11.00GiB, used=9.72GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > The following automatic backup scheme is in place: > hourly: > btrfs sub snap -r online/root online/root.<date> > > daily: > btrfs sub snap -r online/root online/root.<new_offline_reference> > btrfs send -c online/root.<old_offline_reference> > online/root.<new_offline_reference> | btrfs receive offline > btrfs sub del -c online/root.<old_offline_reference> > > monthly: > btrfs sub snap -r online/root online/root.<new_external_reference> > btrfs send -c online/root.<old_external_reference> > online/root.<new_external_reference> | btrfs receive external > btrfs sub del -c online/root.<old_external_reference> > > Now here are the commands leading up to my problem: > After the online filesystem suddenly went ro, and btrfs check showed > massive problems, I decided to start the online array from scratch: > 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 > /dev/mapper/online1 > > As you can see from the backup commands above, the snapshots of > offline and external are not related, so in order to at least keep the > extensive backlog of the external snapshot set (including all > reflinks), I decided to restore the latest snapshot from external. > 2: btrfs send external/root.<external_reference> | btrfs receive online > > I wanted to ensure I can restart the incremental backup flow from > online to external, so I did this > 3: mv online/root.<external_reference> online/root > 4: btrfs sub snap -r online/root online/root.<external_reference> > 5: btrfs property set online/root ro false > > Now, I naively expected a simple restart of my automatic backups for > external should work. > However after running > 6: btrfs sub snap -r online/root online/root.<new_external_reference> > 7: btrfs send -c online/root.<old_external_reference> > online/root.<new_external_reference> | btrfs receive external You just recreated your "online" filesystem from scratch. Where "old_external_reference" comes from? You did not show steps used to create it. > I see the following error: > ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file > or directory > > Which is unfortunate, but the second problem actually encouraged me to > post this message. > As planned, I had to start the offline array from scratch as well, > because I no longer had any reference snapshot for incremental backups > on other devices: > 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 > /dev/mapper/offline1 > > However restarting the automatic daily backup flow bails out with a > similar error, although no potentially problematic previous > incremental snapshots should be involved here! > ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory > Again - before you can *re*start incremental-forever sequence you need initial full copy. How exactly did you restart it if no snapshots exist either on source or on destination? > I'm a bit lost now. The only thing I could image which might be > confusing for btrfs, > is the residual "Received UUID" of online/root.<external_reference> > after command 2. > What's the recommended way to restore snapshots with send/receive > without breaking subsequent incremental backups (including reflinks of > existing backups)? > > Any hints appreciated... > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CAOfGOYyFcQ5gN7z=4zEaGH0VMVUuFE5qiGwgF+c14FU228Y3iQ@mail.gmail.com>]
* Re: Incremental send/receive broken after snapshot restore [not found] ` <CAOfGOYyFcQ5gN7z=4zEaGH0VMVUuFE5qiGwgF+c14FU228Y3iQ@mail.gmail.com> @ 2018-06-30 6:24 ` Andrei Borzenkov 2018-06-30 17:49 ` Hannes Schweizer 0 siblings, 1 reply; 9+ messages in thread From: Andrei Borzenkov @ 2018-06-30 6:24 UTC (permalink / raw) To: Hannes Schweizer, linux-btrfs@vger.kernel.org Do not reply privately to mails on list. 29.06.2018 22:10, Hannes Schweizer пишет: > On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >> >> 28.06.2018 23:09, Hannes Schweizer пишет: >>> Hi, >>> >>> Here's my environment: >>> Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 >>> Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux >>> btrfs-progs v4.17 >>> >>> Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 >>> Total devices 2 FS bytes used 3.16TiB >>> devid 1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 >>> devid 2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 >>> Data, RAID0: total=3.16TiB, used=3.15TiB >>> System, RAID0: total=16.00MiB, used=240.00KiB >>> Metadata, RAID0: total=7.00GiB, used=4.91GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 >>> Total devices 2 FS bytes used 3.52TiB >>> devid 1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 >>> devid 2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 >>> Data, RAID1: total=3.52TiB, used=3.52TiB >>> System, RAID1: total=8.00MiB, used=512.00KiB >>> Metadata, RAID1: total=6.00GiB, used=5.11GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 >>> Total devices 1 FS bytes used 3.65TiB >>> devid 1 size 5.46TiB used 3.66TiB path >>> /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc >>> Data, single: total=3.64TiB, used=3.64TiB >>> System, DUP: total=32.00MiB, used=448.00KiB >>> Metadata, DUP: total=11.00GiB, used=9.72GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> >>> The following automatic backup scheme is in place: >>> hourly: >>> btrfs sub snap -r online/root online/root.<date> >>> >>> daily: >>> btrfs sub snap -r online/root online/root.<new_offline_reference> >>> btrfs send -c online/root.<old_offline_reference> >>> online/root.<new_offline_reference> | btrfs receive offline >>> btrfs sub del -c online/root.<old_offline_reference> >>> >>> monthly: >>> btrfs sub snap -r online/root online/root.<new_external_reference> >>> btrfs send -c online/root.<old_external_reference> >>> online/root.<new_external_reference> | btrfs receive external >>> btrfs sub del -c online/root.<old_external_reference> >>> >>> Now here are the commands leading up to my problem: >>> After the online filesystem suddenly went ro, and btrfs check showed >>> massive problems, I decided to start the online array from scratch: >>> 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 >>> /dev/mapper/online1 >>> >>> As you can see from the backup commands above, the snapshots of >>> offline and external are not related, so in order to at least keep the >>> extensive backlog of the external snapshot set (including all >>> reflinks), I decided to restore the latest snapshot from external. >>> 2: btrfs send external/root.<external_reference> | btrfs receive online >>> >>> I wanted to ensure I can restart the incremental backup flow from >>> online to external, so I did this >>> 3: mv online/root.<external_reference> online/root >>> 4: btrfs sub snap -r online/root online/root.<external_reference> >>> 5: btrfs property set online/root ro false >>> >>> Now, I naively expected a simple restart of my automatic backups for >>> external should work. >>> However after running >>> 6: btrfs sub snap -r online/root online/root.<new_external_reference> >>> 7: btrfs send -c online/root.<old_external_reference> >>> online/root.<new_external_reference> | btrfs receive external >> >> You just recreated your "online" filesystem from scratch. Where >> "old_external_reference" comes from? You did not show steps used to >> create it. >> >>> I see the following error: >>> ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file >>> or directory >>> >>> Which is unfortunate, but the second problem actually encouraged me to >>> post this message. >>> As planned, I had to start the offline array from scratch as well, >>> because I no longer had any reference snapshot for incremental backups >>> on other devices: >>> 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 >>> /dev/mapper/offline1 >>> >>> However restarting the automatic daily backup flow bails out with a >>> similar error, although no potentially problematic previous >>> incremental snapshots should be involved here! >>> ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory >>> >> >> Again - before you can *re*start incremental-forever sequence you need >> initial full copy. How exactly did you restart it if no snapshots exist >> either on source or on destination? > > Thanks for your help regarding this issue! > > Before the online crash, I've used the following online -> external > backup scheme: > btrfs sub snap -r online/root online/root.<new_external_reference> > btrfs send -c online/root.<old_external_reference> > online/root.<new_external_reference> | btrfs receive external > btrfs sub del -c online/root.<old_external_reference> > > By sending the existing snapshot from external to online (basically a > full copy of external/old_external_reference to online/root), it > should have been possible to restart the monthly online -> external > backup scheme, right? > You did not answer any of my questions which makes it impossible to actually try to reproduce or understand it. In particular, it is not even clear whether problem happens immediately or after some time. Educated guess is that the problem is due to stuck received_uuid on source which now propagates into every snapshot and makes receive match wrong subvolume. You should never reset read-only flag, rather create new writable clone leaving original read-only snapshot untouched. Showing output of "btrfs sub li -qRu" on both sides would be helpful. >>> I'm a bit lost now. The only thing I could image which might be >>> confusing for btrfs, >>> is the residual "Received UUID" of online/root.<external_reference> >>> after command 2. >>> What's the recommended way to restore snapshots with send/receive >>> without breaking subsequent incremental backups (including reflinks of >>> existing backups)? >>> >>> Any hints appreciated... >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 6:24 ` Andrei Borzenkov @ 2018-06-30 17:49 ` Hannes Schweizer 2018-06-30 18:49 ` Andrei Borzenkov 0 siblings, 1 reply; 9+ messages in thread From: Hannes Schweizer @ 2018-06-30 17:49 UTC (permalink / raw) To: arvidjaar; +Cc: linux-btrfs On Sat, Jun 30, 2018 at 8:24 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > Do not reply privately to mails on list. > > 29.06.2018 22:10, Hannes Schweizer пишет: > > On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > >> > >> 28.06.2018 23:09, Hannes Schweizer пишет: > >>> Hi, > >>> > >>> Here's my environment: > >>> Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 > >>> Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux > >>> btrfs-progs v4.17 > >>> > >>> Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 > >>> Total devices 2 FS bytes used 3.16TiB > >>> devid 1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 > >>> devid 2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 > >>> Data, RAID0: total=3.16TiB, used=3.15TiB > >>> System, RAID0: total=16.00MiB, used=240.00KiB > >>> Metadata, RAID0: total=7.00GiB, used=4.91GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 > >>> Total devices 2 FS bytes used 3.52TiB > >>> devid 1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 > >>> devid 2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 > >>> Data, RAID1: total=3.52TiB, used=3.52TiB > >>> System, RAID1: total=8.00MiB, used=512.00KiB > >>> Metadata, RAID1: total=6.00GiB, used=5.11GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 > >>> Total devices 1 FS bytes used 3.65TiB > >>> devid 1 size 5.46TiB used 3.66TiB path > >>> /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc > >>> Data, single: total=3.64TiB, used=3.64TiB > >>> System, DUP: total=32.00MiB, used=448.00KiB > >>> Metadata, DUP: total=11.00GiB, used=9.72GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> > >>> The following automatic backup scheme is in place: > >>> hourly: > >>> btrfs sub snap -r online/root online/root.<date> > >>> > >>> daily: > >>> btrfs sub snap -r online/root online/root.<new_offline_reference> > >>> btrfs send -c online/root.<old_offline_reference> > >>> online/root.<new_offline_reference> | btrfs receive offline > >>> btrfs sub del -c online/root.<old_offline_reference> > >>> > >>> monthly: > >>> btrfs sub snap -r online/root online/root.<new_external_reference> > >>> btrfs send -c online/root.<old_external_reference> > >>> online/root.<new_external_reference> | btrfs receive external > >>> btrfs sub del -c online/root.<old_external_reference> > >>> > >>> Now here are the commands leading up to my problem: > >>> After the online filesystem suddenly went ro, and btrfs check showed > >>> massive problems, I decided to start the online array from scratch: > >>> 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 > >>> /dev/mapper/online1 > >>> > >>> As you can see from the backup commands above, the snapshots of > >>> offline and external are not related, so in order to at least keep the > >>> extensive backlog of the external snapshot set (including all > >>> reflinks), I decided to restore the latest snapshot from external. > >>> 2: btrfs send external/root.<external_reference> | btrfs receive online > >>> > >>> I wanted to ensure I can restart the incremental backup flow from > >>> online to external, so I did this > >>> 3: mv online/root.<external_reference> online/root > >>> 4: btrfs sub snap -r online/root online/root.<external_reference> > >>> 5: btrfs property set online/root ro false > >>> > >>> Now, I naively expected a simple restart of my automatic backups for > >>> external should work. > >>> However after running > >>> 6: btrfs sub snap -r online/root online/root.<new_external_reference> > >>> 7: btrfs send -c online/root.<old_external_reference> > >>> online/root.<new_external_reference> | btrfs receive external > >> > >> You just recreated your "online" filesystem from scratch. Where > >> "old_external_reference" comes from? You did not show steps used to > >> create it. > >> > >>> I see the following error: > >>> ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file > >>> or directory > >>> > >>> Which is unfortunate, but the second problem actually encouraged me to > >>> post this message. > >>> As planned, I had to start the offline array from scratch as well, > >>> because I no longer had any reference snapshot for incremental backups > >>> on other devices: > >>> 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 > >>> /dev/mapper/offline1 > >>> > >>> However restarting the automatic daily backup flow bails out with a > >>> similar error, although no potentially problematic previous > >>> incremental snapshots should be involved here! > >>> ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory > >>> > >> > >> Again - before you can *re*start incremental-forever sequence you need > >> initial full copy. How exactly did you restart it if no snapshots exist > >> either on source or on destination? > > > > Thanks for your help regarding this issue! > > > > Before the online crash, I've used the following online -> external > > backup scheme: > > btrfs sub snap -r online/root online/root.<new_external_reference> > > btrfs send -c online/root.<old_external_reference> > > online/root.<new_external_reference> | btrfs receive external > > btrfs sub del -c online/root.<old_external_reference> > > > > By sending the existing snapshot from external to online (basically a > > full copy of external/old_external_reference to online/root), it > > should have been possible to restart the monthly online -> external > > backup scheme, right? > > > > You did not answer any of my questions which makes it impossible to > actually try to reproduce or understand it. In particular, it is not > even clear whether problem happens immediately or after some time. > > Educated guess is that the problem is due to stuck received_uuid on > source which now propagates into every snapshot and makes receive match > wrong subvolume. You should never reset read-only flag, rather create > new writable clone leaving original read-only snapshot untouched. > > Showing output of "btrfs sub li -qRu" on both sides would be helpful. Sry for being too vague... I've tested a few restore methods beforehand, and simply creating a writeable clone from the restored snapshot does not work for me, eg: # create some source snapshots btrfs sub create test_root btrfs sub snap -r test_root test_snap1 btrfs sub snap -r test_root test_snap2 # send a full and incremental backup to external disk btrfs send test_snap2 | btrfs receive /run/media/schweizer/external btrfs sub snap -r test_root test_snap3 btrfs send -c test_snap2 test_snap3 | btrfs receive /run/media/schweizer/external # simulate disappearing source btrfs sub del test_* # restore full snapshot from external disk btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . # create writeable clone btrfs sub snap test_snap3 test_root # try to continue with backup scheme from source to external btrfs sub snap -r test_root test_snap4 # this fails!! btrfs send -c test_snap3 test_snap4 | btrfs receive /run/media/schweizer/external At subvol test_snap4 ERROR: parent determination failed for 2047 ERROR: empty stream is not considered valid I need the following snapshot tree (diablo_external.2018-06-24T19-37-39 has to be a child of diablo): diablo Name: diablo UUID: 46db1185-3c3e-194e-8d19-7456e532b2f3 Parent UUID: - Received UUID: 6c683d90-44f2-ad48-bb84-e9f241800179 Creation time: 2018-06-23 23:37:17 +0200 Subvolume ID: 258 Generation: 13748 Gen at creation: 7 Parent ID: 5 Top level ID: 5 Flags: - Snapshot(s): diablo_external.2018-06-24T19-37-39 diablo.2018-06-30T01-01-02 diablo.2018-06-30T05-01-01 diablo.2018-06-30T09-01-01 diablo.2018-06-30T11-01-01 diablo.2018-06-30T13-01-01 diablo.2018-06-30T14-01-01 diablo.2018-06-30T15-01-01 diablo.2018-06-30T16-01-01 diablo.2018-06-30T17-01-02 diablo.2018-06-30T18-01-01 diablo.2018-06-30T19-01-01 Here's the requested output: btrfs sub li -qRu /mnt/work/backup/online/ ID 258 gen 13742 top level 5 parent_uuid - received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo ID 1896 gen 1089 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid de9421c5-d160-2949-bf09-613949b4611c path diablo_external.2018-06-24T19-37-39 ID 2013 gen 11856 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 09c5d505-eee5-d24c-a05b-b8c79284586e path diablo.2018-06-30T01-01-02 ID 2018 gen 12267 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 8a495ba7-71db-6343-abf2-7859f04e7ba0 path diablo.2018-06-30T05-01-01 ID 2022 gen 12674 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid f115fd06-cef6-db44-928c-cf55fe4b738a path diablo.2018-06-30T09-01-01 ID 2024 gen 12879 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid f4f523af-c9f3-834e-adac-44d24ed95af6 path diablo.2018-06-30T11-01-01 ID 2026 gen 13092 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 6401603c-e71b-9947-b67d-9603a395ebc5 path diablo.2018-06-30T13-01-01 ID 2027 gen 13193 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 027a50b6-0725-f840-b4bd-6674f88ee340 path diablo.2018-06-30T14-01-01 ID 2028 gen 13298 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid bdb61c68-bc15-2840-b8fd-d7a99fe7723a path diablo.2018-06-30T15-01-01 ID 2029 gen 13405 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid b96b5c6b-e719-8d47-ad03-15901ed2bf3d path diablo.2018-06-30T16-01-01 ID 2030 gen 13511 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 22a68d4b-56ab-5444-9883-69023b6a231b path diablo.2018-06-30T17-01-02 ID 2031 gen 13615 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid e2af192e-9311-6c4d-a3fe-5d23767bc60b path diablo.2018-06-30T18-01-01 ID 2032 gen 13718 top level 5 parent_uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid ddf39ee2-d5b9-5440-8c4e-b72b7e342444 path diablo.2018-06-30T19-01-01 btrfs sub li -qRu /run/media/schweizer/external ID 563 gen 5036 top level 5 parent_uuid - received_uuid - uuid fc03e750-c26b-484b-bd00-9a7b89cfd610 path diablo_external.2017-02-06T20-31-34 ID 3284 gen 1392 top level 5 parent_uuid - received_uuid bc9b22c8-e107-8a48-b14f-06b8a41a22a6 uuid 26a7792f-0157-7c4d-adf7-3d60627af8e5 path diablo_external.2017-09-27T11-39-48 ID 5772 gen 1523 top level 5 parent_uuid 26a7792f-0157-7c4d-adf7-3d60627af8e5 received_uuid 3ae68477-b348-714c-8f87-6b5fd0809aa7 uuid 28cbeff1-5ac2-dc46-98f3-10f576acc2e4 path diablo_external.2017-11-07T15-10-37 ID 5929 gen 6667 top level 5 parent_uuid 052624f5-2434-0941-8f0d-a03aef3e6995 received_uuid 209cfa0c-4436-824e-8ce4-f34ce815ecb1 uuid a053ca7e-1e48-1c4c-baf7-75c4d676695f path diablo_external.2018-01-01T15-37-41 ID 6049 gen 7575 top level 5 parent_uuid a053ca7e-1e48-1c4c-baf7-75c4d676695f received_uuid 5b2eb743-6a40-a448-abc7-027aaa0ba770 uuid 22205837-fce9-d74d-9390-d2de6148dee9 path diablo_external.2018-03-03T14-39-44 ID 6158 gen 8480 top level 5 parent_uuid 22205837-fce9-d74d-9390-d2de6148dee9 received_uuid 85e3d595-3979-7648-9ef7-86d0784d41aa uuid 07fe0dc9-1f8b-024b-8848-77b008cd52e5 path diablo_external.2018-04-06T21-22-06 ID 6231 gen 9700 top level 5 parent_uuid 07fe0dc9-1f8b-024b-8848-77b008cd52e5 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 906e702f-d658-7f48-901a-afb032f504d6 path diablo_external.2018-05-01T15-08-13 ID 6281 gen 9429 top level 5 parent_uuid 906e702f-d658-7f48-901a-afb032f504d6 received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid e41f5938-eb95-9845-b691-a4aeae0ce9d1 path diablo_external.2018-06-24T19-37-39 Is there some way to reset the received_uuid of the following snapshot on online? ID 258 gen 13742 top level 5 parent_uuid - received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > >>> I'm a bit lost now. The only thing I could image which might be > >>> confusing for btrfs, > >>> is the residual "Received UUID" of online/root.<external_reference> > >>> after command 2. > >>> What's the recommended way to restore snapshots with send/receive > >>> without breaking subsequent incremental backups (including reflinks of > >>> existing backups)? > >>> > >>> Any hints appreciated... > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 17:49 ` Hannes Schweizer @ 2018-06-30 18:49 ` Andrei Borzenkov 2018-06-30 20:02 ` Andrei Borzenkov 0 siblings, 1 reply; 9+ messages in thread From: Andrei Borzenkov @ 2018-06-30 18:49 UTC (permalink / raw) To: Hannes Schweizer; +Cc: linux-btrfs 30.06.2018 20:49, Hannes Schweizer пишет: > On Sat, Jun 30, 2018 at 8:24 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >> >> Do not reply privately to mails on list. >> >> 29.06.2018 22:10, Hannes Schweizer пишет: >>> On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >>>> >>>> 28.06.2018 23:09, Hannes Schweizer пишет: >>>>> Hi, >>>>> >>>>> Here's my environment: >>>>> Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 >>>>> Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux >>>>> btrfs-progs v4.17 >>>>> >>>>> Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 >>>>> Total devices 2 FS bytes used 3.16TiB >>>>> devid 1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 >>>>> devid 2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 >>>>> Data, RAID0: total=3.16TiB, used=3.15TiB >>>>> System, RAID0: total=16.00MiB, used=240.00KiB >>>>> Metadata, RAID0: total=7.00GiB, used=4.91GiB >>>>> GlobalReserve, single: total=512.00MiB, used=0.00B >>>>> >>>>> Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 >>>>> Total devices 2 FS bytes used 3.52TiB >>>>> devid 1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 >>>>> devid 2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 >>>>> Data, RAID1: total=3.52TiB, used=3.52TiB >>>>> System, RAID1: total=8.00MiB, used=512.00KiB >>>>> Metadata, RAID1: total=6.00GiB, used=5.11GiB >>>>> GlobalReserve, single: total=512.00MiB, used=0.00B >>>>> >>>>> Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 >>>>> Total devices 1 FS bytes used 3.65TiB >>>>> devid 1 size 5.46TiB used 3.66TiB path >>>>> /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc >>>>> Data, single: total=3.64TiB, used=3.64TiB >>>>> System, DUP: total=32.00MiB, used=448.00KiB >>>>> Metadata, DUP: total=11.00GiB, used=9.72GiB >>>>> GlobalReserve, single: total=512.00MiB, used=0.00B >>>>> >>>>> >>>>> The following automatic backup scheme is in place: >>>>> hourly: >>>>> btrfs sub snap -r online/root online/root.<date> >>>>> >>>>> daily: >>>>> btrfs sub snap -r online/root online/root.<new_offline_reference> >>>>> btrfs send -c online/root.<old_offline_reference> >>>>> online/root.<new_offline_reference> | btrfs receive offline >>>>> btrfs sub del -c online/root.<old_offline_reference> >>>>> >>>>> monthly: >>>>> btrfs sub snap -r online/root online/root.<new_external_reference> >>>>> btrfs send -c online/root.<old_external_reference> >>>>> online/root.<new_external_reference> | btrfs receive external >>>>> btrfs sub del -c online/root.<old_external_reference> >>>>> >>>>> Now here are the commands leading up to my problem: >>>>> After the online filesystem suddenly went ro, and btrfs check showed >>>>> massive problems, I decided to start the online array from scratch: >>>>> 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 >>>>> /dev/mapper/online1 >>>>> >>>>> As you can see from the backup commands above, the snapshots of >>>>> offline and external are not related, so in order to at least keep the >>>>> extensive backlog of the external snapshot set (including all >>>>> reflinks), I decided to restore the latest snapshot from external. >>>>> 2: btrfs send external/root.<external_reference> | btrfs receive online >>>>> >>>>> I wanted to ensure I can restart the incremental backup flow from >>>>> online to external, so I did this >>>>> 3: mv online/root.<external_reference> online/root >>>>> 4: btrfs sub snap -r online/root online/root.<external_reference> >>>>> 5: btrfs property set online/root ro false >>>>> >>>>> Now, I naively expected a simple restart of my automatic backups for >>>>> external should work. >>>>> However after running >>>>> 6: btrfs sub snap -r online/root online/root.<new_external_reference> >>>>> 7: btrfs send -c online/root.<old_external_reference> >>>>> online/root.<new_external_reference> | btrfs receive external >>>> >>>> You just recreated your "online" filesystem from scratch. Where >>>> "old_external_reference" comes from? You did not show steps used to >>>> create it. >>>> >>>>> I see the following error: >>>>> ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file >>>>> or directory >>>>> >>>>> Which is unfortunate, but the second problem actually encouraged me to >>>>> post this message. >>>>> As planned, I had to start the offline array from scratch as well, >>>>> because I no longer had any reference snapshot for incremental backups >>>>> on other devices: >>>>> 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 >>>>> /dev/mapper/offline1 >>>>> >>>>> However restarting the automatic daily backup flow bails out with a >>>>> similar error, although no potentially problematic previous >>>>> incremental snapshots should be involved here! >>>>> ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory >>>>> >>>> >>>> Again - before you can *re*start incremental-forever sequence you need >>>> initial full copy. How exactly did you restart it if no snapshots exist >>>> either on source or on destination? >>> >>> Thanks for your help regarding this issue! >>> >>> Before the online crash, I've used the following online -> external >>> backup scheme: >>> btrfs sub snap -r online/root online/root.<new_external_reference> >>> btrfs send -c online/root.<old_external_reference> >>> online/root.<new_external_reference> | btrfs receive external >>> btrfs sub del -c online/root.<old_external_reference> >>> >>> By sending the existing snapshot from external to online (basically a >>> full copy of external/old_external_reference to online/root), it >>> should have been possible to restart the monthly online -> external >>> backup scheme, right? >>> >> >> You did not answer any of my questions which makes it impossible to >> actually try to reproduce or understand it. In particular, it is not >> even clear whether problem happens immediately or after some time. >> >> Educated guess is that the problem is due to stuck received_uuid on >> source which now propagates into every snapshot and makes receive match >> wrong subvolume. You should never reset read-only flag, rather create >> new writable clone leaving original read-only snapshot untouched. >> >> Showing output of "btrfs sub li -qRu" on both sides would be helpful. > > Sry for being too vague... > > I've tested a few restore methods beforehand, and simply creating a > writeable clone from the restored snapshot does not work for me, eg: > # create some source snapshots > btrfs sub create test_root > btrfs sub snap -r test_root test_snap1 > btrfs sub snap -r test_root test_snap2 > > # send a full and incremental backup to external disk > btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > btrfs sub snap -r test_root test_snap3 > btrfs send -c test_snap2 test_snap3 | btrfs receive > /run/media/schweizer/external > > # simulate disappearing source > btrfs sub del test_* > > # restore full snapshot from external disk > btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > > # create writeable clone > btrfs sub snap test_snap3 test_root > > # try to continue with backup scheme from source to external > btrfs sub snap -r test_root test_snap4 > > # this fails!! > btrfs send -c test_snap3 test_snap4 | btrfs receive > /run/media/schweizer/external > At subvol test_snap4 > ERROR: parent determination failed for 2047 > ERROR: empty stream is not considered valid > Yes, that's expected. Incremental stream always needs valid parent - this will be cloned on destination and incremental changes applied to it. "-c" option is just additional sugar on top of it which might reduce size of stream, but in this case (i.e. without "-p") it also attempts to guess parent subvolume for test_snap4 and this fails because test_snap3 and test_snap4 do not have common parent so test_snap3 is rejected as valid parent snapshot. You can restart incremental-forever chain by using explicit "-p" instead: btrfs send -p test_snap3 test_snap4 Subsequent snapshots (test_snap5 etc) will all have common parent with immediate predecessor again so "-c" will work. Note that technically "btrfs send" with single "-c" option is entirely equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) Although this implicit check for common parent may be considered a good thing in this case. P.S. looking at the above, it probably needs to be in manual page for btrfs-send. It took me quite some time to actually understand the meaning of "-p" and "-c" and behavior if they are present. > I need the following snapshot tree > (diablo_external.2018-06-24T19-37-39 has to be a child of diablo): > diablo > Name: diablo > UUID: 46db1185-3c3e-194e-8d19-7456e532b2f3 > Parent UUID: - > Received UUID: 6c683d90-44f2-ad48-bb84-e9f241800179 > Creation time: 2018-06-23 23:37:17 +0200 > Subvolume ID: 258 > Generation: 13748 > Gen at creation: 7 > Parent ID: 5 > Top level ID: 5 > Flags: - > Snapshot(s): > diablo_external.2018-06-24T19-37-39 > diablo.2018-06-30T01-01-02 > diablo.2018-06-30T05-01-01 > diablo.2018-06-30T09-01-01 > diablo.2018-06-30T11-01-01 > diablo.2018-06-30T13-01-01 > diablo.2018-06-30T14-01-01 > diablo.2018-06-30T15-01-01 > diablo.2018-06-30T16-01-01 > diablo.2018-06-30T17-01-02 > diablo.2018-06-30T18-01-01 > diablo.2018-06-30T19-01-01 > > Here's the requested output: > btrfs sub li -qRu /mnt/work/backup/online/ > ID 258 gen 13742 top level 5 parent_uuid - > received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo Yes, as expected. From now on every read-only snapshot created from it will inherit the same received_uuid which will be looked for on destination instead of source uuid, so on destination wrong subvolume will be cloned. I.o.w, on source it will compute changes against one subvolume but apply changes on destination to clone of entirely different subvolume. Actually I could reproduce destination corruption easily (in my case destination snapshot had extra content but for the same reason). ... > > Is there some way to reset the received_uuid of the following snapshot > on online? > ID 258 gen 13742 top level 5 parent_uuid - > received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > There is no "official" tool but this question came up quite often. Search this list, I believe recently one-liner using python-btrfs was posted. Note that also patch that removes received_uuid when "ro" propery is removed was suggested, hopefully it will be merged at some point. Still I personally consider ability to flip read-only property the very bad thing that should have never been exposed in the first place. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 18:49 ` Andrei Borzenkov @ 2018-06-30 20:02 ` Andrei Borzenkov 2018-06-30 23:03 ` Hannes Schweizer 0 siblings, 1 reply; 9+ messages in thread From: Andrei Borzenkov @ 2018-06-30 20:02 UTC (permalink / raw) To: Hannes Schweizer; +Cc: linux-btrfs 30.06.2018 21:49, Andrei Borzenkov пишет: > 30.06.2018 20:49, Hannes Schweizer пишет: ... >> >> I've tested a few restore methods beforehand, and simply creating a >> writeable clone from the restored snapshot does not work for me, eg: >> # create some source snapshots >> btrfs sub create test_root >> btrfs sub snap -r test_root test_snap1 >> btrfs sub snap -r test_root test_snap2 >> >> # send a full and incremental backup to external disk >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external >> btrfs sub snap -r test_root test_snap3 >> btrfs send -c test_snap2 test_snap3 | btrfs receive >> /run/media/schweizer/external >> >> # simulate disappearing source >> btrfs sub del test_* >> >> # restore full snapshot from external disk >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . >> >> # create writeable clone >> btrfs sub snap test_snap3 test_root >> >> # try to continue with backup scheme from source to external >> btrfs sub snap -r test_root test_snap4 >> >> # this fails!! >> btrfs send -c test_snap3 test_snap4 | btrfs receive >> /run/media/schweizer/external >> At subvol test_snap4 >> ERROR: parent determination failed for 2047 >> ERROR: empty stream is not considered valid >> > > Yes, that's expected. Incremental stream always needs valid parent - > this will be cloned on destination and incremental changes applied to > it. "-c" option is just additional sugar on top of it which might reduce > size of stream, but in this case (i.e. without "-p") it also attempts to > guess parent subvolume for test_snap4 and this fails because test_snap3 > and test_snap4 do not have common parent so test_snap3 is rejected as > valid parent snapshot. You can restart incremental-forever chain by > using explicit "-p" instead: > > btrfs send -p test_snap3 test_snap4 > > Subsequent snapshots (test_snap5 etc) will all have common parent with > immediate predecessor again so "-c" will work. > > Note that technically "btrfs send" with single "-c" option is entirely > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > Although this implicit check for common parent may be considered a good > thing in this case. > > P.S. looking at the above, it probably needs to be in manual page for > btrfs-send. It took me quite some time to actually understand the > meaning of "-p" and "-c" and behavior if they are present. > ... >> >> Is there some way to reset the received_uuid of the following snapshot >> on online? >> ID 258 gen 13742 top level 5 parent_uuid - >> received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo >> > > There is no "official" tool but this question came up quite often. > Search this list, I believe recently one-liner using python-btrfs was > posted. Note that also patch that removes received_uuid when "ro" > propery is removed was suggested, hopefully it will be merged at some > point. Still I personally consider ability to flip read-only property > the very bad thing that should have never been exposed in the first place. > Note that if you remove received_uuid (explicitly or - in the future - implicitly) you will not be able to restart incremental send anymore. Without received_uuid there will be no way to match source test_snap3 with destination test_snap3. So you *must* preserve it and start with writable clone. received_uuid is misnomer. I wish it would be named "content_uuid" or "snap_uuid" with semantic 1. When read-only snapshot of writable volume is created, content_uuid is initialized 2. Read-only snapshot of read-only snapshot inherits content_uuid 3. destination of "btrfs send" inherits content_uuid 4. writable snapshot of read-only snapshot clears content_uuid 5. clearing read-only property clears content_uuid This would make it more straightforward to cascade and restart replication by having single subvolume property to match against. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 20:02 ` Andrei Borzenkov @ 2018-06-30 23:03 ` Hannes Schweizer 2018-06-30 23:16 ` Marc MERLIN 0 siblings, 1 reply; 9+ messages in thread From: Hannes Schweizer @ 2018-06-30 23:03 UTC (permalink / raw) To: arvidjaar; +Cc: linux-btrfs On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > 30.06.2018 21:49, Andrei Borzenkov пишет: > > 30.06.2018 20:49, Hannes Schweizer пишет: > ... > >> > >> I've tested a few restore methods beforehand, and simply creating a > >> writeable clone from the restored snapshot does not work for me, eg: > >> # create some source snapshots > >> btrfs sub create test_root > >> btrfs sub snap -r test_root test_snap1 > >> btrfs sub snap -r test_root test_snap2 > >> > >> # send a full and incremental backup to external disk > >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > >> btrfs sub snap -r test_root test_snap3 > >> btrfs send -c test_snap2 test_snap3 | btrfs receive > >> /run/media/schweizer/external > >> > >> # simulate disappearing source > >> btrfs sub del test_* > >> > >> # restore full snapshot from external disk > >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > >> > >> # create writeable clone > >> btrfs sub snap test_snap3 test_root > >> > >> # try to continue with backup scheme from source to external > >> btrfs sub snap -r test_root test_snap4 > >> > >> # this fails!! > >> btrfs send -c test_snap3 test_snap4 | btrfs receive > >> /run/media/schweizer/external > >> At subvol test_snap4 > >> ERROR: parent determination failed for 2047 > >> ERROR: empty stream is not considered valid > >> > > > > Yes, that's expected. Incremental stream always needs valid parent - > > this will be cloned on destination and incremental changes applied to > > it. "-c" option is just additional sugar on top of it which might reduce > > size of stream, but in this case (i.e. without "-p") it also attempts to > > guess parent subvolume for test_snap4 and this fails because test_snap3 > > and test_snap4 do not have common parent so test_snap3 is rejected as > > valid parent snapshot. You can restart incremental-forever chain by > > using explicit "-p" instead: > > > > btrfs send -p test_snap3 test_snap4 > > > > Subsequent snapshots (test_snap5 etc) will all have common parent with > > immediate predecessor again so "-c" will work. > > > > Note that technically "btrfs send" with single "-c" option is entirely > > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > > Although this implicit check for common parent may be considered a good > > thing in this case. > > > > P.S. looking at the above, it probably needs to be in manual page for > > btrfs-send. It took me quite some time to actually understand the > > meaning of "-p" and "-c" and behavior if they are present. > > > ... > >> > >> Is there some way to reset the received_uuid of the following snapshot > >> on online? > >> ID 258 gen 13742 top level 5 parent_uuid - > >> received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > >> > > > > There is no "official" tool but this question came up quite often. > > Search this list, I believe recently one-liner using python-btrfs was > > posted. Note that also patch that removes received_uuid when "ro" > > propery is removed was suggested, hopefully it will be merged at some > > point. Still I personally consider ability to flip read-only property > > the very bad thing that should have never been exposed in the first place. > > > > Note that if you remove received_uuid (explicitly or - in the future - > implicitly) you will not be able to restart incremental send anymore. > Without received_uuid there will be no way to match source test_snap3 > with destination test_snap3. So you *must* preserve it and start with > writable clone. > > received_uuid is misnomer. I wish it would be named "content_uuid" or > "snap_uuid" with semantic > > 1. When read-only snapshot of writable volume is created, content_uuid > is initialized > > 2. Read-only snapshot of read-only snapshot inherits content_uuid > > 3. destination of "btrfs send" inherits content_uuid > > 4. writable snapshot of read-only snapshot clears content_uuid > > 5. clearing read-only property clears content_uuid > > This would make it more straightforward to cascade and restart > replication by having single subvolume property to match against. Indeed, the current terminology is a bit confusing, and the patch removing the received_uuid when manually switching ro to false should definitely be merged. As recommended, I'll simply create a writeable clone of the restored snapshot and use -p instead of -c when restoring again (which kind of snapshot relations are accepted for incremental send/receive needs better documentation) Fortunately, with all your hints regarding received_uuid I was able to successfully restart the incremental-chain WITHOUT restarting from scratch: # replace incorrectly propagated received_uuid on destination with actual uuid of source snapshot btrfs property set /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro false set_received_uuid.py de9421c5-d160-2949-bf09-613949b4611c 1089 0.0 /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 btrfs property set /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro true # remove incorrectly propagated received_uuid on source btrfs property set /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro false set_received_uuid.py 00000000-0000-0000-0000-000000000000 8572 0.0 /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 btrfs property set /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro true # works now! btrfs sub snap -r /mnt/work/backup/online/diablo /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 btrfs send -q -p /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 | btrfs receive /run/media/schweizer/external Time will tell whether the incremental-chain is really consistent, but I suppose all the changes in the heavily used filesystem should've already caused massive unlink/whatever errors otherwise. Thanks a lot! You've really saved me hours by not having to restart the source from scratch. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 23:03 ` Hannes Schweizer @ 2018-06-30 23:16 ` Marc MERLIN 2018-07-01 4:54 ` Andrei Borzenkov 0 siblings, 1 reply; 9+ messages in thread From: Marc MERLIN @ 2018-06-30 23:16 UTC (permalink / raw) To: Hannes Schweizer; +Cc: arvidjaar, linux-btrfs Sorry that I missed the beginning of this discussion, but I think this is what I documented here after hitting hte same problem: http://marc.merlins.org/perso/btrfs/post_2018-03-09_Btrfs-Tips_-Rescuing-A-Btrfs-Send-Receive-Relationship.html Marc On Sun, Jul 01, 2018 at 01:03:37AM +0200, Hannes Schweizer wrote: > On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: > > > > 30.06.2018 21:49, Andrei Borzenkov пишет: > > > 30.06.2018 20:49, Hannes Schweizer пишет: > > ... > > >> > > >> I've tested a few restore methods beforehand, and simply creating a > > >> writeable clone from the restored snapshot does not work for me, eg: > > >> # create some source snapshots > > >> btrfs sub create test_root > > >> btrfs sub snap -r test_root test_snap1 > > >> btrfs sub snap -r test_root test_snap2 > > >> > > >> # send a full and incremental backup to external disk > > >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > > >> btrfs sub snap -r test_root test_snap3 > > >> btrfs send -c test_snap2 test_snap3 | btrfs receive > > >> /run/media/schweizer/external > > >> > > >> # simulate disappearing source > > >> btrfs sub del test_* > > >> > > >> # restore full snapshot from external disk > > >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > > >> > > >> # create writeable clone > > >> btrfs sub snap test_snap3 test_root > > >> > > >> # try to continue with backup scheme from source to external > > >> btrfs sub snap -r test_root test_snap4 > > >> > > >> # this fails!! > > >> btrfs send -c test_snap3 test_snap4 | btrfs receive > > >> /run/media/schweizer/external > > >> At subvol test_snap4 > > >> ERROR: parent determination failed for 2047 > > >> ERROR: empty stream is not considered valid > > >> > > > > > > Yes, that's expected. Incremental stream always needs valid parent - > > > this will be cloned on destination and incremental changes applied to > > > it. "-c" option is just additional sugar on top of it which might reduce > > > size of stream, but in this case (i.e. without "-p") it also attempts to > > > guess parent subvolume for test_snap4 and this fails because test_snap3 > > > and test_snap4 do not have common parent so test_snap3 is rejected as > > > valid parent snapshot. You can restart incremental-forever chain by > > > using explicit "-p" instead: > > > > > > btrfs send -p test_snap3 test_snap4 > > > > > > Subsequent snapshots (test_snap5 etc) will all have common parent with > > > immediate predecessor again so "-c" will work. > > > > > > Note that technically "btrfs send" with single "-c" option is entirely > > > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > > > Although this implicit check for common parent may be considered a good > > > thing in this case. > > > > > > P.S. looking at the above, it probably needs to be in manual page for > > > btrfs-send. It took me quite some time to actually understand the > > > meaning of "-p" and "-c" and behavior if they are present. > > > > > ... > > >> > > >> Is there some way to reset the received_uuid of the following snapshot > > >> on online? > > >> ID 258 gen 13742 top level 5 parent_uuid - > > >> received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > > >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > > >> > > > > > > There is no "official" tool but this question came up quite often. > > > Search this list, I believe recently one-liner using python-btrfs was > > > posted. Note that also patch that removes received_uuid when "ro" > > > propery is removed was suggested, hopefully it will be merged at some > > > point. Still I personally consider ability to flip read-only property > > > the very bad thing that should have never been exposed in the first place. > > > > > > > Note that if you remove received_uuid (explicitly or - in the future - > > implicitly) you will not be able to restart incremental send anymore. > > Without received_uuid there will be no way to match source test_snap3 > > with destination test_snap3. So you *must* preserve it and start with > > writable clone. > > > > received_uuid is misnomer. I wish it would be named "content_uuid" or > > "snap_uuid" with semantic > > > > 1. When read-only snapshot of writable volume is created, content_uuid > > is initialized > > > > 2. Read-only snapshot of read-only snapshot inherits content_uuid > > > > 3. destination of "btrfs send" inherits content_uuid > > > > 4. writable snapshot of read-only snapshot clears content_uuid > > > > 5. clearing read-only property clears content_uuid > > > > This would make it more straightforward to cascade and restart > > replication by having single subvolume property to match against. > > Indeed, the current terminology is a bit confusing, and the patch > removing the received_uuid when manually switching ro to false should > definitely be merged. As recommended, I'll simply create a writeable > clone of the restored snapshot and use -p instead of -c when restoring > again (which kind of snapshot relations are accepted for incremental > send/receive needs better documentation) > > Fortunately, with all your hints regarding received_uuid I was able to > successfully restart the incremental-chain WITHOUT restarting from > scratch: > # replace incorrectly propagated received_uuid on destination with > actual uuid of source snapshot > btrfs property set > /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro > false > set_received_uuid.py de9421c5-d160-2949-bf09-613949b4611c 1089 0.0 > /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 > btrfs property set > /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro > true > > # remove incorrectly propagated received_uuid on source > btrfs property set > /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro false > set_received_uuid.py 00000000-0000-0000-0000-000000000000 8572 0.0 > /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 > btrfs property set > /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro true > > # works now! > btrfs sub snap -r /mnt/work/backup/online/diablo > /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 > btrfs send -q -p > /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 > /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 | btrfs > receive /run/media/schweizer/external > > Time will tell whether the incremental-chain is really consistent, but > I suppose all the changes in the heavily used filesystem should've > already caused massive unlink/whatever errors otherwise. > > Thanks a lot! > You've really saved me hours by not having to restart the source from scratch. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Incremental send/receive broken after snapshot restore 2018-06-30 23:16 ` Marc MERLIN @ 2018-07-01 4:54 ` Andrei Borzenkov 0 siblings, 0 replies; 9+ messages in thread From: Andrei Borzenkov @ 2018-07-01 4:54 UTC (permalink / raw) To: Marc MERLIN, Hannes Schweizer; +Cc: linux-btrfs 01.07.2018 02:16, Marc MERLIN пишет: > Sorry that I missed the beginning of this discussion, but I think this is > what I documented here after hitting hte same problem: This is similar, yes. IIRC you had different starting point though. Here it should have been possible to use only standard documented tools without need of low level surgery if done right. > http://marc.merlins.org/perso/btrfs/post_2018-03-09_Btrfs-Tips_-Rescuing-A-Btrfs-Send-Receive-Relationship.html > M-m-m ... statement "because the source had a Parent UUID value too, I was actually supposed to set Received UUID on the destination to it" is entirely off mark nor does it even match subsequent command. You probably meant to say "because the source had a *Received* UUID value too, I was actually supposed to set Received UUID on the destination to it". That is correct. And that is what I meant above - received_uuid is misnomer, it is actually used as common data set identifier. Two subvolumes with the same received_uuid are presumed to have identical content. Which makes the very idea of being able to freely manipulate it rather questionable. P.S. of course "parent" is also highly ambiguous in btrfs world. We really need to come up with acceptable terminology to disambiguate tree parent, snapshot parent and replication parent. The latter should probably better be called "base snapshot" (NetApp calls it "common snapshot"); error message "Could not find base snapshot matching UUID xxx" would be far less ambiguous. > Marc > > On Sun, Jul 01, 2018 at 01:03:37AM +0200, Hannes Schweizer wrote: >> On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote: >>> >>> 30.06.2018 21:49, Andrei Borzenkov пишет: >>>> 30.06.2018 20:49, Hannes Schweizer пишет: >>> ... >>>>> >>>>> I've tested a few restore methods beforehand, and simply creating a >>>>> writeable clone from the restored snapshot does not work for me, eg: >>>>> # create some source snapshots >>>>> btrfs sub create test_root >>>>> btrfs sub snap -r test_root test_snap1 >>>>> btrfs sub snap -r test_root test_snap2 >>>>> >>>>> # send a full and incremental backup to external disk >>>>> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external >>>>> btrfs sub snap -r test_root test_snap3 >>>>> btrfs send -c test_snap2 test_snap3 | btrfs receive >>>>> /run/media/schweizer/external >>>>> >>>>> # simulate disappearing source >>>>> btrfs sub del test_* >>>>> >>>>> # restore full snapshot from external disk >>>>> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . >>>>> >>>>> # create writeable clone >>>>> btrfs sub snap test_snap3 test_root >>>>> >>>>> # try to continue with backup scheme from source to external >>>>> btrfs sub snap -r test_root test_snap4 >>>>> >>>>> # this fails!! >>>>> btrfs send -c test_snap3 test_snap4 | btrfs receive >>>>> /run/media/schweizer/external >>>>> At subvol test_snap4 >>>>> ERROR: parent determination failed for 2047 >>>>> ERROR: empty stream is not considered valid >>>>> >>>> >>>> Yes, that's expected. Incremental stream always needs valid parent - >>>> this will be cloned on destination and incremental changes applied to >>>> it. "-c" option is just additional sugar on top of it which might reduce >>>> size of stream, but in this case (i.e. without "-p") it also attempts to >>>> guess parent subvolume for test_snap4 and this fails because test_snap3 >>>> and test_snap4 do not have common parent so test_snap3 is rejected as >>>> valid parent snapshot. You can restart incremental-forever chain by >>>> using explicit "-p" instead: >>>> >>>> btrfs send -p test_snap3 test_snap4 >>>> >>>> Subsequent snapshots (test_snap5 etc) will all have common parent with >>>> immediate predecessor again so "-c" will work. >>>> >>>> Note that technically "btrfs send" with single "-c" option is entirely >>>> equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) >>>> Although this implicit check for common parent may be considered a good >>>> thing in this case. >>>> >>>> P.S. looking at the above, it probably needs to be in manual page for >>>> btrfs-send. It took me quite some time to actually understand the >>>> meaning of "-p" and "-c" and behavior if they are present. >>>> >>> ... >>>>> >>>>> Is there some way to reset the received_uuid of the following snapshot >>>>> on online? >>>>> ID 258 gen 13742 top level 5 parent_uuid - >>>>> received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid >>>>> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo >>>>> >>>> >>>> There is no "official" tool but this question came up quite often. >>>> Search this list, I believe recently one-liner using python-btrfs was >>>> posted. Note that also patch that removes received_uuid when "ro" >>>> propery is removed was suggested, hopefully it will be merged at some >>>> point. Still I personally consider ability to flip read-only property >>>> the very bad thing that should have never been exposed in the first place. >>>> >>> >>> Note that if you remove received_uuid (explicitly or - in the future - >>> implicitly) you will not be able to restart incremental send anymore. >>> Without received_uuid there will be no way to match source test_snap3 >>> with destination test_snap3. So you *must* preserve it and start with >>> writable clone. >>> >>> received_uuid is misnomer. I wish it would be named "content_uuid" or >>> "snap_uuid" with semantic >>> >>> 1. When read-only snapshot of writable volume is created, content_uuid >>> is initialized >>> >>> 2. Read-only snapshot of read-only snapshot inherits content_uuid >>> >>> 3. destination of "btrfs send" inherits content_uuid >>> >>> 4. writable snapshot of read-only snapshot clears content_uuid >>> >>> 5. clearing read-only property clears content_uuid >>> >>> This would make it more straightforward to cascade and restart >>> replication by having single subvolume property to match against. >> >> Indeed, the current terminology is a bit confusing, and the patch >> removing the received_uuid when manually switching ro to false should >> definitely be merged. As recommended, I'll simply create a writeable >> clone of the restored snapshot and use -p instead of -c when restoring >> again (which kind of snapshot relations are accepted for incremental >> send/receive needs better documentation) >> >> Fortunately, with all your hints regarding received_uuid I was able to >> successfully restart the incremental-chain WITHOUT restarting from >> scratch: >> # replace incorrectly propagated received_uuid on destination with >> actual uuid of source snapshot >> btrfs property set >> /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro >> false >> set_received_uuid.py de9421c5-d160-2949-bf09-613949b4611c 1089 0.0 >> /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 >> btrfs property set >> /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro >> true >> >> # remove incorrectly propagated received_uuid on source >> btrfs property set >> /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro false >> set_received_uuid.py 00000000-0000-0000-0000-000000000000 8572 0.0 >> /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 >> btrfs property set >> /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 ro true >> >> # works now! >> btrfs sub snap -r /mnt/work/backup/online/diablo >> /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 >> btrfs send -q -p >> /mnt/work/backup/online/diablo_external.2018-06-24T19-37-39 >> /mnt/work/backup/online/diablo_external.2018-07-01T00-19-46 | btrfs >> receive /run/media/schweizer/external >> >> Time will tell whether the incremental-chain is really consistent, but >> I suppose all the changes in the heavily used filesystem should've >> already caused massive unlink/whatever errors otherwise. >> >> Thanks a lot! >> You've really saved me hours by not having to restart the source from scratch. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-07-01 4:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-28 20:09 Incremental send/receive broken after snapshot restore Hannes Schweizer
2018-06-29 17:44 ` Andrei Borzenkov
[not found] ` <CAOfGOYyFcQ5gN7z=4zEaGH0VMVUuFE5qiGwgF+c14FU228Y3iQ@mail.gmail.com>
2018-06-30 6:24 ` Andrei Borzenkov
2018-06-30 17:49 ` Hannes Schweizer
2018-06-30 18:49 ` Andrei Borzenkov
2018-06-30 20:02 ` Andrei Borzenkov
2018-06-30 23:03 ` Hannes Schweizer
2018-06-30 23:16 ` Marc MERLIN
2018-07-01 4:54 ` Andrei Borzenkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).