* Incremental send receive of snapshot fails
@ 2016-12-28 11:50 Rene Wolf
2016-12-29 15:31 ` Giuseppe Della Bianca
0 siblings, 1 reply; 4+ messages in thread
From: Rene Wolf @ 2016-12-28 11:50 UTC (permalink / raw)
To: Btrfs BTRFS
Hi all
I have a problem with incremental snapshot send receive in btrfs. May be
my google-fu is weak, but I couldn't find any pointers, so here goes.
A few words about my setup first:
I have multiple clients that back up to a central server. All clients
(and the server) are running a (K)Ubuntu 16.10 64Bit on btrfs. Backing
up works with btrfs send / receive, either full or incremental,
depending on whats available on the server side. All clients have the
usual (Ubuntu) btrfs layout: 2 subvolumes, one for / and one for /home;
explicit entries in fstab; root volume not mounted anywhere. For further
details see the P.s. at the end.
Here's what happens:
In general I stick to the example form
https://btrfs.wiki.kernel.org/index.php/Incremental_Backup . Backing up
is done daily by a script, and it works successfully on all of my
clients except one (called "lab").
I start with the first snapshot on "lab" and do a full send to the
server. This works as expected (sending takes some hours as it is done
over wifi+ssh). After that is done I send an incremental snapshot based
on the previous parent. This also works as expected (no error etc).
Sending deltas then happens once a day, with the script always keeping
the last two snapshots on the client and many more on the server. Also
after each run of the script I do a bit of "house keeping" to prevent
"disk full" etc (see below p.s. for commands).
I can't exactly say when, but after some time (possibly the next day)
snapshot sending fails with an error on the receiving end:
ERROR: unlink some/file failed. No such file or directory
Some searching around lead me to this
https://bugzilla.kernel.org/show_bug.cgi?id=60673 . So I checked to make
sure my script doesn't use the wrong parent; and it does not. But to
make really sure I tried a send / receive directly on "lab" without the
server:
# btrfs subvol snap -r / /.back/new_snap
> Create a readonly snapshot of '/' in '/.back/new_snap'
# btrfs subv show /.back/last_snap_by_script
> /.back/last_snap_by_script
> Name: last_snap_by_script
> UUID: b4634a8b-b74b-154a-9f17-1115f6d07524
> Parent UUID: b5f9a301-69f7-0646-8cf1-ba29e0c24fac
> Received UUID: 196a0866-cd05-d24e-bac6-84e8e5eb037a
> Creation time: 2016-12-27 17:55:10 +0100
> Subvolume ID: 486
> Generation: 52036
> Gen at creation: 51524
> Parent ID: 257
> Top level ID: 257
> Flags: readonly
> Snapshot(s):
# btrfs subv show /.back/new_snap
> /.back/new_snap
> Name: new_snap
> UUID: fca51929-8101-db45-8df6-f25935c04f98
> Parent UUID: b5f9a301-69f7-0646-8cf1-ba29e0c24fac
> Received UUID: 196a0866-cd05-d24e-bac6-84e8e5eb037a
> Creation time: 2016-12-28 11:51:43 +0100
> Subvolume ID: 506
> Generation: 52271
> Gen at creation: 52271
> Parent ID: 257
> Top level ID: 257
> Flags: readonly
> Snapshot(s):
# btrfs send -p /.back/last_snap_by_script /.back/new_snap > delta
> At subvol /.back/new_snap
# btrfs subvol del /.back/new_snap
> Delete subvolume (no-commit): '/.back/new_snap'
# cat delta | btrfs receive /.back/
> At snapshot new_snap
> ERROR: unlink some/file failed. No such file or directory
And the receive always fails with some ERROR similar to the above! What
I find a bit odd is the identical "Received UUID", even before new_snap
was sent / received ... but maybe that's normal?
If instead of "last_snap_by_script" I also create a new read only
snapshot and send the delta between these two "new" ones, everything
works as expected. But then there's little differences between the two
new snaps ...
I tried to look for differences between the "lab" client and another one
("navi") where backing up works. So far I couldn't really find anything.
I did create both file systems at different points in time (possibly
with different kernels). All fs were created as btrfs and not
"converted" from ext. "lab" has an SSD, "navi" a spinning disc. Both
systems run on Intel CPUs in 64Bit ...
So now I have a snapshot on "lab" which I cannot use as a parent, but
why? What did I do wrong? The whole procedure does work on my other
clients (with the exact same script), why not on the "lab" client? And
this is a re-occuring problem: I tried deleting all of the snaps (on
both ends) and start all over again ... it will again end up with a
"broken" snapshot eventually.
Up until now using btrfs has been a great experience and I always could
resolve my troubles quite quickly, but this time I don't know what to do?
Thanks in advance for any suggestions and feel free to ask for other /
missing details :-)
Regards
Rene
P.s.: here's my system info from the failing client "lab"
$ uname -a
Linux lab 4.8.0-32-generic #34-Ubuntu SMP Tue Dec 13 14:30:43 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
$ btrfs --version
btrfs-progs v4.7.3
# btrfs fi show
Label: 'SSD' uuid: 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
Total devices 1 FS bytes used 37.62GiB
devid 1 size 55.90GiB used 41.03GiB path /dev/sdb1
# btrfs fi df /
Data, single: total=40.00GiB, used=37.09GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=1.00GiB, used=543.08MiB
GlobalReserve, single: total=112.38MiB, used=0.00B
$ mount | grep btrfs
/dev/sdb1 on / type btrfs
(rw,noatime,ssd,space_cache,subvolid=257,subvol=/@)
/dev/sdb1 on /home type btrfs
(rw,noatime,ssd,space_cache,subvolid=286,subvol=/@home)
# btrfs scrub start -B /
scrub done for 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
scrub started at Wed Dec 28 12:05:53 2016 and finished after
00:02:24
total bytes scrubbed: 37.76GiB with 0 errors
"house keeping" mostly based on suggestions from Marc's Blog
(http://marc.merlins.org/perso/btrfs/)
# /bin/btrfs balance start -v -dusage=0 /
# /bin/btrfs balance start -v -dusage=60 -musage=60 -v /
I can add a dmesg output on request, but so far I couldn't observe any
reaction there...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Incremental send receive of snapshot fails
2016-12-28 11:50 Incremental send receive of snapshot fails Rene Wolf
@ 2016-12-29 15:31 ` Giuseppe Della Bianca
2016-12-29 19:31 ` Rene Wolf
0 siblings, 1 reply; 4+ messages in thread
From: Giuseppe Della Bianca @ 2016-12-29 15:31 UTC (permalink / raw)
To: igeligel; +Cc: linux-btrfs
Hi.
In such cases, I have run btrfs check (not repair mode !!!) in every
file system/partition that is involved in creating, sending and
receiving snapshots.
Regards.
Gdb
>Rene Wolf Wed, 28 Dec 2016 03:51:07 -0800
>Hi all
>I have a problem with incremental snapshot send receive in btrfs. May
be my google-fu is weak, but I couldn't find any pointers, so here goes.
>A few words about my setup first:
>I have multiple clients that back up to a central server. All clients
(and the server) are running a (K)Ubuntu 16.10 64Bit on btrfs. Backing
up works with btrfs send / receive, either full or incremental,
depending on whats available on >the server side. All clients have the
usual (Ubuntu) btrfs layout: 2 subvolumes, one for / and one for /home;
explicit entries in fstab; root volume not mounted anywhere. For further
details see the P.s. at the end.
]zac[
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Incremental send receive of snapshot fails
2016-12-29 15:31 ` Giuseppe Della Bianca
@ 2016-12-29 19:31 ` Rene Wolf
2016-12-30 9:34 ` Giuseppe Della Bianca
0 siblings, 1 reply; 4+ messages in thread
From: Rene Wolf @ 2016-12-29 19:31 UTC (permalink / raw)
To: Giuseppe Della Bianca; +Cc: linux-btrfs
Hi
As the fs in question is my root, I tried the following using a live usb
stick of a xubuntu 16.10:
> Checking filesystem on /dev/sdb1
> UUID: 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
> checking extents [o]
> checking free space cache [.]
> checking fs roots [o]
> found 40577679360 bytes used err is 0
> total csum bytes: 39027548
> total tree bytes: 571277312
> total fs tree bytes: 453001216
> total extent tree bytes: 71745536
> btree space waste bytes: 116244847
> file data blocks allocated: 46952968192
> referenced 44081487872
"err is 0" ... so I guess that means everything is fine?
Out of curiosity I retried the new_snap+send+receive on that same fs
using the live cd: same results (ERROR unlink ...)
Though I noticed that the exact file in question (reported by ERROR) is
somewhat random ...
For this test with the live usb, I mounted the root volume directly
instead of subvolumes via fstab. So that doesn't seam to have been the
problem either.
I did some further meditating on what happens here. From what I read and
understand of send/receive, the stream produced by send is about
replaying the fs events. If I give send a parent, it will just replay
the difference between the two snapshots and only produce a stream that
contains the changes needed to "transform" one (parent) snap into the
other (on the receiving end). Now I'm not sure how the receiving end
figures out what the parent is, and whether it has it, but I guess
that's where all those UUIDs come into play.
There are three UUIDs, if I compare them on sending ("lab") and
receiving ("server") side, I see:
## sender
# btrfs subv show /.back/last_snap_by_script
> /.back/last_snap_by_script
> UUID: b4634a8b-b74b-154a-9f17-1115f6d07524
> Parent UUID: b5f9a301-69f7-0646-8cf1-ba29e0c24fac
> Received UUID: 196a0866-cd05-d24e-bac6-84e8e5eb037a
## receiver
# btrfs subv show /media/bak/lab/root/last_snap_by_script
> UUID: 89321ec1-2de6-0a4c-8f9f-cdd30fa3a7af
> Parent UUID: -
> Received UUID: 196a0866-cd05-d24e-bac6-84e8e5eb037a
So that does make sense to me, as neither "Parent UUID" nor "UUID" is
what would fit our needs (both are kind of local to one system). Instead
the "Received UUID" seams to be the link identifying snaps on both ends
to be "equal". But then why do both snaps on the sending side have the
same "Received UUID" for me:
## from my original post, on sender side, this is the "new" delta snapshot
# btrfs subv show /.back/new_snap
> /.back/new_snap
> Name: new_snap
> UUID: fca51929-8101-db45-8df6-f25935c04f98
> Parent UUID: b5f9a301-69f7-0646-8cf1-ba29e0c24fac
> Received UUID: 196a0866-cd05-d24e-bac6-84e8e5eb037a
It would be great if some one could clear this up .. could this point to
the reason on why the "replay" stream is produced on a wrong basis?
Another thing I tried is the "--max-error 0" option of receive. That
lets it continue after error, but that produced an endless slur of more
of the same errors. Is that another indicator that the parent on the
sending or receiving side is identified wrongly or not at all?
In any case, thanks for the tip Giuseppe :-)
Regards
Rene
On 29.12.2016 16:31, Giuseppe Della Bianca wrote:
> Hi.
>
> In such cases, I have run btrfs check (not repair mode !!!) in every
> file system/partition that is involved in creating, sending and
> receiving snapshots.
>
>
> Regards.
>
> Gdb
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Incremental send receive of snapshot fails
2016-12-29 19:31 ` Rene Wolf
@ 2016-12-30 9:34 ` Giuseppe Della Bianca
0 siblings, 0 replies; 4+ messages in thread
From: Giuseppe Della Bianca @ 2016-12-30 9:34 UTC (permalink / raw)
To: Rene Wolf; +Cc: linux-btrfs
Hi.
If btrfs check does not display error messages, the filesystem is ok.
I do not have enough knowledge to analyze your data.
But if you're sure that all filesystems do not have problems, the
problem is the parent subvolume in the receiving filesystem.
Regards.
Gdb
Rene Wolf:
> Hi
>
>
> As the fs in question is my root, I tried the following using a live usb
> stick of a xubuntu 16.10:
>
> > Checking filesystem on /dev/sdb1
> > UUID: 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
> > checking extents [o]
> > checking free space cache [.]
> > checking fs roots [o]
> > found 40577679360 bytes used err is 0
> > total csum bytes: 39027548
> > total tree bytes: 571277312
> > total fs tree bytes: 453001216
> > total extent tree bytes: 71745536
> > btree space waste bytes: 116244847
> > file data blocks allocated: 46952968192
> > referenced 44081487872
>
> "err is 0" ... so I guess that means everything is fine?
>
]zac[
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-12-30 9:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-28 11:50 Incremental send receive of snapshot fails Rene Wolf
2016-12-29 15:31 ` Giuseppe Della Bianca
2016-12-29 19:31 ` Rene Wolf
2016-12-30 9:34 ` Giuseppe Della Bianca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).