linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Incremental send receive of snapshot fails
@ 2016-12-28 11:50 Rene Wolf
  2016-12-29 15:31 ` Giuseppe Della Bianca
  0 siblings, 1 reply; 4+ messages in thread
From: Rene Wolf @ 2016-12-28 11:50 UTC (permalink / raw)
  To: Btrfs BTRFS

Hi all


I have a problem with incremental snapshot send receive in btrfs. May be 
my google-fu is weak, but I couldn't find any pointers, so here goes.


A few words about my setup first:

I have multiple clients that back up to a central server. All clients 
(and the server) are running a (K)Ubuntu 16.10 64Bit on btrfs. Backing 
up works with btrfs send / receive, either full or incremental, 
depending on whats available on the server side. All clients have the 
usual (Ubuntu) btrfs layout: 2 subvolumes, one for / and one for /home; 
explicit entries in fstab; root volume not mounted anywhere. For further 
details see the P.s. at the end.


Here's what happens:

In general I stick to the example form 
https://btrfs.wiki.kernel.org/index.php/Incremental_Backup . Backing up 
is done daily by a script, and it works successfully on all of my 
clients except one (called "lab").

I start with the first snapshot on "lab" and do a full send to the 
server. This works as expected (sending takes some hours as it is done 
over wifi+ssh). After that is done I send an incremental snapshot based 
on the previous parent. This also works as expected (no error etc). 
Sending deltas then happens once a day, with the script always keeping 
the last two snapshots on the client and many more on the server. Also 
after each run of the script I do a bit of "house keeping" to prevent 
"disk full" etc (see below p.s. for commands).

I can't exactly say when, but after some time (possibly the next day) 
snapshot sending fails with an error on the receiving end:
ERROR: unlink some/file failed. No such file or directory

Some searching around lead me to this 
https://bugzilla.kernel.org/show_bug.cgi?id=60673 . So I checked to make 
sure my script doesn't use the wrong parent; and it does not. But to 
make really sure I tried a send / receive directly on "lab" without the 
server:

# btrfs subvol snap -r / /.back/new_snap
> Create a readonly snapshot of '/' in '/.back/new_snap'

# btrfs subv show /.back/last_snap_by_script
> /.back/last_snap_by_script
>         Name:                   last_snap_by_script
>         UUID:                   b4634a8b-b74b-154a-9f17-1115f6d07524
>         Parent UUID:            b5f9a301-69f7-0646-8cf1-ba29e0c24fac
>         Received UUID:          196a0866-cd05-d24e-bac6-84e8e5eb037a
>         Creation time:          2016-12-27 17:55:10 +0100
>         Subvolume ID:           486
>         Generation:             52036
>         Gen at creation:        51524
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):

# btrfs subv show /.back/new_snap
> /.back/new_snap
>         Name:                   new_snap
>         UUID:                   fca51929-8101-db45-8df6-f25935c04f98
>         Parent UUID:            b5f9a301-69f7-0646-8cf1-ba29e0c24fac
>         Received UUID:          196a0866-cd05-d24e-bac6-84e8e5eb037a
>         Creation time:          2016-12-28 11:51:43 +0100
>         Subvolume ID:           506
>         Generation:             52271
>         Gen at creation:        52271
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):

# btrfs send -p /.back/last_snap_by_script /.back/new_snap > delta
> At subvol /.back/new_snap

# btrfs subvol del /.back/new_snap
> Delete subvolume (no-commit): '/.back/new_snap'

# cat delta | btrfs receive /.back/
> At snapshot new_snap
> ERROR: unlink some/file failed. No such file or directory

And the receive always fails with some ERROR similar to the above! What 
I find a bit odd is the identical "Received UUID", even before new_snap 
was sent / received ... but maybe that's normal?

If instead of "last_snap_by_script" I also create a new read only 
snapshot and send the delta between these two "new" ones, everything 
works as expected. But then there's little differences between the two 
new snaps ...

I tried to look for differences between the "lab" client and another one 
("navi") where backing up works. So far I couldn't really find anything. 
I did create both file systems at different points in time (possibly 
with different kernels). All fs were created as btrfs and not 
"converted" from ext. "lab" has an SSD, "navi" a spinning disc. Both 
systems run on Intel CPUs in 64Bit ...


So now I have a snapshot on "lab" which I cannot use as a parent, but 
why? What did I do wrong? The whole procedure does work on my other 
clients (with the exact same script), why not on the "lab" client? And 
this is a re-occuring problem: I tried deleting all of the snaps (on 
both ends) and start all over again ... it will again end up with a 
"broken" snapshot eventually.


Up until now using btrfs has been a great experience and I always could 
resolve my troubles quite quickly, but this time I don't know what to do?
Thanks in advance for any suggestions and feel free to ask for other / 
missing details :-)


Regards
Rene


P.s.: here's my system info from the failing client "lab"

$ uname -a
Linux lab 4.8.0-32-generic #34-Ubuntu SMP Tue Dec 13 14:30:43 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux

$ btrfs --version
btrfs-progs v4.7.3

# btrfs fi show
Label: 'SSD'  uuid: 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
         Total devices 1 FS bytes used 37.62GiB
         devid    1 size 55.90GiB used 41.03GiB path /dev/sdb1

# btrfs fi df /
Data, single: total=40.00GiB, used=37.09GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=1.00GiB, used=543.08MiB
GlobalReserve, single: total=112.38MiB, used=0.00B

$ mount | grep btrfs
/dev/sdb1 on / type btrfs 
(rw,noatime,ssd,space_cache,subvolid=257,subvol=/@)
/dev/sdb1 on /home type btrfs 
(rw,noatime,ssd,space_cache,subvolid=286,subvol=/@home)

# btrfs scrub start -B /
scrub done for 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
         scrub started at Wed Dec 28 12:05:53 2016 and finished after 
00:02:24
         total bytes scrubbed: 37.76GiB with 0 errors

"house keeping" mostly based on suggestions from Marc's Blog 
(http://marc.merlins.org/perso/btrfs/)
# /bin/btrfs balance start -v -dusage=0 /
# /bin/btrfs balance start -v -dusage=60 -musage=60 -v /

I can add a dmesg output on request, but so far I couldn't observe any 
reaction there...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-30  9:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-28 11:50 Incremental send receive of snapshot fails Rene Wolf
2016-12-29 15:31 ` Giuseppe Della Bianca
2016-12-29 19:31   ` Rene Wolf
2016-12-30  9:34     ` Giuseppe Della Bianca

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).