linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs-progs: receive explicit parent support
@ 2015-04-26 10:12 Lauri Võsandi
  2015-04-26 11:19 ` Hugo Mills
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Lauri Võsandi @ 2015-04-26 10:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Lauri Võsandi

This patch adds command-line flag -p to btrfs receive
which makes it possible to disable automatic parent
search for incremental snapshots and use explicitly
specified path instead.

Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
---
 cmds-receive.c | 34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index b7cf3f9..391d281 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -61,6 +61,7 @@ struct btrfs_receive
 	char *root_path;
 	char *dest_dir_path; /* relative to root_path */
 	char *full_subvol_path;
+	char *explicit_parent_path;
 	int dest_dir_chroot;
 
 	struct subvol_info *cur_subvol;
@@ -220,20 +221,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
 		fprintf(stderr, "receiving snapshot %s uuid=%s, "
 				"ctransid=%llu ", path, uuid_str,
 				r->cur_subvol->stransid);
-		uuid_unparse(parent_uuid, uuid_str);
-		fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
-				uuid_str, parent_ctransid);
 	}
 
 	memset(&args_v2, 0, sizeof(args_v2));
 	strncpy_null(args_v2.name, path);
 
-	parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-			parent_ctransid, NULL, subvol_search_by_received_uuid);
-	if (!parent_subvol) {
+	if (r->explicit_parent_path) {
+		if (g_verbose) {
+			fprintf(stderr, "using explicit parent %s\n",
+					r->explicit_parent_path);
+		}
+		parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
+			0, r->explicit_parent_path, subvol_search_by_path);
+	} else {
+		if (g_verbose) {
+			uuid_unparse(parent_uuid, uuid_str);
+			fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
+					uuid_str, parent_ctransid);
+		}
 		parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-				parent_ctransid, NULL, subvol_search_by_uuid);
+			parent_ctransid, NULL, subvol_search_by_received_uuid);
+		if (!parent_subvol) {
+			parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
+					parent_ctransid, NULL, subvol_search_by_uuid);
+		}
 	}
+
 	if (!parent_subvol) {
 		ret = -ENOENT;
 		fprintf(stderr, "ERROR: could not find parent subvolume\n");
@@ -962,11 +975,14 @@ int cmd_receive(int argc, char **argv)
 			{ NULL, 0, NULL, 0 }
 		};
 
-		c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
+		c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
 		if (c < 0)
 			break;
 
 		switch (c) {
+		case 'p':
+			r.explicit_parent_path = optarg;
+			break;
 		case 'v':
 			g_verbose++;
 			break;
@@ -1028,6 +1044,8 @@ const char * const cmd_receive_usage[] = {
 	"                 in the data stream. Without this option,",
 	"                 the receiver terminates only if an error",
 	"                 is recognized or on EOF.",
+	"-p <subvol>      Disables the automatic searching for parents",
+	"                 if incremental streams are received.",
 	"-C|--chroot      confine the process to <mount> using chroot",
 	"--max-errors <N> Terminate as soon as N errors happened while",
 	"                 processing commands from the send stream.",
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-26 10:12 [PATCH] btrfs-progs: receive explicit parent support Lauri Võsandi
@ 2015-04-26 11:19 ` Hugo Mills
  2015-04-26 11:37   ` lauri
  2015-05-07 19:06   ` lauri
  2015-04-27  7:28 ` Stefan Behrens
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Hugo Mills @ 2015-04-26 11:19 UTC (permalink / raw)
  To: Lauri Võsandi; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5302 bytes --]

On Sun, Apr 26, 2015 at 12:12:45PM +0200, Lauri Võsandi wrote:
> This patch adds command-line flag -p to btrfs receive
> which makes it possible to disable automatic parent
> search for incremental snapshots and use explicitly
> specified path instead.

   As I said when we discussed this on IRC, I think this is absolutely
the wrong way to go about solving this problem. This -p option has two
failure modes: the first is that a referenced file simply doesn't
exist in the supplied parent, and the receive fails hard partway
through. The second failure mode is more subtle and not checked for in
any way, which is that the referenced file exists, but doesn't contain
the same data as the original parent subvolume on the send side. In
that case, you will end up with silent corruption of files.

   This is a NAK from me.

   I believe that the correct way of dealing with the issue of
non-transitive sends is twofold:

1) When sending a subvolume which already has a received-uuid field,
   preserve that field in the receiving side rather than overwriting
   it.

2) When specifying the parent to the receiving side, send both the
   uuid and received-uuid fields, and search against both of these in
   the subvolumes on the receiving side to find the parent. This will
   (I think) probably involve a bump of the stream format, and adding
   the received-uuid to the UUID tree.

   Part 1) means that, writing (U, R) for UUID and Received-UUID, we
currently have the following situation:

Subvol A on source side: (A, -)
Send this to A' on target side: (A', A)
Send this back to A'' on source side: (A'', A')

and would change this behaviour to:

Subvol A on source side: (A, -)
Send this to A' on target side: (A', A)
Send this back to A'' on source side: (A'', A) <-- Note the A here, not A'

   More later, when I've had a little time to play with things and
think through the semantics properly.

   Hugo.

> Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
> ---
>  cmds-receive.c | 34 ++++++++++++++++++++++++++--------
>  1 file changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/cmds-receive.c b/cmds-receive.c
> index b7cf3f9..391d281 100644
> --- a/cmds-receive.c
> +++ b/cmds-receive.c
> @@ -61,6 +61,7 @@ struct btrfs_receive
>  	char *root_path;
>  	char *dest_dir_path; /* relative to root_path */
>  	char *full_subvol_path;
> +	char *explicit_parent_path;
>  	int dest_dir_chroot;
>  
>  	struct subvol_info *cur_subvol;
> @@ -220,20 +221,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
>  		fprintf(stderr, "receiving snapshot %s uuid=%s, "
>  				"ctransid=%llu ", path, uuid_str,
>  				r->cur_subvol->stransid);
> -		uuid_unparse(parent_uuid, uuid_str);
> -		fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
> -				uuid_str, parent_ctransid);
>  	}
>  
>  	memset(&args_v2, 0, sizeof(args_v2));
>  	strncpy_null(args_v2.name, path);
>  
> -	parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> -			parent_ctransid, NULL, subvol_search_by_received_uuid);
> -	if (!parent_subvol) {
> +	if (r->explicit_parent_path) {
> +		if (g_verbose) {
> +			fprintf(stderr, "using explicit parent %s\n",
> +					r->explicit_parent_path);
> +		}
> +		parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
> +			0, r->explicit_parent_path, subvol_search_by_path);
> +	} else {
> +		if (g_verbose) {
> +			uuid_unparse(parent_uuid, uuid_str);
> +			fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
> +					uuid_str, parent_ctransid);
> +		}
>  		parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> -				parent_ctransid, NULL, subvol_search_by_uuid);
> +			parent_ctransid, NULL, subvol_search_by_received_uuid);
> +		if (!parent_subvol) {
> +			parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> +					parent_ctransid, NULL, subvol_search_by_uuid);
> +		}
>  	}
> +
>  	if (!parent_subvol) {
>  		ret = -ENOENT;
>  		fprintf(stderr, "ERROR: could not find parent subvolume\n");
> @@ -962,11 +975,14 @@ int cmd_receive(int argc, char **argv)
>  			{ NULL, 0, NULL, 0 }
>  		};
>  
> -		c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
> +		c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
>  		if (c < 0)
>  			break;
>  
>  		switch (c) {
> +		case 'p':
> +			r.explicit_parent_path = optarg;
> +			break;
>  		case 'v':
>  			g_verbose++;
>  			break;
> @@ -1028,6 +1044,8 @@ const char * const cmd_receive_usage[] = {
>  	"                 in the data stream. Without this option,",
>  	"                 the receiver terminates only if an error",
>  	"                 is recognized or on EOF.",
> +	"-p <subvol>      Disables the automatic searching for parents",
> +	"                 if incremental streams are received.",
>  	"-C|--chroot      confine the process to <mount> using chroot",
>  	"--max-errors <N> Terminate as soon as N errors happened while",
>  	"                 processing commands from the send stream.",

-- 
Hugo Mills             | Questions are a burden, and answers a prison for
hugo@... carfax.org.uk | oneself
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                          The Prisoner

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-26 11:19 ` Hugo Mills
@ 2015-04-26 11:37   ` lauri
  2015-05-14 17:35     ` Hugo Mills
  2015-05-07 19:06   ` lauri
  1 sibling, 1 reply; 12+ messages in thread
From: lauri @ 2015-04-26 11:37 UTC (permalink / raw)
  To: Hugo Mills, Lauri Võsandi, linux-btrfs

Hi,

> Subvol A on source side: (A, -)
> Send this to A' on target side: (A', A)
> Send this back to A'' on source side: (A'', A) <-- Note the A here, not A'

I also think your approach is the real solution to the problem, but as
some pointed out on IRC this changes the behaviour of btrfs receive
and will break things for someone. When I asked whose obscure workflow
does it break nobody could come up with any reasonable example whereas
sending snapshots back and forth seems to be the major usecase for
btrfs receive and it's impossible to for example restore snapshots
with current implementation.

I know that -p may fail horribly but the idea was to have *option* to
replace parent lookup logic and instead of relying on UUID-s simply
have it specified by userspace application.

>    More later, when I've had a little time to play with things and
> think through the semantics properly.

You want to give it a try?

-- 
Lauri Võsandi
tel: +372 53329412
e-mail: lauri.vosandi@gmail.com
blog: http://lauri.vosandi.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-26 10:12 [PATCH] btrfs-progs: receive explicit parent support Lauri Võsandi
  2015-04-26 11:19 ` Hugo Mills
@ 2015-04-27  7:28 ` Stefan Behrens
  2015-04-27  9:23   ` lauri
  2015-05-07 17:50 ` [PATCH v2] This patch adds command-line flag -p to btrfs receive which makes it possible to disable automatic parent search for incremental snapshots and use explicitly specified path instead. Works also with multiple incremental snapshots Lauri Võsandi
  2015-05-07 18:53 ` [PATCH v2] btrfs-progs: receive explicit parent support Lauri Võsandi
  3 siblings, 1 reply; 12+ messages in thread
From: Stefan Behrens @ 2015-04-27  7:28 UTC (permalink / raw)
  To: Lauri Võsandi, linux-btrfs

On Sun, 26 Apr 2015 12:12:45 +0200, Lauri Võsandi wrote:
> This patch adds command-line flag -p to btrfs receive
> which makes it possible to disable automatic parent
> search for incremental snapshots and use explicitly
> specified path instead.
> 
> Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
> ---
>  cmds-receive.c | 34 ++++++++++++++++++++++++++--------
>  1 file changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/cmds-receive.c b/cmds-receive.c
> index b7cf3f9..391d281 100644
> --- a/cmds-receive.c
> +++ b/cmds-receive.c
> @@ -61,6 +61,7 @@ struct btrfs_receive
>  	char *root_path;
>  	char *dest_dir_path; /* relative to root_path */
>  	char *full_subvol_path;
> +	char *explicit_parent_path;
>  	int dest_dir_chroot;
>  
>  	struct subvol_info *cur_subvol;
> @@ -220,20 +221,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
>  		fprintf(stderr, "receiving snapshot %s uuid=%s, "
>  				"ctransid=%llu ", path, uuid_str,
>  				r->cur_subvol->stransid);
> -		uuid_unparse(parent_uuid, uuid_str);
> -		fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
> -				uuid_str, parent_ctransid);
>  	}
>  
>  	memset(&args_v2, 0, sizeof(args_v2));
>  	strncpy_null(args_v2.name, path);
>  
> -	parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> -			parent_ctransid, NULL, subvol_search_by_received_uuid);
> -	if (!parent_subvol) {
> +	if (r->explicit_parent_path) {
> +		if (g_verbose) {
> +			fprintf(stderr, "using explicit parent %s\n",
> +					r->explicit_parent_path);
> +		}
> +		parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
> +			0, r->explicit_parent_path, subvol_search_by_path);

This won't work if you receive more than one snapshot, each one derived
from the previous one ("btrfs send -e snap1 snap2 snap3 snap4"). You
would always use the first one as the parent, not the predecessor.
That's implemented differently for the -p option in
git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git


> +	} else {
> +		if (g_verbose) {
> +			uuid_unparse(parent_uuid, uuid_str);
> +			fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
> +					uuid_str, parent_ctransid);
> +		}
>  		parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> -				parent_ctransid, NULL, subvol_search_by_uuid);
> +			parent_ctransid, NULL, subvol_search_by_received_uuid);
> +		if (!parent_subvol) {
> +			parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
> +					parent_ctransid, NULL, subvol_search_by_uuid);
> +		}

This used to be a search for the received_uuid only. Why is this code
changed like this in the branch that is executed when -p is not
specified, in a patch that has the goal to add -p with new functionality
if -p is specified?


>  	}
> +
>  	if (!parent_subvol) {
>  		ret = -ENOENT;
>  		fprintf(stderr, "ERROR: could not find parent subvolume\n");
> @@ -962,11 +975,14 @@ int cmd_receive(int argc, char **argv)
>  			{ NULL, 0, NULL, 0 }
>  		};
>  
> -		c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
> +		c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
>  		if (c < 0)
>  			break;
>  
>  		switch (c) {
> +		case 'p':
> +			r.explicit_parent_path = optarg;
> +			break;
>  		case 'v':
>  			g_verbose++;
>  			break;
> @@ -1028,6 +1044,8 @@ const char * const cmd_receive_usage[] = {
>  	"                 in the data stream. Without this option,",
>  	"                 the receiver terminates only if an error",
>  	"                 is recognized or on EOF.",
> +	"-p <subvol>      Disables the automatic searching for parents",
> +	"                 if incremental streams are received.",
>  	"-C|--chroot      confine the process to <mount> using chroot",
>  	"--max-errors <N> Terminate as soon as N errors happened while",
>  	"                 processing commands from the send stream.",
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-27  7:28 ` Stefan Behrens
@ 2015-04-27  9:23   ` lauri
  2015-04-27 19:29     ` Stefan Behrens
  0 siblings, 1 reply; 12+ messages in thread
From: lauri @ 2015-04-27  9:23 UTC (permalink / raw)
  To: Stefan Behrens; +Cc: linux-btrfs

Hi,

> This won't work if you receive more than one snapshot, each one derived
> from the previous one ("btrfs send -e snap1 snap2 snap3 snap4"). You
> would always use the first one as the parent, not the predecessor.
> That's implemented differently for the -p option in
> git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git

I'll take another look on the multi-subvolume stream mode.

> This used to be a search for the received_uuid only. Why is this code
> changed like this in the branch that is executed when -p is not
> specified, in a patch that has the goal to add -p with new functionality
> if -p is specified?

Many things have changed since far-progs, take a look in the original
source I used as basis for my patch:

https://github.com/kdave/btrfs-progs/blob/1b7dd327f43777bcdd217a0500e6bda78128a290/cmds-receive.c#L233

I am pretty sure lookup logic isn't affected if -p is not used.

2015-04-27 9:28 GMT+02:00 Stefan Behrens <sbehrens@giantdisaster.de>:
> On Sun, 26 Apr 2015 12:12:45 +0200, Lauri Võsandi wrote:
>> This patch adds command-line flag -p to btrfs receive
>> which makes it possible to disable automatic parent
>> search for incremental snapshots and use explicitly
>> specified path instead.
>>
>> Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
>> ---
>>  cmds-receive.c | 34 ++++++++++++++++++++++++++--------
>>  1 file changed, 26 insertions(+), 8 deletions(-)
>>
>> diff --git a/cmds-receive.c b/cmds-receive.c
>> index b7cf3f9..391d281 100644
>> --- a/cmds-receive.c
>> +++ b/cmds-receive.c
>> @@ -61,6 +61,7 @@ struct btrfs_receive
>>       char *root_path;
>>       char *dest_dir_path; /* relative to root_path */
>>       char *full_subvol_path;
>> +     char *explicit_parent_path;
>>       int dest_dir_chroot;
>>
>>       struct subvol_info *cur_subvol;
>> @@ -220,20 +221,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
>>               fprintf(stderr, "receiving snapshot %s uuid=%s, "
>>                               "ctransid=%llu ", path, uuid_str,
>>                               r->cur_subvol->stransid);
>> -             uuid_unparse(parent_uuid, uuid_str);
>> -             fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
>> -                             uuid_str, parent_ctransid);
>>       }
>>
>>       memset(&args_v2, 0, sizeof(args_v2));
>>       strncpy_null(args_v2.name, path);
>>
>> -     parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>> -                     parent_ctransid, NULL, subvol_search_by_received_uuid);
>> -     if (!parent_subvol) {
>> +     if (r->explicit_parent_path) {
>> +             if (g_verbose) {
>> +                     fprintf(stderr, "using explicit parent %s\n",
>> +                                     r->explicit_parent_path);
>> +             }
>> +             parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
>> +                     0, r->explicit_parent_path, subvol_search_by_path);
>
> This won't work if you receive more than one snapshot, each one derived
> from the previous one ("btrfs send -e snap1 snap2 snap3 snap4"). You
> would always use the first one as the parent, not the predecessor.
> That's implemented differently for the -p option in
> git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git
>
>
>> +     } else {
>> +             if (g_verbose) {
>> +                     uuid_unparse(parent_uuid, uuid_str);
>> +                     fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
>> +                                     uuid_str, parent_ctransid);
>> +             }
>>               parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>> -                             parent_ctransid, NULL, subvol_search_by_uuid);
>> +                     parent_ctransid, NULL, subvol_search_by_received_uuid);
>> +             if (!parent_subvol) {
>> +                     parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>> +                                     parent_ctransid, NULL, subvol_search_by_uuid);
>> +             }
>
> This used to be a search for the received_uuid only. Why is this code
> changed like this in the branch that is executed when -p is not
> specified, in a patch that has the goal to add -p with new functionality
> if -p is specified?
>
>
>>       }
>> +
>>       if (!parent_subvol) {
>>               ret = -ENOENT;
>>               fprintf(stderr, "ERROR: could not find parent subvolume\n");
>> @@ -962,11 +975,14 @@ int cmd_receive(int argc, char **argv)
>>                       { NULL, 0, NULL, 0 }
>>               };
>>
>> -             c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
>> +             c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
>>               if (c < 0)
>>                       break;
>>
>>               switch (c) {
>> +             case 'p':
>> +                     r.explicit_parent_path = optarg;
>> +                     break;
>>               case 'v':
>>                       g_verbose++;
>>                       break;
>> @@ -1028,6 +1044,8 @@ const char * const cmd_receive_usage[] = {
>>       "                 in the data stream. Without this option,",
>>       "                 the receiver terminates only if an error",
>>       "                 is recognized or on EOF.",
>> +     "-p <subvol>      Disables the automatic searching for parents",
>> +     "                 if incremental streams are received.",
>>       "-C|--chroot      confine the process to <mount> using chroot",
>>       "--max-errors <N> Terminate as soon as N errors happened while",
>>       "                 processing commands from the send stream.",
>>
>



-- 
Lauri Võsandi
tel: +372 53329412
e-mail: lauri.vosandi@gmail.com
blog: http://lauri.vosandi.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-27  9:23   ` lauri
@ 2015-04-27 19:29     ` Stefan Behrens
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Behrens @ 2015-04-27 19:29 UTC (permalink / raw)
  To: lauri; +Cc: linux-btrfs

On 4/27/2015 11:23 AM, lauri wrote:
> Hi,
>
>> This won't work if you receive more than one snapshot, each one derived
>> from the previous one ("btrfs send -e snap1 snap2 snap3 snap4"). You
>> would always use the first one as the parent, not the predecessor.
>> That's implemented differently for the -p option in
>> git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git
>
> I'll take another look on the multi-subvolume stream mode.
>
>> This used to be a search for the received_uuid only. Why is this code
>> changed like this in the branch that is executed when -p is not
>> specified, in a patch that has the goal to add -p with new functionality
>> if -p is specified?
>
> Many things have changed since far-progs, take a look in the original
> source I used as basis for my patch:
>
> https://github.com/kdave/btrfs-progs/blob/1b7dd327f43777bcdd217a0500e6bda78128a290/cmds-receive.c#L233
>
> I am pretty sure lookup logic isn't affected if -p is not used.

You are right! My bad, not looking precisely enough. Before the patch, 
the code did the two lookups as well.

But the first issue is real. One possible fix is to update 
explicit_parent_path after each received snapshot.


>
> 2015-04-27 9:28 GMT+02:00 Stefan Behrens <sbehrens@giantdisaster.de>:
>> On Sun, 26 Apr 2015 12:12:45 +0200, Lauri Võsandi wrote:
>>> This patch adds command-line flag -p to btrfs receive
>>> which makes it possible to disable automatic parent
>>> search for incremental snapshots and use explicitly
>>> specified path instead.
>>>
>>> Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
>>> ---
>>>   cmds-receive.c | 34 ++++++++++++++++++++++++++--------
>>>   1 file changed, 26 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/cmds-receive.c b/cmds-receive.c
>>> index b7cf3f9..391d281 100644
>>> --- a/cmds-receive.c
>>> +++ b/cmds-receive.c
>>> @@ -61,6 +61,7 @@ struct btrfs_receive
>>>        char *root_path;
>>>        char *dest_dir_path; /* relative to root_path */
>>>        char *full_subvol_path;
>>> +     char *explicit_parent_path;
>>>        int dest_dir_chroot;
>>>
>>>        struct subvol_info *cur_subvol;
>>> @@ -220,20 +221,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
>>>                fprintf(stderr, "receiving snapshot %s uuid=%s, "
>>>                                "ctransid=%llu ", path, uuid_str,
>>>                                r->cur_subvol->stransid);
>>> -             uuid_unparse(parent_uuid, uuid_str);
>>> -             fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
>>> -                             uuid_str, parent_ctransid);
>>>        }
>>>
>>>        memset(&args_v2, 0, sizeof(args_v2));
>>>        strncpy_null(args_v2.name, path);
>>>
>>> -     parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>>> -                     parent_ctransid, NULL, subvol_search_by_received_uuid);
>>> -     if (!parent_subvol) {
>>> +     if (r->explicit_parent_path) {
>>> +             if (g_verbose) {
>>> +                     fprintf(stderr, "using explicit parent %s\n",
>>> +                                     r->explicit_parent_path);
>>> +             }
>>> +             parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
>>> +                     0, r->explicit_parent_path, subvol_search_by_path);
>>
>> This won't work if you receive more than one snapshot, each one derived
>> from the previous one ("btrfs send -e snap1 snap2 snap3 snap4"). You
>> would always use the first one as the parent, not the predecessor.
>> That's implemented differently for the -p option in
>> git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git
>>
>>
>>> +     } else {
>>> +             if (g_verbose) {
>>> +                     uuid_unparse(parent_uuid, uuid_str);
>>> +                     fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
>>> +                                     uuid_str, parent_ctransid);
>>> +             }
>>>                parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>>> -                             parent_ctransid, NULL, subvol_search_by_uuid);
>>> +                     parent_ctransid, NULL, subvol_search_by_received_uuid);
>>> +             if (!parent_subvol) {
>>> +                     parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
>>> +                                     parent_ctransid, NULL, subvol_search_by_uuid);
>>> +             }
>>
>> This used to be a search for the received_uuid only. Why is this code
>> changed like this in the branch that is executed when -p is not
>> specified, in a patch that has the goal to add -p with new functionality
>> if -p is specified?
>>
>>
>>>        }
>>> +
>>>        if (!parent_subvol) {
>>>                ret = -ENOENT;
>>>                fprintf(stderr, "ERROR: could not find parent subvolume\n");
>>> @@ -962,11 +975,14 @@ int cmd_receive(int argc, char **argv)
>>>                        { NULL, 0, NULL, 0 }
>>>                };
>>>
>>> -             c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
>>> +             c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
>>>                if (c < 0)
>>>                        break;
>>>
>>>                switch (c) {
>>> +             case 'p':
>>> +                     r.explicit_parent_path = optarg;
>>> +                     break;
>>>                case 'v':
>>>                        g_verbose++;
>>>                        break;
>>> @@ -1028,6 +1044,8 @@ const char * const cmd_receive_usage[] = {
>>>        "                 in the data stream. Without this option,",
>>>        "                 the receiver terminates only if an error",
>>>        "                 is recognized or on EOF.",
>>> +     "-p <subvol>      Disables the automatic searching for parents",
>>> +     "                 if incremental streams are received.",
>>>        "-C|--chroot      confine the process to <mount> using chroot",
>>>        "--max-errors <N> Terminate as soon as N errors happened while",
>>>        "                 processing commands from the send stream.",
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] This patch adds command-line flag -p to btrfs receive which makes it possible to disable automatic parent search for incremental snapshots and use explicitly specified path instead. Works also with multiple incremental snapshots.
  2015-04-26 10:12 [PATCH] btrfs-progs: receive explicit parent support Lauri Võsandi
  2015-04-26 11:19 ` Hugo Mills
  2015-04-27  7:28 ` Stefan Behrens
@ 2015-05-07 17:50 ` Lauri Võsandi
  2015-05-07 18:53 ` [PATCH v2] btrfs-progs: receive explicit parent support Lauri Võsandi
  3 siblings, 0 replies; 12+ messages in thread
From: Lauri Võsandi @ 2015-05-07 17:50 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Lauri Võsandi

Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
---
 cmds-receive.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index b7cf3f9..21a6f56 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -61,6 +61,7 @@ struct btrfs_receive
 	char *root_path;
 	char *dest_dir_path; /* relative to root_path */
 	char *full_subvol_path;
+	char *explicit_parent_path;
 	int dest_dir_chroot;
 
 	struct subvol_info *cur_subvol;
@@ -152,6 +153,12 @@ static int process_subvol(const char *path, const u8 *uuid, u64 ctransid,
 	if (ret < 0)
 		goto out;
 
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = NULL;
+		fprintf(stderr, "WARNING: Receiving full snapshot, no need to specify explicit parent!\n");
+	}
+
 	r->cur_subvol = calloc(1, sizeof(*r->cur_subvol));
 
 	if (strlen(r->dest_dir_path) == 0)
@@ -220,20 +227,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
 		fprintf(stderr, "receiving snapshot %s uuid=%s, "
 				"ctransid=%llu ", path, uuid_str,
 				r->cur_subvol->stransid);
-		uuid_unparse(parent_uuid, uuid_str);
-		fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
-				uuid_str, parent_ctransid);
 	}
 
 	memset(&args_v2, 0, sizeof(args_v2));
 	strncpy_null(args_v2.name, path);
 
-	parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-			parent_ctransid, NULL, subvol_search_by_received_uuid);
-	if (!parent_subvol) {
+	if (r->explicit_parent_path) {
+		if (g_verbose) {
+			fprintf(stderr, "using explicit parent %s\n",
+					r->explicit_parent_path);
+		}
+		parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
+			0, r->explicit_parent_path, subvol_search_by_path);
+	} else {
+		if (g_verbose) {
+			uuid_unparse(parent_uuid, uuid_str);
+			fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
+					uuid_str, parent_ctransid);
+		}
 		parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-				parent_ctransid, NULL, subvol_search_by_uuid);
+			parent_ctransid, NULL, subvol_search_by_received_uuid);
+		if (!parent_subvol) {
+			parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
+					parent_ctransid, NULL, subvol_search_by_uuid);
+		}
 	}
+
 	if (!parent_subvol) {
 		ret = -ENOENT;
 		fprintf(stderr, "ERROR: could not find parent subvolume\n");
@@ -276,6 +295,12 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
 		goto out;
 	}
 
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = strdup(r->cur_subvol->path);
+		printf("New explicit parent is %s\n", r->explicit_parent_path);
+	}
+
 out:
 	if (parent_subvol) {
 		free(parent_subvol->path);
@@ -922,6 +947,10 @@ out:
 	r->full_subvol_path = NULL;
 	r->dest_dir_path = NULL;
 	free(dest_dir_full_path);
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = NULL;
+	}
 	if (r->cur_subvol) {
 		free(r->cur_subvol->path);
 		free(r->cur_subvol);
@@ -962,11 +991,14 @@ int cmd_receive(int argc, char **argv)
 			{ NULL, 0, NULL, 0 }
 		};
 
-		c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
+		c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
 		if (c < 0)
 			break;
 
 		switch (c) {
+		case 'p':
+			r.explicit_parent_path = strdup(optarg);
+			break;
 		case 'v':
 			g_verbose++;
 			break;
@@ -1028,6 +1060,8 @@ const char * const cmd_receive_usage[] = {
 	"                 in the data stream. Without this option,",
 	"                 the receiver terminates only if an error",
 	"                 is recognized or on EOF.",
+	"-p <subvol>      Disables the automatic searching for parents",
+	"                 if incremental streams are received.",
 	"-C|--chroot      confine the process to <mount> using chroot",
 	"--max-errors <N> Terminate as soon as N errors happened while",
 	"                 processing commands from the send stream.",
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2] btrfs-progs: receive explicit parent support
  2015-04-26 10:12 [PATCH] btrfs-progs: receive explicit parent support Lauri Võsandi
                   ` (2 preceding siblings ...)
  2015-05-07 17:50 ` [PATCH v2] This patch adds command-line flag -p to btrfs receive which makes it possible to disable automatic parent search for incremental snapshots and use explicitly specified path instead. Works also with multiple incremental snapshots Lauri Võsandi
@ 2015-05-07 18:53 ` Lauri Võsandi
  2015-05-12 16:17   ` David Sterba
  3 siblings, 1 reply; 12+ messages in thread
From: Lauri Võsandi @ 2015-05-07 18:53 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Lauri Võsandi

This patch adds command-line flag -p to btrfs receive
which makes it possible to disable automatic parent
search for incremental snapshots and use explicitly
specified path instead. Works also with multiple
incremental snapshots.

Signed-off-by: Lauri Võsandi <lauri.vosandi@gmail.com>
---
 cmds-receive.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index b7cf3f9..21a6f56 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -61,6 +61,7 @@ struct btrfs_receive
 	char *root_path;
 	char *dest_dir_path; /* relative to root_path */
 	char *full_subvol_path;
+	char *explicit_parent_path;
 	int dest_dir_chroot;
 
 	struct subvol_info *cur_subvol;
@@ -152,6 +153,12 @@ static int process_subvol(const char *path, const u8 *uuid, u64 ctransid,
 	if (ret < 0)
 		goto out;
 
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = NULL;
+		fprintf(stderr, "WARNING: Receiving full snapshot, no need to specify explicit parent!\n");
+	}
+
 	r->cur_subvol = calloc(1, sizeof(*r->cur_subvol));
 
 	if (strlen(r->dest_dir_path) == 0)
@@ -220,20 +227,32 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
 		fprintf(stderr, "receiving snapshot %s uuid=%s, "
 				"ctransid=%llu ", path, uuid_str,
 				r->cur_subvol->stransid);
-		uuid_unparse(parent_uuid, uuid_str);
-		fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
-				uuid_str, parent_ctransid);
 	}
 
 	memset(&args_v2, 0, sizeof(args_v2));
 	strncpy_null(args_v2.name, path);
 
-	parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-			parent_ctransid, NULL, subvol_search_by_received_uuid);
-	if (!parent_subvol) {
+	if (r->explicit_parent_path) {
+		if (g_verbose) {
+			fprintf(stderr, "using explicit parent %s\n",
+					r->explicit_parent_path);
+		}
+		parent_subvol = subvol_uuid_search(&r->sus, 0, NULL,
+			0, r->explicit_parent_path, subvol_search_by_path);
+	} else {
+		if (g_verbose) {
+			uuid_unparse(parent_uuid, uuid_str);
+			fprintf(stderr, "parent_uuid=%s, parent_ctransid=%llu\n",
+					uuid_str, parent_ctransid);
+		}
 		parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
-				parent_ctransid, NULL, subvol_search_by_uuid);
+			parent_ctransid, NULL, subvol_search_by_received_uuid);
+		if (!parent_subvol) {
+			parent_subvol = subvol_uuid_search(&r->sus, 0, parent_uuid,
+					parent_ctransid, NULL, subvol_search_by_uuid);
+		}
 	}
+
 	if (!parent_subvol) {
 		ret = -ENOENT;
 		fprintf(stderr, "ERROR: could not find parent subvolume\n");
@@ -276,6 +295,12 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid,
 		goto out;
 	}
 
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = strdup(r->cur_subvol->path);
+		printf("New explicit parent is %s\n", r->explicit_parent_path);
+	}
+
 out:
 	if (parent_subvol) {
 		free(parent_subvol->path);
@@ -922,6 +947,10 @@ out:
 	r->full_subvol_path = NULL;
 	r->dest_dir_path = NULL;
 	free(dest_dir_full_path);
+	if (r->explicit_parent_path) {
+		free(r->explicit_parent_path);
+		r->explicit_parent_path = NULL;
+	}
 	if (r->cur_subvol) {
 		free(r->cur_subvol->path);
 		free(r->cur_subvol);
@@ -962,11 +991,14 @@ int cmd_receive(int argc, char **argv)
 			{ NULL, 0, NULL, 0 }
 		};
 
-		c = getopt_long(argc, argv, "Cevf:", long_opts, NULL);
+		c = getopt_long(argc, argv, "Cevf:p:", long_opts, NULL);
 		if (c < 0)
 			break;
 
 		switch (c) {
+		case 'p':
+			r.explicit_parent_path = strdup(optarg);
+			break;
 		case 'v':
 			g_verbose++;
 			break;
@@ -1028,6 +1060,8 @@ const char * const cmd_receive_usage[] = {
 	"                 in the data stream. Without this option,",
 	"                 the receiver terminates only if an error",
 	"                 is recognized or on EOF.",
+	"-p <subvol>      Disables the automatic searching for parents",
+	"                 if incremental streams are received.",
 	"-C|--chroot      confine the process to <mount> using chroot",
 	"--max-errors <N> Terminate as soon as N errors happened while",
 	"                 processing commands from the send stream.",
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-26 11:19 ` Hugo Mills
  2015-04-26 11:37   ` lauri
@ 2015-05-07 19:06   ` lauri
  1 sibling, 0 replies; 12+ messages in thread
From: lauri @ 2015-05-07 19:06 UTC (permalink / raw)
  To: Hugo Mills, Lauri Võsandi, linux-btrfs

Even if the UUID stuff gets fixed there will be plenty of machines
which are unable to adapt the new approach due to conflicting
received_uuid values and having the option to specify -p would help
them to work around the issue to adapt to new received_uuid scheme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] btrfs-progs: receive explicit parent support
  2015-05-07 18:53 ` [PATCH v2] btrfs-progs: receive explicit parent support Lauri Võsandi
@ 2015-05-12 16:17   ` David Sterba
  0 siblings, 0 replies; 12+ messages in thread
From: David Sterba @ 2015-05-12 16:17 UTC (permalink / raw)
  To: Lauri Võsandi; +Cc: linux-btrfs

On Thu, May 07, 2015 at 09:53:59PM +0300, Lauri Võsandi wrote:
> This patch adds command-line flag -p to btrfs receive
> which makes it possible to disable automatic parent
> search for incremental snapshots and use explicitly
> specified path instead. Works also with multiple
> incremental snapshots.

Can you please describe the usecase(s) in detail? I don't have a clear
picture at the moment and given that there are objections I'd like to be
sure that we can document the expected behaviour possibly with a list of
"do not do this".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-04-26 11:37   ` lauri
@ 2015-05-14 17:35     ` Hugo Mills
  2015-05-21 18:02       ` lauri
  0 siblings, 1 reply; 12+ messages in thread
From: Hugo Mills @ 2015-05-14 17:35 UTC (permalink / raw)
  To: lauri; +Cc: linux-btrfs, David Sterba

[-- Attachment #1: Type: text/plain, Size: 11035 bytes --]

On Sun, Apr 26, 2015 at 02:37:58PM +0300, lauri wrote:
> > Subvol A on source side: (A, -)
> > Send this to A' on target side: (A', A)
> > Send this back to A'' on source side: (A'', A) <-- Note the A here, not A'
> 
> I also think your approach is the real solution to the problem, but as
> some pointed out on IRC this changes the behaviour of btrfs receive
> and will break things for someone. When I asked whose obscure workflow
> does it break nobody could come up with any reasonable example whereas
> sending snapshots back and forth seems to be the major usecase for
> btrfs receive and it's impossible to for example restore snapshots
> with current implementation.
> 
> I know that -p may fail horribly but the idea was to have *option* to
> replace parent lookup logic and instead of relying on UUID-s simply
> have it specified by userspace application.

   My argument is that having this option present will mean that
people will use it. Getting send/receive right is hard enough as it is
without giving someone an option which can completely screw things up
in *most* cases. If you want an analogy, it's a bit like giving
someone a drum of nerve toxin to get rid of the fly in their kitchen,
rather than a fly swatter. (Yes, a bad analogy is like a leaky
screwdriver, and this one's engaging in a bit of hyperbole, but I hope
it gets my point across.)

> >    More later, when I've had a little time to play with things and
> > think through the semantics properly.
> 
> You want to give it a try?

   Finally, I managed to spend some time with it. What follows is long
and detailed, but I hope it's clear, and rigorous. I'd like someone
with a mathematical turn of mind to give this a good kicking, because
there may be some errors in it, but I think it's right.

   A subvolume may be described by the tuple (u, r, p), where u is the
UUID of the subvolume itself, r is the "received UUID" set to indicate
the original send source of the subvolume, and p is the UUID of the
parent subvolume, set when a snapshot is taken. We indicate a
read-only snapshot with a * suffix. The location of a subvolume is
shown with the filesystem prefixed to it: m(u, r, p).

   We use capital letters (A, B, C, ...) to indicate subvolumes, and
S, T to indicate the "source" and "target" filesystems. Note that send
operations will sometimes go from T to S in the examples that follow.

   There are four functions we need to consider: two for snapshots and
two for send operations. The current definitions of the functions are:

Ordinary snapshot:
   Snap(m(u1, #, #))  -> m(u2, -, u1)

Read-only snapshot:
   SnapR(m(u1, #, #))  -> m(u2, -, u1)*

Full send:
   Send(m1(u1, #, #)*, m2) -> m2(u2, u1, -)*

Incremental send:
   SendP(m1(u2, #, p1)*, m1(u1, r1, p1)*, m2) -> m2(u3, u2, p2)*
            precondition: m2(p2, u1, #)* exists for some p2.

where # means "don't care" in all cases.

   This latter function sends m1(u2, #, p1)* to m2, using m1(u1, r1,
p1)* as the parameter to the -p command option (e.g. the reference
point for the incremental stream). The precondition is necessary to
ensure that the reference subvolumes on each side are identical. The
m1(u1, r1, p1)* is the same subvolume data as m2(p2, u1, #)* because
the only way that the second field of a subvolume can be set is via
one of the Send functions, which don't modify the subvolume data (or,
through the read-only property, allow it to be modified).

   Now, there are two use cases we want to support: 1) restore from
backups and continue with incrementals; 2) bidirectional
synchronisation, where incremental sends are used to shift the state
of a subvolume from one machine to another and back again. These are
very similar cases -- the former case simply uses a full send from T
to S, rather than the incremental send used in the latter case.

First use case: restore from backups
------------------------------------

The first use case goes something like this:

1) Create a subvolume, S(A, -, -) as the working subvol.
2) Make a backup to T:
 a  SnapR(S(A, -, -))                   ->  S(B, -, A)*
 b  Send(S(B, -, A)*, T)                ->               T(C, B, -)*
3) Subvol A is modified
4) Make an incremental backup to T
 a  SnapR(S(A, -, -))                   ->  S(D, -, A)*
 b  SendP(S(D, -, A)*, S(B, -, A)*, T)  ->               T(E, D, C)*
 (repeat 3, 4 ad libitum)
5) A meteorite strike destroys S, and it is replaced with a blank machine
6) Restore the backup, making a new working subvol G
 a  Send(T(E, D, C)*, S)                ->  S(F, E, -)*
 b  Snap(S(F, E, -)*)                   ->  S(G, -, F)
7) The new working subvol S(G, -, F) is modifed
8) Make an incremental backup to T
 a  SnapR(S(G, -, F))                   ->  S(H, -, G)*
 b  SendP(S(H, -, G)*, S(F, E, -)*, T)  ->               T(I, H, E)*

Now, with the current implementation, 8b will fail for two reasons:

 - We can't immediately tell that S(H, -, G)* and S(F, E, -)* are
   related, and
 - T(#, F, #)* doesn't exist.

However, it should be allowed to succeed, because there's a chain
of snapshots connecting S(F, E, -)* to S(H, -, G)*, so one is a direct
ancestor of the other. Further, a subvolume equivalent to E exists on
both S and T after step 7 -- E and F respectively. The subvolume S(F,
E, -)* has enough metadata to be able to correctly identify that it's
equivalent to T(E, D, C)*, but the precondition on the SendP function
doesn't support it.

Second use case: bidirectional synchronisation
----------------------------------------------

   Let's leave that there for the moment and look at the second
use-case:

1) Create a subvolume, S(A, -, -) as the working subvol.
2) Make a backup to T:
 a  SnapR(S(A, -, -))                   ->  S(B, -, A)*
 b  Send(S(B, -, A)*, T)                ->               T(C, B, -)*
3) Subvol A is modified
4) Make an incremental backup to T
 a  SnapR(S(A, -, -))                   ->  S(D, -, A)*
 b  SendP(S(D, -, A)*, S(B, -, A)*, T)  ->               T(E, D, C)*
 (repeat 3, 4 ad libitum)
5) Make a working copy on T
    Snap(T(E, D, C))                    ->               T(F, -, E)
6) Subvol F is modified
7) Send the changes back to S
 a  SnapR(T(F, -, E))                   ->               T(G, F, -)*
 b  SendP(T(G, F, -)*, T(E, D, C)*, S)  ->  S(H, G, D)*

Again, with the current implementation, this will fail -- at step 7b
this time. As before, there's two reasons why:

 - We can't immediately tell that T(G, F, -)* and T(E, D, C)* are
   related, and
 - T(#, E, #)* doesn't exist.

As before, it should be allowed, because there's a chain of snapshots
connecting T(E, D, C)* to T(G, F, -)*, so the former is a direct
ancestor of the latter, and because S(D, -, A)* is a duplicate of the
reference subvolume T(E, D, C)*.

   Now, in the first case we've got S(F, E, -)* on the source side,
and T(E, D, C)* on the target side. In the second case, we have T(E,
D, C)* on the source and S(D, -, A)* on the target. In both cases,
we're looking for a subvolume on the target whose UUID matches the
Received UUID of the source side's reference subvolume. This contrasts
with the original algorithm, where we only look for a subvolume on the
target whose Received UUID matches the UUID of the source side's
reference.

What to do about it
-------------------

   So, to support these two use-cases, our SendP function needs to
look like this:

   SendP(m1(u2, #, p1)*, m1(u1, r1, p1)*, m2) -> m2(u3, u2, p2)*
            precondition: m2(p2, u1, #)* exists for some p2,
   or
   SendP(m1(u2, #, p1)*, m1(u1, r1, #)*, m2) -> m2(u3, u2, r1)*
            precondition: m2(r1, #, #)* exists,
	              and m1(p1, -, u1)# exists.

or, in other words, the first form is where the reference subvolume
has been sent to the target machine; the second form is where the
reference subvolume has come *from* the target machine. The other term
in the precondition simply verifies that there has been a chain of
snapshots on the local machine that connects the subject of the send
back to the reference subvolume. So, finally, we end up with the final
form of the function as:

   SendP(m1(u2, #, p1)*, m1(u1, r1, p2)*, m2) -> m2(u3, u2, p3)*
          preconditions:
              (chain rule)     m1(p1, -, u1)# exists or p1 == p2
          and (reference rule) m2(p2, u1, #)* exists or m2(r1, #, #)* exists

The chain rule can (must) be done on the send side. The reference rule
means that we need both r1 and u1 to be available on the receiving
side. I haven't checked, but this will probably involve a change to
the stream format to send r1.

Other things
------------

   As mentioned above, there's a use-case where we do something like
this:

1) S(A, -, -)
2) Send a backup
 a  SnapR(S(A, -, -))               ->  S(B, -, A)*
 b  Send(S(B, -, A)*, T)            ->               T(C, B, -)*
3) Roll back to the state of B
    Snap(S(B, -, A))                ->  S(D, -, B)
4) S(D, -, B) is used for a while
5) Send a backup
 a  SnapR(S(D, -, B))               ->  S(E, -, D)*
 b  SendP(S(E, -, D)*, S(B, -, A)*  ->               T(F, E, C)*

This is supported with the above infrastructure, but it's not going to
work if there's more snapshots in between stages 3 and 4 (because the
"parent" UUID is rotated out each time). We would need a
generalisation of the "chain rule", so that we look for a chain of
snapshots of _any_ length from the reference to the subject. However,
this would either require us to keep a full history of all ancestors
with each snapshot, or to find such a path in the tree of snapshot
history when we run the send (and to require that all snapshots in
that sequence still exist). Neither of these is a particularly
attractive proposition, and I would suggest not trying to support that
use case right now.

   The other thing that we don't support at all is merging changes
from divergent snapshots. i.e. snapshot A to B and C, modify both B
and C, and then later try to merge B and C into D. (Possibly with
additional sends between all of those stages). This gets into issues
of semantic merging. I feel entirely justified in telling anyone who
wants to implement that kind of workflow that it's out of scope for
send/receive, and referring them to a distributed revision control
system like git.

   Finally, there's an interesting use case of trying to rotate the
bidirectional sync through several machines in turn (S, T, Q, R and
back to S). I haven't had time to look at that one in detail, and I
probably won't do so, but at least the analysis of it should be
tractable with the above notation, if anyone does feel moved to look
into it.

   Hugo.

-- 
Hugo Mills             | Once is happenstance; twice is coincidence; three
hugo@... carfax.org.uk | times is enemy action.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs-progs: receive explicit parent support
  2015-05-14 17:35     ` Hugo Mills
@ 2015-05-21 18:02       ` lauri
  0 siblings, 0 replies; 12+ messages in thread
From: lauri @ 2015-05-21 18:02 UTC (permalink / raw)
  To: Hugo Mills, lauri, linux-btrfs, David Sterba

Hi,

> My argument is that having this option present will mean that
> people will use it. Getting send/receive right is hard enough as it is
> without giving someone an option which can completely screw things up
> in *most* cases.

I agree that providing workarounds like this is not nice, however considering
the users who are already stuck with the UUID-s of current architecture
should be provided a way to get back on the track with better implementation :)

> The chain rule can (must) be done on the send side. The reference rule
> means that we need both r1 and u1 to be available on the receiving
> side. I haven't checked, but this will probably involve a change to
> the stream format to send r1.

I've also concluded the real solution involves bundling both subvolume
UUID and received UUID with the btrfs stream. This implies btrfs send
stream format change and if I've understood correctly involves kernel
changes as btrfs stream is generated in kernelspace.

>    The other thing that we don't support at all is merging changes
> from divergent snapshots. i.e. snapshot A to B and C, modify both B
> and C, and then later try to merge B and C into D. (Possibly with
> additional sends between all of those stages). This gets into issues
> of semantic merging. I feel entirely justified in telling anyone who
> wants to implement that kind of workflow that it's out of scope for
> send/receive, and referring them to a distributed revision control
> system like git.

I've never suggested merges, when I mentioned Git-like workflow
I was referring to simply push/pull. Obviously implementing
merge is out of the scope of any filesystem ;)




-- 
Lauri Võsandi
tel: +372 53329412
e-mail: lauri.vosandi@gmail.com
blog: http://lauri.vosandi.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-05-21 18:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-26 10:12 [PATCH] btrfs-progs: receive explicit parent support Lauri Võsandi
2015-04-26 11:19 ` Hugo Mills
2015-04-26 11:37   ` lauri
2015-05-14 17:35     ` Hugo Mills
2015-05-21 18:02       ` lauri
2015-05-07 19:06   ` lauri
2015-04-27  7:28 ` Stefan Behrens
2015-04-27  9:23   ` lauri
2015-04-27 19:29     ` Stefan Behrens
2015-05-07 17:50 ` [PATCH v2] This patch adds command-line flag -p to btrfs receive which makes it possible to disable automatic parent search for incremental snapshots and use explicitly specified path instead. Works also with multiple incremental snapshots Lauri Võsandi
2015-05-07 18:53 ` [PATCH v2] btrfs-progs: receive explicit parent support Lauri Võsandi
2015-05-12 16:17   ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).