All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: David Turner <dturner@twopensource.com>, git@vger.kernel.org
Subject: Re: [PATCH 12/16] refs: always handle non-normal refs in files backend
Date: Wed, 23 Dec 2015 09:02:30 +0100	[thread overview]
Message-ID: <567A5516.9070209@alum.mit.edu> (raw)
In-Reply-To: <1449102921-7707-13-git-send-email-dturner@twopensource.com>

On 12/03/2015 01:35 AM, David Turner wrote:
> Always handle non-normal (per-worktree or pseudo) refs in the files
> backend instead of alternate backends.
> 
> Sometimes a ref transaction will update both a per-worktree ref and a
> normal ref.  For instance, an ordinary commit might update
> refs/heads/master and HEAD (or at least HEAD's reflog).
> 
> We handle three cases here:
> 
> 1. updates to normal refs continue to go through the chosen backend
> 
> 2. updates to non-normal refs with REF_NODEREF or to non-symbolic refs
> are moved to a separate files backend transaction.
> 
> 3. updates to symbolic refs are dereferenced to their base ref.  The
> update to the base ref then goes through the ordinary backend, while
> the files backend is directly called to update the symref's reflog.
> 
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
>  refs.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 139 insertions(+), 2 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 808053f..e48e43a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -9,6 +9,11 @@
>  #include "object.h"
>  #include "tag.h"
>  
> +const char split_transaction_fail_warning[] =
> +	"A ref transaction was split across two refs backends.  Part of the "
> +	"transaction succeeded, but then the update to the per-worktree refs "
> +	"failed.  Your repository may be in an inconsistent state.";
> +
>  /*
>   * We always have a files backend and it is the default.
>   */
> @@ -784,6 +789,13 @@ void ref_transaction_free(struct ref_transaction *transaction)
>  	free(transaction);
>  }
>  
> +static void add_update_obj(struct ref_transaction *transaction,
> +			   struct ref_update *update)
> +{
> +	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> +	transaction->updates[transaction->nr++] = update;
> +}
> +
>  static struct ref_update *add_update(struct ref_transaction *transaction,
>  				     const char *refname)
>  {
> @@ -791,8 +803,7 @@ static struct ref_update *add_update(struct ref_transaction *transaction,
>  	struct ref_update *update = xcalloc(1, sizeof(*update) + len);
>  
>  	memcpy((char *)update->refname, refname, len); /* includes NUL */
> -	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> -	transaction->updates[transaction->nr++] = update;
> +	add_update_obj(transaction, update);
>  	return update;
>  }
>  
> @@ -1130,11 +1141,87 @@ int refs_init_db(struct strbuf *err, int shared)
>  	return the_refs_backend->init_db(err, shared);
>  }
>  
> +/*
> + * Special case for non-normal refs.  For symbolic-refs when
> + * REF_NODEREF is not turned on, we dereference them here and replace
> + * updates to the symbolic refs with updates to the underlying ref.
> + * Then we do our own reflogging for the symbolic ref.
> + *
> + * We move other non-normal ref updates with into a specially-created
> + * files-backend transaction
> + */

Extra word? s/with//?

> +static int move_abnormal_ref_updates(struct ref_transaction *transaction,
> +				     struct ref_transaction *files_transaction,
> +				     struct string_list *symrefs)
> +{
> +	int i;
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		struct ref_update *update = transaction->updates[i];
> +		const char *resolved;
> +		int flags = 0;
> +		unsigned char sha1[20];
> +
> +		if (ref_type(update->refname) == REF_TYPE_NORMAL)
> +			continue;
> +
> +		resolved = resolve_ref_unsafe(update->refname, 0, sha1, &flags);
> +
> +		if (update->flags & REF_NODEREF || !(flags & REF_ISSYMREF)) {
> +			int last;
> +
> +			add_update_obj(files_transaction, update);
> +			/*
> +			 * Replace this transaction with the
> +			 * last transaction, removing it from
> +			 * the list of backend transactions
> +			 */
> +			last = --transaction->nr;
> +			transaction->updates[i] = transaction->updates[last];

The "last" temporary variable could be trivially inlined.

> +			continue;
> +		}
> +
> +		if (resolved) {
> +			struct ref_update *new_update;
> +			struct string_list_item *item;
> +
> +			if (ref_type(resolved) != REF_TYPE_NORMAL)
> +				die("Non-normal symbolic ref `%s` points to non-normal ref `%s`", update->refname, resolved);

We don't usually use backticks in error messages. Please use "'" instead.

Also, please store this error message into a "strbuf *err" and report it
via the usual mechanism.

> +			new_update = xmalloc(sizeof(*new_update) +
> +					     strlen(resolved) + 1);
> +			memcpy(new_update, update, sizeof(*update));

Wouldn't it be preferable to replace this messy replacement code
(including the memcpy(), which can't be checked by the type system) with
a call to ref_transaction_update() followed by moving the new update to
this position in the list and possibly tweaking some of its fields?

> +			if (update->flags & REF_HAVE_OLD &&
> +			    hashcmp(sha1, update->old_sha1)) {
> +				/* consistency check failed */
> +				free(new_update);
> +				return -1;

We need an error message to be reported in this case; i.e., via a
"struct strbuf *err" argument.

But actually, I don't understand why this check is needed here at all.
Isn't it redundant with a similar check that will be done later (and
properly, under lock) as part of the main ref_transaction_commit()?

> +			} else {
> +				hashcpy(update->old_sha1, sha1);
> +			}
> +
> +			strcpy((char *)new_update->refname, resolved);
> +			transaction->updates[i] = new_update;
> +
> +			item = string_list_append(symrefs, update->refname);
> +			item->util = new_update;
> +			free(update);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  int ref_transaction_commit(struct ref_transaction *transaction,
>  			   struct strbuf *err)
>  {
>  	int ret = -1;
>  	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
> +	struct string_list files_affected_refnames = STRING_LIST_INIT_NODUP;
> +	struct string_list symrefs = STRING_LIST_INIT_DUP;
> +	struct string_list_item *item;
> +	struct ref_transaction *files_transaction = NULL;
>  
>  	assert(err);
>  
> @@ -1146,6 +1233,26 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  		return 0;
>  	}
>  
> +	if (the_refs_backend != &refs_be_files) {
> +		files_transaction = ref_transaction_begin(err);
> +		if (!files_transaction)
> +			die("%s", err->buf);

I think dying here is too abrupt. Some callers try to recover from a
failed ref_transaction_commit(). Couldn't you "goto done" and let the
caller deal with err?

> +		ret = move_abnormal_ref_updates(transaction, files_transaction,
> +						&symrefs);
> +		if (ret)
> +			goto done;
> +
> +		/* files backend commit */
> +		if (ref_update_reject_duplicates(files_transaction,
> +						 &files_affected_refnames,
> +						 err)) {
> +			ret = TRANSACTION_GENERIC_ERROR;
> +			goto done;
> +		}

Is it correct to reject_duplicates among "abnormal" references and
"normal" references separately? ***

> +	}
> +
> +	/* main backend commit */
>  	if (ref_update_reject_duplicates(transaction, &affected_refnames, err)) {
>  		ret = TRANSACTION_GENERIC_ERROR;
>  		goto done;
> @@ -1153,8 +1260,35 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  
>  	ret = the_refs_backend->transaction_commit(transaction,
>  						   &affected_refnames, err);
> +	if (ret)
> +		goto done;
> +
> +	if (the_refs_backend != &refs_be_files) {

This conditional would perhaps be more to the point if expressed as "if
(files_transaction)".

> +		ret = refs_be_files.transaction_commit(files_transaction,
> +						       &files_affected_refnames,
> +						       err);
> +		if (ret) {
> +			warning(split_transaction_fail_warning);
> +			goto done;
> +		}
> +
> +		/* reflogging for dereferenced symbolic refs */
> +		for_each_string_list_item(item, &symrefs) {
> +			struct ref_update *update = item->util;
> +			if (files_log_ref_write(item->string, update->old_sha1,
> +						update->new_sha1,
> +						update->msg, update->flags, err))
> +				warning("failed to log ref update for symref %s",
> +					item->string);
> +		}

I think this code is incorrect because it doesn't lock the symbolic
reference before modifying its reflog (though I seem to recall that the
old code was buggy in this respect, too).

I wonder whether it would be simpler overall to leave the ref_update for
the symbolic ref in the files_transaction, but set a special internal
internal flag like REF_LOG_ONLY which tells the usual transaction_commit
code to add a reflog entry for update->old_sha1 to update->new_sha1,
without actually changing the reference.

> +	}
> +
>  done:
>  	string_list_clear(&affected_refnames, 0);
> +	string_list_clear(&files_affected_refnames, 0);
> +	if (files_transaction)
> +		ref_transaction_free(files_transaction);
> +	string_list_clear(&symrefs, 0);
>  	return ret;
>  }
>  
> @@ -1210,6 +1344,9 @@ int peel_ref(const char *refname, unsigned char *sha1)
>  int create_symref(const char *ref_target, const char *refs_heads_master,
>  		  const char *logmsg)
>  {
> +	if (ref_type(ref_target) != REF_TYPE_NORMAL)
> +		return refs_be_files.create_symref(ref_target, refs_heads_master,
> +						   logmsg);
>  	return the_refs_backend->create_symref(ref_target, refs_heads_master,
>  					       logmsg);
>  }
> 

I very much like the idea of introducing special handling for symbolic
reference updates within a transaction. In fact, I think I would go even
farther:

Let's take the example of an update to HEAD, which currently points at
refs/heads/master. I think it would *always* be a good idea (i.e., even
when only the files backend is in use) to split that ref_update into two
ref_updates:

1. One to update refs/heads/master and add a reflog entry for that
   reference.

2. One to add a reflog entry for HEAD (i.e. using the new REF_LOG_ONLY
   flag suggested above).

Why?

* It ensures that both references are locked correctly while their
  reflogs are updated. (I believe the current code gets this wrong.)

* It improves the reject_duplicates coverage, which (I think) currently
  wouldn't detect the conflict between a direct update of
  refs/heads/master with a simultaneous update of the same reference
  via HEAD.

* It could later be generalized to an update that goes through multiple
  layers of symref indirection (though this would be a very low
  priority).

This might benefit the split-backend situation that you are implementing
here. You could first do the symref-splitting step I just described, and
*then* separate the non-normal from the normal refs. I think the net
result would be simpler.

This patch is a lot to digest. I'm not yet confident that I have thought
through all of the ramifications of this patch. I guess a few iterations
will be needed in any case.

By the way, all of the patches preceding this one that I haven't
commented on look OK to me.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

  reply	other threads:[~2015-12-23  8:10 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-03  0:35 [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner
2015-12-03  0:35 ` [PATCH 01/16] refs: add a backend method structure with transaction functions David Turner
2015-12-05  0:07   ` Junio C Hamano
2015-12-03  0:35 ` [PATCH 02/16] refs: add methods for misc ref operations David Turner
2015-12-11 23:33   ` Junio C Hamano
2015-12-11 23:49     ` David Turner
2015-12-11 23:39   ` Junio C Hamano
2015-12-11 23:49     ` David Turner
2015-12-12  0:23       ` Junio C Hamano
2015-12-12  0:48         ` David Turner
2015-12-18  4:06     ` Howard Chu
2015-12-03  0:35 ` [PATCH 03/16] refs: add methods for the ref iterators David Turner
2016-01-03  0:06   ` David Aguilar
2016-01-04 19:01     ` Junio C Hamano
2016-01-05 13:43       ` Michael Haggerty
2016-01-05 18:56         ` Junio C Hamano
2016-01-04 19:12     ` Ronnie Sahlberg
2016-01-04 20:26       ` Junio C Hamano
2016-01-05  1:17         ` Jeff King
2016-01-05  3:29           ` Junio C Hamano
2015-12-03  0:35 ` [PATCH 04/16] refs: add do_for_each_per_worktree_ref David Turner
2015-12-11 23:52   ` Junio C Hamano
2015-12-12  0:01     ` David Turner
2015-12-03  0:35 ` [PATCH 05/16] refs: add methods for reflog David Turner
2015-12-03  0:35 ` [PATCH 06/16] refs: add method for initial ref transaction commit David Turner
2015-12-03  0:35 ` [PATCH 07/16] refs: add method for delete_refs David Turner
2015-12-03  0:35 ` [PATCH 08/16] refs: add methods to init refs backend and db David Turner
2015-12-23  5:33   ` Michael Haggerty
2015-12-23  6:54     ` David Turner
2015-12-03  0:35 ` [PATCH 09/16] refs: add method to rename refs David Turner
2015-12-03  0:35 ` [PATCH 10/16] refs: make lock generic David Turner
2015-12-03  0:35 ` [PATCH 11/16] refs: move duplicate check to common code David Turner
2015-12-23  6:27   ` Michael Haggerty
2016-01-05 16:42     ` David Turner
2015-12-03  0:35 ` [PATCH 12/16] refs: always handle non-normal refs in files backend David Turner
2015-12-23  8:02   ` Michael Haggerty [this message]
2016-01-06  0:13     ` David Turner
2016-01-06 23:41     ` [PATCH/RFC v2 1/3] refs: allow log-only updates David Turner
2016-01-06 23:41       ` [PATCH/RFC v2 2/3] refs: resolve symbolic refs first David Turner
2016-01-06 23:41       ` [PATCH/RFC v2 3/3] refs: always handle non-normal refs in files backend David Turner
2016-01-08 12:52         ` David Turner
2016-01-06 23:42     ` [PATCH 12/16] " David Turner
2015-12-03  0:35 ` [PATCH 13/16] init: allow alternate backends to be set for new repos David Turner
2015-12-05  0:07   ` Junio C Hamano
2015-12-05  6:30   ` Duy Nguyen
2015-12-05  7:44     ` Jeff King
2015-12-08  0:38       ` David Turner
2015-12-23  9:52       ` Michael Haggerty
2015-12-23 20:01         ` Jeff King
2015-12-10 18:02   ` Jeff King
2015-12-10 19:36     ` David Turner
2015-12-23 11:30   ` [PATCH] clone: use child_process for recursive checkouts Michael Haggerty
2016-01-06 23:41     ` David Turner
2015-12-23 13:34   ` [PATCH 13/16] init: allow alternate backends to be set for new repos Michael Haggerty
2016-01-05 17:26     ` David Turner
2016-01-05 18:03       ` Junio C Hamano
2016-01-05 18:24         ` David Turner
2016-01-06 12:02         ` Michael Haggerty
2016-01-06 12:52     ` Duy Nguyen
2016-01-07  3:31       ` Shawn Pearce
2015-12-03  0:35 ` [PATCH 14/16] refs: allow ref backend to be set for clone David Turner
2015-12-23 13:51   ` Michael Haggerty
2015-12-23 20:23     ` Eric Sunshine
2015-12-03  0:35 ` [PATCH 15/16] refs: add LMDB refs backend David Turner
2015-12-05  0:08   ` Junio C Hamano
2015-12-05  0:25     ` David Turner
2015-12-17  1:00   ` Jonathan Nieder
2015-12-17  2:31     ` David Turner
2015-12-17 20:49       ` Jonathan Nieder
2015-12-23 14:32   ` Michael Haggerty
2016-01-08 16:05     ` David Turner
2015-12-03  0:35 ` [PATCH 16/16] refs: tests for lmdb backend David Turner
2015-12-22 23:56 ` [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=567A5516.9070209@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.