From: Michael Haggerty <mhagger@alum.mit.edu>
To: David Turner <dturner@twopensource.com>, git@vger.kernel.org
Subject: Re: [PATCH 12/16] refs: always handle non-normal refs in files backend
Date: Wed, 23 Dec 2015 09:02:30 +0100 [thread overview]
Message-ID: <567A5516.9070209@alum.mit.edu> (raw)
In-Reply-To: <1449102921-7707-13-git-send-email-dturner@twopensource.com>
On 12/03/2015 01:35 AM, David Turner wrote:
> Always handle non-normal (per-worktree or pseudo) refs in the files
> backend instead of alternate backends.
>
> Sometimes a ref transaction will update both a per-worktree ref and a
> normal ref. For instance, an ordinary commit might update
> refs/heads/master and HEAD (or at least HEAD's reflog).
>
> We handle three cases here:
>
> 1. updates to normal refs continue to go through the chosen backend
>
> 2. updates to non-normal refs with REF_NODEREF or to non-symbolic refs
> are moved to a separate files backend transaction.
>
> 3. updates to symbolic refs are dereferenced to their base ref. The
> update to the base ref then goes through the ordinary backend, while
> the files backend is directly called to update the symref's reflog.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
> refs.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 139 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 808053f..e48e43a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -9,6 +9,11 @@
> #include "object.h"
> #include "tag.h"
>
> +const char split_transaction_fail_warning[] =
> + "A ref transaction was split across two refs backends. Part of the "
> + "transaction succeeded, but then the update to the per-worktree refs "
> + "failed. Your repository may be in an inconsistent state.";
> +
> /*
> * We always have a files backend and it is the default.
> */
> @@ -784,6 +789,13 @@ void ref_transaction_free(struct ref_transaction *transaction)
> free(transaction);
> }
>
> +static void add_update_obj(struct ref_transaction *transaction,
> + struct ref_update *update)
> +{
> + ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> + transaction->updates[transaction->nr++] = update;
> +}
> +
> static struct ref_update *add_update(struct ref_transaction *transaction,
> const char *refname)
> {
> @@ -791,8 +803,7 @@ static struct ref_update *add_update(struct ref_transaction *transaction,
> struct ref_update *update = xcalloc(1, sizeof(*update) + len);
>
> memcpy((char *)update->refname, refname, len); /* includes NUL */
> - ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> - transaction->updates[transaction->nr++] = update;
> + add_update_obj(transaction, update);
> return update;
> }
>
> @@ -1130,11 +1141,87 @@ int refs_init_db(struct strbuf *err, int shared)
> return the_refs_backend->init_db(err, shared);
> }
>
> +/*
> + * Special case for non-normal refs. For symbolic-refs when
> + * REF_NODEREF is not turned on, we dereference them here and replace
> + * updates to the symbolic refs with updates to the underlying ref.
> + * Then we do our own reflogging for the symbolic ref.
> + *
> + * We move other non-normal ref updates with into a specially-created
> + * files-backend transaction
> + */
Extra word? s/with//?
> +static int move_abnormal_ref_updates(struct ref_transaction *transaction,
> + struct ref_transaction *files_transaction,
> + struct string_list *symrefs)
> +{
> + int i;
> +
> + for (i = 0; i < transaction->nr; i++) {
> + struct ref_update *update = transaction->updates[i];
> + const char *resolved;
> + int flags = 0;
> + unsigned char sha1[20];
> +
> + if (ref_type(update->refname) == REF_TYPE_NORMAL)
> + continue;
> +
> + resolved = resolve_ref_unsafe(update->refname, 0, sha1, &flags);
> +
> + if (update->flags & REF_NODEREF || !(flags & REF_ISSYMREF)) {
> + int last;
> +
> + add_update_obj(files_transaction, update);
> + /*
> + * Replace this transaction with the
> + * last transaction, removing it from
> + * the list of backend transactions
> + */
> + last = --transaction->nr;
> + transaction->updates[i] = transaction->updates[last];
The "last" temporary variable could be trivially inlined.
> + continue;
> + }
> +
> + if (resolved) {
> + struct ref_update *new_update;
> + struct string_list_item *item;
> +
> + if (ref_type(resolved) != REF_TYPE_NORMAL)
> + die("Non-normal symbolic ref `%s` points to non-normal ref `%s`", update->refname, resolved);
We don't usually use backticks in error messages. Please use "'" instead.
Also, please store this error message into a "strbuf *err" and report it
via the usual mechanism.
> + new_update = xmalloc(sizeof(*new_update) +
> + strlen(resolved) + 1);
> + memcpy(new_update, update, sizeof(*update));
Wouldn't it be preferable to replace this messy replacement code
(including the memcpy(), which can't be checked by the type system) with
a call to ref_transaction_update() followed by moving the new update to
this position in the list and possibly tweaking some of its fields?
> + if (update->flags & REF_HAVE_OLD &&
> + hashcmp(sha1, update->old_sha1)) {
> + /* consistency check failed */
> + free(new_update);
> + return -1;
We need an error message to be reported in this case; i.e., via a
"struct strbuf *err" argument.
But actually, I don't understand why this check is needed here at all.
Isn't it redundant with a similar check that will be done later (and
properly, under lock) as part of the main ref_transaction_commit()?
> + } else {
> + hashcpy(update->old_sha1, sha1);
> + }
> +
> + strcpy((char *)new_update->refname, resolved);
> + transaction->updates[i] = new_update;
> +
> + item = string_list_append(symrefs, update->refname);
> + item->util = new_update;
> + free(update);
> + }
> + }
> +
> + return 0;
> +}
> +
> int ref_transaction_commit(struct ref_transaction *transaction,
> struct strbuf *err)
> {
> int ret = -1;
> struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
> + struct string_list files_affected_refnames = STRING_LIST_INIT_NODUP;
> + struct string_list symrefs = STRING_LIST_INIT_DUP;
> + struct string_list_item *item;
> + struct ref_transaction *files_transaction = NULL;
>
> assert(err);
>
> @@ -1146,6 +1233,26 @@ int ref_transaction_commit(struct ref_transaction *transaction,
> return 0;
> }
>
> + if (the_refs_backend != &refs_be_files) {
> + files_transaction = ref_transaction_begin(err);
> + if (!files_transaction)
> + die("%s", err->buf);
I think dying here is too abrupt. Some callers try to recover from a
failed ref_transaction_commit(). Couldn't you "goto done" and let the
caller deal with err?
> + ret = move_abnormal_ref_updates(transaction, files_transaction,
> + &symrefs);
> + if (ret)
> + goto done;
> +
> + /* files backend commit */
> + if (ref_update_reject_duplicates(files_transaction,
> + &files_affected_refnames,
> + err)) {
> + ret = TRANSACTION_GENERIC_ERROR;
> + goto done;
> + }
Is it correct to reject_duplicates among "abnormal" references and
"normal" references separately? ***
> + }
> +
> + /* main backend commit */
> if (ref_update_reject_duplicates(transaction, &affected_refnames, err)) {
> ret = TRANSACTION_GENERIC_ERROR;
> goto done;
> @@ -1153,8 +1260,35 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>
> ret = the_refs_backend->transaction_commit(transaction,
> &affected_refnames, err);
> + if (ret)
> + goto done;
> +
> + if (the_refs_backend != &refs_be_files) {
This conditional would perhaps be more to the point if expressed as "if
(files_transaction)".
> + ret = refs_be_files.transaction_commit(files_transaction,
> + &files_affected_refnames,
> + err);
> + if (ret) {
> + warning(split_transaction_fail_warning);
> + goto done;
> + }
> +
> + /* reflogging for dereferenced symbolic refs */
> + for_each_string_list_item(item, &symrefs) {
> + struct ref_update *update = item->util;
> + if (files_log_ref_write(item->string, update->old_sha1,
> + update->new_sha1,
> + update->msg, update->flags, err))
> + warning("failed to log ref update for symref %s",
> + item->string);
> + }
I think this code is incorrect because it doesn't lock the symbolic
reference before modifying its reflog (though I seem to recall that the
old code was buggy in this respect, too).
I wonder whether it would be simpler overall to leave the ref_update for
the symbolic ref in the files_transaction, but set a special internal
internal flag like REF_LOG_ONLY which tells the usual transaction_commit
code to add a reflog entry for update->old_sha1 to update->new_sha1,
without actually changing the reference.
> + }
> +
> done:
> string_list_clear(&affected_refnames, 0);
> + string_list_clear(&files_affected_refnames, 0);
> + if (files_transaction)
> + ref_transaction_free(files_transaction);
> + string_list_clear(&symrefs, 0);
> return ret;
> }
>
> @@ -1210,6 +1344,9 @@ int peel_ref(const char *refname, unsigned char *sha1)
> int create_symref(const char *ref_target, const char *refs_heads_master,
> const char *logmsg)
> {
> + if (ref_type(ref_target) != REF_TYPE_NORMAL)
> + return refs_be_files.create_symref(ref_target, refs_heads_master,
> + logmsg);
> return the_refs_backend->create_symref(ref_target, refs_heads_master,
> logmsg);
> }
>
I very much like the idea of introducing special handling for symbolic
reference updates within a transaction. In fact, I think I would go even
farther:
Let's take the example of an update to HEAD, which currently points at
refs/heads/master. I think it would *always* be a good idea (i.e., even
when only the files backend is in use) to split that ref_update into two
ref_updates:
1. One to update refs/heads/master and add a reflog entry for that
reference.
2. One to add a reflog entry for HEAD (i.e. using the new REF_LOG_ONLY
flag suggested above).
Why?
* It ensures that both references are locked correctly while their
reflogs are updated. (I believe the current code gets this wrong.)
* It improves the reject_duplicates coverage, which (I think) currently
wouldn't detect the conflict between a direct update of
refs/heads/master with a simultaneous update of the same reference
via HEAD.
* It could later be generalized to an update that goes through multiple
layers of symref indirection (though this would be a very low
priority).
This might benefit the split-backend situation that you are implementing
here. You could first do the symref-splitting step I just described, and
*then* separate the non-normal from the normal refs. I think the net
result would be simpler.
This patch is a lot to digest. I'm not yet confident that I have thought
through all of the ramifications of this patch. I guess a few iterations
will be needed in any case.
By the way, all of the patches preceding this one that I haven't
commented on look OK to me.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
next prev parent reply other threads:[~2015-12-23 8:10 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-03 0:35 [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner
2015-12-03 0:35 ` [PATCH 01/16] refs: add a backend method structure with transaction functions David Turner
2015-12-05 0:07 ` Junio C Hamano
2015-12-03 0:35 ` [PATCH 02/16] refs: add methods for misc ref operations David Turner
2015-12-11 23:33 ` Junio C Hamano
2015-12-11 23:49 ` David Turner
2015-12-11 23:39 ` Junio C Hamano
2015-12-11 23:49 ` David Turner
2015-12-12 0:23 ` Junio C Hamano
2015-12-12 0:48 ` David Turner
2015-12-18 4:06 ` Howard Chu
2015-12-03 0:35 ` [PATCH 03/16] refs: add methods for the ref iterators David Turner
2016-01-03 0:06 ` David Aguilar
2016-01-04 19:01 ` Junio C Hamano
2016-01-05 13:43 ` Michael Haggerty
2016-01-05 18:56 ` Junio C Hamano
2016-01-04 19:12 ` Ronnie Sahlberg
2016-01-04 20:26 ` Junio C Hamano
2016-01-05 1:17 ` Jeff King
2016-01-05 3:29 ` Junio C Hamano
2015-12-03 0:35 ` [PATCH 04/16] refs: add do_for_each_per_worktree_ref David Turner
2015-12-11 23:52 ` Junio C Hamano
2015-12-12 0:01 ` David Turner
2015-12-03 0:35 ` [PATCH 05/16] refs: add methods for reflog David Turner
2015-12-03 0:35 ` [PATCH 06/16] refs: add method for initial ref transaction commit David Turner
2015-12-03 0:35 ` [PATCH 07/16] refs: add method for delete_refs David Turner
2015-12-03 0:35 ` [PATCH 08/16] refs: add methods to init refs backend and db David Turner
2015-12-23 5:33 ` Michael Haggerty
2015-12-23 6:54 ` David Turner
2015-12-03 0:35 ` [PATCH 09/16] refs: add method to rename refs David Turner
2015-12-03 0:35 ` [PATCH 10/16] refs: make lock generic David Turner
2015-12-03 0:35 ` [PATCH 11/16] refs: move duplicate check to common code David Turner
2015-12-23 6:27 ` Michael Haggerty
2016-01-05 16:42 ` David Turner
2015-12-03 0:35 ` [PATCH 12/16] refs: always handle non-normal refs in files backend David Turner
2015-12-23 8:02 ` Michael Haggerty [this message]
2016-01-06 0:13 ` David Turner
2016-01-06 23:41 ` [PATCH/RFC v2 1/3] refs: allow log-only updates David Turner
2016-01-06 23:41 ` [PATCH/RFC v2 2/3] refs: resolve symbolic refs first David Turner
2016-01-06 23:41 ` [PATCH/RFC v2 3/3] refs: always handle non-normal refs in files backend David Turner
2016-01-08 12:52 ` David Turner
2016-01-06 23:42 ` [PATCH 12/16] " David Turner
2015-12-03 0:35 ` [PATCH 13/16] init: allow alternate backends to be set for new repos David Turner
2015-12-05 0:07 ` Junio C Hamano
2015-12-05 6:30 ` Duy Nguyen
2015-12-05 7:44 ` Jeff King
2015-12-08 0:38 ` David Turner
2015-12-23 9:52 ` Michael Haggerty
2015-12-23 20:01 ` Jeff King
2015-12-10 18:02 ` Jeff King
2015-12-10 19:36 ` David Turner
2015-12-23 11:30 ` [PATCH] clone: use child_process for recursive checkouts Michael Haggerty
2016-01-06 23:41 ` David Turner
2015-12-23 13:34 ` [PATCH 13/16] init: allow alternate backends to be set for new repos Michael Haggerty
2016-01-05 17:26 ` David Turner
2016-01-05 18:03 ` Junio C Hamano
2016-01-05 18:24 ` David Turner
2016-01-06 12:02 ` Michael Haggerty
2016-01-06 12:52 ` Duy Nguyen
2016-01-07 3:31 ` Shawn Pearce
2015-12-03 0:35 ` [PATCH 14/16] refs: allow ref backend to be set for clone David Turner
2015-12-23 13:51 ` Michael Haggerty
2015-12-23 20:23 ` Eric Sunshine
2015-12-03 0:35 ` [PATCH 15/16] refs: add LMDB refs backend David Turner
2015-12-05 0:08 ` Junio C Hamano
2015-12-05 0:25 ` David Turner
2015-12-17 1:00 ` Jonathan Nieder
2015-12-17 2:31 ` David Turner
2015-12-17 20:49 ` Jonathan Nieder
2015-12-23 14:32 ` Michael Haggerty
2016-01-08 16:05 ` David Turner
2015-12-03 0:35 ` [PATCH 16/16] refs: tests for lmdb backend David Turner
2015-12-22 23:56 ` [PATCH 00/16] LMDB refs backend atop pre-vtable David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=567A5516.9070209@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).