git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Ronnie Sahlberg <sahlberg@google.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH v10 25/44] receive-pack.c: use a reference transaction for updating the refs
Date: Fri, 23 May 2014 15:49:45 +0200	[thread overview]
Message-ID: <537F51F9.5070600@alum.mit.edu> (raw)
In-Reply-To: <CAL=YDWmLgW0b28q5Yqw7R4nobKF5=pcbSpnazC8+EA=QKhkpow@mail.gmail.com>

On 05/19/2014 09:02 PM, Ronnie Sahlberg wrote:
> On Sat, May 17, 2014 at 8:35 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> On 05/16/2014 07:37 PM, Ronnie Sahlberg wrote:
>>> Wrap all the ref updates inside a transaction to make the update atomic.
>>>
>>> Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
>>> ---
>>>  builtin/receive-pack.c | 20 ++++++++++----------
>>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
>>> index c323081..5534138 100644
>>> --- a/builtin/receive-pack.c
>>> +++ b/builtin/receive-pack.c
>>> @@ -46,6 +46,8 @@ static void *head_name_to_free;
>>>  static int sent_capabilities;
>>>  static int shallow_update;
>>>  static const char *alt_shallow_file;
>>> +static struct strbuf err = STRBUF_INIT;
>>> +static struct ref_transaction *transaction;
>>>
>>>  static enum deny_action parse_deny_action(const char *var, const char *value)
>>>  {
>>> @@ -475,7 +477,6 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>>       const char *namespaced_name;
>>>       unsigned char *old_sha1 = cmd->old_sha1;
>>>       unsigned char *new_sha1 = cmd->new_sha1;
>>> -     struct ref_lock *lock;
>>>
>>>       /* only refs/... are allowed */
>>>       if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) {
>>> @@ -580,15 +581,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>>                   update_shallow_ref(cmd, si))
>>>                       return "shallow error";
>>>
>>> -             lock = lock_any_ref_for_update(namespaced_name, old_sha1,
>>> -                                            0, NULL);
>>> -             if (!lock) {
>>> -                     rp_error("failed to lock %s", name);
>>> -                     return "failed to lock";
>>> -             }
>>> -             if (write_ref_sha1(lock, new_sha1, "push")) {
>>> -                     return "failed to write"; /* error() already called */
>>> -             }
>>> +             if (ref_transaction_update(transaction, namespaced_name,
>>> +                                        new_sha1, old_sha1, 0, 1, &err))
>>> +                     return "failed to update";
>>>               return NULL; /* good */
>>>       }
>>>  }
>>> @@ -812,6 +807,7 @@ static void execute_commands(struct command *commands,
>>>       head_name = head_name_to_free = resolve_refdup("HEAD", sha1, 0, NULL);
>>>
>>>       checked_connectivity = 1;
>>> +     transaction = ref_transaction_begin();
>>>       for (cmd = commands; cmd; cmd = cmd->next) {
>>>               if (cmd->error_string)
>>>                       continue;
>>> @@ -827,6 +823,10 @@ static void execute_commands(struct command *commands,
>>>                       checked_connectivity = 0;
>>>               }
>>>       }
>>> +     if (ref_transaction_commit(transaction, "push", &err))
>>> +             error("%s", err.buf);
>>> +     ref_transaction_free(transaction);
>>> +     strbuf_release(&err);
>>>
>>>       if (shallow_update && !checked_connectivity)
>>>               error("BUG: run 'git fsck' for safety.\n"
>>>
>>
>> This patch is strange, because even if one ref_transaction_update() call
>> fails, subsequent updates are nevertheless also attempted, and the
>> ref_transaction_commit() is also attempted.  Is this an officially
>> sanctioned use of the ref_transactions API?  Should it be?
> 
> I think it should be supported. Because otherwise, unless you have the
> entire transaction localized in a single block you would end up having
> to check and recheck the return value everywhere.
> 
> It makes the API much easier to use if you can continue calling
> transaction functions even after the transaction has failed. If the
> transaction has already failed then _update/_create/_delete will do
> nothing except return an error.

I agree that it is convenient to be able to keep calling functions
blindly without worrying that an earlier function call already failed.
As you point out below, this allows a style of use of the API where you
choose *not* to check intermediate results at all, and only check
whether the final commit succeeds.

Meanwhile, remember the awkwardness in your patch that made fetch use a
transaction to update the references.  In that case, the switch to using
a transaction had the big disadvantage that the user would only get an
error message for the first failing reference update.

When I combine these two lines of thought, it suggests to me that we
could do a better job of supporting *both* use cases.  What if the
transaction object contained not an err strbuf but a string_list?  If an
error occurs while building up the transaction, a message would be added
to the string list and the function would return an error status.  The
caller can monitor errors while it is building up the transaction and
abort immediately if it wants, or it can ignore the return values and
let the error messages accumulate in the string list.  When the caller
attempts the commit, it would notice that the transaction failed, and at
that time the caller could emit *all* of the accumulated error messages
by reading them out of the string list; e.g.,

    Error fetching from $REMOTE:   <- this is generated by caller
        $ERR[0]    <- these come from the error string list,
        $ERR[1]       printed with indentation by caller
        $ERR[2]
        $ERR[3]

This style would have another advantage: we might have some back ends
for which transactions have a high overhead.  Such a back end would
probably choose not to do any checks while the transaction is being
built up, e.g., to avoid a round-trip to a database.  When commit() is
called, it would learn about all of the errors at once.  (1) It would
need a way to return all of the errors to the caller.  (2) It would be
nice for the caller to be able to treat such a back end the same as it
treats a back end that is able to report errors immediately.  It seems
to me that having a way to report multiple errors at the same time would
solve both problems nicely.

> If _commit is called on a failed transaction then the commit will fail
> with an error and do nothing.
> 
> I think it is convenient, and it allows things like :
> 
> struct ref_transaction *transaction;
> void foo()
> {
>    ...
>    ref_transaction_update(transaction, ... , &err);
>    ...
> }
> 
> 
> transaction = ref_transaction_begin(&err);
> ... doing stuff and call things that eventually ends up calling foo,
> possible multiple times ...
> ret = ref_transaction_commit(transaction, &err);
> 
> 
> In foo() we ignore checking the return value so we will not see/care
> if it failed. IF it fails however it will mark the transaction as
> failed and update &err. (Note that this can not yet happen since
> _update can not really fail, ever, but the next series will introduce
> _update failures when we move locking there.)
> 
> Instead we can depend on that IF _update failed, then the call to
> _commit will fail too and &err is already updated so we can defer any
> checking for errors until _commit time.
> 
> This will make the API much more convenient for use cases where you
> begin/commit the transaction in one function but the calls to
> _update/_delete/_create are somewhere else, possible many function
> calls away.
> It does not mean that a caller must ignore the return value from
> ref_transaction_update, just that the caller can do so and defer
> checking for errors until later when it would be more convenient.
> 
> 
> Please see current:
> https://github.com/rsahlberg/git/tree/ref-transactions
> and patch:
> refs.c: add transaction.status and track OPEN/CLOSED/ERROR
> 
> 
>   It might be
>> a way to give feedback to the user on multiple attempted reference
>> updates at once (i.e., address my comment about the last patch).
>>
>> If this is sanctioned, then it might be appropriate for the transaction
>> to keep track of the fact that one or more reference updates failed, and
>> when *_commit() is called to fail the whole transaction.
> 
> Yes. I updated refs.h to indicate that you can continue using
> _update/_create/_delete even if a previous call has failed but that
> these calls will now just return an error.
> 
> This does mean that on the first update that fails for a ref we fail
> the transaction and abort any further _update calls to fail
> immediately so if there would be additional refs that would fail we
> would not log this. I think this is what we want to do since once we
> have had a ref update fail it would be really hard to determine if the
> next failure was just a side effect of the first failure or not.

It could be that errors cascade, for example if I update reference R to
value A, then (maybe a few steps later) verify that R has value A.  If
the update fails, then the verify will also fail.  But it would be silly
for our code to generate such a sequence of operations.  And if that
sequence of operations came from the user (e.g., from "git update-ref
--stdin"), it would be pretty churlish of the user to complain that we
report two errors.  So I don't think your "side effect" worry is a
problem in practice.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2014-05-23 13:49 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-16 17:36 [PATCH v10 00/44] Use ref transactions for all ref updates Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 01/44] refs.c: constify the sha arguments for ref_transaction_create|delete|update Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 02/44] refs.c: allow passing NULL to ref_transaction_free Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 03/44] refs.c: add a strbuf argument to ref_transaction_commit for error logging Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 04/44] refs.c: add an err argument to repack_without_refs Ronnie Sahlberg
2014-05-17 12:40   ` Michael Haggerty
2014-05-27 19:21     ` Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 05/44] refs.c: make ref_update_reject_duplicates take a strbuf argument for errors Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 06/44] refs.c: add an err argument to delete_ref_loose Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 07/44] refs.c: make update_ref_write update a strbuf on failure Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 08/44] update-ref.c: log transaction error from the update_ref Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 09/44] refs.c: remove the onerr argument to ref_transaction_commit Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 10/44] refs.c: change ref_transaction_update() to do error checking and return status Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 11/44] refs.c: change ref_transaction_create " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 12/44] refs.c: ref_transaction_delete to check for error " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 13/44] tag.c: use ref transactions when doing updates Ronnie Sahlberg
2014-05-17 13:09   ` Michael Haggerty
2014-05-19 17:16     ` Ronnie Sahlberg
2014-05-19 18:03       ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 14/44] replace.c: use the ref transaction functions for updates Ronnie Sahlberg
2014-05-17 13:14   ` Michael Haggerty
2014-05-19 18:04     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 15/44] commit.c: use ref transactions " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 16/44] sequencer.c: use ref transactions for all ref updates Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 17/44] fast-import.c: change update_branch to use ref transactions Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 18/44] branch.c: use ref transaction for all ref updates Ronnie Sahlberg
2014-05-17 13:33   ` Michael Haggerty
2014-05-19 17:22     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 19/44] refs.c: change update_ref to use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 20/44] refs.c: free the transaction before returning when number of updates is 0 Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 21/44] refs.c: ref_transaction_commit should not free the transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 22/44] fetch.c: clear errno before calling functions that might set it Ronnie Sahlberg
2014-05-17 14:56   ` Michael Haggerty
2014-05-27 19:14     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 23/44] fetch.c: change s_update_ref to use a ref transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 24/44] fetch.c: use a single ref transaction for all ref updates Ronnie Sahlberg
2014-05-17 15:05   ` Michael Haggerty
2014-05-17 15:17     ` Michael Haggerty
2014-05-16 17:37 ` [PATCH v10 25/44] receive-pack.c: use a reference transaction for updating the refs Ronnie Sahlberg
2014-05-17 15:35   ` Michael Haggerty
2014-05-19 19:02     ` Ronnie Sahlberg
2014-05-23 13:49       ` Michael Haggerty [this message]
2014-05-23 16:14         ` Ronnie Sahlberg
2014-05-23 21:02           ` Michael Haggerty
2014-05-27 19:30             ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 26/44] fast-import.c: use a ref transaction when dumping tags Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 27/44] walker.c: use ref transaction for ref updates Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 28/44] refs.c: make write_ref_sha1 static Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 29/44] refs.c: make lock_ref_sha1 static Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 30/44] refs.c: add transaction.status and track OPEN/CLOSED/ERROR Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 31/44] refs.c: remove the update_ref_lock function Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 32/44] refs.c: remove the update_ref_write function Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 33/44] refs.c: remove lock_ref_sha1 Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 34/44] refs.c: make prune_ref use a transaction to delete the ref Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 35/44] refs.c: make delete_ref use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 36/44] refs.c: pass the ref log message to _create/delete/update instead of _commit Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 37/44] refs.c: pass NULL as *flags to read_ref_full Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 38/44] refs.c: pack all refs before we start to rename a ref Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 40/44] refs.c: call lock_ref_sha1_basic directly from commit Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 42/44] refs.c: pass a skip list to name_conflict_fn Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 43/44] refs.c: make rename_ref use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 44/44] refs.c: remove forward declaration of write_ref_sha1 Ronnie Sahlberg
2014-05-16 18:39 ` [PATCH v10 00/44] Use ref transactions for all ref updates Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537F51F9.5070600@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=sahlberg@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).