git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ronnie Sahlberg <sahlberg@google.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH v10 25/44] receive-pack.c: use a reference transaction for updating the refs
Date: Fri, 23 May 2014 09:14:25 -0700	[thread overview]
Message-ID: <CAL=YDWkDSF1WWhZAt-nW8RUAjm+iBmg+=p8hq6GJAzF-3-WxGg@mail.gmail.com> (raw)
In-Reply-To: <537F51F9.5070600@alum.mit.edu>

On Fri, May 23, 2014 at 6:49 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 05/19/2014 09:02 PM, Ronnie Sahlberg wrote:
>> On Sat, May 17, 2014 at 8:35 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>>> On 05/16/2014 07:37 PM, Ronnie Sahlberg wrote:
>>>> Wrap all the ref updates inside a transaction to make the update atomic.
>>>>
>>>> Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
>>>> ---
>>>>  builtin/receive-pack.c | 20 ++++++++++----------
>>>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
>>>> index c323081..5534138 100644
>>>> --- a/builtin/receive-pack.c
>>>> +++ b/builtin/receive-pack.c
>>>> @@ -46,6 +46,8 @@ static void *head_name_to_free;
>>>>  static int sent_capabilities;
>>>>  static int shallow_update;
>>>>  static const char *alt_shallow_file;
>>>> +static struct strbuf err = STRBUF_INIT;
>>>> +static struct ref_transaction *transaction;
>>>>
>>>>  static enum deny_action parse_deny_action(const char *var, const char *value)
>>>>  {
>>>> @@ -475,7 +477,6 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>>>       const char *namespaced_name;
>>>>       unsigned char *old_sha1 = cmd->old_sha1;
>>>>       unsigned char *new_sha1 = cmd->new_sha1;
>>>> -     struct ref_lock *lock;
>>>>
>>>>       /* only refs/... are allowed */
>>>>       if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) {
>>>> @@ -580,15 +581,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>>>                   update_shallow_ref(cmd, si))
>>>>                       return "shallow error";
>>>>
>>>> -             lock = lock_any_ref_for_update(namespaced_name, old_sha1,
>>>> -                                            0, NULL);
>>>> -             if (!lock) {
>>>> -                     rp_error("failed to lock %s", name);
>>>> -                     return "failed to lock";
>>>> -             }
>>>> -             if (write_ref_sha1(lock, new_sha1, "push")) {
>>>> -                     return "failed to write"; /* error() already called */
>>>> -             }
>>>> +             if (ref_transaction_update(transaction, namespaced_name,
>>>> +                                        new_sha1, old_sha1, 0, 1, &err))
>>>> +                     return "failed to update";
>>>>               return NULL; /* good */
>>>>       }
>>>>  }
>>>> @@ -812,6 +807,7 @@ static void execute_commands(struct command *commands,
>>>>       head_name = head_name_to_free = resolve_refdup("HEAD", sha1, 0, NULL);
>>>>
>>>>       checked_connectivity = 1;
>>>> +     transaction = ref_transaction_begin();
>>>>       for (cmd = commands; cmd; cmd = cmd->next) {
>>>>               if (cmd->error_string)
>>>>                       continue;
>>>> @@ -827,6 +823,10 @@ static void execute_commands(struct command *commands,
>>>>                       checked_connectivity = 0;
>>>>               }
>>>>       }
>>>> +     if (ref_transaction_commit(transaction, "push", &err))
>>>> +             error("%s", err.buf);
>>>> +     ref_transaction_free(transaction);
>>>> +     strbuf_release(&err);
>>>>
>>>>       if (shallow_update && !checked_connectivity)
>>>>               error("BUG: run 'git fsck' for safety.\n"
>>>>
>>>
>>> This patch is strange, because even if one ref_transaction_update() call
>>> fails, subsequent updates are nevertheless also attempted, and the
>>> ref_transaction_commit() is also attempted.  Is this an officially
>>> sanctioned use of the ref_transactions API?  Should it be?
>>
>> I think it should be supported. Because otherwise, unless you have the
>> entire transaction localized in a single block you would end up having
>> to check and recheck the return value everywhere.
>>
>> It makes the API much easier to use if you can continue calling
>> transaction functions even after the transaction has failed. If the
>> transaction has already failed then _update/_create/_delete will do
>> nothing except return an error.
>
> I agree that it is convenient to be able to keep calling functions
> blindly without worrying that an earlier function call already failed.
> As you point out below, this allows a style of use of the API where you
> choose *not* to check intermediate results at all, and only check
> whether the final commit succeeds.
>
> Meanwhile, remember the awkwardness in your patch that made fetch use a
> transaction to update the references.  In that case, the switch to using
> a transaction had the big disadvantage that the user would only get an
> error message for the first failing reference update.
>
> When I combine these two lines of thought, it suggests to me that we
> could do a better job of supporting *both* use cases.  What if the
> transaction object contained not an err strbuf but a string_list?  If an
> error occurs while building up the transaction, a message would be added
> to the string list and the function would return an error status.  The
> caller can monitor errors while it is building up the transaction and
> abort immediately if it wants, or it can ignore the return values and
> let the error messages accumulate in the string list.  When the caller
> attempts the commit, it would notice that the transaction failed, and at
> that time the caller could emit *all* of the accumulated error messages
> by reading them out of the string list; e.g.,
>
>     Error fetching from $REMOTE:   <- this is generated by caller
>         $ERR[0]    <- these come from the error string list,
>         $ERR[1]       printed with indentation by caller
>         $ERR[2]
>         $ERR[3]
>
> This style would have another advantage: we might have some back ends
> for which transactions have a high overhead.  Such a back end would
> probably choose not to do any checks while the transaction is being
> built up, e.g., to avoid a round-trip to a database.  When commit() is
> called, it would learn about all of the errors at once.  (1) It would
> need a way to return all of the errors to the caller.  (2) It would be
> nice for the caller to be able to treat such a back end the same as it
> treats a back end that is able to report errors immediately.  It seems
> to me that having a way to report multiple errors at the same time would
> solve both problems nicely.

Inretesting.
That would mean changing all functions to take a string_list provided
by the caller instead of a strbuf.
And then have _update/_create/_delete do actual work instead of
bailing out after the first error.

Users that want to check for error and log after each call to
_update/_create/_delete could do so and
just use the last entry added to the string list or otherwise they
could just wait until _commit time and if it fails log
all the strings.


>
>> If _commit is called on a failed transaction then the commit will fail
>> with an error and do nothing.
>>
>> I think it is convenient, and it allows things like :
>>
>> struct ref_transaction *transaction;
>> void foo()
>> {
>>    ...
>>    ref_transaction_update(transaction, ... , &err);
>>    ...
>> }
>>
>>
>> transaction = ref_transaction_begin(&err);
>> ... doing stuff and call things that eventually ends up calling foo,
>> possible multiple times ...
>> ret = ref_transaction_commit(transaction, &err);
>>
>>
>> In foo() we ignore checking the return value so we will not see/care
>> if it failed. IF it fails however it will mark the transaction as
>> failed and update &err. (Note that this can not yet happen since
>> _update can not really fail, ever, but the next series will introduce
>> _update failures when we move locking there.)
>>
>> Instead we can depend on that IF _update failed, then the call to
>> _commit will fail too and &err is already updated so we can defer any
>> checking for errors until _commit time.
>>
>> This will make the API much more convenient for use cases where you
>> begin/commit the transaction in one function but the calls to
>> _update/_delete/_create are somewhere else, possible many function
>> calls away.
>> It does not mean that a caller must ignore the return value from
>> ref_transaction_update, just that the caller can do so and defer
>> checking for errors until later when it would be more convenient.
>>
>>
>> Please see current:
>> https://github.com/rsahlberg/git/tree/ref-transactions
>> and patch:
>> refs.c: add transaction.status and track OPEN/CLOSED/ERROR
>>
>>
>>   It might be
>>> a way to give feedback to the user on multiple attempted reference
>>> updates at once (i.e., address my comment about the last patch).
>>>
>>> If this is sanctioned, then it might be appropriate for the transaction
>>> to keep track of the fact that one or more reference updates failed, and
>>> when *_commit() is called to fail the whole transaction.
>>
>> Yes. I updated refs.h to indicate that you can continue using
>> _update/_create/_delete even if a previous call has failed but that
>> these calls will now just return an error.
>>
>> This does mean that on the first update that fails for a ref we fail
>> the transaction and abort any further _update calls to fail
>> immediately so if there would be additional refs that would fail we
>> would not log this. I think this is what we want to do since once we
>> have had a ref update fail it would be really hard to determine if the
>> next failure was just a side effect of the first failure or not.
>
> It could be that errors cascade, for example if I update reference R to
> value A, then (maybe a few steps later) verify that R has value A.  If
> the update fails, then the verify will also fail.  But it would be silly
> for our code to generate such a sequence of operations.  And if that
> sequence of operations came from the user (e.g., from "git update-ref
> --stdin"), it would be pretty churlish of the user to complain that we
> report two errors.  So I don't think your "side effect" worry is a
> problem in practice.

If we accept that users could do bad things causing cascading errors
to be logged
and that the user is to blame for the cascading errors in the logged
output, then I
am fine with doing these changes you suggest.

I will try this change today and see what it looks like.

  reply	other threads:[~2014-05-23 16:14 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-16 17:36 [PATCH v10 00/44] Use ref transactions for all ref updates Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 01/44] refs.c: constify the sha arguments for ref_transaction_create|delete|update Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 02/44] refs.c: allow passing NULL to ref_transaction_free Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 03/44] refs.c: add a strbuf argument to ref_transaction_commit for error logging Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 04/44] refs.c: add an err argument to repack_without_refs Ronnie Sahlberg
2014-05-17 12:40   ` Michael Haggerty
2014-05-27 19:21     ` Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 05/44] refs.c: make ref_update_reject_duplicates take a strbuf argument for errors Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 06/44] refs.c: add an err argument to delete_ref_loose Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 07/44] refs.c: make update_ref_write update a strbuf on failure Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 08/44] update-ref.c: log transaction error from the update_ref Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 09/44] refs.c: remove the onerr argument to ref_transaction_commit Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 10/44] refs.c: change ref_transaction_update() to do error checking and return status Ronnie Sahlberg
2014-05-16 17:36 ` [PATCH v10 11/44] refs.c: change ref_transaction_create " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 12/44] refs.c: ref_transaction_delete to check for error " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 13/44] tag.c: use ref transactions when doing updates Ronnie Sahlberg
2014-05-17 13:09   ` Michael Haggerty
2014-05-19 17:16     ` Ronnie Sahlberg
2014-05-19 18:03       ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 14/44] replace.c: use the ref transaction functions for updates Ronnie Sahlberg
2014-05-17 13:14   ` Michael Haggerty
2014-05-19 18:04     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 15/44] commit.c: use ref transactions " Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 16/44] sequencer.c: use ref transactions for all ref updates Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 17/44] fast-import.c: change update_branch to use ref transactions Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 18/44] branch.c: use ref transaction for all ref updates Ronnie Sahlberg
2014-05-17 13:33   ` Michael Haggerty
2014-05-19 17:22     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 19/44] refs.c: change update_ref to use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 20/44] refs.c: free the transaction before returning when number of updates is 0 Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 21/44] refs.c: ref_transaction_commit should not free the transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 22/44] fetch.c: clear errno before calling functions that might set it Ronnie Sahlberg
2014-05-17 14:56   ` Michael Haggerty
2014-05-27 19:14     ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 23/44] fetch.c: change s_update_ref to use a ref transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 24/44] fetch.c: use a single ref transaction for all ref updates Ronnie Sahlberg
2014-05-17 15:05   ` Michael Haggerty
2014-05-17 15:17     ` Michael Haggerty
2014-05-16 17:37 ` [PATCH v10 25/44] receive-pack.c: use a reference transaction for updating the refs Ronnie Sahlberg
2014-05-17 15:35   ` Michael Haggerty
2014-05-19 19:02     ` Ronnie Sahlberg
2014-05-23 13:49       ` Michael Haggerty
2014-05-23 16:14         ` Ronnie Sahlberg [this message]
2014-05-23 21:02           ` Michael Haggerty
2014-05-27 19:30             ` Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 26/44] fast-import.c: use a ref transaction when dumping tags Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 27/44] walker.c: use ref transaction for ref updates Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 28/44] refs.c: make write_ref_sha1 static Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 29/44] refs.c: make lock_ref_sha1 static Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 30/44] refs.c: add transaction.status and track OPEN/CLOSED/ERROR Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 31/44] refs.c: remove the update_ref_lock function Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 32/44] refs.c: remove the update_ref_write function Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 33/44] refs.c: remove lock_ref_sha1 Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 34/44] refs.c: make prune_ref use a transaction to delete the ref Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 35/44] refs.c: make delete_ref use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 36/44] refs.c: pass the ref log message to _create/delete/update instead of _commit Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 37/44] refs.c: pass NULL as *flags to read_ref_full Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 38/44] refs.c: pack all refs before we start to rename a ref Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 39/44] refs.c: move the check for valid refname to lock_ref_sha1_basic Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 40/44] refs.c: call lock_ref_sha1_basic directly from commit Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 41/44] refs.c: add a new flag for transaction delete for refs we know are packed only Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 42/44] refs.c: pass a skip list to name_conflict_fn Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 43/44] refs.c: make rename_ref use a transaction Ronnie Sahlberg
2014-05-16 17:37 ` [PATCH v10 44/44] refs.c: remove forward declaration of write_ref_sha1 Ronnie Sahlberg
2014-05-16 18:39 ` [PATCH v10 00/44] Use ref transactions for all ref updates Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL=YDWkDSF1WWhZAt-nW8RUAjm+iBmg+=p8hq6GJAzF-3-WxGg@mail.gmail.com' \
    --to=sahlberg@google.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).