Re: [RFC] push: add documentation on push v2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Brandon Williams <bmwill@google.com>
To: Stefan Beller <sbeller@google.com>
Cc: git <git@vger.kernel.org>
Subject: Re: [RFC] push: add documentation on push v2
Date: Wed, 18 Jul 2018 11:21:54 -0700	[thread overview]
Message-ID: <20180718182154.GE17137@google.com> (raw)
In-Reply-To: <CAGZ79kZ4HOvo-xaDK=USveJ5zaLdJSddh8XNxsOsFHQuu7KcZQ@mail.gmail.com>

On 07/18, Stefan Beller wrote:
> > > Given the example above for "rebase-on-push" though
> > > it is better to first send the packfile (as that is assumed to
> > > take longer) and then send the ref updates, such that the
> > > rebasing could be faster and has no bottleneck.
> >
> > I don't really follow this logic.  I don't think it would change
> > anything much considering the ref-updates section would usually be
> > much smaller than the packfile itself, course I don't have any data so
> > idk.
> 
> The server would need to serialize all incoming requests and apply
> them in order. The receiving of the packfile and the response to the client
> are not part of the critical section that needs to happen serialized but
> can be spread out to threads. So for that use case it would make
> sense to allow sending the packfile first.

I would think that a server needs to read the whole request before any
actions can be done, but maybe I need to think about this a bit more.

> 
> > > > +    update = txn_id SP action SP refname SP old_oid SP new_oid
> > > > +    force-update = txn_id SP "force" SP action SP refname SP new_oid
> > >
> > > So we insert "force" after the transaction id if we want to force it.
> > > When adding the atomic capability later we could imagine another insert here
> > >
> > >   1 atomic create refs/heads/new-ref <0-hash> <hash>
> > >   1 atomic delete refs/heads/old-ref <hash> <0-hash>
> > >
> > > which would look like a "rename" that we could also add instead.
> > > The transaction numbers are an interesting concept, how do you
> > > envision them to be used? In the example I put them both in the same
> > > transaction to demonstrate the "atomic-ness", but one could also
> > > imagine different transactions numbers per ref (i.e. exactly one
> > > ref per txn_id) to have a better understanding of what the server did
> > > to each individual ref.
> >
> > I believe I outlined their use later.  Basically if you give the server
> > free reign to do what it wants with the updates you send it, then you
> > need a way for the client to be able to map the result back to what it
> > requested.  Since now i could push to "master" but instead of updating
> > master the server creates a refs/changes/1 ref and puts my changes there
> > instead of updating master.  The client needs to know that the ref
> > update it requested to master is what caused the creation of the
> > refs/changes/1 ref.
> 
> understood, the question was more related to how you envision what
> the client/server SHOULD be doing here, (and I think a one txn_id per
> ref is what SHOULD be done is how this is best to implement the
> thoughts above, also the client is ALLOWED to put many refs in one
> txn, or would we just disallow that already at this stage to not confuse
> the server?)

Oh sorry for misunderstanding.  Yeah, as of right now I think having a
one-to-one relationship makes it easier to write/implement but I don't
know if there are other workflows which would benefit from multiple per
txn_id.

> 
> >
> > >
> > > Are new capabilities attached to ref updates or transactions?
> > > Unlike the example above, stating "atomic" on each line, you could just
> > > say "transaction 1 should be atomic" in another line, that would address
> > > all refs in that transaction.
> >
> > I haven't thought through "atomic" so i have no idea what you'd want
> > that to look like.
> 
> Yeah I have not really thought about them either, I just see two ways:
> (A) adding more keywords in each ref-update (like "force") or
> (B) adding new subsections somewhere where we talk about the capabilities
>   instead.
> 
> Depending on why way we want to go this might have impact on the
> design how to write the code.
> 
> > > > +       * Normal ref-updates require that the old value of a ref is supplied so
> > > > +         that the server can verify that the reference that is being updated
> > > > +         hasn't changed while the request was being processed.
> > >
> > > create/delete assume <00..00> for either old or new ? (We could also
> > > omit the second hash for create delete, which is more appealing to me)
> >
> > Well that depends, in the case of a create you want to ensure that no
> > ref with that name exists and would want it to fail if one already
> > existed.  If you want to force it then you don't care if one existed or
> > not, you just want the ref to have a certain value.
> 
> What I was trying to say is to have
> 
>     update = txn_id SP (modifier SP) action
>     modifier = "force" | "atomic"
>     action = (create | delete | update)
>     create = "create" SP <hash>
>     update = "update" SP <hash> SP <hash>
>     delete = "delete" SP <hash>
> 
> i.e. only one hash for the create and delete action.
> (I added the "atomic" modifier to demonstrate (A) from above, not needed here)

I understood what you were asking, I was just pointing out the rationale
for including the zero-id but i guess you're correct in that simply
omitting it would work just as well for what I outlined above.  So I
think I'll go with what you suggested here.

> 
> > >
> > > > +       * Forced ref-updates only include the new value of a ref as we don't
> > > > +         care what the old value was.
> > >
> > > How are you implementing force-with-lease then?
> >
> > Currently force-with-lease/force is implemented 100% on the client side,
> 
> Uh? That would be bad. Reading 631b5ef219c (push --force-with-lease: tie
> it all together, 2013-07-08) I think that send-pack is done server-side?

send-pack.c is the client side of push while receive-pack is the
server side.  There currently doesn't exist a way for a server to
understand the difference between a force/force-with-lease/non-force
push.

> 
> > this proposal extends these two to be implemented on the server as well.
> > non-forced variant are basically the "with-lease" case and "force" now
> > actually forces an update.
> 
> I think we have 3 modes:
> (1) standard push, where both client and server check for a fast-forward
> (2) "force" that blindly overwrites the ref, but as that has a race condition
>     in case multiple people can push to the remove we have
> (3) "with-lease", disables the fast forward check both on client and server
>     but still takes out a lock on the server to ensure no races happen
> 
> Now you propose to have only 2, making (1) and (3) the same, deferring
> the check to have "fast forwards only" to be client only?
> The server surely wants to ensure that, too (maybe you need
> special permission for non-ff; depends on the server implementation).
> 
> I am not sure I like it, as on the protocol level this indeed looks the same
> and the server and client need to care in their implementation how/when
> the ff-check is done. Though it would be nice for the client UX that you
> need to give a flag to check for ff (both client and server side? or can we rely
> on the client alone then?)

IIRC there are no checks done server-side for force, with-lease, or
fast-forward (well fast-forward can be checked for but this is a config
and has nothing to do with the protocol).  Most of the work is done by
the client to ensure these checks are made.

> 
> > > > +             (delim-pkt | flush-pkt)
> > > > +
> > > > +    unpack-error = PKT-LINE("ERR" SP error-msg LF)
> > > > +
> > > > +    ref-update-status = *(update-result | update-error)
> > > > +    update-result = *PKT-LINE(txn_id SP result LF)
> > > > +    result = ("created" | "deleted" | "updated") SP refname SP old_oid SP new_oid
> > > > +    update-error = PKT-LINE(txn_id SP "error" SP error-msg LF)
> > >
> > > Can we unify "ERR" and "error" ?
> >
> > No, these are very different.  You could have one ref update succeed
> > while another doesn't for some reason, unless you want everything to be
> > atomic.
> 
> I did not mean to unify them on the semantic level, but on the
> representation level, i.e. have both of them spelled the same,
> as they can still be differentiated by the leading txn id?

Oh I misunderstood again :) yeas we could standardize on ERR.

> 
> 
> Thanks,
> Stefan
> 
> P.S.: another feature that just came to mind is localisation of error messages.
> But that is also easy to do with capabilities (the client sends a capability
> such as "preferred-i18n=DE" and the server may translate all its errors
> if it can.
> 
> That brings me to another point: do we assume all errors to be read
> by humans? or do we want some markup things in there, too, similar to
> EAGAIN?

This sort of thing could be added as a protocol-level capability where
the client sends LANG=<some language> so that those sorts of msgs could
be translated server side before sending them.

-- 
Brandon Williams

next prev parent reply	other threads:[~2018-07-18 18:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-17 21:09 [RFC] push: add documentation on push v2 Brandon Williams
2018-07-17 23:25 ` Stefan Beller
2018-07-18 13:31   ` Derrick Stolee
2018-07-18 16:56     ` Stefan Beller
2018-07-18 17:15       ` Brandon Williams
2018-07-20 13:12         ` Jeff Hostetler
2018-07-24 19:00           ` Brandon Williams
2018-07-18 17:11     ` Brandon Williams
2018-07-18 17:19       ` Duy Nguyen
2018-07-18 17:46         ` Brandon Williams
2018-07-18 17:57           ` Duy Nguyen
2018-07-18 17:08   ` Brandon Williams
2018-07-18 18:07     ` Stefan Beller
2018-07-18 18:17       ` Duy Nguyen
2018-07-18 18:21       ` Brandon Williams [this message]
2018-07-24 19:28 ` Brandon Williams
2018-07-25 15:15   ` Duy Nguyen
2018-07-25 17:46     ` Brandon Williams
2018-08-02 15:17       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180718182154.GE17137@google.com \
    --to=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.