* [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
@ 2011-06-06 17:26 Alexander Neronskiy
2011-06-06 18:21 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Neronskiy @ 2011-06-06 17:26 UTC (permalink / raw)
To: git; +Cc: schacon, Johannes.Schindelin
Explain the exchange that occurs between a client and server when
the client is requesting shallow history and/or is already using
a shallow repository.
Signed-off-by: Alex Neronskiy <zakmagnus@google.com>
---
Documentation/technical/pack-protocol.txt | 87 ++++++++++++++++++++++-------
1 files changed, 66 insertions(+), 21 deletions(-)
diff --git a/Documentation/technical/pack-protocol.txt
b/Documentation/technical/pack-protocol.txt
index 369f91d..f576386 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -187,26 +187,28 @@ server determine what the minimal packfile
necessary for transport is.
Once the client has the initial list of references that the server
has, as well as the list of capabilities, it will begin telling the
-server what objects it wants and what objects it has, so the server
-can make a packfile that only contains the objects that the client needs.
-The client will also send a list of the capabilities it wants to be in
-effect, out of what the server said it could do with the first 'want' line.
+server what objects it wants, its shallow objects (if any), and the
+maximum commit depth it wants (if any). The client will also send a
+list of the capabilities it wants to be in effect, out of what the
+server said it could do with the first 'want' line.
----
upload-request = want-list
- have-list
- compute-end
+ *shallow-line
+ *1depth-request
+ flush-pkt
want-list = first-want
*additional-want
- flush-pkt
+
+ shallow-line = PKT_LINE("shallow" SP obj-id)
+
+ depth-request = PKT_LINE("deepen" SP depth)
first-want = PKT-LINE("want" SP obj-id SP capability-list LF)
additional-want = PKT-LINE("want" SP obj-id LF)
- have-list = *have-line
- have-line = PKT-LINE("have" SP obj-id LF)
- compute-end = flush-pkt / PKT-LINE("done")
+ depth = 1*DIGIT
----
Clients MUST send all the obj-ids it wants from the reference
@@ -215,21 +217,64 @@ discovery phase as 'want' lines. Clients MUST
send at least one
obj-id in a 'want' command which did not appear in the response
obtained through ref discovery.
-If client is requesting a shallow clone, it will now send a 'deepen'
-line with the depth it is requesting.
+The client MUST write all obj-ids which it only has shallow copies
+of (meaning that it does not have the parents of a commit) as
+'shallow' lines so that the server is aware of the limitations of
+the client's history. Clients MUST NOT mention an obj-id which
+it does not know exists on the server.
+
+The client now sends the maximum commit history depth it wants for
+this transaction, which is the number of commits it wants from the
+tip of the history, if any, as a 'deepen' line. A depth of 0 is the
+same as not making a depth request. The client does not want to receive
+any commits beyond this depth, nor objects needed only to complete
+those commits. Commits whose parents are not received as a result are
+marked as shallow.
+
+Once all the 'want's and 'shallow's (and optional 'deepen') are
+transferred, clients MUST send a flush-pkt. If the client has all
+the references on the server, and as much of their commit history
+as it is interested in, client flushes and disconnects.
+
+Otherwise, if the client sent a positive depth request, the server
+will determine which commits will and will not be shallow and
+send this information to the client. If the client did not request
+a positive depth, this step is skipped.
-Once all the "want"s (and optional 'deepen') are transferred,
-clients MUST send a flush-pkt. If the client has all the references
-on the server, client flushes and disconnects.
+----
+ shallow-update = *shallow-line
+ *unshallow-line
+ flush-pkt
-TODO: shallow/unshallow response and document the deepen command in the ABNF.
+ shallow-line = PKT-LINE("shallow" SP obj-id)
+
+ unshallow-line = PKT-LINE("unshallow" SP obj-id)
+----
+
+If the client has requested a positive depth, the server will compute
+the set of commits which are no deeper than the desired depth, starting
+at the client's wants. The server writes 'shallow' lines for each
+commit whose parents will not be sent as a result. The server writes
+an 'unshallow' line for each commit which the client has indicated is
+shallow, but is no longer shallow at the currently requested depth
+(that is, its parents will now be sent). The server MUST NOT mark
+as unshallow anything which the client has not indicated was shallow.
Now the client will send a list of the obj-ids it has using 'have'
-lines. In multi_ack mode, the canonical implementation will send up
-to 32 of these at a time, then will send a flush-pkt. The canonical
-implementation will skip ahead and send the next 32 immediately,
-so that there is always a block of 32 "in-flight on the wire" at a
-time.
+lines, so the server can make a packfile that only contains the objects
+that the client needs. In multi_ack mode, the canonical implementation
+will send up to 32 of these at a time, then will send a flush-pkt. The
+canonical implementation will skip ahead and send the next 32 immediately,
+so that there is always a block of 32 "in-flight on the wire" at a time.
+
+----
+ upload-haves = have-list
+ compute-end
+
+ have-list = *have-line
+ have-line = PKT-LINE("have" SP obj-id LF)
+ compute-end = flush-pkt / PKT-LINE("done")
+----
If the server reads 'have' lines, it then will respond by ACKing any
of the obj-ids the client said it had that the server also has. The
--
1.7.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-06 17:26 [PATCH] Document the underlying protocol used by shallow repositories and --depth commands Alexander Neronskiy
@ 2011-06-06 18:21 ` Junio C Hamano
2011-06-06 19:56 ` Alex Neronskiy
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-06-06 18:21 UTC (permalink / raw)
To: Alexander Neronskiy; +Cc: git, schacon, Johannes.Schindelin
Alexander Neronskiy <zakmagnus@google.com> writes:
> Explain the exchange that occurs between a client and server when
> the client is requesting shallow history and/or is already using
> a shallow repository.
>
> Signed-off-by: Alex Neronskiy <zakmagnus@google.com>
> ---
> Documentation/technical/pack-protocol.txt | 87 ++++++++++++++++++++++-------
Hmmmm, why is this patch riddled with these s?
> diff --git a/Documentation/technical/pack-protocol.txt
> b/Documentation/technical/pack-protocol.txt
> index 369f91d..f576386 100644
> --- a/Documentation/technical/pack-protocol.txt
> +++ b/Documentation/technical/pack-protocol.txt
> @@ -187,26 +187,28 @@ server determine what the minimal packfile
> necessary for transport is.
Linewrapped, perhaps by your MUA.
> Once the client has the initial list of references that the server
> has, as well as the list of capabilities, it will begin telling the
> -server what objects it wants and what objects it has, so the server
> -can make a packfile that only contains the objects that the client needs.
> -The client will also send a list of the capabilities it wants to be in
> -effect, out of what the server said it could do with the first 'want' line.
> +server what objects it wants, its shallow objects (if any), and the
> +maximum commit depth it wants (if any). The client will also send a
> +list of the capabilities it wants to be in effect, out of what the
> +server said it could do with the first 'want' line.
>
> ----
> upload-request = want-list
> - have-list
> - compute-end
> + *shallow-line
> + *1depth-request
> + flush-pkt
>
> want-list = first-want
> *additional-want
> - flush-pkt
> +
> + shallow-line = PKT_LINE("shallow" SP obj-id)
> +
> + depth-request = PKT_LINE("deepen" SP depth)
>
> first-want = PKT-LINE("want" SP obj-id SP capability-list LF)
> additional-want = PKT-LINE("want" SP obj-id LF)
>
> - have-list = *have-line
> - have-line = PKT-LINE("have" SP obj-id LF)
> - compute-end = flush-pkt / PKT-LINE("done")
> + depth = 1*DIGIT
> ----
This change splits the earlier "upload-request" that consisted want-list,
have-list and compute-end into two phases. The first phase described
above is where the client tells the server what it wants, and what it
doesn't have (by giving the "shallow" boundaries), and possibly limits the
"wants" by depth. The second phase is described much later and consists of
the "have" and "done", which comes after the "shallow-update" phase (whose
description is new).
I think the separation makes sense, as there is a lot to talk about what
happens during this phase.
> Clients MUST send all the obj-ids it wants from the reference
> @@ -215,21 +217,64 @@ discovery phase as 'want' lines. Clients MUST
> send at least one
> obj-id in a 'want' command which did not appear in the response
> obtained through ref discovery.
>
> -If client is requesting a shallow clone, it will now send a 'deepen'
> -line with the depth it is requesting.
> +The client MUST write all obj-ids which it only has shallow copies
> +of (meaning that it does not have the parents of a commit) as
> +'shallow' lines so that the server is aware of the limitations of
> +the client's history. Clients MUST NOT mention an obj-id which
> +it does not know exists on the server.
> +
> +The client now sends the maximum commit history depth it wants for
> +this transaction, which is the number of commits it wants from the
> +tip of the history, if any, as a 'deepen' line. A depth of 0 is the
> +same as not making a depth request. The client does not want to receive
> +any commits beyond this depth, nor objects needed only to complete
> +those commits. Commits whose parents are not received as a result are
> +marked as shallow.
... on the server end and will be sent back in the shallow-update phase
below.
> +Once all the 'want's and 'shallow's (and optional 'deepen') are
> +transferred, clients MUST send a flush-pkt. If the client has all
> +the references on the server, and as much of their commit history
> +as it is interested in, client flushes and disconnects.
Hmmmmm, are you describing "everything-local then flush and all-done" in
do_fetch_pack() with the second sentence? If so, placing the description
here is misleading. In that case, I do not think any of the find-common
exchange starting from the "upload-request" phase happens.
> +Otherwise, if the client sent a positive depth request, the server
> +will determine which commits will and will not be shallow and
> +send this information to the client. If the client did not request
> +a positive depth, this step is skipped.
> -Once all the "want"s (and optional 'deepen') are transferred,
> -clients MUST send a flush-pkt. If the client has all the references
> -on the server, client flushes and disconnects.
> +----
> + shallow-update = *shallow-line
> + *unshallow-line
> + flush-pkt
> ...
This is not a complaint to this patch, but I had to read the above twice
to realize that the paragraph "Otherwise..." is not a continuation of the
detailed discussion of the "upload-request" phase, but is a preamble to
the next "shallow-update" phase. It might make sense to give a subsection
heading to each of the phases, like...
Packfile Negotiation
1. upload-request phase
After reference and capabilities... (preamble for this phase)
----
upload-request = want-list ... ABNF
----
The client MUST send all the ... (detailed description of this
phase)
2. shallow-update phase
When the client sent a positive depth request, the server will
determine ... (preamble for this phase)
----
shallow-update = *shallow-line ... ABNF
----
... detailed description of this phase ...
3. common ancestor discovery phase
Now the client will send a list of ...
... likewise ...
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-06 18:21 ` Junio C Hamano
@ 2011-06-06 19:56 ` Alex Neronskiy
2011-06-07 20:23 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Alex Neronskiy @ 2011-06-06 19:56 UTC (permalink / raw)
To: git
Junio C Hamano <gitster <at> pobox.com> writes:
> > +Once all the 'want's and 'shallow's (and optional 'deepen') are
> > +transferred, clients MUST send a flush-pkt. If the client has all
> > +the references on the server, and as much of their commit history
> > +as it is interested in, client flushes and disconnects.
>
> Hmmmmm, are you describing "everything-local then flush and all-done" in
> do_fetch_pack() with the second sentence? If so, placing the description
> here is misleading. In that case, I do not think any of the find-common
> exchange starting from the "upload-request" phase happens.
No, this refers to the same event which was already described in that document,
which I believe happens from inside find_common. It may just be some confusion
on the meaning of "having a reference" on my part, but the idea was to point out
that the client could flush at this stage even if it doesn't have every commit.
I tried to amend the existing wording but I suppose it was just misleading, so
it's better to write something else entirely.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-06 19:56 ` Alex Neronskiy
@ 2011-06-07 20:23 ` Junio C Hamano
2011-06-07 20:47 ` Alex Neronskiy
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-06-07 20:23 UTC (permalink / raw)
To: Alex Neronskiy; +Cc: git
Alex Neronskiy <zakmagnus@google.com> writes:
> Junio C Hamano <gitster <at> pobox.com> writes:
>
>> > +Once all the 'want's and 'shallow's (and optional 'deepen') are
>> > +transferred, clients MUST send a flush-pkt. If the client has all
>> > +the references on the server, and as much of their commit history
>> > +as it is interested in, client flushes and disconnects.
>>
>> Hmmmmm, are you describing "everything-local then flush and all-done" in
>> do_fetch_pack() with the second sentence? If so, placing the description
>> here is misleading. In that case, I do not think any of the find-common
>> exchange starting from the "upload-request" phase happens.
>
> No, this refers to the same event which was already described in that document,
> which I believe happens from inside find_common.
"The same event which was already described in that document" meaning at
the beginning of "Packfile Negotiation" section? That is primarily about
the "ls-remote" that probed the server for the list of current refs, which
is received in connect.c::get_remote_heads(), but it also covers another
case. When fetching, after connect.c::get_remote_heads() finds the list of
current refs, do_fetch_pack() is called, and then everything_local() in it
checks if we have all the objects we are going to ask. If so, we flush and
jump to all_done to terminate the connection, skipping find_common(),
without doing any of the want/shallow/depth/etc.
I don't seem to be able to find where in find_common() and its callee we
could quit without telling the server anything (unless we crash ;-). Even
if get_rev() loop finds nothing, we would at least say "done".
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-07 20:23 ` Junio C Hamano
@ 2011-06-07 20:47 ` Alex Neronskiy
2011-06-07 22:05 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Alex Neronskiy @ 2011-06-07 20:47 UTC (permalink / raw)
To: git
Junio C Hamano <gitster <at> pobox.com> writes:
> "The same event which was already described in that document" meaning at
> the beginning of "Packfile Negotiation" section? That is primarily about
> the "ls-remote" that probed the server for the list of current refs, which
> is received in connect.c::get_remote_heads(), but it also covers another
> case. When fetching, after connect.c::get_remote_heads() finds the list of
> current refs, do_fetch_pack() is called, and then everything_local() in it
> checks if we have all the objects we are going to ask. If so, we flush and
> jump to all_done to terminate the connection, skipping find_common(),
> without doing any of the want/shallow/depth/etc.
>
> I don't seem to be able to find where in find_common() and its callee we
> could quit without telling the server anything (unless we crash . Even
> if get_rev() loop finds nothing, we would at least say "done".
>
The part of the document I'm referring to starts at line 221 and reads:
Once all the "want"s (and optional 'deepen') are transferred,
clients MUST send a flush-pkt. If the client has all the references
on the server, client flushes and disconnects.
And I believe this refers to the code path beginning at line 308 of fetch-pack.c:
if (!fetching) {
strbuf_release(&req_buf);
packet_flush(fd[1]);
return 1;
}
Am I wrong?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-07 20:47 ` Alex Neronskiy
@ 2011-06-07 22:05 ` Junio C Hamano
2011-06-07 22:31 ` Alex Neronskiy
2011-06-07 23:01 ` Junio C Hamano
0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2011-06-07 22:05 UTC (permalink / raw)
To: Alex Neronskiy; +Cc: git
Alex Neronskiy <zakmagnus@google.com> writes:
> The part of the document I'm referring to starts at line 221 and reads:
>
> Once all the "want"s (and optional 'deepen') are transferred,
> clients MUST send a flush-pkt. If the client has all the references
> on the server, client flushes and disconnects.
>
> And I believe this refers to the code path beginning at line 308 of fetch-pack.c:
>
> if (!fetching) {
> strbuf_release(&req_buf);
> packet_flush(fd[1]);
> return 1;
> }
>
> Am I wrong?
Ah, I overlooked that codepath, but if that if statement triggered, that
would mean fetching is still 0, which in turn means that you never sent
any "want", so "Once all the 'want's' (and optional 'deepen') are
transferred" is not even true, is it?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-07 22:05 ` Junio C Hamano
@ 2011-06-07 22:31 ` Alex Neronskiy
2011-06-07 23:01 ` Junio C Hamano
1 sibling, 0 replies; 8+ messages in thread
From: Alex Neronskiy @ 2011-06-07 22:31 UTC (permalink / raw)
To: git
Junio C Hamano <gitster <at> pobox.com> writes:
> Ah, I overlooked that codepath, but if that if statement triggered, that
> would mean fetching is still 0, which in turn means that you never sent
> any "want", so "Once all the 'want's' (and optional 'deepen') are
> transferred" is not even true, is it?
If you want to get pedantic, it IS true that "all" the 'want's are sent; the
correct set of wants to send just happens to be empty. What DOES seem incorrect
is the part about the 'deepen' (as well as the 'shallow's I'm proposing to add);
that part of the code isn't even reached if this termination happens. So, either
I'm mistaken and that is NOT the right codepath, or this is a mistake already
present in the documentation.
Alternatively, the problem there is that it's just deceptively worded. It
implies that a 'deepen' can be sent and that termination can still happen
afterwards; but I don't believe this is possible. If a depth argument is
present, everything_local is not called and COMPLETEness is not set, so it's
impossible to skip any refs except in the corner case where there aren't any to
begin with. The second version of this patch addresses this better.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Document the underlying protocol used by shallow repositories and --depth commands.
2011-06-07 22:05 ` Junio C Hamano
2011-06-07 22:31 ` Alex Neronskiy
@ 2011-06-07 23:01 ` Junio C Hamano
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2011-06-07 23:01 UTC (permalink / raw)
To: Alex Neronskiy; +Cc: git
Junio C Hamano <gitster@pobox.com> writes:
> Alex Neronskiy <zakmagnus@google.com> writes:
>
>> The part of the document I'm referring to starts at line 221 and reads:
>>
>> Once all the "want"s (and optional 'deepen') are transferred,
>> clients MUST send a flush-pkt. If the client has all the references
>> on the server, client flushes and disconnects.
>>
>> And I believe this refers to the code path beginning at line 308 of fetch-pack.c:
>>
>> if (!fetching) {
>> strbuf_release(&req_buf);
>> packet_flush(fd[1]);
>> return 1;
>> }
>>
>> Am I wrong?
>
> Ah, I overlooked that codepath, but if that if statement triggered, that
> would mean fetching is still 0, which in turn means that you never sent
> any "want", so "Once all the 'want's' (and optional 'deepen') are
> transferred" is not even true, is it?
Also I suspect that this might not even trigger. fetching will stay 0 only
when all the refs->old_sha1 refer to commits already marked as COMPLETE, but
everything_local() would already have covered that case and returned 1.
The code dates back to 2759cbc (git-fetch-pack: avoid unnecessary zero
packing, 2005-10-18) and I tend to trust what Linus wrote, but in that
much simpler version of the code, I think the check is redundant.
In either case, I think what the server sees (in other words, what goes
over the wire at the protocol level) is the same. The client receives
references and capabilities, and sends the flush without giving any "want"
to the server.
And that is exactly what is described in the first paragraph of "Packfile
Negotiation" section.
I am inclined to conclude that the original documentation "If the client
has all the references on the server, client flushes and disconnects",
while technically not incorrect, is redundant (because that particular
behaviour on-the-wire is already spelled out in the first paragraph), and
misleading (because that condition is determined by the client without
sending "want" etc., but the second paragraph "Once the client has ..., it
will begin telling the server..." that comes way before this "client can
quit without requesting anything" gives a false impression that the client
will spend all these effort of sending "want" and can flush and disconnect.
In fact the original does not say the client should not ask for objects it
already has (as COMPLETE) with "want", and as you noticed, the expression
"has all the references" is probably the source of confusion. It should
have said "If the client did not send any "want", it may flush and
disconnect". Both the original and your update will allow an incorrectly
implemented client to send "want"s, realize that it does not want any, and
then flush to disconnect, but then the server will have to try to produce
a pack to send to the client, which is certainly not what we wanted to
specify.
So I would suggest removing the later part of the paragraph, and update
the first paragraph of the section instead.
Perhaps like this...
Documentation/technical/pack-protocol.txt | 21 +++++++++++----------
1 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 369f91d..f860f2a 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -179,14 +179,15 @@ and descriptions.
Packfile Negotiation
--------------------
-After reference and capabilities discovery, the client can decide
-to terminate the connection by sending a flush-pkt, telling the
-server it can now gracefully terminate (as happens with the ls-remote
-command) or it can enter the negotiation phase, where the client and
-server determine what the minimal packfile necessary for transport is.
-
-Once the client has the initial list of references that the server
-has, as well as the list of capabilities, it will begin telling the
+After reference and capabilities discovery, the client can decide to
+terminate the connection by sending a flush-pkt, telling the server it can
+now gracefully terminate, and disconnect, when it does not need any pack
+data. This can happen with the ls-remote command, and also can happen when
+the client already is up-to-date.
+
+Otherwise, it enters the negotiation phase, where the client and
+server determine what the minimal packfile necessary for transport is,
+by telling the
server what objects it wants and what objects it has, so the server
can make a packfile that only contains the objects that the client needs.
The client will also send a list of the capabilities it wants to be in
@@ -219,8 +220,8 @@ If client is requesting a shallow clone, it will now send a 'deepen'
line with the depth it is requesting.
Once all the "want"s (and optional 'deepen') are transferred,
-clients MUST send a flush-pkt. If the client has all the references
-on the server, client flushes and disconnects.
+clients MUST send a flush-pkt, to tell the server side that it is
+done sending the list.
TODO: shallow/unshallow response and document the deepen command in the ABNF.
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-06-07 23:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-06 17:26 [PATCH] Document the underlying protocol used by shallow repositories and --depth commands Alexander Neronskiy
2011-06-06 18:21 ` Junio C Hamano
2011-06-06 19:56 ` Alex Neronskiy
2011-06-07 20:23 ` Junio C Hamano
2011-06-07 20:47 ` Alex Neronskiy
2011-06-07 22:05 ` Junio C Hamano
2011-06-07 22:31 ` Alex Neronskiy
2011-06-07 23:01 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).