* [RFC/PATCH 0/2] some transport-helper "option object-format" confusion
@ 2024-03-07 8:47 Jeff King
2024-03-07 8:51 ` [PATCH 1/2] t5801: fix object-format handling in git-remote-testgit Jeff King
` (2 more replies)
0 siblings, 3 replies; 21+ messages in thread
From: Jeff King @ 2024-03-07 8:47 UTC (permalink / raw)
To: git; +Cc: brian m. carlson
I happened to be looking at the output of t5801 for an unrelated
problem, and I noticed our git-remote-testgit spewing a bunch of shell
errors. It turns out that its expectations do not quite match what the
transport-helper code produces.
This series brings the test and documentation in line with how the
transport-helper code behaves. But I'm not sure if we should be going
the other way (see the comments on patch 2 especially), and bringing the
transport-helper code in line with the others. Hence the RFC.
[1/2]: t5801: fix object-format handling in git-remote-testgit
[2/2]: doc/gitremote-helpers: match object-format option docs to code
Documentation/gitremote-helpers.txt | 7 ++-----
t/t5801/git-remote-testgit | 6 ++++--
2 files changed, 6 insertions(+), 7 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/2] t5801: fix object-format handling in git-remote-testgit
2024-03-07 8:47 [RFC/PATCH 0/2] some transport-helper "option object-format" confusion Jeff King
@ 2024-03-07 8:51 ` Jeff King
2024-03-07 8:56 ` [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code Jeff King
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
2 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2024-03-07 8:51 UTC (permalink / raw)
To: git; +Cc: brian m. carlson
Our fake remote helper tries to handle the object-format capability,
courtesy of 3716d50dd5 (remote-testgit: adapt for object-format,
2020-06-19). But its parsing isn't quite right; it expects to receive
"option object-format true", but the transport-helper code just sends
"option object-format" with no value.
As a result, we never set the $object_format variable to "true". And
worse, because $val is used unquoted, this confuses the shell's "test"
command, which prints something like:
.../git/t/t5801/git-remote-testgit: 150: test: =: unexpected operator
It all turns out to be harmless, though, because we never look at
$object_format after that!
The Git-side behavior comes from 8b85ee4f47 (transport-helper: implement
object-format extensions, 2020-05-25). It is a bit unlike other "option"
variables, which always say "true" or "false". But in this case, there's
not really any need to do so. As I understand it from that commit, the
sequence is something like:
1. the remote helper in its capabilities list says "object-format" to
tell Git that it understands the object-format option.
2. Git then tells the helper "option object-format" to tell it that it
too understands object-formats.
3. when the remote helper lists refs, it sends a special
":object-format" line that tells Git which object format it is
using. But it presumably should only do this if we found out that
the other side supports object-formats in step (2).
So let's improve our remote-testgit helper a bit:
- when we see an object-format line, just set object_format=true;
that's the only useful thing to take away from it
- make sure that object_format is set before sending the special
":object-format" line. Since we're always testing against a version
of Git recent enough to have sent us the object-format option, this
is mostly a noop. But it confirms that the transport-helper code is
correctly sending us the option (if we fail to send the line, then
the test will fail when run with GIT_TEST_DEFAULT_HASH=sha256).
Signed-off-by: Jeff King <peff@peff.net>
---
The only other helper we ship that knows about object-format is
remote-curl. And there it _does_ expect "true" or an algorithm.
Curiously the "true" thing works because the remote-curl code silently
rewrites "option foo" to be the same as "option foo true". And even
though it understands receiving a specific algorithm, I'm not sure it
would do anything useful (whatever the caller says is generally
overwritten by the info/refs response).
So I dunno.
t/t5801/git-remote-testgit | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/t/t5801/git-remote-testgit b/t/t5801/git-remote-testgit
index 1544d6dc6b..b348608847 100755
--- a/t/t5801/git-remote-testgit
+++ b/t/t5801/git-remote-testgit
@@ -25,6 +25,7 @@ GIT_DIR="$url/.git"
export GIT_DIR
force=
+object_format=
mkdir -p "$dir"
@@ -56,7 +57,8 @@ do
echo
;;
list)
- echo ":object-format $(git rev-parse --show-object-format=storage)"
+ test -n "$object_format" &&
+ echo ":object-format $(git rev-parse --show-object-format=storage)"
git for-each-ref --format='? %(refname)' 'refs/heads/' 'refs/tags/'
head=$(git symbolic-ref HEAD)
echo "@$head HEAD"
@@ -142,7 +144,7 @@ do
echo "ok"
;;
object-format)
- test $val = "true" && object_format="true" || object_format=
+ object_format=true
echo "ok"
;;
*)
--
2.44.0.463.g71abcb3a9f
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-07 8:47 [RFC/PATCH 0/2] some transport-helper "option object-format" confusion Jeff King
2024-03-07 8:51 ` [PATCH 1/2] t5801: fix object-format handling in git-remote-testgit Jeff King
@ 2024-03-07 8:56 ` Jeff King
2024-03-07 22:20 ` brian m. carlson
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
2 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2024-03-07 8:56 UTC (permalink / raw)
To: git; +Cc: brian m. carlson
Git's transport-helper code has always sent "option object-format\n",
and never provided the "true" or "algorithm" arguments. While the
"algorithm" request is something we might need or want to eventually
support, it probably makes sense for now to document the actual
behavior, especially as it has been in place for several years, since
8b85ee4f47 (transport-helper: implement object-format extensions,
2020-05-25).
Signed-off-by: Jeff King <peff@peff.net>
---
As I discussed in patch 1, remote-curl does handle the "true" thing
correctly. And that's really the helper that matters in practice (it's
possible some third party helper is looking for the explicit "true", but
presumably they'd have reported their confusion to the list). So we
could probably just start tacking on the "true" in transport-helper.c
and leave that part of the documentation untouched.
I'm less sure of the specific-algorithm thing, just because it seems
like remote-curl would never make use of it anyway (preferring instead
to match whatever algorithm is used by the http remote). But maybe there
are pending interoperability plans that depend on this?
I guess it would not hurt to leave it in place even if transport-helper
never produces it. On the other hand, any helper which advertises the
"object-format" capability is supposed to support it, and without the
transport-helper side being implemented, I don't know how any helper
program can claim that.
Documentation/gitremote-helpers.txt | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index 07c8439a6f..12dffbf383 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -542,13 +542,10 @@ set by Git if the remote helper has the 'option' capability.
transaction. If successful, all refs will be updated, or none will. If the
remote side does not support this capability, the push will fail.
-'option object-format' {'true'|algorithm}::
- If 'true', indicate that the caller wants hash algorithm information
+'option object-format'::
+ Indicate that the caller wants hash algorithm information
to be passed back from the remote. This mode is used when fetching
refs.
-+
-If set to an algorithm, indicate that the caller wants to interact with
-the remote side using that algorithm.
SEE ALSO
--------
--
2.44.0.463.g71abcb3a9f
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-07 8:56 ` [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code Jeff King
@ 2024-03-07 22:20 ` brian m. carlson
2024-03-12 7:45 ` Jeff King
0 siblings, 1 reply; 21+ messages in thread
From: brian m. carlson @ 2024-03-07 22:20 UTC (permalink / raw)
To: Jeff King; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1605 bytes --]
On 2024-03-07 at 08:56:32, Jeff King wrote:
> Git's transport-helper code has always sent "option object-format\n",
> and never provided the "true" or "algorithm" arguments. While the
> "algorithm" request is something we might need or want to eventually
> support, it probably makes sense for now to document the actual
> behavior, especially as it has been in place for several years, since
> 8b85ee4f47 (transport-helper: implement object-format extensions,
> 2020-05-25).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> As I discussed in patch 1, remote-curl does handle the "true" thing
> correctly. And that's really the helper that matters in practice (it's
> possible some third party helper is looking for the explicit "true", but
> presumably they'd have reported their confusion to the list). So we
> could probably just start tacking on the "true" in transport-helper.c
> and leave that part of the documentation untouched.
>
> I'm less sure of the specific-algorithm thing, just because it seems
> like remote-curl would never make use of it anyway (preferring instead
> to match whatever algorithm is used by the http remote). But maybe there
> are pending interoperability plans that depend on this?
It was designed to allow indicating that we know how to support both
SHA-1 and SHA-256 and we want one or the other (so we don't need to do
an expensive conversion). However, if it's not implemented, I agree we
should document what's implemented, and then extend it when interop
comes.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-07 22:20 ` brian m. carlson
@ 2024-03-12 7:45 ` Jeff King
2024-03-13 21:11 ` brian m. carlson
0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2024-03-12 7:45 UTC (permalink / raw)
To: brian m. carlson; +Cc: git
On Thu, Mar 07, 2024 at 10:20:16PM +0000, brian m. carlson wrote:
> > As I discussed in patch 1, remote-curl does handle the "true" thing
> > correctly. And that's really the helper that matters in practice (it's
> > possible some third party helper is looking for the explicit "true", but
> > presumably they'd have reported their confusion to the list). So we
> > could probably just start tacking on the "true" in transport-helper.c
> > and leave that part of the documentation untouched.
> >
> > I'm less sure of the specific-algorithm thing, just because it seems
> > like remote-curl would never make use of it anyway (preferring instead
> > to match whatever algorithm is used by the http remote). But maybe there
> > are pending interoperability plans that depend on this?
>
> It was designed to allow indicating that we know how to support both
> SHA-1 and SHA-256 and we want one or the other (so we don't need to do
> an expensive conversion). However, if it's not implemented, I agree we
> should document what's implemented, and then extend it when interop
> comes.
I guess my reservation is that when it _does_ come time to extend, we'll
have to introduce a new capability. The capability "object-format" has a
documented meaning now, and what we send is currently a subset of that
(sort of[1]). If we later start sending an explicit algorithm, then in
theory they're supposed to handle that, too, if they implemented against
the docs.
Whereas if we roll back the explicit-algorithm part of the docs, now we
can't assume any helper claiming "object-format" will understand it. And
we'll need them to say "object-format-extended" or something. That's
both more work, and delays adoption for helpers which implemented what
the current docs say.
So I guess my question was more of: are we thinking this explicit
algorithm thing is coming very soon? If so, it might be worth keeping it
in the docs. But if not, and it's just a hypothetical future, it may be
better to clean things up now. And I ask you as the person who mostly
juggles possible future algorithm plans in his head. ;) Of course if the
answer is some combination of "I don't really remember what the plan
was" and "I don't have time to work on it anytime soon" that's OK, too.
-Peff
[1] In the above I'm really just talking about the explicit-algorithm
part. The "sort of" is that we claim to send "object-format true"
but actually just send "object-format". There I'm more inclined to
just align the docs with practice, as the two are equivalent.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-12 7:45 ` Jeff King
@ 2024-03-13 21:11 ` brian m. carlson
2024-03-14 12:47 ` Eric W. Biederman
2024-03-14 15:33 ` Junio C Hamano
0 siblings, 2 replies; 21+ messages in thread
From: brian m. carlson @ 2024-03-13 21:11 UTC (permalink / raw)
To: Jeff King; +Cc: git, Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 1129 bytes --]
On 2024-03-12 at 07:45:13, Jeff King wrote:
> So I guess my question was more of: are we thinking this explicit
> algorithm thing is coming very soon? If so, it might be worth keeping it
> in the docs. But if not, and it's just a hypothetical future, it may be
> better to clean things up now. And I ask you as the person who mostly
> juggles possible future algorithm plans in his head. ;) Of course if the
> answer is some combination of "I don't really remember what the plan
> was" and "I don't have time to work on it anytime soon" that's OK, too.
The answer is that I'm not planning on doing the SHA-1/SHA-256 interop
work except as part of my employment, since I'm kinda out of energy in
that area and it's a lot of work, and I don't believe that my employer
is planning to have me do that anytime soon. Thus, if nobody else is
planning on doing it in short order, it probably won't be getting done.
I know Eric was working on some of the interop work, so perhaps he can
speak to whether he's planning on working in this area soonish.
--
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-13 21:11 ` brian m. carlson
@ 2024-03-14 12:47 ` Eric W. Biederman
2024-03-14 21:21 ` brian m. carlson
2024-03-14 15:33 ` Junio C Hamano
1 sibling, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2024-03-14 12:47 UTC (permalink / raw)
To: brian m. carlson; +Cc: Jeff King, git
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> On 2024-03-12 at 07:45:13, Jeff King wrote:
>> So I guess my question was more of: are we thinking this explicit
>> algorithm thing is coming very soon? If so, it might be worth keeping it
>> in the docs. But if not, and it's just a hypothetical future, it may be
>> better to clean things up now. And I ask you as the person who mostly
>> juggles possible future algorithm plans in his head. ;) Of course if the
>> answer is some combination of "I don't really remember what the plan
>> was" and "I don't have time to work on it anytime soon" that's OK, too.
Given the rest of the conversation I thought something about the
object-format option was going to depend upon work that I am doing.
Reading up on object-format this seems to be something that should
be sorted out now.
Fundamentally the object-format code is about a client representing a
SHA256 repository encountering a server representing a SHA1 repository
and detecting and handling that case cleanly. Or the other way around.
This is a current concern as SHA1 and SHA256 repositories are both
currently supported.
The only future concern is what happens when a client for a SHA256
repository encounters a server serving a SHA1 repository and wants to
switch into a compatibility mode, before it starts sending something
that will confuse the server.
That said I think a lot of think we do a lot of that today in practice
by simply detecting the length of the hash.
In general the plan is that all of the multiple hash interop work
happens on the client and the server worries about handling a single
hash efficiently.
That said I haven't worked with the git protocol so I don't know
what is needed in detail for a client to figure out what the server
is speaking and cleanly abort, or quickly switch to the servers
language. Jeff do you have any insight into that?
> The answer is that I'm not planning on doing the SHA-1/SHA-256 interop
> work except as part of my employment, since I'm kinda out of energy in
> that area and it's a lot of work, and I don't believe that my employer
> is planning to have me do that anytime soon. Thus, if nobody else is
> planning on doing it in short order, it probably won't be getting done.
>
> I know Eric was working on some of the interop work, so perhaps he can
> speak to whether he's planning on working in this area soonish.
Soon-ish.
Getting the SHA1/SHA256 interop working is something that I feel pretty
strongly about. So once I can set aside some time I am going to
push forward with it.
I have code doing with pretty much everything else working and tested
except the actual interop working at this point. That is I have code
for bi-hash repositories.
Breaking everything into small enough chunks that people don't feel
daunted looking at the code has been a bit of a challenge. My current
plan is to write some ``unit tests'' (that is tests that test a single
abstraction in the code at a time), so I can feel comfortable feeding
things in much smaller pieces.
Once the core infrastructure is merged for bi-hash repositories then
I plan to work on the actual interop between the repositories. With
the challenging technical problem I have been looking at is quickly
and efficiently writing a pack in the repository hash, while
retaining a translation to it's original hash.
Once the translation is done the rest is fiddly bits that should come
fairly quickly and should be comparatively easy to review. AKA things
like the client detecting the other end is using a different hash
algorithm and using that information to send heads in a format the
server understands.
That said I will be happy to help sort out object-format now.
That is maintenance and it has no dependencies that I am aware
of.
...
That said. Sorting out object-format has no dependencies on anything
else I have been doing. I will be happy to help sort that out right
now.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-13 21:11 ` brian m. carlson
2024-03-14 12:47 ` Eric W. Biederman
@ 2024-03-14 15:33 ` Junio C Hamano
2024-03-14 21:54 ` brian m. carlson
1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2024-03-14 15:33 UTC (permalink / raw)
To: brian m. carlson; +Cc: Jeff King, git, Eric W. Biederman
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> The answer is that I'm not planning on doing the SHA-1/SHA-256 interop
> work except as part of my employment, since I'm kinda out of energy in
> that area and it's a lot of work, and I don't believe that my employer
> is planning to have me do that anytime soon.
It is sad to hear that it is depriotised, even though it is one of
the larger areas with high importance for the longer term. Thank
you very much for the progress in this area so far..
> Thus, if nobody else is
> planning on doing it in short order, it probably won't be getting done.
>
> I know Eric was working on some of the interop work, so perhaps he can
> speak to whether he's planning on working in this area soonish.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-14 12:47 ` Eric W. Biederman
@ 2024-03-14 21:21 ` brian m. carlson
2024-03-15 15:41 ` Eric W. Biederman
0 siblings, 1 reply; 21+ messages in thread
From: brian m. carlson @ 2024-03-14 21:21 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Jeff King, git
[-- Attachment #1: Type: text/plain, Size: 750 bytes --]
On 2024-03-14 at 12:47:16, Eric W. Biederman wrote:
> That said I think a lot of think we do a lot of that today in practice
> by simply detecting the length of the hash.
That's only true for the dumb HTTP protocol. Everything else should not
do that and we specifically want to avoid doing that, since we may very
well end up with SHA-3-256 or another 256-bit hash instead of SHA-256 if
there are sufficient cryptographic advances.
In fact, if we're going to support reftables via the dumb HTTP protocol,
then we should add some sort of capability advertisement that tells the
remote side what functionality is supported, and simply specify the hash
in that format.
--
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-14 15:33 ` Junio C Hamano
@ 2024-03-14 21:54 ` brian m. carlson
0 siblings, 0 replies; 21+ messages in thread
From: brian m. carlson @ 2024-03-14 21:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git, Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 1459 bytes --]
On 2024-03-14 at 15:33:24, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>
> > The answer is that I'm not planning on doing the SHA-1/SHA-256 interop
> > work except as part of my employment, since I'm kinda out of energy in
> > that area and it's a lot of work, and I don't believe that my employer
> > is planning to have me do that anytime soon.
>
> It is sad to hear that it is depriotised, even though it is one of
> the larger areas with high importance for the longer term. Thank
> you very much for the progress in this area so far..
I don't want to claim that my employer is not prioritizing SHA-256, it's
just that the focus right now is not having me write the interop code.
Other work is ongoing which has and probably will in the future result
in Git contributions, although not necessarily directly related to the
interop work. Some of our work porting away from from libgit2 to Git to
get better SHA-256 support has resulted in us writing new features which
we upstream.
As far as my personal contributions, I'm focusing on other, smaller
Git-related things right now[0], and I'm just writing less code in C
(and effectively no code in C other than Git). And I'm also doing other
things in my life which leave me less time to work on Git.
[0] Including, hopefully soon, some credential helper improvements.
--
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-14 21:21 ` brian m. carlson
@ 2024-03-15 15:41 ` Eric W. Biederman
2024-03-16 6:04 ` Jeff King
0 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2024-03-15 15:41 UTC (permalink / raw)
To: brian m. carlson; +Cc: Jeff King, git
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> On 2024-03-14 at 12:47:16, Eric W. Biederman wrote:
>> That said I think a lot of think we do a lot of that today in practice
>> by simply detecting the length of the hash.
>
> That's only true for the dumb HTTP protocol. Everything else should not
> do that and we specifically want to avoid doing that, since we may very
> well end up with SHA-3-256 or another 256-bit hash instead of SHA-256 if
> there are sufficient cryptographic advances.
My apologies. I thought Jeff King was reporting that object-format
extension did not work, and that had been masked by a test.
I see you saying and a quick grep through the code supports that the
object-format extension is implemented, and that the primary problem
is that the Documentation varies slightly from what is implemented.
Looking at the code I am left with the question:
Is the object-format extension properly implemented in all cases?
If the object-format extension is properly implemented such that a
client and server mismatch can be detected I am for just Documenting
what is currently implemented and calling it good.
The reason for that is
Documentation/technical/hash-function-transition.txt does not expect
servers to support more than hash function. I don't have a perspective
that differs. So detecting what the client and server support and
failing if they differ should be good enough.
I am concerned that the current code may not report it's hash function
in all of the cases it needs to, to be able to detect a mismatch.
I look at commit 8b85ee4f47aa ("transport-helper: implement
object-format extensions") and I don't see anything that generates
":object-format=" after it has been asked for except the code
in remote-curl.c added in commit 7f60501775b2 ("remote-curl: implement
object-format extensions").
Maybe I am mistaken but a name like remote-curl has me strongly
suspecting that it does not cover all of the cases that git supports
that implement protocol v2.
I think I see some omissions in updating the protocol v2 Documentation.
Can some folks who understand how git protocol v2 is implemented better
that I do, tell me if I am seeing things or if it indeed looks like
there are some omissions in the object-format implementation?
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-15 15:41 ` Eric W. Biederman
@ 2024-03-16 6:04 ` Jeff King
2024-03-17 20:47 ` Eric W. Biederman
0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2024-03-16 6:04 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: brian m. carlson, git
On Fri, Mar 15, 2024 at 10:41:24AM -0500, Eric W. Biederman wrote:
> I see you saying and a quick grep through the code supports that the
> object-format extension is implemented, and that the primary problem
> is that the Documentation varies slightly from what is implemented.
>
>
> Looking at the code I am left with the question:
> Is the object-format extension properly implemented in all cases?
>
>
> If the object-format extension is properly implemented such that a
> client and server mismatch can be detected I am for just Documenting
> what is currently implemented and calling it good.
>
> The reason for that is
> Documentation/technical/hash-function-transition.txt does not expect
> servers to support more than hash function. I don't have a perspective
> that differs. So detecting what the client and server support and
> failing if they differ should be good enough.
AFAIK the code all works correctly, and there are no cases where we fail
to notice a mismatch. The two code/doc inconsistencies (and bearing in
mind this is for the transport-helper protocol, not the v2 protocol
itself) are:
- the docs say "object-format true", but the code just says
"object-format". They're semantically equivalent, so it's just a
minor syntax issue.
- the docs say that Git may write "object-format sha256" to the
helper, but the code will never do that.
So my big question is for the second case: is that something that we'll
need to be able to do (possibly to support interop, but possibly for
some other case)? If not, we should probably just fix the docs. If so,
then we need to either fix the code, or accept that we'll need to add a
new capability/extension later.
> I am concerned that the current code may not report it's hash function
> in all of the cases it needs to, to be able to detect a mismatch.
>
> I look at commit 8b85ee4f47aa ("transport-helper: implement
> object-format extensions") and I don't see anything that generates
> ":object-format=" after it has been asked for except the code
> in remote-curl.c added in commit 7f60501775b2 ("remote-curl: implement
> object-format extensions").
>
> Maybe I am mistaken but a name like remote-curl has me strongly
> suspecting that it does not cover all of the cases that git supports
> that implement protocol v2.
That all sounds right. We are talking just about the transport-helper
protocol here, where Git speaks to a separate program that actually
contacts the remote server. And the main helper we ship is remote-curl
(which handles https, http, etc). Everything else is linked directly and
does not need to use a separate process (we use a separate process to
avoid linking curl, openssl, etc into the main Git binary).
We do ship remote-fd and remote-ext, but they don't support most options
(and probably don't need to, because they're mostly pass-throughs that
just use the "connect" feature).
The other major helpers people tend to use are adapters to other version
control systems (e.g., remote-hg, cinnabar). We don't ship any of those
ourselves. They'll obviously need to learn about the transport-helper
object-format capability before they're ready to handle sha256 repos,
but I suspect that works has not really started.
> I think I see some omissions in updating the protocol v2 Documentation.
If you mean from the commits listed above, I don't think so; they are
just touching the transport-helper protocol, not the v2 wire protocol.
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-16 6:04 ` Jeff King
@ 2024-03-17 20:47 ` Eric W. Biederman
2024-03-18 8:49 ` Jeff King
0 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2024-03-17 20:47 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, git
Jeff King <peff@peff.net> writes:
> On Fri, Mar 15, 2024 at 10:41:24AM -0500, Eric W. Biederman wrote:
>
>> I see you saying and a quick grep through the code supports that the
>> object-format extension is implemented, and that the primary problem
>> is that the Documentation varies slightly from what is implemented.
>>
>>
>> Looking at the code I am left with the question:
>> Is the object-format extension properly implemented in all cases?
>>
>>
>> If the object-format extension is properly implemented such that a
>> client and server mismatch can be detected I am for just Documenting
>> what is currently implemented and calling it good.
>>
>> The reason for that is
>> Documentation/technical/hash-function-transition.txt does not expect
>> servers to support more than hash function. I don't have a perspective
>> that differs. So detecting what the client and server support and
>> failing if they differ should be good enough.
>
> AFAIK the code all works correctly, and there are no cases where we fail
> to notice a mismatch. The two code/doc inconsistencies (and bearing in
> mind this is for the transport-helper protocol, not the v2 protocol
> itself)
Thank you for the explanation of the transport-helper vs the v2 helper
protocol explanation below.
> are:
>
> - the docs say "object-format true", but the code just says
> "object-format". They're semantically equivalent, so it's just a
> minor syntax issue.
I am a bit confused on this point after having read the code. It
appears that when "object-format" is sent remote-curl
experiences "object-format true".
Assuming remote-curl is the only remote helper that currently implements
the object-format capability. I think we ant to fix transport-helper to
send "object-format true" just to be consistent with all of the other
options.
Among other things that will allow using the set_helper_option helper
function, and it will generally keep the code robust as then the code
doesn't develop a special case for the one option that doesn't take an
option value.
> - the docs say that Git may write "object-format sha256" to the
> helper, but the code will never do that.
It looks like remote_curl will get confused in that case when it
processes "object-format sha256" as well. As it stores that value in
options.hash_algo, which in all other cases is used to store what the
hash algorithm computed from the remote side.
> So my big question is for the second case: is that something that we'll
> need to be able to do (possibly to support interop, but possibly for
> some other case)? If not, we should probably just fix the docs. If so,
> then we need to either fix the code, or accept that we'll need to add a
> new capability/extension later.
Since this is the transport helper understanding this enough
to give a good reply is challenging.
As I read things the happy path for most connections is either going to
turn into git protocol v2, git-fast-export, or git-fast-import.
Unless I am misunderstanding something all of those will bypass
the code paths the remote helper object-format capability affects.
It is only when the remote helper send "fallback" during connect
that the remote helper format capability might be used.
The only practical need I can imagine for this is if the client
is going to send oids before asking the remote side what it's oids
are. The only case I can imagine doing this is the initial push
of a repository.
My sense is that unless we can find a current case that was overlooked
during the initial conversion we should remove "object-format
<hash-function>" support from the code and the documentation.
Any new cases that are not currently implemented will almost
certainly be handled by the "smart" protocols.
Looking at the code in transport-helper.c:push_refs it appears the one
use case I can think of is explicitly not supported. The code says:
> if (!remote_refs) {
> fprintf(stderr,
> _("No refs in common and none specified; doing nothing.\n"
> "Perhaps you should specify a branch.\n"));
> return 0;
> }
>> I think I see some omissions in updating the protocol v2 Documentation.
>
> If you mean from the commits listed above, I don't think so; they are
> just touching the transport-helper protocol, not the v2 wire protocol.
This just proves I haven't dug through these protocol bits enough to
have a good understanding of how they operate yet.
So I think at the end of the day we just want to do something
the diff below.
Mostly it deletes and simplifies code, but I found one case where
a malfunctioning remote helper could confuse us, so I added a check
to ensure :object-format is sent when we expect it to be sent.
Does that jive with how you are reading the situation?
diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index ed8da428c98b..47e5bb2cc925 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -542,7 +542,7 @@ set by Git if the remote helper has the 'option' capability.
transaction. If successful, all refs will be updated, or none will. If the
remote side does not support this capability, the push will fail.
-'option object-format' {'true'|algorithm}::
+'option object-format' {'true'}::
If 'true', indicate that the caller wants hash algorithm information
to be passed back from the remote. This mode is used when fetching
refs.
diff --git a/remote-curl.c b/remote-curl.c
index 1161dc7fed68..6f4cb3467458 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -213,12 +213,8 @@ static int set_option(const char *name, const char *value)
} else if (!strcmp(name, "object-format")) {
int algo;
options.object_format = 1;
- if (strcmp(value, "true")) {
- algo = hash_algo_by_name(value);
- if (algo == GIT_HASH_UNKNOWN)
- die("unknown object format '%s'", value);
- options.hash_algo = &hash_algos[algo];
- }
+ if (strcmp(value, "true"))
+ die("unknown object format '%s'", value);
return 0;
} else {
return 1 /* unsupported */;
diff --git a/transport-helper.c b/transport-helper.c
index b660b7942f9f..e648f136287d 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1206,13 +1206,13 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
struct ref **tail = &ret;
struct ref *posn;
struct strbuf buf = STRBUF_INIT;
+ bool received_object_format = false;
data->get_refs_list_called = 1;
helper = get_helper(transport);
if (data->object_format) {
- write_str_in_full(helper->in, "option object-format\n");
- if (recvline(data, &buf) || strcmp(buf.buf, "ok"))
+ if (set_helper_option(transport, "object-format", "true"))
exit(128);
}
@@ -1236,9 +1236,13 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
die(_("unsupported object format '%s'"),
value);
transport->hash_algo = &hash_algos[algo];
+ received_hash_algo = true;
}
continue;
}
+ else if (data->object_format && !received_object_format) {
+ die(_("missing :object-format"));
+ }
eov = strchr(buf.buf, ' ');
if (!eov)
Eric
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code
2024-03-17 20:47 ` Eric W. Biederman
@ 2024-03-18 8:49 ` Jeff King
0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2024-03-18 8:49 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: brian m. carlson, git
On Sun, Mar 17, 2024 at 03:47:18PM -0500, Eric W. Biederman wrote:
> > - the docs say "object-format true", but the code just says
> > "object-format". They're semantically equivalent, so it's just a
> > minor syntax issue.
>
> I am a bit confused on this point after having read the code. It
> appears that when "object-format" is sent remote-curl
> experiences "object-format true".
Right, this is due to this code in remote-curl.c:
} else if (skip_prefix(buf.buf, "option ", &arg)) {
char *value = strchr(arg, ' ');
int result;
if (value)
*value++ = '\0';
else
value = "true";
which goes way back to the beginning of remote-curl, even though I don't
think Git ever sends a value-less option. Anyway, that's presumably why
nobody noticed that "option object-format" is unusual.
> Assuming remote-curl is the only remote helper that currently implements
> the object-format capability. I think we ant to fix transport-helper to
> send "object-format true" just to be consistent with all of the other
> options.
We could be breaking third-party helpers that we don't know about. Of
course, those helpers would have to have ignored the documentation. And
I suspect they simply don't exist, or somebody would have showed up and
asked about it (coupled with how new and relatively obscure the hash
algorithm work has been so far).
So maybe we can get away with fixing it now. We should definitely break
it out into its own patch so we can decide independently, though.
> > - the docs say that Git may write "object-format sha256" to the
> > helper, but the code will never do that.
>
> It looks like remote_curl will get confused in that case when it
> processes "object-format sha256" as well. As it stores that value in
> options.hash_algo, which in all other cases is used to store what the
> hash algorithm computed from the remote side.
Yeah, it ends up in the same variable. I _suspect_ it would simply be
overwritten by the remote repo's idea of the hash. I'm not sure if
that's a bug (if the specific algorithm given by the main process is
supposed to take precedence) or a feature (if it's just a suggestion,
and then the helper says "tough luck, the remote is using sha1"). It's
hard to tell because Git never sends it. ;)
> As I read things the happy path for most connections is either going to
> turn into git protocol v2, git-fast-export, or git-fast-import.
> Unless I am misunderstanding something all of those will bypass
> the code paths the remote helper object-format capability affects.
> It is only when the remote helper send "fallback" during connect
> that the remote helper format capability might be used.
Yeah, I suspect that is true for remote-curl. It may not be for other
helpers which don't support "connect".
> The only practical need I can imagine for this is if the client
> is going to send oids before asking the remote side what it's oids
> are. The only case I can imagine doing this is the initial push
> of a repository.
Hmm, I _think_ we are OK there in practice. Even if there are no refs on
the remote repo (running git-receive-pack), it will still issue a
capability line with its object-format. And then the helper (say,
remote-curl) will report that back to the caller (git-push) who might
say "hey, wait, there's a mismatch". And indeed, it seems to work in
practice with remote-curl, where the push yields:
fatal: the receiving end does not support this repository's hash algorithm
In theory I suppose Git could directly issue a "push" command to the
helper (which would then specify oids along with refs to push) without
ever issuing "list for-push" (which is what causes the helper to contact
the remote to discover and report back the object format). But it
doesn't do that, and I don't see why it ever would.
This is all neglecting dumb protocols that don't even know how to figure
out the object format of the other side, but I think that's an
orthogonal problem. Either it remains unsolved, or whatever solution we
come up with then gets pushed back over the transport-helper protocol in
the same way.
> My sense is that unless we can find a current case that was overlooked
> during the initial conversion we should remove "object-format
> <hash-function>" support from the code and the documentation.
Yeah, if you don't have any plans to use it for interop work, then I
think we can declare it useless. I'll rework my patch series a bit to
remove the useless sending-side code, and then add a patch on top to
switch the "true" syntax as discussed above.
> Looking at the code in transport-helper.c:push_refs it appears the one
> use case I can think of is explicitly not supported. The code says:
> > if (!remote_refs) {
> > fprintf(stderr,
> > _("No refs in common and none specified; doing nothing.\n"
> > "Perhaps you should specify a branch.\n"));
> > return 0;
> > }
That only triggers if you didn't ask to push anything. You might have a
sha256 ref locally and say "git push origin my-branch", and then we'd
need to communicate the ref/oid combo for my-branch to the helper. But
as above, I think by that point the helper will have discovered and
reported back the object-format to push.
> Mostly it deletes and simplifies code, but I found one case where
> a malfunctioning remote helper could confuse us, so I added a check
> to ensure :object-format is sent when we expect it to be sent.
That's probably a reasonable thing to check. We should update the docs
to indicate that it's required to send back ":object-format" if the
helper negotiated that capability. I'll add a patch to do that.
> Does that jive with how you are reading the situation?
Yep, I think I have a good sense how to proceed. It may be a day or so
before I produce a series. Thanks for the discussion!
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/3] some transport-helper "option object-format" confusion
2024-03-07 8:47 [RFC/PATCH 0/2] some transport-helper "option object-format" confusion Jeff King
2024-03-07 8:51 ` [PATCH 1/2] t5801: fix object-format handling in git-remote-testgit Jeff King
2024-03-07 8:56 ` [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code Jeff King
@ 2024-03-20 9:32 ` Jeff King
2024-03-20 9:34 ` [PATCH 1/3] transport-helper: use write helpers more consistently Jeff King
` (3 more replies)
2 siblings, 4 replies; 21+ messages in thread
From: Jeff King @ 2024-03-20 9:32 UTC (permalink / raw)
To: git; +Cc: brian m. carlson, Eric W. Biederman
On Thu, Mar 07, 2024 at 03:47:35AM -0500, Jeff King wrote:
> I happened to be looking at the output of t5801 for an unrelated
> problem, and I noticed our git-remote-testgit spewing a bunch of shell
> errors. It turns out that its expectations do not quite match what the
> transport-helper code produces.
>
> This series brings the test and documentation in line with how the
> transport-helper code behaves. But I'm not sure if we should be going
> the other way (see the comments on patch 2 especially), and bringing the
> transport-helper code in line with the others. Hence the RFC.
>
> [1/2]: t5801: fix object-format handling in git-remote-testgit
> [2/2]: doc/gitremote-helpers: match object-format option docs to code
Here's a non-RFC v2 based on the discussion thus far (thanks brian and
Eric).
The big change is that instead of changing the docs to match true-less
"option object-format", the code is changed to match the docs. That
happens in patch 3 (which subsumes the original patch 1). We continue to
drop the documentation for the "option object-format sha256" form. But
now the commit message justifies it better, and we clean up the stale
code in remote-curl.c.
Patch 1 is a small fix for debugging output that I noticed after getting
confused. :-/ It's not strictly related and could be taken separately.
Eric mentioned having Git check that the helpers never say
":object-format" unless it was negotiated. I stopped short of that. One,
it's a bit tricky to test (since Git will always ask for object-format,
you have to teach remote-testgit to optionally send broken output). And
two, I'm not sure that being strict has much value here. It keeps remote
helpers honest, but the real losers are old versions that do not
understand :object-format, which would fail against such a remote. So I
dunno. It isn't any harder to do it on top later if we want to.
[1/3]: transport-helper: use write helpers more consistently
[2/3]: transport-helper: drop "object-format <algo>" option
[3/3]: transport-helper: send "true" value for object-format option
Documentation/gitremote-helpers.txt | 7 ++-----
remote-curl.c | 9 ++-------
t/t5801/git-remote-testgit | 4 +++-
transport-helper.c | 11 ++++-------
4 files changed, 11 insertions(+), 20 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/3] transport-helper: use write helpers more consistently
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
@ 2024-03-20 9:34 ` Jeff King
2024-03-20 9:37 ` [PATCH 2/3] transport-helper: drop "object-format <algo>" option Jeff King
` (2 subsequent siblings)
3 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2024-03-20 9:34 UTC (permalink / raw)
To: git; +Cc: brian m. carlson, Eric W. Biederman
The transport-helper code provides some functions for writing to the
helper process, but there are a few spots that don't use them. We should
do so consistently because:
1. They detect errors on write (though in practice this means the
helper process went away, and we'd see the problem as soon as we
try to read the response).
2. They dump the written bytes to the GIT_TRANSPORT_HELPER_DEBUG
stream. It's doubly confusing to miss some writes but not others,
as you see a partial conversation.
The "list" ones go all the way back to the beginning of the transport
helper code; they were just missed when most writes were converted in
bf3c523c3f (Add remote helper debug mode, 2009-12-09). The nearby
"object-format" write presumably just cargo-culted them, as it's only a
few lines away.
Signed-off-by: Jeff King <peff@peff.net>
---
I also find the output kind of verbose (especially the constant
"waiting" lines), and because it's not GIT_TRACE_TRANSPORT_HELPER, it's
annoying to use with the test scripts (it gets eaten by the test
harness, and you can't even redirect it to an alternative file).
So I was tempted to convert it, but it felt like too deep a rabbit hole
for today.
transport-helper.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/transport-helper.c b/transport-helper.c
index b660b7942f..7f6bbd06bb 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1211,15 +1211,15 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
helper = get_helper(transport);
if (data->object_format) {
- write_str_in_full(helper->in, "option object-format\n");
+ write_constant(helper->in, "option object-format\n");
if (recvline(data, &buf) || strcmp(buf.buf, "ok"))
exit(128);
}
if (data->push && for_push)
- write_str_in_full(helper->in, "list for-push\n");
+ write_constant(helper->in, "list for-push\n");
else
- write_str_in_full(helper->in, "list\n");
+ write_constant(helper->in, "list\n");
while (1) {
char *eov, *eon;
--
2.44.0.650.g4615f65fe0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/3] transport-helper: drop "object-format <algo>" option
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
2024-03-20 9:34 ` [PATCH 1/3] transport-helper: use write helpers more consistently Jeff King
@ 2024-03-20 9:37 ` Jeff King
2024-03-20 9:41 ` [PATCH 3/3] transport-helper: send "true" value for object-format option Jeff King
2024-03-20 17:05 ` [PATCH 0/3] some transport-helper "option object-format" confusion Eric W. Biederman
3 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2024-03-20 9:37 UTC (permalink / raw)
To: git; +Cc: brian m. carlson, Eric W. Biederman
The documentation in gitremote-helpers.txt claims that helpers should
accept an object-format option from Git whose value is either:
1. "true", in which case the helper is merely told that Git
understands the special ":object-format" response, and will send it
2. an algorithm name that the helper should use
However, Git has never sent the second form, and it's not clear if it
would ever be useful.
When interacting with a remote Git repository, we generally discover
what _their_ object format is, and then decide what to do with a
mismatch (where that is currently just "bail out", but could eventually
be on-the-fly conversion and interop). And that is true for native
protocols, but also for transport helpers like remote-curl that talk to
remote Git repositories. There we send back an ":object-format" line
telling Git what remote-curl detected on the other side.
And this is true even for pushes (since we get it via receive-pack's
advertisement). And it is even true for dumb-http, as we guess at the
algorithm based on the hash size, due to ac093d0790 (remote-curl: detect
algorithm for dumb HTTP by size, 2020-06-19).
The one case where it _isn't_ true is dumb-http talking to an empty
repository. There we have no clue what the remote hash is, so
remote-curl just sends back its default. If we kept the "object-format
<algo>" form then in theory Git could say "object-format sha256" to
change that default. But it doesn't really accomplish anything. We still
may or may not be mis-matched with the other side. For a fetch that's
OK, since it's by definition a noop. For a push into an empty
repository, it might matter (though the dumb http-push DAV code seems
happy to clobber a remote sha256 info/refs and corrupt the repository).
If we want to pursue making this work, I think we'd be better off
improving detection of the object format of empty repositories over
dumb-http (e.g., an "info/object-format" file).
But what about helpers that _aren't_ talking to another Git repo?
Consider something like git-cinnabar, which is converting on the fly
to/from hg. Most of the heavy lifting is done by fast-import/export, but
some oids may still pass between Git and the helper. Could
"object-format <algo>" be useful to tell the helper what oids we expect
to see?
Possibly, but in practice this isn't necessary. Git-cinnabar for example
already peeks at the local-repo .git/config to check its object-format
(and currently just bails if it is sha256).
So I think the "object-format" extension really is only useful for the
helper telling Git what object-format it found, and not the other way
around.
Note that this patch can't break any remote helpers; we're not changing
the code on the Git side at all, but just bringing the documentation in
line with what Git has always done. It does remove the receiving support
in remote-curl.c, but that code was never actually triggered.
Signed-off-by: Jeff King <peff@peff.net>
---
Documentation/gitremote-helpers.txt | 7 ++-----
remote-curl.c | 9 ++-------
2 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index 07c8439a6f..0d3f4f37c2 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -542,13 +542,10 @@ set by Git if the remote helper has the 'option' capability.
transaction. If successful, all refs will be updated, or none will. If the
remote side does not support this capability, the push will fail.
-'option object-format' {'true'|algorithm}::
- If 'true', indicate that the caller wants hash algorithm information
+'option object-format true'::
+ Indicate that the caller wants hash algorithm information
to be passed back from the remote. This mode is used when fetching
refs.
-+
-If set to an algorithm, indicate that the caller wants to interact with
-the remote side using that algorithm.
SEE ALSO
--------
diff --git a/remote-curl.c b/remote-curl.c
index 1161dc7fed..31b02b8840 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -211,14 +211,9 @@ static int set_option(const char *name, const char *value)
options.filter = xstrdup(value);
return 0;
} else if (!strcmp(name, "object-format")) {
- int algo;
options.object_format = 1;
- if (strcmp(value, "true")) {
- algo = hash_algo_by_name(value);
- if (algo == GIT_HASH_UNKNOWN)
- die("unknown object format '%s'", value);
- options.hash_algo = &hash_algos[algo];
- }
+ if (strcmp(value, "true"))
+ die(_("unknown value for object-format: %s"), value);
return 0;
} else {
return 1 /* unsupported */;
--
2.44.0.650.g4615f65fe0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 3/3] transport-helper: send "true" value for object-format option
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
2024-03-20 9:34 ` [PATCH 1/3] transport-helper: use write helpers more consistently Jeff King
2024-03-20 9:37 ` [PATCH 2/3] transport-helper: drop "object-format <algo>" option Jeff King
@ 2024-03-20 9:41 ` Jeff King
2024-03-20 17:23 ` Junio C Hamano
2024-03-20 17:05 ` [PATCH 0/3] some transport-helper "option object-format" confusion Eric W. Biederman
3 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2024-03-20 9:41 UTC (permalink / raw)
To: git; +Cc: brian m. carlson, Eric W. Biederman
The documentation in gitremote-helpers.txt claims that after a helper
has advertised the "object-format" capability, Git may then send "option
object-format true" to indicate that it would like to hear which object
format the helper is using when it returns refs.
However, the code implementing this has always written just "option
object-format", without the extra "true" value. Nobody noticed in
practice or in the tests because the only two helpers we ship are:
- remote-curl, which quietly converts missing values into "true". This
goes all the way back to ef08ef9ea0 (remote-helpers: Support custom
transport options, 2009-10-30), despite the fact that I don't think
any other option has ever made use of it.
- remote-testgit in t5801 does insist on having a "true" value. But
since it sends the ":object-format" response regardless of whether
it thinks the caller asked for it (technically breaking protocol),
everything just works, albeit with an extra shell error:
.../git/t/t5801/git-remote-testgit: 150: test: =: unexpected operator
printed to stderr, which you can see running t5801 with --verbose.
(The problem is that $val is the empty string, and since we don't
double-quote it in "test $val = true", we invoke "test = true"
instead).
When the documentation and code do not match, it is often good to fix
the documentation rather than break compatibility. And in this case, we
have had the mis-match since 8b85ee4f47 (transport-helper: implement
object-format extensions, 2020-05-25). However, the sha256 feature was
listed as experimental until 8e42eb0e9a (doc: sha256 is no longer
experimental, 2023-07-31).
It's possible there are some third party helpers that tried to follow
the documentation, and are broken. Changing the code will fix them. It's
also possible that there are ones that follow the code and will be
broken if we change it. I suspect neither is the case given that no
helper authors have brought this up as an issue (I only noticed it
because I was running t5801 in verbose mode for other reasons and
wondered about the weird shell error). That, coupled with the relative
new-ness of sha256, makes me think nobody has really worked on helpers
for it yet, which gives us an opportunity to correct the code before too
much time passes.
And doing so has some value: it brings "object-format" in line with the
syntax of other options, making the protocol more consistent. It also
lets us use set_helper_option(), which has better error reporting.
Note that we don't really need to allow any other values like "false"
here. The point is for Git to tell the helper that it understands
":object-format" lines coming back as part of the ref listing. There's
no point in future versions saying "no, I don't understand that".
To make sure everything works as expected, we can improve the
remote-testgit helper from t5801 to send the ":object-format" line only
if the other side correctly asked for it (which modern Git will always
do). With that test change and without the matching code fix here, t5801
will fail when run with GIT_TEST_DEFAULT_HASH=sha256.
Signed-off-by: Jeff King <peff@peff.net>
---
t/t5801/git-remote-testgit | 4 +++-
transport-helper.c | 7 ++-----
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/t/t5801/git-remote-testgit b/t/t5801/git-remote-testgit
index bcfb358c51..c5b10f5775 100755
--- a/t/t5801/git-remote-testgit
+++ b/t/t5801/git-remote-testgit
@@ -30,6 +30,7 @@ GIT_DIR="$url/.git"
export GIT_DIR
force=
+object_format=
mkdir -p "$dir"
@@ -61,7 +62,8 @@ do
echo
;;
list)
- echo ":object-format $(git rev-parse --show-object-format=storage)"
+ test -n "$object_format" &&
+ echo ":object-format $(git rev-parse --show-object-format=storage)"
git for-each-ref --format='? %(refname)' 'refs/heads/' 'refs/tags/'
head=$(git symbolic-ref HEAD)
echo "@$head HEAD"
diff --git a/transport-helper.c b/transport-helper.c
index 7f6bbd06bb..8d284b24d5 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1210,11 +1210,8 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
data->get_refs_list_called = 1;
helper = get_helper(transport);
- if (data->object_format) {
- write_constant(helper->in, "option object-format\n");
- if (recvline(data, &buf) || strcmp(buf.buf, "ok"))
- exit(128);
- }
+ if (data->object_format)
+ set_helper_option(transport, "object-format", "true");
if (data->push && for_push)
write_constant(helper->in, "list for-push\n");
--
2.44.0.650.g4615f65fe0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 0/3] some transport-helper "option object-format" confusion
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
` (2 preceding siblings ...)
2024-03-20 9:41 ` [PATCH 3/3] transport-helper: send "true" value for object-format option Jeff King
@ 2024-03-20 17:05 ` Eric W. Biederman
2024-03-27 9:48 ` Jeff King
3 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2024-03-20 17:05 UTC (permalink / raw)
To: Jeff King; +Cc: git, brian m. carlson
Jeff King <peff@peff.net> writes:
> On Thu, Mar 07, 2024 at 03:47:35AM -0500, Jeff King wrote:
>
>> I happened to be looking at the output of t5801 for an unrelated
>> problem, and I noticed our git-remote-testgit spewing a bunch of shell
>> errors. It turns out that its expectations do not quite match what the
>> transport-helper code produces.
>>
>> This series brings the test and documentation in line with how the
>> transport-helper code behaves. But I'm not sure if we should be going
>> the other way (see the comments on patch 2 especially), and bringing the
>> transport-helper code in line with the others. Hence the RFC.
>>
>> [1/2]: t5801: fix object-format handling in git-remote-testgit
>> [2/2]: doc/gitremote-helpers: match object-format option docs to code
>
> Here's a non-RFC v2 based on the discussion thus far (thanks brian and
> Eric).
>
> The big change is that instead of changing the docs to match true-less
> "option object-format", the code is changed to match the docs. That
> happens in patch 3 (which subsumes the original patch 1). We continue to
> drop the documentation for the "option object-format sha256" form. But
> now the commit message justifies it better, and we clean up the stale
> code in remote-curl.c.
>
> Patch 1 is a small fix for debugging output that I noticed after getting
> confused. :-/ It's not strictly related and could be taken separately.
>
> Eric mentioned having Git check that the helpers never say
> ":object-format" unless it was negotiated. I stopped short of that. One,
> it's a bit tricky to test (since Git will always ask for object-format,
> you have to teach remote-testgit to optionally send broken output). And
> two, I'm not sure that being strict has much value here. It keeps remote
> helpers honest, but the real losers are old versions that do not
> understand :object-format, which would fail against such a remote. So I
> dunno. It isn't any harder to do it on top later if we want to.
Your sentence has what I was asking for backwards. It would be healthy
if the code fails when "object-format" has been advertised by the
remote, requested by the transport-helper, and the remote does not send
":object-format".
The check is cheap and should prevent buggy remotes from appearing in
the wild. I am probably biased but I rather want the information
on what hash algorithm the remote is using when I ask for it.
I can totally imagine someone during development forgetting to send
:object-format and either not noticing something was wrong, or spending
a fair amount of time debugging that they forgot to send it.
It is the kind of bug I can imagine someone making when they are called
away from the keyboard at the wrong moment.
The implementation should just be:
diff --git a/transport-helper.c b/transport-helper.c
index b660b7942f9f..e648f136287d 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1206,6 +1206,7 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
struct ref **tail = &ret;
struct ref *posn;
struct strbuf buf = STRBUF_INIT;
+ bool received_object_format = false;
data->get_refs_list_called = 1;
helper = get_helper(transport);
@@ -1236,9 +1236,13 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
die(_("unsupported object format '%s'"),
value);
transport->hash_algo = &hash_algos[algo];
+ received_object_format = true;
}
continue;
}
+ else if (data->object_format && !received_object_format) {
+ die(_("missing :object-format"));
+ }
eov = strchr(buf.buf, ' ');
if (!eov)
Am I missing something that makes a bad implementation?
Hmm. I thought gitremote-helpers.txt said the key value pairs
would precede everything else from a list command.
gitremote-helpers.txt does not mention that. That looks like
a Documentation oversight.
However remote-curl.c in output_refs prints :object-format before
anything else, and transport-helper.c will malfunction if :object-format
is sent after any of the refs. As transport->hash_algop is used by
get_oid_hex_algop is used to parse the oids of the refs.
We can probably fix the Documentation like:
diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index ed8da428c98b..b6ca29a245f3 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -268,6 +268,8 @@ Support for this command is mandatory.
ref. A space-separated list of attributes follows the name;
unrecognized attributes are ignored. The list ends with a
blank line.
+
+ Keywords should precede everything else in the list.
+
See REF LIST ATTRIBUTES for a list of currently defined attributes.
See REF LIST KEYWORDS for a list of currently defined keywords.
I do agree that the sanity check can be added to your series, so if you
would prefer I can do that.
Eric
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] transport-helper: send "true" value for object-format option
2024-03-20 9:41 ` [PATCH 3/3] transport-helper: send "true" value for object-format option Jeff King
@ 2024-03-20 17:23 ` Junio C Hamano
0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-03-20 17:23 UTC (permalink / raw)
To: Jeff King; +Cc: git, brian m. carlson, Eric W. Biederman
Jeff King <peff@peff.net> writes:
> - remote-curl, which quietly converts missing values into "true". This
> goes all the way back to ef08ef9ea0 (remote-helpers: Support custom
> transport options, 2009-10-30), despite the fact that I don't think
> any other option has ever made use of it.
Interesting.
> When the documentation and code do not match, it is often good to fix
> the documentation rather than break compatibility. And in this case, we
> have had the mis-match since 8b85ee4f47 (transport-helper: implement
> object-format extensions, 2020-05-25). However, the sha256 feature was
> listed as experimental until 8e42eb0e9a (doc: sha256 is no longer
> experimental, 2023-07-31).
> ...
> And doing so has some value: it brings "object-format" in line with the
> syntax of other options, making the protocol more consistent. It also
> lets us use set_helper_option(), which has better error reporting.
I suspect that this may have been an attempt to mimick the
value-less true in the configuration syntax, but I agree with the
conclusion of this patch. Boolean "true" in the context of the
transport options may be fairly common, but unlike configuration
files, it is not something we have users write manually, and there
is not much point giving a special short form.
Thanks for a pleasant read. Will queue.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/3] some transport-helper "option object-format" confusion
2024-03-20 17:05 ` [PATCH 0/3] some transport-helper "option object-format" confusion Eric W. Biederman
@ 2024-03-27 9:48 ` Jeff King
0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2024-03-27 9:48 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: git, brian m. carlson
On Wed, Mar 20, 2024 at 12:05:49PM -0500, Eric W. Biederman wrote:
> Your sentence has what I was asking for backwards. It would be healthy
> if the code fails when "object-format" has been advertised by the
> remote, requested by the transport-helper, and the remote does not send
> ":object-format".
Ah, I see. That is probably reasonable, under the assumption that nobody
would have implemented "object-format" so far and _not_ sent it. It
might be worth clarifying the documentation at the same time.
> The implementation should just be:
>
> diff --git a/transport-helper.c b/transport-helper.c
> index b660b7942f9f..e648f136287d 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -1206,6 +1206,7 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
> struct ref **tail = &ret;
> struct ref *posn;
> struct strbuf buf = STRBUF_INIT;
> + bool received_object_format = false;
>
> data->get_refs_list_called = 1;
> helper = get_helper(transport);
> @@ -1236,9 +1236,13 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
> die(_("unsupported object format '%s'"),
> value);
> transport->hash_algo = &hash_algos[algo];
> + received_object_format = true;
> }
> continue;
> }
> + else if (data->object_format && !received_object_format) {
> + die(_("missing :object-format"));
> + }
>
> eov = strchr(buf.buf, ' ');
> if (!eov)
>
> Am I missing something that makes a bad implementation?
No, that seems right to me (modulo that we do not use C99 "bool" in our
code base).
> Hmm. I thought gitremote-helpers.txt said the key value pairs
> would precede everything else from a list command.
> gitremote-helpers.txt does not mention that. That looks like
> a Documentation oversight.
>
> However remote-curl.c in output_refs prints :object-format before
> anything else, and transport-helper.c will malfunction if :object-format
> is sent after any of the refs. As transport->hash_algop is used by
> get_oid_hex_algop is used to parse the oids of the refs.
Yeah, I think it is a natural consequence of "object-format", since it
is necessary for parsing the result. And since there aren't any other
keywords yet, we can surmise that nobody is doing the wrong thing yet.
So now is a good time to clarify the documentation.
I'm also not sure if we ever say explicitly in the documentation that
the keywords start with a colon. But maybe I am just missing it.
> diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
> index ed8da428c98b..b6ca29a245f3 100644
> --- a/Documentation/gitremote-helpers.txt
> +++ b/Documentation/gitremote-helpers.txt
> @@ -268,6 +268,8 @@ Support for this command is mandatory.
> ref. A space-separated list of attributes follows the name;
> unrecognized attributes are ignored. The list ends with a
> blank line.
> +
> + Keywords should precede everything else in the list.
> +
> See REF LIST ATTRIBUTES for a list of currently defined attributes.
> See REF LIST KEYWORDS for a list of currently defined keywords.
>
> I do agree that the sanity check can be added to your series, so if you
> would prefer I can do that.
Yeah, do you want to send some patches that can go on top of mine?
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-03-27 9:48 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-07 8:47 [RFC/PATCH 0/2] some transport-helper "option object-format" confusion Jeff King
2024-03-07 8:51 ` [PATCH 1/2] t5801: fix object-format handling in git-remote-testgit Jeff King
2024-03-07 8:56 ` [PATCH 2/2] doc/gitremote-helpers: match object-format option docs to code Jeff King
2024-03-07 22:20 ` brian m. carlson
2024-03-12 7:45 ` Jeff King
2024-03-13 21:11 ` brian m. carlson
2024-03-14 12:47 ` Eric W. Biederman
2024-03-14 21:21 ` brian m. carlson
2024-03-15 15:41 ` Eric W. Biederman
2024-03-16 6:04 ` Jeff King
2024-03-17 20:47 ` Eric W. Biederman
2024-03-18 8:49 ` Jeff King
2024-03-14 15:33 ` Junio C Hamano
2024-03-14 21:54 ` brian m. carlson
2024-03-20 9:32 ` [PATCH 0/3] some transport-helper "option object-format" confusion Jeff King
2024-03-20 9:34 ` [PATCH 1/3] transport-helper: use write helpers more consistently Jeff King
2024-03-20 9:37 ` [PATCH 2/3] transport-helper: drop "object-format <algo>" option Jeff King
2024-03-20 9:41 ` [PATCH 3/3] transport-helper: send "true" value for object-format option Jeff King
2024-03-20 17:23 ` Junio C Hamano
2024-03-20 17:05 ` [PATCH 0/3] some transport-helper "option object-format" confusion Eric W. Biederman
2024-03-27 9:48 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).