Git development

Git development
 help / color / mirror / Atom feed

* Re: Concurrent fetch commands
From: Junio C Hamano @ 2024-01-03 22:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Patrick Steinhardt, Oswald Buddenhagen, Stefan Haller, git
In-Reply-To: <ZZWOBObBmLW9Nid6@nand.local>

Taylor Blau <me@ttaylorr.com> writes:

> ... I suppose the answer is that they expect
> concurrent fetches to be tolerated, but that the contents of FETCH_HEAD
> (and of course the remote references) are consistent at the end of all
> of the fetches.

What does it mean to be "consistent" in this case, though?  For the
controlled form of multiple fetches performed by "git fetch --all",
the answer is probably "as if we fetched sequentially from these
remotes, one by one, and concatenated what these individual fetch
invocations left in FETCH_HEAD".  But for an uncontrolled background
fetch IDE and others perform behind user's back, it is unclear what
it means, or for that matter, it is dubious if there is a reasonable
definition for the word.

Folks who invented "git maintenance" designed their "prefetch" task
to perform the best practice, without interfering any foreground
fetches by not touching FETCH_HEAD and the remote-tracking branches.

Nobody brought up the latter so far on this discussion thread, but
mucking with the remote-tracking branches behind user's back means
completely breaking the end-user expectation that --force-with-lease
would do something useful even when it is not given the commit the
user expects to see at the remote.  Perhaps those third-party tools
that want to run "git fetch" in the background can learn from how
"prefetch" task works to avoid the breakage they are inflicting on
their users?

^ permalink raw reply

* Re: [PATCH] Documentation: fix statement about rebase.instructionFormat
From: Junio C Hamano @ 2024-01-03 19:22 UTC (permalink / raw)
  To: Maarten van der Schrieck via GitGitGadget; +Cc: git, Maarten van der Schrieck
In-Reply-To: <pull.1629.git.git.1704305663254.gitgitgadget@gmail.com>

"Maarten van der Schrieck via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Maarten van der Schrieck <maarten@thingsconnected.nl>
>
> Since commit 62db5247790f2612c0b407a15d1901d88789d35a "rebase -i: generate
> the script via rebase--helper" (Jul 14 2017), the short hash is given in
> rebase-todo. Specifying rebase.instructionFormat does not alter this
> behavior, contrary to what the documentation implies.
>
> Signed-off-by: Maarten van der Schrieck <maarten@thingsconnected.nl>
> ---

Looks good.  Will queue.  Thanks.


>     Documentation: fix statement about rebase.instructionFormat
>     
>     Since commit 62db5247790f2612c0b407a15d1901d88789d35a "rebase -i:
>     generate the script via rebase--helper" (Jul 14 2017), the short hash is
>     given in rebase-todo. Specifying rebase.instructionFormat does not alter
>     this behavior, contrary to what the documentation implies.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1629%2Fthingsconnected%2Fpullreq1-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1629/thingsconnected/pullreq1-v1
> Pull-Request: https://github.com/git/git/pull/1629
>
>  Documentation/config/rebase.txt | 2 +-
>  Documentation/git-rebase.txt    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/config/rebase.txt b/Documentation/config/rebase.txt
> index d59576dbb23..c6187ab28b2 100644
> --- a/Documentation/config/rebase.txt
> +++ b/Documentation/config/rebase.txt
> @@ -40,7 +40,7 @@ rebase.missingCommitsCheck::
>  rebase.instructionFormat::
>  	A format string, as specified in linkgit:git-log[1], to be used for the
>  	todo list during an interactive rebase.  The format will
> -	automatically have the long commit hash prepended to the format.
> +	automatically have the commit hash prepended to the format.
>  
>  rebase.abbreviateCommands::
>  	If set to true, `git rebase` will use abbreviated command names in the
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 1dd6555f66b..25516c45d8b 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -523,7 +523,7 @@ See also INCOMPATIBLE OPTIONS below.
>  +
>  The commit list format can be changed by setting the configuration option
>  rebase.instructionFormat.  A customized instruction format will automatically
> -have the long commit hash prepended to the format.
> +have the commit hash prepended to the format.
>  +
>  See also INCOMPATIBLE OPTIONS below.
>  
>
> base-commit: 055bb6e9969085777b7fab83e3fee0017654f134

^ permalink raw reply

* Re: [PATCH v2] write-or-die: make GIT_FLUSH a Boolean environment variable
From: Junio C Hamano @ 2024-01-03 19:18 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Taylor Blau, Patrick Steinhardt, Chandra Pratap via GitGitGadget,
	git, Chandra Pratap, Chandra Pratap
In-Reply-To: <20240103184203.GA4334@tb-raspi4>

Torsten Bögershausen <tboegi@web.de> writes:

>> -			cp = getenv("GIT_FLUSH");
>> -			if (cp)
>> -				skip_stdout_flush = (atoi(cp) == 0);
>> -			else if ((fstat(fileno(stdout), &st) == 0) &&
>> -				 S_ISREG(st.st_mode))
>> -				skip_stdout_flush = 1;
>> -			else
>> -				skip_stdout_flush = 0;
>> +			skip_stdout_flush = git_env_bool("GIT_FLUSH", -1);
>> +			if (skip_stdout_flush < 0) {
>> +				struct stat st;
>> +				if (fstat(fileno(f), &st))
>> +					skip_stdout_flush = 0;
>> +				else
>> +					skip_stdout_flush = S_ISREG(st.st_mode);
>> +			}
>>  		}
>>  		if (skip_stdout_flush && !ferror(f))
>>  			return;
>> --- >8 ---
>
> Thanks for a nice reading - I can not imagine a better version.

Yup, the flow of the logic feels very natural with this version by
making it clear that the case that the default "-1" is returned to
us is where we need to figure out an appropriate value ourselves.
An added bonus is that the scope "struct stat" has is limited to the
absolute minimum.  I like it, too.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] write-or-die: make GIT_FLUSH a Boolean environment variable
From: Torsten Bögershausen @ 2024-01-03 18:42 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Patrick Steinhardt, Chandra Pratap via GitGitGadget, git,
	Chandra Pratap, Chandra Pratap
In-Reply-To: <ZZWWLkY+ixg+OMM4@nand.local>

On Wed, Jan 03, 2024 at 12:15:26PM -0500, Taylor Blau wrote:
> On Wed, Jan 03, 2024 at 09:22:13AM +0100, Patrick Steinhardt wrote:
> > On Wed, Jan 03, 2024 at 07:58:28AM +0000, Chandra Pratap via GitGitGadget wrote:
> > [snip]
> > > diff --git a/write-or-die.c b/write-or-die.c
> > > index 42a2dc73cd3..a6acabd329f 100644
> > > --- a/write-or-die.c
> > > +++ b/write-or-die.c
> > > @@ -20,15 +20,12 @@ void maybe_flush_or_die(FILE *f, const char *desc)
> > >  {
> > >  	static int skip_stdout_flush = -1;
> > >  	struct stat st;
> > > -	char *cp;
> > >
> > >  	if (f == stdout) {
> > >  		if (skip_stdout_flush < 0) {
> > > -			/* NEEDSWORK: make this a normal Boolean */
> > > -			cp = getenv("GIT_FLUSH");
> > > -			if (cp)
> > > -				skip_stdout_flush = (atoi(cp) == 0);
> > > -			else if ((fstat(fileno(stdout), &st) == 0) &&
> > > +			if (!git_env_bool("GIT_FLUSH", -1))
> > > +				skip_stdout_flush = 1;
> >
> > It's a bit surprising to pass `-1` as default value to `git_env_bool()`
> > here, as this value would hint that the caller wants to explicitly
> > handle the case where the "GIT_FLUSH" envvar is not set at all. We don't
> > though, and essentially fall back to "GIT_FLUSH=1", so passing `1` as
> > the fallback value would be less confusing.
> >
> > Anyway, the resulting behaviour is the same regardless of whether we
> > pass `1` or `-1`, so I'm not sure whether this is worth a reroll.
>
> Hmm. If we pass -1 as the default value in the call to git_env_bool(),
> the only time we'll end up in the else branch is if the environment is
> set to some false-y value.
>
> I don't think that matches the existing behavior, since right now we'll
> infer skip_stdout_flush based on whether or not stdout is a regular file
> or something else.
>
> I think you'd probably want something closer to:
>
> --- 8< ---
> diff --git a/write-or-die.c b/write-or-die.c
> index 42a2dc73cd..f12e111688 100644
> --- a/write-or-die.c
> +++ b/write-or-die.c
> @@ -19,20 +19,17 @@
>  void maybe_flush_or_die(FILE *f, const char *desc)
>  {
>  	static int skip_stdout_flush = -1;
> -	struct stat st;
> -	char *cp;
>
>  	if (f == stdout) {
>  		if (skip_stdout_flush < 0) {
> -			/* NEEDSWORK: make this a normal Boolean */
> -			cp = getenv("GIT_FLUSH");
> -			if (cp)
> -				skip_stdout_flush = (atoi(cp) == 0);
> -			else if ((fstat(fileno(stdout), &st) == 0) &&
> -				 S_ISREG(st.st_mode))
> -				skip_stdout_flush = 1;
> -			else
> -				skip_stdout_flush = 0;
> +			skip_stdout_flush = git_env_bool("GIT_FLUSH", -1);
> +			if (skip_stdout_flush < 0) {
> +				struct stat st;
> +				if (fstat(fileno(f), &st))
> +					skip_stdout_flush = 0;
> +				else
> +					skip_stdout_flush = S_ISREG(st.st_mode);
> +			}
>  		}
>  		if (skip_stdout_flush && !ferror(f))
>  			return;
> --- >8 ---

Thanks for a nice reading - I can not imagine a better version.

^ permalink raw reply

* [PATCH] Documentation: fix statement about rebase.instructionFormat
From: Maarten van der Schrieck via GitGitGadget @ 2024-01-03 18:14 UTC (permalink / raw)
  To: git; +Cc: Maarten van der Schrieck, Maarten van der Schrieck

From: Maarten van der Schrieck <maarten@thingsconnected.nl>

Since commit 62db5247790f2612c0b407a15d1901d88789d35a "rebase -i: generate
the script via rebase--helper" (Jul 14 2017), the short hash is given in
rebase-todo. Specifying rebase.instructionFormat does not alter this
behavior, contrary to what the documentation implies.

Signed-off-by: Maarten van der Schrieck <maarten@thingsconnected.nl>
---
    Documentation: fix statement about rebase.instructionFormat
    
    Since commit 62db5247790f2612c0b407a15d1901d88789d35a "rebase -i:
    generate the script via rebase--helper" (Jul 14 2017), the short hash is
    given in rebase-todo. Specifying rebase.instructionFormat does not alter
    this behavior, contrary to what the documentation implies.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1629%2Fthingsconnected%2Fpullreq1-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1629/thingsconnected/pullreq1-v1
Pull-Request: https://github.com/git/git/pull/1629

 Documentation/config/rebase.txt | 2 +-
 Documentation/git-rebase.txt    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/rebase.txt b/Documentation/config/rebase.txt
index d59576dbb23..c6187ab28b2 100644
--- a/Documentation/config/rebase.txt
+++ b/Documentation/config/rebase.txt
@@ -40,7 +40,7 @@ rebase.missingCommitsCheck::
 rebase.instructionFormat::
 	A format string, as specified in linkgit:git-log[1], to be used for the
 	todo list during an interactive rebase.  The format will
-	automatically have the long commit hash prepended to the format.
+	automatically have the commit hash prepended to the format.
 
 rebase.abbreviateCommands::
 	If set to true, `git rebase` will use abbreviated command names in the
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 1dd6555f66b..25516c45d8b 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -523,7 +523,7 @@ See also INCOMPATIBLE OPTIONS below.
 +
 The commit list format can be changed by setting the configuration option
 rebase.instructionFormat.  A customized instruction format will automatically
-have the long commit hash prepended to the format.
+have the commit hash prepended to the format.
 +
 See also INCOMPATIBLE OPTIONS below.
 

base-commit: 055bb6e9969085777b7fab83e3fee0017654f134
-- 
gitgitgadget

^ permalink raw reply related

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Junio C Hamano @ 2024-01-03 18:08 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git
In-Reply-To: <ZZWOtnP2IHNldcy6@nand.local>

Taylor Blau <me@ttaylorr.com> writes:

>> * tb/path-filter-fix (2023-10-18) 17 commits
>>  - bloom: introduce `deinit_bloom_filters()`
>>  ...
>>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
>>
>>  The Bloom filter used for path limited history traversal was broken
>>  on systems whose "char" is unsigned; update the implementation and
>>  bump the format version to 2.
>>
>>  Expecting a reroll.
>>  cf. <20231023202212.GA5470@szeder.dev>
>>  source: <cover.1697653929.git.me@ttaylorr.com>
>
> I was confused by this one, since I couldn't figure out which tests
> Gábor was referring to here. I responded in [1], but haven't heard back
> since the end of October.
> ...
> [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/

OK, let's ping just once then.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Patrick Steinhardt @ 2024-01-03 18:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <xmqqedey9u32.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1468 bytes --]

On Wed, Jan 03, 2024 at 09:59:13AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> I think we should tighten things up over time.  First by teaching
> >> the ref backend that anything that is not a pseudoref, HEAD or a
> >> proper ref (one item of whose definition is "lives under refs/
> >> hierarchy) should not resolve_ref() successfully.  That should
> >> correctly fail things like
> >> 
> >>     $ git rev-parse worktrees/$name/bisect/bad
> >>     $ git update-ref foo/bar HEAD
> > ...
> > Yeah, agreed, that's something we should do. I do wonder whether this
> > will break existing usecases, but in any case I'd rather consider it an
> > accident that it is possible to write (and read) such refs in the first
> > place.
> 
> Unfortunately, the worktrees/$name/refs/bisect/bad and its friends
> are documented in "git worktree" and the refs.c layer is aware of
> the "main-worktree/" and "worktrees/" hierarchy, so while I still
> think it is a good long-term direction to make it impossible to
> create random refs like "foo/bar" and "resf/heads/master" via the
> commands like "git update-ref", we cannot limit ourselves only to
> "refs/" hierarchy.

Ah, I first wanted to point this out, but then noticed that you didn't
include the "refs/" prefix in "worktrees/$name/bisect/bad" and thought
this was intentional. But yes, per-worktree refs need to stay supported,
weird as they may be.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Junio C Hamano @ 2024-01-03 17:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <ZZWbMekL2URby0qV@tanuki>

Patrick Steinhardt <ps@pks.im> writes:

>> I think we should tighten things up over time.  First by teaching
>> the ref backend that anything that is not a pseudoref, HEAD or a
>> proper ref (one item of whose definition is "lives under refs/
>> hierarchy) should not resolve_ref() successfully.  That should
>> correctly fail things like
>> 
>>     $ git rev-parse worktrees/$name/bisect/bad
>>     $ git update-ref foo/bar HEAD
> ...
> Yeah, agreed, that's something we should do. I do wonder whether this
> will break existing usecases, but in any case I'd rather consider it an
> accident that it is possible to write (and read) such refs in the first
> place.

Unfortunately, the worktrees/$name/refs/bisect/bad and its friends
are documented in "git worktree" and the refs.c layer is aware of
the "main-worktree/" and "worktrees/" hierarchy, so while I still
think it is a good long-term direction to make it impossible to
create random refs like "foo/bar" and "resf/heads/master" via the
commands like "git update-ref", we cannot limit ourselves only to
"refs/" hierarchy.

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Patrick Steinhardt @ 2024-01-03 17:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <xmqqil4a9vue.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]

On Wed, Jan 03, 2024 at 09:21:13AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > ... But the problem is that tools like git-update-ref(1) don't
> > enforce this, so something like `git update-ref foo/bar HEAD` happily
> > creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
> > at arbitrary paths.
> 
> I think we should tighten things up over time.  First by teaching
> the ref backend that anything that is not a pseudoref, HEAD or a
> proper ref (one item of whose definition is "lives under refs/
> hierarchy) should not resolve_ref() successfully.  That should
> correctly fail things like
> 
>     $ git rev-parse worktrees/$name/bisect/bad
>     $ git update-ref foo/bar HEAD
> 
> I'd hope.
> 
> Thanks.

Yeah, agreed, that's something we should do. I do wonder whether this
will break existing usecases, but in any case I'd rather consider it an
accident that it is possible to write (and read) such refs in the first
place.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Junio C Hamano @ 2024-01-03 17:21 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <ZZWIlx-9D2r9AfDW@tanuki>

Patrick Steinhardt <ps@pks.im> writes:

> ... But the problem is that tools like git-update-ref(1) don't
> enforce this, so something like `git update-ref foo/bar HEAD` happily
> creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
> at arbitrary paths.

I think we should tighten things up over time.  First by teaching
the ref backend that anything that is not a pseudoref, HEAD or a
proper ref (one item of whose definition is "lives under refs/
hierarchy) should not resolve_ref() successfully.  That should
correctly fail things like

    $ git rev-parse worktrees/$name/bisect/bad
    $ git update-ref foo/bar HEAD

I'd hope.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 0/2] doc: bisect: change plural paths to singular pathspec
From: Taylor Blau @ 2024-01-03 17:17 UTC (permalink / raw)
  To: Britton Leo Kerin; +Cc: git
In-Reply-To: <6bcbd017-968e-4ac8-a56b-164b163c76d4@smtp-relay.sendinblue.com>

On Tue, Jan 02, 2024 at 07:02:05PM -0900, Britton Leo Kerin wrote:
> Britton Leo Kerin (2):
>   doc: use singular form of repeatable path arg
>   doc: refer to pathspec instead of path
>
>  Documentation/git-bisect.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Range-diff against v1:
> 1:  90c081dcab ! 1:  da40e4736b doc: use singular form of repeatable path arg
>     @@ Commit message
>          later document text mentions 'path' arguments, while it doesn't mention
>          'paths'.
>
>     -    Signed-off-by: Britton L Kerin <britton.kergin@gmail.com>
>     +    Signed-off-by: Britton Leo Kerin <britton.kergin@gmail.com>
>
>       ## Documentation/git-bisect.txt ##
>      @@ Documentation/git-bisect.txt: The command takes various subcommands, and different options depending
> -:  ---------- > 2:  d932b6d501 doc: refer to pathspec instead of path
> --
> 2.43.0

Hmm. The end-state of these two patches looks good to me, but I probably
would have written this change as a single change from "paths" ->
"pathspec", not "paths" -> "path" -> "pathspec".

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH v2] write-or-die: make GIT_FLUSH a Boolean environment variable
From: Taylor Blau @ 2024-01-03 17:15 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Chandra Pratap via GitGitGadget, git, Chandra Pratap,
	Chandra Pratap
In-Reply-To: <ZZUZNQqDTx3bnveJ@tanuki>

On Wed, Jan 03, 2024 at 09:22:13AM +0100, Patrick Steinhardt wrote:
> On Wed, Jan 03, 2024 at 07:58:28AM +0000, Chandra Pratap via GitGitGadget wrote:
> [snip]
> > diff --git a/write-or-die.c b/write-or-die.c
> > index 42a2dc73cd3..a6acabd329f 100644
> > --- a/write-or-die.c
> > +++ b/write-or-die.c
> > @@ -20,15 +20,12 @@ void maybe_flush_or_die(FILE *f, const char *desc)
> >  {
> >  	static int skip_stdout_flush = -1;
> >  	struct stat st;
> > -	char *cp;
> >
> >  	if (f == stdout) {
> >  		if (skip_stdout_flush < 0) {
> > -			/* NEEDSWORK: make this a normal Boolean */
> > -			cp = getenv("GIT_FLUSH");
> > -			if (cp)
> > -				skip_stdout_flush = (atoi(cp) == 0);
> > -			else if ((fstat(fileno(stdout), &st) == 0) &&
> > +			if (!git_env_bool("GIT_FLUSH", -1))
> > +				skip_stdout_flush = 1;
>
> It's a bit surprising to pass `-1` as default value to `git_env_bool()`
> here, as this value would hint that the caller wants to explicitly
> handle the case where the "GIT_FLUSH" envvar is not set at all. We don't
> though, and essentially fall back to "GIT_FLUSH=1", so passing `1` as
> the fallback value would be less confusing.
>
> Anyway, the resulting behaviour is the same regardless of whether we
> pass `1` or `-1`, so I'm not sure whether this is worth a reroll.

Hmm. If we pass -1 as the default value in the call to git_env_bool(),
the only time we'll end up in the else branch is if the environment is
set to some false-y value.

I don't think that matches the existing behavior, since right now we'll
infer skip_stdout_flush based on whether or not stdout is a regular file
or something else.

I think you'd probably want something closer to:

--- 8< ---
diff --git a/write-or-die.c b/write-or-die.c
index 42a2dc73cd..f12e111688 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -19,20 +19,17 @@
 void maybe_flush_or_die(FILE *f, const char *desc)
 {
 	static int skip_stdout_flush = -1;
-	struct stat st;
-	char *cp;

 	if (f == stdout) {
 		if (skip_stdout_flush < 0) {
-			/* NEEDSWORK: make this a normal Boolean */
-			cp = getenv("GIT_FLUSH");
-			if (cp)
-				skip_stdout_flush = (atoi(cp) == 0);
-			else if ((fstat(fileno(stdout), &st) == 0) &&
-				 S_ISREG(st.st_mode))
-				skip_stdout_flush = 1;
-			else
-				skip_stdout_flush = 0;
+			skip_stdout_flush = git_env_bool("GIT_FLUSH", -1);
+			if (skip_stdout_flush < 0) {
+				struct stat st;
+				if (fstat(fileno(f), &st))
+					skip_stdout_flush = 0;
+				else
+					skip_stdout_flush = S_ISREG(st.st_mode);
+			}
 		}
 		if (skip_stdout_flush && !ferror(f))
 			return;
--- >8 ---

You could lose one level of indentation, but it costs an extra fstat()
call in the case where GIT_FLUSH is set to some explicit value. Doing
that would look like this ugly thing instead ;-):

--- 8< ---
diff --git a/write-or-die.c b/write-or-die.c
index 42a2dc73cd..b3275d7577 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -19,20 +19,11 @@
 void maybe_flush_or_die(FILE *f, const char *desc)
 {
 	static int skip_stdout_flush = -1;
-	struct stat st;
-	char *cp;

 	if (f == stdout) {
 		if (skip_stdout_flush < 0) {
-			/* NEEDSWORK: make this a normal Boolean */
-			cp = getenv("GIT_FLUSH");
-			if (cp)
-				skip_stdout_flush = (atoi(cp) == 0);
-			else if ((fstat(fileno(stdout), &st) == 0) &&
-				 S_ISREG(st.st_mode))
-				skip_stdout_flush = 1;
-			else
-				skip_stdout_flush = 0;
+			struct stat st;
+			skip_stdout_flush = git_env_bool("GIT_FLUSH", !fstat(fileno(f), &st) && S_ISREG(st.st_mode));
 		}
 		if (skip_stdout_flush && !ferror(f))
 			return;
--- >8 ---

Thanks,
Taylor

^ permalink raw reply related

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: René Scharfe @ 2024-01-03 17:14 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: git
In-Reply-To: <20240103090152.GB1866508@coredump.intra.peff.net>

Am 03.01.24 um 10:01 schrieb Jeff King:
> On Tue, Jan 02, 2024 at 05:02:35PM -0800, Junio C Hamano wrote:
>
>> * jk/t1006-cat-file-objectsize-disk (2023-12-21) 1 commit
>>   (merged to 'next' on 2023-12-28 at d82812e636)
>>  + t1006: add tests for %(objectsize:disk)
>>
>>  Test update.
>>
>>  Will merge to 'master'.
>>  source: <20231221094722.GA570888@coredump.intra.peff.net>
>
> It looks like this is the original version. I posted a v2 that took
> René's suggestion to swap out the awk for shell, but it got overlooked.
> I'm happy enough either way, but if we want to salvage that effort,
> here's a patch which could go on top:
>
> -- >8 --
> From: René Scharfe <l.s.r@web.de>
> Subject: [PATCH] t1006: prefer shell loop to awk for packed object sizes
>
> To compute the expected on-disk size of packed objects, we sort the
> output of show-index by pack offset and then compute the difference
> between adjacent entries using awk. This works but has a few readability
> problems:
>
>   1. Reading the index in pack order means don't find out the size of an
>      oid's entry until we see the _next_ entry. So we have to save it to
>      print later.
>
>      We can instead iterate in reverse order, so we compute each oid's
>      size as we see it.
>
>   2. Since the awk invocation is inside a text_expect block, we can't
>      easily use single-quotes to hold the script. So we use
>      double-quotes, but then have to escape the dollar signs in the awk
>      script.
>
>      We can swap this out for a shell loop instead (which is made much
>      easier by the first change).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I gave René authorship since this was his cleverness, but obviously I
> wrote the commit message. Giving an explicit signoff would be nice,
> though.

Alright, thank you!

Signed-off-by: René Scharfe <l.s.r@web.de>

>
>  t/t1006-cat-file.sh | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index 0c2eafae65..5ea3326128 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -1117,14 +1117,16 @@ test_expect_success 'cat-file %(objectsize:disk) with --batch-all-objects' '
>  		while read idx
>  		do
>  			git show-index <"$idx" >idx.raw &&
> -			sort -n <idx.raw >idx.sorted &&
> +			sort -nr <idx.raw >idx.sorted &&
>  			packsz=$(test_file_size "${idx%.idx}.pack") &&
>  			end=$((packsz - rawsz)) &&
> -			awk -v end="$end" "
> -			  NR > 1 { print oid, \$1 - start }
> -			  { start = \$1; oid = \$2 }
> -			  END { print oid, end - start }
> -			" idx.sorted ||
> +			while read start oid rest
> +			do
> +				size=$((end - start)) &&
> +				end=$start &&
> +				echo "$oid $size" ||
> +				return 1
> +			done <idx.sorted ||
>  			return 1
>  		done
>  	} >expect.raw &&

^ permalink raw reply

* Re: [PATCH v2] write-or-die: make GIT_FLUSH a Boolean environment variable
From: Junio C Hamano @ 2024-01-03 17:13 UTC (permalink / raw)
  To: Chandra Pratap via GitGitGadget; +Cc: git, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1628.v2.git.1704268708720.gitgitgadget@gmail.com>

"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  Documentation/git.txt | 16 +++++++---------
>  write-or-die.c        |  9 +++------
>  2 files changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index 2535a30194f..83fd60f2d11 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -724,16 +724,14 @@ for further details.
>  	waiting for someone with sufficient permissions to fix it.
>  
>  `GIT_FLUSH`::
> -// NEEDSWORK: make it into a usual Boolean environment variable
> -	If this environment variable is set to "1", then commands such
> +	If this Boolean environment variable is set to true, then commands such
>  	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
> -	'git check-attr' and 'git check-ignore' will
> -	force a flush of the output stream after each record have been
> -	flushed. If this
> -	variable is set to "0", the output of these commands will be done
> -	using completely buffered I/O.   If this environment variable is
> -	not set, Git will choose buffered or record-oriented flushing
> -	based on whether stdout appears to be redirected to a file or not.
> +	'git check-attr' and 'git check-ignore' will force a flush of the output
> +	stream after each record have been flushed. If this variable is set to
> +	false, the output of these commands will be done using completely buffered
> +	I/O. If this environment variable is not set, Git will choose buffered or
> +	record-oriented flushing based on whether stdout appears to be redirected
> +	to a file or not.

It is somewhat irritating to see that we need to change this many
lines to just change "0" to "false" and "1" to "true".  I wonder if
it becomes easier to grok if we changed the description into a sub
enumeration of three possibilities, but that would be outside the
scope of this change [*].

> diff --git a/write-or-die.c b/write-or-die.c
> index 42a2dc73cd3..a6acabd329f 100644
> --- a/write-or-die.c
> +++ b/write-or-die.c
> @@ -20,15 +20,12 @@ void maybe_flush_or_die(FILE *f, const char *desc)
>  {
>  	static int skip_stdout_flush = -1;
>  	struct stat st;
> -	char *cp;
>  
>  	if (f == stdout) {
>  		if (skip_stdout_flush < 0) {
> -			/* NEEDSWORK: make this a normal Boolean */
> -			cp = getenv("GIT_FLUSH");
> -			if (cp)
> -				skip_stdout_flush = (atoi(cp) == 0);
> -			else if ((fstat(fileno(stdout), &st) == 0) &&
> +			if (!git_env_bool("GIT_FLUSH", -1))
> +				skip_stdout_flush = 1;
> +			else if (!fstat(fileno(stdout), &st) &&
>  				 S_ISREG(st.st_mode))
>  				skip_stdout_flush = 1;
>  			else

The above logic does not look correct to me, primarily because the
return value of git_env_bool() is inspected only once to see if it
is zero, and does not differentiate the "unset" case from other
cases.

Since git_env_bool(k, def) returns

    - "def" (-1 in this case) when k is not exported (in which case
      you need to do the "fstat" dance).

    - 0 when k is exported and has a string that is "false" (in
      which case you would want to set skip_stdout_flush to true).

    - 1 when k is exported and has a string that is "true" (in which
      case you would want to set skip_stdout_flush to false).

    - or dies if the string exported in k is bogus.

shouldn't it be more like

                        skip_stdout_flush = 0; /* assume flushing */
                        switch (git_env_bool("GIT_FLUSH", -1)) {
                        case 0: /* told not to flush */
                                skip_stdout_flush = 1;
                                ;;
                        case -1: /* unspecified */
                                if (!fstat(...) && S_ISREG())
                                        skip_stdout_flush = 1;
                                ;;
                        default: /* told to flush */
                                ;;
                        }

perhaps?


[Footnote]

 * If I were to do this change, and if I were to also improve the
   style of the documentation before I forget, the way I would do so
   probably is with a two-patch series:

    (1) update "0" and "1" in the documentation with "false" and
        "true", without reflowing the text at all, and update the
        code.

    (2) rewrite the documentation to use 3-possibility
        sub-enumeration for values (imitate the way how other
        variables, like `diff.algorithm`, that can choose from a
        set of handful possible values are described).

   These two changes can be done in either order, but perhaps (1) is
   much less controversial change than the other, so I'd probably do
   so first.

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Taylor Blau @ 2024-01-03 17:00 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karthik Nayak, Junio C Hamano, git, christian.couder
In-Reply-To: <ZZWCxIHf9ySEOWEJ@tanuki>

On Wed, Jan 03, 2024 at 04:52:36PM +0100, Patrick Steinhardt wrote:
> On Wed, Jan 03, 2024 at 10:45:49AM -0500, Taylor Blau wrote:
> > On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > > > I tend to agree that the special empty pattern would be a good shorthand
> > > > for listing all references underneath refs/, including any top-level
> > > > psuedo-refs.
> > > >
> > > > But I don't think that I quite follow what Karthik is saying here.
> > > > for-each-ref returns the union of references that match the given
> > > > pattern(s), not their intersection. So if you wanted to list just the
> > > > psudo-refs ending in '_HEAD', you'd do:
> > > >
> > > >   $ git for-each-ref "*_HEAD"
> > > >
> > > > I think if you wanted to list all pseudo-refs, calling the option
> > > > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > > > psueod-refs matching a given pattern, you should specify that pattern
> > > > directly.
> > >
> > > Where I think this proposal falls short is if you have refs outside of
> > > the "refs/" hierarchy. Granted, this is nothing that should usually
> > > happen nowadays. But I think we should safeguard us for the future:
> >
> > Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
> > imply that we list all references (regardless of whether they appear in
> > the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
> > you're trying to accomplish here.
>
> Ah, okay. I think in that case it's simply a misunderstanding. To me a
> pseudo-ref only includes refs that match `is_pseudoref_syntax()`, so
> things like "HEAD", "ORIG_HEAD" or "MERGE_HEAD". So with that
> understanding, a ref "something/outside/refs" would not be included,
> but I'd very much like to see it listed.

OK, I see: you're trying to add an option that lists all references
(including those outside of the top-level "refs/" hierarchy). But my
proposal to use `--pseudo-refs` was to list *just* those references
outside of the top-level hierarchy.

I wonder if we might want to do something else entirely, which is an
option which controls the top-level "namespace" of references that we
want to see. The behavior would then be to list all references under
"namespace" (which presumably would be "refs/" by default).

If you want to list references like something/outside/refs, your
namespace would then be --namespace="".

I think that this would be a bit more flexible than the current
suggestions, but I am also not as familiar as you are at this particular
problem :-).

Thanks,
Taylor

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Taylor Blau @ 2024-01-03 16:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq5y0bcjpw.fsf@gitster.g>

On Tue, Jan 02, 2024 at 05:02:35PM -0800, Junio C Hamano wrote:
> * tb/merge-tree-write-pack (2023-10-23) 5 commits
>  - builtin/merge-tree.c: implement support for `--write-pack`
>  - bulk-checkin: introduce `index_tree_bulk_checkin_incore()`
>  - bulk-checkin: introduce `index_blob_bulk_checkin_incore()`
>  - bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types
>  - bulk-checkin: extract abstract `bulk_checkin_source`
>
>  "git merge-tree" learned "--write-pack" to record its result
>  without creating loose objects.
>
>  Broken when an object created during a merge is needed to continue merge
>  cf. <CABPp-BEfy9VOvimP9==ry_rZXu=metOQ8s=_-XiG_Pdx9c06Ww@mail.gmail.com>
>  source: <cover.1698101088.git.me@ttaylorr.com>

Let's drop this one.

> * tb/pair-chunk-expect (2023-11-10) 8 commits
>  - midx: read `OOFF` chunk with `pair_chunk_expect()`
>  - midx: read `OIDL` chunk with `pair_chunk_expect()`
>  - commit-graph: read `BIDX` chunk with `pair_chunk_expect()`
>  - commit-graph: read `GDAT` chunk with `pair_chunk_expect()`
>  - commit-graph: read `CDAT` chunk with `pair_chunk_expect()`
>  - commit-graph: read `OIDL` chunk with `pair_chunk_expect()`
>  - chunk-format: introduce `pair_chunk_expect()` helper
>  - Merge branch 'jk/chunk-bounds-more' into HEAD
>
>  Further code clean-up.
>
>  Needs review.
>  source: <cover.1699569246.git.me@ttaylorr.com>

This one is on my list of things to look at, but probably not something
that I'll get to urgently before I've had a chance to clear my holiday
backlog. If you don't mind keeping it, that's fine, but I won't be upset
if it's easier to drop from 'seen' in the meantime.

> * tb/path-filter-fix (2023-10-18) 17 commits
>  - bloom: introduce `deinit_bloom_filters()`
>  - commit-graph: reuse existing Bloom filters where possible
>  - object.h: fix mis-aligned flag bits table
>  - commit-graph: drop unnecessary `graph_read_bloom_data_context`
>  - commit-graph.c: unconditionally load Bloom filters
>  - bloom: prepare to discard incompatible Bloom filters
>  - bloom: annotate filters with hash version
>  - commit-graph: new filter ver. that fixes murmur3
>  - repo-settings: introduce commitgraph.changedPathsVersion
>  - t4216: test changed path filters with high bit paths
>  - t/helper/test-read-graph: implement `bloom-filters` mode
>  - bloom.h: make `load_bloom_filter_from_graph()` public
>  - t/helper/test-read-graph.c: extract `dump_graph_info()`
>  - gitformat-commit-graph: describe version 2 of BDAT
>  - commit-graph: ensure Bloom filters are read with consistent settings
>  - revision.c: consult Bloom filters for root commits
>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
>
>  The Bloom filter used for path limited history traversal was broken
>  on systems whose "char" is unsigned; update the implementation and
>  bump the format version to 2.
>
>  Expecting a reroll.
>  cf. <20231023202212.GA5470@szeder.dev>
>  source: <cover.1697653929.git.me@ttaylorr.com>

I was confused by this one, since I couldn't figure out which tests
Gábor was referring to here. I responded in [1], but haven't heard back
since the end of October.

I personally think that this is ready to go, and it would be nice to get
it out of the perpetual "cooking" state that it's in. So if Gábor is
around to reply and I'm indeed missing something, that would be great.
But in the meantime, I think that this is ready to go.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/

^ permalink raw reply

* Re: Concurrent fetch commands
From: Taylor Blau @ 2024-01-03 16:40 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Oswald Buddenhagen, Junio C Hamano, Stefan Haller, git
In-Reply-To: <ZZU5s4LKQF1NLgnC@tanuki>

On Wed, Jan 03, 2024 at 11:40:51AM +0100, Patrick Steinhardt wrote:
>   - `--append` should handle concurrency just fine, that is it knows to
>     append to a preexisting lockfile. This is messy though, and the
>     original creator of the lockfile wouldn't know when it can commit it
>     into place.
>
> Both options are kind of ugly, so I'm less sure now whether lockfiles
> are the way to go.

Interesting. Thinking a little bit about what you wrote here, I feel
like `--append[=<FETCH_HEAD>] would do what you need here. The creator
of the lockfile would commit it into place exactly when all children
have finished writing into the existing lockfile.

It seems like that could work, but I haven't poked around to figure out
whether or not that is the case. Regardless, supposing that it does
work, I wonder what users reasonably expect in the presence of multiple
'git fetch' operations. I suppose the answer is that they expect
concurrent fetches to be tolerated, but that the contents of FETCH_HEAD
(and of course the remote references) are consistent at the end of all
of the fetches.

Thanks,
Taylor

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Junio C Hamano @ 2024-01-03 16:37 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git
In-Reply-To: <20240103090152.GB1866508@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> It looks like this is the original version. I posted a v2 that took
> René's suggestion to swap out the awk for shell, but it got overlooked.
> I'm happy enough either way, but if we want to salvage that effort,
> here's a patch which could go on top:

Thanks.  I was happy enough with the old one and placed the updated
one on backburner.

A commit message that explains why this incremental update (i.e.,
rewrite from awk to a shell loop) is a good idea below does make it
worthwhile ;-)

> -- >8 --
> From: René Scharfe <l.s.r@web.de>
> Subject: [PATCH] t1006: prefer shell loop to awk for packed object sizes
>
> To compute the expected on-disk size of packed objects, we sort the
> output of show-index by pack offset and then compute the difference
> between adjacent entries using awk. This works but has a few readability
> problems:
>
>   1. Reading the index in pack order means don't find out the size of an
>      oid's entry until we see the _next_ entry. So we have to save it to
>      print later.
>
>      We can instead iterate in reverse order, so we compute each oid's
>      size as we see it.

If you go forward, you need "the end of the previous round" (which
is "the beginning of the current round") to be subtracted from "the
end of the current round".  If you go forward, you have to have "the
beginning of the previous round" (which is "the end of the current
round") from which you subtract "the beginning of the current round".

So from that point of view, the only difference is that you would
not be ready to emit in the first round, and you would need to emit
for the last entry after the loop.  Because we happen to have the
end of the last entry outside the loop, we can omit the awkwardness.

OK.  But iterating over a list backwards is a bit awkward ;-).

>   2. Since the awk invocation is inside a text_expect block, we can't
>      easily use single-quotes to hold the script. So we use
>      double-quotes, but then have to escape the dollar signs in the awk
>      script.

Yup.  The joy of shell quoting rules ;-)

> I gave René authorship since this was his cleverness, but obviously I
> wrote the commit message. Giving an explicit signoff would be nice,
> though.

Indeed.

>  t/t1006-cat-file.sh | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index 0c2eafae65..5ea3326128 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -1117,14 +1117,16 @@ test_expect_success 'cat-file %(objectsize:disk) with --batch-all-objects' '
>  		while read idx
>  		do
>  			git show-index <"$idx" >idx.raw &&
> -			sort -n <idx.raw >idx.sorted &&
> +			sort -nr <idx.raw >idx.sorted &&
>  			packsz=$(test_file_size "${idx%.idx}.pack") &&
>  			end=$((packsz - rawsz)) &&
> -			awk -v end="$end" "
> -			  NR > 1 { print oid, \$1 - start }
> -			  { start = \$1; oid = \$2 }
> -			  END { print oid, end - start }
> -			" idx.sorted ||
> +			while read start oid rest
> +			do
> +				size=$((end - start)) &&
> +				end=$start &&
> +				echo "$oid $size" ||
> +				return 1
> +			done <idx.sorted ||
>  			return 1
>  		done
>  	} >expect.raw &&

This is totally unrelated tangent, but the way "show-index" gets
invoked in the above loop makes readers wonder how the caller found
out which $idx file to read.

Of course, the above loop sits downstream of a pipe

    find .git/objects/pack -type f -name \*.idx

which means that any user of "git show-index" must be intimately
familiar with how the object database is structured.  I wonder if we
want an extra layer of abstraction, similar to how the reference
database can have different backend implementation.

Anyway, will queue.  Thanks.

^ permalink raw reply

* Re: [PATCH v6] subtree: fix split processing with multiple subtrees present
From: Christian Couder @ 2024-01-03 16:33 UTC (permalink / raw)
  To: Zach FettersMoore; +Cc: Junio C Hamano, Zach FettersMoore via GitGitGadget, git
In-Reply-To: <CAEWN6q2XeDDLvSM-ik_-HVqpeyYZLWpPwoj2SUyB9L9NyMJPLw@mail.gmail.com>

(Sorry for replying only to Zach instead of everyone previously.)

On Wed, Dec 13, 2023 at 4:20 PM Zach FettersMoore
<zach.fetters@apollographql.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> >>> > $ git subtree split --prefix=apollo-ios-codegen --squash --rejoin
> >>> > Merge made by the 'ort' strategy.
> >>> > e274aed3ba6d0659fb4cc014587cf31c1e8df7f4
> >>>
> >>> Looking into this some it looks like it could be a bash config
> >>> difference? My machine always runs it all the way through vs
> >>> failing for recursion depth. Although that would also be an issue
> >>> which is solved by this fix.
> >>
> >> I use Ubuntu where /bin/sh is dash so my current guess is that dash
> >> might have a smaller recursion limit than bash.
> >
> > That sounds quite bad. Does it have to be recursive (iow, if we can
> > rewrite the logic to be iterative instead, that would be a much better
> > way to fix the issue)?
>
> I don't think an iterative vs recursive approach fixes this
> particular issue, the root of the issue this patch is fixing
> is that lots of commits from the history of subtrees not
> being acted upon are being processed when they don't need to
> be. So the iterative approach would likely resolve the
> recursion limit issue for some shells, but in my instance
> I don't see a recursion limit error, it just takes an
> extraordinary amount of time to run the split command
> because of all the unnecessary processing which needs to be
> avoided which this patch fixes.

Fixing possible recursion might be an improvement on top of your
patch. But without your patch the test case it describes would anyway
take a lot more time than seems necessary. So I agree that your patch
should definitely be merged anyway.

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Patrick Steinhardt @ 2024-01-03 16:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <xmqqsf3ebe1l.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 2426 bytes --]

On Wed, Jan 03, 2024 at 08:02:46AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The downside of an empty prefix is that you wouldn't be able to filter
> > refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
> > A better alternative would be to use "/" as an indicator that you want
> > to list refs outside of "refs/". That'd allow for more flexible queries:
> >
> >   - "/" prints all refs and pseudo refs, even those outside of the
> >     "refs/" hierarchy.
> >
> >   - "/refs" prints your normal refs.
> >
> >   - "/something/else" prints refs in "$GIT_DIR/something/else".
> 
> I do not get this at all, sorry.  What makes your "/" cover "refs/"
> but not "something/"? 

It does cover "something/". But...

> Unless you have some rule that special cases "/" to apply the
> "hierarchy prefix" matching rule unevenly, that is not possible.  So
> you can easily lose the "/" all of your above patterns share, go back
> to what I showed, and apply the morally equivalent special case to an
> empty prefix and you'd be OK.

... I think you're right -- I was arguing under the misassumption that
the typical rev-parse rules kicked in for git-for-each-ref(1) (e.g.
matching "heads/foo" to "refs/heads/foo"). But they don't, so my point
indeed becomes moot and I see what you're getting at now and agree with
you.

> In any case, I do not think supporting anything other than
> pseudorefs and HEAD outside "refs/" is a good idea to begin with
> (the "worktrees/$name/" example), and requiring that all normal
> references live inside "refs/" hierarchy is a good idea, so all of
> the above is moot, I would say.

Yeah, I'm on the same page: anything outside of "refs/" should not be
supported. But the problem is that tools like git-update-ref(1) don't
enforce this, so something like `git update-ref foo/bar HEAD` happily
creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
at arbitrary paths.

With the files backend it's easy to see that this was created and can be
rectified. But with the reftable library you wouldn't be able to learn
about the existence of this ref at all if we ignored anything but
pseudo-refs and refs prefixed with "refs/".

So while I agree that we shouldn't endorse such refs, we need to at
least give an escape hatch in case such refs end up in the refdb anyway.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Junio C Hamano @ 2024-01-03 16:02 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <ZZWCXFghtql4i4YE@tanuki>

Patrick Steinhardt <ps@pks.im> writes:

> The downside of an empty prefix is that you wouldn't be able to filter
> refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
> A better alternative would be to use "/" as an indicator that you want
> to list refs outside of "refs/". That'd allow for more flexible queries:
>
>   - "/" prints all refs and pseudo refs, even those outside of the
>     "refs/" hierarchy.
>
>   - "/refs" prints your normal refs.
>
>   - "/something/else" prints refs in "$GIT_DIR/something/else".

I do not get this at all, sorry.  What makes your "/" cover "refs/"
but not "something/"?  Unless you have some rule that special cases 
"/" to apply the "hierarchy prefix" matching rule unevenly, that is
not possible.  So you can easily lose the "/" all of your above
patterns share, go back to what I showed, and apply the morally
equivalent special case to an empty prefix and you'd be OK.

In any case, I do not think supporting anything other than
pseudorefs and HEAD outside "refs/" is a good idea to begin with
(the "worktrees/$name/" example), and requiring that all normal
references live inside "refs/" hierarchy is a good idea, so all of
the above is moot, I would say.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Patrick Steinhardt @ 2024-01-03 15:52 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Karthik Nayak, Junio C Hamano, git, christian.couder
In-Reply-To: <ZZWBLafB3pIlZqpw@nand.local>

[-- Attachment #1: Type: text/plain, Size: 1703 bytes --]

On Wed, Jan 03, 2024 at 10:45:49AM -0500, Taylor Blau wrote:
> On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > > I tend to agree that the special empty pattern would be a good shorthand
> > > for listing all references underneath refs/, including any top-level
> > > psuedo-refs.
> > >
> > > But I don't think that I quite follow what Karthik is saying here.
> > > for-each-ref returns the union of references that match the given
> > > pattern(s), not their intersection. So if you wanted to list just the
> > > psudo-refs ending in '_HEAD', you'd do:
> > >
> > >   $ git for-each-ref "*_HEAD"
> > >
> > > I think if you wanted to list all pseudo-refs, calling the option
> > > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > > psueod-refs matching a given pattern, you should specify that pattern
> > > directly.
> >
> > Where I think this proposal falls short is if you have refs outside of
> > the "refs/" hierarchy. Granted, this is nothing that should usually
> > happen nowadays. But I think we should safeguard us for the future:
> 
> Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
> imply that we list all references (regardless of whether they appear in
> the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
> you're trying to accomplish here.

Ah, okay. I think in that case it's simply a misunderstanding. To me a
pseudo-ref only includes refs that match `is_pseudoref_syntax()`, so
things like "HEAD", "ORIG_HEAD" or "MERGE_HEAD". So with that
understanding, a ref "something/outside/refs" would not be included,
but I'd very much like to see it listed.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Patrick Steinhardt @ 2024-01-03 15:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Karthik Nayak, Taylor Blau, git, christian.couder
In-Reply-To: <xmqqwmsqbhyt.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 4047 bytes --]

On Wed, Jan 03, 2024 at 06:38:02AM -0800, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> 
> > The confusion was that I thought Junio was referring to using
> >
> >     $ git for-each-ref ""
> >
> > to print all refs under $GIT_DIR, while he was actually talking about
> > "$GIT_DIR/refs/" directory.
> 
> I do not think you misunderstood me here, though.  
> 
> When you have your master branch (refs/heads/master), your v1.0 tag
> (refs/tags/v1.0), and the usual pseudorefs, giving "refs" to "git
> for-each-ref" would yield refs/heads/master and refs/tags/v1.0 but
> not HEAD and others, simply because the pattern "refs" in
> 
>     $ git for-each-ref "refs"
> 
> works as a hierarchy prefix match.  You give "refs/heads" and you
> get only your master branch but not tags or HEAD in such a
> repository.  As a natural extension to that behaviour, an empty
> string as a hierarchy prefix that matches everything would work
> well: you'd get HEAD, refs/heads/master, and refs/tags/v1.0 as an
> empty prefix would cover all of the hiearchies these three refs (and
> pseudorefs if you had ORIG_HEAD and MERGE_HEAD there) live in.
> 
> In any case, it is not a very much interesting to define the syntax
> to tell for-each-ref not to limit itself under "refs/".  My point
> was that you do not need a special option for that, as shown above.

I think you're just stating that "it's possible, but not necessarily a
good idea" (let me know if I'm misinterpreting, I'm not quite sure
whether I read this correctly). Anyway, let me add my 2c here, even
though it may ultimately be completely moot.

The downside of an empty prefix is that you wouldn't be able to filter
refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
A better alternative would be to use "/" as an indicator that you want
to list refs outside of "refs/". That'd allow for more flexible queries:

  - "/" prints all refs and pseudo refs, even those outside of the
    "refs/" hierarchy.

  - "/refs" prints your normal refs.

  - "/something/else" prints refs in "$GIT_DIR/something/else".

I'm not sure whether it's a better idea than using a flag and I'd assume
that the implementation would be more complex in that case because the
respective backends would need to special-case leading slashes.

So in the end I'm still in the camp that a flag is likely a better idea.

> What is more interesting is what to do with refs that are specific
> to other worktrees, e.g.
> 
>     $ git rev-parse "worktrees/$name/refs/bisect/bad"
> 
> would currently let you peek into (and worse yet, muck with, if you
> really wanted to, with something like "git update-ref") refs that
> should be only visible in another worktree.  Should for-each-ref and
> friends learn a way to iterate over them?  I have no answer to that
> question.

That's a good question indeed. I could certainly see an argument that
there should be the possibility to list them to get an allcompassing
view of the repository's refs. It's sure going to get more complex like
that though (which is not a good argument not to do this).

Currently, per-worktree refs are implemented as quasi-separate ref
stores (see `get_worktree_ref_store()`), and the reffiles backend will
indeed use completely separate stacks for each worktree. So ultimately
it makes me think that this is higher-level logic that the ref store
backend wouldn't need to be aware of, but that git-for-each-ref(1) or
related commands would need to handle.

So I'm not quite sure whether we should solve all these related problems
at once. If we were to implement these features via a flag then I could
see us using a value-flag with which you can control what exactly should
be included in the listing. So something like:

  - `--with-refs=repository` includes all refs of the current
    repository.

  - `--with-refs=worktrees` includes refs of all worktrees.

I dunno. I feel like I start to overthink this.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Taylor Blau @ 2024-01-03 15:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karthik Nayak, Junio C Hamano, git, christian.couder
In-Reply-To: <ZZUgUUlB8A-rhep5@tanuki>

On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > I tend to agree that the special empty pattern would be a good shorthand
> > for listing all references underneath refs/, including any top-level
> > psuedo-refs.
> >
> > But I don't think that I quite follow what Karthik is saying here.
> > for-each-ref returns the union of references that match the given
> > pattern(s), not their intersection. So if you wanted to list just the
> > psudo-refs ending in '_HEAD', you'd do:
> >
> >   $ git for-each-ref "*_HEAD"
> >
> > I think if you wanted to list all pseudo-refs, calling the option
> > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > psueod-refs matching a given pattern, you should specify that pattern
> > directly.
>
> Where I think this proposal falls short is if you have refs outside of
> the "refs/" hierarchy. Granted, this is nothing that should usually
> happen nowadays. But I think we should safeguard us for the future:

Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
imply that we list all references (regardless of whether they appear in
the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
you're trying to accomplish here.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Junio C Hamano @ 2024-01-03 14:38 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, Taylor Blau, git, christian.couder
In-Reply-To: <CAOLa=ZS4OOAmyRvf4HH-c_3GvnVkh6zS2kD3hEhRZ7NZT-rvyA@mail.gmail.com>

Karthik Nayak <karthik.188@gmail.com> writes:

> The confusion was that I thought Junio was referring to using
>
>     $ git for-each-ref ""
>
> to print all refs under $GIT_DIR, while he was actually talking about
> "$GIT_DIR/refs/" directory.

I do not think you misunderstood me here, though.  

When you have your master branch (refs/heads/master), your v1.0 tag
(refs/tags/v1.0), and the usual pseudorefs, giving "refs" to "git
for-each-ref" would yield refs/heads/master and refs/tags/v1.0 but
not HEAD and others, simply because the pattern "refs" in

    $ git for-each-ref "refs"

works as a hierarchy prefix match.  You give "refs/heads" and you
get only your master branch but not tags or HEAD in such a
repository.  As a natural extension to that behaviour, an empty
string as a hierarchy prefix that matches everything would work
well: you'd get HEAD, refs/heads/master, and refs/tags/v1.0 as an
empty prefix would cover all of the hiearchies these three refs (and
pseudorefs if you had ORIG_HEAD and MERGE_HEAD there) live in.

In any case, it is not a very much interesting to define the syntax
to tell for-each-ref not to limit itself under "refs/".  My point
was that you do not need a special option for that, as shown above.

What is more interesting is what to do with refs that are specific
to other worktrees, e.g.

    $ git rev-parse "worktrees/$name/refs/bisect/bad"

would currently let you peek into (and worse yet, muck with, if you
really wanted to, with something like "git update-ref") refs that
should be only visible in another worktree.  Should for-each-ref and
friends learn a way to iterate over them?  I have no answer to that
question.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox