Git development

Git development
 help / color / mirror / Atom feed

* [PATCH 2/4] refs/files: use transactions to delete references
From: Patrick Steinhardt @ 2023-11-14  8:58 UTC (permalink / raw)
  To: git; +Cc: David Turner, Han-Wen Nienhuys
In-Reply-To: <cover.1699951815.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 7477 bytes --]

In the `files_delete_refs()` callback function of the files backend we
implement deletion of references. This is done in two steps:

    1. We lock the packed-refs file and delete all references from it in
       a single transaction.

    2. We delete all loose references via separate calls to
       `refs_delete_ref()`.

These steps essentially duplicate the logic around locking and deletion
order that we already have in the transactional interfaces, where we do
know to lock and evict references from the packed-refs file. Despite the
fact that we duplicate the logic, it's also less efficient than if we
used a single generic transaction:

    - The transactional interface knows to skip locking of the packed
      refs in case they don't contain any of the refs which are about to
      be deleted.

    - We end up creating N+1 separate reference transactions, one for
      the packed-refs file and N for the individual loose references.

Refactor the code to instead delete references via a single transaction.
As we don't assert the expected old object ID this is equivalent to the
previous behaviour, and we already do the same in the packed-refs
backend.

Despite the fact that the result is simpler to reason about, this change
also results in improved performance. The following benchmarks have been
executed in linux.git:

```
$ hyperfine -n '{rev}, packed={packed} refcount={refcount}' \
    -L packed true,false -L refcount 1,1000 -L rev master,pks-ref-store-generic-delete-refs \
    --setup 'git -C /home/pks/Development/git switch --detach {rev} && make -C /home/pks/Development/git -j17' \
    --prepare 'printf "create refs/heads/new-branch-%d HEAD\n" $(seq {refcount}) | git -C /home/pks/Reproduction/linux.git update-ref --stdin && if test {packed} = true; then git pack-refs --all; fi' \
    --warmup=10 \
    '/home/pks/Development/git/bin-wrappers/git -C /home/pks/Reproduction/linux.git branch -d new-branch-{1..{refcount}}'

Benchmark 1: master packed=true refcount=1
  Time (mean ± σ):       7.8 ms ±   1.6 ms    [User: 3.4 ms, System: 4.4 ms]
  Range (min … max):     5.5 ms …  11.0 ms    120 runs

Benchmark 2: master packed=false refcount=1
  Time (mean ± σ):       7.0 ms ±   1.1 ms    [User: 3.2 ms, System: 3.8 ms]
  Range (min … max):     5.7 ms …   9.8 ms    180 runs

Benchmark 3: master packed=true refcount=1000
  Time (mean ± σ):     283.8 ms ±   5.2 ms    [User: 45.7 ms, System: 231.5 ms]
  Range (min … max):   276.7 ms … 291.6 ms    10 runs

Benchmark 4: master packed=false refcount=1000
  Time (mean ± σ):     284.4 ms ±   5.3 ms    [User: 44.2 ms, System: 233.6 ms]
  Range (min … max):   277.1 ms … 293.3 ms    10 runs

Benchmark 5: generic-delete-refs packed=true refcount=1
  Time (mean ± σ):       6.2 ms ±   1.8 ms    [User: 2.3 ms, System: 3.9 ms]
  Range (min … max):     4.1 ms …  12.2 ms    142 runs

Benchmark 6: generic-delete-refs packed=false refcount=1
  Time (mean ± σ):       7.1 ms ±   1.4 ms    [User: 2.8 ms, System: 4.3 ms]
  Range (min … max):     4.2 ms …  10.8 ms    157 runs

Benchmark 7: generic-delete-refs packed=true refcount=1000
  Time (mean ± σ):     198.9 ms ±   1.7 ms    [User: 29.5 ms, System: 165.7 ms]
  Range (min … max):   196.1 ms … 201.4 ms    10 runs

Benchmark 8: generic-delete-refs packed=false refcount=1000
  Time (mean ± σ):     199.7 ms ±   7.8 ms    [User: 32.2 ms, System: 163.2 ms]
  Range (min … max):   193.8 ms … 220.7 ms    10 runs

Summary
  generic-delete-refs packed=true refcount=1 ran
    1.14 ± 0.37 times faster than master packed=false refcount=1
    1.15 ± 0.40 times faster than generic-delete-refs packed=false refcount=1
    1.26 ± 0.44 times faster than master packed=true refcount=1
   32.24 ± 9.17 times faster than generic-delete-refs packed=true refcount=1000
   32.36 ± 9.29 times faster than generic-delete-refs packed=false refcount=1000
   46.00 ± 13.10 times faster than master packed=true refcount=1000
   46.10 ± 13.13 times faster than master packed=false refcount=1000
```

Especially in the case where we have many references we can see a clear
performance speedup of nearly 30%.

This is in contrast to the stated objecive in a27dcf89b68 (refs: make
delete_refs() virtual, 2016-09-04), where the virtual `delete_refs()`
function was introduced with the intent to speed things up rather than
making things slower. So it seems like we have outlived the need for a
virtual function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 66 +++++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 32 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index db5c0c7a72..778d3a96ba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1268,49 +1268,51 @@ static int files_pack_refs(struct ref_store *ref_store,
 static int files_delete_refs(struct ref_store *ref_store, const char *msg,
 			     struct string_list *refnames, unsigned int flags)
 {
-	struct files_ref_store *refs =
-		files_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
+	struct ref_transaction *transaction;
 	struct strbuf err = STRBUF_INIT;
-	int i, result = 0;
+	struct string_list_item *item;
+	int ret = 0, failures = 0;
 
 	if (!refnames->nr)
 		return 0;
 
-	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
-		goto error;
-
-	if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
-		packed_refs_unlock(refs->packed_ref_store);
-		goto error;
+	/*
+	 * Since we don't check the references' old_oids, the
+	 * individual updates can't fail, so we can pack all of the
+	 * updates into a single transaction.
+	 */
+	transaction = ref_store_transaction_begin(ref_store, &err);
+	if (!transaction) {
+		ret = error("%s", err.buf);
+		goto out;
 	}
 
-	packed_refs_unlock(refs->packed_ref_store);
-
-	for (i = 0; i < refnames->nr; i++) {
-		const char *refname = refnames->items[i].string;
-
-		if (refs_delete_ref(&refs->base, msg, refname, NULL, flags))
-			result |= error(_("could not remove reference %s"), refname);
+	for_each_string_list_item(item, refnames) {
+		ret = ref_transaction_delete(transaction, item->string,
+					     NULL, flags, msg, &err);
+		if (ret) {
+			warning(_("could not delete reference %s: %s"),
+				item->string, err.buf);
+			strbuf_reset(&err);
+			failures = 1;
+		}
 	}
 
-	strbuf_release(&err);
-	return result;
-
-error:
-	/*
-	 * If we failed to rewrite the packed-refs file, then it is
-	 * unsafe to try to remove loose refs, because doing so might
-	 * expose an obsolete packed value for a reference that might
-	 * even point at an object that has been garbage collected.
-	 */
-	if (refnames->nr == 1)
-		error(_("could not delete reference %s: %s"),
-		      refnames->items[0].string, err.buf);
-	else
-		error(_("could not delete references: %s"), err.buf);
+	ret = ref_transaction_commit(transaction, &err);
+	if (ret) {
+		if (refnames->nr == 1)
+			error(_("could not delete reference %s: %s"),
+			      refnames->items[0].string, err.buf);
+		else
+			error(_("could not delete references: %s"), err.buf);
+	}
 
+out:
+	if (!ret && failures)
+		ret = -1;
+	ref_transaction_free(transaction);
 	strbuf_release(&err);
-	return -1;
+	return ret;
 }
 
 /*
-- 
2.42.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 1/4] t5510: ensure that the packed-refs file needs locking
From: Patrick Steinhardt @ 2023-11-14  8:58 UTC (permalink / raw)
  To: git; +Cc: David Turner, Han-Wen Nienhuys
In-Reply-To: <cover.1699951815.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1569 bytes --]

One of the tests in t5510 asserts that a `git fetch --prune` detects
failures to prune branches. This is done by locking the packed-refs
file, which would then later lead to a locking issue when Git tries to
rewrite the file to prune the branches from it.

Interestingly though, we do not pack the about-to-be-pruned branch into
the packed-refs file, so it never even contained that branch in the
first place. While this is good enough right now because the pruning
will always lock the file regardless of whether it contains the branch
or not, this is a mere implementation detail. In fact, we're about to
rewrite branch deletions to make use of the ref transaction interface,
which knows to skip rewrites of the packed-refs file in the case where
it does not contain the branches in the first place, and this will break
the test.

Prepare the test for that change by packing the refs before trying to
prune them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t5510-fetch.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index dcadd56d3a..79592a3b0a 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -169,6 +169,7 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' '
 	git clone . prune-fail &&
 	cd prune-fail &&
 	git update-ref refs/remotes/origin/extrabranch main &&
+	git pack-refs --all &&
 	: this will prevent --prune from locking packed-refs for deleting refs, but adding loose refs still succeeds  &&
 	>.git/packed-refs.new &&

-- 
2.42.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 0/4] refs: remove virtual `delete_refs()` function
From: Patrick Steinhardt @ 2023-11-14  8:58 UTC (permalink / raw)
  To: git; +Cc: David Turner, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]

Hi,

this patch series refactors the virtual `delete_refs()` function to
instead be implemented generically via a single reference transaction.

The main intent of this patch series is to reduce complexity that we
have in the reference backends so that it becomes easier to implement
new backends that have the same semantics as others. But at the same
time, benchmarks show that the new generic code is even faster than the
old backend-specific code. This is mostly because we avoid the overhead
of per-reference transactions when deleting many references, but also
because the transactional code in the files backend knows to avoid
rewriting the packed-refs file in case it contains none of the refs that
are to be deleted.

So in the end we have less duplicate code in the files backend, make it
easier to implement new backends, and have faster deletion of many refs
in the files backend.

Patrick

Patrick Steinhardt (4):
  t5510: ensure that the packed-refs file needs locking
  refs/files: use transactions to delete references
  refs: deduplicate code to delete references
  refs: remove `delete_refs` callback from backends

 refs.c                | 48 ++++++++++++++++++++++++++++++++++++++---
 refs/debug.c          | 15 -------------
 refs/files-backend.c  | 49 ------------------------------------------
 refs/packed-backend.c | 50 -------------------------------------------
 refs/refs-internal.h  |  3 ---
 t/t5510-fetch.sh      |  1 +
 6 files changed, 46 insertions(+), 120 deletions(-)

base-commit: e0939bec273052b1a8d69db4a3f7c87aaf83e220
-- 
2.42.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: commit-graph paranoia performance, was Re: [ANNOUNCE] Git v2.43.0-rc1
From: Patrick Steinhardt @ 2023-11-14  8:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git
In-Reply-To: <20231113205538.GA2028092@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 2769 bytes --]

On Mon, Nov 13, 2023 at 03:55:38PM -0500, Jeff King wrote:
> On Thu, Nov 09, 2023 at 02:33:54AM +0900, Junio C Hamano wrote:
> 
> >  * The codepath to traverse the commit-graph learned to notice that a
> >    commit is missing (e.g., corrupt repository lost an object), even
> >    though it knows something about the commit (like its parents) from
> >    what is in commit-graph.
> >    (merge 7a5d604443 ps/do-not-trust-commit-graph-blindly-for-existence later to maint).
> 
> I happened to be timing "rev-list" for an unrelated topic today, and I
> noticed that this change had a rather large effect. The commit message
> for 7a5d604443 claims a 30% performance regression. But that's when
> using "--topo-order", and actually writing out the result.
> 
> Running "rev-list --count" on a copy of linux.git with a fully-built
> commit-graph shows that the run-time doubles:
> 
>   Benchmark 1: git.v2.42.1 rev-list --count HEAD
>     Time (mean ± σ):     658.0 ms ±   5.2 ms    [User: 613.5 ms, System: 44.4 ms]
>     Range (min … max):   650.2 ms … 666.0 ms    10 runs
>   
>   Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
>     Time (mean ± σ):      1.333 s ±  0.019 s    [User: 1.263 s, System: 0.069 s]
>     Range (min … max):    1.302 s …  1.361 s    10 runs
>   
>   Summary
>     git.v2.42.1 rev-list --count HEAD ran
>       2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD

Ah, indeed. I thought I already benchmarked the worst-case behaviour by
simply doing a full graph walk, but of course the performance hit is
even worse when not outputting the commits at all but only counting
them.

> Now in defense of that patch, this particular command is going to be one
> of the most sensitive in terms of percent change, simply because it
> isn't doing much besides walking the commits. And 650ms isn't _that_ big
> in an absolute sense. But it also doesn't quite feel like nothing, even
> tacked onto a command that might otherwise take 1000ms to run.
> 
> Should we default GIT_COMMIT_GRAPH_PARANOIA to "0"? Yes, some operations
> might miss a breakage, but that is true of so much of Git. For day to
> day commands we generally assume that the repository is not corrupted,
> and avoid looking at any data we can. Other commands (like "commit-graph
> verify", but maybe others) would probably want to be more careful
> (either by checking this case explicitly, or by enabling the paranoia
> flag themselves).
> 
> -Peff

I'd be fine with that as a follow-up change, yes. I agree that in
general we shouldn't see this kind of corruption, and it's good that the
behaviour can be toggled so easily now.

I'm happy to write that patch if you don't plan to.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2] credential/wincred: store oauth_refresh_token
From: Junio C Hamano @ 2023-11-14  7:40 UTC (permalink / raw)
  To: M Hickford via GitGitGadget
  Cc: git, Jeff King, Taylor Blau, Matthew John Cheetham, M Hickford
In-Reply-To: <pull.1534.v2.git.1699251395093.gitgitgadget@gmail.com>

"M Hickford via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: M Hickford <mirth.hickford@gmail.com>
>
> a5c7656 (credential: new attribute oauth_refresh_token) introduced
> a new confidential credential attribute for helpers to store.
>
> We encode the new attribute in the CredentialBlob, separated by
> newline:
>
>     hunter2
>     oauth_refresh_token=xyzzy
>
> This is extensible and backwards compatible. The credential protocol
> already assumes that attribute values do not contain newlines.
>
> This fixes test "helper (wincred) gets oauth_refresh_token" when
> t0303-credential-external.sh is run with
> GIT_TEST_CREDENTIAL_HELPER=wincred. This test was added in a5c76569e7
> (credential: new attribute oauth_refresh_token, 2023-04-21).
>
> Alternatives considered: store oauth_refresh_token in a wincred
> attribute. This would be insecure because wincred assumes attribute
> values to be non-confidential.

Earlier, a5c76569 (credential: new attribute oauth_refresh_token,
2023-04-21) built the "git" side support for the token, and taught
credential-cache to store the necessary information.  Then 0ce02e2f
(credential/libsecret: store new attributes, 2023-06-16) was written
for libsecret to support the same interface.  

And this one adds corresponding support for wincred.  Do I
understand what is going on around this patch correctly?

I do not do Windows, but some people on this list certainly do and
would be capable of giving the patch a necessary nudge ;-)

Thanks.

>  .../wincred/git-credential-wincred.c          | 46 ++++++++++++++++---
>  1 file changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/contrib/credential/wincred/git-credential-wincred.c b/contrib/credential/wincred/git-credential-wincred.c
> index 4cd56c42e24..5c6a7d65d4a 100644
> --- a/contrib/credential/wincred/git-credential-wincred.c
> +++ b/contrib/credential/wincred/git-credential-wincred.c
> @@ -35,7 +35,7 @@ static void *xmalloc(size_t size)
>  }
>  
>  static WCHAR *wusername, *password, *protocol, *host, *path, target[1024],
> -	*password_expiry_utc;
> +	*password_expiry_utc, *oauth_refresh_token;
>  
>  static void write_item(const char *what, LPCWSTR wbuf, int wlen)
>  {
> @@ -140,6 +140,11 @@ static void get_credential(void)
>  	DWORD num_creds;
>  	int i;
>  	CREDENTIAL_ATTRIBUTEW *attr;
> +	WCHAR *secret;
> +	WCHAR *line;
> +	WCHAR *remaining_lines;
> +	WCHAR *part;
> +	WCHAR *remaining_parts;
>  
>  	if (!CredEnumerateW(L"git:*", 0, &num_creds, &creds))
>  		return;
> @@ -149,9 +154,23 @@ static void get_credential(void)
>  		if (match_cred(creds[i], 0)) {
>  			write_item("username", creds[i]->UserName,
>  				creds[i]->UserName ? wcslen(creds[i]->UserName) : 0);
> -			write_item("password",
> -				(LPCWSTR)creds[i]->CredentialBlob,
> -				creds[i]->CredentialBlobSize / sizeof(WCHAR));
> +			if (creds[i]->CredentialBlobSize > 0) {
> +				secret = xmalloc(creds[i]->CredentialBlobSize);
> +				wcsncpy_s(secret, creds[i]->CredentialBlobSize, (LPCWSTR)creds[i]->CredentialBlob, creds[i]->CredentialBlobSize / sizeof(WCHAR));
> +				line = wcstok_s(secret, L"\r\n", &remaining_lines);
> +				write_item("password", line, line ? wcslen(line) : 0);
> +				while(line != NULL) {
> +					part = wcstok_s(line, L"=", &remaining_parts);
> +					if (!wcscmp(part, L"oauth_refresh_token")) {
> +						write_item("oauth_refresh_token", remaining_parts, remaining_parts ? wcslen(remaining_parts) : 0);
> +					}
> +					line = wcstok_s(NULL, L"\r\n", &remaining_lines);
> +				}
> +			} else {
> +				write_item("password",
> +						(LPCWSTR)creds[i]->CredentialBlob,
> +						creds[i]->CredentialBlobSize / sizeof(WCHAR));
> +			}
>  			for (int j = 0; j < creds[i]->AttributeCount; j++) {
>  				attr = creds[i]->Attributes + j;
>  				if (!wcscmp(attr->Keyword, L"git_password_expiry_utc")) {
> @@ -160,6 +179,7 @@ static void get_credential(void)
>  					break;
>  				}
>  			}
> +			free(secret);
>  			break;
>  		}
>  
> @@ -170,16 +190,26 @@ static void store_credential(void)
>  {
>  	CREDENTIALW cred;
>  	CREDENTIAL_ATTRIBUTEW expiry_attr;
> +	WCHAR *secret;
> +	int wlen;
>  
>  	if (!wusername || !password)
>  		return;
>  
> +	if (oauth_refresh_token) {
> +		wlen = _scwprintf(L"%s\r\noauth_refresh_token=%s", password, oauth_refresh_token);
> +		secret = xmalloc(sizeof(WCHAR) * wlen);
> +		_snwprintf_s(secret, sizeof(WCHAR) * wlen, wlen, L"%s\r\noauth_refresh_token=%s", password, oauth_refresh_token);
> +	} else {
> +		secret = _wcsdup(password);
> +	}
> +
>  	cred.Flags = 0;
>  	cred.Type = CRED_TYPE_GENERIC;
>  	cred.TargetName = target;
>  	cred.Comment = L"saved by git-credential-wincred";
> -	cred.CredentialBlobSize = (wcslen(password)) * sizeof(WCHAR);
> -	cred.CredentialBlob = (LPVOID)password;
> +	cred.CredentialBlobSize = wcslen(secret) * sizeof(WCHAR);
> +	cred.CredentialBlob = (LPVOID)_wcsdup(secret);
>  	cred.Persist = CRED_PERSIST_LOCAL_MACHINE;
>  	cred.AttributeCount = 0;
>  	cred.Attributes = NULL;
> @@ -194,6 +224,8 @@ static void store_credential(void)
>  	cred.TargetAlias = NULL;
>  	cred.UserName = wusername;
>  
> +	free(secret);
> +
>  	if (!CredWriteW(&cred, 0))
>  		die("CredWrite failed");
>  }
> @@ -265,6 +297,8 @@ static void read_credential(void)
>  			password = utf8_to_utf16_dup(v);
>  		else if (!strcmp(buf, "password_expiry_utc"))
>  			password_expiry_utc = utf8_to_utf16_dup(v);
> +		else if (!strcmp(buf, "oauth_refresh_token"))
> +			oauth_refresh_token = utf8_to_utf16_dup(v);
>  		/*
>  		 * Ignore other lines; we don't know what they mean, but
>  		 * this future-proofs us when later versions of git do
>
> base-commit: bc5204569f7db44d22477485afd52ea410d83743

^ permalink raw reply

* Re: [PATCH v2] glossary: add definitions for dereference & peel
From: Junio C Hamano @ 2023-11-14  4:49 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, ps, Kristoffer Haugsbakk, Victoria Dye
In-Reply-To: <pull.1610.v2.git.1699917471769.gitgitgadget@gmail.com>

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

>      * Removed references to "peeling" a commit; the updated definition
>        discusses "peeling" only in the context of tags.
>      * Added a cross-link from "dereference" to "peel" (one already existed
>        for "peel" to "dereference").
> ...
> +[[def_peel]]peel::
> +	The action of recursively <<def_dereference,dereferencing>> a
> +	<<def_tag_object,tag object>>.
> +

This was a bit surprising to me as I thought we would say "peel the
tag once" vs "peel the tag repeatedly", but upon inspecting the
existing code, documentation, and messages, we seem to mean by "to
peel" to dereference a tag repeatedly until it no longer is a tag,
which the new entry above exactly is (although "until the non-tag
object is revealed" is missing).

Thanks.  Will queue.

^ permalink raw reply

* Re: [PATCH v5 5/5] builtin/merge-tree.c: implement support for `--write-pack`
From: Elijah Newren @ 2023-11-14  3:08 UTC (permalink / raw)
  To: Jeff King
  Cc: Taylor Blau, git, Eric W. Biederman, Junio C Hamano,
	Patrick Steinhardt
In-Reply-To: <20231113220546.GB2065691@coredump.intra.peff.net>

On Mon, Nov 13, 2023 at 2:05 PM Jeff King <peff@peff.net> wrote:
>
> On Fri, Nov 10, 2023 at 08:24:44PM -0500, Taylor Blau wrote:
>
> > > This does mean that for a recursive merge, that you'll get up to 2*N
> > > packfiles, where N is the depth of the recursive merge.
> >
> > We definitely want to avoid that ;-). I think there are a couple of
> > potential directions forward here, but the most promising one I think is
> > due to Johannes who suggests that we write loose objects into a
> > temporary directory with a replace_tmp_objdir() call, and then repack
> > that side directory before migrating a single pack back into the main
> > object store.
>
> I posted an alternative in response to Elijah; the general idea being to
> allow the usual object-lookup code to access the in-progress pack. That
> would keep us limited to a single pack.
>
> It _might_ be a terrible idea. E.g., if you write a non-bulk object that
> references a bulk one, then that non-bulk one is broken from the
> perspective of other processes (until the bulk checkin is flushed). But
> I think we'd always be writing to one or the other here, never
> interleaving?

I think it sounds like a really cool idea, personally.  I also don't
see why any current uses would result in interleaving.

fast-import's created pack has the same kind of restriction...and has
even grown some extensions to work around the need for early access.
In particular, it has a `checkpoint` feature for flushing the current
pack and starting a new one, and also gained the `cat-blob` command
for when folks really wanted to access an object without writing out
the whole packfile.  So, _if_ the need for early access arises, we
have some prior art to look to for ways to provide it.

^ permalink raw reply

* Re: [PATCH v3 2/3] update-index: add --show-index-version
From: Junio C Hamano @ 2023-11-14  2:55 UTC (permalink / raw)
  To: Teng Long; +Cc: git
In-Reply-To: <20231114021839.8275-1-tenglong.tl@alibaba-inc.com>

Teng Long <dyroneteng@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> @@ -1181,15 +1183,20 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>>  
>>  	getline_fn = nul_term_line ? strbuf_getline_nul : strbuf_getline_lf;
>>  	if (preferred_index_format) {
>> -		if (preferred_index_format < INDEX_FORMAT_LB ||
>> -		    INDEX_FORMAT_UB < preferred_index_format)
>> +		if (preferred_index_format < 0) {
>> +			printf(_("%d\n"), the_index.version);
>
> Maybe the "%d\n" shouldn't be translated? :)

Excellent.

^ permalink raw reply

* Re: [PATCH v5 5/5] builtin/merge-tree.c: implement support for `--write-pack`
From: Elijah Newren @ 2023-11-14  2:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Taylor Blau, git, Eric W. Biederman,
	Patrick Steinhardt
In-Reply-To: <xmqqpm0d1591.fsf@gitster.g>

On Mon, Nov 13, 2023 at 5:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> > I posted an alternative in response to Elijah; the general idea being to
> > allow the usual object-lookup code to access the in-progress pack. That
> > would keep us limited to a single pack.
>
> If such a mechanism is done in a generic way, would we be able to
> simplify fast-import a lot, I wonder?  IIRC, it had quite a lot of
> code to remember what it has written to its output to work around
> the exact issue your alternative tries to solve.  In fact, maybe we
> could make fast-import a thin wrapper around the bulk checkin
> infrastructure?

fast-import also attempts to delta objects against previous ones as it
writes them to the pack.  That's one thing still lacking from this
solution, but aside from that, it also sounds to me like the bulk
checkin infrastructure is getting closer to becoming a replacement for
much of the fast-import code.

^ permalink raw reply

* Re: [PATCH 1/1] attr: add native file mode values support
From: Junio C Hamano @ 2023-11-14  2:52 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git, tboegi
In-Reply-To: <CAMmZTi-7=L6XOB9-MwZCpD5QaziTi0x6J7m-UTtwQ0S00wnVgQ@mail.gmail.com>

Joanna Wang <jojwang@google.com> writes:

>>> Even if we assume that this code is currently meant to work only
>>> with GIT_ATTR_CHECKIN, I do not think this is what you want.  When
>>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>>> want to exclude the submodules that are already known to this
>>> superproject, but also a repository that _can_ become a submodule of
>>> this superproject when added, no?
> Sorry, I was totally ignorant of ...

Nothing to be sorry about, and if I sounded like I was upset or
something, that wasn't my intention.  Reviewers and more experienced
developers read posted patches to give pieces of advice exactly
because no single human is perfect and knows everything.  We cover
holes in each others' knowledge and attention.

>>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>>> simpler.  You'd see what the path in the index is, among a gitlink,
>>> a regular non-executable file, an executable file, or a symlink.
> I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
> in read_attr(), the indexed .gitattributes file is checked with the
> actual file as fallback or vice versa.
> I would think that we'd only want to use attributes from one state
> (e.g. what's actually in the file)
> or the other (e.g. what's indexed).
> So I guess I'm still not sure what the "direction" concept is.
> For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?

When checking things out of the index, the index should be the
source of the truth.  If something is not in the index, is there a
point in falling back to the workint tree state to decide if the
thing should be checked out of the index?

>>> But stepping back a bit,
>>> such an application is likely marking selected few paths with the
>>> attribute, and paths for which "mode" was "unset" are now given
>>> these natural "mode"; it is inevitable to crash with such uses.
> I'm confused. This does not match what I think is the current behavior
> of my patch.
> If "mode" was unset or removed for a path (meaning '<path> !mode' was
> added to .gitattributes),
> the code in my patch would respect that and not return the native mode.
> It would return 'unspecified' or 'unset'.

But the usual practice is *not* to cover all paths with explicit
attribute definition, isn't it?  If somebody used the "foo"
attribute in their project to decide certain paths are worth giving
special treatments, there are paths with that attribute, perhaps
(totally made up example):

	*.c	foo=yes

Now, if you add a new "built-in" attribute next to 'mode' that
assigns "foo" in attr.c:git_check_attr() the same way to those paths
whose value is still ATTR__UNKNOWN after collect_some_attrs returns,
wouldn't a "hello.c" file get attribute 'foo' with value 'yes',
while a "hello.h" file (not mentioned by .gitattributes) will get
whatever value the built-in logic computed for it?  If that existing
project were using "mode" (instead of "foo"), then doesn't this patch
change the behaviour for them?

>>> Again, this has one hole, I think.  Paths that are not mentioned
>>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>>> and trigger the fallback behaviour, even when other paths are given
>>> end-user specified "mode" attribute values.
> What you are describing sounds correct/what I intended.
> So are you saying that the expected behavior is actually:
> If the user sets 'mode' for 1+ paths in the repo, then the native mode
> fallback should
> NOT be used for all paths in the repo?

That might be workable, but it breaks a well working system suddenly
when somebody uses "mode" in their .gitattibutes and we do not even
warn about the breakage, which is not ideal.

I think, when we see such an attribute is defined while reading from
.gitattributes and friends, we would probably want to warn and
ignore their definition altogether and use *only* the computed
value.  This cannot be done in a backward compatible way, so it
would be better to carve out a namespace (and avoid overly generic
word like "mode") for ourselves to define built-in attributes that
do not overlap with end-user defined attribute names.

Perhaps call this 'builtin-object-mode' or something and document
that any attribute with a name that begins with 'builtin-' will
always get a computed value (possibly "unset"), it is an error
to define such an attribute in .gitattributes system, or something?

^ permalink raw reply

* Re: [PATCH v5 5/5] builtin/merge-tree.c: implement support for `--write-pack`
From: Elijah Newren @ 2023-11-14  2:50 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Jeff King, git, Eric W. Biederman, Junio C Hamano,
	Patrick Steinhardt
In-Reply-To: <ZVKkgpiFaOwwDcdw@nand.local>

On Mon, Nov 13, 2023 at 2:34 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Mon, Nov 13, 2023 at 05:02:54PM -0500, Jeff King wrote:
> > On Fri, Nov 10, 2023 at 03:51:18PM -0800, Elijah Newren wrote:
> >
> > > This is unsafe; the object may need to be read later within the same
> > > merge. [...]
> > >
> > > I think (untested) that you could fix this by creating two packs
> > > instead of just one.  In particular, add a call to
> > > flush_odb_transcation() after the "redo_after_renames" block in
> > > merge_ort_nonrecursive_internal().
> >
> > It might not be too hard to just let in-process callers access the
> > objects we've written. A quick and dirty patch is below, which works
> > with the test you suggested (the test still fails because it finds a
> > conflict, but it gets past the "woah, I can't find that sha1" part).
>
> That's a very slick idea, and I think that this series has some legs to
> stand on now as a result.
>
> There are a few of interesting conclusions we can draw here:
>
>   1. This solves the recursive merge problem that Elijah mentioned
>      earlier where we could generate up to 2^N packs (where N is the
>      maximum depth of the recursive merge).
>
>   2. This also solves the case where the merge-ort code needs to read an
>      object that it wrote earlier on in the same process without having
>      to flush out intermediate packs. So in the best case we need only a
>      single pack (instead of two).
>
>   3. This also solves the 'replay' issue, *and* allows us to avoid the
>      tmp-objdir thing there entirely, since we can simulate object reads
>      in the bulk-checkin pack.
>
> So I think that this is a direction worth pursuing.

Agreed; this looks great.  It's basically bringing fast-import-like
functionality -- writing objects to a single new packfile while making
previous objects accessible to subsequent ones -- to additional
callers.

> In terms of making those lookups faster, I think that what you'd want is
> an oidmap. The overhead is slightly unfortunate, but I think that any
> other solution which requires keeping the written array in sorted order
> would in practice be more expensive as you have to constantly reallocate
> and copy portions of the array around to maintain its invariant.

When comparing the overhead of an oidmap to a bunch of inodes,
however, it seems relatively cheap.  :-)

> > I don't know if that is sufficient, though. Would other spawned
> > processes (hooks, external merge drivers, and so on) need to be able to
> > see these objects, too?
>
> Interesting point. In theory those processes could ask about newly
> created objects, and if they were spawned before the bulk-checkin pack
> was flushed, those lookups would fail.

One of the big design differences that I was pushing really hard with
git-replay was performance and things that came with it -- no
worktree, no per-commit hooks (which are nearly ruled out by no
worktree, but it's still worth calling out separately), etc.  A
post-operation hook could be fine, but would also not get to assume a
worktree.

merge-tree is the same as far as hooks; I'd rather just not have them,
but if we did, they'd be a post-operation hook.

In both cases, that makes hooks not much of a sticking point.

External merge drivers, however...

> But that requires that, e.g. for hooks, that we know a-priori the object
> ID of some newly-written objects. If you wanted to make those lookups
> succeed, I think there are a couple of options:
>
>   - teach child-processes about the bulk-checkin pack, and let them
>     perform the same fake lookup
>
>   - flush (but do not close) the bulk-checkin transaction
>
> In any event, I think that this is a sufficiently rare and niche case
> that we'd be OK to declare that you should not expect the above
> scenarios to work when using `--write-pack`. If someone does ask for
> that feature in the future, we could implement it relatively painlessly
> using one of the above options.

Seems reasonable to me.

^ permalink raw reply

* Re: [PATCH 1/1] attr: add native file mode values support
From: Joanna Wang @ 2023-11-14  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Joanna Wang, tboegi

>> Even if we assume that this code is currently meant to work only
>> with GIT_ATTR_CHECKIN, I do not think this is what you want.  When
>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>> want to exclude the submodules that are already known to this
>> superproject, but also a repository that _can_ become a submodule of
>> this superproject when added, no?
Sorry, I was totally ignorant of two key concepts that you mention here.
I somehow missed the concept of git_attr_direction altogether and I also
did not know submodules could be added with git-add (I was only
familiar with how
they are currently added in chromium via git update-index).
This makes sense now. I'll fix in the next patch.

>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>> simpler.  You'd see what the path in the index is, among a gitlink,
>> a regular non-executable file, an executable file, or a symlink.
I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
in read_attr(), the indexed .gitattributes file is checked with the
actual file as fallback or vice versa.
I would think that we'd only want to use attributes from one state
(e.g. what's actually in the file)
or the other (e.g. what's indexed).
So I guess I'm still not sure what the "direction" concept is.
For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?

>> "ls-tree" documentation seems to call it %(objectmode).
I can change it to 'objectmode' in a followup patch, if there are no
objections to this.

>> I think the idea is that "mode" being a too generic word can be used
>> for totally different purposes
My thinking was, no matter how generic or rare a name we choose, there
is always a chance
no matter how tiny, that the name will be in use in someone's .gitattributes.
But if people are ok with choosing something less generic
and have that become a 'reserved' attribute name and not have any
existing values in .gitattributes
take precedence I can do that.
I just don't know how git has historically balanced breaking existing
workflows vs easier/more reasonable
implementation/behavior.

>> But stepping back a bit,
>> such an application is likely marking selected few paths with the
>> attribute, and paths for which "mode" was "unset" are now given
>> these natural "mode"; it is inevitable to crash with such uses.
I'm confused. This does not match what I think is the current behavior
of my patch.
If "mode" was unset or removed for a path (meaning '<path> !mode' was
added to .gitattributes),
the code in my patch would respect that and not return the native mode.
It would return 'unspecified' or 'unset'. I have tests for these in
the patch, but if I've missed some
test cases please let me know.


>> Again, this has one hole, I think.  Paths that are not mentioned
>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>> and trigger the fallback behaviour, even when other paths are given
>> end-user specified "mode" attribute values.
What you are describing sounds correct/what I intended.
So are you saying that the expected behavior is actually:
If the user sets 'mode' for 1+ paths in the repo, then the native mode
fallback should
NOT be used for all paths in the repo?

^ permalink raw reply

* [PATCH v3 2/3] update-index: add --show-index-version
From: Teng Long @ 2023-11-14  2:18 UTC (permalink / raw)
  To: gitster; +Cc: git
In-Reply-To: <20230912193235.776292-3-gitster@pobox.com>

Junio C Hamano <gitster@pobox.com> writes:

> @@ -1181,15 +1183,20 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  
>  	getline_fn = nul_term_line ? strbuf_getline_nul : strbuf_getline_lf;
>  	if (preferred_index_format) {
> -		if (preferred_index_format < INDEX_FORMAT_LB ||
> -		    INDEX_FORMAT_UB < preferred_index_format)
> +		if (preferred_index_format < 0) {
> +			printf(_("%d\n"), the_index.version);

Maybe the "%d\n" shouldn't be translated? :)

Thanks.

^ permalink raw reply

* Re: [PATCH 02/13] Enable builds for z/OS.
From: Junio C Hamano @ 2023-11-14  1:47 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Haritha D via GitGitGadget, git, Haritha
In-Reply-To: <ZVKrWSv7JguKTSYw@tapette.crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I'd generally want to look at the commit message and understand the
> problem the code is trying to solve and then look at the code and think,
> "Oh, yes, this seems like the obvious and logical way to solve this
> problem," or at least think, "Oh, no, I think we should solve this
> problem in a different way," so I can help make a thoughtful review
> comment.  Right now, I lack the information to have an informed opinion
> and so I can't provide any helpful feedback or analysis of the patches.
> ...
> I'd recommend a quick pass over the SubmittingPatches file, which is
> also available at https://git-scm.com/docs/SubmittingPatches.  The
> sections on making separate commits for separate changes and describing
> changes well come to mind as places to focus.
>
> I know this may seem overwhelming and like I'm upset or disappointed;
> I'm definitely not.  I'm very much interested in seeing Git available
> for more platforms, but right now it's too hard for me to reason about
> the changes for z/OS to provide helpful feedback, so I'm hoping you can
> send a fixed v2 that helps me (and everyone else) understand these
> changes better so you can get a helpful review.

All very good pieces of advice.  I suspect we are missing some of
them from our SubmittingPatches or CodingGuidelines documents and
may want to add them there.

Thanks.

^ permalink raw reply

* Re: [PATCH v5 5/5] builtin/merge-tree.c: implement support for `--write-pack`
From: Junio C Hamano @ 2023-11-14  1:40 UTC (permalink / raw)
  To: Jeff King
  Cc: Taylor Blau, Elijah Newren, git, Eric W. Biederman,
	Patrick Steinhardt
In-Reply-To: <20231113220546.GB2065691@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> I posted an alternative in response to Elijah; the general idea being to
> allow the usual object-lookup code to access the in-progress pack. That
> would keep us limited to a single pack.

If such a mechanism is done in a generic way, would we be able to
simplify fast-import a lot, I wonder?  IIRC, it had quite a lot of
code to remember what it has written to its output to work around
the exact issue your alternative tries to solve.  In fact, maybe we
could make fast-import a thin wrapper around the bulk checkin
infrastructure?

^ permalink raw reply

* Re: [PATCH] remote-curl: avoid hang if curl asks for more data after eof
From: Junio C Hamano @ 2023-11-14  1:36 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Jiří Hruška, git, Jeff King
In-Reply-To: <20231113212243.1495815-1-jonathantanmy@google.com>

Jonathan Tan <jonathantanmy@google.com> writes:

> Due to the nature of the bug and the fix, I do agree that it's hard to
> test and I would be OK with including the fix without associated tests.

Is this a bug on our side, or cURL library calling us when it should not?
Even if the latter, we should be prepared and the two liner fix you
suggested would be worth doing, but at the same time we should let
the cURL maintainer know if the latter is the case, and that is why
I am asking.

Thanks, both.  Especially thanks, Jonathan, for excellent review
comments.

^ permalink raw reply

* Re: [PATCH] completion: add and use the __git_get_config_subsection helper function
From: Junio C Hamano @ 2023-11-14  1:08 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Philippe Blain
In-Reply-To: <20231113222528.62771-1-szeder.dev@gmail.com>

SZEDER Gábor <szeder.dev@gmail.com> writes:

> +# Lists all subsections in the given section which contain the given
> +# config variable, with the section and variable names removed.
> +__git_get_config_subsections ()
> +{
> +	local section="$1" var="$2" i IFS=$'\n'
> +	for i in $(__git config --name-only --get-regexp "^$section\..*\.$var$"); do
> +		i=${i#$section.}
> +		i=${i%.$var}

As this script is allowed bash-isms, I wondered if we can use
a single pattern substitution instead of two remove pre/suffix
pattern substitution, but I guess it would not work, and the above
is perfectly readable.

> +		echo "$i"

As the subsection is designed to contain unbounded set of end-user
controlled names, we probably should do

		printf "%s\n" "$i"

instead to protect us from interesting names (e.g. ones that begin
with a dash).

> +	done
> +}

Interesting to see that we do not need to bother deduplicating the
output from here.

> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index a7c3b4eb63..11ed83d0ed 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -2130,6 +2130,19 @@ test_expect_success '__git_get_config_variables' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success '__git_get_config_subsections' '
> +	cat >expect <<-\EOF &&
> +	subsection-1
> +	SubSection-2
> +	sub.section.3
> +	EOF
> +	test_config interesting.subsection-1.name good &&
> +	test_config Interesting.SubSection-2.Name good &&
> +	test_config interesting.sub.section.3.name good &&
> +	__git_get_config_subsections interesting name >actual &&
> +	test_cmp expect actual
> +'

Good to see an uppercase character is used here ;-).

Thanks.

^ permalink raw reply

* Re: [PATCH] checkout: add config variable checkout.autoDetach
From: Junio C Hamano @ 2023-11-14  0:48 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Andy Koppe, git, pclouds
In-Reply-To: <0e37ee23-922c-4bbf-82c3-8f44e9216ab0@gmail.com>

Phillip Wood <phillip.wood123@gmail.com> writes:

>> and good thing to help new users.  I do not know how others react to
>> this kind of proliferation of configuration variables, but I do not
>> mind this particular variable existing.
>
> I'm a bit wary of having a config variable that could break
> scripts relying on the current behavior of "git checkout".  As far
> as "git switch" goes I don't particularly mind this config
> variable though I'm not sure it is that hard to type "--detach"
> (especially with tab completion) and

I do not have much sympathy myself to scripts that are not being
defensive enough to write "--detach" explicitly, but I do understand
and share your concern as the project maintainer.

> ... I do worry that we're making the UI more complex each
> time we add something like this.

Thanks for saying this---this is exactly the kind of reaction I as
expecting to see.

>
> Best Wishes
>
> Phillip

^ permalink raw reply

* Re: [PATCH 02/13] Enable builds for z/OS.
From: Junio C Hamano @ 2023-11-14  0:48 UTC (permalink / raw)
  To: Haritha D via GitGitGadget; +Cc: git, Haritha
In-Reply-To: <098b9ca8ece4fdce45a9b48e576b474ed81dced1.1699871056.git.gitgitgadget@gmail.com>

"Haritha D via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Subject: Re: [PATCH 02/13] Enable builds for z/OS.

Documentation/CodingGuidelines and Documentation/SubmittingPatches
would help here, I think.

>  # Define SHELL_PATH to a POSIX shell if your /bin/sh is broken.
>  #
> +# Define SHELL_PATH_FOR_SCRIPTS to a POSIX shell if your /bin/sh is broken.

The reason to exist for the _FOR_SCRIPTS variants is not justified
anywhere in the proposed log message.

The former should be sufficient, and our policy is to let the
builder specify exactly what binaries the build products depend on,
(instead of random $PATH interfere with the choice by using
"#!/bin/env tool" that also has to assume that everybody's "env" is
installed in "/bin").

This patch has too many #ifdefs in the primary codepaths.  Your
porting strategy may need to be rethought.  Our usual convention is
to encapsulate these platform differences as much as possible in
git-compat-util.h and platform specific files in compat/ directory.

> diff --git a/builtin/hash-object.c b/builtin/hash-object.c
> index 5ffec99dcea..b33b32ff977 100644
> --- a/builtin/hash-object.c
> +++ b/builtin/hash-object.c
> @@ -57,11 +57,39 @@ static void hash_fd(int fd, const char *type, const char *path, unsigned flags,
>  	maybe_flush_or_die(stdout, "hash to stdout");
>  }
>  
> +#ifdef __MVS__
> +#  if (__CHARSET_LIB == 1)
> +#  include <stdio.h>
> +#  include <stdlib.h>
> +
> +   int setbinaryfd(int fd)
> +   {
> +     attrib_t attr;
> +     int rc;

Ahh, OK, I saw [03/13] first and was utterly confused by this thing.
Do not send in such a mess that introduces broken code in an early
step that you need to later say "oops that one was broken and I need
to fix it up with this patch".  "rebase -i" is your friend to clean
up your mess into a logical progression to help readers better
understand what you wrote.

I'll stop here.

^ permalink raw reply

* Re: [PATCH 03/13] spaces and errors fix Handled git pipeline errors
From: Junio C Hamano @ 2023-11-14  0:38 UTC (permalink / raw)
  To: Haritha D via GitGitGadget; +Cc: git, Haritha
In-Reply-To: <e31be0d764f47c21519016729259f8d74a53e21f.1699871056.git.gitgitgadget@gmail.com>

"Haritha D via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Subject: Re: [PATCH 03/13] spaces and errors fix Handled git pipeline errors

-ECANNOTPARSE.  Perhaps Documentation/CodingGuidelines and
 Documentation/SubmittingPatches may help?

> From: Haritha D <harithamma.d@ibm.com>
>
> This PR has fixes to enable build on z/OS

This is way under-explained.  Your proposed log message should be
able to answer when somebody asks "Is anything broken in the
existing codebase to cause your build to fail, or is it your
compiler toolchain that is broken?" but the above does not help
understanding what and why you needed to fix at all.

> diff --git a/builtin/hash-object.c b/builtin/hash-object.c
> index b33b32ff977..9129658a37c 100644
> --- a/builtin/hash-object.c
> +++ b/builtin/hash-object.c
> @@ -62,8 +62,8 @@ static void hash_fd(int fd, const char *type, const char *path, unsigned flags,
>  #  include <stdio.h>
>  #  include <stdlib.h>
>  
> -   int setbinaryfd(int fd)
> -   {
> +int setbinaryfd(int fd)
> +{
>       attrib_t attr;
>       int rc;
>  
> @@ -74,7 +74,7 @@ static void hash_fd(int fd, const char *type, const char *path, unsigned flags,
>  
>       rc = __fchattr(fd, &attr, sizeof(attr));
>       return rc;
> -   }
> +}
>  #  endif
>  #endif

No such function in our codebase.  Are you fixing somebody else's
forked version of Git and we shouldn't even be looking at this
patch, perhaps?

> diff --git a/convert.c b/convert.c
> index 4f14ff6f1ed..17cc849efed 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1315,15 +1315,28 @@ static struct attr_check *check;
>  
>  static const char* get_platform() {
>  	struct utsname uname_info;
> +	char *result;
> +	if(!uname_info.sysname)
> +	{
> +		result = (char *)malloc(strlen(uname_info.sysname)+1);
> +		int index=0;
> +		while(index <= strlen(uname_info.sysname))
> +		{
> +			*result = uname_info.sysname[index];
> +			++result;
> +			++index;
> +		}
> +	}

No such function in our codebase.  I doubt these patches have much
relevance to this project?

I'll stop here.

^ permalink raw reply

* Re: [PATCH v2] ci: avoid running the test suite _twice_
From: Junio C Hamano @ 2023-11-14  0:24 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Jeff King, Johannes Schindelin
In-Reply-To: <pull.1613.v2.git.1699907108371.gitgitgadget@gmail.com>

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

>     While this tries to fix a bug uncovered by js/doc-unit-tests, to avoid
>     merge conflicts, this is based on ps/ci-gitlab.

Excellent choice of the base, as dropping the state is the right
thing to do regardless of the "unit-tests" thing (as Peff said),
and the refactoring done by ps/ci-gitlab topic is what makes the
problematic state=failed,slow,save spread wider and affects more
jobs.

Will queue.  Thanks.

^ permalink raw reply

* Re: [PATCH] ci: avoid running the test suite _twice_
From: Junio C Hamano @ 2023-11-13 23:55 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin via GitGitGadget, Josh Steadmon, Phillip Wood,
	git, Johannes Schindelin
In-Reply-To: <20231113184909.GB3838361@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> I do have to wonder, though, as somebody who did not follow the
> unit-test topic closely: why are the unit tests totally separate from
> the rest of the suite? I would think we'd want them run from one or more
> t/t*.sh scripts. That would make bugs like this impossible, but also:
>
>   1. They'd be run via "make test", so developers don't have to remember
>      to run them separately.
>
>   2. They can be run in parallel with all of the other tests when using
>      "prove -j", etc.

Very good points.  Josh?

^ permalink raw reply

* Re: commit-graph paranoia performance, was Re: [ANNOUNCE] Git v2.43.0-rc1
From: Junio C Hamano @ 2023-11-13 23:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, Karthik Nayak
In-Reply-To: <20231113205538.GA2028092@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Should we default GIT_COMMIT_GRAPH_PARANOIA to "0"? Yes, some operations
> might miss a breakage, but that is true of so much of Git. For day to
> day commands we generally assume that the repository is not corrupted,
> and avoid looking at any data we can. Other commands (like "commit-graph
> verify", but maybe others) would probably want to be more careful
> (either by checking this case explicitly, or by enabling the paranoia
> flag themselves).

I am obviously fine with that direction, as that was exactly the
stance I took when we discussed the topic on "rev-list --missing"
;-)

Patrick?  Karthik?

^ permalink raw reply

* Re: [PATCH v4 0/3] t: improve compatibility with NixOS
From: Junio C Hamano @ 2023-11-13 23:42 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King
In-Reply-To: <ZVHM-0d7g45P3Uoj@tanuki>

Patrick Steinhardt <ps@pks.im> writes:

> Fair enough. I assumed that it would ease your workload instead of
> creating more work for you. But I'll keep in mind that it doesn't and
> refrain from doing this in the future.

Thanks.  

With or without conflicts, basing your work on 'next' would not be a
good idea to begin with, as we never merge 'next' down to 'master'
(we only merge individual topics).

The following applies not specifically to you but to all
contributors.

What is helpful is in a tricky case is to start your topic branch
development at an appropriate base (often the tip of 'master', but
for a bugfix the tip of 'maint'), merge in other topics in flight
you depend on that are not in the target base (which limits what you
can depend on if you are writing a bugfix on 'maint'), and then
start building on top of the merge result.  Then communicate how you
built this (artificial) base clearly in your cover letter.

Whether you have "other topics in flight" merged to your base or
not, creating a trial merge of your topic to seen to see how others'
work-in-progress may interact with your work would be a valuable way
to keep aware of what is cooking and how you will be affected.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] fuzz: add new oss-fuzz fuzzer for date.c / date.h
From: Junio C Hamano @ 2023-11-13 23:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Arthur Chan via GitGitGadget, git, Arthur Chan
In-Reply-To: <20231113183509.GA3838361@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Mon, Nov 13, 2023 at 04:22:48PM +0000, Arthur Chan via GitGitGadget wrote:
>
>> +	str = (char *)malloc(size + 1);
>> +	if (!str)
>> +		return 0;
>> +	memcpy(str, data, size);
>> +	str[size] = '\0';
>
> Is it important that we avoid calling die() if the malloc fails here?
>
> The usual way to write this in our code base is just:
>
>   str = xmemdupz(data, size);
>
> It's not entirely a style thing; we sometimes audit the code base
> looking for computations on malloc sizes (for integer overflows) as well
> as sites that should be using xmalloc and are not. Obviously we can
> exclude oss-fuzz/ from such audits, but if there's no reason not to
> prefer our usual style, it's one less thing to worry about.

Good point.  Thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox