Git development
 help / color / mirror / Atom feed
* Re: [PATCH v3] sideband.c: remove redundant 'NEEDSWORK' tag
From: Junio C Hamano @ 2023-12-27 22:22 UTC (permalink / raw)
  To: Chandra Pratap via GitGitGadget
  Cc: git, Torsten Bögershausen, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1625.v3.git.1703672407895.gitgitgadget@gmail.com>

"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

>       - * NEEDSWORK: use "size_t n" instead for clarity.
>      ++ * It is fine to use "int n" here instead of "size_t n" as all calls to this
>      ++ * function pass an 'int' parameter.

This does not sound like a sufficient justification, though.

We should also explain why "int" is good enough for these callers.
Otherwise, using size_t throughout the callchain would become
another viable solution.

^ permalink raw reply

* Re: [PATCH 1/1] doc: use singular form of repeatable path arg
From: Eric Sunshine @ 2023-12-27 22:29 UTC (permalink / raw)
  To: Britton Leo Kerin; +Cc: git, Britton L Kerin
In-Reply-To: <3d46bca1-96d4-43ba-a912-1f7c76942287@smtp-relay.sendinblue.com>

On Wed, Dec 27, 2023 at 3:55 PM Britton Leo Kerin
<britton.kerin@gmail.com> wrote:
> This is more correct because the <path>... doc syntax already indicates
> that the arg is "array-type".  It's how other tools do it.  Finally, the
> later document text mentions 'path' arguments, while it doesn't mention
> 'paths'.

Yep, makes sense.

> Signed-off-by: Britton L Kerin <britton.kergin@gmail.com>
> ---
> diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
> @@ -17,7 +17,7 @@ The command takes various subcommands, and different options depending
>   git bisect start [--term-(new|bad)=<term-new> --term-(old|good)=<term-old>]
> -                 [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<paths>...]
> +                 [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<path>...]

Looking good.

In builtin/bisect.c, the "usage" string says "[<pathspec>...]" rather
than "[<path>...]". Perhaps it makes sense to unify these?

Also, there are a few more documentation files that could use the
"<paths>" to "<path>..." fixup (though not always in the synopsis). A
'grep' indicates that git-checkout.txt, git-diff.txt, and
git-rev-list-options.txt also mention "<paths>". Those may be outside
the scope of this patch, although they could easily be included, as
well, or made part of a patch series if you feel inclined.

^ permalink raw reply

* Re: [PATCH] Port helper/test-ctype.c to unit-tests/t-ctype.c
From: Junio C Hamano @ 2023-12-27 23:48 UTC (permalink / raw)
  To: René Scharfe
  Cc: Christian Couder, Achu Luma, git, Christian Couder, Phillip Wood,
	Josh Steadmon
In-Reply-To: <f743b473-40f8-423d-bf5b-d42b92e5aa1b@web.de>

René Scharfe <l.s.r@web.de> writes:

>> Also it might not be a big issue here, but when the new unit test
>> framework was proposed, I commented on the fact that "left" and
>> "right" were perhaps a bit less explicit than "actual" and "expected".
>
> True.
> ...
> The added repetition is a bit grating.  With a bit of setup, loop
> unrolling and stringification you can retain the property of only having
> to mention the class name once.  Demo patch below.

Nice.

This (and your mempool thing) being one of the early efforts to
adopt the unit-test framework outside the initial set of sample
tests, it is understandable that we might find what framework offers
is still lacking.  But at the same time, while the macro tricks
demonstrated here are all amusing to read and admire, it feels a bit
too much to expect that the test writers are willing to invent
something like these every time they want to test.

Being a relatively faithful conversion of the original ctype tests,
with its thorough enumeration of test samples and expected output,
is what makes this test program require these macro tricks, and it
does not have much to do with the features (or lack thereof) of the
framework, I guess.

> +struct ctype {
> +	const char *name;
> +	const char *expect;
> +	int actual[256];
> +};
> +
> +static void test_ctype(const struct ctype *class)
> +{
> +	for (int i = 0; i < 256; i++) {
> +		int expect = is_in(class->expect, i);
> +		int actual = class->actual[i];
> +		int res = test_assert(TEST_LOCATION(), class->name,
> +				      actual == expect);
> +		if (!res)
> +			test_msg("%s classifies char %d (0x%02x) wrongly",
> +				 class->name, i, i);
> +	}
>  }

Somehow, the "test_assert" does not seem to be adding much value
here (i.e. we can do "res = (actual == expect)" there).  Is this
because we want to be able to report success, too?

    ... goes and looks at test_assert() ...

Ah, is it because we want to be able to "skip" (which pretends that
the assert() was satisified).  OK, but then the error reporting from
it is redundant with our own test_msg().  

Everything below this line was a fun read ;-)

Thanks.

> ...
> +#define APPLY16(f, n) \
> +	f(n + 0x0), f(n + 0x1), f(n + 0x2), f(n + 0x3), \
> +	f(n + 0x4), f(n + 0x5), f(n + 0x6), f(n + 0x7), \
> +	f(n + 0x8), f(n + 0x9), f(n + 0xa), f(n + 0xb), \
> +	f(n + 0xc), f(n + 0xd), f(n + 0xe), f(n + 0xf)
> +#define APPLY256(f) \
> +	APPLY16(f, 0x00), APPLY16(f, 0x10), APPLY16(f, 0x20), APPLY16(f, 0x30),\
> +	APPLY16(f, 0x40), APPLY16(f, 0x50), APPLY16(f, 0x60), APPLY16(f, 0x70),\
> +	APPLY16(f, 0x80), APPLY16(f, 0x90), APPLY16(f, 0xa0), APPLY16(f, 0xb0),\
> +	APPLY16(f, 0xc0), APPLY16(f, 0xd0), APPLY16(f, 0xe0), APPLY16(f, 0xf0),\
> +
> +#define CTYPE(name, expect) { #name, expect, { APPLY256(name) }  }
>
>  int cmd_main(int argc, const char **argv) {
> +	struct ctype classes[] = {
> +		CTYPE(isdigit, DIGIT),
> +		CTYPE(isspace, " \n\r\t"),
> +		CTYPE(isalpha, LOWER UPPER),
> +		CTYPE(isalnum, LOWER UPPER DIGIT),
> +		CTYPE(is_glob_special, "*?[\\"),
> +		CTYPE(is_regex_special, "$()*+.?[\\^{|"),
> +		CTYPE(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~"),
> +		CTYPE(isascii, ASCII),
> +		CTYPE(islower, LOWER),
> +		CTYPE(isupper, UPPER),
> +		CTYPE(iscntrl, CNTRL),
> +		CTYPE(ispunct, PUNCT),
> +		CTYPE(isxdigit, DIGIT "abcdefABCDEF"),
> +		CTYPE(isprint, LOWER UPPER DIGIT PUNCT " "),
> +	};
>  	/* Run all character type tests */
> -	TEST(test_ctype_isspace(), "isspace() works as we expect");
> -	TEST(test_ctype_isdigit(), "isdigit() works as we expect");
> -	TEST(test_ctype_isalpha(), "isalpha() works as we expect");
> -	TEST(test_ctype_isalnum(), "isalnum() works as we expect");
> -	TEST(test_ctype_is_glob_special(), "is_glob_special() works as we expect");
> -	TEST(test_ctype_is_regex_special(), "is_regex_special() works as we expect");
> -	TEST(test_ctype_is_pathspec_magic(), "is_pathspec_magic() works as we expect");
> -	TEST(test_ctype_isascii(), "isascii() works as we expect");
> -	TEST(test_ctype_islower(), "islower() works as we expect");
> -	TEST(test_ctype_isupper(), "isupper() works as we expect");
> -	TEST(test_ctype_iscntrl(), "iscntrl() works as we expect");
> -	TEST(test_ctype_ispunct(), "ispunct() works as we expect");
> -	TEST(test_ctype_isxdigit(), "isxdigit() works as we expect");
> -	TEST(test_ctype_isprint(), "isprint() works as we expect");
> +	for (int i = 0; i < ARRAY_SIZE(classes); i++)
> +		TEST(test_ctype(&classes[i]), "%s works", classes[i].name);
>
>  	return test_done();
>  }

^ permalink raw reply

* Re: [PATCH 1/1] doc: use singular form of repeatable path arg
From: Junio C Hamano @ 2023-12-28  0:10 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Britton Leo Kerin, git, Britton L Kerin
In-Reply-To: <CAPig+cShsSSd-jpvSW_sq3-R++zjtHU-m2PmTsz-Nx9YVRStug@mail.gmail.com>

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Wed, Dec 27, 2023 at 3:55 PM Britton Leo Kerin
> <britton.kerin@gmail.com> wrote:
>> This is more correct because the <path>... doc syntax already indicates
>> that the arg is "array-type".  It's how other tools do it.  Finally, the
>> later document text mentions 'path' arguments, while it doesn't mention
>> 'paths'.
>
> Yep, makes sense.
>
>> Signed-off-by: Britton L Kerin <britton.kergin@gmail.com>

Micronit.  This should be identical to how From: line identifies the
author.  s/L/Leo/ should be sufficient, I presume?

>> ---
>> diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
>> @@ -17,7 +17,7 @@ The command takes various subcommands, and different options depending
>>   git bisect start [--term-(new|bad)=<term-new> --term-(old|good)=<term-old>]
>> -                 [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<paths>...]
>> +                 [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<path>...]
>
> Looking good.
>
> In builtin/bisect.c, the "usage" string says "[<pathspec>...]" rather
> than "[<path>...]". Perhaps it makes sense to unify these?

Yup, and that would be a good thing to do in a single patch, and
also it would be a good place to stop.  Further clean-up you
suggested below are very much worth doing, but it probably is good
to leave out of this single focused fix we are reviewing here and
instead be done as separate patch(es).

Thanks.

> Also, there are a few more documentation files that could use the
> "<paths>" to "<path>..." fixup (though not always in the synopsis). A
> 'grep' indicates that git-checkout.txt, git-diff.txt, and
> git-rev-list-options.txt also mention "<paths>". Those may be outside
> the scope of this patch, although they could easily be included, as
> well, or made part of a patch series if you feel inclined.

^ permalink raw reply

* Re: [PATCH 2/3] sparse-checkout: use default patterns for 'set' only !stdin
From: Junio C Hamano @ 2023-12-28  0:18 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git
In-Reply-To: <CABPp-BF3Vqq8Y8+2pQyO7-StUVPR3as+Q_aL+o92ydoJY-8zEw@mail.gmail.com>

Elijah Newren <newren@gmail.com> writes:

>> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
>> index 8f55127202..04ae81fce8 100644
>> --- a/builtin/sparse-checkout.c
>> +++ b/builtin/sparse-checkout.c
>> @@ -837,7 +837,7 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix)
>>          * non-cone mode, if nothing is specified, manually select just the
>>          * top-level directory (much as 'init' would do).
>>          */
>> -       if (!core_sparse_checkout_cone && argc == 0) {
>> +       if (!core_sparse_checkout_cone && !set_opts.use_stdin && argc == 0) {
>>                 argv = default_patterns;
>>                 argc = default_patterns_nr;
>>         } else {
>> --
>> 2.43.0-174-g055bb6e996
>
> Thanks for catching this; the fix looks good to me.

Actually I am not so sure.

An obvious alternative would be to collect the patterns, either from
the command line or from the standard input, and then use the
default when we collected nothing.

But I guess those who use the standard input should be able to
specify everything fully, so it would probably be OK.

^ permalink raw reply

* What's cooking in git.git (Dec 2023, #05; Wed, 27)
From: Junio C Hamano @ 2023-12-28  4:25 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking in my tree.  Commits
prefixed with '+' are in 'next' (being in 'next' is a sign that a
topic is stable enough to be used and are candidate to be in a
future release).  Commits prefixed with '-' are only in 'seen', and
aren't considered "accepted" at all and may be annotated with an URL
to a message that raises issues but they are no means exhaustive.  A
topic without enough support may be discarded after a long period of
no activity (of course they can be resubmit when new interests
arise).

The RelNotes symbolic link says we are now working towards Git 2.44.
It may not be a bad idea to reflect on what technical debt and UI
warts we have accumulated so far to see if we have enough of them to
start planning for a breaking Git 3.0 release (or, of course, keep
incrementally improve the system, which is much more preferrable---
continuity and stability is good).  End of year being a relatively
quiet period, it may be a good time to think about your favorite pet
peeve, to be discussed early next year.

Copies of the source code to Git live in many repositories, and the
following is a list of the ones I push into or their mirrors.  Some
repositories have only a subset of branches.

With maint, master, next, seen, todo:

	git://git.kernel.org/pub/scm/git/git.git/
	git://repo.or.cz/alt-git.git/
	https://kernel.googlesource.com/pub/scm/git/git/
	https://github.com/git/git/
	https://gitlab.com/git-vcs/git/

With all the integration branches and topics broken out:

	https://github.com/gitster/git/

Even though the preformatted documentation in HTML and man format
are not sources, they are published in these repositories for
convenience (replace "htmldocs" with "manpages" for the manual
pages):

	git://git.kernel.org/pub/scm/git/git-htmldocs.git/
	https://github.com/gitster/git-htmldocs.git/

Release tarballs are available at:

	https://www.kernel.org/pub/software/scm/git/

--------------------------------------------------
[Graduated to 'master']

* es/add-doc-list-short-form-of-all-in-synopsis (2023-12-15) 1 commit
  (merged to 'next' on 2023-12-18 at a4f20da2bf)
 + git-add.txt: add missing short option -A to synopsis

 Doc update.
 source: <20231215204333.1253-1-ericsunshine@charter.net>


* jc/checkout-B-branch-in-use (2023-12-13) 2 commits
  (merged to 'next' on 2023-12-14 at 0a3998619e)
 + checkout: forbid "-B <branch>" from touching a branch used elsewhere
 + checkout: refactor die_if_checked_out() caller

 "git checkout -B <branch> [<start-point>]" allowed a branch that is
 in use in another worktree to be updated and checked out, which
 might be a bit unexpected.  The rule has been tightened, which is a
 breaking change.  "--ignore-other-worktrees" option is required to
 unbreak you, if you are used to the current behaviour that "-B"
 overrides the safety.
 source: <xmqqjzq9cl70.fsf@gitster.g>


* jc/diff-cached-fsmonitor-fix (2023-09-15) 3 commits
  (merged to 'next' on 2023-12-15 at 4aa7596593)
 + diff-lib: fix check_removed() when fsmonitor is active
 + Merge branch 'jc/fake-lstat' into jc/diff-cached-fsmonitor-fix
 + Merge branch 'js/diff-cached-fsmonitor-fix' into jc/diff-cached-fsmonitor-fix
 (this branch uses jc/fake-lstat.)

 The optimization based on fsmonitor in the "diff --cached"
 codepath is resurrected with the "fake-lstat" introduced earlier.
 cf. <e5295dbe-94d2-3186-5663-2466eba4bdde@jeffhostetler.com>
 source: <xmqqr0n0h0tw.fsf@gitster.g>


* jc/doc-misspelt-refs-fix (2023-12-18) 1 commit
  (merged to 'next' on 2023-12-18 at e7799fd5c9)
 + doc: format.notes specify a ref under refs/notes/ hierarchy

 Doc update.
 source: <xmqqjzpfje33.fsf_-_@gitster.g>


* jc/doc-most-refs-are-not-that-special (2023-12-15) 5 commits
  (merged to 'next' on 2023-12-18 at aead30fcc8)
 + docs: MERGE_AUTOSTASH is not that special
 + docs: AUTO_MERGE is not that special
 + refs.h: HEAD is not that special
 + git-bisect.txt: BISECT_HEAD is not that special
 + git.txt: HEAD is not that special

 Doc updates.
 source: <20231215203245.3622299-1-gitster@pobox.com>


* jc/fake-lstat (2023-09-15) 1 commit
  (merged to 'next' on 2023-12-15 at 48e34cc0b4)
 + cache: add fake_lstat()
 (this branch is used by jc/diff-cached-fsmonitor-fix.)

 A new helper to let us pretend that we called lstat() when we know
 our cache_entry is up-to-date via fsmonitor.
 cf. <e5295dbe-94d2-3186-5663-2466eba4bdde@jeffhostetler.com>
 source: <xmqqcyykig1l.fsf@gitster.g>


* jk/mailinfo-iterative-unquote-comment (2023-12-14) 2 commits
  (merged to 'next' on 2023-12-18 at 92363605fd)
 + mailinfo: avoid recursion when unquoting From headers
 + t5100: make rfc822 comment test more careful
 (this branch uses jk/mailinfo-oob-read-fix.)

 The code to parse the From e-mail header has been updated to avoid
 recursion.
 source: <20231214214444.GB2297853@coredump.intra.peff.net>


* jk/mailinfo-oob-read-fix (2023-12-12) 1 commit
  (merged to 'next' on 2023-12-14 at 0dcfcb0d02)
 + mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
 (this branch is used by jk/mailinfo-iterative-unquote-comment.)

 OOB read fix.
 source: <20231212221243.GA1656116@coredump.intra.peff.net>


* jx/fetch-atomic-error-message-fix (2023-12-18) 2 commits
  (merged to 'next' on 2023-12-18 at a1988b00e5)
 + fetch: no redundant error message for atomic fetch
 + t5574: test porcelain output of atomic fetch

 "git fetch --atomic" issued an unnecessary empty error message,
 which has been corrected.
 cf. <ZX__e7VjyLXIl-uV@tanuki>
 source: <cover.1702821462.git.zhiyou.jx@alibaba-inc.com>


* ps/chainlint-self-check-update (2023-12-15) 1 commit
  (merged to 'next' on 2023-12-18 at 0de2e1807f)
 + tests: adjust whitespace in chainlint expectations

 Test framework update.
 source: <fb312f559de7b99244e4c86a995250599cd9be06.1702622508.git.ps@pks.im>


* ps/clone-into-reftable-repository (2023-12-12) 7 commits
  (merged to 'next' on 2023-12-19 at adf7eb1f84)
 + builtin/clone: create the refdb with the correct object format
 + builtin/clone: skip reading HEAD when retrieving remote
 + builtin/clone: set up sparse checkout later
 + builtin/clone: fix bundle URIs with mismatching object formats
 + remote-curl: rediscover repository when fetching refs
 + setup: allow skipping creation of the refdb
 + setup: extract function to create the refdb
 (this branch is used by ps/refstorage-extension.)

 "git clone" has been prepared to allow cloning a repository with
 non-default hash function into a repository that uses the reftable
 backend.
 source: <cover.1702361370.git.ps@pks.im>


* ps/reftable-fixes (2023-12-11) 11 commits
  (merged to 'next' on 2023-12-15 at ebba966016)
 + reftable/block: reuse buffer to compute record keys
 + reftable/block: introduce macro to initialize `struct block_iter`
 + reftable/merged: reuse buffer to compute record keys
 + reftable/stack: fix use of unseeded randomness
 + reftable/stack: fix stale lock when dying
 + reftable/stack: reuse buffers when reloading stack
 + reftable/stack: perform auto-compaction with transactional interface
 + reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
 + reftable: handle interrupted writes
 + reftable: handle interrupted reads
 + reftable: wrap EXPECT macros in do/while
 (this branch is used by ps/reftable-fixes-and-optims.)

 Bunch of small fix-ups to the reftable code.
 source: <cover.1702285387.git.ps@pks.im>


* rs/c99-stdbool-test-balloon (2023-12-18) 1 commit
  (merged to 'next' on 2023-12-18 at 5a62aaa127)
 + git-compat-util: convert skip_{prefix,suffix}{,_mem} to bool

 Test balloon to use C99 "bool" type from <stdbool.h>.
 source: <2d30dc36-6091-4b47-846f-92d3f4a8b135@web.de>


* rs/show-ref-incompatible-options (2023-12-11) 1 commit
  (merged to 'next' on 2023-12-18 at 5a092285f7)
 + show-ref: use die_for_incompatible_opt3()

 Code clean-up for sanity checking of command line options for "git
 show-ref".
 source: <e5304253-3347-4900-bbf2-d3c6ee3fb976@web.de>


* rs/t6300-compressed-size-fix (2023-12-12) 1 commit
  (merged to 'next' on 2023-12-19 at 37ed09549c)
 + t6300: avoid hard-coding object sizes

 Test fix.
 source: <9feeb6cf-aabf-4002-917f-3f6c27547bc8@web.de>


* sp/test-i18ngrep (2023-12-18) 1 commit
  (merged to 'next' on 2023-12-18 at d54442693a)
 + test-lib-functions.sh: fix test_grep fail message wording

 Error message fix in the test framework.
 source: <20231203171956.771-1-shreyanshpaliwalcmsmn@gmail.com>

--------------------------------------------------
[New Topics]

* ml/doc-merge-updates (2023-12-20) 2 commits
 - Documentation/git-merge.txt: use backticks for command wrapping
 - Documentation/git-merge.txt: fix reference to synopsis

 Doc update.

 Will merge to 'next'.
 source: <20231220195342.17590-1-mi.al.lohmann@gmail.com>


* cp/apply-core-filemode (2023-12-26) 3 commits
 - apply: code simplification
 - apply: correctly reverse patch's pre- and post-image mode bits
 - apply: ignore working tree filemode when !core.filemode

 "git apply" on a filesystem without filemode support have learned
 to take a hint from what is in the index for the path, even when
 not working with the "--index" or "--cached" option, when checking
 the executable bit match what is required by the preimage in the
 patch.

 Needs review.
 source: <20231226233218.472054-1-gitster@pobox.com>


* jc/archive-list-with-extra-args (2023-12-21) 1 commit
 - archive: "--list" does not take further options

 "git archive --list extra garbage" silently ignored excess command
 line parameters, which has been corrected.

 Will merge to 'next'.
 source: <xmqqmsu3mnix.fsf@gitster.g>


* jk/t1006-cat-file-objectsize-disk (2023-12-21) 1 commit
 - t1006: add tests for %(objectsize:disk)

 Test update.

 Will merge to 'next'.
 source: <20231221094722.GA570888@coredump.intra.peff.net>


* js/contributor-docs-updates (2023-12-21) 9 commits
 - SubmittingPatches: hyphenate non-ASCII
 - SubmittingPatches: clarify GitHub artifact format
 - SubmittingPatches: clarify GitHub visual
 - SubmittingPatches: improve extra tags advice
 - SubmittingPatches: update extra tags list
 - SubmittingPatches: discourage new trailers
 - SubmittingPatches: drop ref to "What's in git.git"
 - CodingGuidelines: write punctuation marks
 - CodingGuidelines: move period inside parentheses

 Doc update.

 Expecting a reroll, but basically looking good.
 source: <pull.1623.v2.git.1703176865.gitgitgadget@gmail.com>


* al/unit-test-ctype (2023-12-26) 1 commit
 - unit-tests: rewrite t/helper/test-ctype.c as a unit test.

 Move test-ctype helper to the unit-test framework.

 Expecting a reroll.
 source: <20231221231527.8130-1-ach.lumap@gmail.com>


* bk/bisect-doc-fix (2023-12-27) 1 commit
 - doc: use singular form of repeatable path arg

 Synopsis fix.

 Expecting a reroll.
 source: <3d46bca1-96d4-43ba-a912-1f7c76942287@smtp-relay.sendinblue.com>


* en/sparse-checkout-eoo (2023-12-26) 2 commits
 - sparse-checkout: be consistent with end of options markers
 - Merge branch 'jk/end-of-options' into jc/sparse-checkout-set-add-end-of-options

 "git sparse-checkout (add|set) --[no-]cone --end-of-options" did
 not handle "--end-of-options" correctly after a recent update.

 Will merge to 'next'.
 source: <pull.1625.v2.git.git.1703619562639.gitgitgadget@gmail.com>


* ja/doc-placeholders-fix (2023-12-26) 2 commits
 - doc: enforce placeholders in documentation
 - doc: enforce dashes in placeholders

 Docfix.

 Needs review.
 source: <pull.1626.git.1703539287.gitgitgadget@gmail.com>


* jc/sparse-checkout-set-default-fix (2023-12-26) 1 commit
 - sparse-checkout: use default patterns for 'set' only !stdin

 "git sparse-checkout set" added default patterns even when the
 patterns are being fed from the standard input, which has been
 corrected.

 Will merge to 'next'.
 source: <20231221065925.3234048-3-gitster@pobox.com>


* rs/fast-import-simplify-mempool-allocation (2023-12-26) 1 commit
 - fast-import: use mem_pool_calloc()

 Code simplification.

 Will merge to 'next'.
 source: <50c1f410-ca37-4c1c-a28b-3e9fad49f2b4@web.de>


* rs/mem-pool-improvements (2023-12-26) 2 commits
 - mem-pool: simplify alignment calculation
 - mem-pool: fix big allocations

 MemPool allocator fixes.

 Will merge to 'next'.
 source: <3e15d11a-bd19-49ca-b674-9b50e0ba7fc2@web.de>

--------------------------------------------------
[Cooking]

* jc/retire-cas-opt-name-constant (2023-12-19) 1 commit
  (merged to 'next' on 2023-12-21 at 39ef057c8b)
 + remote.h: retire CAS_OPT_NAME

 Code clean-up.

 Will merge to 'master'.
 source: <xmqq5y0uc7tq.fsf@gitster.g>


* rs/rebase-use-strvec-pushf (2023-12-20) 1 commit
  (merged to 'next' on 2023-12-20 at ecb190973c)
 + rebase: use strvec_pushf() for format-patch revisions

 Code clean-up.

 Will merge to 'master'.
 source: <4ab7431c-6c1b-448c-b4d2-e8b9be0e4eef@web.de>


* ps/refstorage-extension (2023-12-20) 13 commits
 - t9500: write "extensions.refstorage" into config
 - builtin/clone: introduce `--ref-format=` value flag
 - builtin/init: introduce `--ref-format=` value flag
 - builtin/rev-parse: introduce `--show-ref-format` flag
 - t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
 - setup: introduce GIT_DEFAULT_REF_FORMAT envvar
 - setup: introduce "extensions.refStorage" extension
 - setup: set repository's formats on init
 - setup: start tracking ref storage format when
 - refs: refactor logic to look up storage backends
 - worktree: skip reading HEAD when repairing worktrees
 - t: introduce DEFAULT_REPO_FORMAT prereq
 - Merge branch 'ps/clone-into-reftable-repository' into ps/refstorage-extension

 Introduce a new extension "refstorage" so that we can mark a
 repository that uses a non-default ref backend, like reftable.

 Needs review.
 source: <cover.1703067989.git.ps@pks.im>


* ps/reftable-fixes-and-optims (2023-12-20) 9 commits
 - SQUASH??? make "make hdr-check" pass
 - reftable/merged: transfer ownership of records when iterating
 - reftable/merged: really reuse buffers to compute record keys
 - reftable/record: store "val2" hashes as static arrays
 - reftable/record: store "val1" hashes as static arrays
 - reftable/record: constify some parts of the interface
 - reftable/writer: fix index corruption when writing multiple indices
 - reftable/stack: do not overwrite errors when compacting
 - Merge branch 'ps/reftable-fixes' into ps/reftable-fixes-and-optims

 More fixes and optimizations to the reftable backend.

 Needs review.
 source: <cover.1703063544.git.ps@pks.im>


* ps/pseudo-refs (2023-12-14) 4 commits
  (merged to 'next' on 2023-12-21 at 3460e3d667)
 + bisect: consistently write BISECT_EXPECTED_REV via the refdb
 + refs: complete list of special refs
 + refs: propagate errno when reading special refs fails
 + wt-status: read HEAD and ORIG_HEAD via the refdb

 Assorted changes around pseudoref handling.

 Will merge to 'master'.
 source: <cover.1702560829.git.ps@pks.im>


* tb/multi-pack-verbatim-reuse (2023-12-14) 26 commits
 - t/perf: add performance tests for multi-pack reuse
 - pack-bitmap: enable reuse from all bitmapped packs
 - pack-objects: allow setting `pack.allowPackReuse` to "single"
 - t/test-lib-functions.sh: implement `test_trace2_data` helper
 - pack-objects: add tracing for various packfile metrics
 - pack-bitmap: prepare to mark objects from multiple packs for reuse
 - pack-revindex: implement `midx_pair_to_pack_pos()`
 - pack-revindex: factor out `midx_key_to_pack_pos()` helper
 - midx: implement `midx_preferred_pack()`
 - git-compat-util.h: implement checked size_t to uint32_t conversion
 - pack-objects: include number of packs reused in output
 - pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse
 - pack-objects: prepare `write_reused_pack()` for multi-pack reuse
 - pack-objects: pass `bitmapped_pack`'s to pack-reuse functions
 - pack-objects: keep track of `pack_start` for each reuse pack
 - pack-objects: parameterize pack-reuse routines over a single pack
 - pack-bitmap: return multiple packs via `reuse_partial_packfile_from_bitmap()`
 - pack-bitmap: simplify `reuse_partial_packfile_from_bitmap()` signature
 - ewah: implement `bitmap_is_empty()`
 - pack-bitmap: pass `bitmapped_pack` struct to pack-reuse functions
 - midx: implement `midx_locate_pack()`
 - midx: implement `BTMP` chunk
 - midx: factor out `fill_pack_info()`
 - pack-bitmap: plug leak in find_objects()
 - pack-bitmap-write: deep-clear the `bb_commit` slab
 - pack-objects: free packing_data in more places

 Streaming spans of packfile data used to be done only from a
 single, primary, pack in a repository with multiple packfiles.  It
 has been extended to allow reuse from other packfiles, too.

 Will merge to 'next'?
 cf. <ZXurD1NTZ4TAs7WZ@nand.local>
 source: <cover.1702592603.git.me@ttaylorr.com>


* jc/bisect-doc (2023-12-09) 1 commit
 - bisect: document "terms" subcommand more fully

 Doc update.

 Needs review.
 source: <xmqqzfyjmk02.fsf@gitster.g>


* sh/completion-with-reftable (2023-12-19) 2 commits
  (merged to 'next' on 2023-12-20 at 7957d4aa5b)
 + completion: support pseudoref existence checks for reftables
 + completion: refactor existence checks for pseudorefs

 Command line completion script (in contrib/) learned to work better
 with the reftable backend.

 Will merge to 'master'.
 source: <cover.1703022850.git.stanhu@gmail.com>


* en/header-cleanup (2023-12-26) 12 commits
 - treewide: remove unnecessary includes in source files
 - treewide: add direct includes currently only pulled in transitively
 - trace2/tr2_tls.h: remove unnecessary include
 - submodule-config.h: remove unnecessary include
 - pkt-line.h: remove unnecessary include
 - line-log.h: remove unnecessary include
 - http.h: remove unnecessary include
 - fsmonitor--daemon.h: remove unnecessary includes
 - blame.h: remove unnecessary includes
 - archive.h: remove unnecessary include
 - treewide: remove unnecessary includes in source files
 - treewide: remove unnecessary includes from header files

 Remove unused header "#include".

 Will merge to 'next'.
 source: <pull.1617.v2.git.1703351700.gitgitgadget@gmail.com>


* jc/orphan-unborn (2023-11-24) 2 commits
  (merged to 'next' on 2023-12-21 at 030300487a)
 + orphan/unborn: fix use of 'orphan' in end-user facing messages
 + orphan/unborn: add to the glossary and use them consistently

 Doc updates to clarify what an "unborn branch" means.

 Will merge to 'master'.
 source: <xmqq4jhb977x.fsf@gitster.g>


* jw/builtin-objectmode-attr (2023-12-12) 2 commits
 - SQUASH??? - leakfix
 - attr: add builtin objectmode values support

 The builtin_objectmode attribute is populated for each path
 without adding anything in .gitattributes files, which would be
 useful in magic pathspec, e.g., ":(attr:builtin_objectmode=100755)"
 to limit to executables.

 Needs to get leakfix reviewed.
 cf. <xmqq5y0ssknj.fsf@gitster.g>
 source: <20231116054437.2343549-1-jojwang@google.com>


* tb/merge-tree-write-pack (2023-10-23) 5 commits
 - builtin/merge-tree.c: implement support for `--write-pack`
 - bulk-checkin: introduce `index_tree_bulk_checkin_incore()`
 - bulk-checkin: introduce `index_blob_bulk_checkin_incore()`
 - bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types
 - bulk-checkin: extract abstract `bulk_checkin_source`

 "git merge-tree" learned "--write-pack" to record its result
 without creating loose objects.

 Broken when an object created during a merge is needed to continue merge
 cf. <CABPp-BEfy9VOvimP9==ry_rZXu=metOQ8s=_-XiG_Pdx9c06Ww@mail.gmail.com>
 source: <cover.1698101088.git.me@ttaylorr.com>


* tb/pair-chunk-expect (2023-11-10) 8 commits
 - midx: read `OOFF` chunk with `pair_chunk_expect()`
 - midx: read `OIDL` chunk with `pair_chunk_expect()`
 - commit-graph: read `BIDX` chunk with `pair_chunk_expect()`
 - commit-graph: read `GDAT` chunk with `pair_chunk_expect()`
 - commit-graph: read `CDAT` chunk with `pair_chunk_expect()`
 - commit-graph: read `OIDL` chunk with `pair_chunk_expect()`
 - chunk-format: introduce `pair_chunk_expect()` helper
 - Merge branch 'jk/chunk-bounds-more' into HEAD

 Further code clean-up.

 Needs review.
 source: <cover.1699569246.git.me@ttaylorr.com>


* tb/path-filter-fix (2023-10-18) 17 commits
 - bloom: introduce `deinit_bloom_filters()`
 - commit-graph: reuse existing Bloom filters where possible
 - object.h: fix mis-aligned flag bits table
 - commit-graph: drop unnecessary `graph_read_bloom_data_context`
 - commit-graph.c: unconditionally load Bloom filters
 - bloom: prepare to discard incompatible Bloom filters
 - bloom: annotate filters with hash version
 - commit-graph: new filter ver. that fixes murmur3
 - repo-settings: introduce commitgraph.changedPathsVersion
 - t4216: test changed path filters with high bit paths
 - t/helper/test-read-graph: implement `bloom-filters` mode
 - bloom.h: make `load_bloom_filter_from_graph()` public
 - t/helper/test-read-graph.c: extract `dump_graph_info()`
 - gitformat-commit-graph: describe version 2 of BDAT
 - commit-graph: ensure Bloom filters are read with consistent settings
 - revision.c: consult Bloom filters for root commits
 - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`

 The Bloom filter used for path limited history traversal was broken
 on systems whose "char" is unsigned; update the implementation and
 bump the format version to 2.

 Expecting a reroll.
 cf. <20231023202212.GA5470@szeder.dev>
 source: <cover.1697653929.git.me@ttaylorr.com>


* ak/color-decorate-symbols (2023-10-23) 7 commits
 - log: add color.decorate.pseudoref config variable
 - refs: exempt pseudorefs from pattern prefixing
 - refs: add pseudorefs array and iteration functions
 - log: add color.decorate.ref config variable
 - log: add color.decorate.symbol config variable
 - log: use designated inits for decoration_colors
 - config: restructure color.decorate documentation

 A new config for coloring.

 Needs review.
 source: <20231023221143.72489-1-andy.koppe@gmail.com>


* la/trailer-cleanups (2023-12-20) 3 commits
  (merged to 'next' on 2023-12-21 at e26ede5f55)
 + trailer: use offsets for trailer_start/trailer_end
 + trailer: find the end of the log message
 + commit: ignore_non_trailer computes number of bytes to ignore

 Code clean-up.

 Will merge to 'master'.
 source: <pull.1563.v5.git.1697828495.gitgitgadget@gmail.com>


* eb/hash-transition (2023-10-02) 30 commits
 - t1016-compatObjectFormat: add tests to verify the conversion between objects
 - t1006: test oid compatibility with cat-file
 - t1006: rename sha1 to oid
 - test-lib: compute the compatibility hash so tests may use it
 - builtin/ls-tree: let the oid determine the output algorithm
 - object-file: handle compat objects in check_object_signature
 - tree-walk: init_tree_desc take an oid to get the hash algorithm
 - builtin/cat-file: let the oid determine the output algorithm
 - rev-parse: add an --output-object-format parameter
 - repository: implement extensions.compatObjectFormat
 - object-file: update object_info_extended to reencode objects
 - object-file-convert: convert commits that embed signed tags
 - object-file-convert: convert commit objects when writing
 - object-file-convert: don't leak when converting tag objects
 - object-file-convert: convert tag objects when writing
 - object-file-convert: add a function to convert trees between algorithms
 - object: factor out parse_mode out of fast-import and tree-walk into in object.h
 - cache: add a function to read an OID of a specific algorithm
 - tag: sign both hashes
 - commit: export add_header_signature to support handling signatures on tags
 - commit: convert mergetag before computing the signature of a commit
 - commit: write commits for both hashes
 - object-file: add a compat_oid_in parameter to write_object_file_flags
 - object-file: update the loose object map when writing loose objects
 - loose: compatibilty short name support
 - loose: add a mapping between SHA-1 and SHA-256 for loose objects
 - repository: add a compatibility hash algorithm
 - object-names: support input of oids in any supported hash
 - oid-array: teach oid-array to handle multiple kinds of oids
 - object-file-convert: stubs for converting from one object format to another

 Teach a repository to work with both SHA-1 and SHA-256 hash algorithms.

 Needs review.
 source: <878r8l929e.fsf@gmail.froward.int.ebiederm.org>


* jx/remote-archive-over-smart-http (2023-12-14) 4 commits
 - archive: support remote archive from stateless transport
 - transport-helper: call do_take_over() in connect_helper
 - transport-helper: call do_take_over() in process_connect
 - transport-helper: no connection restriction in connect_helper

 "git archive --remote=<remote>" learned to talk over the smart
 http (aka stateless) transport.

 Needs review.
 source: <cover.1702562879.git.zhiyou.jx@alibaba-inc.com>


* jx/sideband-chomp-newline-fix (2023-12-18) 3 commits
 - pkt-line: do not chomp newlines for sideband messages
 - pkt-line: memorize sideband fragment in reader
 - test-pkt-line: add option parser for unpack-sideband

 Sideband demultiplexer fixes.

 Will merge to 'next'?
 source: <cover.1702823801.git.zhiyou.jx@alibaba-inc.com>


* jc/rerere-cleanup (2023-08-25) 4 commits
 - rerere: modernize use of empty strbuf
 - rerere: try_merge() should use LL_MERGE_ERROR when it means an error
 - rerere: fix comment on handle_file() helper
 - rerere: simplify check_one_conflict() helper function

 Code clean-up.

 Not ready to be reviewed yet.
 source: <20230824205456.1231371-1-gitster@pobox.com>


* rj/status-bisect-while-rebase (2023-10-16) 1 commit
  (merged to 'next' on 2023-12-21 at 929a027fb7)
 + status: fix branch shown when not only bisecting

 "git status" is taught to show both the branch being bisected and
 being rebased when both are in effect at the same time.

 Will merge to 'master'.
 cf. <xmqqil76kyov.fsf@gitster.g>
 source: <2e24ca9b-9c5f-f4df-b9f8-6574a714dfb2@gmail.com>


--------------------------------------------------
[Discarded]

* jc/sparse-checkout-eoo (2023-12-21) 5 commits
 . sparse-checkout: tighten add_patterns_from_input()
 . sparse-checkout: use default patterns for 'set' only !stdin
 . SQUASH??? end-of-options test
 . sparse-checkout: take care of "--end-of-options" in set/add/check-rules
 - Merge branch 'jk/end-of-options' into jc/sparse-checkout-set-add-end-of-options

 "git sparse-checkout (add|set) --[no-]cone --end-of-options" did
 not handle "--end-of-options" correctly after a recent update.

 Superseded by the en/sparse-checkout-eoo topic.
 source: <20231221065925.3234048-1-gitster@pobox.com>


^ permalink raw reply

* [PATCH v3 0/9] Minor improvements to CodingGuidelines and SubmittingPatches
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref
In-Reply-To: <pull.1623.v2.git.1703176865.gitgitgadget@gmail.com>

These are a bunch of things I've run into over my past couple of attempts to
contribute to Git.

 * Incremental punctuation/grammatical improvements
 * Update extra tags suggestions based on common usage
 * drop reference to an article that was discontinued over a decade ago
 * update GitHub references
 * harmonize non-ASCII while I'm here

Note that I'm trying to do things "in the neighborhood". It'll be slower
than me replacing things topically, but hopefully easier for others to
digest. My current estimate is a decade or two :).

Josh Soref (9):
  CodingGuidelines: move period inside parentheses
  CodingGuidelines: write punctuation marks
  SubmittingPatches: drop ref to "What's in git.git"
  SubmittingPatches: discourage new trailers
  SubmittingPatches: update extra tags list
  SubmittingPatches: provide tag naming advice
  SubmittingPatches: clarify GitHub visual
  SubmittingPatches: clarify GitHub artifact format
  SubmittingPatches: hyphenate non-ASCII

 Documentation/CodingGuidelines  |  4 ++--
 Documentation/SubmittingPatches | 33 +++++++++++++++++++++++----------
 2 files changed, 25 insertions(+), 12 deletions(-)


base-commit: 624eb90fa8f65a79396615f3c2842ac5a3743350
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1623%2Fjsoref%2Fdocumentation-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1623/jsoref/documentation-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1623

Range-diff vs v2:

  1:  b9a8eb6aa4e =  1:  b9a8eb6aa4e CodingGuidelines: move period inside parentheses
  2:  c0db8336e51 =  2:  c0db8336e51 CodingGuidelines: write punctuation marks
  3:  22d66c5b78a =  3:  22d66c5b78a SubmittingPatches: drop ref to "What's in git.git"
  4:  eac2211332f =  4:  eac2211332f SubmittingPatches: discourage new trailers
  5:  8848572fe2c =  5:  8848572fe2c SubmittingPatches: update extra tags list
  6:  8f16c7caa73 !  6:  f28c1011ba9 SubmittingPatches: improve extra tags advice
     @@ Metadata
      Author: Josh Soref <jsoref@gmail.com>
      
       ## Commit message ##
     -    SubmittingPatches: improve extra tags advice
     +    SubmittingPatches: provide tag naming advice
      
          Current statistics show a strong preference to only capitalize the first
          letter in a hyphenated tag, but that some guidance would be helpful:
     @@ Documentation/SubmittingPatches: While you can also create your own trailer if t
       encourage you to instead use one of the common trailers in this project
       highlighted above.
       
     -+Extra tags should only capitalize the very first letter, i.e. favor
     ++Only capitalize the very first letter of tags, i.e. favor
      +"Signed-off-by" over "Signed-Off-By" and "Acked-by:" over "Acked-By".
      +
       [[git-tools]]
  7:  cdb5fd0957f <  -:  ----------- SubmittingPatches: clarify GitHub visual
  8:  77576327df8 !  7:  49cef6f7c20 SubmittingPatches: clarify GitHub artifact format
     @@ Metadata
      Author: Josh Soref <jsoref@gmail.com>
      
       ## Commit message ##
     -    SubmittingPatches: clarify GitHub artifact format
     +    SubmittingPatches: clarify GitHub visual
      
     -    GitHub wraps artifacts generated by workflows in a .zip file.
     +    GitHub has two general forms for its states, sometimes they're a simple
     +    colored object (e.g. green check or red x), and sometimes there's also a
     +    colored container (e.g. green box or red circle) which contains that
     +    object (e.g. check or x).
      
     -    Internally, workflows can package anything they like in them.
     -
     -    A recently generated failure artifact had the form:
     -
     -    windows-artifacts.zip
     -      Length      Date    Time    Name
     -    ---------  ---------- -----   ----
     -     76001695  12-19-2023 01:35   artifacts.tar.gz
     -     11005650  12-19-2023 01:35   tracked.tar.gz
     -    ---------                     -------
     -     87007345                     2 files
     +    That's a lot of words to try to describe things, but in general, the key
     +    for a failure is that it's recognized as an `x` and that it's associated
     +    with the color red -- the color of course is problematic for people who
     +    are red-green color-blind, but that's why they are paired with distinct
     +    shapes.
      
          Signed-off-by: Josh Soref <jsoref@gmail.com>
      
       ## Documentation/SubmittingPatches ##
     -@@ Documentation/SubmittingPatches: branches here: `https://github.com/<Your GitHub handle>/git/actions/workflows/ma
     - If a branch did not pass all test cases then it is marked with a red
     - +x+. In that case you can click on the failing job and navigate to
     - "ci/run-build-and-tests.sh" and/or "ci/print-test-failures.sh". You
     +@@ Documentation/SubmittingPatches: After the initial setup, CI will run whenever you push new changes
     + to your fork of Git on GitHub.  You can monitor the test state of all your
     + branches here: `https://github.com/<Your GitHub handle>/git/actions/workflows/main.yml`
     + 
     +-If a branch did not pass all test cases then it is marked with a red
     +-cross. In that case you can click on the failing job and navigate to
     +-"ci/run-build-and-tests.sh" and/or "ci/print-test-failures.sh". You
      -can also download "Artifacts" which are tarred (or zipped) archives
     -+can also download "Artifacts" which are zip archives containing
     -+tarred (or zipped) archives
     - with test data relevant for debugging.
     +-with test data relevant for debugging.
     ++If a branch does not pass all test cases then it will be marked with a
     ++red +x+, instead of a green check. In that case, you can click on the
     ++failing job and navigate to "ci/run-build-and-tests.sh" and/or
     ++"ci/print-test-failures.sh". You can also download "Artifacts" which
     ++are tarred (or zipped) archives with test data relevant for debugging.
       
       Then fix the problem and push your fix to your GitHub fork. This will
     + trigger a new CI build to ensure all tests pass.
  -:  ----------- >  8:  deb1bf02f3a SubmittingPatches: clarify GitHub artifact format
  9:  a4878f58fe4 =  9:  b1b75cc6a3e SubmittingPatches: hyphenate non-ASCII

-- 
gitgitgadget

^ permalink raw reply

* [PATCH v3 1/9] CodingGuidelines: move period inside parentheses
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

The contents within parenthesis should be omittable without resulting
in broken text.

Eliding the parenthesis left a period to end a run without any content.

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/CodingGuidelines | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 8ed517a5ca0..af94ed3a75d 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -450,7 +450,7 @@ For C programs:
    one of the approved headers that includes it first for you.  (The
    approved headers currently include "builtin.h",
    "t/helper/test-tool.h", "xdiff/xinclude.h", or
-   "reftable/system.h").  You do not have to include more than one of
+   "reftable/system.h".)  You do not have to include more than one of
    these.
 
  - A C file must directly include the header files that declare the
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 2/9] CodingGuidelines: write punctuation marks
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

- Match style in Release Notes

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/CodingGuidelines | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index af94ed3a75d..578587a4715 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -578,7 +578,7 @@ Externally Visible Names
    . The variable name describes the effect of tweaking this knob.
 
    The section and variable names that consist of multiple words are
-   formed by concatenating the words without punctuations (e.g. `-`),
+   formed by concatenating the words without punctuation marks (e.g. `-`),
    and are broken using bumpyCaps in documentation as a hint to the
    reader.
 
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 3/9] SubmittingPatches: drop ref to "What's in git.git"
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

"What's in git.git" was last seen in 2010:
  https://lore.kernel.org/git/?q=%22what%27s+in+git.git%22
  https://lore.kernel.org/git/7vaavikg72.fsf@alter.siamese.dyndns.org/

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index bce7f97815c..32e90238777 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -570,7 +570,7 @@ their trees themselves.
   master).
 
 * Read the Git mailing list, the maintainer regularly posts messages
-  entitled "What's cooking in git.git" and "What's in git.git" giving
+  entitled "What's cooking in git.git" giving
   the status of various proposed changes.
 
 == GitHub CI[[GHCI]]
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 4/9] SubmittingPatches: discourage new trailers
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

There seems to be consensus amongst the core Git community on a working
set of common trailers, and there are non-trivial costs to people
inventing new trailers (research to discover what they mean/how they
differ from existing trailers) such that inventing new ones is generally
unwarranted and not something to be recommended to new contributors.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 32e90238777..58dfe405049 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -356,8 +356,9 @@ If you like, you can put extra tags at the end:
 . `Tested-by:` is used to indicate that the person applied the patch
   and found it to have the desired effect.
 
-You can also create your own tag or use one that's in common usage
-such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
+While you can also create your own trailer if the situation warrants it, we
+encourage you to instead use one of the common trailers in this project
+highlighted above.
 
 [[git-tools]]
 === Generate your patch using Git tools out of your commits.
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 5/9] SubmittingPatches: update extra tags list
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

Add items with at least 100 uses in the past three years:
- Co-authored-by
- Helped-by
- Mentored-by
- Suggested-by

git log --since=3.years|
  perl -ne 'next unless /^\s+[A-Z][a-z]+-\S+:/;s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n|grep '[0-9][0-9] '
  14 Based-on-patch-by:
  14 Original-patch-by:
  17 Tested-by:
 100 Suggested-by:
 121 Co-authored-by:
 163 Mentored-by:
 274 Reported-by:
 290 Acked-by:
 450 Helped-by:
 602 Reviewed-by:
14111 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 58dfe405049..31878cb70b7 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -355,6 +355,14 @@ If you like, you can put extra tags at the end:
   patch after a detailed analysis.
 . `Tested-by:` is used to indicate that the person applied the patch
   and found it to have the desired effect.
+. `Co-authored-by:` is used to indicate that people exchanged drafts
+   of a patch before submitting it.
+. `Helped-by:` is used to credit someone who suggested ideas for
+  changes without providing the precise changes in patch form.
+. `Mentored-by:` is used to credit someone with helping develop a
+  patch as part of a mentorship program (e.g., GSoC or Outreachy).
+. `Suggested-by:` is used to credit someone with suggesting the idea
+  for a patch.
 
 While you can also create your own trailer if the situation warrants it, we
 encourage you to instead use one of the common trailers in this project
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 6/9] SubmittingPatches: provide tag naming advice
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

Current statistics show a strong preference to only capitalize the first
letter in a hyphenated tag, but that some guidance would be helpful:

git log |
  perl -ne 'next unless /^\s+(?:Signed-[oO]ff|Acked)-[bB]y:/;
    s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n
   2 Signed-off-By:
   4 Signed-Off-by:
  22 Acked-By:
  47 Signed-Off-By:
2202 Acked-by:
95315 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 31878cb70b7..94c874ab5e6 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -368,6 +368,9 @@ While you can also create your own trailer if the situation warrants it, we
 encourage you to instead use one of the common trailers in this project
 highlighted above.
 
+Only capitalize the very first letter of tags, i.e. favor
+"Signed-off-by" over "Signed-Off-By" and "Acked-by:" over "Acked-By".
+
 [[git-tools]]
 === Generate your patch using Git tools out of your commits.
 
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 7/9] SubmittingPatches: clarify GitHub visual
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

GitHub has two general forms for its states, sometimes they're a simple
colored object (e.g. green check or red x), and sometimes there's also a
colored container (e.g. green box or red circle) which contains that
object (e.g. check or x).

That's a lot of words to try to describe things, but in general, the key
for a failure is that it's recognized as an `x` and that it's associated
with the color red -- the color of course is problematic for people who
are red-green color-blind, but that's why they are paired with distinct
shapes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 94c874ab5e6..0665f89f38c 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -602,11 +602,11 @@ After the initial setup, CI will run whenever you push new changes
 to your fork of Git on GitHub.  You can monitor the test state of all your
 branches here: `https://github.com/<Your GitHub handle>/git/actions/workflows/main.yml`
 
-If a branch did not pass all test cases then it is marked with a red
-cross. In that case you can click on the failing job and navigate to
-"ci/run-build-and-tests.sh" and/or "ci/print-test-failures.sh". You
-can also download "Artifacts" which are tarred (or zipped) archives
-with test data relevant for debugging.
+If a branch does not pass all test cases then it will be marked with a
+red +x+, instead of a green check. In that case, you can click on the
+failing job and navigate to "ci/run-build-and-tests.sh" and/or
+"ci/print-test-failures.sh". You can also download "Artifacts" which
+are tarred (or zipped) archives with test data relevant for debugging.
 
 Then fix the problem and push your fix to your GitHub fork. This will
 trigger a new CI build to ensure all tests pass.
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 8/9] SubmittingPatches: clarify GitHub artifact format
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

GitHub wraps artifacts generated by workflows in a .zip file.

Internally, workflows can package anything they like in them.

A recently generated failure artifact had the form:

windows-artifacts.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
 76001695  12-19-2023 01:35   artifacts.tar.gz
 11005650  12-19-2023 01:35   tracked.tar.gz
---------                     -------
 87007345                     2 files

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 0665f89f38c..5e2e13b5e09 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -606,7 +606,8 @@ If a branch does not pass all test cases then it will be marked with a
 red +x+, instead of a green check. In that case, you can click on the
 failing job and navigate to "ci/run-build-and-tests.sh" and/or
 "ci/print-test-failures.sh". You can also download "Artifacts" which
-are tarred (or zipped) archives with test data relevant for debugging.
+are zip archives containing tarred (or zipped) archives with test data
+relevant for debugging.
 
 Then fix the problem and push your fix to your GitHub fork. This will
 trigger a new CI build to ensure all tests pass.
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 9/9] SubmittingPatches: hyphenate non-ASCII
From: Josh Soref via GitGitGadget @ 2023-12-28  4:55 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, René Scharfe, Phillip Wood, Josh Soref,
	Dragan Simic, Josh Soref, Josh Soref
In-Reply-To: <pull.1623.v3.git.1703739324.gitgitgadget@gmail.com>

From: Josh Soref <jsoref@gmail.com>

Git documentation does this with the exception of ancient release notes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
---
 Documentation/SubmittingPatches | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 5e2e13b5e09..e734a3f0f17 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -699,7 +699,7 @@ message to an external program, and this is a handy way to drive
 `git am`.  However, if the message is MIME encoded, what is
 piped into the program is the representation you see in your
 `*Article*` buffer after unwrapping MIME.  This is often not what
-you would want for two reasons.  It tends to screw up non ASCII
+you would want for two reasons.  It tends to screw up non-ASCII
 characters (most notably in people's names), and also
 whitespaces (fatal in patches).  Running "C-u g" to display the
 message in raw form before using "|" to run the pipe can work
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2 11/11] reftable/block: reuse buffer to compute record keys
From: Patrick Steinhardt @ 2023-12-28  5:53 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: git, Jonathan Nieder
In-Reply-To: <CAOw_e7bQ+jxO2zhj32mDksq9uBKQfNt=wMNP5K6Oy1DqievCdg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

On Thu, Dec 21, 2023 at 11:43:27AM +0100, Han-Wen Nienhuys wrote:
> On Fri, Dec 8, 2023 at 3:53 PM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > @@ -84,10 +84,12 @@ struct block_iter {
> >
> >         /* key for last entry we read. */
> >         struct strbuf last_key;
> > +       struct strbuf key;
> >  };
> 
> it's slightly more efficient, but the new field has no essential
> meaning. If I encountered this code with the change you make here, I
> would probably refactor it in the opposite direction to increase code
> clarity.
> 
> I suspect that the gains are too small to be measurable, but if you
> are after small efficiency gains, you can have
> reftable_record_decode() consume the key to avoid copying overhead in
> record.c.

Fair enough. I've done a better job for these kinds of refactorings for
the reftable library in my second patch series [1] by including some
benchmarks via Valgrind. This should result in less handwavy and
actually measurable/significant performance improvements.

Patrick

[1]: https://public-inbox.org/git/xmqqr0jgsn9g.fsf@gitster.g/T/#t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 0/8] reftable: fixes and optimizations (pt.2)
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703063544.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3681 bytes --]

Hi,

this is the second version of my patch series that contains additional
fixes and optimizations for the reftable library. Changes compared to
v1:

  - Added a patch to not auto-compact twice in `reftable_stack_add()`,
    reported by Han-Wen.

  - Fixed an issue I've previously introduced while rebasing the patch
    series.

  - Fixed a missing header, as reported by Junio.

Due to the added patch that avoids double auto-compaction this patch
series now depends on 637e34a783 (Merge branch 'ps/reftable-fixes',
2023-12-27).

Patrick

Patrick Steinhardt (8):
  reftable/stack: do not overwrite errors when compacting
  reftable/stack: do not auto-compact twice in `reftable_stack_add()`
  reftable/writer: fix index corruption when writing multiple indices
  reftable/record: constify some parts of the interface
  reftable/record: store "val1" hashes as static arrays
  reftable/record: store "val2" hashes as static arrays
  reftable/merged: really reuse buffers to compute record keys
  reftable/merged: transfer ownership of records when iterating

 reftable/block_test.c      |   4 +-
 reftable/merged.c          |   8 +--
 reftable/merged_test.c     |  16 +++---
 reftable/readwrite_test.c  | 102 +++++++++++++++++++++++++++++++------
 reftable/record.c          |  17 ++-----
 reftable/record_test.c     |   5 --
 reftable/reftable-record.h |  11 ++--
 reftable/stack.c           |  23 +++------
 reftable/stack_test.c      |   2 -
 reftable/writer.c          |   4 +-
 10 files changed, 117 insertions(+), 75 deletions(-)

Range-diff against v1:
1:  573fb2c4fb = 1:  22a57020c6 reftable/stack: do not overwrite errors when compacting
-:  ---------- > 2:  a08f7efc99 reftable/stack: do not auto-compact twice in `reftable_stack_add()`
2:  86ee79c48d = 3:  c00e08d97f reftable/writer: fix index corruption when writing multiple indices
3:  3ad4a0e5b9 = 4:  3416268087 reftable/record: constify some parts of the interface
4:  06c9eab678 ! 5:  46ca3a37f8 reftable/record: store "val1" hashes as static arrays
    @@ reftable/readwrite_test.c: static void test_write_object_id_length(void)
      	};
      	int err;
      	int i;
    +@@ reftable/readwrite_test.c: static void test_write_multiple_indices(void)
    + 	writer = reftable_new_writer(&strbuf_add_void, &writer_buf, &opts);
    + 	reftable_writer_set_limits(writer, 1, 1);
    + 	for (i = 0; i < 100; i++) {
    +-		unsigned char hash[GIT_SHA1_RAWSZ] = {i};
    + 		struct reftable_ref_record ref = {
    + 			.update_index = 1,
    + 			.value_type = REFTABLE_REF_VAL1,
    +-			.value.val1 = hash,
    ++			.value.val1 = {i},
    + 		};
    + 
    + 		strbuf_reset(&buf);
     
      ## reftable/record.c ##
     @@ reftable/record.c: static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
    @@ reftable/record_test.c: static void test_reftable_ref_record_roundtrip(void)
      		case REFTABLE_REF_VAL2:
     
      ## reftable/reftable-record.h ##
    +@@ reftable/reftable-record.h: license that can be found in the LICENSE file or at
    + #ifndef REFTABLE_RECORD_H
    + #define REFTABLE_RECORD_H
    + 
    ++#include "hash-ll.h"
    + #include <stdint.h>
    + 
    + /*
     @@ reftable/reftable-record.h: struct reftable_ref_record {
      #define REFTABLE_NR_REF_VALUETYPES 4
      	} value_type;
5:  49f13c123f = 6:  c8a36917b1 reftable/record: store "val2" hashes as static arrays
6:  32b7ddee28 = 7:  6313f8affd reftable/merged: really reuse buffers to compute record keys
7:  3dbabea22a = 8:  25a3919e58 reftable/merged: transfer ownership of records when iterating

-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 1/8] reftable/stack: do not overwrite errors when compacting
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]

In order to compact multiple stacks we iterate through the merged ref
and log records. When there is any error either when reading the records
from the old merged table or when writing the records to the new table
then we break out of the respective loops. When breaking out of the loop
for the ref records though the error code will be overwritten, which may
cause us to inadvertently skip over bad ref records. In the worst case,
this can lead to a compacted stack that is missing records.

Fix the code by using `goto done` instead so that any potential error
codes are properly returned to the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 16bab82063..8729508dc3 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -801,18 +801,16 @@ static int stack_write_compact(struct reftable_stack *st,
 			err = 0;
 			break;
 		}
-		if (err < 0) {
-			break;
-		}
+		if (err < 0)
+			goto done;
 
 		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
 			continue;
 		}
 
 		err = reftable_writer_add_ref(wr, &ref);
-		if (err < 0) {
-			break;
-		}
+		if (err < 0)
+			goto done;
 		entries++;
 	}
 	reftable_iterator_destroy(&it);
@@ -827,9 +825,8 @@ static int stack_write_compact(struct reftable_stack *st,
 			err = 0;
 			break;
 		}
-		if (err < 0) {
-			break;
-		}
+		if (err < 0)
+			goto done;
 		if (first == 0 && reftable_log_record_is_deletion(&log)) {
 			continue;
 		}
@@ -845,9 +842,8 @@ static int stack_write_compact(struct reftable_stack *st,
 		}
 
 		err = reftable_writer_add_log(wr, &log);
-		if (err < 0) {
-			break;
-		}
+		if (err < 0)
+			goto done;
 		entries++;
 	}
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 2/8] reftable/stack: do not auto-compact twice in `reftable_stack_add()`
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1171 bytes --]

In 5c086453ff (reftable/stack: perform auto-compaction with
transactional interface, 2023-12-11), we fixed a bug where the
transactional interface to add changes to a reftable stack did not
perform auto-compaction by calling `reftable_stack_auto_compact()` in
`reftable_stack_addition_commit()`. While correct, this change may now
cause us to perform auto-compaction twice in the non-transactional
interface `reftable_stack_add()`:

  - It performs auto-compaction by itself.

  - It now transitively performs auto-compaction via the transactional
    interface.

Remove the first instance so that we only end up doing auto-compaction
once.

Reported-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 8729508dc3..7ffeb3ee10 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -425,9 +425,6 @@ int reftable_stack_add(struct reftable_stack *st,
 		return err;
 	}
 
-	if (!st->disable_auto_compact)
-		return reftable_stack_auto_compact(st);
-
 	return 0;
 }
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 3/8] reftable/writer: fix index corruption when writing multiple indices
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5241 bytes --]

Each reftable may contain multiple types of blocks for refs, objects and
reflog records, where each of these may have an index that makes it more
efficient to find the records. It was observed that the index for log
records can become corrupted under certain circumstances, where the
first entry of the index points into the object index instead of to the
log records.

As it turns out, this corruption can occur whenever we write a log index
as well as at least one additional index. Writing records and their index
is basically a two-step process:

  1. We write all blocks for the corresponding record. Each block that
     gets written is added to a list of blocks to index.

  2. Once all blocks were written we finish the section. If at least two
     blocks have been added to the list of blocks to index then we will
     now write the index for those blocks and flush it, as well.

When we have a very large number of blocks then we may decide to write a
multi-level index, which is why we also keep track of the list of the
index blocks in the same way as we previously kept track of the blocks
to index.

Now when we have finished writing all index blocks we clear the index
and flush the last block to disk. This is done in the wrong order though
because flushing the block to disk will re-add it to the list of blocks
to be indexed. The result is that the next section we are about to write
will have an entry in the list of blocks to index that points to the
last block of the preceding section's index, which will corrupt the log
index.

Fix this corruption by clearing the index after having written the last
block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/readwrite_test.c | 80 +++++++++++++++++++++++++++++++++++++++
 reftable/writer.c         |  4 +-
 2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 278663f22d..9c16e0504e 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -798,6 +798,85 @@ static void test_write_key_order(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_multiple_indices(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 100,
+	};
+	struct strbuf writer_buf = STRBUF_INIT, buf = STRBUF_INIT;
+	struct reftable_block_source source = { 0 };
+	struct reftable_iterator it = { 0 };
+	const struct reftable_stats *stats;
+	struct reftable_writer *writer;
+	struct reftable_reader *reader;
+	int err, i;
+
+	writer = reftable_new_writer(&strbuf_add_void, &writer_buf, &opts);
+	reftable_writer_set_limits(writer, 1, 1);
+	for (i = 0; i < 100; i++) {
+		unsigned char hash[GIT_SHA1_RAWSZ] = {i};
+		struct reftable_ref_record ref = {
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash,
+		};
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "refs/heads/%04d", i);
+		ref.refname = buf.buf,
+
+		err = reftable_writer_add_ref(writer, &ref);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < 100; i++) {
+		unsigned char hash[GIT_SHA1_RAWSZ] = {i};
+		struct reftable_log_record log = {
+			.update_index = 1,
+			.value_type = REFTABLE_LOG_UPDATE,
+			.value.update = {
+				.old_hash = hash,
+				.new_hash = hash,
+			},
+		};
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "refs/heads/%04d", i);
+		log.refname = buf.buf,
+
+		err = reftable_writer_add_log(writer, &log);
+		EXPECT_ERR(err);
+	}
+
+	reftable_writer_close(writer);
+
+	/*
+	 * The written data should be sufficiently large to result in indices
+	 * for each of the block types.
+	 */
+	stats = reftable_writer_stats(writer);
+	EXPECT(stats->ref_stats.index_offset > 0);
+	EXPECT(stats->obj_stats.index_offset > 0);
+	EXPECT(stats->log_stats.index_offset > 0);
+
+	block_source_from_strbuf(&source, &writer_buf);
+	err = reftable_new_reader(&reader, &source, "filename");
+	EXPECT_ERR(err);
+
+	/*
+	 * Seeking the log uses the log index now. In case there is any
+	 * confusion regarding indices we would notice here.
+	 */
+	err = reftable_reader_seek_log(reader, &it, "");
+	EXPECT_ERR(err);
+
+	reftable_iterator_destroy(&it);
+	reftable_writer_free(writer);
+	reftable_reader_free(reader);
+	strbuf_release(&writer_buf);
+	strbuf_release(&buf);
+}
+
 static void test_corrupt_table_empty(void)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -847,5 +926,6 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_log_overflow);
 	RUN_TEST(test_write_object_id_length);
 	RUN_TEST(test_write_object_id_min_length);
+	RUN_TEST(test_write_multiple_indices);
 	return 0;
 }
diff --git a/reftable/writer.c b/reftable/writer.c
index 2e322a5683..ee4590e20f 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -432,12 +432,12 @@ static int writer_finish_section(struct reftable_writer *w)
 		reftable_free(idx);
 	}
 
-	writer_clear_index(w);
-
 	err = writer_flush_block(w);
 	if (err < 0)
 		return err;
 
+	writer_clear_index(w);
+
 	bstats = writer_reftable_block_stats(w, typ);
 	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
 	bstats->index_offset = index_start;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 4/8] reftable/record: constify some parts of the interface
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]

We're about to convert reftable records to stop storing their object IDs
as allocated hashes. Prepare for this refactoring by constifying some
parts of the interface that will be impacted by this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/record.c          | 8 ++++----
 reftable/reftable-record.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/reftable/record.c b/reftable/record.c
index fbaa1fbef5..5e258c734b 100644
--- a/reftable/record.c
+++ b/reftable/record.c
@@ -76,7 +76,7 @@ int reftable_is_block_type(uint8_t typ)
 	return 0;
 }
 
-uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec)
+const unsigned char *reftable_ref_record_val1(const struct reftable_ref_record *rec)
 {
 	switch (rec->value_type) {
 	case REFTABLE_REF_VAL1:
@@ -88,7 +88,7 @@ uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec)
 	}
 }
 
-uint8_t *reftable_ref_record_val2(const struct reftable_ref_record *rec)
+const unsigned char *reftable_ref_record_val2(const struct reftable_ref_record *rec)
 {
 	switch (rec->value_type) {
 	case REFTABLE_REF_VAL2:
@@ -242,7 +242,7 @@ static char hexdigit(int c)
 	return 'a' + (c - 10);
 }
 
-static void hex_format(char *dest, uint8_t *src, int hash_size)
+static void hex_format(char *dest, const unsigned char *src, int hash_size)
 {
 	assert(hash_size > 0);
 	if (src) {
@@ -1164,7 +1164,7 @@ int reftable_record_equal(struct reftable_record *a, struct reftable_record *b,
 		reftable_record_data(a), reftable_record_data(b), hash_size);
 }
 
-static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+static int hash_equal(const unsigned char *a, const unsigned char *b, int hash_size)
 {
 	if (a && b)
 		return !memcmp(a, b, hash_size);
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
index 67104f8fbf..f7eb2d6015 100644
--- a/reftable/reftable-record.h
+++ b/reftable/reftable-record.h
@@ -49,11 +49,11 @@ struct reftable_ref_record {
 
 /* Returns the first hash, or NULL if `rec` is not of type
  * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
-uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec);
+const unsigned char *reftable_ref_record_val1(const struct reftable_ref_record *rec);
 
 /* Returns the second hash, or NULL if `rec` is not of type
  * REFTABLE_REF_VAL2. */
-uint8_t *reftable_ref_record_val2(const struct reftable_ref_record *rec);
+const unsigned char *reftable_ref_record_val2(const struct reftable_ref_record *rec);
 
 /* returns whether 'ref' represents a deletion */
 int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 5/8] reftable/record: store "val1" hashes as static arrays
From: Patrick Steinhardt @ 2023-12-28  6:27 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 8585 bytes --]

When reading ref records of type "val1" we store its object ID in an
allocated array. This results in an additional allocation for every
single ref record we read, which is rather inefficient especially when
iterating over refs.

Refactor the code to instead use a static array of `GIT_MAX_RAWSZ`
bytes. While this means that `struct ref_record` is bigger now, we
typically do not store all refs in an array anyway and instead only
handle a limited number of records at the same point in time.

Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:

    HEAP SUMMARY:
        in use at exit: 21,098 bytes in 192 blocks
      total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,098 bytes in 192 blocks
      total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block_test.c      |  4 +---
 reftable/merged_test.c     | 16 ++++++----------
 reftable/readwrite_test.c  | 14 ++++----------
 reftable/record.c          |  3 ---
 reftable/record_test.c     |  1 -
 reftable/reftable-record.h |  3 ++-
 reftable/stack_test.c      |  2 --
 7 files changed, 13 insertions(+), 30 deletions(-)

diff --git a/reftable/block_test.c b/reftable/block_test.c
index c00bbc8aed..dedb05c7d8 100644
--- a/reftable/block_test.c
+++ b/reftable/block_test.c
@@ -49,13 +49,11 @@ static void test_block_read_write(void)
 
 	for (i = 0; i < N; i++) {
 		char name[100];
-		uint8_t hash[GIT_SHA1_RAWSZ];
 		snprintf(name, sizeof(name), "branch%02d", i);
-		memset(hash, i, sizeof(hash));
 
 		rec.u.ref.refname = name;
 		rec.u.ref.value_type = REFTABLE_REF_VAL1;
-		rec.u.ref.value.val1 = hash;
+		memset(rec.u.ref.value.val1, i, GIT_SHA1_RAWSZ);
 
 		names[i] = xstrdup(name);
 		n = block_writer_add(&bw, &rec);
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
index d08c16abef..b3927a5d73 100644
--- a/reftable/merged_test.c
+++ b/reftable/merged_test.c
@@ -123,13 +123,11 @@ static void readers_destroy(struct reftable_reader **readers, size_t n)
 
 static void test_merged_between(void)
 {
-	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 };
-
 	struct reftable_ref_record r1[] = { {
 		.refname = "b",
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
-		.value.val1 = hash1,
+		.value.val1 = { 1, 2, 3, 0 },
 	} };
 	struct reftable_ref_record r2[] = { {
 		.refname = "a",
@@ -165,26 +163,24 @@ static void test_merged_between(void)
 
 static void test_merged(void)
 {
-	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 };
-	uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 };
 	struct reftable_ref_record r1[] = {
 		{
 			.refname = "a",
 			.update_index = 1,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash1,
+			.value.val1 = { 1 },
 		},
 		{
 			.refname = "b",
 			.update_index = 1,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash1,
+			.value.val1 = { 1 },
 		},
 		{
 			.refname = "c",
 			.update_index = 1,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash1,
+			.value.val1 = { 1 },
 		}
 	};
 	struct reftable_ref_record r2[] = { {
@@ -197,13 +193,13 @@ static void test_merged(void)
 			.refname = "c",
 			.update_index = 3,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash2,
+			.value.val1 = { 2 },
 		},
 		{
 			.refname = "d",
 			.update_index = 3,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash1,
+			.value.val1 = { 1 },
 		},
 	};
 
diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 9c16e0504e..87b238105c 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -60,18 +60,15 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 	*names = reftable_calloc(sizeof(char *) * (N + 1));
 	reftable_writer_set_limits(w, update_index, update_index);
 	for (i = 0; i < N; i++) {
-		uint8_t hash[GIT_SHA256_RAWSZ] = { 0 };
 		char name[100];
 		int n;
 
-		set_test_hash(hash, i);
-
 		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
 
 		ref.refname = name;
 		ref.update_index = update_index;
 		ref.value_type = REFTABLE_REF_VAL1;
-		ref.value.val1 = hash;
+		set_test_hash(ref.value.val1, i);
 		(*names)[i] = xstrdup(name);
 
 		n = reftable_writer_add_ref(w, &ref);
@@ -675,11 +672,10 @@ static void test_write_object_id_min_length(void)
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
 		reftable_new_writer(&strbuf_add_void, &buf, &opts);
-	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
-		.value.val1 = hash,
+		.value.val1 = {42},
 	};
 	int err;
 	int i;
@@ -711,11 +707,10 @@ static void test_write_object_id_length(void)
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
 		reftable_new_writer(&strbuf_add_void, &buf, &opts);
-	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
-		.value.val1 = hash,
+		.value.val1 = {42},
 	};
 	int err;
 	int i;
@@ -814,11 +809,10 @@ static void test_write_multiple_indices(void)
 	writer = reftable_new_writer(&strbuf_add_void, &writer_buf, &opts);
 	reftable_writer_set_limits(writer, 1, 1);
 	for (i = 0; i < 100; i++) {
-		unsigned char hash[GIT_SHA1_RAWSZ] = {i};
 		struct reftable_ref_record ref = {
 			.update_index = 1,
 			.value_type = REFTABLE_REF_VAL1,
-			.value.val1 = hash,
+			.value.val1 = {i},
 		};
 
 		strbuf_reset(&buf);
diff --git a/reftable/record.c b/reftable/record.c
index 5e258c734b..a67a6b4d8a 100644
--- a/reftable/record.c
+++ b/reftable/record.c
@@ -219,7 +219,6 @@ static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
 	case REFTABLE_REF_DELETION:
 		break;
 	case REFTABLE_REF_VAL1:
-		ref->value.val1 = reftable_malloc(hash_size);
 		memcpy(ref->value.val1, src->value.val1, hash_size);
 		break;
 	case REFTABLE_REF_VAL2:
@@ -303,7 +302,6 @@ void reftable_ref_record_release(struct reftable_ref_record *ref)
 		reftable_free(ref->value.val2.value);
 		break;
 	case REFTABLE_REF_VAL1:
-		reftable_free(ref->value.val1);
 		break;
 	case REFTABLE_REF_DELETION:
 		break;
@@ -394,7 +392,6 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key,
 			return -1;
 		}
 
-		r->value.val1 = reftable_malloc(hash_size);
 		memcpy(r->value.val1, in.buf, hash_size);
 		string_view_consume(&in, hash_size);
 		break;
diff --git a/reftable/record_test.c b/reftable/record_test.c
index 70ae78feca..5c94d26e35 100644
--- a/reftable/record_test.c
+++ b/reftable/record_test.c
@@ -119,7 +119,6 @@ static void test_reftable_ref_record_roundtrip(void)
 		case REFTABLE_REF_DELETION:
 			break;
 		case REFTABLE_REF_VAL1:
-			in.u.ref.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
 			set_hash(in.u.ref.value.val1, 1);
 			break;
 		case REFTABLE_REF_VAL2:
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
index f7eb2d6015..7f3a0df635 100644
--- a/reftable/reftable-record.h
+++ b/reftable/reftable-record.h
@@ -9,6 +9,7 @@ license that can be found in the LICENSE file or at
 #ifndef REFTABLE_RECORD_H
 #define REFTABLE_RECORD_H
 
+#include "hash-ll.h"
 #include <stdint.h>
 
 /*
@@ -38,7 +39,7 @@ struct reftable_ref_record {
 #define REFTABLE_NR_REF_VALUETYPES 4
 	} value_type;
 	union {
-		uint8_t *val1; /* malloced hash. */
+		unsigned char val1[GIT_MAX_RAWSZ];
 		struct {
 			uint8_t *value; /* first value, malloced hash  */
 			uint8_t *target_value; /* second value, malloced hash */
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 14a3fc11ee..feab49d7f7 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -463,7 +463,6 @@ static void test_reftable_stack_add(void)
 		refs[i].refname = xstrdup(buf);
 		refs[i].update_index = i + 1;
 		refs[i].value_type = REFTABLE_REF_VAL1;
-		refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
 		set_test_hash(refs[i].value.val1, i);
 
 		logs[i].refname = xstrdup(buf);
@@ -600,7 +599,6 @@ static void test_reftable_stack_tombstone(void)
 		refs[i].update_index = i + 1;
 		if (i % 2 == 0) {
 			refs[i].value_type = REFTABLE_REF_VAL1;
-			refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
 			set_test_hash(refs[i].value.val1, i);
 		}
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 6/8] reftable/record: store "val2" hashes as static arrays
From: Patrick Steinhardt @ 2023-12-28  6:28 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5214 bytes --]

Similar to the preceding commit, convert ref records of type "val2" to
store their object IDs in static arrays instead of allocating them for
every single record.

We're using the same benchmark as in the preceding commit, with `git
show-ref --quiet` in a repository with ~350k refs. This time around
though the effects aren't this huge. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,419,040 allocs, 1,418,847 frees, 62,153,868 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated

This is because "val2"-type records are typically only stored for peeled
tags, and the number of annotated tags in the benchmark repository is
rather low. Still, it can be seen that this change leads to a reduction
of allocations overall, even if only a small one.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/readwrite_test.c  | 12 ++++--------
 reftable/record.c          |  6 ------
 reftable/record_test.c     |  4 ----
 reftable/reftable-record.h |  4 ++--
 4 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 87b238105c..178766dfa8 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -547,8 +547,6 @@ static void test_table_refs_for(int indexed)
 		uint8_t hash[GIT_SHA1_RAWSZ];
 		char fill[51] = { 0 };
 		char name[100];
-		uint8_t hash1[GIT_SHA1_RAWSZ];
-		uint8_t hash2[GIT_SHA1_RAWSZ];
 		struct reftable_ref_record ref = { NULL };
 
 		memset(hash, i, sizeof(hash));
@@ -558,11 +556,9 @@ static void test_table_refs_for(int indexed)
 		name[40] = 0;
 		ref.refname = name;
 
-		set_test_hash(hash1, i / 4);
-		set_test_hash(hash2, 3 + i / 4);
 		ref.value_type = REFTABLE_REF_VAL2;
-		ref.value.val2.value = hash1;
-		ref.value.val2.target_value = hash2;
+		set_test_hash(ref.value.val2.value, i / 4);
+		set_test_hash(ref.value.val2.target_value, 3 + i / 4);
 
 		/* 80 bytes / entry, so 3 entries per block. Yields 17
 		 */
@@ -570,8 +566,8 @@ static void test_table_refs_for(int indexed)
 		n = reftable_writer_add_ref(w, &ref);
 		EXPECT(n == 0);
 
-		if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) ||
-		    !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) {
+		if (!memcmp(ref.value.val2.value, want_hash, GIT_SHA1_RAWSZ) ||
+		    !memcmp(ref.value.val2.target_value, want_hash, GIT_SHA1_RAWSZ)) {
 			want_names[want_names_len++] = xstrdup(name);
 		}
 	}
diff --git a/reftable/record.c b/reftable/record.c
index a67a6b4d8a..5c3fbb7b2a 100644
--- a/reftable/record.c
+++ b/reftable/record.c
@@ -222,9 +222,7 @@ static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
 		memcpy(ref->value.val1, src->value.val1, hash_size);
 		break;
 	case REFTABLE_REF_VAL2:
-		ref->value.val2.value = reftable_malloc(hash_size);
 		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
-		ref->value.val2.target_value = reftable_malloc(hash_size);
 		memcpy(ref->value.val2.target_value,
 		       src->value.val2.target_value, hash_size);
 		break;
@@ -298,8 +296,6 @@ void reftable_ref_record_release(struct reftable_ref_record *ref)
 		reftable_free(ref->value.symref);
 		break;
 	case REFTABLE_REF_VAL2:
-		reftable_free(ref->value.val2.target_value);
-		reftable_free(ref->value.val2.value);
 		break;
 	case REFTABLE_REF_VAL1:
 		break;
@@ -401,11 +397,9 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key,
 			return -1;
 		}
 
-		r->value.val2.value = reftable_malloc(hash_size);
 		memcpy(r->value.val2.value, in.buf, hash_size);
 		string_view_consume(&in, hash_size);
 
-		r->value.val2.target_value = reftable_malloc(hash_size);
 		memcpy(r->value.val2.target_value, in.buf, hash_size);
 		string_view_consume(&in, hash_size);
 		break;
diff --git a/reftable/record_test.c b/reftable/record_test.c
index 5c94d26e35..2876db7d27 100644
--- a/reftable/record_test.c
+++ b/reftable/record_test.c
@@ -122,11 +122,7 @@ static void test_reftable_ref_record_roundtrip(void)
 			set_hash(in.u.ref.value.val1, 1);
 			break;
 		case REFTABLE_REF_VAL2:
-			in.u.ref.value.val2.value =
-				reftable_malloc(GIT_SHA1_RAWSZ);
 			set_hash(in.u.ref.value.val2.value, 1);
-			in.u.ref.value.val2.target_value =
-				reftable_malloc(GIT_SHA1_RAWSZ);
 			set_hash(in.u.ref.value.val2.target_value, 2);
 			break;
 		case REFTABLE_REF_SYMREF:
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
index 7f3a0df635..bb6e99acd3 100644
--- a/reftable/reftable-record.h
+++ b/reftable/reftable-record.h
@@ -41,8 +41,8 @@ struct reftable_ref_record {
 	union {
 		unsigned char val1[GIT_MAX_RAWSZ];
 		struct {
-			uint8_t *value; /* first value, malloced hash  */
-			uint8_t *target_value; /* second value, malloced hash */
+			unsigned char value[GIT_MAX_RAWSZ]; /* first hash  */
+			unsigned char target_value[GIT_MAX_RAWSZ]; /* second hash */
 		} val2;
 		char *symref; /* referent, malloced 0-terminated string */
 	} value;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 7/8] reftable/merged: really reuse buffers to compute record keys
From: Patrick Steinhardt @ 2023-12-28  6:28 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1703743174.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1486 bytes --]

In 829231dc20 (reftable/merged: reuse buffer to compute record keys,
2023-12-11), we have refactored the merged iterator to reuse a set of
buffers that it would otherwise have to reallocate on every single
iteration. Unfortunately, there was a brown-paper-bag-style bug here as
we continue to release these buffers after the iteration, and thus we
have essentially gained nothing.

Fix this performance issue by not releasing those buffers on iteration
anymore, where we instead rely on `merged_iter_close()` to release the
buffers for us.

Using `git show-ref --quiet` in a repository with ~350k refs this leads
to a significant drop in allocations. Before:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated

After:

    HEAP SUMMARY:
        in use at exit: 21,163 bytes in 193 blocks
      total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/merged.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/reftable/merged.c b/reftable/merged.c
index 556bb5c556..a28bb99aaf 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -128,8 +128,6 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 
 done:
 	reftable_record_release(&entry.rec);
-	strbuf_release(&mi->entry_key);
-	strbuf_release(&mi->key);
 	return err;
 }
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox