Git development
 help / color / mirror / Atom feed
* Re: [PATCH 0/7] clone: fix init of refdb with wrong object format
From: brian m. carlson @ 2023-12-11 22:17 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, git
In-Reply-To: <ZXcrpGQhH121AQ7y@tanuki>

[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]

On 2023-12-11 at 15:32:52, Patrick Steinhardt wrote:
> On Mon, Dec 11, 2023 at 06:57:25AM -0800, Junio C Hamano wrote:
> > We have "no way to do so"?  We have "not done so"?
> 
> We have not done so until now, and we have no easy way to change this on
> the server-side as the server is not controlled by us in the first
> place. That leaves two options I can think of:
> 
>   - Try harder on the client-side, e.g. by trying to download the
>     gitconfig as you propose further down. I wonder whether admins would
>     typically block access to the config, but doubt they do.
> 
>   - Change the format of `info/refs` to include the hash format, as this
>     file _is_ controlled by us on the server-side. Doesn't help though
>     in an empty repository, where the file is likely to never have been
>     generated in the first place.
> 
> So it seems like downloading the gitconfig is the only viable option
> that I can think of right now.

I mean, we can add an `info/capabilities` file with capabilities and
assume the repository is SHA-1 without it.  I'm fine with that approach
as well, and it can be implemented as part of `git update-server-info`
pretty easily.

But yes, absent that approach or parsing the config file, we'll have to
just use the default settings.

> > The simplest "fix" might be to leave what happens in this narrow
> > case (i.e. cloning over dumb HTTP from an empty repository)
> > undefined by not testing (or not insisting on one particular
> > outcome), but ...
> 
> I would be fine with that outcome, as well. It's not like the current
> behaviour is correct in all cases either. The only benefit of that
> behaviour is that a user can in fact work around the broken cases by
> setting `GIT_HASH_DEFAULT` to the correct hash, and that benefit would
> be retained by the diff I sent that made remote-curl.c aware of this
> environment variable.

That would also be fine with me.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Bug in SVN.pm
From: Daniel Ducharme @ 2023-12-11 22:51 UTC (permalink / raw)
  To: git@vger.kernel.org

Good afternoon,

Not sure if this is maintained as part of GIT, but the bug is located in the file: C:\Program Files\Git\mingw64\share\perl5\Git\SVN.pm

sub parse_svn_date requires both the month, day, hour, and minute to be 2 digits long and fails on 2007-3-12T17:46:4.000000Z as an example due to the regex. Suggestion is to make the regex instead /^(\d{4})\-(\d\d?)\-(\d\d?)T (\d\d?)\:(\d\d?)\:(\d\d?)\.\d*Z$/x)

I have found this data to be present in a SVN repository converted off of VSS while trying to take some old VSS repos to git through SVN, not sure if standard SVN would have allowed these date patterns, but they should be valid. The above regex also contains a fix for single digit minute and second as I also ran into that as well.

I have fixed my local copy so I am good for my project, but it took me a couple of hours to find and fix and figured this may help someone else so I wanted to get it reported. I am on git version 2.43.0.windows.1.

Daniel   Ducharme, Ph.D., RICA l  Director of Software Development
Catalis Tax and CAMA, Inc.
O: 781.476.2012
M: 401.743.5853
E: dducharme@catalisgov.com
 catalisgov.com


 


^ permalink raw reply

* Re: [PATCH 5/7] submodule: handle NULL value when parsing submodule.*.branch
From: Jeff King @ 2023-12-12  0:46 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Carlos Andrés Ramírez Cataño
In-Reply-To: <ZXF--AxCOOOjyOMc@tanuki>

On Thu, Dec 07, 2023 at 09:14:48AM +0100, Patrick Steinhardt wrote:

> > diff --git a/submodule-config.c b/submodule-config.c
> > index 6a48fd12f6..f4dd482abc 100644
> > --- a/submodule-config.c
> > +++ b/submodule-config.c
> > @@ -516,7 +516,9 @@ static int parse_config(const char *var, const char *value,
> >  			submodule->recommend_shallow =
> >  				git_config_bool(var, value);
> >  	} else if (!strcmp(item.buf, "branch")) {
> > -		if (!me->overwrite && submodule->branch)
> > +		if (!value)
> > +			ret = config_error_nonbool(var);
> > +		else if (!me->overwrite && submodule->branch)
> >  			warn_multiple_config(me->treeish_name, submodule->name,
> >  					     "branch");
> >  		else {
> 
> I was about to say that I'd rather expect us to handle this gracefully
> so that Git doesn't die when parsing an invalid gitmodules file. But
> there are other cases where we already fail in the same way, so this
> looks good to me.

We're just returning the error here, so it's really up to the caller to
decide what to do. The config API has an "error_action" field in the
options struct. By default for reading from a blob, this will propagate
the error, and I think that's what we use in most of the submodule code.

For the code in fsck which looks at gitmodules, it suppresses the error
text from the config API (in favor of its own fsck-specific message).
Of course it does not suppress the error() from config_error_nonbool,
which writes straight to stderr. So there may be some room for
improvement, but I doubt anybody cares too much in practice if fsck is a
little chatty when it sees breakage.

-Peff

^ permalink raw reply

* Re: [PATCH 0/7] fix segfaults with implicit-bool config
From: Jeff King @ 2023-12-12  0:52 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Carlos Andrés Ramírez Cataño
In-Reply-To: <ZXF-7AMZ_SBltplk@tanuki>

On Thu, Dec 07, 2023 at 09:14:36AM +0100, Patrick Steinhardt wrote:

> Thanks for working on this topic! I've left a couple of comments, most
> of which are about whether we should retain previous behaviour where we
> generate a warning instead of raising an error for unknown values.

Thanks for taking a look. I see what you're saying about the warnings,
but IMHO it's not worth the extra complexity. Returning early means the
existing code can proceed without worrying about NULLs. Though I suppose
we could have a "warn_error_nonbool()" which issues a warning and
returns 0.

Still, the "return config_error_nonbool()" pattern is pretty
well-established and used for most options. I would go so far as to say
the ones that warn for invalid values are the odd ones out, and probably
should be returning errors as well (though doing so now may not be worth
the trouble and risk of annoyance).

And certainly there should be no regressions in this series; every case
is currently a segfault, so returning an error is a strict improvement.
I'd just as soon stay strict there, as it's easier to loosen later if we
choose than to tighten.

-Peff

^ permalink raw reply

* Re: [PATCH 1/7] config: handle NULL value when parsing non-bools
From: Jeff King @ 2023-12-12  0:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Carlos Andrés Ramírez Cataño
In-Reply-To: <ZXF-8iNH0qaJSVl9@tanuki>

On Thu, Dec 07, 2023 at 09:14:42AM +0100, Patrick Steinhardt wrote:

> >  	if (!strcmp(k, "uploadpack.blobpackfileuri")) {
> [...]
> This isn't part of the diff and not a new issue, but why don't we
> `return 0` when parsing this config correctly? We fall through to
> `git_default_config()` even if we've successfully parsed the config key,
> which seems like a bug to me.

I don't think it's a functional bug, but merely a pessimization. We can
return early if we know we've handled the option, but the rest of the
code would simply fail to match it. So we are just wasting a few strcmp
calls (and an unknown key already wastes the same number).

So I think it is a good practice to return, but not really a bug if we
don't.

> >  	if (!strcmp(var, "core.checkstat")) {
> > +		if (!value)
> > +			return config_error_nonbool(var);
> >  		if (!strcasecmp(value, "default"))
> >  			check_stat = 1;
> >  		else if (!strcasecmp(value, "minimal"))
> 
> We would ignore `true` here, so should we ignore implicit `true`, as
> well?

IMHO the lack of a final "else" in the strcasecmp if-cascade is a bug
(and I sent a fix as part of the "config fixes on top" series). Even if
we want to leave it for historical reasons, I think it's still worth
returning an error for the NULL case (since we know it would have
segfaulted previously).

(I snipped the rest of your mail, as I think my response to the cover
letter covers the general discussion).

-Peff

^ permalink raw reply

* Re: [PATCH] setup: recognize bare repositories with packed-refs
From: Jeff King @ 2023-12-12  1:22 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Adam Majer, git
In-Reply-To: <ZXOF75NwxI187QDQ@nand.local>

On Fri, Dec 08, 2023 at 04:09:03PM -0500, Taylor Blau wrote:

> > I dunno. I am skeptical that there are millions of these. Who really
> > wants to embed bare git repos except for projects related to Git itself,
> > which want test vectors? Is there a use case I'm missing?
> 
> Just picking on GitHub as an example, my copy has a fair number of
> embedded bare repositories:
> 
>     $ find . -mindepth 2 -type d -name '*.git' | wc -l
>     279
> 
> That might be an unfair example in general, since GitHub probably has a
> greater need to embed bare repositories than most other projects. But I
> think that we shouldn't make our decision here based on volume of
> embedded bare repositories, but rather on the number of projects which
> have >1 embedded bare repository.

Right, I meant "I am skeptical there are a lot of projects that have
embedded repositories". It is useful if your project is related to
working on Git itself and you store your test vectors that way. So
github.git is not alone there (there is libgit2, other forges, and so
on). But I don't think it is representative in general.

> Perhaps I'm over-estimating how difficult this transition would be to
> impose on users. But it does make me very leery to make this kind of a
> change without having a better sense of how many of them exist in the
> wild.

Just to be clear: I am not proposing any transition here. It is already
the case that your "refs/" directory is necessary for Git to recognize
the bare repo, and you risk committing a broken state if you have no
loose refs in it.

There's been a proposal elsewhere to require extra steps to recognize an
embedded bare repo. Which I agree will be a pain for folks who use them,
but may be worth it for the security benefit. But here I was only saying
that _if_ we do that other change, then adding extra steps might not be
too bad on top. :)

(BTW, I think libgit2 already faces this problem, because it wants
non-bare repos; so there is some magic where it stores ".gitted"
directories, and then renames them on the fly).

> Searching just on GitHub for `path:**/*.git/config` [^1], it looks like
> there are ~1,400 results. That provides us an upper-bound on the number
> of projects which have embedded bare repositories, so perhaps I really
> am overestimating the burden we'd be imposing on other projects.

Thanks, that's an interesting number, and matches my intuition.

Of course it's not a true upper bound anyway. It wouldn't count private
projects (though maybe it hits github.git in your case), not to mention
stuff that isn't hosted on GitHub.

-Peff

^ permalink raw reply

* What's cooking in git.git (Dec 2023, #02; Mon, 11)
From: Junio C Hamano @ 2023-12-12  1:23 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking in my tree.  Commits
prefixed with '+' are in 'next' (being in 'next' is a sign that a
topic is stable enough to be used and are candidate to be in a
future release).  Commits prefixed with '-' are only in 'seen', and
aren't considered "accepted" at all and may be annotated with an URL
to a message that raises issues but they are no means exhaustive.  A
topic without enough support may be discarded after a long period of
no activity (of course they can be resubmit when new interests
arise).

The 'maint' branch now points at the maintenance track of Git 2.43,
which was released earlier in the month, and the tip of 'next' has
been rewound and rebuilt on top of Git 2.43.  I am planning to start
ejecting topics that have been in the "stalled" state for too long.

The RelNotes symbolic link says we are now working towards Git 2.44.
It may not be a bad idea to reflect on what technical debt and UI
warts we have accumulated so far to see if we have enough of them to
start planning for a breaking Git 3.0 release (or, of course, keep
incrementally improve the system, which is much more preferrable---
continuity and stability is good).

Copies of the source code to Git live in many repositories, and the
following is a list of the ones I push into or their mirrors.  Some
repositories have only a subset of branches.

With maint, master, next, seen, todo:

	git://git.kernel.org/pub/scm/git/git.git/
	git://repo.or.cz/alt-git.git/
	https://kernel.googlesource.com/pub/scm/git/git/
	https://github.com/git/git/
	https://gitlab.com/git-vcs/git/

With all the integration branches and topics broken out:

	https://github.com/gitster/git/

Even though the preformatted documentation in HTML and man format
are not sources, they are published in these repositories for
convenience (replace "htmldocs" with "manpages" for the manual
pages):

	git://git.kernel.org/pub/scm/git/git-htmldocs.git/
	https://github.com/gitster/git-htmldocs.git/

Release tarballs are available at:

	https://www.kernel.org/pub/software/scm/git/

--------------------------------------------------
[Graduated to 'master']

* ak/rebase-autosquash (2023-11-16) 3 commits
  (merged to 'next' on 2023-11-17 at 3ed6e79445)
 + rebase: rewrite --(no-)autosquash documentation
 + rebase: support --autosquash without -i
 + rebase: fully ignore rebase.autoSquash without -i

 "git rebase --autosquash" is now enabled for non-interactive rebase,
 but it is still incompatible with the apply backend.
 source: <20231114214339.10925-1-andy.koppe@gmail.com>


* jk/chunk-bounds-more (2023-11-09) 9 commits
  (merged to 'next' on 2023-11-13 at 3df4b18bea)
 + commit-graph: mark chunk error messages for translation
 + commit-graph: drop verify_commit_graph_lite()
 + commit-graph: check order while reading fanout chunk
 + commit-graph: use fanout value for graph size
 + commit-graph: abort as soon as we see a bogus chunk
 + commit-graph: clarify missing-chunk error messages
 + commit-graph: drop redundant call to "lite" verification
 + midx: check consistency of fanout table
 + commit-graph: handle overflow in chunk_size checks
 (this branch is used by tb/pair-chunk-expect.)

 Code clean-up for jk/chunk-bounds topic.
 source: <20231109070310.GA2697602@coredump.intra.peff.net>


* js/ci-discard-prove-state (2023-11-14) 1 commit
  (merged to 'next' on 2023-11-14 at fade3ba143)
 + ci: avoid running the test suite _twice_
 (this branch uses ps/ci-gitlab.)

 The way CI testing used "prove" could lead to running the test
 suite twice needlessly, which has been corrected.
 source: <pull.1613.git.1699894837844.gitgitgadget@gmail.com>


* js/doc-unit-tests (2023-11-10) 3 commits
  (merged to 'next' on 2023-11-10 at 7d00ffd06b)
 + ci: run unit tests in CI
 + unit tests: add TAP unit test framework
 + unit tests: add a project plan document
 (this branch is used by js/doc-unit-tests-with-cmake.)

 Process to add some form of low-level unit tests has started.
 source: <cover.1699555664.git.steadmon@google.com>


* js/doc-unit-tests-with-cmake (2023-11-10) 7 commits
  (merged to 'next' on 2023-11-10 at b4503c9c8c)
 + cmake: handle also unit tests
 + cmake: use test names instead of full paths
 + cmake: fix typo in variable name
 + artifacts-tar: when including `.dll` files, don't forget the unit-tests
 + unit-tests: do show relative file paths
 + unit-tests: do not mistake `.pdb` files for being executable
 + cmake: also build unit tests
 (this branch uses js/doc-unit-tests.)

 Update the base topic to work with CMake builds.
 source: <pull.1579.v3.git.1695640836.gitgitgadget@gmail.com>


* jw/git-add-attr-pathspec (2023-11-04) 1 commit
  (merged to 'next' on 2023-11-13 at b61be94e4d)
 + attr: enable attr pathspec magic for git-add and git-stash

 "git add" and "git stash" learned to support the ":(attr:...)"
 magic pathspec.
 source: <20231103163449.1578841-1-jojwang@google.com>


* ps/ban-a-or-o-operator-with-test (2023-11-11) 4 commits
  (merged to 'next' on 2023-11-14 at d84471baab)
 + Makefile: stop using `test -o` when unlinking duplicate executables
 + contrib/subtree: convert subtree type check to use case statement
 + contrib/subtree: stop using `-o` to test for number of args
 + global: convert trivial usages of `test <expr> -a/-o <expr>`

 Test and shell scripts clean-up.
 source: <cover.1699609940.git.ps@pks.im>


* ps/ci-gitlab (2023-11-09) 8 commits
  (merged to 'next' on 2023-11-10 at ea7ed67945)
 + ci: add support for GitLab CI
 + ci: install test dependencies for linux-musl
 + ci: squelch warnings when testing with unusable Git repo
 + ci: unify setup of some environment variables
 + ci: split out logic to set up failed test artifacts
 + ci: group installation of Docker dependencies
 + ci: make grouping setup more generic
 + ci: reorder definitions for grouping functions
 (this branch is used by js/ci-discard-prove-state.)

 Add support for GitLab CI.
 source: <cover.1699514143.git.ps@pks.im>


* ps/httpd-tests-on-nixos (2023-11-11) 3 commits
  (merged to 'next' on 2023-11-13 at 81bd6f5334)
 + t9164: fix inability to find basename(1) in Subversion hooks
 + t/lib-httpd: stop using legacy crypt(3) for authentication
 + t/lib-httpd: dynamically detect httpd and modules path

 Portability tweak.
 source: <cover.1699596457.git.ps@pks.im>


* ps/ref-tests-update (2023-11-03) 10 commits
  (merged to 'next' on 2023-11-13 at dc26e55d6f)
 + t: mark several tests that assume the files backend with REFFILES
 + t7900: assert the absence of refs via git-for-each-ref(1)
 + t7300: assert exact states of repo
 + t4207: delete replace references via git-update-ref(1)
 + t1450: convert tests to remove worktrees via git-worktree(1)
 + t: convert tests to not access reflog via the filesystem
 + t: convert tests to not access symrefs via the filesystem
 + t: convert tests to not write references via the filesystem
 + t: allow skipping expected object ID in `ref-store update-ref`
 + Merge branch 'ps/show-ref' into ps/ref-tests-update

 Update ref-related tests.
 source: <cover.1698914571.git.ps@pks.im>


* ss/format-patch-use-encode-headers-for-cover-letter (2023-11-10) 1 commit
  (merged to 'next' on 2023-11-14 at 1a4bd59e15)
 + format-patch: fix ignored encode_email_headers for cover letter

 "git format-patch --encode-email-headers" ignored the option when
 preparing the cover letter, which has been corrected.
 source: <20231109111950.387219-1-contact@emersion.fr>


* tz/send-email-negatable-options (2023-11-17) 2 commits
  (merged to 'next' on 2023-11-17 at f09e533e43)
 + send-email: avoid duplicate specification warnings
 + perl: bump the required Perl version to 5.8.1 from 5.8.0

 Newer versions of Getopt::Long started giving warnings against our
 (ab)use of it in "git send-email".  Bump the minimum version
 requirement for Perl to 5.8.1 (from September 2002) to allow
 simplifying our implementation.
 source: <20231116193014.470420-1-tmz@pobox.com>


* vd/for-each-ref-unsorted-optimization (2023-11-16) 10 commits
  (merged to 'next' on 2023-11-17 at ff99420bf6)
 + t/perf: add perf tests for for-each-ref
 + ref-filter.c: use peeled tag for '*' format fields
 + for-each-ref: clean up documentation of --format
 + ref-filter.c: filter & format refs in the same callback
 + ref-filter.c: refactor to create common helper functions
 + ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
 + ref-filter.h: add functions for filter/format & format-only
 + ref-filter.h: move contains caches into filter
 + ref-filter.h: add max_count and omit_empty to ref_format
 + ref-filter.c: really don't sort when using --no-sort

 "git for-each-ref --no-sort" still sorted the refs alphabetically
 which paid non-trivial cost.  It has been redefined to show output
 in an unspecified order, to allow certain optimizations to take
 advantage of.
 source: <pull.1609.v2.git.1699991638.gitgitgadget@gmail.com>

--------------------------------------------------
[New Topics]

* jc/bisect-doc (2023-12-09) 1 commit
 - bisect: document "terms" subcommand more fully

 Doc update.

 Needs review.
 source: <xmqqzfyjmk02.fsf@gitster.g>


* rs/show-ref-incompatible-options (2023-12-11) 1 commit
 - show-ref: use die_for_incompatible_opt3()

 source: <e5304253-3347-4900-bbf2-d3c6ee3fb976@web.de>

--------------------------------------------------
[Stalled]

* pw/rebase-sigint (2023-09-07) 1 commit
 - rebase -i: ignore signals when forking subprocesses

 If the commit log editor or other external programs (spawned via
 "exec" insn in the todo list) receive internactive signal during
 "git rebase -i", it caused not just the spawned program but the
 "Git" process that spawned them, which is often not what the end
 user intended.  "git" learned to ignore SIGINT and SIGQUIT while
 waiting for these subprocesses.

 Expecting a reroll.
 cf. <12c956ea-330d-4441-937f-7885ab519e26@gmail.com>
 source: <pull.1581.git.1694080982621.gitgitgadget@gmail.com>


* tk/cherry-pick-sequence-requires-clean-worktree (2023-06-01) 1 commit
 - cherry-pick: refuse cherry-pick sequence if index is dirty

 "git cherry-pick A" that replays a single commit stopped before
 clobbering local modification, but "git cherry-pick A..B" did not,
 which has been corrected.

 Expecting a reroll.
 cf. <999f12b2-38d6-f446-e763-4985116ad37d@gmail.com>
 source: <pull.1535.v2.git.1685264889088.gitgitgadget@gmail.com>


* jc/diff-cached-fsmonitor-fix (2023-09-15) 3 commits
 - diff-lib: fix check_removed() when fsmonitor is active
 - Merge branch 'jc/fake-lstat' into jc/diff-cached-fsmonitor-fix
 - Merge branch 'js/diff-cached-fsmonitor-fix' into jc/diff-cached-fsmonitor-fix
 (this branch uses jc/fake-lstat.)

 The optimization based on fsmonitor in the "diff --cached"
 codepath is resurrected with the "fake-lstat" introduced earlier.

 It is unknown if the optimization is worth resurrecting, but in case...
 source: <xmqqr0n0h0tw.fsf@gitster.g>

--------------------------------------------------
[Cooking]

* jp/use-diff-index-in-pre-commit-sample (2023-12-03) 1 commit
 - hooks--pre-commit: detect non-ASCII when renaming

 The sample pre-commit hook that tries to catch introduction of new
 paths that use potentially non-portable characters did not notice
 an existing path getting renamed to such a problematic path, when
 rename detection was enabled.

 Will merge to 'next'.
 source: <pull.1291.v2.git.git.1701360836307.gitgitgadget@gmail.com>


* mk/doc-gitfile-more (2023-12-03) 1 commit
 - doc: make the gitfile syntax easier to discover

 Doc update.

 Will merge to 'next'.
 source: <20231128065558.1061206-1-mk+copyleft@pimpmybyte.de>


* ps/ref-tests-update-more (2023-12-03) 10 commits
 - t6301: write invalid object ID via `test-tool ref-store`
 - t5551: stop writing packed-refs directly
 - t5401: speed up creation of many branches
 - t4013: simplify magic parsing and drop "failure"
 - t3310: stop checking for reference existence via `test -f`
 - t1417: make `reflog --updateref` tests backend agnostic
 - t1410: use test-tool to create empty reflog
 - t1401: stop treating FETCH_HEAD as real reference
 - t1400: split up generic reflog tests from the reffile-specific ones
 - t0410: mark tests to require the reffiles backend

 Tests update.

 Will merge to 'next'.
 source: <cover.1701242407.git.ps@pks.im>


* sh/completion-with-reftable (2023-12-03) 2 commits
 - completion: stop checking for reference existence via `test -f`
 - completion: refactor existence checks for special refs

 Command line completion script (in contrib/) learned to work better
 with the reftable backend.

 Expecting a reroll.
 source: <20231130202404.89791-1-stanhu@gmail.com>


* en/header-cleanup (2023-12-03) 12 commits
 - treewide: remove unnecessary includes in source files
 - treewide: add direct includes currently only pulled in transitively
 - trace2/tr2_tls.h: remove unnecessary include
 - submodule-config.h: remove unnecessary include
 - pkt-line.h: remove unnecessary include
 - line-log.h: remove unnecessary include
 - http.h: remove unnecessary include
 - fsmonitor--daemon.h: remove unnecessary includes
 - blame.h: remove unnecessary includes
 - archive.h: remove unnecessary include
 - treewide: remove unnecessary includes in source files
 - treewide: remove unnecessary includes from header files

 Remove unused header "#include".

 Has a few interactions with topics in flight.
 source: <pull.1617.git.1701585682.gitgitgadget@gmail.com>


* jc/revision-parse-int (2023-12-09) 1 commit
 - revision: parse integer arguments to --max-count, --skip, etc., more carefully

 The command line parser for the "log" family of commands was too
 loose when parsing certain numbers, e.g., silently ignoring the
 extra 'q' in "git log -n 1q" without complaining, which has been
 tightened up.

 Will merge to 'next'.
 source: <xmqq5y181fx0.fsf_-_@gitster.g>


* jk/bisect-reset-fix (2023-12-09) 1 commit
 - bisect: always clean on reset

 "git bisect reset" has been taught to clean up state files and refs
 even when BISECT_START file is gone.

 Will merge to 'next'.
 source: <20231207065341.GA778781@coredump.intra.peff.net>


* jk/implicit-true (2023-12-09) 7 commits
 - fsck: handle NULL value when parsing message config
 - trailer: handle NULL value when parsing trailer-specific config
 - submodule: handle NULL value when parsing submodule.*.branch
 - help: handle NULL value for alias.* config
 - trace2: handle NULL values in tr2_sysenv config callback
 - setup: handle NULL value when parsing extensions
 - config: handle NULL value when parsing non-bools
 (this branch is used by jk/config-cleanup.)

 Some codepaths did not correctly parse configuration variables
 specified with valueless "true", which has been corrected.

 Will merge to 'next'.
 source: <20231207071030.GA1275835@coredump.intra.peff.net>


* jk/config-cleanup (2023-12-09) 9 commits
 - sequencer: simplify away extra git_config_string() call
 - gpg-interface: drop pointless config_error_nonbool() checks
 - push: drop confusing configset/callback redundancy
 - config: use git_config_string() for core.checkRoundTripEncoding
 - diff: give more detailed messages for bogus diff.* config
 - config: use config_error_nonbool() instead of custom messages
 - imap-send: don't use git_die_config() inside callback
 - git_xmerge_config(): prefer error() to die()
 - config: reject bogus values for core.checkstat
 (this branch uses jk/implicit-true.)

 Code clean-up around use of configuration variables.

 Will merge to 'next'.
 source: <20231207071030.GA1275835@coredump.intra.peff.net>
 source: <20231207072338.GA1277727@coredump.intra.peff.net>


* jk/end-of-options (2023-12-09) 1 commit
 - parse-options: decouple "--end-of-options" and "--"

 "git log --end-of-options --rev -- --path" learned to interpret
 "--rev" as a rev, and "--path" as a path, as expected.

 Will merge to 'next'.
 source: <20231206222145.GA136253@coredump.intra.peff.net>


* ps/clone-into-reftable-repository (2023-12-09) 7 commits
 - builtin/clone: create the refdb with the correct object format
 - builtin/clone: skip reading HEAD when retrieving remote
 - builtin/clone: set up sparse checkout later
 - builtin/clone: fix bundle URIs with mismatching object formats
 - remote-curl: rediscover repository when fetching refs
 - setup: allow skipping creation of the refdb
 - setup: extract function to create the refdb

 "git clone" has been prepared to allow cloning a repository with
 non-default hash function into a repository that uses the reftable
 backend.

 Fails t5550 under SHA-256 mode.
 cf. <xmqq7clmn3w1.fsf@gitster.g>
 source: <cover.1701863960.git.ps@pks.im>


* rs/incompatible-options-messages (2023-12-09) 7 commits
 - worktree: simplify incompatibility message for --orphan and commit-ish
 - worktree: standardize incompatibility messages
 - clean: factorize incompatibility message
 - revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
 - revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
 - repack: use die_for_incompatible_opt3() for -A/-k/--cruft
 - push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror

 Clean-up code that handles combinations of incompatible options.

 Will merge to 'next'.
 source: <20231206115215.94467-1-l.s.r@web.de>


* ad/merge-file-diff-algo (2023-11-22) 1 commit
  (merged to 'next' on 2023-12-11 at ab43a54c43)
 + merge-file: add --diff-algorithm option

 "git merge-file" learned to take the "--diff-algorithm" option to
 use algorithm different from the default "myers" diff.

 Will merge to 'master'.
 source: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>


* ak/p4-initial-empty-commits (2023-11-23) 1 commit
 - git-p4: fix fast import when empty commit is first

 Expecting a reroll.
 source: <pull.1609.git.git.1700639764041.gitgitgadget@gmail.com>


* jc/checkout-B-branch-in-use (2023-12-09) 3 commits
 - fixup! checkout: forbid "-B <branch>" from touching a branch used elsewhere
 - checkout: forbid "-B <branch>" from touching a branch used elsewhere
 - checkout: refactor die_if_checked_out() caller

 "git checkout -B <branch> [<start-point>]" allowed a branch that is
 in use in another worktree to be updated and checked out, which
 might be a bit unexpected.  The rule has been tightened, which is a
 breaking change.  "--ignore-other-worktrees" option is required to
 unbreak you, if you are used to the current behaviour that "-B"
 overrides the safety.

 Needs review.
 source: <xmqqjzq9cl70.fsf@gitster.g>


* jh/trace2-redact-auth (2023-11-23) 4 commits
  (merged to 'next' on 2023-12-11 at 7e679a4c4d)
 + t0212: test URL redacting in EVENT format
 + t0211: test URL redacting in PERF format
 + trace2: redact passwords from https:// URLs by default
 + trace2: fix signature of trace2_def_param() macro

 trace2 streams used to record the URLs that potentially embed
 authentication material, which has been corrected.

 Will merge to 'master'.
 source: <pull.1616.git.1700680717.gitgitgadget@gmail.com>


* ps/commit-graph-less-paranoid (2023-11-26) 1 commit
  (merged to 'next' on 2023-12-11 at 618bd08fa1)
 + commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default

 Earlier we stopped relying on commit-graph that (still) records
 information about commits that are lost from the object store,
 which has negative performance implications.  The default has been
 flipped to disable this pessimization.

 Will merge to 'master'.
 source: <17e08289cd59d20de0de9b4e18f5e6bf77987351.1700823746.git.ps@pks.im>


* ps/reftable-fixes (2023-12-11) 11 commits
 - reftable/block: reuse buffer to compute record keys
 - reftable/block: introduce macro to initialize `struct block_iter`
 - reftable/merged: reuse buffer to compute record keys
 - reftable/stack: fix use of unseeded randomness
 - reftable/stack: fix stale lock when dying
 - reftable/stack: reuse buffers when reloading stack
 - reftable/stack: perform auto-compaction with transactional interface
 - reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
 - reftable: handle interrupted writes
 - reftable: handle interrupted reads
 - reftable: wrap EXPECT macros in do/while

 Bunch of small fix-ups to the reftable code.

 Will merge to 'next'?
 source: <cover.1702285387.git.ps@pks.im>


* en/complete-sparse-checkout (2023-12-03) 4 commits
 - completion: avoid user confusion in non-cone mode
 - completion: avoid misleading completions in cone mode
 - completion: fix logic for determining whether cone mode is active
 - completion: squelch stray errors in sparse-checkout completion

 Command line completion (in contrib/) learned to complete path
 arguments to the "add/set" subcommands of "git sparse-checkout"
 better.

 Will merge to 'next'.
 source: <pull.1349.v3.git.1701583024.gitgitgadget@gmail.com>


* jb/reflog-expire-delete-dry-run-options (2023-11-26) 1 commit
  (merged to 'next' on 2023-12-11 at c7e9846963)
 + builtin/reflog.c: fix dry-run option short name

 Command line parsing fix for "git reflog".

 Will merge to 'master'.
 source: <20231126000514.85509-1-josh@brob.st>


* jc/orphan-unborn (2023-11-24) 2 commits
 - orphan/unborn: fix use of 'orphan' in end-user facing messages
 - orphan/unborn: add to the glossary and use them consistently

 Doc updates to clarify what an "unborn branch" means.

 Comments?
 source: <xmqq4jhb977x.fsf@gitster.g>


* rs/column-leakfix (2023-11-27) 1 commit
  (merged to 'next' on 2023-12-11 at 9ac1707337)
 + column: release strbuf and string_list after use

 Leakfix.

 Will merge to 'master'.
 source: <f087137d-a5aa-487e-a1cb-0ad7117b38ed@web.de>


* rs/i18n-cannot-be-used-together (2023-11-27) 1 commit
  (merged to 'next' on 2023-12-11 at a44e1c84c9)
 + i18n: factorize even more 'incompatible options' messages

 Clean-up code that handles combinations of incompatible options.

 Will merge to 'master'.
 source: <e6eb12e4-bb63-473c-9c2f-965a4d5981ad@web.de>


* ac/fuzz-show-date (2023-11-20) 1 commit
  (merged to 'next' on 2023-12-11 at f36795a896)
 + fuzz: add new oss-fuzz fuzzer for date.c / date.h

 Subject approxidate() and show_date() machinery to OSS-Fuzz.

 Will merge to 'master'.
 source: <pull.1612.v4.git.1700243267653.gitgitgadget@gmail.com>


* js/packfile-h-typofix (2023-11-20) 1 commit
  (merged to 'next' on 2023-12-11 at 328399439a)
 + packfile.c: fix a typo in `each_file_in_pack_dir_fn()`'s declaration

 Typofix.

 Will merge to 'master'.
 source: <pull.1614.git.1700226915859.gitgitgadget@gmail.com>


* jw/builtin-objectmode-attr (2023-12-09) 1 commit
 - attr: add builtin objectmode values support

 The builtin_objectmode attribute is populated for each path
 without adding anything in .gitattributes files, which would be
 useful in magic pathspec, e.g., ":(attr:builtin_objectmode=100755)"
 to limit to executables.

 Will merge to 'next'.
 source: <20231116054437.2343549-1-jojwang@google.com>


* ps/ref-deletion-updates (2023-11-17) 4 commits
  (merged to 'next' on 2023-12-11 at ca551a0c36)
 + refs: remove `delete_refs` callback from backends
 + refs: deduplicate code to delete references
 + refs/files: use transactions to delete references
 + t5510: ensure that the packed-refs file needs locking

 Simplify API implementation to delete references by eliminating
 duplication.

 Will merge to 'master'.
 source: <cover.1699951815.git.ps@pks.im>


* jx/fetch-atomic-error-message-fix (2023-10-19) 2 commits
 - fetch: no redundant error message for atomic fetch
 - t5574: test porcelain output of atomic fetch

 "git fetch --atomic" issued an unnecessary empty error message,
 which has been corrected.

 Expecting an update.
 cf. <ZTjQIrCgSANAT8wR@tanuki>
 source: <ced46baeb1c18b416b4b4cc947f498bea2910b1b.1697725898.git.zhiyou.jx@alibaba-inc.com>


* js/bugreport-in-the-same-minute (2023-10-16) 1 commit
 - bugreport: include +i in outfile suffix as needed

 Instead of auto-generating a filename that is already in use for
 output and fail the command, `git bugreport` learned to fuzz the
 filename to avoid collisions with existing files.

 Expecting a reroll.
 cf. <ZTtZ5CbIGETy1ucV.jacob@initialcommit.io>
 source: <20231016214045.146862-2-jacob@initialcommit.io>


* kh/t7900-cleanup (2023-10-17) 9 commits
 - t7900: fix register dependency
 - t7900: factor out packfile dependency
 - t7900: fix `print-args` dependency
 - t7900: fix `pfx` dependency
 - t7900: factor out common schedule setup
 - t7900: factor out inheritance test dependency
 - t7900: create commit so that branch is born
 - t7900: setup and tear down clones
 - t7900: remove register dependency

 Test clean-up.

 Perhaps discard?
 cf. <655ca147-c214-41be-919d-023c1b27b311@app.fastmail.com>
 source: <cover.1697319294.git.code@khaugsbakk.name>


* tb/merge-tree-write-pack (2023-10-23) 5 commits
 - builtin/merge-tree.c: implement support for `--write-pack`
 - bulk-checkin: introduce `index_tree_bulk_checkin_incore()`
 - bulk-checkin: introduce `index_blob_bulk_checkin_incore()`
 - bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types
 - bulk-checkin: extract abstract `bulk_checkin_source`

 "git merge-tree" learned "--write-pack" to record its result
 without creating loose objects.

 Broken when an object created during a merge is needed to continue merge
 cf. <CABPp-BEfy9VOvimP9==ry_rZXu=metOQ8s=_-XiG_Pdx9c06Ww@mail.gmail.com>
 source: <cover.1698101088.git.me@ttaylorr.com>


* tb/pair-chunk-expect (2023-11-10) 8 commits
 - midx: read `OOFF` chunk with `pair_chunk_expect()`
 - midx: read `OIDL` chunk with `pair_chunk_expect()`
 - commit-graph: read `BIDX` chunk with `pair_chunk_expect()`
 - commit-graph: read `GDAT` chunk with `pair_chunk_expect()`
 - commit-graph: read `CDAT` chunk with `pair_chunk_expect()`
 - commit-graph: read `OIDL` chunk with `pair_chunk_expect()`
 - chunk-format: introduce `pair_chunk_expect()` helper
 - Merge branch 'jk/chunk-bounds-more' into HEAD

 Further code clean-up.

 Needs review.
 source: <cover.1699569246.git.me@ttaylorr.com>


* tb/path-filter-fix (2023-10-18) 17 commits
 - bloom: introduce `deinit_bloom_filters()`
 - commit-graph: reuse existing Bloom filters where possible
 - object.h: fix mis-aligned flag bits table
 - commit-graph: drop unnecessary `graph_read_bloom_data_context`
 - commit-graph.c: unconditionally load Bloom filters
 - bloom: prepare to discard incompatible Bloom filters
 - bloom: annotate filters with hash version
 - commit-graph: new filter ver. that fixes murmur3
 - repo-settings: introduce commitgraph.changedPathsVersion
 - t4216: test changed path filters with high bit paths
 - t/helper/test-read-graph: implement `bloom-filters` mode
 - bloom.h: make `load_bloom_filter_from_graph()` public
 - t/helper/test-read-graph.c: extract `dump_graph_info()`
 - gitformat-commit-graph: describe version 2 of BDAT
 - commit-graph: ensure Bloom filters are read with consistent settings
 - revision.c: consult Bloom filters for root commits
 - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`

 The Bloom filter used for path limited history traversal was broken
 on systems whose "char" is unsigned; update the implementation and
 bump the format version to 2.

 Needs (hopefully final and quick) review.
 source: <cover.1697653929.git.me@ttaylorr.com>


* cc/git-replay (2023-11-26) 14 commits
  (merged to 'next' on 2023-12-11 at 6f7d123578)
 + replay: stop assuming replayed branches do not diverge
 + replay: add --contained to rebase contained branches
 + replay: add --advance or 'cherry-pick' mode
 + replay: use standard revision ranges
 + replay: make it a minimal server side command
 + replay: remove HEAD related sanity check
 + replay: remove progress and info output
 + replay: add an important FIXME comment about gpg signing
 + replay: change rev walking options
 + replay: introduce pick_regular_commit()
 + replay: die() instead of failing assert()
 + replay: start using parse_options API
 + replay: introduce new builtin
 + t6429: remove switching aspects of fast-rebase

 Introduce "git replay", a tool meant on the server side without
 working tree to recreate a history.

 Will merge to 'master'.
 cf. <6bfe1541-54dd-ca6b-e930-94d3038060f1@gmx.de>
 source: <20231124111044.3426007-1-christian.couder@gmail.com>


* ak/color-decorate-symbols (2023-10-23) 7 commits
 - log: add color.decorate.pseudoref config variable
 - refs: exempt pseudorefs from pattern prefixing
 - refs: add pseudorefs array and iteration functions
 - log: add color.decorate.ref config variable
 - log: add color.decorate.symbol config variable
 - log: use designated inits for decoration_colors
 - config: restructure color.decorate documentation

 A new config for coloring.

 Needs review.
 source: <20231023221143.72489-1-andy.koppe@gmail.com>


* js/update-urls-in-doc-and-comment (2023-11-26) 4 commits
  (merged to 'next' on 2023-12-11 at 3cda3f2a03)
 + doc: refer to internet archive
 + doc: update links for andre-simon.de
 + doc: switch links to https
 + doc: update links to current pages

 Stale URLs have been updated to their current counterparts (or
 archive.org) and HTTP links are replaced with working HTTPS links.

 Will merge to 'master'.
 source: <pull.1589.v3.git.1700796916.gitgitgadget@gmail.com>


* la/trailer-cleanups (2023-10-20) 3 commits
 - trailer: use offsets for trailer_start/trailer_end
 - trailer: find the end of the log message
 - commit: ignore_non_trailer computes number of bytes to ignore

 Code clean-up.

 Comments?
 source: <pull.1563.v5.git.1697828495.gitgitgadget@gmail.com>


* eb/hash-transition (2023-10-02) 30 commits
 - t1016-compatObjectFormat: add tests to verify the conversion between objects
 - t1006: test oid compatibility with cat-file
 - t1006: rename sha1 to oid
 - test-lib: compute the compatibility hash so tests may use it
 - builtin/ls-tree: let the oid determine the output algorithm
 - object-file: handle compat objects in check_object_signature
 - tree-walk: init_tree_desc take an oid to get the hash algorithm
 - builtin/cat-file: let the oid determine the output algorithm
 - rev-parse: add an --output-object-format parameter
 - repository: implement extensions.compatObjectFormat
 - object-file: update object_info_extended to reencode objects
 - object-file-convert: convert commits that embed signed tags
 - object-file-convert: convert commit objects when writing
 - object-file-convert: don't leak when converting tag objects
 - object-file-convert: convert tag objects when writing
 - object-file-convert: add a function to convert trees between algorithms
 - object: factor out parse_mode out of fast-import and tree-walk into in object.h
 - cache: add a function to read an OID of a specific algorithm
 - tag: sign both hashes
 - commit: export add_header_signature to support handling signatures on tags
 - commit: convert mergetag before computing the signature of a commit
 - commit: write commits for both hashes
 - object-file: add a compat_oid_in parameter to write_object_file_flags
 - object-file: update the loose object map when writing loose objects
 - loose: compatibilty short name support
 - loose: add a mapping between SHA-1 and SHA-256 for loose objects
 - repository: add a compatibility hash algorithm
 - object-names: support input of oids in any supported hash
 - oid-array: teach oid-array to handle multiple kinds of oids
 - object-file-convert: stubs for converting from one object format to another

 Teach a repository to work with both SHA-1 and SHA-256 hash algorithms.

 Needs review.
 source: <878r8l929e.fsf@gmail.froward.int.ebiederm.org>


* jx/remote-archive-over-smart-http (2023-10-04) 4 commits
 - archive: support remote archive from stateless transport
 - transport-helper: call do_take_over() in connect_helper
 - transport-helper: call do_take_over() in process_connect
 - transport-helper: no connection restriction in connect_helper

 "git archive --remote=<remote>" learned to talk over the smart
 http (aka stateless) transport.

 Needs review.
 source: <cover.1696432593.git.zhiyou.jx@alibaba-inc.com>


* jx/sideband-chomp-newline-fix (2023-10-04) 3 commits
 - pkt-line: do not chomp newlines for sideband messages
 - pkt-line: memorize sideband fragment in reader
 - test-pkt-line: add option parser for unpack-sideband

 Sideband demultiplexer fixes.

 Needs review.
 source: <cover.1696425168.git.zhiyou.jx@alibaba-inc.com>


* js/config-parse (2023-09-21) 5 commits
 - config-parse: split library out of config.[c|h]
 - config.c: accept config_parse_options in git_config_from_stdin
 - config: report config parse errors using cb
 - config: split do_event() into start and flush operations
 - config: split out config_parse_options

 The parsing routines for the configuration files have been split
 into a separate file.

 Needs review.
 source: <cover.1695330852.git.steadmon@google.com>


* jc/fake-lstat (2023-09-15) 1 commit
 - cache: add fake_lstat()
 (this branch is used by jc/diff-cached-fsmonitor-fix.)

 A new helper to let us pretend that we called lstat() when we know
 our cache_entry is up-to-date via fsmonitor.

 Needs review.
 source: <xmqqcyykig1l.fsf@gitster.g>


* jc/rerere-cleanup (2023-08-25) 4 commits
 - rerere: modernize use of empty strbuf
 - rerere: try_merge() should use LL_MERGE_ERROR when it means an error
 - rerere: fix comment on handle_file() helper
 - rerere: simplify check_one_conflict() helper function

 Code clean-up.

 Not ready to be reviewed yet.
 source: <20230824205456.1231371-1-gitster@pobox.com>


* rj/status-bisect-while-rebase (2023-10-16) 1 commit
 - status: fix branch shown when not only bisecting

 "git status" is taught to show both the branch being bisected and
 being rebased when both are in effect at the same time.

 Needs review.
 source: <2e24ca9b-9c5f-f4df-b9f8-6574a714dfb2@gmail.com>

^ permalink raw reply

* Re: [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully
From: Jeff King @ 2023-12-12  1:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Britton Kerin
In-Reply-To: <xmqq5y181fx0.fsf_-_@gitster.g>

On Sat, Dec 09, 2023 at 07:35:23AM +0900, Junio C Hamano wrote:

> The "rev-list" and other commands in the "log" family, being the
> oldest part of the system, use their own custom argument parsers,
> and integer values of some options are parsed with atoi(), which
> allows a non-digit after the number (e.g., "1q") to be silently
> ignored.  As a natural consequence, an argument that does not begin
> with a digit (e.g., "q") silently becomes zero, too.
> 
> Switch to use strtol_i() and parse_timestamp() appropriately to
> catch bogus input.
> 
> Note that one may naïvely expect that --max-count, --skip, etc., to
> only take non-negative values, but we must allow them to also take
> negative values, as an escape hatch to countermand a limit set by an
> earlier option on the command line; the underlying variables are
> initialized to (-1) and "--max-count=-1", for example, is a
> legitimate way to reinitialize the limit.

This all looks pretty reasonable to me.

I couldn't help but think, though, that surely we have some helpers for
this already? But the closest seems to be git_parse_int(), which also
allows unit factors. I'm not sure if allowing "-n 1k" would be a feature
or a bug. ;)

I guess "strtol_i()" maybe is that helper already, though I did not even
know it existed. Looks like it goes back to 2007, and is seldom used. I
wonder if there are more spots that could benefit.

I don't think there is any such helper for timestamps, but the checks in
your parser look good (strtol_i() checks for overflow as we cast to int,
but I don't think we need to do the same here since timestamp_t and
parse_timestamp() should be matched).

-Peff

^ permalink raw reply

* Re: What's cooking in git.git (Dec 2023, #01; Sat, 9)
From: Jeff King @ 2023-12-12  1:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqa5qknnej.fsf@gitster.g>

On Fri, Dec 08, 2023 at 06:02:44PM -0800, Junio C Hamano wrote:

> * jk/end-of-options (2023-12-09) 1 commit
>  - parse-options: decouple "--end-of-options" and "--"
> 
>  "git log --end-of-options --rev -- --path" learned to interpret
>  "--rev" as a rev, and "--path" as a path, as expected.
> 
>  Will merge to 'next'.
>  source: <20231206222145.GA136253@coredump.intra.peff.net>

A minor correction here (since this will eventually go to the release
notes): "log --end-of-options --rev -- --path" always worked. It is "git
reset" that is fixed (along with "checkout" and many other programs).

-Peff

^ permalink raw reply

* Re: [PATCH 3/9] imap-send: don't use git_die_config() inside callback
From: Jeff King @ 2023-12-12  1:37 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Patrick Steinhardt, git
In-Reply-To: <ZXOfrKYsmOjOHGmj@nand.local>

On Fri, Dec 08, 2023 at 05:58:52PM -0500, Taylor Blau wrote:

> On Thu, Dec 07, 2023 at 09:57:55AM +0100, Patrick Steinhardt wrote:
> > On Thu, Dec 07, 2023 at 02:24:58AM -0500, Jeff King wrote:
> > [snip]
> > > diff --git a/imap-send.c b/imap-send.c
> > > index 996651e4f8..5b0fe4f95a 100644
> > > --- a/imap-send.c
> > > +++ b/imap-send.c
> > > @@ -1346,7 +1346,7 @@ static int git_imap_config(const char *var, const char *val,
> > >  		server.port = git_config_int(var, val, ctx->kvi);
> > >  	else if (!strcmp("imap.host", var)) {
> > >  		if (!val) {
> > > -			git_die_config("imap.host", "Missing value for 'imap.host'");
> > > +			return error("Missing value for 'imap.host'");
> >
> > Nit: while at it we might also mark this error for translation. Not
> > worth a reroll on its own though.
> 
> This string goes away entirely in the next patch, so I don't think we
> need to mark it here.

Right. It's a little confusing because it is converted along with some
other spots in the next patch. But in one of those other spots, we
earlier switched it (in patch 2) from die() to error(), and we _did_
mark it for translation as we did so.

I did it there because in patch 2 we touch multiple messages, and the
other ones don't end up as config_error_nonbool(), so we do want them
translated.

I'm not sure if there would have been an easier ordering to the series.
I could have pulled the "mark for translation" bits from patch 2 into
their own patch (after this one makes some of the messages go away), but
then I'd expect somebody would review patch 2 and say "why not mark them
for translation while we're here?". :)

-Peff

^ permalink raw reply

* Re: Bug in SVN.pm
From: Junio C Hamano @ 2023-12-12  1:43 UTC (permalink / raw)
  To: Daniel Ducharme; +Cc: git@vger.kernel.org
In-Reply-To: <DS7PR13MB46861D9FF40E2A340FD2C8D2CF8FA@DS7PR13MB4686.namprd13.prod.outlook.com>

Daniel Ducharme <dducharme@catalisgov.com> writes:

> sub parse_svn_date requires both the month, day, hour, and minute
> to be 2 digits long and fails on 2007-3-12T17:46:4.000000Z as an
> example due to the regex. Suggestion is to make the regex instead
> /^(\d{4})\-(\d\d?)\-(\d\d?)T (\d\d?)\:(\d\d?)\:(\d\d?)\.\d*Z$/x)
>
> I have found this data to be present in a SVN repository converted
> off of VSS while trying to take some old VSS repos to git through
> SVN, not sure if standard SVN would have allowed these date
> patterns, but they should be valid. The above regex also contains
> a fix for single digit minute and second as I also ran into that
> as well.

I do not think it is fair to call this a bug in SVN.pm, provided if
SVN wants to use ISO 8601 datetime format for its timestamps.  ISO
8601, IIUC, is fairly clear that month and day must be two-digit
strings, 0-filled to the left as needed, and I would not be
surprised if standard SVN rejected such bogus dates, and I do not
agree with your "they *SHOULD* be valid" statement.

Having said all that, it is good to be liberal in what we accept,
and I am sympathetic to "it would be nicer if they were accepted"
sentiment.  As long as there is no ambiguity, I would say that it
would be nicer if timestamps like "2007-3-12T17:1:2.0Z" were
accepted, and I would not be fundamentally opposed to a patch that
loosens the regex to do so.

Patches welcome ;-).

^ permalink raw reply

* Re: [PATCH v3 04/11] reftable/stack: verify that `reftable_stack_add()` uses auto-compaction
From: Patrick Steinhardt @ 2023-12-12  3:44 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Han-Wen Nienhuys, Jonathan Nieder, Eric Sunshine
In-Reply-To: <ZXdt17pN68tsmH1H@nand.local>

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

On Mon, Dec 11, 2023 at 03:15:19PM -0500, Taylor Blau wrote:
> On Mon, Dec 11, 2023 at 10:07:42AM +0100, Patrick Steinhardt wrote:
> > While we have several tests that check whether we correctly perform
> > auto-compaction when manually calling `reftable_stack_auto_compact()`,
> > we don't have any tests that verify whether `reftable_stack_add()` does
> > call it automatically. Add one.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  reftable/stack_test.c | 49 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 49 insertions(+)
> >
> > diff --git a/reftable/stack_test.c b/reftable/stack_test.c
> > index 0644c8ad2e..52b4dc3b14 100644
> > --- a/reftable/stack_test.c
> > +++ b/reftable/stack_test.c
> > @@ -850,6 +850,54 @@ static void test_reftable_stack_auto_compaction(void)
> >  	clear_dir(dir);
> >  }
> >
> > +static void test_reftable_stack_add_performs_auto_compaction(void)
> > +{
> > +	struct reftable_write_options cfg = { 0 };
> > +	struct reftable_stack *st = NULL;
> > +	struct strbuf refname = STRBUF_INIT;
> > +	char *dir = get_tmp_dir(__LINE__);
> > +	int err, i, n = 20;
> > +
> > +	err = reftable_new_stack(&st, dir, cfg);
> > +	EXPECT_ERR(err);
> > +
> > +	for (i = 0; i <= n; i++) {
> > +		struct reftable_ref_record ref = {
> > +			.update_index = reftable_stack_next_update_index(st),
> > +			.value_type = REFTABLE_REF_SYMREF,
> > +			.value.symref = "master",
> > +		};
> > +
> > +		/*
> > +		 * Disable auto-compaction for all but the last runs. Like this
> > +		 * we can ensure that we indeed honor this setting and have
> > +		 * better control over when exactly auto compaction runs.
> > +		 */
> > +		st->disable_auto_compact = i != n;
> > +
> > +		strbuf_reset(&refname);
> > +		strbuf_addf(&refname, "branch-%04d", i);
> > +		ref.refname = refname.buf;
> 
> Does the reftable backend take ownership of the "refname" field? If so,
> then I think we'd want to use strbuf_detach() here to avoid a
> double-free() when you call strbuf_release() below.

No it doesn't. `reftable_stack_add()` will lock the stack and commit the
new table immediately, so there's no need to transfer ownership of
memory here.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v3 00/11] reftable: small set of fixes
From: Patrick Steinhardt @ 2023-12-12  3:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Han-Wen Nienhuys, Jonathan Nieder, Eric Sunshine
In-Reply-To: <ZXduGvCJIa25eldZ@nand.local>

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

On Mon, Dec 11, 2023 at 03:16:26PM -0500, Taylor Blau wrote:
> On Mon, Dec 11, 2023 at 10:07:25AM +0100, Patrick Steinhardt wrote:
> >  reftable/block.c          |  23 ++++----
> >  reftable/block.h          |   6 +++
> >  reftable/block_test.c     |   4 +-
> >  reftable/blocksource.c    |   2 +-
> >  reftable/iter.h           |   8 +--
> >  reftable/merged.c         |  31 +++++------
> >  reftable/merged.h         |   2 +
> >  reftable/reader.c         |   7 ++-
> >  reftable/readwrite_test.c |   6 +--
> >  reftable/stack.c          |  73 +++++++++++---------------
> >  reftable/stack_test.c     | 107 +++++++++++++++++++++++++++++++++++++-
> >  reftable/test_framework.h |  58 ++++++++++++---------
> >  12 files changed, 213 insertions(+), 114 deletions(-)
> >
> > Range-diff against v2:
> 
> I had one small question on the new version of the fourth patch, but
> otherwise this version LGTM.

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 0/7] fix segfaults with implicit-bool config
From: Patrick Steinhardt @ 2023-12-12  4:10 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Carlos Andrés Ramírez Cataño
In-Reply-To: <20231212005228.GB376323@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 1338 bytes --]

On Mon, Dec 11, 2023 at 07:52:28PM -0500, Jeff King wrote:
> On Thu, Dec 07, 2023 at 09:14:36AM +0100, Patrick Steinhardt wrote:
> 
> > Thanks for working on this topic! I've left a couple of comments, most
> > of which are about whether we should retain previous behaviour where we
> > generate a warning instead of raising an error for unknown values.
> 
> Thanks for taking a look. I see what you're saying about the warnings,
> but IMHO it's not worth the extra complexity. Returning early means the
> existing code can proceed without worrying about NULLs. Though I suppose
> we could have a "warn_error_nonbool()" which issues a warning and
> returns 0.
> 
> Still, the "return config_error_nonbool()" pattern is pretty
> well-established and used for most options. I would go so far as to say
> the ones that warn for invalid values are the odd ones out, and probably
> should be returning errors as well (though doing so now may not be worth
> the trouble and risk of annoyance).
> 
> And certainly there should be no regressions in this series; every case
> is currently a segfault, so returning an error is a strict improvement.
> I'd just as soon stay strict there, as it's easier to loosen later if we
> choose than to tighten.

Fair enough, I'm perfectly fine with this reasoning. Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 0/7] clone: fix init of refdb with wrong object format
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1701863960.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 8774 bytes --]

Hi,

this is the second version of my patch series to make git-clone(1)
initialize the reference database with the correct hash format. This is
a preparatory step for the reftable backend, which encodes the hash
format on-disk and thus requires us to get the hash format right at the
time of initialization.

Changes compared to v1:

  - Patch 1: Extend the comment explaining why we always create the
    "refs/" directory.

  - Patch 3: Explain better why the patch is required in the first
    place.

  - Patch 4: Fix a typo.

  - Patch 7: Adapt a failing test to now assert the new behaviour.

Thanks for your reviews!

Patrick

Patrick Steinhardt (7):
  setup: extract function to create the refdb
  setup: allow skipping creation of the refdb
  remote-curl: rediscover repository when fetching refs
  builtin/clone: fix bundle URIs with mismatching object formats
  builtin/clone: set up sparse checkout later
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: create the refdb with the correct object format

 builtin/clone.c             |  65 ++++++++++----------
 remote-curl.c               |   7 ++-
 remote.c                    |  26 ++++----
 remote.h                    |   1 +
 setup.c                     | 114 ++++++++++++++++++++++--------------
 setup.h                     |   6 +-
 t/t5550-http-fetch-dumb.sh  |   4 +-
 t/t5558-clone-bundle-uri.sh |  18 ++++++
 8 files changed, 150 insertions(+), 91 deletions(-)

Range-diff against v1:
1:  b69c57d272 ! 1:  2f34daa082 setup: extract function to create the refdb
    @@ Commit message
         calling `init_db()`. Extract the logic into a standalone function so
         that it becomes easier to do this refactoring.
     
    +    While at it, expand the comment that explains why we always create the
    +    "refs/" directory.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## setup.c ##
    @@ setup.c: void initialize_repository_version(int hash_algo, int reinit)
     +	int reinit = is_reinit();
     +
     +	/*
    -+	 * We need to create a "refs" dir in any case so that older
    -+	 * versions of git can tell that this is a repository.
    ++	 * We need to create a "refs" dir in any case so that older versions of
    ++	 * Git can tell that this is a repository. This serves two main purposes:
    ++	 *
    ++	 * - Clients will know to stop walking the parent-directory chain when
    ++	 *   detecting the Git repository. Otherwise they may end up detecting
    ++	 *   a Git repository in a parent directory instead.
    ++	 *
    ++	 * - Instead of failing to detect a repository with unknown reference
    ++	 *   format altogether, old clients will print an error saying that
    ++	 *   they do not understand the reference format extension.
     +	 */
     +	safe_create_dir(git_path("refs"), 1);
     +	adjust_shared_perm(git_path("refs"));
2:  090c52423e = 2:  40005ac1f1 setup: allow skipping creation of the refdb
3:  a1b86a0cbb ! 3:  374a1c514b remote-curl: rediscover repository when fetching refs
    @@ Metadata
      ## Commit message ##
         remote-curl: rediscover repository when fetching refs
     
    -    We're about to change git-clone(1) so that we set up the reference
    -    database at a later point. This change will cause git-remote-curl(1) to
    -    not detect the repository anymore due to "HEAD" not having been created
    -    yet at the time it spawns, and thus causes it to error out once it is
    -    asked to fetch the references.
    +    The reftable format encodes the hash function used by the repository
    +    inside of its tables. The reftable backend thus needs to be initialized
    +    with the correct hash function right from the start, or otherwise we may
    +    end up writing tables with the wrong hash function. But git-clone(1)
    +    initializes the reference database before learning about the hash
    +    function used by the remote repository, which has never been a problem
    +    with the reffiles backend.
    +
    +    To fix this, we'll have to change git-clone(1) to be more careful and
    +    only create the reference backend once it learned about the remote hash
    +    function. This creates a problem for git-remote-curl(1), which will then
    +    be spawned at a time where the repository is not yet fully-initialized.
    +    Consequentially, git-remote-curl(1) will fail to detect the repository,
    +    which eventually causes it to error out once it is asked to fetch remote
    +    objects.
     
         We can address this issue by trying to re-discover the Git repository in
         case none was detected at startup time. With this change, the clone will
    @@ Commit message
              repository
     
           4. git-clone(1) creates the reference database as it has now learned
    -         about the object format.
    +         about the hash function.
     
           5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
              The latter notices that it doesn't have a repository available, but
4:  c7a9d6ef74 ! 4:  3bef564b57 builtin/clone: fix bundle URIs with mismatching object formats
    @@ Commit message
         is indeed not necessarily the case for the hash algorithm right now. So
         in practice it is the right thing to detect the remote's object format
         before downloading bundle URIs anyway, and not doing so causes clones
    -    with bundle URIS to fail when the local default object format does not
    +    with bundle URIs to fail when the local default object format does not
         match the remote repository's format.
     
         Unfortunately though, this creates a new issue: downloading bundles may
5:  703ff77eea = 5:  917f15055f builtin/clone: set up sparse checkout later
6:  6c919fb19c = 6:  f3485a2708 builtin/clone: skip reading HEAD when retrieving remote
7:  eb5530e6a8 ! 7:  f062b11550 builtin/clone: create the refdb with the correct object format
    @@ Commit message
         upcoming reftable backend when cloning repositories with the SHA256
         object format.
     
    +    This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
    +    empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
    +    the resulting hash format of the empty cloned repository to match the
    +    default hash, but now we always end up with a sha1 repository. The
    +    problem is that for dumb HTTP fetches, we have no easy way to figure out
    +    the remote's hash function except for deriving it based on the hash
    +    length of refs in `info/refs`. But as the remote repository is empty we
    +    cannot rely on this detection mechanism.
    +
    +    Before the change in this commit we already initialized the repository
    +    with the default hash function and then left it as-is. With this patch
    +    we always use the hash function detected via the remote, where we fall
    +    back to "sha1" in case we cannot detect it.
    +
    +    Neither the old nor the new behaviour are correct as we second-guess the
    +    remote hash function in both cases. But given that this is a rather
    +    unlikely edge case (we use the dumb HTTP protocol, the remote repository
    +    uses SHA256 and the remote repository is empty), let's simply adapt the
    +    test to assert the new behaviour. If we want to properly address this
    +    edge case in the future we will have to extend the dumb HTTP protocol so
    +    that we can properly detect the hash function for empty repositories.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## builtin/clone.c ##
    @@ setup.h: int init_db(const char *git_dir, const char *real_git_dir,
      
      /*
       * NOTE NOTE NOTE!!
    +
    + ## t/t5550-http-fetch-dumb.sh ##
    +@@ t/t5550-http-fetch-dumb.sh: test_expect_success 'create empty remote repository' '
    + 	setup_post_update_server_info_hook "$HTTPD_DOCUMENT_ROOT_PATH/empty.git"
    + '
    + 
    +-test_expect_success 'empty dumb HTTP repository has default hash algorithm' '
    ++test_expect_success 'empty dumb HTTP repository falls back to SHA1' '
    + 	test_when_finished "rm -fr clone-empty" &&
    + 	git clone $HTTPD_URL/dumb/empty.git clone-empty &&
    + 	git -C clone-empty rev-parse --show-object-format >empty-format &&
    +-	test "$(cat empty-format)" = "$(test_oid algo)"
    ++	test "$(cat empty-format)" = sha1
    + '
    + 
    + setup_askpass_helper

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 1/7] setup: extract function to create the refdb
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4810 bytes --]

We're about to let callers skip creation of the reference database when
calling `init_db()`. Extract the logic into a standalone function so
that it becomes easier to do this refactoring.

While at it, expand the comment that explains why we always create the
"refs/" directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c | 103 +++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 65 insertions(+), 38 deletions(-)

diff --git a/setup.c b/setup.c
index fc592dc6dd..865cfe6743 100644
--- a/setup.c
+++ b/setup.c
@@ -1885,6 +1885,68 @@ void initialize_repository_version(int hash_algo, int reinit)
 		git_config_set_gently("extensions.objectformat", NULL);
 }
 
+static int is_reinit(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	char junk[2];
+	int ret;
+
+	git_path_buf(&buf, "HEAD");
+	ret = !access(buf.buf, R_OK) || readlink(buf.buf, junk, sizeof(junk) - 1) != -1;
+	strbuf_release(&buf);
+	return ret;
+}
+
+static void create_reference_database(const char *initial_branch, int quiet)
+{
+	struct strbuf err = STRBUF_INIT;
+	int reinit = is_reinit();
+
+	/*
+	 * We need to create a "refs" dir in any case so that older versions of
+	 * Git can tell that this is a repository. This serves two main purposes:
+	 *
+	 * - Clients will know to stop walking the parent-directory chain when
+	 *   detecting the Git repository. Otherwise they may end up detecting
+	 *   a Git repository in a parent directory instead.
+	 *
+	 * - Instead of failing to detect a repository with unknown reference
+	 *   format altogether, old clients will print an error saying that
+	 *   they do not understand the reference format extension.
+	 */
+	safe_create_dir(git_path("refs"), 1);
+	adjust_shared_perm(git_path("refs"));
+
+	if (refs_init_db(&err))
+		die("failed to set up refs db: %s", err.buf);
+
+	/*
+	 * Point the HEAD symref to the initial branch with if HEAD does
+	 * not yet exist.
+	 */
+	if (!reinit) {
+		char *ref;
+
+		if (!initial_branch)
+			initial_branch = git_default_branch_name(quiet);
+
+		ref = xstrfmt("refs/heads/%s", initial_branch);
+		if (check_refname_format(ref, 0) < 0)
+			die(_("invalid initial branch name: '%s'"),
+			    initial_branch);
+
+		if (create_symref("HEAD", ref, NULL) < 0)
+			exit(1);
+		free(ref);
+	}
+
+	if (reinit && initial_branch)
+		warning(_("re-init: ignored --initial-branch=%s"),
+			initial_branch);
+
+	strbuf_release(&err);
+}
+
 static int create_default_files(const char *template_path,
 				const char *original_git_dir,
 				const char *initial_branch,
@@ -1896,10 +1958,8 @@ static int create_default_files(const char *template_path,
 	struct stat st1;
 	struct strbuf buf = STRBUF_INIT;
 	char *path;
-	char junk[2];
 	int reinit;
 	int filemode;
-	struct strbuf err = STRBUF_INIT;
 	const char *init_template_dir = NULL;
 	const char *work_tree = get_git_work_tree();
 
@@ -1919,6 +1979,8 @@ static int create_default_files(const char *template_path,
 	reset_shared_repository();
 	git_config(git_default_config, NULL);
 
+	reinit = is_reinit();
+
 	/*
 	 * We must make sure command-line options continue to override any
 	 * values we might have just re-read from the config.
@@ -1962,39 +2024,7 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
-	/*
-	 * We need to create a "refs" dir in any case so that older
-	 * versions of git can tell that this is a repository.
-	 */
-	safe_create_dir(git_path("refs"), 1);
-	adjust_shared_perm(git_path("refs"));
-
-	if (refs_init_db(&err))
-		die("failed to set up refs db: %s", err.buf);
-
-	/*
-	 * Point the HEAD symref to the initial branch with if HEAD does
-	 * not yet exist.
-	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
-	if (!reinit) {
-		char *ref;
-
-		if (!initial_branch)
-			initial_branch = git_default_branch_name(quiet);
-
-		ref = xstrfmt("refs/heads/%s", initial_branch);
-		if (check_refname_format(ref, 0) < 0)
-			die(_("invalid initial branch name: '%s'"),
-			    initial_branch);
-
-		if (create_symref("HEAD", ref, NULL) < 0)
-			exit(1);
-		free(ref);
-	}
-
+	create_reference_database(initial_branch, quiet);
 	initialize_repository_version(fmt->hash_algo, 0);
 
 	/* Check filemode trustability */
@@ -2158,9 +2188,6 @@ int init_db(const char *git_dir, const char *real_git_dir,
 				      prev_bare_repository,
 				      init_shared_repository,
 				      flags & INIT_DB_QUIET);
-	if (reinit && initial_branch)
-		warning(_("re-init: ignored --initial-branch=%s"),
-			initial_branch);
 
 	create_object_directory();
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 2/7] setup: allow skipping creation of the refdb
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2755 bytes --]

Allow callers to skip creation of the reference database via a new flag
`INIT_DB_SKIP_REFDB`, which is required for git-clone(1) so that we can
create it at a later point once the object format has been discovered
from the remote repository.

Note that we also uplift the call to `create_reference_database()` into
`init_db()`, which makes it easier to handle the new flag for us. This
changes the order in which we do initialization so that we now set up
the Git configuration before we create the reference database. In
practice this move should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c | 13 +++++--------
 setup.h |  5 +++--
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/setup.c b/setup.c
index 865cfe6743..d6a1c59b7b 100644
--- a/setup.c
+++ b/setup.c
@@ -1949,11 +1949,9 @@ static void create_reference_database(const char *initial_branch, int quiet)
 
 static int create_default_files(const char *template_path,
 				const char *original_git_dir,
-				const char *initial_branch,
 				const struct repository_format *fmt,
 				int prev_bare_repository,
-				int init_shared_repository,
-				int quiet)
+				int init_shared_repository)
 {
 	struct stat st1;
 	struct strbuf buf = STRBUF_INIT;
@@ -2024,7 +2022,6 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
-	create_reference_database(initial_branch, quiet);
 	initialize_repository_version(fmt->hash_algo, 0);
 
 	/* Check filemode trustability */
@@ -2184,11 +2181,11 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	validate_hash_algorithm(&repo_fmt, hash);
 
 	reinit = create_default_files(template_dir, original_git_dir,
-				      initial_branch, &repo_fmt,
-				      prev_bare_repository,
-				      init_shared_repository,
-				      flags & INIT_DB_QUIET);
+				      &repo_fmt, prev_bare_repository,
+				      init_shared_repository);
 
+	if (!(flags & INIT_DB_SKIP_REFDB))
+		create_reference_database(initial_branch, flags & INIT_DB_QUIET);
 	create_object_directory();
 
 	if (get_shared_repository()) {
diff --git a/setup.h b/setup.h
index b48cf1c43b..cbf538286b 100644
--- a/setup.h
+++ b/setup.h
@@ -169,8 +169,9 @@ int verify_repository_format(const struct repository_format *format,
  */
 void check_repository_format(struct repository_format *fmt);
 
-#define INIT_DB_QUIET 0x0001
-#define INIT_DB_EXIST_OK 0x0002
+#define INIT_DB_QUIET      (1 << 0)
+#define INIT_DB_EXIST_OK   (1 << 1)
+#define INIT_DB_SKIP_REFDB (1 << 2)
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash_algo,
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 3/7] remote-curl: rediscover repository when fetching refs
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

The reftable format encodes the hash function used by the repository
inside of its tables. The reftable backend thus needs to be initialized
with the correct hash function right from the start, or otherwise we may
end up writing tables with the wrong hash function. But git-clone(1)
initializes the reference database before learning about the hash
function used by the remote repository, which has never been a problem
with the reffiles backend.

To fix this, we'll have to change git-clone(1) to be more careful and
only create the reference backend once it learned about the remote hash
function. This creates a problem for git-remote-curl(1), which will then
be spawned at a time where the repository is not yet fully-initialized.
Consequentially, git-remote-curl(1) will fail to detect the repository,
which eventually causes it to error out once it is asked to fetch remote
objects.

We can address this issue by trying to re-discover the Git repository in
case none was detected at startup time. With this change, the clone will
look as following:

  1. git-clone(1) sets up the initial repository, excluding the
     reference database.

  2. git-clone(1) spawns git-remote-curl(1), which will be unable to
     detect the repository due to a missing "HEAD".

  3. git-clone(1) asks git-remote-curl(1) to list remote references.
     This works just fine as this step does not require a local
     repository

  4. git-clone(1) creates the reference database as it has now learned
     about the hash function.

  5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
     The latter notices that it doesn't have a repository available, but
     it now knows to try and re-discover it.

If the re-discovery succeeds in the last step we can continue with the
clone.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote-curl.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index ef05752ca5..fc29757b65 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1564,8 +1564,11 @@ int cmd_main(int argc, const char **argv)
 		if (buf.len == 0)
 			break;
 		if (starts_with(buf.buf, "fetch ")) {
-			if (nongit)
-				die(_("remote-curl: fetch attempted without a local repo"));
+			if (nongit) {
+				setup_git_directory_gently(&nongit);
+				if (nongit)
+					die(_("remote-curl: fetch attempted without a local repo"));
+			}
 			parse_fetch(&buf);
 
 		} else if (!strcmp(buf.buf, "list") || starts_with(buf.buf, "list ")) {
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 4/7] builtin/clone: fix bundle URIs with mismatching object formats
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 6668 bytes --]

We create the reference database in git-clone(1) quite early before
connecting to the remote repository. Given that we do not yet know about
the object format that the remote repository uses at that point in time
the consequence is that the refdb may be initialized with the wrong
object format.

This is not a problem in the context of the files backend as we do not
encode the object format anywhere, and furthermore the only reference
that we write between initializing the refdb and learning about the
object format is the "HEAD" symref. It will become a problem though once
we land the reftable backend, which indeed does require to know about
the proper object format at the time of creation. We thus need to
rearrange the logic in git-clone(1) so that we only initialize the refdb
once we have learned about the actual object format.

As a first step, move listing of remote references to happen earlier,
which also allow us to set up the hash algorithm of the repository
earlier now. While we aim to execute this logic as late as possible
until after most of the setup has happened already, detection of the
object format and thus later the setup of the reference database must
happen before any other logic that may spawn Git commands or otherwise
these Git commands may not recognize the repository as such.

The first Git step where we expect the repository to be fully initalized
is when we fetch bundles via bundle URIs. Funny enough, the comments
there also state that "the_repository must match the cloned repo", which
is indeed not necessarily the case for the hash algorithm right now. So
in practice it is the right thing to detect the remote's object format
before downloading bundle URIs anyway, and not doing so causes clones
with bundle URIs to fail when the local default object format does not
match the remote repository's format.

Unfortunately though, this creates a new issue: downloading bundles may
take a long time, so if we list refs beforehand they might've grown
stale meanwhile. It is not clear how to solve this issue except for a
second reference listing though after we have downloaded the bundles,
which may be an expensive thing to do.

Arguably though, it's preferable to have a staleness issue compared to
being unable to clone a repository altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/clone.c             | 48 ++++++++++++++++++-------------------
 t/t5558-clone-bundle-uri.sh | 18 ++++++++++++++
 2 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index c6357af949..d188650881 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1266,6 +1266,26 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen && !filter_options.choice)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
+	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+	refspec_ref_prefixes(&remote->fetch,
+			     &transport_ls_refs_options.ref_prefixes);
+	if (option_branch)
+		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
+				  option_branch);
+	if (!option_no_tags)
+		strvec_push(&transport_ls_refs_options.ref_prefixes,
+			    "refs/tags/");
+
+	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
+
+	/*
+	 * Now that we know what algorithm the remote side is using, let's set
+	 * ours to the same thing.
+	 */
+	hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
+	initialize_repository_version(hash_algo, 1);
+	repo_set_hash_algo(the_repository, hash_algo);
+
 	/*
 	 * Before fetching from the remote, download and install bundle
 	 * data from the --bundle-uri option.
@@ -1281,24 +1301,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 				bundle_uri);
 		else if (has_heuristic)
 			git_config_set_gently("fetch.bundleuri", bundle_uri);
-	}
-
-	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
-	refspec_ref_prefixes(&remote->fetch,
-			     &transport_ls_refs_options.ref_prefixes);
-	if (option_branch)
-		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
-				  option_branch);
-	if (!option_no_tags)
-		strvec_push(&transport_ls_refs_options.ref_prefixes,
-			    "refs/tags/");
-
-	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
-
-	if (refs)
-		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
-
-	if (!bundle_uri) {
+	} else {
 		/*
 		* Populate transport->got_remote_bundle_uri and
 		* transport->bundle_uri. We might get nothing.
@@ -1319,13 +1322,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-		/*
-		 * Now that we know what algorithm the remote side is using,
-		 * let's set ours to the same thing.
-		 */
-	hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
-	initialize_repository_version(hash_algo, 1);
-	repo_set_hash_algo(the_repository, hash_algo);
+	if (refs)
+		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
 	if (mapped_refs) {
 		/*
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
index 996a08e90c..1ca5f745e7 100755
--- a/t/t5558-clone-bundle-uri.sh
+++ b/t/t5558-clone-bundle-uri.sh
@@ -33,6 +33,15 @@ test_expect_success 'clone with path bundle' '
 	test_cmp expect actual
 '
 
+test_expect_success 'clone with path bundle and non-default hash' '
+	test_when_finished "rm -rf clone-path-non-default-hash" &&
+	GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \
+		clone-from clone-path-non-default-hash &&
+	git -C clone-path-non-default-hash rev-parse refs/bundles/topic >actual &&
+	git -C clone-from rev-parse topic >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'clone with file:// bundle' '
 	git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
 		clone-from clone-file &&
@@ -284,6 +293,15 @@ test_expect_success 'clone HTTP bundle' '
 	test_config -C clone-http log.excludedecoration refs/bundle/
 '
 
+test_expect_success 'clone HTTP bundle with non-default hash' '
+	test_when_finished "rm -rf clone-http-non-default-hash" &&
+	GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="$HTTPD_URL/B.bundle" \
+		"$HTTPD_URL/smart/fetch.git" clone-http-non-default-hash &&
+	git -C clone-http-non-default-hash rev-parse refs/bundles/topic >actual &&
+	git -C clone-from rev-parse topic >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'clone bundle list (HTTP, no heuristic)' '
 	test_when_finished rm -f trace*.txt &&
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 5/7] builtin/clone: set up sparse checkout later
From: Patrick Steinhardt @ 2023-12-12  7:00 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]

When asked to do a sparse checkout, then git-clone(1) will spawn
`git sparse-checkout set` to set up the configuration accordingly. This
requires a proper Git repository or otherwise the command will fail. But
as we are about to move creation of the reference database to a later
point, this prerequisite will not hold anymore.

Move the logic to a later point in time where we know to have created
the reference database already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/clone.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index d188650881..9c60923f31 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1185,9 +1185,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	if (option_sparse_checkout && git_sparse_checkout_init(dir))
-		return 1;
-
 	remote = remote_get(remote_name);
 
 	refspec_appendf(&remote->fetch, "+%s*:%s*", src_ref_prefix,
@@ -1426,6 +1423,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		dissociate_from_references();
 	}
 
+	if (option_sparse_checkout && git_sparse_checkout_init(dir))
+		return 1;
+
 	junk_mode = JUNK_LEAVE_REPO;
 	err = checkout(submodule_progress, filter_submodules);
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 6/7] builtin/clone: skip reading HEAD when retrieving remote
From: Patrick Steinhardt @ 2023-12-12  7:01 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5199 bytes --]

After we have set up the remote configuration in git-clone(1) we'll call
`remote_get()` to read the remote from the on-disk configuration. But
next to reading the on-disk configuration, `remote_get()` will also
cause us to try and read the repository's HEAD reference so that we can
figure out the current branch. Besides being pointless in git-clone(1)
because we're operating in an empty repository anyway, this will also
break once we move creation of the reference database to a later point
in time.

Refactor the code to introduce a new `remote_get_early()` function that
will skip reading the HEAD reference to address this issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/clone.c |  2 +-
 remote.c        | 26 ++++++++++++++++----------
 remote.h        |  1 +
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 9c60923f31..06966c5d4c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1185,7 +1185,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	remote = remote_get(remote_name);
+	remote = remote_get_early(remote_name);
 
 	refspec_appendf(&remote->fetch, "+%s*:%s*", src_ref_prefix,
 			branch_top.buf);
diff --git a/remote.c b/remote.c
index abb24822be..051d0a64a0 100644
--- a/remote.c
+++ b/remote.c
@@ -509,7 +509,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 	}
 }
 
-static void read_config(struct repository *repo)
+static void read_config(struct repository *repo, int early)
 {
 	int flag;
 
@@ -518,7 +518,7 @@ static void read_config(struct repository *repo)
 	repo->remote_state->initialized = 1;
 
 	repo->remote_state->current_branch = NULL;
-	if (startup_info->have_repository) {
+	if (startup_info->have_repository && !early) {
 		const char *head_ref = refs_resolve_ref_unsafe(
 			get_main_ref_store(repo), "HEAD", 0, NULL, &flag);
 		if (head_ref && (flag & REF_ISSYMREF) &&
@@ -561,7 +561,7 @@ static const char *remotes_remote_for_branch(struct remote_state *remote_state,
 
 const char *remote_for_branch(struct branch *branch, int *explicit)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	die_on_missing_branch(the_repository, branch);
 
 	return remotes_remote_for_branch(the_repository->remote_state, branch,
@@ -587,7 +587,7 @@ remotes_pushremote_for_branch(struct remote_state *remote_state,
 
 const char *pushremote_for_branch(struct branch *branch, int *explicit)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	die_on_missing_branch(the_repository, branch);
 
 	return remotes_pushremote_for_branch(the_repository->remote_state,
@@ -599,7 +599,7 @@ static struct remote *remotes_remote_get(struct remote_state *remote_state,
 
 const char *remote_ref_for_branch(struct branch *branch, int for_push)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	die_on_missing_branch(the_repository, branch);
 
 	if (branch) {
@@ -709,7 +709,13 @@ remotes_remote_get(struct remote_state *remote_state, const char *name)
 
 struct remote *remote_get(const char *name)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
+	return remotes_remote_get(the_repository->remote_state, name);
+}
+
+struct remote *remote_get_early(const char *name)
+{
+	read_config(the_repository, 1);
 	return remotes_remote_get(the_repository->remote_state, name);
 }
 
@@ -722,7 +728,7 @@ remotes_pushremote_get(struct remote_state *remote_state, const char *name)
 
 struct remote *pushremote_get(const char *name)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	return remotes_pushremote_get(the_repository->remote_state, name);
 }
 
@@ -738,7 +744,7 @@ int remote_is_configured(struct remote *remote, int in_repo)
 int for_each_remote(each_remote_fn fn, void *priv)
 {
 	int i, result = 0;
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	for (i = 0; i < the_repository->remote_state->remotes_nr && !result;
 	     i++) {
 		struct remote *remote =
@@ -1831,7 +1837,7 @@ struct branch *branch_get(const char *name)
 {
 	struct branch *ret;
 
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	if (!name || !*name || !strcmp(name, "HEAD"))
 		ret = the_repository->remote_state->current_branch;
 	else
@@ -1973,7 +1979,7 @@ static const char *branch_get_push_1(struct remote_state *remote_state,
 
 const char *branch_get_push(struct branch *branch, struct strbuf *err)
 {
-	read_config(the_repository);
+	read_config(the_repository, 0);
 	die_on_missing_branch(the_repository, branch);
 
 	if (!branch)
diff --git a/remote.h b/remote.h
index cdc8b1db42..79353ba226 100644
--- a/remote.h
+++ b/remote.h
@@ -118,6 +118,7 @@ struct remote {
  * and configuration.
  */
 struct remote *remote_get(const char *name);
+struct remote *remote_get_early(const char *name);
 
 struct remote *pushremote_get(const char *name);
 int remote_is_configured(struct remote *remote, int in_repo);
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH v2 7/7] builtin/clone: create the refdb with the correct object format
From: Patrick Steinhardt @ 2023-12-12  7:01 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1702361370.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5283 bytes --]

We're currently creating the reference database with a potentially
incorrect object format when the remote repository's object format is
different from the local default object format. This works just fine for
now because the files backend never records the object format anywhere.
But this logic will fail with any new reference backend that encodes
this information in some form either on-disk or in-memory.

The preceding commits have reshuffled code in git-clone(1) so that there
is no code path that will access the reference database before we have
detected the remote's object format. With these refactorings we can now
defer initialization of the reference database until after we have
learned the remote's object format and thus initialize it with the
correct format from the get-go.

These refactorings are required to make git-clone(1) work with the
upcoming reftable backend when cloning repositories with the SHA256
object format.

This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
the resulting hash format of the empty cloned repository to match the
default hash, but now we always end up with a sha1 repository. The
problem is that for dumb HTTP fetches, we have no easy way to figure out
the remote's hash function except for deriving it based on the hash
length of refs in `info/refs`. But as the remote repository is empty we
cannot rely on this detection mechanism.

Before the change in this commit we already initialized the repository
with the default hash function and then left it as-is. With this patch
we always use the hash function detected via the remote, where we fall
back to "sha1" in case we cannot detect it.

Neither the old nor the new behaviour are correct as we second-guess the
remote hash function in both cases. But given that this is a rather
unlikely edge case (we use the dumb HTTP protocol, the remote repository
uses SHA256 and the remote repository is empty), let's simply adapt the
test to assert the new behaviour. If we want to properly address this
edge case in the future we will have to extend the dumb HTTP protocol so
that we can properly detect the hash function for empty repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/clone.c            | 9 ++++++++-
 setup.c                    | 2 +-
 setup.h                    | 1 +
 t/t5550-http-fetch-dumb.sh | 4 ++--
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 06966c5d4c..fd052b8b54 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1097,8 +1097,14 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		}
 	}
 
+	/*
+	 * Initialize the repository, but skip initializing the reference
+	 * database. We do not yet know about the object format of the
+	 * repository, and reference backends may persist that information into
+	 * their on-disk data structures.
+	 */
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		do_not_override_repo_unix_permissions, INIT_DB_QUIET);
+		do_not_override_repo_unix_permissions, INIT_DB_QUIET | INIT_DB_SKIP_REFDB);
 
 	if (real_git_dir) {
 		free((char *)git_dir);
@@ -1282,6 +1288,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 	initialize_repository_version(hash_algo, 1);
 	repo_set_hash_algo(the_repository, hash_algo);
+	create_reference_database(NULL, 1);
 
 	/*
 	 * Before fetching from the remote, download and install bundle
diff --git a/setup.c b/setup.c
index d6a1c59b7b..155fe13f70 100644
--- a/setup.c
+++ b/setup.c
@@ -1897,7 +1897,7 @@ static int is_reinit(void)
 	return ret;
 }
 
-static void create_reference_database(const char *initial_branch, int quiet)
+void create_reference_database(const char *initial_branch, int quiet)
 {
 	struct strbuf err = STRBUF_INIT;
 	int reinit = is_reinit();
diff --git a/setup.h b/setup.h
index cbf538286b..3f0f17c351 100644
--- a/setup.h
+++ b/setup.h
@@ -178,6 +178,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *initial_branch, int init_shared_repository,
 	    unsigned int flags);
 void initialize_repository_version(int hash_algo, int reinit);
+void create_reference_database(const char *initial_branch, int quiet);
 
 /*
  * NOTE NOTE NOTE!!
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index e444b30bf6..4c3b32785d 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -66,11 +66,11 @@ test_expect_success 'create empty remote repository' '
 	setup_post_update_server_info_hook "$HTTPD_DOCUMENT_ROOT_PATH/empty.git"
 '
 
-test_expect_success 'empty dumb HTTP repository has default hash algorithm' '
+test_expect_success 'empty dumb HTTP repository falls back to SHA1' '
 	test_when_finished "rm -fr clone-empty" &&
 	git clone $HTTPD_URL/dumb/empty.git clone-empty &&
 	git -C clone-empty rev-parse --show-object-format >empty-format &&
-	test "$(cat empty-format)" = "$(test_oid algo)"
+	test "$(cat empty-format)" = sha1
 '
 
 setup_askpass_helper
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* Re: [PATCH 02/24] pack-bitmap-write: deep-clear the `bb_commit` slab
From: Jeff King @ 2023-12-12  7:04 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <6f5ff96998946f3f49da56fd05c096b949521339.1701198172.git.me@ttaylorr.com>

On Tue, Nov 28, 2023 at 02:07:59PM -0500, Taylor Blau wrote:

> +static void clear_bb_commit(struct bb_commit *commit)
> +{
> +	free(commit->reverse_edges);
> +	bitmap_free(commit->commit_mask);
> +	bitmap_free(commit->bitmap);
> +}

I think these bitmaps may sometimes be NULL. But double-checking
bitmap_free(), it sensibly is noop when passed NULL. So this look good
to me.

-Peff

^ permalink raw reply

* Re: [PATCH 03/24] pack-bitmap: plug leak in find_objects()
From: Jeff King @ 2023-12-12  7:04 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <bc38fba65547f5716a2ff9904dd615e755ea84bb.1701198172.git.me@ttaylorr.com>

On Tue, Nov 28, 2023 at 02:08:02PM -0500, Taylor Blau wrote:

> The `find_objects()` function creates an object_list for any tips of the
> reachability query which do not have corresponding bitmaps.
> 
> The object_list is not used outside of `find_objects()`, but we never
> free it with `object_list_free()`, resulting in a leak. Let's plug that
> leak by calling `object_list_free()`, which results in t6113 becoming
> leak-free.

Makes sense.

> @@ -1280,6 +1280,8 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
>  		base = fill_in_bitmap(bitmap_git, revs, base, seen);
>  	}
>  
> +	object_list_free(&not_mapped);
> +
>  	return base;
>  }

There's an extra return earlier in the function, but it triggers only
when not_mapped is NULL. So this covers all cases. Good.

> +++ b/t/t6113-rev-list-bitmap-filters.sh
> [..]
> +TEST_PASSES_SANITIZE_LEAK=true

Yay. :)

-Peff

^ permalink raw reply

* [PATCH v2 0/4] refs: improve handling of special refs
From: Patrick Steinhardt @ 2023-12-12  7:18 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Phillip Wood
In-Reply-To: <cover.1701243201.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4823 bytes --]

Hi,

this is the second version of my patch series that aims to improve our
handling of special refs. This patch series is in preparation for the
upcoming reftable backend.

Changes compared to v1:

  - Patch 1: Stop needlessly resetting `errno` before calling
    `strbuf_read_file()`. Also, clean up state we create in our test
    repos properly.

  - Patch 3: The discussion with Phillip made me reevaluate my claim
    that "rebase-apply/" and "rebase-merge/" contain special refs.
    Indeed they don't, all files in there seem to be exclusively
    read and written via the filesystem without ever going via the
    refdb. I've thus dropped them from the set of special refs.

  - Patch 4: The array of static refs is now static.

Thanks for your reviews!

Patrick

Patrick Steinhardt (4):
  wt-status: read HEAD and ORIG_HEAD via the refdb
  refs: propagate errno when reading special refs fails
  refs: complete list of special refs
  bisect: consistently write BISECT_EXPECTED_REV via the refdb

 bisect.c                    | 25 +++-------------
 builtin/bisect.c            |  8 ++---
 refs.c                      | 59 +++++++++++++++++++++++++++++++++++--
 t/t1403-show-ref.sh         | 10 +++++++
 t/t6030-bisect-porcelain.sh |  2 +-
 wt-status.c                 | 17 ++++++-----
 6 files changed, 82 insertions(+), 39 deletions(-)

Range-diff against v1:
1:  35b74eb972 = 1:  1db3eb3945 wt-status: read HEAD and ORIG_HEAD via the refdb
2:  691552a17e ! 2:  24032a62e5 refs: propagate errno when reading special refs fails
    @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
      	strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
      
     -	if (strbuf_read_file(&content, full_path.buf, 0) < 0)
    -+	errno = 0;
    -+
     +	if (strbuf_read_file(&content, full_path.buf, 0) < 0) {
     +		*failure_errno = errno;
      		goto done;
    @@ t/t1403-show-ref.sh: test_expect_success '--exists with directory fails with gen
     +'
     +
     +test_expect_success '--exists with existing special ref' '
    ++	test_when_finished "rm .git/FETCH_HEAD" &&
     +	git rev-parse HEAD >.git/FETCH_HEAD &&
     +	git show-ref --exists FETCH_HEAD
     +'
3:  0e38103114 ! 3:  3dd9089fd5 refs: complete list of special refs
    @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
     +	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
     +	 *   heads.
     +	 *
    -+	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
    -+	 *   rebases, where keeping it closely together feels sensible.
    -+	 *
     +	 * There are some exceptions that you might expect to see on this list
     +	 * but which are handled exclusively via the reference backend:
     +	 *
     +	 * - CHERRY_PICK_HEAD
    ++	 *
     +	 * - HEAD
    ++	 *
     +	 * - ORIG_HEAD
     +	 *
    ++	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
    ++	 *   rebases, including some reference-like files. These are
    ++	 *   exclusively read and written via the filesystem and never go
    ++	 *   through the refdb.
    ++	 *
     +	 * Writing or deleting references must consistently go either through
     +	 * the filesystem (special refs) or through the reference backend
     +	 * (normal ones).
     +	 */
    -+	const char * const special_refs[] = {
    ++	static const char * const special_refs[] = {
     +		"AUTO_MERGE",
     +		"BISECT_EXPECTED_REV",
     +		"FETCH_HEAD",
     +		"MERGE_AUTOSTASH",
     +		"MERGE_HEAD",
     +	};
    -+	int i;
    ++	size_t i;
     +
     +	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
     +		if (!strcmp(refname, special_refs[i]))
     +			return 1;
     +
    -+	/*
    -+	 * git-rebase(1) stores its state in `rebase-apply/` or
    -+	 * `rebase-merge/`, including various reference-like bits.
    -+	 */
    -+	if (starts_with(refname, "rebase-apply/") ||
    -+	    starts_with(refname, "rebase-merge/"))
    -+		return 1;
    -+
     +	return 0;
     +}
     +
4:  c7ab26fb7e ! 4:  4a4447a3f5 bisect: consistently write BISECT_EXPECTED_REV via the refdb
    @@ refs.c: static int is_special_ref(const char *refname)
      	 * but which are handled exclusively via the reference backend:
      	 *
     +	 * - BISECT_EXPECTED_REV
    ++	 *
      	 * - CHERRY_PICK_HEAD
    + 	 *
      	 * - HEAD
    - 	 * - ORIG_HEAD
     @@ refs.c: static int is_special_ref(const char *refname)
      	 */
    - 	const char * const special_refs[] = {
    + 	static const char * const special_refs[] = {
      		"AUTO_MERGE",
     -		"BISECT_EXPECTED_REV",
      		"FETCH_HEAD",

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox