Git development
 help / color / mirror / Atom feed
* [PATCH v12] checkout: extend --track with a "fetch" mode to refresh start-point
From: Harald Nordgren via GitGitGadget @ 2026-05-21 10:20 UTC (permalink / raw)
  To: git
  Cc: Ramsay Jones, D. Ben Knoble, Kristoffer Haugsbakk, Marc Branchaud,
	Phillip Wood, Harald Nordgren, Harald Nordgren
In-Reply-To: <pull.2281.v11.git.git.1779177508772.gitgitgadget@gmail.com>

From: Harald Nordgren <haraldnordgren@gmail.com>

Add a "fetch" mode to the "--track" option of "git checkout" / "git
switch" that refreshes <start-point> before checking it out:

    git checkout -b new_branch --track=fetch origin/some-branch

is shorthand for

    git fetch origin some-branch
    git checkout -b new_branch --track origin/some-branch

Identify the remote whose configured fetch refspec maps to
<start-point>, then run "git fetch <remote> <src-ref>" for just that
ref so other remote-tracking branches are left untouched. When
<start-point> is a bare <remote> (e.g. "origin"), follow
refs/remotes/<remote>/HEAD to learn which branch to refresh. If
"git fetch" fails but the remote-tracking ref already exists locally,
warn and proceed from the existing tip; otherwise abort.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
---
    checkout: --track=fetch
    
     * Find the right remote by matching fetch refspecs. Instead of assuming
       the start-point begins with a remote's name, ask each configured
       remote whether its fetch refspec maps to refs/remotes/<start-point>.
     * Die clearly when no remote matches. Previously, if we couldn't figure
       out the remote we'd silently skip the fetch and fall through to a
       confusing later error.
     * Die clearly when more than one remote matches. If two remotes both
       map their fetches into the same refs/remotes/<ns>/* namespace, there
       is no unambiguous choice.
     * Bare-namespace form (--track=fetch origin) requires <ns>/HEAD. When
       the user passes just a namespace, we now follow
       refs/remotes/<ns>/HEAD to learn which branch to refresh. If that
       symref is missing, die with a hint to run git remote set-head <ns>
       --auto instead of guessing or fetching everything. If <ns>/HEAD
       points outside the namespace, reject it.
     * Validate the refname before doing anything. Reject obviously invalid
       start-points like foo..bar up front, so we don't run a fetch we know
       cannot succeed.
     * Forward --quiet to the underlying fetch. checkout -q --track=fetch
       ... now suppresses the fetch progress output, matching the user's
       intent.
     * More tests coverage.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2281%2FHaraldNordgren%2Fcheckout-fetch-start-point-v12
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2281/HaraldNordgren/checkout-fetch-start-point-v12
Pull-Request: https://github.com/git/git/pull/2281

Range-diff vs v11:

 1:  d0c9e3e879 ! 1:  bcd034dbed checkout: extend --track with a "fetch" mode to refresh start-point
     @@ Metadata
       ## Commit message ##
          checkout: extend --track with a "fetch" mode to refresh start-point
      
     -    If you want to fork your topic branch from the very latest of the
     -    tip of a branch your remote has, you would do:
     -
     -        git fetch origin some-branch
     -        git checkout -b new_branch --track origin/some-branch
     -
     -    Extend the "--track" option of "git checkout" and allow users to
     -    write
     +    Add a "fetch" mode to the "--track" option of "git checkout" / "git
     +    switch" that refreshes <start-point> before checking it out:
      
              git checkout -b new_branch --track=fetch origin/some-branch
      
     -    to (1) fetch 'some-branch' from the remote 'origin', updating the
     -    remote-tracking branch 'origin/some-branch', (2) arrange subsequent
     -    'git pull' on 'new_branch' to interact with 'origin/some-branch' and
     -    (3) fork 'new_branch' from it.
     -
     -    In the value of the '--track' option, 'fetch' can be combined with
     -    the existing 'direct' (default) and 'inherit' modes via a
     -    comma-separated list. Examples:
     +    is shorthand for
      
     -        git checkout -b new_branch --track=fetch,inherit some_local_branch
     -        git switch -c new_branch --track=fetch origin/some-branch
     +        git fetch origin some-branch
     +        git checkout -b new_branch --track origin/some-branch
      
     -    When "fetch" is requested and <start-point> is in <remote>/<branch>
     -    form, run "git fetch <remote> <branch>" before resolving the ref, so
     -    that other remote-tracking branches are left untouched. If
     -    <start-point> is a bare remote name like "origin" (which resolves to
     -    that remote's default branch), "git fetch <remote>" is run instead,
     -    since the target branch is not known up front. Abort the checkout if
     -    the fetch fails.
     +    Identify the remote whose configured fetch refspec maps to
     +    <start-point>, then run "git fetch <remote> <src-ref>" for just that
     +    ref so other remote-tracking branches are left untouched. When
     +    <start-point> is a bare <remote> (e.g. "origin"), follow
     +    refs/remotes/<remote>/HEAD to learn which branch to refresh. If
     +    "git fetch" fails but the remote-tracking ref already exists locally,
     +    warn and proceed from the existing tip; otherwise abort.
      
          Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
      
     @@ Documentation/git-checkout.adoc: of it").
      +`fetch` requests that the remote be fetched before _<start-point>_ is
      +resolved, so the new branch starts from a fresh tip: when
      +_<start-point>_ is in _<remote>/<branch>_ form, only that branch is
     -+updated; when _<start-point>_ is a bare remote name (e.g. `origin`),
     -+only the remote's default branch is updated. If the fetch fails and the
     ++updated; when _<start-point>_ is a bare _<remote>_ (e.g. `origin`), the
     ++branch named by _<remote>/HEAD_ is updated, and the checkout fails
     ++with a hint to configure that symref if it is not set. The checkout
     ++also fails if no configured remote's fetch refspec maps to
     ++_<start-point>_, or if more than one does (in which case the `fetch`
     ++cannot be unambiguously routed). If the fetch itself fails and the
      +corresponding remote-tracking ref already exists, a warning is printed
      +and the checkout proceeds from the existing tip; otherwise the checkout
      +is aborted.
     @@ Documentation/git-switch.adoc: variable.
      +`--track[=(direct|inherit|fetch)[,...]]`::
       	When creating a new branch, set up "upstream" configuration.
       	`-c` is implied. See `--track` in linkgit:git-branch[1] for
     - 	details.
     +-	details.
     ++	details, and `--track` in linkgit:git-checkout[1] for the
     ++	`fetch` mode.
       +
     -+The argument is a comma-separated list. `direct` (the default) and
     -+`inherit` select the tracking mode and are mutually exclusive. Adding
     -+`fetch` requests that the remote be fetched before _<start-point>_ is
     -+resolved, so the new branch starts from a fresh tip: when
     -+_<start-point>_ is in _<remote>/<branch>_ form, only that branch is
     -+updated; when _<start-point>_ is a bare remote name (e.g. `origin`),
     -+only the remote's default branch is updated. If the fetch fails and the
     -+corresponding remote-tracking ref already exists, a warning is printed
     -+and the switch proceeds from the existing tip; otherwise the switch is
     -+aborted.
     -++
       If no `-c` option is given, the name of the new branch will be derived
       from the remote-tracking branch, by looking at the local part of the
     - refspec configured for the corresponding remote, and then stripping
      
       ## builtin/checkout.c ##
      @@
     @@ builtin/checkout.c: struct branch_info {
       };
       
      +struct fetch_target_cb {
     -+	struct refspec_item query;
     -+	const char *remote_name;
     -+	int matches;
     ++	char *dst;
     ++	struct string_list matches;
      +};
      +
      +static int match_fetch_target(struct remote *remote, void *priv)
      +{
      +	struct fetch_target_cb *cb = priv;
     -+	struct refspec_item q = { .dst = cb->query.dst };
     -+
     -+	if (!remote_find_tracking(remote, &q) && q.src) {
     -+		if (++cb->matches == 1) {
     -+			cb->remote_name = remote->name;
     -+			free(cb->query.src);
     -+			cb->query.src = q.src;
     -+		} else {
     -+			free(q.src);
     -+		}
     -+	}
     ++	struct refspec_item q = { .dst = cb->dst };
     ++
     ++	if (!remote_find_tracking(remote, &q) && q.src)
     ++		string_list_append(&cb->matches, remote->name)->util = q.src;
      +	return 0;
      +}
      +
     -+static int resolve_fetch_target(const char *arg, char **remote_out,
     -+				char **src_ref_out, char **existing_ref_out)
     ++static void fetch_remote_for_start_point(const char *arg, int quiet)
      +{
      +	struct strbuf dst = STRBUF_INIT;
     -+	struct strbuf head_path = STRBUF_INIT;
     -+	struct fetch_target_cb cb = { 0 };
     ++	struct fetch_target_cb cb = { .matches = STRING_LIST_INIT_NODUP };
     ++	struct child_process cmd = CHILD_PROCESS_INIT;
      +	struct object_id oid;
     -+	const char *head_target;
     -+
     -+	*remote_out = NULL;
     -+	*src_ref_out = NULL;
     -+	*existing_ref_out = NULL;
     -+
     -+	if (!arg || !*arg)
     -+		return -1;
     -+
     -+	strbuf_addf(&head_path, "refs/remotes/%s/HEAD", arg);
     -+	head_target = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
     -+					      head_path.buf,
     -+					      RESOLVE_REF_READING |
     -+					      RESOLVE_REF_NO_RECURSE,
     -+					      &oid, NULL);
     -+	if (head_target)
     -+		strbuf_addstr(&dst, head_target);
     -+	else
     -+		strbuf_addf(&dst, "refs/remotes/%s", arg);
     -+
     -+	cb.query.dst = dst.buf;
     -+	for_each_remote(match_fetch_target, &cb);
     -+
     -+	if (cb.matches != 1) {
     -+		free(cb.query.src);
     -+		strbuf_release(&dst);
     -+		strbuf_release(&head_path);
     -+		return -1;
     ++	struct remote *named_remote;
     ++	int bare_ns;
     ++	size_t i;
     ++
     ++	strbuf_addf(&dst, "refs/remotes/%s", arg);
     ++	if (check_refname_format(dst.buf, 0))
     ++		die(_("cannot fetch start-point '%s': not a valid "
     ++		      "remote-tracking name"), arg);
     ++
     ++	named_remote = remote_get(arg);
     ++	bare_ns = !strchr(arg, '/') ||
     ++		(named_remote && remote_is_configured(named_remote, 1));
     ++	if (bare_ns) {
     ++		char *head_path = xstrfmt("refs/remotes/%s/HEAD", arg);
     ++		const char *head_target =
     ++			refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
     ++						head_path,
     ++						RESOLVE_REF_READING |
     ++						RESOLVE_REF_NO_RECURSE,
     ++						&oid, NULL);
     ++		if (head_target &&
     ++		    starts_with(head_target, dst.buf) &&
     ++		    head_target[dst.len] == '/' &&
     ++		    !check_refname_format(head_target, 0)) {
     ++			strbuf_reset(&dst);
     ++			strbuf_addstr(&dst, head_target);
     ++			bare_ns = 0;
     ++		}
     ++		free(head_path);
      +	}
      +
     -+	*remote_out = xstrdup(cb.remote_name);
     -+	*src_ref_out = cb.query.src;
     -+	if (head_target)
     -+		*existing_ref_out = strbuf_detach(&head_path, NULL);
     -+	else if (!refs_read_ref(get_main_ref_store(the_repository),
     -+				dst.buf, &oid))
     -+		*existing_ref_out = strbuf_detach(&dst, NULL);
     -+
     -+	strbuf_release(&dst);
     -+	strbuf_release(&head_path);
     -+	return 0;
     -+}
     ++	cb.dst = dst.buf;
     ++	for_each_remote(match_fetch_target, &cb);
      +
     -+static void fetch_remote_for_start_point(const char *arg)
     -+{
     -+	char *remote_name = NULL;
     -+	char *src_ref = NULL;
     -+	char *existing_ref = NULL;
     -+	struct child_process cmd = CHILD_PROCESS_INIT;
     ++	if (cb.matches.nr > 1) {
     ++		struct strbuf msg = STRBUF_INIT;
     ++
     ++		strbuf_addf(&msg,
     ++			    _("cannot fetch start-point '%s': fetch refspecs "
     ++			      "of multiple remotes map to the same destination:"),
     ++			    arg);
     ++		for (i = 0; i < cb.matches.nr; i++)
     ++			strbuf_addf(&msg, "\n  %s", cb.matches.items[i].string);
     ++		strbuf_addstr(&msg,
     ++			      _("\nadjust 'remote.<name>.fetch' so only one "
     ++				"remote maps there, or omit '=fetch'"));
     ++		die("%s", msg.buf);
     ++	}
      +
     -+	if (resolve_fetch_target(arg, &remote_name, &src_ref, &existing_ref))
     -+		return;
     ++	if (!cb.matches.nr) {
     ++		if (bare_ns && named_remote &&
     ++		    remote_is_configured(named_remote, 1))
     ++			die(_("cannot fetch start-point '%s': "
     ++			      "'refs/remotes/%s/HEAD' is not set; run "
     ++			      "'git remote set-head %s --auto' to set it"),
     ++			    arg, arg, arg);
     ++		die(_("cannot fetch start-point '%s': no configured remote's "
     ++		      "fetch refspec matches it"), arg);
     ++	}
      +
     -+	strvec_pushl(&cmd.args, "fetch", remote_name, NULL);
     -+	if (src_ref)
     -+		strvec_push(&cmd.args, src_ref);
     ++	strvec_push(&cmd.args, "fetch");
     ++	if (quiet)
     ++		strvec_push(&cmd.args, "--quiet");
     ++	strvec_pushl(&cmd.args, cb.matches.items[0].string,
     ++		     (char *)cb.matches.items[0].util, NULL);
      +	cmd.git_cmd = 1;
      +	if (run_command(&cmd)) {
     -+		if (existing_ref)
     ++		if (!refs_read_ref(get_main_ref_store(the_repository),
     ++				   dst.buf, &oid))
      +			warning(_("failed to fetch start-point '%s'; "
     -+				  "using existing '%s'"),
     -+				arg, existing_ref);
     ++				  "using existing '%s'"), arg, dst.buf);
      +		else
      +			die(_("failed to fetch start-point '%s'"), arg);
      +	}
      +
     -+	free(remote_name);
     -+	free(src_ref);
     -+	free(existing_ref);
     ++	for (i = 0; i < cb.matches.nr; i++)
     ++		free(cb.matches.items[i].util);
     ++	string_list_clear(&cb.matches, 0);
     ++	strbuf_release(&dst);
      +}
      +
      +static int parse_opt_checkout_track(const struct option *opt,
     @@ builtin/checkout.c: struct branch_info {
      +	struct checkout_opts *opts = opt->value;
      +	struct string_list tokens = STRING_LIST_INIT_DUP;
      +	struct string_list_item *item;
     -+	int saw_direct = 0, saw_inherit = 0;
     ++	int saw_direct = 0;
      +	int ret = 0;
      +
      +	opts->fetch = 0;
     -+
      +	if (unset) {
      +		opts->track = BRANCH_TRACK_NEVER;
      +		return 0;
      +	}
     -+
      +	opts->track = BRANCH_TRACK_EXPLICIT;
      +	if (!arg)
      +		return 0;
      +
      +	string_list_split(&tokens, arg, ",", -1);
      +	for_each_string_list_item(item, &tokens) {
     -+		if (!strcmp(item->string, "fetch")) {
     ++		if (!strcmp(item->string, "fetch"))
      +			opts->fetch = 1;
     -+		} else if (!strcmp(item->string, "direct")) {
     ++		else if (!strcmp(item->string, "direct"))
      +			saw_direct = 1;
     -+			opts->track = BRANCH_TRACK_EXPLICIT;
     -+		} else if (!strcmp(item->string, "inherit")) {
     -+			saw_inherit = 1;
     ++		else if (!strcmp(item->string, "inherit"))
      +			opts->track = BRANCH_TRACK_INHERIT;
     -+		} else {
     ++		else {
      +			ret = error(_("option `%s' expects \"%s\", \"%s\", "
      +				      "or \"%s\""),
      +				    "--track", "direct", "inherit", "fetch");
      +			goto out;
      +		}
      +	}
     -+
     -+	if (saw_direct && saw_inherit)
     ++	if (saw_direct && opts->track == BRANCH_TRACK_INHERIT)
      +		ret = error(_("option `%s' cannot combine \"%s\" and \"%s\""),
      +			    "--track", "direct", "inherit");
     -+
      +out:
      +	string_list_clear(&tokens, 0);
      +	return ret;
     @@ builtin/checkout.c: static int checkout_main(int argc, const char **argv, const
      +		int n;
      +
      +		if (opts->fetch)
     -+			fetch_remote_for_start_point(argv[0]);
     ++			fetch_remote_for_start_point(argv[0], opts->quiet);
      +
      +		n = parse_branchname_arg(argc, argv, dwim_ok, which_command,
      +					 &new_branch_info, opts, &rev);
     @@ t/t7201-co.sh: test_expect_success 'tracking info copied with autoSetupMerge=inh
      +	git checkout main &&
      +	git remote add fetch_ns ./fetch_upstream &&
      +	test_when_finished "git remote remove fetch_ns" &&
     ++	test_when_finished "git update-ref -d refs/remotes/ns_alias/HEAD" &&
      +	git config --replace-all remote.fetch_ns.fetch \
      +		"+refs/heads/*:refs/remotes/ns_alias/*" &&
      +	git fetch fetch_ns &&
     @@ t/t7201-co.sh: test_expect_success 'tracking info copied with autoSetupMerge=inh
      +	test_cmp_config refs/heads/main branch.local_ns.merge
      +'
      +
     ++test_expect_success '--track=fetch on bare hierarchical remote name follows <ns>/HEAD' '
     ++	git checkout main &&
     ++	git remote add nested/bare ./fetch_upstream &&
     ++	test_when_finished "git remote remove nested/bare" &&
     ++	test_when_finished "git update-ref -d refs/remotes/nested/bare/HEAD" &&
     ++	git fetch nested/bare &&
     ++	git symbolic-ref refs/remotes/nested/bare/HEAD \
     ++		refs/remotes/nested/bare/main &&
     ++	git -C fetch_upstream checkout main &&
     ++	test_commit -C fetch_upstream u_nested_bare_post &&
     ++	git checkout --track=fetch -b local_nested_bare nested/bare &&
     ++	test_cmp_rev refs/remotes/nested/bare/main HEAD
     ++'
     ++
      +test_expect_success 'checkout --track=fetch handles hierarchical remote name' '
      +	git checkout main &&
     -+	git -C fetch_upstream checkout -b fetch_hier &&
     -+	test_commit -C fetch_upstream u_hier &&
      +	git remote add nested/remote ./fetch_upstream &&
      +	test_when_finished "git remote remove nested/remote" &&
     -+	git fetch nested/remote fetch_hier &&
     -+	test_commit -C fetch_upstream u_hier_post &&
     ++	git -C fetch_upstream checkout -b fetch_hier &&
     ++	test_commit -C fetch_upstream u_hier &&
     ++	test_must_fail git rev-parse --verify refs/remotes/nested/remote/fetch_hier &&
      +	git checkout --track=fetch -b local_hier nested/remote/fetch_hier &&
      +	test_cmp_rev refs/remotes/nested/remote/fetch_hier HEAD
      +'
      +
     ++test_expect_success 'checkout --track=fetch dies on bare remote name with no <ns>/HEAD' '
     ++	git checkout main &&
     ++	git remote add fetch_nohead ./fetch_upstream &&
     ++	test_when_finished "git remote remove fetch_nohead" &&
     ++	test_might_fail git symbolic-ref -d refs/remotes/fetch_nohead/HEAD &&
     ++	test_must_fail git checkout --track=fetch -b local_nohead fetch_nohead 2>err &&
     ++	test_grep "refs/remotes/fetch_nohead/HEAD" err &&
     ++	test_grep "git remote set-head fetch_nohead --auto" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_nohead
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch on bare unknown name does not suggest set-head' '
     ++	git checkout main &&
     ++	test_must_fail git rev-parse --verify refs/remotes/no_such_ns/HEAD &&
     ++	test_must_fail git config --get remote.no_such_ns.url &&
     ++	test_must_fail git checkout --track=fetch -b local_unknown no_such_ns 2>err &&
     ++	test_grep "no configured remote" err &&
     ++	test_grep ! "set-head" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_unknown
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch rejects <ns>/HEAD pointing outside namespace' '
     ++	git checkout main &&
     ++	git remote add fetch_crossns ./fetch_upstream &&
     ++	test_when_finished "git remote remove fetch_crossns" &&
     ++	test_when_finished "git update-ref -d refs/remotes/fetch_crossns/HEAD" &&
     ++	git fetch fetch_crossns &&
     ++	git symbolic-ref refs/remotes/fetch_crossns/HEAD \
     ++		refs/remotes/fetch_upstream/u_main &&
     ++	test_must_fail git checkout --track=fetch -b local_crossns fetch_crossns 2>err &&
     ++	test_grep "refs/remotes/fetch_crossns/HEAD" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_crossns
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch dies on ambiguous fetch refspec match' '
     ++	git checkout main &&
     ++	git remote add fetch_ambig_a ./fetch_upstream &&
     ++	git remote add fetch_ambig_b ./fetch_upstream &&
     ++	test_when_finished "git remote remove fetch_ambig_a" &&
     ++	test_when_finished "git remote remove fetch_ambig_b" &&
     ++	git config --replace-all remote.fetch_ambig_a.fetch \
     ++		"+refs/heads/*:refs/remotes/ambig_ns/*" &&
     ++	git config --replace-all remote.fetch_ambig_b.fetch \
     ++		"+refs/heads/*:refs/remotes/ambig_ns/*" &&
     ++	git -C fetch_upstream checkout -b fetch_ambig &&
     ++	test_commit -C fetch_upstream u_ambig &&
     ++	test_must_fail git checkout --track=fetch -b local_ambig ambig_ns/fetch_ambig 2>err &&
     ++	test_grep "fetch_ambig_a" err &&
     ++	test_grep "fetch_ambig_b" err &&
     ++	test_grep "remote.<name>.fetch" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_ambig
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch rejects invalid refname components' '
     ++	git checkout main &&
     ++	test_must_fail git checkout --track=fetch -b local_invalid "foo..bar" 2>err &&
     ++	test_grep "valid" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_invalid
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch,inherit rejects invalid refname components' '
     ++	git checkout main &&
     ++	test_must_fail git checkout --track=fetch,inherit -b local_invalid \
     ++		"foo..bar" 2>err &&
     ++	test_grep "valid" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/local_invalid
     ++'
     ++
      +test_expect_success 'checkout --track=inherit,direct is rejected' '
      +	test_must_fail git checkout --track=inherit,direct -b bad fetch_upstream/fetch_new 2>err &&
      +	test_grep "cannot combine" err
      +'
      +
     ++test_expect_success 'checkout --track=direct,inherit is rejected' '
     ++	test_must_fail git checkout --track=direct,inherit -b bad fetch_upstream/fetch_new 2>err &&
     ++	test_grep "cannot combine" err
     ++'
     ++
      +test_expect_success 'checkout --track=fetch then --track=direct drops fetch (last-one-wins)' '
      +	git checkout main &&
      +	git -C fetch_upstream checkout -b fetch_lastwin &&
     @@ t/t7201-co.sh: test_expect_success 'tracking info copied with autoSetupMerge=inh
      +	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_lastwin
      +'
      +
     -+test_expect_success 'checkout --track=fetch,inherit fetches and inherits' '
     ++test_expect_success 'checkout --track=fetch then --no-track drops fetch' '
     ++	git checkout main &&
     ++	git -C fetch_upstream checkout -b fetch_notrack &&
     ++	test_commit -C fetch_upstream u_notrack &&
     ++	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_notrack &&
     ++	test_must_fail git checkout --track=fetch --no-track \
     ++		-b local_notrack fetch_upstream/fetch_notrack &&
     ++	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_notrack
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch,inherit fetches remote-tracking start-point' '
      +	git checkout main &&
      +	git -C fetch_upstream checkout -b fetch_inherit &&
      +	test_commit -C fetch_upstream u_inherit &&
     -+	git fetch fetch_upstream fetch_inherit &&
     -+	git checkout -b base_inherit fetch_upstream/fetch_inherit &&
     -+	test_commit -C fetch_upstream u_inherit2 &&
     ++	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_inherit &&
     ++	git checkout --track=fetch,inherit -b local_inherit \
     ++		fetch_upstream/fetch_inherit &&
     ++	test_cmp_rev refs/remotes/fetch_upstream/fetch_inherit HEAD
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch,inherit errors when start-point does not map to a remote' '
      +	git checkout main &&
     -+	git checkout --track=fetch,inherit -b local_inherit base_inherit &&
     -+	test_cmp_rev refs/remotes/fetch_upstream/fetch_inherit HEAD &&
     -+	test_cmp_config fetch_upstream branch.local_inherit.remote &&
     -+	test_cmp_config refs/heads/fetch_inherit branch.local_inherit.merge
     ++	test_must_fail git checkout --track=fetch,inherit -b bad main 2>err &&
     ++	test_grep "no configured remote" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/bad
     ++'
     ++
     ++test_expect_success 'checkout --track=fetch on local start-point errors' '
     ++	git checkout main &&
     ++	test_must_fail git checkout --track=fetch -b bad main 2>err &&
     ++	test_grep "no configured remote" err &&
     ++	test_must_fail git rev-parse --verify refs/heads/bad
      +'
      +
      +test_expect_success 'checkout --track=bogus reports an error' '
     @@ t/t7201-co.sh: test_expect_success 'tracking info copied with autoSetupMerge=inh
      +	test_grep "expects" err
      +'
      +
     ++test_expect_success 'checkout -q --track=fetch silences the fetch output' '
     ++	git checkout main &&
     ++	git -C fetch_upstream checkout -b fetch_quiet &&
     ++	test_commit -C fetch_upstream u_quiet &&
     ++	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_quiet &&
     ++	git checkout -q --track=fetch -b local_quiet \
     ++		fetch_upstream/fetch_quiet 2>err &&
     ++	test_grep ! "-> fetch_upstream/fetch_quiet" err &&
     ++	test_cmp_rev refs/remotes/fetch_upstream/fetch_quiet HEAD
     ++'
     ++
      +test_expect_success 'switch --track=fetch -c picks up branch created upstream after clone' '
      +	git checkout main &&
      +	git -C fetch_upstream checkout -b fetch_switch &&


 Documentation/git-checkout.adoc |  17 +-
 Documentation/git-switch.adoc   |   5 +-
 builtin/checkout.c              | 159 +++++++++++++++++-
 t/t7201-co.sh                   | 276 ++++++++++++++++++++++++++++++++
 4 files changed, 450 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-checkout.adoc b/Documentation/git-checkout.adoc
index a8b3b8c2e2..20b6cae60e 100644
--- a/Documentation/git-checkout.adoc
+++ b/Documentation/git-checkout.adoc
@@ -158,11 +158,26 @@ of it").
 	resets _<branch>_ to the start point instead of failing.
 
 `-t`::
-`--track[=(direct|inherit)]`::
+`--track[=(direct|inherit|fetch)[,...]]`::
 	When creating a new branch, set up "upstream" configuration. See
 	`--track` in linkgit:git-branch[1] for details. As a convenience,
 	--track without -b implies branch creation.
 +
+The argument is a comma-separated list. `direct` (the default) and
+`inherit` select the tracking mode and are mutually exclusive. Adding
+`fetch` requests that the remote be fetched before _<start-point>_ is
+resolved, so the new branch starts from a fresh tip: when
+_<start-point>_ is in _<remote>/<branch>_ form, only that branch is
+updated; when _<start-point>_ is a bare _<remote>_ (e.g. `origin`), the
+branch named by _<remote>/HEAD_ is updated, and the checkout fails
+with a hint to configure that symref if it is not set. The checkout
+also fails if no configured remote's fetch refspec maps to
+_<start-point>_, or if more than one does (in which case the `fetch`
+cannot be unambiguously routed). If the fetch itself fails and the
+corresponding remote-tracking ref already exists, a warning is printed
+and the checkout proceeds from the existing tip; otherwise the checkout
+is aborted.
++
 If no `-b` option is given, the name of the new branch will be
 derived from the remote-tracking branch, by looking at the local part of
 the refspec configured for the corresponding remote, and then stripping
diff --git a/Documentation/git-switch.adoc b/Documentation/git-switch.adoc
index d6c4f229a5..a8730b1da8 100644
--- a/Documentation/git-switch.adoc
+++ b/Documentation/git-switch.adoc
@@ -155,10 +155,11 @@ variable.
 	attached to a terminal, regardless of `--quiet`.
 
 `-t`::
-`--track[ (direct|inherit)]`::
+`--track[=(direct|inherit|fetch)[,...]]`::
 	When creating a new branch, set up "upstream" configuration.
 	`-c` is implied. See `--track` in linkgit:git-branch[1] for
-	details.
+	details, and `--track` in linkgit:git-checkout[1] for the
+	`fetch` mode.
 +
 If no `-c` option is given, the name of the new branch will be derived
 from the remote-tracking branch, by looking at the local part of the
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 1345e8574a..9c5c4f1c2e 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -25,10 +25,12 @@
 #include "preload-index.h"
 #include "read-cache.h"
 #include "refs.h"
+#include "refspec.h"
 #include "remote.h"
 #include "repo-settings.h"
 #include "resolve-undo.h"
 #include "revision.h"
+#include "run-command.h"
 #include "sequencer.h"
 #include "setup.h"
 #include "strvec.h"
@@ -62,6 +64,7 @@ struct checkout_opts {
 	int count_checkout_paths;
 	int overlay_mode;
 	int dwim_new_local_branch;
+	int fetch;
 	int discard_changes;
 	int accept_ref;
 	int accept_pathspec;
@@ -115,6 +118,149 @@ struct branch_info {
 	char *checkout;
 };
 
+struct fetch_target_cb {
+	char *dst;
+	struct string_list matches;
+};
+
+static int match_fetch_target(struct remote *remote, void *priv)
+{
+	struct fetch_target_cb *cb = priv;
+	struct refspec_item q = { .dst = cb->dst };
+
+	if (!remote_find_tracking(remote, &q) && q.src)
+		string_list_append(&cb->matches, remote->name)->util = q.src;
+	return 0;
+}
+
+static void fetch_remote_for_start_point(const char *arg, int quiet)
+{
+	struct strbuf dst = STRBUF_INIT;
+	struct fetch_target_cb cb = { .matches = STRING_LIST_INIT_NODUP };
+	struct child_process cmd = CHILD_PROCESS_INIT;
+	struct object_id oid;
+	struct remote *named_remote;
+	int bare_ns;
+	size_t i;
+
+	strbuf_addf(&dst, "refs/remotes/%s", arg);
+	if (check_refname_format(dst.buf, 0))
+		die(_("cannot fetch start-point '%s': not a valid "
+		      "remote-tracking name"), arg);
+
+	named_remote = remote_get(arg);
+	bare_ns = !strchr(arg, '/') ||
+		(named_remote && remote_is_configured(named_remote, 1));
+	if (bare_ns) {
+		char *head_path = xstrfmt("refs/remotes/%s/HEAD", arg);
+		const char *head_target =
+			refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
+						head_path,
+						RESOLVE_REF_READING |
+						RESOLVE_REF_NO_RECURSE,
+						&oid, NULL);
+		if (head_target &&
+		    starts_with(head_target, dst.buf) &&
+		    head_target[dst.len] == '/' &&
+		    !check_refname_format(head_target, 0)) {
+			strbuf_reset(&dst);
+			strbuf_addstr(&dst, head_target);
+			bare_ns = 0;
+		}
+		free(head_path);
+	}
+
+	cb.dst = dst.buf;
+	for_each_remote(match_fetch_target, &cb);
+
+	if (cb.matches.nr > 1) {
+		struct strbuf msg = STRBUF_INIT;
+
+		strbuf_addf(&msg,
+			    _("cannot fetch start-point '%s': fetch refspecs "
+			      "of multiple remotes map to the same destination:"),
+			    arg);
+		for (i = 0; i < cb.matches.nr; i++)
+			strbuf_addf(&msg, "\n  %s", cb.matches.items[i].string);
+		strbuf_addstr(&msg,
+			      _("\nadjust 'remote.<name>.fetch' so only one "
+				"remote maps there, or omit '=fetch'"));
+		die("%s", msg.buf);
+	}
+
+	if (!cb.matches.nr) {
+		if (bare_ns && named_remote &&
+		    remote_is_configured(named_remote, 1))
+			die(_("cannot fetch start-point '%s': "
+			      "'refs/remotes/%s/HEAD' is not set; run "
+			      "'git remote set-head %s --auto' to set it"),
+			    arg, arg, arg);
+		die(_("cannot fetch start-point '%s': no configured remote's "
+		      "fetch refspec matches it"), arg);
+	}
+
+	strvec_push(&cmd.args, "fetch");
+	if (quiet)
+		strvec_push(&cmd.args, "--quiet");
+	strvec_pushl(&cmd.args, cb.matches.items[0].string,
+		     (char *)cb.matches.items[0].util, NULL);
+	cmd.git_cmd = 1;
+	if (run_command(&cmd)) {
+		if (!refs_read_ref(get_main_ref_store(the_repository),
+				   dst.buf, &oid))
+			warning(_("failed to fetch start-point '%s'; "
+				  "using existing '%s'"), arg, dst.buf);
+		else
+			die(_("failed to fetch start-point '%s'"), arg);
+	}
+
+	for (i = 0; i < cb.matches.nr; i++)
+		free(cb.matches.items[i].util);
+	string_list_clear(&cb.matches, 0);
+	strbuf_release(&dst);
+}
+
+static int parse_opt_checkout_track(const struct option *opt,
+				    const char *arg, int unset)
+{
+	struct checkout_opts *opts = opt->value;
+	struct string_list tokens = STRING_LIST_INIT_DUP;
+	struct string_list_item *item;
+	int saw_direct = 0;
+	int ret = 0;
+
+	opts->fetch = 0;
+	if (unset) {
+		opts->track = BRANCH_TRACK_NEVER;
+		return 0;
+	}
+	opts->track = BRANCH_TRACK_EXPLICIT;
+	if (!arg)
+		return 0;
+
+	string_list_split(&tokens, arg, ",", -1);
+	for_each_string_list_item(item, &tokens) {
+		if (!strcmp(item->string, "fetch"))
+			opts->fetch = 1;
+		else if (!strcmp(item->string, "direct"))
+			saw_direct = 1;
+		else if (!strcmp(item->string, "inherit"))
+			opts->track = BRANCH_TRACK_INHERIT;
+		else {
+			ret = error(_("option `%s' expects \"%s\", \"%s\", "
+				      "or \"%s\""),
+				    "--track", "direct", "inherit", "fetch");
+			goto out;
+		}
+	}
+	if (saw_direct && opts->track == BRANCH_TRACK_INHERIT)
+		ret = error(_("option `%s' cannot combine \"%s\" and \"%s\""),
+			    "--track", "direct", "inherit");
+out:
+	string_list_clear(&tokens, 0);
+	return ret;
+}
+
 static void branch_info_release(struct branch_info *info)
 {
 	free(info->name);
@@ -1733,10 +1879,10 @@ static struct option *add_common_switch_branch_options(
 {
 	struct option options[] = {
 		OPT_BOOL('d', "detach", &opts->force_detach, N_("detach HEAD at named commit")),
-		OPT_CALLBACK_F('t', "track",  &opts->track, "(direct|inherit)",
+		OPT_CALLBACK_F('t', "track",  opts, "(direct|inherit|fetch)[,...]",
 			N_("set branch tracking configuration"),
 			PARSE_OPT_OPTARG,
-			parse_opt_tracking_mode),
+			parse_opt_checkout_track),
 		OPT__FORCE(&opts->force, N_("force checkout (throw away local modifications)"),
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_STRING(0, "orphan", &opts->new_orphan_branch, N_("new-branch"), N_("new unborn branch")),
@@ -1941,8 +2087,13 @@ static int checkout_main(int argc, const char **argv, const char *prefix,
 			opts->dwim_new_local_branch &&
 			opts->track == BRANCH_TRACK_UNSPECIFIED &&
 			!opts->new_branch;
-		int n = parse_branchname_arg(argc, argv, dwim_ok, which_command,
-					     &new_branch_info, opts, &rev);
+		int n;
+
+		if (opts->fetch)
+			fetch_remote_for_start_point(argv[0], opts->quiet);
+
+		n = parse_branchname_arg(argc, argv, dwim_ok, which_command,
+					 &new_branch_info, opts, &rev);
 		argv += n;
 		argc -= n;
 	} else if (!opts->accept_ref && opts->from_treeish) {
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 7613b1d2a4..c4a165cb1d 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -870,4 +870,280 @@ test_expect_success 'tracking info copied with autoSetupMerge=inherit' '
 	test_cmp_config "" --default "" branch.main2.merge
 '
 
+test_expect_success 'setup upstream for --track=fetch tests' '
+	git checkout main &&
+	git init fetch_upstream &&
+	test_commit -C fetch_upstream u_main &&
+	git remote add fetch_upstream fetch_upstream &&
+	git fetch fetch_upstream &&
+	git -C fetch_upstream checkout -b fetch_new &&
+	test_commit -C fetch_upstream u_new
+'
+
+test_expect_success 'checkout --track=fetch -b picks up branch created upstream after clone' '
+	git checkout main &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_new &&
+	git checkout --track=fetch -b local_new fetch_upstream/fetch_new &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_new HEAD &&
+	test_cmp_config fetch_upstream branch.local_new.remote &&
+	test_cmp_config refs/heads/fetch_new branch.local_new.merge
+'
+
+test_expect_success 'checkout --track=fetch <remote>/<branch> leaves other tracking branches untouched' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_target &&
+	test_commit -C fetch_upstream u_target_pre &&
+	git -C fetch_upstream checkout -b fetch_other &&
+	test_commit -C fetch_upstream u_other_pre &&
+	git fetch fetch_upstream &&
+	other_before=$(git rev-parse refs/remotes/fetch_upstream/fetch_other) &&
+	git -C fetch_upstream checkout fetch_target &&
+	test_commit -C fetch_upstream u_target_post &&
+	git -C fetch_upstream checkout fetch_other &&
+	test_commit -C fetch_upstream u_other_post &&
+	git checkout --track=fetch -b local_target fetch_upstream/fetch_target &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_target HEAD &&
+	test "$(git rev-parse refs/remotes/fetch_upstream/fetch_other)" = "$other_before"
+'
+
+test_expect_success 'checkout --track=fetch with bare remote name fetches only <remote>/HEAD target' '
+	git checkout main &&
+	git -C fetch_upstream checkout main &&
+	git remote set-head fetch_upstream main &&
+	git -C fetch_upstream checkout -b fetch_unrelated &&
+	test_commit -C fetch_upstream u_unrelated_pre &&
+	git fetch fetch_upstream fetch_unrelated &&
+	unrelated_before=$(git rev-parse refs/remotes/fetch_upstream/fetch_unrelated) &&
+	git -C fetch_upstream checkout main &&
+	test_commit -C fetch_upstream u_main_post &&
+	git -C fetch_upstream checkout fetch_unrelated &&
+	test_commit -C fetch_upstream u_unrelated_post &&
+	git checkout --track=fetch -b local_from_remote fetch_upstream &&
+	test_cmp_rev refs/remotes/fetch_upstream/main HEAD &&
+	test "$(git rev-parse refs/remotes/fetch_upstream/fetch_unrelated)" = "$unrelated_before"
+'
+
+test_expect_success 'checkout --track=fetch aborts and does not create branch when no existing ref' '
+	git checkout main &&
+	test_might_fail git branch -D bogus &&
+	test_must_fail git checkout --track=fetch -b bogus fetch_upstream/does_not_exist &&
+	test_must_fail git rev-parse --verify refs/heads/bogus
+'
+
+test_expect_success 'checkout --track=fetch warns and proceeds when fetch fails but ref exists' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_offline &&
+	test_commit -C fetch_upstream u_offline &&
+	git fetch fetch_upstream fetch_offline &&
+	saved_url=$(git config remote.fetch_upstream.url) &&
+	test_when_finished "git config remote.fetch_upstream.url \"$saved_url\"" &&
+	git config remote.fetch_upstream.url ./does-not-exist &&
+	git checkout --track=fetch -b local_offline fetch_upstream/fetch_offline 2>err &&
+	test_grep "failed to fetch" err &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_offline HEAD
+'
+
+test_expect_success 'checkout --track=fetch resolves through configured fetch refspec' '
+	git checkout main &&
+	git remote add fetch_custom ./fetch_upstream &&
+	test_when_finished "git remote remove fetch_custom" &&
+	git config --replace-all remote.fetch_custom.fetch \
+		"+refs/heads/*:refs/remotes/custom-ns/*" &&
+	git -C fetch_upstream checkout -b fetch_refspec &&
+	test_commit -C fetch_upstream u_refspec &&
+	test_must_fail git rev-parse --verify refs/remotes/custom-ns/fetch_refspec &&
+	git checkout --track=fetch -b local_refspec custom-ns/fetch_refspec &&
+	test_cmp_rev refs/remotes/custom-ns/fetch_refspec HEAD
+'
+
+test_expect_success 'checkout --track=fetch on namespace bare name follows <ns>/HEAD' '
+	git checkout main &&
+	git remote add fetch_ns ./fetch_upstream &&
+	test_when_finished "git remote remove fetch_ns" &&
+	test_when_finished "git update-ref -d refs/remotes/ns_alias/HEAD" &&
+	git config --replace-all remote.fetch_ns.fetch \
+		"+refs/heads/*:refs/remotes/ns_alias/*" &&
+	git fetch fetch_ns &&
+	git symbolic-ref refs/remotes/ns_alias/HEAD refs/remotes/ns_alias/main &&
+	git -C fetch_upstream checkout main &&
+	test_commit -C fetch_upstream u_ns_post &&
+	git checkout --track=fetch -b local_ns ns_alias &&
+	test_cmp_rev refs/remotes/ns_alias/main HEAD &&
+	test_cmp_config fetch_ns branch.local_ns.remote &&
+	test_cmp_config refs/heads/main branch.local_ns.merge
+'
+
+test_expect_success '--track=fetch on bare hierarchical remote name follows <ns>/HEAD' '
+	git checkout main &&
+	git remote add nested/bare ./fetch_upstream &&
+	test_when_finished "git remote remove nested/bare" &&
+	test_when_finished "git update-ref -d refs/remotes/nested/bare/HEAD" &&
+	git fetch nested/bare &&
+	git symbolic-ref refs/remotes/nested/bare/HEAD \
+		refs/remotes/nested/bare/main &&
+	git -C fetch_upstream checkout main &&
+	test_commit -C fetch_upstream u_nested_bare_post &&
+	git checkout --track=fetch -b local_nested_bare nested/bare &&
+	test_cmp_rev refs/remotes/nested/bare/main HEAD
+'
+
+test_expect_success 'checkout --track=fetch handles hierarchical remote name' '
+	git checkout main &&
+	git remote add nested/remote ./fetch_upstream &&
+	test_when_finished "git remote remove nested/remote" &&
+	git -C fetch_upstream checkout -b fetch_hier &&
+	test_commit -C fetch_upstream u_hier &&
+	test_must_fail git rev-parse --verify refs/remotes/nested/remote/fetch_hier &&
+	git checkout --track=fetch -b local_hier nested/remote/fetch_hier &&
+	test_cmp_rev refs/remotes/nested/remote/fetch_hier HEAD
+'
+
+test_expect_success 'checkout --track=fetch dies on bare remote name with no <ns>/HEAD' '
+	git checkout main &&
+	git remote add fetch_nohead ./fetch_upstream &&
+	test_when_finished "git remote remove fetch_nohead" &&
+	test_might_fail git symbolic-ref -d refs/remotes/fetch_nohead/HEAD &&
+	test_must_fail git checkout --track=fetch -b local_nohead fetch_nohead 2>err &&
+	test_grep "refs/remotes/fetch_nohead/HEAD" err &&
+	test_grep "git remote set-head fetch_nohead --auto" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_nohead
+'
+
+test_expect_success 'checkout --track=fetch on bare unknown name does not suggest set-head' '
+	git checkout main &&
+	test_must_fail git rev-parse --verify refs/remotes/no_such_ns/HEAD &&
+	test_must_fail git config --get remote.no_such_ns.url &&
+	test_must_fail git checkout --track=fetch -b local_unknown no_such_ns 2>err &&
+	test_grep "no configured remote" err &&
+	test_grep ! "set-head" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_unknown
+'
+
+test_expect_success 'checkout --track=fetch rejects <ns>/HEAD pointing outside namespace' '
+	git checkout main &&
+	git remote add fetch_crossns ./fetch_upstream &&
+	test_when_finished "git remote remove fetch_crossns" &&
+	test_when_finished "git update-ref -d refs/remotes/fetch_crossns/HEAD" &&
+	git fetch fetch_crossns &&
+	git symbolic-ref refs/remotes/fetch_crossns/HEAD \
+		refs/remotes/fetch_upstream/u_main &&
+	test_must_fail git checkout --track=fetch -b local_crossns fetch_crossns 2>err &&
+	test_grep "refs/remotes/fetch_crossns/HEAD" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_crossns
+'
+
+test_expect_success 'checkout --track=fetch dies on ambiguous fetch refspec match' '
+	git checkout main &&
+	git remote add fetch_ambig_a ./fetch_upstream &&
+	git remote add fetch_ambig_b ./fetch_upstream &&
+	test_when_finished "git remote remove fetch_ambig_a" &&
+	test_when_finished "git remote remove fetch_ambig_b" &&
+	git config --replace-all remote.fetch_ambig_a.fetch \
+		"+refs/heads/*:refs/remotes/ambig_ns/*" &&
+	git config --replace-all remote.fetch_ambig_b.fetch \
+		"+refs/heads/*:refs/remotes/ambig_ns/*" &&
+	git -C fetch_upstream checkout -b fetch_ambig &&
+	test_commit -C fetch_upstream u_ambig &&
+	test_must_fail git checkout --track=fetch -b local_ambig ambig_ns/fetch_ambig 2>err &&
+	test_grep "fetch_ambig_a" err &&
+	test_grep "fetch_ambig_b" err &&
+	test_grep "remote.<name>.fetch" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_ambig
+'
+
+test_expect_success 'checkout --track=fetch rejects invalid refname components' '
+	git checkout main &&
+	test_must_fail git checkout --track=fetch -b local_invalid "foo..bar" 2>err &&
+	test_grep "valid" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_invalid
+'
+
+test_expect_success 'checkout --track=fetch,inherit rejects invalid refname components' '
+	git checkout main &&
+	test_must_fail git checkout --track=fetch,inherit -b local_invalid \
+		"foo..bar" 2>err &&
+	test_grep "valid" err &&
+	test_must_fail git rev-parse --verify refs/heads/local_invalid
+'
+
+test_expect_success 'checkout --track=inherit,direct is rejected' '
+	test_must_fail git checkout --track=inherit,direct -b bad fetch_upstream/fetch_new 2>err &&
+	test_grep "cannot combine" err
+'
+
+test_expect_success 'checkout --track=direct,inherit is rejected' '
+	test_must_fail git checkout --track=direct,inherit -b bad fetch_upstream/fetch_new 2>err &&
+	test_grep "cannot combine" err
+'
+
+test_expect_success 'checkout --track=fetch then --track=direct drops fetch (last-one-wins)' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_lastwin &&
+	test_commit -C fetch_upstream u_lastwin &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_lastwin &&
+	test_must_fail git checkout --track=fetch --track=direct \
+		-b local_lastwin fetch_upstream/fetch_lastwin &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_lastwin
+'
+
+test_expect_success 'checkout --track=fetch then --no-track drops fetch' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_notrack &&
+	test_commit -C fetch_upstream u_notrack &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_notrack &&
+	test_must_fail git checkout --track=fetch --no-track \
+		-b local_notrack fetch_upstream/fetch_notrack &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_notrack
+'
+
+test_expect_success 'checkout --track=fetch,inherit fetches remote-tracking start-point' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_inherit &&
+	test_commit -C fetch_upstream u_inherit &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_inherit &&
+	git checkout --track=fetch,inherit -b local_inherit \
+		fetch_upstream/fetch_inherit &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_inherit HEAD
+'
+
+test_expect_success 'checkout --track=fetch,inherit errors when start-point does not map to a remote' '
+	git checkout main &&
+	test_must_fail git checkout --track=fetch,inherit -b bad main 2>err &&
+	test_grep "no configured remote" err &&
+	test_must_fail git rev-parse --verify refs/heads/bad
+'
+
+test_expect_success 'checkout --track=fetch on local start-point errors' '
+	git checkout main &&
+	test_must_fail git checkout --track=fetch -b bad main 2>err &&
+	test_grep "no configured remote" err &&
+	test_must_fail git rev-parse --verify refs/heads/bad
+'
+
+test_expect_success 'checkout --track=bogus reports an error' '
+	git checkout main &&
+	test_must_fail git checkout --track=bogus -b bogus_branch fetch_upstream/fetch_new 2>err &&
+	test_grep "expects" err
+'
+
+test_expect_success 'checkout -q --track=fetch silences the fetch output' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_quiet &&
+	test_commit -C fetch_upstream u_quiet &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_quiet &&
+	git checkout -q --track=fetch -b local_quiet \
+		fetch_upstream/fetch_quiet 2>err &&
+	test_grep ! "-> fetch_upstream/fetch_quiet" err &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_quiet HEAD
+'
+
+test_expect_success 'switch --track=fetch -c picks up branch created upstream after clone' '
+	git checkout main &&
+	git -C fetch_upstream checkout -b fetch_switch &&
+	test_commit -C fetch_upstream u_switch &&
+	test_must_fail git rev-parse --verify refs/remotes/fetch_upstream/fetch_switch &&
+	git switch --track=fetch -c local_switch fetch_upstream/fetch_switch &&
+	test_cmp_rev refs/remotes/fetch_upstream/fetch_switch HEAD
+'
+
 test_done

base-commit: aec3f587505a472db67e9462d0702e7d463a449d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH 4/9] run-command: add support for timeout in command finisher
From: Siddh Raman Pant @ 2026-05-21  9:59 UTC (permalink / raw)
  To: j6t@kdbg.org
  Cc: git@vger.kernel.org, gitster@pobox.com, newren@gmail.com,
	ps@pks.im, oswald.buddenhagen@gmx.de, code@khaugsbakk.name
In-Reply-To: <b69605a6-e841-47b9-a899-a57e184d3c8b@kdbg.org>

[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]

On Thu, May 21 2026 at 12:51:51 +0530, Johannes Sixt wrote:
> This is extremely suspicious. A communication protocl with a child
> program that requires to kill the child looks like a design error. A
> band-aid like this timeout should not be necessary for a well-behaved
> child process.

I do not think this is a protocol design error. The normal protocol does
not require killing the helper: git sends one object id, the helper
sends one bounded response, and the helper exits when git closes its
pipes.

The timeout is for the failure path, where the external helper has
already stopped following that protocol or is blocked on something
outside git's control. Since git starts the helper and puts it on the
log/grep path, git also needs a bounded way to recover when that helper
does not make progress. Otherwise an optional note source can prevent
the main git command from completing.

> If the (your?) problem is that the child process is actually not
> well-behaved, then I suggest to use a middle-man as child process that
> behaves well from the point of view of the git process, but can punish
> the ill-behaved downstream process when needed.

A middle-man would need the same timeout/termination/reaping logic, and
git would still need to handle the middle-man itself hanging / failing.
So I don't think it removes the problem, it just makes each user or
deployment carry that process-supervision logic outside git.

External notes are additive. If the helper misbehaves, the intended
behavior is to warn once, disable that source for the rest of the
process, and let git continue without those notes. That seems
preferable to leaving git stuck in finish_command().

Thanks,
Siddh

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v11] checkout: extend --track with a "fetch" mode to refresh start-point
From: Phillip Wood @ 2026-05-21  9:49 UTC (permalink / raw)
  To: Junio C Hamano, Harald Nordgren via GitGitGadget
  Cc: git, Ramsay Jones, D. Ben Knoble, Kristoffer Haugsbakk,
	Marc Branchaud, Harald Nordgren
In-Reply-To: <xmqq1pf77kml.fsf@gitster.g>

On 19/05/2026 11:34, Junio C Hamano wrote:
> "Harald Nordgren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>      checkout: --track=fetch
>>      
>>       * Find the right remote by checking which remote's fetch refspec maps
>>         to the user's start-point, instead of assuming the start-point begins
>>         with the remote's name. This fixes cases where the user has a custom
>>         refspec mapping into a namespace whose name differs from the remote
>>         (e.g. fetching from origin into refs/remotes/upstream/*).
> 
> This comment is even before looking at the patch text.  After
> getting one issue pointed out, I'd expect you to think about related
> issues before sending a new round out.
> 
> One.  Have you considered the case where the remote-tracking refs
> are overlapping, e.g., where "origin" and "upstream" point at
> different URLs but they both store in "refs/remotes/upstream/*"?
> Perhaps their URLs may textually be different but are pointing
> logically at the same place (e.g., one ssh:// the other https:// for
> example).
> 
> What should happen?  What does happen after you apply this patch?

It would be worth looking at what "git checkout --track" does in that 
case and seeing if we can share the code.

Thanks

Phillip

> 
>>       * For a bare namespace name, follow <namespace>/HEAD first to figure
>>         out which branch to fetch.
> 
> What should happen if HEAD does not exist?  What does happen after
> you apply this patch?
> 
> Thanks.
> 


^ permalink raw reply

* Re: [PATCH v9 3/5] branch: add --prune-merged <remote>
From: Phillip Wood @ 2026-05-21  9:46 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget, git
  Cc: Kristoffer Haugsbakk, Johannes Sixt, Harald Nordgren
In-Reply-To: <6501a3d5-a5ec-421b-8526-ee7d4ae5ea98@gmail.com>

Hi Harald

A couple more thoughts ...

On 18/05/2026 16:27, Phillip Wood wrote:
> On 13/05/2026 20:34, Harald Nordgren via GitGitGadget wrote:
>> From: Harald Nordgren <haraldnordgren@gmail.com>
>>
>> Delete the local branches that --forked <remote> would list, but
>> only those whose tip is reachable from their configured upstream
>> remote-tracking branch (branch.<name>.merge): the work has already
>> landed on the upstream it tracks, so the local copy is no longer
>> needed.

While we want to clean up topic branches, we want to avoid cleaning up 
branches like "master" which follow an upstream branch and therefore 
look like they've been merged straight after they've been pulled. So I 
think as well as checking that the local branch is merged into its 
upstream branch, we want to check that the local branch is not pushed to 
the upstream branch i.e. that branch@{upstream} != branch@{push}. That 
should also avoid deleting newly created topic branches that match their 
upstream (I think that's probably less likely to happen in practice as 
I'd expect the branch to be checked out and therefore protected against 
deletion).

Also as this is a destructive operation (there is no way to restore a 
deleted branch and its reflog) it would be good to have a --dry-run option.

Thanks

Phillip


^ permalink raw reply

* Re: [PATCH v3] generate-configlist: collapse depfile for older Ninja
From: Patrick Steinhardt @ 2026-05-21  9:31 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, D. Ben Knoble
In-Reply-To: <20260515-toon-fix-almalinux8-v3-1-b545a0647f0f@iotcl.com>

On Fri, May 15, 2026 at 10:42:26AM +0200, Toon Claes wrote:
> diff --git a/tools/generate-configlist.sh b/tools/generate-configlist.sh
> index e28054f9e0..d1d2ba4bb7 100755
> --- a/tools/generate-configlist.sh
> +++ b/tools/generate-configlist.sh
> @@ -42,9 +42,12 @@ if test -n "$DEPFILE"
>  then
>  	QUOTED_OUTPUT="$(printf '%s\n' "$OUTPUT" | sed 's,[&/\],\\&,g')"
>  	{
> +		printf '%s' "$QUOTED_OUTPUT: "
>  		printf '%s\n' "$SOURCE_DIR"/Documentation/*config.adoc \
>  			"$SOURCE_DIR"/Documentation/config/*.adoc |
> -			sed -e 's/[# ]/\\&/g' -e "s/^/$QUOTED_OUTPUT: /"
> +			sed -e 's/[# ]/\\&/g' |

The `-e` switch is now arguably not necessary anymore, but that's not a
huge concern.

> +			tr '\n' ' '
> +		printf '\n'
>  		printf '%s:\n' "$SOURCE_DIR"/Documentation/*config.adoc \
>  			"$SOURCE_DIR"/Documentation/config/*.adoc |
>  			sed -e 's/[# ]/\\&/g'

The extra printf could've been rolled into the second printf call via
`printf '\n%s:\n'`, but that's not a huge concern, either.

Other than that this looks good to me, thanks!

Patrick

^ permalink raw reply

* [PATCH 2/2] gitlab-ci: update macOS image
From: Patrick Steinhardt @ 2026-05-21  8:59 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-gitlab-ci-updates-v1-0-53bb46ed33e0@pks.im>

The GitLab CI jobs for macOS are all using the macOS 15 images. While
these images are not deprecated yet, there is a new image for macOS 26
generally available by now [1].

Switch two of our jobs to use the new image. The third job still
continues to use the old image. This ensures broader test coverage until
this old image gets deprecated.

[1]: https://docs.gitlab.com/ci/runners/hosted_runners/macos/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 1c6777acf3..e0b9a0d82b 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -104,10 +104,10 @@ test:osx:
         image: macos-15-xcode-16
         CC: clang
       - jobname: osx-reftable
-        image: macos-15-xcode-16
+        image: macos-26-xcode-26
         CC: clang
       - jobname: osx-meson
-        image: macos-15-xcode-16
+        image: macos-26-xcode-26
         CC: clang
   artifacts:
     paths:

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 1/2] gitlab-ci: upgrade macOS runners
From: Patrick Steinhardt @ 2026-05-21  8:59 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-gitlab-ci-updates-v1-0-53bb46ed33e0@pks.im>

We're currently using M1-based runners for our macOS jobs. GitLab has
since introduced a new M2 Pro-based runner type that is available for
all GitLab tiers [1], which upgrades from 4 to 6 cores and from 8 to 16
GB RAM.

Upgrade to this new runner type, which results in some nice speedups:

  - osx-clang goes from 26 minutes to 16 minutes.

  - osx-meson goes from 19 minutes to 13 minutes.

  - osx-reftable goes from 23 minutes to 14 mintues.

[1]: https://docs.gitlab.com/ci/runners/hosted_runners/macos/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 83ec786c5a..1c6777acf3 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -79,7 +79,7 @@ test:osx:
   stage: test
   needs: [ ]
   tags:
-    - saas-macos-medium-m1
+    - saas-macos-large-m2pro
   variables:
     TEST_OUTPUT_DIRECTORY: "/Volumes/RAMDisk"
   before_script:

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 0/2] gitlab-ci: some smallish updates for macOS jobs
From: Patrick Steinhardt @ 2026-05-21  8:59 UTC (permalink / raw)
  To: git

Hi,

this patch series does some smallish updates for GitLab's CI jobs that
exercise macOS. A test run of this can be found at [1].

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/576

---
Patrick Steinhardt (2):
      gitlab-ci: upgrade macOS runners
      gitlab-ci: update macOS image

 .gitlab-ci.yml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


---
base-commit: aec3f587505a472db67e9462d0702e7d463a449d
change-id: 20260521-b4-pks-gitlab-ci-updates-3b4a959f8e0c


^ permalink raw reply

* Re: [PATCH 4/9] run-command: add support for timeout in command finisher
From: Oswald Buddenhagen @ 2026-05-21  8:39 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Siddh Raman Pant, Calvin Wan, Patrick Steinhardt, Elijah Newren,
	Kristoffer Haugsbakk, Junio C Hamano, git
In-Reply-To: <b69605a6-e841-47b9-a899-a57e184d3c8b@kdbg.org>

On Thu, May 21, 2026 at 09:21:51AM +0200, Johannes Sixt wrote:
>Please, do not add this infrastructure to core Git, and instead fix the
>communication protocol.
>
there is nothing to fix here. proper error handling including timeout 
handling should just be part of every protocol handler, and in the case 
of child processes, forcible termination is part of that.

one can ignore the issue, in which case termination is left to the user 
by ctrl-c'ing the whole process group. this isn't very user-friendly, 
because it doesn't report the problem, and it may leave hung processes 
behind. it is also extremely bad if keeping the parent process alive is 
a lot more important than the child process, but this doesn't appear to 
apply to the particular use case.

adding a proxy doesn't fix the problem, it just adds another point of 
failure.

^ permalink raw reply

* [PATCH 18/18] odb/source-loose: drop pointer to the "files" source
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Now that all callbacks of the loose source operate on `struct
odb_source_loose` directly we no longer have to reach into the "files"
source at all.

Drop this field and update `odb_source_loose_new()` to instead accept
all parameters required to initialize itself. This ensures that the
"loose" backend is a fully standalone source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-files.c | 2 +-
 odb/source-loose.c | 8 ++++----
 odb/source-loose.h | 7 ++++---
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/odb/source-files.c b/odb/source-files.c
index 83f8066c67..5bdd042922 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -268,7 +268,7 @@ struct odb_source_files *odb_source_files_new(struct object_database *odb,
 
 	CALLOC_ARRAY(files, 1);
 	odb_source_init(&files->base, odb, ODB_SOURCE_FILES, path, local);
-	files->loose = odb_source_loose_new(files);
+	files->loose = odb_source_loose_new(odb, path, local);
 	files->packed = packfile_store_new(&files->base);
 
 	files->base.free = odb_source_files_free;
diff --git a/odb/source-loose.c b/odb/source-loose.c
index e174941318..7d7ea2fb84 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -705,14 +705,14 @@ static void odb_source_loose_free(struct odb_source *source)
 	free(loose);
 }
 
-struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
+struct odb_source_loose *odb_source_loose_new(struct object_database *odb,
+					      const char *path,
+					      bool local)
 {
 	struct odb_source_loose *loose;
 
 	CALLOC_ARRAY(loose, 1);
-	odb_source_init(&loose->base, files->base.odb, ODB_SOURCE_LOOSE,
-			files->base.path, files->base.local);
-	loose->files = files;
+	odb_source_init(&loose->base, odb, ODB_SOURCE_LOOSE, path, local);
 
 	loose->base.free = odb_source_loose_free;
 	loose->base.close = odb_source_loose_close;
diff --git a/odb/source-loose.h b/odb/source-loose.h
index 825e703072..fb75e3bbff 100644
--- a/odb/source-loose.h
+++ b/odb/source-loose.h
@@ -9,11 +9,10 @@ struct oidtree;
 
 /*
  * An object database source that stores its objects in loose format, one
- * file per object. This source is part of the files source.
+ * file per object.
  */
 struct odb_source_loose {
 	struct odb_source base;
-	struct odb_source_files *files;
 
 	/*
 	 * Used to store the results of readdir(3) calls when we are OK
@@ -31,7 +30,9 @@ struct odb_source_loose {
 	struct loose_object_map *map;
 };
 
-struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files);
+struct odb_source_loose *odb_source_loose_new(struct object_database *odb,
+					      const char *path,
+					      bool local);
 
 /*
  * Cast the given object database source to the loose backend. This will cause

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 17/18] odb/source-loose: stub out remaining callbacks
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Stub out remaining callback functions for the "loose" backend.

Note that we also stub out transactions for loose objects. In fact, we
already have the infrastructure in place for those, and we could in
theory implement those, as well. But there are separate efforts ongoing
to polish up transactional interfaces, and doing so now would likely
result in some messiness. This omission will thus be worked on in a
subsequent patch series, once the dust has settled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-loose.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/odb/source-loose.c b/odb/source-loose.c
index e52fc289a2..e174941318 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -645,6 +645,25 @@ static int odb_source_loose_write_object_stream(struct odb_source *source,
 	return odb_source_loose_write_stream(loose, in_stream, len, oid);
 }
 
+static int odb_source_loose_begin_transaction(struct odb_source *source UNUSED,
+					      struct odb_transaction **out UNUSED)
+{
+	/* TODO: this is a known omission that we'll want to address eventually. */
+	return error("loose source does not support transactions");
+}
+
+static int odb_source_loose_read_alternates(struct odb_source *source UNUSED,
+					    struct strvec *out UNUSED)
+{
+	return 0;
+}
+
+static int odb_source_loose_write_alternate(struct odb_source *source UNUSED,
+					    const char *alternate UNUSED)
+{
+	return error("loose source does not support alternates");
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -706,6 +725,9 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.freshen_object = odb_source_loose_freshen_object;
 	loose->base.write_object = odb_source_loose_write_object;
 	loose->base.write_object_stream = odb_source_loose_write_object_stream;
+	loose->base.begin_transaction = odb_source_loose_begin_transaction;
+	loose->base.read_alternates = odb_source_loose_read_alternates;
+	loose->base.write_alternate = odb_source_loose_write_alternate;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 16/18] odb/source-loose: wire up `write_object_stream()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Wire up the `write_object_stream()` callback.

Note that we don't move the implementation into "odb/source-loose.c".
This is because most of the logic to write loose objects is still
contained in "object-file.c", and detangling that requires us to do some
refactorings as explained in the preceding commit. So for now, the
implementation of writing an object stream is still located in
"object-file.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.h      | 12 +++++++++++-
 odb/source-files.c |  3 ++-
 odb/source-loose.c | 14 ++++++++++++++
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/object-file.h b/object-file.h
index d30f1b10b2..b864351372 100644
--- a/object-file.h
+++ b/object-file.h
@@ -23,7 +23,17 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 struct object_info;
 struct odb_source;
 
-int odb_source_loose_write_stream(struct odb_source_loose *loose,
+/*
+ * Write the given stream into the loose object source. The only difference to
+ * the generic implementation of this function is that we don't perform an
+ * object existence check here.
+ *
+ * TODO: We should stop exposing this function altogether and move it into
+ * "odb/source-loose.c". This requires a couple of refactorings though to make
+ * `force_object_loose()` generic and is thus postponed to a later point in
+ * time.
+ */
+int odb_source_loose_write_stream(struct odb_source_loose *source,
 				  struct odb_write_stream *stream, size_t len,
 				  struct object_id *oid);
 
diff --git a/odb/source-files.c b/odb/source-files.c
index 2ba1def776..83f8066c67 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -7,6 +7,7 @@
 #include "odb.h"
 #include "odb/source.h"
 #include "odb/source-files.h"
+#include "odb/source-loose.h"
 #include "packfile.h"
 #include "strbuf.h"
 #include "write-or-die.h"
@@ -175,7 +176,7 @@ static int odb_source_files_write_object_stream(struct odb_source *source,
 						struct object_id *oid)
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
-	return odb_source_loose_write_stream(files->loose, stream, len, oid);
+	return odb_source_write_object_stream(&files->loose->base, stream, len, oid);
 }
 
 static int odb_source_files_begin_transaction(struct odb_source *source,
diff --git a/odb/source-loose.c b/odb/source-loose.c
index da8a60dba1..e52fc289a2 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -632,6 +632,19 @@ static int odb_source_loose_write_object(struct odb_source *source,
 	return 0;
 }
 
+static int odb_source_loose_write_object_stream(struct odb_source *source,
+						struct odb_write_stream *in_stream,
+						size_t len,
+						struct object_id *oid)
+{
+	/*
+	 * TODO: the implementation should be moved here, see the comment on
+	 * the called function in "object-file.h".
+	 */
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	return odb_source_loose_write_stream(loose, in_stream, len, oid);
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -692,6 +705,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.count_objects = odb_source_loose_count_objects;
 	loose->base.freshen_object = odb_source_loose_freshen_object;
 	loose->base.write_object = odb_source_loose_write_object;
+	loose->base.write_object_stream = odb_source_loose_write_object_stream;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 15/18] object-file: refactor writing objects to use loose source
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

The "object-file" subsystem still hosts the majority of logic used to
write loose objects. Eventually, we'll want to move this logic into
"odb/source-loose.c", but this isn't yet easily possible because a lot
of the writing logic is still being shared with `force_object_loose()`.

We will eventually detangle this logic so that we can indeed move all of
it into the "loose" source. Meanwhile though, refactor the code so that
it operates on a `struct odb_source_loose` directly to already make the
dependency explicit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 http-walker.c      |  3 ++-
 http.c             |  6 +++--
 object-file.c      | 75 +++++++++++++++++++++++++++---------------------------
 object-file.h      |  6 ++---
 odb/source-files.c |  3 ++-
 odb/source-loose.c |  9 ++++---
 6 files changed, 53 insertions(+), 49 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 1b6d496548..435a726540 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -539,8 +539,9 @@ static int fetch_object(struct walker *walker, const struct object_id *oid)
 	} else if (!oideq(&obj_req->oid, &req->real_oid)) {
 		ret = error("File %s has bad hash", hex);
 	} else if (req->rename < 0) {
+		struct odb_source_files *files = odb_source_files_downcast(the_repository->objects->sources);
 		struct strbuf buf = STRBUF_INIT;
-		odb_loose_path(the_repository->objects->sources, &buf, &req->oid);
+		odb_loose_path(files->loose, &buf, &req->oid);
 		ret = error("unable to write sha1 filename %s", buf.buf);
 		strbuf_release(&buf);
 	}
diff --git a/http.c b/http.c
index ea9b16861b..3fcc012233 100644
--- a/http.c
+++ b/http.c
@@ -2826,6 +2826,7 @@ static size_t fwrite_sha1_file(char *ptr, size_t eltsize, size_t nmemb,
 struct http_object_request *new_http_object_request(const char *base_url,
 						    const struct object_id *oid)
 {
+	struct odb_source_files *files = odb_source_files_downcast(the_repository->objects->sources);
 	char *hex = oid_to_hex(oid);
 	struct strbuf filename = STRBUF_INIT;
 	struct strbuf prevfile = STRBUF_INIT;
@@ -2840,7 +2841,7 @@ struct http_object_request *new_http_object_request(const char *base_url,
 	oidcpy(&freq->oid, oid);
 	freq->localfile = -1;
 
-	odb_loose_path(the_repository->objects->sources, &filename, oid);
+	odb_loose_path(files->loose, &filename, oid);
 	strbuf_addf(&freq->tmpfile, "%s.temp", filename.buf);
 
 	strbuf_addf(&prevfile, "%s.prev", filename.buf);
@@ -2966,6 +2967,7 @@ void process_http_object_request(struct http_object_request *freq)
 
 int finish_http_object_request(struct http_object_request *freq)
 {
+	struct odb_source_files *files = odb_source_files_downcast(the_repository->objects->sources);
 	struct stat st;
 	struct strbuf filename = STRBUF_INIT;
 
@@ -2992,7 +2994,7 @@ int finish_http_object_request(struct http_object_request *freq)
 		unlink_or_warn(freq->tmpfile.buf);
 		return -1;
 	}
-	odb_loose_path(the_repository->objects->sources, &filename, &freq->oid);
+	odb_loose_path(files->loose, &filename, &freq->oid);
 	freq->rename = finalize_object_file(the_repository, freq->tmpfile.buf, filename.buf);
 	strbuf_release(&filename);
 
diff --git a/object-file.c b/object-file.c
index 7bb5b31bca..bce941874e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -54,14 +54,14 @@ static void fill_loose_path(struct strbuf *buf,
 	}
 }
 
-const char *odb_loose_path(struct odb_source *source,
+const char *odb_loose_path(struct odb_source_loose *loose,
 			   struct strbuf *buf,
 			   const struct object_id *oid)
 {
 	strbuf_reset(buf);
-	strbuf_addstr(buf, source->path);
+	strbuf_addstr(buf, loose->base.path);
 	strbuf_addch(buf, '/');
-	fill_loose_path(buf, oid, source->odb->repo->hash_algo);
+	fill_loose_path(buf, oid, loose->base.odb->repo->hash_algo);
 	return buf->buf;
 }
 
@@ -575,14 +575,14 @@ static void flush_loose_object_transaction(struct odb_transaction_files *transac
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_loose_object(struct odb_source *source,
+static void close_loose_object(struct odb_source_loose *loose,
 			       int fd, const char *filename)
 {
-	if (source->will_destroy)
+	if (loose->base.will_destroy)
 		goto out;
 
 	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		fsync_loose_object_transaction(source->odb->transaction, fd, filename);
+		fsync_loose_object_transaction(loose->base.odb->transaction, fd, filename);
 	else if (fsync_object_files > 0)
 		fsync_or_die(fd, filename);
 	else
@@ -651,7 +651,7 @@ static int create_tmpfile(struct repository *repo,
  * Returns a "fd", which should later be provided to
  * end_loose_object_common().
  */
-static int start_loose_object_common(struct odb_source *source,
+static int start_loose_object_common(struct odb_source_loose *loose,
 				     struct strbuf *tmp_file,
 				     const char *filename, unsigned flags,
 				     git_zstream *stream,
@@ -659,18 +659,18 @@ static int start_loose_object_common(struct odb_source *source,
 				     struct git_hash_ctx *c, struct git_hash_ctx *compat_c,
 				     char *hdr, int hdrlen)
 {
-	const struct git_hash_algo *algo = source->odb->repo->hash_algo;
-	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
+	const struct git_hash_algo *algo = loose->base.odb->repo->hash_algo;
+	const struct git_hash_algo *compat = loose->base.odb->repo->compat_hash_algo;
 	int fd;
 
-	fd = create_tmpfile(source->odb->repo, tmp_file, filename);
+	fd = create_tmpfile(loose->base.odb->repo, tmp_file, filename);
 	if (fd < 0) {
 		if (flags & ODB_WRITE_OBJECT_SILENT)
 			return -1;
 		else if (errno == EACCES)
 			return error(_("insufficient permission for adding "
 				       "an object to repository database %s"),
-				     source->path);
+				     loose->base.path);
 		else
 			return error_errno(
 				_("unable to create temporary file"));
@@ -700,14 +700,14 @@ static int start_loose_object_common(struct odb_source *source,
  * Common steps for the inner git_deflate() loop for writing loose
  * objects. Returns what git_deflate() returns.
  */
-static int write_loose_object_common(struct odb_source *source,
+static int write_loose_object_common(struct odb_source_loose *loose,
 				     struct git_hash_ctx *c, struct git_hash_ctx *compat_c,
 				     git_zstream *stream, const int flush,
 				     unsigned char *in0, const int fd,
 				     unsigned char *compressed,
 				     const size_t compressed_len)
 {
-	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
+	const struct git_hash_algo *compat = loose->base.odb->repo->compat_hash_algo;
 	int ret;
 
 	ret = git_deflate(stream, flush ? Z_FINISH : 0);
@@ -728,12 +728,12 @@ static int write_loose_object_common(struct odb_source *source,
  * - End the compression of zlib stream.
  * - Get the calculated oid to "oid".
  */
-static int end_loose_object_common(struct odb_source *source,
+static int end_loose_object_common(struct odb_source_loose *loose,
 				   struct git_hash_ctx *c, struct git_hash_ctx *compat_c,
 				   git_zstream *stream, struct object_id *oid,
 				   struct object_id *compat_oid)
 {
-	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
+	const struct git_hash_algo *compat = loose->base.odb->repo->compat_hash_algo;
 	int ret;
 
 	ret = git_deflate_end_gently(stream);
@@ -746,7 +746,7 @@ static int end_loose_object_common(struct odb_source *source,
 	return Z_OK;
 }
 
-int write_loose_object(struct odb_source *source,
+int write_loose_object(struct odb_source_loose *loose,
 		       const struct object_id *oid, char *hdr,
 		       int hdrlen, const void *buf, unsigned long len,
 		       time_t mtime, unsigned flags)
@@ -760,11 +760,11 @@ int write_loose_object(struct odb_source *source,
 	static struct strbuf filename = STRBUF_INIT;
 
 	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		prepare_loose_object_transaction(source->odb->transaction);
+		prepare_loose_object_transaction(loose->base.odb->transaction);
 
-	odb_loose_path(source, &filename, oid);
+	odb_loose_path(loose, &filename, oid);
 
-	fd = start_loose_object_common(source, &tmp_file, filename.buf, flags,
+	fd = start_loose_object_common(loose, &tmp_file, filename.buf, flags,
 				       &stream, compressed, sizeof(compressed),
 				       &c, NULL, hdr, hdrlen);
 	if (fd < 0)
@@ -776,14 +776,14 @@ int write_loose_object(struct odb_source *source,
 	do {
 		unsigned char *in0 = stream.next_in;
 
-		ret = write_loose_object_common(source, &c, NULL, &stream, 1, in0, fd,
+		ret = write_loose_object_common(loose, &c, NULL, &stream, 1, in0, fd,
 						compressed, sizeof(compressed));
 	} while (ret == Z_OK);
 
 	if (ret != Z_STREAM_END)
 		die(_("unable to deflate new object %s (%d)"), oid_to_hex(oid),
 		    ret);
-	ret = end_loose_object_common(source, &c, NULL, &stream, &parano_oid, NULL);
+	ret = end_loose_object_common(loose, &c, NULL, &stream, &parano_oid, NULL);
 	if (ret != Z_OK)
 		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
 		    ret);
@@ -791,7 +791,7 @@ int write_loose_object(struct odb_source *source,
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(source, fd, tmp_file.buf);
+	close_loose_object(loose, fd, tmp_file.buf);
 
 	if (mtime) {
 		struct utimbuf utb;
@@ -802,16 +802,15 @@ int write_loose_object(struct odb_source *source,
 			warning_errno(_("failed utime() on %s"), tmp_file.buf);
 	}
 
-	return finalize_object_file_flags(source->odb->repo, tmp_file.buf, filename.buf,
+	return finalize_object_file_flags(loose->base.odb->repo, tmp_file.buf, filename.buf,
 					  FOF_SKIP_COLLISION_CHECK);
 }
 
-int odb_source_loose_write_stream(struct odb_source *source,
+int odb_source_loose_write_stream(struct odb_source_loose *loose,
 				  struct odb_write_stream *in_stream, size_t len,
 				  struct object_id *oid)
 {
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
+	const struct git_hash_algo *compat = loose->base.odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
 	int fd, ret, err = 0, flush = 0;
 	unsigned char compressed[4096];
@@ -825,10 +824,10 @@ int odb_source_loose_write_stream(struct odb_source *source,
 	int hdrlen;
 
 	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		prepare_loose_object_transaction(source->odb->transaction);
+		prepare_loose_object_transaction(loose->base.odb->transaction);
 
 	/* Since oid is not determined, save tmp file to odb path. */
-	strbuf_addf(&filename, "%s/", source->path);
+	strbuf_addf(&filename, "%s/", loose->base.path);
 	hdrlen = format_object_header(hdr, sizeof(hdr), OBJ_BLOB, len);
 
 	/*
@@ -839,7 +838,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 	 *  - Setup zlib stream for compression.
 	 *  - Start to feed header to zlib stream.
 	 */
-	fd = start_loose_object_common(source, &tmp_file, filename.buf, 0,
+	fd = start_loose_object_common(loose, &tmp_file, filename.buf, 0,
 				       &stream, compressed, sizeof(compressed),
 				       &c, &compat_c, hdr, hdrlen);
 	if (fd < 0) {
@@ -867,7 +866,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 			if (in_stream->is_finished)
 				flush = 1;
 		}
-		ret = write_loose_object_common(source, &c, &compat_c, &stream, flush, in0, fd,
+		ret = write_loose_object_common(loose, &c, &compat_c, &stream, flush, in0, fd,
 						compressed, sizeof(compressed));
 		/*
 		 * Unlike write_loose_object(), we do not have the entire
@@ -890,16 +889,16 @@ int odb_source_loose_write_stream(struct odb_source *source,
 	 */
 	if (ret != Z_STREAM_END)
 		die(_("unable to stream deflate new object (%d)"), ret);
-	ret = end_loose_object_common(source, &c, &compat_c, &stream, oid, &compat_oid);
+	ret = end_loose_object_common(loose, &c, &compat_c, &stream, oid, &compat_oid);
 	if (ret != Z_OK)
 		die(_("deflateEnd on stream object failed (%d)"), ret);
-	close_loose_object(source, fd, tmp_file.buf);
+	close_loose_object(loose, fd, tmp_file.buf);
 
-	if (odb_freshen_object(source->odb, oid)) {
+	if (odb_freshen_object(loose->base.odb, oid)) {
 		unlink_or_warn(tmp_file.buf);
 		goto cleanup;
 	}
-	odb_loose_path(source, &filename, oid);
+	odb_loose_path(loose, &filename, oid);
 
 	/* We finally know the object path, and create the missing dir. */
 	dirlen = directory_size(filename.buf);
@@ -907,7 +906,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 		struct strbuf dir = STRBUF_INIT;
 		strbuf_add(&dir, filename.buf, dirlen);
 
-		if (safe_create_dir_in_gitdir(source->odb->repo, dir.buf) &&
+		if (safe_create_dir_in_gitdir(loose->base.odb->repo, dir.buf) &&
 		    errno != EEXIST) {
 			err = error_errno(_("unable to create directory %s"), dir.buf);
 			strbuf_release(&dir);
@@ -916,10 +915,10 @@ int odb_source_loose_write_stream(struct odb_source *source,
 		strbuf_release(&dir);
 	}
 
-	err = finalize_object_file_flags(source->odb->repo, tmp_file.buf, filename.buf,
+	err = finalize_object_file_flags(loose->base.odb->repo, tmp_file.buf, filename.buf,
 					 FOF_SKIP_COLLISION_CHECK);
 	if (!err && compat)
-		err = repo_add_loose_object_map(files->loose, oid, &compat_oid);
+		err = repo_add_loose_object_map(loose, oid, &compat_oid);
 cleanup:
 	strbuf_release(&tmp_file);
 	strbuf_release(&filename);
@@ -957,7 +956,7 @@ int force_object_loose(struct odb_source *source,
 				     oid_to_hex(oid), compat->name);
 	}
 	hdrlen = format_object_header(hdr, sizeof(hdr), type, len);
-	ret = write_loose_object(source, oid, hdr, hdrlen, buf, len, mtime, 0);
+	ret = write_loose_object(files->loose, oid, hdr, hdrlen, buf, len, mtime, 0);
 	if (!ret && compat)
 		ret = repo_add_loose_object_map(files->loose, oid, &compat_oid);
 	free(buf);
diff --git a/object-file.h b/object-file.h
index 2b32592de1..d30f1b10b2 100644
--- a/object-file.h
+++ b/object-file.h
@@ -23,7 +23,7 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 struct object_info;
 struct odb_source;
 
-int odb_source_loose_write_stream(struct odb_source *source,
+int odb_source_loose_write_stream(struct odb_source_loose *loose,
 				  struct odb_write_stream *stream, size_t len,
 				  struct object_id *oid);
 
@@ -31,7 +31,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
  * Put in `buf` the name of the file in the local object database that
  * would be used to store a loose object with the specified oid.
  */
-const char *odb_loose_path(struct odb_source *source,
+const char *odb_loose_path(struct odb_source_loose *source,
 			   struct strbuf *buf,
 			   const struct object_id *oid);
 
@@ -127,7 +127,7 @@ void write_object_file_prepare(const struct git_hash_algo *algo,
 			       const void *buf, unsigned long len,
 			       enum object_type type, struct object_id *oid,
 			       char *hdr, int *hdrlen);
-int write_loose_object(struct odb_source *source,
+int write_loose_object(struct odb_source_loose *loose,
 		       const struct object_id *oid, char *hdr,
 		       int hdrlen, const void *buf, unsigned long len,
 		       time_t mtime, unsigned flags);
diff --git a/odb/source-files.c b/odb/source-files.c
index 52ba04237a..2ba1def776 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -174,7 +174,8 @@ static int odb_source_files_write_object_stream(struct odb_source *source,
 						size_t len,
 						struct object_id *oid)
 {
-	return odb_source_loose_write_stream(source, stream, len, oid);
+	struct odb_source_files *files = odb_source_files_downcast(source);
+	return odb_source_loose_write_stream(files->loose, stream, len, oid);
 }
 
 static int odb_source_files_begin_transaction(struct odb_source *source,
diff --git a/odb/source-loose.c b/odb/source-loose.c
index c91018109e..da8a60dba1 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -220,7 +220,7 @@ static int odb_source_loose_read_object_info(struct odb_source *source,
 	if (flags & OBJECT_INFO_SECOND_READ)
 		return -1;
 
-	odb_loose_path(source, &buf, oid);
+	odb_loose_path(loose, &buf, oid);
 	return read_object_info_from_path(loose, buf.buf, oid, oi, flags);
 }
 
@@ -238,7 +238,7 @@ static int open_loose_object(struct odb_source_loose *loose,
 	static struct strbuf buf = STRBUF_INIT;
 	int fd;
 
-	*path = odb_loose_path(&loose->base, &buf, oid);
+	*path = odb_loose_path(loose, &buf, oid);
 	fd = git_open(*path);
 	if (fd >= 0)
 		return fd;
@@ -584,8 +584,9 @@ static int odb_source_loose_count_objects(struct odb_source *source,
 static int odb_source_loose_freshen_object(struct odb_source *source,
 					   const struct object_id *oid)
 {
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
 	static struct strbuf path = STRBUF_INIT;
-	odb_loose_path(source, &path, oid);
+	odb_loose_path(loose, &path, oid);
 	return !!check_and_freshen_file(path.buf, 1);
 }
 
@@ -624,7 +625,7 @@ static int odb_source_loose_write_object(struct odb_source *source,
 	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
 	if (odb_freshen_object(source->odb, oid))
 		return 0;
-	if (write_loose_object(source, oid, hdr, hdrlen, buf, len, 0, flags))
+	if (write_loose_object(loose, oid, hdr, hdrlen, buf, len, 0, flags))
 		return -1;
 	if (compat)
 		return repo_add_loose_object_map(loose, oid, &compat_oid);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 14/18] odb/source-loose: wire up `write_object()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_write_object()` from "object-file.c" into
"odb/source-loose.c" and wire it up as the `write_object()` callback of
the loose source.

As in preceding commits, this requires us to expose a couple of generic
functions from "object-file.c" as they are used in both subsystems now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 58 ++++++++----------------------------------------------
 object-file.h      | 14 +++++++------
 odb/source-files.c |  5 +++--
 odb/source-loose.c | 44 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 63 insertions(+), 58 deletions(-)

diff --git a/object-file.c b/object-file.c
index fe24f00d1b..7bb5b31bca 100644
--- a/object-file.c
+++ b/object-file.c
@@ -326,10 +326,10 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
 	git_hash_final_oid(oid, c);
 }
 
-static void write_object_file_prepare(const struct git_hash_algo *algo,
-				      const void *buf, unsigned long len,
-				      enum object_type type, struct object_id *oid,
-				      char *hdr, int *hdrlen)
+void write_object_file_prepare(const struct git_hash_algo *algo,
+			       const void *buf, unsigned long len,
+			       enum object_type type, struct object_id *oid,
+			       char *hdr, int *hdrlen)
 {
 	struct git_hash_ctx c;
 
@@ -746,10 +746,10 @@ static int end_loose_object_common(struct odb_source *source,
 	return Z_OK;
 }
 
-static int write_loose_object(struct odb_source *source,
-			      const struct object_id *oid, char *hdr,
-			      int hdrlen, const void *buf, unsigned long len,
-			      time_t mtime, unsigned flags)
+int write_loose_object(struct odb_source *source,
+		       const struct object_id *oid, char *hdr,
+		       int hdrlen, const void *buf, unsigned long len,
+		       time_t mtime, unsigned flags)
 {
 	int fd, ret;
 	unsigned char compressed[4096];
@@ -926,48 +926,6 @@ int odb_source_loose_write_stream(struct odb_source *source,
 	return err;
 }
 
-int odb_source_loose_write_object(struct odb_source *source,
-				  const void *buf, unsigned long len,
-				  enum object_type type, struct object_id *oid,
-				  struct object_id *compat_oid_in,
-				  enum odb_write_object_flags flags)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	const struct git_hash_algo *algo = source->odb->repo->hash_algo;
-	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
-	struct object_id compat_oid;
-	char hdr[MAX_HEADER_LEN];
-	int hdrlen = sizeof(hdr);
-
-	/* Generate compat_oid */
-	if (compat) {
-		if (compat_oid_in)
-			oidcpy(&compat_oid, compat_oid_in);
-		else if (type == OBJ_BLOB)
-			hash_object_file(compat, buf, len, type, &compat_oid);
-		else {
-			struct strbuf converted = STRBUF_INIT;
-			convert_object_file(source->odb->repo, &converted, algo, compat,
-					    buf, len, type, 0);
-			hash_object_file(compat, converted.buf, converted.len,
-					 type, &compat_oid);
-			strbuf_release(&converted);
-		}
-	}
-
-	/* Normally if we have it in the pack then we do not bother writing
-	 * it out into .git/objects/??/?{38} file.
-	 */
-	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
-	if (odb_freshen_object(source->odb, oid))
-		return 0;
-	if (write_loose_object(source, oid, hdr, hdrlen, buf, len, 0, flags))
-		return -1;
-	if (compat)
-		return repo_add_loose_object_map(files->loose, oid, &compat_oid);
-	return 0;
-}
-
 int force_object_loose(struct odb_source *source,
 		       const struct object_id *oid, time_t mtime)
 {
diff --git a/object-file.h b/object-file.h
index 1d90df9d98..2b32592de1 100644
--- a/object-file.h
+++ b/object-file.h
@@ -23,12 +23,6 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 struct object_info;
 struct odb_source;
 
-int odb_source_loose_write_object(struct odb_source *source,
-				  const void *buf, unsigned long len,
-				  enum object_type type, struct object_id *oid,
-				  struct object_id *compat_oid_in,
-				  enum odb_write_object_flags flags);
-
 int odb_source_loose_write_stream(struct odb_source *source,
 				  struct odb_write_stream *stream, size_t len,
 				  struct object_id *oid);
@@ -129,6 +123,14 @@ int finalize_object_file_flags(struct repository *repo,
 void hash_object_file(const struct git_hash_algo *algo, const void *buf,
 		      unsigned long len, enum object_type type,
 		      struct object_id *oid);
+void write_object_file_prepare(const struct git_hash_algo *algo,
+			       const void *buf, unsigned long len,
+			       enum object_type type, struct object_id *oid,
+			       char *hdr, int *hdrlen);
+int write_loose_object(struct odb_source *source,
+		       const struct object_id *oid, char *hdr,
+		       int hdrlen, const void *buf, unsigned long len,
+		       time_t mtime, unsigned flags);
 
 /* Helper to check and "touch" a file */
 int check_and_freshen_file(const char *fn, int freshen);
diff --git a/odb/source-files.c b/odb/source-files.c
index ef548e6fe6..52ba04237a 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -164,8 +164,9 @@ static int odb_source_files_write_object(struct odb_source *source,
 					 struct object_id *compat_oid,
 					 enum odb_write_object_flags flags)
 {
-	return odb_source_loose_write_object(source, buf, len, type,
-					     oid, compat_oid, flags);
+	struct odb_source_files *files = odb_source_files_downcast(source);
+	return odb_source_write_object(&files->loose->base, buf, len, type,
+				       oid, compat_oid, flags);
 }
 
 static int odb_source_files_write_object_stream(struct odb_source *source,
diff --git a/odb/source-loose.c b/odb/source-loose.c
index e519365d23..c91018109e 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -5,6 +5,7 @@
 #include "hex.h"
 #include "loose.h"
 #include "object-file.h"
+#include "object-file-convert.h"
 #include "odb.h"
 #include "odb/source-files.h"
 #include "odb/source-loose.h"
@@ -588,6 +589,48 @@ static int odb_source_loose_freshen_object(struct odb_source *source,
 	return !!check_and_freshen_file(path.buf, 1);
 }
 
+static int odb_source_loose_write_object(struct odb_source *source,
+					 const void *buf, unsigned long len,
+					 enum object_type type, struct object_id *oid,
+					 struct object_id *compat_oid_in,
+					 enum odb_write_object_flags flags)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	const struct git_hash_algo *algo = source->odb->repo->hash_algo;
+	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
+	struct object_id compat_oid;
+	char hdr[MAX_HEADER_LEN];
+	int hdrlen = sizeof(hdr);
+
+	/* Generate compat_oid */
+	if (compat) {
+		if (compat_oid_in)
+			oidcpy(&compat_oid, compat_oid_in);
+		else if (type == OBJ_BLOB)
+			hash_object_file(compat, buf, len, type, &compat_oid);
+		else {
+			struct strbuf converted = STRBUF_INIT;
+			convert_object_file(source->odb->repo, &converted, algo, compat,
+					    buf, len, type, 0);
+			hash_object_file(compat, converted.buf, converted.len,
+					 type, &compat_oid);
+			strbuf_release(&converted);
+		}
+	}
+
+	/* Normally if we have it in the pack then we do not bother writing
+	 * it out into .git/objects/??/?{38} file.
+	 */
+	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
+	if (odb_freshen_object(source->odb, oid))
+		return 0;
+	if (write_loose_object(source, oid, hdr, hdrlen, buf, len, 0, flags))
+		return -1;
+	if (compat)
+		return repo_add_loose_object_map(loose, oid, &compat_oid);
+	return 0;
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -647,6 +690,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.find_abbrev_len = odb_source_loose_find_abbrev_len;
 	loose->base.count_objects = odb_source_loose_count_objects;
 	loose->base.freshen_object = odb_source_loose_freshen_object;
+	loose->base.write_object = odb_source_loose_write_object;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 13/18] loose: refactor object map to operate on `struct odb_source_loose`
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

While the loose object map functions in "loose.c" accept a generic
`struct odb_source *`, they always expect this to be the "files"
backend. Furthermore, the subsystem doesn't even care about the "files"
backend, but only uses it as a stepping stone to get to the "loose"
backend.

This assumption is implicit and thus not immediately obvious. Refactor
the interfaces to instead operate on a `struct odb_source_loose`
instead, which eliminates the implicit dependency and unnecessary detour
via the "files" source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 loose.c       | 45 ++++++++++++++++++++++-----------------------
 loose.h       |  4 ++--
 object-file.c |  9 ++++++---
 3 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/loose.c b/loose.c
index f7a3dd1a72..0b626c1b85 100644
--- a/loose.c
+++ b/loose.c
@@ -46,38 +46,36 @@ static int insert_oid_pair(kh_oid_map_t *map, const struct object_id *key, const
 	return 1;
 }
 
-static int insert_loose_map(struct odb_source *source,
+static int insert_loose_map(struct odb_source_loose *loose,
 			    const struct object_id *oid,
 			    const struct object_id *compat_oid)
 {
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	struct loose_object_map *map = files->loose->map;
+	struct loose_object_map *map = loose->map;
 	int inserted = 0;
 
 	inserted |= insert_oid_pair(map->to_compat, oid, compat_oid);
 	inserted |= insert_oid_pair(map->to_storage, compat_oid, oid);
 	if (inserted)
-		oidtree_insert(files->loose->cache, compat_oid, NULL);
+		oidtree_insert(loose->cache, compat_oid, NULL);
 
 	return inserted;
 }
 
-static int load_one_loose_object_map(struct repository *repo, struct odb_source *source)
+static int load_one_loose_object_map(struct repository *repo, struct odb_source_loose *loose)
 {
-	struct odb_source_files *files = odb_source_files_downcast(source);
 	struct strbuf buf = STRBUF_INIT, path = STRBUF_INIT;
 	FILE *fp;
 
-	if (!files->loose->map)
-		loose_object_map_init(&files->loose->map);
-	if (!files->loose->cache) {
-		ALLOC_ARRAY(files->loose->cache, 1);
-		oidtree_init(files->loose->cache);
+	if (!loose->map)
+		loose_object_map_init(&loose->map);
+	if (!loose->cache) {
+		ALLOC_ARRAY(loose->cache, 1);
+		oidtree_init(loose->cache);
 	}
 
-	insert_loose_map(source, repo->hash_algo->empty_tree, repo->compat_hash_algo->empty_tree);
-	insert_loose_map(source, repo->hash_algo->empty_blob, repo->compat_hash_algo->empty_blob);
-	insert_loose_map(source, repo->hash_algo->null_oid, repo->compat_hash_algo->null_oid);
+	insert_loose_map(loose, repo->hash_algo->empty_tree, repo->compat_hash_algo->empty_tree);
+	insert_loose_map(loose, repo->hash_algo->empty_blob, repo->compat_hash_algo->empty_blob);
+	insert_loose_map(loose, repo->hash_algo->null_oid, repo->compat_hash_algo->null_oid);
 
 	repo_common_path_replace(repo, &path, "objects/loose-object-idx");
 	fp = fopen(path.buf, "rb");
@@ -97,7 +95,7 @@ static int load_one_loose_object_map(struct repository *repo, struct odb_source
 		    parse_oid_hex_algop(p, &compat_oid, &p, repo->compat_hash_algo) ||
 		    p != buf.buf + buf.len)
 			goto err;
-		insert_loose_map(source, &oid, &compat_oid);
+		insert_loose_map(loose, &oid, &compat_oid);
 	}
 
 	strbuf_release(&buf);
@@ -119,7 +117,8 @@ int repo_read_loose_object_map(struct repository *repo)
 	odb_prepare_alternates(repo->objects);
 
 	for (source = repo->objects->sources; source; source = source->next) {
-		if (load_one_loose_object_map(repo, source) < 0) {
+		struct odb_source_files *files = odb_source_files_downcast(source);
+		if (load_one_loose_object_map(repo, files->loose) < 0) {
 			return -1;
 		}
 	}
@@ -171,7 +170,7 @@ int repo_write_loose_object_map(struct repository *repo)
 	return -1;
 }
 
-static int write_one_object(struct odb_source *source,
+static int write_one_object(struct odb_source_loose *loose,
 			    const struct object_id *oid,
 			    const struct object_id *compat_oid)
 {
@@ -180,7 +179,7 @@ static int write_one_object(struct odb_source *source,
 	struct stat st;
 	struct strbuf buf = STRBUF_INIT, path = STRBUF_INIT;
 
-	strbuf_addf(&path, "%s/loose-object-idx", source->path);
+	strbuf_addf(&path, "%s/loose-object-idx", loose->base.path);
 	hold_lock_file_for_update_timeout(&lock, path.buf, LOCK_DIE_ON_ERROR, -1);
 
 	fd = open(path.buf, O_WRONLY | O_CREAT | O_APPEND, 0666);
@@ -196,7 +195,7 @@ static int write_one_object(struct odb_source *source,
 		goto errout;
 	if (close(fd))
 		goto errout;
-	adjust_shared_perm(source->odb->repo, path.buf);
+	adjust_shared_perm(loose->base.odb->repo, path.buf);
 	rollback_lock_file(&lock);
 	strbuf_release(&buf);
 	strbuf_release(&path);
@@ -210,18 +209,18 @@ static int write_one_object(struct odb_source *source,
 	return -1;
 }
 
-int repo_add_loose_object_map(struct odb_source *source,
+int repo_add_loose_object_map(struct odb_source_loose *loose,
 			      const struct object_id *oid,
 			      const struct object_id *compat_oid)
 {
 	int inserted = 0;
 
-	if (!should_use_loose_object_map(source->odb->repo))
+	if (!should_use_loose_object_map(loose->base.odb->repo))
 		return 0;
 
-	inserted = insert_loose_map(source, oid, compat_oid);
+	inserted = insert_loose_map(loose, oid, compat_oid);
 	if (inserted)
-		return write_one_object(source, oid, compat_oid);
+		return write_one_object(loose, oid, compat_oid);
 	return 0;
 }
 
diff --git a/loose.h b/loose.h
index 6af1702973..6c9b3f4571 100644
--- a/loose.h
+++ b/loose.h
@@ -4,7 +4,7 @@
 #include "khash.h"
 
 struct repository;
-struct odb_source;
+struct odb_source_loose;
 
 struct loose_object_map {
 	kh_oid_map_t *to_compat;
@@ -17,7 +17,7 @@ int repo_loose_object_map_oid(struct repository *repo,
 			      const struct object_id *src,
 			      const struct git_hash_algo *dest_algo,
 			      struct object_id *dest);
-int repo_add_loose_object_map(struct odb_source *source,
+int repo_add_loose_object_map(struct odb_source_loose *loose,
 			      const struct object_id *oid,
 			      const struct object_id *compat_oid);
 int repo_read_loose_object_map(struct repository *repo);
diff --git a/object-file.c b/object-file.c
index 0689a4e67b..fe24f00d1b 100644
--- a/object-file.c
+++ b/object-file.c
@@ -810,6 +810,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 				  struct odb_write_stream *in_stream, size_t len,
 				  struct object_id *oid)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
 	int fd, ret, err = 0, flush = 0;
@@ -918,7 +919,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
 	err = finalize_object_file_flags(source->odb->repo, tmp_file.buf, filename.buf,
 					 FOF_SKIP_COLLISION_CHECK);
 	if (!err && compat)
-		err = repo_add_loose_object_map(source, oid, &compat_oid);
+		err = repo_add_loose_object_map(files->loose, oid, &compat_oid);
 cleanup:
 	strbuf_release(&tmp_file);
 	strbuf_release(&filename);
@@ -931,6 +932,7 @@ int odb_source_loose_write_object(struct odb_source *source,
 				  struct object_id *compat_oid_in,
 				  enum odb_write_object_flags flags)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	const struct git_hash_algo *algo = source->odb->repo->hash_algo;
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
@@ -962,13 +964,14 @@ int odb_source_loose_write_object(struct odb_source *source,
 	if (write_loose_object(source, oid, hdr, hdrlen, buf, len, 0, flags))
 		return -1;
 	if (compat)
-		return repo_add_loose_object_map(source, oid, &compat_oid);
+		return repo_add_loose_object_map(files->loose, oid, &compat_oid);
 	return 0;
 }
 
 int force_object_loose(struct odb_source *source,
 		       const struct object_id *oid, time_t mtime)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	void *buf;
 	unsigned long len;
@@ -998,7 +1001,7 @@ int force_object_loose(struct odb_source *source,
 	hdrlen = format_object_header(hdr, sizeof(hdr), type, len);
 	ret = write_loose_object(source, oid, hdr, hdrlen, buf, len, mtime, 0);
 	if (!ret && compat)
-		ret = repo_add_loose_object_map(source, oid, &compat_oid);
+		ret = repo_add_loose_object_map(files->loose, oid, &compat_oid);
 	free(buf);
 
 	return ret;

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 12/18] odb/source-loose: wire up `freshen_object()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_freshen_object()` from "object-file.c" into
"odb/source-loose.c" and wire it up as the `freshen_object()` callback
of the loose source.

As part of the move, `check_and_freshen_source()` is inlined into the
callback function, as it has no other callers anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 15 ---------------
 object-file.h      |  3 ---
 odb/source-files.c |  2 +-
 odb/source-loose.c |  9 +++++++++
 4 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/object-file.c b/object-file.c
index c83136cf70..0689a4e67b 100644
--- a/object-file.c
+++ b/object-file.c
@@ -87,15 +87,6 @@ int check_and_freshen_file(const char *fn, int freshen)
 	return 1;
 }
 
-static int check_and_freshen_source(struct odb_source *source,
-				    const struct object_id *oid,
-				    int freshen)
-{
-	static struct strbuf path = STRBUF_INIT;
-	odb_loose_path(source, &path, oid);
-	return check_and_freshen_file(path.buf, freshen);
-}
-
 int format_object_header(char *str, size_t size, enum object_type type,
 			 size_t objsize)
 {
@@ -815,12 +806,6 @@ static int write_loose_object(struct odb_source *source,
 					  FOF_SKIP_COLLISION_CHECK);
 }
 
-int odb_source_loose_freshen_object(struct odb_source *source,
-				    const struct object_id *oid)
-{
-	return !!check_and_freshen_source(source, oid, 1);
-}
-
 int odb_source_loose_write_stream(struct odb_source *source,
 				  struct odb_write_stream *in_stream, size_t len,
 				  struct object_id *oid)
diff --git a/object-file.h b/object-file.h
index 506ca6be40..1d90df9d98 100644
--- a/object-file.h
+++ b/object-file.h
@@ -23,9 +23,6 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 struct object_info;
 struct odb_source;
 
-int odb_source_loose_freshen_object(struct odb_source *source,
-				    const struct object_id *oid);
-
 int odb_source_loose_write_object(struct odb_source *source,
 				  const void *buf, unsigned long len,
 				  enum object_type type, struct object_id *oid,
diff --git a/odb/source-files.c b/odb/source-files.c
index d5454e170d..ef548e6fe6 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -152,7 +152,7 @@ static int odb_source_files_freshen_object(struct odb_source *source,
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
 	if (packfile_store_freshen_object(files->packed, oid) ||
-	    odb_source_loose_freshen_object(source, oid))
+	    odb_source_freshen_object(&files->loose->base, oid))
 		return 1;
 	return 0;
 }
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 27be066327..e519365d23 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -580,6 +580,14 @@ static int odb_source_loose_count_objects(struct odb_source *source,
 	return ret;
 }
 
+static int odb_source_loose_freshen_object(struct odb_source *source,
+					   const struct object_id *oid)
+{
+	static struct strbuf path = STRBUF_INIT;
+	odb_loose_path(source, &path, oid);
+	return !!check_and_freshen_file(path.buf, 1);
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -638,6 +646,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.for_each_object = odb_source_loose_for_each_object;
 	loose->base.find_abbrev_len = odb_source_loose_find_abbrev_len;
 	loose->base.count_objects = odb_source_loose_count_objects;
+	loose->base.freshen_object = odb_source_loose_freshen_object;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 11/18] odb/source-loose: drop `odb_source_loose_has_object()`
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

The function `odb_source_loose_has_object()` checks whether a specific
object exists as a loose object on disk by using lstat(3p). This
interface is somewhat redundant, as we typically check for object
existence in a generic way via `odb_source_read_object_info()`.

In fact, these two calls are redundant in case the latter is called in a
specific way: when called without an object info request and without the
`OBJECT_INFO_QUICK` flag, then we will end up doing the same call to
lstat(3p) in `read_object_info_from_path()`.

Drop the function and adapt callers to instead use the generic
interface so that its calling conventions align with that of other
sources.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/pack-objects.c | 12 ++++++++----
 object-file.c          | 12 ++++--------
 object-file.h          |  8 --------
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 480cc0bd8c..a6be3d659f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1750,9 +1750,11 @@ static int want_object_in_pack_mtime(const struct object_id *oid,
 		 * skip the local object source.
 		 */
 		struct odb_source *source = the_repository->objects->sources->next;
-		for (; source; source = source->next)
-			if (odb_source_loose_has_object(source, oid))
+		for (; source; source = source->next) {
+			struct odb_source_files *files = odb_source_files_downcast(source);
+			if (!odb_source_read_object_info(&files->loose->base, oid, NULL, 0))
 				return 0;
+		}
 	}
 
 	/*
@@ -4135,9 +4137,11 @@ static void add_cruft_object_entry(const struct object_id *oid, enum object_type
 			struct odb_source *source = the_repository->objects->sources;
 			int found = 0;
 
-			for (; !found && source; source = source->next)
-				if (odb_source_loose_has_object(source, oid))
+			for (; !found && source; source = source->next) {
+				struct odb_source_files *files = odb_source_files_downcast(source);
+				if (!odb_source_read_object_info(&files->loose->base, oid, NULL, 0))
 					found = 1;
+			}
 
 			/*
 			 * If a traversed tree has a missing blob then we want
diff --git a/object-file.c b/object-file.c
index 9b2044de37..c83136cf70 100644
--- a/object-file.c
+++ b/object-file.c
@@ -96,12 +96,6 @@ static int check_and_freshen_source(struct odb_source *source,
 	return check_and_freshen_file(path.buf, freshen);
 }
 
-int odb_source_loose_has_object(struct odb_source *source,
-				const struct object_id *oid)
-{
-	return check_and_freshen_source(source, oid, 0);
-}
-
 int format_object_header(char *str, size_t size, enum object_type type,
 			 size_t objsize)
 {
@@ -1000,9 +994,11 @@ int force_object_loose(struct odb_source *source,
 	int hdrlen;
 	int ret;
 
-	for (struct odb_source *s = source->odb->sources; s; s = s->next)
-		if (odb_source_loose_has_object(s, oid))
+	for (struct odb_source *s = source->odb->sources; s; s = s->next) {
+		struct odb_source_files *files = odb_source_files_downcast(s);
+		if (!odb_source_read_object_info(&files->loose->base, oid, NULL, 0))
 			return 0;
+	}
 
 	oi.typep = &type;
 	oi.sizep = &len;
diff --git a/object-file.h b/object-file.h
index bc72d89f54..506ca6be40 100644
--- a/object-file.h
+++ b/object-file.h
@@ -23,14 +23,6 @@ int index_path(struct index_state *istate, struct object_id *oid, const char *pa
 struct object_info;
 struct odb_source;
 
-/*
- * Return true iff an object database source has a loose object
- * with the specified name.  This function does not respect replace
- * references.
- */
-int odb_source_loose_has_object(struct odb_source *source,
-				const struct object_id *oid);
-
 int odb_source_loose_freshen_object(struct odb_source *source,
 				    const struct object_id *oid);
 

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 10/18] odb/source-loose: wire up `count_objects()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_count_objects()` and its associated helpers from
"object-file.c" into "odb/source-loose.c" and wire it up as the
`count_objects()` callback of the loose source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c       |  6 +++---
 object-file.c      | 60 -----------------------------------------------------
 object-file.h      | 14 -------------
 odb/source-files.c |  2 +-
 odb/source-loose.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 65 insertions(+), 78 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 84a66d3240..c26c93ee0f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -466,6 +466,7 @@ static int rerere_gc_condition(struct gc_config *cfg UNUSED)
 
 static int too_many_loose_objects(int limit)
 {
+	struct odb_source_files *files = odb_source_files_downcast(the_repository->objects->sources);
 	/*
 	 * This is weird, but stems from legacy behaviour: the GC auto
 	 * threshold was always essentially interpreted as if it was rounded up
@@ -474,9 +475,8 @@ static int too_many_loose_objects(int limit)
 	int auto_threshold = DIV_ROUND_UP(limit, 256) * 256;
 	unsigned long loose_count;
 
-	if (odb_source_loose_count_objects(the_repository->objects->sources,
-					   ODB_COUNT_OBJECTS_APPROXIMATE,
-					   &loose_count) < 0)
+	if (odb_source_count_objects(&files->loose->base, ODB_COUNT_OBJECTS_APPROXIMATE,
+				     &loose_count) < 0)
 		return 0;
 
 	return loose_count > auto_threshold;
diff --git a/object-file.c b/object-file.c
index 11957aa44f..9b2044de37 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1602,66 +1602,6 @@ int for_each_loose_file_in_source(struct odb_source *source,
 	return r;
 }
 
-static int count_loose_object(const struct object_id *oid UNUSED,
-			      struct object_info *oi UNUSED,
-			      void *payload)
-{
-	unsigned long *count = payload;
-	(*count)++;
-	return 0;
-}
-
-int odb_source_loose_count_objects(struct odb_source *source,
-				   enum odb_count_objects_flags flags,
-				   unsigned long *out)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	const unsigned hexsz = source->odb->repo->hash_algo->hexsz - 2;
-	char *path = NULL;
-	DIR *dir = NULL;
-	int ret;
-
-	if (flags & ODB_COUNT_OBJECTS_APPROXIMATE) {
-		unsigned long count = 0;
-		struct dirent *ent;
-
-		path = xstrfmt("%s/17", source->path);
-
-		dir = opendir(path);
-		if (!dir) {
-			if (errno == ENOENT) {
-				*out = 0;
-				ret = 0;
-				goto out;
-			}
-
-			ret = error_errno("cannot open object shard '%s'", path);
-			goto out;
-		}
-
-		while ((ent = readdir(dir)) != NULL) {
-			if (strspn(ent->d_name, "0123456789abcdef") != hexsz ||
-			    ent->d_name[hexsz] != '\0')
-				continue;
-			count++;
-		}
-
-		*out = count * 256;
-		ret = 0;
-	} else {
-		struct odb_for_each_object_options opts = { 0 };
-		*out = 0;
-		ret = odb_source_for_each_object(&files->loose->base, NULL, count_loose_object,
-						 out, &opts);
-	}
-
-out:
-	if (dir)
-		closedir(dir);
-	free(path);
-	return ret;
-}
-
 static int check_stream_oid(git_zstream *stream,
 			    const char *hdr,
 			    unsigned long size,
diff --git a/object-file.h b/object-file.h
index 96760db0e1..bc72d89f54 100644
--- a/object-file.h
+++ b/object-file.h
@@ -96,20 +96,6 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
 				each_loose_subdir_fn subdir_cb,
 				void *data);
 
-/*
- * Count the number of loose objects in this source.
- *
- * The object count is approximated by opening a single sharding directory for
- * loose objects and scanning its contents. The result is then extrapolated by
- * 256. This should generally work as a reasonable estimate given that the
- * object hash is supposed to be indistinguishable from random.
- *
- * Returns 0 on success, a negative error code otherwise.
- */
-int odb_source_loose_count_objects(struct odb_source *source,
-				   enum odb_count_objects_flags flags,
-				   unsigned long *out);
-
 /**
  * format_object_header() is a thin wrapper around s xsnprintf() that
  * writes the initial "<type> <obj-len>" part of the loose object
diff --git a/odb/source-files.c b/odb/source-files.c
index 4a54b10e4a..d5454e170d 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -109,7 +109,7 @@ static int odb_source_files_count_objects(struct odb_source *source,
 	if (!(flags & ODB_COUNT_OBJECTS_APPROXIMATE)) {
 		unsigned long loose_count;
 
-		ret = odb_source_loose_count_objects(source, flags, &loose_count);
+		ret = odb_source_count_objects(&files->loose->base, flags, &loose_count);
 		if (ret < 0)
 			goto out;
 
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 4b8d10bc87..27be066327 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -520,6 +520,66 @@ static int odb_source_loose_find_abbrev_len(struct odb_source *source,
 	return ret;
 }
 
+static int count_loose_object(const struct object_id *oid UNUSED,
+			      struct object_info *oi UNUSED,
+			      void *payload)
+{
+	unsigned long *count = payload;
+	(*count)++;
+	return 0;
+}
+
+static int odb_source_loose_count_objects(struct odb_source *source,
+					  enum odb_count_objects_flags flags,
+					  unsigned long *out)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	const unsigned hexsz = source->odb->repo->hash_algo->hexsz - 2;
+	char *path = NULL;
+	DIR *dir = NULL;
+	int ret;
+
+	if (flags & ODB_COUNT_OBJECTS_APPROXIMATE) {
+		unsigned long count = 0;
+		struct dirent *ent;
+
+		path = xstrfmt("%s/17", source->path);
+
+		dir = opendir(path);
+		if (!dir) {
+			if (errno == ENOENT) {
+				*out = 0;
+				ret = 0;
+				goto out;
+			}
+
+			ret = error_errno("cannot open object shard '%s'", path);
+			goto out;
+		}
+
+		while ((ent = readdir(dir)) != NULL) {
+			if (strspn(ent->d_name, "0123456789abcdef") != hexsz ||
+			    ent->d_name[hexsz] != '\0')
+				continue;
+			count++;
+		}
+
+		*out = count * 256;
+		ret = 0;
+	} else {
+		struct odb_for_each_object_options opts = { 0 };
+		*out = 0;
+		ret = odb_source_for_each_object(&loose->base, NULL, count_loose_object,
+						 out, &opts);
+	}
+
+out:
+	if (dir)
+		closedir(dir);
+	free(path);
+	return ret;
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -577,6 +637,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.read_object_stream = odb_source_loose_read_object_stream;
 	loose->base.for_each_object = odb_source_loose_for_each_object;
 	loose->base.find_abbrev_len = odb_source_loose_find_abbrev_len;
+	loose->base.count_objects = odb_source_loose_count_objects;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 09/18] odb/source-loose: wire up `find_abbrev_len()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_find_abbrev_len()` and its associated helpers
from "object-file.c" into "odb/source-loose.c" and wire it up as the
`find_abbrev_len` callback of the loose source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 39 ---------------------------------------
 object-file.h      | 12 ------------
 odb/source-files.c |  2 +-
 odb/source-loose.c | 40 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 41 insertions(+), 52 deletions(-)

diff --git a/object-file.c b/object-file.c
index 157ecad3ea..11957aa44f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1662,45 +1662,6 @@ int odb_source_loose_count_objects(struct odb_source *source,
 	return ret;
 }
 
-struct find_abbrev_len_data {
-	const struct object_id *oid;
-	unsigned len;
-};
-
-static int find_abbrev_len_cb(const struct object_id *oid,
-			      struct object_info *oi UNUSED,
-			      void *cb_data)
-{
-	struct find_abbrev_len_data *data = cb_data;
-	unsigned len = oid_common_prefix_hexlen(oid, data->oid);
-	if (len != hash_algos[oid->algo].hexsz && len >= data->len)
-		data->len = len + 1;
-	return 0;
-}
-
-int odb_source_loose_find_abbrev_len(struct odb_source *source,
-				     const struct object_id *oid,
-				     unsigned min_len,
-				     unsigned *out)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	struct odb_for_each_object_options opts = {
-		.prefix = oid,
-		.prefix_hex_len = min_len,
-	};
-	struct find_abbrev_len_data data = {
-		.oid = oid,
-		.len = min_len,
-	};
-	int ret;
-
-	ret = odb_source_for_each_object(&files->loose->base, NULL, find_abbrev_len_cb,
-					 &data, &opts);
-	*out = data.len;
-
-	return ret;
-}
-
 static int check_stream_oid(git_zstream *stream,
 			    const char *hdr,
 			    unsigned long size,
diff --git a/object-file.h b/object-file.h
index 9ee5649220..96760db0e1 100644
--- a/object-file.h
+++ b/object-file.h
@@ -110,18 +110,6 @@ int odb_source_loose_count_objects(struct odb_source *source,
 				   enum odb_count_objects_flags flags,
 				   unsigned long *out);
 
-/*
- * Find the shortest unique prefix for the given object ID, where `min_len` is
- * the minimum length that the prefix should have.
- *
- * Returns 0 on success, in which case the computed length will be written to
- * `out`. Otherwise, a negative error code is returned.
- */
-int odb_source_loose_find_abbrev_len(struct odb_source *source,
-				     const struct object_id *oid,
-				     unsigned min_len,
-				     unsigned *out);
-
 /**
  * format_object_header() is a thin wrapper around s xsnprintf() that
  * writes the initial "<type> <obj-len>" part of the loose object
diff --git a/odb/source-files.c b/odb/source-files.c
index 676a641739..4a54b10e4a 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -136,7 +136,7 @@ static int odb_source_files_find_abbrev_len(struct odb_source *source,
 	if (ret < 0)
 		goto out;
 
-	ret = odb_source_loose_find_abbrev_len(source, oid, len, &len);
+	ret = odb_source_find_abbrev_len(&files->loose->base, oid, len, &len);
 	if (ret < 0)
 		goto out;
 
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 4e8b923498..4b8d10bc87 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -481,6 +481,45 @@ static int odb_source_loose_for_each_object(struct odb_source *source,
 					     NULL, NULL, &data);
 }
 
+struct find_abbrev_len_data {
+	const struct object_id *oid;
+	unsigned len;
+};
+
+static int find_abbrev_len_cb(const struct object_id *oid,
+			      struct object_info *oi UNUSED,
+			      void *cb_data)
+{
+	struct find_abbrev_len_data *data = cb_data;
+	unsigned len = oid_common_prefix_hexlen(oid, data->oid);
+	if (len != hash_algos[oid->algo].hexsz && len >= data->len)
+		data->len = len + 1;
+	return 0;
+}
+
+static int odb_source_loose_find_abbrev_len(struct odb_source *source,
+					    const struct object_id *oid,
+					    unsigned min_len,
+					    unsigned *out)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	struct odb_for_each_object_options opts = {
+		.prefix = oid,
+		.prefix_hex_len = min_len,
+	};
+	struct find_abbrev_len_data data = {
+		.oid = oid,
+		.len = min_len,
+	};
+	int ret;
+
+	ret = odb_source_for_each_object(&loose->base, NULL, find_abbrev_len_cb,
+					 &data, &opts);
+	*out = data.len;
+
+	return ret;
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -537,6 +576,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.read_object_info = odb_source_loose_read_object_info;
 	loose->base.read_object_stream = odb_source_loose_read_object_stream;
 	loose->base.for_each_object = odb_source_loose_for_each_object;
+	loose->base.find_abbrev_len = odb_source_loose_find_abbrev_len;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 08/18] odb/source-loose: wire up `for_each_object()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_for_each_object()` and its associated helpers
from "object-file.c" into "odb/source-loose.c" and wire it up as the
`for_each_object()` callback of the loose source.

Again, as in the preceding commit, we are forced to expose a couple of
functions from "object-file.c" that are now used by both subsystems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/cat-file.c |   5 +-
 object-file.c      | 299 +++--------------------------------------------------
 object-file.h      |  32 +++---
 odb/source-files.c |   2 +-
 odb/source-loose.c | 264 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 297 insertions(+), 305 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index d9fbad5358..2958fc5357 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -862,8 +862,9 @@ static void batch_each_object(struct batch_options *opt,
 	 */
 	odb_prepare_alternates(the_repository->objects);
 	for (source = the_repository->objects->sources; source; source = source->next) {
-		int ret = odb_source_loose_for_each_object(source, NULL, batch_one_object_oi,
-							   &payload, &opts);
+		struct odb_source_files *files = odb_source_files_downcast(source);
+		int ret = odb_source_for_each_object(&files->loose->base, NULL, batch_one_object_oi,
+						     &payload, &opts);
 		if (ret)
 			break;
 	}
diff --git a/object-file.c b/object-file.c
index adfb672493..157ecad3ea 100644
--- a/object-file.c
+++ b/object-file.c
@@ -22,7 +22,6 @@
 #include "odb.h"
 #include "odb/streaming.h"
 #include "odb/transaction.h"
-#include "oidtree.h"
 #include "pack.h"
 #include "packfile.h"
 #include "path.h"
@@ -31,12 +30,6 @@
 #include "tempfile.h"
 #include "tmp-objdir.h"
 
-/* The maximum size for an object header. */
-#define MAX_HEADER_LEN 32
-
-static struct oidtree *odb_source_loose_cache(struct odb_source *source,
-					      const struct object_id *oid);
-
 static int get_conv_flags(unsigned flags)
 {
 	if (flags & INDEX_RENORMALIZE)
@@ -164,12 +157,6 @@ int stream_object_signature(struct repository *r,
 	return !oideq(oid, &real_oid) ? -1 : 0;
 }
 
-static int quick_has_loose(struct odb_source_loose *loose,
-			   const struct object_id *oid)
-{
-	return !!oidtree_contains(odb_source_loose_cache(&loose->files->base, oid), oid);
-}
-
 /*
  * Map and close the given loose object fd. The path argument is used for
  * error reporting.
@@ -227,9 +214,9 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 	return ULHR_TOO_LONG;
 }
 
-static void *unpack_loose_rest(git_zstream *stream,
-			       void *buffer, unsigned long size,
-			       const struct object_id *oid)
+void *unpack_loose_rest(git_zstream *stream,
+			void *buffer, unsigned long size,
+			const struct object_id *oid)
 {
 	size_t bytes = strlen(buffer) + 1, n;
 	unsigned char *buf = xmallocz(size);
@@ -343,149 +330,6 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
 	return 0;
 }
 
-int read_object_info_from_path(struct odb_source_loose *loose,
-			       const char *path,
-			       const struct object_id *oid,
-			       struct object_info *oi,
-			       enum object_info_flags flags)
-{
-	int ret;
-	int fd;
-	unsigned long mapsize;
-	void *map = NULL;
-	git_zstream stream, *stream_to_end = NULL;
-	char hdr[MAX_HEADER_LEN];
-	unsigned long size_scratch;
-	enum object_type type_scratch;
-	struct stat st;
-
-	/*
-	 * If we don't care about type or size, then we don't
-	 * need to look inside the object at all. Note that we
-	 * do not optimize out the stat call, even if the
-	 * caller doesn't care about the disk-size, since our
-	 * return value implicitly indicates whether the
-	 * object even exists.
-	 */
-	if (!oi || (!oi->typep && !oi->sizep && !oi->contentp)) {
-		struct stat st;
-
-		if ((!oi || (!oi->disk_sizep && !oi->mtimep)) && (flags & OBJECT_INFO_QUICK)) {
-			ret = quick_has_loose(loose, oid) ? 0 : -1;
-			goto out;
-		}
-
-		if (lstat(path, &st) < 0) {
-			ret = -1;
-			goto out;
-		}
-
-		if (oi) {
-			if (oi->disk_sizep)
-				*oi->disk_sizep = st.st_size;
-			if (oi->mtimep)
-				*oi->mtimep = st.st_mtime;
-		}
-
-		ret = 0;
-		goto out;
-	}
-
-	fd = git_open(path);
-	if (fd < 0) {
-		if (errno != ENOENT)
-			error_errno(_("unable to open loose object %s"), oid_to_hex(oid));
-		ret = -1;
-		goto out;
-	}
-
-	if (fstat(fd, &st)) {
-		close(fd);
-		ret = -1;
-		goto out;
-	}
-
-	mapsize = xsize_t(st.st_size);
-	if (!mapsize) {
-		close(fd);
-		ret = error(_("object file %s is empty"), path);
-		goto out;
-	}
-
-	map = xmmap(NULL, mapsize, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
-	if (!map) {
-		ret = -1;
-		goto out;
-	}
-
-	if (oi->disk_sizep)
-		*oi->disk_sizep = mapsize;
-	if (oi->mtimep)
-		*oi->mtimep = st.st_mtime;
-
-	stream_to_end = &stream;
-
-	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr))) {
-	case ULHR_OK:
-		if (!oi->sizep)
-			oi->sizep = &size_scratch;
-		if (!oi->typep)
-			oi->typep = &type_scratch;
-
-		if (parse_loose_header(hdr, oi) < 0) {
-			ret = error(_("unable to parse %s header"), oid_to_hex(oid));
-			goto corrupt;
-		}
-
-		if (*oi->typep < 0)
-			die(_("invalid object type"));
-
-		if (oi->contentp) {
-			*oi->contentp = unpack_loose_rest(&stream, hdr, *oi->sizep, oid);
-			if (!*oi->contentp) {
-				ret = -1;
-				goto corrupt;
-			}
-		}
-
-		break;
-	case ULHR_BAD:
-		ret = error(_("unable to unpack %s header"),
-			    oid_to_hex(oid));
-		goto corrupt;
-	case ULHR_TOO_LONG:
-		ret = error(_("header for %s too long, exceeds %d bytes"),
-			    oid_to_hex(oid), MAX_HEADER_LEN);
-		goto corrupt;
-	}
-
-	ret = 0;
-
-corrupt:
-	if (ret && (flags & OBJECT_INFO_DIE_IF_CORRUPT))
-		die(_("loose object %s (stored in %s) is corrupt"),
-		    oid_to_hex(oid), path);
-
-out:
-	if (stream_to_end)
-		git_inflate_end(stream_to_end);
-	if (map)
-		munmap(map, mapsize);
-	if (oi) {
-		if (oi->sizep == &size_scratch)
-			oi->sizep = NULL;
-		if (oi->typep == &type_scratch)
-			oi->typep = NULL;
-		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, loose->base.odb->repo->hash_algo);
-		if (!ret)
-			oi->whence = OI_LOOSE;
-	}
-
-	return ret;
-}
-
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
 			     const void *buf, unsigned long len,
 			     struct object_id *oid,
@@ -1667,13 +1511,13 @@ int read_pack_header(int fd, struct pack_header *header)
 	return 0;
 }
 
-static int for_each_file_in_obj_subdir(unsigned int subdir_nr,
-				       struct strbuf *path,
-				       const struct git_hash_algo *algop,
-				       each_loose_object_fn obj_cb,
-				       each_loose_cruft_fn cruft_cb,
-				       each_loose_subdir_fn subdir_cb,
-				       void *data)
+int for_each_file_in_obj_subdir(unsigned int subdir_nr,
+				struct strbuf *path,
+				const struct git_hash_algo *algop,
+				each_loose_object_fn obj_cb,
+				each_loose_cruft_fn cruft_cb,
+				each_loose_subdir_fn subdir_cb,
+				void *data)
 {
 	size_t origlen, baselen;
 	DIR *dir;
@@ -1758,78 +1602,6 @@ int for_each_loose_file_in_source(struct odb_source *source,
 	return r;
 }
 
-struct for_each_object_wrapper_data {
-	struct odb_source_loose *loose;
-	const struct object_info *request;
-	odb_for_each_object_cb cb;
-	void *cb_data;
-};
-
-static int for_each_object_wrapper_cb(const struct object_id *oid,
-				      const char *path,
-				      void *cb_data)
-{
-	struct for_each_object_wrapper_data *data = cb_data;
-
-	if (data->request) {
-		struct object_info oi = *data->request;
-
-		if (read_object_info_from_path(data->loose, path, oid, &oi, 0) < 0)
-			return -1;
-
-		return data->cb(oid, &oi, data->cb_data);
-	} else {
-		return data->cb(oid, NULL, data->cb_data);
-	}
-}
-
-static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
-					       void *node_data UNUSED,
-					       void *cb_data)
-{
-	struct for_each_object_wrapper_data *data = cb_data;
-	if (data->request) {
-		struct object_info oi = *data->request;
-
-		if (odb_source_read_object_info(&data->loose->base,
-						oid, &oi, 0) < 0)
-			return -1;
-
-		return data->cb(oid, &oi, data->cb_data);
-	} else {
-		return data->cb(oid, NULL, data->cb_data);
-	}
-}
-
-int odb_source_loose_for_each_object(struct odb_source *source,
-				     const struct object_info *request,
-				     odb_for_each_object_cb cb,
-				     void *cb_data,
-				     const struct odb_for_each_object_options *opts)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	struct for_each_object_wrapper_data data = {
-		.loose = files->loose,
-		.request = request,
-		.cb = cb,
-		.cb_data = cb_data,
-	};
-
-	/* There are no loose promisor objects, so we can return immediately. */
-	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY))
-		return 0;
-	if ((opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY) && !source->local)
-		return 0;
-
-	if (opts->prefix)
-		return oidtree_each(odb_source_loose_cache(source, opts->prefix),
-				    opts->prefix, opts->prefix_hex_len,
-				    for_each_prefixed_object_wrapper_cb, &data);
-
-	return for_each_loose_file_in_source(source, for_each_object_wrapper_cb,
-					     NULL, NULL, &data);
-}
-
 static int count_loose_object(const struct object_id *oid UNUSED,
 			      struct object_info *oi UNUSED,
 			      void *payload)
@@ -1843,6 +1615,7 @@ int odb_source_loose_count_objects(struct odb_source *source,
 				   enum odb_count_objects_flags flags,
 				   unsigned long *out)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	const unsigned hexsz = source->odb->repo->hash_algo->hexsz - 2;
 	char *path = NULL;
 	DIR *dir = NULL;
@@ -1878,8 +1651,8 @@ int odb_source_loose_count_objects(struct odb_source *source,
 	} else {
 		struct odb_for_each_object_options opts = { 0 };
 		*out = 0;
-		ret = odb_source_loose_for_each_object(source, NULL, count_loose_object,
-						       out, &opts);
+		ret = odb_source_for_each_object(&files->loose->base, NULL, count_loose_object,
+						 out, &opts);
 	}
 
 out:
@@ -1910,6 +1683,7 @@ int odb_source_loose_find_abbrev_len(struct odb_source *source,
 				     unsigned min_len,
 				     unsigned *out)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	struct odb_for_each_object_options opts = {
 		.prefix = oid,
 		.prefix_hex_len = min_len,
@@ -1920,54 +1694,13 @@ int odb_source_loose_find_abbrev_len(struct odb_source *source,
 	};
 	int ret;
 
-	ret = odb_source_loose_for_each_object(source, NULL, find_abbrev_len_cb,
-					       &data, &opts);
+	ret = odb_source_for_each_object(&files->loose->base, NULL, find_abbrev_len_cb,
+					 &data, &opts);
 	*out = data.len;
 
 	return ret;
 }
 
-static int append_loose_object(const struct object_id *oid,
-			       const char *path UNUSED,
-			       void *data)
-{
-	oidtree_insert(data, oid, NULL);
-	return 0;
-}
-
-static struct oidtree *odb_source_loose_cache(struct odb_source *source,
-					      const struct object_id *oid)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	int subdir_nr = oid->hash[0];
-	struct strbuf buf = STRBUF_INIT;
-	size_t word_bits = bitsizeof(files->loose->subdir_seen[0]);
-	size_t word_index = subdir_nr / word_bits;
-	size_t mask = (size_t)1u << (subdir_nr % word_bits);
-	uint32_t *bitmap;
-
-	if (subdir_nr < 0 ||
-	    (size_t) subdir_nr >= bitsizeof(files->loose->subdir_seen))
-		BUG("subdir_nr out of range");
-
-	bitmap = &files->loose->subdir_seen[word_index];
-	if (*bitmap & mask)
-		return files->loose->cache;
-	if (!files->loose->cache) {
-		ALLOC_ARRAY(files->loose->cache, 1);
-		oidtree_init(files->loose->cache);
-	}
-	strbuf_addstr(&buf, source->path);
-	for_each_file_in_obj_subdir(subdir_nr, &buf,
-				    source->odb->repo->hash_algo,
-				    append_loose_object,
-				    NULL, NULL,
-				    files->loose->cache);
-	*bitmap |= mask;
-	strbuf_release(&buf);
-	return files->loose->cache;
-}
-
 static int check_stream_oid(git_zstream *stream,
 			    const char *hdr,
 			    unsigned long size,
diff --git a/object-file.h b/object-file.h
index d93b7ffad7..9ee5649220 100644
--- a/object-file.h
+++ b/object-file.h
@@ -6,6 +6,9 @@
 #include "odb.h"
 #include "odb/source-loose.h"
 
+/* The maximum size for an object header. */
+#define MAX_HEADER_LEN 32
+
 struct index_state;
 
 enum {
@@ -85,19 +88,13 @@ int for_each_loose_file_in_source(struct odb_source *source,
 				  each_loose_cruft_fn cruft_cb,
 				  each_loose_subdir_fn subdir_cb,
 				  void *data);
-
-/*
- * Iterate through all loose objects in the given object database source and
- * invoke the callback function for each of them. If an object info request is
- * given, then the object info will be read for every individual object and
- * passed to the callback as if `odb_source_loose_read_object_info()` was
- * called for the object.
- */
-int odb_source_loose_for_each_object(struct odb_source *source,
-				     const struct object_info *request,
-				     odb_for_each_object_cb cb,
-				     void *cb_data,
-				     const struct odb_for_each_object_options *opts);
+int for_each_file_in_obj_subdir(unsigned int subdir_nr,
+				struct strbuf *path,
+				const struct git_hash_algo *algop,
+				each_loose_object_fn obj_cb,
+				each_loose_cruft_fn cruft_cb,
+				each_loose_subdir_fn subdir_cb,
+				void *data);
 
 /*
  * Count the number of loose objects in this source.
@@ -188,12 +185,6 @@ int read_loose_object(struct repository *repo,
 		      void **contents,
 		      struct object_info *oi);
 
-int read_object_info_from_path(struct odb_source_loose *loose,
-			       const char *path,
-			       const struct object_id *oid,
-			       struct object_info *oi,
-			       enum object_info_flags flags);
-
 enum unpack_loose_header_result {
 	ULHR_OK,
 	ULHR_BAD,
@@ -217,6 +208,9 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 						    unsigned long mapsize,
 						    void *buffer,
 						    unsigned long bufsiz);
+void *unpack_loose_rest(git_zstream *stream,
+			void *buffer, unsigned long size,
+			const struct object_id *oid);
 
 int parse_loose_header(const char *hdr, struct object_info *oi);
 
diff --git a/odb/source-files.c b/odb/source-files.c
index 90806ddf86..676a641739 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -82,7 +82,7 @@ static int odb_source_files_for_each_object(struct odb_source *source,
 	int ret;
 
 	if (!(opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY)) {
-		ret = odb_source_loose_for_each_object(source, request, cb, cb_data, opts);
+		ret = odb_source_for_each_object(&files->loose->base, request, cb, cb_data, opts);
 		if (ret)
 			return ret;
 	}
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 4b82c6f316..4e8b923498 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -2,6 +2,7 @@
 #include "abspath.h"
 #include "chdir-notify.h"
 #include "gettext.h"
+#include "hex.h"
 #include "loose.h"
 #include "object-file.h"
 #include "odb.h"
@@ -9,8 +10,198 @@
 #include "odb/source-loose.h"
 #include "odb/streaming.h"
 #include "oidtree.h"
+#include "repository.h"
 #include "strbuf.h"
 
+static int append_loose_object(const struct object_id *oid,
+			       const char *path UNUSED,
+			       void *data)
+{
+	oidtree_insert(data, oid, NULL);
+	return 0;
+}
+
+static struct oidtree *odb_source_loose_cache(struct odb_source_loose *loose,
+					      const struct object_id *oid)
+{
+	int subdir_nr = oid->hash[0];
+	struct strbuf buf = STRBUF_INIT;
+	size_t word_bits = bitsizeof(loose->subdir_seen[0]);
+	size_t word_index = subdir_nr / word_bits;
+	size_t mask = (size_t)1u << (subdir_nr % word_bits);
+	uint32_t *bitmap;
+
+	if (subdir_nr < 0 ||
+	    (size_t) subdir_nr >= bitsizeof(loose->subdir_seen))
+		BUG("subdir_nr out of range");
+
+	bitmap = &loose->subdir_seen[word_index];
+	if (*bitmap & mask)
+		return loose->cache;
+	if (!loose->cache) {
+		ALLOC_ARRAY(loose->cache, 1);
+		oidtree_init(loose->cache);
+	}
+	strbuf_addstr(&buf, loose->base.path);
+	for_each_file_in_obj_subdir(subdir_nr, &buf,
+				    loose->base.odb->repo->hash_algo,
+				    append_loose_object,
+				    NULL, NULL,
+				    loose->cache);
+	*bitmap |= mask;
+	strbuf_release(&buf);
+	return loose->cache;
+}
+
+static int quick_has_loose(struct odb_source_loose *loose,
+			   const struct object_id *oid)
+{
+	return !!oidtree_contains(odb_source_loose_cache(loose, oid), oid);
+}
+
+static int read_object_info_from_path(struct odb_source_loose *loose,
+				      const char *path,
+				      const struct object_id *oid,
+				      struct object_info *oi,
+				      enum object_info_flags flags)
+{
+	int ret;
+	int fd;
+	unsigned long mapsize;
+	void *map = NULL;
+	git_zstream stream, *stream_to_end = NULL;
+	char hdr[MAX_HEADER_LEN];
+	unsigned long size_scratch;
+	enum object_type type_scratch;
+	struct stat st;
+
+	/*
+	 * If we don't care about type or size, then we don't
+	 * need to look inside the object at all. Note that we
+	 * do not optimize out the stat call, even if the
+	 * caller doesn't care about the disk-size, since our
+	 * return value implicitly indicates whether the
+	 * object even exists.
+	 */
+	if (!oi || (!oi->typep && !oi->sizep && !oi->contentp)) {
+		struct stat st;
+
+		if ((!oi || (!oi->disk_sizep && !oi->mtimep)) && (flags & OBJECT_INFO_QUICK)) {
+			ret = quick_has_loose(loose, oid) ? 0 : -1;
+			goto out;
+		}
+
+		if (lstat(path, &st) < 0) {
+			ret = -1;
+			goto out;
+		}
+
+		if (oi) {
+			if (oi->disk_sizep)
+				*oi->disk_sizep = st.st_size;
+			if (oi->mtimep)
+				*oi->mtimep = st.st_mtime;
+		}
+
+		ret = 0;
+		goto out;
+	}
+
+	fd = git_open(path);
+	if (fd < 0) {
+		if (errno != ENOENT)
+			error_errno(_("unable to open loose object %s"), oid_to_hex(oid));
+		ret = -1;
+		goto out;
+	}
+
+	if (fstat(fd, &st)) {
+		close(fd);
+		ret = -1;
+		goto out;
+	}
+
+	mapsize = xsize_t(st.st_size);
+	if (!mapsize) {
+		close(fd);
+		ret = error(_("object file %s is empty"), path);
+		goto out;
+	}
+
+	map = xmmap(NULL, mapsize, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+	if (!map) {
+		ret = -1;
+		goto out;
+	}
+
+	if (oi->disk_sizep)
+		*oi->disk_sizep = mapsize;
+	if (oi->mtimep)
+		*oi->mtimep = st.st_mtime;
+
+	stream_to_end = &stream;
+
+	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr))) {
+	case ULHR_OK:
+		if (!oi->sizep)
+			oi->sizep = &size_scratch;
+		if (!oi->typep)
+			oi->typep = &type_scratch;
+
+		if (parse_loose_header(hdr, oi) < 0) {
+			ret = error(_("unable to parse %s header"), oid_to_hex(oid));
+			goto corrupt;
+		}
+
+		if (*oi->typep < 0)
+			die(_("invalid object type"));
+
+		if (oi->contentp) {
+			*oi->contentp = unpack_loose_rest(&stream, hdr, *oi->sizep, oid);
+			if (!*oi->contentp) {
+				ret = -1;
+				goto corrupt;
+			}
+		}
+
+		break;
+	case ULHR_BAD:
+		ret = error(_("unable to unpack %s header"),
+			    oid_to_hex(oid));
+		goto corrupt;
+	case ULHR_TOO_LONG:
+		ret = error(_("header for %s too long, exceeds %d bytes"),
+			    oid_to_hex(oid), MAX_HEADER_LEN);
+		goto corrupt;
+	}
+
+	ret = 0;
+
+corrupt:
+	if (ret && (flags & OBJECT_INFO_DIE_IF_CORRUPT))
+		die(_("loose object %s (stored in %s) is corrupt"),
+		    oid_to_hex(oid), path);
+
+out:
+	if (stream_to_end)
+		git_inflate_end(stream_to_end);
+	if (map)
+		munmap(map, mapsize);
+	if (oi) {
+		if (oi->sizep == &size_scratch)
+			oi->sizep = NULL;
+		if (oi->typep == &type_scratch)
+			oi->typep = NULL;
+		if (oi->delta_base_oid)
+			oidclr(oi->delta_base_oid, loose->base.odb->repo->hash_algo);
+		if (!ret)
+			oi->whence = OI_LOOSE;
+	}
+
+	return ret;
+}
+
 static int odb_source_loose_read_object_info(struct odb_source *source,
 					     const struct object_id *oid,
 					     struct object_info *oi,
@@ -218,6 +409,78 @@ static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
 	return -1;
 }
 
+struct for_each_object_wrapper_data {
+	struct odb_source_loose *loose;
+	const struct object_info *request;
+	odb_for_each_object_cb cb;
+	void *cb_data;
+};
+
+static int for_each_object_wrapper_cb(const struct object_id *oid,
+				      const char *path,
+				      void *cb_data)
+{
+	struct for_each_object_wrapper_data *data = cb_data;
+
+	if (data->request) {
+		struct object_info oi = *data->request;
+
+		if (read_object_info_from_path(data->loose, path, oid, &oi, 0) < 0)
+			return -1;
+
+		return data->cb(oid, &oi, data->cb_data);
+	} else {
+		return data->cb(oid, NULL, data->cb_data);
+	}
+}
+
+static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
+					       void *node_data UNUSED,
+					       void *cb_data)
+{
+	struct for_each_object_wrapper_data *data = cb_data;
+	if (data->request) {
+		struct object_info oi = *data->request;
+
+		if (odb_source_read_object_info(&data->loose->base,
+						oid, &oi, 0) < 0)
+			return -1;
+
+		return data->cb(oid, &oi, data->cb_data);
+	} else {
+		return data->cb(oid, NULL, data->cb_data);
+	}
+}
+
+static int odb_source_loose_for_each_object(struct odb_source *source,
+					    const struct object_info *request,
+					    odb_for_each_object_cb cb,
+					    void *cb_data,
+					    const struct odb_for_each_object_options *opts)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	struct for_each_object_wrapper_data data = {
+		.loose = loose,
+		.request = request,
+		.cb = cb,
+		.cb_data = cb_data,
+	};
+
+	/* There are no loose promisor objects, so we can return immediately. */
+	if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY))
+		return 0;
+	if ((opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY) && !source->local)
+		return 0;
+
+	if (opts->prefix)
+		return oidtree_each(odb_source_loose_cache(loose, opts->prefix),
+				    opts->prefix, opts->prefix_hex_len,
+				    for_each_prefixed_object_wrapper_cb, &data);
+
+	return for_each_loose_file_in_source(source, for_each_object_wrapper_cb,
+					     NULL, NULL, &data);
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -273,6 +536,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.reprepare = odb_source_loose_reprepare;
 	loose->base.read_object_info = odb_source_loose_read_object_info;
 	loose->base.read_object_stream = odb_source_loose_read_object_stream;
+	loose->base.for_each_object = odb_source_loose_for_each_object;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 07/18] odb/source-loose: wire up `read_object_stream()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_read_object_stream()` and its associated helpers
from "object-file.c" into "odb/source-loose.c" and wire it up as the
`read_object_stream()` callback of the loose source.

As part of the move we are also forced to expose a couple of functions
from "object-file.h" that parse object headers in a somewhat-generic
way, as those functions are now used by both subsystems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 200 ++---------------------------------------------------
 object-file.h      |  31 +++++++--
 odb/source-files.c |   2 +-
 odb/source-loose.c | 189 ++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 222 insertions(+), 200 deletions(-)

diff --git a/object-file.c b/object-file.c
index fa174512a4..adfb672493 100644
--- a/object-file.c
+++ b/object-file.c
@@ -164,28 +164,6 @@ int stream_object_signature(struct repository *r,
 	return !oideq(oid, &real_oid) ? -1 : 0;
 }
 
-/*
- * Find "oid" as a loose object in given source, open the object and return its
- * file descriptor. Returns the file descriptor on success, negative on failure.
- *
- * The "path" out-parameter will give the path of the object we found (if any).
- * Note that it may point to static storage and is only valid until another
- * call to stat_loose_object().
- */
-static int open_loose_object(struct odb_source_loose *loose,
-			     const struct object_id *oid, const char **path)
-{
-	static struct strbuf buf = STRBUF_INIT;
-	int fd;
-
-	*path = odb_loose_path(&loose->files->base, &buf, oid);
-	fd = git_open(*path);
-	if (fd >= 0)
-		return fd;
-
-	return -1;
-}
-
 static int quick_has_loose(struct odb_source_loose *loose,
 			   const struct object_id *oid)
 {
@@ -215,42 +193,11 @@ static void *map_fd(int fd, const char *path, unsigned long *size)
 	return map;
 }
 
-static void *odb_source_loose_map_object(struct odb_source *source,
-					 const struct object_id *oid,
-					 unsigned long *size)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	const char *p;
-	int fd = open_loose_object(files->loose, oid, &p);
-
-	if (fd < 0)
-		return NULL;
-	return map_fd(fd, p, size);
-}
-
-enum unpack_loose_header_result {
-	ULHR_OK,
-	ULHR_BAD,
-	ULHR_TOO_LONG,
-};
-
-/**
- * unpack_loose_header() initializes the data stream needed to unpack
- * a loose object header.
- *
- * Returns:
- *
- * - ULHR_OK on success
- * - ULHR_BAD on error
- * - ULHR_TOO_LONG if the header was too long
- *
- * It will only parse up to MAX_HEADER_LEN bytes.
- */
-static enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
-							   unsigned char *map,
-							   unsigned long mapsize,
-							   void *buffer,
-							   unsigned long bufsiz)
+enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
+						    unsigned char *map,
+						    unsigned long mapsize,
+						    void *buffer,
+						    unsigned long bufsiz)
 {
 	int status;
 
@@ -340,7 +287,7 @@ static void *unpack_loose_rest(git_zstream *stream,
  * too permissive for what we want to check. So do an anal
  * object header parse by hand.
  */
-static int parse_loose_header(const char *hdr, struct object_info *oi)
+int parse_loose_header(const char *hdr, struct object_info *oi)
 {
 	const char *type_buf = hdr;
 	size_t size;
@@ -2170,138 +2117,3 @@ struct odb_transaction *odb_transaction_files_begin(struct odb_source *source)
 
 	return &transaction->base;
 }
-
-struct odb_loose_read_stream {
-	struct odb_read_stream base;
-	git_zstream z;
-	enum {
-		ODB_LOOSE_READ_STREAM_INUSE,
-		ODB_LOOSE_READ_STREAM_DONE,
-		ODB_LOOSE_READ_STREAM_ERROR,
-	} z_state;
-	void *mapped;
-	unsigned long mapsize;
-	char hdr[32];
-	int hdr_avail;
-	int hdr_used;
-};
-
-static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t sz)
-{
-	struct odb_loose_read_stream *st =
-		container_of(_st, struct odb_loose_read_stream, base);
-	size_t total_read = 0;
-
-	switch (st->z_state) {
-	case ODB_LOOSE_READ_STREAM_DONE:
-		return 0;
-	case ODB_LOOSE_READ_STREAM_ERROR:
-		return -1;
-	default:
-		break;
-	}
-
-	if (st->hdr_used < st->hdr_avail) {
-		size_t to_copy = st->hdr_avail - st->hdr_used;
-		if (sz < to_copy)
-			to_copy = sz;
-		memcpy(buf, st->hdr + st->hdr_used, to_copy);
-		st->hdr_used += to_copy;
-		total_read += to_copy;
-	}
-
-	while (total_read < sz) {
-		int status;
-
-		st->z.next_out = (unsigned char *)buf + total_read;
-		st->z.avail_out = sz - total_read;
-		status = git_inflate(&st->z, Z_FINISH);
-
-		total_read = st->z.next_out - (unsigned char *)buf;
-
-		if (status == Z_STREAM_END) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_LOOSE_READ_STREAM_DONE;
-			break;
-		}
-		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
-			git_inflate_end(&st->z);
-			st->z_state = ODB_LOOSE_READ_STREAM_ERROR;
-			return -1;
-		}
-	}
-	return total_read;
-}
-
-static int close_istream_loose(struct odb_read_stream *_st)
-{
-	struct odb_loose_read_stream *st =
-		container_of(_st, struct odb_loose_read_stream, base);
-
-	if (st->z_state == ODB_LOOSE_READ_STREAM_INUSE)
-		git_inflate_end(&st->z);
-	munmap(st->mapped, st->mapsize);
-	return 0;
-}
-
-int odb_source_loose_read_object_stream(struct odb_read_stream **out,
-					struct odb_source *source,
-					const struct object_id *oid)
-{
-	struct object_info oi = OBJECT_INFO_INIT;
-	struct odb_loose_read_stream *st;
-	unsigned long mapsize;
-	unsigned long size_ul;
-	void *mapped;
-
-	mapped = odb_source_loose_map_object(source, oid, &mapsize);
-	if (!mapped)
-		return -1;
-
-	/*
-	 * Note: we must allocate this structure early even though we may still
-	 * fail. This is because we need to initialize the zlib stream, and it
-	 * is not possible to copy the stream around after the fact because it
-	 * has self-referencing pointers.
-	 */
-	CALLOC_ARRAY(st, 1);
-
-	switch (unpack_loose_header(&st->z, mapped, mapsize, st->hdr,
-				    sizeof(st->hdr))) {
-	case ULHR_OK:
-		break;
-	case ULHR_BAD:
-	case ULHR_TOO_LONG:
-		goto error;
-	}
-
-	/*
-	 * object_info.sizep is unsigned long* (32-bit on Windows), but
-	 * st->base.size is size_t (64-bit). Use temporary variable.
-	 * Note: loose objects >4GB would still truncate here, but such
-	 * large loose objects are uncommon (they'd normally be packed).
-	 */
-	oi.sizep = &size_ul;
-	oi.typep = &st->base.type;
-
-	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
-		goto error;
-	st->base.size = size_ul;
-
-	st->mapped = mapped;
-	st->mapsize = mapsize;
-	st->hdr_used = strlen(st->hdr) + 1;
-	st->hdr_avail = st->z.total_out;
-	st->z_state = ODB_LOOSE_READ_STREAM_INUSE;
-	st->base.close = close_istream_loose;
-	st->base.read = read_istream_loose;
-
-	*out = &st->base;
-
-	return 0;
-error:
-	git_inflate_end(&st->z);
-	munmap(mapped, mapsize);
-	free(st);
-	return -1;
-}
diff --git a/object-file.h b/object-file.h
index 8ac2832dac..d93b7ffad7 100644
--- a/object-file.h
+++ b/object-file.h
@@ -18,13 +18,8 @@ int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct s
 int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
 
 struct object_info;
-struct odb_read_stream;
 struct odb_source;
 
-int odb_source_loose_read_object_stream(struct odb_read_stream **out,
-					struct odb_source *source,
-					const struct object_id *oid);
-
 /*
  * Return true iff an object database source has a loose object
  * with the specified name.  This function does not respect replace
@@ -199,6 +194,32 @@ int read_object_info_from_path(struct odb_source_loose *loose,
 			       struct object_info *oi,
 			       enum object_info_flags flags);
 
+enum unpack_loose_header_result {
+	ULHR_OK,
+	ULHR_BAD,
+	ULHR_TOO_LONG,
+};
+
+/**
+ * unpack_loose_header() initializes the data stream needed to unpack
+ * a loose object header.
+ *
+ * Returns:
+ *
+ * - ULHR_OK on success
+ * - ULHR_BAD on error
+ * - ULHR_TOO_LONG if the header was too long
+ *
+ * It will only parse up to MAX_HEADER_LEN bytes.
+ */
+enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
+						    unsigned char *map,
+						    unsigned long mapsize,
+						    void *buffer,
+						    unsigned long bufsiz);
+
+int parse_loose_header(const char *hdr, struct object_info *oi);
+
 struct odb_transaction;
 
 /*
diff --git a/odb/source-files.c b/odb/source-files.c
index 8d6924755f..90806ddf86 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -67,7 +67,7 @@ static int odb_source_files_read_object_stream(struct odb_read_stream **out,
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
 	if (!packfile_store_read_object_stream(out, files->packed, oid) ||
-	    !odb_source_loose_read_object_stream(out, source, oid))
+	    !odb_source_read_object_stream(out, &files->loose->base, oid))
 		return 0;
 	return -1;
 }
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 50f387ecf3..4b82c6f316 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -1,11 +1,13 @@
 #include "git-compat-util.h"
 #include "abspath.h"
 #include "chdir-notify.h"
+#include "gettext.h"
 #include "loose.h"
 #include "object-file.h"
 #include "odb.h"
 #include "odb/source-files.h"
 #include "odb/source-loose.h"
+#include "odb/streaming.h"
 #include "oidtree.h"
 #include "strbuf.h"
 
@@ -30,6 +32,192 @@ static int odb_source_loose_read_object_info(struct odb_source *source,
 	return read_object_info_from_path(loose, buf.buf, oid, oi, flags);
 }
 
+/*
+ * Find "oid" as a loose object in given source, open the object and return its
+ * file descriptor. Returns the file descriptor on success, negative on failure.
+ *
+ * The "path" out-parameter will give the path of the object we found (if any).
+ * Note that it may point to static storage and is only valid until another
+ * call to open_loose_object().
+ */
+static int open_loose_object(struct odb_source_loose *loose,
+			     const struct object_id *oid, const char **path)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	int fd;
+
+	*path = odb_loose_path(&loose->base, &buf, oid);
+	fd = git_open(*path);
+	if (fd >= 0)
+		return fd;
+
+	return -1;
+}
+
+static void *odb_source_loose_map_object(struct odb_source_loose *loose,
+					 const struct object_id *oid,
+					 unsigned long *size)
+{
+	const char *p;
+	int fd = open_loose_object(loose, oid, &p);
+	void *map = NULL;
+	struct stat st;
+
+	if (fd < 0)
+		return NULL;
+
+	if (!fstat(fd, &st)) {
+		*size = xsize_t(st.st_size);
+		if (!*size) {
+			/* mmap() is forbidden on empty files */
+			error(_("object file %s is empty"), p);
+			goto out;
+		}
+
+		map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0);
+	}
+
+out:
+	close(fd);
+	return map;
+}
+
+struct odb_loose_read_stream {
+	struct odb_read_stream base;
+	git_zstream z;
+	enum {
+		ODB_LOOSE_READ_STREAM_INUSE,
+		ODB_LOOSE_READ_STREAM_DONE,
+		ODB_LOOSE_READ_STREAM_ERROR,
+	} z_state;
+	void *mapped;
+	unsigned long mapsize;
+	char hdr[32];
+	int hdr_avail;
+	int hdr_used;
+};
+
+static ssize_t read_istream_loose(struct odb_read_stream *_st, char *buf, size_t sz)
+{
+	struct odb_loose_read_stream *st =
+		container_of(_st, struct odb_loose_read_stream, base);
+	size_t total_read = 0;
+
+	switch (st->z_state) {
+	case ODB_LOOSE_READ_STREAM_DONE:
+		return 0;
+	case ODB_LOOSE_READ_STREAM_ERROR:
+		return -1;
+	default:
+		break;
+	}
+
+	if (st->hdr_used < st->hdr_avail) {
+		size_t to_copy = st->hdr_avail - st->hdr_used;
+		if (sz < to_copy)
+			to_copy = sz;
+		memcpy(buf, st->hdr + st->hdr_used, to_copy);
+		st->hdr_used += to_copy;
+		total_read += to_copy;
+	}
+
+	while (total_read < sz) {
+		int status;
+
+		st->z.next_out = (unsigned char *)buf + total_read;
+		st->z.avail_out = sz - total_read;
+		status = git_inflate(&st->z, Z_FINISH);
+
+		total_read = st->z.next_out - (unsigned char *)buf;
+
+		if (status == Z_STREAM_END) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_DONE;
+			break;
+		}
+		if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) {
+			git_inflate_end(&st->z);
+			st->z_state = ODB_LOOSE_READ_STREAM_ERROR;
+			return -1;
+		}
+	}
+	return total_read;
+}
+
+static int close_istream_loose(struct odb_read_stream *_st)
+{
+	struct odb_loose_read_stream *st =
+		container_of(_st, struct odb_loose_read_stream, base);
+
+	if (st->z_state == ODB_LOOSE_READ_STREAM_INUSE)
+		git_inflate_end(&st->z);
+	munmap(st->mapped, st->mapsize);
+	return 0;
+}
+
+static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
+					       struct odb_source *source,
+					       const struct object_id *oid)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	struct object_info oi = OBJECT_INFO_INIT;
+	struct odb_loose_read_stream *st;
+	unsigned long mapsize;
+	unsigned long size_ul;
+	void *mapped;
+
+	mapped = odb_source_loose_map_object(loose, oid, &mapsize);
+	if (!mapped)
+		return -1;
+
+	/*
+	 * Note: we must allocate this structure early even though we may still
+	 * fail. This is because we need to initialize the zlib stream, and it
+	 * is not possible to copy the stream around after the fact because it
+	 * has self-referencing pointers.
+	 */
+	CALLOC_ARRAY(st, 1);
+
+	switch (unpack_loose_header(&st->z, mapped, mapsize, st->hdr,
+				    sizeof(st->hdr))) {
+	case ULHR_OK:
+		break;
+	case ULHR_BAD:
+	case ULHR_TOO_LONG:
+		goto error;
+	}
+
+	/*
+	 * object_info.sizep is unsigned long* (32-bit on Windows), but
+	 * st->base.size is size_t (64-bit). Use temporary variable.
+	 * Note: loose objects >4GB would still truncate here, but such
+	 * large loose objects are uncommon (they'd normally be packed).
+	 */
+	oi.sizep = &size_ul;
+	oi.typep = &st->base.type;
+
+	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
+		goto error;
+	st->base.size = size_ul;
+
+	st->mapped = mapped;
+	st->mapsize = mapsize;
+	st->hdr_used = strlen(st->hdr) + 1;
+	st->hdr_avail = st->z.total_out;
+	st->z_state = ODB_LOOSE_READ_STREAM_INUSE;
+	st->base.close = close_istream_loose;
+	st->base.read = read_istream_loose;
+
+	*out = &st->base;
+
+	return 0;
+error:
+	git_inflate_end(&st->z);
+	munmap(mapped, mapsize);
+	free(st);
+	return -1;
+}
+
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
@@ -84,6 +272,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.close = odb_source_loose_close;
 	loose->base.reprepare = odb_source_loose_reprepare;
 	loose->base.read_object_info = odb_source_loose_read_object_info;
+	loose->base.read_object_stream = odb_source_loose_read_object_stream;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 06/18] odb/source-loose: wire up `read_object_info()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_read_object_info()` from "object-file.c" into
"odb/source-loose.c" and wire it up as the `read_object_info()` callback
of the loose source. Callers that previously invoked it directly now go
through the generic `odb_source_read_object_info()` interface instead.

The function `read_object_info_from_path()` cannot be moved along with
it because it is still called by `for_each_object_wrapper_cb()`. It is
therefore kept in place, but adjusted to take a loose source to clarify
that it's always operating on this structure.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 46 +++++++++++++---------------------------------
 object-file.h      | 11 ++++++-----
 odb/source-files.c |  2 +-
 odb/source-loose.c | 24 ++++++++++++++++++++++++
 4 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/object-file.c b/object-file.c
index 0f4f1e7bdc..fa174512a4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -396,13 +396,12 @@ static int parse_loose_header(const char *hdr, struct object_info *oi)
 	return 0;
 }
 
-static int read_object_info_from_path(struct odb_source *source,
-				      const char *path,
-				      const struct object_id *oid,
-				      struct object_info *oi,
-				      enum object_info_flags flags)
+int read_object_info_from_path(struct odb_source_loose *loose,
+			       const char *path,
+			       const struct object_id *oid,
+			       struct object_info *oi,
+			       enum object_info_flags flags)
 {
-	struct odb_source_files *files = odb_source_files_downcast(source);
 	int ret;
 	int fd;
 	unsigned long mapsize;
@@ -425,7 +424,7 @@ static int read_object_info_from_path(struct odb_source *source,
 		struct stat st;
 
 		if ((!oi || (!oi->disk_sizep && !oi->mtimep)) && (flags & OBJECT_INFO_QUICK)) {
-			ret = quick_has_loose(files->loose, oid) ? 0 : -1;
+			ret = quick_has_loose(loose, oid) ? 0 : -1;
 			goto out;
 		}
 
@@ -532,7 +531,7 @@ static int read_object_info_from_path(struct odb_source *source,
 		if (oi->typep == &type_scratch)
 			oi->typep = NULL;
 		if (oi->delta_base_oid)
-			oidclr(oi->delta_base_oid, source->odb->repo->hash_algo);
+			oidclr(oi->delta_base_oid, loose->base.odb->repo->hash_algo);
 		if (!ret)
 			oi->whence = OI_LOOSE;
 	}
@@ -540,26 +539,6 @@ static int read_object_info_from_path(struct odb_source *source,
 	return ret;
 }
 
-int odb_source_loose_read_object_info(struct odb_source *source,
-				      const struct object_id *oid,
-				      struct object_info *oi,
-				      enum object_info_flags flags)
-{
-	static struct strbuf buf = STRBUF_INIT;
-
-	/*
-	 * The second read shouldn't cause new loose objects to show up, unless
-	 * there was a race condition with a secondary process. We don't care
-	 * about this case though, so we simply skip reading loose objects a
-	 * second time.
-	 */
-	if (flags & OBJECT_INFO_SECOND_READ)
-		return -1;
-
-	odb_loose_path(source, &buf, oid);
-	return read_object_info_from_path(source, buf.buf, oid, oi, flags);
-}
-
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
 			     const void *buf, unsigned long len,
 			     struct object_id *oid,
@@ -1833,7 +1812,7 @@ int for_each_loose_file_in_source(struct odb_source *source,
 }
 
 struct for_each_object_wrapper_data {
-	struct odb_source *source;
+	struct odb_source_loose *loose;
 	const struct object_info *request;
 	odb_for_each_object_cb cb;
 	void *cb_data;
@@ -1848,7 +1827,7 @@ static int for_each_object_wrapper_cb(const struct object_id *oid,
 	if (data->request) {
 		struct object_info oi = *data->request;
 
-		if (read_object_info_from_path(data->source, path, oid, &oi, 0) < 0)
+		if (read_object_info_from_path(data->loose, path, oid, &oi, 0) < 0)
 			return -1;
 
 		return data->cb(oid, &oi, data->cb_data);
@@ -1865,8 +1844,8 @@ static int for_each_prefixed_object_wrapper_cb(const struct object_id *oid,
 	if (data->request) {
 		struct object_info oi = *data->request;
 
-		if (odb_source_loose_read_object_info(data->source,
-						      oid, &oi, 0) < 0)
+		if (odb_source_read_object_info(&data->loose->base,
+						oid, &oi, 0) < 0)
 			return -1;
 
 		return data->cb(oid, &oi, data->cb_data);
@@ -1881,8 +1860,9 @@ int odb_source_loose_for_each_object(struct odb_source *source,
 				     void *cb_data,
 				     const struct odb_for_each_object_options *opts)
 {
+	struct odb_source_files *files = odb_source_files_downcast(source);
 	struct for_each_object_wrapper_data data = {
-		.source = source,
+		.loose = files->loose,
 		.request = request,
 		.cb = cb,
 		.cb_data = cb_data,
diff --git a/object-file.h b/object-file.h
index 420a0fff2e..8ac2832dac 100644
--- a/object-file.h
+++ b/object-file.h
@@ -21,11 +21,6 @@ struct object_info;
 struct odb_read_stream;
 struct odb_source;
 
-int odb_source_loose_read_object_info(struct odb_source *source,
-				      const struct object_id *oid,
-				      struct object_info *oi,
-				      enum object_info_flags flags);
-
 int odb_source_loose_read_object_stream(struct odb_read_stream **out,
 					struct odb_source *source,
 					const struct object_id *oid);
@@ -198,6 +193,12 @@ int read_loose_object(struct repository *repo,
 		      void **contents,
 		      struct object_info *oi);
 
+int read_object_info_from_path(struct odb_source_loose *loose,
+			       const char *path,
+			       const struct object_id *oid,
+			       struct object_info *oi,
+			       enum object_info_flags flags);
+
 struct odb_transaction;
 
 /*
diff --git a/odb/source-files.c b/odb/source-files.c
index 59e3a70d80..8d6924755f 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -55,7 +55,7 @@ static int odb_source_files_read_object_info(struct odb_source *source,
 	struct odb_source_files *files = odb_source_files_downcast(source);
 
 	if (!packfile_store_read_object_info(files->packed, oid, oi, flags) ||
-	    !odb_source_loose_read_object_info(source, oid, oi, flags))
+	    !odb_source_read_object_info(&files->loose->base, oid, oi, flags))
 		return 0;
 
 	return -1;
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 65c1076659..50f387ecf3 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -2,10 +2,33 @@
 #include "abspath.h"
 #include "chdir-notify.h"
 #include "loose.h"
+#include "object-file.h"
 #include "odb.h"
 #include "odb/source-files.h"
 #include "odb/source-loose.h"
 #include "oidtree.h"
+#include "strbuf.h"
+
+static int odb_source_loose_read_object_info(struct odb_source *source,
+					     const struct object_id *oid,
+					     struct object_info *oi,
+					     enum object_info_flags flags)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	static struct strbuf buf = STRBUF_INIT;
+
+	/*
+	 * The second read shouldn't cause new loose objects to show up, unless
+	 * there was a race condition with a secondary process. We don't care
+	 * about this case though, so we simply skip reading loose objects a
+	 * second time.
+	 */
+	if (flags & OBJECT_INFO_SECOND_READ)
+		return -1;
+
+	odb_loose_path(source, &buf, oid);
+	return read_object_info_from_path(loose, buf.buf, oid, oi, flags);
+}
 
 static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
@@ -60,6 +83,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->base.free = odb_source_loose_free;
 	loose->base.close = odb_source_loose_close;
 	loose->base.reprepare = odb_source_loose_reprepare;
+	loose->base.read_object_info = odb_source_loose_read_object_info;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 05/18] odb/source-loose: wire up `close()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Wire up a new `close()` callback for the loose source and call it from
the "files" source via the generic `odb_source_close()` interface. The
callback itself is a no-op as the loose source has no resources that
need to be released on close.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 odb/source-files.c | 1 +
 odb/source-loose.c | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/odb/source-files.c b/odb/source-files.c
index 10832e81e4..59e3a70d80 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -36,6 +36,7 @@ static void odb_source_files_free(struct odb_source *source)
 static void odb_source_files_close(struct odb_source *source)
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
+	odb_source_close(&files->loose->base);
 	packfile_store_close(files->packed);
 }
 
diff --git a/odb/source-loose.c b/odb/source-loose.c
index e0fe0d513d..65c1076659 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -21,6 +21,11 @@ static void odb_source_loose_reprepare(struct odb_source *source)
 	odb_source_loose_clear_cache(loose);
 }
 
+static void odb_source_loose_close(struct odb_source *source UNUSED)
+{
+	/* Nothing to do. */
+}
+
 static void odb_source_loose_reparent(const char *name UNUSED,
 				      const char *old_cwd,
 				      const char *new_cwd,
@@ -53,6 +58,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->files = files;
 
 	loose->base.free = odb_source_loose_free;
+	loose->base.close = odb_source_loose_close;
 	loose->base.reprepare = odb_source_loose_reprepare;
 
 	if (!is_absolute_path(loose->base.path))

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 04/18] odb/source-loose: wire up `reprepare()` callback
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Move `odb_source_loose_reprepare()` from "object-file.c" into
"odb/source-loose.c" and wire it up as the `reprepare()` callback of the
loose source.

While at it, make `odb_source_loose_clear_cache()` static, as it is no
longer needed outside of its file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 6 ------
 object-file.h      | 3 ---
 odb/source-files.c | 2 +-
 odb/source-loose.c | 9 ++++++++-
 odb/source-loose.h | 2 --
 5 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/object-file.c b/object-file.c
index 977d959d33..0f4f1e7bdc 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2041,12 +2041,6 @@ static struct oidtree *odb_source_loose_cache(struct odb_source *source,
 	return files->loose->cache;
 }
 
-void odb_source_loose_reprepare(struct odb_source *source)
-{
-	struct odb_source_files *files = odb_source_files_downcast(source);
-	odb_source_loose_clear_cache(files->loose);
-}
-
 static int check_stream_oid(git_zstream *stream,
 			    const char *hdr,
 			    unsigned long size,
diff --git a/object-file.h b/object-file.h
index 02c9680980..420a0fff2e 100644
--- a/object-file.h
+++ b/object-file.h
@@ -21,9 +21,6 @@ struct object_info;
 struct odb_read_stream;
 struct odb_source;
 
-/* Reprepare the loose source by emptying the loose object cache. */
-void odb_source_loose_reprepare(struct odb_source *source);
-
 int odb_source_loose_read_object_info(struct odb_source *source,
 				      const struct object_id *oid,
 				      struct object_info *oi,
diff --git a/odb/source-files.c b/odb/source-files.c
index ccc637311b..10832e81e4 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -42,7 +42,7 @@ static void odb_source_files_close(struct odb_source *source)
 static void odb_source_files_reprepare(struct odb_source *source)
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
-	odb_source_loose_reprepare(&files->base);
+	odb_source_reprepare(&files->loose->base);
 	packfile_store_reprepare(files->packed);
 }
 
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 92e18f5adb..e0fe0d513d 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -7,7 +7,7 @@
 #include "odb/source-loose.h"
 #include "oidtree.h"
 
-void odb_source_loose_clear_cache(struct odb_source_loose *loose)
+static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 {
 	oidtree_clear(loose->cache);
 	FREE_AND_NULL(loose->cache);
@@ -15,6 +15,12 @@ void odb_source_loose_clear_cache(struct odb_source_loose *loose)
 	       sizeof(loose->subdir_seen));
 }
 
+static void odb_source_loose_reprepare(struct odb_source *source)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	odb_source_loose_clear_cache(loose);
+}
+
 static void odb_source_loose_reparent(const char *name UNUSED,
 				      const char *old_cwd,
 				      const char *new_cwd,
@@ -47,6 +53,7 @@ struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 	loose->files = files;
 
 	loose->base.free = odb_source_loose_free;
+	loose->base.reprepare = odb_source_loose_reprepare;
 
 	if (!is_absolute_path(loose->base.path))
 		chdir_notify_register(NULL, odb_source_loose_reparent, loose);
diff --git a/odb/source-loose.h b/odb/source-loose.h
index 441da9e418..825e703072 100644
--- a/odb/source-loose.h
+++ b/odb/source-loose.h
@@ -44,6 +44,4 @@ static inline struct odb_source_loose *odb_source_loose_downcast(struct odb_sour
 	return container_of(source, struct odb_source_loose, base);
 }
 
-void odb_source_loose_clear_cache(struct odb_source_loose *loose);
-
 #endif

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related

* [PATCH 03/18] odb/source-loose: start converting to a proper `struct odb_source`
From: Patrick Steinhardt @ 2026-05-21  8:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20260521-b4-pks-odb-source-loose-v1-0-6553b399be2d@pks.im>

Start converting `struct odb_source_loose` into a proper pluggable
`struct odb_source` by embedding the base struct and assigning it the
new `ODB_SOURCE_LOOSE` type. Furthermore, wire up lifecycle management
of this source by implementing the `free` callback and taking ownership
of the chdir notifications.

Note that the loose source is not yet functional as a standalone `struct
odb_source`, as it's missing all of the callback implementations. These
will be wired up in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c      | 17 -----------------
 object-file.h      |  2 --
 odb/source-files.c |  2 +-
 odb/source-loose.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 odb/source-loose.h | 14 ++++++++++++++
 odb/source.h       |  3 +++
 6 files changed, 63 insertions(+), 20 deletions(-)

diff --git a/object-file.c b/object-file.c
index 7a1908bfc0..977d959d33 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2041,14 +2041,6 @@ static struct oidtree *odb_source_loose_cache(struct odb_source *source,
 	return files->loose->cache;
 }
 
-static void odb_source_loose_clear_cache(struct odb_source_loose *loose)
-{
-	oidtree_clear(loose->cache);
-	FREE_AND_NULL(loose->cache);
-	memset(&loose->subdir_seen, 0,
-	       sizeof(loose->subdir_seen));
-}
-
 void odb_source_loose_reprepare(struct odb_source *source)
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
@@ -2205,15 +2197,6 @@ struct odb_transaction *odb_transaction_files_begin(struct odb_source *source)
 	return &transaction->base;
 }
 
-void odb_source_loose_free(struct odb_source_loose *loose)
-{
-	if (!loose)
-		return;
-	odb_source_loose_clear_cache(loose);
-	loose_object_map_clear(&loose->map);
-	free(loose);
-}
-
 struct odb_loose_read_stream {
 	struct odb_read_stream base;
 	git_zstream z;
diff --git a/object-file.h b/object-file.h
index 1d8312cf7f..02c9680980 100644
--- a/object-file.h
+++ b/object-file.h
@@ -21,8 +21,6 @@ struct object_info;
 struct odb_read_stream;
 struct odb_source;
 
-void odb_source_loose_free(struct odb_source_loose *loose);
-
 /* Reprepare the loose source by emptying the loose object cache. */
 void odb_source_loose_reprepare(struct odb_source *source);
 
diff --git a/odb/source-files.c b/odb/source-files.c
index 185cc6903e..ccc637311b 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -27,7 +27,7 @@ static void odb_source_files_free(struct odb_source *source)
 {
 	struct odb_source_files *files = odb_source_files_downcast(source);
 	chdir_notify_unregister(NULL, odb_source_files_reparent, files);
-	odb_source_loose_free(files->loose);
+	odb_source_free(&files->loose->base);
 	packfile_store_free(files->packed);
 	odb_source_release(&files->base);
 	free(files);
diff --git a/odb/source-loose.c b/odb/source-loose.c
index c9e7414814..92e18f5adb 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -1,10 +1,55 @@
 #include "git-compat-util.h"
+#include "abspath.h"
+#include "chdir-notify.h"
+#include "loose.h"
+#include "odb.h"
+#include "odb/source-files.h"
 #include "odb/source-loose.h"
+#include "oidtree.h"
+
+void odb_source_loose_clear_cache(struct odb_source_loose *loose)
+{
+	oidtree_clear(loose->cache);
+	FREE_AND_NULL(loose->cache);
+	memset(&loose->subdir_seen, 0,
+	       sizeof(loose->subdir_seen));
+}
+
+static void odb_source_loose_reparent(const char *name UNUSED,
+				      const char *old_cwd,
+				      const char *new_cwd,
+				      void *cb_data)
+{
+	struct odb_source_loose *loose = cb_data;
+	char *path = reparent_relative_path(old_cwd, new_cwd,
+					    loose->base.path);
+	free(loose->base.path);
+	loose->base.path = path;
+}
+
+static void odb_source_loose_free(struct odb_source *source)
+{
+	struct odb_source_loose *loose = odb_source_loose_downcast(source);
+	odb_source_loose_clear_cache(loose);
+	loose_object_map_clear(&loose->map);
+	chdir_notify_unregister(NULL, odb_source_loose_reparent, loose);
+	odb_source_release(&loose->base);
+	free(loose);
+}
 
 struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files)
 {
 	struct odb_source_loose *loose;
+
 	CALLOC_ARRAY(loose, 1);
+	odb_source_init(&loose->base, files->base.odb, ODB_SOURCE_LOOSE,
+			files->base.path, files->base.local);
 	loose->files = files;
+
+	loose->base.free = odb_source_loose_free;
+
+	if (!is_absolute_path(loose->base.path))
+		chdir_notify_register(NULL, odb_source_loose_reparent, loose);
+
 	return loose;
 }
diff --git a/odb/source-loose.h b/odb/source-loose.h
index bf61e767c8..441da9e418 100644
--- a/odb/source-loose.h
+++ b/odb/source-loose.h
@@ -12,6 +12,7 @@ struct oidtree;
  * file per object. This source is part of the files source.
  */
 struct odb_source_loose {
+	struct odb_source base;
 	struct odb_source_files *files;
 
 	/*
@@ -32,4 +33,17 @@ struct odb_source_loose {
 
 struct odb_source_loose *odb_source_loose_new(struct odb_source_files *files);
 
+/*
+ * Cast the given object database source to the loose backend. This will cause
+ * a BUG in case the source uses doesn't use this backend.
+ */
+static inline struct odb_source_loose *odb_source_loose_downcast(struct odb_source *source)
+{
+	if (source->type != ODB_SOURCE_LOOSE)
+		BUG("trying to downcast source of type '%d' to loose", source->type);
+	return container_of(source, struct odb_source_loose, base);
+}
+
+void odb_source_loose_clear_cache(struct odb_source_loose *loose);
+
 #endif
diff --git a/odb/source.h b/odb/source.h
index 0a440884e4..8bcb67787e 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -14,6 +14,9 @@ enum odb_source_type {
 	/* The "files" backend that uses loose objects and packfiles. */
 	ODB_SOURCE_FILES,
 
+	/* The "loose" backend that uses loose objects, only. */
+	ODB_SOURCE_LOOSE,
+
 	/* The "in-memory" backend that stores objects in memory. */
 	ODB_SOURCE_INMEMORY,
 };

-- 
2.54.0.926.g75ba10bac6.dirty


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox