git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Karthik Nayak <karthik.188@gmail.com>
Cc: git@vger.kernel.org, jltobler@gmail.com, toon@iotcl.com,
	sunshine@sunshineco.com, "Jean-Noël Avila" <jn.avila@free.fr>
Subject: Re: [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory
Date: Wed, 26 Nov 2025 08:17:50 -0800	[thread overview]
Message-ID: <xmqq7bvcpy35.fsf@gitster.g> (raw)
In-Reply-To: <20251126-kn-alternate-ref-dir-v2-2-8b9f6f18f635@gmail.com> (Karthik Nayak's message of "Wed, 26 Nov 2025 12:12:01 +0100")

Karthik Nayak <karthik.188@gmail.com> writes:

> +`GIT_REF_URI`::
> +    Specify which reference backend to be used along with its URI. Reference
> +    backends like the files, reftable backend use the $GIT_DIR as their URI.
> ++
> +Expects the format `<ref_backend>://<URI-for-resource>`, where the
> +_<ref_backend>_ specifies the reference backend and the _<URI-for-resource>_
> +specifies the URI used by the backend.

It is more like "<directory>" that specifies the local directory the
backend is told to use to store its data.  It feels way too broad
for what the initial implementation achieves and what the design can
potentially include, to say "URI-for-resource", I would think.

> diff --git a/environment.h b/environment.h
> index 51898c99cd..9bc380bba4 100644
> --- a/environment.h
> +++ b/environment.h
> @@ -42,6 +42,7 @@
>  #define GIT_OPTIONAL_LOCKS_ENVIRONMENT "GIT_OPTIONAL_LOCKS"
>  #define GIT_TEXT_DOMAIN_DIR_ENVIRONMENT "GIT_TEXTDOMAINDIR"
>  #define GIT_ATTR_SOURCE_ENVIRONMENT "GIT_ATTR_SOURCE"
> +#define GIT_REF_URI_ENVIRONMENT "GIT_REF_URI"
>  
>  /*
>   * Environment variable used to propagate the --no-advice global option to the
> diff --git a/refs.c b/refs.c
> index 23f46867f2..a7af228799 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2186,15 +2186,73 @@ static struct ref_store *get_ref_store_for_dir(struct repository *r,
>  	return maybe_debug_wrap_ref_store(dir, ref_store);
>  }
>  
> +static struct ref_store *get_ref_store_from_uri(struct repository *repo,
> +						const char *uri)
> +{
> +	struct string_list ref_backend_info = STRING_LIST_INIT_DUP;
> +	enum ref_storage_format format;
> +	struct ref_store *store = NULL;
> +	char *format_string;
> +	char *dir;
> +
> +	if (!uri || !uri[0]) {
> +		error("reference backend uri is empty");
> +		goto cleanup;
> +	}

Equating !uri and !uri[0] and giving the same message would not help
diagnosing an error, and not _("localizing") the message is of dubious
value (after all, the message is not being given to somebody coming
over the network, but meant to be given to the local user, right?).

If we remove the !uri[0] from the check, shouldn't the later check
catch it as "invalid format" anyway, and print '%s' it to show that
what was given was empty clearly enough?

> +	if (string_list_split(&ref_backend_info, uri, ":", 2) != 2) {
> +		error("invalid reference backend uri format '%s'", uri);
> +		goto cleanup;
> +	}
> +
> +	format_string = ref_backend_info.items[0].string;
> +	if (!starts_with(ref_backend_info.items[1].string, "//")) {
> +		error("invalid reference backend uri format '%s'", uri);
> +		goto cleanup;
> +	}
> +	dir = ref_backend_info.items[1].string + 2;

Two questions.  (1) do we still want the double-slash after the
colon?  (2) if so, would it make it simpler to string-list-split
using "://" as the separator?

> +	format_string = ref_backend_info.items[0].string;
> +	dir = ref_backend_info.items[1].string + 2;

These two lines are fishy.  Perhaps leftover from an earlier draft
that did not have an error checking before the previous 5 lines were
added?

> +	if (!dir || !dir[0]) {
> +		error("invalid path in uri '%s'", uri);
> +		goto cleanup;
> +	}

At this point it is very unlikely for "dir" to be NULL, no?  Even if
the .string member after splitting were NULL, adding 2 to it would
not leave it NULL.

Being defensive and checking for NULL is good, but then exactly the
same question on "NULL vs an empty string" applies here.

>  struct ref_store *get_main_ref_store(struct repository *r)
>  {
> +	char *ref_uri;
> +
>  	if (r->refs_private)
>  		return r->refs_private;
>  
>  	if (!r->gitdir)
>  		BUG("attempting to get main_ref_store outside of repository");
>  
> -	r->refs_private = get_ref_store_for_dir(r, r->gitdir, r->ref_storage_format);
> +	ref_uri = getenv(GIT_REF_URI_ENVIRONMENT);
> +	if (ref_uri) {
> +		r->refs_private = get_ref_store_from_uri(r, ref_uri);
> +		if (!r->refs_private)
> +			die("failed to initialize ref store from URI: %s", ref_uri);
> +
> +	} else {
> +		r->refs_private = get_ref_store_for_dir(r, r->gitdir,
> +							r->ref_storage_format);
> +	}
>  	return r->refs_private;
>  }

If this mechanism is for consumption by "git refs migrate", is it
possible to reduce the blast radius by giving the command a command
line option to do an equivalent of this?  I really am not happy with
this environment variable that can change the behaviour of such a
low level layer from unsuspecting programs that are not ready.

Instead of tweaking the behaviour of this function via environment
that can affect any programs, can't we give these callers like "git
refs migrate" with specific needs set_main_ref_store() function that
takes a ref_store and a repository.  Then they can use to call into
get_ref_store_for_dir() to obtain a ref they need.  "git refs migrate"
already takes "--ref-format" variable, so all it needs is another
"--ref-directory" command line option, right?

If the ability to set the ref backend location for arbitrary program
proves to be useful, we _could_ give the same --ref-format and
--ref-direcctory command line options to "git" itself (like "git -C
there" runs any subcommand in the named directory), which does the
the get_ref_store_for_dir() plus set_main_ref_store() dance,
modelled after how "git refs migrate" does them.

Hmm?

  reply	other threads:[~2025-11-26 16:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-19 21:48 [PATCH 0/2] refs: allow setting the reference directory Karthik Nayak
2025-11-19 21:48 ` [PATCH 1/2] refs: support obtaining ref_store for given dir Karthik Nayak
2025-11-20 19:05   ` Justin Tobler
2025-11-21 11:18     ` Karthik Nayak
2025-11-19 21:48 ` [PATCH 2/2] refs: add GIT_REF_URI to specify reference backend and directory Karthik Nayak
2025-11-19 22:13   ` Eric Sunshine
2025-11-19 23:01     ` Karthik Nayak
2025-11-20 10:00   ` Jean-Noël Avila
2025-11-21 11:21     ` Karthik Nayak
2025-11-20 19:38   ` Justin Tobler
2025-11-24 13:23     ` Karthik Nayak
2025-11-21 13:42   ` Toon Claes
2025-11-21 16:07     ` Junio C Hamano
2025-11-24 13:25       ` Karthik Nayak
2025-11-26 13:11         ` Toon Claes
2025-11-24 13:26     ` Karthik Nayak
2025-12-01 13:28   ` Patrick Steinhardt
2025-12-02 22:21     ` Karthik Nayak
2025-11-23  4:29 ` [PATCH 0/2] refs: allow setting the reference directory Junio C Hamano
2025-12-01 13:19   ` Patrick Steinhardt
2025-12-02 10:25     ` Junio C Hamano
2025-12-02 15:29     ` Karthik Nayak
2025-11-26 11:11 ` [PATCH v2 " Karthik Nayak
2025-11-26 11:12   ` [PATCH v2 1/2] refs: support obtaining ref_store for given dir Karthik Nayak
2025-11-26 15:16     ` Junio C Hamano
2025-11-26 11:12   ` [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory Karthik Nayak
2025-11-26 16:17     ` Junio C Hamano [this message]
2025-11-27 14:52       ` Karthik Nayak
2025-11-27 20:02         ` Junio C Hamano
2025-11-27 21:45           ` Karthik Nayak
2025-12-01 11:24 ` [PATCH v3 0/2] refs: allow setting the reference directory Karthik Nayak
2025-12-01 11:24   ` [PATCH v3 1/2] refs: support obtaining ref_store for given dir Karthik Nayak
2025-12-01 11:24   ` [PATCH v3 2/2] refs: add GIT_REF_URI to specify reference backend and directory Karthik Nayak
  -- strict thread matches above, loose matches on Subject: below --
2025-11-28  5:21 [PATCH v2 " Natee Korn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq7bvcpy35.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    --cc=jn.avila@free.fr \
    --cc=karthik.188@gmail.com \
    --cc=sunshine@sunshineco.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).