* [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined
@ 2024-10-07 16:54 Kousik Sanagavarapu
2024-10-08 12:23 ` Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Kousik Sanagavarapu @ 2024-10-07 16:54 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt
Hi,
I have two questions but a bit of a background first -
102de880d2 (path.c: migrate global git_path_* to take a repository
argument, 2018-05-17) made global git_path_* functions take a repo
argument. The commit msg mentions that this migration doesn't change
the local path functions in various builtins - which were defined using
GIT_PATH_FUNC. This was also the commit which introduced the macro
REPO_GIT_PATH_FUNC.
Skip to 7ac16649ec (path: hide functions using `the_repository` by
default, 2024-08-13), GIT_PATH_FUNC is hidden under
USE_THE_REPOSITORY_VARIABLE and the REPO_GIT_PATH_FUNC is made its
arbitrary repo equivalent - which can be inferred from the following
portion of the diff
@@ -165,19 +130,10 @@ void report_linked_checkout_garbage(struct repository *r);
/*
* You can define a static memoized git path like:
*
- * static GIT_PATH_FUNC(git_path_foo, "FOO")
+ * static REPO_GIT_PATH_FUNC(git_path_foo, "FOO")
*
* or use one of the global ones below.
*/
-#define GIT_PATH_FUNC(func, filename) \
- const char *func(void) \
- { \
- static char *ret; \
- if (!ret) \
- ret = git_pathdup(filename); \
- return ret; \
- }
-
#define REPO_GIT_PATH_FUNC(var, filename) \
const char *git_path_##var(struct repository *r) \
{ \
(the GIT_PATH_FUNC macro is moved to be under USE_THE_REPOSITORY_VARIABLE)
Looking at the expansion of REPO_GIT_PATH_FUNC ...
#define REPO_GIT_PATH_FUNC(var, filename) \
const char *git_path_##var(struct repository *r) \
{ \
if (!r->cached_paths.var) \
r->cached_paths.var = repo_git_path(r, filename); \
return r->cached_paths.var; \
}
It seems that REPO_GIT_PATH_FUNC isn't an exact equivalent of
GIT_PATH_FUNC. That is, REPO_GIT_PATH_FUNC expects even a local path to be
a field of the "struct repo_path_cache". An example of a local path is
EDIT_DESCRIPTION from "git branch --edit-description" (which inturn gets
used by "git format-patch").
So my question is - do we want, in the future in which we are free from
the dependency on "the_repository", for all the local paths to be a part
of "struct repo_path_cache"? Which in my gut feels wrong - one alternative
then is that we will have to refactor REPO_GIT_PATH_FUNC - or am I missing
something here?
I got into this when I was trying to refactor builtin/branch.c to be
independent of "the_repository". It was a very naive approach of just
manual conversion of all the git_* calls to repo_* calls and similar
changes but the compiler started to complain since I overlooked
GIT_PATH_FUNC and some variables in environment.h which are also hidden
under USE_THE_REPOSITORY_VARIABLE.
Which raises another question - why are variables such as
"comment_line_str" and "default_abbrev" hidden under
USE_THE_REPOSITORY_VARIABLE?[1] They don't seem to be dependent on
"the_repository"? Again, I might be missing something here but am not
sure what.
By the way I don't expect this "naive approach" to be the right method
of doing this - I was just tinkering to get to know
USE_THE_REPOSITORY_VARIABLE better - since builtin/branch.c also calls
into ref-filter which heavily relies on "the_repository" so changes
there also would be appropriate for the complete picture.
[1] See
- f2d70847bd (environment: reorder header to split out `the_repository`-free
section, 2024-09-12)
- 673af418d0 (environment: guard state depending on a repository, 2024-09-12)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined
2024-10-07 16:54 [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined Kousik Sanagavarapu
@ 2024-10-08 12:23 ` Patrick Steinhardt
2024-10-08 17:59 ` Kousik Sanagavarapu
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Steinhardt @ 2024-10-08 12:23 UTC (permalink / raw)
To: Kousik Sanagavarapu; +Cc: git
On Mon, Oct 07, 2024 at 10:24:49PM +0530, Kousik Sanagavarapu wrote:
> Hi,
>
> I have two questions but a bit of a background first -
>
> 102de880d2 (path.c: migrate global git_path_* to take a repository
> argument, 2018-05-17) made global git_path_* functions take a repo
> argument. The commit msg mentions that this migration doesn't change
> the local path functions in various builtins - which were defined using
> GIT_PATH_FUNC. This was also the commit which introduced the macro
> REPO_GIT_PATH_FUNC.
>
> Skip to 7ac16649ec (path: hide functions using `the_repository` by
> default, 2024-08-13), GIT_PATH_FUNC is hidden under
> USE_THE_REPOSITORY_VARIABLE and the REPO_GIT_PATH_FUNC is made its
> arbitrary repo equivalent - which can be inferred from the following
> portion of the diff
>
> @@ -165,19 +130,10 @@ void report_linked_checkout_garbage(struct repository *r);
> /*
> * You can define a static memoized git path like:
> *
> - * static GIT_PATH_FUNC(git_path_foo, "FOO")
> + * static REPO_GIT_PATH_FUNC(git_path_foo, "FOO")
> *
> * or use one of the global ones below.
> */
> -#define GIT_PATH_FUNC(func, filename) \
> - const char *func(void) \
> - { \
> - static char *ret; \
> - if (!ret) \
> - ret = git_pathdup(filename); \
> - return ret; \
> - }
> -
> #define REPO_GIT_PATH_FUNC(var, filename) \
> const char *git_path_##var(struct repository *r) \
> { \
>
> (the GIT_PATH_FUNC macro is moved to be under USE_THE_REPOSITORY_VARIABLE)
>
> Looking at the expansion of REPO_GIT_PATH_FUNC ...
>
> #define REPO_GIT_PATH_FUNC(var, filename) \
> const char *git_path_##var(struct repository *r) \
> { \
> if (!r->cached_paths.var) \
> r->cached_paths.var = repo_git_path(r, filename); \
> return r->cached_paths.var; \
> }
>
> It seems that REPO_GIT_PATH_FUNC isn't an exact equivalent of
> GIT_PATH_FUNC. That is, REPO_GIT_PATH_FUNC expects even a local path to be
> a field of the "struct repo_path_cache". An example of a local path is
> EDIT_DESCRIPTION from "git branch --edit-description" (which inturn gets
> used by "git format-patch").
>
> So my question is - do we want, in the future in which we are free from
> the dependency on "the_repository", for all the local paths to be a part
> of "struct repo_path_cache"? Which in my gut feels wrong - one alternative
> then is that we will have to refactor REPO_GIT_PATH_FUNC - or am I missing
> something here?
What I don't quite understand: what is the problem with making it part
of the `struct repo_path_cache`? Does this cause an actual issue, or is
it merely that you feel it is unnecessary complexity?
> I got into this when I was trying to refactor builtin/branch.c to be
> independent of "the_repository". It was a very naive approach of just
> manual conversion of all the git_* calls to repo_* calls and similar
> changes but the compiler started to complain since I overlooked
> GIT_PATH_FUNC and some variables in environment.h which are also hidden
> under USE_THE_REPOSITORY_VARIABLE.
>
> Which raises another question - why are variables such as
> "comment_line_str" and "default_abbrev" hidden under
> USE_THE_REPOSITORY_VARIABLE?[1] They don't seem to be dependent on
> "the_repository"? Again, I might be missing something here but am not
> sure what.
They do depend on `the_repository`, but implicitly only. The problem is
that those variables are populated via the config, and that may include
repository-local configuration. As such they contain values that have
been derived via `the_repository`, and those values may not be the
correct value when you handle multiple repositories in a single process,
because those may have a different value for e.g. "core.commentChar".
> By the way I don't expect this "naive approach" to be the right method
> of doing this - I was just tinkering to get to know
> USE_THE_REPOSITORY_VARIABLE better - since builtin/branch.c also calls
> into ref-filter which heavily relies on "the_repository" so changes
> there also would be appropriate for the complete picture.
Yeah, in the ideal case you'd first adapt any underlying code that you
happen to spot that relies on `the_repository`. That doesn't always
work as it is easy to miss that something implicitly depends on the
variable. But in case such a dependency is missed it will get to light
eventually as we continue with our quest to remove `the_repository`.
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined
2024-10-08 12:23 ` Patrick Steinhardt
@ 2024-10-08 17:59 ` Kousik Sanagavarapu
2024-10-09 3:42 ` Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Kousik Sanagavarapu @ 2024-10-08 17:59 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Tue, Oct 08, 2024 at 02:23:54PM +0200, Patrick Steinhardt wrote:
> On Mon, Oct 07, 2024 at 10:24:49PM +0530, Kousik Sanagavarapu wrote:
> > Hi,
> >
> > I have two questions but a bit of a background first -
> >
> > [...]
> >
> > So my question is - do we want, in the future in which we are free from
> > the dependency on "the_repository", for all the local paths to be a part
> > of "struct repo_path_cache"? Which in my gut feels wrong - one alternative
> > then is that we will have to refactor REPO_GIT_PATH_FUNC - or am I missing
> > something here?
>
> What I don't quite understand: what is the problem with making it part
> of the `struct repo_path_cache`? Does this cause an actual issue, or is
> it merely that you feel it is unnecessary complexity?
I feel it is unnecessary complexity.
$ git grep -E "(static GIT_PATH_FUNC|^GIT_PATH_FUNC)" | wc -l
65
Meaning each of these would have to have an entry in
"struct repo_path_cache" in the world where we don't rely on
"the_repository". Some of these are also not direct ".git/some-file" but
".git/dir/files" where ".git/dir" is also given by a seperate path func,
like ".git/rebase-merges" and ".git/rebase-merges/head-name".
So why hold pointers to such filenames instead of just calling
repo_git_path() manually - all these filenames are "local" anyways - unlike
say files such as "SQUASH_MSG"?
> > I got into this when I was trying to refactor builtin/branch.c to be
> > independent of "the_repository". It was a very naive approach of just
> > manual conversion of all the git_* calls to repo_* calls and similar
> > changes but the compiler started to complain since I overlooked
> > GIT_PATH_FUNC and some variables in environment.h which are also hidden
> > under USE_THE_REPOSITORY_VARIABLE.
> >
> > Which raises another question - why are variables such as
> > "comment_line_str" and "default_abbrev" hidden under
> > USE_THE_REPOSITORY_VARIABLE?[1] They don't seem to be dependent on
> > "the_repository"? Again, I might be missing something here but am not
> > sure what.
>
> They do depend on `the_repository`, but implicitly only. The problem is
> that those variables are populated via the config, and that may include
> repository-local configuration. As such they contain values that have
> been derived via `the_repository`, and those values may not be the
> correct value when you handle multiple repositories in a single process,
> because those may have a different value for e.g. "core.commentChar".
I see. Guess I didn't do my research right - didn't know about
"core.commentChar".
> > By the way I don't expect this "naive approach" to be the right method
> > of doing this - I was just tinkering to get to know
> > USE_THE_REPOSITORY_VARIABLE better - since builtin/branch.c also calls
> > into ref-filter which heavily relies on "the_repository" so changes
> > there also would be appropriate for the complete picture.
>
> Yeah, in the ideal case you'd first adapt any underlying code that you
> happen to spot that relies on `the_repository`. That doesn't always
> work as it is easy to miss that something implicitly depends on the
> variable. But in case such a dependency is missed it will get to light
> eventually as we continue with our quest to remove `the_repository`.
Thanks for such a nice explanation.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined
2024-10-08 17:59 ` Kousik Sanagavarapu
@ 2024-10-09 3:42 ` Patrick Steinhardt
2024-10-09 6:20 ` Kousik Sanagavarapu
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Steinhardt @ 2024-10-09 3:42 UTC (permalink / raw)
To: Kousik Sanagavarapu; +Cc: git
On Tue, Oct 08, 2024 at 11:29:51PM +0530, Kousik Sanagavarapu wrote:
> On Tue, Oct 08, 2024 at 02:23:54PM +0200, Patrick Steinhardt wrote:
> > On Mon, Oct 07, 2024 at 10:24:49PM +0530, Kousik Sanagavarapu wrote:
> > > Hi,
> > >
> > > I have two questions but a bit of a background first -
> > >
> > > [...]
> > >
> > > So my question is - do we want, in the future in which we are free from
> > > the dependency on "the_repository", for all the local paths to be a part
> > > of "struct repo_path_cache"? Which in my gut feels wrong - one alternative
> > > then is that we will have to refactor REPO_GIT_PATH_FUNC - or am I missing
> > > something here?
> >
> > What I don't quite understand: what is the problem with making it part
> > of the `struct repo_path_cache`? Does this cause an actual issue, or is
> > it merely that you feel it is unnecessary complexity?
>
> I feel it is unnecessary complexity.
>
> $ git grep -E "(static GIT_PATH_FUNC|^GIT_PATH_FUNC)" | wc -l
> 65
>
> Meaning each of these would have to have an entry in
> "struct repo_path_cache" in the world where we don't rely on
> "the_repository". Some of these are also not direct ".git/some-file" but
> ".git/dir/files" where ".git/dir" is also given by a seperate path func,
> like ".git/rebase-merges" and ".git/rebase-merges/head-name".
>
> So why hold pointers to such filenames instead of just calling
> repo_git_path() manually - all these filenames are "local" anyways - unlike
> say files such as "SQUASH_MSG"?
It does make the interface easier to use at times because you don't have
to worry about freeing returned strings. In other situations it likely
is unnecessary.
In any case, not all cases must strictly be converted to REPO_PATH_FUNC.
A refactoring may also decide that using e.g. `repo_git_path()` or
`repo_common_pathv()` might be better alternatives.
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined
2024-10-09 3:42 ` Patrick Steinhardt
@ 2024-10-09 6:20 ` Kousik Sanagavarapu
0 siblings, 0 replies; 5+ messages in thread
From: Kousik Sanagavarapu @ 2024-10-09 6:20 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Wed, Oct 09, 2024 at 05:42:01AM +0200, Patrick Steinhardt wrote:
> On Tue, Oct 08, 2024 at 11:29:51PM +0530, Kousik Sanagavarapu wrote:
> > On Tue, Oct 08, 2024 at 02:23:54PM +0200, Patrick Steinhardt wrote:
> > > On Mon, Oct 07, 2024 at 10:24:49PM +0530, Kousik Sanagavarapu wrote:
> > > > Hi,
> > > >
> > > > I have two questions but a bit of a background first -
> > > >
> > > > [...]
> > > >
> > > > So my question is - do we want, in the future in which we are free from
> > > > the dependency on "the_repository", for all the local paths to be a part
> > > > of "struct repo_path_cache"? Which in my gut feels wrong - one alternative
> > > > then is that we will have to refactor REPO_GIT_PATH_FUNC - or am I missing
> > > > something here?
> > >
> > > What I don't quite understand: what is the problem with making it part
> > > of the `struct repo_path_cache`? Does this cause an actual issue, or is
> > > it merely that you feel it is unnecessary complexity?
> >
> > I feel it is unnecessary complexity.
> >
> > $ git grep -E "(static GIT_PATH_FUNC|^GIT_PATH_FUNC)" | wc -l
> > 65
> >
> > Meaning each of these would have to have an entry in
> > "struct repo_path_cache" in the world where we don't rely on
> > "the_repository". Some of these are also not direct ".git/some-file" but
> > ".git/dir/files" where ".git/dir" is also given by a seperate path func,
> > like ".git/rebase-merges" and ".git/rebase-merges/head-name".
> >
> > So why hold pointers to such filenames instead of just calling
> > repo_git_path() manually - all these filenames are "local" anyways - unlike
> > say files such as "SQUASH_MSG"?
>
> It does make the interface easier to use at times because you don't have
> to worry about freeing returned strings. In other situations it likely
> is unnecessary.
>
> In any case, not all cases must strictly be converted to REPO_PATH_FUNC.
> A refactoring may also decide that using e.g. `repo_git_path()` or
> `repo_common_pathv()` might be better alternatives.
Got it. Thanks again for the nice explanations.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-09 6:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-07 16:54 [Question] local paths when USE_THE_REPOSITORY_VARIABLE is not defined Kousik Sanagavarapu
2024-10-08 12:23 ` Patrick Steinhardt
2024-10-08 17:59 ` Kousik Sanagavarapu
2024-10-09 3:42 ` Patrick Steinhardt
2024-10-09 6:20 ` Kousik Sanagavarapu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).