* [PATCH v1 1/2] t0050: ls-files path fails if path of workdir is NFD
[not found] <20240430032717281.IXLP.121462.mail.biglobe.ne.jp@biglobe.ne.jp>
@ 2024-05-07 8:44 ` tboegi
2024-05-07 17:30 ` Junio C Hamano
2024-05-07 8:44 ` [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed() tboegi
` (3 subsequent siblings)
4 siblings, 1 reply; 24+ messages in thread
From: tboegi @ 2024-05-07 8:44 UTC (permalink / raw)
To: tboegi, git, takimoto-j
From: Torsten Bögershausen <tboegi@web.de>
Add a test case for this bug report, slightly edited and shortened:
ls-files path' fails if absolute path of workdir contains NFD (macOS)
On macOS, 'git ls-files path' does not work (gives an error)
if the absolute 'path' contains characters in NFD (decomposed).
I guess this is a (minor) bug of git.
$ cd /somewhere # some safe place, /tmp or ~/tmp etc.
$ mkdir $'u\xcc\x88' # ü in NFD
$ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
$ git init
$ git ls-files $'/somewhere/u\xcc\x88' # NFD
fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
$ git ls-files $'/somewhere/\xc3\xbc' # NFC
(the same error as above)
In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.
The added test case here follows the error description,
with the exception that the 'ü' is replaced by an 'ä',
which we already have as NFD and NFC in t0050.
A fix will be done in the next commit.
Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
t/t0050-filesystem.sh | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 325eb1c3cd..bb85ec38cb 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -156,4 +156,16 @@ test_expect_success CASE_INSENSITIVE_FS 'checkout with no pathspec and a case in
)
'
+test_expect_success 'git ls-files under NFD' '
+ (
+ mkdir somewhere &&
+ mkdir somewhere/$aumlcdiar &&
+ mypwd=$PWD &&
+ cd somewhere/$aumlcdiar &&
+ git init &&
+ git ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
+'
test_done
--
2.41.0.394.ge43f4fd0bd
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v1 1/2] t0050: ls-files path fails if path of workdir is NFD
2024-05-07 8:44 ` [PATCH v1 1/2] t0050: ls-files path fails if path of workdir is NFD tboegi
@ 2024-05-07 17:30 ` Junio C Hamano
0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2024-05-07 17:30 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> From: Torsten Bögershausen <tboegi@web.de>
>
> Add a test case for this bug report, slightly edited and shortened:
>
> ls-files path' fails if absolute path of workdir contains NFD (macOS)
> On macOS, 'git ls-files path' does not work (gives an error)
> if the absolute 'path' contains characters in NFD (decomposed).
> I guess this is a (minor) bug of git.
>
> $ cd /somewhere # some safe place, /tmp or ~/tmp etc.
> $ mkdir $'u\xcc\x88' # ü in NFD
> $ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
> $ git init
> $ git ls-files $'/somewhere/u\xcc\x88' # NFD
> fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
> $ git ls-files $'/somewhere/\xc3\xbc' # NFC
> (the same error as above)
>
> In the 'fatal:' error message, there are three ü;
> the 1st and 2nd are in NFC, the 3rd is in NFD.
>
> The added test case here follows the error description,
> with the exception that the 'ü' is replaced by an 'ä',
> which we already have as NFD and NFC in t0050.
> A fix will be done in the next commit.
That will break bisection. I think combining the two commits into
one would make sense for a small change like this, consisting a
focused and straight-forward fix plus a clean and concise test.
> Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> t/t0050-filesystem.sh | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
> index 325eb1c3cd..bb85ec38cb 100755
> --- a/t/t0050-filesystem.sh
> +++ b/t/t0050-filesystem.sh
> @@ -156,4 +156,16 @@ test_expect_success CASE_INSENSITIVE_FS 'checkout with no pathspec and a case in
> )
> '
>
> +test_expect_success 'git ls-files under NFD' '
> + (
> + mkdir somewhere &&
> + mkdir somewhere/$aumlcdiar &&
Would a single "mkdir -p" suffice?
mkdir -p "somewhere/$aumlcdiar" &&
> + mypwd=$PWD &&
> + cd somewhere/$aumlcdiar &&
> + git init &&
> + git ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
We do not control what is in "$mypwd". Can it have funny characters
that can confuse Git? Quoting the path with a pair of double quotes
protects the shell from getting confused with $IFS whitespaces, but
we may want to protect the pathspec handing in Git with something
like
git --literal-pathspecs ls-files "..."
here.
> + >expected &&
> + test_cmp expected err
> + )
> +'
> test_done
> --
> 2.41.0.394.ge43f4fd0bd
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
[not found] <20240430032717281.IXLP.121462.mail.biglobe.ne.jp@biglobe.ne.jp>
2024-05-07 8:44 ` [PATCH v1 1/2] t0050: ls-files path fails if path of workdir is NFD tboegi
@ 2024-05-07 8:44 ` tboegi
2024-05-07 17:22 ` Junio C Hamano
2024-05-07 17:47 ` Junio C Hamano
2024-05-09 16:11 ` [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD tboegi
` (2 subsequent siblings)
4 siblings, 2 replies; 24+ messages in thread
From: tboegi @ 2024-05-07 8:44 UTC (permalink / raw)
To: tboegi, git, takimoto-j
From: Torsten Bögershausen <tboegi@web.de>
When running under macOs a call to strbuf_getcwd() may return
decomposed unicode.
This could make `git ls-files` fail, see previous commit of t0050
The solution is to precompose the result of getcwd() if needed.
One possible implementation would be to re-define getcwd() similar
to opendir(), readdir() and closedir().
Since there is already a strbuf wrapper around getcwd(), and only this
wrapper is used inside the whole codebase, equip strbuf_getcwd() with
a call to the newly created function precompse_strbuf_if_needed().
Note that precompse_strbuf_if_needed() is a function under macOs,
and is a "nop" on all other systems.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
compat/precompose_utf8.c | 11 +++++++++++
compat/precompose_utf8.h | 1 +
git-compat-util.h | 2 ++
strbuf.c | 1 +
4 files changed, 15 insertions(+)
diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index 0bd5c24250..82ec2a1c5b 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -94,6 +94,17 @@ const char *precompose_string_if_needed(const char *in)
return in;
}
+void precompse_strbuf_if_needed(struct strbuf *sb)
+{
+ char *buf_prec = (char *)precompose_string_if_needed(sb->buf);
+ if (buf_prec != sb->buf) {
+ size_t buf_prec_len = strlen(buf_prec);
+ free(strbuf_detach(sb, NULL));
+ strbuf_attach(sb, buf_prec, buf_prec_len, buf_prec_len + 1);
+ }
+
+}
+
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix)
{
int i = 0;
diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h
index fea06cf28a..fb17b1bd4a 100644
--- a/compat/precompose_utf8.h
+++ b/compat/precompose_utf8.h
@@ -30,6 +30,7 @@ typedef struct {
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix);
const char *precompose_string_if_needed(const char *in);
+void precompse_strbuf_if_needed(struct strbuf *sb);
void probe_utf8_pathname_composition(void);
PREC_DIR *precompose_utf8_opendir(const char *dirname);
diff --git a/git-compat-util.h b/git-compat-util.h
index ca7678a379..80ae463410 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -344,6 +344,8 @@ static inline const char *precompose_string_if_needed(const char *in)
return in;
}
+#define precompse_strbuf_if_needed(a)
+
#define probe_utf8_pathname_composition()
#endif
diff --git a/strbuf.c b/strbuf.c
index 0d929e4e19..cefea6b75f 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -591,6 +591,7 @@ int strbuf_getcwd(struct strbuf *sb)
for (;; guessed_len *= 2) {
strbuf_grow(sb, guessed_len);
if (getcwd(sb->buf, sb->alloc)) {
+ precompse_strbuf_if_needed(sb);
strbuf_setlen(sb, strlen(sb->buf));
return 0;
}
--
2.41.0.394.ge43f4fd0bd
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
2024-05-07 8:44 ` [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed() tboegi
@ 2024-05-07 17:22 ` Junio C Hamano
2024-05-09 15:24 ` Junio C Hamano
2024-05-07 17:47 ` Junio C Hamano
1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-07 17:22 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> +void precompse_strbuf_if_needed(struct strbuf *sb)
> +{
> + char *buf_prec = (char *)precompose_string_if_needed(sb->buf);
> + if (buf_prec != sb->buf) {
Cute. This matches with the !PRECOMPSE_UNICODE case in git-compat-util.h
where we do
static inline const char *precompose_string_if_needed(const char *in)
{
return in;
}
to make it a no-op. I was wondering how you are avoiding an
inevitable crash from trying to free an unfreeable piece of memory,
but this should do just fine.
You'd want to fix the typo in the name of the new function, I
presume? "precompse" -> "precompose"
> + size_t buf_prec_len = strlen(buf_prec);
> + free(strbuf_detach(sb, NULL));
> + strbuf_attach(sb, buf_prec, buf_prec_len, buf_prec_len + 1);
> + }
> +
> +}
> diff --git a/strbuf.c b/strbuf.c
> index 0d929e4e19..cefea6b75f 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -591,6 +591,7 @@ int strbuf_getcwd(struct strbuf *sb)
> for (;; guessed_len *= 2) {
> strbuf_grow(sb, guessed_len);
> if (getcwd(sb->buf, sb->alloc)) {
> + precompse_strbuf_if_needed(sb);
> strbuf_setlen(sb, strlen(sb->buf));
The need for strbuf_setlen() stems from the use of getcwd() that may
and will place a string that is much shorter than sb->alloc, so they
logically belong together. It will make more sense to call the
precompose _after_ arranging the members of strbuf in a consistent
state with the call to strbuf_setlen().
> return 0;
> }
> --
> 2.41.0.394.ge43f4fd0bd
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
2024-05-07 17:22 ` Junio C Hamano
@ 2024-05-09 15:24 ` Junio C Hamano
2024-05-09 15:29 ` Torsten Bögershausen
0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-09 15:24 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
Junio C Hamano <gitster@pobox.com> writes:
>> diff --git a/strbuf.c b/strbuf.c
>> index 0d929e4e19..cefea6b75f 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -591,6 +591,7 @@ int strbuf_getcwd(struct strbuf *sb)
>> for (;; guessed_len *= 2) {
>> strbuf_grow(sb, guessed_len);
>> if (getcwd(sb->buf, sb->alloc)) {
>> + precompse_strbuf_if_needed(sb);
>> strbuf_setlen(sb, strlen(sb->buf));
>
> The need for strbuf_setlen() stems from the use of getcwd() that may
> and will place a string that is much shorter than sb->alloc, so they
> logically belong together. It will make more sense to call the
> precompose _after_ arranging the members of strbuf in a consistent
> state with the call to strbuf_setlen().
Of course, we need to make sure precompose_string_if_needed() will
leave the strbuf in an consistent state. I think the implementation
of that helper function in this patch already does so, so
strbuf_grow(sb, guessed_len);
if (getcwd(sb->buf, sb->alloc)) {
strbuf_setlen(sb, strlen(sb->buf));
precompse_strbuf_if_needed(sb);
would be what we would want.
Thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
2024-05-09 15:24 ` Junio C Hamano
@ 2024-05-09 15:29 ` Torsten Bögershausen
0 siblings, 0 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2024-05-09 15:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, takimoto-j
On Thu, May 09, 2024 at 08:24:05AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> Of course, we need to make sure precompose_string_if_needed() will
> leave the strbuf in an consistent state. I think the implementation
> of that helper function in this patch already does so, so
>
> strbuf_grow(sb, guessed_len);
> if (getcwd(sb->buf, sb->alloc)) {
> strbuf_setlen(sb, strlen(sb->buf));
> precompse_strbuf_if_needed(sb);
>
I think that is what I have in mind as well.
Thanks for the review, a V2 should come the next days.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
2024-05-07 8:44 ` [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed() tboegi
2024-05-07 17:22 ` Junio C Hamano
@ 2024-05-07 17:47 ` Junio C Hamano
2024-05-08 0:32 ` brian m. carlson
1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-07 17:47 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> From: Torsten Bögershausen <tboegi@web.de>
>
> When running under macOs a call to strbuf_getcwd() may return
You spelled it as "macOS" in [1/2]. The hits from
$ git grep -i 'mac *os' \*.[ch]
tell me that we seem to say "macOS", "MacOS X", "Mac OSX" and "Mac
OS X" pretty much interchangeably. We may want to eventually
consolidate them to whatever the official name Apple uses, but in
the meantime let's make sure we do not add even more.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed()
2024-05-07 17:47 ` Junio C Hamano
@ 2024-05-08 0:32 ` brian m. carlson
0 siblings, 0 replies; 24+ messages in thread
From: brian m. carlson @ 2024-05-08 0:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: tboegi, git, takimoto-j
[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]
On 2024-05-07 at 17:47:41, Junio C Hamano wrote:
> tboegi@web.de writes:
>
> > From: Torsten Bögershausen <tboegi@web.de>
> >
> > When running under macOs a call to strbuf_getcwd() may return
>
> You spelled it as "macOS" in [1/2]. The hits from
>
> $ git grep -i 'mac *os' \*.[ch]
>
> tell me that we seem to say "macOS", "MacOS X", "Mac OSX" and "Mac
> OS X" pretty much interchangeably. We may want to eventually
> consolidate them to whatever the official name Apple uses, but in
> the meantime let's make sure we do not add even more.
I believe the current preferred form is "macOS". That's what I see on
Apple's website, and that's consistent with my understanding as well.
It was previously "Mac OS X", which is why we probably have that in our
code base, but it's my understanding Apple has moved away from that.
Wikipedia's opening sentence states, "macOS, originally Mac OS X,
previously shortened as OS X, is an operating system developed and
marketed by Apple since 2001," so I think Wikipedia agrees, too.
--
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
[not found] <20240430032717281.IXLP.121462.mail.biglobe.ne.jp@biglobe.ne.jp>
2024-05-07 8:44 ` [PATCH v1 1/2] t0050: ls-files path fails if path of workdir is NFD tboegi
2024-05-07 8:44 ` [PATCH v1 2/2] strbuf_getcwd() needs precompse_strbuf_if_needed() tboegi
@ 2024-05-09 16:11 ` tboegi
2024-05-09 16:37 ` Junio C Hamano
2024-05-19 7:03 ` Jun. T
2024-05-21 14:14 ` [PATCH v3 " tboegi
2024-05-31 19:31 ` [PATCH v4 " tboegi
4 siblings, 2 replies; 24+ messages in thread
From: tboegi @ 2024-05-09 16:11 UTC (permalink / raw)
To: tboegi, git, takimoto-j
From: Torsten Bögershausen <tboegi@web.de>
Under macOS, `git ls-files path` does not work (gives an error)
if the absolute 'path' contains characters in NFD (decomposed).
This happens when core.precomposeunicode is true, which is the
most common case. The bug report says:
$ cd somewhere # some safe place, /tmp or ~/tmp etc.
$ mkdir $'u\xcc\x88' # ü in NFD
$ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
$ git init
$ git ls-files $'/somewhere/u\xcc\x88' # NFD
fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
$ git ls-files $'/somewhere/\xc3\xbc' # NFC
(the same error as above)
In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.
This commit adds a test case that follows the bug report,
with the simplification that the 'ü' is replaced by an 'ä',
which is already used as NFD and NFC in t0050.
The solution is to precompose the result of getcwd(), if needed.
One possible implementation would be to re-define getcwd() similar
to opendir(), readdir() and closedir().
Since there is already a strbuf wrapper around getcwd(), and only this
wrapper is used inside the whole codebase, equip strbuf_getcwd() with
a call to the newly created function precompose_strbuf_if_needed().
Note that precompose_strbuf_if_needed() is a function under macOS,
and is a "no-op" on all other systems.
Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
compat/precompose_utf8.c | 10 ++++++++++
compat/precompose_utf8.h | 1 +
git-compat-util.h | 1 +
strbuf.c | 1 +
t/t0050-filesystem.sh | 11 +++++++++++
5 files changed, 24 insertions(+)
Thanks everybody for the review, which makes V2 much better.
diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index 0bd5c24250..5a7c90c90d 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -94,6 +94,16 @@ const char *precompose_string_if_needed(const char *in)
return in;
}
+void precompose_strbuf_if_needed(struct strbuf *sb)
+{
+ char *buf_prec = (char *)precompose_string_if_needed(sb->buf);
+ if (buf_prec != sb->buf) {
+ size_t buf_prec_len = strlen(buf_prec);
+ free(strbuf_detach(sb, NULL));
+ strbuf_attach(sb, buf_prec, buf_prec_len, buf_prec_len + 1);
+ }
+}
+
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix)
{
int i = 0;
diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h
index fea06cf28a..7c3cfcadb0 100644
--- a/compat/precompose_utf8.h
+++ b/compat/precompose_utf8.h
@@ -30,6 +30,7 @@ typedef struct {
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix);
const char *precompose_string_if_needed(const char *in);
+void precompose_strbuf_if_needed(struct strbuf *sb);
void probe_utf8_pathname_composition(void);
PREC_DIR *precompose_utf8_opendir(const char *dirname);
diff --git a/git-compat-util.h b/git-compat-util.h
index ca7678a379..892e1f9067 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -344,6 +344,7 @@ static inline const char *precompose_string_if_needed(const char *in)
return in;
}
+#define precompose_strbuf_if_needed(a)
#define probe_utf8_pathname_composition()
#endif
diff --git a/strbuf.c b/strbuf.c
index 0d929e4e19..d5b4b3903a 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -592,6 +592,7 @@ int strbuf_getcwd(struct strbuf *sb)
strbuf_grow(sb, guessed_len);
if (getcwd(sb->buf, sb->alloc)) {
strbuf_setlen(sb, strlen(sb->buf));
+ precompose_strbuf_if_needed(sb);
return 0;
}
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 325eb1c3cd..a24ec866d1 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -156,4 +156,15 @@ test_expect_success CASE_INSENSITIVE_FS 'checkout with no pathspec and a case in
)
'
+test_expect_success 'git ls-files under NFD' '
+ (
+ mkdir -p somewhere/$aumlcdiar &&
+ mypwd=$PWD &&
+ cd somewhere/$aumlcdiar &&
+ git init &&
+ git --literal-pathspecs ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
+'
test_done
--
2.41.0.394.ge43f4fd0bd
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-09 16:11 ` [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD tboegi
@ 2024-05-09 16:37 ` Junio C Hamano
2024-05-19 7:03 ` Jun. T
1 sibling, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2024-05-09 16:37 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> From: Torsten Bögershausen <tboegi@web.de>
>
> Under macOS, `git ls-files path` does not work (gives an error)
> if the absolute 'path' contains characters in NFD (decomposed).
> ...
>
> Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
Looks good. I've queued with a slight rewording to the proposed log
message, and a bit of extra quoting in the test. Any string that
contains "$aumlcdiar" are enclosed in a pair of double-quotes in the
script, so not just the one given to ls-files, other two references
to it are also quoted now.
Thanks.
1: a00cec23cf ! 1: ee6ba4053d macOS: ls-files path fails if path of workdir is NFD
@@ Commit message
In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.
- This commit adds a test case that follows the bug report,
- with the simplification that the 'ü' is replaced by an 'ä',
- which is already used as NFD and NFC in t0050.
+ Add a test case that follows the bug report, with the simplification
+ that the 'ü' is replaced by an 'ä', which is already used as NFD and
+ NFC in t0050.
- The solution is to precompose the result of getcwd(), if needed.
+ Precompose the result of getcwd(), if needed, just like all other
+ paths we use internally. That way, paths comparisons are all done
+ in NFC and we would correctly notice that the early part of the
+ path given as an absolute path matches the current directory.
One possible implementation would be to re-define getcwd() similar
- to opendir(), readdir() and closedir().
- Since there is already a strbuf wrapper around getcwd(), and only this
- wrapper is used inside the whole codebase, equip strbuf_getcwd() with
- a call to the newly created function precompose_strbuf_if_needed().
+ to opendir(), readdir() and closedir(), but since there is already a
+ strbuf wrapper around getcwd(), and only this wrapper is used inside
+ the whole codebase, equip strbuf_getcwd() with a call to the newly
+ created function precompose_strbuf_if_needed().
+
Note that precompose_strbuf_if_needed() is a function under macOS,
and is a "no-op" on all other systems.
@@ t/t0050-filesystem.sh: test_expect_success CASE_INSENSITIVE_FS 'checkout with no
+test_expect_success 'git ls-files under NFD' '
+ (
-+ mkdir -p somewhere/$aumlcdiar &&
++ mkdir -p "somewhere/$aumlcdiar" &&
+ mypwd=$PWD &&
-+ cd somewhere/$aumlcdiar &&
++ cd "somewhere/$aumlcdiar" &&
+ git init &&
-+ git --literal-pathspecs ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
++ git --literal-pathspecs ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-09 16:11 ` [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD tboegi
2024-05-09 16:37 ` Junio C Hamano
@ 2024-05-19 7:03 ` Jun. T
2024-05-20 16:06 ` Torsten Bögershausen
1 sibling, 1 reply; 24+ messages in thread
From: Jun. T @ 2024-05-19 7:03 UTC (permalink / raw)
To: tboegi; +Cc: git
Sorry for not responding quickly.
Thank you for the patch, but it seems the problem still remains.
Although
% git ls-files NFD
(apparently) works,
% git ls-files NFC
still gives the error
(if core.precomposeunicode is not set in global config).
The following is some info I got (hope it is correct and useful),
but I have no idea how to fix the problem.
precompose_string_if_needed() works only if:
precomposed_unicode is already set to 1, or
git_config_get_bool("core.precomposeunicode") sets it to 1.
But git_config_get_bool() reads the file .git/config only if:
the_repository->commondir is already set to ".git".
Back trace when the strbuf_getcwd() is called for the
3rd time is (frame #4 is set_git_work_tree()):
* frame #0: git`strbuf_getcwd(sb=0x00007ff7bfeff0a8) at strbuf.c:588:20
frame #1: git`strbuf_realpath_1(resolved=0x00007ff7bfeff0a8, path=".", flags=2) at abspath.c:101:7
frame #2: git`strbuf_realpath(resolved=0x00007ff7bfeff0a8, path=".", die_on_error=1) at abspath.c:219:9
frame #3: git`real_pathdup(path=".", die_on_error=1) at abspath.c:240:6
frame #4: git`repo_set_worktree(repo=0x000000010044eb98, path=".") at repository.c:145:19
frame #5: git`set_git_work_tree(new_work_tree=".") at environment.c:278:2
frame #6: git`setup_discovered_git_dir(gitdir=".git", cwd=0x0000000100435238, offset=16, repo_fmt=0x00007ff7bfeff1d8, nongit_ok=0x0000000000000000) at setup.c:1119:2
frame #7: git`setup_git_directory_gently(nongit_ok=0x0000000000000000) at setup.c:1606:12
frame #8: git`setup_git_directory at setup.c:1815:9
frame #9: git`run_builtin(p=0x0000000100424d58, argc=2, argv=0x00007ff7bfeff6d8) at git.c:448:12
frame #10: git`handle_builtin(argc=2, argv=0x00007ff7bfeff6d8) at git.c:729:3
frame #11: git`run_argv(argcp=0x00007ff7bfeff54c, argv=0x00007ff7bfeff540) at git.c:793:4
frame #12: git`cmd_main(argc=2, argv=0x00007ff7bfeff6d8) at git.c:928:19
frame #13: git`main(argc=3, argv=0x00007ff7bfeff6d0) at common-main.c:62:11
At this point, precomposed_unicode is still -1 and
the_repository->commondir is still NULL.
This means strbuf_getcwd() retuns NFD, and the_repository->worktree is set to NFD.
Moreover, precompose_string_if_needed() calls
git_config_get_bool("core.precomposeunicode"), and
this function indirecly sets
the_repository->config->hash_initialized = 1
Later setup_git_directory_gently() (frame #7) calls
setup_git_env() --> repo_set_gitdir() --> repo_set_commondir()
and the_repository->commondir is now set to ".git".
Then run_builtin() (frame #10) calls precompose_argv_prefix()
--> precompose_string_if_needed(). Here we have
precomposed_unicode = -1
the_repository->config->hash_initialized = 1
This means git_config_check_init() does not read
.git/config (does not call repo_read_config()) even if
the_repository->commondir is set to ".git",
and precomposed_unicode is not set to 1.
So the NFD in argv is not converted to NFC,
and
% git ls-files NFD
apparently works.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-19 7:03 ` Jun. T
@ 2024-05-20 16:06 ` Torsten Bögershausen
2024-05-20 18:08 ` Junio C Hamano
0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2024-05-20 16:06 UTC (permalink / raw)
To: Jun. T; +Cc: git
On Sun, May 19, 2024 at 04:03:03PM +0900, Jun. T wrote:
> Sorry for not responding quickly.
>
> Thank you for the patch, but it seems the problem still remains.
>
> Although
> % git ls-files NFD
> (apparently) works,
> % git ls-files NFC
> still gives the error
> (if core.precomposeunicode is not set in global config).
>
> The following is some info I got (hope it is correct and useful),
> but I have no idea how to fix the problem.
>
> precompose_string_if_needed() works only if:
> precomposed_unicode is already set to 1, or
> git_config_get_bool("core.precomposeunicode") sets it to 1.
>
> But git_config_get_bool() reads the file .git/config only if:
> the_repository->commondir is already set to ".git".
>
> Back trace when the strbuf_getcwd() is called for the
> 3rd time is (frame #4 is set_git_work_tree()):
>
> * frame #0: git`strbuf_getcwd(sb=0x00007ff7bfeff0a8) at strbuf.c:588:20
> frame #1: git`strbuf_realpath_1(resolved=0x00007ff7bfeff0a8, path=".", flags=2) at abspath.c:101:7
> frame #2: git`strbuf_realpath(resolved=0x00007ff7bfeff0a8, path=".", die_on_error=1) at abspath.c:219:9
> frame #3: git`real_pathdup(path=".", die_on_error=1) at abspath.c:240:6
> frame #4: git`repo_set_worktree(repo=0x000000010044eb98, path=".") at repository.c:145:19
> frame #5: git`set_git_work_tree(new_work_tree=".") at environment.c:278:2
> frame #6: git`setup_discovered_git_dir(gitdir=".git", cwd=0x0000000100435238, offset=16, repo_fmt=0x00007ff7bfeff1d8, nongit_ok=0x0000000000000000) at setup.c:1119:2
> frame #7: git`setup_git_directory_gently(nongit_ok=0x0000000000000000) at setup.c:1606:12
> frame #8: git`setup_git_directory at setup.c:1815:9
> frame #9: git`run_builtin(p=0x0000000100424d58, argc=2, argv=0x00007ff7bfeff6d8) at git.c:448:12
> frame #10: git`handle_builtin(argc=2, argv=0x00007ff7bfeff6d8) at git.c:729:3
> frame #11: git`run_argv(argcp=0x00007ff7bfeff54c, argv=0x00007ff7bfeff540) at git.c:793:4
> frame #12: git`cmd_main(argc=2, argv=0x00007ff7bfeff6d8) at git.c:928:19
> frame #13: git`main(argc=3, argv=0x00007ff7bfeff6d0) at common-main.c:62:11
>
> At this point, precomposed_unicode is still -1 and
> the_repository->commondir is still NULL.
> This means strbuf_getcwd() retuns NFD, and the_repository->worktree is set to NFD.
>
> Moreover, precompose_string_if_needed() calls
> git_config_get_bool("core.precomposeunicode"), and
> this function indirecly sets
> the_repository->config->hash_initialized = 1
>
> Later setup_git_directory_gently() (frame #7) calls
> setup_git_env() --> repo_set_gitdir() --> repo_set_commondir()
> and the_repository->commondir is now set to ".git".
>
> Then run_builtin() (frame #10) calls precompose_argv_prefix()
> --> precompose_string_if_needed(). Here we have
> precomposed_unicode = -1
> the_repository->config->hash_initialized = 1
> This means git_config_check_init() does not read
> .git/config (does not call repo_read_config()) even if
> the_repository->commondir is set to ".git",
> and precomposed_unicode is not set to 1.
> So the NFD in argv is not converted to NFC,
> and
> % git ls-files NFD
> apparently works.
>
Thanks so much for the detailed analysis, that is appreciated.
To be honest, I have set core.precomposeunicode true globally,
core.quotepath=false, together with settings for
pull.rebase and init.defaultbranch
Because of that, the new testcase passed, and the patch was in
improvement.
However, it would be nice to have the case fixed as well,
where core.precomposeunicode is not set at a global level.
I am happy to provide a patch (a new testcase is already there),
but for a change in the codebase I would need some help from an expert,
to get the config-reading right both for hash_initialized
(that is may be not about the hash-algorithn at all ?)
and precompose.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-20 16:06 ` Torsten Bögershausen
@ 2024-05-20 18:08 ` Junio C Hamano
2024-05-20 19:21 ` Torsten Bögershausen
0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-20 18:08 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Jun. T, git
Torsten Bögershausen <tboegi@web.de> writes:
> Thanks so much for the detailed analysis, that is appreciated.
> To be honest, I have set core.precomposeunicode true globally,
> ...
> ...
> I am happy to provide a patch (a new testcase is already there),
> but for a change in the codebase I would need some help from an expert,
> to get the config-reading right both for hash_initialized
> (that is may be not about the hash-algorithn at all ?)
> and precompose.
It does not sound like an issue with the hash algorithm.
Why isn't the local config (presumably set with auto-probing when
the repository was initialized) being read? Are we reading the
core.precomposeunicode in some funny ways? Is per-worktree config
involved that is trying to read from one but the auto-probing code
is setting it to another, or something silly like that?
Thanks for working well together, both of you.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-20 18:08 ` Junio C Hamano
@ 2024-05-20 19:21 ` Torsten Bögershausen
0 siblings, 0 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2024-05-20 19:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jun. T, git
On Mon, May 20, 2024 at 11:08:43AM -0700, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
> > Thanks so much for the detailed analysis, that is appreciated.
> > To be honest, I have set core.precomposeunicode true globally,
> > ...
> > ...
> > I am happy to provide a patch (a new testcase is already there),
> > but for a change in the codebase I would need some help from an expert,
> > to get the config-reading right both for hash_initialized
> > (that is may be not about the hash-algorithn at all ?)
> > and precompose.
>
> It does not sound like an issue with the hash algorithm.
>
> Why isn't the local config (presumably set with auto-probing when
> the repository was initialized) being read?
I have the same question, kind of. The callstack provided does give some
hints, but I was lost...
> Are we reading the core.precomposeunicode in some funny ways?
Not what I am aware of. But the order of initialization, when a git command
is executed, some need a worktree or .git directory, some not, is somewhat
beyond my yet expertise.
>Is per-worktree config involved that is trying to read from one but the auto-probing code
> is setting it to another, or something silly like that?
No, not to my understanding.
We have a "local" config (per repo), that is there and has the right value.
However, for some reason we miss to read the repo-config here,
when argv[] needs to be precomposed.
Reading the global config does work, but if core.precomposeunicode is
not set here, but set in the repo-config, we miss that.
>
> Thanks for working well together, both of you.
Yes, please let someone join the force, reading the callstack(s) and
try to find what is wrong here. I will try to do the same.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
[not found] <20240430032717281.IXLP.121462.mail.biglobe.ne.jp@biglobe.ne.jp>
` (2 preceding siblings ...)
2024-05-09 16:11 ` [PATCH v2 1/1] macOS: ls-files path fails if path of workdir is NFD tboegi
@ 2024-05-21 14:14 ` tboegi
2024-05-21 17:50 ` Junio C Hamano
2024-05-31 19:31 ` [PATCH v4 " tboegi
4 siblings, 1 reply; 24+ messages in thread
From: tboegi @ 2024-05-21 14:14 UTC (permalink / raw)
To: tboegi, git, takimoto-j
From: Torsten Bögershausen <tboegi@web.de>
Under macOS, `git ls-files path` does not work (gives an error)
if the absolute 'path' contains characters in NFD (decomposed).
This happens when core.precomposeunicode is true, which is the
most common case. The bug report says:
$ cd somewhere # some safe place, /tmp or ~/tmp etc.
$ mkdir $'u\xcc\x88' # ü in NFD
$ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
$ git init
$ git ls-files $'/somewhere/u\xcc\x88' # NFD
fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
$ git ls-files $'/somewhere/\xc3\xbc' # NFC
(the same error as above)
In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.
Add a test case that follows the bug report, with the simplification
that the 'ü' is replaced by an 'ä', which is already used as NFD and
NFC in t0050.
Precompose the result of getcwd(), if needed, just like all other
paths we use internally. That way, paths comparisons are all done
in NFC and we would correctly notice that the early part of the
path given as an absolute path matches the current directory.
One possible implementation would be to re-define getcwd() similar
to opendir(), readdir() and closedir(), but since there is already a
strbuf wrapper around getcwd(), and only this wrapper is used inside
the whole codebase, equip strbuf_getcwd() with a call to the newly
created function precompose_strbuf_if_needed().
Note that precompose_strbuf_if_needed() is a function under macOS,
and is a "no-op" on all other systems.
Add a missing call to precompose_string_if_needed() to this code
in setup.c :
`work_tree = precompose_string_if_needed(get_git_work_tree());`
Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
compat/precompose_utf8.c | 10 ++++++++++
compat/precompose_utf8.h | 1 +
git-compat-util.h | 1 +
setup.c | 2 +-
strbuf.c | 1 +
t/t0050-filesystem.sh | 26 ++++++++++++++++++++++++++
6 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index 0bd5c24250..5a7c90c90d 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -94,6 +94,16 @@ const char *precompose_string_if_needed(const char *in)
return in;
}
+void precompose_strbuf_if_needed(struct strbuf *sb)
+{
+ char *buf_prec = (char *)precompose_string_if_needed(sb->buf);
+ if (buf_prec != sb->buf) {
+ size_t buf_prec_len = strlen(buf_prec);
+ free(strbuf_detach(sb, NULL));
+ strbuf_attach(sb, buf_prec, buf_prec_len, buf_prec_len + 1);
+ }
+}
+
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix)
{
int i = 0;
diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h
index fea06cf28a..7c3cfcadb0 100644
--- a/compat/precompose_utf8.h
+++ b/compat/precompose_utf8.h
@@ -30,6 +30,7 @@ typedef struct {
const char *precompose_argv_prefix(int argc, const char **argv, const char *prefix);
const char *precompose_string_if_needed(const char *in);
+void precompose_strbuf_if_needed(struct strbuf *sb);
void probe_utf8_pathname_composition(void);
PREC_DIR *precompose_utf8_opendir(const char *dirname);
diff --git a/git-compat-util.h b/git-compat-util.h
index 3e7a59b5ff..8b63108f16 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -331,6 +331,7 @@ static inline const char *precompose_string_if_needed(const char *in)
return in;
}
+#define precompose_strbuf_if_needed(a)
#define probe_utf8_pathname_composition()
#endif
diff --git a/setup.c b/setup.c
index 2e607632db..61f61496ec 100644
--- a/setup.c
+++ b/setup.c
@@ -48,7 +48,7 @@ static int abspath_part_inside_repo(char *path)
size_t wtlen;
char *path0;
int off;
- const char *work_tree = get_git_work_tree();
+ const char *work_tree = precompose_string_if_needed(get_git_work_tree());
struct strbuf realpath = STRBUF_INIT;
if (!work_tree)
diff --git a/strbuf.c b/strbuf.c
index 4c9ac6dc5e..b05581d8e7 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -569,6 +569,7 @@ int strbuf_getcwd(struct strbuf *sb)
strbuf_grow(sb, guessed_len);
if (getcwd(sb->buf, sb->alloc)) {
strbuf_setlen(sb, strlen(sb->buf));
+ precompose_strbuf_if_needed(sb);
return 0;
}
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 325eb1c3cd..5a9ee5be92 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -156,4 +156,30 @@ test_expect_success CASE_INSENSITIVE_FS 'checkout with no pathspec and a case in
)
'
+test_expect_success 'git ls-files under NFD' '
+ (
+ mkdir -p "somewhere/$aumlcdiar" &&
+ mypwd=$PWD &&
+ cd "somewhere/$aumlcdiar" &&
+ git init &&
+ git --literal-pathspecs ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
+'
+
+# Re-do the same test. Note: global core.precomposeunicode is changed
+test_expect_success 'git ls-files under NFD. global precompose false' '
+ test_when_finished "git config --global --unset core.precomposeunicode" &&
+ (
+ mypwd=$PWD &&
+ cd "somewhere/$aumlcdiar" &&
+ git config --global core.precomposeunicode false &&
+ git config core.precomposeunicode true &&
+ git --literal-pathspecs ls-files "$mypwd/somewhere/$aumlcdiar" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
+'
+
test_done
--
2.41.0.394.ge43f4fd0bd
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-21 14:14 ` [PATCH v3 " tboegi
@ 2024-05-21 17:50 ` Junio C Hamano
2024-05-21 20:57 ` Torsten Bögershausen
0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-21 17:50 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> Add a missing call to precompose_string_if_needed() to this code
> in setup.c :
> `work_tree = precompose_string_if_needed(get_git_work_tree());`
This is new in this iteration, I presume? The old one did the
precompose only in strbuf_getcwd(). We now precompose also the
result of get_git_work_tree().
Two questions.
* It is unclear to me why this makes a difference only when the
precompuse configuration is set only in the local configuration.
* As the leading part of the value placed in get_git_work_tree()
comes from strbuf_getcwd() called by abspath.c:real_pathdup()
that is called by repository.c:repo_set_worktree(), doesn't this
potentially call precompse twice on the already precomposed early
parth of the get_git_work_tree() result?
I suspect that with the arrangement in your test, the argument given
to set_git_work_tree() from setup.c:setup_discovered_git_dir() is
always ".", and that dot is passed to repository.c:repo_set_worktree()
which calls abspath.c:real_pathdup() to turn it into an absolute,
where it has a call to strbuf_getcwd().
So with the provided test, I suspect there is no difference between
the previous and this iteration in behaviour, as what is fed to
precompose should be identical?
What this iteration does differently is that inside real_pathdup(),
if the string given to repo_set_worktree() is more than the trivial
".", it is appended to the result of strbuf_getcwd(), and the new
code precomposes after such appending in real_pathdup() happens. It
will convert the leading part twice [*] and more importantly the
appended part is now converted, unlike the previous one?
Side note: [*] hopefully precompose is idempotent? Relying
on that property somewhat feels yucky, though.
Puzzled...
Will replace and queue, but I couldn't figure out what is going on
with the help by the proposed log message, so...
Thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-21 17:50 ` Junio C Hamano
@ 2024-05-21 20:57 ` Torsten Bögershausen
2024-05-21 22:15 ` Junio C Hamano
0 siblings, 1 reply; 24+ messages in thread
From: Torsten Bögershausen @ 2024-05-21 20:57 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, takimoto-j
On Tue, May 21, 2024 at 10:50:25AM -0700, Junio C Hamano wrote:
> tboegi@web.de writes:
>
> > Add a missing call to precompose_string_if_needed() to this code
> > in setup.c :
> > `work_tree = precompose_string_if_needed(get_git_work_tree());`
>
> This is new in this iteration, I presume? The old one did the
> precompose only in strbuf_getcwd(). We now precompose also the
> result of get_git_work_tree().
>
> Two questions.
>
> * It is unclear to me why this makes a difference only when the
> precompuse configuration is set only in the local configuration.
>
> * As the leading part of the value placed in get_git_work_tree()
> comes from strbuf_getcwd() called by abspath.c:real_pathdup()
> that is called by repository.c:repo_set_worktree(), doesn't this
> potentially call precompse twice on the already precomposed early
> parth of the get_git_work_tree() result?
>
> I suspect that with the arrangement in your test, the argument given
> to set_git_work_tree() from setup.c:setup_discovered_git_dir() is
> always ".", and that dot is passed to repository.c:repo_set_worktree()
> which calls abspath.c:real_pathdup() to turn it into an absolute,
> where it has a call to strbuf_getcwd().
>
> So with the provided test, I suspect there is no difference between
> the previous and this iteration in behaviour, as what is fed to
> precompose should be identical?
>
> What this iteration does differently is that inside real_pathdup(),
> if the string given to repo_set_worktree() is more than the trivial
> ".", it is appended to the result of strbuf_getcwd(), and the new
> code precomposes after such appending in real_pathdup() happens. It
> will convert the leading part twice [*] and more importantly the
> appended part is now converted, unlike the previous one?
>
> Side note: [*] hopefully precompose is idempotent? Relying
> on that property somewhat feels yucky, though.
>
> Puzzled...
>
> Will replace and queue, but I couldn't figure out what is going on
> with the help by the proposed log message, so...
Acknowledge.
The commit message deserves an update, for sure.
My suggestion would be too keep it in seen, until I have managed
to write a better commit message.
At the same time, I would ask Jun-ichi Takimoto to do a re-test
of the new version.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-21 20:57 ` Torsten Bögershausen
@ 2024-05-21 22:15 ` Junio C Hamano
2024-05-23 15:33 ` Jun. T
0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-05-21 22:15 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git, takimoto-j
Torsten Bögershausen <tboegi@web.de> writes:
> The commit message deserves an update, for sure.
> My suggestion would be too keep it in seen, until I have managed
> to write a better commit message.
> At the same time, I would ask Jun-ichi Takimoto to do a re-test
> of the new version.
Yup, that sounds extremely sensible. Thanks for working on this.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-21 22:15 ` Junio C Hamano
@ 2024-05-23 15:33 ` Jun. T
2024-05-25 20:01 ` Torsten Bögershausen
0 siblings, 1 reply; 24+ messages in thread
From: Jun. T @ 2024-05-23 15:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Torsten Bögershausen, git
Unfortunately v3 still doesn't work.
'git ls-files NFD' works but 'git ls-files NFC' does not.
I think it better to test both "ls-config NFD" and "ls-config NFC".
The reason of the failure seems to be the same as v2, but
I describe it here in more detail (or too detailed).
(one of?) The problem is the_repository->config->hash_initialized
is set to 1 before the_repository->commondir is set to ".git".
Due to this, .git/config is never read, and precomposed_unicode
is never set to 1 (remains -1).
run_builtin() {
setup_git_directory() {
strbuf_getcwd() { # setup.c:1542
precompose_{strbuf,string}_if_needed() {
# precomposed_unicode is still -1
git_congig_get_bool("core.precomposeunicode") {
git_config_check_init() {
repo_read_config() {
git_config_init() {
# !!!
the_repository->config->hash_initialized=1
# !!!
}
# does not read .git/config since
# the_repository->commondir is still NULL
}
}
}
returns without converting to NFC
}
returns cwd in NFD
}
setup_discovered_git_dir() {
set_git_work_tree(".") {
repo_set_worktree() {
# this function indirectly calls strbuf_getcwd()
# --> precompose_{strbuf,string}_if_needed() -->
# {git,repo}_config_get_bool("core.precomposeunicode"),
# but does not try to read .git/config since
# the_repository->config->hash_initialized
# is already set to 1 above. And it will not read
# .git/config even if hash_initialized is 0
# since the_repository->commondir is still NULL.
the_repository->worktree = NFD
}
}
}
setup_git_env() {
repo_setup_gitdir() {
repo_set_commondir() {
# finally commondir is set here
the_repository->commondir = ".git"
}
}
}
} // END setup_git_directory
precompose_argv_prefix() {
# since the_repository->config->hash_initialized is still 1
# .git/config is not read and precomposed_unicode remains -1,
# and argv (if in NFD) is not converted to NFC
}
cmd_ls_files() {
parse_pathspec(.., argv /* may be in NFD, see above */) {
init_path_spec_item() {
prefix_path_gently() {
abspath_part_inside_repo() {
work_tree = precomose_string_if_needed(
get_git_work_gree())
# get_git_work_tree() returns NFD, and
# precompose_string_if_needed() does not
# convert it to NFC since
# the_repository->config->hash_initialized is 1
worktree = NFD
returns 0 for "ls-files NFD" since both argv
and work_tree are in NFD, but returns -1 for
"ls-files NFC" since argv is in NFC.
}
returns NULL for "ls-files NFC"
}
die() at pathspec.c:499 for "ls-files NFC"
}
}
}
} END run_builtin
I don't know how to fix the problem, but I think it better to avoid
calling precompose_{strbuf,string}_if_needed() before commondir
is set to ".git" and .git/config is successfully read.
Or reset the_repository->config->hash_initialized at some point?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-23 15:33 ` Jun. T
@ 2024-05-25 20:01 ` Torsten Bögershausen
0 siblings, 0 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2024-05-25 20:01 UTC (permalink / raw)
To: Jun. T; +Cc: Junio C Hamano, git
On Fri, May 24, 2024 at 12:33:08AM +0900, Jun. T wrote:
>
> Unfortunately v3 still doesn't work.
> 'git ls-files NFD' works but 'git ls-files NFC' does not.
>
> I think it better to test both "ls-config NFD" and "ls-config NFC".
>
> The reason of the failure seems to be the same as v2, but
> I describe it here in more detail (or too detailed).
Thanks for testing - I was fully convinced that the new test case
did cover all problems - but proved to be wrong.
[snip the nice analyses]
> I don't know how to fix the problem, but I think it better to avoid
> calling precompose_{strbuf,string}_if_needed() before commondir
> is set to ".git" and .git/config is successfully read.
>
> Or reset the_repository->config->hash_initialized at some point?
>
I think that I may be able to offer a v4 patch, which has better test cases.
From my understanding, the reading of the local/repo .git/config does
not work as we need it, since the .git/ directroy is determined after
the config has been read. And so it is never read.
Having said that, a patch relying on the global .gitconfig may still
be an improvement on it's own.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v4 1/1] macOS: ls-files path fails if path of workdir is NFD
[not found] <20240430032717281.IXLP.121462.mail.biglobe.ne.jp@biglobe.ne.jp>
` (3 preceding siblings ...)
2024-05-21 14:14 ` [PATCH v3 " tboegi
@ 2024-05-31 19:31 ` tboegi
2024-06-01 15:55 ` Junio C Hamano
2024-06-04 0:56 ` Jun T
4 siblings, 2 replies; 24+ messages in thread
From: tboegi @ 2024-05-31 19:31 UTC (permalink / raw)
To: tboegi, git, takimoto-j
From: Torsten Bögershausen <tboegi@web.de>
Under macOS, `git ls-files path` does not work (gives an error)
if the absolute 'path' contains characters in NFD (decomposed).
This happens when core.precomposeunicode is true, which is the
most common case. The bug report says:
$ cd somewhere # some safe place, /tmp or ~/tmp etc.
$ mkdir $'u\xcc\x88' # ü in NFD
$ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc'
$ git init
$ git ls-files $'/somewhere/u\xcc\x88' # NFD
fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü'
$ git ls-files $'/somewhere/\xc3\xbc' # NFC
(the same error as above)
In the 'fatal:' error message, there are three ü;
the 1st and 2nd are in NFC, the 3rd is in NFD.
Add test cases that follows the bug report, with the simplification
that the 'ü' is replaced by an 'ä', which is already used as NFD and
NFC in t3910.
The solution is to add a call to precompose_string_if_needed()
to this code in setup.c :
`work_tree = precompose_string_if_needed(get_git_work_tree());`
There is, however, a limitation with this very usage of Git:
The (repo) local .gitconfig file is not used, only the global
"core.precomposeunicode" is taken into account, if it is set (or not).
To set it to true is a good recommendation anyway, and here is the
analyzes from Jun T :
The problem is the_repository->config->hash_initialized
is set to 1 before the_repository->commondir is set to ".git".
Due to this, .git/config is never read, and precomposed_unicode
is never set to 1 (remains -1).
run_builtin() {
setup_git_directory() {
strbuf_getcwd() { # setup.c:1542
precompose_{strbuf,string}_if_needed() {
# precomposed_unicode is still -1
git_congig_get_bool("core.precomposeunicode") {
git_config_check_init() {
repo_read_config() {
git_config_init() {
# !!!
the_repository->config->hash_initialized=1
# !!!
}
# does not read .git/config since
# the_repository->commondir is still NULL
}
}
}
returns without converting to NFC
}
returns cwd in NFD
}
setup_discovered_git_dir() {
set_git_work_tree(".") {
repo_set_worktree() {
# this function indirectly calls strbuf_getcwd()
# --> precompose_{strbuf,string}_if_needed() -->
# {git,repo}_config_get_bool("core.precomposeunicode"),
# but does not try to read .git/config since
# the_repository->config->hash_initialized
# is already set to 1 above. And it will not read
# .git/config even if hash_initialized is 0
# since the_repository->commondir is still NULL.
the_repository->worktree = NFD
}
}
}
setup_git_env() {
repo_setup_gitdir() {
repo_set_commondir() {
# finally commondir is set here
the_repository->commondir = ".git"
}
}
}
} // END setup_git_directory
Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
setup.c | 2 +-
t/t3910-mac-os-precompose.sh | 39 +++++++++++++++++++++++++++++++++++-
2 files changed, 39 insertions(+), 2 deletions(-)
diff --git a/setup.c b/setup.c
index 2e607632db..61f61496ec 100644
--- a/setup.c
+++ b/setup.c
@@ -48,7 +48,7 @@ static int abspath_part_inside_repo(char *path)
size_t wtlen;
char *path0;
int off;
- const char *work_tree = get_git_work_tree();
+ const char *work_tree = precompose_string_if_needed(get_git_work_tree());
struct strbuf realpath = STRBUF_INIT;
if (!work_tree)
diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh
index 898267a6bd..6d5918c8fe 100755
--- a/t/t3910-mac-os-precompose.sh
+++ b/t/t3910-mac-os-precompose.sh
@@ -37,6 +37,27 @@ Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #50 Byte
Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #250 Byte
Alongc=$Alongc$AEligatu$AEligatu #254 Byte
+
+ls_files_nfc_nfd () {
+ test_when_finished "git config --global --unset core.precomposeunicode" &&
+ prglbl=$1
+ prlocl=$2
+ aumlcreat=$3
+ aumllist=$4
+ git config --global core.precomposeunicode $prglbl &&
+ (
+ rm -rf .git &&
+ mkdir -p "somewhere/$prglbl/$prlocl/$aumlcreat" &&
+ mypwd=$PWD &&
+ cd "somewhere/$prglbl/$prlocl/$aumlcreat" &&
+ git init &&
+ git config core.precomposeunicode $prlocl &&
+ git --literal-pathspecs ls-files "$mypwd/somewhere/$prglbl/$prlocl/$aumllist" 2>err &&
+ >expected &&
+ test_cmp expected err
+ )
+}
+
test_expect_success "detect if nfd needed" '
precomposeunicode=$(git config core.precomposeunicode) &&
test "$precomposeunicode" = true &&
@@ -211,8 +232,8 @@ test_expect_success "unicode decomposed: git restore -p . " '
'
# Test if the global core.precomposeunicode stops autosensing
-# Must be the last test case
test_expect_success "respect git config --global core.precomposeunicode" '
+ test_when_finished "git config --global --unset core.precomposeunicode" &&
git config --global core.precomposeunicode true &&
rm -rf .git &&
git init &&
@@ -220,4 +241,20 @@ test_expect_success "respect git config --global core.precomposeunicode" '
test "$precomposeunicode" = "true"
'
+test_expect_success "ls-files false false nfd nfd" '
+ ls_files_nfc_nfd false false $Adiarnfd $Adiarnfd
+'
+
+test_expect_success "ls-files false true nfd nfd" '
+ ls_files_nfc_nfd false true $Adiarnfd $Adiarnfd
+'
+
+test_expect_success "ls-files true false nfd nfd" '
+ ls_files_nfc_nfd true false $Adiarnfd $Adiarnfd
+'
+
+test_expect_success "ls-files true true nfd nfd" '
+ ls_files_nfc_nfd true true $Adiarnfd $Adiarnfd
+'
+
test_done
--
2.41.0.394.ge43f4fd0bd
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v4 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-31 19:31 ` [PATCH v4 " tboegi
@ 2024-06-01 15:55 ` Junio C Hamano
2024-06-02 19:40 ` Torsten Bögershausen
2024-06-04 0:56 ` Jun T
1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2024-06-01 15:55 UTC (permalink / raw)
To: tboegi; +Cc: git, takimoto-j
tboegi@web.de writes:
> The problem is the_repository->config->hash_initialized
> is set to 1 before the_repository->commondir is set to ".git".
> Due to this, .git/config is never read, and precomposed_unicode
> is never set to 1 (remains -1).
The "is never read" part is a bit confusing and misleading. If it
were
At the point of code flow where we would want to learn the value
of precompose configuration, the local configuration has not
been read.
then I would understand it, though.
Is the analysis telling us that we need to rethink the order of
things setup_git_directory() does? Or is this inherently unsolvable
because we need to discover the .git/ directory and path to it
before we can read configuration to learn from it, but we need the
value of precompose setting to compute the "path to it"?
Presumably chdir() done in setup_discovered_git_dir() can be done
with either NFC or NFD if the filesystem is squashing the
differences between them, so perhaps doing the repo_set_worktree()
done in setup_discovered_git_dir() is wrong, and we could delay
populating the .worktree member until much later? For reading the
local config, it does matter we know where the git-dir and
common-dir are, but the location of worktree is immaterial.
Anyway, thanks for a patch. Will queue.
> run_builtin() {
> setup_git_directory() {
> strbuf_getcwd() { # setup.c:1542
> precompose_{strbuf,string}_if_needed() {
> # precomposed_unicode is still -1
> git_congig_get_bool("core.precomposeunicode") {
> git_config_check_init() {
> repo_read_config() {
> git_config_init() {
> # !!!
> the_repository->config->hash_initialized=1
> # !!!
> }
> # does not read .git/config since
> # the_repository->commondir is still NULL
> }
> }
> }
> returns without converting to NFC
> }
> returns cwd in NFD
> }
>
> setup_discovered_git_dir() {
> set_git_work_tree(".") {
> repo_set_worktree() {
> # this function indirectly calls strbuf_getcwd()
> # --> precompose_{strbuf,string}_if_needed() -->
> # {git,repo}_config_get_bool("core.precomposeunicode"),
> # but does not try to read .git/config since
> # the_repository->config->hash_initialized
> # is already set to 1 above. And it will not read
> # .git/config even if hash_initialized is 0
> # since the_repository->commondir is still NULL.
>
> the_repository->worktree = NFD
> }
> }
> }
>
> setup_git_env() {
> repo_setup_gitdir() {
> repo_set_commondir() {
> # finally commondir is set here
> the_repository->commondir = ".git"
> }
> }
> }
>
> } // END setup_git_directory
>
> Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> setup.c | 2 +-
> t/t3910-mac-os-precompose.sh | 39 +++++++++++++++++++++++++++++++++++-
> 2 files changed, 39 insertions(+), 2 deletions(-)
>
> diff --git a/setup.c b/setup.c
> index 2e607632db..61f61496ec 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -48,7 +48,7 @@ static int abspath_part_inside_repo(char *path)
> size_t wtlen;
> char *path0;
> int off;
> - const char *work_tree = get_git_work_tree();
> + const char *work_tree = precompose_string_if_needed(get_git_work_tree());
> struct strbuf realpath = STRBUF_INIT;
>
> if (!work_tree)
> diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh
> index 898267a6bd..6d5918c8fe 100755
> --- a/t/t3910-mac-os-precompose.sh
> +++ b/t/t3910-mac-os-precompose.sh
> @@ -37,6 +37,27 @@ Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #50 Byte
> Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #250 Byte
> Alongc=$Alongc$AEligatu$AEligatu #254 Byte
>
> +
> +ls_files_nfc_nfd () {
> + test_when_finished "git config --global --unset core.precomposeunicode" &&
> + prglbl=$1
> + prlocl=$2
> + aumlcreat=$3
> + aumllist=$4
> + git config --global core.precomposeunicode $prglbl &&
> + (
> + rm -rf .git &&
> + mkdir -p "somewhere/$prglbl/$prlocl/$aumlcreat" &&
> + mypwd=$PWD &&
> + cd "somewhere/$prglbl/$prlocl/$aumlcreat" &&
> + git init &&
> + git config core.precomposeunicode $prlocl &&
> + git --literal-pathspecs ls-files "$mypwd/somewhere/$prglbl/$prlocl/$aumllist" 2>err &&
> + >expected &&
> + test_cmp expected err
> + )
> +}
> +
> test_expect_success "detect if nfd needed" '
> precomposeunicode=$(git config core.precomposeunicode) &&
> test "$precomposeunicode" = true &&
> @@ -211,8 +232,8 @@ test_expect_success "unicode decomposed: git restore -p . " '
> '
>
> # Test if the global core.precomposeunicode stops autosensing
> -# Must be the last test case
> test_expect_success "respect git config --global core.precomposeunicode" '
> + test_when_finished "git config --global --unset core.precomposeunicode" &&
> git config --global core.precomposeunicode true &&
> rm -rf .git &&
> git init &&
> @@ -220,4 +241,20 @@ test_expect_success "respect git config --global core.precomposeunicode" '
> test "$precomposeunicode" = "true"
> '
>
> +test_expect_success "ls-files false false nfd nfd" '
> + ls_files_nfc_nfd false false $Adiarnfd $Adiarnfd
> +'
> +
> +test_expect_success "ls-files false true nfd nfd" '
> + ls_files_nfc_nfd false true $Adiarnfd $Adiarnfd
> +'
> +
> +test_expect_success "ls-files true false nfd nfd" '
> + ls_files_nfc_nfd true false $Adiarnfd $Adiarnfd
> +'
> +
> +test_expect_success "ls-files true true nfd nfd" '
> + ls_files_nfc_nfd true true $Adiarnfd $Adiarnfd
> +'
> +
> test_done
> --
> 2.41.0.394.ge43f4fd0bd
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v4 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-06-01 15:55 ` Junio C Hamano
@ 2024-06-02 19:40 ` Torsten Bögershausen
0 siblings, 0 replies; 24+ messages in thread
From: Torsten Bögershausen @ 2024-06-02 19:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, takimoto-j
On Sat, Jun 01, 2024 at 08:55:03AM -0700, Junio C Hamano wrote:
> tboegi@web.de writes:
>
> > The problem is the_repository->config->hash_initialized
> > is set to 1 before the_repository->commondir is set to ".git".
> > Due to this, .git/config is never read, and precomposed_unicode
> > is never set to 1 (remains -1).
>
> The "is never read" part is a bit confusing and misleading. If it
> were
>
> At the point of code flow where we would want to learn the value
> of precompose configuration, the local configuration has not
> been read.
>
> then I would understand it, though.
Yes, the thing is that it has not been read, and will never been read,
in this very very usage of Git.
>
> Is the analysis telling us that we need to rethink the order of
> things setup_git_directory() does? Or is this inherently unsolvable
> because we need to discover the .git/ directory and path to it
> before we can read configuration to learn from it, but we need the
> value of precompose setting to compute the "path to it"?
I think it is solvable, see below.
>
> Presumably chdir() done in setup_discovered_git_dir() can be done
> with either NFC or NFD if the filesystem is squashing the
> differences between them, so perhaps doing the repo_set_worktree()
> done in setup_discovered_git_dir() is wrong, and we could delay
> populating the .worktree member until much later? For reading the
> local config, it does matter we know where the git-dir and
> common-dir are, but the location of worktree is immaterial.
>
> Anyway, thanks for a patch. Will queue.
My understanding is that detecting the .git/ dir and then reading
the config file from here does not work here.
That may be better documented in this very commit message,
please feel free to ammend it.
The root cause may be fixed in a different commit later.
[snip]
The best info that I have at the moment is the call stack analysis
done by Jun. T, where the global core.precomposeunicode is not set,
and reading the local one "fails".
I hope this is a useful repetition:
The following is some info I got (hope it is correct and useful),
but I have no idea how to fix the problem.
precompose_string_if_needed() works only if:
precomposed_unicode is already set to 1, or
git_config_get_bool("core.precomposeunicode") sets it to 1.
But git_config_get_bool() reads the file .git/config only if:
the_repository->commondir is already set to ".git".
Back trace when the strbuf_getcwd() is called for the
3rd time is (frame #4 is set_git_work_tree()):
* frame #0: git`strbuf_getcwd(sb=0x00007ff7bfeff0a8) at strbuf.c:588:20
frame #1: git`strbuf_realpath_1(resolved=0x00007ff7bfeff0a8, path=".", flags=2) at abspath.c:101:7
frame #2: git`strbuf_realpath(resolved=0x00007ff7bfeff0a8, path=".", die_on_error=1) at abspath.c:219:9
frame #3: git`real_pathdup(path=".", die_on_error=1) at abspath.c:240:6
frame #4: git`repo_set_worktree(repo=0x000000010044eb98, path=".") at repository.c:145:19
frame #5: git`set_git_work_tree(new_work_tree=".") at environment.c:278:2
frame #6: git`setup_discovered_git_dir(gitdir=".git", cwd=0x0000000100435238, offset=16, repo_fmt=0x00007ff7bfeff1d8, nongit_ok=0x0000000000000000) at setup.c:1119:2
frame #7: git`setup_git_directory_gently(nongit_ok=0x0000000000000000) at setup.c:1606:12
frame #8: git`setup_git_directory at setup.c:1815:9
frame #9: git`run_builtin(p=0x0000000100424d58, argc=2, argv=0x00007ff7bfeff6d8) at git.c:448:12
frame #10: git`handle_builtin(argc=2, argv=0x00007ff7bfeff6d8) at git.c:729:3
frame #11: git`run_argv(argcp=0x00007ff7bfeff54c, argv=0x00007ff7bfeff540) at git.c:793:4
frame #12: git`cmd_main(argc=2, argv=0x00007ff7bfeff6d8) at git.c:928:19
frame #13: git`main(argc=3, argv=0x00007ff7bfeff6d0) at common-main.c:62:11
At this point, precomposed_unicode is still -1 and
the_repository->commondir is still NULL.
This means strbuf_getcwd() retuns NFD, and the_repository->worktree is set to NFD.
Moreover, precompose_string_if_needed() calls
git_config_get_bool("core.precomposeunicode"), and
this function indirecly sets
the_repository->config->hash_initialized = 1
Later setup_git_directory_gently() (frame #7) calls
setup_git_env() --> repo_set_gitdir() --> repo_set_commondir()
and the_repository->commondir is now set to ".git".
Then run_builtin() (frame #10) calls precompose_argv_prefix()
--> precompose_string_if_needed(). Here we have
precomposed_unicode = -1
the_repository->config->hash_initialized = 1
This means git_config_check_init() does not read
.git/config (does not call repo_read_config()) even if
the_repository->commondir is set to ".git",
and precomposed_unicode is not set to 1.
[snip]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v4 1/1] macOS: ls-files path fails if path of workdir is NFD
2024-05-31 19:31 ` [PATCH v4 " tboegi
2024-06-01 15:55 ` Junio C Hamano
@ 2024-06-04 0:56 ` Jun T
1 sibling, 0 replies; 24+ messages in thread
From: Jun T @ 2024-06-04 0:56 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
> 2024/06/01 4:3, tboegi@web.de <mailto:tboegi@web.de> wrote:
>
> The solution is to add a call to precompose_string_if_needed()
> to this code in setup.c :
> `work_tree = precompose_string_if_needed(get_git_work_tree());`
This simple patch works for both 'ls-files NFD' and 'ls-files NFC'.
>
> There is, however, a limitation with this very usage of Git:
> The (repo) local .gitconfig file is not used,
core.precomposeunicode in .git/config is read, in function
precompose_argv_prefix(), and NFD in argv is converted to NFC.
But, as you know, the variable the_repository->worktree or
the return value of get_git_work_tree() is in NFD.
> 2024/06/03 4:40, Torsten Bögershausen <tboegi@web.de> wrote:
>
> The root cause may be fixed in a different commit later.
Of course fixing the 'root cause' is better, but I don't
know whether it's easy or not.
Anyway, thank you for woking on this problem.
--
Jun
^ permalink raw reply [flat|nested] 24+ messages in thread