* git diff does not precompose unicode file paths (OS X)
@ 2016-03-04 9:07 Alexander Rinass
2016-03-04 12:16 ` Torsten Bögershausen
0 siblings, 1 reply; 11+ messages in thread
From: Alexander Rinass @ 2016-03-04 9:07 UTC (permalink / raw)
To: git
Hallo,
It appears that the git diff command does not precompose file path arguments, even if the option core.precomposeunicode is set to true (which is the default on OS X).
Passing the decomposed form of a file path to the git diff command will yield no diff for a modified file.
In my case, the decomposed form of the file path is sent by the OS X Cocoa framework's NSTask, wich I am using in an application. It can be simulated on OS X by using $(iconv -f utf-8 -t utf-8-mac <<< FILE_PATH) as file path argument on the shell.
Git commands like add, log, ls-tree, ls-files, mv, ... accept both file path forms, git diff does not.
It can be tested with the following setup on OS X (as iconv's utf-8-mac encoding is only available on OS X):
git init .
git config core.quotepath true
git config core.precomposeunicode true # (default on OS X)
touch .gitignore && git add .gitignore && git commit -m "Initial commit"
echo "." >> Ä
git add Ä
git commit -m "Create commit with unicode file path"
echo "." >> Ä
This gives the following status, showing the precomposed form of "Ä":
git status --short
M "\303\204"
Running git add with both forms does work as expected:
git add Ä
git status --short
M "\303\204"
git reset HEAD -- Ä
git add $(iconv -f utf-8 -t utf-8-mac <<< Ä)
git status --short
M "\303\204"
git reset HEAD -- Ä
However, running git diff only works with the precomposed form:
git status --short
M "\303\204"
git --no-pager diff -- Ä
[...shows diff...]
git --no-pager diff -- $(iconv -f utf-8 -t utf-8-mac <<< Ä)
[...shows NO diff...]
I took a look at the Git source code, and the builtin/diff*.c do not contain the parse_options call (which does the precompose_argv call) that the other builtins use.
But I am not really familiar with either C or the Git project structure, so this may not mean anything.
Best regards,
Alexander Rinass
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-04 9:07 git diff does not precompose unicode file paths (OS X) Alexander Rinass
@ 2016-03-04 12:16 ` Torsten Bögershausen
2016-03-04 14:37 ` Alexander Rinass
0 siblings, 1 reply; 11+ messages in thread
From: Torsten Bögershausen @ 2016-03-04 12:16 UTC (permalink / raw)
To: Alexander Rinass, git
On 03/04/2016 10:07 AM, Alexander Rinass wrote:
> Hallo,
>
> It appears that the git diff command does not precompose file path arguments, even if the option core.precomposeunicode is set to true (which is the default on OS X).
>
> Passing the decomposed form of a file path to the git diff command will yield no diff for a modified file.
>
> In my case, the decomposed form of the file path is sent by the OS X Cocoa framework's NSTask, wich I am using in an application. It can be simulated on OS X by using $(iconv -f utf-8 -t utf-8-mac <<< FILE_PATH) as file path argument on the shell.
>
> Git commands like add, log, ls-tree, ls-files, mv, ... accept both file path forms, git diff does not.
>
> It can be tested with the following setup on OS X (as iconv's utf-8-mac encoding is only available on OS X):
>
> git init .
> git config core.quotepath true
> git config core.precomposeunicode true # (default on OS X)
> touch .gitignore && git add .gitignore && git commit -m "Initial commit"
>
> echo "." >> Ä
> git add Ä
> git commit -m "Create commit with unicode file path"
>
> echo "." >> Ä
>
> This gives the following status, showing the precomposed form of "Ä":
>
> git status --short
> M "\303\204"
>
> Running git add with both forms does work as expected:
>
> git add Ä
> git status --short
> M "\303\204"
>
> git reset HEAD -- Ä
>
> git add $(iconv -f utf-8 -t utf-8-mac <<< Ä)
> git status --short
> M "\303\204"
>
> git reset HEAD -- Ä
>
> However, running git diff only works with the precomposed form:
>
> git status --short
> M "\303\204"
>
> git --no-pager diff -- Ä
> [...shows diff...]
>
> git --no-pager diff -- $(iconv -f utf-8 -t utf-8-mac <<< Ä)
> [...shows NO diff...]
>
> I took a look at the Git source code, and the builtin/diff*.c do not contain the parse_options call (which does the precompose_argv call) that the other builtins use.
>
> But I am not really familiar with either C or the Git project structure, so this may not mean anything.
>
> Best regards,
> Alexander Rinass
>
Good analyzes, and thanks for the report.
It should be possible to stick in a
precompose_arrgv(argc, argv)
into builtin/diff.c
Do you you can test that ?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-04 12:16 ` Torsten Bögershausen
@ 2016-03-04 14:37 ` Alexander Rinass
2016-03-04 18:36 ` Junio C Hamano
2016-03-04 18:49 ` Ramsay Jones
0 siblings, 2 replies; 11+ messages in thread
From: Alexander Rinass @ 2016-03-04 14:37 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
> On 04 Mar 2016, at 13:16, Torsten Bögershausen <tboegi@web.de> wrote:
>
> On 03/04/2016 10:07 AM, Alexander Rinass wrote:
>> Hallo,
>>
>> It appears that the git diff command does not precompose file path arguments, even if the option core.precomposeunicode is set to true (which is the default on OS X).
>>
>> Passing the decomposed form of a file path to the git diff command will yield no diff for a modified file.
>>
>> In my case, the decomposed form of the file path is sent by the OS X Cocoa framework's NSTask, wich I am using in an application. It can be simulated on OS X by using $(iconv -f utf-8 -t utf-8-mac <<< FILE_PATH) as file path argument on the shell.
>>
>> Git commands like add, log, ls-tree, ls-files, mv, ... accept both file path forms, git diff does not.
>>
>> It can be tested with the following setup on OS X (as iconv's utf-8-mac encoding is only available on OS X):
>>
>> git init .
>> git config core.quotepath true
>> git config core.precomposeunicode true # (default on OS X)
>> touch .gitignore && git add .gitignore && git commit -m "Initial commit"
>> echo "." >> Ä
>> git add Ä
>> git commit -m "Create commit with unicode file path"
>> echo "." >> Ä
>> This gives the following status, showing the precomposed form of "Ä":
>>
>> git status --short
>> M "\303\204"
>> Running git add with both forms does work as expected:
>>
>> git add Ä
>> git status --short
>> M "\303\204"
>> git reset HEAD -- Ä
>> git add $(iconv -f utf-8 -t utf-8-mac <<< Ä)
>> git status --short
>> M "\303\204"
>> git reset HEAD -- Ä
>> However, running git diff only works with the precomposed form:
>>
>> git status --short
>> M "\303\204"
>> git --no-pager diff -- Ä
>> [...shows diff...]
>> git --no-pager diff -- $(iconv -f utf-8 -t utf-8-mac <<< Ä)
>> [...shows NO diff...]
>>
>> I took a look at the Git source code, and the builtin/diff*.c do not contain the parse_options call (which does the precompose_argv call) that the other builtins use.
>>
>> But I am not really familiar with either C or the Git project structure, so this may not mean anything.
>>
>> Best regards,
>> Alexander Rinass
>>
> Good analyzes, and thanks for the report.
> It should be possible to stick in a
>
> precompose_arrgv(argc, argv)
>
> into builtin/diff.c
>
> Do you you can test that ?
>
Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff function fixes the issue.
But I had to disable the check (precomposed_unicode != 1) in precompose_argv to make it work. That’s probably because precompose_argv is usually called from parse_options and is missing some other call before it?
I think it is clear that diff.c and friends are definitely missing the precomposing step. I am not sure about the right way to fix though (should parse_options be used in the end?) and my C skills are basic at best, otherwise I would create a patch.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-04 14:37 ` Alexander Rinass
@ 2016-03-04 18:36 ` Junio C Hamano
2016-03-04 18:49 ` Ramsay Jones
1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2016-03-04 18:36 UTC (permalink / raw)
To: Alexander Rinass; +Cc: Torsten Bögershausen, git
Alexander Rinass <alex@fournova.com> writes:
> Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff
> function fixes the issue.
>
> But I had to disable the check (precomposed_unicode != 1) in
> precompose_argv to make it work. That’s probably because
> precompose_argv is usually called from parse_options and is
> missing some other call before it?
The "precomposed_unicode" bit comes from your configuration file,
so it won't be usable before you call into git_default_core_config
and that happens via a call to "git_config(git_diff_ui_config, NULL)".
So perhaps you'd want to add the argv munging immediately before
init_revisions() call there?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-04 14:37 ` Alexander Rinass
2016-03-04 18:36 ` Junio C Hamano
@ 2016-03-04 18:49 ` Ramsay Jones
2016-03-07 7:47 ` Alexander Rinass
1 sibling, 1 reply; 11+ messages in thread
From: Ramsay Jones @ 2016-03-04 18:49 UTC (permalink / raw)
To: Alexander Rinass, Torsten Bögershausen; +Cc: git
On 04/03/16 14:37, Alexander Rinass wrote:
>
>> On 04 Mar 2016, at 13:16, Torsten Bögershausen <tboegi@web.de> wrote:
>>
>> On 03/04/2016 10:07 AM, Alexander Rinass wrote:
[snip]
>
> Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff function fixes the issue.
>
> But I had to disable the check (precomposed_unicode != 1) in precompose_argv to make it work. That’s probably because precompose_argv is usually called from parse_options and is missing some other call before it?
>
Yes, you need to place it after the configuration is read, but before
calls to diff_no_index() or setup_revisions(). Directly after the call
to git_config() should be fine. [But this begs the question about other
commands, including plumbing, which don't call parse_options().]
Maybe this will work for you (I can't test it, since I don't have any
access to a Mac):
diff --git a/builtin/diff.c b/builtin/diff.c
index 343c6b8..b7a9405 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -320,6 +320,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
gitmodules_config();
init_diff_ui_defaults();
git_config(git_diff_ui_config, NULL);
+ precompose_argv(argc, argv);
init_revisions(&rev, prefix);
ATB,
Ramsay Jones
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-04 18:49 ` Ramsay Jones
@ 2016-03-07 7:47 ` Alexander Rinass
2016-03-07 8:54 ` Torsten Bögershausen
0 siblings, 1 reply; 11+ messages in thread
From: Alexander Rinass @ 2016-03-07 7:47 UTC (permalink / raw)
To: Ramsay Jones; +Cc: Torsten Bögershausen, git
> On 04 Mar 2016, at 19:49, Ramsay Jones <ramsay@ramsayjones.plus.com> wrote:
>
>
>
> On 04/03/16 14:37, Alexander Rinass wrote:
>>
>>> On 04 Mar 2016, at 13:16, Torsten Bögershausen <tboegi@web.de> wrote:
>>>
>>> On 03/04/2016 10:07 AM, Alexander Rinass wrote:
> [snip]
>
>>
>> Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff function fixes the issue.
>>
>> But I had to disable the check (precomposed_unicode != 1) in precompose_argv to make it work. That’s probably because precompose_argv is usually called from parse_options and is missing some other call before it?
>>
>
> Yes, you need to place it after the configuration is read, but before
> calls to diff_no_index() or setup_revisions(). Directly after the call
> to git_config() should be fine. [But this begs the question about other
> commands, including plumbing, which don't call parse_options().]
>
> Maybe this will work for you (I can't test it, since I don't have any
> access to a Mac):
>
> diff --git a/builtin/diff.c b/builtin/diff.c
> index 343c6b8..b7a9405 100644
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
> @@ -320,6 +320,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> gitmodules_config();
> init_diff_ui_defaults();
> git_config(git_diff_ui_config, NULL);
> + precompose_argv(argc, argv);
>
> init_revisions(&rev, prefix);
Your patch fixes it for the diff command without further modifications.
I have also modified diff-tree, diff-index and diff-files by adding the precompose_argv call and successfully verified it.
I have attached the full patch. If there is anything else I can test, let me know.
diff --git a/builtin/diff-files.c b/builtin/diff-files.c
index 8ed2eb8..15c61fd 100644
--- a/builtin/diff-files.c
+++ b/builtin/diff-files.c
@@ -24,6 +24,7 @@ int cmd_diff_files(int argc, const char **argv, const char *prefix)
gitmodules_config();
git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
rev.abbrev = 0;
+ precompose_argv(argc, argv);
argc = setup_revisions(argc, argv, &rev, NULL);
while (1 < argc && argv[1][0] == '-') {
diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index d979824..1af373d 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -21,6 +21,7 @@ int cmd_diff_index(int argc, const char **argv, const char *prefix)
gitmodules_config();
git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
rev.abbrev = 0;
+ precompose_argv(argc, argv);
argc = setup_revisions(argc, argv, &rev, NULL);
for (i = 1; i < argc; i++) {
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index 2a12b81..806dd7a 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -114,6 +114,8 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
opt->disable_stdin = 1;
memset(&s_r_opt, 0, sizeof(s_r_opt));
s_r_opt.tweak = diff_tree_tweak_rev;
+
+ precompose_argv(argc, argv);
argc = setup_revisions(argc, argv, opt, &s_r_opt);
while (--argc > 0) {
diff --git a/builtin/diff.c b/builtin/diff.c
index 52c98a9..d6b8f98 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -319,6 +319,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
if (!no_index)
gitmodules_config();
git_config(git_diff_ui_config, NULL);
+ precompose_argv(argc, argv);
init_revisions(&rev, prefix);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-07 7:47 ` Alexander Rinass
@ 2016-03-07 8:54 ` Torsten Bögershausen
[not found] ` <5C6A30EF-ED0A-4D64-B971-CF873C64B46E@fournova.com>
0 siblings, 1 reply; 11+ messages in thread
From: Torsten Bögershausen @ 2016-03-07 8:54 UTC (permalink / raw)
To: Alexander Rinass, Ramsay Jones; +Cc: Torsten Bögershausen, git
On 03/07/2016 08:47 AM, Alexander Rinass wrote:
>> On 04 Mar 2016, at 19:49, Ramsay Jones <ramsay@ramsayjones.plus.com> wrote:
>>
>>
>>
>> On 04/03/16 14:37, Alexander Rinass wrote:
>>>> On 04 Mar 2016, at 13:16, Torsten Bögershausen <tboegi@web.de> wrote:
>>>>
>>>> On 03/04/2016 10:07 AM, Alexander Rinass wrote:
>> [snip]
>>
>>> Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff function fixes the issue.
>>>
>>> But I had to disable the check (precomposed_unicode != 1) in precompose_argv to make it work. That’s probably because precompose_argv is usually called from parse_options and is missing some other call before it?
>>>
>> Yes, you need to place it after the configuration is read, but before
>> calls to diff_no_index() or setup_revisions(). Directly after the call
>> to git_config() should be fine. [But this begs the question about other
>> commands, including plumbing, which don't call parse_options().]
>>
>> Maybe this will work for you (I can't test it, since I don't have any
>> access to a Mac):
>>
>> diff --git a/builtin/diff.c b/builtin/diff.c
>> index 343c6b8..b7a9405 100644
>> --- a/builtin/diff.c
>> +++ b/builtin/diff.c
>> @@ -320,6 +320,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>> gitmodules_config();
>> init_diff_ui_defaults();
>> git_config(git_diff_ui_config, NULL);
>> + precompose_argv(argc, argv);
>>
>> init_revisions(&rev, prefix);
> Your patch fixes it for the diff command without further modifications.
>
> I have also modified diff-tree, diff-index and diff-files by adding the precompose_argv call and successfully verified it.
>
> I have attached the full patch. If there is anything else I can test, let me know.
Thanks for reporting, fixing, testing -
Do you think you can send an official patch to the list ?
If not, push your patches to some public repo ?
And if not, I can put it on my TODO-stack.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
[not found] ` <5C6A30EF-ED0A-4D64-B971-CF873C64B46E@fournova.com>
@ 2016-03-08 12:30 ` Torsten Bögershausen
2016-03-14 21:45 ` Alexander Rinass
0 siblings, 1 reply; 11+ messages in thread
From: Torsten Bögershausen @ 2016-03-08 12:30 UTC (permalink / raw)
To: Alexander Rinass; +Cc: Ramsay Jones, git
>> And if not, I can put it on my TODO-stack.
> I have read through the official contribution guidelines and I think I can
> send an official patch.
>
> In this case, would you prefer to have a single commit since the change
> is related? Or would you prefer keeping it in separate commits, since
> they are different commands and I can use commit subjects like “diff:”
> and “diff-index:”, etc.?
>
Thanks for the work.
The same issue fixed at different places:
I personally would prefer a single commit.
Another thing is, if we want to add TC in t3910,
to avoid future regressions.
(Otherwise I can help with those)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-08 12:30 ` Torsten Bögershausen
@ 2016-03-14 21:45 ` Alexander Rinass
2016-03-15 5:45 ` Torsten Bögershausen
0 siblings, 1 reply; 11+ messages in thread
From: Alexander Rinass @ 2016-03-14 21:45 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Ramsay Jones, git
> On 08 Mar 2016, at 13:30, Torsten Bögershausen <tboegi@web.de> wrote:
>
>>> And if not, I can put it on my TODO-stack.
>> I have read through the official contribution guidelines and I think I can
>> send an official patch.
>>
>> In this case, would you prefer to have a single commit since the change
>> is related? Or would you prefer keeping it in separate commits, since
>> they are different commands and I can use commit subjects like “diff:”
>> and “diff-index:”, etc.?
>>
> Thanks for the work.
> The same issue fixed at different places:
> I personally would prefer a single commit.
>
> Another thing is, if we want to add TC in t3910,
> to avoid future regressions.
> (Otherwise I can help with those)
I created a test case but git diff exits with 0 if it does not recognize the file
path so the test case always succeeds. Can you give me a hint or one
example test case?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-14 21:45 ` Alexander Rinass
@ 2016-03-15 5:45 ` Torsten Bögershausen
2016-04-04 20:43 ` Alexander Rinass
0 siblings, 1 reply; 11+ messages in thread
From: Torsten Bögershausen @ 2016-03-15 5:45 UTC (permalink / raw)
To: Alexander Rinass, Torsten Bögershausen; +Cc: Ramsay Jones, git
>I created a test case but git diff exits with 0 if it does not recognize the
file >path so the test case always succeeds. Can you give me a hint or one
>example test case?
The most clean (?) is to compare "git diff" NFC and git diff NFD, they should
give the same result:
for "git diff" something like this would do:
+
+# This will test git diff
+test_expect_success "git diff f.Adiar" '
+ echo "Modified" >f.$Adiarnfd &&
+ git diff f.$Adiarnfd >expect &&
+ git diff f.$Adiarnfc >actual &&
+ git checkout f.$Adiarnfd &&
+ test_cmp expect actual
+'
HTH
/Torsten
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git diff does not precompose unicode file paths (OS X)
2016-03-15 5:45 ` Torsten Bögershausen
@ 2016-04-04 20:43 ` Alexander Rinass
0 siblings, 0 replies; 11+ messages in thread
From: Alexander Rinass @ 2016-04-04 20:43 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Ramsay Jones, git
> On 15 Mar 2016, at 06:45, Torsten Bögershausen <tboegi@web.de> wrote:
>
> >I created a test case but git diff exits with 0 if it does not recognize the file >path so the test case always succeeds. Can you give me a hint or one >example test case?
>
> The most clean (?) is to compare "git diff" NFC and git diff NFD, they should give the same result:
> for "git diff" something like this would do:
> +
> +# This will test git diff
> +test_expect_success "git diff f.Adiar" '
> + echo "Modified" >f.$Adiarnfd &&
> + git diff f.$Adiarnfd >expect &&
> + git diff f.$Adiarnfc >actual &&
> + git checkout f.$Adiarnfd &&
> + test_cmp expect actual
> +’
Thank you!
I had to tweak it a little but it now reproduces the issue and confirms the fix
for diff, diff-index, diff-files and diff-tree.
I have just sent in the full patch.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-04-04 20:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-04 9:07 git diff does not precompose unicode file paths (OS X) Alexander Rinass
2016-03-04 12:16 ` Torsten Bögershausen
2016-03-04 14:37 ` Alexander Rinass
2016-03-04 18:36 ` Junio C Hamano
2016-03-04 18:49 ` Ramsay Jones
2016-03-07 7:47 ` Alexander Rinass
2016-03-07 8:54 ` Torsten Bögershausen
[not found] ` <5C6A30EF-ED0A-4D64-B971-CF873C64B46E@fournova.com>
2016-03-08 12:30 ` Torsten Bögershausen
2016-03-14 21:45 ` Alexander Rinass
2016-03-15 5:45 ` Torsten Bögershausen
2016-04-04 20:43 ` Alexander Rinass
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).