* [TESTCASE] Failing 'git am' when core.autocrlf=true
@ 2007-08-23 14:07 Marius Storm-Olsen
2007-08-23 17:18 ` Linus Torvalds
0 siblings, 1 reply; 8+ messages in thread
From: Marius Storm-Olsen @ 2007-08-23 14:07 UTC (permalink / raw)
To: Git Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 950 bytes --]
Hi,
I have an issue with git-rebase failing on a repository using
core.autocrlf=true
I've tracked it down to git-am failing with core.autocrlf=true and
passing with core.autocrlf=false. I've tried digging deeper into the
code, but for some reason ce_match_stat_basic() (read-cache.c:~187)
reports the size of the file in the index to be 0 (when
core.autocrlf=true), which is why git-am bails out on the patch.
(ce->ce_size == 0, while st->st_size == the correct size on disk)
If I force the bailout in check_patch() (builtin-apply.c:~2101) to
_not_ happen, the patch applies without problems.
Can anyone please enlighten me on why this may happen?
I've attached a SH script which will reproduce the problem. Simply run
./git_am_crlf_testcase.sh
to reproduce the problem, and run
./git_am_crlf_testcase.sh --no-crlf
to show it working when core.autocrlf=false.
Thanks for your help!
--
.marius
[-- Attachment #1.2: git_am_crlf_testcase.sh --]
[-- Type: text/plain, Size: 3002 bytes --]
#!/bin/sh
#
# Testcase for failing 'git am' on a repository with core.autocrlf = true
# ./git_am_crlf_testcase.sh fails for Git 1.5.3.rc4
# ./git_am_crlf_testcase.sh --no-crlf passes
#
# --------------------------------------------------------------------------------------------------
TESTCASE_REPO=git_am_crlf_testcase
GIT_AUTOCRLF=true
# parse the arguments
while [ "$#" -gt 0 ]; do
case "$1" in
--no-crlf) # Turn off autocrlf for the repository
GIT_AUTOCRLF=false
;;
--crlf) # Turn on autocrlf for the repository
GIT_AUTOCRLF=true
;;
esac
shift
done
# Functions ----------------------------------------------------------------------------------------
die() {
echo >&2 "$@"
exit 1
}
print_patch() {
cat << __END_OF_PATCH__
From 38be10072e45dd6b08ce40851e3fca60a31a340b Mon Sep 17 00:00:00 2001
From: Marius Storm-Olsen <x@y.com>
Date: Thu, 23 Aug 2007 13:00:00 +0200
Subject: test1
---
foo | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 foo
diff --git a/foo b/foo
new file mode 100644
index 0000000000000000000000000000000000000000..5716ca5987cbf97d6bb54920bea6adde242d87e6
--- /dev/null
+++ b/foo
@@ -0,0 +1 @@
+bar
--
1.5.3.rc4.mingw.2.3.g3318a-dirty
From 3bb3855cf028fc90010e2f51aceb6d3a90039c5c Mon Sep 17 00:00:00 2001
From: Marius Storm-Olsen <x@y.com>
Date: Thu, 23 Aug 2007 13:00:01 +0200
Subject: test2
---
foo | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/foo b/foo
index 5716ca5987cbf97d6bb54920bea6adde242d87e6..e2994c5ae2f7c41a306d9b41b64af4d6b7242793 100644
--- a/foo
+++ b/foo
@@ -1 +1,2 @@
bar
+baz
--
1.5.3.rc4.mingw.2.3.g3318a-dirty
From 38212fd002da735b3626c6f1d801c8ca37ce04e6 Mon Sep 17 00:00:00 2001
From: Marius Storm-Olsen <x@y.com>
Date: Thu, 23 Aug 2007 13:00:02 +0200
Subject: test3
---
foo | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/foo b/foo
index e2994c5ae2f7c41a306d9b41b64af4d6b7242793..ad94f9e431a24bbe67e9e5935e2a8cf20a73e723 100644
--- a/foo
+++ b/foo
@@ -1,2 +1,3 @@
bar
baz
+blah
--
1.5.3.rc4.mingw.2.3.g3318a-dirty
__END_OF_PATCH__
}
# 'main' -------------------------------------------------------------------------------------------
# Kill old testcase repository
test -d "$TESTCASE_REPO" && rm -rf $TESTCASE_REPO
# Create new testcase repository
mkdir $TESTCASE_REPO &&
cd $TESTCASE_REPO &&
git init || die "Couldn\'t create testcase repository \'$TESTCASE_REPO\'!"
# Set autocrlf explicitly
git config core.autocrlf $GIT_AUTOCRLF || die "Couldn\'t set core.autocrlf to \'$GIT_AUTOCRLF\'!"
# Create initial commit
(echo foo > bar &&
git add bar &&
git commit -m "initial commit") || die "Couldn\'t create initial file!"
# Apply failing patch set
print_patch | git am --binary -3 || die "Applying patch set to autocrlf=true repo failed!"
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 14:07 [TESTCASE] Failing 'git am' when core.autocrlf=true Marius Storm-Olsen
@ 2007-08-23 17:18 ` Linus Torvalds
2007-08-23 17:45 ` Marius Storm-Olsen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-08-23 17:18 UTC (permalink / raw)
To: Marius Storm-Olsen, Junio C Hamano; +Cc: Git Mailing List
On Thu, 23 Aug 2007, Marius Storm-Olsen wrote:
>
> I have an issue with git-rebase failing on a repository using
> core.autocrlf=true
>
> I've tracked it down to git-am failing with core.autocrlf=true and passing
> with core.autocrlf=false. I've tried digging deeper into the code, but for
> some reason ce_match_stat_basic() (read-cache.c:~187) reports the size of the
> file in the index to be 0 (when core.autocrlf=true), which is why git-am bails
> out on the patch. (ce->ce_size == 0, while st->st_size == the correct size on
> disk)
Very interesting.
Adding some instrumentation to "git-am.sh" (namely a lot of
git diff --quiet || exit
lines to figure out exactly *where* the index gets out of sync with the
working tree), I get this trace:
trace: built-in: git 'mailsplit' '-d4' '-o.dotest' '-b' '--'
trace: built-in: git 'diff-index' '--cached' '--name-only' 'HEAD'
trace: built-in: git 'diff' '--quiet'
trace: built-in: git 'mailinfo' '-u' '.dotest/msg' '.dotest/patch'
trace: built-in: git 'stripspace'
trace: built-in: git 'diff' '--quiet'
trace: built-in: git 'apply' '--allow-binary-replacement' '--index' '.dotest/patch'
trace: built-in: git 'diff' '--quiet'
trace: built-in: git 'diff' '--quiet'
trace: built-in: git 'write-tree'
trace: built-in: git 'diff' '--quiet'
ie everything was fine after the "apply" phase, but the index and the
working tree went out-of-kilter after "git write-tree".
The reason? "git write-tree" doesn't read the config file, so it never
even reads the "core.autocrlf=true" variable. As a result, it seems to
screw up the index matching when it does the cache_tree_fully_valid()
(which will fail due to "git apply --index" having invalidated the tree
SHA1's) followed by cache_tree_update().
> Can anyone please enlighten me on why this may happen?
This patch should fix it.
Junio - it fixes the test for me, but quite frankly, I don't see why
write-tree would *ever* change any non-tree index entries. But it does. I
think there's another bug somewhere, or I'm missing something.
Linus
---
builtin-write-tree.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/builtin-write-tree.c b/builtin-write-tree.c
index 88f34ba..b89d02e 100644
--- a/builtin-write-tree.c
+++ b/builtin-write-tree.c
@@ -72,6 +72,7 @@ int cmd_write_tree(int argc, const char **argv, const char *unused_prefix)
const char *prefix = NULL;
unsigned char sha1[20];
+ git_config(git_default_config);
while (1 < argc) {
const char *arg = argv[1];
if (!strcmp(arg, "--missing-ok"))
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 17:18 ` Linus Torvalds
@ 2007-08-23 17:45 ` Marius Storm-Olsen
2007-08-23 18:41 ` Junio C Hamano
2007-08-23 19:23 ` Linus Torvalds
2 siblings, 0 replies; 8+ messages in thread
From: Marius Storm-Olsen @ 2007-08-23 17:45 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 582 bytes --]
Linus Torvalds said the following on 23.08.2007 19:18:
> On Thu, 23 Aug 2007, Marius Storm-Olsen wrote:
>> I have an issue with git-rebase failing on a repository using
>> core.autocrlf=true
>
> Very interesting.
>
> Adding some instrumentation to "git-am.sh" (namely a lot of
>
> git diff --quiet || exit
Aah! Very neat way to narrow it down. I'll keep that in mind for later
debugging. Thanks!
> This patch should fix it.
Yep, it did. I just rebased 80 commits with core.autocrlf=true. Thanks
again, Linus, for the quick reply!
--
.marius
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 17:18 ` Linus Torvalds
2007-08-23 17:45 ` Marius Storm-Olsen
@ 2007-08-23 18:41 ` Junio C Hamano
2007-08-23 19:25 ` Linus Torvalds
2007-08-23 19:23 ` Linus Torvalds
2 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2007-08-23 18:41 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Marius Storm-Olsen, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, 23 Aug 2007, Marius Storm-Olsen wrote:
>>
>> I have an issue with git-rebase failing on a repository using
>> core.autocrlf=true
>>
>> I've tracked it down to git-am failing with core.autocrlf=true and passing
>> with core.autocrlf=false. I've tried digging deeper into the code, but for
>> some reason ce_match_stat_basic() (read-cache.c:~187) reports the size of the
>> file in the index to be 0 (when core.autocrlf=true), which is why git-am bails
>> out on the patch. (ce->ce_size == 0, while st->st_size == the correct size on
>> disk)
>
> Very interesting.
> ...
> trace: built-in: git 'apply' '--allow-binary-replacement' '--index' '.dotest/patch'
> trace: built-in: git 'diff' '--quiet'
> trace: built-in: git 'diff' '--quiet'
> trace: built-in: git 'write-tree'
> trace: built-in: git 'diff' '--quiet'
>
> ie everything was fine after the "apply" phase, but the index and the
> working tree went out-of-kilter after "git write-tree".
>
> The reason? "git write-tree" doesn't read the config file, so it never
> even reads the "core.autocrlf=true" variable. As a result, it seems to
> screw up the index matching when it does the cache_tree_fully_valid()
> (which will fail due to "git apply --index" having invalidated the tree
> SHA1's) followed by cache_tree_update().
>
>> Can anyone please enlighten me on why this may happen?
>
> This patch should fix it.
>
> Junio - it fixes the test for me, but quite frankly, I don't see why
> write-tree would *ever* change any non-tree index entries. But it does. I
> think there's another bug somewhere, or I'm missing something.
As you said, there is something else going on. write-tree is
about reading the index entries and writing them out as a set of
trees, and at that point it should not even matter if you have
garbage in the work tree or if you do not even have a work tree.
All the crlf conversions have been done when the object hit the
index, so its reading or not reading core.autocrlf should not
change its behaviour.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 17:18 ` Linus Torvalds
2007-08-23 17:45 ` Marius Storm-Olsen
2007-08-23 18:41 ` Junio C Hamano
@ 2007-08-23 19:23 ` Linus Torvalds
2007-08-23 19:44 ` Junio C Hamano
2007-08-23 20:11 ` Marius Storm-Olsen
2 siblings, 2 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-08-23 19:23 UTC (permalink / raw)
To: Marius Storm-Olsen, Junio C Hamano; +Cc: Git Mailing List
On Thu, 23 Aug 2007, Linus Torvalds wrote:
>
> Junio - it fixes the test for me, but quite frankly, I don't see why
> write-tree would *ever* change any non-tree index entries. But it does. I
> think there's another bug somewhere, or I'm missing something.
Looking more carefully at the index file before and after, the _only_
difference really is the ce_size field for "foo".
In the fixed one, ce_size remains at 5 (which is correct: the file
contents are "bar\n\r"), in the broken case it gets zeroed by
git-write-tree for some reason.
And the reason is really interesting: it only happens if the index file
has the same date as the entry in the index, in which case we end up doign
the "careful" check in ce_smudge_racily_clean_entry() and there the
"ce_modified_check_fs()" will end up re-reading the file, and if we don't
have the right CRLF behaviour, we will now return DATA_CHANGED.
So the call-chain for this is:
cmd_write_tree ->
write_tree ->
write_index ->
ce_smudge_racily_clean_entry ->
ce_modified_check_fs ->
ce_compare_data ->
index_fd ->
convert_to_git ->
** wrong answer unless auto_crlf is set **
and now "ce_smudge_racily_clean_entry()" will do
ce->ce_size = htonl(0);
and the one-liner fix I sent out is actually the right fix.
This was harder to find than it should have been, because it actually
depends on the datestamp of the index file matching the datestamp of the
file in question!
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 18:41 ` Junio C Hamano
@ 2007-08-23 19:25 ` Linus Torvalds
0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-08-23 19:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Marius Storm-Olsen, Git Mailing List
On Thu, 23 Aug 2007, Junio C Hamano wrote:
>
> As you said, there is something else going on. write-tree is
> about reading the index entries and writing them out as a set of
> trees, and at that point it should not even matter if you have
> garbage in the work tree or if you do not even have a work tree.
> All the crlf conversions have been done when the object hit the
> index, so its reading or not reading core.autocrlf should not
> change its behaviour.
Ok, I just sent out the analysis of this, maybe you should add that as the
changelog for the one-liner, and sign-off from me.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 19:23 ` Linus Torvalds
@ 2007-08-23 19:44 ` Junio C Hamano
2007-08-23 20:11 ` Marius Storm-Olsen
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2007-08-23 19:44 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Marius Storm-Olsen, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> And the reason is really interesting: it only happens if the index file
> has the same date as the entry in the index, in which case we end up doign
> the "careful" check in ce_smudge_racily_clean_entry() and there the
> "ce_modified_check_fs()" will end up re-reading the file, and if we don't
> have the right CRLF behaviour, we will now return DATA_CHANGED.
>
> So the call-chain for this is:
>
> cmd_write_tree ->
> write_tree ->
> write_index ->
> ce_smudge_racily_clean_entry ->
> ce_modified_check_fs ->
> ce_compare_data ->
> index_fd ->
> convert_to_git ->
> ** wrong answer unless auto_crlf is set **
>
> and now "ce_smudge_racily_clean_entry()" will do
> ce->ce_size = htonl(0);
>
> and the one-liner fix I sent out is actually the right fix.
>
> This was harder to find than it should have been, because it actually
> depends on the datestamp of the index file matching the datestamp of the
> file in question!
Ahhhh, I forgot that we made write-tree to write back into index
because at that point we have a fully populated cache-tree.
Your analysis makes sense to me. Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TESTCASE] Failing 'git am' when core.autocrlf=true
2007-08-23 19:23 ` Linus Torvalds
2007-08-23 19:44 ` Junio C Hamano
@ 2007-08-23 20:11 ` Marius Storm-Olsen
1 sibling, 0 replies; 8+ messages in thread
From: Marius Storm-Olsen @ 2007-08-23 20:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]
Linus Torvalds wrote:
> So the call-chain for this is:
>
> cmd_write_tree ->
> write_tree ->
> write_index ->
> ce_smudge_racily_clean_entry ->
> ce_modified_check_fs ->
> ce_compare_data ->
> index_fd ->
> convert_to_git ->
> ** wrong answer unless auto_crlf is set **
>
> and now "ce_smudge_racily_clean_entry()" will do
> ce->ce_size = htonl(0);
>
> and the one-liner fix I sent out is actually the right fix.
>
> This was harder to find than it should have been, because it
> actually depends on the datestamp of the index file matching the
> datestamp of the file in question!
Thanks for the detailed analysis of the chain of events. It
really helps understand the inner-workings of Git.
I also tried to track this down after your initial patch, but
couldn't find it. The fact that the index is smudged at the time of
the second commit, so the third failed, made me look in the wrong
place.
How about adding a trace for the smudged case, so it's easier to
find similar issues in the future?
--
.marius
From b858610ff4cd42f57a05c815a3e3b43428d67f99 Mon Sep 17 00:00:00 2001
From: Marius Storm-Olsen <mstormo_git@storm-olsen.com>
Date: Thu, 23 Aug 2007 22:03:35 +0200
Subject: [PATCH] Add a trace to more easily show that the index has been smudged.
Signed-off-by: Marius Storm-Olsen <mstormo_git@storm-olsen.com>
---
read-cache.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index 849e8d6..000451c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1088,6 +1088,7 @@ static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
* for "frotz" stays 6 which does not match the filesystem.
*/
ce->ce_size = htonl(0);
+ trace_printf("trace: index: Index object for '%s' smudged due to being racily clean\n", ce->name);
}
}
--
1.5.3.rc4.mingw.2.3.g3318a-dirty
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-08-23 20:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-23 14:07 [TESTCASE] Failing 'git am' when core.autocrlf=true Marius Storm-Olsen
2007-08-23 17:18 ` Linus Torvalds
2007-08-23 17:45 ` Marius Storm-Olsen
2007-08-23 18:41 ` Junio C Hamano
2007-08-23 19:25 ` Linus Torvalds
2007-08-23 19:23 ` Linus Torvalds
2007-08-23 19:44 ` Junio C Hamano
2007-08-23 20:11 ` Marius Storm-Olsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).