From: "Torsten Bögershausen" <tboegi@web.de>
To: "Junio C Hamano" <gitster@pobox.com>,
"Torsten Bögershausen" <tboegi@web.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v8 10/10] ce_compare_data() did not respect conversion
Date: Tue, 3 May 2016 18:02:07 +0200 [thread overview]
Message-ID: <55d60bb1-8626-4c97-630c-1a9f114c46b4@web.de> (raw)
In-Reply-To: <xmqqpot4s1ap.fsf@gitster.mtv.corp.google.com>
On 2016-05-02 21.33, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> Let's step back a bit and make sure we are on the same page. I
> think this "series" conflates a bit too many things into a single
> topic.
>
> * The comparison between the index and the working tree, i.e. "git
> diff", should compare result of convert_to_git() with what is in
> the index, and the world around it should be made consistent with
> that. Your separate "git blame" fix to add missing knowledge
> that convert_to_git() would not do s/CRLF/LF/ for a path whose
> index entry already is contaminated with CR falls into this
> category and was a very good thing to do.
>
> * A convert_to_git() and convert_to_working_tree() pair that do not
> roundtrip would (by definition) leave contents in the working
> tree, that, when passed through convert_to_git(), will be
> different from the index, upon completion of "reset --hard". We
> _should_ fix it so that "git diff" _reports_ differences.
> Currently, lstat(2) based optimization hides this in a racy way
> (when racy Git kicks in to reinspect the index and the working
> tree file actually matches, it finds out that they do not match),
> it is a bug that needs to be fixed, not 10/10 where it tries to
> hide the differences consistently and spreads the bug. I haven't
> studied 8/10 carefully yet, but it seems to attempt the same.
>
> * I think the "text=auto eol=THIS" that did not mean "I do not care
> to specify which ones are text files. Please detect the file
> type, and for those automatically detected, please make sure that
> the contents follwo THIS eol convention." was a bug, and what
> 07/10 tries to do is a good thing.
>
> By the way, lack of the cover letter of this series made it more
> painful to write a reply like this than necessary. A cover letter
> for a trivial 3-patch series might be overkill, but for anything
> with substance that spans more than 4-5 patches, a cover letter to
> describe the overall direction would really help.
The 10/10 needs to be replaced with something different, and I start to
get a better picture, why.
read_cache.c/ce_compare_data() checks what "git add" will change in the index.
sha1_file.c/index_fd() will read the file content from the working tree,
run convert_to_git() and calculate the sha1, which read_cache.c feeds into
hashcmp().
When convert_to_git() is run, 3 steps are taken:
- apply the filter, if any
- run crlf_to_git(), if attributes and core.autocrlf say so
- run ident, if specified.
Now the crlf_to_git() uses has_cr_in_index(const char *path) to find
out if we should keep the CRLF or not.
Side note: the suggested patches will use
get_convert_stats_sha1(get_sha1_from_cache(path),)
This works pretty well under normal "git add", but fails in t6038,
when we do a merge.
The new function get_sha1_from_index() can not find the sha1,
"read-cache.c/get_sha1_from_index:2392 path=file pos=-3"
and falls into the code path for
/*
* We might be in the middle of a merge, in which
* case we would read stage #2 (ours).
*/
read-cache.c/get_sha1_from_index:2408 path=file pos=3
read-cache.c/get_sha1_from_index:2416 path=file sha1=ad55e2
(The line number are with debug code)
The problem is, that ad55e2 is _not_ the sha1, which
read_cache/ce_compare_data() had been asked to look at.
Instead we should check the blob with 99b633.
The result is that convert/get_convert_stats_sha1() is called on the
wrong blob (the one with CRLF instead the one without CRLF), and this
makes t6038 from 9/10 fail.
10/10 rescues the situation, by using the correct blob :-)
In short, ce_compare_data() needs to forward ce->sha1 the whole way into
convert.c/crlf_to_git() and get_convert_stats_sha1().
While at it, I realized that we call a convert_attrs(&ca, path) a couple
of times (e.g. in would_convert_to_git(), to find out that we really don't have
any attributes set.
It could be nice to do that only once.
The next step will be to add the improvements/fixes for the ce_compare_data()
chain as described above, and then put 7/10..9/10 on top of that.
This will probably take some time, so that's why I asked if 1/10..4/10 could
proceed as is ?
(and the next version with cover letter, sorry for that)
next prev parent reply other threads:[~2016-05-03 16:02 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Message-Id=xmqqio26nqk8.fsf@gitster.mtv.corp.google.com>
2016-02-11 16:16 ` [PATCH 1/3] git reset --hard gives clean working tree tboegi
2016-02-11 18:49 ` Junio C Hamano
2016-03-05 7:23 ` Torsten Bögershausen
2016-03-05 8:05 ` Junio C Hamano
2016-03-05 8:27 ` Torsten Bögershausen
2016-03-05 21:18 ` Junio C Hamano
2016-03-07 8:14 ` Junio C Hamano
2016-03-07 8:51 ` Junio C Hamano
2016-03-07 8:58 ` Torsten Bögershausen
2016-03-07 22:34 ` Junio C Hamano
2016-03-29 13:25 ` [PATCH v1 1/7] Make it possible to get sha1 for a path from the index tboegi
2016-03-29 13:28 ` Duy Nguyen
2016-03-29 13:31 ` Duy Nguyen
2016-03-29 15:05 ` Torsten Bögershausen
2016-03-29 19:32 ` Eric Sunshine
2016-03-29 13:25 ` [PATCH v1 2/7] convert.c: stream and early out tboegi
2016-03-29 13:25 ` [PATCH v1 3/7] Allow core.autocrlf=input and core.eol=crlf tboegi
2016-03-29 13:25 ` [PATCH v1 4/7] t0027: TC for combined attributes tboegi
2016-03-29 13:25 ` [PATCH v1 5/7] CRLF: unify the "auto" handling tboegi
2016-03-29 19:42 ` Eric Sunshine
2016-03-29 13:25 ` [PATCH v1 6/7] correct blame for files commited with CRLF tboegi
2016-03-29 17:21 ` Junio C Hamano
2016-03-29 19:51 ` Torsten Bögershausen
2016-03-29 19:58 ` Junio C Hamano
2016-03-29 20:25 ` Junio C Hamano
2016-03-29 20:32 ` Junio C Hamano
2016-03-29 20:50 ` Junio C Hamano
2016-03-30 17:48 ` Torsten Bögershausen
2016-03-29 13:25 ` [PATCH v1 7/7] convert.c: more safer crlf handling with text attribute tboegi
2016-03-29 18:37 ` Junio C Hamano
2016-04-01 16:08 ` [PATCH v2 1/7] Make it possible to get sha1 for a path from the index tboegi
2016-04-01 16:08 ` [PATCH v2 2/7] convert.c: stream and early out tboegi
2016-04-01 16:08 ` [PATCH v2 3/7] Allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-01 22:20 ` Junio C Hamano
2016-04-01 16:08 ` [PATCH v2 4/7] t0027: TC for combined attributes tboegi
2016-04-01 22:22 ` Junio C Hamano
2016-04-01 16:08 ` [PATCH v2 5/7] CRLF: unify the "auto" handling tboegi
2016-04-01 22:25 ` Junio C Hamano
2016-04-01 16:08 ` [PATCH v2 6/7] correct blame for files commited with CRLF tboegi
2016-04-01 22:29 ` Junio C Hamano
2016-04-03 9:29 ` Torsten Bögershausen
2016-04-01 16:08 ` [PATCH v2 7/7] convert.c: more safer crlf handling with text attribute tboegi
2016-04-05 19:23 ` [PATCH v1] correct blame for files commited with CRLF tboegi
2016-04-05 20:57 ` Junio C Hamano
2016-04-05 21:12 ` Junio C Hamano
2016-04-06 4:17 ` Torsten Bögershausen
2016-04-19 13:24 ` [PATCH v5 1/4] t0027: Make more reliable tboegi
2016-04-19 13:26 ` [PATCH v5 2/4] convert: allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-19 13:26 ` [PATCH v5 3/4] t0027: test cases for combined attributes tboegi
2016-04-19 21:32 ` Junio C Hamano
2016-04-20 15:52 ` Torsten Bögershausen
2016-04-19 13:26 ` [PATCH v5 4/4] convert.c: ident + core.autocrlf didn't work tboegi
2016-04-20 22:27 ` Junio C Hamano
2016-04-22 14:38 ` [PATCH v6 01/10] t0027: Make more reliable tboegi
2016-04-22 22:03 ` Junio C Hamano
2016-04-24 3:45 ` Torsten Bögershausen
2016-04-22 14:53 ` [PATCH v6 02/10] convert: allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-22 14:53 ` [PATCH v6 03/10] t0027: test cases for combined attributes tboegi
2016-04-22 14:53 ` [PATCH v6 04/10] convert.c: ident + core.autocrlf didn't work tboegi
2016-04-22 14:53 ` [PATCH v6 05/10] read-cache: factor out get_sha1_from_index() helper tboegi
2016-04-22 14:53 ` [PATCH v6 06/10] convert.c: stream and early out tboegi
2016-04-22 14:53 ` [PATCH v6 07/10] convert: unify the "auto" handling of CRLF tboegi
2016-04-22 14:53 ` [PATCH v6 08/10] convert.c: more safer crlf handling with text attribute tboegi
2016-04-22 14:53 ` [PATCH v6 09/10] t6038; use crlf on all platforms tboegi
2016-04-22 14:53 ` [PATCH v6 10/10] ce_compare_data() did not respect conversion tboegi
2016-04-24 15:10 ` [PATCH v6b 01/10] t0027: Make commit_chk_wrnNNO() reliable tboegi
2016-04-24 15:11 ` [PATCH v6b 02/10] convert: allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-24 15:11 ` [PATCH v6b 03/10] t0027: test cases for combined attributes tboegi
2016-04-24 15:11 ` [PATCH v6b 04/10] convert.c: ident + core.autocrlf didn't work tboegi
2016-04-24 15:11 ` [PATCH v6b 05/10] read-cache: factor out get_sha1_from_index() helper tboegi
2016-04-24 15:11 ` [PATCH v6b 06/10] convert.c: stream and early out tboegi
2016-04-24 15:11 ` [PATCH v6b 07/10] convert: unify the "auto" handling of CRLF tboegi
2016-04-24 15:11 ` [PATCH v6b 08/10] convert.c: more safer crlf handling with text attribute tboegi
2016-04-24 15:11 ` [PATCH v6b 09/10] t6038; use crlf on all platforms tboegi
2016-04-24 15:11 ` [PATCH v6b 10/10] ce_compare_data() did not respect conversion tboegi
2016-04-25 16:56 ` [PATCH v7 01/10] t0027: Make commit_chk_wrnNNO() reliable tboegi
2016-04-25 19:15 ` Junio C Hamano
2016-04-25 16:56 ` [PATCH v7 02/10] convert: allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-25 16:56 ` [PATCH v7 03/10] t0027: test cases for combined attributes tboegi
2016-04-25 16:56 ` [PATCH v7 04/10] convert.c: ident + core.autocrlf didn't work tboegi
2016-04-25 16:56 ` [PATCH v7 05/10] read-cache: factor out get_sha1_from_index() helper tboegi
2016-04-25 16:56 ` [PATCH v7 06/10] convert.c: stream and early out tboegi
2016-04-25 16:56 ` [PATCH v7 07/10] convert: unify the "auto" handling of CRLF tboegi
2016-04-25 19:37 ` Junio C Hamano
2016-04-26 16:33 ` Torsten Bögershausen
2016-04-26 17:42 ` Junio C Hamano
2016-04-25 16:56 ` [PATCH v7 08/10] convert.c: more safer crlf handling with text attribute tboegi
2016-04-25 16:56 ` [PATCH v7 09/10] t6038; use crlf on all platforms tboegi
2016-04-25 16:56 ` [PATCH v7 10/10] ce_compare_data() did not respect conversion tboegi
2016-04-29 15:01 ` [PATCH v8 01/10] t0027: make commit_chk_wrnNNO() reliable tboegi
2016-04-29 15:01 ` [PATCH v8 02/10] convert: allow core.autocrlf=input and core.eol=crlf tboegi
2016-04-29 15:01 ` [PATCH v8 03/10] t0027: test cases for combined attributes tboegi
2016-04-29 15:01 ` [PATCH v8 04/10] convert.c: ident + core.autocrlf didn't work tboegi
2016-04-29 15:02 ` [PATCH v8 05/10] read-cache: factor out get_sha1_from_index() helper tboegi
2016-04-29 15:02 ` [PATCH v8 06/10] convert.c: stream and early out tboegi
2016-04-29 15:02 ` [PATCH v8 07/10] convert: unify the "auto" handling of CRLF tboegi
2016-11-25 15:48 ` Torsten Bögershausen
2016-11-27 16:22 ` [PATCH/RFC v1 1/1] New way to normalize the line endings tboegi
2016-11-29 19:15 ` Junio C Hamano
2017-04-12 11:48 ` [PATCH v2 1/1] Document how " tboegi
2016-04-29 15:02 ` [PATCH v8 08/10] convert.c: more safer crlf handling with text attribute tboegi
2016-04-29 15:02 ` [PATCH v8 09/10] t6038; use crlf on all platforms tboegi
2016-04-29 15:02 ` [PATCH v8 10/10] ce_compare_data() did not respect conversion tboegi
2016-04-29 18:20 ` Junio C Hamano
2016-04-29 21:09 ` Junio C Hamano
2016-05-01 16:27 ` Torsten Bögershausen
2016-05-02 18:16 ` Junio C Hamano
2016-05-02 19:33 ` Junio C Hamano
2016-05-03 16:02 ` Torsten Bögershausen [this message]
2016-05-03 18:31 ` Junio C Hamano
2016-05-04 4:07 ` Torsten Bögershausen
2016-05-04 7:23 ` Junio C Hamano
2016-05-06 8:54 ` Torsten Bögershausen
2016-05-06 17:11 ` Junio C Hamano
2016-05-07 6:10 ` [PATCH v9 0/6] convert-eol-autocrlf, old 5..10 now 1..6 tboegi
2016-05-07 6:10 ` [PATCH v9 1/6] read-cache: factor out get_sha1_from_index() helper tboegi
2016-05-09 19:54 ` Junio C Hamano
2016-05-07 6:11 ` [PATCH v9 2/6] convert.c: stream and early out tboegi
2016-05-09 20:29 ` Junio C Hamano
2016-05-11 4:30 ` Torsten Bögershausen
2016-05-07 6:11 ` [PATCH v9 3/6] convert: unify the "auto" handling of CRLF tboegi
2016-05-07 6:11 ` [PATCH v9 4/6] convert.c: more safer crlf handling with text attribute tboegi
2016-05-07 6:11 ` [PATCH v9 5/6] t6038; use crlf on all platforms tboegi
2016-05-07 6:11 ` [PATCH v9 6/6] convert: ce_compare_data() checks for a sha1 of a path tboegi
2016-02-11 16:16 ` [PATCH 2/3] Factor out convert_cmp_checkout() into convert.c tboegi
2016-02-11 16:16 ` [PATCH 3/3] convert.c: Optimize convert_cmp_checkout() for changed file len tboegi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55d60bb1-8626-4c97-630c-1a9f114c46b4@web.de \
--to=tboegi@web.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).