From: Junio C Hamano <gitster@pobox.com>
To: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH] diffcore-delta: avoid ignoring final 'line' of file
Date: Thu, 11 Jan 2024 15:00:23 -0800 [thread overview]
Message-ID: <xmqqedenearc.fsf@gitster.g> (raw)
In-Reply-To: <pull.1637.git.1705006074626.gitgitgadget@gmail.com> (Elijah Newren via GitGitGadget's message of "Thu, 11 Jan 2024 20:47:54 +0000")
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> diff --git a/diffcore-delta.c b/diffcore-delta.c
> index c30b56e983b..7136c3dd203 100644
> --- a/diffcore-delta.c
> +++ b/diffcore-delta.c
> @@ -159,6 +159,10 @@ static struct spanhash_top *hash_chars(struct repository *r,
> n = 0;
> accum1 = accum2 = 0;
> }
> + if (n > 0) {
> + hashval = (accum1 + accum2 * 0x61) % HASHBASE;
> + hash = add_spanhash(hash, hashval, n);
> + }
OK, so we were ignoring the final short bit that is not terminated
with LF due to the "continue". Nicely found.
> diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
> index 85be1367de6..29299acbce7 100755
> --- a/t/t4001-diff-rename.sh
> +++ b/t/t4001-diff-rename.sh
> @@ -286,4 +286,23 @@ test_expect_success 'basename similarity vs best similarity' '
> test_cmp expected actual
> '
>
> +test_expect_success 'last line matters too' '
> + test_write_lines a 0 1 2 3 4 5 6 7 8 9 >nonewline &&
> + printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
> + git add nonewline &&
> + git commit -m "original version of file with no final newline" &&
I found it misleading that the file whose name is nonewline has
bunch of LF including on its last line ;-).
> + # Change ONLY the first character of the whole file
> + test_write_lines b 0 1 2 3 4 5 6 7 8 9 >nonewline &&
Same here, but it is too much to bother rewriting it ...
{
test_write_lines ...
printf ...
} >incomplete
... like so ("incomplete" stands for "file ending with an incomplete line"),
so I'll let it pass.
> + printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
> + git add nonewline &&
> + git mv nonewline still-no-newline &&
> + git commit -a -m "rename nonewline -> still-no-newline" &&
> + git diff-tree -r -M01 --name-status HEAD^ HEAD >actual &&
> + cat >expected <<-\EOF &&
> + R097 nonewline still-no-newline
I am not very happy with the hardcoded 97. You are already using
the non-standard 10% threshold. If the delta detection that
forgets about the last line is so broken as your proposed log
message noted, shouldn't you be able to construct a sample pair of
preimage and postimage for which the broken version gives so low
similarity to be judged not worth treating as a rename, while the
fixed version gives reasonable similarity to be made into a rename,
by the default threshold? That way, the test only needs to see if
we got a rename (with any similarity) or a delete and an add.
next prev parent reply other threads:[~2024-01-11 23:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-11 20:47 [PATCH] diffcore-delta: avoid ignoring final 'line' of file Elijah Newren via GitGitGadget
2024-01-11 21:45 ` Taylor Blau
2024-01-11 23:00 ` Junio C Hamano [this message]
2024-01-13 1:45 ` Elijah Newren
2024-01-13 6:21 ` Junio C Hamano
2024-01-19 1:54 ` Elijah Newren
2024-01-19 3:06 ` Junio C Hamano
2024-01-19 5:05 ` Elijah Newren
2024-01-19 6:27 ` Junio C Hamano
2024-01-13 4:26 ` [PATCH v2] " Elijah Newren via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqedenearc.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.