git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* `git diff --break-rewrites` does not work (otherwise it should break rewrite into delete and create, for `--find-renames` to work)
@ 2024-09-10 11:07 Han Jiang
  2024-09-11  6:48 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Han Jiang @ 2024-09-10 11:07 UTC (permalink / raw)
  To: Git Mailing List

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)

cd '/'; cd '/'; rm --force --recursive -- './test_git2'; mkdir "$_"; cd "$_";
mkdir --parents -- './repo';
git init './repo'
echo -e 'a\nb\nc\nd\ne\nf\ng\nh\ni\nj' >'./repo/file1'
echo -e '0\n1\n2\n3\n4\n5\n6\n7\n8\n9' >'./repo/file2'
git -C './repo' add './file1' './file2'
mv './repo/file2' './repo/file3'
mv './repo/file1' './repo/file2'
git -C './repo' add --intent-to-add './file3'
git -C './repo' diff --break-rewrites='50%/50%' --find-renames='50%'

What did you expect to happen? (Expected behavior)

`git diff` outputs: file1 rename to file2, file2 rename to file3

What happened instead? (Actual behavior)

`git diff` outputs: file1 remove all content, file2 complete rewrite,
file3 add all content

What's different between what you expected and what actually happened?

Anything else you want to add:

Please review the rest of the bug report below.
You can delete any lines you don't wish to share.


[System Info]
git version:
git version 2.46.0.windows.1
cpu: x86_64
built from commit: 2e6a859ffc0471f60f79c1256f766042b0d5d17d
sizeof-long: 4
sizeof-size_t: 8
shell-path: D:/git-sdk-64-build-installers/usr/bin/sh
feature: fsmonitor--daemon
libcurl: 8.9.0
OpenSSL: OpenSSL 3.2.2 4 Jun 2024
zlib: 1.3.1
uname: Windows 10.0 22631
compiler info: gnuc: 14.1
libc info: no libc information available
$SHELL (typically, interactive shell): C:\Program Files\Git\usr\bin\bash.exe


[Enabled Hooks]
not run from a git repository - no hooks to show

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: `git diff --break-rewrites` does not work (otherwise it should break rewrite into delete and create, for `--find-renames` to work)
  2024-09-10 11:07 `git diff --break-rewrites` does not work (otherwise it should break rewrite into delete and create, for `--find-renames` to work) Han Jiang
@ 2024-09-11  6:48 ` Jeff King
  2024-09-11 10:30   ` Han Jiang
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2024-09-11  6:48 UTC (permalink / raw)
  To: Han Jiang; +Cc: Git Mailing List

On Tue, Sep 10, 2024 at 11:07:19PM +1200, Han Jiang wrote:

> Thank you for filling out a Git bug report!
> Please answer the following questions to help us understand your issue.
> 
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
> cd '/'; cd '/'; rm --force --recursive -- './test_git2'; mkdir "$_"; cd "$_";
> mkdir --parents -- './repo';
> git init './repo'
> echo -e 'a\nb\nc\nd\ne\nf\ng\nh\ni\nj' >'./repo/file1'
> echo -e '0\n1\n2\n3\n4\n5\n6\n7\n8\n9' >'./repo/file2'
> git -C './repo' add './file1' './file2'
> mv './repo/file2' './repo/file3'
> mv './repo/file1' './repo/file2'
> git -C './repo' add --intent-to-add './file3'
> git -C './repo' diff --break-rewrites='50%/50%' --find-renames='50%'
> 
> What did you expect to happen? (Expected behavior)
> 
> `git diff` outputs: file1 rename to file2, file2 rename to file3
> 
> What happened instead? (Actual behavior)
> 
> `git diff` outputs: file1 remove all content, file2 complete rewrite,
> file3 add all content

It's because your toy example is too small. Try:

  seq 400 >repo/file2

instead of the 0-9 input, which will then do what you expect.

There is a hard-coded MINIMUM_BREAK_SIZE limit which requires that one
of the files must be at least 400 bytes, presumably to avoid awkward
corner cases in the heuristics for very small files. That comes from
eeaa460314 ([PATCH] diff: Update -B heuristics., 2005-06-03), so quite
long ago. But I'm not sure if any science went into determining it.

Do you have a real (non-toy) case where it should be triggering but
isn't? I wonder if we should consider making that hard-coded limit
configurable somehow.

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: `git diff --break-rewrites` does not work (otherwise it should break rewrite into delete and create, for `--find-renames` to work)
  2024-09-11  6:48 ` Jeff King
@ 2024-09-11 10:30   ` Han Jiang
  0 siblings, 0 replies; 3+ messages in thread
From: Han Jiang @ 2024-09-11 10:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

Thank you for the explanation!
The bug report comes from an attempt to write examples to demonstrate
what those command line options do and how they interact, so currently
there is no real-world case where the limit needs to be loosen.
It would be good to make the limit configurable if many people want
it. For me I just hope the documentation and the actual behavior can
be consistent.

On Wed, Sep 11, 2024 at 6:48 PM Jeff King <peff@peff.net> wrote:
>
> On Tue, Sep 10, 2024 at 11:07:19PM +1200, Han Jiang wrote:
>
> > Thank you for filling out a Git bug report!
> > Please answer the following questions to help us understand your issue.
> >
> > What did you do before the bug happened? (Steps to reproduce your issue)
> >
> > cd '/'; cd '/'; rm --force --recursive -- './test_git2'; mkdir "$_"; cd "$_";
> > mkdir --parents -- './repo';
> > git init './repo'
> > echo -e 'a\nb\nc\nd\ne\nf\ng\nh\ni\nj' >'./repo/file1'
> > echo -e '0\n1\n2\n3\n4\n5\n6\n7\n8\n9' >'./repo/file2'
> > git -C './repo' add './file1' './file2'
> > mv './repo/file2' './repo/file3'
> > mv './repo/file1' './repo/file2'
> > git -C './repo' add --intent-to-add './file3'
> > git -C './repo' diff --break-rewrites='50%/50%' --find-renames='50%'
> >
> > What did you expect to happen? (Expected behavior)
> >
> > `git diff` outputs: file1 rename to file2, file2 rename to file3
> >
> > What happened instead? (Actual behavior)
> >
> > `git diff` outputs: file1 remove all content, file2 complete rewrite,
> > file3 add all content
>
> It's because your toy example is too small. Try:
>
>   seq 400 >repo/file2
>
> instead of the 0-9 input, which will then do what you expect.
>
> There is a hard-coded MINIMUM_BREAK_SIZE limit which requires that one
> of the files must be at least 400 bytes, presumably to avoid awkward
> corner cases in the heuristics for very small files. That comes from
> eeaa460314 ([PATCH] diff: Update -B heuristics., 2005-06-03), so quite
> long ago. But I'm not sure if any science went into determining it.
>
> Do you have a real (non-toy) case where it should be triggering but
> isn't? I wonder if we should consider making that hard-coded limit
> configurable somehow.
>
> -Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-11 10:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-10 11:07 `git diff --break-rewrites` does not work (otherwise it should break rewrite into delete and create, for `--find-renames` to work) Han Jiang
2024-09-11  6:48 ` Jeff King
2024-09-11 10:30   ` Han Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).