git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Make XDF_NEED_MINIMAL default in blame.
@ 2014-03-16 10:43 Michael Andreen
  2014-03-16 12:12 ` Thomas Rast
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Andreen @ 2014-03-16 10:43 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2794 bytes --]

Currently git blame has a big problem finding copies and moves when you
split up a big file into smaller ones. One example in the git repository
is 2cf565c, which split the documentation into smaller files.

In 582aa00 XDF_NEED_MINIMAL was removed as the default for performance
reasons, mainly for diff and rebase, but blame was also changed.

In 059a500 the problem with blame was noticed and the flag --minimal was
introduced. However this flag is not documented and it is not possible
to set when using "git gui blame".

Setting XDF_NEED_MINIMAL as default has a small performance impact when
you run on a file with few modifications. However, if you run it on a
file with a bigger number of modifications, the performance impact is
small enough to not be noticable.

((2cf565c...))$ time PAGER=cat git blame -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.003s
user    0m0.002s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame --minimal -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.010s
user    0m0.009s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.010s
user    0m0.010s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.028s
user    0m0.027s
sys 0m0.000s

(master)$ time PAGER=cat git blame -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m2.338s
user    0m2.283s
sys 0m0.056s

(master)$ time PAGER=cat git blame --minimal -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m2.355s
user    0m2.285s
sys 0m0.069s

(master)$ time PAGER=cat git blame -C -M cache.h > /dev/null

real    0m1.755s
user    0m1.730s
sys 0m0.024s

(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real    0m1.785s
user    0m1.770s
sys 0m0.014s

(master)$ time PAGER=cat git blame -C -C -C -M cache.h > /dev/null

real    0m31.515s
user    0m30.810s
sys 0m0.684s

(master)$ time PAGER=cat git blame --minimal -C -C -C -M cache.h >
/dev/null

real    0m31.504s
user    0m30.885s
sys 0m0.598s

Signed-off-by: Michael Andreen <harv@ruin.nu>
---
Additional measurements attached, the variation is fairly small.

The --minimal flag is still there, but didn't want to break scripts
depending on it.

 builtin/blame.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index e5b5d71..0e7ebd0 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -42,7 +42,7 @@ static int show_root;
 static int reverse;
 static int blank_boundary;
 static int incremental;
-static int xdl_opts;
+static int xdl_opts = XDF_NEED_MINIMAL;
 static int abbrev = -1;
 static int no_whole_file_rename;
 
-- 
1.8.3.2


[-- Attachment #2: blame-time.txt --]
[-- Type: text/plain, Size: 4750 bytes --]

(master)$ time PAGER=cat git blame -C -M cache.h > /dev/null

real	0m1.767s
user	0m1.747s
sys	0m0.018s
(master)$ time PAGER=cat git blame -C -M cache.h > /dev/null

real	0m1.755s
user	0m1.730s
sys	0m0.024s
(master)$ time PAGER=cat git blame -C -M cache.h > /dev/null

real	0m1.784s
user	0m1.757s
sys	0m0.025s
(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real	0m1.813s
user	0m1.797s
sys	0m0.014s
(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real	0m1.790s
user	0m1.770s
sys	0m0.018s
(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real	0m1.785s
user	0m1.770s
sys	0m0.014s
(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real	0m1.794s
user	0m1.770s
sys	0m0.022s
(master)$ time PAGER=cat git blame -C -C -C -M cache.h > /dev/null

real	0m31.515s
user	0m30.810s
sys	0m0.684s
(master)$ time PAGER=cat git blame -C -C -C -M cache.h > /dev/null

real	0m31.594s
user	0m30.879s
sys	0m0.695s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M cache.h > /dev/null

real	0m31.666s
user	0m31.054s
sys	0m0.591s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M cache.h > /dev/null

real	0m31.504s
user	0m30.885s
sys	0m0.598s

(master)$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.355s
user	0m2.319s
sys	0m0.035s
(master)$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.352s
user	0m2.292s
sys	0m0.059s
(master)$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.354s
user	0m2.312s
sys	0m0.040s
(master)$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.338s
user	0m2.283s
sys	0m0.056s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.376s
user	0m2.302s
sys	0m0.071s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.362s
user	0m2.312s
sys	0m0.049s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.360s
user	0m2.301s
sys	0m0.057s
(master)$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m2.355s
user	0m2.285s
sys	0m0.069s


---------------------------------------------------

((2cf565c...))$ time PAGER=cat git blame -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.003s
user	0m0.002s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.003s
user	0m0.003s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.004s
user	0m0.003s
sys	0m0.001s
((2cf565c...))$ time PAGER=cat git blame -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.004s
user	0m0.003s
sys	0m0.001s
((2cf565c...))$ time PAGER=cat git blame -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.003s
user	0m0.002s
sys	0m0.001s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.010s
user	0m0.009s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.011s
user	0m0.010s
sys	0m0.001s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.012s
user	0m0.012s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.011s
user	0m0.011s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.008s
user	0m0.007s
sys	0m0.002s
((2cf565c...))$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.011s
user	0m0.011s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.010s
user	0m0.010s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.010s
user	0m0.009s
sys	0m0.000s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.028s
user	0m0.024s
sys	0m0.003s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.032s
user	0m0.029s
sys	0m0.002s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.027s
user	0m0.025s
sys	0m0.001s
((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M Documentation/git-ls-files.txt > /dev/null

real	0m0.028s
user	0m0.027s
sys	0m0.000s


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Make XDF_NEED_MINIMAL default in blame.
  2014-03-16 10:43 Michael Andreen
@ 2014-03-16 12:12 ` Thomas Rast
  2014-03-16 12:32   ` Michael Andreen
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Rast @ 2014-03-16 12:12 UTC (permalink / raw)
  To: Michael Andreen; +Cc: git

Michael Andreen <harv@ruin.nu> writes:

> The --minimal flag is still there, but didn't want to break scripts
> depending on it.

If I specify --no-minimal, does that turn it off again?

-- 
Thomas Rast
tr@thomasrast.ch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Make XDF_NEED_MINIMAL default in blame.
  2014-03-16 12:12 ` Thomas Rast
@ 2014-03-16 12:32   ` Michael Andreen
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Andreen @ 2014-03-16 12:32 UTC (permalink / raw)
  To: git; +Cc: Thomas Rast

On Sunday, March 16, 2014 01:12:01 PM Thomas Rast wrote:
> Michael Andreen <harv@ruin.nu> writes:
> 
> > The --minimal flag is still there, but didn't want to break scripts
> > depending on it.
> 
> If I specify --no-minimal, does that turn it off again?
> 

Yes, that works.

/Michael

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Make XDF_NEED_MINIMAL default in blame.
@ 2014-03-20 20:18 Michael Andreen
  2014-03-20 20:45 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Andreen @ 2014-03-20 20:18 UTC (permalink / raw)
  To: gitster; +Cc: git

Currently git blame has a big problem finding copies and moves when you
split up a big file into smaller ones. One example in the git repository
is 2cf565c, which split the documentation into smaller files.

In 582aa00 XDF_NEED_MINIMAL was removed as the default for performance
reasons, mainly for diff and rebase, but blame was also changed.

In 059a500 the problem with blame was noticed and the flag --minimal was
introduced. However this flag is not documented and it is not possible
to set when using "git gui blame".

Setting XDF_NEED_MINIMAL as default has a small performance impact when
you run on a file with few modifications. However, if you run it on a
file with a bigger number of modifications, the performance impact is
small enough to not be noticable.

The previous behavior can still be activated with --no-minimal.

((2cf565c...))$ time PAGER=cat git blame -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.003s
user    0m0.002s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame --minimal -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.010s
user    0m0.009s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.010s
user    0m0.010s
sys 0m0.000s

((2cf565c...))$ time PAGER=cat git blame --minimal -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m0.028s
user    0m0.027s
sys 0m0.000s

(master)$ time PAGER=cat git blame -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m2.338s
user    0m2.283s
sys 0m0.056s

(master)$ time PAGER=cat git blame --minimal -C -C -C -M
    Documentation/git-ls-files.txt > /dev/null

real    0m2.355s
user    0m2.285s
sys 0m0.069s

(master)$ time PAGER=cat git blame -C -M cache.h > /dev/null

real    0m1.755s
user    0m1.730s
sys 0m0.024s

(master)$ time PAGER=cat git blame --minimal -C -M cache.h > /dev/null

real    0m1.785s
user    0m1.770s
sys 0m0.014s

(master)$ time PAGER=cat git blame -C -C -C -M cache.h > /dev/null

real    0m31.515s
user    0m30.810s
sys 0m0.684s

(master)$ time PAGER=cat git blame --minimal -C -C -C -M cache.h >
/dev/null

real    0m31.504s
user    0m30.885s
sys 0m0.598s

Signed-off-by: Michael Andreen <harv@ruin.nu>
---
There hasn't been any arguments against this patch. Just updated the message 
with a note about --no-minimal.

Applies cleanly on both master and maint.

 builtin/blame.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index e5b5d71..0e7ebd0 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -42,7 +42,7 @@ static int show_root;
 static int reverse;
 static int blank_boundary;
 static int incremental;
-static int xdl_opts;
+static int xdl_opts = XDF_NEED_MINIMAL;
 static int abbrev = -1;
 static int no_whole_file_rename;
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Make XDF_NEED_MINIMAL default in blame.
  2014-03-20 20:18 [PATCH] Make XDF_NEED_MINIMAL default in blame Michael Andreen
@ 2014-03-20 20:45 ` Junio C Hamano
  2014-03-20 21:21   ` Michael Andreen
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-03-20 20:45 UTC (permalink / raw)
  To: Michael Andreen; +Cc: git

Michael Andreen <harv@ruin.nu> writes:

> There hasn't been any arguments against this patch. Just updated the message 
> with a note about --no-minimal.

There hasn't been any argument for this patch, either.

It is not like we are still in year 2007; timing result in a small
project like Git itself is not a good enough argument to change a
well established default at this late in the game, especially when
there are ways like command line options for users to specify their
preferred settings.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Make XDF_NEED_MINIMAL default in blame.
  2014-03-20 20:45 ` Junio C Hamano
@ 2014-03-20 21:21   ` Michael Andreen
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Andreen @ 2014-03-20 21:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Thursday, March 20, 2014 01:45:21 PM Junio C Hamano wrote:
> There hasn't been any argument for this patch, either.
> 
> It is not like we are still in year 2007; timing result in a small
> project like Git itself is not a good enough argument to change a
> well established default at this late in the game, especially when
> there are ways like command line options for users to specify their
> preferred settings.

The main reason why I submitted it is because the current behavior is very 
unintuitive when you split big files into smaller ones. Like the one that was 
done in 2cf565c.

If you do:

 $ git checkout 2cf565c
 $ git blame -C -C -C -M Documentation/git-ls-files.txt

then blame will not be able to detect any moves, even though big chunks have 
been moved without whitespace changes. Worse is that if you do

 $ git gui blame Documentation/git-ls-files.txt

then that won't see any movies either, and there is no way to make git gui 
blame use --minimal.

I spent a lot of time at $work trying to code-review a big split like this, 
wondering why "git gui blame" and "git blame -C -C -C -M" couldn't detect 
moves when complete functions had been moved with no changes (not even 
whitespace). So by making it default it will hopefully help someone else from 
losing time investigating this.

It wasn't until I did a bisect on git, finding 582aa00 and later googling for 
XDF_NEED_MINIMAL to find 059a500, that I found out about --minimal. Which is 
not documented anywhere.

I guess it can be argued for just documenting --minimal and either patching 
"git gui blame" to use it, or make it configurable. However I think the 
current default is really confusing, and the reason for changing it back in 
2007 was for performance, so that was what I tried to focus on, hopefully 
showing that there aren't any noticeable differences.

/Michael

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-20 21:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-20 20:18 [PATCH] Make XDF_NEED_MINIMAL default in blame Michael Andreen
2014-03-20 20:45 ` Junio C Hamano
2014-03-20 21:21   ` Michael Andreen
  -- strict thread matches above, loose matches on Subject: below --
2014-03-16 10:43 Michael Andreen
2014-03-16 12:12 ` Thomas Rast
2014-03-16 12:32   ` Michael Andreen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).