git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance issue of 'git branch'
@ 2009-07-22 23:59 Carlos R. Mafra
  2009-07-23  0:21 ` Linus Torvalds
  2009-07-23  0:23 ` SZEDER Gábor
  0 siblings, 2 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-22 23:59 UTC (permalink / raw)
  To: git

Hi,

When I run 'git branch' in the linux-2.6 repo I think it takes
too long to finish (with cold cache):

[mafra@Pilar:linux-2.6]$ time git branch
  27-stable
  28-stable
  29-stable
  30-stable
  dev-private
* master
  option
  sparse
  stern
0.00user 0.05system 0:05.73elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (209major+1380minor)pagefaults 0swaps

This is with git 1.6.4.rc1.10.g2a67 and the kernel is 2.6.31-rc3+. The
machine is a 64bit Vaio laptop which is 1+ year old (so it is not "slow").

Repeating the command a second time takes basically zero seconds, but
this is more or less what I would expect in the first time too.

I use git to track linux-2.6 for 2 years now, and I remember that
'git branch' is slow for quite some time, so it is not a regression
or something. It is just now that I took the courage to report this
small issue.

I did a 'strace' and this is where it spent most of the time:

1248301060.654911 open(".git/refs/heads/sparse", O_RDONLY) = 6
1248301060.654985 read(6, "60afdf6a4065a170ad829b4d79a86ec0"..., 255) = 41
1248301060.655056 read(6, "", 214)      = 0
1248301060.655116 close(6)              = 0
1248301060.680754 lstat(".git/refs/heads/stern", 0x7fff80bfa8d0) = -1 ENOENT (No such file or directory)
1248301064.018491 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
1248301064.018641 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f409ffa7000
1248301064.018722 write(1, "  27-stable\33[m\n", 15) = 15

I don't know why .git/refs/heads/stern does not exist and why it takes
so long with it. That branch is functional ('git checkout stern' succeeds),
as well as all the others. But strangely .git/refs/heads/ contains only

[mafra@Pilar:linux-2.6]$ ls .git/refs/heads/
dev-private  master  sparse

which, apart from "master", are the last branches that I created.

I occasionally run 'git gc --aggressive --prune" to optimize the repo,
but other than that I don't do anything fancy, just 'pull' almost
every day and 'bisect' (which is becoming a rare event now :-)

So I would like to ask what should I do to recover the missing files
in .git/refs/heads/ (which apparently is the cause for my issue) and
how I can avoid losing them in the first place.

Also, is there a way to "fix" the 4-secs pause in that lstat() in
case the files in .git/refs/heads/ get lost again?

Thanks in advance,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-22 23:59 Performance issue of 'git branch' Carlos R. Mafra
@ 2009-07-23  0:21 ` Linus Torvalds
  2009-07-23  0:51   ` Linus Torvalds
  2009-07-23  1:22   ` Carlos R. Mafra
  2009-07-23  0:23 ` SZEDER Gábor
  1 sibling, 2 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  0:21 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: git



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> 
> When I run 'git branch' in the linux-2.6 repo I think it takes
> too long to finish (with cold cache):
> 
> [mafra@Pilar:linux-2.6]$ time git branch
>   27-stable
>   28-stable
>   29-stable
>   30-stable
>   dev-private
> * master
>   option
>   sparse
>   stern
> 0.00user 0.05system 0:05.73elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (209major+1380minor)pagefaults 0swaps
> 
> This is with git 1.6.4.rc1.10.g2a67 and the kernel is 2.6.31-rc3+. The
> machine is a 64bit Vaio laptop which is 1+ year old (so it is not "slow").

When have you last repacked the repository?

What you're descibing is basically IO overhead, and if you don't have 
packed references, it's going to read a lot of small files.

> I use git to track linux-2.6 for 2 years now, and I remember that
> 'git branch' is slow for quite some time, so it is not a regression
> or something. It is just now that I took the courage to report this
> small issue.
> 
> I did a 'strace' and this is where it spent most of the time:
> 
> 1248301060.654911 open(".git/refs/heads/sparse", O_RDONLY) = 6
> 1248301060.654985 read(6, "60afdf6a4065a170ad829b4d79a86ec0"..., 255) = 41
> 1248301060.655056 read(6, "", 214)      = 0
> 1248301060.655116 close(6)              = 0
> 1248301060.680754 lstat(".git/refs/heads/stern", 0x7fff80bfa8d0) = -1 ENOENT (No such file or directory)
> 1248301064.018491 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
> 1248301064.018641 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f409ffa7000
> 1248301064.018722 write(1, "  27-stable\33[m\n", 15) = 15
> 
> I don't know why .git/refs/heads/stern does not exist and why it takes
> so long with it. That branch is functional ('git checkout stern' succeeds),
> as well as all the others. But strangely .git/refs/heads/ contains only
> 
> [mafra@Pilar:linux-2.6]$ ls .git/refs/heads/
> dev-private  master  sparse
> 
> which, apart from "master", are the last branches that I created.

Ok, this actually means that you _have_ repacked the repo, and the rest of 
the branches are all nicely packed in .git/packed-refs.

But that four _second_ lstat() is really disgusting.

Let me guess: if you do a "ls -ld .git/refs/heads" you get a very big 
directory, despite it only having three entries in it. And your filesystem 
doesn't have name hashing enabled, so searching for a non-existent file 
involves looking through _all_ of the empty slots.

Try this:

	git pack-refs --all

	rmdir .git/refs/heads
	rmdir .git/refs/tags

	mkdir .git/refs/heads
	mkdir .git/refs/tags

and see if it magically speeds up.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-22 23:59 Performance issue of 'git branch' Carlos R. Mafra
  2009-07-23  0:21 ` Linus Torvalds
@ 2009-07-23  0:23 ` SZEDER Gábor
  2009-07-23  2:25   ` Carlos R. Mafra
  1 sibling, 1 reply; 74+ messages in thread
From: SZEDER Gábor @ 2009-07-23  0:23 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: git

Hi,


On Thu, Jul 23, 2009 at 01:59:14AM +0200, Carlos R. Mafra wrote:

> I don't know why .git/refs/heads/stern does not exist and why it takes
> so long with it. That branch is functional ('git checkout stern' succeeds),
> as well as all the others. But strangely .git/refs/heads/ contains only
> 
> [mafra@Pilar:linux-2.6]$ ls .git/refs/heads/
> dev-private  master  sparse
> 
> which, apart from "master", are the last branches that I created.
> 
> I occasionally run 'git gc --aggressive --prune" to optimize the repo,
> but other than that I don't do anything fancy, just 'pull' almost
> every day and 'bisect' (which is becoming a rare event now :-)
> 
> So I would like to ask what should I do to recover the missing files
> in .git/refs/heads/ (which apparently is the cause for my issue) and
> how I can avoid losing them in the first place.

have a look at .git/packed-refs and 'git pack-refs'.


Best,
Gábor

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  0:21 ` Linus Torvalds
@ 2009-07-23  0:51   ` Linus Torvalds
  2009-07-23  0:55     ` Linus Torvalds
  2009-07-23  1:22   ` Carlos R. Mafra
  1 sibling, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  0:51 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: git



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> Try this:
> 
> 	git pack-refs --all
> 
> 	rmdir .git/refs/heads
> 	rmdir .git/refs/tags
> 
> 	mkdir .git/refs/heads
> 	mkdir .git/refs/tags
> 
> and see if it magically speeds up.

In fact, you could also just try

	mv .git/refs .git/temp-refs &&
	cp -a .git/temp-refs .git/refs &&
	rm -rf .git/temp-refs

which will re-create other subdirectories too (like .git/refs/remotes 
etc).

Of course, depending on your particular filesystem, a better fix might be 
to enable filename hashing, which gets rid of the whole "look through all 
the old empty stale directory entries to see if there's a filename there" 
issue. That won't fix 'readdir()' performance, but it should fix your 
insane 4-second lstat() thing.

If you have ext3, you'd do something like

	tune2fs -O dir_index /dev/<node-of-your-filesystem-goes-here>

but as mentioned, even with directory indexing it can actually make sense 
to recreate directories that at some point _used_ to be large, but got 
shrunk down to something much smaller. It's a generic directory problem 
(not just ext3, not just unix, it's a common issue across filesystems. 
It's not _universal_ - some smarter filesystems really do shrink their 
directories - but it's certainly not unusual).

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  0:51   ` Linus Torvalds
@ 2009-07-23  0:55     ` Linus Torvalds
  2009-07-23  2:02       ` Carlos R. Mafra
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  0:55 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: git



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> If you have ext3, you'd do something like
> 
> 	tune2fs -O dir_index /dev/<node-of-your-filesystem-goes-here>

One last email note on this subject. Really. Promise.

If you do that "tune2fs -O dir_index" thing, it will only take effect for 
_newly_ created directories. So you'll still need to do that whole 
"mv+cp+rm" dance, just to make sure that the refs directories are all new.

I think you can also force all directories to be indexed by using fsck, 
but I forget the details. I'm sure man-pages will have it. Or google.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  0:21 ` Linus Torvalds
  2009-07-23  0:51   ` Linus Torvalds
@ 2009-07-23  1:22   ` Carlos R. Mafra
  2009-07-23  2:20     ` Linus Torvalds
  1 sibling, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  1:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed 22.Jul'09 at 17:21:48 -0700, Linus Torvalds wrote:
> 
> When have you last repacked the repository?

Last week or so, with 'git repack -d -a'


> > [mafra@Pilar:linux-2.6]$ ls .git/refs/heads/
> > dev-private  master  sparse
> > 
> > which, apart from "master", are the last branches that I created.
> 
> Ok, this actually means that you _have_ repacked the repo, and the rest of 
> the branches are all nicely packed in .git/packed-refs.

Yes, now I saw the other branches inside packed-refs.

> But that four _second_ lstat() is really disgusting.
> 
> Let me guess: if you do a "ls -ld .git/refs/heads" you get a very big 
> directory, despite it only having three entries in it. 

[mafra@Pilar:linux-2.6]$ ls -ld .git/refs/heads
drwxr-xr-x 2 mafra mafra 4096 2009-07-22 23:01 .git/refs/heads/

> And your filesystem 
> doesn't have name hashing enabled, so searching for a non-existent file 
> involves looking through _all_ of the empty slots.

I use ext3 without changing any defaults that I know of (I simply compile
and boot the kernel of the day), and I have no idea if name hashing
is enabled here.

> Try this:
> 
> 	git pack-refs --all
> 
> 	rmdir .git/refs/heads
> 	rmdir .git/refs/tags
> 
> 	mkdir .git/refs/heads
> 	mkdir .git/refs/tags
> 
> and see if it magically speeds up.

It didn't change things, unfortunately.

After 'echo 3 > /proc/sys/vm/drop_caches' it still takes too long,

1248310449.693085 munmap(0x7f50bcd11000, 164) = 0
1248310449.693187 lstat(".git/refs/heads/sparse", 0x7fff618c0960) = -1 ENOENT (No such file or directory)
1248310449.719112 lstat(".git/refs/heads/stern", 0x7fff618c0960) = -1 ENOENT (No such file or directory)
1248310453.014041 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
1248310453.014183 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f50bcd11000

Perhaps I should delete the "stern" branch, but I would like to learn why
it is slowing things, because it also happened before (in fact it is always
like this, afaicr)

Do you have another theory? (now .git/refs/heads is empty)

Thanks,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  0:55     ` Linus Torvalds
@ 2009-07-23  2:02       ` Carlos R. Mafra
  2009-07-23  2:28         ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  2:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed 22.Jul'09 at 17:55:51 -0700, Linus Torvalds wrote:
> On Wed, 22 Jul 2009, Linus Torvalds wrote:
> > 
> > If you have ext3, you'd do something like
> > 
> > 	tune2fs -O dir_index /dev/<node-of-your-filesystem-goes-here>
> 
> One last email note on this subject. Really. Promise.
> 
> If you do that "tune2fs -O dir_index" thing, it will only take effect for 
> _newly_ created directories. So you'll still need to do that whole 
> "mv+cp+rm" dance, just to make sure that the refs directories are all new.

Ok, now I also did the "dir_index" thing followed by the mv+cp+rm instructions.
It doesn't change the 3.5 secs delay in that single line,

1248313742.355195 lstat(".git/refs/heads/sparse", 0x7fff0c663ab0) = -1 ENOENT (No such file or directory)
1248313742.381178 lstat(".git/refs/heads/stern", 0x7fff0c663ab0) = -1 ENOENT (No such file or directory)
1248313745.804637 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0

Just to double check,

[root@Pilar linux-2.6]# tune2fs -l /dev/sda5 |grep dir_index
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file

(and I did the mv+cp+rm after setting "dir_index")

Is there another way to check what is going on with that anomalous lstat()?
[ perhaps I will try 'perf' after I read how to use it ]

Thanks,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  1:22   ` Carlos R. Mafra
@ 2009-07-23  2:20     ` Linus Torvalds
  2009-07-23  2:23       ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  2:20 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> > Let me guess: if you do a "ls -ld .git/refs/heads" you get a very big 
> > directory, despite it only having three entries in it. 
> 
> [mafra@Pilar:linux-2.6]$ ls -ld .git/refs/heads
> drwxr-xr-x 2 mafra mafra 4096 2009-07-22 23:01 .git/refs/heads/

Hmm. That's just a single block. 

Then I really don't see why the lstat takes so long.

> After 'echo 3 > /proc/sys/vm/drop_caches' it still takes too long,
> 
> 1248310449.693085 munmap(0x7f50bcd11000, 164) = 0
> 1248310449.693187 lstat(".git/refs/heads/sparse", 0x7fff618c0960) = -1 ENOENT (No such file or directory)
> 1248310449.719112 lstat(".git/refs/heads/stern", 0x7fff618c0960) = -1 ENOENT (No such file or directory)
> 1248310453.014041 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
> 1248310453.014183 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f50bcd11000

Use 'strace -T', which shows how long the actual system calls take, rather 
than '-tt' which just shows when they started.

Maybe the four seconds is something else than the lstat - page faults on 
the pack-file in between the lstat and the fstat, for example.

> Perhaps I should delete the "stern" branch, but I would like to learn why
> it is slowing things, because it also happened before (in fact it is always
> like this, afaicr)

Absolutely. Don't delete it until we figure out what takes so long there.

> Do you have another theory? (now .git/refs/heads is empty)

Clearly it's IO, but if that 'lstat()' was just a red herring, then I 
suspect it's IO on the pack-file. If so, I'd further guess that your VAIO 
has some pitiful 4200rpm harddisk that is slow as hell and has horrible 
seek latencies, and the CPU is way overpowered compared to the cruddy 
disk.

It probably does the object lookup. You can see some debug output if you 
do

	GIT_DEBUG_LOOKUP=1 git branch

and that will show you the patterns. It won't be very pretty, especially 
if you have several pack-files, but maybe we can figure out what's up.

Hmm. I wonder.. I suspect 'git branch' looks up _all_ refs, and then 
afterwards it filters them. So even though it only prints out a few 
branches, maybe it will look at all the tags etc of the whole repository.

Ooh yes. That would do it. It's going to peel and look up every single ref 
it finds, so it's going to look up _hundreds_ of objects (all the tags, 
all the commits they point to, etc etc). Even if it then only shows a 
couple of branches.

Junio, any ideas?

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:20     ` Linus Torvalds
@ 2009-07-23  2:23       ` Linus Torvalds
  2009-07-23  3:08         ` Linus Torvalds
                           ` (3 more replies)
  0 siblings, 4 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  2:23 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> Ooh yes. That would do it. It's going to peel and look up every single ref 
> it finds, so it's going to look up _hundreds_ of objects (all the tags, 
> all the commits they point to, etc etc). Even if it then only shows a 
> couple of branches.
> 
> Junio, any ideas?

I had one of my own.

Does this fix it?

It uses the "raw" version of 'for_each_ref()' (which doesn't verify that 
the ref is valid), and then does the "type verification" before it starts 
doing any gentle commit lookup.

That should hopefully mean that it no longer does tons of object lookups 
on refs that it's not actually interested in. 

		Linus

---
 builtin-branch.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin-branch.c b/builtin-branch.c
index 5687d60..54a89ff 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -240,6 +240,10 @@ static int append_ref(const char *refname, const unsigned char *sha1, int flags,
 	if (ARRAY_SIZE(ref_kind) <= i)
 		return 0;
 
+	/* Don't add types the caller doesn't want */
+	if ((kind & ref_list->kinds) == 0)
+		return 0;
+
 	commit = lookup_commit_reference_gently(sha1, 1);
 	if (!commit)
 		return error("branch '%s' does not point at a commit", refname);
@@ -248,10 +252,6 @@ static int append_ref(const char *refname, const unsigned char *sha1, int flags,
 	if (!is_descendant_of(commit, ref_list->with_commit))
 		return 0;
 
-	/* Don't add types the caller doesn't want */
-	if ((kind & ref_list->kinds) == 0)
-		return 0;
-
 	if (merge_filter != NO_FILTER)
 		add_pending_object(&ref_list->revs,
 				   (struct object *)commit, refname);
@@ -426,7 +426,7 @@ static void print_ref_list(int kinds, int detached, int verbose, int abbrev, str
 	ref_list.with_commit = with_commit;
 	if (merge_filter != NO_FILTER)
 		init_revisions(&ref_list.revs, NULL);
-	for_each_ref(append_ref, &ref_list);
+	for_each_rawref(append_ref, &ref_list);
 	if (merge_filter != NO_FILTER) {
 		struct commit *filter;
 		filter = lookup_commit_reference_gently(merge_filter_ref, 0);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  0:23 ` SZEDER Gábor
@ 2009-07-23  2:25   ` Carlos R. Mafra
  0 siblings, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  2:25 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

Hi,

On Wed 22.Jul'09 at 19:23:23 -0500, SZEDER Gábor wrote:
> > So I would like to ask what should I do to recover the missing files
> > in .git/refs/heads/ (which apparently is the cause for my issue) and
> > how I can avoid losing them in the first place.
> 
> have a look at .git/packed-refs and 'git pack-refs'.

Yes, now I learned that the files were not really missing
as in "there is something wrong".

I will also start to use 'git pack-refs --prune' from time to time
now, in adition to 'git gc --prune' and 'git repack -d -a'.

But the takes-too-long 'git branch' issue is apparently caused
by something else.

Thanks Gábor,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:02       ` Carlos R. Mafra
@ 2009-07-23  2:28         ` Linus Torvalds
  2009-07-23 12:42           ` Jakub Narebski
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  2:28 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: git



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> 
> Is there another way to check what is going on with that anomalous lstat()?

I really don't think it's the lstat any more. Your directories look small 
and simple, and clearly the indexing made no difference.

See earlier email about using "strace -T" instead of "-tt". Also, I sent 
you a patch to try out just a minute ago, I think that may be it.

> [ perhaps I will try 'perf' after I read how to use it ]

I really like 'perf' (it does what oprofile did for me, but without the 
headaches), but it doesn't help with IO profiling.

I've actually often wanted to have a 'strace' that shows page faults as 
special system calls, but it's sadly nontrivial ;(

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:23       ` Linus Torvalds
@ 2009-07-23  3:08         ` Linus Torvalds
  2009-07-23  3:21           ` Linus Torvalds
  2009-07-23  3:18         ` Carlos R. Mafra
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  3:08 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> It uses the "raw" version of 'for_each_ref()' (which doesn't verify that 
> the ref is valid), and then does the "type verification" before it starts 
> doing any gentle commit lookup.
> 
> That should hopefully mean that it no longer does tons of object lookups 
> on refs that it's not actually interested in. 

Hmm. On my kernel repo, doing

	GIT_DEBUG_LOOKUP=1 git branch | wc -l

I get
 - before: 2121
 - after: 39

(where two of the lines are the actual 'git branch' output). So yeah, this 
should make a big difference. It now looks up just two objects (one of 
them duplicated because it checks "HEAD" - but the duplicate lookup won't 
result in any extra IO, so it's only two _uncached_ accesses).

The GIT_DEBUG_LOOKUP debug output probably does match the number of 
cold-cache IO's fairly well for something like this (at least to a first 
approximation), so I really hope my patch will fix your problem.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:23       ` Linus Torvalds
  2009-07-23  3:08         ` Linus Torvalds
@ 2009-07-23  3:18         ` Carlos R. Mafra
  2009-07-23  3:27           ` Carlos R. Mafra
                             ` (2 more replies)
  2009-07-23  4:40         ` Junio C Hamano
  2009-07-23 16:48         ` Anders Kaseorg
  3 siblings, 3 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  3:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

First of all:
      * yes, my VAIO has a slow 4200 rpm disc :-(
      * strace -T indeed showed that lstat() was not guilty
      * GIT_DEBUG_LOOKUP=1 git branch produced ugly 2200+ lines

Now to the patch,

On Wed 22.Jul'09 at 19:23:39 -0700, Linus Torvalds wrote:
> > Ooh yes. That would do it. It's going to peel and look up every single ref 
> > it finds, so it's going to look up _hundreds_ of objects (all the tags, 
> > all the commits they point to, etc etc). Even if it then only shows a 
> > couple of branches.
> > 
> > Junio, any ideas?
> 
> I had one of my own.
> 
> Does this fix it?

Yes!

[mafra@Pilar:linux-2.6]$ time git branch
  27-stable
  28-stable
  29-stable
  30-stable
  dev-private
* master
  option
  sparse
  stern
0.00user 0.01system 0:01.50elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (42major+757minor)pagefaults 0swaps

01.50 is not that good, but it doesn't "feel" terrible as 4 seconds.
[ It is incredible how 4 secs feels really bad while 2 is acceptable... ]

So thank you very much, Linus! A 50% improvement here!

And I am happy to have finally reported it, after quietly suffering for so long 
thinking that "git is as fast as possible, so it is probably my fault".

PS: Out of curiosity, how many femtoseconds does it take in your 
state-of-the-art machine? :-)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:08         ` Linus Torvalds
@ 2009-07-23  3:21           ` Linus Torvalds
  2009-07-23 17:47             ` Tony Finch
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  3:21 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> The GIT_DEBUG_LOOKUP debug output probably does match the number of 
> cold-cache IO's fairly well for something like this (at least to a first 
> approximation), so I really hope my patch will fix your problem.

Side note: the object lookup binary search we do is simple and reasonably 
efficient, but it is _not_ very cache-friendly (where "cache-friendly" 
also in this case means IO caches).

There are more cache-friendly ways of searching, although the really 
clever ones would require us to switch the format of the pack-file index 
around. Which would be a fairly big pain (in addition to making the lookup 
a lot more complex).

The _simpler_ cache-friendly alternative is likely to try the "guess 
location by assuming the SHA1's are evenly spread out" thing doesn't jump 
back-and-forth like a binary search does.

We tried it a few years ago, but didn't do cold-cache numbers. And 
repositories were smaller too.

With something like the kernel repo, with 1.2+ million objects, a binary 
search needs about 21 comparisons for each object we look up. The index 
has a first-level fan-out of 256, so that takes away 8 of them, but we're 
still talking about 13 comparisons. With bad locality except for the very 
last ones.

Assuming a 4kB page-size, and about 170 index entries per page (~7 binary 
search levels), that's 6 pages we have to page-fault in for each search. 
And we probably won't start seeing lots of cache reuse until we hit 
hundreds or thousands of objects searched for.

With soemthing like "three iterations of newton-raphson + linear search", 
we might end up with more index entries looked at, but we'd quite possibly 
get much better locality.

I suspect the old newton-raphson patches we had (Discussions and patches 
back in April 2007 on this list) could be resurrected pretty easily.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:18         ` Carlos R. Mafra
@ 2009-07-23  3:27           ` Carlos R. Mafra
  2009-07-23  3:40           ` Carlos R. Mafra
  2009-07-23  3:47           ` Linus Torvalds
  2 siblings, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  3:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

On Thu 23.Jul'09 at  5:18:44 +0200, Carlos R. Mafra wrote:

>       * GIT_DEBUG_LOOKUP=1 git branch produced ugly 2200+ lines

With your patch applied it went down to 132 lines.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:18         ` Carlos R. Mafra
  2009-07-23  3:27           ` Carlos R. Mafra
@ 2009-07-23  3:40           ` Carlos R. Mafra
  2009-07-23  3:47           ` Linus Torvalds
  2 siblings, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  3:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

On Thu 23.Jul'09 at  5:18:44 +0200, Carlos R. Mafra wrote:

> 0.00user 0.01system 0:01.50elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (42major+757minor)pagefaults 0swaps
> 
> 01.50 is not that good, but it doesn't "feel" terrible as 4 seconds.
> [ It is incredible how 4 secs feels really bad while 2 is acceptable... ]

I need to sleep, as the number 4 seconds got stuck in my head. In my original
report it was much worse

0.00user 0.05system 0:05.73elapsed

So now it was a 75% improvement!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:18         ` Carlos R. Mafra
  2009-07-23  3:27           ` Carlos R. Mafra
  2009-07-23  3:40           ` Carlos R. Mafra
@ 2009-07-23  3:47           ` Linus Torvalds
  2009-07-23  4:10             ` Linus Torvalds
  2 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  3:47 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> 
> PS: Out of curiosity, how many femtoseconds does it take in your 
> state-of-the-art machine? :-)

Cold cache? 0.15s before the patch. 0.03s after.

So we're not talking femto-seconds, but I've got Intel SSD's that do 
random reads in well under a millisecond. Your pitiful 4200rpm drive 
probably takes 20ms for each seek. You don't really need that many IO's 
for it to take a second or two. Or four.

The kernel will do IO in bigger chunks than a single page, and there is 
_some_ locality to it all, so you won't see IO for each lookup. But with 
2000+ lines of GIT_DEBUG_LOOKUP, you probably do end up having a 
noticeable fraction of them being IO-causing, and another fraction causing 
seeks.

But I'll see if I can dig up my non-binary-search patch and see if I can 
make it go faster. My machine is fast, but not so fast that I can't 
measure it ;)

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:47           ` Linus Torvalds
@ 2009-07-23  4:10             ` Linus Torvalds
  2009-07-23  5:13               ` Junio C Hamano
  2009-07-23  5:17               ` Carlos R. Mafra
  0 siblings, 2 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  4:10 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Git Mailing List, Junio C Hamano



On Wed, 22 Jul 2009, Linus Torvalds wrote:
> 
> But I'll see if I can dig up my non-binary-search patch and see if I can 
> make it go faster. My machine is fast, but not so fast that I can't 
> measure it ;)

Oh. We actually merged a fixed version of it. I'd completely forgotten.
 
Enabled with 'GIT_USE_LOOKUP'. But it seems to give worse performance, 
despite giving me fewer searches: I get 2121 probes with binary searching, 
but only 1325 with the newton-raphson method (for the non-fixed 'git 
branch' case).

Using GIT_USE_LOOKUP actually results in fewer pagefaults (1391 vs 1473), 
but it's still slower. Interesting. Carlos, try it on your machine (just 
do

	export GIT_USE_LOOKUP=1
	time git branch

to try it, and 'unset GIT_USE_LOOKUP' to disable it.

(And note that the "=1" part isn't important - the only thing that matters 
is whether the environment variable is set or not - setting it to '0' will 
_not_ disable it, you need to 'unset' it).

With my fix to 'git branch', it doesn't matter. I get the same 
performance, and same number of page faults (676) regardless. So my patch 
makes the GIT_USE_LOOKUP=1 thing irrelevant.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:23       ` Linus Torvalds
  2009-07-23  3:08         ` Linus Torvalds
  2009-07-23  3:18         ` Carlos R. Mafra
@ 2009-07-23  4:40         ` Junio C Hamano
  2009-07-23  5:36           ` Linus Torvalds
  2009-07-23 16:07           ` Carlos R. Mafra
  2009-07-23 16:48         ` Anders Kaseorg
  3 siblings, 2 replies; 74+ messages in thread
From: Junio C Hamano @ 2009-07-23  4:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 22 Jul 2009, Linus Torvalds wrote:
>> 
>> Ooh yes. That would do it. It's going to peel and look up every single ref 
>> it finds, so it's going to look up _hundreds_ of objects (all the tags, 
>> all the commits they point to, etc etc). Even if it then only shows a 
>> couple of branches.
>> 
>> Junio, any ideas?
>
> I had one of my own.

It seems that I missed all the fun while going out to dinner.

> It uses the "raw" version of 'for_each_ref()' (which doesn't verify that 
> the ref is valid), and then does the "type verification" before it starts 
> doing any gentle commit lookup.

Hmm, we now have to remember what this patch did, if we ever wanted to
introduce negative refs later (see ef06b91 do_for_each_ref: perform the
same sanity check for leftovers., 2006-11-18).  Not exactly nice to spread
the codepaths that need to be updated.  Is the cold cache performance of
"git branch" to list your local branches that important?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  4:10             ` Linus Torvalds
@ 2009-07-23  5:13               ` Junio C Hamano
  2009-07-23  5:17               ` Carlos R. Mafra
  1 sibling, 0 replies; 74+ messages in thread
From: Junio C Hamano @ 2009-07-23  5:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 22 Jul 2009, Linus Torvalds wrote:
>> 
>> But I'll see if I can dig up my non-binary-search patch and see if I can 
>> make it go faster. My machine is fast, but not so fast that I can't 
>> measure it ;)
>
> Oh. We actually merged a fixed version of it. I'd completely forgotten.

As the commit message of 628522e (sha1-lookup: more memory efficient
search in sorted list of SHA-1, 2007-12-29) shows, it didn't get any great
performance improvements, even though it did make the probing quite a lot
less memory intensive.

Perhaps you can spot obvious inefficiency in the code that I failed to
see, just like you recently did for "show --cc" codepath?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  4:10             ` Linus Torvalds
  2009-07-23  5:13               ` Junio C Hamano
@ 2009-07-23  5:17               ` Carlos R. Mafra
  1 sibling, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23  5:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

On Wed 22.Jul'09 at 21:10:49 -0700, Linus Torvalds wrote:
> Enabled with 'GIT_USE_LOOKUP'. But it seems to give worse performance, 
> despite giving me fewer searches: I get 2121 probes with binary searching, 
> but only 1325 with the newton-raphson method (for the non-fixed 'git 
> branch' case).
> 
> Using GIT_USE_LOOKUP actually results in fewer pagefaults (1391 vs 1473), 
> but it's still slower. Interesting. Carlos, try it on your machine (just 
> do
> 
> 	export GIT_USE_LOOKUP=1
> 	time git branch
> 
> to try it, and 'unset GIT_USE_LOOKUP' to disable it.


GIT_USE_LOOKUP=1 makes is a bit slower overall. 

Without your patch, I get fewer pagefaults (1254 vs 1404) when
it is set, but it takes ~0.5s longer (it varies a bit).

> With my fix to 'git branch', it doesn't matter. I get the same 
> performance, and same number of page faults (676) regardless. So my patch 
> makes the GIT_USE_LOOKUP=1 thing irrelevant.

With your patch and GIT_USE_LOOKUP=1 I get 751 pagefaults, versus 775
if GIT_USE_LOOKUP is unset, but it is faster when unset.

So your patch without GIT_USE_LOOKUP=1 is the fastest option.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  4:40         ` Junio C Hamano
@ 2009-07-23  5:36           ` Linus Torvalds
  2009-07-23  5:52             ` Junio C Hamano
  2009-07-23 16:07           ` Carlos R. Mafra
  1 sibling, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23  5:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carlos R. Mafra, Git Mailing List



On Wed, 22 Jul 2009, Junio C Hamano wrote:
> 
> Hmm, we now have to remember what this patch did, if we ever wanted to
> introduce negative refs later (see ef06b91 do_for_each_ref: perform the
> same sanity check for leftovers., 2006-11-18).  Not exactly nice to spread
> the codepaths that need to be updated.  Is the cold cache performance of
> "git branch" to list your local branches that important?

Hmm. I do think that 7.5s is _way_ too long to wait for something as 
simple as "what branches do I have?".

And yes, it's also an operation that I'd expect to be quite possibly the 
first one you do when moving to a new repo, so cold-cache is realistic.

And the 'rawref' thing is exactly the same as the 'ref' version, except it 
doesn't do the null_sha1 check and the 'has_sha1-file()' check.

And since git branch will do something _better_ than the 'has_sha1_file()' 
check (by virtue of actually looking up the commit), I don't think that 
part is an issue. So the only issue is the is_null_sha1() thing.

And quite frankly, while the null-sha1 check may make sense, the way the 
flag is named right now (DO_FOR_EACH_INCLUDE_BROKEN), I think we might be 
better off re-thinking things later if we ever end up caring. That 
'is_null_sha1()' check should possibly be under a separate flag.

That said, while I think my patch was the simplest and most 
the problem could certainly have been fixed differently.

For example, instead of using 'for_each_ref()' and then splitting them by 
kind with that "detect kind" loop, it could instead have done two loops, 
ie

	if (kinds & REF_LOCAL_BRANCH)
		for_each_ref_in("refs/heads/", append_local, &ref_list);
	if (kinds & REF_REMOTE_BRANCH)
		for_each_ref_in("refs/remotes/", append_remote, &ref_list);

and avoided the other refs we aren't interested in _that_ way instead.

But it would be a bigger and involved patch. It gets really messy too (I 
tried), because when you use 'for_each_ref_in()' it removes the prefix as 
it goes along, but then the code in builtin-branch.c wants the prefix 
after all.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  5:36           ` Linus Torvalds
@ 2009-07-23  5:52             ` Junio C Hamano
  2009-07-23  6:04               ` Junio C Hamano
  0 siblings, 1 reply; 74+ messages in thread
From: Junio C Hamano @ 2009-07-23  5:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 22 Jul 2009, Junio C Hamano wrote:
>> 
>> Hmm, we now have to remember what this patch did, if we ever wanted to
>> introduce negative refs later (see ef06b91 do_for_each_ref: perform the
>> same sanity check for leftovers., 2006-11-18).  Not exactly nice to spread
>> the codepaths that need to be updated.
> ...
> And since git branch will do something _better_ than the 'has_sha1_file()' 
> check (by virtue of actually looking up the commit), I don't think that 
> part is an issue. So the only issue is the is_null_sha1() thing.

Exactly.

That is_null_sha1() thing was a remnant of your idea to represent deleted
ref that has a packed counterpart by storing 0{40} in a loose ref, so that
we can implement deletion efficiently.

Since we currently implement deletion by repacking packed refs if the ref
has a packed (possibly stale) one, we do not use such a "negative ref",
and skipping 0{40} done by the normal (i.e. non-raw) for_each_ref() family
is not necessary.

I was inclined to say that, because I never saw anybody complained that
deleting refs was too slow, we declare that we would forever stick to the
current implementation of ref deletion, and remove the is_null_sha1()
check from the do_one_ref() function, even for include-broken case.

But after thinking about it again, I'd say "if null, then skip" should be
outside the DO_FOR_EACH_INCLUDE_BROKEN anyway, because the null check is
not about brokenness of the ref, but is about a possible future expansion
to represent deleted ref with such a "negative ref" entry.

If we remove is_null_sha1() from do_one_ref(), or if we move it out of the
"include broken" thing, my "Not exactly nice" comment can be rescinded, as
doing the former (i.e. removal of is_null_sha1() check) is a promise that
we will never have to worry about negative refs, and doing the latter will
still protect callers of do_for_each_rawref() from negative refs if we
ever introduce them in some future.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  5:52             ` Junio C Hamano
@ 2009-07-23  6:04               ` Junio C Hamano
  2009-07-23 17:19                 ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Junio C Hamano @ 2009-07-23  6:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Carlos R. Mafra, Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> Exactly.
>
> That is_null_sha1() thing was a remnant of your idea to represent deleted
> ref that has a packed counterpart by storing 0{40} in a loose ref, so that
> we can implement deletion efficiently.
>
> Since we currently implement deletion by repacking packed refs if the ref
> has a packed (possibly stale) one, we do not use such a "negative ref",
> and skipping 0{40} done by the normal (i.e. non-raw) for_each_ref() family
> is not necessary.
>
> I was inclined to say that, because I never saw anybody complained that
> deleting refs was too slow, we declare that we would forever stick to the
> current implementation of ref deletion, and remove the is_null_sha1()
> check from the do_one_ref() function, even for include-broken case.
>
> But after thinking about it again, I'd say "if null, then skip" should be
> outside the DO_FOR_EACH_INCLUDE_BROKEN anyway, because the null check is
> not about brokenness of the ref, but is about a possible future expansion
> to represent deleted ref with such a "negative ref" entry.
>
> If we remove is_null_sha1() from do_one_ref(), or if we move it out of the
> "include broken" thing, my "Not exactly nice" comment can be rescinded, as
> doing the former (i.e. removal of is_null_sha1() check) is a promise that
> we will never have to worry about negative refs, and doing the latter will
> still protect callers of do_for_each_rawref() from negative refs if we
> ever introduce them in some future.

That is, a patch like this (this should go to 'maint'), and my worries
will go away.

-- >8 --
Subject: do_one_ref(): null_sha1 check is not about broken ref

f8948e2 (remote prune: warn dangling symrefs, 2009-02-08) introduced a
more dangerous variant of for_each_ref() family that skips the check for
dangling refs, but it also made another unrelated check optional by
mistake.

The check to see if a ref points at 0{40} is not about brokenness, but is
about a possible future plan to represent a deleted ref by writing 40 "0"
in a loose ref when there is a stale version of the same ref already in
.git/packed-refs, so that we can implement deletion of a ref without
having to rewrite the packed refs file excluding the ref being deleted.
This check has to be outside of the conditional.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 refs.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index bb0762e..3da3c8c 100644
--- a/refs.c
+++ b/refs.c
@@ -531,9 +531,10 @@ static int do_one_ref(const char *base, each_ref_fn fn, int trim,
 {
 	if (strncmp(base, entry->name, trim))
 		return 0;
+	/* Is this a "negative ref" that represents a deleted ref? */
+	if (is_null_sha1(entry->sha1))
+		return 0;
 	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN)) {
-		if (is_null_sha1(entry->sha1))
-			return 0;
 		if (!has_sha1_file(entry->sha1)) {
 			error("%s does not point to a valid object!", entry->name);
 			return 0;

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:28         ` Linus Torvalds
@ 2009-07-23 12:42           ` Jakub Narebski
  2009-07-23 14:45             ` Carlos R. Mafra
  2009-07-23 16:25             ` Linus Torvalds
  0 siblings, 2 replies; 74+ messages in thread
From: Jakub Narebski @ 2009-07-23 12:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> > 
> > Is there another way to check what is going on with that anomalous lstat()?
> 
> I really don't think it's the lstat any more. Your directories look small 
> and simple, and clearly the indexing made no difference.
> 
> See earlier email about using "strace -T" instead of "-tt". Also, I sent 
> you a patch to try out just a minute ago, I think that may be it.
> 
> > [ perhaps I will try 'perf' after I read how to use it ]
> 
> I really like 'perf' (it does what oprofile did for me, but without the 
> headaches), but it doesn't help with IO profiling.
> 
> I've actually often wanted to have a 'strace' that shows page faults as 
> special system calls, but it's sadly nontrivial ;(

BTW. Would SystemTap help there?  Among contributed scripts there is
iotimes, so perhaps it would be possible to have iotrace...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 12:42           ` Jakub Narebski
@ 2009-07-23 14:45             ` Carlos R. Mafra
  2009-07-23 16:25             ` Linus Torvalds
  1 sibling, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23 14:45 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, git

On Thu 23.Jul'09 at  5:42:03 -0700, Jakub Narebski wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> > > 
> > > Is there another way to check what is going on with that anomalous lstat()?
> > 
> > I really don't think it's the lstat any more. Your directories look small 
> > and simple, and clearly the indexing made no difference.
> > 
> > See earlier email about using "strace -T" instead of "-tt". Also, I sent 
> > you a patch to try out just a minute ago, I think that may be it.
> > 
> > > [ perhaps I will try 'perf' after I read how to use it ]
> > 
> > I really like 'perf' (it does what oprofile did for me, but without the 
> > headaches), but it doesn't help with IO profiling.
> > 
> > I've actually often wanted to have a 'strace' that shows page faults as 
> > special system calls, but it's sadly nontrivial ;(
> 
> BTW. Would SystemTap help there?  Among contributed scripts there is
> iotimes, so perhaps it would be possible to have iotrace...


I played a bit with 'blktrace' and 'btrace' and had two terminals
open side by side, one with 'strace git branch' and the other with
'blktrace'.

It was pretty obvious that exactly at the point where 'git branch'
was stalling (without Linus' patch) -- which I thought had to do
with lstat() -- there was a flurry of activity going on in 'btrace' 
output.

It would be nice if 'btrace' could be somehow unified with 'strace',
if that makes any sense.

Here are some numbers from my tests with blktrace (blkparse and btrace):

[root@Pilar mafra]# grep git blkparse-patch.txt |wc -l
811
[root@Pilar mafra]# grep git blkparse-nopatch.txt |wc -l
3479

where those lines with 'git' are something like

8,5    0      677     1.787350654 18591  I   R 204488479 + 40 [git]
8,0    0      678     1.787370489 18591  A   R 204488783 + 96 <- (8,5) 137529800
8,5    0      679     1.787371886 18591  Q   R 204488783 + 96 [git]
8,5    0      680     1.787375378 18591  G   R 204488783 + 96 [git]
8,5    0      681     1.787377613 18591  I   R 204488783 + 96 [git]

And the summary lines also indicate that the non-patched git makes
the disc work much harder:

*************** Without Linus' patch ******************************************

Total (8,5):
 Reads Queued:         764,   20,008KiB  Writes Queued:           0,        0KiB
 Read Dispatches:      764,   20,008KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:      764,   20,008KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            0,        0KiB
 IO unplugs:           299               Timer unplugs:           2

Throughput (R/W): 4,003KiB/s / 0KiB/s
Events (8,5): 5,266 entries
Skips: 0 forward (0 -   0.0%)

************** With Linus' patch **********************************************

Total (sda5):
 Reads Queued:         171,    3,128KiB	 Writes Queued:           6,       24KiB
 Read Dispatches:      171,    3,128KiB	 Write Dispatches:        2,       24KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:      171,    3,128KiB	 Writes Completed:        2,       24KiB
 Read Merges:            0,        0KiB	 Write Merges:            4,       16KiB
 IO unplugs:            80        	 Timer unplugs:           0

Throughput (R/W): 1,632KiB/s / 12KiB/s
Events (sda5): 1,226 entries
Skips: 0 forward (0 -   0.0%)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  4:40         ` Junio C Hamano
  2009-07-23  5:36           ` Linus Torvalds
@ 2009-07-23 16:07           ` Carlos R. Mafra
  2009-07-23 16:19             ` Linus Torvalds
  1 sibling, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23 16:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

On Wed 22.Jul'09 at 21:40:36 -0700, Junio C Hamano wrote:
> Is the cold cache performance of "git branch" to list your 
> local branches that important?

I simply felt like something not optimal was going on, and in
some sense I still feel it even with Linus' patch applied...

Don't get me wrong, I am super happy that Linus fixed it
so quickly and I am grateful for that, but I am surely missing
some git internal reason why 'git branch' is not instantaneous
as I _naively_ expected.

Having learned about .git/packed-refs last night, today I tried
this (with cold cache),

[mafra@Pilar:linux-2.6]$ time awk '{print $2}' .git/packed-refs |grep heads| awk -F "/" '{print $3}'
0.00user 0.00system 0:00.12elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (3major+311minor)pagefaults 0swaps
27-stable
28-stable
29-stable
30-stable
dev-private
master
option
sparse
stern

and notice how that makes my pitiful harddisc look like Linus' SSD! And the
result is the same. 

[ If some branches are not inside .git/packed-refs but are listed in .git/refs/heads 
(like some of them were last night), it would require some modification to the
script, but it would still be faster ]

However, I know that I am missing something here and I would be happy to 
learn what.

Thanks in advance,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 16:07           ` Carlos R. Mafra
@ 2009-07-23 16:19             ` Linus Torvalds
  2009-07-23 16:53               ` Carlos R. Mafra
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 16:19 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
> 
> Having learned about .git/packed-refs last night, today I tried
> this (with cold cache),
> 
> [mafra@Pilar:linux-2.6]$ time awk '{print $2}' .git/packed-refs |grep heads| awk -F "/" '{print $3}'
> 0.00user 0.00system 0:00.12elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (3major+311minor)pagefaults 0swaps
> 27-stable
> 28-stable
> 29-stable
> 30-stable
> dev-private
> master
> option
> sparse
> stern
> 
> and notice how that makes my pitiful harddisc look like Linus' SSD! And the
> result is the same. 

The result is the same, yes, but it doesn't do error checking.

What "git branch" does over and beyond just looking at the heads is to 
also look at the commits those heads point to. And the reason it sucks for 
you is that the commits are pretty spread out (particularly in the index 
file, but also in the pack-file) on disk. So each "verify this head" will 
likely involve at least one seek, and possibly four or five. 

And on your disk, five seeks is a tenth of a second. You can run hdparm, 
and it will probably say that you get 30MB/s off that laptop drive - but 
when doing small random reads you'll probably get performance in the order 
of a few tens of kilobytes, not megabytes. (With read-ahead and 
read-around it's probably going to be mostly ~64kB IO's and you'll 
probably get hundreds of kB per second, but you're going to care about 
just a few kB total of those).

So we _could_ make 'git branch' not actually read and verify the commits. 
It doesn't strictly _need_ to, unless you use 'git branch -v' or 
something. That would speed it up further, but the verification is nice, 
and as long as performance isn't _horrible_ I think we're better off doing 
it.

After all, you'll see the problem only once.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 12:42           ` Jakub Narebski
  2009-07-23 14:45             ` Carlos R. Mafra
@ 2009-07-23 16:25             ` Linus Torvalds
  1 sibling, 0 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 16:25 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Carlos R. Mafra, git



On Thu, 23 Jul 2009, Jakub Narebski wrote:
> 
> BTW. Would SystemTap help there?  Among contributed scripts there is
> iotimes, so perhaps it would be possible to have iotrace...

The problem I've had with all iotracers is that it's easy enough to get an 
IO trace, but it's basically almost impossible to integrate it with what 
actually _caused_ the IO.

Using 'strace -T' shows very clearly what operations are taking a long 
time. It's very useful for seeing what you should not do for good 
performance - including IO - and where it comes from. It's just that page 
faults are invisible to it.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  2:23       ` Linus Torvalds
                           ` (2 preceding siblings ...)
  2009-07-23  4:40         ` Junio C Hamano
@ 2009-07-23 16:48         ` Anders Kaseorg
  2009-07-23 19:03           ` Carlos R. Mafra
  3 siblings, 1 reply; 74+ messages in thread
From: Anders Kaseorg @ 2009-07-23 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, Git Mailing List, Junio C Hamano

On Wed, 22 Jul 2009, Linus Torvalds wrote:
> It uses the "raw" version of 'for_each_ref()' (which doesn't verify that 
> the ref is valid), and then does the "type verification" before it starts 
> doing any gentle commit lookup.

I submitted essentially the same patch in May:
  http://article.gmane.org/gmane.comp.version-control.git/120097
with the additional optimization that we don’t need to lookup commits at 
all unless we’re using -v, --merged, --no-merged, or --contains.  In my 
tests, it makes `git branch` 5 times faster on an uncached linux-2.6 
repository.

Anders

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 16:19             ` Linus Torvalds
@ 2009-07-23 16:53               ` Carlos R. Mafra
  2009-07-23 19:05                 ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23 16:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Thu 23.Jul'09 at  9:19:21 -0700, Linus Torvalds wrote:
> > 
> > and notice how that makes my pitiful harddisc look like Linus' SSD! And the
> > result is the same. 
> 
> The result is the same, yes, but it doesn't do error checking.

Oh, I see.

> So we _could_ make 'git branch' not actually read and verify the commits. 
> It doesn't strictly _need_ to, unless you use 'git branch -v' or 
> something. That would speed it up further, but the verification is nice, 
> and as long as performance isn't _horrible_ I think we're better off doing 
> it.

Right, but I would definitely like having some option like --dont-check to 
'git branch', and I think I would use it as default (unless experience
tells that errors happen often).

> After all, you'll see the problem only once.

True, but paradoxically that is also the reason why I notice it and
makes it feel bad.

Everytime I did the first 'git branch' those 5 seconds really hurt, because
I wondered why it couldn't be done in 0s like subsequent commands.

But sure, this was definitely not a pressing issue and your patch made
it even less. I am happy that it takes 1s now, and I really appreciated
your patch! 

Thanks,
Carlos

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  6:04               ` Junio C Hamano
@ 2009-07-23 17:19                 ` Linus Torvalds
  0 siblings, 0 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 17:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carlos R. Mafra, Git Mailing List



On Wed, 22 Jul 2009, Junio C Hamano wrote:
>
> Subject: do_one_ref(): null_sha1 check is not about broken ref

Ack. If we want to make it conditional at some point, we'd want to use a 
different flag. 

I do wonder if we should simply remove the code entirely?

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23  3:21           ` Linus Torvalds
@ 2009-07-23 17:47             ` Tony Finch
  2009-07-23 18:57               ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Tony Finch @ 2009-07-23 17:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed, 22 Jul 2009, Linus Torvalds wrote:
>
> I suspect the old newton-raphson patches we had (Discussions and patches
> back in April 2007 on this list) could be resurrected pretty easily.

That sounds interesting, but I can't find the thread you are referring to.
Do you have a URL or a subject I can feed to Google?

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 17:47             ` Tony Finch
@ 2009-07-23 18:57               ` Linus Torvalds
  2009-07-23 22:48                 ` Newton-Raphson, was " Tony Finch
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 18:57 UTC (permalink / raw)
  To: Tony Finch; +Cc: git



On Thu, 23 Jul 2009, Tony Finch wrote:

> On Wed, 22 Jul 2009, Linus Torvalds wrote:
> >
> > I suspect the old newton-raphson patches we had (Discussions and patches
> > back in April 2007 on this list) could be resurrected pretty easily.
> 
> That sounds interesting, but I can't find the thread you are referring to.
> Do you have a URL or a subject I can feed to Google?

Some googling found this:

	http://marc.info/?l=git&m=117537594112450&w=2

but what got merged (half a year later) was a much fancier thing by Junio. 
See sha1-lookup.c.

That original "single iteration of newton-raphson" patch was buggy, but 
it's perhaps interesting as a concept patch.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 16:48         ` Anders Kaseorg
@ 2009-07-23 19:03           ` Carlos R. Mafra
  0 siblings, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23 19:03 UTC (permalink / raw)
  To: Anders Kaseorg; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

On Thu 23.Jul'09 at 12:48:20 -0400, Anders Kaseorg wrote:
> 
> I submitted essentially the same patch in May:
>   http://article.gmane.org/gmane.comp.version-control.git/120097
> with the additional optimization that we don't need to lookup commits at
> all unless we're using -v, --merged, --no-merged, or --contains.  In my 
> tests, it makes `git branch` 5 times faster on an uncached linux-2.6 
> repository.

I also tested your patch even if you said that it was "essentially the same". 

But after repeating the tests 6 times for both your and Linus' patch
(taking care to let the system rest a bit after clearing the cache), your
patch is faster,

0.62 +/- 0.24 (Anders)
1.35 +/- 0.23 (Linus)

And this is the raw data for your patch,

0.00user 0.01system 0:00.54elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (7major+727minor)pagefaults 0swaps

0.00user 0.00system 0:00.18elapsed 5%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+733minor)pagefaults 0swaps

0.00user 0.00system 0:00.66elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (9major+723minor)pagefaults 0swaps

0.00user 0.01system 0:00.74elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (14major+720minor)pagefaults 0swaps

0.00user 0.00system 0:00.80elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (16major+718minor)pagefaults 0swaps

0.00user 0.00system 0:00.83elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (16major+718minor)pagefaults 0swaps


and for Linus'

0.00user 0.01system 0:01.56elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (43major+755minor)pagefaults 0swaps

0.00user 0.01system 0:01.09elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (24major+775minor)pagefaults 0swaps

0.00user 0.01system 0:01.33elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (32major+767minor)pagefaults 0swaps

0.00user 0.00system 0:01.53elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (39major+760minor)pagefaults 0swaps

0.00user 0.01system 0:01.06elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (24major+775minor)pagefaults 0swaps

0.00user 0.00system 0:01.54elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (39major+760minor)pagefaults 0swaps

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 16:53               ` Carlos R. Mafra
@ 2009-07-23 19:05                 ` Linus Torvalds
  2009-07-23 19:13                   ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 19:05 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
>
> Everytime I did the first 'git branch' those 5 seconds really hurt, because
> I wondered why it couldn't be done in 0s like subsequent commands.
> 
> But sure, this was definitely not a pressing issue and your patch made
> it even less. I am happy that it takes 1s now, and I really appreciated
> your patch! 

You could try something like this (on _top_ of the previous patch). 

Not very exhaustively tested, but it's pretty simple.

It will still do _some_ object lookups. In particular, it will do the HEAD 
lookup in 'print_ref_list()', even if it's not strictly necessary. But it 
should cut down the noise further.

		Linus

---
 builtin-branch.c |   24 ++++++++++++++----------
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/builtin-branch.c b/builtin-branch.c
index 54a89ff..82c2cf0 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -191,7 +191,7 @@ struct ref_item {
 
 struct ref_list {
 	struct rev_info revs;
-	int index, alloc, maxwidth;
+	int index, alloc, maxwidth, verbose;
 	struct ref_item *list;
 	struct commit_list *with_commit;
 	int kinds;
@@ -244,17 +244,20 @@ static int append_ref(const char *refname, const unsigned char *sha1, int flags,
 	if ((kind & ref_list->kinds) == 0)
 		return 0;
 
-	commit = lookup_commit_reference_gently(sha1, 1);
-	if (!commit)
-		return error("branch '%s' does not point at a commit", refname);
+	commit = NULL;
+	if (ref_list->verbose || ref_list->with_commit || merge_filter != NO_FILTER) {
+		commit = lookup_commit_reference_gently(sha1, 1);
+		if (!commit)
+			return error("branch '%s' does not point at a commit", refname);
 
-	/* Filter with with_commit if specified */
-	if (!is_descendant_of(commit, ref_list->with_commit))
-		return 0;
+		/* Filter with with_commit if specified */
+		if (!is_descendant_of(commit, ref_list->with_commit))
+			return 0;
 
-	if (merge_filter != NO_FILTER)
-		add_pending_object(&ref_list->revs,
-				   (struct object *)commit, refname);
+		if (merge_filter != NO_FILTER)
+			add_pending_object(&ref_list->revs,
+					   (struct object *)commit, refname);
+	}
 
 	/* Resize buffer */
 	if (ref_list->index >= ref_list->alloc) {
@@ -423,6 +426,7 @@ static void print_ref_list(int kinds, int detached, int verbose, int abbrev, str
 
 	memset(&ref_list, 0, sizeof(ref_list));
 	ref_list.kinds = kinds;
+	ref_list.verbose = verbose;
 	ref_list.with_commit = with_commit;
 	if (merge_filter != NO_FILTER)
 		init_revisions(&ref_list.revs, NULL);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 19:05                 ` Linus Torvalds
@ 2009-07-23 19:13                   ` Linus Torvalds
  2009-07-23 19:55                     ` Carlos R. Mafra
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-23 19:13 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Thu, 23 Jul 2009, Linus Torvalds wrote:
> 
> You could try something like this (on _top_ of the previous patch). 
> 
> Not very exhaustively tested, but it's pretty simple.
> 
> It will still do _some_ object lookups. In particular, it will do the HEAD 
> lookup in 'print_ref_list()', even if it's not strictly necessary. But it 
> should cut down the noise further.

And this (on top of them all) will basically avoid even that one.

In fact, I think this is a cleanup. I think I'll resubmit the whole series 
with proper commit messages etc.

		Linus

---
 builtin-branch.c |   38 +++++++++++++++++++++++---------------
 1 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/builtin-branch.c b/builtin-branch.c
index 82c2cf0..1a03d5f 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -191,7 +191,7 @@ struct ref_item {
 
 struct ref_list {
 	struct rev_info revs;
-	int index, alloc, maxwidth, verbose;
+	int index, alloc, maxwidth, verbose, abbrev;
 	struct ref_item *list;
 	struct commit_list *with_commit;
 	int kinds;
@@ -418,15 +418,34 @@ static int calc_maxwidth(struct ref_list *refs)
 	return w;
 }
 
+
+static void show_detached(struct ref_list *ref_list)
+{
+	struct commit *head_commit = lookup_commit_reference_gently(head_sha1, 1);
+
+	if (head_commit && is_descendant_of(head_commit, ref_list->with_commit)) {
+		struct ref_item item;
+		item.name = xstrdup("(no branch)");
+		item.len = strlen(item.name);
+		item.kind = REF_LOCAL_BRANCH;
+		item.dest = NULL;
+		item.commit = head_commit;
+		if (item.len > ref_list->maxwidth)
+			ref_list->maxwidth = item.len;
+		print_ref_item(&item, ref_list->maxwidth, ref_list->verbose, ref_list->abbrev, 1, "");
+		free(item.name);
+	}
+}
+
 static void print_ref_list(int kinds, int detached, int verbose, int abbrev, struct commit_list *with_commit)
 {
 	int i;
 	struct ref_list ref_list;
-	struct commit *head_commit = lookup_commit_reference_gently(head_sha1, 1);
 
 	memset(&ref_list, 0, sizeof(ref_list));
 	ref_list.kinds = kinds;
 	ref_list.verbose = verbose;
+	ref_list.abbrev = abbrev;
 	ref_list.with_commit = with_commit;
 	if (merge_filter != NO_FILTER)
 		init_revisions(&ref_list.revs, NULL);
@@ -446,19 +465,8 @@ static void print_ref_list(int kinds, int detached, int verbose, int abbrev, str
 	qsort(ref_list.list, ref_list.index, sizeof(struct ref_item), ref_cmp);
 
 	detached = (detached && (kinds & REF_LOCAL_BRANCH));
-	if (detached && head_commit &&
-	    is_descendant_of(head_commit, with_commit)) {
-		struct ref_item item;
-		item.name = xstrdup("(no branch)");
-		item.len = strlen(item.name);
-		item.kind = REF_LOCAL_BRANCH;
-		item.dest = NULL;
-		item.commit = head_commit;
-		if (item.len > ref_list.maxwidth)
-			ref_list.maxwidth = item.len;
-		print_ref_item(&item, ref_list.maxwidth, verbose, abbrev, 1, "");
-		free(item.name);
-	}
+	if (detached)
+		show_detached(&ref_list);
 
 	for (i = 0; i < ref_list.index; i++) {
 		int current = !detached &&

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 19:13                   ` Linus Torvalds
@ 2009-07-23 19:55                     ` Carlos R. Mafra
  2009-07-24 20:36                       ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-23 19:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Thu 23.Jul'09 at 12:13:41 -0700, Linus Torvalds wrote:
> > It will still do _some_ object lookups. In particular, it will do the HEAD 
> > lookup in 'print_ref_list()', even if it's not strictly necessary. But it 
> > should cut down the noise further.
> 
> And this (on top of them all) will basically avoid even that one.

Ok, I applied (both) on top of the first one.

After 7 tests I got these, 

time:

      0.61 +/- 0.08

GIT_DEBUG_LOOKUP=1 git branch |wc -l
    
      9
      
which are in fact only the branches list.

Compared to yesterday, that is a huge improvement (0.6s vs 5.7s)
and (9 vs 2200+). At least for me 0.6s is "instantaneous", so
the issue is really gone.

Thanks a lot to everyone!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Newton-Raphson, was Re: Performance issue of 'git branch'
  2009-07-23 18:57               ` Linus Torvalds
@ 2009-07-23 22:48                 ` Tony Finch
  2009-07-23 23:24                   ` Johannes Schindelin
  0 siblings, 1 reply; 74+ messages in thread
From: Tony Finch @ 2009-07-23 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1724 bytes --]

On Thu, 23 Jul 2009, Linus Torvalds wrote:
>
> Some googling found this:
> 	http://marc.info/?l=git&m=117537594112450&w=2
> but what got merged (half a year later) was a much fancier thing by Junio.
> See sha1-lookup.c.

Thanks. Edésio Costa e Silva also gave me a useful pointer.

> That original "single iteration of newton-raphson" patch was buggy, but
> it's perhaps interesting as a concept patch.

I think Newton-Raphson is a brilliant but misleading idea. (As Junio said,
"egg of Columbus" - it certainly blew my mind!) However, Newton's method
works with smooth curves, but a pack index is a straight line plus
stochastic deviations. If you try to apply Newton's method then the more
you zoom in the more the random variations will send you away from the
place you want to be. So I think your first N-R patch was closer to being
right than its successors.

What you should do is ONE linear interpolation on the entire index. (i.e.
If you have N objects in the pack and you want to find one with SHA-1 id
S, take the top four bytes of S and multiply by N/2^32.) Note that if you
do a level-1 256-way fan-out lookup first then the random variations will
make you LESS likely to land near the right place.

After doing the first-order linear interpolation, it's probably sensible
to do a page-wise linear search (in case you don't land directly on
the page containing the target SHA-1) then a binary search within the
final page for efficiency with a hot cache.

This should give you O(1) seeks in the index per object lookup.

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Newton-Raphson, was Re: Performance issue of 'git branch'
  2009-07-23 22:48                 ` Newton-Raphson, was " Tony Finch
@ 2009-07-23 23:24                   ` Johannes Schindelin
  2009-07-23 23:50                     ` Tony Finch
  0 siblings, 1 reply; 74+ messages in thread
From: Johannes Schindelin @ 2009-07-23 23:24 UTC (permalink / raw)
  To: Tony Finch; +Cc: Linus Torvalds, git

Hi,

On Thu, 23 Jul 2009, Tony Finch wrote:

> I think Newton-Raphson is a brilliant but misleading idea. (As Junio 
> said, "egg of Columbus" - it certainly blew my mind!) However, Newton's 
> method works with smooth curves, but a pack index is a straight line 
> plus stochastic deviations. If you try to apply Newton's method then the 
> more you zoom in the more the random variations will send you away from 
> the place you want to be.

No.

Think about it, absent any further information than "it is a hash, i.e. 
distributed pretty equally in _any_ byte", even subsets of a sorted list 
will me more or less linear.  And assuming that they are linear is _still_ 
your best bet.

Assuming that subsets of said sorted list will _still_ minimize the 
average number of steps to take until you find the correct entry.

Unless you have more information about the nature of the hashes, of 
course.

> This should give you O(1) seeks in the index per object lookup.

There is no way to achieve that, best thing you can hope for is _expected_ 
O(1) (e.g. with a hashmap, with exponential worst case).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Newton-Raphson, was Re: Performance issue of 'git branch'
  2009-07-23 23:24                   ` Johannes Schindelin
@ 2009-07-23 23:50                     ` Tony Finch
  2009-07-24  0:43                       ` Johannes Schindelin
  0 siblings, 1 reply; 74+ messages in thread
From: Tony Finch @ 2009-07-23 23:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, git

On Fri, 24 Jul 2009, Johannes Schindelin wrote:
>
> Think about it, absent any further information than "it is a hash, i.e.
> distributed pretty equally in _any_ byte", even subsets of a sorted list
> will me more or less linear.  And assuming that they are linear is _still_
> your best bet.

The even distribution of the lower-order bytes is irrelevant. We're
looking at the top 20-ish bits for a pack with a million-ish objects. The
more you zoom in the less linear a sorted list of hashes will be, so
assuming linearity at all scales is wrong. It's a bit like fractal
mountains.

> There is no way to achieve [O(1) seeks], best thing you can hope for is
> _expected_ O(1) (e.g. with a hashmap, with exponential worst case).

Of course it's expected. However the worst case is nowhere near
exponential: it's linear because the second-order search is a linear
pagewise scan. But I think in practice, the larger the pack the more that
the randomization of the hash function will smooth out performance
oddities. (Sorry, I don't know enough statistics to be able to say what
the expected error of the linear interpolation is, though I expect it's a
fairly simple formula.) For small packs the number of seeks is 1 anyway.

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Newton-Raphson, was Re: Performance issue of 'git branch'
  2009-07-23 23:50                     ` Tony Finch
@ 2009-07-24  0:43                       ` Johannes Schindelin
  0 siblings, 0 replies; 74+ messages in thread
From: Johannes Schindelin @ 2009-07-24  0:43 UTC (permalink / raw)
  To: Tony Finch; +Cc: Linus Torvalds, git

Hi,

On Fri, 24 Jul 2009, Tony Finch wrote:

> On Fri, 24 Jul 2009, Johannes Schindelin wrote:
> >
> > Think about it, absent any further information than "it is a hash, i.e.
> > distributed pretty equally in _any_ byte", even subsets of a sorted list
> > will me more or less linear.  And assuming that they are linear is _still_
> > your best bet.
> 
> The even distribution of the lower-order bytes is irrelevant.

I was not talking about lower-order bytes.  All bytes are pretty much 
evenly distributed.  That's why SHA-1 is a good hash.

> We're looking at the top 20-ish bits for a pack with a million-ish 
> objects. The more you zoom in the less linear a sorted list of hashes 
> will be, so assuming linearity at all scales is wrong. It's a bit like 
> fractal mountains.

If you really find irregularities like that, then SHA-1 is really a lousy 
hash.  Irregularities like this are typically exploitable.

If you know of such an irregularity, you might want to write a paper that 
SHA-1 is broken and get famous.

> > There is no way to achieve [O(1) seeks], best thing you can hope for 
> > is _expected_ O(1) (e.g. with a hashmap, with exponential worst case).
> 
> Of course it's expected. However the worst case is nowhere near
> exponential: it's linear because the second-order search is a linear
> pagewise scan. But I think in practice, the larger the pack the more that
> the randomization of the hash function will smooth out performance
> oddities. (Sorry, I don't know enough statistics to be able to say what
> the expected error of the linear interpolation is, though I expect it's a
> fairly simple formula.) For small packs the number of seeks is 1 anyway.

I will believe it when I see it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-23 19:55                     ` Carlos R. Mafra
@ 2009-07-24 20:36                       ` Linus Torvalds
  2009-07-24 20:47                         ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-24 20:36 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Thu, 23 Jul 2009, Carlos R. Mafra wrote:
>
> After 7 tests I got these, 
> 
> time:
> 
>       0.61 +/- 0.08

Btw, I think 0.61s is still too much. Can you send me the output of 
'strace -Ttt' on your machine?

It's entirely possible that it's all the actual binary (and shared 
library) loading, of course. You do have a slow harddisk. But it takes 
0.035s for me, and I'm wondering if there is something else than just CPU 
speed and IO speed accounting for the 20x performance difference.

(That said, maybe 20x is right - my SSD latency almost certainly is 20x 
better).

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 20:36                       ` Linus Torvalds
@ 2009-07-24 20:47                         ` Linus Torvalds
  2009-07-24 21:21                           ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-24 20:47 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Fri, 24 Jul 2009, Linus Torvalds wrote:
> 
> Btw, I think 0.61s is still too much. Can you send me the output of 
> 'strace -Ttt' on your machine?

Never mind. I'm seeing even worse behavior on a laptop I just dug up 
(another 4200 rpm harddisk).

I'll dig some more.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 20:47                         ` Linus Torvalds
@ 2009-07-24 21:21                           ` Linus Torvalds
  2009-07-24 22:13                             ` Linus Torvalds
                                               ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-24 21:21 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Fri, 24 Jul 2009, Linus Torvalds wrote:
> 
> Never mind. I'm seeing even worse behavior on a laptop I just dug up 
> (another 4200 rpm harddisk).
> 
> I'll dig some more.

Yeah, it seems to be the loading overhead. I'm seeing a 'time git branch' 
take 1.2s in the cold-cache case, in a directory that isn't even a git 
directory.

And 80% of it comes before we even get to 'main()'. Shared library 
loading, SELinux crud etc. A lot of it seems to be 'libfreebl3' and 
'libselinux', which is some crazy sh*t.

It seems to be all from 'curl' support.

That seems _really_ sad. Lookie here:

   [torvalds@nehalem git]$ ldd git
	linux-vdso.so.1 =>  (0x00007fff61da7000)
	libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f2f1a498000)
	libz.so.1 => /lib64/libz.so.1 (0x0000003cdb800000)
	libcrypto.so.8 => /usr/lib64/libcrypto.so.8 (0x0000003ba7a00000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cdb400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003cda800000)
	libidn.so.11 => /lib64/libidn.so.11 (0x0000003ceaa00000)
	libssh2.so.1 => /usr/lib64/libssh2.so.1 (0x0000003ba8e00000)
	libldap-2.4.so.2 => /usr/lib64/libldap-2.4.so.2 (0x00007f2f1a250000)
	librt.so.1 => /lib64/librt.so.1 (0x0000003cdbc00000)
	libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 (0x0000003ce6e00000)
	libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003ce7e00000)
	libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003ce7200000)
	libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003ce6a00000)
	libssl3.so => /lib64/libssl3.so (0x0000003490200000)
	libsmime3.so => /lib64/libsmime3.so (0x000000348fe00000)
	libnss3.so => /lib64/libnss3.so (0x000000348f600000)
	libplds4.so => /lib64/libplds4.so (0x0000003cbc800000)
	libplc4.so => /lib64/libplc4.so (0x0000003cbdc00000)
	libnspr4.so => /lib64/libnspr4.so (0x0000003cbd800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003cdb000000)
	/lib64/ld-linux-x86-64.so.2 (0x0000003cda400000)
	libssl.so.8 => /usr/lib64/libssl.so.8 (0x0000003ba7e00000)
	liblber-2.4.so.2 => /usr/lib64/liblber-2.4.so.2 (0x0000003ceee00000)
	libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003ce5600000)
	libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007f2f1a030000)
	libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 (0x0000003ce7a00000)
	libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003ce7600000)
	libnssutil3.so => /lib64/libnssutil3.so (0x000000348fa00000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f2f19df8000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003cdc400000)
	libfreebl3.so => /lib64/libfreebl3.so (0x00007f2f19b99000)
   [torvalds@nehalem git]$ make -j16 NO_CURL=1
   [torvalds@nehalem git]$ ldd git
	linux-vdso.so.1 =>  (0x00007fff2f960000)
	libz.so.1 => /lib64/libz.so.1 (0x0000003cdb800000)
	libcrypto.so.8 => /usr/lib64/libcrypto.so.8 (0x0000003ba7a00000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cdb400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003cda800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003cdb000000)
	/lib64/ld-linux-x86-64.so.2 (0x0000003cda400000)

What a huge difference!

And the NO_CURL version really does load a lot faster in cold-cache. We're 
not talking small differences:

 - compiled with NO_CURL, five runs of "echo 3 > /proc/sys/vm/drop_caches" 
   followed by "time git branch":

	real	0m0.654s
	real	0m0.562s
	real	0m0.519s
	real	0m0.534s
	real	0m0.734s

   Total number of system calls: 194

 - compiled with curl, same thing:

	real	0m1.503s
	real	0m1.455s
	real	0m1.267s
	real	0m1.819s
	real	0m0.985s

   Total number of system calls: 407!

ie we're talking a _huge_ hit in startup times for that curl support. 
That's really really sad - especially considering how all the curl support 
is for very random occasional stuff. I never use it myself, for example, 
since I don't use http at all. And even for people who do, they only need 
it for non-local operations.

I wonder if there is some way to only load the crazy curl stuff when we 
actually want open a http: connection.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 21:21                           ` Linus Torvalds
@ 2009-07-24 22:13                             ` Linus Torvalds
  2009-07-24 22:18                               ` david
  2009-08-07  4:21                               ` Jeff King
  2009-07-24 22:54                             ` Theodore Tso
  2009-07-24 23:46                             ` Carlos R. Mafra
  2 siblings, 2 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-24 22:13 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
  Cc: Carlos R. Mafra, Daniel Barkalow, Johannes Schindelin


On Fri, 24 Jul 2009, Linus Torvalds wrote:
> 
> ie we're talking a _huge_ hit in startup times for that curl support. 
> That's really really sad - especially considering how all the curl support 
> is for very random occasional stuff. I never use it myself, for example, 
> since I don't use http at all. And even for people who do, they only need 
> it for non-local operations.
> 
> I wonder if there is some way to only load the crazy curl stuff when we 
> actually want open a http: connection.

Here's the simple step#1: make 'git-http-fetch' be an external program 
rather than a built-in.

Sadly, I have no idea hot to turn the transport.c code into an external 
walker sanely (turn the ref/object walkers into an exec of an external 
program). So we still end up linking with curl. But maybe somebody 
(Daniel? Dscho?) who knows the transport code could try to make it an 
external process?

The performance angle of http fetching is non-existent, we really should 
try very hard to make the curl-dependent parts be in a binary of their 
own.

		Linus

---
>From 3cfc50d497266dc73a414ed1460b36b712ad10de Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 24 Jul 2009 14:54:55 -0700
Subject: [PATCH] git-http-fetch: not a builtin

We should really try to avoid having a dependency on the curl libraries
for the core 'git' executable. It adds huge overheads, for no advantage.

This splits up git-http-fetch so that it isn't built-in.  We still do
end up linking with curl for the git binary due to the transport.c http
walker, but that's at least partially an independent issue.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 Makefile                             |    8 +++++++-
 git.c                                |    3 ---
 builtin-http-fetch.c => http-fetch.c |    5 ++++-
 3 files changed, 11 insertions(+), 5 deletions(-)
 rename builtin-http-fetch.c => http-fetch.c (95%)

diff --git a/Makefile b/Makefile
index bde27ed..8cbd863 100644
--- a/Makefile
+++ b/Makefile
@@ -978,9 +978,12 @@ else
 	else
 		CURL_LIBCURL = -lcurl
 	endif
-	BUILTIN_OBJS += builtin-http-fetch.o
+	PROGRAMS += git-http-fetch$X
+
+	# FIXME! Sadly 'transport.c' still needs these for the builtin case
 	EXTLIBS += $(CURL_LIBCURL)
 	LIB_OBJS += http.o http-walker.o
+
 	curl_check := $(shell (echo 070908; curl-config --vernum) | sort -r | sed -ne 2p)
 	ifeq "$(curl_check)" "070908"
 		ifndef NO_EXPAT
@@ -1485,6 +1488,9 @@ git-imap-send$X: imap-send.o $(GITLIBS)
 
 http.o http-walker.o http-push.o transport.o: http.h
 
+git-http-fetch$X: revision.o http.o http-push.o $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
 git-http-push$X: revision.o http.o http-push.o $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
diff --git a/git.c b/git.c
index 807d875..c1e8f05 100644
--- a/git.c
+++ b/git.c
@@ -309,9 +309,6 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "get-tar-commit-id", cmd_get_tar_commit_id },
 		{ "grep", cmd_grep, RUN_SETUP | USE_PAGER },
 		{ "help", cmd_help },
-#ifndef NO_CURL
-		{ "http-fetch", cmd_http_fetch, RUN_SETUP },
-#endif
 		{ "init", cmd_init_db },
 		{ "init-db", cmd_init_db },
 		{ "log", cmd_log, RUN_SETUP | USE_PAGER },
diff --git a/builtin-http-fetch.c b/http-fetch.c
similarity index 95%
rename from builtin-http-fetch.c
rename to http-fetch.c
index f3e63d7..e8f44ba 100644
--- a/builtin-http-fetch.c
+++ b/http-fetch.c
@@ -1,8 +1,9 @@
 #include "cache.h"
 #include "walker.h"
 
-int cmd_http_fetch(int argc, const char **argv, const char *prefix)
+int main(int argc, const char **argv)
 {
+	const char *prefix;
 	struct walker *walker;
 	int commits_on_stdin = 0;
 	int commits;
@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, const char *prefix)
 	int get_verbosely = 0;
 	int get_recover = 0;
 
+	prefix = setup_git_directory();
+
 	git_config(git_default_config, NULL);
 
 	while (arg < argc && argv[arg][0] == '-') {
-- 
1.6.4.rc1.5.gb84f

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:13                             ` Linus Torvalds
@ 2009-07-24 22:18                               ` david
  2009-07-24 22:42                                 ` Linus Torvalds
  2009-08-07  4:21                               ` Jeff King
  1 sibling, 1 reply; 74+ messages in thread
From: david @ 2009-07-24 22:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Daniel Barkalow, Johannes Schindelin

On Fri, 24 Jul 2009, Linus Torvalds wrote:

> On Fri, 24 Jul 2009, Linus Torvalds wrote:
>>
>> ie we're talking a _huge_ hit in startup times for that curl support.
>> That's really really sad - especially considering how all the curl support
>> is for very random occasional stuff. I never use it myself, for example,
>> since I don't use http at all. And even for people who do, they only need
>> it for non-local operations.
>>
>> I wonder if there is some way to only load the crazy curl stuff when we
>> actually want open a http: connection.
>
> Here's the simple step#1: make 'git-http-fetch' be an external program
> rather than a built-in.
>
> Sadly, I have no idea hot to turn the transport.c code into an external
> walker sanely (turn the ref/object walkers into an exec of an external
> program). So we still end up linking with curl. But maybe somebody
> (Daniel? Dscho?) who knows the transport code could try to make it an
> external process?
>
> The performance angle of http fetching is non-existent, we really should
> try very hard to make the curl-dependent parts be in a binary of their
> own.

what does the performance look like if you just do a static compile 
instead?

David Lang

> 		Linus
>
> ---
>> From 3cfc50d497266dc73a414ed1460b36b712ad10de Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Fri, 24 Jul 2009 14:54:55 -0700
> Subject: [PATCH] git-http-fetch: not a builtin
>
> We should really try to avoid having a dependency on the curl libraries
> for the core 'git' executable. It adds huge overheads, for no advantage.
>
> This splits up git-http-fetch so that it isn't built-in.  We still do
> end up linking with curl for the git binary due to the transport.c http
> walker, but that's at least partially an independent issue.
>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
> Makefile                             |    8 +++++++-
> git.c                                |    3 ---
> builtin-http-fetch.c => http-fetch.c |    5 ++++-
> 3 files changed, 11 insertions(+), 5 deletions(-)
> rename builtin-http-fetch.c => http-fetch.c (95%)
>
> diff --git a/Makefile b/Makefile
> index bde27ed..8cbd863 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -978,9 +978,12 @@ else
> 	else
> 		CURL_LIBCURL = -lcurl
> 	endif
> -	BUILTIN_OBJS += builtin-http-fetch.o
> +	PROGRAMS += git-http-fetch$X
> +
> +	# FIXME! Sadly 'transport.c' still needs these for the builtin case
> 	EXTLIBS += $(CURL_LIBCURL)
> 	LIB_OBJS += http.o http-walker.o
> +
> 	curl_check := $(shell (echo 070908; curl-config --vernum) | sort -r | sed -ne 2p)
> 	ifeq "$(curl_check)" "070908"
> 		ifndef NO_EXPAT
> @@ -1485,6 +1488,9 @@ git-imap-send$X: imap-send.o $(GITLIBS)
>
> http.o http-walker.o http-push.o transport.o: http.h
>
> +git-http-fetch$X: revision.o http.o http-push.o $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> +		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> git-http-push$X: revision.o http.o http-push.o $(GITLIBS)
> 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> 		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> diff --git a/git.c b/git.c
> index 807d875..c1e8f05 100644
> --- a/git.c
> +++ b/git.c
> @@ -309,9 +309,6 @@ static void handle_internal_command(int argc, const char **argv)
> 		{ "get-tar-commit-id", cmd_get_tar_commit_id },
> 		{ "grep", cmd_grep, RUN_SETUP | USE_PAGER },
> 		{ "help", cmd_help },
> -#ifndef NO_CURL
> -		{ "http-fetch", cmd_http_fetch, RUN_SETUP },
> -#endif
> 		{ "init", cmd_init_db },
> 		{ "init-db", cmd_init_db },
> 		{ "log", cmd_log, RUN_SETUP | USE_PAGER },
> diff --git a/builtin-http-fetch.c b/http-fetch.c
> similarity index 95%
> rename from builtin-http-fetch.c
> rename to http-fetch.c
> index f3e63d7..e8f44ba 100644
> --- a/builtin-http-fetch.c
> +++ b/http-fetch.c
> @@ -1,8 +1,9 @@
> #include "cache.h"
> #include "walker.h"
>
> -int cmd_http_fetch(int argc, const char **argv, const char *prefix)
> +int main(int argc, const char **argv)
> {
> +	const char *prefix;
> 	struct walker *walker;
> 	int commits_on_stdin = 0;
> 	int commits;
> @@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, const char *prefix)
> 	int get_verbosely = 0;
> 	int get_recover = 0;
>
> +	prefix = setup_git_directory();
> +
> 	git_config(git_default_config, NULL);
>
> 	while (arg < argc && argv[arg][0] == '-') {
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:18                               ` david
@ 2009-07-24 22:42                                 ` Linus Torvalds
  2009-07-24 22:46                                   ` david
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-24 22:42 UTC (permalink / raw)
  To: david
  Cc: Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Daniel Barkalow, Johannes Schindelin



On Fri, 24 Jul 2009, david@lang.hm wrote:
> 
> what does the performance look like if you just do a static compile instead?

I don't even know - I don't have a static version of curl. I could install 
one, of course, but since I don't think that's the solution anyway, I'm 
not going to bother.

The real solution really is to not have curl support in the main binary.

One option might be to make _all_ the transport code be outside of the 
core binary, or course.  That's a fairly simple but somewhat sad solution 
(ie make all of push/pull/fetch/clone/ls-remote/etc be external binaries)

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:42                                 ` Linus Torvalds
@ 2009-07-24 22:46                                   ` david
  2009-07-25  2:39                                     ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: david @ 2009-07-24 22:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Daniel Barkalow, Johannes Schindelin

On Fri, 24 Jul 2009, Linus Torvalds wrote:

> On Fri, 24 Jul 2009, david@lang.hm wrote:
>>
>> what does the performance look like if you just do a static compile instead?
>
> I don't even know - I don't have a static version of curl. I could install
> one, of course, but since I don't think that's the solution anyway, I'm
> not going to bother.

I wasn't thinking a static version of curl, I was thinking a static 
version of the git binaries. see how fast things could be if no startup 
linking was nessasary.

David Lang

> The real solution really is to not have curl support in the main binary.
>
> One option might be to make _all_ the transport code be outside of the
> core binary, or course.  That's a fairly simple but somewhat sad solution
> (ie make all of push/pull/fetch/clone/ls-remote/etc be external binaries)
>
> 		Linus
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 21:21                           ` Linus Torvalds
  2009-07-24 22:13                             ` Linus Torvalds
@ 2009-07-24 22:54                             ` Theodore Tso
  2009-07-24 22:59                               ` Shawn O. Pearce
  2009-07-24 23:46                             ` Carlos R. Mafra
  2 siblings, 1 reply; 74+ messages in thread
From: Theodore Tso @ 2009-07-24 22:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Carlos R. Mafra, Junio C Hamano, Git Mailing List

On Fri, Jul 24, 2009 at 02:21:20PM -0700, Linus Torvalds wrote:
> 
> I wonder if there is some way to only load the crazy curl stuff when we 
> actually want open a http: connection.

Well, we could use dlopen(), but I'm not sure that qualifies as a
_sane_ solution --- especially given that there are approximately 15
interfaces used by git, that we'd have to resolve using dlsym().

	   	   	     	       - Ted

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:54                             ` Theodore Tso
@ 2009-07-24 22:59                               ` Shawn O. Pearce
  2009-07-24 23:28                                 ` Junio C Hamano
  2009-07-26 17:07                                 ` Avi Kivity
  0 siblings, 2 replies; 74+ messages in thread
From: Shawn O. Pearce @ 2009-07-24 22:59 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linus Torvalds, Carlos R. Mafra, Junio C Hamano, Git Mailing List

Theodore Tso <tytso@mit.edu> wrote:
> On Fri, Jul 24, 2009 at 02:21:20PM -0700, Linus Torvalds wrote:
> > 
> > I wonder if there is some way to only load the crazy curl stuff when we 
> > actually want open a http: connection.
> 
> Well, we could use dlopen(), but I'm not sure that qualifies as a
> _sane_ solution --- especially given that there are approximately 15
> interfaces used by git, that we'd have to resolve using dlsym().

Yea, that's not sane.

Probably the better approach is to have git fetch and git push be a
different binary from main git, so we only pay the libcurl loading
overheads when we hit transport.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:59                               ` Shawn O. Pearce
@ 2009-07-24 23:28                                 ` Junio C Hamano
  2009-07-26 17:07                                 ` Avi Kivity
  1 sibling, 0 replies; 74+ messages in thread
From: Junio C Hamano @ 2009-07-24 23:28 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Theodore Tso, Linus Torvalds, Carlos R. Mafra, Junio C Hamano,
	Git Mailing List

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Theodore Tso <tytso@mit.edu> wrote:
>> On Fri, Jul 24, 2009 at 02:21:20PM -0700, Linus Torvalds wrote:
>> > 
>> > I wonder if there is some way to only load the crazy curl stuff when we 
>> > actually want open a http: connection.
>> 
>> Well, we could use dlopen(), but I'm not sure that qualifies as a
>> _sane_ solution --- especially given that there are approximately 15
>> interfaces used by git, that we'd have to resolve using dlsym().
>
> Yea, that's not sane.
>
> Probably the better approach is to have git fetch and git push be a
> different binary from main git, so we only pay the libcurl loading
> overheads when we hit transport.

Even though that still will hurt people who do not use http, I think it
would be a right approach (in the sense that it should not be too painful
and with a reasonable gain for local-only operations).

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 21:21                           ` Linus Torvalds
  2009-07-24 22:13                             ` Linus Torvalds
  2009-07-24 22:54                             ` Theodore Tso
@ 2009-07-24 23:46                             ` Carlos R. Mafra
  2009-07-25  0:41                               ` Carlos R. Mafra
  2 siblings, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-24 23:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Sorry for the delay and missing the "strace -ttT" request,
but today was a "Physics" day and took me longer to 
notice your email.

On Fri 24.Jul'09 at 14:21:20 -0700, Linus Torvalds wrote:
> 
> What a huge difference!
> 
> And the NO_CURL version really does load a lot faster in cold-cache. We're 
> not talking small differences:

With NO_CURL=1 the strace log contained 242 lines (vs 404), but
the time difference was not as great as you got. But it was
better:

0.55 +- 0.06 (for 8 runs)

So I repeated the tests with curl enabled and this time
I got:

0.77 +- 0.03 (for 6 runs)

(yesterday I got 0.61 +- 0.08, so there is lot of noise)

So it is better, but not by the same factor as you saw.
But I may have an explanation for this.

After I clear the cache I wait a few seconds to stabilize,
and I do the 'time git branch' test when I see that
there is no activity in the disk by looking at
the 'btrace' output in another xterm. 

I noticed that after dropping the cache and before
I do the test there is lot of activity of something
called 'preload', with lines which look like these:

8,0  0  42881   495.067655112 17777  Q   R 51244367 + 552 [preload]
8,0  0  42882   495.067659931 17777  G   R 51244367 + 552 [preload]
8,0  0  42883   495.067664401 17777  I   R 51244367 + 552 [preload]

I hadn't noticed this before and now I checked that,

"preload is an adaptive readahead daemon that prefetches files mapped by
applications from the disk to reduce application startup time."

So I guess that my tests here for your NO_CURL=1 idea is inconclusive,
as I am not sure what preload is prefetching.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 23:46                             ` Carlos R. Mafra
@ 2009-07-25  0:41                               ` Carlos R. Mafra
  2009-07-25 18:04                                 ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-25  0:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Sat 25.Jul'09 at  1:46:48 +0200, Carlos R. Mafra wrote:
> 
> So I guess that my tests here for your NO_CURL=1 idea is inconclusive,
> as I am not sure what preload is prefetching.

Ok, so I killed /usr/sbin/preload and did the tests again. The 
results were much more stable, with average 0.40 vs 0.79
(NO_CURL=1 being faster). The pagefaults were pretty stable too,
(40major+654minor vs 12major+401minor). 

I will use NO_CURL=1 from now on!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:46                                   ` david
@ 2009-07-25  2:39                                     ` Linus Torvalds
  2009-07-25  2:53                                       ` Daniel Barkalow
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-25  2:39 UTC (permalink / raw)
  To: david
  Cc: Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Daniel Barkalow, Johannes Schindelin



On Fri, 24 Jul 2009, david@lang.hm wrote:

> On Fri, 24 Jul 2009, Linus Torvalds wrote:
> 
> > On Fri, 24 Jul 2009, david@lang.hm wrote:
> > > 
> > > what does the performance look like if you just do a static compile
> > > instead?
> > 
> > I don't even know - I don't have a static version of curl. I could install
> > one, of course, but since I don't think that's the solution anyway, I'm
> > not going to bother.
> 
> I wasn't thinking a static version of curl, I was thinking a static version of
> the git binaries. see how fast things could be if no startup linking was
> nessasary.

Well, that's what I meant. If I add '-static' to the link flags, I get

	/usr/bin/ld: cannot find -lcurl
	collect2: ld returned 1 exit status

because I simply don't have a static library version of curl (and if I do 
NO_CURL, I fail the link due to not having a static version of zlib).

That's what I meant by "I could install a static version of curl" - I 
could install the debug libraries, but it just isn't a normal thing to do 
on any modern distribution. The right thing to do really would be to not 
have -lcurl for the main git binary at all.

Preferably done by having http walking handled by an external process (the 
way we already do rsync), but it's probably easier to just make all the 
clone/fetch/ls-remote things be a separate binary.

Of course, I'd personally solve the problem with NO_CURL=1, but that's 
probably not acceptable in general.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25  2:39                                     ` Linus Torvalds
@ 2009-07-25  2:53                                       ` Daniel Barkalow
  0 siblings, 0 replies; 74+ messages in thread
From: Daniel Barkalow @ 2009-07-25  2:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: david, Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Johannes Schindelin

On Fri, 24 Jul 2009, Linus Torvalds wrote:

> On Fri, 24 Jul 2009, david@lang.hm wrote:
> 
> > On Fri, 24 Jul 2009, Linus Torvalds wrote:
> > 
> > > On Fri, 24 Jul 2009, david@lang.hm wrote:
> > > > 
> > > > what does the performance look like if you just do a static compile
> > > > instead?
> > > 
> > > I don't even know - I don't have a static version of curl. I could install
> > > one, of course, but since I don't think that's the solution anyway, I'm
> > > not going to bother.
> > 
> > I wasn't thinking a static version of curl, I was thinking a static version of
> > the git binaries. see how fast things could be if no startup linking was
> > nessasary.
> 
> Well, that's what I meant. If I add '-static' to the link flags, I get
> 
> 	/usr/bin/ld: cannot find -lcurl
> 	collect2: ld returned 1 exit status
> 
> because I simply don't have a static library version of curl (and if I do 
> NO_CURL, I fail the link due to not having a static version of zlib).
> 
> That's what I meant by "I could install a static version of curl" - I 
> could install the debug libraries, but it just isn't a normal thing to do 
> on any modern distribution. The right thing to do really would be to not 
> have -lcurl for the main git binary at all.
> 
> Preferably done by having http walking handled by an external process (the 
> way we already do rsync), but it's probably easier to just make all the 
> clone/fetch/ls-remote things be a separate binary.

I think it's actually easy enough to have a separate binary to handle the 
http walking, particularly since I've got code lying around to handle 
importing from a foreign VCS with a separate binary that I can just remove 
some of the features from.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25  0:41                               ` Carlos R. Mafra
@ 2009-07-25 18:04                                 ` Linus Torvalds
  2009-07-25 18:57                                   ` Timo Hirvonen
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-25 18:04 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: Junio C Hamano, Git Mailing List



On Sat, 25 Jul 2009, Carlos R. Mafra wrote:
> 
> Ok, so I killed /usr/sbin/preload and did the tests again. The 
> results were much more stable, with average 0.40 vs 0.79
> (NO_CURL=1 being faster). The pagefaults were pretty stable too,
> (40major+654minor vs 12major+401minor). 
> 
> I will use NO_CURL=1 from now on!

I actually find it interesting that this whole NO_CURL issue is actually a 
lot more noticeable for me in the hot-cache case than all the other 'git 
branch' issues were.

I went back to a version a few days ago (before all the optimizations), 
and on my machine with a hot cache I get (for my kernel repo - I don't 
use branches there, but I have an old 'akpm' branch for taking a emailed 
patch series from Andrew):

	[torvalds@nehalem linux]$ time ~/git/git branch
	  akpm
	* master

	real	0m0.005s
	user	0m0.004s
	sys	0m0.000s

so it's five milliseconds. Big deal, fast enough, right?

Ok, so fast-forward to today, with the optimizations to builtin-branch.c:

	[torvalds@nehalem linux]$ time ~/git/git branch
	  akpm
	* master

	real	0m0.004s
	user	0m0.000s
	sys	0m0.004s

Woot! I shaved a millisecond off it by avoiding all those page faults and 
object lookups. Good, but hey, all that unnecessary lookup was just a 25% 
cost.

So let's build it with NO_CURL:

	[torvalds@nehalem linux]$ time ~/git/git branch
	  akpm
	* master

	real	0m0.002s
	user	0m0.000s
	sys	0m0.000s

Heh. The whole NO_CURL=1 thing is actually a _bigger_ optimization than 
anything else I did to git-branch. Cost of curl: 100%.

The difference in number of system calls and page faults is really quite 
staggering. System calls: 397->184, page faults: 619->293. Just from not 
doing that curl loading. No wonder performance actually doubles.

Now, I admit that 5ms vs 2ms probably doesn't really matter much, but 
dang, performance was a primary goal in git, so I'm a bit upset at how bad 
curl screwed us. Plus those things do add up when scripting things, and 
those 300+ page faults are basically true for _all_ git programs.

So it's not just 'git branch': doing 'git show' shows the exact same 
thing: 6ms -> 4ms, 448->235 system calls, and 1549->1176 page faults.

So curl really must die. It may not matter for the expensive operations, 
but a lot of scripting is about running all those "cheap" things that just 
add up over time.

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 18:04                                 ` Linus Torvalds
@ 2009-07-25 18:57                                   ` Timo Hirvonen
  2009-07-25 19:06                                     ` Reece Dunn
                                                       ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Timo Hirvonen @ 2009-07-25 18:57 UTC (permalink / raw)
  To: git; +Cc: Carlos R. Mafra, Junio C Hamano, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> So curl really must die. It may not matter for the expensive operations, 
> but a lot of scripting is about running all those "cheap" things that just 
> add up over time.

SELinux is the problem, not curl.

On my Arch Linux machine:

   $ ldd bin/git
	linux-vdso.so.1 =>  (0x00007fff42306000)
	libcurl.so.4 => /usr/lib/libcurl.so.4 (0x00007f8714532000)
	libz.so.1 => /usr/lib/libz.so.1 (0x00007f871431d000)
	libcrypto.so.0.9.8 => /usr/lib/libcrypto.so.0.9.8 (0x00007f8713f8f000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007f8713d74000)
	libc.so.6 => /lib/libc.so.6 (0x00007f8713a21000)
	librt.so.1 => /lib/librt.so.1 (0x00007f8713819000)
	libssl.so.0.9.8 => /usr/lib/libssl.so.0.9.8 (0x00007f87135ca000)
	libdl.so.2 => /lib/libdl.so.2 (0x00007f87133c6000)
	/lib/ld-linux-x86-64.so.2 (0x00007f8714778000)

Your:

   [torvalds@nehalem git]$ ldd git
	linux-vdso.so.1 =>  (0x00007fff61da7000)
	libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f2f1a498000)
	libz.so.1 => /lib64/libz.so.1 (0x0000003cdb800000)
	libcrypto.so.8 => /usr/lib64/libcrypto.so.8 (0x0000003ba7a00000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cdb400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003cda800000)
	libidn.so.11 => /lib64/libidn.so.11 (0x0000003ceaa00000)
	libssh2.so.1 => /usr/lib64/libssh2.so.1 (0x0000003ba8e00000)
	libldap-2.4.so.2 => /usr/lib64/libldap-2.4.so.2 (0x00007f2f1a250000)
	librt.so.1 => /lib64/librt.so.1 (0x0000003cdbc00000)
	libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 (0x0000003ce6e00000)
	libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003ce7e00000)
	libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003ce7200000)
	libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003ce6a00000)
	libssl3.so => /lib64/libssl3.so (0x0000003490200000)
	libsmime3.so => /lib64/libsmime3.so (0x000000348fe00000)
	libnss3.so => /lib64/libnss3.so (0x000000348f600000)
	libplds4.so => /lib64/libplds4.so (0x0000003cbc800000)
	libplc4.so => /lib64/libplc4.so (0x0000003cbdc00000)
	libnspr4.so => /lib64/libnspr4.so (0x0000003cbd800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003cdb000000)
	/lib64/ld-linux-x86-64.so.2 (0x0000003cda400000)
	libssl.so.8 => /usr/lib64/libssl.so.8 (0x0000003ba7e00000)
	liblber-2.4.so.2 => /usr/lib64/liblber-2.4.so.2 (0x0000003ceee00000)
	libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003ce5600000)
	libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007f2f1a030000)
	libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 (0x0000003ce7a00000)
	libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003ce7600000)
	libnssutil3.so => /lib64/libnssutil3.so (0x000000348fa00000)
	libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f2f19df8000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003cdc400000)
	libfreebl3.so => /lib64/libfreebl3.so (0x00007f2f19b99000)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 18:57                                   ` Timo Hirvonen
@ 2009-07-25 19:06                                     ` Reece Dunn
  2009-07-25 20:31                                     ` Mike Hommey
  2009-07-25 21:04                                     ` Carlos R. Mafra
  2 siblings, 0 replies; 74+ messages in thread
From: Reece Dunn @ 2009-07-25 19:06 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git, Carlos R. Mafra, Junio C Hamano

2009/7/25 Timo Hirvonen <tihirvon@gmail.com>:
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> So curl really must die. It may not matter for the expensive operations,
>> but a lot of scripting is about running all those "cheap" things that just
>> add up over time.
>
> SELinux is the problem, not curl.
>
> On my Arch Linux machine:
>
>   $ ldd bin/git
>        linux-vdso.so.1 =>  (0x00007fff42306000)
>        libcurl.so.4 => /usr/lib/libcurl.so.4 (0x00007f8714532000)
>        libz.so.1 => /usr/lib/libz.so.1 (0x00007f871431d000)
>        libcrypto.so.0.9.8 => /usr/lib/libcrypto.so.0.9.8 (0x00007f8713f8f000)
>        libpthread.so.0 => /lib/libpthread.so.0 (0x00007f8713d74000)
>        libc.so.6 => /lib/libc.so.6 (0x00007f8713a21000)
>        librt.so.1 => /lib/librt.so.1 (0x00007f8713819000)
>        libssl.so.0.9.8 => /usr/lib/libssl.so.0.9.8 (0x00007f87135ca000)
>        libdl.so.2 => /lib/libdl.so.2 (0x00007f87133c6000)
>        /lib/ld-linux-x86-64.so.2 (0x00007f8714778000)

It will depend on the dependencies of curl that are applied. BLFS
(http://www.linuxfromscratch.org/blfs/view/stable/basicnet/curl.html)
list the following dependencies:

    pkg-config-0.22
    OpenSSL-0.9.8g  or GnuTLS-1.6.3
    OpenLDAP-2.3.39
    libidn-0.6.14
    MIT Kerberos V5-1.6 or Heimdal-1.1
    krb4
    SPNEGO
    c-ares

and the dependencies of those packages and so forth.

On Ubuntu 9.04, I get:

$ ldd /usr/bin/git
	linux-gate.so.1 =>  (0xb80ae000)
	libcurl-gnutls.so.4 => /usr/lib/libcurl-gnutls.so.4 (0xb805b000)
	libz.so.1 => /lib/libz.so.1 (0xb8045000)
	libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb802b000)
	libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7ec8000)
	libidn.so.11 => /usr/lib/libidn.so.11 (0xb7e95000)
	liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0xb7e87000)
	libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0xb7e43000)
	librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7e39000)
	libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0xb7e0e000)
	libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7d71000)
	libtasn1.so.3 => /usr/lib/libtasn1.so.3 (0xb7d5f000)
	libgcrypt.so.11 => /lib/libgcrypt.so.11 (0xb7cf6000)
	/lib/ld-linux.so.2 (0xb80af000)
	libresolv.so.2 => /lib/tls/i686/cmov/libresolv.so.2 (0xb7ce0000)
	libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0xb7cc7000)
	libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7cc3000)
	libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0xb7c31000)
	libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0xb7c0d000)
	libcom_err.so.2 => /lib/libcom_err.so.2 (0xb7c09000)
	libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0xb7bff000)
	libkeyutils.so.1 => /lib/libkeyutils.so.1 (0xb7bfb000)
	libgpg-error.so.0 => /lib/libgpg-error.so.0 (0xb7bf7000)

- Reece

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 18:57                                   ` Timo Hirvonen
  2009-07-25 19:06                                     ` Reece Dunn
@ 2009-07-25 20:31                                     ` Mike Hommey
  2009-07-25 21:02                                       ` Linus Torvalds
  2009-07-25 21:04                                     ` Carlos R. Mafra
  2 siblings, 1 reply; 74+ messages in thread
From: Mike Hommey @ 2009-07-25 20:31 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git, Carlos R. Mafra, Junio C Hamano

On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > So curl really must die. It may not matter for the expensive operations, 
> > but a lot of scripting is about running all those "cheap" things that just 
> > add up over time.
> 
> SELinux is the problem, not curl.

I think it's NSS, the problem, not SELinux. Linus's libcurl is built
against NSS, which is the default on Fedora.

Mike

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 20:31                                     ` Mike Hommey
@ 2009-07-25 21:02                                       ` Linus Torvalds
  2009-07-25 21:13                                         ` Linus Torvalds
  2009-07-26  7:54                                         ` Mike Hommey
  0 siblings, 2 replies; 74+ messages in thread
From: Linus Torvalds @ 2009-07-25 21:02 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Timo Hirvonen, git, Carlos R. Mafra, Junio C Hamano



On Sat, 25 Jul 2009, Mike Hommey wrote:

> On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
> > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > 
> > > So curl really must die. It may not matter for the expensive operations, 
> > > but a lot of scripting is about running all those "cheap" things that just 
> > > add up over time.
> > 
> > SELinux is the problem, not curl.
> 
> I think it's NSS, the problem, not SELinux. Linus's libcurl is built
> against NSS, which is the default on Fedora.

Well, it kind of doesn't matter. The fact is, libcurl is a bloated 
monster, and adds zero to 99% of what git people do.

The fact that apparently sometimes it's less bloated than other times 
doesn't really change anything fundamental, does it?

			Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 18:57                                   ` Timo Hirvonen
  2009-07-25 19:06                                     ` Reece Dunn
  2009-07-25 20:31                                     ` Mike Hommey
@ 2009-07-25 21:04                                     ` Carlos R. Mafra
  2 siblings, 0 replies; 74+ messages in thread
From: Carlos R. Mafra @ 2009-07-25 21:04 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git, Junio C Hamano

On Sat 25.Jul'09 at 21:57:39 +0300, Timo Hirvonen wrote:
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > So curl really must die. It may not matter for the expensive operations, 
> > but a lot of scripting is about running all those "cheap" things that just 
> > add up over time.
> 
> SELinux is the problem, not curl.

I don't have SELinux, and without curl it takes ~50% less time (on
top of Linus' previous optimizations!).

The time to open() all the libs really sums up to a considerable 
fraction (when the total time is low, not when compared to the 
huge 6 secs of before)

Without curl:
[mafra@Pilar:linux-2.6]$ grep open strace-nocurl.log |grep lib \
> | awk -F "<" '{print $2}' | sed s/\>// | awk '{s += $1} END {print s}'
0.070104

With curl:
[mafra@Pilar:linux-2.6]$ grep open strace-curl.log |grep lib \
> | awk -F "<" '{print $2}' | sed s/\>// | awk '{s += $1} END {print s}'
0.249764

PS: It is interesting that in my laptop the time required
to open libcurl alone is 20x the total time of 'git branch' for Linus'
in his supercomputer:
open("/usr/lib64/libcurl.so.4", O_RDONLY) = 3 <0.066239>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 21:02                                       ` Linus Torvalds
@ 2009-07-25 21:13                                         ` Linus Torvalds
  2009-07-25 23:23                                           ` Johannes Schindelin
  2009-07-26  7:54                                         ` Mike Hommey
  1 sibling, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-25 21:13 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Timo Hirvonen, git, Carlos R. Mafra, Junio C Hamano



On Sat, 25 Jul 2009, Linus Torvalds wrote:
>
> The fact that apparently sometimes it's less bloated than other times 
> doesn't really change anything fundamental, does it?

Btw, does anybody know how/why libdl seems to get linked in too?

We're not doing -ldl, and I'm not seeing any need for it, but it's 
definitely there on fedora, at least.

It seems to come from libcrypto. I can get rid of it with NO_OPENSSL, and 
that cuts down on the number of system calls in my startup by 16 (getting 
rid of both libcrypto and libdl). I wonder if there is some way to get the 
optimized openssl sha1 routines _without_ that silly ldl thing.

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 21:13                                         ` Linus Torvalds
@ 2009-07-25 23:23                                           ` Johannes Schindelin
  2009-07-26  4:49                                             ` Linus Torvalds
  0 siblings, 1 reply; 74+ messages in thread
From: Johannes Schindelin @ 2009-07-25 23:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mike Hommey, Timo Hirvonen, git, Carlos R. Mafra, Junio C Hamano

Hi,

On Sat, 25 Jul 2009, Linus Torvalds wrote:

> On Sat, 25 Jul 2009, Linus Torvalds wrote:
> >
> > The fact that apparently sometimes it's less bloated than other times 
> > doesn't really change anything fundamental, does it?
> 
> Btw, does anybody know how/why libdl seems to get linked in too?
> 
> We're not doing -ldl, and I'm not seeing any need for it, but it's 
> definitely there on fedora, at least.
> 
> It seems to come from libcrypto. I can get rid of it with NO_OPENSSL, and 
> that cuts down on the number of system calls in my startup by 16 (getting 
> rid of both libcrypto and libdl). I wonder if there is some way to get the 
> optimized openssl sha1 routines _without_ that silly ldl thing.

OpenSSL allows for so-called engines implementing certain algorithms.  
These engines are dynamic libraries, loaded via dlopen().

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 23:23                                           ` Johannes Schindelin
@ 2009-07-26  4:49                                             ` Linus Torvalds
  2009-07-26 16:29                                               ` Theodore Tso
  0 siblings, 1 reply; 74+ messages in thread
From: Linus Torvalds @ 2009-07-26  4:49 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Mike Hommey, Timo Hirvonen, git, Carlos R. Mafra, Junio C Hamano



On Sun, 26 Jul 2009, Johannes Schindelin wrote:
> > 
> > It seems to come from libcrypto. I can get rid of it with NO_OPENSSL, and 
> > that cuts down on the number of system calls in my startup by 16 (getting 
> > rid of both libcrypto and libdl). I wonder if there is some way to get the 
> > optimized openssl sha1 routines _without_ that silly ldl thing.
> 
> OpenSSL allows for so-called engines implementing certain algorithms.  
> These engines are dynamic libraries, loaded via dlopen().

Ah. Ok, that explains it.

It's a bit sad, since the _only_ thing we load all of libcrypto for is the 
(fairly trivial) SHA1 code. 

But at the same time, last time I benchmarked the different SHA1 
libraries, the openssl one was the fastest. I think it has tuned assembly 
language for most architectures. Our regular mozilla-based C code is 
perfectly fine, but it doesn't hold a candle to assembler tuning.

Oh well. 

		Linus

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-25 21:02                                       ` Linus Torvalds
  2009-07-25 21:13                                         ` Linus Torvalds
@ 2009-07-26  7:54                                         ` Mike Hommey
  2009-07-26 10:16                                           ` Johannes Schindelin
  1 sibling, 1 reply; 74+ messages in thread
From: Mike Hommey @ 2009-07-26  7:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Timo Hirvonen, git, Carlos R. Mafra, Junio C Hamano

On Sat, Jul 25, 2009 at 02:02:19PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 25 Jul 2009, Mike Hommey wrote:
> 
> > On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
> > > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > > 
> > > > So curl really must die. It may not matter for the expensive operations, 
> > > > but a lot of scripting is about running all those "cheap" things that just 
> > > > add up over time.
> > > 
> > > SELinux is the problem, not curl.
> > 
> > I think it's NSS, the problem, not SELinux. Linus's libcurl is built
> > against NSS, which is the default on Fedora.
> 
> Well, it kind of doesn't matter. The fact is, libcurl is a bloated 
> monster, and adds zero to 99% of what git people do.

Especially consideting the http transport fails to be useful in various
scenarios.

Mike

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-26  7:54                                         ` Mike Hommey
@ 2009-07-26 10:16                                           ` Johannes Schindelin
  2009-07-26 10:23                                             ` demerphq
  0 siblings, 1 reply; 74+ messages in thread
From: Johannes Schindelin @ 2009-07-26 10:16 UTC (permalink / raw)
  To: Mike Hommey
  Cc: Linus Torvalds, Timo Hirvonen, git, Carlos R. Mafra,
	Junio C Hamano

Hi,

On Sun, 26 Jul 2009, Mike Hommey wrote:

> On Sat, Jul 25, 2009 at 02:02:19PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Sat, 25 Jul 2009, Mike Hommey wrote:
> > 
> > > On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
> > > > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > > > 
> > > > > So curl really must die. It may not matter for the expensive operations, 
> > > > > but a lot of scripting is about running all those "cheap" things that just 
> > > > > add up over time.
> > > > 
> > > > SELinux is the problem, not curl.
> > > 
> > > I think it's NSS, the problem, not SELinux. Linus's libcurl is built
> > > against NSS, which is the default on Fedora.
> > 
> > Well, it kind of doesn't matter. The fact is, libcurl is a bloated 
> > monster, and adds zero to 99% of what git people do.
> 
> Especially consideting the http transport fails to be useful in various
> scenarios.

I beg your pardon?  Maybe "s/useful/desirable/"?

In many scenarios, http transport is the _last resort_ against overzealous 
administrators.  The fact that you might be lucky enough not to need that 
resort is a blessing, and does not give you the right to ridicule those 
who are unfortunate enough not to share your good luck.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-26 10:16                                           ` Johannes Schindelin
@ 2009-07-26 10:23                                             ` demerphq
  2009-07-26 10:27                                               ` demerphq
  0 siblings, 1 reply; 74+ messages in thread
From: demerphq @ 2009-07-26 10:23 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Mike Hommey, Linus Torvalds, Timo Hirvonen, git, Carlos R. Mafra,
	Junio C Hamano

2009/7/26 Johannes Schindelin <Johannes.Schindelin@gmx.de>:
> Hi,
>
> On Sun, 26 Jul 2009, Mike Hommey wrote:
>
>> On Sat, Jul 25, 2009 at 02:02:19PM -0700, Linus Torvalds wrote:
>> >
>> >
>> > On Sat, 25 Jul 2009, Mike Hommey wrote:
>> >
>> > > On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
>> > > > Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> > > >
>> > > > > So curl really must die. It may not matter for the expensive operations,
>> > > > > but a lot of scripting is about running all those "cheap" things that just
>> > > > > add up over time.
>> > > >
>> > > > SELinux is the problem, not curl.
>> > >
>> > > I think it's NSS, the problem, not SELinux. Linus's libcurl is built
>> > > against NSS, which is the default on Fedora.
>> >
>> > Well, it kind of doesn't matter. The fact is, libcurl is a bloated
>> > monster, and adds zero to 99% of what git people do.
>>
>> Especially consideting the http transport fails to be useful in various
>> scenarios.
>
> I beg your pardon?  Maybe "s/useful/desirable/"?
>
> In many scenarios, http transport is the _last resort_ against overzealous
> administrators.  The fact that you might be lucky enough not to need that
> resort is a blessing, and does not give you the right to ridicule those
> who are unfortunate enough not to share your good luck.

I think he meant that it is buggy and does not work correctly in
various scenarios.

Eg: Last I checked it couldn't handle repos where the main branch
wasn''t called master, and I've seen other messages that make me think
it doesn't work correctly on edge cases.

cheers,
Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-26 10:23                                             ` demerphq
@ 2009-07-26 10:27                                               ` demerphq
  0 siblings, 0 replies; 74+ messages in thread
From: demerphq @ 2009-07-26 10:27 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Mike Hommey, Linus Torvalds, Timo Hirvonen, git, Carlos R. Mafra,
	Junio C Hamano

2009/7/26 demerphq <demerphq@gmail.com>:
> 2009/7/26 Johannes Schindelin <Johannes.Schindelin@gmx.de>:
>> Hi,
>>
>> On Sun, 26 Jul 2009, Mike Hommey wrote:
>>
>>> On Sat, Jul 25, 2009 at 02:02:19PM -0700, Linus Torvalds wrote:
>>> >
>>> >
>>> > On Sat, 25 Jul 2009, Mike Hommey wrote:
>>> >
>>> > > On Sat, Jul 25, 2009 at 09:57:39PM +0300, Timo Hirvonen wrote:
>>> > > > Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>> > > >
>>> > > > > So curl really must die. It may not matter for the expensive operations,
>>> > > > > but a lot of scripting is about running all those "cheap" things that just
>>> > > > > add up over time.
>>> > > >
>>> > > > SELinux is the problem, not curl.
>>> > >
>>> > > I think it's NSS, the problem, not SELinux. Linus's libcurl is built
>>> > > against NSS, which is the default on Fedora.
>>> >
>>> > Well, it kind of doesn't matter. The fact is, libcurl is a bloated
>>> > monster, and adds zero to 99% of what git people do.
>>>
>>> Especially consideting the http transport fails to be useful in various
>>> scenarios.
>>
>> I beg your pardon?  Maybe "s/useful/desirable/"?
>>
>> In many scenarios, http transport is the _last resort_ against overzealous
>> administrators.  The fact that you might be lucky enough not to need that
>> resort is a blessing, and does not give you the right to ridicule those
>> who are unfortunate enough not to share your good luck.
>
> I think he meant that it is buggy and does not work correctly in
> various scenarios.
>
> Eg: Last I checked it couldn't handle repos where the main branch
> wasn''t called master, and I've seen other messages that make me think
> it doesn't work correctly on edge cases.

Er, I meant that to go to Johannes directly, not to spam the list or
the cc's with my hazy recollection, and I should have added: "but
perhaps im confusing http and rsync".

Sorry for the noise.

Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-26  4:49                                             ` Linus Torvalds
@ 2009-07-26 16:29                                               ` Theodore Tso
  0 siblings, 0 replies; 74+ messages in thread
From: Theodore Tso @ 2009-07-26 16:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Mike Hommey, Timo Hirvonen, git,
	Carlos R. Mafra, Junio C Hamano

On Sat, Jul 25, 2009 at 09:49:41PM -0700, Linus Torvalds wrote:
> 
> But at the same time, last time I benchmarked the different SHA1 
> libraries, the openssl one was the fastest. I think it has tuned assembly 
> language for most architectures. Our regular mozilla-based C code is 
> perfectly fine, but it doesn't hold a candle to assembler tuning.

So maybe git should import the SHA1 code into its own source base?
It's not like the SHA1 code changes often, or is likely to have
security issues (at least, not buffer overruns; if SHA1 gets thorouhly
broken we might have to change algorithms, but that's a different
kettle of fish :-).

					- Ted

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:59                               ` Shawn O. Pearce
  2009-07-24 23:28                                 ` Junio C Hamano
@ 2009-07-26 17:07                                 ` Avi Kivity
  2009-07-26 17:16                                   ` Johannes Schindelin
  1 sibling, 1 reply; 74+ messages in thread
From: Avi Kivity @ 2009-07-26 17:07 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Theodore Tso, Linus Torvalds, Carlos R. Mafra, Junio C Hamano,
	Git Mailing List

On 07/25/2009 01:59 AM, Shawn O. Pearce wrote:
> Theodore Tso<tytso@mit.edu>  wrote:
>    
>> On Fri, Jul 24, 2009 at 02:21:20PM -0700, Linus Torvalds wrote:
>>      
>>> I wonder if there is some way to only load the crazy curl stuff when we
>>> actually want open a http: connection.
>>>        
>> Well, we could use dlopen(), but I'm not sure that qualifies as a
>> _sane_ solution --- especially given that there are approximately 15
>> interfaces used by git, that we'd have to resolve using dlsym().
>>      
>
> Yea, that's not sane.
>
> Probably the better approach is to have git fetch and git push be a
> different binary from main git, so we only pay the libcurl loading
> overheads when we hit transport.
>    

Or make the transports shared libraries, and use dlopen() to open the 
transport and dlsym() to resolve the struct transport object exported by 
the library.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-26 17:07                                 ` Avi Kivity
@ 2009-07-26 17:16                                   ` Johannes Schindelin
  0 siblings, 0 replies; 74+ messages in thread
From: Johannes Schindelin @ 2009-07-26 17:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Shawn O. Pearce, Theodore Tso, Linus Torvalds, Carlos R. Mafra,
	Junio C Hamano, Git Mailing List

Hi,

On Sun, 26 Jul 2009, Avi Kivity wrote:

> On 07/25/2009 01:59 AM, Shawn O. Pearce wrote:
> > Theodore Tso<tytso@mit.edu>  wrote:
> >    
> > > On Fri, Jul 24, 2009 at 02:21:20PM -0700, Linus Torvalds wrote:
> > >      
> > > > I wonder if there is some way to only load the crazy curl stuff when we
> > > > actually want open a http: connection.
> > > >        
> > > Well, we could use dlopen(), but I'm not sure that qualifies as a
> > > _sane_ solution --- especially given that there are approximately 15
> > > interfaces used by git, that we'd have to resolve using dlsym().
> > >      
> >
> > Yea, that's not sane.
> >
> > Probably the better approach is to have git fetch and git push be a
> > different binary from main git, so we only pay the libcurl loading
> > overheads when we hit transport.
> >    
> 
> Or make the transports shared libraries, and use dlopen() to open the
> transport and dlsym() to resolve the struct transport object exported by the
> library.

... and introduce all kinds of braindamage to the Makefile so we can 
properly compile .dll files on Windows?

Umm, thanks, but no.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
@ 2009-07-26 23:21 George Spelvin
  0 siblings, 0 replies; 74+ messages in thread
From: George Spelvin @ 2009-07-26 23:21 UTC (permalink / raw)
  To: git, torvalds; +Cc: linux

> It's a bit sad, since the _only_ thing we load all of libcrypto for is the 
> (fairly trivial) SHA1 code. 
>
> But at the same time, last time I benchmarked the different SHA1 
> libraries, the openssl one was the fastest. I think it has tuned assembly 
> language for most architectures. Our regular mozilla-based C code is 
> perfectly fine, but it doesn't hold a candle to assembler tuning.

Actually, openssl only has assembly for x86, x86_64, and ia64.
Truthfully, once you have 32 registers, SHA1 is awfully easy to
compile near-optimally.

Git currently includes some hand-tuned assembly that isn't in OpenSSL:
- ARM (only 16 registers, and the rotate+op support can be used nicely)
- PPC (3-way superscalar *without* OO execution benefits from careful
  scheduling)

Further, all of the core hand-tuned SHA1 assembly code in OpenSSL is by
Andy Polyakov and is dual-licensed GPL/3-clause BSD *in addition to*
the OpenSSL license.  So we can just import it:

See http://www.openssl.org/~appro/cryptogams/
and http://www.openssl.org/~appro/cryptogams/cryptogams-0.tar.gz

(Ooh, look, he has PPC code in there, too.  Does anyone with a PPC machine
want to compare it with Git's?)

It'll take some massaging because that's just the core SHA1_Transform
function and not the wrappers, but it's hardly a heroic effort.

I'm pretty deep in the weeds at $DAY_JOB and can't get to it for a while,
but would a patch be appreciated?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Performance issue of 'git branch'
  2009-07-24 22:13                             ` Linus Torvalds
  2009-07-24 22:18                               ` david
@ 2009-08-07  4:21                               ` Jeff King
  1 sibling, 0 replies; 74+ messages in thread
From: Jeff King @ 2009-08-07  4:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Git Mailing List, Carlos R. Mafra,
	Daniel Barkalow, Johannes Schindelin

On Fri, Jul 24, 2009 at 03:13:07PM -0700, Linus Torvalds wrote:

> Subject: [PATCH] git-http-fetch: not a builtin
> 
> We should really try to avoid having a dependency on the curl libraries
> for the core 'git' executable. It adds huge overheads, for no advantage.
> 
> This splits up git-http-fetch so that it isn't built-in.  We still do
> end up linking with curl for the git binary due to the transport.c http
> walker, but that's at least partially an independent issue.
>
> [...]
>
> +git-http-fetch$X: revision.o http.o http-push.o $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> +		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)

Err, this seems to horribly break git-http-fetch (see if you can spot
the logic error in dependencies). Patch is below.

Nobody noticed, I expect, because nothing in git _uses_ http-fetch
anymore, now that git-clone is no longer a shell script. I only noticed
because it tried to build http-push on one of my NO_EXPAT machines.

It might be an interesting exercise to dust off the old shell scripts
once in a while and see if they still pass their original tests while
running on top of a more modern git. It would test that we haven't
broken the plumbing interfaces.

-- >8 --
Subject: [PATCH] Makefile: build http-fetch against http-fetch.o

As opposed to http-push.o. We can also drop EXPAT_LIBEXPAT,
since fetch does not need it.

This appears to be a bad cut-and-paste in commit 1088261f.

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 97d904b..d6362d3 100644
--- a/Makefile
+++ b/Makefile
@@ -1502,9 +1502,9 @@ http.o http-walker.o http-push.o: http.h
 
 http.o http-walker.o: $(LIB_H)
 
-git-http-fetch$X: revision.o http.o http-push.o $(GITLIBS)
+git-http-fetch$X: revision.o http.o http-fetch.o http-walker.o $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
-		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+		$(LIBS) $(CURL_LIBCURL)
 git-http-push$X: revision.o http.o http-push.o $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
-- 
1.6.4.117.g6056d.dirty

^ permalink raw reply related	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2009-08-07  4:21 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-22 23:59 Performance issue of 'git branch' Carlos R. Mafra
2009-07-23  0:21 ` Linus Torvalds
2009-07-23  0:51   ` Linus Torvalds
2009-07-23  0:55     ` Linus Torvalds
2009-07-23  2:02       ` Carlos R. Mafra
2009-07-23  2:28         ` Linus Torvalds
2009-07-23 12:42           ` Jakub Narebski
2009-07-23 14:45             ` Carlos R. Mafra
2009-07-23 16:25             ` Linus Torvalds
2009-07-23  1:22   ` Carlos R. Mafra
2009-07-23  2:20     ` Linus Torvalds
2009-07-23  2:23       ` Linus Torvalds
2009-07-23  3:08         ` Linus Torvalds
2009-07-23  3:21           ` Linus Torvalds
2009-07-23 17:47             ` Tony Finch
2009-07-23 18:57               ` Linus Torvalds
2009-07-23 22:48                 ` Newton-Raphson, was " Tony Finch
2009-07-23 23:24                   ` Johannes Schindelin
2009-07-23 23:50                     ` Tony Finch
2009-07-24  0:43                       ` Johannes Schindelin
2009-07-23  3:18         ` Carlos R. Mafra
2009-07-23  3:27           ` Carlos R. Mafra
2009-07-23  3:40           ` Carlos R. Mafra
2009-07-23  3:47           ` Linus Torvalds
2009-07-23  4:10             ` Linus Torvalds
2009-07-23  5:13               ` Junio C Hamano
2009-07-23  5:17               ` Carlos R. Mafra
2009-07-23  4:40         ` Junio C Hamano
2009-07-23  5:36           ` Linus Torvalds
2009-07-23  5:52             ` Junio C Hamano
2009-07-23  6:04               ` Junio C Hamano
2009-07-23 17:19                 ` Linus Torvalds
2009-07-23 16:07           ` Carlos R. Mafra
2009-07-23 16:19             ` Linus Torvalds
2009-07-23 16:53               ` Carlos R. Mafra
2009-07-23 19:05                 ` Linus Torvalds
2009-07-23 19:13                   ` Linus Torvalds
2009-07-23 19:55                     ` Carlos R. Mafra
2009-07-24 20:36                       ` Linus Torvalds
2009-07-24 20:47                         ` Linus Torvalds
2009-07-24 21:21                           ` Linus Torvalds
2009-07-24 22:13                             ` Linus Torvalds
2009-07-24 22:18                               ` david
2009-07-24 22:42                                 ` Linus Torvalds
2009-07-24 22:46                                   ` david
2009-07-25  2:39                                     ` Linus Torvalds
2009-07-25  2:53                                       ` Daniel Barkalow
2009-08-07  4:21                               ` Jeff King
2009-07-24 22:54                             ` Theodore Tso
2009-07-24 22:59                               ` Shawn O. Pearce
2009-07-24 23:28                                 ` Junio C Hamano
2009-07-26 17:07                                 ` Avi Kivity
2009-07-26 17:16                                   ` Johannes Schindelin
2009-07-24 23:46                             ` Carlos R. Mafra
2009-07-25  0:41                               ` Carlos R. Mafra
2009-07-25 18:04                                 ` Linus Torvalds
2009-07-25 18:57                                   ` Timo Hirvonen
2009-07-25 19:06                                     ` Reece Dunn
2009-07-25 20:31                                     ` Mike Hommey
2009-07-25 21:02                                       ` Linus Torvalds
2009-07-25 21:13                                         ` Linus Torvalds
2009-07-25 23:23                                           ` Johannes Schindelin
2009-07-26  4:49                                             ` Linus Torvalds
2009-07-26 16:29                                               ` Theodore Tso
2009-07-26  7:54                                         ` Mike Hommey
2009-07-26 10:16                                           ` Johannes Schindelin
2009-07-26 10:23                                             ` demerphq
2009-07-26 10:27                                               ` demerphq
2009-07-25 21:04                                     ` Carlos R. Mafra
2009-07-23 16:48         ` Anders Kaseorg
2009-07-23 19:03           ` Carlos R. Mafra
2009-07-23  0:23 ` SZEDER Gábor
2009-07-23  2:25   ` Carlos R. Mafra
  -- strict thread matches above, loose matches on Subject: below --
2009-07-26 23:21 George Spelvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).