* git and larger trees, not so fast? @ 2007-08-09 16:30 moe 2007-08-09 17:11 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: moe @ 2007-08-09 16:30 UTC (permalink / raw) To: git hi guys, earlier today i imported one of my larger trees (~70k files) into git and was quite disappointed by the performance. i made some tests on latest master branch (1.5.3.rc4.29.g74276) and it seems like git hits a wall somewhere above ~50k files. i'm seeing 'commit' timings of 30s and up as well as 'status' timings in the 10s ballpark. here's a test-case (should be safe to copy/paste on linux, bash): # # first create a tree of roughly 100k files # mkdir bummer cd bummer for ((i=0;i<100;i++)); do mkdir $i && pushd $i; for ((j=0;j<1000;j++)); do echo "$j" >$j; done; popd; done # # init and add this to git # time git init git config user.email "no@thx" git config user.name "nothx" time git add . time git commit -m 'buurrrrn' -a # # git-status, tunes in at around ~10s for me # time git-status time git-status time git-status # # git-commit, takes a whopping 52s for me # date >50/500 time git commit -m 'expose the turtle' 50/500 regards, moe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 16:30 git and larger trees, not so fast? moe @ 2007-08-09 17:11 ` Linus Torvalds 2007-08-09 17:38 ` Linus Torvalds ` (2 more replies) 2007-08-10 19:39 ` Linus Torvalds 2007-08-11 18:47 ` Linus Torvalds 2 siblings, 3 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 17:11 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, moe wrote: > > i made some tests on latest master branch > (1.5.3.rc4.29.g74276) and it seems like git > hits a wall somewhere above ~50k files. Good catch. Definitely not acceptable performance. We seem to spend a lot of our time in memcpy: samples % image name app name symbol name 200527 25.4551 libc-2.6.so libc-2.6.so _wordcopy_bwd_aligned 104505 13.2660 libc-2.6.so libc-2.6.so _wordcopy_fwd_aligned 99185 12.5907 libz.so.1.2.3 libz.so.1.2.3 (no symbols) 83452 10.5935 libc-2.5.so libc-2.5.so (no symbols) 54203 6.8806 git git assign_blame 46153 5.8587 git git read_directory_recursive 27665 3.5118 git git handle_split 21385 2.7146 vmlinux vmlinux blk_complete_sgv4_hdr_rq 20745 2.6334 git git read_packed_refs 12709 1.6133 git git builtin_diffstat 7829 0.9938 git git show_patch_diff ... but the silly thing is, this is only true if you give the filenames explicitly! Lookie here: [torvalds@woody bummer]$ date >50/500 [torvalds@woody bummer]$ time git commit -a -m 'expose the turtle' Created commit 25ca22d: expose the turtle 1 files changed, 1 insertions(+), 1 deletions(-) real 0m4.612s user 0m4.224s sys 0m0.412s [torvalds@woody bummer]$ date >50/500 [torvalds@woody bummer]$ time git commit -m 'expose the turtle' 50/500 Created commit 009f6b5: expose the turtle 1 files changed, 1 insertions(+), 1 deletions(-) real 0m12.464s user 0m12.129s sys 0m0.336s ie we take almost three times longer with explicitly naming the file, than when just using "git commit -a". Oops. That said, even the 4.6 seconds is really not acceptable: this is on a good 2.6GHz Core 2 Duo too, so on weaker hardware it would be quite painful. I haven't looked at *why* it's that slow, but it's not anything really fundamental, the basic operations are fast: [torvalds@woody bummer]$ time git add 50/500 real 0m0.064s user 0m0.048s sys 0m0.016s [torvalds@woody bummer]$ time git write-tree 7480230419e510c93082a4a19e23d928a426973a real 0m0.069s user 0m0.048s sys 0m0.024s [torvalds@woody bummer]$ time git diff real 0m0.127s user 0m0.000s sys 0m0.000s so it's not the "lstat()" that we do on all files, or the write-tree (which are all O(n) in files, with a rather small constant), but some O(n**2) behaviour elsewhere. And all the expense seems to be in not the commit itself, but in [torvalds@woody bummer]$ time git 'runstatus' '--nocolor' real 0m4.208s user 0m4.068s sys 0m0.140s and that thing seems to suck really really hard. Doing an ltrace on it shows tons and tons of: ... strlen("35") strlen("349") calloc(1, 72) memcpy(0x73034e, "10/", 3) memcpy(0x730351, "349", 4) memmove(0x2ab637f41e80, 0x2ab637f41e78, 781768) ... but I haven't looked at where they come from yet. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 17:11 ` Linus Torvalds @ 2007-08-09 17:38 ` Linus Torvalds 2007-08-09 18:00 ` Junio C Hamano 2007-08-09 18:06 ` Linus Torvalds 2007-08-09 17:54 ` Linus Torvalds 2007-08-09 18:06 ` David Kastrup 2 siblings, 2 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 17:38 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, Linus Torvalds wrote: > > Doing an ltrace on it shows tons and tons of: > > ... > strlen("35") > strlen("349") > calloc(1, 72) > memcpy(0x73034e, "10/", 3) > memcpy(0x730351, "349", 4) > memmove(0x2ab637f41e80, 0x2ab637f41e78, 781768) > ... > > but I haven't looked at where they come from yet. Ouch. It's the diffing between HEAD and the index, and it's all from "add_index_entry()", which sorts the index array using an insertion sort. So when the index array gets large, that sort spends all its time in huge memmove() calls. The silly thing, of course, is that we don't even "need" to do that: both the index and the trees are really sorted already, so we could just interleave them. But since we read them separately, the thing just sucks. We've fixed other similar cases of this we had (diffing trees against each other) by walking the trees together, but the "index vs tree" diff (and merge) is the one remaining place where we still use the original stupid algorithm. So you'll see this performance problem for - diff tree against index ("git diff HEAD" - merge tree into index ("git read-tree -m HEAD") which both do the stupid index/tree filling. So this is all O(n**2), which is why we haven't reacted very much - it doesn't show up nearly as much with the kernel. Also, with a smaller set of files, it would tends to fit in the L2 cache of most competent CPU's. So not only is it n**2, you get the cache trashing behaviour too, and that, I think, is what really causes it to fall off the cliff edge! Gaah. This shouldn't be *that* hard to fix, but I'm not entirely sure I'll have time today. Diffing the index against the tree *should* be instantaneous. It should be no more costly than reading the tree itself (which is 0.191 seconds for me: test "git read-tree -m HEAD" vs "git read-tree HEAD") and reading the index (which is almost instantaneous - the only way I can test it is by doing something like "git update-index --refresh", and that's 0.131 seconds, but that includes all the 100,000 "lstat()" calls). So basically, we're spending several seconds just doing stupid make-believe work and moving the index array around. Ouch. Anyway, the good news is that this is by no means fundamental. It's a small and stupid detail. The only thing that makes it at all painful is that this is in some low-level crud that we haven't touched in *ages*, so I've long since swapped out all my recollection of how we do it. (We basically do: read_cache(); followed by unpack_trees(); and each of those *on*its*own* is pretty cheap, but when we unpack trees into an already populated index, the end result is ugly. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 17:38 ` Linus Torvalds @ 2007-08-09 18:00 ` Junio C Hamano 2007-08-09 18:06 ` Linus Torvalds 1 sibling, 0 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 18:00 UTC (permalink / raw) To: Linus Torvalds; +Cc: moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > So this is all O(n**2), which is why we haven't reacted very much - it > doesn't show up nearly as much with the kernel. Also, with a smaller set > of files, it would tends to fit in the L2 cache of most competent CPU's. > So not only is it n**2, you get the cache trashing behaviour too, and > that, I think, is what really causes it to fall off the cliff edge! > > Gaah. This shouldn't be *that* hard to fix, but I'm not entirely sure I'll > have time today. One thing to keep in mind is that in your earlier test of "git write-tree" (or "git commit") followed by "git add a/file" followed by "git write-tree" is extremely fast because the last operation optimizes otherwise O(n) behaviour of write-tree from index extreamely cheap, thanks to cache-tree in the index. > Diffing the index against the tree *should* be instantaneous. Right now we do not cull the subdirectory that we _know_ are unchanged in "git diff-index --cached" using cache-tree, but diffing the index against the tree could be instantaneous. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 17:38 ` Linus Torvalds 2007-08-09 18:00 ` Junio C Hamano @ 2007-08-09 18:06 ` Linus Torvalds 2007-08-09 18:11 ` Junio C Hamano 1 sibling, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 18:06 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, Linus Torvalds wrote: > > Gaah. This shouldn't be *that* hard to fix, but I'm not entirely sure I'll > have time today. In fact, I'm almost sure I will *not* have time today. Anyway, the really trivial (and ugly) fix is to handle the cases of adding _independent_ stages to the index (which is the case for both "git diff-index" and "git read-tree -m") differently: instead of using the standard "add_index_entry()", which does all the complex sorting and checks that there aren't duplicates, we could do a much simpler one that just unconditionally appends to the end of the index. This works, because when the stages are independent, there can be no index clashes (by definition). Then, after adding all the stages, we could just do a "qsort()" on the result, and rather than having an expensive O(n**2) thing, we'd have a much nicer and well-behaved (with a smaller constant too) O(n*logn) thing. I bet it's just ~50 lines of code, it really shouldn't be that hard to do. I just won't be able to do it and test it until late tonight or tomorrow, I suspect. Sadly, this is an area that is almost exclusively mine and Junio's. I'd love for somebody else to get their feet wet, but doing a gitk read-cache.c shows that few enough people have done anythign really fundamental in this file.. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 18:06 ` Linus Torvalds @ 2007-08-09 18:11 ` Junio C Hamano 2007-08-09 20:42 ` Junio C Hamano 0 siblings, 1 reply; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 18:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, 9 Aug 2007, Linus Torvalds wrote: >> >> Gaah. This shouldn't be *that* hard to fix, but I'm not entirely sure I'll >> have time today. > > In fact, I'm almost sure I will *not* have time today. > > Anyway, the really trivial (and ugly) fix is to handle the cases of adding > _independent_ stages to the index (which is the case for both "git > diff-index" and "git read-tree -m") differently... > ... > Sadly, this is an area that is almost exclusively mine and Junio's. I'd > love for somebody else to get their feet wet,... I hopefully have some time this evening to look into this, if not earlier. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 18:11 ` Junio C Hamano @ 2007-08-09 20:42 ` Junio C Hamano 2007-08-09 20:52 ` Sean 0 siblings, 1 reply; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 20:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: moe, git Junio C Hamano <gitster@pobox.com> writes: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> In fact, I'm almost sure I will *not* have time today. >> >> Anyway, the really trivial (and ugly) fix is to handle the cases of adding >> _independent_ stages to the index (which is the case for both "git >> diff-index" and "git read-tree -m") differently... >> ... >> Sadly, this is an area that is almost exclusively mine and Junio's. I'd >> love for somebody else to get their feet wet,... > > I hopefully have some time this evening to look into this, if > not earlier. So here is what I did during the lunch break. wt-status has two calls to run_diff_index() to do the equivalent of "git diff --cached". This codepath is the sole caller of read_tree(), and before calling read_tree(), vacates stage #1 entries, and reads tree contents to the index at stage #1, without any funky "merge" magic. This changes read_tree() to first make sure that there is not any existing cache entries at specified stage (from the above description, you can see this is not strictly needed if we are only interested in supporting existing callers), and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). add_cache_entry() has been taught to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and appends the given cache entry straight at the end of the active cache. The appending and sorting at the end destroys cache-tree optimization, but we are not writing the resulting index out anyway, so that is not a problem. I do not know if this "fixes" the performance problem or not (I do not have that much time during the day), so I would not call this a "fix" yet, but at least the _change_ looks trivially correct, and passes all the existing tests. Interested parties may want to try it and see if it shifts the bottleneck. --- cache.h | 1 + read-cache.c | 20 +++++++++++++++- tree.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 85 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h index e97af18..e5276e6 100644 --- a/cache.h +++ b/cache.h @@ -258,6 +258,7 @@ extern int index_name_pos(struct index_state *, const char *name, int namelen); #define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */ #define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace file/directory */ #define ADD_CACHE_SKIP_DFCHECK 4 /* Ok to skip DF conflict checks */ +#define ADD_CACHE_JUST_APPEND 8 /* Append only; tree.c::read_tree() */ extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option); extern struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int really); extern int remove_index_entry_at(struct index_state *, int pos); diff --git a/read-cache.c b/read-cache.c index e060392..865369d 100644 --- a/read-cache.c +++ b/read-cache.c @@ -665,7 +665,7 @@ static int check_file_directory_conflict(struct index_state *istate, return retval + has_dir_name(istate, ce, pos, ok_to_replace); } -int add_index_entry(struct index_state *istate, struct cache_entry *ce, int option) +static int add_index_entry_with_check(struct index_state *istate, struct cache_entry *ce, int option) { int pos; int ok_to_add = option & ADD_CACHE_OK_TO_ADD; @@ -707,6 +707,22 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti pos = index_name_pos(istate, ce->name, ntohs(ce->ce_flags)); pos = -pos-1; } + return pos + 1; +} + +int add_index_entry(struct index_state *istate, struct cache_entry *ce, int option) +{ + int pos; + + if (option & ADD_CACHE_JUST_APPEND) + pos = istate->cache_nr; + else { + int ret; + ret = add_index_entry_with_check(istate, ce, option); + if (ret <= 0) + return ret; + pos = ret - 1; + } /* Make sure the array is big enough .. */ if (istate->cache_nr == istate->cache_alloc) { @@ -717,7 +733,7 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti /* Add it in.. */ istate->cache_nr++; - if (istate->cache_nr > pos) + if (istate->cache_nr > pos + 1) memmove(istate->cache + pos + 1, istate->cache + pos, (istate->cache_nr - pos - 1) * sizeof(ce)); diff --git a/tree.c b/tree.c index 04fe653..8c0819f 100644 --- a/tree.c +++ b/tree.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "cache-tree.h" #include "tree.h" #include "blob.h" #include "commit.h" @@ -7,7 +8,7 @@ const char *tree_type = "tree"; -static int read_one_entry(const unsigned char *sha1, const char *base, int baselen, const char *pathname, unsigned mode, int stage) +static int read_one_entry_opt(const unsigned char *sha1, const char *base, int baselen, const char *pathname, unsigned mode, int stage, int opt) { int len; unsigned int size; @@ -25,7 +26,23 @@ static int read_one_entry(const unsigned char *sha1, const char *base, int basel memcpy(ce->name, base, baselen); memcpy(ce->name + baselen, pathname, len+1); hashcpy(ce->sha1, sha1); - return add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_SKIP_DFCHECK); + return add_cache_entry(ce, opt); +} + +static int read_one_entry(const unsigned char *sha1, const char *base, int baselen, const char *pathname, unsigned mode, int stage) +{ + return read_one_entry_opt(sha1, base, baselen, pathname, mode, stage, + ADD_CACHE_OK_TO_ADD|ADD_CACHE_SKIP_DFCHECK); +} + +/* + * This is used when the caller knows there is no existing entries at + * the stage that will conflict with the entry being added. + */ +static int read_one_entry_quick(const unsigned char *sha1, const char *base, int baselen, const char *pathname, unsigned mode, int stage) +{ + return read_one_entry_opt(sha1, base, baselen, pathname, mode, stage, + ADD_CACHE_JUST_APPEND); } static int match_tree_entry(const char *base, int baselen, const char *path, unsigned int mode, const char **paths) @@ -119,9 +136,55 @@ int read_tree_recursive(struct tree *tree, return 0; } +static int cmp_cache_name_compare(const void *a_, const void *b_) +{ + const struct cache_entry *ce1, *ce2; + + ce1 = *((const struct cache_entry **)a_); + ce2 = *((const struct cache_entry **)b_); + return cache_name_compare(ce1->name, ntohs(ce1->ce_flags), + ce2->name, ntohs(ce2->ce_flags)); +} + int read_tree(struct tree *tree, int stage, const char **match) { - return read_tree_recursive(tree, "", 0, stage, match, read_one_entry); + read_tree_fn_t fn = NULL; + int i, err; + + /* + * Currently the only existing callers of this function all + * call it with stage=1 and after making sure there is nothing + * at that stage; we could always use read_one_entry_quick(). + * + * But when we decide to straighten out git-read-tree not to + * use unpack_trees() in some cases, this will probably start + * to matter. + */ + + /* + * See if we have cache entry at the stage. If so, + * do it the original slow way, otherwise, append and then + * sort at the end. + */ + for (i = 0; !fn && i < active_nr; i++) { + struct cache_entry *ce = active_cache[i]; + if (ce_stage(ce) == stage) + fn = read_one_entry; + } + + if (!fn) + fn = read_one_entry_quick; + err = read_tree_recursive(tree, "", 0, stage, match, fn); + if (fn == read_one_entry || err) + return err; + + /* + * Sort the cache entry -- we need to nuke the cache tree, though. + */ + cache_tree_free(&active_cache_tree); + qsort(active_cache, active_nr, sizeof(active_cache[0]), + cmp_cache_name_compare); + return 0; } struct tree *lookup_tree(const unsigned char *sha1) ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 20:42 ` Junio C Hamano @ 2007-08-09 20:52 ` Sean 2007-08-09 21:37 ` Junio C Hamano 2007-08-09 21:41 ` Linus Torvalds 0 siblings, 2 replies; 38+ messages in thread From: Sean @ 2007-08-09 20:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, moe, git On Thu, 09 Aug 2007 13:42:50 -0700 Junio C Hamano <gitster@pobox.com> wrote: > I do not know if this "fixes" the performance problem or not (I > do not have that much time during the day), so I would not call > this a "fix" yet, but at least the _change_ looks trivially > correct, and passes all the existing tests. > > Interested parties may want to try it and see if it shifts the > bottleneck. Junio, This makes things _much_ better, however the final commit in the test script still shows a lot of user time: ## time git init real 0m0.005s user 0m0.001s sys 0m0.004s ## time git add . real 0m3.501s user 0m1.268s sys 0m2.159s ## time git commit -q -m 'buurrrrn' -a real 0m2.299s user 0m1.065s sys 0m1.317s ## time git status real 0m1.107s user 0m0.548s sys 0m0.557s ## time git status real 0m1.122s user 0m0.545s sys 0m0.557s ## time git status real 0m1.142s user 0m0.545s sys 0m0.576s ## time git commit -q -m 'hurry' 50/500 real 0m16.944s user 0m15.466s sys 0m1.133s Cheers, Sean ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 20:52 ` Sean @ 2007-08-09 21:37 ` Junio C Hamano 2007-08-09 21:41 ` Linus Torvalds 1 sibling, 0 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 21:37 UTC (permalink / raw) To: Sean; +Cc: Linus Torvalds, moe, git Sean <seanlkml@sympatico.ca> writes: > This makes things _much_ better, however the final commit in the > test script still shows a lot of user time: I really do not have time to look into this for now until late in the evening, but it is not surprising at all that per-path partial commit is _always_ slower than the whole tree commit. It simply needs to do _extra_ work, such as (1) re-initializing a temporary index from the tree, (2) add the named entries to that temporary index, and (3) add the same named entries to the real index. After that it writes a tree out of the temporary index but the cost for that is the same as writing out of the real index that is done for the normal commit. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 20:52 ` Sean 2007-08-09 21:37 ` Junio C Hamano @ 2007-08-09 21:41 ` Linus Torvalds 2007-08-09 21:46 ` Linus Torvalds 1 sibling, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 21:41 UTC (permalink / raw) To: Sean; +Cc: Junio C Hamano, moe, git On Thu, 9 Aug 2007, Sean wrote: > > This makes things _much_ better, however the final commit in the > test script still shows a lot of user time: > > ## time git commit -q -m 'buurrrrn' -a > real 0m2.299s > user 0m1.065s > sys 0m1.317s > > ## time git commit -q -m 'hurry' 50/500 > real 0m16.944s > user 0m15.466s > sys 0m1.133s In the case where we do a partial commit (never mind that it's all the changes: when you give a path limiter, it's "partial"), "git commit" one goes through another path, and triggers the same issue with git read-tree --index-output=tmp-index -i -m HEAD which has the same O(n**2) issue (except it uses "unpack_trees()" rather than "read_tree()", so Junio's patch does nothing for it). So "builtin-read-tree.c" (or rather unpack-trees.c) would need the same kind of logic. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 21:41 ` Linus Torvalds @ 2007-08-09 21:46 ` Linus Torvalds 2007-08-09 22:02 ` Junio C Hamano 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 21:46 UTC (permalink / raw) To: Sean; +Cc: Junio C Hamano, moe, git On Thu, 9 Aug 2007, Linus Torvalds wrote: > > So "builtin-read-tree.c" (or rather unpack-trees.c) would need the same > kind of logic. The path seems to be: cmd_read_tree -> unpack_trees -> unpack_trees_rec -> [ recursive .. unpack_trees_rec ] -> oneway_merge -> keep_entry -> add_index_entry() and here again we end up having the same insertion sort issue. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 21:46 ` Linus Torvalds @ 2007-08-09 22:02 ` Junio C Hamano 2007-08-09 23:38 ` Junio C Hamano 2007-08-10 1:42 ` git and larger trees, not so fast? Daniel Barkalow 0 siblings, 2 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 22:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, 9 Aug 2007, Linus Torvalds wrote: >> >> So "builtin-read-tree.c" (or rather unpack-trees.c) would need the same >> kind of logic. > > The path seems to be: > > cmd_read_tree -> > unpack_trees -> > unpack_trees_rec -> > [ recursive .. unpack_trees_rec ] -> > oneway_merge -> > keep_entry -> > add_index_entry() > > and here again we end up having the same insertion sort issue. Quite honestly, I was this (shows the "thumb and index finger almost touching" gesture) close to declare that unpack-trees is unsalvageable, and was planning to redo the one-tree (and perhaps two-tree) read-tree without using that mess after 1.5.3. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 22:02 ` Junio C Hamano @ 2007-08-09 23:38 ` Junio C Hamano 2007-08-10 0:04 ` Junio C Hamano 2007-08-10 0:44 ` Linus Torvalds 2007-08-10 1:42 ` git and larger trees, not so fast? Daniel Barkalow 1 sibling, 2 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-09 23:38 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Junio C Hamano <gitster@pobox.com> writes: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> On Thu, 9 Aug 2007, Linus Torvalds wrote: >>> >>> So "builtin-read-tree.c" (or rather unpack-trees.c) would need the same >>> kind of logic. >> >> The path seems to be: >> >> cmd_read_tree -> >> unpack_trees -> >> unpack_trees_rec -> >> [ recursive .. unpack_trees_rec ] -> >> oneway_merge -> >> keep_entry -> >> add_index_entry() >> >> and here again we end up having the same insertion sort issue. > > Quite honestly, I was this (shows the "thumb and index finger > almost touching" gesture) close to declare that unpack-trees is > unsalvageable, and was planning to redo the one-tree (and > perhaps two-tree) read-tree without using that mess after 1.5.3. While I do not think the previous one was hacky at all, this one IS a hack, not meant for inclusion. It makes the partial commit codepath to use vanilla "git read-tree" without any single tree merge semantics, and rewrite that codepath to use read_tree() which was changed with my previous patch. --- builtin-read-tree.c | 5 ++++- git-commit.sh | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/builtin-read-tree.c b/builtin-read-tree.c index a3b17a3..61ea15c 100644 --- a/builtin-read-tree.c +++ b/builtin-read-tree.c @@ -258,7 +258,10 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix) opts.head_idx = 1; } - unpack_trees(trees, &opts); + if (!opts.merge && !opts.prefix && trees && !trees->next) + read_tree((struct tree*) trees->item, 0, NULL); + else + unpack_trees(trees, &opts); /* * When reading only one tree (either the most basic form, diff --git a/git-commit.sh b/git-commit.sh index d7e7028..b50468f 100755 --- a/git-commit.sh +++ b/git-commit.sh @@ -386,7 +386,7 @@ t,) if test -z "$initial_commit" then GIT_INDEX_FILE="$THIS_INDEX" \ - git read-tree --index-output="$TMP_INDEX" -i -m HEAD + git read-tree --index-output="$TMP_INDEX" HEAD else rm -f "$TMP_INDEX" fi || exit ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 23:38 ` Junio C Hamano @ 2007-08-10 0:04 ` Junio C Hamano 2007-08-10 0:44 ` Linus Torvalds 1 sibling, 0 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-10 0:04 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Junio C Hamano <gitster@pobox.com> writes: > While I do not think the previous one was hacky at all, this one > IS a hack, not meant for inclusion. It makes the partial commit > codepath to use vanilla "git read-tree" without any single tree > merge semantics, and rewrite that codepath to use read_tree() > which was changed with my previous patch. Just in case anybody is wondering, this also passed all the existing tests. Maybe it is going in the right direction of slowly moving away from unpack_trees() where we can. I dunno. I also do not know if it "fixes" the performance issue on that testcase. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 23:38 ` Junio C Hamano 2007-08-10 0:04 ` Junio C Hamano @ 2007-08-10 0:44 ` Linus Torvalds 2007-08-10 0:51 ` Junio C Hamano 1 sibling, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 0:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Thu, 9 Aug 2007, Junio C Hamano wrote: > > While I do not think the previous one was hacky at all, this one > IS a hack, not meant for inclusion. Ugh. I had some time, so I tried to find out *why* that thing is so slow. The fact is, "git read-tree -m HEAD" should be really really fast, because it should never actually insert multiple entries into the same index entry: it should just _replace_ the entry. But why is it slow? It doesn't actually replace the entry with "add_cache_entry()" at all. What it does is to *remove* the entry entirely at unpack-trees.c, line 154, unpack_trees_rec(), which does a "remove_cache_entry_at(o->pos);". That causes us to have to condense the index array, and is one big memcpy() for a large index. It then ADDS THE NEW ENTRY BACK! Which causes *another* expensive index array memmove(), as it now needs to make room (at the same location that it just compacted). Sadly, that removal is required for some of the other cases, so it's not like we can remove the remove. But we could *possibly* make things ridiculously much faster by making the remove a lazy thing, and if the next index operation just adds it back in, we wouldn't move things around. A bit too subtle for my taste. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 0:44 ` Linus Torvalds @ 2007-08-10 0:51 ` Junio C Hamano 2007-08-10 0:57 ` Linus Torvalds 0 siblings, 1 reply; 38+ messages in thread From: Junio C Hamano @ 2007-08-10 0:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, 9 Aug 2007, Junio C Hamano wrote: >> >> While I do not think the previous one was hacky at all, this one >> IS a hack, not meant for inclusion. > ... > Sadly, that removal is required for some of the other cases, so it's not > like we can remove the remove. But we could *possibly* make things > ridiculously much faster by making the remove a lazy thing, and if the > next index operation just adds it back in, we wouldn't move things around. Heh, that makes the two of us. I have been wanting to revamp or kill off unpack-trees for quite some time, and after all the patch you are responding to might be a small first step in the right direction ;-). ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 0:51 ` Junio C Hamano @ 2007-08-10 0:57 ` Linus Torvalds 2007-08-10 3:48 ` Junio C Hamano 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 0:57 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Thu, 9 Aug 2007, Junio C Hamano wrote: > > Heh, that makes the two of us. I have been wanting to revamp or > kill off unpack-trees for quite some time, and after all the > patch you are responding to might be a small first step in the > right direction ;-). Well, the patch you sent out was not only hacky, it was kind of pointless. Without the "-m" handling, it means that most users of git-read-tree won't ever see the improvement. The merge with the old index is absolutely critical for things like switching branches, for example. And I think you probably screwed up performance with your change to git-commit to simply not use it, because now the resulting index will be totally stale, and while the *commit* may be fast as a result, the subsequent operations might be horrible. (I didn't test it, though, maybe I missed something). So I do think that things like "git checkout <otherbranch>" need to have a fully working (and fast) unpack-trees, and your hack doesn't really help except for a pretty uninteresting special case. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 0:57 ` Linus Torvalds @ 2007-08-10 3:48 ` Junio C Hamano 2007-08-10 5:55 ` Junio C Hamano 2007-08-10 16:07 ` Linus Torvalds 0 siblings, 2 replies; 38+ messages in thread From: Junio C Hamano @ 2007-08-10 3:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > (I didn't test it, though, maybe I missed something). I do not think the change affects the normal codepath. The one-liner patch to git-commit.sh touches the codepath that updates the index used only to write out the partial commit, and losing the cached stat info from that index does not matter, as that index is removed immediately after writing the tree out and is never compared with working tree as far as I can tell. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 3:48 ` Junio C Hamano @ 2007-08-10 5:55 ` Junio C Hamano 2007-08-10 15:49 ` Linus Torvalds 2007-08-10 16:07 ` Linus Torvalds 1 sibling, 1 reply; 38+ messages in thread From: Junio C Hamano @ 2007-08-10 5:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Junio C Hamano <gitster@pobox.com> writes: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> (I didn't test it, though, maybe I missed something). > > I do not think the change affects the normal codepath. The > one-liner patch to git-commit.sh touches the codepath that > updates the index used only to write out the partial commit, and > losing the cached stat info from that index does not matter, as > that index is removed immediately after writing the tree out and > is never compared with working tree as far as I can tell. FWIW, moe's script with and without two patches gives these numbers for me. with patches without patches ---------------------------------------------------------------- # git init | # git init Initialized empty Git repository in | Initialized empty Git repository in | real 0m0.029s | real 0m0.004s user 0m0.000s | user 0m0.000s sys 0m0.008s | sys 0m0.004s # git add . | # git add . | real 0m2.947s | real 0m3.038s user 0m0.808s | user 0m0.876s sys 0m2.120s | sys 0m1.996s # git commit -a | # git commit -a | real 0m4.537s | real 0m4.674s user 0m1.980s | user 0m1.888s sys 0m2.356s | sys 0m2.332s # git status (three times) | # git status (three times) # On branch master | # On branch master nothing to commit (working directory| nothing to commit (working directory | real 0m0.718s | real 0m17.323s user 0m0.300s | user 0m16.913s sys 0m0.416s | sys 0m0.396s # On branch master | # On branch master nothing to commit (working directory| nothing to commit (working directory | real 0m0.707s | real 0m16.994s user 0m0.312s | user 0m16.573s sys 0m0.400s | sys 0m0.416s # On branch master | # On branch master nothing to commit (working directory| nothing to commit (working directory | real 0m0.720s | real 0m18.042s user 0m0.344s | user 0m17.633s sys 0m0.376s | sys 0m0.408s # git commit (one path) | # git commit (one path) Created commit 3483df5: expose the t| Created commit ced664f: expose the t 1 files changed, 1 insertions(+), 1| 1 files changed, 1 insertions(+), 1 | real 0m3.057s | real 0m38.130s user 0m0.888s | user 0m37.458s sys 0m0.712s | sys 0m0.616s ---------------------------------------------------------------- ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 5:55 ` Junio C Hamano @ 2007-08-10 15:49 ` Linus Torvalds 0 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 15:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Thu, 9 Aug 2007, Junio C Hamano wrote: > > FWIW, moe's script with and without two patches gives these > numbers for me. Btw, I really think it's worth doing even just the hacky patches at this stage, even though it's late in the game for 1.5.3. That performance problem is serious enough that I'd call it a major bug. Performance has always been one of the goals of git, and when you have a difference between 17s and 0.7s for "git status", that's a *huge* usability thing. It would be sad to release 1.5.3 with a known bug. [ Some people don't think performance issues are "real bugs", and I think such people shouldn't be allowed to program. ] Side note: your first patch is actually quite noticeable on even just the kernel. Not nearly as much, but without it, I get about 0.5s, and with it, I get consistently under 0.3s. So it's about a 40% improvement even for smaller projects (and it's probably much more if you have a CPU with a smaller cache: my Core 2 Duo has 4MB of L2 cache, and a lot of the index will even fit in the L1 - a slower CPU with less cache will see a bigger impact, and with smaller repositories, from the unnecessary memory moving). While 0.5s -> 0.3s may not sound like much, on a slower machine where it might otherwise be 2.5s -> 1.5s, that's likely to be quite noticeable. In fact, I can tell even on my machine: 0.3s is visible as a "I'm clearly thinking about it" delay (quite frankly, it would be better at 0.1s, which is "immediate"), but 0.5s is already approaching the point where you actually wait for the answer (rather than just notice that it wasn't quite immediate). Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-10 3:48 ` Junio C Hamano 2007-08-10 5:55 ` Junio C Hamano @ 2007-08-10 16:07 ` Linus Torvalds 2007-08-10 16:51 ` Fix "git commit directory/" performance anomaly Linus Torvalds 1 sibling, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 16:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Thu, 9 Aug 2007, Junio C Hamano wrote: > > > > (I didn't test it, though, maybe I missed something). > > I do not think the change affects the normal codepath. The > one-liner patch to git-commit.sh touches the codepath that > updates the index used only to write out the partial commit, and > losing the cached stat info from that index does not matter, as > that index is removed immediately after writing the tree out and > is never compared with working tree as far as I can tell. You are, of course, mostly right. Using "-m" there is largely pointless, since it's a throw-away index, and we'll only ever use the exact paths that were given to us. However, it does actually matter for one case: the case where you give a directory name or other name pattern, resulting in a *lot* of filenames. In that case, the commit will end up piping that (potentially very large) list to "git update-index --add --remove --stdin", and that will now mean that they *all* get their SHA1's recomputed. Of course, that was the other performance bug that we already knew about (except we were thining "git add .", and fixed that case). So we're already slow at it - but we *shouldn't* be. Try this on the kernel archive (use a clean one, so these things *should* all be no-ops): time sh -c "git add . ; git commit" which is nice and fast and takes just over a second for me, but then try time git commit . which *should* be nice and fast, but it takes forever, because we now re-compute all the SHA1's for *every* file. Of course, if it's all in the cache, it's still just 4s for me, but I tried with a cold cache, and it was over half a minute! (I don't actually ever do something like "git commit .", but I could see people doing it. What I *do* do is that if I have multiple independent changes, I may actually do "git commit fs" to commit just part of them, and rather than list all the files, I literally just say "commit that sub-tree". So this really is another valid performance issue). Sad. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Fix "git commit directory/" performance anomaly 2007-08-10 16:07 ` Linus Torvalds @ 2007-08-10 16:51 ` Linus Torvalds 2007-08-10 17:14 ` Linus Torvalds 2007-08-10 18:31 ` Junio C Hamano 0 siblings, 2 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 16:51 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git This trivial patch avoids re-hashing files that are already clean in the index. This mirrors what commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe did for "git add .", only for "git commit ." instead. This improves the cold-cache case immensely, since we don't need to bring in all the file contents, just the index and any files dirty in the index. Before: [torvalds@woody linux]$ time git commit . real 1m49.537s user 0m3.892s sys 0m2.432s After: [torvalds@woody linux]$ time git commit . real 0m14.273s user 0m1.312s sys 0m0.516s (both after doing a "echo 3 > /proc/sys/vm/drop_caches" to get cold-cache behaviour - even with the index optimization git still has to "lstat()" all the files, so with a truly cold cache, bringing all the inodes in will take some time). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- On Fri, 10 Aug 2007, Linus Torvalds wrote: > > Try this on the kernel archive (use a clean one, so these things *should* > all be no-ops): > > time sh -c "git add . ; git commit" > > which is nice and fast and takes just over a second for me, but then try > > time git commit . > > which *should* be nice and fast, but it takes forever, because we now > re-compute all the SHA1's for *every* file. Of course, if it's all in the > cache, it's still just 4s for me, but I tried with a cold cache, and it > was over half a minute! > > (I don't actually ever do something like "git commit .", but I could see > people doing it. What I *do* do is that if I have multiple independent > changes, I may actually do "git commit fs" to commit just part of them, > and rather than list all the files, I literally just say "commit that > sub-tree". So this really is another valid performance issue). > > Sad. > > Linus > --- builtin-update-index.c | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin-update-index.c b/builtin-update-index.c index 509369e..8d22dfa 100644 --- a/builtin-update-index.c +++ b/builtin-update-index.c @@ -86,9 +86,15 @@ static int process_lstat_error(const char *path, int err) static int add_one_path(struct cache_entry *old, const char *path, int len, struct stat *st) { - int option, size = cache_entry_size(len); - struct cache_entry *ce = xcalloc(1, size); + int option, size; + struct cache_entry *ce; + + /* Was the old index entry already up-to-date? */ + if (old && !ce_stage(old) && !ce_match_stat(old, st, 0)) + return; + size = cache_entry_size(len); + ce = xcalloc(1, size); memcpy(ce->name, path, len); ce->ce_flags = htons(len); fill_stat_cache_info(ce, st); ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: Fix "git commit directory/" performance anomaly 2007-08-10 16:51 ` Fix "git commit directory/" performance anomaly Linus Torvalds @ 2007-08-10 17:14 ` Linus Torvalds 2007-08-10 18:31 ` Junio C Hamano 1 sibling, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 17:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Fri, 10 Aug 2007, Linus Torvalds wrote: > > This improves the cold-cache case immensely, since we don't need to bring > in all the file contents, just the index and any files dirty in the index. The extreme - but not necessarily unusual - case of this is when the index and stat information is hot-cache (which is quite normal, in case you've done a "git diff" or something like that), but the whole source tree is otherwise not. Then the numbers look like this: Before: [torvalds@woody linux]$ time git commit . real 1m33.751s user 0m3.864s sys 0m2.160s After: [torvalds@woody linux]$ time git commit . real 0m1.415s user 0m1.176s sys 0m0.260s The all-hot-cache numbers are better too, of course, just not nearly as noticeable: Before: [torvalds@woody linux]$ time git commit . real 0m3.960s user 0m3.304s sys 0m0.672s (and the after case is that 1.415s above, of course, so it's still more than twice as fast - it's just no longer a 60x performance difference!). (Honesty in advertising: the "after" numbers also contain the "runstatus" speedup, which is the 0.5s -> 0.3s improvement) Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Fix "git commit directory/" performance anomaly 2007-08-10 16:51 ` Fix "git commit directory/" performance anomaly Linus Torvalds 2007-08-10 17:14 ` Linus Torvalds @ 2007-08-10 18:31 ` Junio C Hamano 2007-08-10 18:56 ` Linus Torvalds 1 sibling, 1 reply; 38+ messages in thread From: Junio C Hamano @ 2007-08-10 18:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sean, moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > This trivial patch avoids re-hashing files that are already clean in the > index. This mirrors what commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe > did for "git add .", only for "git commit ." instead. Makes sense. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Fix "git commit directory/" performance anomaly 2007-08-10 18:31 ` Junio C Hamano @ 2007-08-10 18:56 ` Linus Torvalds 0 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 18:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sean, moe, git On Fri, 10 Aug 2007, Junio C Hamano wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > > > This trivial patch avoids re-hashing files that are already clean in the > > index. This mirrors what commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe > > did for "git add .", only for "git commit ." instead. > > Makes sense. Thanks. Please don't apply that patch without this trivial fix. I don't know why I didn't notice. It passed all the tests, but it really shouldn't have, and the compiler warned. Linus --- diff --git a/builtin-update-index.c b/builtin-update-index.c index 8d22dfa..a7a4574 100644 --- a/builtin-update-index.c +++ b/builtin-update-index.c @@ -91,7 +91,7 @@ static int add_one_path(struct cache_entry *old, const char *path, int len, stru /* Was the old index entry already up-to-date? */ if (old && !ce_stage(old) && !ce_match_stat(old, st, 0)) - return; + return 0; size = cache_entry_size(len); ce = xcalloc(1, size); ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 22:02 ` Junio C Hamano 2007-08-09 23:38 ` Junio C Hamano @ 2007-08-10 1:42 ` Daniel Barkalow 1 sibling, 0 replies; 38+ messages in thread From: Daniel Barkalow @ 2007-08-10 1:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Sean, moe, git On Thu, 9 Aug 2007, Junio C Hamano wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > > > On Thu, 9 Aug 2007, Linus Torvalds wrote: > >> > >> So "builtin-read-tree.c" (or rather unpack-trees.c) would need the same > >> kind of logic. > > > > The path seems to be: > > > > cmd_read_tree -> > > unpack_trees -> > > unpack_trees_rec -> > > [ recursive .. unpack_trees_rec ] -> > > oneway_merge -> > > keep_entry -> > > add_index_entry() > > > > and here again we end up having the same insertion sort issue. > > Quite honestly, I was this (shows the "thumb and index finger > almost touching" gesture) close to declare that unpack-trees is > unsalvageable, and was planning to redo the one-tree (and > perhaps two-tree) read-tree without using that mess after 1.5.3. Yeah, that's probably the right thing to do; I wrote it with the idea that we'd be doing many-parent merges with it, but merge-recursive turned out to be a better idea, so I designed it to be comprehensible for a complicated case we never actually do. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 17:11 ` Linus Torvalds 2007-08-09 17:38 ` Linus Torvalds @ 2007-08-09 17:54 ` Linus Torvalds 2007-08-09 18:06 ` David Kastrup 2 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-09 17:54 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, Linus Torvalds wrote: > > We seem to spend a lot of our time in memcpy: > > samples % image name app name symbol name > 200527 25.4551 libc-2.6.so libc-2.6.so _wordcopy_bwd_aligned > 104505 13.2660 libc-2.6.so libc-2.6.so _wordcopy_fwd_aligned Sorry, that was a bogus trace with some old stuff in it. The real profile was this one. 102343 73.1377 libc-2.6.so libc-2.6.so _wordcopy_bwd_aligned 3573 2.5534 git git cache_name_compare 2328 1.6637 git git index_name_pos ... which matches the rest of my emails.. (the "73%" is actually really supposed to be about 95%, but I had X running and doing stuff at the same time, so it was only 73% of all the other CPU activity that was going on over the time I profiled). Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 17:11 ` Linus Torvalds 2007-08-09 17:38 ` Linus Torvalds 2007-08-09 17:54 ` Linus Torvalds @ 2007-08-09 18:06 ` David Kastrup 2 siblings, 0 replies; 38+ messages in thread From: David Kastrup @ 2007-08-09 18:06 UTC (permalink / raw) To: git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, 9 Aug 2007, moe wrote: >> >> i made some tests on latest master branch >> (1.5.3.rc4.29.g74276) and it seems like git >> hits a wall somewhere above ~50k files. > > Good catch. Definitely not acceptable performance. > > We seem to spend a lot of our time in memcpy: [...] > Doing an ltrace on it shows tons and tons of: > > ... > strlen("35") > strlen("349") > calloc(1, 72) > memcpy(0x73034e, "10/", 3) > memcpy(0x730351, "349", 4) > memmove(0x2ab637f41e80, 0x2ab637f41e78, 781768) > ... > > but I haven't looked at where they come from yet. Ok, preaching to the pope here, but: moving memory is bad. Make sure data can stay where it starts. In particular: don't use realloc. And if you do, grow the size _exponentially_ (like a factor of 1.5). If you grow the size exponentially, at least the movery hits the algorithm with O(n lg n). If data stays put, even in badly scattered linear lists, we have O(n). If you grow realloc _linearly_ (constant size increment), then the algorithm is hit with O(n^2). Technically, basically _all_ operations of git can be done by doing list merges of presorted lists. The index can be kept sorted. All requests into the index can be collected in readdir order into one linear list, this gets sorted once using a merge sort with O(n lg n), then it gets merged with the index O(n+N). As long as the whole index is read in, it can't be done faster. It is not necessary to organize the read data into trees or more complicate structures: a single linear list is sufficient. One can use the hierarchical structure of a directory to shave off some part of the sorting cost, though, and it considerably will lessen memory impact (and copying costs) if a file/directory/tree entry can contain a relative file name and a "pointer to prefix" where the rest of the file path is to be found. Anyway, so much for some theory. Now let's look at bad points in the code, judging from your benchmarks. A grep for realloc is appaling. Let's see what is actually involved here. attr.c: struct git_attr *git_attr(const char *name, int len) { a->attr_nr = attr_nr++; git_attr_hash[pos] = a; check_all_attr = xrealloc(check_all_attr, sizeof(*check_all_attr) * attr_nr); [...] Full O(n^2) behavior of the worst kind (increment 1!). builtin-commit-tree.c: static void add_buffer(char **bufp, unsigned int *sizep, const char *fmt, ...) { size = *sizep; newsize = size + len + 1; alloc = (size + 32767) & ~32767; [size rounded to next 32k (inconsistent! needs to be size+1 rounded up)] buf = *bufp; if (newsize > alloc) { alloc = (newsize + 32767) & ~32767; [newsize rounded to next 32k] buf = xrealloc(buf, alloc); [O(n^2): constant increment. Important? No idea.] *bufp = buf; } *sizep = newsize - 1; memcpy(buf + size, one_line, len); } [...] int register_commit_graft(struct commit_graft *graft, int ignore_dups) { [...] if (commit_graft_alloc <= ++commit_graft_nr) { commit_graft_alloc = alloc_nr(commit_graft_alloc); commit_graft = xrealloc(commit_graft, sizeof(*commit_graft) * commit_graft_alloc); } if (pos < commit_graft_nr) memmove(commit_graft + pos + 1, commit_graft + pos, (commit_graft_nr - pos - 1) * sizeof(*commit_graft)); commit_graft[pos] = graft; return 0; } Eeek. Start with a linear list, not an array. objects.c: void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode) { unsigned nr = array->nr; unsigned alloc = array->alloc; struct object_array_entry *objects = array->objects; if (nr >= alloc) { alloc = (alloc + 32) * 2; objects = xrealloc(objects, alloc * sizeof(*objects)); array->alloc = alloc; Constant increment, O(n^2). pathlist.c: static int add_entry(struct path_list *list, const char *path) { int exact_match; int index = get_entry_index(list, path, &exact_match); if (exact_match) return -1 - index; if (list->nr + 1 >= list->alloc) { list->alloc += 32; list->items = xrealloc(list->items, list->alloc * sizeof(struct path_list_item)); } Constant increment, O(n^2). That's just a cursory examination. In my opinion, pretty much every realloc should be replaced by some sort of list structure. That would be the nicest thing. I have sped up some awk scripts that built up argument lists by a factor of 100 by replacing a[index] = a[index] " " thenewstuff with a[index,nr[index]++] = thenewstuff and then never concatenating the strings, but just outputting them in a loop. Anyway, short of that, don't realloc by fixed increments, always use alloc_nr as soon as multiple reallocs are to be expected. And they certainly are in some of the above cited code passages. -- David Kastrup ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 16:30 git and larger trees, not so fast? moe 2007-08-09 17:11 ` Linus Torvalds @ 2007-08-10 19:39 ` Linus Torvalds 2007-08-11 18:47 ` Linus Torvalds 2 siblings, 0 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-10 19:39 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, moe wrote: > > earlier today i imported one of my larger trees > (~70k files) into git and was quite disappointed > by the performance. Ok, I said I wouldn't have time to fix it yesterday, but today it's all done. With the first fix from Junio yesterday (the one that fixed "git status"), and the fixes I've sent out today, your cases should not all be basically instantaneous (ie we're talking low seconds, even on not-the-fastest- possible-machines). So with the following patches that were posted over the last 24 hours, you should be ok: Junio: Fix performance problem in "git status" Me: Start moving unpack-trees to "struct tree_desc" Fix "git commit directory/" performance anomaly (+ one-liner fix) Move old index entry removal from "unpack_trees()" into the individual functions Optimize the common cases of git-read-tree Optimize the two-way merge of git-read-tree too (that patch from Junio was sent in an email in this thread, with the subject line "Re: git and larger trees, not so fast?" and a message ID of "<7v7io4xwvp.fsf@assigned-by-dhcp.cox.net>": the patches from me should all have the appropriate Subject lines and be findable that way). If you can test with your real load to make sure, that would be good. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-09 16:30 git and larger trees, not so fast? moe 2007-08-09 17:11 ` Linus Torvalds 2007-08-10 19:39 ` Linus Torvalds @ 2007-08-11 18:47 ` Linus Torvalds 2007-08-11 19:02 ` Fernando J. Pereda 2007-08-11 20:06 ` moe 2 siblings, 2 replies; 38+ messages in thread From: Linus Torvalds @ 2007-08-11 18:47 UTC (permalink / raw) To: moe; +Cc: git On Thu, 9 Aug 2007, moe wrote: > > here's a test-case (should be safe to > copy/paste on linux, bash): moe: with current git (and thus the 1.5.3 release), the "git status" commands now take half a second for me, and the git commit takes just under a second. The *initial* commit that adds everything still takes almost 5 seconds, but that was due to generating the diffstat summary - with a "-q" on the commit line that too drops down to just under a second. In fact, the only thing that took more than a second for me with the current git is that initial "git add .", which took 1.791s for me. Considering that it had to hash all the 100,000 objects, I'm not surprised. Anyway, it would be good if you re-did your real work tree with current commit, just to verify. You have slower hardware than I do, but hopefully it is now just about as fast as it can be. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 18:47 ` Linus Torvalds @ 2007-08-11 19:02 ` Fernando J. Pereda 2007-08-11 20:38 ` Linus Torvalds 2007-08-11 20:06 ` moe 1 sibling, 1 reply; 38+ messages in thread From: Fernando J. Pereda @ 2007-08-11 19:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: moe, git [-- Attachment #1: Type: text/plain, Size: 566 bytes --] On Sat, Aug 11, 2007 at 11:47:42AM -0700, Linus Torvalds wrote: > Anyway, it would be good if you re-did your real work tree with current > commit, just to verify. You have slower hardware than I do, but hopefully > it is now just about as fast as it can be. Just for the record, I tried those patches on a real tree of ~120k files and ~25k directories, and Git is now _usable_. (The repository was created from an old checkout of the gentoo-x86 tree). - ferdy -- Fernando J. Pereda Garcimartín 20BB BDC3 761A 4781 E6ED ED0B 0A48 5B0C 60BD 28D4 [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 19:02 ` Fernando J. Pereda @ 2007-08-11 20:38 ` Linus Torvalds 2007-08-11 20:51 ` Fernando J. Pereda 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-11 20:38 UTC (permalink / raw) To: Fernando J. Pereda; +Cc: moe, git On Sat, 11 Aug 2007, Fernando J. Pereda wrote: > > Just for the record, I tried those patches on a real tree of ~120k files > and ~25k directories, and Git is now _usable_. (The repository was > created from an old checkout of the gentoo-x86 tree). What does "usable" mean? Is it still slow ("barely usable") or is it actually fast enough to be truly _nice_ to use? Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 20:38 ` Linus Torvalds @ 2007-08-11 20:51 ` Fernando J. Pereda 2007-08-11 22:27 ` Linus Torvalds 0 siblings, 1 reply; 38+ messages in thread From: Fernando J. Pereda @ 2007-08-11 20:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: moe, git [-- Attachment #1: Type: text/plain, Size: 910 bytes --] On Sat, Aug 11, 2007 at 01:38:34PM -0700, Linus Torvalds wrote: > > > On Sat, 11 Aug 2007, Fernando J. Pereda wrote: > > > > Just for the record, I tried those patches on a real tree of ~120k files > > and ~25k directories, and Git is now _usable_. (The repository was > > created from an old checkout of the gentoo-x86 tree). > > What does "usable" mean? Is it still slow ("barely usable") or is it > actually fast enough to be truly _nice_ to use? Very nice to use considering my hardware is rather old. git status used to take >1m and it now takes ~3s and git commit takes ~7s while it used to take >1m too. So it makes things nice to use and I guess things are MUCH better on faster hardware. (This is running on a 'AMD Athlon(TM) XP 2000+' and the repo is in a USB Hard drive) - ferdy -- Fernando J. Pereda Garcimartín 20BB BDC3 761A 4781 E6ED ED0B 0A48 5B0C 60BD 28D4 [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 20:51 ` Fernando J. Pereda @ 2007-08-11 22:27 ` Linus Torvalds 2007-08-11 23:26 ` David Kastrup 0 siblings, 1 reply; 38+ messages in thread From: Linus Torvalds @ 2007-08-11 22:27 UTC (permalink / raw) To: Fernando J. Pereda; +Cc: moe, git On Sat, 11 Aug 2007, Fernando J. Pereda wrote: > > > > What does "usable" mean? Is it still slow ("barely usable") or is it > > actually fast enough to be truly _nice_ to use? > > Very nice to use considering my hardware is rather old. git status used > to take >1m and it now takes ~3s and git commit takes ~7s while it used > to take >1m too. So it makes things nice to use and I guess things are > MUCH better on faster hardware. Oh, ok. Having a 7s commit sounds fine - certainly not instantaneous, but it doesn't sound too painful. Certainly not compared to what people live with normally in some other environments, at least. Thanks go to moe for just giving a trivial script to reproduce the performance anomaly. It wasn't that hard to fix once there was a trivial and unambiguous test case. Linus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 22:27 ` Linus Torvalds @ 2007-08-11 23:26 ` David Kastrup 0 siblings, 0 replies; 38+ messages in thread From: David Kastrup @ 2007-08-11 23:26 UTC (permalink / raw) To: Linus Torvalds; +Cc: Fernando J. Pereda, moe, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sat, 11 Aug 2007, Fernando J. Pereda wrote: >> > >> > What does "usable" mean? Is it still slow ("barely usable") or is it >> > actually fast enough to be truly _nice_ to use? >> >> Very nice to use considering my hardware is rather old. git status used >> to take >1m and it now takes ~3s and git commit takes ~7s while it used >> to take >1m too. So it makes things nice to use and I guess things are >> MUCH better on faster hardware. > > Oh, ok. Having a 7s commit sounds fine - certainly not instantaneous, but > it doesn't sound too painful. Certainly not compared to what people live > with normally in some other environments, at least. > > Thanks go to moe for just giving a trivial script to reproduce the > performance anomaly. It wasn't that hard to fix once there was a trivial > and unambiguous test case. Since one of the problems is git sorting stuff inefficiently, it might make for an interesting test case to create the test files in reverse alphabetic order so that readdir is more likely to deliver them in reverse order, too. Or in random order, by something like dc -e '2 32^sb69069sc100000sd1se[rlc*le+lb%prle+dld!=a]sa42 0lax'| while read i;do echo $i > $i;done (change the 100000 to get a different number of files, change the 42 to get a different seed). -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 18:47 ` Linus Torvalds 2007-08-11 19:02 ` Fernando J. Pereda @ 2007-08-11 20:06 ` moe 2007-08-23 0:30 ` moe 1 sibling, 1 reply; 38+ messages in thread From: moe @ 2007-08-11 20:06 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Sat, Aug 11, 2007 at 11:47:42AM -0700, Linus Torvalds wrote: > > > On Thu, 9 Aug 2007, moe wrote: > > > > here's a test-case (should be safe to > > copy/paste on linux, bash): > > moe: with current git (and thus the 1.5.3 release), the "git status" > commands now take half a second for me, and the git commit takes just > under a second. > > The *initial* commit that adds everything still takes almost 5 seconds, > but that was due to generating the diffstat summary - with a "-q" on the > commit line that too drops down to just under a second. > > In fact, the only thing that took more than a second for me with the > current git is that initial "git add .", which took 1.791s for me. > Considering that it had to hash all the 100,000 objects, I'm not > surprised. > > Anyway, it would be good if you re-did your real work tree with current > commit, just to verify. You have slower hardware than I do, but hopefully > it is now just about as fast as it can be. hi linus, thx for your efforts, the figures look very promising. i'm out of town right now but will test when i get stationary internet again (sometime tomorrow evening i think). regards, moe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: git and larger trees, not so fast? 2007-08-11 20:06 ` moe @ 2007-08-23 0:30 ` moe 0 siblings, 0 replies; 38+ messages in thread From: moe @ 2007-08-23 0:30 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Sat, Aug 11, 2007 at 10:06:30PM +0200, moe wrote: > On Sat, Aug 11, 2007 at 11:47:42AM -0700, Linus Torvalds wrote: > > > > > > On Thu, 9 Aug 2007, moe wrote: > > > > > > here's a test-case (should be safe to > > > copy/paste on linux, bash): > > > > moe: with current git (and thus the 1.5.3 release), the "git status" > > commands now take half a second for me, and the git commit takes just > > under a second. > > > > The *initial* commit that adds everything still takes almost 5 seconds, > > but that was due to generating the diffstat summary - with a "-q" on the > > commit line that too drops down to just under a second. > > > > In fact, the only thing that took more than a second for me with the > > current git is that initial "git add .", which took 1.791s for me. > > Considering that it had to hash all the 100,000 objects, I'm not > > surprised. > > > > Anyway, it would be good if you re-did your real work tree with current > > commit, just to verify. You have slower hardware than I do, but hopefully > > it is now just about as fast as it can be. > > hi linus, > > thx for your efforts, the figures look very promising. > i'm out of town right now but will test when i get > stationary internet again (sometime tomorrow evening > i think). sorry for late followup, i was blocked on paid work for longer than i expected. i can happily confirm what others have already reported; with the patches applied git works well even for my bigger repo. git status : 0m1.036s git commit (single file): 0m1.846s http://www.gosimpsons.com/ProdImages/krustysealkeychain.jpg regards, moe ^ permalink raw reply [flat|nested] 38+ messages in thread
* git and larger trees, not so fast? @ 2007-08-09 16:06 moe 0 siblings, 0 replies; 38+ messages in thread From: moe @ 2007-08-09 16:06 UTC (permalink / raw) To: git hi guys, earlier today i imported one of my larger trees (~70k files) into git and was quite disappointed by the performance. i made some tests on latest master branch (1.5.3.rc4.29.g74276) and it seems like git hits a wall somewhere above ~50k files. i'm seeing 'commit' timings of 30s and up as well as 'status' timings in the 10s ballpark. here's a test-case (should be safe to copy/paste on linux, bash): # # first create a tree of roughly 100k files # mkdir bummer cd bummer for ((i=0;i<100;i++)); do mkdir $i && pushd $i; for ((j=0;j<1000;j++)); do echo "$j" >$j; done; popd; done # # init and add this to git # time git init git config user.email "no@thx" git config user.name "nothx" time git add . time git commit -m 'buurrrrn' -a # # git-status, tunes in at around ~10s for me # time git-status time git-status time git-status # # git-commit, takes a whopping 52s for me # date >50/500 time git commit -m 'expose the turtle' 50/500 regards, moe ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2007-08-23 0:56 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-09 16:30 git and larger trees, not so fast? moe 2007-08-09 17:11 ` Linus Torvalds 2007-08-09 17:38 ` Linus Torvalds 2007-08-09 18:00 ` Junio C Hamano 2007-08-09 18:06 ` Linus Torvalds 2007-08-09 18:11 ` Junio C Hamano 2007-08-09 20:42 ` Junio C Hamano 2007-08-09 20:52 ` Sean 2007-08-09 21:37 ` Junio C Hamano 2007-08-09 21:41 ` Linus Torvalds 2007-08-09 21:46 ` Linus Torvalds 2007-08-09 22:02 ` Junio C Hamano 2007-08-09 23:38 ` Junio C Hamano 2007-08-10 0:04 ` Junio C Hamano 2007-08-10 0:44 ` Linus Torvalds 2007-08-10 0:51 ` Junio C Hamano 2007-08-10 0:57 ` Linus Torvalds 2007-08-10 3:48 ` Junio C Hamano 2007-08-10 5:55 ` Junio C Hamano 2007-08-10 15:49 ` Linus Torvalds 2007-08-10 16:07 ` Linus Torvalds 2007-08-10 16:51 ` Fix "git commit directory/" performance anomaly Linus Torvalds 2007-08-10 17:14 ` Linus Torvalds 2007-08-10 18:31 ` Junio C Hamano 2007-08-10 18:56 ` Linus Torvalds 2007-08-10 1:42 ` git and larger trees, not so fast? Daniel Barkalow 2007-08-09 17:54 ` Linus Torvalds 2007-08-09 18:06 ` David Kastrup 2007-08-10 19:39 ` Linus Torvalds 2007-08-11 18:47 ` Linus Torvalds 2007-08-11 19:02 ` Fernando J. Pereda 2007-08-11 20:38 ` Linus Torvalds 2007-08-11 20:51 ` Fernando J. Pereda 2007-08-11 22:27 ` Linus Torvalds 2007-08-11 23:26 ` David Kastrup 2007-08-11 20:06 ` moe 2007-08-23 0:30 ` moe -- strict thread matches above, loose matches on Subject: below -- 2007-08-09 16:06 moe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).