* [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new @ 2011-03-31 1:37 Dan McGee 2011-03-31 1:37 ` [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix Dan McGee ` (6 more replies) 0 siblings, 7 replies; 30+ messages in thread From: Dan McGee @ 2011-03-31 1:37 UTC (permalink / raw) To: git; +Cc: Dan McGee This was seen to happen in some invocations of git-log with a filtered path. Only do it if we are not recursively descending, as otherwise we mess with copy and rename detection in full tree moves. We need to exclude times when FIND_COPIES_HARDER is enabled as the diff machinery requires traversal of all files in this case to find copies from files that did not change at all in the given revision. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- These next few patches are all derived from trying to make operations on this repository a bit faster: http://projects.archlinux.org/svntogit/packages.git/ As far as a different shape of repository, this one qualifies. Some stats from after running `git gc --prune` and looking at things from count-objects among others: arch-packages linux-2.6 in-pack: 496718 1979274 size-pack: 112582 513977 ls | wc -l: 2401 34 ls-tree -r | wc -l 14039 36706 ls-tree -r -t | grep ' tree ' | wc -l 11211 2256 time git log -- zzzzz_not_exist >/dev/null 35.558s 0.976s I didn't find some golden switch to turn that made things instantly fast, but these patches did at least give some speedups. Suggestions/feedback/etc. welcome. Most profiling was done with `valgrind --tool=callgrind` and then using kcachegrind to chase down slow spots. tree-diff.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/tree-diff.c b/tree-diff.c index 76f83fc..ab90f1a 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -286,6 +286,9 @@ int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const cha unsigned long size1, size2; int retval; + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(old, new)) + return 0; + tree1 = read_object_with_reference(old, tree_type, &size1, NULL); if (!tree1) die("unable to read source tree (%s)", sha1_to_hex(old)); -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee @ 2011-03-31 1:37 ` Dan McGee 2011-08-30 18:55 ` Dan McGee 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee ` (5 subsequent siblings) 6 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-03-31 1:37 UTC (permalink / raw) To: git; +Cc: Dan McGee Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index 322becc..9be8007 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -522,7 +522,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, return 0; } -static int match_dir_prefix(const char *base, int baselen, +static int match_dir_prefix(const char *base, const char *match, int matchlen) { if (strncmp(base, match, matchlen)) @@ -579,7 +579,7 @@ int tree_entry_interesting(const struct name_entry *entry, if (baselen >= matchlen) { /* If it doesn't match, move along... */ - if (!match_dir_prefix(base_str, baselen, match, matchlen)) + if (!match_dir_prefix(base_str, match, matchlen)) goto match_wildcards; if (!ps->recursive || ps->max_depth == -1) -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix 2011-03-31 1:37 ` [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix Dan McGee @ 2011-08-30 18:55 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-08-30 18:55 UTC (permalink / raw) To: git; +Cc: Dan McGee On Wed, Mar 30, 2011 at 8:37 PM, Dan McGee <dpmcgee@gmail.com> wrote: > Signed-off-by: Dan McGee <dpmcgee@gmail.com> > --- This still seems to apply to the current code, ping? > tree-walk.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tree-walk.c b/tree-walk.c > index 322becc..9be8007 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -522,7 +522,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, > return 0; > } > > -static int match_dir_prefix(const char *base, int baselen, > +static int match_dir_prefix(const char *base, > const char *match, int matchlen) > { > if (strncmp(base, match, matchlen)) > @@ -579,7 +579,7 @@ int tree_entry_interesting(const struct name_entry *entry, > > if (baselen >= matchlen) { > /* If it doesn't match, move along... */ > - if (!match_dir_prefix(base_str, baselen, match, matchlen)) > + if (!match_dir_prefix(base_str, match, matchlen)) > goto match_wildcards; > > if (!ps->recursive || ps->max_depth == -1) > -- > 1.7.4.2 ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee 2011-03-31 1:37 ` [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix Dan McGee @ 2011-03-31 1:37 ` Dan McGee 2011-04-03 4:01 ` Nguyen Thai Ngoc Duy ` (2 more replies) 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee ` (4 subsequent siblings) 6 siblings, 3 replies; 30+ messages in thread From: Dan McGee @ 2011-03-31 1:37 UTC (permalink / raw) To: git; +Cc: Dan McGee In the case of a wide breadth top-level tree (~2400 entries, all trees in this case), we can see a noticeable cost in the profiler calling strncmp() here. Most of the time we are at the base level of the repository, so base is "" and baselen == 0, which means we will always test true. Break out this one tiny case so we can short circuit the strncmp() call. This resulted in an ~11% improvement (43 to 38 secs) for a reasonable log operation on the Arch Linux Packages SVN clone repository, which contained 117220 commits and the aforementioned 2400 top-level objects: git log -- autogen/trunk pacman/trunk/ wget/trunk/ Negligible slowdown was noted with other repositories (e.g. linux-2.6). Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index 9be8007..f386151 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -591,8 +591,8 @@ int tree_entry_interesting(const struct name_entry *entry, ps->max_depth); } - /* Does the base match? */ - if (!strncmp(base_str, match, baselen)) { + /* Either there must be no base, or the base must match. */ + if (baselen == 0 || !strncmp(base_str, match, baselen)) { if (match_entry(entry, pathlen, match + baselen, matchlen - baselen, &never_interesting)) -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee @ 2011-04-03 4:01 ` Nguyen Thai Ngoc Duy 2011-04-03 18:55 ` Junio C Hamano 2011-04-04 14:46 ` [PATCH] tree_entry_interesting: inline strncmp() Nguyễn Thái Ngọc Duy 2 siblings, 0 replies; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-04-03 4:01 UTC (permalink / raw) To: Dan McGee; +Cc: git On Wed, Mar 30, 2011 at 08:37:59PM -0500, Dan McGee wrote: > In the case of a wide breadth top-level tree (~2400 entries, all trees > in this case), we can see a noticeable cost in the profiler calling > strncmp() here. Most of the time we are at the base level of the > repository, so base is "" and baselen == 0, which means we will always > test true. Break out this one tiny case so we can short circuit the > strncmp() call. > > This resulted in an ~11% improvement (43 to 38 secs) for a reasonable > log operation on the Arch Linux Packages SVN clone repository, which > contained 117220 commits and the aforementioned 2400 top-level objects: > git log -- autogen/trunk pacman/trunk/ wget/trunk/ > > Negligible slowdown was noted with other repositories (e.g. linux-2.6). > > Signed-off-by: Dan McGee <dpmcgee@gmail.com> Ack. On my (slower) laptop, the same command with vanilla git -O3 takes 95 secs. Your patch cuts it down to 82 secs. I tried inlining strncmp with a version from glibc. Without your patch it takes 88 secs. With your patch, it takes 86 secs (still worse than your patch alone). A better strncmp version must be used on my system, I suspect. --8<-- diff --git a/tree-walk.c b/tree-walk.c index 322becc..0859b5c 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -457,6 +457,55 @@ int get_tree_entry(const unsigned char *tree_sha1, const char *name, unsigned ch return retval; } +static inline int inline_strncmp(const char *s1, const char *s2, size_t n) +{ + unsigned char c1 = '\0'; + unsigned char c2 = '\0'; + + if (n == 0) + return 0; + + if (n >= 4) { + size_t n4 = n >> 2; + do { + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + } while (--n4 > 0); + n &= 3; + } + + while (n > 0) { + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + n--; + } + + return c1 - c2; +} + +#if 1 +#ifdef strncmp +#undef strncmp +#endif +#define strncmp inline_strncmp +#endif + static int match_entry(const struct name_entry *entry, int pathlen, const char *match, int matchlen, int *never_interesting) --8<-- -- Duy ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee 2011-04-03 4:01 ` Nguyen Thai Ngoc Duy @ 2011-04-03 18:55 ` Junio C Hamano 2011-04-05 0:22 ` Dan McGee 2011-04-04 14:46 ` [PATCH] tree_entry_interesting: inline strncmp() Nguyễn Thái Ngọc Duy 2 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2011-04-03 18:55 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > In the case of a wide breadth top-level tree (~2400 entries, all trees > in this case), we can see a noticeable cost in the profiler calling > strncmp() here. Most of the time we are at the base level of the > repository, so base is "" and baselen == 0, which means we will always > test true. Break out this one tiny case so we can short circuit the > strncmp() call. This sounds as if the patch helps only when you have a superfat tree at the "top-level" of the project, but wouldn't this benefit any superfat tree at _any_ level while we recursively descend into it? > This resulted in an ~11% improvement (43 to 38 secs) for a reasonable > log operation on the Arch Linux Packages SVN clone repository, which > contained 117220 commits and the aforementioned 2400 top-level objects: > git log -- autogen/trunk pacman/trunk/ wget/trunk/ > > Negligible slowdown was noted with other repositories (e.g. linux-2.6). It would have been easier to swallow if the last sentence were "This could lead to a slowdown in repositories without directories that are too wide, but in practice it was not even measurable." "Negligible" sounds as if it had still measurable downside, and as if you decided that the slowdown can be ignored---but obviously you are not an unbiased judge. There is nothing wrong in the patch per-se, but I really wish we didn't have to do this; it feels like the compiler should be helping us in this case. > Signed-off-by: Dan McGee <dpmcgee@gmail.com> > --- > tree-walk.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tree-walk.c b/tree-walk.c > index 9be8007..f386151 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -591,8 +591,8 @@ int tree_entry_interesting(const struct name_entry *entry, > ps->max_depth); > } > > - /* Does the base match? */ > - if (!strncmp(base_str, match, baselen)) { > + /* Either there must be no base, or the base must match. */ > + if (baselen == 0 || !strncmp(base_str, match, baselen)) { > if (match_entry(entry, pathlen, > match + baselen, matchlen - baselen, > &never_interesting)) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting 2011-04-03 18:55 ` Junio C Hamano @ 2011-04-05 0:22 ` Dan McGee [not found] ` <CAEik5nOKrpFycZYVnSu4_5LYWxn0JS_hVXyiQH-80Bu-C4k8VQ@mail.gmail.com> 0 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-04-05 0:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Nguyen Thai Ngoc Duy On Sun, Apr 3, 2011 at 1:55 PM, Junio C Hamano <gitster@pobox.com> wrote: > Dan McGee <dpmcgee@gmail.com> writes: > >> In the case of a wide breadth top-level tree (~2400 entries, all trees >> in this case), we can see a noticeable cost in the profiler calling >> strncmp() here. Most of the time we are at the base level of the >> repository, so base is "" and baselen == 0, which means we will always >> test true. Break out this one tiny case so we can short circuit the >> strncmp() call. > > This sounds as if the patch helps only when you have a superfat tree at > the "top-level" of the project, but wouldn't this benefit any superfat > tree at _any_ level while we recursively descend into it? Correct. I looked at the fact that more often than not, we wouldn't have to descend into subtrees unless searching for a path underneath it, so that is why I phrased it that way. So the "in the case of" was quite literally the case I was testing, but didn't mean to exclude other potential test cases. >> This resulted in an ~11% improvement (43 to 38 secs) for a reasonable >> log operation on the Arch Linux Packages SVN clone repository, which >> contained 117220 commits and the aforementioned 2400 top-level objects: >> git log -- autogen/trunk pacman/trunk/ wget/trunk/ >> >> Negligible slowdown was noted with other repositories (e.g. linux-2.6). > > It would have been easier to swallow if the last sentence were "This could > lead to a slowdown in repositories without directories that are too wide, > but in practice it was not even measurable." "Negligible" sounds as if it > had still measurable downside, and as if you decided that the slowdown can > be ignored---but obviously you are not an unbiased judge. Perhaps I was too cautious with my words- but I was also trying to not be biased. Considering this same operation takes < 1 second in linux-2.6, I only wanted to mention it could have a slight effect. In reality I saw nothing more than an extra 0.01s or so, and definitely nothing significant. Let me know if you see otherwise. dmcgee@galway ~/projects/linux-2.6 (master) $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m0.945s user 0m0.857s sys 0m0.083s > There is nothing wrong in the patch per-se, but I really wish we didn't > have to do this; it feels like the compiler should be helping us in this > case. > >> Signed-off-by: Dan McGee <dpmcgee@gmail.com> >> --- >> tree-walk.c | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/tree-walk.c b/tree-walk.c >> index 9be8007..f386151 100644 >> --- a/tree-walk.c >> +++ b/tree-walk.c >> @@ -591,8 +591,8 @@ int tree_entry_interesting(const struct name_entry *entry, >> ps->max_depth); >> } >> >> - /* Does the base match? */ >> - if (!strncmp(base_str, match, baselen)) { >> + /* Either there must be no base, or the base must match. */ >> + if (baselen == 0 || !strncmp(base_str, match, baselen)) { >> if (match_entry(entry, pathlen, >> match + baselen, matchlen - baselen, >> &never_interesting)) > ^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <CAEik5nOKrpFycZYVnSu4_5LYWxn0JS_hVXyiQH-80Bu-C4k8VQ@mail.gmail.com>]
* Re: [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting [not found] ` <CAEik5nOKrpFycZYVnSu4_5LYWxn0JS_hVXyiQH-80Bu-C4k8VQ@mail.gmail.com> @ 2011-08-30 19:51 ` Dan McGee 2011-08-30 20:40 ` Junio C Hamano 0 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-08-30 19:51 UTC (permalink / raw) To: git On Mon, Apr 4, 2011 at 7:22 PM, Dan McGee <dpmcgee@gmail.com> wrote: > On Sun, Apr 3, 2011 at 1:55 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Dan McGee <dpmcgee@gmail.com> writes: >> >>> In the case of a wide breadth top-level tree (~2400 entries, all trees >>> in this case), we can see a noticeable cost in the profiler calling >>> strncmp() here. Most of the time we are at the base level of the >>> repository, so base is "" and baselen == 0, which means we will always >>> test true. Break out this one tiny case so we can short circuit the >>> strncmp() call. >> >> This sounds as if the patch helps only when you have a superfat tree at >> the "top-level" of the project, but wouldn't this benefit any superfat >> tree at _any_ level while we recursively descend into it? > > Correct. I looked at the fact that more often than not, we wouldn't > have to descend into subtrees unless searching for a path underneath > it, so that is why I phrased it that way. So the "in the case of" was > quite literally the case I was testing, but didn't mean to exclude > other potential test cases. > >>> This resulted in an ~11% improvement (43 to 38 secs) for a reasonable >>> log operation on the Arch Linux Packages SVN clone repository, which >>> contained 117220 commits and the aforementioned 2400 top-level objects: >>> git log -- autogen/trunk pacman/trunk/ wget/trunk/ >>> >>> Negligible slowdown was noted with other repositories (e.g. linux-2.6). >> >> It would have been easier to swallow if the last sentence were "This could >> lead to a slowdown in repositories without directories that are too wide, >> but in practice it was not even measurable." "Negligible" sounds as if it >> had still measurable downside, and as if you decided that the slowdown can >> be ignored---but obviously you are not an unbiased judge. > > Perhaps I was too cautious with my words- but I was also trying to not > be biased. Considering this same operation takes < 1 second in > linux-2.6, I only wanted to mention it could have a slight effect. In > reality I saw nothing more than an extra 0.01s or so, and definitely > nothing significant. Let me know if you see otherwise. > > dmcgee@galway ~/projects/linux-2.6 (master) > $ time ../git/git-log -- zzzzz_not_exist > /dev/null > > real 0m0.945s > user 0m0.857s > sys 0m0.083s > >> There is nothing wrong in the patch per-se, but I really wish we didn't >> have to do this; it feels like the compiler should be helping us in this >> case. If I resurrect this with an updated commit message reflecting concerns raised, can it be merged? Given that it is a noticeable performance boost on real-life repositories and I can show it has little (<1%) to no impact on most repos, it is a definite win. >>> Signed-off-by: Dan McGee <dpmcgee@gmail.com> >>> --- >>> tree-walk.c | 4 ++-- >>> 1 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/tree-walk.c b/tree-walk.c >>> index 9be8007..f386151 100644 >>> --- a/tree-walk.c >>> +++ b/tree-walk.c >>> @@ -591,8 +591,8 @@ int tree_entry_interesting(const struct name_entry *entry, >>> ps->max_depth); >>> } >>> >>> - /* Does the base match? */ >>> - if (!strncmp(base_str, match, baselen)) { >>> + /* Either there must be no base, or the base must match. */ >>> + if (baselen == 0 || !strncmp(base_str, match, baselen)) { >>> if (match_entry(entry, pathlen, >>> match + baselen, matchlen - baselen, >>> &never_interesting)) >> > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting 2011-08-30 19:51 ` Dan McGee @ 2011-08-30 20:40 ` Junio C Hamano 2011-09-09 2:02 ` [PATCH 1/2] tree-walk: drop unused parameter from match_dir_prefix Dan McGee 0 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2011-08-30 20:40 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: >> dmcgee@galway ~/projects/linux-2.6 (master) >> $ time ../git/git-log -- zzzzz_not_exist > /dev/null >> >> real 0m0.945s >> user 0m0.857s >> sys 0m0.083s >> >>> There is nothing wrong in the patch per-se, but I really wish we didn't >>> have to do this; it feels like the compiler should be helping us in this >>> case. > > If I resurrect this with an updated commit message reflecting concerns > raised, can it be merged? Given that it is a noticeable performance > boost on real-life repositories and I can show it has little (<1%) to > no impact on most repos, it is a definite win. I do not see anything wrong in this particular patch per-se, but I really wish we didn't have to do this. Please include a few lines of benchmarking result in the updated commit log message as well if you are rerolling this patch, perhaps like I did in: http://thread.gmane.org/gmane.comp.version-control.git/179926/focus=180361 i.e., before and after comparison. Thanks. ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 1/2] tree-walk: drop unused parameter from match_dir_prefix 2011-08-30 20:40 ` Junio C Hamano @ 2011-09-09 2:02 ` Dan McGee 2011-09-09 2:02 ` [PATCH 2/2] tree-walk: micro-optimization in tree_entry_interesting Dan McGee 0 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-09-09 2:02 UTC (permalink / raw) To: git Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index 33f749e..dbcd94a 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -522,7 +522,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, return 0; } -static int match_dir_prefix(const char *base, int baselen, +static int match_dir_prefix(const char *base, const char *match, int matchlen) { if (strncmp(base, match, matchlen)) @@ -579,7 +579,7 @@ int tree_entry_interesting(const struct name_entry *entry, if (baselen >= matchlen) { /* If it doesn't match, move along... */ - if (!match_dir_prefix(base_str, baselen, match, matchlen)) + if (!match_dir_prefix(base_str, match, matchlen)) goto match_wildcards; if (!ps->recursive || ps->max_depth == -1) -- 1.7.6.1 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 2/2] tree-walk: micro-optimization in tree_entry_interesting 2011-09-09 2:02 ` [PATCH 1/2] tree-walk: drop unused parameter from match_dir_prefix Dan McGee @ 2011-09-09 2:02 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-09-09 2:02 UTC (permalink / raw) To: git In the case of a wide breadth top-level tree (~2400 entries, all trees in this case), we can see a noticeable cost in the profiler calling strncmp() here. Most of the time we are at the base level of the repository, so base is "" and baselen == 0, which means we will always test true. Break out this one tiny case so we can short circuit the strncmp() call. Test cases are as follows. packages.git is the Arch Linux git-svn clone of the packages repository which has the characteristics above. Commands: [1] packages.git, /usr/bin/time git log >/dev/null [2] packages.git, /usr/bin/time git log -- autogen/trunk pacman/trunk wget/trunk >/dev/null [3] linux.git, /usr/bin/time git log >/dev/null [4] linux.git, /usr/bin/time git log -- drivers/ata drivers/uio tools >/dev/null Results: before after %faster [1] 2.56 2.55 0.4% [2] 51.82 48.66 6.5% [3] 5.58 5.61 -0.5% [4] 1.55 1.51 0.2% The takeaway here is this doesn't matter in many operations, but it does for a certain style of repository and operation where it nets a 6.5% measured improvement. The other changes are likely not significant by reasonable statistics methods. Note: the measured improvement when originally submitted was ~11% (43 to 38 secs) for operation [2]. At the time, the repository had 117220 commits; it now has 137537 commits. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index dbcd94a..e401f07 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -591,8 +591,8 @@ int tree_entry_interesting(const struct name_entry *entry, ps->max_depth); } - /* Does the base match? */ - if (!strncmp(base_str, match, baselen)) { + /* Either there must be no base, or the base must match. */ + if (baselen == 0 || !strncmp(base_str, match, baselen)) { if (match_entry(entry, pathlen, match + baselen, matchlen - baselen, &never_interesting)) -- 1.7.6.1 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH] tree_entry_interesting: inline strncmp() 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee 2011-04-03 4:01 ` Nguyen Thai Ngoc Duy 2011-04-03 18:55 ` Junio C Hamano @ 2011-04-04 14:46 ` Nguyễn Thái Ngọc Duy 2 siblings, 0 replies; 30+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2011-04-04 14:46 UTC (permalink / raw) To: git, Junio C Hamano, dpmcgee; +Cc: Nguyễn Thái Ngọc Duy strncmp() is the function that takes most of the time inside tree_entry_interesting(). Inline it so we can shave some seconds out of function call time. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- Turns out simplicity is the best. My straight copy of strncmp from glibc performed worse. With this I get a slightly better performance than Dan's 3/5: 81.07-82.27 secs versus 82.02-82.92 (no other patches are applied). But I'm happy even if it gives the same or slightly worse performance because this applies to more cases than flat top tree case. Dan, match_dir_prefix() can also use some reordering to avoid strncmp(). But I suppose it won't give much gain on packages.git tree-walk.c | 28 ++++++++++++++++++++++++---- 1 files changed, 24 insertions(+), 4 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index 322becc..80bfc3a 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -457,6 +457,26 @@ int get_tree_entry(const unsigned char *tree_sha1, const char *name, unsigned ch return retval; } +/* Static version of strncmp to reduce function call cost */ +static inline int strncmp_1(const char *s1, const char *s2, size_t n) +{ + unsigned char c1 = '\0'; + unsigned char c2 = '\0'; + + if (!n) + return 0; + + while (n > 0) { + c1 = (unsigned char) *s1++; + c2 = (unsigned char) *s2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + n--; + } + + return c1 - c2; +} + static int match_entry(const struct name_entry *entry, int pathlen, const char *match, int matchlen, int *never_interesting) @@ -473,7 +493,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, * Does match sort strictly earlier than path * with their common parts? */ - m = strncmp(match, entry->path, + m = strncmp_1(match, entry->path, (matchlen < pathlen) ? matchlen : pathlen); if (m < 0) return 0; @@ -509,7 +529,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, * we cheated and did not do strncmp(), so we do * that here. */ - m = strncmp(match, entry->path, pathlen); + m = strncmp_1(match, entry->path, pathlen); /* * If common part matched earlier then it is a hit, @@ -525,7 +545,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, static int match_dir_prefix(const char *base, int baselen, const char *match, int matchlen) { - if (strncmp(base, match, matchlen)) + if (strncmp_1(base, match, matchlen)) return 0; /* @@ -592,7 +612,7 @@ int tree_entry_interesting(const struct name_entry *entry, } /* Does the base match? */ - if (!strncmp(base_str, match, baselen)) { + if (!strncmp_1(base_str, match, baselen)) { if (match_entry(entry, pathlen, match + baselen, matchlen - baselen, &never_interesting)) -- 1.7.4.74.g639db ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee 2011-03-31 1:37 ` [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix Dan McGee 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee @ 2011-03-31 1:38 ` Dan McGee 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy ` (2 more replies) 2011-03-31 1:38 ` [PATCH 5/5] tree-walk: match_entry microoptimization Dan McGee ` (3 subsequent siblings) 6 siblings, 3 replies; 30+ messages in thread From: Dan McGee @ 2011-03-31 1:38 UTC (permalink / raw) To: git; +Cc: Dan McGee We know our mode entry in our tree objects should be 5 or 6 characters long. This change both enforces this fact and also unrolls the parsing of the information giving the compiler more room for optimization of the operations. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 41 ++++++++++++++++++++++++++++++++++------- 1 files changed, 34 insertions(+), 7 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index f386151..41383b0 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -9,16 +9,43 @@ static const char *get_mode(const char *str, unsigned int *modep) unsigned char c; unsigned int mode = 0; - if (*str == ' ') - return NULL; - - while ((c = *str++) != ' ') { - if (c < '0' || c > '7') - return NULL; + /* + * Unroll what looks like a loop since the bounds are + * well-known. There should be at least 5 and at most 6 + * characters available in any valid mode, as '40000' is the + * shortest while '160000' (S_IFGITLINK) is the longest. + */ + /* char 1 */ + c = *str++; + if (c < '0' || c > '7') return NULL; + mode = (mode << 3) + (c - '0'); + /* char 2 */ + c = *str++; + if (c < '0' || c > '7') return NULL; + mode = (mode << 3) + (c - '0'); + /* char 3 */ + c = *str++; + if (c < '0' || c > '7') return NULL; + mode = (mode << 3) + (c - '0'); + /* char 4 */ + c = *str++; + if (c < '0' || c > '7') return NULL; + mode = (mode << 3) + (c - '0'); + /* char 5 */ + c = *str++; + if (c < '0' || c > '7') return NULL; + mode = (mode << 3) + (c - '0'); + /* char 6, optional */ + if (*str != ' ') { + c = *str++; + if (c < '0' || c > '7') return NULL; mode = (mode << 3) + (c - '0'); } + + if (*str != ' ') return NULL; + *modep = mode; - return str; + return str + 1; } static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size) -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee @ 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy 2011-04-02 17:28 ` Dan McGee 2011-04-04 10:29 ` Erik Faye-Lund 2011-04-04 16:55 ` Junio C Hamano 2 siblings, 1 reply; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-04-02 9:28 UTC (permalink / raw) To: Dan McGee; +Cc: git On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: > We know our mode entry in our tree objects should be 5 or 6 characters > long. This change both enforces this fact and also unrolls the parsing > of the information giving the compiler more room for optimization of the > operations. I'm skeptical. Did you measure signficant gain after this patch? I looked at asm output with -O3 and failed to see the compiler doing anything fancy. Perhaps it's because I'm on x86 with quite small register set. -- Duy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy @ 2011-04-02 17:28 ` Dan McGee 2011-04-03 4:07 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-04-02 17:28 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git On Sat, Apr 2, 2011 at 4:28 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: >> We know our mode entry in our tree objects should be 5 or 6 characters >> long. This change both enforces this fact and also unrolls the parsing >> of the information giving the compiler more room for optimization of the >> operations. > > I'm skeptical. Did you measure signficant gain after this patch? I > looked at asm output with -O3 and failed to see the compiler doing > anything fancy. Perhaps it's because I'm on x86 with quite small > register set. I'm on x86_64 and was just using -O2; -O3 produces the same output actually. You can see it below. I had taken a look at this before I submitted, and noticed a few things: 1. We do use multiple registers now since we aren't constrained to a loop. 2. movzbl (for the string parts) and cmb instructions tend to get clustered first. 3. mozbl (for the mode shifting) and leal instructions tend to get clustered later. 4. The normal case now involves no conditional jumps until the ' ' (space) comparison. Call these "trivial", but on my worst case operation times went from (shown below) 27.41 secs to 26.49 secs. Considering this operation is called 530,588,868 times (that is not a typo) during this operation, every saved instruction or non-missed branch prediction does seem to make a difference. -Dan Repo: http://projects.archlinux.org/svntogit/packages.git/ Old: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m27.409s user 0m27.172s sys 0m0.230s .LVL3: .LBB58: .LBB59: .loc 1 12 0 is_stmt 1 movzbl (%rsi), %eax cmpb $32, %al je .L5 .LVL4: .loc 1 16 0 leal -48(%rax), %edx .LVL5: .loc 1 15 0 leaq 1(%rsi), %rdi .LVL6: .loc 1 16 0 cmpb $7, %dl ja .L5 xorl %edx, %edx jmp .L6 .LVL7: .p2align 4,,10 .p2align 3 .L7: leal -48(%rax), %ecx cmpb $7, %cl ja .L5 .LVL8: .L6: .loc 1 18 0 movzbl %al, %eax leal -48(%rax,%rdx,8), %edx .LVL9: .loc 1 15 0 movzbl (%rdi), %eax .LVL10: addq $1, %rdi .LVL11: cmpb $32, %al jne .L7 New: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m26.490s user 0m26.282s sys 0m0.200s .LVL3: .LBB58: .LBB59: .loc 1 19 0 is_stmt 1 movzbl (%rsi), %eax .LVL4: .loc 1 20 0 leal -48(%rax), %edx .LVL5: cmpb $7, %dl ja .L5 .LVL6: .loc 1 23 0 movzbl 1(%rsi), %edx .LVL7: .loc 1 24 0 leal -48(%rdx), %ecx cmpb $7, %cl ja .L5 .LVL8: .loc 1 27 0 movzbl 2(%rsi), %ecx .LVL9: .loc 1 28 0 leal -48(%rcx), %edi cmpb $7, %dil ja .L5 .LVL10: .loc 1 31 0 movzbl 3(%rsi), %edi .LVL11: .loc 1 32 0 leal -48(%rdi), %r8d cmpb $7, %r8b ja .L5 .LVL12: .loc 1 35 0 movzbl 4(%rsi), %r8d .LVL13: .loc 1 36 0 leal -48(%r8), %r9d cmpb $7, %r9b ja .L5 .loc 1 21 0 movzbl %al, %eax .loc 1 25 0 movzbl %dl, %edx .loc 1 29 0 movzbl %cl, %ecx .loc 1 25 0 leal -432(%rdx,%rax,8), %edx .loc 1 33 0 movzbl %dil, %edi .loc 1 37 0 movzbl %r8b, %r8d .loc 1 35 0 leaq 5(%rsi), %rax .loc 1 29 0 leal -48(%rcx,%rdx,8), %edx .loc 1 39 0 movzbl 5(%rsi), %ecx .loc 1 33 0 leal -48(%rdi,%rdx,8), %edx .loc 1 39 0 cmpb $32, %cl .loc 1 37 0 leal -48(%r8,%rdx,8), %edx .LVL14: .loc 1 39 0 je .L7 .LVL15: .loc 1 41 0 leal -48(%rcx), %eax cmpb $7, %al ja .L5 .loc 1 45 0 cmpb $32, 6(%rsi) .loc 1 42 0 movzbl %cl, %ecx .loc 1 40 0 leaq 6(%rsi), %rax .loc 1 42 0 leal -48(%rcx,%rdx,8), %edx .LVL16: .loc 1 45 0 jne .L5 diff --git a/tree-walk.c b/tree-walk.c index 63901f8..dd7bd45 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -4,11 +4,22 @@ #include "dir.h" #include "tree.h" +static unsigned long hit_ctr = 0; + +static void print_hit_ctr(void) +{ + fprintf(stderr, "hit_ctr: %lu\n", hit_ctr); +} + static const char *get_mode(const char *str, unsigned int *modep) { unsigned char c; unsigned int mode = 0; + if(hit_ctr == 0) { + atexit(print_hit_ctr); + } + hit_ctr++; /* * Unroll what looks like a loop since the bounds are * well-known. There should be at least 5 and at most 6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-02 17:28 ` Dan McGee @ 2011-04-03 4:07 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-04-03 4:07 UTC (permalink / raw) To: Dan McGee; +Cc: git On Sun, Apr 3, 2011 at 12:28 AM, Dan McGee <dpmcgee@gmail.com> wrote: > On Sat, Apr 2, 2011 at 4:28 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: >> On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: >>> We know our mode entry in our tree objects should be 5 or 6 characters >>> long. This change both enforces this fact and also unrolls the parsing >>> of the information giving the compiler more room for optimization of the >>> operations. >> >> I'm skeptical. Did you measure signficant gain after this patch? I >> looked at asm output with -O3 and failed to see the compiler doing >> anything fancy. Perhaps it's because I'm on x86 with quite small >> register set. > > I'm on x86_64 and was just using -O2; -O3 produces the same output > actually. You can see it below. I had taken a look at this before I > submitted, and noticed a few things: > 1. We do use multiple registers now since we aren't constrained to a loop. > 2. movzbl (for the string parts) and cmb instructions tend to get > clustered first. > 3. mozbl (for the mode shifting) and leal instructions tend to get > clustered later. > 4. The normal case now involves no conditional jumps until the ' ' > (space) comparison. > > Call these "trivial", but on my worst case operation times went from > (shown below) 27.41 secs to 26.49 secs. Considering this operation is > called 530,588,868 times (that is not a typo) during this operation, > every saved instruction or non-missed branch prediction does seem to > make a difference. If it makes it better for you, I'm good. -- Duy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy @ 2011-04-04 10:29 ` Erik Faye-Lund 2011-04-04 12:30 ` Andreas Ericsson 2011-04-04 16:55 ` Junio C Hamano 2 siblings, 1 reply; 30+ messages in thread From: Erik Faye-Lund @ 2011-04-04 10:29 UTC (permalink / raw) To: Dan McGee; +Cc: git On Thu, Mar 31, 2011 at 3:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: > We know our mode entry in our tree objects should be 5 or 6 characters > long. This change both enforces this fact and also unrolls the parsing > of the information giving the compiler more room for optimization of the > operations. > > Signed-off-by: Dan McGee <dpmcgee@gmail.com> > --- > tree-walk.c | 41 ++++++++++++++++++++++++++++++++++------- > 1 files changed, 34 insertions(+), 7 deletions(-) > > diff --git a/tree-walk.c b/tree-walk.c > index f386151..41383b0 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -9,16 +9,43 @@ static const char *get_mode(const char *str, unsigned int *modep) > unsigned char c; > unsigned int mode = 0; > > - if (*str == ' ') > - return NULL; > - > - while ((c = *str++) != ' ') { > - if (c < '0' || c > '7') > - return NULL; > + /* > + * Unroll what looks like a loop since the bounds are > + * well-known. There should be at least 5 and at most 6 > + * characters available in any valid mode, as '40000' is the > + * shortest while '160000' (S_IFGITLINK) is the longest. > + */ > + /* char 1 */ > + c = *str++; > + if (c < '0' || c > '7') return NULL; We perfer this style: if (c < '0' || c > '7') return NULL; i.e a line-break and a tab between the if-statement and the conditional code. > + mode = (mode << 3) + (c - '0'); > + /* char 2 */ > + c = *str++; > + if (c < '0' || c > '7') return NULL; > + mode = (mode << 3) + (c - '0'); > + /* char 3 */ > + c = *str++; > + if (c < '0' || c > '7') return NULL; > + mode = (mode << 3) + (c - '0'); > + /* char 4 */ > + c = *str++; > + if (c < '0' || c > '7') return NULL; > + mode = (mode << 3) + (c - '0'); > + /* char 5 */ > + c = *str++; > + if (c < '0' || c > '7') return NULL; > + mode = (mode << 3) + (c - '0'); Wouldn't this part be cleaner as a constant-length loop? Any optimizing compiler should end up unrolling this, and we don't get as much code-duplication... Oddly enough, this change gave me a drastic (> 20%) performance increase in my test (isolating get_mode in a separate compilation unit, and calling it in a loop): diff --git a/tree-walk.c b/tree-walk.c index b8d504b..114ad63 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -7,42 +7,30 @@ static const char *get_mode(const char *str, unsigned int *modep) { unsigned char c; - unsigned int mode = 0; + unsigned int mode = 0, i; /* - * Unroll what looks like a loop since the bounds are + * Allow the compiler to unroll the loop since the bounds are * well-known. There should be at least 5 and at most 6 * characters available in any valid mode, as '40000' is the * shortest while '160000' (S_IFGITLINK) is the longest. */ - /* char 1 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 2 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 3 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 4 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 5 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); + for (i = 0; i < 5; ++i) { + c = *str++; + if (c < '0' || c > '7') + return NULL; + mode = (mode << 3) + (c - '0'); + } /* char 6, optional */ if (*str != ' ') { c = *str++; - if (c < '0' || c > '7') return NULL; + if (c < '0' || c > '7') + return NULL; mode = (mode << 3) + (c - '0'); } - if (*str != ' ') return NULL; + if (*str != ' ') + return NULL; *modep = mode; return str + 1; ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-04 10:29 ` Erik Faye-Lund @ 2011-04-04 12:30 ` Andreas Ericsson 0 siblings, 0 replies; 30+ messages in thread From: Andreas Ericsson @ 2011-04-04 12:30 UTC (permalink / raw) To: kusmabite; +Cc: Dan McGee, git On 04/04/2011 12:29 PM, Erik Faye-Lund wrote: > > Wouldn't this part be cleaner as a constant-length loop? Any > optimizing compiler should end up unrolling this, and we don't get as > much code-duplication... > It would. > Oddly enough, this change gave me a drastic (> 20%) performance > increase in my test (isolating get_mode in a separate compilation > unit, and calling it in a loop): > > diff --git a/tree-walk.c b/tree-walk.c > index b8d504b..114ad63 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -7,42 +7,30 @@ > static const char *get_mode(const char *str, unsigned int *modep) > { > unsigned char c; > - unsigned int mode = 0; > + unsigned int mode = 0, i; > > /* > - * Unroll what looks like a loop since the bounds are > + * Allow the compiler to unroll the loop since the bounds are > * well-known. There should be at least 5 and at most 6 > * characters available in any valid mode, as '40000' is the > * shortest while '160000' (S_IFGITLINK) is the longest. > */ > - /* char 1 */ > - c = *str++; > - if (c< '0' || c> '7') return NULL; > - mode = (mode<< 3) + (c - '0'); > - /* char 2 */ > - c = *str++; > - if (c< '0' || c> '7') return NULL; > - mode = (mode<< 3) + (c - '0'); > - /* char 3 */ > - c = *str++; > - if (c< '0' || c> '7') return NULL; > - mode = (mode<< 3) + (c - '0'); > - /* char 4 */ > - c = *str++; > - if (c< '0' || c> '7') return NULL; > - mode = (mode<< 3) + (c - '0'); > - /* char 5 */ > - c = *str++; > - if (c< '0' || c> '7') return NULL; > - mode = (mode<< 3) + (c - '0'); > + for (i = 0; i< 5; ++i) { s/i< 5/i < 5/ (nitpicking, yes) > + c = *str++; > + if (c< '0' || c> '7') > + return NULL; > + mode = (mode<< 3) + (c - '0'); This could be (micro-)optimized further as: --%<--%<-- for (i = 0; i < 5; i++) { int c = *str++ - '0'; if (c & ~7) { /* 5 chars is ok, so long as *str is now a space */ if (i == 5 && c == ' ') { *modep = mode; return str; } return NULL; } mode = (mode << 3) + c; } /* this should be a space, or we've got a malformed entry */ if (*str != ' ') return NULL; return str + 1; --%<--%<-- This should be slightly faster since there's now only one comparison inside the loop and since one branch contains a second branch fork-point and an inevitable early return the branch prediction machinery in gcc will favor the most common case properly, but pre-computations on 'mode' have to be undone in one case of hitting the early return, it might be faster to play pickup after a loop to 5 is done. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy 2011-04-04 10:29 ` Erik Faye-Lund @ 2011-04-04 16:55 ` Junio C Hamano 2011-04-05 5:33 ` Dan McGee 2 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2011-04-04 16:55 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > We know our mode entry in our tree objects should be 5 or 6 characters > long. This change both enforces this fact... I find the implementation later shown in the thread is cleaner, but I'd comment on the word "enforces" here. It is more like "versions of git we know how to read from writes mode bits with 5 or 6 characters---if we see something else, either the tree object is corrupt, or we are too old to read the new type of entry in the tree object". So returning NULL is fine and it tells the caller that we do not understand the tree object. The caller says "corrupt tree file" when we do so here, but this change needs to rephrase it. If we stopped because we saw something other than ' ' after the run of octal digits, then we know the tree is corrupt. If we saw a three octal digits 644 in the mode field, terminated with ' ', maybe we are seeing a new kind of tree entry generated from later versions of git. Ideally, I'd rather see error checking done even higher layer than decode_tree_entry() for the "we are too old" case, though. We should return mode 0644 in such a case, and let the caller suggest that the version of git we are running might be too old, or the tree may have been written by a broken system that tried to mimic git in its error message. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-04 16:55 ` Junio C Hamano @ 2011-04-05 5:33 ` Dan McGee 2011-04-05 23:55 ` Antriksh Pany 0 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-04-05 5:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Erik Faye-Lund On Mon, Apr 4, 2011 at 5:29 AM, Erik Faye-Lund <kusmabite@gmail.com> wrote: > > We perfer this style: > > if (c < '0' || c > '7') > return NULL; > > i.e a line-break and a tab between the if-statement and the conditional code. I am well aware, I just wanted to make the code a bit less lengthy due to the repetition. On Mon, Apr 4, 2011 at 11:55 AM, Junio C Hamano <gitster@pobox.com> wrote: > Dan McGee <dpmcgee@gmail.com> writes: > >> We know our mode entry in our tree objects should be 5 or 6 characters >> long. This change both enforces this fact... > > I find the implementation later shown in the thread is cleaner, Totally agree; I should have tried to do it this way in the first place. However, compiling the fixed-length 0 to 5 loop does not produce fully-unrolled assembly for me with CFLAGS="-march=native -mtune=native -O2 -pipe -g" on x86_64. I see two copies of the loop only, and even worse is the (lack of) performance (each is the mode of 3 runs). Compilers are stupid apparently. Hand-unrolled: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m28.616s user 0m28.428s sys 0m0.177s Static counter loop: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m26.393s user 0m26.185s sys 0m0.203s Diff between the two down at the bottom. > but I'd > comment on the word "enforces" here. > > It is more like "versions of git we know how to read from writes mode bits > with 5 or 6 characters---if we see something else, either the tree object > is corrupt, or we are too old to read the new type of entry in the tree > object". OK, sounds like we have opposite thinking on this, interesting. get_mode() is not get_permissions()- it is the same as the stat() st_mode field, which means it has to (as of now) include at least one of the S_IFREG, S_IFDIR, or S_IFGITLINK flags in its value. All of S_ISREG, S_ISDIR, and S_ISGITLINK are used fsck_walk_tree(). So I'm not seeing a change in the tree storage format happening anytime soon as there is little way to preserve both forward and backward compatibility. Currently git will happily parse both a "1" in the tree stream, or a "165324132132464677321513252351"- it doesn't care. The problem is in the former case, a mode of 01 doesn't have any significance later on. In the latter case, we will be left-shifting our mode to hell and it becomes worthless. Do we disagree on clamping the lower limit at 5, the upper limit at 6, or both? > So returning NULL is fine and it tells the caller that we do not > understand the tree object. The caller says "corrupt tree file" when we > do so here, but this change needs to rephrase it. If we stopped because > we saw something other than ' ' after the run of octal digits, then we > know the tree is corrupt. If we saw a three octal digits 644 in the mode > field, terminated with ' ', maybe we are seeing a new kind of tree entry > generated from later versions of git. > > Ideally, I'd rather see error checking done even higher layer than > decode_tree_entry() for the "we are too old" case, though. We should > return mode 0644 in such a case, and let the caller suggest that the > version of git we are running might be too old, or the tree may have been > written by a broken system that tried to mimic git in its error message. > diff --git a/tree-walk.c b/tree-walk.c index 63901f8..63ec130 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -7,34 +7,21 @@ static const char *get_mode(const char *str, unsigned int *modep) { unsigned char c; - unsigned int mode = 0; + unsigned int mode = 0, i; /* - * Unroll what looks like a loop since the bounds are + * Allow the compiler to unroll the loop since the bounds are * well-known. There should be at least 5 and at most 6 * characters available in any valid mode, as '40000' is the * shortest while '160000' (S_IFGITLINK) is the longest. */ - /* char 1 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 2 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 3 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 4 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 5 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); + /* chars 1-5, optional */ + for (i = 0; i < 5; i++) { + c = *str++; + if (c < '0' || c > '7') + return NULL; + mode = (mode << 3) + (c - '0'); + } /* char 6, optional */ if (*str != ' ') { c = *str++; ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-05 5:33 ` Dan McGee @ 2011-04-05 23:55 ` Antriksh Pany 2011-04-06 20:45 ` Dan McGee 0 siblings, 1 reply; 30+ messages in thread From: Antriksh Pany @ 2011-04-05 23:55 UTC (permalink / raw) To: Dan McGee; +Cc: Junio C Hamano, git, Erik Faye-Lund On Mon, Apr 4, 2011 at 10:33 PM, Dan McGee <dpmcgee@gmail.com> wrote: > .... > Totally agree; I should have tried to do it this way in the first > place. However, compiling the fixed-length 0 to 5 loop does not > produce fully-unrolled assembly for me with CFLAGS="-march=native > -mtune=native -O2 -pipe -g" on x86_64. I see two copies of the loop > only, and even worse is the (lack of) performance (each is the mode of > 3 runs). Compilers are stupid apparently. > .... Can you try -O3? Or an explicit '-funroll-loops'? gcc I think does not do aggressive speed optimizations at the cost of space when at O2. Cheers Antriksh ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known 2011-04-05 23:55 ` Antriksh Pany @ 2011-04-06 20:45 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-04-06 20:45 UTC (permalink / raw) To: Antriksh Pany; +Cc: Junio C Hamano, git, Erik Faye-Lund On Tue, Apr 5, 2011 at 6:55 PM, Antriksh Pany <antriksh.pany@gmail.com> wrote: > On Mon, Apr 4, 2011 at 10:33 PM, Dan McGee <dpmcgee@gmail.com> wrote: >> .... >> Totally agree; I should have tried to do it this way in the first >> place. However, compiling the fixed-length 0 to 5 loop does not >> produce fully-unrolled assembly for me with CFLAGS="-march=native >> -mtune=native -O2 -pipe -g" on x86_64. I see two copies of the loop >> only, and even worse is the (lack of) performance (each is the mode of >> 3 runs). Compilers are stupid apparently. >> .... > > Can you try -O3? Or an explicit '-funroll-loops'? > gcc I think does not do aggressive speed optimizations at the cost of > space when at O2. Sure- both of these options show the loop being unrolled for all 5 iterations. However, that doesn't help me and the other 95% of people using distro packages, git-scm.com binaries, or anyone compiling with the default CFLAGS optimization level which is unfortunate. -Dan ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 5/5] tree-walk: match_entry microoptimization 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee ` (2 preceding siblings ...) 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee @ 2011-03-31 1:38 ` Dan McGee 2011-04-02 9:06 ` Nguyen Thai Ngoc Duy 2011-03-31 12:58 ` [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Nguyen Thai Ngoc Duy ` (2 subsequent siblings) 6 siblings, 1 reply; 30+ messages in thread From: Dan McGee @ 2011-03-31 1:38 UTC (permalink / raw) To: git; +Cc: Dan McGee Before calling strncmp(), see if we can get away with checking only the first character of the passed path components instead. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- tree-walk.c | 26 ++++++++++++++++++-------- 1 files changed, 18 insertions(+), 8 deletions(-) diff --git a/tree-walk.c b/tree-walk.c index 41383b0..083b951 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -488,9 +488,10 @@ static int match_entry(const struct name_entry *entry, int pathlen, const char *match, int matchlen, int *never_interesting) { - int m = -1; /* signals that we haven't called strncmp() */ + int m = -1; /* signals that we haven't compared strings */ if (*never_interesting) { + const int maxlen = (matchlen < pathlen) ? matchlen : pathlen; /* * We have not seen any match that sorts later * than the current path. @@ -500,10 +501,13 @@ static int match_entry(const struct name_entry *entry, int pathlen, * Does match sort strictly earlier than path * with their common parts? */ - m = strncmp(match, entry->path, - (matchlen < pathlen) ? matchlen : pathlen); - if (m < 0) - return 0; + if (maxlen && match[0] > entry->path[0]) { + /* no good for the shortcut here, match must be <= */ + } else { + m = strncmp(match, entry->path, maxlen); + if(m < 0) + return 0; + } /* * If we come here even once, that means there is at @@ -531,12 +535,17 @@ static int match_entry(const struct name_entry *entry, int pathlen, return 0; } - if (m == -1) + if (m == -1) { /* - * we cheated and did not do strncmp(), so we do + * we cheated and did compare strings, so we do * that here. */ - m = strncmp(match, entry->path, pathlen); + if (pathlen && match[0] == entry->path[0]) + /* invariant: matchlen == pathlen */ + m = strncmp(match, entry->path, pathlen); + else + m = 1; + } /* * If common part matched earlier then it is a hit, @@ -552,6 +561,7 @@ static int match_entry(const struct name_entry *entry, int pathlen, static int match_dir_prefix(const char *base, const char *match, int matchlen) { + /* invariant: baselen >= matchlen */ if (strncmp(base, match, matchlen)) return 0; -- 1.7.4.2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] tree-walk: match_entry microoptimization 2011-03-31 1:38 ` [PATCH 5/5] tree-walk: match_entry microoptimization Dan McGee @ 2011-04-02 9:06 ` Nguyen Thai Ngoc Duy 2011-04-02 17:54 ` Dan McGee 0 siblings, 1 reply; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-04-02 9:06 UTC (permalink / raw) To: Dan McGee; +Cc: git On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: > Before calling strncmp(), see if we can get away with checking only the > first character of the passed path components instead. > > Signed-off-by: Dan McGee <dpmcgee@gmail.com> I wonder if inlining strcmp() would be better (at least cleaner code). It seems like you try to avoid the function call cost here. -- Duy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] tree-walk: match_entry microoptimization 2011-04-02 9:06 ` Nguyen Thai Ngoc Duy @ 2011-04-02 17:54 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-04-02 17:54 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git On Sat, Apr 2, 2011 at 4:06 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > On Thu, Mar 31, 2011 at 8:38 AM, Dan McGee <dpmcgee@gmail.com> wrote: >> Before calling strncmp(), see if we can get away with checking only the >> first character of the passed path components instead. >> >> Signed-off-by: Dan McGee <dpmcgee@gmail.com> > > I wonder if inlining strcmp() would be better (at least cleaner code). > It seems like you try to avoid the function call cost here. 100% agree with you, but I have no idea how to do that. Realize also you can use memcmp() here; I tried that and got no noticeable gain (likely due to how short these strings are). However, it is what we use in most other functions in this file. I assume the compiler would inline versions with a static length, but if you have suggestions as to how to inline a non-trivial version here I'm all ears. -Dan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee ` (3 preceding siblings ...) 2011-03-31 1:38 ` [PATCH 5/5] tree-walk: match_entry microoptimization Dan McGee @ 2011-03-31 12:58 ` Nguyen Thai Ngoc Duy 2011-03-31 13:56 ` Dan McGee 2011-04-01 22:28 ` Junio C Hamano 2011-05-03 7:34 ` Nguyen Thai Ngoc Duy 6 siblings, 1 reply; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-31 12:58 UTC (permalink / raw) To: Dan McGee; +Cc: git On Thu, Mar 31, 2011 at 8:37 AM, Dan McGee <dpmcgee@gmail.com> wrote: > These next few patches are all derived from trying to make operations on this > repository a bit faster: > http://projects.archlinux.org/svntogit/packages.git/ > > ... > > time git log -- zzzzz_not_exist >/dev/null > 35.558s 0.976s Do you have numbers before and after applying this series? It'd be interesting to see how much we gain from the series. -- Duy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new 2011-03-31 12:58 ` [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Nguyen Thai Ngoc Duy @ 2011-03-31 13:56 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-03-31 13:56 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git On Thu, Mar 31, 2011 at 7:58 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > On Thu, Mar 31, 2011 at 8:37 AM, Dan McGee <dpmcgee@gmail.com> wrote: >> These next few patches are all derived from trying to make operations on this >> repository a bit faster: >> http://projects.archlinux.org/svntogit/packages.git/ >> >> ... >> >> time git log -- zzzzz_not_exist >/dev/null >> 35.558s 0.976s > > Do you have numbers before and after applying this series? It'd be > interesting to see how much we gain from the series. Whipped some up for you: Repo Operation master after delta arch-packages ../git/git-log 2.26 2.27 -0.01 arch-packages ../git/git-log -- zzzzz_not_exist 34.69 26.75 7.93 arch-packages ../git/git-log -- aaaaa_first 6.02 5.76 0.26 linux-2.6 ../git/git-log 5.51 5.50 0.02 linux-2.6 ../git/git-log -- zzzzz_not_exist 0.95 0.92 0.03 linux-2.6 ../git/git-log -- aaaaa_first 0.89 0.86 0.03 These are all the "real" value from time with the full set of patches applied. As you can see, one either gains or isn't really affected by these; anything under 0.05s delta is noise. Bigger gains than this don't seem too plausible unless we move away from linear parsing and traversal of tree objects. -Dan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee ` (4 preceding siblings ...) 2011-03-31 12:58 ` [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Nguyen Thai Ngoc Duy @ 2011-04-01 22:28 ` Junio C Hamano [not found] ` <AANLkTinPSqDPdGi5nA3sH1D2wMSW1SQc+5gRqdLy++y0@mail.gmail.com> 2011-05-03 7:34 ` Nguyen Thai Ngoc Duy 6 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2011-04-01 22:28 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > This was seen to happen in some invocations of git-log with a filtered > path. Only do it if we are not recursively descending, as otherwise we > mess with copy and rename detection in full tree moves. There is no code that corresponds to your "Only do it..." description in your patch, though. The existing code already takes care of that part with or without your patch, no? > diff --git a/tree-diff.c b/tree-diff.c > index 76f83fc..ab90f1a 100644 > --- a/tree-diff.c > +++ b/tree-diff.c > @@ -286,6 +286,9 @@ int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const cha > unsigned long size1, size2; > int retval; > > + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(old, new)) > + return 0; > + I am very curious why this patch makes a difference; doesn't an existing test in compare_tree_entry() oalready cull extra recursion? There is: if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(sha1, sha2) && mode1 == mode2) return 0; before a recursive call to diff_tree_sha1() to dig deeper. ^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <AANLkTinPSqDPdGi5nA3sH1D2wMSW1SQc+5gRqdLy++y0@mail.gmail.com>]
* Fwd: [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new [not found] ` <AANLkTinPSqDPdGi5nA3sH1D2wMSW1SQc+5gRqdLy++y0@mail.gmail.com> @ 2011-04-02 18:38 ` Dan McGee 0 siblings, 0 replies; 30+ messages in thread From: Dan McGee @ 2011-04-02 18:38 UTC (permalink / raw) To: git Forgot to forward this to the list as well, I apologize. On Fri, Apr 1, 2011 at 5:28 PM, Junio C Hamano <gitster@pobox.com> wrote: > Dan McGee <dpmcgee@gmail.com> writes: > >> This was seen to happen in some invocations of git-log with a filtered >> path. Only do it if we are not recursively descending, as otherwise we >> mess with copy and rename detection in full tree moves. > > There is no code that corresponds to your "Only do it..." description in > your patch, though. The existing code already takes care of that part > with or without your patch, no? Damn, I forgot to update the message- see below. >> diff --git a/tree-diff.c b/tree-diff.c >> index 76f83fc..ab90f1a 100644 >> --- a/tree-diff.c >> +++ b/tree-diff.c >> @@ -286,6 +286,9 @@ int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const cha >> unsigned long size1, size2; >> int retval; >> >> + if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(old, new)) >> + return 0; >> + > > I am very curious why this patch makes a difference; doesn't an existing > test in compare_tree_entry() oalready cull extra recursion? There is: This was originally testing RECURSIVE; however I discovered that was not the culprit to my failed tests. t9300-fastimport.sh was failing on "copy then modify subdirectory" due to the full info not being loaded for the before sha1 in that test- instead of showing the fcf778cda ... C100 part (this is just the first line of expected, all were the same), it was 000000 ... A. once I added the above fallthrough to not shortcut if this option was enabled, things worked fine and all tests passed. > if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(sha1, sha2) && > mode1 == mode2) > return 0; > > before a recursive call to diff_tree_sha1() to dig deeper. > I'm not totally sure why this check wasn't working, but without the above exception my patch definitely broke tests. -Dan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee ` (5 preceding siblings ...) 2011-04-01 22:28 ` Junio C Hamano @ 2011-05-03 7:34 ` Nguyen Thai Ngoc Duy 6 siblings, 0 replies; 30+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-05-03 7:34 UTC (permalink / raw) To: Dan McGee; +Cc: git On Thu, Mar 31, 2011 at 8:37 AM, Dan McGee <dpmcgee@gmail.com> wrote: > These next few patches are all derived from trying to make operations on this > repository a bit faster: > http://projects.archlinux.org/svntogit/packages.git/ > > As far as a different shape of repository, this one qualifies. Some stats from > after running `git gc --prune` and looking at things from count-objects among > others: Does this series need any work to get (at least part of) it in? -- Duy ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2011-09-09 2:03 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-31 1:37 [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Dan McGee 2011-03-31 1:37 ` [PATCH 2/5] tree-walk: drop unused parameter from match_dir_prefix Dan McGee 2011-08-30 18:55 ` Dan McGee 2011-03-31 1:37 ` [PATCH 3/5] tree-walk: micro-optimization in tree_entry_interesting Dan McGee 2011-04-03 4:01 ` Nguyen Thai Ngoc Duy 2011-04-03 18:55 ` Junio C Hamano 2011-04-05 0:22 ` Dan McGee [not found] ` <CAEik5nOKrpFycZYVnSu4_5LYWxn0JS_hVXyiQH-80Bu-C4k8VQ@mail.gmail.com> 2011-08-30 19:51 ` Dan McGee 2011-08-30 20:40 ` Junio C Hamano 2011-09-09 2:02 ` [PATCH 1/2] tree-walk: drop unused parameter from match_dir_prefix Dan McGee 2011-09-09 2:02 ` [PATCH 2/2] tree-walk: micro-optimization in tree_entry_interesting Dan McGee 2011-04-04 14:46 ` [PATCH] tree_entry_interesting: inline strncmp() Nguyễn Thái Ngọc Duy 2011-03-31 1:38 ` [PATCH 4/5] tree-walk: unroll get_mode since loop boundaries are well-known Dan McGee 2011-04-02 9:28 ` Nguyen Thai Ngoc Duy 2011-04-02 17:28 ` Dan McGee 2011-04-03 4:07 ` Nguyen Thai Ngoc Duy 2011-04-04 10:29 ` Erik Faye-Lund 2011-04-04 12:30 ` Andreas Ericsson 2011-04-04 16:55 ` Junio C Hamano 2011-04-05 5:33 ` Dan McGee 2011-04-05 23:55 ` Antriksh Pany 2011-04-06 20:45 ` Dan McGee 2011-03-31 1:38 ` [PATCH 5/5] tree-walk: match_entry microoptimization Dan McGee 2011-04-02 9:06 ` Nguyen Thai Ngoc Duy 2011-04-02 17:54 ` Dan McGee 2011-03-31 12:58 ` [PATCH 1/5] diff_tree_sha1: skip diff_tree if old == new Nguyen Thai Ngoc Duy 2011-03-31 13:56 ` Dan McGee 2011-04-01 22:28 ` Junio C Hamano [not found] ` <AANLkTinPSqDPdGi5nA3sH1D2wMSW1SQc+5gRqdLy++y0@mail.gmail.com> 2011-04-02 18:38 ` Fwd: " Dan McGee 2011-05-03 7:34 ` Nguyen Thai Ngoc Duy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).