* [PATCH 1/4] pack-objects: mark add_to_write_order() as inline @ 2011-10-18 5:21 Dan McGee 2011-10-18 5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Dan McGee @ 2011-10-18 5:21 UTC (permalink / raw) To: git This function is a whole 26 bytes when compiled on x86_64, but is currently invoked over 1.037 billion times when running pack-objects on the Linux kernel git repository. This is hitting the point where micro-optimizations do make a difference, and inlining it only increases the object file size by 38 bytes. As reported by perf, this dropped task-clock from 84183 to 83373 ms, and total cycles from 223.5 billion to 221.6 billion. Not astronomical, but worth getting for adding one word. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- builtin/pack-objects.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2b18de5..0ab3a3b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -454,7 +454,7 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag, return 0; } -static void add_to_write_order(struct object_entry **wo, +static inline void add_to_write_order(struct object_entry **wo, int *endp, struct object_entry *e) { -- 1.7.7 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/4] pack-objects: use unsigned int for counter and offset values 2011-10-18 5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee @ 2011-10-18 5:21 ` Dan McGee 2011-10-18 5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee 2011-10-18 5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee 2 siblings, 0 replies; 10+ messages in thread From: Dan McGee @ 2011-10-18 5:21 UTC (permalink / raw) To: git This is done in some of the new pack layout code introduced in commit 1b4bb16b9ec331c. This more closely matches the nr_objects global that is unsigned that these variables are based off of and bounded by. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- builtin/pack-objects.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 0ab3a3b..0de10d2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -455,7 +455,7 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag, } static inline void add_to_write_order(struct object_entry **wo, - int *endp, + unsigned int *endp, struct object_entry *e) { if (e->filled) @@ -465,7 +465,7 @@ static inline void add_to_write_order(struct object_entry **wo, } static void add_descendants_to_write_order(struct object_entry **wo, - int *endp, + unsigned int *endp, struct object_entry *e) { struct object_entry *child; @@ -477,7 +477,7 @@ static void add_descendants_to_write_order(struct object_entry **wo, } static void add_family_to_write_order(struct object_entry **wo, - int *endp, + unsigned int *endp, struct object_entry *e) { struct object_entry *root; @@ -490,7 +490,7 @@ static void add_family_to_write_order(struct object_entry **wo, static struct object_entry **compute_write_order(void) { - int i, wo_end; + unsigned int i, wo_end; struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo)); @@ -506,8 +506,8 @@ static struct object_entry **compute_write_order(void) * Make sure delta_sibling is sorted in the original * recency order. */ - for (i = nr_objects - 1; 0 <= i; i--) { - struct object_entry *e = &objects[i]; + for (i = nr_objects; i > 0;) { + struct object_entry *e = &objects[--i]; if (!e->delta) continue; /* Mark me as the first child */ -- 1.7.7 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-10-18 5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee 2011-10-18 5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee @ 2011-10-18 5:21 ` Dan McGee 2011-10-27 22:26 ` Junio C Hamano 2011-10-18 5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee 2 siblings, 1 reply; 10+ messages in thread From: Dan McGee @ 2011-10-18 5:21 UTC (permalink / raw) To: git This brings back some of the performance lost in optimizing recency order inside pack objects. We were doing extreme amounts of object re-traversal: for the 2.14 million objects in the Linux kernel repository, we were calling add_to_write_order() over 1.03 billion times (a 0.2% hit rate, making 99.8% of of these calls extraneous). Two optimizations take place here- we can start our objects array iteration from a known point where we left off before we started trying to find our tags, and we don't need to do the deep dives required by add_family_to_write_order() if the object has already been marked as filled. These two optimizations bring some pretty spectacular results via `perf stat`: task-clock: 83373 ms --> 43800 ms (50% faster) cycles: 221,633,461,676 --> 116,307,209,986 (47% fewer) instructions: 149,299,179,939 --> 122,998,800,184 (18% fewer) Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- builtin/pack-objects.c | 18 ++++++++++++------ 1 files changed, 12 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 0de10d2..d9fb202 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -490,7 +490,7 @@ static void add_family_to_write_order(struct object_entry **wo, static struct object_entry **compute_write_order(void) { - unsigned int i, wo_end; + unsigned int i, wo_end, last_untagged; struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo)); @@ -521,7 +521,7 @@ static struct object_entry **compute_write_order(void) for_each_tag_ref(mark_tagged, NULL); /* - * Give the commits in the original recency order until + * Give the objects in the original recency order until * we see a tagged tip. */ for (i = wo_end = 0; i < nr_objects; i++) { @@ -529,6 +529,7 @@ static struct object_entry **compute_write_order(void) break; add_to_write_order(wo, &wo_end, &objects[i]); } + last_untagged = i; /* * Then fill all the tagged tips. @@ -541,7 +542,7 @@ static struct object_entry **compute_write_order(void) /* * And then all remaining commits and tags. */ - for (i = 0; i < nr_objects; i++) { + for (i = last_untagged; i < nr_objects; i++) { if (objects[i].type != OBJ_COMMIT && objects[i].type != OBJ_TAG) continue; @@ -551,7 +552,7 @@ static struct object_entry **compute_write_order(void) /* * And then all the trees. */ - for (i = 0; i < nr_objects; i++) { + for (i = last_untagged; i < nr_objects; i++) { if (objects[i].type != OBJ_TREE) continue; add_to_write_order(wo, &wo_end, &objects[i]); @@ -560,8 +561,13 @@ static struct object_entry **compute_write_order(void) /* * Finally all the rest in really tight order */ - for (i = 0; i < nr_objects; i++) - add_family_to_write_order(wo, &wo_end, &objects[i]); + for (i = last_untagged; i < nr_objects; i++) { + if (!objects[i].filled) + add_family_to_write_order(wo, &wo_end, &objects[i]); + } + + if(wo_end != nr_objects) + die("ordered %u objects, expected %u", wo_end, nr_objects); return wo; } -- 1.7.7 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-10-18 5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee @ 2011-10-27 22:26 ` Junio C Hamano 2011-11-09 4:31 ` Dan McGee 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2011-10-27 22:26 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > Two optimizations take place here- we can start our objects array > iteration from a known point where we left off before we started trying > to find our tags, This I would understand (but I am somewhat curious how much last_untagged would advance relative to nr_objects for this half of the optimization to be worth it), but... > and we don't need to do the deep dives required by > add_family_to_write_order() if the object has already been marked as > filled. I am not sure if this produces the identical result that was benchmarked in the original series. For example, if you have a tagged object that is not a commit (say a blob), you would have written that blob in the second phase (write tagged objects together), so the family of blobs that share same delta parent as that blob will not be written in this "Finally all the rest" in the right place in the original list, no? I do not think this change would forget to fill an object that needs to be filled, but it would affect the resulting ordering of the list, so... > @@ -560,8 +561,13 @@ static struct object_entry **compute_write_order(void) > /* > * Finally all the rest in really tight order > */ > - for (i = 0; i < nr_objects; i++) > - add_family_to_write_order(wo, &wo_end, &objects[i]); > + for (i = last_untagged; i < nr_objects; i++) { > + if (!objects[i].filled) > + add_family_to_write_order(wo, &wo_end, &objects[i]); > + } > + > + if(wo_end != nr_objects) > + die("ordered %u objects, expected %u", wo_end, nr_objects); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-10-27 22:26 ` Junio C Hamano @ 2011-11-09 4:31 ` Dan McGee 2011-11-12 6:55 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Dan McGee @ 2011-11-09 4:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote: > Dan McGee <dpmcgee@gmail.com> writes: > >> Two optimizations take place here- we can start our objects array >> iteration from a known point where we left off before we started trying >> to find our tags, > > This I would understand (but I am somewhat curious how much last_untagged > would advance relative to nr_objects for this half of the optimization to > be worth it), but... First off, sorry I wasn't able to respond to you until now, I've been a bit swamped with other projects. I saw this made it into the 1.7.7.3 maint release today, but wanted to at least try to respond to the points you raised (if you didn't investigate them by yourself already). If I remember right, the last_untagged optimization was pretty minor- around 10% advancement of the pointer, never that much more. However, it was a very easy change to make so I figured it was worth the slight additional code (~5 lines changed for that part on its own). > >> and we don't need to do the deep dives required by >> add_family_to_write_order() if the object has already been marked as >> filled. > > I am not sure if this produces the identical result that was benchmarked > in the original series. I was not either when I wrote the patch, and I had hoped to confirm the results you showed in the message of 1b4bb16b9ec. However, I was unable to figure out how you generated those numbers so I wasn't able to do so (and had planned to get back to you to find out how you made those tables). Were you able to verify the ordering did not regress? > For example, if you have a tagged object that is not a commit (say a > blob), you would have written that blob in the second phase (write tagged > objects together), so the family of blobs that share same delta parent as > that blob will not be written in this "Finally all the rest" in the right > place in the original list, no? True, I think. They would not be written in the same place, but is that necessarily the right place? The delta parent of an already-filled tag object would eventually come up in the array as not filled, and then we would do a full deep dive at that point, so the majority of the objects would still be in close proximity. Note that either way, we still have a gap between the original tagged object and its "family"- potentially all the other tagged tips, tags, commits, and trees before the blobs are finally hit and laid out in family groups. So there is at least one potentially big seek that can't be avoided. > I do not think this change would forget to fill an object that needs to be > filled, but it would affect the resulting ordering of the list, so... This was the purpose of the added "if (wo_end != nr_objects) die()" line; it confirms we have traversed and hit every object we originally found. -Dan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-11-09 4:31 ` Dan McGee @ 2011-11-12 6:55 ` Junio C Hamano 2011-11-13 22:34 ` Dan McGee 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2011-11-12 6:55 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote: > >> I am not sure if this produces the identical result that was benchmarked >> in the original series. > I was not either when I wrote the patch, and I had hoped to confirm > the results you showed in the message of 1b4bb16b9ec. I actually am reasonably sure the result will not be identical, but I also do not think it matters. The differences would appear only for entries that have been filled earlier, which should be a minority. > unable to figure out how you generated those numbers so I wasn't able > to do so (and had planned to get back to you to find out how you made > those tables). Were you able to verify the ordering did not regress? No; I was hoping you would redo the benchmark using 5f44324 (core: log offset pack data accesses happened, 2011-07-06). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-11-12 6:55 ` Junio C Hamano @ 2011-11-13 22:34 ` Dan McGee 2011-11-14 5:40 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Dan McGee @ 2011-11-13 22:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: GIT Mailing-list On Sat, Nov 12, 2011 at 12:55 AM, Junio C Hamano <gitster@pobox.com> wrote: > Dan McGee <dpmcgee@gmail.com> writes: > >> On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote: >> >>> I am not sure if this produces the identical result that was benchmarked >>> in the original series. >> I was not either when I wrote the patch, and I had hoped to confirm >> the results you showed in the message of 1b4bb16b9ec. > > I actually am reasonably sure the result will not be identical, but I also > do not think it matters. The differences would appear only for entries > that have been filled earlier, which should be a minority. > >> unable to figure out how you generated those numbers so I wasn't able >> to do so (and had planned to get back to you to find out how you made >> those tables). Were you able to verify the ordering did not regress? > > No; I was hoping you would redo the benchmark using 5f44324 (core: log > offset pack data accesses happened, 2011-07-06). I'm still not sure what you used to parse these results, so I had to spend a good amount of time writing up scripts to parse that output. Anyway, I ran tests that nearly correspond to the ones you quoted on four different pack-object versions as noted below. The first is one revision before your original changes, the next two are self-explanatory, and the final version is with the short diff included below. Observations: * Perhaps I did something wrong, but the v1.7.7.2 numbers don't seem to agree with the figures you came up with- not sure why that is, however, and I've already spent a lot of time on this today and don't really have more time to sink into this. * The diff included below seems to make a significant difference in some of the seek values. dmcgee@galway ~/logs-git $ ~/projects/git/parse_pack_access.py *git_log.txt 1b4bb16b9ec~1 v1.7.7.2 v1.7.7.3 add-whole-family 0.0: 32 32 32 32 10.0: 256 256 256 256 20.0: 293 293 293 293 30.0: 327 327 327 327 40.0: 366 366 366 366 50.0: 421 421 421 421 60.0: 526 526 526 526 70.0: 894 894 894 895 80.0: 11612 11625 11622 11625 90.0: 97405 97487 97487 97510 95.0: 280123 280391 280396 280391 99.0: 1251812 1254871 1252919 1253318 99.5: 1850181 1853291 1853100 1853106 99.9: 4008778 4013897 3988759 4013897 accesses: 280450 280464 280457 280461 <2MiB seek: 99.61 99.61 99.61 99.61 dmcgee@galway ~/logs-git $ ~/projects/git/parse_pack_access.py *git_log_drivers_net.txt 1b4bb16b9ec~1 v1.7.7.2 v1.7.7.3 add-whole-family 0.0: 0 0 0 0 10.0: 144 46 46 46 20.0: 233 48 48 48 30.0: 317 98 97 97 40.0: 512 1396 1367 990 50.0: 2921 773060 786210 399996 60.0: 774452 11594348 10156053 4532415 70.0: 333258049 424530065 428113049 101687854 80.0: 411869214 438733385 438510929 106316734 90.0: 426972983 444510034 443993824 112362757 95.0: 432061866 447253337 446466814 116078451 99.0: 434915514 453076229 451430514 118597896 99.5: 435032830 454359692 452394830 119008652 99.9: 435123054 456005056 454017949 119605604 accesses: 601405 600780 601732 601219 <2MiB seek: 61.68 53.06 53.53 56.21 dmcgee@galway ~/logs-git $ ~/projects/git/parse_pack_access.py *blame*.txt 1b4bb16b9ec~1 v1.7.7.2 v1.7.7.3 add-whole-family 0.0: 0 0 0 0 10.0: 137 46 46 46 20.0: 192 48 48 48 30.0: 309 97 97 97 40.0: 774 5246 5244 4034 50.0: 32585 2643798 2585078 1518706 60.0: 376168864 434624162 434479253 102257691 70.0: 415045893 438464313 438213313 104681256 80.0: 425668210 441472744 441222839 107541644 90.0: 430603653 445070643 444450126 112494824 95.0: 433514511 447215535 446450183 116456428 99.0: 435037297 453431874 451707047 118722564 99.5: 435065325 454459123 452652312 119165598 99.9: 435149193 456205377 454197863 119710538 accesses: 199249 194708 194738 194875 <2MiB seek: 54.31 49.25 49.43 50.94 dmcgee@galway ~/logs-git $ ~/projects/git/parse_pack_access.py *index*.txt 1b4bb16b9ec~1 v1.7.7.2 v1.7.7.3 add-whole-family 0.0: 9 9 9 9 10.0: 137 45 45 45 20.0: 224 47 47 47 30.0: 315 71 71 71 40.0: 449 96 96 96 50.0: 808 164 165 165 60.0: 2693 289 290 287 70.0: 46913 456 458 448 80.0: 1456359 966 975 905 90.0: 12555961 3423 3486 2836 95.0: 44134452 10211 10616 7075 99.0: 269753078 304302120 314414250 35401 99.5: 353346206 386205936 388585783 59897 99.9: 408908212 414260557 415445310 244643 accesses: 3050155 3045012 3045025 3045141 <2MiB seek: 81.56 98.71 98.65 99.98 The scripts used, obviously some hardcoded magic here should be able to use them if you want: $ cat test_pack.sh #!/bin/bash -e versions=('1b4bb16b9ec~1' 'v1.7.7.2' 'v1.7.7.3' 'add-whole-family') commands=('git log' 'git log drivers/net' 'git blame drivers/net/netconsole.c' 'git index-pack -v .git/objects/pack/*.pack') gitdir=/home/dmcgee/projects/git linuxdir=/home/dmcgee/projects/linux export GIT_EXEC_DIR=$gitdir for version in "${versions[@]}"; do echo $version cd $gitdir git checkout $version make -j6 CFLAGS="-march=native -mtune=native -O2 -pipe -g" PYTHON_PATH=/usr/bin/python2 cd $linuxdir git config core.logpackaccess "/tmp/$version-repack.txt" $gitdir/git repack -a -d for command in "${commands[@]}"; do clean_cmd=${command//\//_} clean_cmd=${clean_cmd// /_} git config core.logpackaccess "/tmp/$version-$clean_cmd.txt" echo $command $gitdir/$command >/dev/null done git config --unset core.logpackaccess done $ cat parse_pack_access.py #!/usr/bin/python2 from collections import defaultdict import sys def read_file(filename): packs = defaultdict(list) with open(filename, 'r') as data: for line in data.readlines(): pack, position = line.strip().split(' ') packs[pack].append(int(position)) return packs def calculate_seeks(positions): seeks = [] prev = positions[0] for position in positions[1:]: seeks.append(abs(position - prev)) prev = position return sorted(seeks) def bucket_seeks(seeks): percents = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 95.0, 99.0, 99.5, 99.9] results = [] for percent in percents: index = int(percent/100.0 * len(seeks)) offset = seeks[index] results.append((percent, offset)) return results def print_result_line(label, results): print '%12s%s' % (label, ''.join('%18s' % result for result in results)) def main(filenames): known_versions = ['1b4bb16b9ec~1', 'v1.7.7.2', 'v1.7.7.3', 'add-whole-family'] pack_accesses = {} for filename in filenames: pack_accesses[filename] = read_file(filename) bucket_table = defaultdict(list) accesses = [] under_twomb = [] for version in known_versions: filename = [fn for fn in pack_accesses.keys() if fn.startswith(version)][0] access = pack_accesses[filename] for pack, positions in access.items(): seeks = calculate_seeks(positions) results = bucket_seeks(seeks) for percent, offset in results: bucket_table[percent].append(offset) accesses.append(len(positions)) under_twomb.append(sum(1 for s in seeks if s < 2 * 1024 * 1024) * 100.0 / len(seeks)) print_result_line('', known_versions) for k in sorted(bucket_table.keys()): print_result_line('%.1f:' % k, bucket_table[k]) print_result_line('accesses:', accesses) print_result_line('<2MiB seek:', ('%.2f' % under for under in under_twomb)) if __name__ == '__main__': main(sys.argv[1:]) >From dfee7999b95442a5de2a7a0232c75262d13a28d6 Mon Sep 17 00:00:00 2001 From: Dan McGee <dpmcgee@gmail.com> Date: Sun, 13 Nov 2011 15:07:13 -0600 Subject: [PATCH] Add whole family when packing objects Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- builtin/pack-objects.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 80ab6c3..0a9e761 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -566,7 +566,7 @@ static struct object_entry **compute_write_order(void) */ for (; i < nr_objects; i++) { if (objects[i].tagged) - add_to_write_order(wo, &wo_end, &objects[i]); + add_family_to_write_order(wo, &wo_end, &objects[i]); } /* @@ -576,7 +576,7 @@ static struct object_entry **compute_write_order(void) if (objects[i].type != OBJ_COMMIT && objects[i].type != OBJ_TAG) continue; - add_to_write_order(wo, &wo_end, &objects[i]); + add_family_to_write_order(wo, &wo_end, &objects[i]); } /* @@ -585,7 +585,7 @@ static struct object_entry **compute_write_order(void) for (i = last_untagged; i < nr_objects; i++) { if (objects[i].type != OBJ_TREE) continue; - add_to_write_order(wo, &wo_end, &objects[i]); + add_family_to_write_order(wo, &wo_end, &objects[i]); } /* -- 1.7.7.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily 2011-11-13 22:34 ` Dan McGee @ 2011-11-14 5:40 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2011-11-14 5:40 UTC (permalink / raw) To: Dan McGee; +Cc: GIT Mailing-list Dan McGee <dpmcgee@gmail.com> writes: >>> unable to figure out how you generated those numbers so I wasn't able >>> to do so (and had planned to get back to you to find out how you made >>> those tables). Were you able to verify the ordering did not regress? >> >> No; I was hoping you would redo the benchmark using 5f44324 (core: log >> offset pack data accesses happened, 2011-07-06). > > I'm still not sure what you used to parse these results,... Ah, in the kernel repository, after running "repack -a -d -f" with versions of git and copying the resulting packfiles in PACK-OLD/ and PACK-NEW/, I used these scripts to examine the access pattern. -- >8 -- DOIT.sh -- >8 -- #!/bin/sh tmp=/var/tmp/ll$ trap 'rm -f "$tmp.*"' 0 ln -f PACK-OLD/* .git/objects/pack/. || exit log="$tmp.old" eval '/usr/bin/time rungit test -c core.logpackaccess="$log" '"$*" ln -f PACK-NEW/* .git/objects/pack/. || exit log="$tmp.new" eval '/usr/bin/time rungit test -c core.logpackaccess="$log" '"$*" perl OFS.perl "$tmp.old" "$tmp.new" -- 8< -- DOIT.sh -- 8< -- -- >8 -- OFS.perl -- >8 -- #!/usr/bin/perl use strict; use warnings; use Getopt::Long; my $verbose; exit(1) if (!GetOptions("verbose" => \$verbose)); sub take_one { my ($filename) = @_; my (%lofs, $num); my @diff; open my $in, '<', $filename; $num = 0; while (<$in>) { my ($file, $ofs) = split(' '); if (!exists $lofs{$file}) { $lofs{$file} = [$num++, 0]; } my $diff = $ofs - $lofs{$file}[1]; $lofs{$file}[1] = $ofs; push @diff, abs($diff); print "$lofs{$file}[0] $diff $ofs\n" if $verbose; } return \@diff; } sub bsearch { my ($list, $target) = @_; my ($hi, $lo) = ((scalar @$list), 0); while ($lo < $hi) { my $mi = int(($lo + $hi) / 2); if ($list->[$mi] == $target) { return $mi; } elsif ($list->[$mi] < $target) { $lo = $mi + 1; } else { $hi = $mi; } } return $hi; } my @percentile = (); for (my $i = 0; $i < 100; $i += 10) { push @percentile, $i; } push @percentile, 95, 99, 99.9, 99.99; sub thcomma { my ($intval) = @_; my $result = ""; while ($intval > 1000) { my $rem = $intval % 1000; if ($result ne "") { $result = sprintf "%03d,%s", $rem, $result; } else { $result = sprintf "%03d", $rem; } $intval -= $rem; $intval /= 1000; } if ($intval) { if ($result ne "") { $result = sprintf "%d,%s", $intval, $result; } else { $result = sprintf "%d", $intval; } } $result =~ s/^[0,]*//; $result = "0" if ($result eq ""); return $result; } sub show_stat { my ($diff1, $diff2) = @_; my ($i, $ix); if ($diff2) { @$diff2 = sort { $a <=> $b } @$diff2; } @$diff1 = sort { $a <=> $b } @$diff1; printf "\nTotal number of access : %12s", thcomma(scalar(@$diff1)); printf "%12s", thcomma(scalar(@$diff2)) if ($diff2); for $i (@percentile) { $ix = scalar(@$diff1) * $i / 100; printf "\n %5.2f%% percentile : %12s", $i, thcomma($diff1->[$ix]); if ($diff2) { $ix = scalar(@$diff2) * $i / 100; printf "%12s", thcomma($diff2->[$ix]); } } $ix = bsearch($diff1, 2 * 1024 * 1024); printf "\n Less than 2MiB seek : %5.2f%%", ($ix * 100.0 / @$diff1); if ($diff2) { $ix = bsearch($diff2, 2 * 1024 * 1024); printf " %5.2f%%", ($ix * 100.0 / @$diff2); } print "\n"; } my ($diff1, $diff2); $diff1 = take_one($ARGV[0]); $diff2 = take_one($ARGV[1]) if ($ARGV[1]); show_stat($diff1, $diff2); -- 8< -- OFS.perl -- 8< -- ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively 2011-10-18 5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee 2011-10-18 5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee 2011-10-18 5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee @ 2011-10-18 5:21 ` Dan McGee 2011-10-27 22:13 ` Junio C Hamano 2 siblings, 1 reply; 10+ messages in thread From: Dan McGee @ 2011-10-18 5:21 UTC (permalink / raw) To: git This removes the need to call this function recursively, shinking the code size slightly and netting a small performance increase. Signed-off-by: Dan McGee <dpmcgee@gmail.com> --- builtin/pack-objects.c | 44 +++++++++++++++++++++++++++++++++++++------- 1 files changed, 37 insertions(+), 7 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d9fb202..9efd1a7 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -468,12 +468,43 @@ static void add_descendants_to_write_order(struct object_entry **wo, unsigned int *endp, struct object_entry *e) { - struct object_entry *child; - - for (child = e->delta_child; child; child = child->delta_sibling) - add_to_write_order(wo, endp, child); - for (child = e->delta_child; child; child = child->delta_sibling) - add_descendants_to_write_order(wo, endp, child); + int add_to_order = 1; + while (e) { + if (add_to_order) { + struct object_entry *s; + /* add this node... */ + add_to_write_order(wo, endp, e); + /* all its siblings... */ + for (s = e->delta_sibling; s; s = s->delta_sibling) { + add_to_write_order(wo, endp, s); + } + } + /* drop down a level to add left subtree nodes if possible */ + if (e->delta_child) { + add_to_order = 1; + e = e->delta_child; + } else { + add_to_order = 0; + /* our sibling might have some children, it is next */ + if (e->delta_sibling) { + e = e->delta_sibling; + continue; + } + /* go back to our parent node */ + e = e->delta; + while (e && !e->delta_sibling) { + /* we're on the right side of a subtree, keep + * going up until we can go right again */ + e = e->delta; + } + if (!e) { + /* done- we hit our original root node */ + return; + } + /* pass it off to sibling at this level */ + e = e->delta_sibling; + } + }; } static void add_family_to_write_order(struct object_entry **wo, @@ -484,7 +515,6 @@ static void add_family_to_write_order(struct object_entry **wo, for (root = e; root->delta; root = root->delta) ; /* nothing */ - add_to_write_order(wo, endp, root); add_descendants_to_write_order(wo, endp, root); } -- 1.7.7 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively 2011-10-18 5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee @ 2011-10-27 22:13 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2011-10-27 22:13 UTC (permalink / raw) To: Dan McGee; +Cc: git Dan McGee <dpmcgee@gmail.com> writes: > This removes the need to call this function recursively, shinking the > code size slightly and netting a small performance increase. > > Signed-off-by: Dan McGee <dpmcgee@gmail.com> Tricky. As long as this is done after compute_write_order() populates delta_sibling vs delta link in a consistent way, the new logic should produce exactly the same result as the original code. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-11-14 5:40 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-18 5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee 2011-10-18 5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee 2011-10-18 5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee 2011-10-27 22:26 ` Junio C Hamano 2011-11-09 4:31 ` Dan McGee 2011-11-12 6:55 ` Junio C Hamano 2011-11-13 22:34 ` Dan McGee 2011-11-14 5:40 ` Junio C Hamano 2011-10-18 5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee 2011-10-27 22:13 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).