[PATCH 1/4] pack-objects: mark add_to_write

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/4] pack-objects: mark add_to_write_order() as inline
@ 2011-10-18  5:21 Dan McGee
  2011-10-18  5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Dan McGee @ 2011-10-18  5:21 UTC (permalink / raw)
  To: git

This function is a whole 26 bytes when compiled on x86_64, but is
currently invoked over 1.037 billion times when running pack-objects on
the Linux kernel git repository. This is hitting the point where
micro-optimizations do make a difference, and inlining it only increases
the object file size by 38 bytes.

As reported by perf, this dropped task-clock from 84183 to 83373 ms, and
total cycles from 223.5 billion to 221.6 billion. Not astronomical, but
worth getting for adding one word.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
---
 builtin/pack-objects.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2b18de5..0ab3a3b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -454,7 +454,7 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag,
 	return 0;
 }

-static void add_to_write_order(struct object_entry **wo,
+static inline void add_to_write_order(struct object_entry **wo,
 			       int *endp,
 			       struct object_entry *e)
 {
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] pack-objects: use unsigned int for counter and offset values
  2011-10-18  5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee
@ 2011-10-18  5:21 ` Dan McGee
  2011-10-18  5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee
  2011-10-18  5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee
  2 siblings, 0 replies; 10+ messages in thread
From: Dan McGee @ 2011-10-18  5:21 UTC (permalink / raw)
  To: git

This is done in some of the new pack layout code introduced in commit
1b4bb16b9ec331c. This more closely matches the nr_objects global that is
unsigned that these variables are based off of and bounded by.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
---
 builtin/pack-objects.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0ab3a3b..0de10d2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -455,7 +455,7 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag,
 }
 
 static inline void add_to_write_order(struct object_entry **wo,
-			       int *endp,
+			       unsigned int *endp,
 			       struct object_entry *e)
 {
 	if (e->filled)
@@ -465,7 +465,7 @@ static inline void add_to_write_order(struct object_entry **wo,
 }
 
 static void add_descendants_to_write_order(struct object_entry **wo,
-					   int *endp,
+					   unsigned int *endp,
 					   struct object_entry *e)
 {
 	struct object_entry *child;
@@ -477,7 +477,7 @@ static void add_descendants_to_write_order(struct object_entry **wo,
 }
 
 static void add_family_to_write_order(struct object_entry **wo,
-				      int *endp,
+				      unsigned int *endp,
 				      struct object_entry *e)
 {
 	struct object_entry *root;
@@ -490,7 +490,7 @@ static void add_family_to_write_order(struct object_entry **wo,
 
 static struct object_entry **compute_write_order(void)
 {
-	int i, wo_end;
+	unsigned int i, wo_end;
 
 	struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo));
 
@@ -506,8 +506,8 @@ static struct object_entry **compute_write_order(void)
 	 * Make sure delta_sibling is sorted in the original
 	 * recency order.
 	 */
-	for (i = nr_objects - 1; 0 <= i; i--) {
-		struct object_entry *e = &objects[i];
+	for (i = nr_objects; i > 0;) {
+		struct object_entry *e = &objects[--i];
 		if (!e->delta)
 			continue;
 		/* Mark me as the first child */
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-10-18  5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee
  2011-10-18  5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee
@ 2011-10-18  5:21 ` Dan McGee
  2011-10-27 22:26   ` Junio C Hamano
  2011-10-18  5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee
  2 siblings, 1 reply; 10+ messages in thread
From: Dan McGee @ 2011-10-18  5:21 UTC (permalink / raw)
  To: git

This brings back some of the performance lost in optimizing recency
order inside pack objects. We were doing extreme amounts of object
re-traversal: for the 2.14 million objects in the Linux kernel
repository, we were calling add_to_write_order() over 1.03 billion times
(a 0.2% hit rate, making 99.8% of of these calls extraneous).

Two optimizations take place here- we can start our objects array
iteration from a known point where we left off before we started trying
to find our tags, and we don't need to do the deep dives required by
add_family_to_write_order() if the object has already been marked as
filled.

These two optimizations bring some pretty spectacular results via `perf
stat`:

task-clock:   83373 ms        --> 43800 ms         (50% faster)
cycles:       221,633,461,676 --> 116,307,209,986  (47% fewer)
instructions: 149,299,179,939 --> 122,998,800,184  (18% fewer)

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
---
 builtin/pack-objects.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0de10d2..d9fb202 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -490,7 +490,7 @@ static void add_family_to_write_order(struct object_entry **wo,
 
 static struct object_entry **compute_write_order(void)
 {
-	unsigned int i, wo_end;
+	unsigned int i, wo_end, last_untagged;
 
 	struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo));
 
@@ -521,7 +521,7 @@ static struct object_entry **compute_write_order(void)
 	for_each_tag_ref(mark_tagged, NULL);
 
 	/*
-	 * Give the commits in the original recency order until
+	 * Give the objects in the original recency order until
 	 * we see a tagged tip.
 	 */
 	for (i = wo_end = 0; i < nr_objects; i++) {
@@ -529,6 +529,7 @@ static struct object_entry **compute_write_order(void)
 			break;
 		add_to_write_order(wo, &wo_end, &objects[i]);
 	}
+	last_untagged = i;
 
 	/*
 	 * Then fill all the tagged tips.
@@ -541,7 +542,7 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * And then all remaining commits and tags.
 	 */
-	for (i = 0; i < nr_objects; i++) {
+	for (i = last_untagged; i < nr_objects; i++) {
 		if (objects[i].type != OBJ_COMMIT &&
 		    objects[i].type != OBJ_TAG)
 			continue;
@@ -551,7 +552,7 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * And then all the trees.
 	 */
-	for (i = 0; i < nr_objects; i++) {
+	for (i = last_untagged; i < nr_objects; i++) {
 		if (objects[i].type != OBJ_TREE)
 			continue;
 		add_to_write_order(wo, &wo_end, &objects[i]);
@@ -560,8 +561,13 @@ static struct object_entry **compute_write_order(void)
 	/*
 	 * Finally all the rest in really tight order
 	 */
-	for (i = 0; i < nr_objects; i++)
-		add_family_to_write_order(wo, &wo_end, &objects[i]);
+	for (i = last_untagged; i < nr_objects; i++) {
+		if (!objects[i].filled)
+			add_family_to_write_order(wo, &wo_end, &objects[i]);
+	}
+
+	if(wo_end != nr_objects)
+		die("ordered %u objects, expected %u", wo_end, nr_objects);
 
 	return wo;
 }
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-10-18  5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee
@ 2011-10-27 22:26   ` Junio C Hamano
  2011-11-09  4:31     ` Dan McGee
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2011-10-27 22:26 UTC (permalink / raw)
  To: Dan McGee; +Cc: git

Dan McGee <dpmcgee@gmail.com> writes:

> Two optimizations take place here- we can start our objects array
> iteration from a known point where we left off before we started trying
> to find our tags,

This I would understand (but I am somewhat curious how much last_untagged
would advance relative to nr_objects for this half of the optimization to
be worth it), but...

> and we don't need to do the deep dives required by
> add_family_to_write_order() if the object has already been marked as
> filled.

I am not sure if this produces the identical result that was benchmarked
in the original series.

For example, if you have a tagged object that is not a commit (say a
blob), you would have written that blob in the second phase (write tagged
objects together), so the family of blobs that share same delta parent as
that blob will not be written in this "Finally all the rest" in the right
place in the original list, no?

I do not think this change would forget to fill an object that needs to be
filled, but it would affect the resulting ordering of the list, so...

> @@ -560,8 +561,13 @@ static struct object_entry **compute_write_order(void)
>  	/*
>  	 * Finally all the rest in really tight order
>  	 */
> -	for (i = 0; i < nr_objects; i++)
> -		add_family_to_write_order(wo, &wo_end, &objects[i]);
> +	for (i = last_untagged; i < nr_objects; i++) {
> +		if (!objects[i].filled)
> +			add_family_to_write_order(wo, &wo_end, &objects[i]);
> +	}
> +
> +	if(wo_end != nr_objects)
> +		die("ordered %u objects, expected %u", wo_end, nr_objects);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-10-27 22:26   ` Junio C Hamano
@ 2011-11-09  4:31     ` Dan McGee
  2011-11-12  6:55       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Dan McGee @ 2011-11-09  4:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Dan McGee <dpmcgee@gmail.com> writes:
>
>> Two optimizations take place here- we can start our objects array
>> iteration from a known point where we left off before we started trying
>> to find our tags,
>
> This I would understand (but I am somewhat curious how much last_untagged
> would advance relative to nr_objects for this half of the optimization to
> be worth it), but...
First off, sorry I wasn't able to respond to you until now, I've been
a bit swamped with other projects. I saw this made it into the 1.7.7.3
maint release today, but wanted to at least try to respond to the
points you raised (if you didn't investigate them by yourself
already).

If I remember right, the last_untagged optimization was pretty minor-
around 10% advancement of the pointer, never that much more. However,
it was a very easy change to make so I figured it was worth the slight
additional code (~5 lines changed for that part on its own).

>
>> and we don't need to do the deep dives required by
>> add_family_to_write_order() if the object has already been marked as
>> filled.
>
> I am not sure if this produces the identical result that was benchmarked
> in the original series.
I was not either when I wrote the patch, and I had hoped to confirm
the results you showed in the message of 1b4bb16b9ec. However, I was
unable to figure out how you generated those numbers so I wasn't able
to do so (and had planned to get back to you to find out how you made
those tables). Were you able to verify the ordering did not regress?

> For example, if you have a tagged object that is not a commit (say a
> blob), you would have written that blob in the second phase (write tagged
> objects together), so the family of blobs that share same delta parent as
> that blob will not be written in this "Finally all the rest" in the right
> place in the original list, no?
True, I think. They would not be written in the same place, but is
that necessarily the right place? The delta parent of an
already-filled tag object would eventually come up in the array as not
filled, and then we would do a full deep dive at that point, so the
majority of the objects would still be in close proximity.

Note that either way, we still have a gap between the original tagged
object and its "family"- potentially all the other tagged tips, tags,
commits, and trees before the blobs are finally hit and laid out in
family groups. So there is at least one potentially big seek that
can't be avoided.

> I do not think this change would forget to fill an object that needs to be
> filled, but it would affect the resulting ordering of the list, so...
This was the purpose of the added "if (wo_end != nr_objects) die()"
line; it confirms we have traversed and hit every object we originally
found.

-Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-11-09  4:31     ` Dan McGee
@ 2011-11-12  6:55       ` Junio C Hamano
  2011-11-13 22:34         ` Dan McGee
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2011-11-12  6:55 UTC (permalink / raw)
  To: Dan McGee; +Cc: git

Dan McGee <dpmcgee@gmail.com> writes:

> On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> I am not sure if this produces the identical result that was benchmarked
>> in the original series.
> I was not either when I wrote the patch, and I had hoped to confirm
> the results you showed in the message of 1b4bb16b9ec.

I actually am reasonably sure the result will not be identical, but I also
do not think it matters. The differences would appear only for entries
that have been filled earlier, which should be a minority.

> unable to figure out how you generated those numbers so I wasn't able
> to do so (and had planned to get back to you to find out how you made
> those tables). Were you able to verify the ordering did not regress?

No; I was hoping you would redo the benchmark using 5f44324 (core: log
offset pack data accesses happened, 2011-07-06).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-11-12  6:55       ` Junio C Hamano
@ 2011-11-13 22:34         ` Dan McGee
  2011-11-14  5:40           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Dan McGee @ 2011-11-13 22:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: GIT Mailing-list

On Sat, Nov 12, 2011 at 12:55 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Dan McGee <dpmcgee@gmail.com> writes:
>
>> On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> I am not sure if this produces the identical result that was benchmarked
>>> in the original series.
>> I was not either when I wrote the patch, and I had hoped to confirm
>> the results you showed in the message of 1b4bb16b9ec.
>
> I actually am reasonably sure the result will not be identical, but I also
> do not think it matters. The differences would appear only for entries
> that have been filled earlier, which should be a minority.
>
>> unable to figure out how you generated those numbers so I wasn't able
>> to do so (and had planned to get back to you to find out how you made
>> those tables). Were you able to verify the ordering did not regress?
>
> No; I was hoping you would redo the benchmark using 5f44324 (core: log
> offset pack data accesses happened, 2011-07-06).

I'm still not sure what you used to parse these results, so I had to
spend a good amount of time writing up scripts to parse that output.
Anyway, I ran tests that nearly correspond to the ones you quoted on
four different pack-object versions as noted below. The first is one
revision before your original changes, the next two are
self-explanatory, and the final version is with the short diff
included below.

Observations:
* Perhaps I did something wrong, but the v1.7.7.2 numbers don't seem
to agree with the figures you came up with- not sure why that is,
however, and I've already spent a lot of time on this today and don't
really have more time to sink into this.
* The diff included below seems to make a significant difference in
some of the seek values.

dmcgee@galway ~/logs-git
$ ~/projects/git/parse_pack_access.py *git_log.txt
                 1b4bb16b9ec~1          v1.7.7.2          v1.7.7.3
add-whole-family
        0.0:                32                32                32
           32
       10.0:               256               256               256
          256
       20.0:               293               293               293
          293
       30.0:               327               327               327
          327
       40.0:               366               366               366
          366
       50.0:               421               421               421
          421
       60.0:               526               526               526
          526
       70.0:               894               894               894
          895
       80.0:             11612             11625             11622
        11625
       90.0:             97405             97487             97487
        97510
       95.0:            280123            280391            280396
       280391
       99.0:           1251812           1254871           1252919
      1253318
       99.5:           1850181           1853291           1853100
      1853106
       99.9:           4008778           4013897           3988759
      4013897
   accesses:            280450            280464            280457
       280461
 <2MiB seek:             99.61             99.61             99.61
        99.61

dmcgee@galway ~/logs-git
$ ~/projects/git/parse_pack_access.py *git_log_drivers_net.txt
                 1b4bb16b9ec~1          v1.7.7.2          v1.7.7.3
add-whole-family
        0.0:                 0                 0                 0
            0
       10.0:               144                46                46
           46
       20.0:               233                48                48
           48
       30.0:               317                98                97
           97
       40.0:               512              1396              1367
          990
       50.0:              2921            773060            786210
       399996
       60.0:            774452          11594348          10156053
      4532415
       70.0:         333258049         424530065         428113049
    101687854
       80.0:         411869214         438733385         438510929
    106316734
       90.0:         426972983         444510034         443993824
    112362757
       95.0:         432061866         447253337         446466814
    116078451
       99.0:         434915514         453076229         451430514
    118597896
       99.5:         435032830         454359692         452394830
    119008652
       99.9:         435123054         456005056         454017949
    119605604
   accesses:            601405            600780            601732
       601219
 <2MiB seek:             61.68             53.06             53.53
        56.21

dmcgee@galway ~/logs-git
$ ~/projects/git/parse_pack_access.py *blame*.txt
                 1b4bb16b9ec~1          v1.7.7.2          v1.7.7.3
add-whole-family
        0.0:                 0                 0                 0
            0
       10.0:               137                46                46
           46
       20.0:               192                48                48
           48
       30.0:               309                97                97
           97
       40.0:               774              5246              5244
         4034
       50.0:             32585           2643798           2585078
      1518706
       60.0:         376168864         434624162         434479253
    102257691
       70.0:         415045893         438464313         438213313
    104681256
       80.0:         425668210         441472744         441222839
    107541644
       90.0:         430603653         445070643         444450126
    112494824
       95.0:         433514511         447215535         446450183
    116456428
       99.0:         435037297         453431874         451707047
    118722564
       99.5:         435065325         454459123         452652312
    119165598
       99.9:         435149193         456205377         454197863
    119710538
   accesses:            199249            194708            194738
       194875
 <2MiB seek:             54.31             49.25             49.43
        50.94

dmcgee@galway ~/logs-git
$ ~/projects/git/parse_pack_access.py *index*.txt
                 1b4bb16b9ec~1          v1.7.7.2          v1.7.7.3
add-whole-family
        0.0:                 9                 9                 9
            9
       10.0:               137                45                45
           45
       20.0:               224                47                47
           47
       30.0:               315                71                71
           71
       40.0:               449                96                96
           96
       50.0:               808               164               165
          165
       60.0:              2693               289               290
          287
       70.0:             46913               456               458
          448
       80.0:           1456359               966               975
          905
       90.0:          12555961              3423              3486
         2836
       95.0:          44134452             10211             10616
         7075
       99.0:         269753078         304302120         314414250
        35401
       99.5:         353346206         386205936         388585783
        59897
       99.9:         408908212         414260557         415445310
       244643
   accesses:           3050155           3045012           3045025
      3045141
 <2MiB seek:             81.56             98.71             98.65
        99.98


The scripts used, obviously some hardcoded magic here should be able
to use them if you want:

$ cat test_pack.sh
#!/bin/bash -e

versions=('1b4bb16b9ec~1' 'v1.7.7.2' 'v1.7.7.3' 'add-whole-family')
commands=('git log' 'git log drivers/net' 'git blame
drivers/net/netconsole.c' 'git index-pack -v
.git/objects/pack/*.pack')

gitdir=/home/dmcgee/projects/git
linuxdir=/home/dmcgee/projects/linux

export GIT_EXEC_DIR=$gitdir

for version in "${versions[@]}"; do
	echo $version
	cd $gitdir
	git checkout $version
	make -j6 CFLAGS="-march=native -mtune=native -O2 -pipe -g"
PYTHON_PATH=/usr/bin/python2
	cd $linuxdir
	git config core.logpackaccess "/tmp/$version-repack.txt"
	$gitdir/git repack -a -d
	for command in "${commands[@]}"; do
		clean_cmd=${command//\//_}
		clean_cmd=${clean_cmd// /_}
		git config core.logpackaccess "/tmp/$version-$clean_cmd.txt"
		echo $command
		$gitdir/$command >/dev/null
	done
	git config --unset core.logpackaccess
done


$ cat parse_pack_access.py
#!/usr/bin/python2

from collections import defaultdict
import sys

def read_file(filename):
    packs = defaultdict(list)
    with open(filename, 'r') as data:
        for line in data.readlines():
            pack, position = line.strip().split(' ')
            packs[pack].append(int(position))
    return packs

def calculate_seeks(positions):
    seeks = []
    prev = positions[0]
    for position in positions[1:]:
        seeks.append(abs(position - prev))
        prev = position
    return sorted(seeks)

def bucket_seeks(seeks):
    percents = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0,
            95.0, 99.0, 99.5, 99.9]
    results = []
    for percent in percents:
        index = int(percent/100.0 * len(seeks))
        offset = seeks[index]
        results.append((percent, offset))

    return results

def print_result_line(label, results):
    print '%12s%s' % (label,
            ''.join('%18s' % result for result in results))

def main(filenames):
    known_versions = ['1b4bb16b9ec~1', 'v1.7.7.2', 'v1.7.7.3',
'add-whole-family']
    pack_accesses = {}

    for filename in filenames:
        pack_accesses[filename] = read_file(filename)

    bucket_table = defaultdict(list)
    accesses = []
    under_twomb = []
    for version in known_versions:
        filename = [fn for fn in pack_accesses.keys() if
fn.startswith(version)][0]
        access = pack_accesses[filename]

        for pack, positions in access.items():
            seeks = calculate_seeks(positions)
            results = bucket_seeks(seeks)
            for percent, offset in results:
                bucket_table[percent].append(offset)
            accesses.append(len(positions))
            under_twomb.append(sum(1 for s in seeks if s < 2 * 1024 *
1024) * 100.0 / len(seeks))

    print_result_line('', known_versions)
    for k in sorted(bucket_table.keys()):
        print_result_line('%.1f:' % k, bucket_table[k])
    print_result_line('accesses:', accesses)
    print_result_line('<2MiB seek:', ('%.2f' % under for under in under_twomb))

if __name__ == '__main__':
    main(sys.argv[1:])



>From dfee7999b95442a5de2a7a0232c75262d13a28d6 Mon Sep 17 00:00:00 2001
From: Dan McGee <dpmcgee@gmail.com>
Date: Sun, 13 Nov 2011 15:07:13 -0600
Subject: [PATCH] Add whole family when packing objects

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
---
 builtin/pack-objects.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 80ab6c3..0a9e761 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -566,7 +566,7 @@ static struct object_entry **compute_write_order(void)
 	 */
 	for (; i < nr_objects; i++) {
 		if (objects[i].tagged)
-			add_to_write_order(wo, &wo_end, &objects[i]);
+			add_family_to_write_order(wo, &wo_end, &objects[i]);
 	}

 	/*
@@ -576,7 +576,7 @@ static struct object_entry **compute_write_order(void)
 		if (objects[i].type != OBJ_COMMIT &&
 		    objects[i].type != OBJ_TAG)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_family_to_write_order(wo, &wo_end, &objects[i]);
 	}

 	/*
@@ -585,7 +585,7 @@ static struct object_entry **compute_write_order(void)
 	for (i = last_untagged; i < nr_objects; i++) {
 		if (objects[i].type != OBJ_TREE)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_family_to_write_order(wo, &wo_end, &objects[i]);
 	}

 	/*
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] pack-objects: don't traverse objects unnecessarily
  2011-11-13 22:34         ` Dan McGee
@ 2011-11-14  5:40           ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2011-11-14  5:40 UTC (permalink / raw)
  To: Dan McGee; +Cc: GIT Mailing-list

Dan McGee <dpmcgee@gmail.com> writes:

>>> unable to figure out how you generated those numbers so I wasn't able
>>> to do so (and had planned to get back to you to find out how you made
>>> those tables). Were you able to verify the ordering did not regress?
>>
>> No; I was hoping you would redo the benchmark using 5f44324 (core: log
>> offset pack data accesses happened, 2011-07-06).
>
> I'm still not sure what you used to parse these results,...

Ah, in the kernel repository, after running "repack -a -d -f" with
versions of git and copying the resulting packfiles in PACK-OLD/ and
PACK-NEW/, I used these scripts to examine the access pattern.

-- >8 -- DOIT.sh -- >8 --
#!/bin/sh

tmp=/var/tmp/ll$
trap 'rm -f "$tmp.*"' 0

ln -f PACK-OLD/* .git/objects/pack/. || exit
log="$tmp.old"
eval '/usr/bin/time rungit test -c core.logpackaccess="$log" '"$*"

ln -f PACK-NEW/* .git/objects/pack/. || exit
log="$tmp.new"
eval '/usr/bin/time rungit test -c core.logpackaccess="$log" '"$*"

perl OFS.perl "$tmp.old" "$tmp.new"
-- 8< -- DOIT.sh -- 8< --

-- >8 -- OFS.perl -- >8 --
#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my $verbose;

exit(1) if (!GetOptions("verbose" => \$verbose));

sub take_one {
	my ($filename) = @_;
	my (%lofs, $num);
	my @diff;
	open my $in, '<', $filename;

	$num = 0;
	while (<$in>) {
		my ($file, $ofs) = split(' ');
		if (!exists $lofs{$file}) {
			$lofs{$file} = [$num++, 0];
		}
		my $diff = $ofs - $lofs{$file}[1];
		$lofs{$file}[1] = $ofs;
		push @diff, abs($diff);
		print "$lofs{$file}[0] $diff $ofs\n" if $verbose;
	}
	return \@diff;
}

sub bsearch {
	my ($list, $target) = @_;
	my ($hi, $lo) = ((scalar @$list), 0);

	while ($lo < $hi) {
		my $mi = int(($lo + $hi) / 2);
		if ($list->[$mi] == $target) {
			return $mi;
		} elsif ($list->[$mi] < $target) {
			$lo = $mi + 1;
		} else {
			$hi = $mi;
		}
	}
	return $hi;
}

my @percentile = ();
for (my $i = 0; $i < 100; $i += 10) {
	push @percentile, $i;
}
push @percentile, 95, 99, 99.9, 99.99;

sub thcomma {
	my ($intval) = @_;
	my $result = "";
	while ($intval > 1000) {
		my $rem = $intval % 1000;
		if ($result ne "") {
			$result = sprintf "%03d,%s", $rem, $result;
		} else {
			$result = sprintf "%03d", $rem;
		}
		$intval -= $rem;
		$intval /= 1000;
	}
	if ($intval) {
		if ($result ne "") {
			$result = sprintf "%d,%s", $intval, $result;
		} else {
			$result = sprintf "%d", $intval;
		}
	}
	$result =~ s/^[0,]*//;
	$result = "0" if ($result eq "");
	return $result;
}

sub show_stat {
	my ($diff1, $diff2) = @_;
	my ($i, $ix);
	if ($diff2) {
		@$diff2 = sort { $a <=> $b } @$diff2;
	}
	@$diff1 = sort { $a <=> $b } @$diff1;
	printf "\nTotal number of access : %12s", thcomma(scalar(@$diff1));
	printf "%12s", thcomma(scalar(@$diff2)) if ($diff2);
	for $i (@percentile) {
		$ix = scalar(@$diff1) * $i / 100;
		printf "\n     %5.2f%% percentile : %12s", $i, thcomma($diff1->[$ix]);
		if ($diff2) {
			$ix = scalar(@$diff2) * $i / 100;
			printf "%12s", thcomma($diff2->[$ix]);
		}
	}

	$ix = bsearch($diff1, 2 * 1024 * 1024);
	printf "\n   Less than 2MiB seek :       %5.2f%%", ($ix * 100.0 / @$diff1);
	if ($diff2) {
		$ix = bsearch($diff2, 2 * 1024 * 1024);
		printf "      %5.2f%%", ($ix * 100.0 / @$diff2);
	}

	print "\n";
}

my ($diff1, $diff2);
$diff1 = take_one($ARGV[0]);
$diff2 = take_one($ARGV[1]) if ($ARGV[1]);

show_stat($diff1, $diff2);
-- 8< -- OFS.perl -- 8< --


	

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively
  2011-10-18  5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee
  2011-10-18  5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee
  2011-10-18  5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee
@ 2011-10-18  5:21 ` Dan McGee
  2011-10-27 22:13   ` Junio C Hamano
  2 siblings, 1 reply; 10+ messages in thread
From: Dan McGee @ 2011-10-18  5:21 UTC (permalink / raw)
  To: git

This removes the need to call this function recursively, shinking the
code size slightly and netting a small performance increase.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
---
 builtin/pack-objects.c |   44 +++++++++++++++++++++++++++++++++++++-------
 1 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d9fb202..9efd1a7 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -468,12 +468,43 @@ static void add_descendants_to_write_order(struct object_entry **wo,
 					   unsigned int *endp,
 					   struct object_entry *e)
 {
-	struct object_entry *child;
-
-	for (child = e->delta_child; child; child = child->delta_sibling)
-		add_to_write_order(wo, endp, child);
-	for (child = e->delta_child; child; child = child->delta_sibling)
-		add_descendants_to_write_order(wo, endp, child);
+	int add_to_order = 1;
+	while (e) {
+		if (add_to_order) {
+			struct object_entry *s;
+			/* add this node... */
+			add_to_write_order(wo, endp, e);
+			/* all its siblings... */
+			for (s = e->delta_sibling; s; s = s->delta_sibling) {
+				add_to_write_order(wo, endp, s);
+			}
+		}
+		/* drop down a level to add left subtree nodes if possible */
+		if (e->delta_child) {
+			add_to_order = 1;
+			e = e->delta_child;
+		} else {
+			add_to_order = 0;
+			/* our sibling might have some children, it is next */
+			if (e->delta_sibling) {
+				e = e->delta_sibling;
+				continue;
+			}
+			/* go back to our parent node */
+			e = e->delta;
+			while (e && !e->delta_sibling) {
+				/* we're on the right side of a subtree, keep
+				 * going up until we can go right again */
+				e = e->delta;
+			}
+			if (!e) {
+				/* done- we hit our original root node */
+				return;
+			}
+			/* pass it off to sibling at this level */
+			e = e->delta_sibling;
+		}
+	};
 }
 
 static void add_family_to_write_order(struct object_entry **wo,
@@ -484,7 +515,6 @@ static void add_family_to_write_order(struct object_entry **wo,
 
 	for (root = e; root->delta; root = root->delta)
 		; /* nothing */
-	add_to_write_order(wo, endp, root);
 	add_descendants_to_write_order(wo, endp, root);
 }
 
-- 
1.7.7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively
  2011-10-18  5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee
@ 2011-10-27 22:13   ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2011-10-27 22:13 UTC (permalink / raw)
  To: Dan McGee; +Cc: git

Dan McGee <dpmcgee@gmail.com> writes:

> This removes the need to call this function recursively, shinking the
> code size slightly and netting a small performance increase.
>
> Signed-off-by: Dan McGee <dpmcgee@gmail.com>

Tricky.

As long as this is done after compute_write_order() populates
delta_sibling vs delta link in a consistent way, the new logic should
produce exactly the same result as the original code.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-11-14  5:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-18  5:21 [PATCH 1/4] pack-objects: mark add_to_write_order() as inline Dan McGee
2011-10-18  5:21 ` [PATCH 2/4] pack-objects: use unsigned int for counter and offset values Dan McGee
2011-10-18  5:21 ` [PATCH 3/4] pack-objects: don't traverse objects unnecessarily Dan McGee
2011-10-27 22:26   ` Junio C Hamano
2011-11-09  4:31     ` Dan McGee
2011-11-12  6:55       ` Junio C Hamano
2011-11-13 22:34         ` Dan McGee
2011-11-14  5:40           ` Junio C Hamano
2011-10-18  5:21 ` [PATCH 4/4] pack-objects: rewrite add_descendants_to_write_order() iteratively Dan McGee
2011-10-27 22:13   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).