git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] speedup allocation in pack-redundant.c
@ 2005-11-22 14:56 Alex Riesen
  2005-11-22 20:41 ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Riesen @ 2005-11-22 14:56 UTC (permalink / raw)
  To: Lukas Sandström; +Cc: git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 33 bytes --]

Reuse discarded nodes of llists

[-- Attachment #2: 0001-speedup-allocation-in-pack-redundant.c.txt --]
[-- Type: text/plain, Size: 2117 bytes --]

Subject: [PATCH] speedup allocation in pack-redundant.c

Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>


---

 pack-redundant.c |   32 ++++++++++++++++++++++++++------
 1 files changed, 26 insertions(+), 6 deletions(-)

applies-to: 0a8441bc998f995dd35380472314802f53c6e1f3
738ce6cef594ae09b89372859d28f37b1bef9aa1
diff --git a/pack-redundant.c b/pack-redundant.c
index 1519385..3681170 100644
--- a/pack-redundant.c
+++ b/pack-redundant.c
@@ -36,11 +36,31 @@ struct pll {
 	size_t pl_size;
 };
 
-static inline void llist_free(struct llist *list)
+static struct llist_item *free_nodes = NULL;
+
+static inline struct llist_item *llist_item_get()
+{
+	struct llist_item *new;
+	if ( free_nodes ) {
+		new = free_nodes;
+		free_nodes = free_nodes->next;
+	} else
+		new = xmalloc(sizeof(struct llist_item));
+
+	return new;
+}
+
+static inline void llist_item_put(struct llist_item *item)
+{
+	item->next = free_nodes;
+	free_nodes = item;
+}
+
+static void llist_free(struct llist *list)
 {
 	while((list->back = list->front)) {
 		list->front = list->front->next;
-		free(list->back);
+		llist_item_put(list->back);
 	}
 	free(list);
 }
@@ -62,13 +82,13 @@ static struct llist * llist_copy(struct 
 	if ((ret->size = list->size) == 0)
 		return ret;
 
-	new = ret->front = xmalloc(sizeof(struct llist_item));
+	new = ret->front = llist_item_get();
 	new->sha1 = list->front->sha1;
 
 	old = list->front->next;
 	while (old) {
 		prev = new;
-		new = xmalloc(sizeof(struct llist_item));
+		new = llist_item_get();
 		prev->next = new;
 		new->sha1 = old->sha1;
 		old = old->next;
@@ -82,7 +102,7 @@ static struct llist * llist_copy(struct 
 static inline struct llist_item * llist_insert(struct llist *list,
 					struct llist_item *after, char *sha1)
 {
-	struct llist_item *new = xmalloc(sizeof(struct llist_item));
+	struct llist_item *new = llist_item_get();
 	new->sha1 = sha1;
 	new->next = NULL;
 
@@ -153,7 +173,7 @@ redo_from_start:
 				prev->next = l->next;
 			if (l == list->back)
 				list->back = prev;
-			free(l);
+			llist_item_put(l);
 			list->size--;
 			return prev;
 		}
---
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 14:56 [PATCH] speedup allocation in pack-redundant.c Alex Riesen
@ 2005-11-22 20:41 ` Junio C Hamano
  2005-11-22 22:48   ` Lukas Sandström
  2005-11-22 23:00   ` Alex Riesen
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2005-11-22 20:41 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Lukas Sandström, git

Alex Riesen <raa.lkml@gmail.com> writes:

> Subject: [PATCH] speedup allocation in pack-redundant.c
>
> Reuse discarded nodes of llists
>
> Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>

I think making allocation/deallocation to the central place is a
good cleanup, but I am not sure about the free-nodes reusing.
Does this make difference in real life?  If so, it might be
worth doing the slab-like allocation, since free-nodes are very
small structure and malloc overhead is not ignorable there.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 20:41 ` Junio C Hamano
@ 2005-11-22 22:48   ` Lukas Sandström
  2005-11-22 23:08     ` Junio C Hamano
  2005-11-22 23:46     ` Alex Riesen
  2005-11-22 23:00   ` Alex Riesen
  1 sibling, 2 replies; 10+ messages in thread
From: Lukas Sandström @ 2005-11-22 22:48 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alex Riesen, Lukas Sandström

Junio C Hamano wrote:
> Alex Riesen <raa.lkml@gmail.com> writes:
> 
> 
>>Subject: [PATCH] speedup allocation in pack-redundant.c
>>
>>Reuse discarded nodes of llists
>>
>>Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>
> 
> 
> I think making allocation/deallocation to the central place is a
> good cleanup, but I am not sure about the free-nodes reusing.
> Does this make difference in real life?  If so, it might be
> worth doing the slab-like allocation, since free-nodes are very
> small structure and malloc overhead is not ignorable there.
> 
> 
I have done some tests, and unfortunatley I saw approx. zero
improvement with Alex's patch. (less than 10ms difference when
total runtime is 1.850s, tested on http://home.arcor.de/fork0/download/idx.tar.gz)

Did someone else notice an improvement?

It's a nice idea though. I'll look into doing slab-allocation
for the fun of it, but I'm not really sure that malloc is the
bottleneck.

/Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 20:41 ` Junio C Hamano
  2005-11-22 22:48   ` Lukas Sandström
@ 2005-11-22 23:00   ` Alex Riesen
  2005-11-22 23:14     ` Lukas Sandström
  1 sibling, 1 reply; 10+ messages in thread
From: Alex Riesen @ 2005-11-22 23:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lukas Sandström, git

Junio C Hamano, Tue, Nov 22, 2005 21:41:56 +0100:
> > Reuse discarded nodes of llists
> >
> > Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>
> 
> I think making allocation/deallocation to the central place is a
> good cleanup, but I am not sure about the free-nodes reusing.
> Does this make difference in real life?

It definitely does, though nor very much. I have no real numbers at
hand (being home now), but I remember it was 1 min with against 3 min
without the patch on cygwin+fat32, which is already bad enough all by
itself. Very big repository with no redundant packs in it.

> If so, it might be worth doing the slab-like allocation, since
> free-nodes are very small structure and malloc overhead is not
> ignorable there.

Like this?

    if ( free_nodes ) { ... }
    else {
	struct llist_node *slab = malloc(sizeof(*slab) * BLKCNT);
	for ( i =0; i < BLKCNT; ++i ) {
	    slab->next = free_nodes;
	    free_nodes = slab++;
	}
    }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 22:48   ` Lukas Sandström
@ 2005-11-22 23:08     ` Junio C Hamano
  2005-11-22 23:46     ` Alex Riesen
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2005-11-22 23:08 UTC (permalink / raw)
  To: Lukas Sandström; +Cc: git

Lukas Sandström <lukass@etek.chalmers.se> writes:

> Did someone else notice an improvement?

Not me.  I merged it only for its clean-up value, not immediate
performance reasons.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 23:00   ` Alex Riesen
@ 2005-11-22 23:14     ` Lukas Sandström
  2005-11-22 23:38       ` Alex Riesen
  0 siblings, 1 reply; 10+ messages in thread
From: Lukas Sandström @ 2005-11-22 23:14 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano

Alex Riesen wrote:
> Junio C Hamano, Tue, Nov 22, 2005 21:41:56 +0100:
>>I think making allocation/deallocation to the central place is a
>>good cleanup, but I am not sure about the free-nodes reusing.
>>Does this make difference in real life?
> 
> 
> It definitely does, though nor very much. I have no real numbers at
> hand (being home now), but I remember it was 1 min with against 3 min
> without the patch on cygwin+fat32, which is already bad enough all by
> itself. Very big repository with no redundant packs in it.
> 

Would you mind sharing the .idx files?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 23:14     ` Lukas Sandström
@ 2005-11-22 23:38       ` Alex Riesen
  2005-11-22 23:55         ` Lukas Sandström
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Riesen @ 2005-11-22 23:38 UTC (permalink / raw)
  To: Lukas Sandström; +Cc: git, Junio C Hamano

Lukas Sandström, Wed, Nov 23, 2005 00:14:53 +0100:
> >>I think making allocation/deallocation to the central place is a
> >>good cleanup, but I am not sure about the free-nodes reusing.
> >>Does this make difference in real life?
> > 
> > It definitely does, though nor very much. I have no real numbers at
> > hand (being home now), but I remember it was 1 min with against 3 min
> > without the patch on cygwin+fat32, which is already bad enough all by
> > itself. Very big repository with no redundant packs in it.
> 
> Would you mind sharing the .idx files?

this time I probably would (they're not here)... But for a perfomance
testing any big repository will do, linux kernel, for example.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 22:48   ` Lukas Sandström
  2005-11-22 23:08     ` Junio C Hamano
@ 2005-11-22 23:46     ` Alex Riesen
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Riesen @ 2005-11-22 23:46 UTC (permalink / raw)
  To: Lukas Sandström; +Cc: git, Junio C Hamano

Lukas Sandström, Tue, Nov 22, 2005 23:48:51 +0100:
> >>Subject: [PATCH] speedup allocation in pack-redundant.c
> >>Reuse discarded nodes of llists
> >>Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>
> > 
> > I think making allocation/deallocation to the central place is a
> > good cleanup, but I am not sure about the free-nodes reusing.
> > Does this make difference in real life?  If so, it might be
> > worth doing the slab-like allocation, since free-nodes are very
> > small structure and malloc overhead is not ignorable there.
> > 
> I have done some tests, and unfortunatley I saw approx. zero
> improvement with Alex's patch. (less than 10ms difference when
> total runtime is 1.850s, tested on http://home.arcor.de/fork0/download/idx.tar.gz)

Can I suggest you try it in a really really weird environment? Like
Cygwin. And switch some virus scanner on.

> Did someone else notice an improvement?

My test case had over 100k files in it (just don't ask why. Weird
environments, weird projects, ...)

> It's a nice idea though. I'll look into doing slab-allocation
> for the fun of it, but I'm not really sure that malloc is the
> bottleneck.

Yes, it usually is not a bottleneck. I think, it just another
exception.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] speedup allocation in pack-redundant.c
  2005-11-22 23:38       ` Alex Riesen
@ 2005-11-22 23:55         ` Lukas Sandström
  2005-11-23  7:31           ` Alex Riesen
  0 siblings, 1 reply; 10+ messages in thread
From: Lukas Sandström @ 2005-11-22 23:55 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano

Alex Riesen wrote:
> Lukas Sandström, Wed, Nov 23, 2005 00:14:53 +0100:
> 
>>>>I think making allocation/deallocation to the central place is a
>>>>good cleanup, but I am not sure about the free-nodes reusing.
>>>>Does this make difference in real life?
>>>
>>>It definitely does, though nor very much. I have no real numbers at
>>>hand (being home now), but I remember it was 1 min with against 3 min
>>>without the patch on cygwin+fat32, which is already bad enough all by
>>>itself. Very big repository with no redundant packs in it.
>>
>>Would you mind sharing the .idx files?
> 
> 
> this time I probably would (they're not here)... But for a perfomance
> testing any big repository will do, linux kernel, for example.
> 
The problem is that the large repository I have contains lots of
redundant packs, which makes quite fast to find a complete set
and end the search. If you don't have any redundant packs, the
complete set search really is 2**n (n = the number of packs).

I did some quick experiments with slab allocation and got a 4.4%
improvement on the redundant repo, so that might be worth persuing. 
(Concept patch below)

diff --git a/pack-redundant.c b/pack-redundant.c
index b38baa9..05294f8 100644
--- a/pack-redundant.c
+++ b/pack-redundant.c
@@ -8,6 +8,8 @@
 
 #include "cache.h"
 
+#define BLKSIZE 1024
+
 static const char pack_redundant_usage[] =
 "git-pack-redundant [ --verbose ] [ --alt-odb ] < --all | <.pack filename> ...>";
 
@@ -38,24 +40,28 @@ struct pll {
 
 static struct llist_item *free_nodes = NULL;
 
+static inline void llist_item_put(struct llist_item *item)
+{
+	item->next = free_nodes;
+	free_nodes = item;
+}
+
 static inline struct llist_item *llist_item_get()
 {
 	struct llist_item *new;
 	if ( free_nodes ) {
 		new = free_nodes;
 		free_nodes = free_nodes->next;
-	} else
-		new = xmalloc(sizeof(struct llist_item));
-
+	} else {
+		int i = 1;
+		new = xmalloc(sizeof(struct llist_item) * BLKSIZE);
+		for(;i < BLKSIZE; i++) {
+			llist_item_put(&new[i]);
+		}
+	}
 	return new;
 }
 
-static inline void llist_item_put(struct llist_item *item)
-{
-	item->next = free_nodes;
-	free_nodes = item;
-}
-
 static void llist_free(struct llist *list)
 {
 	while((list->back = list->front)) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: speedup allocation in pack-redundant.c
  2005-11-22 23:55         ` Lukas Sandström
@ 2005-11-23  7:31           ` Alex Riesen
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Riesen @ 2005-11-23  7:31 UTC (permalink / raw)
  To: Lukas Sandström; +Cc: git, Junio C Hamano

On 11/23/05, Lukas Sandström <lukass@etek.chalmers.se> wrote:
> >>>>I think making allocation/deallocation to the central place is a
> >>>>good cleanup, but I am not sure about the free-nodes reusing.
> >>>>Does this make difference in real life?
> >>>
> >>>It definitely does, though nor very much. I have no real numbers at
> >>>hand (being home now), but I remember it was 1 min with against 3 min
> >>>without the patch on cygwin+fat32, which is already bad enough all by
> >>>itself. Very big repository with no redundant packs in it.
> >>
> >>Would you mind sharing the .idx files?
> >
> > this time I probably would (they're not here)... But for a perfomance
> > testing any big repository will do, linux kernel, for example.
> >
> The problem is that the large repository I have contains lots of
> redundant packs, which makes quite fast to find a complete set
> and end the search. If you don't have any redundant packs, the
> complete set search really is 2**n (n = the number of packs).
>
> I did some quick experiments with slab allocation and got a 4.4%
> improvement on the redundant repo, so that might be worth persuing.
> (Concept patch below)
>

I don't have the old packs anymore, but I benchmarked all three
allocation types anyway:

malloc/free:

$ time git-pack-redundant --all --alt-odb
real    0m0.092s
user    0m0.108s
sys     0m0.015s

simple node reuse (the patch in official tree):

$ time git-pack-redundant --all --alt-odb
real    0m0.074s
user    0m0.093s
sys     0m0.015s

slab node allocation (your concept patch):

$ time git-pack-redundant --all --alt-odb
real    0m0.031s
user    0m0.046s
sys     0m0.015s

This repository has one pack and 17758 files.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-11-23  7:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-22 14:56 [PATCH] speedup allocation in pack-redundant.c Alex Riesen
2005-11-22 20:41 ` Junio C Hamano
2005-11-22 22:48   ` Lukas Sandström
2005-11-22 23:08     ` Junio C Hamano
2005-11-22 23:46     ` Alex Riesen
2005-11-22 23:00   ` Alex Riesen
2005-11-22 23:14     ` Lukas Sandström
2005-11-22 23:38       ` Alex Riesen
2005-11-22 23:55         ` Lukas Sandström
2005-11-23  7:31           ` Alex Riesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).