Performance problem, long run of identical hashes

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Performance problem, long run of identical hashes
@ 2007-12-10 15:07 Jon Smirl
  2007-12-10 15:45 ` Nicolas Pitre
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Smirl @ 2007-12-10 15:07 UTC (permalink / raw)
  To: Git Mailing List

Running oprofile during my gcc repack shows this loop as the hottest
place in the code by far. I added some debug printfs which show that I
have a 100,000+ run of identical hash entries. Processing the 100,000
entries also causes RAM consumption to explode.

create_delta()
	for (entry = index->hash[i]; entry < index->hash[i+1]; entry++) {
		const unsigned char *ref = entry->ptr;
		const unsigned char *src = data;
		unsigned int ref_size = ref_top - ref;
		if (entry->val != val)
			continue;
		if (ref_size > top - src)
			ref_size = top - src;
		if (ref_size <= msize)
			break;
		while (ref_size-- && *src++ == *ref)
			ref++;
		if (msize < ref - entry->ptr) {
			/* this is our best match so far */
			msize = ref - entry->ptr;
			moff = entry->ptr - ref_data;
			if (msize >= 4096) /* good enough */
				break;
		}
	}

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 15:07 Performance problem, long run of identical hashes Jon Smirl
@ 2007-12-10 15:45 ` Nicolas Pitre
  2007-12-10 16:14   ` David Kastrup
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 15:45 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List

On Mon, 10 Dec 2007, Jon Smirl wrote:

> Running oprofile during my gcc repack shows this loop as the hottest
> place in the code by far.

Well, that is kind of expected.

> I added some debug printfs which show that I
> have a 100,000+ run of identical hash entries. Processing the 100,000
> entries also causes RAM consumption to explode.

That is impossible.  If you look at the code where those hash entries 
are created in create_delta_index(), you'll notice a hard limit of 
HASH_LIMIT (currently 64) is imposed on the number of identical hash 
entries.


Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 15:45 ` Nicolas Pitre
@ 2007-12-10 16:14   ` David Kastrup
  2007-12-10 16:20   ` Jon Smirl
  2007-12-10 19:39   ` David Kastrup
  2 siblings, 0 replies; 7+ messages in thread
From: David Kastrup @ 2007-12-10 16:14 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List

Nicolas Pitre <nico@cam.org> writes:

> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
>> Running oprofile during my gcc repack shows this loop as the hottest
>> place in the code by far.
>
> Well, that is kind of expected.
>
>> I added some debug printfs which show that I
>> have a 100,000+ run of identical hash entries. Processing the 100,000
>> entries also causes RAM consumption to explode.
>
> That is impossible.  If you look at the code where those hash entries 
> are created in create_delta_index(), you'll notice a hard limit of 
> HASH_LIMIT (currently 64) is imposed on the number of identical hash 
> entries.

Well, impossible is a strong word to use with respect to code: bugs are
possible.  However, we have the assertion

	assert(packed_entry - (struct index_entry *)mem == entries);

in create_delta_index and that makes a pretty strong guarantee that the
culling of hash entries should be effective.  So at least the overall
_number_ of entries should be consistent.  If there is a bug, it might
be that they get garbled or something.

It is also not clear how this could cause an explosion of RAM
consumption in the loop.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 15:45 ` Nicolas Pitre
  2007-12-10 16:14   ` David Kastrup
@ 2007-12-10 16:20   ` Jon Smirl
  2007-12-10 16:30     ` Nicolas Pitre
  2007-12-10 19:39   ` David Kastrup
  2 siblings, 1 reply; 7+ messages in thread
From: Jon Smirl @ 2007-12-10 16:20 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Mailing List

On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
> > Running oprofile during my gcc repack shows this loop as the hottest
> > place in the code by far.
>
> Well, that is kind of expected.
>
> > I added some debug printfs which show that I
> > have a 100,000+ run of identical hash entries. Processing the 100,000
> > entries also causes RAM consumption to explode.
>
> That is impossible.  If you look at the code where those hash entries
> are created in create_delta_index(), you'll notice a hard limit of
> HASH_LIMIT (currently 64) is imposed on the number of identical hash
> entries.

On 12/10/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 12/9/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > +               if (victim) {
> > > +                       sub_size = victim->remaining / 2;
> > > +                       list = victim->list + victim->list_size - sub_size;
> > > +                       while (sub_size && list[0]->hash &&
> > > +                              list[0]->hash == list[-1]->hash) {
> > > +                               list++;
> >
> > I think you needed to copy sub_size to another variable for this loop
>
> Copying sub_size was wrong. I believe you are checking for deltas on
> the same file. It's probably that chain of 103,817 deltas that can't
> be broken up.

At the end of multi-threaded repack one thread ends up with 45 minutes
of work after all the other threads have exited. That's because it
hits this loop and can't spit the list any more.

If the lists can't be over 64 identical entries, why do I get caught
in this loop for 50,000+ iterations? If remove this loop the threads
are balanced right to the end.

-- 
Jon Smirl
jonsmirl@gmail.com


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 16:20   ` Jon Smirl
@ 2007-12-10 16:30     ` Nicolas Pitre
  0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 16:30 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List

On Mon, 10 Dec 2007, Jon Smirl wrote:

> On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Mon, 10 Dec 2007, Jon Smirl wrote:
> >
> > > Running oprofile during my gcc repack shows this loop as the hottest
> > > place in the code by far.
> >
> > Well, that is kind of expected.
> >
> > > I added some debug printfs which show that I
> > > have a 100,000+ run of identical hash entries. Processing the 100,000
> > > entries also causes RAM consumption to explode.
> >
> > That is impossible.  If you look at the code where those hash entries
> > are created in create_delta_index(), you'll notice a hard limit of
> > HASH_LIMIT (currently 64) is imposed on the number of identical hash
> > entries.
> 
> On 12/10/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > On 12/9/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > > +               if (victim) {
> > > > +                       sub_size = victim->remaining / 2;
> > > > +                       list = victim->list + victim->list_size - sub_size;
> > > > +                       while (sub_size && list[0]->hash &&
> > > > +                              list[0]->hash == list[-1]->hash) {
> > > > +                               list++;
> > >
> > > I think you needed to copy sub_size to another variable for this loop
> >
> > Copying sub_size was wrong. I believe you are checking for deltas on
> > the same file. It's probably that chain of 103,817 deltas that can't
> > be broken up.
> 
> At the end of multi-threaded repack one thread ends up with 45 minutes
> of work after all the other threads have exited. That's because it
> hits this loop and can't spit the list any more.
> 
> If the lists can't be over 64 identical entries, why do I get caught
> in this loop for 50,000+ iterations? If remove this loop the threads
> are balanced right to the end.

Completely different issue.

Please read my other answers.


Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 15:45 ` Nicolas Pitre
  2007-12-10 16:14   ` David Kastrup
  2007-12-10 16:20   ` Jon Smirl
@ 2007-12-10 19:39   ` David Kastrup
  2007-12-10 20:11     ` Nicolas Pitre
  2 siblings, 1 reply; 7+ messages in thread
From: David Kastrup @ 2007-12-10 19:39 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List

Nicolas Pitre <nico@cam.org> writes:

> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
>> Running oprofile during my gcc repack shows this loop as the hottest
>> place in the code by far.
>
> Well, that is kind of expected.
>
>> I added some debug printfs which show that I
>> have a 100,000+ run of identical hash entries. Processing the 100,000
>> entries also causes RAM consumption to explode.
>
> That is impossible.  If you look at the code where those hash entries 
> are created in create_delta_index(), you'll notice a hard limit of 
> HASH_LIMIT (currently 64) is imposed on the number of identical hash 
> entries.

On the other hand, if we have, say, laaaaaarge streaks of zeros, what
happens is that we have 64 hashes seeing them.  Now about 4096 bytes are
compared, and then the comparison stops.  Then it scans backwards to
seek for more zeros (and finds about 64k of them before it stops) and
folds them into the current compacted form.  Each of these backward
scans (of which we have 64 in the worst case) is in a different memory
area.  So since we scan/compare areas of 64k for each advance of 4k, we
have an overscanning factor of 16 (for a worst case scenario).

Not sure whether this is what we are seeing here.  It would still not
explain exploding memory usage I think.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance problem, long run of identical hashes
  2007-12-10 19:39   ` David Kastrup
@ 2007-12-10 20:11     ` Nicolas Pitre
  0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 20:11 UTC (permalink / raw)
  To: David Kastrup; +Cc: Jon Smirl, Git Mailing List

On Mon, 10 Dec 2007, David Kastrup wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Mon, 10 Dec 2007, Jon Smirl wrote:
> >
> >> I added some debug printfs which show that I
> >> have a 100,000+ run of identical hash entries. Processing the 100,000
> >> entries also causes RAM consumption to explode.
> >
> > That is impossible.  If you look at the code where those hash entries 
> > are created in create_delta_index(), you'll notice a hard limit of 
> > HASH_LIMIT (currently 64) is imposed on the number of identical hash 
> > entries.
> 
> On the other hand, if we have, say, laaaaaarge streaks of zeros, what
> happens is that we have 64 hashes seeing them.  Now about 4096 bytes are
> compared, and then the comparison stops.

No.

What would happen in that case is that the first hash entry pointing 
somewhere in the beginning of the zero chunk will match _at least_ 4096 
bytes, probably much more, up to the end of the buffer if that is 
possible.

> Then it scans backwards to
> seek for more zeros (and finds about 64k of them before it stops)

The first hash entry for those zeroes is going to point at an offset not 
greater than 15 bytes from the start of the zero area.  So, unless both 
buffers are going to match even before that zero area, the backward scan 
will stop quickly.

> folds them into the current compacted form.  Each of these backward
> scans (of which we have 64 in the worst case) is in a different memory
> area.  So since we scan/compare areas of 64k for each advance of 4k, we
> have an overscanning factor of 16 (for a worst case scenario).

Actually, if we matched 100MB of zeroes, we'll just encode that in a 
suite of 64KB copy operations, and continue scanning the buffer after 
that 100MB.

So I don't see where your scan/compare areas of 64k for each advance of 
4k comes from.

> Not sure whether this is what we are seeing here.  It would still not
> explain exploding memory usage I think.

Right.

Nicolas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-12-10 20:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-10 15:07 Performance problem, long run of identical hashes Jon Smirl
2007-12-10 15:45 ` Nicolas Pitre
2007-12-10 16:14   ` David Kastrup
2007-12-10 16:20   ` Jon Smirl
2007-12-10 16:30     ` Nicolas Pitre
2007-12-10 19:39   ` David Kastrup
2007-12-10 20:11     ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).