* Re: Performance problem, long run of identical hashes
2007-12-10 15:45 ` Nicolas Pitre
@ 2007-12-10 16:14 ` David Kastrup
2007-12-10 16:20 ` Jon Smirl
2007-12-10 19:39 ` David Kastrup
2 siblings, 0 replies; 7+ messages in thread
From: David Kastrup @ 2007-12-10 16:14 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List
Nicolas Pitre <nico@cam.org> writes:
> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
>> Running oprofile during my gcc repack shows this loop as the hottest
>> place in the code by far.
>
> Well, that is kind of expected.
>
>> I added some debug printfs which show that I
>> have a 100,000+ run of identical hash entries. Processing the 100,000
>> entries also causes RAM consumption to explode.
>
> That is impossible. If you look at the code where those hash entries
> are created in create_delta_index(), you'll notice a hard limit of
> HASH_LIMIT (currently 64) is imposed on the number of identical hash
> entries.
Well, impossible is a strong word to use with respect to code: bugs are
possible. However, we have the assertion
assert(packed_entry - (struct index_entry *)mem == entries);
in create_delta_index and that makes a pretty strong guarantee that the
culling of hash entries should be effective. So at least the overall
_number_ of entries should be consistent. If there is a bug, it might
be that they get garbled or something.
It is also not clear how this could cause an explosion of RAM
consumption in the loop.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problem, long run of identical hashes
2007-12-10 15:45 ` Nicolas Pitre
2007-12-10 16:14 ` David Kastrup
@ 2007-12-10 16:20 ` Jon Smirl
2007-12-10 16:30 ` Nicolas Pitre
2007-12-10 19:39 ` David Kastrup
2 siblings, 1 reply; 7+ messages in thread
From: Jon Smirl @ 2007-12-10 16:20 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Git Mailing List
On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
> > Running oprofile during my gcc repack shows this loop as the hottest
> > place in the code by far.
>
> Well, that is kind of expected.
>
> > I added some debug printfs which show that I
> > have a 100,000+ run of identical hash entries. Processing the 100,000
> > entries also causes RAM consumption to explode.
>
> That is impossible. If you look at the code where those hash entries
> are created in create_delta_index(), you'll notice a hard limit of
> HASH_LIMIT (currently 64) is imposed on the number of identical hash
> entries.
On 12/10/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 12/9/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > + if (victim) {
> > > + sub_size = victim->remaining / 2;
> > > + list = victim->list + victim->list_size - sub_size;
> > > + while (sub_size && list[0]->hash &&
> > > + list[0]->hash == list[-1]->hash) {
> > > + list++;
> >
> > I think you needed to copy sub_size to another variable for this loop
>
> Copying sub_size was wrong. I believe you are checking for deltas on
> the same file. It's probably that chain of 103,817 deltas that can't
> be broken up.
At the end of multi-threaded repack one thread ends up with 45 minutes
of work after all the other threads have exited. That's because it
hits this loop and can't spit the list any more.
If the lists can't be over 64 identical entries, why do I get caught
in this loop for 50,000+ iterations? If remove this loop the threads
are balanced right to the end.
--
Jon Smirl
jonsmirl@gmail.com
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problem, long run of identical hashes
2007-12-10 16:20 ` Jon Smirl
@ 2007-12-10 16:30 ` Nicolas Pitre
0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 16:30 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
On Mon, 10 Dec 2007, Jon Smirl wrote:
> On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Mon, 10 Dec 2007, Jon Smirl wrote:
> >
> > > Running oprofile during my gcc repack shows this loop as the hottest
> > > place in the code by far.
> >
> > Well, that is kind of expected.
> >
> > > I added some debug printfs which show that I
> > > have a 100,000+ run of identical hash entries. Processing the 100,000
> > > entries also causes RAM consumption to explode.
> >
> > That is impossible. If you look at the code where those hash entries
> > are created in create_delta_index(), you'll notice a hard limit of
> > HASH_LIMIT (currently 64) is imposed on the number of identical hash
> > entries.
>
> On 12/10/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > On 12/9/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > > + if (victim) {
> > > > + sub_size = victim->remaining / 2;
> > > > + list = victim->list + victim->list_size - sub_size;
> > > > + while (sub_size && list[0]->hash &&
> > > > + list[0]->hash == list[-1]->hash) {
> > > > + list++;
> > >
> > > I think you needed to copy sub_size to another variable for this loop
> >
> > Copying sub_size was wrong. I believe you are checking for deltas on
> > the same file. It's probably that chain of 103,817 deltas that can't
> > be broken up.
>
> At the end of multi-threaded repack one thread ends up with 45 minutes
> of work after all the other threads have exited. That's because it
> hits this loop and can't spit the list any more.
>
> If the lists can't be over 64 identical entries, why do I get caught
> in this loop for 50,000+ iterations? If remove this loop the threads
> are balanced right to the end.
Completely different issue.
Please read my other answers.
Nicolas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problem, long run of identical hashes
2007-12-10 15:45 ` Nicolas Pitre
2007-12-10 16:14 ` David Kastrup
2007-12-10 16:20 ` Jon Smirl
@ 2007-12-10 19:39 ` David Kastrup
2007-12-10 20:11 ` Nicolas Pitre
2 siblings, 1 reply; 7+ messages in thread
From: David Kastrup @ 2007-12-10 19:39 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List
Nicolas Pitre <nico@cam.org> writes:
> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
>> Running oprofile during my gcc repack shows this loop as the hottest
>> place in the code by far.
>
> Well, that is kind of expected.
>
>> I added some debug printfs which show that I
>> have a 100,000+ run of identical hash entries. Processing the 100,000
>> entries also causes RAM consumption to explode.
>
> That is impossible. If you look at the code where those hash entries
> are created in create_delta_index(), you'll notice a hard limit of
> HASH_LIMIT (currently 64) is imposed on the number of identical hash
> entries.
On the other hand, if we have, say, laaaaaarge streaks of zeros, what
happens is that we have 64 hashes seeing them. Now about 4096 bytes are
compared, and then the comparison stops. Then it scans backwards to
seek for more zeros (and finds about 64k of them before it stops) and
folds them into the current compacted form. Each of these backward
scans (of which we have 64 in the worst case) is in a different memory
area. So since we scan/compare areas of 64k for each advance of 4k, we
have an overscanning factor of 16 (for a worst case scenario).
Not sure whether this is what we are seeing here. It would still not
explain exploding memory usage I think.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problem, long run of identical hashes
2007-12-10 19:39 ` David Kastrup
@ 2007-12-10 20:11 ` Nicolas Pitre
0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 20:11 UTC (permalink / raw)
To: David Kastrup; +Cc: Jon Smirl, Git Mailing List
On Mon, 10 Dec 2007, David Kastrup wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On Mon, 10 Dec 2007, Jon Smirl wrote:
> >
> >> I added some debug printfs which show that I
> >> have a 100,000+ run of identical hash entries. Processing the 100,000
> >> entries also causes RAM consumption to explode.
> >
> > That is impossible. If you look at the code where those hash entries
> > are created in create_delta_index(), you'll notice a hard limit of
> > HASH_LIMIT (currently 64) is imposed on the number of identical hash
> > entries.
>
> On the other hand, if we have, say, laaaaaarge streaks of zeros, what
> happens is that we have 64 hashes seeing them. Now about 4096 bytes are
> compared, and then the comparison stops.
No.
What would happen in that case is that the first hash entry pointing
somewhere in the beginning of the zero chunk will match _at least_ 4096
bytes, probably much more, up to the end of the buffer if that is
possible.
> Then it scans backwards to
> seek for more zeros (and finds about 64k of them before it stops)
The first hash entry for those zeroes is going to point at an offset not
greater than 15 bytes from the start of the zero area. So, unless both
buffers are going to match even before that zero area, the backward scan
will stop quickly.
> folds them into the current compacted form. Each of these backward
> scans (of which we have 64 in the worst case) is in a different memory
> area. So since we scan/compare areas of 64k for each advance of 4k, we
> have an overscanning factor of 16 (for a worst case scenario).
Actually, if we matched 100MB of zeroes, we'll just encode that in a
suite of 64KB copy operations, and continue scanning the buffer after
that 100MB.
So I don't see where your scan/compare areas of 64k for each advance of
4k comes from.
> Not sure whether this is what we are seeing here. It would still not
> explain exploding memory usage I think.
Right.
Nicolas
^ permalink raw reply [flat|nested] 7+ messages in thread