* [PATCH] avoid possible overflow in delta size filtering computation
@ 2009-03-24 19:56 Nicolas Pitre
2009-03-24 20:20 ` Brandon Casey
2009-03-25 12:15 ` Kjetil Barvik
0 siblings, 2 replies; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-24 19:56 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kjetil Barvik, git
On a 32-bit system, the maximum possible size for an object is less than
4GB, while 64-bit systems may cope with larger objects. Due to this
limitation, variables holding object sizes are using an unsigned long
type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
When large objects are encountered, and/or people play with large delta
depth values, it is possible for the maximum allowed delta size
computation to overflow, especially on a 32-bit system. When this
occurs, surviving result bits may represent a value much smaller than
what it is supposed to be, or even zero. This prevents some objects
from being deltified although they do get deltified when a smaller depth
limit is used. Fix this by always performing a 64-bit multiplication.
Signed-off-by: Nicolas Pitre <nico@cam.org>
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 3a4bdbb..9fc3b35 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1293,7 +1293,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
max_size = trg_entry->delta_size;
ref_depth = trg->depth;
}
- max_size = max_size * (max_depth - src->depth) /
+ max_size = (uint64_t)max_size * (max_depth - src->depth) /
(max_depth - ref_depth + 1);
if (max_size == 0)
return 0;
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-24 19:56 [PATCH] avoid possible overflow in delta size filtering computation Nicolas Pitre
@ 2009-03-24 20:20 ` Brandon Casey
2009-03-24 20:52 ` Nicolas Pitre
2009-03-25 0:39 ` Nicolas Pitre
2009-03-25 12:15 ` Kjetil Barvik
1 sibling, 2 replies; 10+ messages in thread
From: Brandon Casey @ 2009-03-24 20:20 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, Kjetil Barvik, git
Nicolas Pitre wrote:
> On a 32-bit system, the maximum possible size for an object is less than
> 4GB, while 64-bit systems may cope with larger objects. Due to this
> limitation, variables holding object sizes are using an unsigned long
> type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
FYI: except on windows 64-bit where long is still 32 bits AFAIK
-brandon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-24 20:20 ` Brandon Casey
@ 2009-03-24 20:52 ` Nicolas Pitre
2009-03-25 0:39 ` Nicolas Pitre
1 sibling, 0 replies; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-24 20:52 UTC (permalink / raw)
To: Brandon Casey; +Cc: Junio C Hamano, Kjetil Barvik, git
On Tue, 24 Mar 2009, Brandon Casey wrote:
> Nicolas Pitre wrote:
> > On a 32-bit system, the maximum possible size for an object is less than
> > 4GB, while 64-bit systems may cope with larger objects. Due to this
> > limitation, variables holding object sizes are using an unsigned long
> > type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
>
> FYI: except on windows 64-bit where long is still 32 bits AFAIK
Whatever. Same issue.
Nicolas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-24 20:20 ` Brandon Casey
2009-03-24 20:52 ` Nicolas Pitre
@ 2009-03-25 0:39 ` Nicolas Pitre
1 sibling, 0 replies; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-25 0:39 UTC (permalink / raw)
To: Brandon Casey; +Cc: Junio C Hamano, Kjetil Barvik, git
On Tue, 24 Mar 2009, Brandon Casey wrote:
> Nicolas Pitre wrote:
> > On a 32-bit system, the maximum possible size for an object is less than
> > 4GB, while 64-bit systems may cope with larger objects. Due to this
> > limitation, variables holding object sizes are using an unsigned long
> > type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
>
> FYI: except on windows 64-bit where long is still 32 bits AFAIK
Whatever. Same issue.
Nicolas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-24 19:56 [PATCH] avoid possible overflow in delta size filtering computation Nicolas Pitre
2009-03-24 20:20 ` Brandon Casey
@ 2009-03-25 12:15 ` Kjetil Barvik
2009-03-25 16:18 ` Nicolas Pitre
1 sibling, 1 reply; 10+ messages in thread
From: Kjetil Barvik @ 2009-03-25 12:15 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
Nicolas Pitre <nico@cam.org> writes:
> On a 32-bit system, the maximum possible size for an object is less than
> 4GB, while 64-bit systems may cope with larger objects. Due to this
> limitation, variables holding object sizes are using an unsigned long
> type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
>
> When large objects are encountered, and/or people play with large delta
> depth values, it is possible for the maximum allowed delta size
> computation to overflow, especially on a 32-bit system. When this
> occurs, surviving result bits may represent a value much smaller than
> what it is supposed to be, or even zero. This prevents some objects
> from being deltified although they do get deltified when a smaller depth
> limit is used. Fix this by always performing a 64-bit multiplication.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>
I added this patch and rerun the 2 test cases form the table where
--depth is 20000 and 95000, and got the following result:
--depth=20000 => file size: 19126077 delta: 73814
--depth=95000 => file size: 19126087 delta: 73814
So, it seems that this patch almost fixed the issue. But notice that
the pack file was 10 bytes larger for the --depth=95000 case.
I made a small perl script to compare the output from 'git verify-pack
-v' of the 2 idx/pack files, and found the following difference(1)
(first line from --depth=20000 case, second from --depth=95000):
fe0a6f3e971373590714dbafd087b235ea60ac00 tree 9 19 18921247 731 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
fe0a6f3e971373590714dbafd087b235ea60ac00 tree 20 29 18921247 730 12e560f7fb28558b15e3a2008fba860f9a4b2222
'git show fe0a6f3e971373590714dbafd087b235ea60ac00' =>
tree fe0a6f3e971373590714dbafd087b235ea60ac00
Makefile
t0000-basic.sh
test-lib.sh
'git show 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d' =>
tree 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
Makefile
t0000-basic.sh
t0100-environment-names.sh
t0200-update-cache.sh
t0400-ls-files.sh
t0500-ls-files.sh
t1000-checkout-cache.sh
t1001-checkout-cache.sh
test-lib.sh
'git show 12e560f7fb28558b15e3a2008fba860f9a4b2222' =>
tree 12e560f7fb28558b15e3a2008fba860f9a4b2222
Makefile
t0000-basic.sh
t0100-environment-names.sh
t0200-update-cache.sh
t0400-ls-files.sh
t0500-ls-files.sh
t1000-checkout-cache.sh
t1001-checkout-cache.sh
test-lib.sh
-- kjetil
1) there was lots of lines with different offsets, all of which was 10
larger in the --depth=95000 case.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-25 12:15 ` Kjetil Barvik
@ 2009-03-25 16:18 ` Nicolas Pitre
2009-03-25 16:34 ` Kjetil Barvik
0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-25 16:18 UTC (permalink / raw)
To: Kjetil Barvik; +Cc: Junio C Hamano, git
On Wed, 25 Mar 2009, Kjetil Barvik wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On a 32-bit system, the maximum possible size for an object is less than
> > 4GB, while 64-bit systems may cope with larger objects. Due to this
> > limitation, variables holding object sizes are using an unsigned long
> > type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
> >
> > When large objects are encountered, and/or people play with large delta
> > depth values, it is possible for the maximum allowed delta size
> > computation to overflow, especially on a 32-bit system. When this
> > occurs, surviving result bits may represent a value much smaller than
> > what it is supposed to be, or even zero. This prevents some objects
> > from being deltified although they do get deltified when a smaller depth
> > limit is used. Fix this by always performing a 64-bit multiplication.
> >
> > Signed-off-by: Nicolas Pitre <nico@cam.org>
>
> I added this patch and rerun the 2 test cases form the table where
> --depth is 20000 and 95000, and got the following result:
>
> --depth=20000 => file size: 19126077 delta: 73814
> --depth=95000 => file size: 19126087 delta: 73814
>
> So, it seems that this patch almost fixed the issue. But notice that
> the pack file was 10 bytes larger for the --depth=95000 case.
>
> I made a small perl script to compare the output from 'git verify-pack
> -v' of the 2 idx/pack files, and found the following difference(1)
> (first line from --depth=20000 case, second from --depth=95000):
>
> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 9 19 18921247 731 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 20 29 18921247 730 12e560f7fb28558b15e3a2008fba860f9a4b2222
OK. Apparently, a different base object for that one delta was chosen
between those two runs.
Is your machine SMP?
Nicolas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-25 16:18 ` Nicolas Pitre
@ 2009-03-25 16:34 ` Kjetil Barvik
2009-03-25 19:17 ` Nicolas Pitre
0 siblings, 1 reply; 10+ messages in thread
From: Kjetil Barvik @ 2009-03-25 16:34 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
Nicolas Pitre <nico@cam.org> writes:
> On Wed, 25 Mar 2009, Kjetil Barvik wrote:
>
>> Nicolas Pitre <nico@cam.org> writes:
>>
>> > On a 32-bit system, the maximum possible size for an object is less than
>> > 4GB, while 64-bit systems may cope with larger objects. Due to this
>> > limitation, variables holding object sizes are using an unsigned long
>> > type (32 bits on 32-bit systems, or 64 bits on 64-bit systems).
>> >
>> > When large objects are encountered, and/or people play with large delta
>> > depth values, it is possible for the maximum allowed delta size
>> > computation to overflow, especially on a 32-bit system. When this
>> > occurs, surviving result bits may represent a value much smaller than
>> > what it is supposed to be, or even zero. This prevents some objects
>> > from being deltified although they do get deltified when a smaller depth
>> > limit is used. Fix this by always performing a 64-bit multiplication.
>> >
>> > Signed-off-by: Nicolas Pitre <nico@cam.org>
>>
>> I added this patch and rerun the 2 test cases form the table where
>> --depth is 20000 and 95000, and got the following result:
>>
>> --depth=20000 => file size: 19126077 delta: 73814
>> --depth=95000 => file size: 19126087 delta: 73814
>>
>> So, it seems that this patch almost fixed the issue. But notice that
>> the pack file was 10 bytes larger for the --depth=95000 case.
>>
>> I made a small perl script to compare the output from 'git verify-pack
>> -v' of the 2 idx/pack files, and found the following difference(1)
>> (first line from --depth=20000 case, second from --depth=95000):
>>
>> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 9 19 18921247 731 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
>> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 20 29 18921247 730 12e560f7fb28558b15e3a2008fba860f9a4b2222
>
> OK. Apparently, a different base object for that one delta was chosen
> between those two runs.
>
> Is your machine SMP?
kjetil ~$ uname -a
Linux localhost 2.6.28.4 #26 SMP PREEMPT Tue Feb 10 17:07:14 CET 2009
i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux
-- kjetil
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-25 16:34 ` Kjetil Barvik
@ 2009-03-25 19:17 ` Nicolas Pitre
2009-03-26 7:18 ` Kjetil Barvik
0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-25 19:17 UTC (permalink / raw)
To: Kjetil Barvik; +Cc: Junio C Hamano, git
On Wed, 25 Mar 2009, Kjetil Barvik wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On Wed, 25 Mar 2009, Kjetil Barvik wrote:
> >
> >> So, it seems that this patch almost fixed the issue. But notice that
> >> the pack file was 10 bytes larger for the --depth=95000 case.
> >>
> >> I made a small perl script to compare the output from 'git verify-pack
> >> -v' of the 2 idx/pack files, and found the following difference(1)
> >> (first line from --depth=20000 case, second from --depth=95000):
> >>
> >> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 9 19 18921247 731 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
> >> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 20 29 18921247 730 12e560f7fb28558b15e3a2008fba860f9a4b2222
> >
> > OK. Apparently, a different base object for that one delta was chosen
> > between those two runs.
> >
> > Is your machine SMP?
>
> kjetil ~$ uname -a
> Linux localhost 2.6.28.4 #26 SMP PREEMPT Tue Feb 10 17:07:14 CET 2009
> i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux
Here you go. If you want a perfectly deterministic repacking, you'll
have to force the pack.threads config option to 1.
Nicolas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-25 19:17 ` Nicolas Pitre
@ 2009-03-26 7:18 ` Kjetil Barvik
2009-03-27 2:23 ` Nicolas Pitre
0 siblings, 1 reply; 10+ messages in thread
From: Kjetil Barvik @ 2009-03-26 7:18 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
Nicolas Pitre <nico@cam.org> writes:
> On Wed, 25 Mar 2009, Kjetil Barvik wrote:
>
>> Nicolas Pitre <nico@cam.org> writes:
>>
>> > On Wed, 25 Mar 2009, Kjetil Barvik wrote:
>> >
>> >> So, it seems that this patch almost fixed the issue. But notice that
>> >> the pack file was 10 bytes larger for the --depth=95000 case.
>> >>
>> >> I made a small perl script to compare the output from 'git verify-pack
>> >> -v' of the 2 idx/pack files, and found the following difference(1)
>> >> (first line from --depth=20000 case, second from --depth=95000):
>> >>
>> >> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 9 19 18921247 731 96a3ec5789504e6d0f90c99fb1937af1ebd58e2d
>> >> fe0a6f3e971373590714dbafd087b235ea60ac00 tree 20 29 18921247 730 12e560f7fb28558b15e3a2008fba860f9a4b2222
>> >
>> > OK. Apparently, a different base object for that one delta was chosen
>> > between those two runs.
>> >
>> > Is your machine SMP?
>>
>> kjetil ~$ uname -a
>> Linux localhost 2.6.28.4 #26 SMP PREEMPT Tue Feb 10 17:07:14 CET 2009
>> i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux
>
> Here you go. If you want a perfectly deterministic repacking, you'll
> have to force the pack.threads config option to 1.
OK, have rerun the test again with pack.config set to 1, and got the
exact same result(1) as above this time also! :-)
And I think I can explain the reason for the same result: When the
--window and/or --depth value(s) is larger than (half?) the value of
possible number of objects (98438 in this time), I think that the
thread logic finds out that it can only run one thread (there is not
room/objects enough for 2 or more threads).
This also explains what I see when I run the repack command (without
the pack.config option), only 1 git process is running on 1 of the
CPU's from the start, and the other is idle.
~~
Give me a hint if you want some debug info from the 2 pack/idx files.
-- kjetil
1) The output from the 2 'git repack' commands did not show the
"running delta with 2 threads" (or something similar) this time.
I guess this is the sign that "pack.threads = 1" is working.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] avoid possible overflow in delta size filtering computation
2009-03-26 7:18 ` Kjetil Barvik
@ 2009-03-27 2:23 ` Nicolas Pitre
0 siblings, 0 replies; 10+ messages in thread
From: Nicolas Pitre @ 2009-03-27 2:23 UTC (permalink / raw)
To: Kjetil Barvik; +Cc: Junio C Hamano, git
On Thu, 26 Mar 2009, Kjetil Barvik wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > Here you go. If you want a perfectly deterministic repacking, you'll
> > have to force the pack.threads config option to 1.
>
> OK, have rerun the test again with pack.config set to 1, and got the
> exact same result(1) as above this time also! :-)
Good.
> Give me a hint if you want some debug info from the 2 pack/idx files.
I think there is nothing else to debug. Having some differences between
successive threaded repacks is expected.
Nicolas
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-03-27 2:25 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-24 19:56 [PATCH] avoid possible overflow in delta size filtering computation Nicolas Pitre
2009-03-24 20:20 ` Brandon Casey
2009-03-24 20:52 ` Nicolas Pitre
2009-03-25 0:39 ` Nicolas Pitre
2009-03-25 12:15 ` Kjetil Barvik
2009-03-25 16:18 ` Nicolas Pitre
2009-03-25 16:34 ` Kjetil Barvik
2009-03-25 19:17 ` Nicolas Pitre
2009-03-26 7:18 ` Kjetil Barvik
2009-03-27 2:23 ` Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).