* [PATCH] Set hard limit on delta chain depth
@ 2011-12-05 7:04 Nguyễn Thái Ngọc Duy
2011-12-06 12:17 ` Erik Faye-Lund
2011-12-06 15:06 ` Junio C Hamano
0 siblings, 2 replies; 19+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-12-05 7:04 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
Too deep delta chains can cause stack overflow in get_base_data(). Set
a hard limit so that index-pack does not run out of stack. Also stop
people from producing such a long delta chains using "pack-object
--depth=<too large>"
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
I used to make very long delta chains and triggered this in index-pack.
I did not care reporting because it's my fault anyway. Think again,
index-pack is called at server side and a malicious client can
trigger this. This patch does not improve the situation much, but at
least we won't get sigsegv at server side.
builtin/index-pack.c | 12 ++++++++++--
builtin/pack-objects.c | 3 +++
pack.h | 2 ++
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 0945adb..cfb8cb2 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -504,13 +504,16 @@ static int is_delta_type(enum object_type type)
return (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA);
}
-static void *get_base_data(struct base_data *c)
+static void *get_base_data_1(struct base_data *c, int depth)
{
+ if (depth > MAX_DELTA_DEPTH)
+ die("index-pack: too long delta chain");
+
if (!c->data) {
struct object_entry *obj = c->obj;
if (is_delta_type(obj->type)) {
- void *base = get_base_data(c->base);
+ void *base = get_base_data_1(c->base, depth + 1);
void *raw = get_data_from_pack(obj);
c->data = patch_delta(
base, c->base->size,
@@ -530,6 +533,11 @@ static void *get_base_data(struct base_data *c)
return c->data;
}
+static void *get_base_data(struct base_data *c)
+{
+ return get_base_data_1(c, 0);
+}
+
static void resolve_delta(struct object_entry *delta_obj,
struct base_data *base, struct base_data *result)
{
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 824ecee..85ee04b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2508,6 +2508,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
if (keep_unreachable && unpack_unreachable)
die("--keep-unreachable and --unpack-unreachable are incompatible.");
+ if (depth > MAX_DELTA_DEPTH)
+ die("--depth exceeds %d", MAX_DELTA_DEPTH);
+
if (progress && all_progress_implied)
progress = 2;
diff --git a/pack.h b/pack.h
index 722a54e..b8f60c3 100644
--- a/pack.h
+++ b/pack.h
@@ -3,6 +3,8 @@
#include "object.h"
+#define MAX_DELTA_DEPTH 128
+
/*
* Packed object header
*/
--
1.7.8.36.g69ee2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-05 7:04 [PATCH] Set hard limit on delta chain depth Nguyễn Thái Ngọc Duy
@ 2011-12-06 12:17 ` Erik Faye-Lund
2011-12-06 12:32 ` Nguyen Thai Ngoc Duy
2011-12-06 14:54 ` Michael Haggerty
2011-12-06 15:06 ` Junio C Hamano
1 sibling, 2 replies; 19+ messages in thread
From: Erik Faye-Lund @ 2011-12-06 12:17 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano
2011/12/5 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> Too deep delta chains can cause stack overflow in get_base_data(). Set
> a hard limit so that index-pack does not run out of stack. Also stop
> people from producing such a long delta chains using "pack-object
> --depth=<too large>"
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> I used to make very long delta chains and triggered this in index-pack.
> I did not care reporting because it's my fault anyway. Think again,
> index-pack is called at server side and a malicious client can
> trigger this. This patch does not improve the situation much, but at
> least we won't get sigsegv at server side.
>
Wouldn't it make more sense to make the limit a config option rather
than a hard-coded value of 128 (which seems arbitrary to me)? After
all, different platforms have different stack-limitations...
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 12:17 ` Erik Faye-Lund
@ 2011-12-06 12:32 ` Nguyen Thai Ngoc Duy
2011-12-06 12:41 ` Erik Faye-Lund
2011-12-06 14:54 ` Michael Haggerty
1 sibling, 1 reply; 19+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-12-06 12:32 UTC (permalink / raw)
To: kusmabite; +Cc: git, Junio C Hamano
2011/12/6 Erik Faye-Lund <kusmabite@gmail.com>:
> 2011/12/5 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>> Too deep delta chains can cause stack overflow in get_base_data(). Set
>> a hard limit so that index-pack does not run out of stack. Also stop
>> people from producing such a long delta chains using "pack-object
>> --depth=<too large>"
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>> I used to make very long delta chains and triggered this in index-pack.
>> I did not care reporting because it's my fault anyway. Think again,
>> index-pack is called at server side and a malicious client can
>> trigger this. This patch does not improve the situation much, but at
>> least we won't get sigsegv at server side.
>>
>
> Wouldn't it make more sense to make the limit a config option rather
> than a hard-coded value of 128 (which seems arbitrary to me)? After
> all, different platforms have different stack-limitations...
Then it'd make more sense to make a compile time config based on
platform. We could have a config option that can override the default,
but I really don't see the point of making long delta chains.
--
Duy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 12:32 ` Nguyen Thai Ngoc Duy
@ 2011-12-06 12:41 ` Erik Faye-Lund
2011-12-06 12:48 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 19+ messages in thread
From: Erik Faye-Lund @ 2011-12-06 12:41 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git, Junio C Hamano
On Tue, Dec 6, 2011 at 1:32 PM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> 2011/12/6 Erik Faye-Lund <kusmabite@gmail.com>:
>> 2011/12/5 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>>> Too deep delta chains can cause stack overflow in get_base_data(). Set
>>> a hard limit so that index-pack does not run out of stack. Also stop
>>> people from producing such a long delta chains using "pack-object
>>> --depth=<too large>"
>>>
>>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>> ---
>>> I used to make very long delta chains and triggered this in index-pack.
>>> I did not care reporting because it's my fault anyway. Think again,
>>> index-pack is called at server side and a malicious client can
>>> trigger this. This patch does not improve the situation much, but at
>>> least we won't get sigsegv at server side.
>>>
>>
>> Wouldn't it make more sense to make the limit a config option rather
>> than a hard-coded value of 128 (which seems arbitrary to me)? After
>> all, different platforms have different stack-limitations...
>
> Then it'd make more sense to make a compile time config based on
> platform.
Can how much stack each recursion use be calculated at compile-time?
If so, I agree with you.
> We could have a config option that can override the default,
> but I really don't see the point of making long delta chains.
Aha, I figured you _did_ see a point in this, because 128 seemed
excessive to me already. I was thinking more that some platforms can
have a much smaller stack than (I would expect to) fit in 128
recursions (I've worked relatively recently with platforms with as
small as a static 2k stack per process), so you might not be fixing
the issue for such platforms. But that's not really your
responsibility either ;)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 12:41 ` Erik Faye-Lund
@ 2011-12-06 12:48 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 19+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-12-06 12:48 UTC (permalink / raw)
To: kusmabite; +Cc: git, Junio C Hamano
On Tue, Dec 6, 2011 at 7:41 PM, Erik Faye-Lund <kusmabite@gmail.com> wrote:
>>> Wouldn't it make more sense to make the limit a config option rather
>>> than a hard-coded value of 128 (which seems arbitrary to me)? After
>>> all, different platforms have different stack-limitations...
>>
>> Then it'd make more sense to make a compile time config based on
>> platform.
>
> Can how much stack each recursion use be calculated at compile-time?
> If so, I agree with you.
No, but at least we know default stack size of each platform and can
make pretty good limit based on that.
>> We could have a config option that can override the default,
>> but I really don't see the point of making long delta chains.
>
> Aha, I figured you _did_ see a point in this, because 128 seemed
> excessive to me already. I was thinking more that some platforms can
> have a much smaller stack than (I would expect to) fit in 128
> recursions (I've worked relatively recently with platforms with as
> small as a static 2k stack per process), so you might not be fixing
> the issue for such platforms. But that's not really your
> responsibility either ;)
Ah, I was thinking of an option that extends the limit, not shortens
it. Yes it makes sense in this case.
--
Duy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 12:17 ` Erik Faye-Lund
2011-12-06 12:32 ` Nguyen Thai Ngoc Duy
@ 2011-12-06 14:54 ` Michael Haggerty
2011-12-06 15:30 ` Nguyen Thai Ngoc Duy
1 sibling, 1 reply; 19+ messages in thread
From: Michael Haggerty @ 2011-12-06 14:54 UTC (permalink / raw)
To: kusmabite; +Cc: Nguyễn Thái Ngọc Duy, git, Junio C Hamano
On 12/06/2011 01:17 PM, Erik Faye-Lund wrote:
> 2011/12/5 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>> Too deep delta chains can cause stack overflow in get_base_data(). Set
>> a hard limit so that index-pack does not run out of stack. Also stop
>> people from producing such a long delta chains using "pack-object
>> --depth=<too large>"
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>> I used to make very long delta chains and triggered this in index-pack.
>> I did not care reporting because it's my fault anyway. Think again,
>> index-pack is called at server side and a malicious client can
>> trigger this. This patch does not improve the situation much, but at
>> least we won't get sigsegv at server side.
>
> Wouldn't it make more sense to make the limit a config option rather
> than a hard-coded value of 128 (which seems arbitrary to me)? After
> all, different platforms have different stack-limitations...
I'm confused: is the data only ever read by the same host that generated
it? Because if not, then the "creator" had better never be configured
to use a chain depth that the "reader" cannot handle. This in turn
imply that there should be a common limit that is supported by all git
clients and is a documented part of the protocol. (Or the code has to
be rewritten to use an explicit stack instead of recursion.)
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-05 7:04 [PATCH] Set hard limit on delta chain depth Nguyễn Thái Ngọc Duy
2011-12-06 12:17 ` Erik Faye-Lund
@ 2011-12-06 15:06 ` Junio C Hamano
2011-12-06 15:45 ` Nguyen Thai Ngoc Duy
2011-12-07 17:50 ` [PATCH] index-pack: eliminate unlimited recursion in get_delta_base() Nguyễn Thái Ngọc Duy
1 sibling, 2 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-12-06 15:06 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> Too deep delta chains can cause stack overflow in get_base_data(). Set
> a hard limit so that index-pack does not run out of stack. Also stop
> people from producing such a long delta chains using "pack-object
> --depth=<too large>"
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> I used to make very long delta chains and triggered this in index-pack.
> I did not care reporting because it's my fault anyway. Think again,
> index-pack is called at server side and a malicious client can
> trigger this. This patch does not improve the situation much, but at
> least we won't get sigsegv at server side.
Why should we treat this condition any differently from the case where the
sender of a pack used beefier machine than you have and stuffed a huge
object that the index-pack running on your box cannot hold in core,
causing xmalloc() to die on your machine?
I do not think this is the right way to handle the issue. Your other patch
to flatten the recursion to iteration looked a lot saner approach.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 14:54 ` Michael Haggerty
@ 2011-12-06 15:30 ` Nguyen Thai Ngoc Duy
2011-12-06 18:12 ` Shawn Pearce
0 siblings, 1 reply; 19+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-12-06 15:30 UTC (permalink / raw)
To: Michael Haggerty; +Cc: kusmabite, git, Junio C Hamano
2011/12/6 Michael Haggerty <mhagger@alum.mit.edu>:
> On 12/06/2011 01:17 PM, Erik Faye-Lund wrote:
>> 2011/12/5 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>>> Too deep delta chains can cause stack overflow in get_base_data(). Set
>>> a hard limit so that index-pack does not run out of stack. Also stop
>>> people from producing such a long delta chains using "pack-object
>>> --depth=<too large>"
>>>
>>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>> ---
>>> I used to make very long delta chains and triggered this in index-pack.
>>> I did not care reporting because it's my fault anyway. Think again,
>>> index-pack is called at server side and a malicious client can
>>> trigger this. This patch does not improve the situation much, but at
>>> least we won't get sigsegv at server side.
>>
>> Wouldn't it make more sense to make the limit a config option rather
>> than a hard-coded value of 128 (which seems arbitrary to me)? After
>> all, different platforms have different stack-limitations...
>
> I'm confused: is the data only ever read by the same host that generated
> it?
index-pack is called at server side as part of a push. So in theory
the sending side can generate very long delta chains and bring down
the server side.
> Because if not, then the "creator" had better never be configured
> to use a chain depth that the "reader" cannot handle.
Normal creators (i.e. C Git) use default depth 50 so we should be safe.
> This in turn
> imply that there should be a common limit that is supported by all git
> clients and is a documented part of the protocol. (Or the code has to
> be rewritten to use an explicit stack instead of recursion.)
It's the implementation limitation, not the protocol. If the server
propagates error messages from index-pack back to client (I'm not
sure), then users can adjust --depth to be accepted by server. We
could negotiate the limit over the protocol but not sure we would want
to go that route.
The troubled code could be rewritten to avoid recursion. However long
delta chains may degrade performance, I'd rather have support to split
long chains (which can be done with --depth at client already) instead
of just recursion elimination. This patch is simpler, so it would be
easier to back port it if someone wants to do so.
--
Duy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 15:06 ` Junio C Hamano
@ 2011-12-06 15:45 ` Nguyen Thai Ngoc Duy
2011-12-10 0:02 ` Junio C Hamano
2011-12-07 17:50 ` [PATCH] index-pack: eliminate unlimited recursion in get_delta_base() Nguyễn Thái Ngọc Duy
1 sibling, 1 reply; 19+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-12-06 15:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
2011/12/6 Junio C Hamano <gitster@pobox.com>:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> Too deep delta chains can cause stack overflow in get_base_data(). Set
>> a hard limit so that index-pack does not run out of stack. Also stop
>> people from producing such a long delta chains using "pack-object
>> --depth=<too large>"
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>> I used to make very long delta chains and triggered this in index-pack.
>> I did not care reporting because it's my fault anyway. Think again,
>> index-pack is called at server side and a malicious client can
>> trigger this. This patch does not improve the situation much, but at
>> least we won't get sigsegv at server side.
>
> Why should we treat this condition any differently from the case where the
> sender of a pack used beefier machine than you have and stuffed a huge
> object that the index-pack running on your box cannot hold in core,
> causing xmalloc() to die on your machine?
That's interesting. First of all xmalloc() is controlled by us while
index-pack code might lead to stack overflow exploit (never done it,
not sure if it's really pratical to do in this case).
But can I really use up all memory at server side by sending a huge pack?
> I do not think this is the right way to handle the issue. Your other patch
> to flatten the recursion to iteration looked a lot saner approach.
It may take me some time as I'm not really familar with this code.
Anybody is welcome to step up and flatten the function.
--
Duy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 15:30 ` Nguyen Thai Ngoc Duy
@ 2011-12-06 18:12 ` Shawn Pearce
2011-12-06 18:56 ` Jeff King
0 siblings, 1 reply; 19+ messages in thread
From: Shawn Pearce @ 2011-12-06 18:12 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: Michael Haggerty, kusmabite, git, Junio C Hamano
On Tue, Dec 6, 2011 at 07:30, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> index-pack is called at server side as part of a push. So in theory
> the sending side can generate very long delta chains and bring down
> the server side.
It is also called at client side during fetch. So in theory the server
can produce very long delta chains and take down a client.
>> Because if not, then the "creator" had better never be configured
>> to use a chain depth that the "reader" cannot handle.
>
> Normal creators (i.e. C Git) use default depth 50 so we should be safe.
JGit is also a "normal creator", and it sometimes produces chains
deeper than 50. Junio identified a 255 deep chain a week or two ago.
Some people have repacked their repositories very aggressively with
deeper chains when they are trying to optimize for space and don't
access historical revisions very often. I doubt anyone has packed
deeper than 120ish intentionally... but we shouldn't assume that in
the code.
>> This in turn
>> imply that there should be a common limit that is supported by all git
>> clients and is a documented part of the protocol. (Or the code has to
>> be rewritten to use an explicit stack instead of recursion.)
>
> It's the implementation limitation, not the protocol. If the server
> propagates error messages from index-pack back to client (I'm not
> sure), then users can adjust --depth to be accepted by server. We
> could negotiate the limit over the protocol but not sure we would want
> to go that route.
What about clients fetching from a server? The client can't change the
depth the server sends it.
> The troubled code could be rewritten to avoid recursion. However long
> delta chains may degrade performance, I'd rather have support to split
> long chains (which can be done with --depth at client already) instead
> of just recursion elimination. This patch is simpler, so it would be
> easier to back port it if someone wants to do so.
JGit long ago changed its IndexPack routine to use a manually managed
heap based stack instead of recursion, making it immune from
overrunning the stack due to a long delta chain. That is probably the
cleaner route. But you also have to fix sha1_file.c and its recursion
based unpacking of a delta chain... again which JGit fixed a long time
ago.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 18:12 ` Shawn Pearce
@ 2011-12-06 18:56 ` Jeff King
0 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2011-12-06 18:56 UTC (permalink / raw)
To: Shawn Pearce
Cc: Nguyen Thai Ngoc Duy, Michael Haggerty, kusmabite, git,
Junio C Hamano
On Tue, Dec 06, 2011 at 10:12:54AM -0800, Shawn O. Pearce wrote:
> > Normal creators (i.e. C Git) use default depth 50 so we should be safe.
>
> JGit is also a "normal creator", and it sometimes produces chains
> deeper than 50. Junio identified a 255 deep chain a week or two ago.
> Some people have repacked their repositories very aggressively with
> deeper chains when they are trying to optimize for space and don't
> access historical revisions very often. I doubt anyone has packed
> deeper than 120ish intentionally... but we shouldn't assume that in
> the code.
"git gc --aggressive" will set the depth to 250.
-Peff
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] index-pack: eliminate unlimited recursion in get_delta_base()
2011-12-06 15:06 ` Junio C Hamano
2011-12-06 15:45 ` Nguyen Thai Ngoc Duy
@ 2011-12-07 17:50 ` Nguyễn Thái Ngọc Duy
2011-12-08 3:02 ` Shawn Pearce
1 sibling, 1 reply; 19+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-12-07 17:50 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
Revert the order of delta applying so that by the time a delta is
applied, its base is either non-delta or already inflated.
get_delta_base() is still recursive, but because base's data is always
ready, the inner get_delta_base() call never has any chance to call
itself again.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/index-pack.c | 30 +++++++++++++++++++++---------
1 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 0945adb..103e19c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -508,10 +508,25 @@ static void *get_base_data(struct base_data *c)
{
if (!c->data) {
struct object_entry *obj = c->obj;
+ struct base_data **delta = NULL;
+ int delta_nr = 0, delta_alloc = 0;
- if (is_delta_type(obj->type)) {
- void *base = get_base_data(c->base);
- void *raw = get_data_from_pack(obj);
+ for (; is_delta_type(c->obj->type); c = c->base) {
+ ALLOC_GROW(delta, delta_nr + 1, delta_alloc);
+ delta[delta_nr++] = c;
+ }
+ if (!delta_nr) {
+ c->data = get_data_from_pack(obj);
+ c->size = obj->size;
+ base_cache_used += c->size;
+ prune_base_data(c);
+ }
+ for (; delta_nr > 0; delta_nr--) {
+ void *base, *raw;
+ c = delta[delta_nr - 1];
+ obj = c->obj;
+ base = get_base_data(c->base);
+ raw = get_data_from_pack(obj);
c->data = patch_delta(
base, c->base->size,
raw, obj->size,
@@ -519,13 +534,10 @@ static void *get_base_data(struct base_data *c)
free(raw);
if (!c->data)
bad_object(obj->idx.offset, "failed to apply delta");
- } else {
- c->data = get_data_from_pack(obj);
- c->size = obj->size;
+ base_cache_used += c->size;
+ prune_base_data(c);
}
-
- base_cache_used += c->size;
- prune_base_data(c);
+ free(delta);
}
return c->data;
}
--
1.7.8.36.g69ee2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] index-pack: eliminate unlimited recursion in get_delta_base()
2011-12-07 17:50 ` [PATCH] index-pack: eliminate unlimited recursion in get_delta_base() Nguyễn Thái Ngọc Duy
@ 2011-12-08 3:02 ` Shawn Pearce
2011-12-08 11:06 ` Nguyen Thai Ngoc Duy
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Shawn Pearce @ 2011-12-08 3:02 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano
2011/12/7 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> Revert the order of delta applying so that by the time a delta is
> applied, its base is either non-delta or already inflated.
> get_delta_base() is still recursive, but because base's data is always
> ready, the inner get_delta_base() call never has any chance to call
> itself again.
...
> @@ -508,10 +508,25 @@ static void *get_base_data(struct base_data *c)
> {
> if (!c->data) {
> struct object_entry *obj = c->obj;
> + struct base_data **delta = NULL;
> + int delta_nr = 0, delta_alloc = 0;
>
> - if (is_delta_type(obj->type)) {
> - void *base = get_base_data(c->base);
> - void *raw = get_data_from_pack(obj);
I think you missed the critical recursion. The real work is the
recursion within find_unresolved_deltas(). This little helper
get_base_data() shouldn't be tripping over these cases unless we have
run out of delta_base_cache_limit and released objects near the base
end of the delta chain, in which case this will restore them.
Maybe this is useful on its own, but in my opinion its not an
interesting patch to consider without first fixing
find_unresolved_deltas's recursion.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] index-pack: eliminate unlimited recursion in get_delta_base()
2011-12-08 3:02 ` Shawn Pearce
@ 2011-12-08 11:06 ` Nguyen Thai Ngoc Duy
2011-12-08 13:40 ` [PATCH 1/2] index_pack: indent find_unresolved_detals one level pclouds
[not found] ` <1323351638-4790-1-git-send-email-y>
2 siblings, 0 replies; 19+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-12-08 11:06 UTC (permalink / raw)
To: Shawn Pearce; +Cc: git, Junio C Hamano
2011/12/8 Shawn Pearce <spearce@spearce.org>:
> I think you missed the critical recursion. The real work is the
> recursion within find_unresolved_deltas(). This little helper
> get_base_data() shouldn't be tripping over these cases unless we have
> run out of delta_base_cache_limit and released objects near the base
> end of the delta chain, in which case this will restore them.
>
> Maybe this is useful on its own, but in my opinion its not an
> interesting patch to consider without first fixing
> find_unresolved_deltas's recursion.
Thanks. I missed that function. Will try to fix it.
--
Duy
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/2] index_pack: indent find_unresolved_detals one level
2011-12-08 3:02 ` Shawn Pearce
2011-12-08 11:06 ` Nguyen Thai Ngoc Duy
@ 2011-12-08 13:40 ` pclouds
2011-12-09 21:27 ` Junio C Hamano
[not found] ` <1323351638-4790-1-git-send-email-y>
2 siblings, 1 reply; 19+ messages in thread
From: pclouds @ 2011-12-08 13:40 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Shawn Pearce,
Nguyễn Thái Ngọc Duy
From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
The next patch puts most of the code in one level deeper. By indenting
separately, it'd be easier to see the actual changes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/index-pack.c | 47 +++++++++++++++++++++++------------------------
1 files changed, 23 insertions(+), 24 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 103e19c..9855ada 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -585,37 +585,36 @@ static void find_unresolved_deltas(struct base_data *base,
base_spec.offset = base->obj->idx.offset;
find_delta_children(&base_spec,
&ofs_first, &ofs_last, OBJ_OFS_DELTA);
- }
- if (ref_last == -1 && ofs_last == -1) {
- free(base->data);
- return;
- }
+ if (ref_last == -1 && ofs_last == -1) {
+ free(base->data);
+ return;
+ }
- link_base_data(prev_base, base);
+ link_base_data(prev_base, base);
- for (i = ref_first; i <= ref_last; i++) {
- struct object_entry *child = objects + deltas[i].obj_no;
- struct base_data result;
+ for (i = ref_first; i <= ref_last; i++) {
+ struct object_entry *child = objects + deltas[i].obj_no;
+ struct base_data result;
- assert(child->real_type == OBJ_REF_DELTA);
- resolve_delta(child, base, &result);
- if (i == ref_last && ofs_last == -1)
- free_base_data(base);
- find_unresolved_deltas(&result, base);
- }
+ assert(child->real_type == OBJ_REF_DELTA);
+ resolve_delta(child, base, &result);
+ if (i == ref_last && ofs_last == -1)
+ free_base_data(base);
+ find_unresolved_deltas(&result, base);
+ }
- for (i = ofs_first; i <= ofs_last; i++) {
- struct object_entry *child = objects + deltas[i].obj_no;
- struct base_data result;
+ for (i = ofs_first; i <= ofs_last; i++) {
+ struct object_entry *child = objects + deltas[i].obj_no;
+ struct base_data result;
- assert(child->real_type == OBJ_OFS_DELTA);
- resolve_delta(child, base, &result);
- if (i == ofs_last)
- free_base_data(base);
- find_unresolved_deltas(&result, base);
+ assert(child->real_type == OBJ_OFS_DELTA);
+ resolve_delta(child, base, &result);
+ if (i == ofs_last)
+ free_base_data(base);
+ find_unresolved_deltas(&result, base);
+ }
}
-
unlink_base_data(base);
}
--
1.7.8.36.g69ee2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/2] index-pack: a naive attempt to flatten find_unresolved_deltas
[not found] ` <1323351638-4790-1-git-send-email-y>
@ 2011-12-08 13:40 ` pclouds
2011-12-08 16:42 ` Shawn Pearce
0 siblings, 1 reply; 19+ messages in thread
From: pclouds @ 2011-12-08 13:40 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Shawn Pearce,
Nguyễn Thái Ngọc Duy
From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
This seems to work for me. But I think this approach uses (much?) more
memory because it turns a tree of calls into a flat list of calls and
keep the list until the end, while the recursive version only has to
keep one call chain at a time.
Any ideas how to improve it?
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/index-pack.c | 57 +++++++++++++++++++++++++++++++++++++------------
1 files changed, 43 insertions(+), 14 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 9855ada..d4c182f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -565,18 +565,31 @@ static void resolve_delta(struct object_entry *delta_obj,
nr_resolved_deltas++;
}
+struct fud_stack {
+ struct base_data base_;
+ struct base_data *base;
+ struct base_data *prev_base;
+};
+
static void find_unresolved_deltas(struct base_data *base,
struct base_data *prev_base)
{
+ struct fud_stack **stack = NULL;
+ int stk, stack_nr = 0, stack_alloc = 0;
int i, ref_first, ref_last, ofs_first, ofs_last;
- /*
- * This is a recursive function. Those brackets should help reducing
- * stack usage by limiting the scope of the delta_base union.
- */
- {
+ ALLOC_GROW(stack, stack_nr + 1, stack_alloc);
+ stack[stack_nr] = xmalloc(sizeof(struct fud_stack));
+ stack[stack_nr]->base = base;
+ stack[stack_nr]->prev_base = prev_base;
+ stack_nr++;
+
+ for (stk = 0; stk < stack_nr; stk++) {
union delta_base base_spec;
+ base = stack[stk]->base;
+ prev_base = stack[stk]->prev_base;
+
hashcpy(base_spec.sha1, base->obj->idx.sha1);
find_delta_children(&base_spec,
&ref_first, &ref_last, OBJ_REF_DELTA);
@@ -588,34 +601,50 @@ static void find_unresolved_deltas(struct base_data *base,
if (ref_last == -1 && ofs_last == -1) {
free(base->data);
- return;
+ free(stack[stk]);
+ stack[stk] = NULL;
+ continue;
}
link_base_data(prev_base, base);
+ ALLOC_GROW(stack, stack_nr +
+ (ref_last - ref_first + 1) +
+ (ofs_last - ofs_first + 1),
+ stack_alloc);
+
for (i = ref_first; i <= ref_last; i++) {
struct object_entry *child = objects + deltas[i].obj_no;
- struct base_data result;
assert(child->real_type == OBJ_REF_DELTA);
- resolve_delta(child, base, &result);
+ stack[stack_nr] = xmalloc(sizeof(struct fud_stack));
+ stack[stack_nr]->base = &stack[stack_nr]->base_;
+ resolve_delta(child, base, stack[stack_nr]->base);
if (i == ref_last && ofs_last == -1)
free_base_data(base);
- find_unresolved_deltas(&result, base);
+ stack[stack_nr]->prev_base = base;
+ stack_nr++;
}
for (i = ofs_first; i <= ofs_last; i++) {
struct object_entry *child = objects + deltas[i].obj_no;
- struct base_data result;
-
assert(child->real_type == OBJ_OFS_DELTA);
- resolve_delta(child, base, &result);
+ stack[stack_nr] = xmalloc(sizeof(struct fud_stack));
+ stack[stack_nr]->base = &stack[stack_nr]->base_;
+ resolve_delta(child, base, stack[stack_nr]->base);
if (i == ofs_last)
free_base_data(base);
- find_unresolved_deltas(&result, base);
+ stack[stack_nr]->prev_base = base;
+ stack_nr++;
}
}
- unlink_base_data(base);
+
+ for (stk = stack_nr - 1; stk >= 0; stk--)
+ if (stack[stk]) {
+ unlink_base_data(stack[stk]->base);
+ free(stack[stk]);
+ }
+ free(stack);
}
static int compare_delta_entry(const void *a, const void *b)
--
1.7.8.36.g69ee2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] index-pack: a naive attempt to flatten find_unresolved_deltas
2011-12-08 13:40 ` [PATCH 2/2] index-pack: a naive attempt to flatten find_unresolved_deltas pclouds
@ 2011-12-08 16:42 ` Shawn Pearce
0 siblings, 0 replies; 19+ messages in thread
From: Shawn Pearce @ 2011-12-08 16:42 UTC (permalink / raw)
To: pclouds; +Cc: git, Junio C Hamano
2011/12/8 <pclouds@gmail.com>:
> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> This seems to work for me. But I think this approach uses (much?) more
> memory because it turns a tree of calls into a flat list of calls and
> keep the list until the end, while the recursive version only has to
> keep one call chain at a time.
>
> Any ideas how to improve it?
Well, fortunately JGit is under a very liberal BSD license and has a
pretty efficient version of a heap based algorithm for delta
resolution. You could look at [1] for ideas. I am told the BSD license
is very compatible with the GPLv2. (Unlike the reverse.)
[1] http://egit.eclipse.org/w/?p=jgit.git;a=blob;f=org.eclipse.jgit/src/org/eclipse/jgit/transport/PackParser.java;hb=HEAD#l556
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] index_pack: indent find_unresolved_detals one level
2011-12-08 13:40 ` [PATCH 1/2] index_pack: indent find_unresolved_detals one level pclouds
@ 2011-12-09 21:27 ` Junio C Hamano
0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-12-09 21:27 UTC (permalink / raw)
To: pclouds; +Cc: git, Shawn Pearce
pclouds@gmail.com writes:
> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> The next patch puts most of the code in one level deeper. By indenting
> separately, it'd be easier to see the actual changes.
Yuck.
Isn't it a sign that "the next patch" should perhaps be helped by a small
helper function that does whatever the part you are indenting here?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] Set hard limit on delta chain depth
2011-12-06 15:45 ` Nguyen Thai Ngoc Duy
@ 2011-12-10 0:02 ` Junio C Hamano
0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2011-12-10 0:02 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> That's interesting. First of all xmalloc() is controlled by us while
> index-pack code might lead to stack overflow exploit (never done it,
> not sure if it's really pratical to do in this case).
What do you exactly mean by "stack overflow exploit"?
If your callee has prepares a stackframe that is not sufficiently big but
carelessly tries to store more than it has space for, such a write can
overflow the stack (without hardware traps) and overwrite return address,
and instead of coming back to you, the control can be transferred to
random places.
But I do not think that is what we are talking about here.
You attempt to write parameters and return address to the area of memory
pointed by your stack pointer, advance the stack pointer to create a stack
frame and the callee attempts to write to its local variables in the newly
allocated stack frame. These memory accesses eventually attempt to touch
memory beyond the address range the kernel gave you page table entries to
be used as your stack space, and hardware traps. If you haven't run out of
the stack, a new page is lazily added to the page table and your attempted
access will succeed. If you are recursing too deeply, you won't be given a
new page and you will be killed by the kernel. That is a rather controlled
death of the process, unlike smashing the contents of the stack to jump to
a randomly chosen place, isn't it?
Of course, some platforms do not have an unwritable gap between the stack
segment that grow downwards and the heap that grow upwards, and also your
stackframe could be larger than such a gap (in this particular callchain I
do not think that is the case), so the above discussion does not apply
universally, though.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-12-10 0:02 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-05 7:04 [PATCH] Set hard limit on delta chain depth Nguyễn Thái Ngọc Duy
2011-12-06 12:17 ` Erik Faye-Lund
2011-12-06 12:32 ` Nguyen Thai Ngoc Duy
2011-12-06 12:41 ` Erik Faye-Lund
2011-12-06 12:48 ` Nguyen Thai Ngoc Duy
2011-12-06 14:54 ` Michael Haggerty
2011-12-06 15:30 ` Nguyen Thai Ngoc Duy
2011-12-06 18:12 ` Shawn Pearce
2011-12-06 18:56 ` Jeff King
2011-12-06 15:06 ` Junio C Hamano
2011-12-06 15:45 ` Nguyen Thai Ngoc Duy
2011-12-10 0:02 ` Junio C Hamano
2011-12-07 17:50 ` [PATCH] index-pack: eliminate unlimited recursion in get_delta_base() Nguyễn Thái Ngọc Duy
2011-12-08 3:02 ` Shawn Pearce
2011-12-08 11:06 ` Nguyen Thai Ngoc Duy
2011-12-08 13:40 ` [PATCH 1/2] index_pack: indent find_unresolved_detals one level pclouds
2011-12-09 21:27 ` Junio C Hamano
[not found] ` <1323351638-4790-1-git-send-email-y>
2011-12-08 13:40 ` [PATCH 2/2] index-pack: a naive attempt to flatten find_unresolved_deltas pclouds
2011-12-08 16:42 ` Shawn Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).