* [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth
@ 2007-11-14 4:48 Shawn O. Pearce
2007-11-14 5:45 ` Brian Downing
2007-11-14 5:46 ` Junio C Hamano
0 siblings, 2 replies; 4+ messages in thread
From: Shawn O. Pearce @ 2007-11-14 4:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Brian Downing noticed fast-import can produce tree depths of up
to 6,035 objects and even deeper. Long delta chains can create
very small packfiles but cause problems during repacking as git
needs to unpack each tree to count the reachable blobs.
What's happening here is the active branch cache isn't big enough.
We're swapping out the branch and thus recycling the tree information
(struct tree_content) back into the free pool. When we later reload
the tree we set the delta_depth to 0 but we kept the tree we just
reloaded as a delta base.
So if the tree we reloaded was already at the maximum depth we
wouldn't know it and make the new tree a delta. Multiply the
number of times the branch cache has to swap out the tree times
max_depth (10) and you get the maximum delta depth of a tree created
by fast-import. In Brian's case above the active branch cache had
to swap the branch out 603/604 times during this import to produce
a tree with a delta depth of 6035.
Acked-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
Junio, this patch is against maint. It will apply cleanly to maint
but is also crafted to ensure it should apply to next with git-am -3.
Its a real bug that's lasted a long time in fast-import. I think
it is maint material.
My fast-import tree is quite behind yours so I would appreciate
it if you could apply this to your own tree instead of trying to
fetch things from me. Thanks.
fast-import.c | 17 ++++++++++++-----
1 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/fast-import.c b/fast-import.c
index c07e3d8..7544949 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -154,13 +154,16 @@ Format of STDIN stream:
#define PACK_ID_BITS 16
#define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
+#define DEPTH_BITS 13
+#define MAX_DEPTH ((1<<DEPTH_BITS)-1)
struct object_entry
{
struct object_entry *next;
uint32_t offset;
- unsigned type : TYPE_BITS;
- unsigned pack_id : PACK_ID_BITS;
+ uint32_t type : TYPE_BITS,
+ pack_id : PACK_ID_BITS,
+ depth : DEPTH_BITS;
unsigned char sha1[20];
};
@@ -1105,7 +1108,7 @@ static int store_object(
unsigned pos = sizeof(hdr) - 1;
delta_count_by_type[type]++;
- last->depth++;
+ e->depth = ++last->depth++;
hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);
write_or_die(pack_data->pack_fd, hdr, hdrlen);
@@ -1117,6 +1120,7 @@ static int store_object(
write_or_die(pack_data->pack_fd, hdr + pos, sizeof(hdr) - pos);
pack_size += sizeof(hdr) - pos;
} else {
+ e->depth = 0;
if (last)
last->depth = 0;
hdrlen = encode_header(type, datlen, hdr);
@@ -1181,7 +1185,7 @@ static void load_tree(struct tree_entry *root)
if (myoe && myoe->pack_id != MAX_PACK_ID) {
if (myoe->type != OBJ_TREE)
die("Not a tree: %s", sha1_to_hex(sha1));
- t->delta_depth = 0;
+ t->delta_depth = myoe->depth;
buf = gfi_unpack_entry(myoe, &size);
} else {
enum object_type type;
@@ -2347,8 +2351,11 @@ int main(int argc, const char **argv)
}
else if (!prefixcmp(a, "--max-pack-size="))
max_packsize = strtoumax(a + 16, NULL, 0) * 1024 * 1024;
- else if (!prefixcmp(a, "--depth="))
+ else if (!prefixcmp(a, "--depth=")) {
max_depth = strtoul(a + 8, NULL, 0);
+ if (max_depth > MAX_DEPTH)
+ die("--depth cannot exceed %u", MAX_DEPTH);
+ }
else if (!prefixcmp(a, "--active-branches="))
max_active_branches = strtoul(a + 18, NULL, 0);
else if (!prefixcmp(a, "--import-marks="))
--
1.5.3.5.1728.g34b3e
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth
2007-11-14 4:48 [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth Shawn O. Pearce
@ 2007-11-14 5:45 ` Brian Downing
2007-11-14 5:46 ` Junio C Hamano
1 sibling, 0 replies; 4+ messages in thread
From: Brian Downing @ 2007-11-14 5:45 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Junio C Hamano, git
On Tue, Nov 13, 2007 at 11:48:42PM -0500, Shawn O. Pearce wrote:
> @@ -1105,7 +1108,7 @@ static int store_object(
> unsigned pos = sizeof(hdr) - 1;
>
> delta_count_by_type[type]++;
> - last->depth++;
> + e->depth = ++last->depth++;
>
> hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);
> write_or_die(pack_data->pack_fd, hdr, hdrlen);
I'm not sure, but I think that's too many ++'s.
The earlier patch had:
@@ -1084,6 +1087,7 @@ static int store_object(
delta_count_by_type[type]++;
last->depth++;
+ e->depth = last->depth;
hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);
write_or_die(pack_data->pack_fd, hdr, hdrlen);
Which of course would be the equivalent of:
> + e->depth = ++last->depth;
Maybe there's some cleverness here I'm missing, though.
-bcd
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth
2007-11-14 4:48 [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth Shawn O. Pearce
2007-11-14 5:45 ` Brian Downing
@ 2007-11-14 5:46 ` Junio C Hamano
2007-11-14 5:54 ` Shawn O. Pearce
1 sibling, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2007-11-14 5:46 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
"Shawn O. Pearce" <spearce@spearce.org> writes:
> Junio, this patch is against maint. It will apply cleanly to maint
> but is also crafted to ensure it should apply to next with git-am -3.
> Its a real bug that's lasted a long time in fast-import. I think
> it is maint material.
Thanks.
> diff --git a/fast-import.c b/fast-import.c
> index c07e3d8..7544949 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -154,13 +154,16 @@ Format of STDIN stream:
>
> #define PACK_ID_BITS 16
> #define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
> +#define DEPTH_BITS 13
> +#define MAX_DEPTH ((1<<DEPTH_BITS)-1)
>
> struct object_entry
> {
> struct object_entry *next;
> uint32_t offset;
> - unsigned type : TYPE_BITS;
> - unsigned pack_id : PACK_ID_BITS;
> + uint32_t type : TYPE_BITS,
> + pack_id : PACK_ID_BITS,
> + depth : DEPTH_BITS;
> unsigned char sha1[20];
> };
uint32_t with bit-width specifiers look somewhat funny here...
>
> @@ -1105,7 +1108,7 @@ static int store_object(
> unsigned pos = sizeof(hdr) - 1;
>
> delta_count_by_type[type]++;
> - last->depth++;
> + e->depth = ++last->depth++;
"lvalue required as increment operand"?
Wouldn't it be easier to read like this?
diff --git a/fast-import.c b/fast-import.c
index 7544949..d32c412 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -161,7 +161,7 @@ struct object_entry
{
struct object_entry *next;
uint32_t offset;
- uint32_t type : TYPE_BITS,
+ unsigned type : TYPE_BITS,
pack_id : PACK_ID_BITS,
depth : DEPTH_BITS;
unsigned char sha1[20];
@@ -1108,7 +1108,7 @@ static int store_object(
unsigned pos = sizeof(hdr) - 1;
delta_count_by_type[type]++;
- e->depth = ++last->depth++;
+ e->depth = last->depth + 1;
hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);
write_or_die(pack_data->pack_fd, hdr, hdrlen);
@@ -1121,8 +1121,6 @@ static int store_object(
pack_size += sizeof(hdr) - pos;
} else {
e->depth = 0;
- if (last)
- last->depth = 0;
hdrlen = encode_header(type, datlen, hdr);
write_or_die(pack_data->pack_fd, hdr, hdrlen);
pack_size += hdrlen;
@@ -1138,6 +1136,7 @@ static int store_object(
free(last->data);
last->data = dat;
last->offset = e->offset;
+ last->depth = e->depth;
last->len = datlen;
}
return 0;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth
2007-11-14 5:46 ` Junio C Hamano
@ 2007-11-14 5:54 ` Shawn O. Pearce
0 siblings, 0 replies; 4+ messages in thread
From: Shawn O. Pearce @ 2007-11-14 5:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano <gitster@pobox.com> wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
> > + uint32_t type : TYPE_BITS,
> > + pack_id : PACK_ID_BITS,
> > + depth : DEPTH_BITS;
>
> uint32_t with bit-width specifiers look somewhat funny here...
Sure. But the prior item is also uint32_t and before that a pointer
so it aligns out nice. And we're using exactly 32 bits in this field.
So it just sort of made sense to me to declare it was a uint32_t.
> > @@ -1105,7 +1108,7 @@ static int store_object(
> > unsigned pos = sizeof(hdr) - 1;
> >
> > delta_count_by_type[type]++;
> > - last->depth++;
> > + e->depth = ++last->depth++;
>
> "lvalue required as increment operand"?
>
> Wouldn't it be easier to read like this?
Yes. Good call. :-)
> diff --git a/fast-import.c b/fast-import.c
> index 7544949..d32c412 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -161,7 +161,7 @@ struct object_entry
> {
> struct object_entry *next;
> uint32_t offset;
> - uint32_t type : TYPE_BITS,
> + unsigned type : TYPE_BITS,
> pack_id : PACK_ID_BITS,
> depth : DEPTH_BITS;
> unsigned char sha1[20];
> @@ -1108,7 +1108,7 @@ static int store_object(
> unsigned pos = sizeof(hdr) - 1;
>
> delta_count_by_type[type]++;
> - e->depth = ++last->depth++;
> + e->depth = last->depth + 1;
>
> hdrlen = encode_header(OBJ_OFS_DELTA, deltalen, hdr);
> write_or_die(pack_data->pack_fd, hdr, hdrlen);
> @@ -1121,8 +1121,6 @@ static int store_object(
> pack_size += sizeof(hdr) - pos;
> } else {
> e->depth = 0;
> - if (last)
> - last->depth = 0;
> hdrlen = encode_header(type, datlen, hdr);
> write_or_die(pack_data->pack_fd, hdr, hdrlen);
> pack_size += hdrlen;
> @@ -1138,6 +1136,7 @@ static int store_object(
> free(last->data);
> last->data = dat;
> last->offset = e->offset;
> + last->depth = e->depth;
> last->len = datlen;
> }
> return 0;
--
Shawn.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-11-14 5:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-14 4:48 [PATCH] Don't allow fast-import tree delta chains to exceed maximum depth Shawn O. Pearce
2007-11-14 5:45 ` Brian Downing
2007-11-14 5:46 ` Junio C Hamano
2007-11-14 5:54 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).