* [PATCH 0/2] nd/slim-index-pack-memory-usage update
@ 2015-04-18 10:47 Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 1/2] index-pack: reduce object_entry size to save memory Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 2/2] index-pack: kill union delta_base " Nguyễn Thái Ngọc Duy
0 siblings, 2 replies; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-04-18 10:47 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
The only change is in the commit message of 2/2 [1]. Sorry it took too
long, I completely forgot about it.
[1] http://article.gmane.org/gmane.comp.version-control.git/264544
Nguyễn Thái Ngọc Duy (2):
index-pack: reduce object_entry size to save memory
index-pack: kill union delta_base to save memory
builtin/index-pack.c | 290 +++++++++++++++++++++++++++++++--------------------
1 file changed, 179 insertions(+), 111 deletions(-)
--
2.3.0.rc1.137.g477eb31
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] index-pack: reduce object_entry size to save memory
2015-04-18 10:47 [PATCH 0/2] nd/slim-index-pack-memory-usage update Nguyễn Thái Ngọc Duy
@ 2015-04-18 10:47 ` Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 2/2] index-pack: kill union delta_base " Nguyễn Thái Ngọc Duy
1 sibling, 0 replies; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-04-18 10:47 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy, Junio C Hamano
For each object in the input pack, we need one struct object_entry. On
x86-64, this struct is 64 bytes long. Although:
- The 8 bytes for delta_depth and base_object_no are only useful when
show_stat is set. And it's never set unless someone is debugging.
- The three fields hdr_size, type and real_type take 4 bytes each
even though they never use more than 4 bits.
By moving delta_depth and base_object_no out of struct object_entry
and make the other 3 fields one byte long instead of 4, we shrink 25%
of this struct.
On a 3.4M object repo (*) that's about 53MB. The saving is less
impressive compared to index-pack memory use for basic bookkeeping (**),
about 16%.
(*) linux-2.6.git already has 4M objects as of v3.19-rc7 so this is
not an unrealistic number of objects that we have to deal with.
(**) 3.4M * (sizeof(object_entry) + sizeof(delta_entry)) = 311MB
Brought-up-by: Matthew Sporleder <msporleder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin/index-pack.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4632117..9d854fb 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -18,9 +18,12 @@ static const char index_pack_usage[] =
struct object_entry {
struct pack_idx_entry idx;
unsigned long size;
- unsigned int hdr_size;
- enum object_type type;
- enum object_type real_type;
+ unsigned char hdr_size;
+ signed char type;
+ signed char real_type;
+};
+
+struct object_stat {
unsigned delta_depth;
int base_object_no;
};
@@ -64,6 +67,7 @@ struct delta_entry {
};
static struct object_entry *objects;
+static struct object_stat *obj_stat;
static struct delta_entry *deltas;
static struct thread_local nothread_data;
static int nr_objects;
@@ -873,13 +877,15 @@ static void resolve_delta(struct object_entry *delta_obj,
void *base_data, *delta_data;
if (show_stat) {
- delta_obj->delta_depth = base->obj->delta_depth + 1;
+ int i = delta_obj - objects;
+ int j = base->obj - objects;
+ obj_stat[i].delta_depth = obj_stat[j].delta_depth + 1;
deepest_delta_lock();
- if (deepest_delta < delta_obj->delta_depth)
- deepest_delta = delta_obj->delta_depth;
+ if (deepest_delta < obj_stat[i].delta_depth)
+ deepest_delta = obj_stat[i].delta_depth;
deepest_delta_unlock();
+ obj_stat[i].base_object_no = j;
}
- delta_obj->base_object_no = base->obj - objects;
delta_data = get_data_from_pack(delta_obj);
base_data = get_base_data(base);
result->obj = delta_obj;
@@ -902,7 +908,7 @@ static void resolve_delta(struct object_entry *delta_obj,
* "want"; if so, swap in "set" and return true. Otherwise, leave it untouched
* and return false.
*/
-static int compare_and_swap_type(enum object_type *type,
+static int compare_and_swap_type(signed char *type,
enum object_type want,
enum object_type set)
{
@@ -1499,7 +1505,7 @@ static void show_pack_info(int stat_only)
struct object_entry *obj = &objects[i];
if (is_delta_type(obj->type))
- chain_histogram[obj->delta_depth - 1]++;
+ chain_histogram[obj_stat[i].delta_depth - 1]++;
if (stat_only)
continue;
printf("%s %-6s %lu %lu %"PRIuMAX,
@@ -1508,8 +1514,8 @@ static void show_pack_info(int stat_only)
(unsigned long)(obj[1].idx.offset - obj->idx.offset),
(uintmax_t)obj->idx.offset);
if (is_delta_type(obj->type)) {
- struct object_entry *bobj = &objects[obj->base_object_no];
- printf(" %u %s", obj->delta_depth, sha1_to_hex(bobj->idx.sha1));
+ struct object_entry *bobj = &objects[obj_stat[i].base_object_no];
+ printf(" %u %s", obj_stat[i].delta_depth, sha1_to_hex(bobj->idx.sha1));
}
putchar('\n');
}
@@ -1672,6 +1678,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
curr_pack = open_pack_file(pack_name);
parse_pack_header();
objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+ if (show_stat)
+ obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
parse_pack_objects(pack_sha1);
resolve_deltas();
--
2.3.0.rc1.137.g477eb31
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/2] index-pack: kill union delta_base to save memory
2015-04-18 10:47 [PATCH 0/2] nd/slim-index-pack-memory-usage update Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 1/2] index-pack: reduce object_entry size to save memory Nguyễn Thái Ngọc Duy
@ 2015-04-18 10:47 ` Nguyễn Thái Ngọc Duy
2015-07-03 16:51 ` Junio C Hamano
1 sibling, 1 reply; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-04-18 10:47 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
Once we know the number of objects in the input pack, we allocate an
array of nr_objects of struct delta_entry. On x86-64, this struct is
32 bytes long. The union delta_base, which is part of struct
delta_entry, provides enough space to store either ofs-delta (8 bytes)
or ref-delta (20 bytes).
Because ofs-delta encoding is more efficient space-wise and more
performant at runtime than ref-delta encoding, Git packers try to use
ofs-delta whenever possible, and it is expected that objects encoded
as ref-delta are minority.
In the best clone case where no ref-delta object is present, we waste
(20-8) * nr_objects bytes because of this union. That's about 38MB out
of 100MB for deltas[] with 3.4M objects, or 38%. deltas[] would be
around 62MB without the waste.
This patch attempts to eliminate that. deltas[] array is split into
two: one for ofs-delta and one for ref-delta. Many functions are also
duplicated because of this split. With this patch, ofs_deltas[] array
takes 51MB. ref_deltas[] should remain unallocated in clone case (0
bytes). This array grows as we see ref-delta. We save about half in
this case, or 25% of total bookkeeping.
The saving is more than the calculation above because some padding in
the old delta_entry struct is removed. ofs_delta_entry is 16 bytes,
including the 4 bytes padding. That's 13MB for padding, but packing
the struct could break platforms that do not support unaligned
access. If someone on 32-bit is really low on memory and only deals
with packs smaller than 2G, using 32-bit off_t would eliminate the
padding and save 27MB on top.
A note about ofs_deltas allocation. We could use ref_deltas memory
allocation strategy for ofs_deltas. But that probably just adds more
overhead on top. ofs-deltas are generally the majority (1/2 to 2/3) in
any pack. Incremental realloc may lead to too many memcpy. And if we
preallocate, say 1/2 or 2/3 of nr_objects initially, the growth rate
of ALLOC_GROW() could make this array larger than nr_objects, wasting
more memory.
Brought-up-by: Matthew Sporleder <msporleder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/index-pack.c | 260 +++++++++++++++++++++++++++++++--------------------
1 file changed, 160 insertions(+), 100 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 9d854fb..3ed53e3 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -28,11 +28,6 @@ struct object_stat {
int base_object_no;
};
-union delta_base {
- unsigned char sha1[20];
- off_t offset;
-};
-
struct base_data {
struct base_data *base;
struct base_data *child;
@@ -52,26 +47,28 @@ struct thread_local {
int pack_fd;
};
-/*
- * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
- * to memcmp() only the first 20 bytes.
- */
-#define UNION_BASE_SZ 20
-
#define FLAG_LINK (1u<<20)
#define FLAG_CHECKED (1u<<21)
-struct delta_entry {
- union delta_base base;
+struct ofs_delta_entry {
+ off_t offset;
+ int obj_no;
+};
+
+struct ref_delta_entry {
+ unsigned char sha1[20];
int obj_no;
};
static struct object_entry *objects;
static struct object_stat *obj_stat;
-static struct delta_entry *deltas;
+static struct ofs_delta_entry *ofs_deltas;
+static struct ref_delta_entry *ref_deltas;
static struct thread_local nothread_data;
static int nr_objects;
-static int nr_deltas;
+static int nr_ofs_deltas;
+static int nr_ref_deltas;
+static int ref_deltas_alloc;
static int nr_resolved_deltas;
static int nr_threads;
@@ -480,7 +477,8 @@ static void *unpack_entry_data(unsigned long offset, unsigned long size,
}
static void *unpack_raw_entry(struct object_entry *obj,
- union delta_base *delta_base,
+ off_t *ofs_offset,
+ unsigned char *ref_sha1,
unsigned char *sha1)
{
unsigned char *p;
@@ -509,11 +507,10 @@ static void *unpack_raw_entry(struct object_entry *obj,
switch (obj->type) {
case OBJ_REF_DELTA:
- hashcpy(delta_base->sha1, fill(20));
+ hashcpy(ref_sha1, fill(20));
use(20);
break;
case OBJ_OFS_DELTA:
- memset(delta_base, 0, sizeof(*delta_base));
p = fill(1);
c = *p;
use(1);
@@ -527,8 +524,8 @@ static void *unpack_raw_entry(struct object_entry *obj,
use(1);
base_offset = (base_offset << 7) + (c & 127);
}
- delta_base->offset = obj->idx.offset - base_offset;
- if (delta_base->offset <= 0 || delta_base->offset >= obj->idx.offset)
+ *ofs_offset = obj->idx.offset - base_offset;
+ if (*ofs_offset <= 0 || *ofs_offset >= obj->idx.offset)
bad_object(obj->idx.offset, _("delta base offset is out of bound"));
break;
case OBJ_COMMIT:
@@ -612,55 +609,108 @@ static void *get_data_from_pack(struct object_entry *obj)
return unpack_data(obj, NULL, NULL);
}
-static int compare_delta_bases(const union delta_base *base1,
- const union delta_base *base2,
- enum object_type type1,
- enum object_type type2)
+static int compare_ofs_delta_bases(off_t offset1, off_t offset2,
+ enum object_type type1,
+ enum object_type type2)
{
int cmp = type1 - type2;
if (cmp)
return cmp;
- return memcmp(base1, base2, UNION_BASE_SZ);
+ return offset1 - offset2;
}
-static int find_delta(const union delta_base *base, enum object_type type)
+static int find_ofs_delta(const off_t offset, enum object_type type)
{
- int first = 0, last = nr_deltas;
-
- while (first < last) {
- int next = (first + last) / 2;
- struct delta_entry *delta = &deltas[next];
- int cmp;
-
- cmp = compare_delta_bases(base, &delta->base,
- type, objects[delta->obj_no].type);
- if (!cmp)
- return next;
- if (cmp < 0) {
- last = next;
- continue;
- }
- first = next+1;
- }
- return -first-1;
+ int first = 0, last = nr_ofs_deltas;
+
+ while (first < last) {
+ int next = (first + last) / 2;
+ struct ofs_delta_entry *delta = &ofs_deltas[next];
+ int cmp;
+
+ cmp = compare_ofs_delta_bases(offset, delta->offset,
+ type, objects[delta->obj_no].type);
+ if (!cmp)
+ return next;
+ if (cmp < 0) {
+ last = next;
+ continue;
+ }
+ first = next+1;
+ }
+ return -first-1;
}
-static void find_delta_children(const union delta_base *base,
- int *first_index, int *last_index,
- enum object_type type)
+static void find_ofs_delta_children(off_t offset,
+ int *first_index, int *last_index,
+ enum object_type type)
{
- int first = find_delta(base, type);
+ int first = find_ofs_delta(offset, type);
int last = first;
- int end = nr_deltas - 1;
+ int end = nr_ofs_deltas - 1;
if (first < 0) {
*first_index = 0;
*last_index = -1;
return;
}
- while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
+ while (first > 0 && ofs_deltas[first - 1].offset == offset)
--first;
- while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
+ while (last < end && ofs_deltas[last + 1].offset == offset)
+ ++last;
+ *first_index = first;
+ *last_index = last;
+}
+
+static int compare_ref_delta_bases(const unsigned char *sha1,
+ const unsigned char *sha2,
+ enum object_type type1,
+ enum object_type type2)
+{
+ int cmp = type1 - type2;
+ if (cmp)
+ return cmp;
+ return hashcmp(sha1, sha2);
+}
+
+static int find_ref_delta(const unsigned char *sha1, enum object_type type)
+{
+ int first = 0, last = nr_ref_deltas;
+
+ while (first < last) {
+ int next = (first + last) / 2;
+ struct ref_delta_entry *delta = &ref_deltas[next];
+ int cmp;
+
+ cmp = compare_ref_delta_bases(sha1, delta->sha1,
+ type, objects[delta->obj_no].type);
+ if (!cmp)
+ return next;
+ if (cmp < 0) {
+ last = next;
+ continue;
+ }
+ first = next+1;
+ }
+ return -first-1;
+}
+
+static void find_ref_delta_children(const unsigned char *sha1,
+ int *first_index, int *last_index,
+ enum object_type type)
+{
+ int first = find_ref_delta(sha1, type);
+ int last = first;
+ int end = nr_ref_deltas - 1;
+
+ if (first < 0) {
+ *first_index = 0;
+ *last_index = -1;
+ return;
+ }
+ while (first > 0 && !hashcmp(ref_deltas[first - 1].sha1, sha1))
+ --first;
+ while (last < end && !hashcmp(ref_deltas[last + 1].sha1, sha1))
++last;
*first_index = first;
*last_index = last;
@@ -927,16 +977,13 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
struct base_data *prev_base)
{
if (base->ref_last == -1 && base->ofs_last == -1) {
- union delta_base base_spec;
-
- hashcpy(base_spec.sha1, base->obj->idx.sha1);
- find_delta_children(&base_spec,
- &base->ref_first, &base->ref_last, OBJ_REF_DELTA);
+ find_ref_delta_children(base->obj->idx.sha1,
+ &base->ref_first, &base->ref_last,
+ OBJ_REF_DELTA);
- memset(&base_spec, 0, sizeof(base_spec));
- base_spec.offset = base->obj->idx.offset;
- find_delta_children(&base_spec,
- &base->ofs_first, &base->ofs_last, OBJ_OFS_DELTA);
+ find_ofs_delta_children(base->obj->idx.offset,
+ &base->ofs_first, &base->ofs_last,
+ OBJ_OFS_DELTA);
if (base->ref_last == -1 && base->ofs_last == -1) {
free(base->data);
@@ -947,7 +994,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
}
if (base->ref_first <= base->ref_last) {
- struct object_entry *child = objects + deltas[base->ref_first].obj_no;
+ struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no;
struct base_data *result = alloc_base_data();
if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA,
@@ -963,7 +1010,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
}
if (base->ofs_first <= base->ofs_last) {
- struct object_entry *child = objects + deltas[base->ofs_first].obj_no;
+ struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no;
struct base_data *result = alloc_base_data();
assert(child->real_type == OBJ_OFS_DELTA);
@@ -999,15 +1046,20 @@ static void find_unresolved_deltas(struct base_data *base)
}
}
-static int compare_delta_entry(const void *a, const void *b)
+static int compare_ofs_delta_entry(const void *a, const void *b)
+{
+ const struct ofs_delta_entry *delta_a = a;
+ const struct ofs_delta_entry *delta_b = b;
+
+ return delta_a->offset - delta_b->offset;
+}
+
+static int compare_ref_delta_entry(const void *a, const void *b)
{
- const struct delta_entry *delta_a = a;
- const struct delta_entry *delta_b = b;
+ const struct ref_delta_entry *delta_a = a;
+ const struct ref_delta_entry *delta_b = b;
- /* group by type (ref vs ofs) and then by value (sha-1 or offset) */
- return compare_delta_bases(&delta_a->base, &delta_b->base,
- objects[delta_a->obj_no].type,
- objects[delta_b->obj_no].type);
+ return hashcmp(delta_a->sha1, delta_b->sha1);
}
static void resolve_base(struct object_entry *obj)
@@ -1053,7 +1105,8 @@ static void *threaded_second_pass(void *data)
static void parse_pack_objects(unsigned char *sha1)
{
int i, nr_delays = 0;
- struct delta_entry *delta = deltas;
+ struct ofs_delta_entry *ofs_delta = ofs_deltas;
+ unsigned char ref_delta_sha1[20];
struct stat st;
if (verbose)
@@ -1062,12 +1115,18 @@ static void parse_pack_objects(unsigned char *sha1)
nr_objects);
for (i = 0; i < nr_objects; i++) {
struct object_entry *obj = &objects[i];
- void *data = unpack_raw_entry(obj, &delta->base, obj->idx.sha1);
+ void *data = unpack_raw_entry(obj, &ofs_delta->offset,
+ ref_delta_sha1, obj->idx.sha1);
obj->real_type = obj->type;
- if (is_delta_type(obj->type)) {
- nr_deltas++;
- delta->obj_no = i;
- delta++;
+ if (obj->type == OBJ_OFS_DELTA) {
+ nr_ofs_deltas++;
+ ofs_delta->obj_no = i;
+ ofs_delta++;
+ } else if (obj->type == OBJ_REF_DELTA) {
+ ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, ref_deltas_alloc);
+ hashcpy(ref_deltas[nr_ref_deltas].sha1, ref_delta_sha1);
+ ref_deltas[nr_ref_deltas].obj_no = i;
+ nr_ref_deltas++;
} else if (!data) {
/* large blobs, check later */
obj->real_type = OBJ_BAD;
@@ -1118,15 +1177,18 @@ static void resolve_deltas(void)
{
int i;
- if (!nr_deltas)
+ if (!nr_ofs_deltas && !nr_ref_deltas)
return;
/* Sort deltas by base SHA1/offset for fast searching */
- qsort(deltas, nr_deltas, sizeof(struct delta_entry),
- compare_delta_entry);
+ qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
+ compare_ofs_delta_entry);
+ qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
+ compare_ref_delta_entry);
if (verbose)
- progress = start_progress(_("Resolving deltas"), nr_deltas);
+ progress = start_progress(_("Resolving deltas"),
+ nr_ref_deltas + nr_ofs_deltas);
#ifndef NO_PTHREADS
nr_dispatched = 0;
@@ -1164,7 +1226,7 @@ static void resolve_deltas(void)
static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved);
static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned char *pack_sha1)
{
- if (nr_deltas == nr_resolved_deltas) {
+ if (nr_ref_deltas + nr_ofs_deltas == nr_resolved_deltas) {
stop_progress(&progress);
/* Flush remaining pack final 20-byte SHA1. */
flush();
@@ -1175,7 +1237,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
struct sha1file *f;
unsigned char read_sha1[20], tail_sha1[20];
struct strbuf msg = STRBUF_INIT;
- int nr_unresolved = nr_deltas - nr_resolved_deltas;
+ int nr_unresolved = nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas;
int nr_objects_initial = nr_objects;
if (nr_unresolved <= 0)
die(_("confusion beyond insanity"));
@@ -1197,11 +1259,11 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
die(_("Unexpected tail checksum for %s "
"(disk corruption?)"), curr_pack);
}
- if (nr_deltas != nr_resolved_deltas)
+ if (nr_ofs_deltas + nr_ref_deltas != nr_resolved_deltas)
die(Q_("pack has %d unresolved delta",
"pack has %d unresolved deltas",
- nr_deltas - nr_resolved_deltas),
- nr_deltas - nr_resolved_deltas);
+ nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas),
+ nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas);
}
static int write_compressed(struct sha1file *f, void *in, unsigned int size)
@@ -1261,14 +1323,14 @@ static struct object_entry *append_obj_to_pack(struct sha1file *f,
static int delta_pos_compare(const void *_a, const void *_b)
{
- struct delta_entry *a = *(struct delta_entry **)_a;
- struct delta_entry *b = *(struct delta_entry **)_b;
+ struct ref_delta_entry *a = *(struct ref_delta_entry **)_a;
+ struct ref_delta_entry *b = *(struct ref_delta_entry **)_b;
return a->obj_no - b->obj_no;
}
static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
{
- struct delta_entry **sorted_by_pos;
+ struct ref_delta_entry **sorted_by_pos;
int i, n = 0;
/*
@@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
* resolving deltas in the same order as their position in the pack.
*/
sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
- for (i = 0; i < nr_deltas; i++) {
- if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
- continue;
- sorted_by_pos[n++] = &deltas[i];
- }
+ for (i = 0; i < nr_ref_deltas; i++)
+ sorted_by_pos[n++] = &ref_deltas[i];
qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
for (i = 0; i < n; i++) {
- struct delta_entry *d = sorted_by_pos[i];
+ struct ref_delta_entry *d = sorted_by_pos[i];
enum object_type type;
struct base_data *base_obj = alloc_base_data();
if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
continue;
- base_obj->data = read_sha1_file(d->base.sha1, &type, &base_obj->size);
+ base_obj->data = read_sha1_file(d->sha1, &type, &base_obj->size);
if (!base_obj->data)
continue;
- if (check_sha1_signature(d->base.sha1, base_obj->data,
+ if (check_sha1_signature(d->sha1, base_obj->data,
base_obj->size, typename(type)))
- die(_("local object %s is corrupt"), sha1_to_hex(d->base.sha1));
- base_obj->obj = append_obj_to_pack(f, d->base.sha1,
+ die(_("local object %s is corrupt"), sha1_to_hex(d->sha1));
+ base_obj->obj = append_obj_to_pack(f, d->sha1,
base_obj->data, base_obj->size, type);
find_unresolved_deltas(base_obj);
display_progress(progress, nr_resolved_deltas);
@@ -1495,7 +1554,7 @@ static void read_idx_option(struct pack_idx_option *opts, const char *pack_name)
static void show_pack_info(int stat_only)
{
- int i, baseobjects = nr_objects - nr_deltas;
+ int i, baseobjects = nr_objects - nr_ref_deltas - nr_ofs_deltas;
unsigned long *chain_histogram = NULL;
if (deepest_delta)
@@ -1680,11 +1739,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
if (show_stat)
obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
- deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
+ ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
parse_pack_objects(pack_sha1);
resolve_deltas();
conclude_pack(fix_thin_pack, curr_pack, pack_sha1);
- free(deltas);
+ free(ofs_deltas);
+ free(ref_deltas);
if (strict)
foreign_nr = check_objects();
--
2.3.0.rc1.137.g477eb31
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] index-pack: kill union delta_base to save memory
2015-04-18 10:47 ` [PATCH 2/2] index-pack: kill union delta_base " Nguyễn Thái Ngọc Duy
@ 2015-07-03 16:51 ` Junio C Hamano
2015-07-03 18:29 ` Duy Nguyen
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-07-03 16:51 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> Once we know the number of objects in the input pack, we allocate an
> array of nr_objects of struct delta_entry. On x86-64, this struct is
> 32 bytes long. The union delta_base, which is part of struct
> delta_entry, provides enough space to store either ofs-delta (8 bytes)
> or ref-delta (20 bytes).
Sorry for responding to a patch 7000+ messages ago, but some kind
folks at Google were puzzled by this code, and I think they found a
small bug.
> static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
> {
> - struct delta_entry **sorted_by_pos;
> + struct ref_delta_entry **sorted_by_pos;
> int i, n = 0;
>
> /*
> @@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
> * resolving deltas in the same order as their position in the pack.
> */
> sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
> - for (i = 0; i < nr_deltas; i++) {
> - if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
> - continue;
> - sorted_by_pos[n++] = &deltas[i];
> - }
> + for (i = 0; i < nr_ref_deltas; i++)
> + sorted_by_pos[n++] = &ref_deltas[i];
> qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
You allocate an array of nr_unresolved (which is the sum of
nr_ref_deltas and nr_ofs_deltas in the new world order with patch)
entries, fill only the first nr_ref_deltas entries of it, and then
sort that first n (= nr_ref_deltas) entries. The qsort and the
subsequent loop only looks at the first n entries, so nothing is
filling or reading these nr_ofs_deltas entres at the end.
I do not see any wrong behaviour other than temporary wastage of
nr_ofs_deltas pointers that would be caused by this, but this
allocation is misleading.
Also, the old code before this change had to use 'i' and 'n' because
some of the things we see in the (old) deltas[] array we scanned
with 'i' would not make it into the sorted_by_pos[] array in the old
world order, but now because you have only ref delta in a separate
ref_deltas[] array, they increment lock&step. That also puzzled me
while re-reading this code.
Perhaps something like this is needed?
builtin/index-pack.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 48fa472..d6c48cd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1334,7 +1334,7 @@ static int delta_pos_compare(const void *_a, const void *_b)
static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
{
struct ref_delta_entry **sorted_by_pos;
- int i, n = 0;
+ int i;
/*
* Since many unresolved deltas may well be themselves base objects
@@ -1346,12 +1346,12 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
* before deltas depending on them, a good heuristic is to start
* resolving deltas in the same order as their position in the pack.
*/
- sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
+ sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
for (i = 0; i < nr_ref_deltas; i++)
- sorted_by_pos[n++] = &ref_deltas[i];
- qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
+ sorted_by_pos[i] = &ref_deltas[i];
+ qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
- for (i = 0; i < n; i++) {
+ for (i = 0; i < nr_ref_deltas; i++) {
struct ref_delta_entry *d = sorted_by_pos[i];
enum object_type type;
struct base_data *base_obj = alloc_base_data();
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] index-pack: kill union delta_base to save memory
2015-07-03 16:51 ` Junio C Hamano
@ 2015-07-03 18:29 ` Duy Nguyen
2015-07-04 1:21 ` [PATCH] index-pack: fix overallocation of sorted_by_pos array Junio C Hamano
2015-07-04 22:24 ` [PATCH 2/2] index-pack: kill union delta_base to save memory Junio C Hamano
0 siblings, 2 replies; 17+ messages in thread
From: Duy Nguyen @ 2015-07-03 18:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
On Fri, Jul 3, 2015 at 11:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> Once we know the number of objects in the input pack, we allocate an
>> array of nr_objects of struct delta_entry. On x86-64, this struct is
>> 32 bytes long. The union delta_base, which is part of struct
>> delta_entry, provides enough space to store either ofs-delta (8 bytes)
>> or ref-delta (20 bytes).
>
> Sorry for responding to a patch 7000+ messages ago, but some kind
> folks at Google were puzzled by this code, and I think they found a
> small bug.
>
>> static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
>> {
>> - struct delta_entry **sorted_by_pos;
>> + struct ref_delta_entry **sorted_by_pos;
>> int i, n = 0;
>>
>> /*
>> @@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
>> * resolving deltas in the same order as their position in the pack.
>> */
>> sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
>> - for (i = 0; i < nr_deltas; i++) {
>> - if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
>> - continue;
>> - sorted_by_pos[n++] = &deltas[i];
>> - }
>> + for (i = 0; i < nr_ref_deltas; i++)
>> + sorted_by_pos[n++] = &ref_deltas[i];
>> qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
>
> You allocate an array of nr_unresolved (which is the sum of
> nr_ref_deltas and nr_ofs_deltas in the new world order with patch)
> entries, fill only the first nr_ref_deltas entries of it, and then
> sort that first n (= nr_ref_deltas) entries. The qsort and the
> subsequent loop only looks at the first n entries, so nothing is
> filling or reading these nr_ofs_deltas entres at the end.
>
> I do not see any wrong behaviour other than temporary wastage of
> nr_ofs_deltas pointers that would be caused by this, but this
> allocation is misleading.
>
> Also, the old code before this change had to use 'i' and 'n' because
> some of the things we see in the (old) deltas[] array we scanned
> with 'i' would not make it into the sorted_by_pos[] array in the old
> world order, but now because you have only ref delta in a separate
> ref_deltas[] array, they increment lock&step. That also puzzled me
> while re-reading this code.
>
> Perhaps something like this is needed?
Yeah I can see the confusion when reading the code without checking
its history. You probably want to kill the argument nr_unresolved too
because it's not needed anymore after your patch.
So what's the bug you mentioned? All I got from the above was wastage
and confusion, no bug.
--
Duy
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] index-pack: fix overallocation of sorted_by_pos array
2015-07-03 18:29 ` Duy Nguyen
@ 2015-07-04 1:21 ` Junio C Hamano
2015-07-04 22:30 ` [PATCH] index-pack: fix allocation " Junio C Hamano
2015-07-04 22:24 ` [PATCH 2/2] index-pack: kill union delta_base to save memory Junio C Hamano
1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-07-04 1:21 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List
When c6458e60 (index-pack: kill union delta_base to save memory,
2015-04-18) attempted to reduce the memory footprint of index-pack,
one of the key thing it did was to keep track of ref-deltas and
ofs-deltas separately.
In fix_unresolved_deltas(), however it forgot that it now wants to
look only at ref deltas in one place. The code allocated an array
sufficient to hold both ref- and ofs-deltas, stuffed only ref-deltas
in the array, sorted it (with the right count) and iterated over the
array (with the right count).
There is no externally visible ill effect, other than a small
wastage of memory, but the code was misleading.
Also, the old code before this change had to use 'i' and 'n' because
some of the things we see in the (old) deltas[] array we scanned
with 'i' would not make it into the sorted_by_pos[] array in the old
world order, but now because you have only ref delta in a separate
ref_deltas[] array, they increment lock&step. We no longer need
separate variables (nor the nr_unresolved parameter).
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* This time with a formal log message. Again, this is not really a
"bug"-fix; just returning the memory that we allocated without
using to the system, and making the code easier to follow.
builtin/index-pack.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3ed53e3..fa13e20 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1223,7 +1223,7 @@ static void resolve_deltas(void)
* - append objects to convert thin pack to full pack if required
* - write the final 20-byte SHA-1
*/
-static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved);
+static void fix_unresolved_deltas(struct sha1file *f);
static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned char *pack_sha1)
{
if (nr_ref_deltas + nr_ofs_deltas == nr_resolved_deltas) {
@@ -1245,7 +1245,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
memset(objects + nr_objects + 1, 0,
nr_unresolved * sizeof(*objects));
f = sha1fd(output_fd, curr_pack);
- fix_unresolved_deltas(f, nr_unresolved);
+ fix_unresolved_deltas(f);
strbuf_addf(&msg, _("completed with %d local objects"),
nr_objects - nr_objects_initial);
stop_progress_msg(&progress, msg.buf);
@@ -1328,10 +1328,10 @@ static int delta_pos_compare(const void *_a, const void *_b)
return a->obj_no - b->obj_no;
}
-static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
+static void fix_unresolved_deltas(struct sha1file *f)
{
struct ref_delta_entry **sorted_by_pos;
- int i, n = 0;
+ int i;
/*
* Since many unresolved deltas may well be themselves base objects
@@ -1343,12 +1343,12 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
* before deltas depending on them, a good heuristic is to start
* resolving deltas in the same order as their position in the pack.
*/
- sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
+ sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
for (i = 0; i < nr_ref_deltas; i++)
- sorted_by_pos[n++] = &ref_deltas[i];
- qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
+ sorted_by_pos[i] = &ref_deltas[i];
+ qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
- for (i = 0; i < n; i++) {
+ for (i = 0; i < nr_ref_deltas; i++) {
struct ref_delta_entry *d = sorted_by_pos[i];
enum object_type type;
struct base_data *base_obj = alloc_base_data();
--
2.5.0-rc1-213-g8b7a1c9
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] index-pack: kill union delta_base to save memory
2015-07-03 18:29 ` Duy Nguyen
2015-07-04 1:21 ` [PATCH] index-pack: fix overallocation of sorted_by_pos array Junio C Hamano
@ 2015-07-04 22:24 ` Junio C Hamano
1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-07-04 22:24 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List
Duy Nguyen <pclouds@gmail.com> writes:
>>> @@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
>>> * resolving deltas in the same order as their position in the pack.
>>> */
>>> sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
>>> - for (i = 0; i < nr_deltas; i++) {
>>> - if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
>>> - continue;
>>> - sorted_by_pos[n++] = &deltas[i];
>>> - }
>>> + for (i = 0; i < nr_ref_deltas; i++)
>>> + sorted_by_pos[n++] = &ref_deltas[i];
>>> qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
>>
>> You allocate an array of nr_unresolved (which is the sum of
>> nr_ref_deltas and nr_ofs_deltas in the new world order with patch)
>> entries, fill only the first nr_ref_deltas entries of it, and then
>> sort that first n (= nr_ref_deltas) entries. The qsort and the
>> subsequent loop only looks at the first n entries, so nothing is
>> filling or reading these nr_ofs_deltas entres at the end.
>>
>> I do not see any wrong behaviour other than temporary wastage of
>> nr_ofs_deltas pointers that would be caused by this, but this
>> allocation is misleading.
>>
>> Also, the old code before this change had to use 'i' and 'n' because
>> some of the things we see in the (old) deltas[] array we scanned
>> with 'i' would not make it into the sorted_by_pos[] array in the old
>> world order, but now because you have only ref delta in a separate
>> ref_deltas[] array, they increment lock&step. That also puzzled me
>> while re-reading this code.
>>
>> Perhaps something like this is needed?
>
> Yeah I can see the confusion when reading the code without checking
> its history. You probably want to kill the argument nr_unresolved too
> because it's not needed anymore after your patch.
>
> So what's the bug you mentioned? All I got from the above was wastage
> and confusion, no bug.
Actually, the above is not analyzed correctly. I thought
nr_unresolved was ref + ofs, but look at the caller in the
fix_thin_pack codepath:
int nr_unresolved = nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas;
int nr_objects_initial = nr_objects;
if (nr_unresolved <= 0)
die(_("confusion beyond insanity"));
REALLOC_ARRAY(objects, nr_objects + nr_unresolved + 1);
memset(objects + nr_objects + 1, 0,
nr_unresolved * sizeof(*objects));
f = sha1fd(output_fd, curr_pack);
fix_unresolved_deltas(f, nr_unresolved);
If there were tons of nr_resolved_deltas and only small number of
nr_ofs_deltas, then the allocation in question may actually be
under-allocating.
So...
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] index-pack: fix allocation of sorted_by_pos array
2015-07-04 1:21 ` [PATCH] index-pack: fix overallocation of sorted_by_pos array Junio C Hamano
@ 2015-07-04 22:30 ` Junio C Hamano
2015-07-06 23:23 ` Junio C Hamano
2015-07-07 0:36 ` Duy Nguyen
0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-07-04 22:30 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List
When c6458e60 (index-pack: kill union delta_base to save memory,
2015-04-18) attempted to reduce the memory footprint of index-pack,
one of the key thing it did was to keep track of ref-deltas and
ofs-deltas separately.
In fix_unresolved_deltas(), however it forgot that it now wants to
look only at ref deltas in one place. The code allocated an array
for nr_unresolved, which is sum of number of ref- and ofs-deltas
minus nr_resolved, which may be larger or smaller than the number
ref-deltas. Depending on nr_resolved, this was either under or over
allocating.
Also, the old code before this change had to use 'i' and 'n' because
some of the things we see in the (old) deltas[] array we scanned
with 'i' would not make it into the sorted_by_pos[] array in the old
world order, but now because you have only ref delta in a separate
ref_deltas[] array, they increment lock&step. We no longer need
separate variables. And most importantly, we shouldn't pass the
nr_unresolved parameter, as this number does not play a role in the
working of this helper function.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* This time, correcting the analysis; the previous one claimed that
this was not a bug but just an overallocation. It indeed is a bug
that uses an unrelated value that may or may not be sufficiently
large to hold the whole thing, I think.
builtin/index-pack.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3ed53e3..fa13e20 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1223,7 +1223,7 @@ static void resolve_deltas(void)
* - append objects to convert thin pack to full pack if required
* - write the final 20-byte SHA-1
*/
-static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved);
+static void fix_unresolved_deltas(struct sha1file *f);
static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned char *pack_sha1)
{
if (nr_ref_deltas + nr_ofs_deltas == nr_resolved_deltas) {
@@ -1245,7 +1245,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
memset(objects + nr_objects + 1, 0,
nr_unresolved * sizeof(*objects));
f = sha1fd(output_fd, curr_pack);
- fix_unresolved_deltas(f, nr_unresolved);
+ fix_unresolved_deltas(f);
strbuf_addf(&msg, _("completed with %d local objects"),
nr_objects - nr_objects_initial);
stop_progress_msg(&progress, msg.buf);
@@ -1328,10 +1328,10 @@ static int delta_pos_compare(const void *_a, const void *_b)
return a->obj_no - b->obj_no;
}
-static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
+static void fix_unresolved_deltas(struct sha1file *f)
{
struct ref_delta_entry **sorted_by_pos;
- int i, n = 0;
+ int i;
/*
* Since many unresolved deltas may well be themselves base objects
@@ -1343,12 +1343,12 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
* before deltas depending on them, a good heuristic is to start
* resolving deltas in the same order as their position in the pack.
*/
- sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
+ sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
for (i = 0; i < nr_ref_deltas; i++)
- sorted_by_pos[n++] = &ref_deltas[i];
- qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
+ sorted_by_pos[i] = &ref_deltas[i];
+ qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
- for (i = 0; i < n; i++) {
+ for (i = 0; i < nr_ref_deltas; i++) {
struct ref_delta_entry *d = sorted_by_pos[i];
enum object_type type;
struct base_data *base_obj = alloc_base_data();
--
2.5.0-rc1-220-g8f9c9ba
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] index-pack: fix allocation of sorted_by_pos array
2015-07-04 22:30 ` [PATCH] index-pack: fix allocation " Junio C Hamano
@ 2015-07-06 23:23 ` Junio C Hamano
2015-07-07 0:36 ` Duy Nguyen
1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-07-06 23:23 UTC (permalink / raw)
To: Git Mailing List; +Cc: Duy Nguyen
Junio C Hamano <gitster@pobox.com> writes:
> When c6458e60 (index-pack: kill union delta_base to save memory,
> 2015-04-18) attempted to reduce the memory footprint of index-pack,
> one of the key thing it did was to keep track of ref-deltas and
> ofs-deltas separately.
>
> In fix_unresolved_deltas(), however it forgot that it now wants to
> look only at ref deltas in one place. The code allocated an array
> for nr_unresolved, which is sum of number of ref- and ofs-deltas
> minus nr_resolved, which may be larger or smaller than the number
> ref-deltas. Depending on nr_resolved, this was either under or over
> allocating.
>
> Also, the old code before this change had to use 'i' and 'n' because
> some of the things we see in the (old) deltas[] array we scanned
> with 'i' would not make it into the sorted_by_pos[] array in the old
> world order, but now because you have only ref delta in a separate
> ref_deltas[] array, they increment lock&step. We no longer need
> separate variables. And most importantly, we shouldn't pass the
> nr_unresolved parameter, as this number does not play a role in the
> working of this helper function.
>
> Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
> * This time, correcting the analysis; the previous one claimed that
> this was not a bug but just an overallocation. It indeed is a bug
> that uses an unrelated value that may or may not be sufficiently
> large to hold the whole thing, I think.
I think this one as a real "crash hotfix" must be in the upcoming
release (the obvious alternative is to revert the series with
c6458e60 which I really want to avoid).
As Eric's "worktree add" would need some more time to solidify,
let's delay the -rc2 and later by a few weeks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] index-pack: fix allocation of sorted_by_pos array
2015-07-04 22:30 ` [PATCH] index-pack: fix allocation " Junio C Hamano
2015-07-06 23:23 ` Junio C Hamano
@ 2015-07-07 0:36 ` Duy Nguyen
2015-07-07 15:49 ` Junio C Hamano
1 sibling, 1 reply; 17+ messages in thread
From: Duy Nguyen @ 2015-07-07 0:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
On Sun, Jul 5, 2015 at 5:30 AM, Junio C Hamano <gitster@pobox.com> wrote:
> When c6458e60 (index-pack: kill union delta_base to save memory,
> 2015-04-18) attempted to reduce the memory footprint of index-pack,
> one of the key thing it did was to keep track of ref-deltas and
> ofs-deltas separately.
>
> In fix_unresolved_deltas(), however it forgot that it now wants to
> look only at ref deltas in one place. The code allocated an array
> for nr_unresolved, which is sum of number of ref- and ofs-deltas
> minus nr_resolved, which may be larger or smaller than the number
> ref-deltas. Depending on nr_resolved, this was either under or over
> allocating.
It's either that or we could put back "if (real_type != OBJ_REF_DELTA)
continue;" in the sorted_by_pos population loop. Resolved deltas can't
have real_type == OBJ_REF_DELTA, so if we allocate nr_unresolved, it's
guaranteed over-allocation, never under-allocation. But I guess your
approach would make the code easier to read.
I keep tripping over this "real_type vs type" in this code. What do
you think about renaming "type" field to "in_pack_type" and
"real_type" to "canon_type" (or "final_type")? "Real" does not really
say anything in this context..
--
Duy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] index-pack: fix allocation of sorted_by_pos array
2015-07-07 0:36 ` Duy Nguyen
@ 2015-07-07 15:49 ` Junio C Hamano
2015-07-07 16:06 ` Jeff King
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-07-07 15:49 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List
Duy Nguyen <pclouds@gmail.com> writes:
> I keep tripping over this "real_type vs type" in this code. What do
> you think about renaming "type" field to "in_pack_type" and
> "real_type" to "canon_type" (or "final_type")? "Real" does not really
> say anything in this context..
An unqualified name "type" does bother me for the word to express
what representation the piece of data uses (i.e. is it a delta, or
is it a base object of "tree" type, or what). I think I tried to
unconfuse myself by saying "representation type" in in-code
comments, reviews and log messages when it is not clear which kind
between "in-pack representation" or "Git object type of that stored
data" a sentence is talking about, and I agree "in_pack_type" would
be a vast improvement over just "type".
To me personally real- and final- mean about the same thing
(i.e. what is the real type of the object that is stored?) in the
context of this codepath.
Especially, if the other one is renamed with "in_pack_" prefix,
"real_type" is not just clear enough but is probably better because
it explains what it is from its "meaning" (i.e. it is the type of
the Git object, not how it is represented in the pack-stream) than
"final_type" that is named after "how" it is computed (i.e. it makes
sense to you only if you know that an in-pack type "this is delta"
does not have the full information and you have to traverse the
delta chain and you will finally find out what it is when you hit
the base representation).
Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] index-pack: fix allocation of sorted_by_pos array
2015-07-07 15:49 ` Junio C Hamano
@ 2015-07-07 16:06 ` Jeff King
2015-07-08 11:56 ` [PATCH 1/2] index-pack: rename the field "type" to "in_pack_type" Nguyễn Thái Ngọc Duy
0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2015-07-07 16:06 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Duy Nguyen, Git Mailing List
On Tue, Jul 07, 2015 at 08:49:19AM -0700, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
> > I keep tripping over this "real_type vs type" in this code. What do
> > you think about renaming "type" field to "in_pack_type" and
> > "real_type" to "canon_type" (or "final_type")? "Real" does not really
> > say anything in this context..
>
> An unqualified name "type" does bother me for the word to express
> what representation the piece of data uses (i.e. is it a delta, or
> is it a base object of "tree" type, or what). I think I tried to
> unconfuse myself by saying "representation type" in in-code
> comments, reviews and log messages when it is not clear which kind
> between "in-pack representation" or "Git object type of that stored
> data" a sentence is talking about, and I agree "in_pack_type" would
> be a vast improvement over just "type".
I think this is doubly confusing because pack-objects _does_ use
in_pack_type. And its "type" is therefore the "real" object type. Which
is the opposite of index-pack, which uses "type" for the in-pack type.
So at the very least, we should harmonize these two uses.
> Especially, if the other one is renamed with "in_pack_" prefix,
> "real_type" is not just clear enough but is probably better because
> it explains what it is from its "meaning" (i.e. it is the type of
> the Git object, not how it is represented in the pack-stream) than
> "final_type" that is named after "how" it is computed (i.e. it makes
> sense to you only if you know that an in-pack type "this is delta"
> does not have the full information and you have to traverse the
> delta chain and you will finally find out what it is when you hit
> the base representation).
Yeah, I agree real_type is fine when paired with in_pack_type. We might
consider modifying pack-objects.h to match (on top of just moving
index-pack to in_pack_type).
-Peff
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] index-pack: rename the field "type" to "in_pack_type"
2015-07-07 16:06 ` Jeff King
@ 2015-07-08 11:56 ` Nguyễn Thái Ngọc Duy
2015-07-08 11:56 ` [PATCH 2/2] pack-objects: rename the field "type" to "real_type" Nguyễn Thái Ngọc Duy
0 siblings, 1 reply; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-07-08 11:56 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jeff King, Nguyễn Thái Ngọc Duy
We have two types in this code: in-pack and canonical. "in_pack_type"
makes it clearer than plain "type".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/index-pack.c | 42 +++++++++++++++++++++---------------------
1 file changed, 21 insertions(+), 21 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 48fa472..797e571 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -19,7 +19,7 @@ struct object_entry {
struct pack_idx_entry idx;
unsigned long size;
unsigned char hdr_size;
- signed char type;
+ signed char in_pack_type;
signed char real_type;
};
@@ -493,7 +493,7 @@ static void *unpack_raw_entry(struct object_entry *obj,
p = fill(1);
c = *p;
use(1);
- obj->type = (c >> 4) & 7;
+ obj->in_pack_type = (c >> 4) & 7;
size = (c & 15);
shift = 4;
while (c & 0x80) {
@@ -505,7 +505,7 @@ static void *unpack_raw_entry(struct object_entry *obj,
}
obj->size = size;
- switch (obj->type) {
+ switch (obj->in_pack_type) {
case OBJ_REF_DELTA:
hashcpy(ref_sha1, fill(20));
use(20);
@@ -534,11 +534,11 @@ static void *unpack_raw_entry(struct object_entry *obj,
case OBJ_TAG:
break;
default:
- bad_object(obj->idx.offset, _("unknown object type %d"), obj->type);
+ bad_object(obj->idx.offset, _("unknown object type %d"), obj->in_pack_type);
}
obj->hdr_size = consumed_bytes - obj->idx.offset;
- data = unpack_entry_data(obj->idx.offset, obj->size, obj->type, sha1);
+ data = unpack_entry_data(obj->idx.offset, obj->size, obj->in_pack_type, sha1);
obj->idx.crc32 = input_crc32;
return data;
}
@@ -631,7 +631,7 @@ static int find_ofs_delta(const off_t offset, enum object_type type)
int cmp;
cmp = compare_ofs_delta_bases(offset, delta->offset,
- type, objects[delta->obj_no].type);
+ type, objects[delta->obj_no].in_pack_type);
if (!cmp)
return next;
if (cmp < 0) {
@@ -685,7 +685,7 @@ static int find_ref_delta(const unsigned char *sha1, enum object_type type)
int cmp;
cmp = compare_ref_delta_bases(sha1, delta->sha1,
- type, objects[delta->obj_no].type);
+ type, objects[delta->obj_no].in_pack_type);
if (!cmp)
return next;
if (cmp < 0) {
@@ -759,7 +759,7 @@ static int check_collison(struct object_entry *entry)
enum object_type type;
unsigned long size;
- if (entry->size <= big_file_threshold || entry->type != OBJ_BLOB)
+ if (entry->size <= big_file_threshold || entry->in_pack_type != OBJ_BLOB)
return -1;
memset(&data, 0, sizeof(data));
@@ -767,7 +767,7 @@ static int check_collison(struct object_entry *entry)
data.st = open_istream(entry->idx.sha1, &type, &size, NULL);
if (!data.st)
return -1;
- if (size != entry->size || type != entry->type)
+ if (size != entry->size || type != entry->in_pack_type)
die(_("SHA1 COLLISION FOUND WITH %s !"),
sha1_to_hex(entry->idx.sha1));
unpack_data(entry, compare_objects, &data);
@@ -891,7 +891,7 @@ static void *get_base_data(struct base_data *c)
struct base_data **delta = NULL;
int delta_nr = 0, delta_alloc = 0;
- while (is_delta_type(c->obj->type) && !c->data) {
+ while (is_delta_type(c->obj->in_pack_type) && !c->data) {
ALLOC_GROW(delta, delta_nr + 1, delta_alloc);
delta[delta_nr++] = c;
c = c->base;
@@ -1085,7 +1085,7 @@ static void *threaded_second_pass(void *data)
counter_unlock();
work_lock();
while (nr_dispatched < nr_objects &&
- is_delta_type(objects[nr_dispatched].type))
+ is_delta_type(objects[nr_dispatched].in_pack_type))
nr_dispatched++;
if (nr_dispatched >= nr_objects) {
work_unlock();
@@ -1121,12 +1121,12 @@ static void parse_pack_objects(unsigned char *sha1)
struct object_entry *obj = &objects[i];
void *data = unpack_raw_entry(obj, &ofs_delta->offset,
ref_delta_sha1, obj->idx.sha1);
- obj->real_type = obj->type;
- if (obj->type == OBJ_OFS_DELTA) {
+ obj->real_type = obj->in_pack_type;
+ if (obj->in_pack_type == OBJ_OFS_DELTA) {
nr_ofs_deltas++;
ofs_delta->obj_no = i;
ofs_delta++;
- } else if (obj->type == OBJ_REF_DELTA) {
+ } else if (obj->in_pack_type == OBJ_REF_DELTA) {
ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, ref_deltas_alloc);
hashcpy(ref_deltas[nr_ref_deltas].sha1, ref_delta_sha1);
ref_deltas[nr_ref_deltas].obj_no = i;
@@ -1136,7 +1136,7 @@ static void parse_pack_objects(unsigned char *sha1)
obj->real_type = OBJ_BAD;
nr_delays++;
} else
- sha1_object(data, NULL, obj->size, obj->type, obj->idx.sha1);
+ sha1_object(data, NULL, obj->size, obj->in_pack_type, obj->idx.sha1);
free(data);
display_progress(progress, i+1);
}
@@ -1161,8 +1161,8 @@ static void parse_pack_objects(unsigned char *sha1)
struct object_entry *obj = &objects[i];
if (obj->real_type != OBJ_BAD)
continue;
- obj->real_type = obj->type;
- sha1_object(NULL, obj, obj->size, obj->type, obj->idx.sha1);
+ obj->real_type = obj->in_pack_type;
+ sha1_object(NULL, obj, obj->size, obj->in_pack_type, obj->idx.sha1);
nr_delays--;
}
if (nr_delays)
@@ -1215,7 +1215,7 @@ static void resolve_deltas(void)
for (i = 0; i < nr_objects; i++) {
struct object_entry *obj = &objects[i];
- if (is_delta_type(obj->type))
+ if (is_delta_type(obj->in_pack_type))
continue;
resolve_base(obj);
display_progress(progress, nr_resolved_deltas);
@@ -1314,7 +1314,7 @@ static struct object_entry *append_obj_to_pack(struct sha1file *f,
sha1write(f, header, n);
obj[0].size = size;
obj[0].hdr_size = n;
- obj[0].type = type;
+ obj[0].in_pack_type = type;
obj[0].real_type = type;
obj[1].idx.offset = obj[0].idx.offset + n;
obj[1].idx.offset += write_compressed(f, buf, size);
@@ -1566,7 +1566,7 @@ static void show_pack_info(int stat_only)
for (i = 0; i < nr_objects; i++) {
struct object_entry *obj = &objects[i];
- if (is_delta_type(obj->type))
+ if (is_delta_type(obj->in_pack_type))
chain_histogram[obj_stat[i].delta_depth - 1]++;
if (stat_only)
continue;
@@ -1575,7 +1575,7 @@ static void show_pack_info(int stat_only)
typename(obj->real_type), obj->size,
(unsigned long)(obj[1].idx.offset - obj->idx.offset),
(uintmax_t)obj->idx.offset);
- if (is_delta_type(obj->type)) {
+ if (is_delta_type(obj->in_pack_type)) {
struct object_entry *bobj = &objects[obj_stat[i].base_object_no];
printf(" %u %s", obj_stat[i].delta_depth, sha1_to_hex(bobj->idx.sha1));
}
--
2.3.0.rc1.137.g477eb31
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/2] pack-objects: rename the field "type" to "real_type"
2015-07-08 11:56 ` [PATCH 1/2] index-pack: rename the field "type" to "in_pack_type" Nguyễn Thái Ngọc Duy
@ 2015-07-08 11:56 ` Nguyễn Thái Ngọc Duy
2015-07-08 13:47 ` Jeff King
0 siblings, 1 reply; 17+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-07-08 11:56 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jeff King, Nguyễn Thái Ngọc Duy
This is to avoid the too generic name "type" and harmonize with the
naming in index-pack. There's a subtle difference though: real_type in
index-pack is what the upper level see, no delta types (after delta
resolution). But real_type in pack-objects is the type to be written in
the pack, delta types are fine (it's actually markers for reused deltas)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
builtin/pack-objects.c | 36 ++++++++++++++++++------------------
pack-bitmap-write.c | 6 +++---
pack-objects.h | 2 +-
3 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 80fe8c7..e03bf3e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -244,7 +244,7 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
struct git_istream *st = NULL;
if (!usable_delta) {
- if (entry->type == OBJ_BLOB &&
+ if (entry->real_type == OBJ_BLOB &&
entry->size > big_file_threshold &&
(st = open_istream(entry->idx.sha1, &type, &size, NULL)) != NULL)
buf = NULL;
@@ -348,7 +348,7 @@ static unsigned long write_reuse_object(struct sha1file *f, struct object_entry
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
- enum object_type type = entry->type;
+ enum object_type type = entry->real_type;
unsigned long datalen;
unsigned char header[10], dheader[10];
unsigned hdrlen;
@@ -452,11 +452,11 @@ static unsigned long write_object(struct sha1file *f,
to_reuse = 0; /* explicit */
else if (!entry->in_pack)
to_reuse = 0; /* can't reuse what we don't have */
- else if (entry->type == OBJ_REF_DELTA || entry->type == OBJ_OFS_DELTA)
+ else if (entry->real_type == OBJ_REF_DELTA || entry->real_type == OBJ_OFS_DELTA)
/* check_object() decided it for us ... */
to_reuse = usable_delta;
/* ... but pack split may override that */
- else if (entry->type != entry->in_pack_type)
+ else if (entry->real_type != entry->in_pack_type)
to_reuse = 0; /* pack has delta which is unusable */
else if (entry->delta)
to_reuse = 0; /* we want to pack afresh */
@@ -676,8 +676,8 @@ static struct object_entry **compute_write_order(void)
* And then all remaining commits and tags.
*/
for (i = last_untagged; i < to_pack.nr_objects; i++) {
- if (objects[i].type != OBJ_COMMIT &&
- objects[i].type != OBJ_TAG)
+ if (objects[i].real_type != OBJ_COMMIT &&
+ objects[i].real_type != OBJ_TAG)
continue;
add_to_write_order(wo, &wo_end, &objects[i]);
}
@@ -686,7 +686,7 @@ static struct object_entry **compute_write_order(void)
* And then all the trees.
*/
for (i = last_untagged; i < to_pack.nr_objects; i++) {
- if (objects[i].type != OBJ_TREE)
+ if (objects[i].real_type != OBJ_TREE)
continue;
add_to_write_order(wo, &wo_end, &objects[i]);
}
@@ -994,7 +994,7 @@ static void create_object_entry(const unsigned char *sha1,
entry = packlist_alloc(&to_pack, sha1, index_pos);
entry->hash = hash;
if (type)
- entry->type = type;
+ entry->real_type = type;
if (exclude)
entry->preferred_base = 1;
else
@@ -1355,9 +1355,9 @@ static void check_object(struct object_entry *entry)
switch (entry->in_pack_type) {
default:
/* Not a delta hence we've already got all we need. */
- entry->type = entry->in_pack_type;
+ entry->real_type = entry->in_pack_type;
entry->in_pack_header_size = used;
- if (entry->type < OBJ_COMMIT || entry->type > OBJ_BLOB)
+ if (entry->real_type < OBJ_COMMIT || entry->real_type > OBJ_BLOB)
goto give_up;
unuse_pack(&w_curs);
return;
@@ -1411,7 +1411,7 @@ static void check_object(struct object_entry *entry)
* deltify other objects against, in order to avoid
* circular deltas.
*/
- entry->type = entry->in_pack_type;
+ entry->real_type = entry->in_pack_type;
entry->delta = base_entry;
entry->delta_size = entry->size;
entry->delta_sibling = base_entry->delta_child;
@@ -1420,7 +1420,7 @@ static void check_object(struct object_entry *entry)
return;
}
- if (entry->type) {
+ if (entry->real_type) {
/*
* This must be a delta and we already know what the
* final object type is. Let's extract the actual
@@ -1443,7 +1443,7 @@ static void check_object(struct object_entry *entry)
unuse_pack(&w_curs);
}
- entry->type = sha1_object_info(entry->idx.sha1, &entry->size);
+ entry->real_type = sha1_object_info(entry->idx.sha1, &entry->size);
/*
* The error condition is checked in prepare_pack(). This is
* to permit a missing preferred base object to be ignored
@@ -1503,9 +1503,9 @@ static int type_size_sort(const void *_a, const void *_b)
const struct object_entry *a = *(struct object_entry **)_a;
const struct object_entry *b = *(struct object_entry **)_b;
- if (a->type > b->type)
+ if (a->real_type > b->real_type)
return -1;
- if (a->type < b->type)
+ if (a->real_type < b->real_type)
return 1;
if (a->hash > b->hash)
return -1;
@@ -1581,7 +1581,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
void *delta_buf;
/* Don't bother doing diffs between different types */
- if (trg_entry->type != src_entry->type)
+ if (trg_entry->real_type != src_entry->real_type)
return -1;
/*
@@ -2149,11 +2149,11 @@ static void prepare_pack(int window, int depth)
if (!entry->preferred_base) {
nr_deltas++;
- if (entry->type < 0)
+ if (entry->real_type < 0)
die("unable to get type of object %s",
sha1_to_hex(entry->idx.sha1));
} else {
- if (entry->type < 0) {
+ if (entry->real_type < 0) {
/*
* This object is not found, but we
* don't have to include it anyway.
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index c05d138..572f4d6 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -64,12 +64,12 @@ void bitmap_writer_build_type_index(struct pack_idx_entry **index,
entry->in_pack_pos = i;
- switch (entry->type) {
+ switch (entry->real_type) {
case OBJ_COMMIT:
case OBJ_TREE:
case OBJ_BLOB:
case OBJ_TAG:
- real_type = entry->type;
+ real_type = entry->real_type;
break;
default:
@@ -96,7 +96,7 @@ void bitmap_writer_build_type_index(struct pack_idx_entry **index,
default:
die("Missing type information for %s (%d/%d)",
- sha1_to_hex(entry->idx.sha1), real_type, entry->type);
+ sha1_to_hex(entry->idx.sha1), real_type, entry->real_type);
}
}
}
diff --git a/pack-objects.h b/pack-objects.h
index d1b98b3..33cde59 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -14,7 +14,7 @@ struct object_entry {
void *delta_data; /* cached delta (uncompressed) */
unsigned long delta_size; /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
- enum object_type type;
+ enum object_type real_type;
enum object_type in_pack_type; /* could be delta */
uint32_t hash; /* name hint hash */
unsigned int in_pack_pos;
--
2.3.0.rc1.137.g477eb31
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] pack-objects: rename the field "type" to "real_type"
2015-07-08 11:56 ` [PATCH 2/2] pack-objects: rename the field "type" to "real_type" Nguyễn Thái Ngọc Duy
@ 2015-07-08 13:47 ` Jeff King
2015-07-08 13:57 ` Duy Nguyen
0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2015-07-08 13:47 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano
On Wed, Jul 08, 2015 at 06:56:31PM +0700, Nguyễn Thái Ngọc Duy wrote:
> This is to avoid the too generic name "type" and harmonize with the
> naming in index-pack. There's a subtle difference though: real_type in
> index-pack is what the upper level see, no delta types (after delta
> resolution). But real_type in pack-objects is the type to be written in
> the pack, delta types are fine (it's actually markers for reused deltas)
Hrm, now I'm confused about whether this change is a good idea.
The definition of in_pack_type says:
> enum object_type in_pack_type; /* could be delta */
so now I am confused about what exactly "type" (and now "real_type")
means.
-Peff
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] pack-objects: rename the field "type" to "real_type"
2015-07-08 13:47 ` Jeff King
@ 2015-07-08 13:57 ` Duy Nguyen
2015-07-08 14:11 ` Jeff King
0 siblings, 1 reply; 17+ messages in thread
From: Duy Nguyen @ 2015-07-08 13:57 UTC (permalink / raw)
To: Jeff King; +Cc: Git Mailing List, Junio C Hamano
On Wed, Jul 8, 2015 at 8:47 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Jul 08, 2015 at 06:56:31PM +0700, Nguyễn Thái Ngọc Duy wrote:
>
>> This is to avoid the too generic name "type" and harmonize with the
>> naming in index-pack. There's a subtle difference though: real_type in
>> index-pack is what the upper level see, no delta types (after delta
>> resolution). But real_type in pack-objects is the type to be written in
>> the pack, delta types are fine (it's actually markers for reused deltas)
>
> Hrm, now I'm confused about whether this change is a good idea.
Oh good :) I found it not-so-good too after seeing the check "if
(real_type == OBJ_REF...)"
> The definition of in_pack_type says:
>
>> enum object_type in_pack_type; /* could be delta */
>
> so now I am confused about what exactly "type" (and now "real_type")
> means.
I think we just overload "type" with "this delta is detected reusable
already" in write_object(). It only means 'real type in the output
pack' for canonical types. For generated deltas, we already know if
it's ref-delta or ofs-delta, we don't rely on real_type
--
Duy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] pack-objects: rename the field "type" to "real_type"
2015-07-08 13:57 ` Duy Nguyen
@ 2015-07-08 14:11 ` Jeff King
0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2015-07-08 14:11 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List, Junio C Hamano
On Wed, Jul 08, 2015 at 08:57:35PM +0700, Duy Nguyen wrote:
> > The definition of in_pack_type says:
> >
> >> enum object_type in_pack_type; /* could be delta */
> >
> > so now I am confused about what exactly "type" (and now "real_type")
> > means.
>
> I think we just overload "type" with "this delta is detected reusable
> already" in write_object(). It only means 'real type in the output
> pack' for canonical types. For generated deltas, we already know if
> it's ref-delta or ofs-delta, we don't rely on real_type
Ah. I think I'd be inclined to just leave it as "type" then.
-Peff
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-07-08 14:11 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-18 10:47 [PATCH 0/2] nd/slim-index-pack-memory-usage update Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 1/2] index-pack: reduce object_entry size to save memory Nguyễn Thái Ngọc Duy
2015-04-18 10:47 ` [PATCH 2/2] index-pack: kill union delta_base " Nguyễn Thái Ngọc Duy
2015-07-03 16:51 ` Junio C Hamano
2015-07-03 18:29 ` Duy Nguyen
2015-07-04 1:21 ` [PATCH] index-pack: fix overallocation of sorted_by_pos array Junio C Hamano
2015-07-04 22:30 ` [PATCH] index-pack: fix allocation " Junio C Hamano
2015-07-06 23:23 ` Junio C Hamano
2015-07-07 0:36 ` Duy Nguyen
2015-07-07 15:49 ` Junio C Hamano
2015-07-07 16:06 ` Jeff King
2015-07-08 11:56 ` [PATCH 1/2] index-pack: rename the field "type" to "in_pack_type" Nguyễn Thái Ngọc Duy
2015-07-08 11:56 ` [PATCH 2/2] pack-objects: rename the field "type" to "real_type" Nguyễn Thái Ngọc Duy
2015-07-08 13:47 ` Jeff King
2015-07-08 13:57 ` Duy Nguyen
2015-07-08 14:11 ` Jeff King
2015-07-04 22:24 ` [PATCH 2/2] index-pack: kill union delta_base to save memory Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).