* [PATCH 1/2] pack-objects: reverse the delta search sort list
@ 2007-12-08 5:00 Nicolas Pitre
2007-12-08 8:18 ` [OT] perhaps we want to support copied-context diff output Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Pitre @ 2007-12-08 5:00 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jon Smirl
It is currently sorted and then walked backward. Not only this doesn't
feel natural for my poor brain, but it would make the next patch less
obvious as well.
So reverse the sort order, and reverse the list walking direction,
which effectively produce the exact same end result as before.
Also bring the relevant comment nearer the actual code and adjust it
accordingly, with minor additional clarifications.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
builtin-pack-objects.c | 41 +++++++++++++++++++++--------------------
1 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 4f44658..5002cc6 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1245,28 +1245,37 @@ static void get_object_details(void)
free(sorted_by_offset);
}
+/*
+ * We search for deltas in a list sorted by type, by filename hash, and then
+ * by size, so that we see progressively smaller and smaller files.
+ * That's because we prefer deltas to be from the bigger file
+ * to the smaller -- deletes are potentially cheaper, but perhaps
+ * more importantly, the bigger file is likely the more recent
+ * one. The deepest deltas are therefore the oldest objects which are
+ * less susceptible to be accessed often.
+ */
static int type_size_sort(const void *_a, const void *_b)
{
const struct object_entry *a = *(struct object_entry **)_a;
const struct object_entry *b = *(struct object_entry **)_b;
- if (a->type < b->type)
- return -1;
if (a->type > b->type)
- return 1;
- if (a->hash < b->hash)
return -1;
- if (a->hash > b->hash)
+ if (a->type < b->type)
return 1;
- if (a->preferred_base < b->preferred_base)
+ if (a->hash > b->hash)
return -1;
- if (a->preferred_base > b->preferred_base)
+ if (a->hash < b->hash)
return 1;
- if (a->size < b->size)
+ if (a->preferred_base > b->preferred_base)
return -1;
+ if (a->preferred_base < b->preferred_base)
+ return 1;
if (a->size > b->size)
+ return -1;
+ if (a->size < b->size)
return 1;
- return a > b ? -1 : (a < b); /* newest last */
+ return a < b ? -1 : (a > b); /* newest first */
}
struct unpacked {
@@ -1317,14 +1326,6 @@ static pthread_mutex_t progress_mutex = PTHREAD_MUTEX_INITIALIZER;
#endif
-/*
- * We search for deltas _backwards_ in a list sorted by type and
- * by size, so that we see progressively smaller and smaller files.
- * That's because we prefer deltas to be from the bigger file
- * to the smaller - deletes are potentially cheaper, but perhaps
- * more importantly, the bigger file is likely the more recent
- * one.
- */
static int try_delta(struct unpacked *trg, struct unpacked *src,
unsigned max_depth, unsigned long *mem_usage)
{
@@ -1481,7 +1482,7 @@ static unsigned long free_unpacked(struct unpacked *n)
static void find_deltas(struct object_entry **list, unsigned list_size,
int window, int depth, unsigned *processed)
{
- uint32_t i = list_size, idx = 0, count = 0;
+ uint32_t i = 0, idx = 0, count = 0;
unsigned int array_size = window * sizeof(struct unpacked);
struct unpacked *array;
unsigned long mem_usage = 0;
@@ -1490,7 +1491,7 @@ static void find_deltas(struct object_entry **list, unsigned list_size,
memset(array, 0, array_size);
do {
- struct object_entry *entry = list[--i];
+ struct object_entry *entry = list[i++];
struct unpacked *n = array + idx;
int j, max_depth, best_base = -1;
@@ -1575,7 +1576,7 @@ static void find_deltas(struct object_entry **list, unsigned list_size,
count++;
if (idx >= window)
idx = 0;
- } while (i > 0);
+ } while (i < list_size);
for (i = 0; i < window; ++i) {
free_delta_index(array[i].index);
--
1.5.3.7.2184.ge321d-dirty
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [OT] perhaps we want to support copied-context diff output
2007-12-08 5:00 [PATCH 1/2] pack-objects: reverse the delta search sort list Nicolas Pitre
@ 2007-12-08 8:18 ` Junio C Hamano
2007-12-08 8:53 ` Mike Hommey
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2007-12-08 8:18 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: git, Jon Smirl
The patch looks correct, but I have an offtopic comment that does not
have anything to do with the problem being discussed right now.
Nicolas Pitre <nico@cam.org> writes:
> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 4f44658..5002cc6 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -1245,28 +1245,37 @@ static void get_object_details(void)
> ...
> static int type_size_sort(const void *_a, const void *_b)
> {
> const struct object_entry *a = *(struct object_entry **)_a;
> const struct object_entry *b = *(struct object_entry **)_b;
>
> - if (a->type < b->type)
> - return -1;
> if (a->type > b->type)
> - return 1;
> - if (a->hash < b->hash)
> return -1;
> - if (a->hash > b->hash)
> + if (a->type < b->type)
> return 1;
> - if (a->preferred_base < b->preferred_base)
> + if (a->hash > b->hash)
> return -1;
> - if (a->preferred_base > b->preferred_base)
> + if (a->hash < b->hash)
> return 1;
> - if (a->size < b->size)
> + if (a->preferred_base > b->preferred_base)
> return -1;
> + if (a->preferred_base < b->preferred_base)
> + return 1;
> if (a->size > b->size)
> + return -1;
> + if (a->size < b->size)
> return 1;
> - return a > b ? -1 : (a < b); /* newest last */
> + return a < b ? -1 : (a > b); /* newest first */
> }
Before being able to understand what was going on, I had to shuffle the
above patch by duplicating the context lines, prefix them with '-' and
then '+', and grouping preimage lines and postimage lines together, to
come up with this patch:
static int type_size_sort(const void *_a, const void *_b)
{
const struct object_entry *a = *(struct object_entry **)_a;
const struct object_entry *b = *(struct object_entry **)_b;
- if (a->type < b->type)
- return -1;
- if (a->type > b->type)
- return 1;
- if (a->hash < b->hash)
- return -1;
- if (a->hash > b->hash)
- return 1;
- if (a->preferred_base < b->preferred_base)
- return -1;
- if (a->preferred_base > b->preferred_base)
- return 1;
- if (a->size < b->size)
- return -1;
- if (a->size > b->size)
- return 1;
- return a > b ? -1 : (a < b); /* newest last */
+ if (a->type > b->type)
+ return -1;
+ if (a->type < b->type)
+ return 1;
+ if (a->hash > b->hash)
+ return -1;
+ if (a->hash < b->hash)
+ return 1;
+ if (a->preferred_base > b->preferred_base)
+ return -1;
+ if (a->preferred_base < b->preferred_base)
+ return 1;
+ if (a->size > b->size)
+ return -1;
+ if (a->size < b->size)
+ return 1;
+ return a < b ? -1 : (a > b); /* newest first */
}
Perhaps we may want to add "diff -c" (copied context) output format as
an option, which may be easier to read.
A possible alternative with much less impact to our toolset would be to
stay with unified context format but employ some heuristics (e.g. "a
hunk has many small context lines between preimage and postimage pairs")
and rewrite the diff output automatically like what I did by hand above.
The problematic region has 26 lines, among which 9 lines are deleted
material and 9 lines are added material, and it contains 8 isolated
groups of unchanged material, one line each. A heuristics to notice
such excessively large number of groups of unchanged lines compared to
the size of the hunk itself would be reasonably easy to implement.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [OT] perhaps we want to support copied-context diff output
2007-12-08 8:18 ` [OT] perhaps we want to support copied-context diff output Junio C Hamano
@ 2007-12-08 8:53 ` Mike Hommey
2007-12-08 11:10 ` Peter Baumann
0 siblings, 1 reply; 4+ messages in thread
From: Mike Hommey @ 2007-12-08 8:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git, Jon Smirl
> Perhaps we may want to add "diff -c" (copied context) output format as
> an option, which may be easier to read.
Or maybe use the patience diff.
On a testcase I had a few months ago, patience diff would give:
@@ -42,11 +42,10 @@
include $(DEPTH)/config/autoconf.mk
+EXTRA_COMPONENTS = nsKillAll.js
+
include $(topsrcdir)/config/rules.mk
-libs::
- $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
-
clean::
rm -f $(DIST)/bin/components/nsKillAll.js
Where "normal" diff would give:
@@ -42,10 +42,9 @@
include $(DEPTH)/config/autoconf.mk
-include $(topsrcdir)/config/rules.mk
+EXTRA_COMPONENTS = nsKillAll.js
-libs::
- $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
+include $(topsrcdir)/config/rules.mk
clean::
rm -f $(DIST)/bin/components/nsKillAll.js
git outputs the same as the normal diff, bzr and svn seem to use the
patience diff. Mercurial outputs almost the same as bzr and svn:
@@ -42,10 +42,9 @@
include $(DEPTH)/config/autoconf.mk
+EXTRA_COMPONENTS = nsKillAll.js
+
include $(topsrcdir)/config/rules.mk
-
-libs::
- $(INSTALL) $(srcdir)/nsKillAll.js $(DIST)/bin/components
clean::
rm -f $(DIST)/bin/components/nsKillAll.js
Mike
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [OT] perhaps we want to support copied-context diff output
2007-12-08 8:53 ` Mike Hommey
@ 2007-12-08 11:10 ` Peter Baumann
0 siblings, 0 replies; 4+ messages in thread
From: Peter Baumann @ 2007-12-08 11:10 UTC (permalink / raw)
To: Mike Hommey; +Cc: Junio C Hamano, Nicolas Pitre, git, Jon Smirl
On Sat, Dec 08, 2007 at 09:53:02AM +0100, Mike Hommey wrote:
> > Perhaps we may want to add "diff -c" (copied context) output format as
> > an option, which may be easier to read.
>
> Or maybe use the patience diff.
>
[... skip testcase showing a much nicer diff for human consumption ...]
AFAIR bzr uses the patience diff already. I don't actually use bzr
(obviously, because git is so much better:-) but it produces much nicer
diffs than git if you have many small changes nearby.
-Peter
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-12-08 11:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-08 5:00 [PATCH 1/2] pack-objects: reverse the delta search sort list Nicolas Pitre
2007-12-08 8:18 ` [OT] perhaps we want to support copied-context diff output Junio C Hamano
2007-12-08 8:53 ` Mike Hommey
2007-12-08 11:10 ` Peter Baumann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).