* git log anomalities @ 2013-07-22 9:08 Uwe Kleine-König 2013-07-22 10:40 ` Thomas Rast 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast 0 siblings, 2 replies; 9+ messages in thread From: Uwe Kleine-König @ 2013-07-22 9:08 UTC (permalink / raw) To: git; +Cc: kernel Hello, today I looked at the changes to drivers/net/ethernet/freescale/fec.c in the kernel since v3.8 using git log --stat v3.8.. --full-diff -- drivers/net/ethernet/freescale/fec.c which looks as expected. But when I added --graph the diffstats change. E.g. for 793fc0964be1921f15a44be58b066f22b925d90b it changes from: drivers/net/ethernet/freescale/Makefile | 3 +- drivers/net/ethernet/freescale/fec.c | 1966 ----------------------------- drivers/net/ethernet/freescale/fec_main.c | 1966 +++++++++++++++++++++++++++++ drivers/net/ethernet/freescale/fec_ptp.c | 3 - 4 files changed, 1968 insertions(+), 1970 deletions(-) to | Documentation/devicetree/bindings/net/dsa/dsa.txt | 91 + | .../bindings/net/marvell-orion-mdio.txt | 3 + | Documentation/networking/ip-sysctl.txt | 35 +- | MAINTAINERS | 2 +- | arch/arm/net/bpf_jit_32.c | 5 +- | arch/arm/plat-orion/common.c | 54 +- | arch/powerpc/net/bpf_jit_comp.c | 12 +- | ... | 404 files changed, 15373 insertions(+), 8563 deletions(-) Is that a bug, or a feature I don't understand? (I'm using Debian's 1:1.8.3.2-1) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git log anomalities 2013-07-22 9:08 git log anomalities Uwe Kleine-König @ 2013-07-22 10:40 ` Thomas Rast 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast 1 sibling, 0 replies; 9+ messages in thread From: Thomas Rast @ 2013-07-22 10:40 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: git, kernel Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes: > Hello, > > today I looked at the changes to drivers/net/ethernet/freescale/fec.c in > the kernel since v3.8 using > > git log --stat v3.8.. --full-diff -- drivers/net/ethernet/freescale/fec.c > > which looks as expected. But when I added --graph the diffstats change. > E.g. for 793fc0964be1921f15a44be58b066f22b925d90b it changes from: > > drivers/net/ethernet/freescale/Makefile | 3 +- > drivers/net/ethernet/freescale/fec.c | 1966 ----------------------------- > drivers/net/ethernet/freescale/fec_main.c | 1966 +++++++++++++++++++++++++++++ > drivers/net/ethernet/freescale/fec_ptp.c | 3 - > 4 files changed, 1968 insertions(+), 1970 deletions(-) > > to > > | Documentation/devicetree/bindings/net/dsa/dsa.txt | 91 + > | .../bindings/net/marvell-orion-mdio.txt | 3 + > | Documentation/networking/ip-sysctl.txt | 35 +- [...] > | 404 files changed, 15373 insertions(+), 8563 deletions(-) > > Is that a bug, or a feature I don't understand? Nice catch. It's a bad interaction between --full-diff, --stat and --parents (which --graph implies in an internal-workings kind of way). The parent rewriting gets to mess with the history *before* we generate the diffs. Normally this wouldn't matter, because the pathspec filter would then again exclude all but the "real" change from the diff. But with --full-diff, you see all the changes between 793fc096 and the next commit that touches the given pathspec (47a5247f), which is -- at this stage, after rewriting -- the "parent". I suspect to fix this we'll need to separate the "real" from the "rewritten" parents, which might take a bit of work. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] log: use true parents for diff even when rewriting 2013-07-22 9:08 git log anomalities Uwe Kleine-König 2013-07-22 10:40 ` Thomas Rast @ 2013-07-22 21:22 ` Thomas Rast 2013-07-22 21:48 ` Junio C Hamano ` (2 more replies) 1 sibling, 3 replies; 9+ messages in thread From: Thomas Rast @ 2013-07-22 21:22 UTC (permalink / raw) To: git; +Cc: Uwe Kleine-König, Junio C Hamano When using pathspec filtering in combination with diff-based log output, parent simplification happens before the diff is computed. The diff is therefore against the *simplified* parents. This works okay, arguably by accident, in the normal case: the pruned commits did not affect the paths being filtered, so the diff against the prune-result is the same as against the diff against the true parents. However, --full-diff breaks this guarantee, and indeed gives pretty spectacular results when comparing the output of git log --graph --stat ... git log --graph --full-diff --stat ... (--graph internally kicks in parent simplification, much like --parents). Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thomas Rast <trast@inf.ethz.ch> --- Perhaps like this. It's getting a bit late, so I'm not sure if I'm missing another user of the "true" parent list, but it does fix the issue you reported. combine-diff.c | 3 ++- commit.c | 16 ++++++++++++++++ commit.h | 3 +++ log-tree.c | 2 +- revision.c | 30 +++++++++++++++++++++++++++++- revision.h | 15 +++++++++++++++ t/t6012-rev-list-simplify.sh | 6 ++++++ 7 files changed, 72 insertions(+), 3 deletions(-) diff --git a/combine-diff.c b/combine-diff.c index 6dc0609..abd2397 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -10,6 +10,7 @@ #include "refs.h" #include "userdiff.h" #include "sha1-array.h" +#include "revision.h" static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent) { @@ -1383,7 +1384,7 @@ void diff_tree_combined(const unsigned char *sha1, void diff_tree_combined_merge(const struct commit *commit, int dense, struct rev_info *rev) { - struct commit_list *parent = commit->parents; + struct commit_list *parent = get_saved_parents(commit); struct sha1_array parents = SHA1_ARRAY_INIT; while (parent) { diff --git a/commit.c b/commit.c index e5862f6..5ecdb38 100644 --- a/commit.c +++ b/commit.c @@ -377,6 +377,22 @@ unsigned commit_list_count(const struct commit_list *l) return c; } +struct commit_list *copy_commit_list(struct commit_list *list) +{ + struct commit_list *head = NULL; + struct commit_list **pp = &head; + while (list) { + struct commit_list *new; + new = xmalloc(sizeof(struct commit_list)); + new->item = list->item; + new->next = NULL; + *pp = new; + pp = &new->next; + list = list->next; + } + return head; +} + void free_commit_list(struct commit_list *list) { while (list) { diff --git a/commit.h b/commit.h index 35cc4e2..ca766d2 100644 --- a/commit.h +++ b/commit.h @@ -62,6 +62,9 @@ struct commit_list *commit_list_insert_by_date(struct commit *item, struct commit_list **list); void commit_list_sort_by_date(struct commit_list **list); +/* Shallow copy of the input list */ +struct commit_list *copy_commit_list(struct commit_list *list); + void free_commit_list(struct commit_list *list); /* Commit formats */ diff --git a/log-tree.c b/log-tree.c index a49d8e8..b1d5747 100644 --- a/log-tree.c +++ b/log-tree.c @@ -738,7 +738,7 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log sha1 = commit->tree->object.sha1; /* Root commit? */ - parents = commit->parents; + parents = get_saved_parents(commit); if (!parents) { if (opt->show_root_diff) { diff_root_tree_sha1(sha1, "", &opt->diffopt); diff --git a/revision.c b/revision.c index 84ccc05..3fdd0ae 100644 --- a/revision.c +++ b/revision.c @@ -15,6 +15,7 @@ #include "string-list.h" #include "line-log.h" #include "mailmap.h" +#include "commit-slab.h" volatile show_early_output_fn_t show_early_output; @@ -2763,7 +2764,7 @@ static int commit_match(struct commit *commit, struct rev_info *opt) return retval; } -static inline int want_ancestry(struct rev_info *revs) +static inline int want_ancestry(const struct rev_info *revs) { return (revs->rewrite_parents || revs->children.name); } @@ -2820,6 +2821,7 @@ enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit) if (action == commit_show && !revs->show_all && revs->prune && revs->dense && want_ancestry(revs)) { + save_parents(commit); if (rewrite_parents(revs, commit, rewrite_one) < 0) return commit_error; } @@ -3069,3 +3071,29 @@ void put_revision_mark(const struct rev_info *revs, const struct commit *commit) fputs(mark, stdout); putchar(' '); } + +define_commit_slab(saved_parents, struct commit_list *); +struct saved_parents saved_parents_slab; +static int saved_parents_initialized; + +void save_parents(struct commit *commit) +{ + struct commit_list **pp; + + if (!saved_parents_initialized) { + init_saved_parents(&saved_parents_slab); + saved_parents_initialized = 1; + } + + pp = saved_parents_at(&saved_parents_slab, commit); + assert(*pp == NULL); + *pp = copy_commit_list(commit->parents); +} + +struct commit_list *get_saved_parents(struct commit *commit) +{ + if (!saved_parents_initialized) + return commit->parents; + + return *saved_parents_at(&saved_parents_slab, commit); +} diff --git a/revision.h b/revision.h index 95859ba..0717518 100644 --- a/revision.h +++ b/revision.h @@ -273,4 +273,19 @@ enum rewrite_result { extern int rewrite_parents(struct rev_info *revs, struct commit *commit, rewrite_parent_fn_t rewrite_parent); + +/* + * Save a copy of the parent list, and return the saved copy. This is + * used by the log machinery to retrieve the original parents when + * commit->parents has been modified by history simpification. + * + * You may only call save_parents() once per commit (this is checked + * for non-root commits). + * + * get_original_parents() will transparently return commit->parents if + * history simplification is off. + */ +extern void save_parents(struct commit *commit); +extern struct commit_list *get_original_parents(struct commit *commit); + #endif diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index 57ce239..fde5e71 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -127,4 +127,10 @@ test_expect_success 'full history simplification without parent' ' } ' +test_expect_success '--full-diff is not affected by --parents' ' + git log -p --pretty="%H" --full-diff -- file >expected && + git log -p --pretty="%H" --full-diff --parents -- file >actual && + test_cmp expected actual +' + test_done -- 1.8.3.3.1180.gbf33142 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] log: use true parents for diff even when rewriting 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast @ 2013-07-22 21:48 ` Junio C Hamano 2013-07-23 7:27 ` Thomas Rast 2013-07-23 19:55 ` Junio C Hamano 2013-07-31 20:13 ` [PATCH v2] " Thomas Rast 2 siblings, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2013-07-22 21:48 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Uwe Kleine-König Thomas Rast <trast@inf.ethz.ch> writes: > When using pathspec filtering in combination with diff-based log > output, parent simplification happens before the diff is computed. > The diff is therefore against the *simplified* parents. > > This works okay, arguably by accident, in the normal case: the pruned > commits did not affect the paths being filtered, so the diff against > the prune-result is the same as against the diff against the true > parents. > > However, --full-diff breaks this guarantee, and indeed gives pretty > spectacular results when comparing the output of > > git log --graph --stat ... > git log --graph --full-diff --stat ... > > (--graph internally kicks in parent simplification, much like > --parents). > > Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > Signed-off-by: Thomas Rast <trast@inf.ethz.ch> > --- > > Perhaps like this. It's getting a bit late, so I'm not sure if I'm > missing another user of the "true" parent list, but it does fix the > issue you reported. Conceptually I can see how this will change the history simplification in the vertical direction (skipping the ancestry chain and jumping directly to the closest grandparent that touched the specified path), but I am not sure how well this interacts with history simplification in the horizontal direciton (culling irrelevant side branches from the merge). > @@ -2820,6 +2821,7 @@ enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit) > if (action == commit_show && > !revs->show_all && > revs->prune && revs->dense && want_ancestry(revs)) { > + save_parents(commit); > if (rewrite_parents(revs, commit, rewrite_one) < 0) > return commit_error; > } After this, rewritten parent list may have shrunk by dropping irrelevant side branches, but saved parents would have the full parents list. When we decide how to show diff depending on the number of remaining parents, we would still use the rewritten parents (which may have been reduced to a single strand of pearls) and your change will use the original parents (which may be multiple). I also have to wonder if we always want to incur this save-parents overhead, or we are better off limiting it to only when --full-diff is in effect. Thanks for getting the ball rolling, anyway. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: use true parents for diff even when rewriting 2013-07-22 21:48 ` Junio C Hamano @ 2013-07-23 7:27 ` Thomas Rast 2013-07-23 7:49 ` Uwe Kleine-König 0 siblings, 1 reply; 9+ messages in thread From: Thomas Rast @ 2013-07-23 7:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Uwe Kleine-König Junio C Hamano <gitster@pobox.com> writes: > Thomas Rast <trast@inf.ethz.ch> writes: > >> When using pathspec filtering in combination with diff-based log >> output, parent simplification happens before the diff is computed. >> The diff is therefore against the *simplified* parents. >> >> This works okay, arguably by accident, in the normal case: the pruned >> commits did not affect the paths being filtered, so the diff against >> the prune-result is the same as against the diff against the true >> parents. >> >> However, --full-diff breaks this guarantee, and indeed gives pretty >> spectacular results when comparing the output of >> >> git log --graph --stat ... >> git log --graph --full-diff --stat ... >> >> (--graph internally kicks in parent simplification, much like >> --parents). Hmm, I stopped writing the message midway through. There should be another two paragraphs here about storing the original parent list on the side for later use when showing the diff. >> Perhaps like this. It's getting a bit late, so I'm not sure if I'm >> missing another user of the "true" parent list, but it does fix the >> issue you reported. > > Conceptually I can see how this will change the history > simplification in the vertical direction (skipping the ancestry > chain and jumping directly to the closest grandparent that touched > the specified path), but I am not sure how well this interacts with > history simplification in the horizontal direciton (culling > irrelevant side branches from the merge). But isn't that similarly confusing for the user as Uwe's original problem? Suddenly we'd be showing a merge commit as an ordinary one, simply because the merged history did not affect the filtered pathspecs. Thus we would show everything that has been merged on the *other* files as a big diff. Would that be useful? It would certainly be a big difference in how the commit is shown. > I also have to wonder if we always want to incur this save-parents > overhead, or we are better off limiting it to only when --full-diff > is in effect. I haven't quite convinced myself that it is 100% safe to use the rewritten parents when --full-diff is not in effect... -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: use true parents for diff even when rewriting 2013-07-23 7:27 ` Thomas Rast @ 2013-07-23 7:49 ` Uwe Kleine-König 0 siblings, 0 replies; 9+ messages in thread From: Uwe Kleine-König @ 2013-07-23 7:49 UTC (permalink / raw) To: Thomas Rast; +Cc: Junio C Hamano, git Hello Thomas, On Tue, Jul 23, 2013 at 09:27:06AM +0200, Thomas Rast wrote: > Junio C Hamano <gitster@pobox.com> writes: > > Conceptually I can see how this will change the history > > simplification in the vertical direction (skipping the ancestry > > chain and jumping directly to the closest grandparent that touched > > the specified path), but I am not sure how well this interacts with > > history simplification in the horizontal direciton (culling > > irrelevant side branches from the merge). > > But isn't that similarly confusing for the user as Uwe's original > problem? Suddenly we'd be showing a merge commit as an ordinary one, > simply because the merged history did not affect the filtered > pathspecs. Thus we would show everything that has been merged on the > *other* files as a big diff. Would that be useful? It would certainly > be a big difference in how the commit is shown. the merge is only included in the output if on both parent paths the file is touched. So this is a non-issue, isn't it? (Well, only if it has more than 2 parents and not all ancestor paths touch the file, the number of parents shown is changed.) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] log: use true parents for diff even when rewriting 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast 2013-07-22 21:48 ` Junio C Hamano @ 2013-07-23 19:55 ` Junio C Hamano 2013-07-31 20:13 ` [PATCH v2] " Thomas Rast 2 siblings, 0 replies; 9+ messages in thread From: Junio C Hamano @ 2013-07-23 19:55 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Uwe Kleine-König Thomas Rast <trast@inf.ethz.ch> writes: > +define_commit_slab(saved_parents, struct commit_list *); > +struct saved_parents saved_parents_slab; > +static int saved_parents_initialized; > + > +void save_parents(struct commit *commit) > +{ > + struct commit_list **pp; > + > + if (!saved_parents_initialized) { > + init_saved_parents(&saved_parents_slab); > + saved_parents_initialized = 1; > + } > + > + pp = saved_parents_at(&saved_parents_slab, commit); > + assert(*pp == NULL); > + *pp = copy_commit_list(commit->parents); > +} > + > +struct commit_list *get_saved_parents(struct commit *commit) Use "const struct commit *" here, as combine-diff.c has a const pointer. > +{ > + if (!saved_parents_initialized) > + return commit->parents; > + > + return *saved_parents_at(&saved_parents_slab, commit); > +} clear_commit_slab() is not used, failing -Wunused -Werror compilation. > diff --git a/revision.h b/revision.h > index 95859ba..0717518 100644 > --- a/revision.h > +++ b/revision.h > @@ -273,4 +273,19 @@ enum rewrite_result { > > extern int rewrite_parents(struct rev_info *revs, struct commit *commit, > rewrite_parent_fn_t rewrite_parent); > + > +/* > + * Save a copy of the parent list, and return the saved copy. This is > + * used by the log machinery to retrieve the original parents when > + * commit->parents has been modified by history simpification. > + * > + * You may only call save_parents() once per commit (this is checked > + * for non-root commits). > + * > + * get_original_parents() will transparently return commit->parents if > + * history simplification is off. > + */ > +extern void save_parents(struct commit *commit); > +extern struct commit_list *get_original_parents(struct commit *commit); > + s/_original/_saved/ here, and "const struct commit *". By the way, when the only single parameter is a named type, you can safely omit the name of the parameter from the declaration without losing clarity. > #endif > diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh > index 57ce239..fde5e71 100755 > --- a/t/t6012-rev-list-simplify.sh > +++ b/t/t6012-rev-list-simplify.sh > @@ -127,4 +127,10 @@ test_expect_success 'full history simplification without parent' ' > } > ' > > +test_expect_success '--full-diff is not affected by --parents' ' > + git log -p --pretty="%H" --full-diff -- file >expected && > + git log -p --pretty="%H" --full-diff --parents -- file >actual && > + test_cmp expected actual > +' > + > test_done ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] log: use true parents for diff even when rewriting 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast 2013-07-22 21:48 ` Junio C Hamano 2013-07-23 19:55 ` Junio C Hamano @ 2013-07-31 20:13 ` Thomas Rast 2013-07-31 22:55 ` Jeff King 2 siblings, 1 reply; 9+ messages in thread From: Thomas Rast @ 2013-07-31 20:13 UTC (permalink / raw) To: git; +Cc: Uwe Kleine-König, Ramsay Jones, Junio C Hamano When using pathspec filtering in combination with diff-based log output, parent simplification happens before the diff is computed. The diff is therefore against the *simplified* parents. This works okay, arguably by accident, in the normal case: simplification reduces to one parent as long as the commit is TREESAME to it. So the simplified parent of any given commit must have the same tree contents on the filtered paths as its true (unfiltered) parent. However, --full-diff breaks this guarantee, and indeed gives pretty spectacular results when comparing the output of git log --graph --stat ... git log --graph --full-diff --stat ... (--graph internally kicks in parent simplification, much like --parents). To fix it, store a copy of the parent list before simplification (in a slab) whenever --full-diff is in effect. Then use the stored parents instead of the simplified ones in the commit display code paths. The latter do not actually check for --full-diff to avoid duplicated code; they just grab the original parents if save_parents() has not been called for this revision walk. For ordinary commits it should be obvious that this is the right thing to do. Merge commits are a bit subtle. Observe that with default simplification, merge simplification is an all-or-nothing decision: either the merge is TREESAME to one parent and disappears, or it is different from all parents and the parent list remains intact. Redundant parents are not pruned, so the existing code also shows them as a merge. So if we do show a merge commit, the parent list just consists of the rewrite result on each parent. Running, e.g., --cc on this in --full-diff mode is not very useful: if any commits were skipped, some hunks will disagree with all sides of the merge (with one side, because commits were skipped; with the others, because they didn't have those changes in the first place). This triggers --cc showing these hunks spuriously. Therefore I believe that even for merge commits it is better to show the diffs wrt. the original parents. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Thomas Rast <trast@inf.ethz.ch> --- New in this version: * Junio's and Ramsay's suggestions squashed * Moved the slab variable from file-static to within the struct rev_info. In practice there is no difference, because you cannot run two walks in parallel anyway (they would stomp over each others' parent lists!). But it was easy, feels slightly cleaner, and avoids having yet another global variable. * Wrote an actual commit message. I hope the logic wrt. merge commits is correct. combine-diff.c | 3 ++- commit.c | 16 ++++++++++++++++ commit.h | 3 +++ log-tree.c | 2 +- revision.c | 43 ++++++++++++++++++++++++++++++++++++++++++- revision.h | 20 ++++++++++++++++++++ t/t6012-rev-list-simplify.sh | 6 ++++++ 7 files changed, 90 insertions(+), 3 deletions(-) diff --git a/combine-diff.c b/combine-diff.c index 6dc0609..3d2aaf3 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -10,6 +10,7 @@ #include "refs.h" #include "userdiff.h" #include "sha1-array.h" +#include "revision.h" static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent) { @@ -1383,7 +1384,7 @@ void diff_tree_combined(const unsigned char *sha1, void diff_tree_combined_merge(const struct commit *commit, int dense, struct rev_info *rev) { - struct commit_list *parent = commit->parents; + struct commit_list *parent = get_saved_parents(rev, commit); struct sha1_array parents = SHA1_ARRAY_INIT; while (parent) { diff --git a/commit.c b/commit.c index e5862f6..5ecdb38 100644 --- a/commit.c +++ b/commit.c @@ -377,6 +377,22 @@ unsigned commit_list_count(const struct commit_list *l) return c; } +struct commit_list *copy_commit_list(struct commit_list *list) +{ + struct commit_list *head = NULL; + struct commit_list **pp = &head; + while (list) { + struct commit_list *new; + new = xmalloc(sizeof(struct commit_list)); + new->item = list->item; + new->next = NULL; + *pp = new; + pp = &new->next; + list = list->next; + } + return head; +} + void free_commit_list(struct commit_list *list) { while (list) { diff --git a/commit.h b/commit.h index d912a9d..f9504f7 100644 --- a/commit.h +++ b/commit.h @@ -62,6 +62,9 @@ struct commit_list *commit_list_insert_by_date(struct commit *item, struct commit_list **list); void commit_list_sort_by_date(struct commit_list **list); +/* Shallow copy of the input list */ +struct commit_list *copy_commit_list(struct commit_list *list); + void free_commit_list(struct commit_list *list); /* Commit formats */ diff --git a/log-tree.c b/log-tree.c index a49d8e8..8534d91 100644 --- a/log-tree.c +++ b/log-tree.c @@ -738,7 +738,7 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log sha1 = commit->tree->object.sha1; /* Root commit? */ - parents = commit->parents; + parents = get_saved_parents(opt, commit); if (!parents) { if (opt->show_root_diff) { diff_root_tree_sha1(sha1, "", &opt->diffopt); diff --git a/revision.c b/revision.c index 84ccc05..e3ca936 100644 --- a/revision.c +++ b/revision.c @@ -15,6 +15,7 @@ #include "string-list.h" #include "line-log.h" #include "mailmap.h" +#include "commit-slab.h" volatile show_early_output_fn_t show_early_output; @@ -2763,7 +2764,7 @@ static int commit_match(struct commit *commit, struct rev_info *opt) return retval; } -static inline int want_ancestry(struct rev_info *revs) +static inline int want_ancestry(const struct rev_info *revs) { return (revs->rewrite_parents || revs->children.name); } @@ -2820,6 +2821,14 @@ enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit) if (action == commit_show && !revs->show_all && revs->prune && revs->dense && want_ancestry(revs)) { + /* + * --full-diff on simplified parents is no good: it + * will show spurious changes from the commits that + * were elided. So we save the parents on the side + * when --full-diff is in effect. + */ + if (revs->full_diff) + save_parents(revs, commit); if (rewrite_parents(revs, commit, rewrite_one) < 0) return commit_error; } @@ -3038,6 +3047,8 @@ struct commit *get_revision(struct rev_info *revs) c = get_revision_internal(revs); if (c && revs->graph) graph_update(revs->graph, c); + if (!c) + free_saved_parents(revs); return c; } @@ -3069,3 +3080,33 @@ void put_revision_mark(const struct rev_info *revs, const struct commit *commit) fputs(mark, stdout); putchar(' '); } + +define_commit_slab(saved_parents, struct commit_list *); + +void save_parents(struct rev_info *revs, struct commit *commit) +{ + struct commit_list **pp; + + if (!revs->saved_parents_slab) { + revs->saved_parents_slab = xmalloc(sizeof(struct saved_parents)); + init_saved_parents(revs->saved_parents_slab); + } + + pp = saved_parents_at(revs->saved_parents_slab, commit); + assert(*pp == NULL); + *pp = copy_commit_list(commit->parents); +} + +struct commit_list *get_saved_parents(struct rev_info *revs, const struct commit *commit) +{ + if (!revs->saved_parents_slab) + return commit->parents; + + return *saved_parents_at(revs->saved_parents_slab, commit); +} + +void free_saved_parents(struct rev_info *revs) +{ + if (revs->saved_parents_slab) + clear_saved_parents(revs->saved_parents_slab); +} diff --git a/revision.h b/revision.h index 95859ba..e7f1d21 100644 --- a/revision.h +++ b/revision.h @@ -25,6 +25,7 @@ struct rev_info; struct log_info; struct string_list; +struct saved_parents; struct rev_cmdline_info { unsigned int nr; @@ -187,6 +188,9 @@ struct rev_info { /* line level range that we are chasing */ struct decoration line_log_data; + + /* copies of the parent lists, for --full-diff display */ + struct saved_parents *saved_parents_slab; }; #define REV_TREE_SAME 0 @@ -273,4 +277,20 @@ enum rewrite_result { extern int rewrite_parents(struct rev_info *revs, struct commit *commit, rewrite_parent_fn_t rewrite_parent); + +/* + * Save a copy of the parent list, and return the saved copy. This is + * used by the log machinery to retrieve the original parents when + * commit->parents has been modified by history simpification. + * + * You may only call save_parents() once per commit (this is checked + * for non-root commits). + * + * get_saved_parents() will transparently return commit->parents if + * history simplification is off. + */ +extern void save_parents(struct rev_info *revs, struct commit *commit); +extern struct commit_list *get_saved_parents(struct rev_info *revs, const struct commit *commit); +extern void free_saved_parents(struct rev_info *revs); + #endif diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index 57ce239..fde5e71 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -127,4 +127,10 @@ test_expect_success 'full history simplification without parent' ' } ' +test_expect_success '--full-diff is not affected by --parents' ' + git log -p --pretty="%H" --full-diff -- file >expected && + git log -p --pretty="%H" --full-diff --parents -- file >actual && + test_cmp expected actual +' + test_done -- 1.8.4.rc0.408.gad6868d ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] log: use true parents for diff even when rewriting 2013-07-31 20:13 ` [PATCH v2] " Thomas Rast @ 2013-07-31 22:55 ` Jeff King 0 siblings, 0 replies; 9+ messages in thread From: Jeff King @ 2013-07-31 22:55 UTC (permalink / raw) To: Thomas Rast; +Cc: git, Uwe Kleine-König, Ramsay Jones, Junio C Hamano On Wed, Jul 31, 2013 at 10:13:20PM +0200, Thomas Rast wrote: > When using pathspec filtering in combination with diff-based log > output, parent simplification happens before the diff is computed. > The diff is therefore against the *simplified* parents. > > This works okay, arguably by accident, in the normal case: > simplification reduces to one parent as long as the commit is TREESAME > to it. So the simplified parent of any given commit must have the > same tree contents on the filtered paths as its true (unfiltered) > parent. > > However, --full-diff breaks this guarantee, and indeed gives pretty > spectacular results when comparing the output of > > git log --graph --stat ... > git log --graph --full-diff --stat ... > > (--graph internally kicks in parent simplification, much like > --parents). Your description (and solution) make a lot of sense to me. Another code path that has a similar problem is the "-g" reflog walker. It rewrites the parents based on the reflog, and the diffs it produces are mostly useless (e.g., try "git stash list -p"). Should we be applying the same technique there? I guess it might bother people who really want to see the diff between two lines in the reflog (e.g., comparing the results stored by "rebase" with the previous version). I think that is rare enough that "git diff" would be a better tool in that case, though. And arguably, "log -g" should not be rewriting the parents at all, because it makes "--graph" pointless (and indeed, we disallow the combination). So potentially the solution is to stop the rewriting entirely, not mask it for the diffs. But doing so might be a good interim solution until somebody feels like rewriting the reflog walker. -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-07-31 22:55 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-07-22 9:08 git log anomalities Uwe Kleine-König 2013-07-22 10:40 ` Thomas Rast 2013-07-22 21:22 ` [PATCH] log: use true parents for diff even when rewriting Thomas Rast 2013-07-22 21:48 ` Junio C Hamano 2013-07-23 7:27 ` Thomas Rast 2013-07-23 7:49 ` Uwe Kleine-König 2013-07-23 19:55 ` Junio C Hamano 2013-07-31 20:13 ` [PATCH v2] " Thomas Rast 2013-07-31 22:55 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).