* git log --numstat disagrees with git apply --numstat @ 2008-12-11 23:53 Shawn O. Pearce 2008-12-12 1:52 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Shawn O. Pearce @ 2008-12-11 23:53 UTC (permalink / raw) To: git I've found a case where git apply --numstat and git log --numstat produce different results for the same commit. In egit (git://repo.or.cz/egit.git/): $ git log --numstat -1 --pretty=o 9bda5ece6806cd797416eaa47c 9bda5ece6806cd797416eaa47c7b927cc6e9c3b2 Teach RevWalk about ... 8 0 org.spearce.jgit/src/org/spearce/jgit/revwalk/DateRevQueue.java 69 13 org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java $ git log -p -1 --pretty=o 9bda5ece6806cd797416eaa47c | git apply --numstat 8 0 org.spearce.jgit/src/org/spearce/jgit/revwalk/DateRevQueue.java 68 12 org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java I found this because I was writing a unit test for JGit that ran through the JGit project history and compared the output of git log --numstat against the output of JGit's "git apply --numstat" implementation, after scraping the "git log -p" output. I can't quite figure out why log --numstat is coming up with a +1 difference here for both added and removed, but it is. I haven't dug into the Git code yet to figure out why. FWIW, JGit produces the same result as "git apply --numstat" (the 68/12). At this point JGit was able to successfully read and match 715 of 1211 commits before it found this difference, so its also somewhat rare to occur I think... -- Shawn. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: git log --numstat disagrees with git apply --numstat 2008-12-11 23:53 git log --numstat disagrees with git apply --numstat Shawn O. Pearce @ 2008-12-12 1:52 ` Jeff King 2008-12-12 2:08 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2008-12-12 1:52 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: git On Thu, Dec 11, 2008 at 03:53:37PM -0800, Shawn O. Pearce wrote: > I've found a case where git apply --numstat and git log --numstat > produce different results for the same commit. "git apply" doesn't actually do the diff; it just calculates based on what it sees in the patch. So the real issue is "git log -p" and "git log --numstat" produce different patches. And you can see it by instrumenting like this: diff --git a/diff.c b/diff.c index af822c1..fce93db 100644 --- a/diff.c +++ b/diff.c @@ -726,6 +726,8 @@ static void diffstat_consume(void *priv, char *line, unsigned long len) struct diffstat_t *diffstat = priv; struct diffstat_file *x = diffstat->files[diffstat->nr - 1]; + fwrite(line, 1, len, stderr); + if (line[0] == '+') x->added++; else if (line[0] == '-') and then comparing what diffstat_consume gets versus the patch, something like: what="9bda5ece org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java" git log -p -1 --pretty=format: $what | grep '^[-+@]' >a git log --numstat -1 --pretty=format: $what 2>b >/dev/null diff -u a b It looks like it is just a place where two different valid diffs can be constructed: +- +- for (final RevCommit p : c.parents) { +- if ((p.flags & SEEN) != 0) + for (;;) { + final RevCommit c = pending.pop(); + if (c == null) @@ -67,19 +68,20 @@ + p.flags |= SEEN; + pending.add(p); + } -- for (final RevCommit p : c.parents) { -- if ((p.flags & SEEN) != 0) ++ which is probably just due to different xdi settings being used between the two codepaths. I haven't looked closely to see which different options we are feeding to xdiff. -Peff ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: git log --numstat disagrees with git apply --numstat 2008-12-12 1:52 ` Jeff King @ 2008-12-12 2:08 ` Jeff King 2008-12-12 2:21 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2008-12-12 2:08 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: git On Thu, Dec 11, 2008 at 08:52:55PM -0500, Jeff King wrote: > It looks like it is just a place where two different valid diffs can be > constructed: > [...] > which is probably just due to different xdi settings being used between > the two codepaths. I haven't looked closely to see which different > options we are feeding to xdiff. Ah. Doing this gives me the 68/12 answer for "git log --numstat": diff --git a/diff.c b/diff.c index af822c1..5f314ce 100644 --- a/diff.c +++ b/diff.c @@ -1539,6 +1539,7 @@ static void builtin_diffstat(const char *name_a, const char *name_b, memset(&xpp, 0, sizeof(xpp)); memset(&xecfg, 0, sizeof(xecfg)); + xecfg.ctxlen = o->context; xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts; xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat, &xpp, &xecfg, &ecb); I guess it is slightly less efficient (since we just throw away the context lines anyway), but it is nice to count the exact same patch that "git log -p" would produce. -Peff ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: git log --numstat disagrees with git apply --numstat 2008-12-12 2:08 ` Jeff King @ 2008-12-12 2:21 ` Jeff King 2008-12-15 9:57 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2008-12-12 2:21 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: git On Thu, Dec 11, 2008 at 09:08:58PM -0500, Jeff King wrote: > > which is probably just due to different xdi settings being used between > > the two codepaths. I haven't looked closely to see which different > > options we are feeding to xdiff. > > Ah. Doing this gives me the 68/12 answer for "git log --numstat": BTW, I got a little confused looking at the parameters to xdi_diff_outf, since ecb gets passed in full of random garbage. I don't know if this cleanup is worth applying: -- >8 -- remove xecb parameter to xdi_diff_outf It is pointless to pass this parameter in instead of just declaring it locally inside the function because: 1. We overwrite every member of the struct inside the function anyway, so we ignore anything passed in. 2. The contents after we return point to a local variable that has gone out of scope, so it is wrong to look at them. So it is just making the interface more complex to have it there, and it looks like a potential error in the callers to be passing a completely uninitialized variable. Signed-off-by: Jeff King <peff@peff.net> --- diff.c | 15 +++++---------- xdiff-interface.c | 9 +++++---- xdiff-interface.h | 2 +- 3 files changed, 11 insertions(+), 15 deletions(-) diff --git a/diff.c b/diff.c index af822c1..1801aef 100644 --- a/diff.c +++ b/diff.c @@ -403,7 +403,6 @@ static void diff_words_show(struct diff_words_data *diff_words) { xpparam_t xpp; xdemitconf_t xecfg; - xdemitcb_t ecb; mmfile_t minus, plus; int i; @@ -428,7 +427,7 @@ static void diff_words_show(struct diff_words_data *diff_words) xpp.flags = XDF_NEED_MINIMAL; xecfg.ctxlen = diff_words->minus.alloc + diff_words->plus.alloc; xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words, - &xpp, &xecfg, &ecb); + &xpp, &xecfg); free(minus.ptr); free(plus.ptr); diff_words->minus.text.size = diff_words->plus.text.size = 0; @@ -1436,7 +1435,6 @@ static void builtin_diff(const char *name_a, const char *diffopts = getenv("GIT_DIFF_OPTS"); xpparam_t xpp; xdemitconf_t xecfg; - xdemitcb_t ecb; struct emit_callback ecbdata; const struct userdiff_funcname *pe; @@ -1484,7 +1482,7 @@ static void builtin_diff(const char *name_a, ecbdata.diff_words->file = o->file; } xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata, - &xpp, &xecfg, &ecb); + &xpp, &xecfg); if (DIFF_OPT_TST(o, COLOR_DIFF_WORDS)) free_diff_words_data(&ecbdata); if (textconv_one) @@ -1535,13 +1533,12 @@ static void builtin_diffstat(const char *name_a, const char *name_b, /* Crazy xdl interfaces.. */ xpparam_t xpp; xdemitconf_t xecfg; - xdemitcb_t ecb; memset(&xpp, 0, sizeof(xpp)); memset(&xecfg, 0, sizeof(xecfg)); xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts; xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat, - &xpp, &xecfg, &ecb); + &xpp, &xecfg); } free_and_return: @@ -1582,14 +1579,13 @@ static void builtin_checkdiff(const char *name_a, const char *name_b, /* Crazy xdl interfaces.. */ xpparam_t xpp; xdemitconf_t xecfg; - xdemitcb_t ecb; memset(&xpp, 0, sizeof(xpp)); memset(&xecfg, 0, sizeof(xecfg)); xecfg.ctxlen = 1; /* at least one context line */ xpp.flags = XDF_NEED_MINIMAL; xdi_diff_outf(&mf1, &mf2, checkdiff_consume, &data, - &xpp, &xecfg, &ecb); + &xpp, &xecfg); if ((data.ws_rule & WS_TRAILING_SPACE) && data.trailing_blanks_start) { @@ -3009,7 +3005,6 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1) for (i = 0; i < q->nr; i++) { xpparam_t xpp; xdemitconf_t xecfg; - xdemitcb_t ecb; mmfile_t mf1, mf2; struct diff_filepair *p = q->queue[i]; int len1, len2; @@ -3071,7 +3066,7 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1) xecfg.ctxlen = 3; xecfg.flags = XDL_EMIT_FUNCNAMES; xdi_diff_outf(&mf1, &mf2, patch_id_consume, &data, - &xpp, &xecfg, &ecb); + &xpp, &xecfg); } git_SHA1_Final(sha1, &ctx); diff --git a/xdiff-interface.c b/xdiff-interface.c index d782f06..d0d60fa 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -140,18 +140,19 @@ int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t co int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2, xdiff_emit_consume_fn fn, void *consume_callback_data, xpparam_t const *xpp, - xdemitconf_t const *xecfg, xdemitcb_t *xecb) + xdemitconf_t const *xecfg) { int ret; struct xdiff_emit_state state; + xdemitcb_t xecb; memset(&state, 0, sizeof(state)); state.consume = fn; state.consume_callback_data = consume_callback_data; - xecb->outf = xdiff_outf; - xecb->priv = &state; + xecb.outf = xdiff_outf; + xecb.priv = &state; strbuf_init(&state.remainder, 0); - ret = xdi_diff(mf1, mf2, xpp, xecfg, xecb); + ret = xdi_diff(mf1, mf2, xpp, xecfg, &xecb); strbuf_release(&state.remainder); return ret; } diff --git a/xdiff-interface.h b/xdiff-interface.h index 7352b9a..491037d 100644 --- a/xdiff-interface.h +++ b/xdiff-interface.h @@ -10,7 +10,7 @@ int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t co int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2, xdiff_emit_consume_fn fn, void *consume_callback_data, xpparam_t const *xpp, - xdemitconf_t const *xecfg, xdemitcb_t *xecb); + xdemitconf_t const *xecfg); int xdi_diff_hunks(mmfile_t *mf1, mmfile_t *mf2, xdiff_emit_hunk_consume_fn fn, void *consume_callback_data, xpparam_t const *xpp, xdemitconf_t *xecfg); -- 1.6.1.rc2.307.gb39e0.dirty ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: git log --numstat disagrees with git apply --numstat 2008-12-12 2:21 ` Jeff King @ 2008-12-15 9:57 ` Junio C Hamano 2008-12-15 20:32 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2008-12-15 9:57 UTC (permalink / raw) To: Jeff King; +Cc: Shawn O. Pearce, git Jeff King <peff@peff.net> writes: > BTW, I got a little confused looking at the parameters to xdi_diff_outf, > since ecb gets passed in full of random garbage. I don't know if this > cleanup is worth applying: I think this makes sense, but let's do this after 1.6.1 final. It does not fix anything, and I'd rather avoid distraction. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: git log --numstat disagrees with git apply --numstat 2008-12-15 9:57 ` Junio C Hamano @ 2008-12-15 20:32 ` Jeff King 0 siblings, 0 replies; 6+ messages in thread From: Jeff King @ 2008-12-15 20:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, git On Mon, Dec 15, 2008 at 01:57:07AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > BTW, I got a little confused looking at the parameters to xdi_diff_outf, > > since ecb gets passed in full of random garbage. I don't know if this > > cleanup is worth applying: > > I think this makes sense, but let's do this after 1.6.1 final. It does > not fix anything, and I'd rather avoid distraction. Agreed. I will re-post both patches after 1.6.1 is released. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-12-15 20:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-12-11 23:53 git log --numstat disagrees with git apply --numstat Shawn O. Pearce 2008-12-12 1:52 ` Jeff King 2008-12-12 2:08 ` Jeff King 2008-12-12 2:21 ` Jeff King 2008-12-15 9:57 ` Junio C Hamano 2008-12-15 20:32 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).