From: Junio C Hamano <gitster@pobox.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Junio C Hamano <gitster@pobox.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: Wrong damage counting in diffcore_count_changes?
Date: Fri, 04 Dec 2009 12:47:16 -0800 [thread overview]
Message-ID: <7vljhio4a3.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0912041200120.24579@localhost.localdomain> (Linus Torvalds's message of "Fri\, 4 Dec 2009 12\:07\:47 -0800 \(PST\)")
Linus Torvalds <torvalds@linux-foundation.org> writes:
> This changes it so that
>
> - whenever it bypasses a destination hash (because it doesn't match a
> source), it counts the bytes associated with that as "literal added"
>
> - at the end (once we have used up all the source hashes), we do the same
> thing with the remaining destination hashes.
>
> - when hashes do match, and we use the difference in counts as a value,
> we also use up that destination hash entry (the 'd++'
>
> This _seems_ to make --dirstat output more sensible, and I'd hope that
> that in turn should mean that file rename detection should also be more
> sensible. But I haven't actually verified it in any way. Maybe I just
> screwed up file rename detection entirely.
>
> Did I miss something?
Well, the current loop structure largely comes from your eb4d0e3 (optimize
diffcore-delta by sorting hash entries., 2007-10-02) so you would be the
best judge of the change ;-), even though it seems that the current code
inherited the "skipping of dst when src does not exist" bug from c06c796
(diffcore-rename: somewhat optimized., 2006-03-12).
Earlier code, e.g. as of ba23bbc (diffcore-delta: make change counter to
byte oriented again., 2006-03-04), used to be very simple minded and
inefficient and iterated over src_count[] and dst_count[] arrays for the
entire hash value namespace and there was no such "missing, skipped,
happened to have the next value" issue.
I think you understand the original intention of the function correctly
and from a cursory look of the patch I think it fixes the bug in the
current code, and any changes to renames/breaks should be improvements.
But my lunchbreak is over, and my evening is booked, so I unfortunately
cannot spend more time thinking about any possible fallouts from this
change until tomorrow.
Sorry, and thanks.
> Linus
> ---
> diffcore-delta.c | 11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/diffcore-delta.c b/diffcore-delta.c
> index e670f85..7cf431d 100644
> --- a/diffcore-delta.c
> +++ b/diffcore-delta.c
> @@ -201,10 +201,15 @@ int diffcore_count_changes(struct diff_filespec *src,
> while (d->cnt) {
> if (d->hashval >= s->hashval)
> break;
> + la += d->cnt;
> d++;
> }
> src_cnt = s->cnt;
> - dst_cnt = d->hashval == s->hashval ? d->cnt : 0;
> + dst_cnt = 0;
> + if (d->cnt && d->hashval == s->hashval) {
> + dst_cnt = d->cnt;
> + d++;
> + }
> if (src_cnt < dst_cnt) {
> la += dst_cnt - src_cnt;
> sc += src_cnt;
> @@ -213,6 +218,10 @@ int diffcore_count_changes(struct diff_filespec *src,
> sc += dst_cnt;
> s++;
> }
> + while (d->cnt) {
> + la += d->cnt;
> + d++;
> + }
>
> if (!src_count_p)
> free(src_count);
next prev parent reply other threads:[~2009-12-04 20:47 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-04 20:07 Wrong damage counting in diffcore_count_changes? Linus Torvalds
2009-12-04 20:47 ` Junio C Hamano [this message]
2009-12-04 22:48 ` Linus Torvalds
2009-12-04 23:20 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vljhio4a3.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox