git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [WIP PATCH] Manual rename correction
Date: Thu, 2 Aug 2012 18:41:55 -0400	[thread overview]
Message-ID: <20120802224155.GB28217@sigill.intra.peff.net> (raw)
In-Reply-To: <CACsJy8CndopS7fg4mevFD5T0KJ85ba6jjhamrKDdKvKsWa_fQw@mail.gmail.com>

On Thu, Aug 02, 2012 at 07:08:25PM +0700, Nguyen Thai Ngoc Duy wrote:

> > I implemented (1a). Implementing (1b) would be easy, but for a full-on
> > cache (especially for "-C"), I think the resulting size might be
> > prohibitive.
> 
> (1a) is good regardless rename overrides. Why don't you polish and
> submit it? We can set some criteria to limit the cache size while
> keeping computation reasonably low. Caching rename scores for file
> pairs that has file size larger than a limit is one. Rename matrix
> size could also be a candidate. We could even cache just rename scores
> for recent commits (i.e. close to heads) only with the assumption that
> people diff/apply recent commits more often.

I'll polish and share it. I'm still not 100% sure it's a good idea,
because introducing an on-disk cache means we need to _manage_ that
cache. How big will it be? Who will prune it when it gets too big? By
what criteria? And so on.

But if it's all hidden behind a config option, then it won't hurt people
who don't use it. And people who do use it can gather data on how the
caches grow.

> > All solutions under (2) suffer from the same problem: they are accurate
> > only for a single diff. For other diffs, you would either have to not
> > use the feature, or you would be stuck traversing the history and
> > assigning a temporary file identity (e.g., given commits A->B->C, and in
> > A->B we rename "foo" to "bar", the diff between A and C could discover
> > that A's "foo" corresponds to C's "bar").
> 
> Yeah. If we go with manual overrides, I expect users to deal with
> these manually too. IOW they'll need to create a mapping for A->C
> themselves. We can help detect that there are manual overrides in some
> cases, like merge, and let users know that manual overrides are
> ignored. For merge, I think we can just check for all commits while
> traversing looking for bases.

Yeah, merges are a special case, in that we know the diff we perform
will always have a direct-ancestor relationship (since it is always
between a tip and the merge base).

> > But there is not much point in making it machine-readable, since the
> > interesting machine-readable things we do with renames are:
> >
> >   1. Show the diff against the rename src, which can often be easier to
> >      read. Except that if rename detection did not find it, it is
> >      probably _not_ going to be easier to read.
> 
> Probably. Still it helps "git log --follow" to follow the correct
> track in the 1% case that rename detection does go wrong.

Thanks. I didn't think of --follow, but that is a good counterpoint to
my argument.

> >   2. Applying content to the destination of a merge. But you're almost
> >      never doing the diff between a commit and its parent, so the
> >      information would be useless.
> 
> Having a way to interfere rename detection, even manually, could be
> good in this case if it reduces conflicts. We could feed rename
> overrides using command line.

Yeah. I think I'd start with letting you feed pairs to diff_options,
give it a command-line option to see how useful it is, and then later on
consider a mechanism for extracting those pairs automatically from
commits or notes.

-Peff

  reply	other threads:[~2012-08-02 22:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-31 14:15 [WIP PATCH] Manual rename correction Nguyen Thai Ngoc Duy
2012-07-31 16:32 ` Junio C Hamano
2012-07-31 19:23   ` Jeff King
2012-07-31 20:20     ` Junio C Hamano
2012-08-01  0:42       ` Jeff King
2012-08-01  6:01         ` Junio C Hamano
2012-08-01 21:54           ` Jeff King
2012-08-01 22:10             ` Junio C Hamano
2012-08-02 22:37               ` Jeff King
2012-08-02 22:51                 ` Junio C Hamano
2012-08-02 22:58                   ` Jeff King
2012-08-02  5:33             ` Junio C Hamano
2012-08-01  1:10     ` Nguyen Thai Ngoc Duy
2012-08-01  2:01       ` Jeff King
2012-08-01  4:36         ` Nguyen Thai Ngoc Duy
2012-08-01  6:09           ` Junio C Hamano
2012-08-01  6:34             ` Nguyen Thai Ngoc Duy
2012-08-01 21:32               ` Jeff King
2012-08-01 21:27           ` Jeff King
2012-08-02 12:08             ` Nguyen Thai Ngoc Duy
2012-08-02 22:41               ` Jeff King [this message]
2012-08-04 17:09                 ` [PATCH 0/8] caching rename results Jeff King
2012-08-04 17:10                   ` [PATCH 1/8] implement generic key/value map Jeff King
2012-08-04 22:58                     ` Junio C Hamano
2012-08-06 20:35                       ` Jeff King
2012-08-04 17:10                   ` [PATCH 2/8] map: add helper functions for objects as keys Jeff King
2012-08-04 17:11                   ` [PATCH 3/8] fast-export: use object to uint32 map instead of "decorate" Jeff King
2012-08-04 17:11                   ` [PATCH 4/8] decorate: use "map" for the underlying implementation Jeff King
2012-08-04 17:11                   ` [PATCH 5/8] map: implement persistent maps Jeff King
2012-08-04 17:11                   ` [PATCH 6/8] implement metadata cache subsystem Jeff King
2012-08-04 22:49                     ` Junio C Hamano
2012-08-06 20:31                       ` Jeff King
2012-08-06 20:38                     ` Jeff King
2012-08-04 17:12                   ` [PATCH 7/8] implement rename cache Jeff King
2012-08-04 17:14                   ` [PATCH 8/8] diff: optionally use " Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120802224155.GB28217@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).