git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jonathan Tan <jonathantanmy@google.com>, git@vger.kernel.org
Subject: Re: [RFC PATCH] diff: only prefetch for certain output formats
Date: Sat, 1 Feb 2020 06:29:20 -0500	[thread overview]
Message-ID: <20200201112920.GD1864948@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqeevfbisb.fsf@gitster-ct.c.googlers.com>

On Fri, Jan 31, 2020 at 10:08:36AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > fetches anything at all). I.e., for "diff -M -p", you'd want:
> >
> >   1. diffcore_std() sees "-p" and fetches everything
> >
> >   2. diffcore_rename() sees there's nothing we don't already have
> >
> > rather than:
> >
> >   1. diffcore_rename() fetches a few blobs to do rename detection
> >
> >   2. diffcore_std() fetches a few more blobs that weren't rename
> >      candidates, but we need for "-p"
> 
> Hmph, a pure rename only change will cause no blobs transferred with
> the latter (because there is no content change for "-p" to report,
> and the rename detection for R100 paths would be done at the object
> name level), but all blobs in filepairs (before rename matches A/D
> up) with the former, no?

Hmm, good point that an exact rename won't show the blobs. I don't think
there's a way that covers all cases in a single fetch at the granularity
of the functions I listed above. But we could if we break it down a bit:

  1. look for exact renames

  2. queue for fetching any inexact rename candidates leftover

  3. if we're going to show a diff that requires contents, then:

     3a. if marked as an exact rename/copy, don't queue (I guess this is
         really checking p->one->oid versus p->two->oid)

     3b. likewise, deletions with --irreversible-delete don't need queued

     3c. otherwise, queue for fetch

  4. fetch whatever was queued

  5. proceed with inexact rename detection, then showing the diffs

So that logic has to go in the middle of diffcore_rename(). And if we're
not doing renames, then the logic from (3) needs to get triggered by
diffcore_std(). So it probably needs to be hoisted out into a helper,
and made idempotent (it already should be, since after the first
prefetch we'd have all of the objects, but we might want a flag to avoid
unnecessarily checking again).

-Peff

      reply	other threads:[~2020-02-01 11:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-28 21:35 [RFC PATCH] diff: only prefetch for certain output formats Jonathan Tan
2020-01-29  5:09 ` Jeff King
2020-01-30  1:39   ` Jonathan Tan
2020-01-30  5:51     ` Jeff King
2020-01-30 23:20       ` Jonathan Tan
2020-01-31  0:14         ` Jeff King
2020-01-31 18:08           ` Junio C Hamano
2020-02-01 11:29             ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200201112920.GD1864948@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).