git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Simon Ruderich <simon@ruderich.org>,
	Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>,
	git <git@vger.kernel.org>
Subject: Re: [PATCH v2 1/3] diffcore-pickaxe: remove unnecessary call to get_textconv()
Date: Thu, 4 Apr 2013 17:11:10 -0400	[thread overview]
Message-ID: <20130404211110.GB25811@sigill.intra.peff.net> (raw)
In-Reply-To: <7vvc82ggy3.fsf@alter.siamese.dyndns.org>

On Thu, Apr 04, 2013 at 01:48:52PM -0700, Junio C Hamano wrote:

> If I am reading the code correctly, it is has_changes(), which is
> used for "log -S" (not "log -G" that uses diff_grep()), that does
> the unnecessary get_textconv() unconditionally.  The way diff_grep() 
> divides the work to make fill_one() responsible for filling the
> textconv as necessary is internally consistent, and there is no
> unnecessary call.
> 
> Perhaps...
> 
> 	The fill_one() function is responsible for finding and
> 	filling the textconv filter as necessary, and is called by
> 	diff_grep() function that implements "git log -G<pattern>".
> 
> 	The has_changes() function calls get_textconv() for two
> 	sides being compared, before it checks to see if it was
> 	asked to perform the pickaxe limiting with the -S option.
> 	Move the code around to avoid this wastage.  After that,
> 	fill_one() is called to use the textconv.
> 
> 	By adding get_textconv() to diff_grep() and relieving
> 	fill_one() of responsibility to find the textconv filter, we
> 	can avoid calling get_textconv() twice.
> 
> Explained that way, it makes me wonder why we cannot fix it the
> other way around, that is, not fetching textconv in has_changes()
> and instead letting fill_one() to find textconv as needed.
> 
> The answer is because has_changes() itself looks at the textconv.
> 
> But we have to wonder why it is so.  diff_grep() short-circuits when
> the two sides are identical and has_changes() has a similar but
> different logic to check if the identical two sides are processed
> with the same textconv filter before saying this filepair is
> uninteresting.
> 
> Shouldn't that logic be unified as well?

I think it would be OK to perform the same optimization in diff_grep; I
do not recall offhand why I didn't do so when touching this code last
time, so it may have just been because I didn't think of it.

But I do not think fill_one is the right interface for it. The reason
has_changes calls get_textconv separately is that we do not want to fill
the buffer (which may be expensive) if we can avoid it. So the correct
sequence is:

  1. Find out textconv_one.

  2. Find out textconv_two.

  3. Check "!hashcmp(one->sha1, two->sha1) && textconv_one == textconv_two";
     if true, then we know the content we are about to compare will be
     identical, and we can return early.

  4. Otherwise, retrieve the content for one (respecting textconv_one).

  5. Retrieve the content for two (respecting textconv_two).

You cannot implement the optimization looking only at one side
(obviously), but you also need to split the textconv lookup from the
"fill" procedure if you want the optimization to come at the right
place.

If you turned fill_one into "fill_both_sides" then you could share
the code between diff_grep/pickaxe and do it in the right order in the
helper.

-Peff

  parent reply	other threads:[~2013-04-04 21:11 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-04  8:34 [BUG] git log -S not respecting --no-textconv Matthieu Moy
2013-04-04 16:03 ` [PATCH 1/2] diffcore-pickaxe: respect --no-textconv Simon Ruderich
2013-04-04 17:03   ` Matthieu Moy
2013-04-04 17:43   ` Jeff King
2013-04-04 17:45   ` Junio C Hamano
2013-04-04 17:51     ` Jeff King
2013-04-04 18:10       ` Junio C Hamano
2013-04-04 20:20         ` [PATCH v2 1/3] diffcore-pickaxe: remove unnecessary call to get_textconv() Simon Ruderich
2013-04-04 20:48           ` Junio C Hamano
2013-04-04 21:10             ` Junio C Hamano
2013-04-04 21:11             ` Jeff King [this message]
2013-04-04 22:51               ` Junio C Hamano
2013-04-05 13:20             ` Simon Ruderich
2013-04-04 20:21         ` [PATCH v2 2/3] diffcore-pickaxe: remove fill_one() Simon Ruderich
2013-04-05  0:08           ` Jeff King
2013-04-05  4:43             ` Junio C Hamano
2013-04-05  4:45               ` [PATCH 1/3] diffcore-pickaxe: port optimization from has_changes() to diff_grep() Junio C Hamano
2013-04-05  4:45                 ` [PATCH 2/3] diffcore-pickaxe: fix leaks in "log -S<block>" and "log -G<pattern>" Junio C Hamano
2013-04-05  4:45                 ` [PATCH 3/3] diffcore-pickaxe: unify setup and teardown code between log -S/-G Junio C Hamano
2013-04-05  5:28                   ` Jeff King
2013-04-05  5:43                     ` Junio C Hamano
2013-04-05  5:45                       ` Jeff King
2013-04-05 16:44                         ` Junio C Hamano
2013-04-04 20:21         ` [PATCH v2 3/3] diffcore-pickaxe: respect --no-textconv Simon Ruderich
2013-04-05  7:40           ` Matthieu Moy
2013-04-05 13:16             ` [PATCH v3 " Simon Ruderich
2013-04-05 17:31               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130404211110.GB25811@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=simon@ruderich.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).