From: Kirill Smelkov <kirr@landau.phys.spbu.ru>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org, Axel Bonnet <axel.bonnet@ensimag.imag.fr>,
Cl??ment Poulain <clement.poulain@ensimag.imag.fr>,
Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Subject: Re: [PATCH 2/2] fill_textconv(): Don't get/put cache if sha1 is not valid
Date: Sat, 18 Dec 2010 23:55:15 +0300 [thread overview]
Message-ID: <20101218205514.GA21249@landau.phys.spbu.ru> (raw)
In-Reply-To: <20101218161337.GB18643@sigill.intra.peff.net>
On Sat, Dec 18, 2010 at 11:13:37AM -0500, Jeff King wrote:
> On Sat, Dec 18, 2010 at 05:54:12PM +0300, Kirill Smelkov wrote:
>
> > It turned out, under blame there are requests to fill_textconv() with
> > sha1=0000000000000000000000000000000000000000 and sha1_valid=0.
> >
> > As the code did not analyzed sha1 validity, we ended up putting 000000
> > into textconv cache which was fooling later blames to discover lots of
> > lines in 'Not Yet Committed' state.
> > [...]
> > - if (driver->textconv_cache) {
> > + if (driver->textconv_cache && df->sha1_valid) {
> > *outbuf = notes_cache_get(driver->textconv_cache, df->sha1,
> > &size);
>
> In short:
>
> Acked-by: Jeff King <peff@peff.net>
>
> But it took some thinking to convince myself, so the long answer is
> below if anyone cares.
>
> I was dubious at first that this could be the right solution. We still
> end up putting the filespec through run_textconv, which didn't seem
> right if it is not valid.
>
> But reading into it more, there are two levels of invalidity:
>
> 1. !DIFF_FILE_VALID(df) - we are not a valid file at all. I.e., we are
> /dev/null.
>
> 2. !df->sha1_valid - we are pointing to a working tree file whose sha1
> we don't know
>
> I think level (2) never happens at all in the regular diff code, which
> is why this case was completely unhandled. But it is OK in that case
> (required, even) to put the contents through run_textconv.
>
> In theory we could actually calculate the sha1 in case (2) and cache
> under that, but I don't know how much it would buy us in practice. It
> saves us running the textconv filter at the expense of computing the
> sha1. Which is probably a win for most filters, but on the other hand,
> it is the wrong place to compute such a sha1. If it is a working tree
> file, we should ideally update our stat info in the index so that the
> info can be reused.
Jeff,
Thanks for your ACK and for the explanation.
My last patches to git were blame related so semi-intuitively I knew
that invalid sha1 are coming from files in worktree. Your description
makes things much more clear and I'd put it into patch log as well.
What is the best practice for this? For me to re-roll, or for Junio to
merge texts?
Also experimenting shows that, as you say, regular diff does not have this
problem, and also that diff calculates sha1 for files in worktree and so
their textconv results are cached. So clearly, there are some behaviour
differences between diff and blame.
Thanks again,
Kirill
next prev parent reply other threads:[~2010-12-18 20:55 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-18 14:54 [PATCH 1/2] t/t8006: Demonstrate blame is broken when cachetextconv is on Kirill Smelkov
2010-12-18 14:54 ` [PATCH 2/2] fill_textconv(): Don't get/put cache if sha1 is not valid Kirill Smelkov
2010-12-18 16:13 ` Jeff King
2010-12-18 20:55 ` Kirill Smelkov [this message]
2010-12-19 3:23 ` Junio C Hamano
2010-12-19 12:10 ` Kirill Smelkov
2010-12-20 2:41 ` Junio C Hamano
2010-12-20 4:46 ` Jeff King
2010-12-20 19:28 ` Kirill Smelkov
2010-12-20 2:26 ` Junio C Hamano
2010-12-20 4:42 ` Jeff King
2010-12-20 8:42 ` Junio C Hamano
2010-12-20 2:32 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101218205514.GA21249@landau.phys.spbu.ru \
--to=kirr@landau.phys.spbu.ru \
--cc=axel.bonnet@ensimag.imag.fr \
--cc=clement.poulain@ensimag.imag.fr \
--cc=diane.gasselin@ensimag.imag.fr \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).