From: David Kastrup <dak@gnu.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] blame.c: don't drop origin blobs as eagerly
Date: Wed, 03 Apr 2019 13:08:30 +0200 [thread overview]
Message-ID: <87ftqz5osx.fsf@fencepost.gnu.org> (raw)
In-Reply-To: <xmqqv9zvsfay.fsf@gitster-ct.c.googlers.com> (Junio C. Hamano's message of "Wed, 03 Apr 2019 16:45:09 +0900")
Junio C Hamano <gitster@pobox.com> writes:
> David Kastrup <dak@gnu.org> writes:
>
>> When a parent blob already has chunks queued up for blaming, dropping
>> the blob at the end of one blame step will cause it to get reloaded
>> right away, doubling the amount of I/O and unpacking when processing a
>> linear history.
>>
>> Keeping such parent blobs in memory seems like a reasonable optimization
>> that should incur additional memory pressure mostly when processing the
>> merges from old branches.
>
> Thanks for finding an age-old one that dates back to 7c3c7962
> ("blame: drop blob data after passing blame to the parent",
> 2007-12-11).
>
> Interestingly, the said commit claims:
>
> When passing blame from a parent to its parent (i.e. the
> grandparent), the blob data for the parent may need to be read
> again, but it should be relatively cheap, thanks to delta-base
> cache.
>
> but perhaps you found a case where the delta-base cache is not all
> that effective in the benchmark?
The most relevant contribution is in a linear history where the diff
between commit and parent is followed by the diff between parent and
grandparent. It seems wasteful to recreate the blobs in this case. Of
course this is also the case where any close cache layers are more
likely to still be warm, so the savings may be less apparent. They are
likely more for deep delta chains in long histories where the
delta-chain cache is more thoroughly exercised.
--
David Kastrup
next prev parent reply other threads:[~2019-04-03 11:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-02 11:56 [PATCH] blame.c: don't drop origin blobs as eagerly David Kastrup
2019-04-03 7:45 ` Junio C Hamano
2019-04-03 9:32 ` Duy Nguyen
2019-04-03 11:36 ` Jeff King
2019-04-03 12:06 ` Duy Nguyen
2019-04-03 12:19 ` Jeff King
2019-04-03 12:32 ` David Kastrup
2019-04-03 11:08 ` David Kastrup [this message]
-- strict thread matches above, loose matches on Subject: below --
2016-05-27 13:35 David Kastrup
2016-05-27 15:00 ` Johannes Schindelin
2016-05-27 15:41 ` David Kastrup
2016-05-28 6:37 ` Johannes Schindelin
2016-05-28 8:29 ` David Kastrup
2016-05-28 12:34 ` Johannes Schindelin
2016-05-28 14:00 ` David Kastrup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ftqz5osx.fsf@fencepost.gnu.org \
--to=dak@gnu.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.