From: Thomas Rast <trast@inf.ethz.ch>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] pack-objects: no crc check when the cached version is used
Date: Fri, 13 Sep 2013 23:26:21 +0200 [thread overview]
Message-ID: <87k3ikct1e.fsf@inf.ethz.ch> (raw)
In-Reply-To: <xmqq7gekk24q.fsf@gitster.dls.corp.google.com> (Junio C. Hamano's message of "Fri, 13 Sep 2013 11:28:05 -0700")
Junio C Hamano <gitster@pobox.com> writes:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> Current code makes pack-objects always do check_pack_crc() in
>> unpack_entry() even if right after that we find out there's a cached
>> version and pack access is not needed. Swap two code blocks, search
>> for cached version first, then check crc.
[...]
>
> Interesting.
>
> This is only triggered inside pack-objects, which would read a lot
> of data from existing packs, and the overhead for looking up the
> entry from the revindex, faulting in the actual packdata, and
> computing and comparing the crc would not be trivial, especially as
> the cost is incurred over many objects we need to untangle in the
> delta chain. If you have interesting numbers to show how much this
> improves the performance, I am curious to see it.
I can't see anything wrong with the patch, but then I haven't stared too
hard. (It seems that my conversion around abe601b (sha1_file: remove
recursion in unpack_entry, 2013-03-27) was faithful on this point, the
problem has existed for longer than that.)
I tried the perf script below, but at least for the git repo the only
thing I can see is noise.
--- 8< --- t/perf/p5300-pack-object.sh --- 8< ---
#!/bin/sh
test_description="Tests object packing performance"
. ./perf-lib.sh
test_perf_default_repo
test_perf 'pack-objects on commits in HEAD' '
git rev-list HEAD |
git pack-objects --stdout >/dev/null
'
test_perf 'pack-objects on all of HEAD' '
git rev-list --objects HEAD |
git pack-objects --stdout >/dev/null
'
test_done
next prev parent reply other threads:[~2013-09-13 21:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-13 11:03 [PATCH] pack-objects: no crc check when the cached version is used Nguyễn Thái Ngọc Duy
2013-09-13 18:28 ` Junio C Hamano
2013-09-13 21:26 ` Thomas Rast [this message]
2013-09-14 1:04 ` Duy Nguyen
2013-09-14 3:18 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k3ikct1e.fsf@inf.ethz.ch \
--to=trast@inf.ethz.ch \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).