From: Thomas Rast <trast@inf.ethz.ch>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] pack-objects: no crc check when the cached version is used
Date: Fri, 13 Sep 2013 23:26:21 +0200 [thread overview]
Message-ID: <87k3ikct1e.fsf@inf.ethz.ch> (raw)
In-Reply-To: <xmqq7gekk24q.fsf@gitster.dls.corp.google.com> (Junio C. Hamano's message of "Fri, 13 Sep 2013 11:28:05 -0700")
Junio C Hamano <gitster@pobox.com> writes:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> Current code makes pack-objects always do check_pack_crc() in
>> unpack_entry() even if right after that we find out there's a cached
>> version and pack access is not needed. Swap two code blocks, search
>> for cached version first, then check crc.
[...]
>
> Interesting.
>
> This is only triggered inside pack-objects, which would read a lot
> of data from existing packs, and the overhead for looking up the
> entry from the revindex, faulting in the actual packdata, and
> computing and comparing the crc would not be trivial, especially as
> the cost is incurred over many objects we need to untangle in the
> delta chain. If you have interesting numbers to show how much this
> improves the performance, I am curious to see it.
I can't see anything wrong with the patch, but then I haven't stared too
hard. (It seems that my conversion around abe601b (sha1_file: remove
recursion in unpack_entry, 2013-03-27) was faithful on this point, the
problem has existed for longer than that.)
I tried the perf script below, but at least for the git repo the only
thing I can see is noise.
--- 8< --- t/perf/p5300-pack-object.sh --- 8< ---
#!/bin/sh
test_description="Tests object packing performance"
. ./perf-lib.sh
test_perf_default_repo
test_perf 'pack-objects on commits in HEAD' '
git rev-list HEAD |
git pack-objects --stdout >/dev/null
'
test_perf 'pack-objects on all of HEAD' '
git rev-list --objects HEAD |
git pack-objects --stdout >/dev/null
'
test_done
next prev parent reply other threads:[~2013-09-13 21:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-13 11:03 [PATCH] pack-objects: no crc check when the cached version is used Nguyễn Thái Ngọc Duy
2013-09-13 18:28 ` Junio C Hamano
2013-09-13 21:26 ` Thomas Rast [this message]
2013-09-14 1:04 ` Duy Nguyen
2013-09-14 3:18 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k3ikct1e.fsf@inf.ethz.ch \
--to=trast@inf.ethz.ch \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.