git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] pack-objects: no crc check when the cached version is used
Date: Fri, 13 Sep 2013 23:26:21 +0200	[thread overview]
Message-ID: <87k3ikct1e.fsf@inf.ethz.ch> (raw)
In-Reply-To: <xmqq7gekk24q.fsf@gitster.dls.corp.google.com> (Junio C. Hamano's message of "Fri, 13 Sep 2013 11:28:05 -0700")

Junio C Hamano <gitster@pobox.com> writes:

> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> Current code makes pack-objects always do check_pack_crc() in
>> unpack_entry() even if right after that we find out there's a cached
>> version and pack access is not needed. Swap two code blocks, search
>> for cached version first, then check crc.
[...]
>
> Interesting.
>
> This is only triggered inside pack-objects, which would read a lot
> of data from existing packs, and the overhead for looking up the
> entry from the revindex, faulting in the actual packdata, and
> computing and comparing the crc would not be trivial, especially as
> the cost is incurred over many objects we need to untangle in the
> delta chain.  If you have interesting numbers to show how much this
> improves the performance, I am curious to see it.

I can't see anything wrong with the patch, but then I haven't stared too
hard.  (It seems that my conversion around abe601b (sha1_file: remove
recursion in unpack_entry, 2013-03-27) was faithful on this point, the
problem has existed for longer than that.)

I tried the perf script below, but at least for the git repo the only
thing I can see is noise.

--- 8< --- t/perf/p5300-pack-object.sh --- 8< ---
#!/bin/sh

test_description="Tests object packing performance"

. ./perf-lib.sh

test_perf_default_repo

test_perf 'pack-objects on commits in HEAD' '
	git rev-list HEAD |
	git pack-objects --stdout >/dev/null
'

test_perf 'pack-objects on all of HEAD' '
	git rev-list --objects HEAD |
	git pack-objects --stdout >/dev/null
'

test_done

  reply	other threads:[~2013-09-13 21:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-13 11:03 [PATCH] pack-objects: no crc check when the cached version is used Nguyễn Thái Ngọc Duy
2013-09-13 18:28 ` Junio C Hamano
2013-09-13 21:26   ` Thomas Rast [this message]
2013-09-14  1:04     ` Duy Nguyen
2013-09-14  3:18       ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3ikct1e.fsf@inf.ethz.ch \
    --to=trast@inf.ethz.ch \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).