From: Junio C Hamano <gitster@pobox.com>
To: Phillip Wood <phillip.wood123@gmail.com>
Cc: Jeff King <peff@peff.net>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org, vdye@github.com, me@ttaylorr.com,
mjcheetham@outlook.com, Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 2/2] for-each-ref: add --count-matches option
Date: Wed, 28 Jun 2023 10:08:21 -0700 [thread overview]
Message-ID: <xmqqzg4jcy62.fsf@gitster.g> (raw)
In-Reply-To: <776c3682-d2eb-d2d7-3ea8-4a7db8cd7842@gmail.com> (Phillip Wood's message of "Wed, 28 Jun 2023 14:12:38 +0100")
Phillip Wood <phillip.wood123@gmail.com> writes:
> So it seems most of the slowdown I was seeing yesterday was due it
> looking up a loose object. I'm surprised repacking makes such a
> difference in a repository that only contains two objects.
If we compare what is done in packfile.c:packed_object_info() and
object-file.c:loose_object_info() when we are only interested in
finding out the object type, there aren't that many differences
in the set of system calls each codepath needs to make.
* The packfile codepath needs to open and mmap *.pack and *.idx,
binary search in the .idx for the object location, then read a
few bytes from .pack, before being able to decode the header to
find out the type.
* The loose object codepath needs to open and mmap the loose object
file, read a few bytes from there, before being abole to decode
the header to find out the type. After that, it needs to munmap.
The cost of open/mmap for packfile codepath amortises over number of
objects (hence number of refs) very well. If there are many refs
that point at the same object, cache object layer will kick in to
avoid disk access for second and subsequent accesses to the same
object, but it helps both codepaths equally, so there should not be
much difference either way.
Thanks for a interesting piece of food for thought.
next prev parent reply other threads:[~2023-06-28 17:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-26 15:09 [PATCH 0/2] [RFC] for-each-ref: add --count-matches mode Derrick Stolee via GitGitGadget
2023-06-26 15:09 ` [PATCH 1/2] for-each-ref: extract ref output loop Derrick Stolee via GitGitGadget
2023-06-26 15:09 ` [PATCH 2/2] for-each-ref: add --count-matches option Derrick Stolee via GitGitGadget
2023-06-26 16:14 ` Junio C Hamano
2023-06-27 7:30 ` Jeff King
2023-06-27 10:05 ` Phillip Wood
2023-06-27 18:22 ` Junio C Hamano
2023-06-27 19:59 ` Jeff King
2023-06-28 13:12 ` Phillip Wood
2023-06-28 17:08 ` Junio C Hamano [this message]
2023-07-11 14:48 ` René Scharfe
2023-07-10 16:51 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqzg4jcy62.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=me@ttaylorr.com \
--cc=mjcheetham@outlook.com \
--cc=peff@peff.net \
--cc=phillip.wood123@gmail.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).