From: David Kastrup <dak@gnu.org>
To: Jeff King <peff@peff.net>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 4/4] gc --aggressive: three phase repacking
Date: Tue, 18 Mar 2014 07:19:43 +0100 [thread overview]
Message-ID: <87lhw8vxj4.fsf@fencepost.gnu.org> (raw)
In-Reply-To: <20140318050727.GA14769@sigill.intra.peff.net> (Jeff King's message of "Tue, 18 Mar 2014 01:07:27 -0400")
Jeff King <peff@peff.net> writes:
> On Tue, Mar 18, 2014 at 12:50:50AM -0400, Jeff King wrote:
>
>> On Sun, Mar 16, 2014 at 08:35:04PM +0700, Nguyễn Thái Ngọc Duy wrote:
>>
>> > As explained in the previous commit, current aggressive settings
>> > --depth=250 --window=250 could slow down repository access
>> > significantly. Notice that people usually work on recent history only,
>> > we could keep recent history more loosely packed, so that repo access
>> > is fast most of the time while the pack file remains small.
>>
>> One thing I have not seen is real-world timings showing the slowdown
>> based on --depth. Did I miss them, or are we just making assumptions
>> based on one old case from 2009 (that, AFAIK does not have real numbers,
>> just speculation)? Has anyone measured the effect of bumping the delta
>> cache size (and its hash implementation)?
>
> Just as a very quick, rough data point, here are before-and-after
> timings for the patch below doing "git rev-list --objects --all" on my
> linux.git, which is a mix of "--aggressive" and normal packing (I didn't
> do a "repack -f", but it's partially what I've downloaded from k.org and
> what I've repacked in various experiments over the past few months).
>
> [before]
> real 0m28.824s
> user 0m28.620s
> sys 0m0.232s
>
> [after]
> real 0m21.694s
> user 0m21.544s
> sys 0m0.172s
>
> The numbers below are completely pulled out of a hat, so we can perhaps
> do even better. But I think it shows that there is room for improvement
> in the delta base cache.
>
> ---
> diff --git a/environment.c b/environment.c
> index c3c8606..73ed670 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -37,7 +37,7 @@ int core_compression_seen;
> int fsync_object_files;
> size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
> size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
> -size_t delta_base_cache_limit = 16 * 1024 * 1024;
> +size_t delta_base_cache_limit = 128 * 1024 * 1024;
You need to change a file in Documentation as well. Can offer a patch.
> unsigned long big_file_threshold = 512 * 1024 * 1024;
> const char *pager_program;
> int pager_use_color = 1;
> diff --git a/sha1_file.c b/sha1_file.c
> index b37c6f6..a9ab8e3 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1944,7 +1944,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
> return buffer;
> }
>
> -#define MAX_DELTA_CACHE (256)
> +#define MAX_DELTA_CACHE (1024)
This one really needs experimentation. I found that increases here lead
to performance degradation rather soon, probably because of decreased
memory locality without significant reduction in cache collisions. Not
sure whether it's worth touching at all.
--
David Kastrup
prev parent reply other threads:[~2014-03-18 6:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-16 13:34 [PATCH 0/4] Better "gc --aggressive" Nguyễn Thái Ngọc Duy
2014-03-16 13:35 ` [PATCH 1/4] environment.c: fix constness for odb_pack_keep() Nguyễn Thái Ngọc Duy
2014-03-16 13:35 ` [PATCH] index-pack: do not segfault when keep_name is NULL Nguyễn Thái Ngọc Duy
2014-03-16 13:35 ` [PATCH 2/4] pack-objects: support --keep Nguyễn Thái Ngọc Duy
2014-03-16 13:35 ` [PATCH 3/4] gc --aggressive: make --depth configurable Nguyễn Thái Ngọc Duy
[not found] ` <CAG+J_Dw=Y5d2JTOngkxH=vNg3C43nP5=y7S6VXS=aHgmBshYZQ@mail.gmail.com>
2014-03-16 23:06 ` Duy Nguyen
2014-03-16 13:35 ` [PATCH 4/4] gc --aggressive: three phase repacking Nguyễn Thái Ngọc Duy
2014-03-17 22:12 ` Junio C Hamano
2014-03-17 22:59 ` Duy Nguyen
2014-03-17 23:07 ` Junio C Hamano
2014-03-18 4:50 ` Jeff King
2014-03-18 5:00 ` Duy Nguyen
2014-03-18 5:13 ` Jeff King
2014-03-18 6:16 ` David Kastrup
2014-03-19 11:03 ` Duy Nguyen
2014-03-18 5:07 ` Jeff King
2014-03-18 5:16 ` Duy Nguyen
2014-03-18 6:19 ` Duy Nguyen
2014-03-18 7:38 ` David Kastrup
[not found] ` <CALbm-EbZSuzynXoUNEifP=Ga_mj6Fp9L9Do-mxhRdMvUEfogig@mail.gmail.com>
2014-03-20 1:31 ` Duy Nguyen
2014-03-18 6:19 ` David Kastrup [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lhw8vxj4.fsf@fencepost.gnu.org \
--to=dak@gnu.org \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.