git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: David Kastrup <dak@gnu.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2] Bump core.deltaBaseCacheLimit to 128MiB
Date: Wed, 19 Mar 2014 15:11:22 -0700	[thread overview]
Message-ID: <xmqqlhw5260l.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <87ob11g9st.fsf@fencepost.gnu.org> (David Kastrup's message of "Wed, 19 Mar 2014 22:25:54 +0100")

David Kastrup <dak@gnu.org> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> David Kastrup <dak@gnu.org> writes:
>>
>>> The default of 16MiB causes serious thrashing for large delta chains
>>> combined with large files.
>>>
>>> Signed-off-by: David Kastrup <dak@gnu.org>
>>> ---
>>
>> Is that a good argument?  Wouldn't the default of 128MiB burden
>> smaller machines with bloated processes?
>
> The default file size before Git forgets about delta compression is
> 512MiB.  Unpacking 500MiB files with 16MiB of delta storage is going to
> be uglier.
>
> ...
>
> Documentation/config.txt states:
>
>     core.deltaBaseCacheLimit::
>             Maximum number of bytes to reserve for caching base objects
>             that may be referenced by multiple deltified objects.  By storing the
>             entire decompressed base objects in a cache Git is able
>             to avoid unpacking and decompressing frequently used base
>             objects multiple times.
>     +
>     Default is 16 MiB on all platforms.  This should be reasonable
>     for all users/operating systems, except on the largest projects.
>     You probably do not need to adjust this value.
>
> I've seen this seriously screwing performance in several projects of
> mine that don't really count as "largest projects".
>
> So the description in combination with the current setting is clearly wrong.

That is a good material for proposed log message, and I think you
are onto something here.

I know that the 512MiB default for the bitFileThreashold (aka
"forget about delta compression") came out of thin air.  It was just
"1GB is always too huge for anybody, so let's cut it in half and
declare that value the initial version of a sane threashold",
nothing more.

So it might be that the problem is 512MiB is still too big, relative
to the 16MiB of delta base cache, and the former may be what needs
to be tweaked.  If a blob close to but below 512MiB is a problem for
16MiB delta base cache, it would still be too big to cause the same
problem for 128MiB delta base cache---it would evict all the other
objects and then end up not being able to fit in the limit itself,
busting the limit immediately, no?

I would understand if the change were to update the definition of
deltaBaseCacheLimit and link it to the value of bigFileThreashold,
for example.  With the presented discussion, I am still not sure if
we can say that bumping deltaBaseCacheLimit is the right solution to
the "description with the current setting is clearly wrong" (which
is a real issue).

  reply	other threads:[~2014-03-19 22:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19 12:38 [PATCH v2] Bump core.deltaBaseCacheLimit to 128MiB David Kastrup
2014-03-19 21:09 ` Junio C Hamano
2014-03-19 21:25   ` David Kastrup
2014-03-19 22:11     ` Junio C Hamano [this message]
2014-03-20  1:38       ` Duy Nguyen
2014-03-20 17:02         ` Junio C Hamano
2014-03-20 17:08           ` David Kastrup
2014-03-20 22:35             ` Junio C Hamano
2014-03-21  6:04               ` David Kastrup
2014-03-21  7:59                 ` Duy Nguyen
2014-03-21  8:02       ` David Kastrup
2014-03-20 23:48 ` Jeff King
2014-03-21  6:12   ` David Kastrup
2014-03-21  8:11   ` David Kastrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlhw5260l.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dak@gnu.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).