From: Nicolas Pitre <nico@cam.org>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>,
Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org, Stephan Hennig <mailing_list@arcor.de>,
Andreas Ericsson <ae@op5.se>
Subject: Re: [PATCH 4/4] index-pack: Honor core.deltaBaseCacheLimit when resolving deltas
Date: Mon, 14 Jul 2008 23:05:24 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.1.10.0807142255240.12484@xanadu.home> (raw)
In-Reply-To: <1216001267-33235-5-git-send-email-spearce@spearce.org>
On Sun, 13 Jul 2008, Shawn O. Pearce wrote:
> If we are trying to resolve deltas for a long delta chain composed
> of multi-megabyte objects we can easily run into requiring 500M+
> of memory to hold each object in the chain on the call stack while
> we recurse into the dependent objects and resolve them.
>
> We now use a simple delta cache that discards objects near the
> bottom of the call stack first, as they are the most least recently
> used objects in this current delta chain. If we recurse out of a
> chain we may find the base object is no longer available, as it was
> free'd to keep memory under the deltaBaseCacheLimit. In such cases
> we must unpack the base object again, which will require recursing
> back to the root of the top of the delta chain as we released that
> root first.
>
> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
OK now I see what the 'base' pointer I previously dismissed is really
needed for.
But this patch is suboptimal as it actually recreate the same memory
pressure, to a lesser degree, this series is meant to solve. If you do:
> + struct object_entry *obj = c->obj;
> + void *raw = get_data_from_pack(obj);
> + if (obj->type == OBJ_REF_DELTA || obj->type == OBJ_OFS_DELTA) {
> + c->data = patch_delta(
> + get_base_data(c->base), c->base->size,
> + raw, obj->size,
> + &c->size);
> + free(raw);
What you actually do is to read the delta data in memory, then recurse
down to read more delta data, then recurse down to read the base which
might be yet more delta data in memory, etc. etc. Only when you reach
the bottom of the stack will you start resolving all those deltas in
memory. Instead, the check for a delta object should be done first, and
if so then recursion for the base object be performed _before_ reading
the currently wanted object data. This way you won't have more than one
delta buffer at any time in memory.
Nicolas
next prev parent reply other threads:[~2008-07-15 3:06 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-10 14:40 git pull is slow Stephan Hennig
2008-07-10 15:13 ` Martin Langhoff
2008-07-10 15:28 ` Petr Baudis
2008-07-10 15:30 ` Johannes Sixt
2008-07-10 15:45 ` Stephan Hennig
2008-07-10 15:50 ` Petr Baudis
2008-07-10 17:44 ` Stephan Hennig
2008-07-11 12:25 ` Stephan Hennig
2008-07-11 13:34 ` Andreas Ericsson
2008-07-11 14:04 ` Johannes Schindelin
2008-07-12 12:32 ` Stephan Hennig
2008-07-12 17:05 ` Johannes Schindelin
2008-07-13 1:15 ` Shawn O. Pearce
2008-07-13 13:59 ` Johannes Schindelin
2008-07-13 22:11 ` Shawn O. Pearce
2008-07-14 2:07 ` [PATCH 0/4] Honor core.deltaBaseCacheLimit during index-pack Shawn O. Pearce
2008-07-14 2:27 ` Nicolas Pitre
2008-07-14 3:12 ` Shawn O. Pearce
2008-07-14 11:44 ` Johannes Schindelin
2008-07-14 11:54 ` Jakub Narebski
2008-07-14 12:10 ` Johannes Schindelin
2008-07-14 12:16 ` Andreas Ericsson
2008-07-14 12:25 ` Johannes Schindelin
2008-07-14 12:51 ` Andreas Ericsson
2008-07-14 12:58 ` Johannes Schindelin
2008-07-15 2:21 ` Nicolas Pitre
2008-07-15 2:47 ` Shawn O. Pearce
2008-07-15 3:06 ` Nicolas Pitre
2008-07-17 16:06 ` Stephan Hennig
2008-07-17 16:25 ` Nicolas Pitre
2008-07-17 21:35 ` Shawn O. Pearce
2008-07-17 22:02 ` [RFC PATCH] index-pack: Issue a warning if deltaBaseCacheLimit is too small Shawn O. Pearce
2008-07-17 23:45 ` Nicolas Pitre
2008-07-15 4:19 ` [PATCH 0/4] Honor core.deltaBaseCacheLimit during index-pack Shawn O. Pearce
2008-07-14 2:07 ` [PATCH 1/4] index-pack: Refactor base arguments of resolve_delta into a struct Shawn O. Pearce
2008-07-15 2:40 ` Nicolas Pitre
2008-07-14 2:07 ` [PATCH 2/4] index-pack: Chain the struct base_data on the stack for traversal Shawn O. Pearce
2008-07-15 2:48 ` Nicolas Pitre
2008-07-14 2:07 ` [PATCH 3/4] index-pack: Track the object_entry that creates each base_data Shawn O. Pearce
2008-07-14 10:15 ` Johannes Schindelin
2008-07-15 2:50 ` Nicolas Pitre
2008-07-15 3:20 ` Shawn O. Pearce
2008-07-15 3:42 ` Nicolas Pitre
2008-07-14 2:07 ` [PATCH 4/4] index-pack: Honor core.deltaBaseCacheLimit when resolving deltas Shawn O. Pearce
2008-07-15 3:05 ` Nicolas Pitre [this message]
2008-07-15 3:18 ` Shawn O. Pearce
2008-07-15 4:45 ` [PATCH v2] " Shawn O. Pearce
2008-07-15 5:05 ` Nicolas Pitre
2008-07-15 18:48 ` Junio C Hamano
2008-07-13 9:01 ` git pull is slow Stephan Hennig
2008-07-11 12:55 ` Stephan Hennig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.10.0807142255240.12484@xanadu.home \
--to=nico@cam.org \
--cc=ae@op5.se \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mailing_list@arcor.de \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).