From: Nicolas Pitre <nico@cam.org>
To: Dana How <danahow@gmail.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [PATCH 09/13] drop objects larger than --blob-limit if specified
Date: Thu, 05 Apr 2007 23:20:44 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.0.98.0704052257200.28181@xanadu.home> (raw)
In-Reply-To: <56b7f5510704051919v7daac590m6ac52c4fcabd5321@mail.gmail.com>
On Thu, 5 Apr 2007, Dana How wrote:
> On 4/5/07, Nicolas Pitre <nico@cam.org> wrote:
> > I still consider this feature to make no sense.
>
> Well, suppose I'm packing my 55GB of data into 2GB
> packfiles. There seemed to be some agreement that
> limiting packfile size was useful. 700MB is another example.
>
> Now, suppose there is an object whose packing would
> result in a packfile larger than the limit. What should we do?
You error out.
> (1) Refuse to run. This solution means I can't pack my repository.
Exactly. If you want packs not to be larger than 10MB and you have a
100MB blob then you are screwed. Just lift your pack size limit in such
case.
> (2) Pack the object any way and let the packfile size exceed
> my specification. Ignoring a clear preference from the user
> doesn't seem good.
It is not indeed.
> (3) Pack the object by itself in its own pack. This is better than the
> previous since I haven't wrapped up any small object in a pack
> whose size I dont't want to deal with. The resulting pack is too big,
> but the original object was also too big so at least I haven't made
> the problem worse. But why bother wrapping the object so?
> I just made the list of packs to look through longer for every access,
> instead of leaving the big object in the objects/xx directories which
> are already used to handle exceptions (usually meaning more recent).
> In my 55GB example, I have 9 jumbo objects, and this solution
> would more than double the number of packs to step through.
> Having them randomly placed in 256 subdirectories seems better.
You forget about the case where those jumbo blobs could delta well
against each other. That means that one pack could possibly contain
those 9 objects because 8 of them are tiny deltas against the first big
one.
> (4) Just leave the jumbo object by itself, unpacked.
Hmmmmm.
> What do you think?
Let's say I wouldn't mind much if it was implemented differently. The
objects array is probably the biggest cost in terms of memory usage for
pack-objects. When you have 4 milions objects like in the kde repo that
means each new field you add will cost between 4 to 16 MB of memory. I
think this is too big a cost for filtering out a couple big objects once
in a while.
Instead, I think you should apply the filtering in add_object_entry()
directly and simply skip adding the unwanted object to the list
altogether.
Nicolas
next prev parent reply other threads:[~2007-04-06 3:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-05 22:36 [PATCH 09/13] drop objects larger than --blob-limit if specified Dana How
[not found] ` <al pine.LFD.0.98.0704052103410.28181@xanadu.home>
[not found] ` <56b7f5510704051919v7daac590m 6ac52c4fcabd5321@mail.gmail.com>
2007-04-06 1:04 ` Nicolas Pitre
2007-04-06 2:19 ` Dana How
2007-04-06 3:20 ` Nicolas Pitre [this message]
2007-04-06 15:49 ` Linus Torvalds
[not found] ` <56b7f55 10704061109n2878a221p391b7c3edba89c63@mail.gmail.com>
2007-04-06 18:09 ` Dana How
2007-04-06 19:21 ` Nicolas Pitre
2007-04-06 19:24 ` Linus Torvalds
2007-04-06 22:33 ` Junio C Hamano
2007-04-08 1:53 ` David Lang
2007-04-06 20:12 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.98.0704052257200.28181@xanadu.home \
--to=nico@cam.org \
--cc=danahow@gmail.com \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).