From: "Dana How" <danahow@gmail.com>
To: "Linus Torvalds" <torvalds@linux-foundation.org>
Cc: "Nicolas Pitre" <nico@cam.org>, "Junio C Hamano" <junkio@cox.net>,
git@vger.kernel.org, danahow@gmail.com
Subject: Re: [PATCH 09/13] drop objects larger than --blob-limit if specified
Date: Fri, 6 Apr 2007 11:09:16 -0700 [thread overview]
Message-ID: <56b7f5510704061109n2878a221p391b7c3edba89c63@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0704060845120.6730@woody.linux-foundation.org>
On 4/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Thu, 5 Apr 2007, Nicolas Pitre wrote:
> > > (2) Pack the object any way and let the packfile size exceed
> > > my specification. Ignoring a clear preference from the user
> > > doesn't seem good.
> > It is not indeed.
>
> Well, I think there is an easy solution.
>
> Just go back and say that when the user limits the size, it limits the
> offset at which objects can *start*.
Yes, this is what my original, unsplit patch did in order to support
the --pack-limit && --stdout combination, since lseek doesn't
always work on stdout.
> Not only is that the only thing that the index file itself cares about, it
> also means that
>
> - you don't ever need to undo anything at all (because you know the
> starting offset) when you begin packing a new object.
Yes, that was true.
> This should simplify your patch a lot.
It removes the calls to, and the additions of, sha1mark() and sha1undo()
in csum-file.[ch]. The changes to builtin-pack-objects.c are almost
the same -- an "if" moves from after the write_object() to before.
> - the object size issue just goes away. Sure, the pack-file limit looks a
> bit strange (it's no longer a hard limit on the *size* of the
> pack-file, just on the object offsets), but let's face it, we really
> really don't care.
>
> And in practice, by setting the pack-file limit to 2GB (or even 1GB), you
> never ever have to worry about the 32-bit filesystem limits any more,
> unless your repository is fundamentally so screwed that you simply
> *cannot* reporesent it well on something like FATFS (ie any object that is
> 2GB in size will probably have a blob that is even bigger, and FATFS
> already has problems with it).
>
> So in practice, just limiting the index offsets is what you really want
> anyway.
I agree with your arguments, but packs have different uses
and to me there are several differing (non-)reasons for --pack-limit.
* To split packs into smaller chunks when the .pack format is used as a
communications protocol. But the discussion has converged to
"we're not going to split packs at the transmitter". I agree with this
conclusion, since it would make the total transmission larger (loss
of deltas).
Since no .idx file is sent there is no requirement for --pack-limit here.
* To avoid offsets larger than 31 bits in .idx files. Your proposal,
and what I was doing for --pack-limit && --stdout, is sufficient to
address this.
* Avoiding (e.g.) 2GB+ files when none already exist in the repository --
either the filesystem doesn't support anything beyond the limit,
or we don't want to use a >31b off_t with mmap. (Perhaps
the latter case is completely avoided by some glibc 64b trickery,
but is that always true?) Only the write rollback approach can address this.
* Creating .pack files that fit on e.g. CDROM etc.
The 2nd and 3rd cases are what I'm thinking about,
which is why the first version of my patch did both.
Anyway, now that I understand Nicolas's issues with the patchset,
and realize I agree with his concerns, I'm going to let this percolate
a little bit until I've got the least number of added behaviors and
command line options. At the moment I'm leaning towards killing my
--blob-limit idea, keeping --pack-limit based on rollback but with the
additional twist that the first object written to a packfile is never
"rolled back" [meaning (a) everything is packed to make Nicolas happy,
(b) any illegally-sized pack has only one object and could be removed
to make me happy, and (c) my patchset has one less bug], and
adding --index-limit which is what normal people might use on a large
repository [and could be made the default later]. So I think what you
discuss is the most common use for limiting.
Thanks,
--
Dana L. How danahow@gmail.com +1 650 804 5991 cell
next prev parent reply other threads:[~2007-04-06 18:09 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-05 22:36 [PATCH 09/13] drop objects larger than --blob-limit if specified Dana How
[not found] ` <al pine.LFD.0.98.0704052103410.28181@xanadu.home>
[not found] ` <56b7f5510704051919v7daac590m 6ac52c4fcabd5321@mail.gmail.com>
2007-04-06 1:04 ` Nicolas Pitre
2007-04-06 2:19 ` Dana How
2007-04-06 3:20 ` Nicolas Pitre
2007-04-06 15:49 ` Linus Torvalds
[not found] ` <56b7f55 10704061109n2878a221p391b7c3edba89c63@mail.gmail.com>
2007-04-06 18:09 ` Dana How [this message]
2007-04-06 19:21 ` Nicolas Pitre
2007-04-06 19:24 ` Linus Torvalds
2007-04-06 22:33 ` Junio C Hamano
2007-04-08 1:53 ` David Lang
2007-04-06 20:12 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56b7f5510704061109n2878a221p391b7c3edba89c63@mail.gmail.com \
--to=danahow@gmail.com \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=nico@cam.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).