From: "Shawn O. Pearce" <spearce@spearce.org>
To: Geert Bosch <bosch@adacore.com>
Cc: Nicolas Pitre <nico@cam.org>,
Troy Telford <ttelford.groups@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH] Support 64-bit indexes for pack files.
Date: Tue, 27 Feb 2007 11:11:22 -0500 [thread overview]
Message-ID: <20070227161122.GE3230@spearce.org> (raw)
In-Reply-To: <5FE0C988-0DA8-4BFB-8F0C-42F97808E6F8@adacore.com>
Geert Bosch <bosch@adacore.com> wrote:
> When I import a large code-base (such as a *.tar.gz), I don't know
> beforehand how many objects I'm going to create. Ideally, I'd like
> to stream them directly into a new pack without ever having to write
> the expanded source to the filesystem.
See git-fast-import. If you are coming from a tar, also see
contrib/fast-import/import-tars.perl. :-)
> So for creating a large pack from a stream of data, you have to do
> the following:
> 1. write out a temporary pack file to disk without correct count
> 2. fix-up the count
> 3. read the entire temporary pack file to compute the final SHA-1
> 4. fix-up the SHA1 at the end of the file
> 5. construct and write out the index
Yes, this is exactly what git-fast-import does. Yes, it sort of
sucks. But its not as bad as you think.
> There are a few ways to fixing this:
> - Have a count of 0xffffffff mean: look in the index for the count.
> Pulling/pushing would still use regular counted pack files.
> - Have the pack file checksum be the SHA1 of (the count followed
> by the SHA1 of the compressed data of each object). This would
> allow 3.
> to be done without reading back all data.
I don't think it is worth it. Aside from git-fast-import we
always know the object count before we start writing any data.
But despite that, fast-import runs quite well.
--
Shawn.
next prev parent reply other threads:[~2007-02-27 20:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-26 22:40 [PATCH] Support 64-bit indexes for pack files Troy Telford
2007-02-26 23:55 ` Shawn O. Pearce
2007-02-27 0:24 ` Nicolas Pitre
2007-02-27 0:31 ` Shawn O. Pearce
2007-02-27 4:32 ` Nicolas Pitre
2007-02-27 4:55 ` Geert Bosch
2007-02-27 5:11 ` Nicolas Pitre
2007-02-27 16:04 ` Geert Bosch
2007-02-27 16:11 ` Shawn O. Pearce [this message]
2007-02-27 16:55 ` Geert Bosch
2007-02-27 17:36 ` Nicolas Pitre
2007-02-28 3:52 ` Shawn O. Pearce
2007-02-28 4:12 ` Nicolas Pitre
2007-02-27 17:03 ` Nicolas Pitre
2007-02-27 20:05 ` Johannes Schindelin
2007-02-27 20:25 ` Geert Bosch
2007-02-27 20:35 ` Johannes Schindelin
2007-02-27 1:16 ` Troy Telford
2007-02-27 4:56 ` Nicolas Pitre
2007-02-28 19:46 ` Troy Telford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070227161122.GE3230@spearce.org \
--to=spearce@spearce.org \
--cc=bosch@adacore.com \
--cc=git@vger.kernel.org \
--cc=nico@cam.org \
--cc=ttelford.groups@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).