git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: Martin Langhoff <martin.langhoff@gmail.com>,
	Linus Torvalds <torvalds@osdl.org>, git <git@vger.kernel.org>
Subject: Re: Creating objects manually and repack
Date: Sat, 5 Aug 2006 01:52:03 -0400	[thread overview]
Message-ID: <20060805055203.GC18679@spearce.org> (raw)
In-Reply-To: <9e4733910608042240u581dd23q3859ebcfe4268ce2@mail.gmail.com>

Jon Smirl <jonsmirl@gmail.com> wrote:
> How about adding a flag to repack that simply says delete the objects
> when done with them? I'd still create all of the objects on disk.
> Repack would assume that they have at least been sorted by filename.
> So repack could read in object names until it sees a change in the
> file name, sort them by size, deltafy, write out the pack and then
> delete the objects from that batch. Then repeat this process for the
> next file name on stdin.
> 
> I'm making two assumptions, first that blocks from a deleted file
> don't get written to disk. And that by deleting the file the file
> system will use the same blocks over and over. If those assumptions
> are close to being true then the cache shouldn't thrash. They don't
> have to be totally true, close is good enough.
> 
> Of course eliminating the files all together will be even faster.

See the email I just sent you.  The only file being written is the
pack file that's being generated.  No temporary files, no temporary
inodes, no temporary blocks.  Only two passes over the data: one to
write it out and a second to generate the SHA1.  I do two passes
vs. keep it all in memory to prevent the program from blowing out
on extremely large inputs.

It may be possible to tweak git-pack-objects to get what you propose
above, but to be honest I think the git-fast-import I just sent
was easier, especially as it avoids the temporary loose object stage.

-- 
Shawn.

  reply	other threads:[~2006-08-05  5:52 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-04  3:43 Creating objects manually and repack Jon Smirl
2006-08-04  3:58 ` Jeff King
2006-08-04  4:01 ` Linus Torvalds
2006-08-04  4:24   ` Jon Smirl
2006-08-04  4:46     ` Linus Torvalds
2006-08-04  5:01       ` Linus Torvalds
2006-08-04  5:11         ` Jon Smirl
2006-08-04 14:40         ` Jon Smirl
2006-08-04 14:50           ` Jon Smirl
2006-08-04 15:22             ` Linus Torvalds
2006-08-04 15:41               ` Jon Smirl
2006-08-04 16:01                 ` A Large Angry SCM
2006-08-04 16:11                   ` Jon Smirl
2006-08-04 16:32                     ` Linus Torvalds
2006-08-04 16:56                   ` Linus Torvalds
2006-08-04 16:39                 ` Rogan Dawes
2006-08-04 16:53                   ` Jon Smirl
2006-08-04 16:53                 ` Linus Torvalds
2006-08-04 17:17                   ` Jon Smirl
2006-08-04 17:29                     ` Linus Torvalds
2006-08-04 18:06                       ` Linus Torvalds
2006-08-04 18:24                         ` Junio C Hamano
2006-08-04 19:20                           ` Linus Torvalds
2006-08-04 19:31                             ` Carl Worth
2006-08-04 19:57                               ` Junio C Hamano
2006-08-04 20:08                                 ` Carl Worth
2006-08-04 20:08                                 ` Carl Worth
2006-08-04 20:12                                 ` Jakub Narebski
2006-08-04 20:30                                   ` Junio C Hamano
2006-08-04 20:37                                     ` Jakub Narebski
2006-08-05  4:15                     ` Martin Langhoff
2006-08-05  5:12                       ` Jon Smirl
2006-08-05  5:21                         ` Shawn Pearce
2006-08-05  5:40                           ` Jon Smirl
2006-08-05  5:52                             ` Shawn Pearce [this message]
2006-08-05  5:46                           ` Shawn Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060805055203.GC18679@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=martin.langhoff@gmail.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).