git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Riesen" <raa.lkml@gmail.com>
To: "Shawn Pearce" <spearce@spearce.org>
Cc: "Eran Tromer" <git2eran@tromer.org>,
	"Nicolas Pitre" <nico@cam.org>, "Junio C Hamano" <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: fetching packs and storing them as packs
Date: Fri, 27 Oct 2006 09:42:47 +0200	[thread overview]
Message-ID: <81b0412b0610270042w29279b90t7c94d8590d701519@mail.gmail.com> (raw)
In-Reply-To: <20061027044233.GA29057@spearce.org>

> So the receive-pack process becomes:
>
>  a. Create temporary pack file in $GIT_DIR/objects/pack_XXXXX.
> b. Create temporary index file in $GIT_DIR/objects/index_XXXXX.

Why not $GIT_DIR/objects/tmp/pack... and ignore it everywhere?

On 10/27/06, Shawn Pearce <spearce@spearce.org> wrote:
> Eran Tromer <git2eran@tromer.org> wrote:
> > On 2006-10-27 05:00, Shawn Pearce wrote:
> > >> Change git-repack to follow references under $GIT_DIR/tmp/refs/ too.
> > >> To receive or fetch a pack:
> > >> 1. Add references to the new heads in
> > >>    `mktemp $GIT_DIR/tmp/refs/XXXXXX`.
> > >> 2. Put the new .pack under $GIT_DIR/objects/pack/.
> > >> 3. Put the new .idx under $GIT_DIR/objects/pack/.
> > >> 4. Update the relevant heads under $GIT_DIR/refs/.
> > >> 5. Delete the references from step 1.
> >
> > > That was actually my (and also Sean's) solution.  Except I would
> > > put the temporary refs as "$GIT_DIR/refs/ref_XXXXXX" as this is
> > > less code to change and its consistent with how temporary loose
> > > objects are created.
> >
> > If you do that, other programs (e.g., anyone who uses rev-list --all)
> > may try to walk those heads or consider them available before the pack
> > is really there. The point about $GIT_DIR/tmp/refs is that only programs
> > meddling with physical packs (git-fetch, git-receive-pack, git-repack)
> > will know about it.
>
> Doh.  Yes, of course, that makes much sense.
>
> Hmm... Looking at git-repack we have two things currently pending
> to rework in there:
>
>   - Historical vs. active packs.
>   - Don't delete a possibly still incoming pack during -d.
>
> These have a lot of the same implementation issues.  We need to
> be able to identify a set of packs which should be allowed for
> repack with -a, and allowed for removal with -d if -a was also used.
> A newly uploaded pack cannot be in that list unless its contents are
> referenced by one or more refs (which implies that the receive-pack
> process has completed).
>
> I'm thinking that the ref thing might be unnecessary.  We just
> need to fix repack so it builds a list of "active packs" whose
> objects should be copied into the new pack, and then only packs
> loose objects and those objects contained by an active packs.
>
> So the receive-pack process becomes:
>
>   a. Create temporary pack file in $GIT_DIR/objects/pack_XXXXX.
>   b. Create temporary index file in $GIT_DIR/objects/index_XXXXX.
>   c. Write pack and index.
>   d. Move pack to $GIT_DIR/objects/pack/...
>   e. Move index to $GIT_DIR/objects/pack...
>   f. Update refs.
>   g. Arrange for new pack and index to be considered active.
>
> And the repack -a -d process becomes:
>
>   1. List all active packs and store in memory.
>   2. Repack only loose objects and objects contained in active packs.
>   3. Move new pack and idx into $GIT_DIR/objects/pack/...
>   4. Arrange for new pack and idx to be considered active.
>   5. Delete active packs found by step #1.
>
> Junio was originally considering making historical packs
> historical by placing their names into an information file (such as
> `$GIT_DIR/objects/info/historical-packs`) and then consider all other
> packs as active.  Thus step #1 is list all packs and removes those
> whose names appear in historical-packs, while step #4 is unnecessary.
>
> I was thinking about just changing the "pack-" prefix to "hist-" for
> the historical packs and assuming all "pack-*.pack" to be active.
> Thus step #1 is a simple glob on the pack directory and step #4
> is unnecessary.
>
> In the latter case its easy to mark an existing pack as historical
> (just hardlink hist- names for pack, then idx, then unlink previous
> names) and its also easy to mark new incoming packs as non active
> by using a different prefix (e.g. "incm-") during step #d/#e and
> then relinking them as "pack-" during step #g.  Its also very safe
> on systems that support hardlinks.
>
> We shouldn't ever need to worry about race conditions with repacking
> historical packs.  For starters historical packs will tend to be
> several years' worth of object accumulation and will be so large
> that repacking them might take 45 minutes or more.  Thus they
> probably will never get repacked.  An active pack will simply move
> into historical status after it gets so large that its no longer
> worthwhile to keep repacking it.  They also will tend to have objects
> that are so old that at least one ref in the repository will point
> at their entire DAG and thus everything would carry over on a repack.
>
> So this would be cleaner then messing around with temporary refs and
> gets us the historical pack feature we've been looking to implement.
>
> --
> Shawn.
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2006-10-27  7:43 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-26  3:44 fetching packs and storing them as packs Nicolas Pitre
2006-10-26 14:45 ` Eran Tromer
     [not found]   ` <Pine.LNX.4.64.0610261105200.12418@xanadu.home>
2006-10-26 22:09     ` Eran Tromer
2006-10-27  0:50       ` Nicolas Pitre
2006-10-27  1:42         ` Shawn Pearce
2006-10-27  2:38           ` Sean
2006-10-27  6:57             ` Junio C Hamano
2006-10-27 17:23               ` Nicolas Pitre
2006-10-27  2:41           ` Nicolas Pitre
2006-10-27  2:42           ` Eran Tromer
2006-10-27  3:00             ` Shawn Pearce
2006-10-27  3:13               ` Sean
2006-10-27  3:20                 ` Jakub Narebski
2006-10-27  3:27                   ` Sean
2006-10-27  4:03               ` Eran Tromer
2006-10-27  4:42                 ` Shawn Pearce
2006-10-27  7:42                   ` Alex Riesen [this message]
2006-10-27  7:52                     ` Shawn Pearce
2006-10-27  8:08                       ` Alex Riesen
2006-10-27  8:13                         ` Shawn Pearce
2006-10-27 14:27               ` Nicolas Pitre
2006-10-27 14:38                 ` Petr Baudis
2006-10-27 14:48                   ` J. Bruce Fields
2006-10-27 15:03                     ` Petr Baudis
2006-10-27 16:04                       ` J. Bruce Fields
2006-10-27 16:05                         ` J. Bruce Fields
2006-10-27 18:56                   ` Junio C Hamano
2006-10-27 20:22   ` Linus Torvalds
2006-10-27 21:53     ` Junio C Hamano
2006-10-28  3:42       ` Shawn Pearce
2006-10-28  4:09         ` Junio C Hamano
2006-10-28  4:18         ` Linus Torvalds
2006-10-28  5:42           ` Junio C Hamano
2006-10-28  7:21             ` Shawn Pearce
2006-10-28  8:40               ` Shawn Pearce
2006-10-28 19:15                 ` Junio C Hamano
2006-10-29  3:50                   ` Shawn Pearce
2006-10-29  4:29                     ` Junio C Hamano
2006-10-29  4:38                       ` Shawn Pearce
2006-10-29  5:16                         ` Junio C Hamano
2006-10-29  5:21                           ` Shawn Pearce
2006-10-28 17:59               ` Linus Torvalds
2006-10-28 18:34               ` Junio C Hamano
2006-10-28 22:31               ` Eran Tromer
2006-10-29  3:38                 ` Shawn Pearce
2006-10-29  3:48                   ` Jakub Narebski
2006-10-29  3:52                     ` Shawn Pearce
2006-10-29  7:47 ` [PATCH] send-pack --keep: do not explode into loose objects on the receiving end Junio C Hamano
2006-10-29  7:56   ` Shawn Pearce
2006-10-29  8:05     ` Junio C Hamano
2006-10-30  1:44     ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81b0412b0610270042w29279b90t7c94d8590d701519@mail.gmail.com \
    --to=raa.lkml@gmail.com \
    --cc=git2eran@tromer.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=nico@cam.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).