git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Bringing a bit more sanity to $GIT_DIR/objects/info/alternates?
Date: Sun, 05 Aug 2012 11:38:12 +0200	[thread overview]
Message-ID: <501E3F04.4050902@alum.mit.edu> (raw)
In-Reply-To: <7vmx2a3pif.fsf@alter.siamese.dyndns.org>

On 08/05/2012 06:56 AM, Junio C Hamano wrote:
> The "alternates" mechanism [...]
> The UI for this mechanism however has some room for improvement, and
> we may want to start improving it for the next release after the
> upcoming Git 1.7.12 (or even Git 2.0 if the change is a large one
> that may be backward incompatible but gives us a vast improvement).
 >
> Here are some random thoughts as a discussion starter. [...]
[...]
>     - Make the distinction between a regular repository and an object
>       store that is meant to be used for object sharing stronger.
>
>       Perhaps a configuration item "core.objectstore = readonly" can
>       be introduced, and we forbid "clone -s" from pointing at a
>       repository without such a configuration.  We also forbid object
>       pruning operations such as "gc" and "repack" from being run in
>       a repository marked as such.

Must the repository necessarily be "readonly"?  It seems that it would 
be permissible to push new objects to such a repository; just not to 
delete existing objects.  Thus maybe another term would be better to 
describe such a repository, like "appendonly" or "noprune" or even 
something more abstract like "donor".

I have some other crazy ideas for making the concept even more powerful:

* Support remote alternate repositories.  Local repository obtains 
missing objects from the remote as needed.  This would probably be 
insanely inefficient without also supporting...

* Lazy copying of "borrowed" objects to the local repository.  Any 
object fetched from the alternate object store is copied to the local 
object store.

Together, I think that these two features would give fully-functional 
shallow clones.

Such alternates could even be chained together: for example, keep a 
single local lazy clone of the upstream repository somewhere on your 
site or on your computer, and use that as read-through cache for other 
clones.

* To help manage local disk space, allow intelligent curation of the 
objects kept in the local store when they are also available in the 
alternate.  The criteria for what to keep could be things like 
"revisions with depth <= 20 on branches X, Y/*, and Z"; "objects that 
have been accessed within the last 3 months", "all tag objects 
refs/tags/release-*".  It should be possible to cull objects not meeting 
the criteria with or without actively fetching all objects meeting the 
criteria.  Probably the criteria would be stored in the configuration to 
be reused (and perhaps run as part of "git gc").

This would cure a lot of "storing big, non-deltaable files" pain because 
big blobs could be stored on a central server without multiplying the 
size of every clone.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

  reply	other threads:[~2012-08-05  9:46 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-05  4:56 Bringing a bit more sanity to $GIT_DIR/objects/info/alternates? Junio C Hamano
2012-08-05  9:38 ` Michael Haggerty [this message]
2012-08-05 19:01   ` Junio C Hamano
2012-08-07  6:16   ` Jeff King
2012-08-06 21:55 ` Junio C Hamano
2012-08-08  1:42 ` Sascha Cunz
2012-08-11  9:35 ` Hallvard Breien Furuseth
2012-08-27 22:39 ` Oswald Buddenhagen
2012-08-28 19:19   ` GC of alternate object store (was: Bringing a bit more sanity to $GIT_DIR/objects/info/alternates?) Hallvard Breien Furuseth
2012-08-29  7:42     ` Oswald Buddenhagen
2012-08-29 15:52       ` GC of alternate object store Junio C Hamano
2012-08-30  9:53         ` Oswald Buddenhagen
2012-08-30 16:03           ` Junio C Hamano
2012-08-31 16:26             ` Oswald Buddenhagen
2012-08-31 19:18               ` Dan Johnson
2012-08-31 19:45                 ` Junio C Hamano
2012-09-01  4:25                   ` [PATCH] fetch --all: pass --tags/--no-tags through to each remote Dan Johnson
2012-09-01 11:22                     ` Jeff King
2012-09-01 11:25                       ` [PATCH 1/2] argv-array: add pop function Jeff King
2012-09-01 11:27                       ` [PATCH 2/2] fetch: use argv_array instead of hand-building arrays Jeff King
2012-09-01 14:34                         ` Jens Lehmann
2012-09-01 15:27                           ` [PATCH] submodule: " Jens Lehmann
2012-09-01 11:32                       ` [PATCH] fetch --all: pass --tags/--no-tags through to each remote Jeff King
2012-09-01 11:34                         ` [PATCH 3/2] argv-array: fix bogus cast when freeing array Jeff King
2012-09-05 21:22                       ` [PATCHv2] fetch --all: pass --tags/--no-tags through to each remote Dan Johnson
2012-09-07 17:07                         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=501E3F04.4050902@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).