* Git and Media repositories....
@ 2008-11-02 19:50 Tim Ansell
2008-11-03 6:56 ` Johannes Schindelin
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Tim Ansell @ 2008-11-02 19:50 UTC (permalink / raw)
To: git
Hey guys,
Last week at the gittogether I lead some discussions about how we could
make Git better support large media repositories (which is one area
where Subversion still make sense). It was suggested that I post to this
list to get a discussion going.
The general idea is that we always clone the complete meta-data (tags,
commits and trees) and then only clone blobs when they are needed (using
something like alternates). This allows us to support shallow, narrow
and sparse checkouts while still being able to perform operations such
as committing and merging.
You can find a copy of the summary presentation at
http://www.thousandparsec.net/~tim/media+git.pdf
I have started working on adapting git to check a remote http alternate
to provide a proof of concept.
I appreciate any help or suggestions.
Tim 'mithro' Ansell
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git and Media repositories....
2008-11-02 19:50 Git and Media repositories Tim Ansell
@ 2008-11-03 6:56 ` Johannes Schindelin
2008-11-03 9:40 ` Jakub Narebski
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Johannes Schindelin @ 2008-11-03 6:56 UTC (permalink / raw)
To: Tim Ansell; +Cc: git
Hi,
On Sun, 2 Nov 2008, Tim Ansell wrote:
> Last week at the gittogether I lead some discussions about how we could
> make Git better support large media repositories (which is one area
> where Subversion still make sense). It was suggested that I post to this
> list to get a discussion going.
>
> The general idea is that we always clone the complete meta-data (tags,
> commits and trees) and then only clone blobs when they are needed (using
> something like alternates). This allows us to support shallow, narrow
> and sparse checkouts while still being able to perform operations such
> as committing and merging.
>
> You can find a copy of the summary presentation at
> http://www.thousandparsec.net/~tim/media+git.pdf
>
> I have started working on adapting git to check a remote http alternate
> to provide a proof of concept.
>
> I appreciate any help or suggestions.
You might find this message (and others from the same time frame and
author) pretty interesting:
http://article.gmane.org/gmane.comp.version-control.git/48485
Ciao,
Dscho
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git and Media repositories....
2008-11-02 19:50 Git and Media repositories Tim Ansell
2008-11-03 6:56 ` Johannes Schindelin
@ 2008-11-03 9:40 ` Jakub Narebski
2008-11-07 13:00 ` Jakub Narebski
2008-11-07 13:19 ` Santi Béjar
3 siblings, 0 replies; 6+ messages in thread
From: Jakub Narebski @ 2008-11-03 9:40 UTC (permalink / raw)
To: Tim Ansell; +Cc: git, Dana How
Tim Ansell <mithro@mithis.com> writes:
> Last week at the GitTogether I lead some discussions about how we could
> make Git better support large media repositories (which is one area
> where Subversion still make sense). It was suggested that I post to this
> list to get a discussion going.
>
> The general idea is that we always clone the complete meta-data (tags,
> commits and trees) and then only clone blobs when they are needed (using
> something like alternates). This allows us to support shallow, narrow
> and sparse checkouts while still being able to perform operations such
> as committing and merging.
>
> You can find a copy of the summary presentation at
> http://www.thousandparsec.net/~tim/media+git.pdf
>
> I have started working on adapting git to check a remote http alternate
> to provide a proof of concept.
>
> I appreciate any help or suggestions.
Dana How (CC-ed) worked on better support for large files, but in
corporate setting. The solution that was the result of all discussion
and all patches (not all accpeted) was to create kept packfile for
those large files, and share those packfiles (perhaps via alternates)
using network filesystem, instead of keeping separate copies and
trasferring them on fetch / push.
>From what I remember there was one serious attempt (by serious I mean
here with patches) to add 'lazy clone' / 'sparse clone' / 'remote
alternates', using some kind of "stub" objects and trasferring objects
lazily. This patch was fairly intrusive, and didn't get accepted.
I think you can find it in archives. Unfortunately I haven't bookmarked
this thread...
The problem with lazy clone is that git assumes in many places that if
it has some object, it has all its dependencies. Lazy clone
(on-demand object loading) breaks this assumption... although in your
case (only blobs of large size can be asked to be loaded lazily) it is
migitated somehow.
I also think that you would have to have 'sparse checkout' support.
If you don't have blob in object repository (and don't want to have it
there), you can not check it out. Fortunately this feature is quite
alive, and worked on by Duy (pclouds), see "What's cooking..."
(nd/narrow branch in 'pu').
HTH
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git and Media repositories....
2008-11-02 19:50 Git and Media repositories Tim Ansell
2008-11-03 6:56 ` Johannes Schindelin
2008-11-03 9:40 ` Jakub Narebski
@ 2008-11-07 13:00 ` Jakub Narebski
2008-11-07 13:19 ` Santi Béjar
3 siblings, 0 replies; 6+ messages in thread
From: Jakub Narebski @ 2008-11-07 13:00 UTC (permalink / raw)
To: Tim Ansell; +Cc: git
Tim Ansell <mithro@mithis.com> writes:
> Last week at the gittogether I lead some discussions about how we could
> make Git better support large media repositories (which is one area
> where Subversion still make sense). It was suggested that I post to this
> list to get a discussion going.
>
> The general idea is that we always clone the complete meta-data (tags,
> commits and trees) and then only clone blobs when they are needed (using
> something like alternates). This allows us to support shallow, narrow
> and sparse checkouts while still being able to perform operations such
> as committing and merging.
[...]
Well, the *workaround* you could currently use is to put large media
files in separate subdirectory, and make this subdirectory into
submodule. This uses the fact that you can selectively clone
submodules, or leave them as a stubs...
...and this is also the code you might want to look at when
implementings stubs for 'remote' blob objects
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git and Media repositories....
2008-11-02 19:50 Git and Media repositories Tim Ansell
` (2 preceding siblings ...)
2008-11-07 13:00 ` Jakub Narebski
@ 2008-11-07 13:19 ` Santi Béjar
2008-11-09 4:58 ` Nguyen Thai Ngoc Duy
3 siblings, 1 reply; 6+ messages in thread
From: Santi Béjar @ 2008-11-07 13:19 UTC (permalink / raw)
To: Tim Ansell; +Cc: git
On Sun, Nov 2, 2008 at 8:50 PM, Tim Ansell <mithro@mithis.com> wrote:
> Hey guys,
>
[...]
>
> The general idea is that we always clone the complete meta-data (tags,
> commits and trees) and then only clone blobs when they are needed (using
> something like alternates). This allows us to support shallow, narrow
> and sparse checkouts while still being able to perform operations such
> as committing and merging.
>
A related use case could be to remove a blob from a repo but being
able to work normally with it, similar to:
http://wiki.freebsd.org/VCSFeatureObliterate
Santi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git and Media repositories....
2008-11-07 13:19 ` Santi Béjar
@ 2008-11-09 4:58 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 6+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-11-09 4:58 UTC (permalink / raw)
To: Santi Béjar; +Cc: Tim Ansell, git
On 11/7/08, Santi Béjar <santi@agolina.net> wrote:
> On Sun, Nov 2, 2008 at 8:50 PM, Tim Ansell <mithro@mithis.com> wrote:
> > Hey guys,
> >
>
> [...]
>
>
> >
> > The general idea is that we always clone the complete meta-data (tags,
> > commits and trees) and then only clone blobs when they are needed (using
> > something like alternates). This allows us to support shallow, narrow
> > and sparse checkouts while still being able to perform operations such
> > as committing and merging.
> >
>
>
> A related use case could be to remove a blob from a repo but being
> able to work normally with it, similar to:
>
> http://wiki.freebsd.org/VCSFeatureObliterate
Maybe another use case: encrypted blobs (those are generally
unavailable until corrected password is given, so they are "holes" in
checkout/clone). It could be used to store sensitive content (in $HOME
for example)
--
Duy
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-11-09 5:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-02 19:50 Git and Media repositories Tim Ansell
2008-11-03 6:56 ` Johannes Schindelin
2008-11-03 9:40 ` Jakub Narebski
2008-11-07 13:00 ` Jakub Narebski
2008-11-07 13:19 ` Santi Béjar
2008-11-09 4:58 ` Nguyen Thai Ngoc Duy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox