* Tracking a repository for content instead of history @ 2006-12-12 12:35 Andy Parkins 2006-12-12 13:04 ` Jakub Narebski 0 siblings, 1 reply; 10+ messages in thread From: Andy Parkins @ 2006-12-12 12:35 UTC (permalink / raw) To: git Hello, For interests sake I'd like to track the kernel.org linux repository. However, I'm not that bothered about tracking the history - it's more that I like to have the latest kernel release lying around. Is there a way that I could just pull individual commits from a git repository? In particular - could I make a repository (obviously not a clone, because it wouldn't have all the history) that contained only the tagged commits from an upstream repository? Is it even sensible to want that? It strikes me that it's possible that there isn't that much space/bandwidth saving to be made. Should I just clone the repository and shut up? :-) Andy -- Dr Andy Parkins, M Eng (hons), MIEE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 12:35 Tracking a repository for content instead of history Andy Parkins @ 2006-12-12 13:04 ` Jakub Narebski 2006-12-12 13:26 ` Andy Parkins 0 siblings, 1 reply; 10+ messages in thread From: Jakub Narebski @ 2006-12-12 13:04 UTC (permalink / raw) To: git Andy Parkins wrote: > For interests sake I'd like to track the kernel.org linux repository. > However, I'm not that bothered about tracking the history - it's more that I > like to have the latest kernel release lying around. > > Is there a way that I could just pull individual commits from a git > repository? In particular - could I make a repository (obviously not a > clone, because it wouldn't have all the history) that contained only the > tagged commits from an upstream repository? As of beta (in 'next') you can do 'shallow clone'm i.e. clone/fetch only N commits depth history. > Is it even sensible to want that? It strikes me that it's possible that there > isn't that much space/bandwidth saving to be made. Should I just clone the > repository and shut up? :-) I've had similar idea: search for "sparse clone" keyword. But no code. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 13:04 ` Jakub Narebski @ 2006-12-12 13:26 ` Andy Parkins 2006-12-12 14:28 ` Johannes Schindelin 0 siblings, 1 reply; 10+ messages in thread From: Andy Parkins @ 2006-12-12 13:26 UTC (permalink / raw) To: git On Tuesday 2006 December 12 13:04, Jakub Narebski wrote: > > Is it even sensible to want that? It strikes me that it's possible that > > there isn't that much space/bandwidth saving to be made. Should I just > > clone the repository and shut up? :-) > > I've had similar idea: search for "sparse clone" keyword. But no code. While the functionality might not be built into git in terms of clone, would there be a way to pull a particular commit from another repository? The way I would do it given nothing else is to simply extract snapshots into a working directory; and create a repository from scratch. I was just wondering if a method existed that could reduce the size of the download. I think the best way is going to be to use the patches published at kernel.org and apply them one at a time with git-apply. Andy -- Dr Andy Parkins, M Eng (hons), MIEE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 13:26 ` Andy Parkins @ 2006-12-12 14:28 ` Johannes Schindelin 2006-12-12 15:38 ` Andy Parkins 0 siblings, 1 reply; 10+ messages in thread From: Johannes Schindelin @ 2006-12-12 14:28 UTC (permalink / raw) To: Andy Parkins; +Cc: git Hi, On Tue, 12 Dec 2006, Andy Parkins wrote: > The way I would do it given nothing else is to simply extract snapshots > into a working directory; and create a repository from scratch. I was > just wondering if a method existed that could reduce the size of the > download. You are not by any chance talking about the --remote option to git-archive? If you want to reduce the number of objects to be downloaded, by telling the other side what you have, you literally end up with something like shallow clone: the other side _has_ to support it. Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 14:28 ` Johannes Schindelin @ 2006-12-12 15:38 ` Andy Parkins 2006-12-12 16:24 ` Johannes Schindelin 2006-12-12 21:46 ` Nguyen Thai Ngoc Duy 0 siblings, 2 replies; 10+ messages in thread From: Andy Parkins @ 2006-12-12 15:38 UTC (permalink / raw) To: git On Tuesday 2006 December 12 14:28, Johannes Schindelin wrote: > You are not by any chance talking about the --remote option to > git-archive? I wasn't; but that's certainly a helpful switch. It's certainly a huge help. > If you want to reduce the number of objects to be downloaded, by telling > the other side what you have, you literally end up with something like > shallow clone: the other side _has_ to support it. I suppose so; but I was thinking more an automated way of getting the data that is supplied for the kernel anyway. So: base-v1.0.0.tar.gz patch-v1.0.1.gz patch-v1.0.2.gz etc Each patch is obviously smaller than "base". Git could easily make the patches, and each of those patches could be fed by hand into a repository with git-apply. It doesn't seem like something that would require support on the other side, because it isn't so much a shallow clone (which /would/ preserve history, making it available if wanted); it is pulling just, say, tagged commits out of an existing repository. Given a list of tags it is almost: git-archive <get me base> ssh remote git-diff v1.0.0..v1.0.1 | git-apply; git commit ssh remote git-diff v1.0.1..v1.0.2 | git-apply; git commit If that makes sense? Obviously though it would be possible to use git rather than ssh to do this. However... please don't waste any more time thinking about this; it's not a problem I have that needs a solution - it was more a "because I'm curious" sort of question. Andy -- Dr Andy Parkins, M Eng (hons), MIEE ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 15:38 ` Andy Parkins @ 2006-12-12 16:24 ` Johannes Schindelin 2006-12-12 16:35 ` Johannes Schindelin 2006-12-12 21:46 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 10+ messages in thread From: Johannes Schindelin @ 2006-12-12 16:24 UTC (permalink / raw) To: Andy Parkins; +Cc: git Hi, On Tue, 12 Dec 2006, Andy Parkins wrote: > On Tuesday 2006 December 12 14:28, Johannes Schindelin wrote: > > > You are not by any chance talking about the --remote option to > > git-archive? > > I wasn't; but that's certainly a helpful switch. It's certainly a huge > help. > > > If you want to reduce the number of objects to be downloaded, by telling > > the other side what you have, you literally end up with something like > > shallow clone: the other side _has_ to support it. > > I suppose so; but I was thinking more an automated way of getting the data > that is supplied for the kernel anyway. So: > > base-v1.0.0.tar.gz > patch-v1.0.1.gz > patch-v1.0.2.gz > etc > > Each patch is obviously smaller than "base". Git could easily make the > patches, and each of those patches could be fed by hand into a repository > with git-apply. If it weren't for the recent discussion of kernel.org being overloaded with gitweb processes, I'd just write down a hint like http://repo.or.cz/w/git/jnareb-git.git?a=commitdiff_plain;h=next;hp=master But since kernel.org is overloaded, I will not do that. Ciao, ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 16:24 ` Johannes Schindelin @ 2006-12-12 16:35 ` Johannes Schindelin 0 siblings, 0 replies; 10+ messages in thread From: Johannes Schindelin @ 2006-12-12 16:35 UTC (permalink / raw) To: Andy Parkins; +Cc: git Hi, On Tue, 12 Dec 2006, Johannes Schindelin wrote: > If it weren't for the recent discussion of kernel.org being overloaded > with gitweb processes, I'd just write down a hint like > [URL edited out] > > But since kernel.org is overloaded, I will not do that. Side note: it would probably not help you. The diff is uncompressed, and thus likely _substantially larger_ than getting the snapshot via gitweb, which _is_ compressed. Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 15:38 ` Andy Parkins 2006-12-12 16:24 ` Johannes Schindelin @ 2006-12-12 21:46 ` Nguyen Thai Ngoc Duy 2006-12-12 21:48 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 10+ messages in thread From: Nguyen Thai Ngoc Duy @ 2006-12-12 21:46 UTC (permalink / raw) To: Andy Parkins; +Cc: git On 12/12/06, Andy Parkins <andyparkins@gmail.com> wrote: > I suppose so; but I was thinking more an automated way of getting the data > that is supplied for the kernel anyway. So: > > base-v1.0.0.tar.gz > patch-v1.0.1.gz > patch-v1.0.2.gz > etc > > Each patch is obviously smaller than "base". Git could easily make the > patches, and each of those patches could be fed by hand into a repository > with git-apply. It doesn't seem like something that would require support on > the other side, because it isn't so much a shallow clone (which /would/ > preserve history, making it available if wanted); it is pulling just, say, > tagged commits out of an existing repository. > > Given a list of tags it is almost: > > git-archive <get me base> > ssh remote git-diff v1.0.0..v1.0.1 | git-apply; git commit > ssh remote git-diff v1.0.1..v1.0.2 | git-apply; git commit > > If that makes sense? Obviously though it would be possible to use git rather > than ssh to do this. Hm.. I'm no git:// expert. But is it possible doing as follow? 1. git-archive <base> 2. reconstruct commit, blobs and trees from the archive 3. tell git server that you have one commit, you need another commit (maybe heads only, i'm not sure here) 4. get the pack from git server, create new commit and a diff -- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 21:46 ` Nguyen Thai Ngoc Duy @ 2006-12-12 21:48 ` Nguyen Thai Ngoc Duy 2006-12-12 22:25 ` Johannes Schindelin 0 siblings, 1 reply; 10+ messages in thread From: Nguyen Thai Ngoc Duy @ 2006-12-12 21:48 UTC (permalink / raw) To: git On 12/13/06, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > Hm.. I'm no git:// expert. But is it possible doing as follow? > 1. git-archive <base> > 2. reconstruct commit, blobs and trees from the archive > 3. tell git server that you have one commit, you need another commit > (maybe heads only, i'm not sure here) > 4. get the pack from git server, create new commit and a diff Ok. Stupid idea. The pack may base on objects that I don't have. -- ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Tracking a repository for content instead of history 2006-12-12 21:48 ` Nguyen Thai Ngoc Duy @ 2006-12-12 22:25 ` Johannes Schindelin 0 siblings, 0 replies; 10+ messages in thread From: Johannes Schindelin @ 2006-12-12 22:25 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git Hi, On Wed, 13 Dec 2006, Nguyen Thai Ngoc Duy wrote: > On 12/13/06, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > > Hm.. I'm no git:// expert. But is it possible doing as follow? > > 1. git-archive <base> > > 2. reconstruct commit, blobs and trees from the archive > > 3. tell git server that you have one commit, you need another commit > > (maybe heads only, i'm not sure here) > > 4. get the pack from git server, create new commit and a diff > > Ok. Stupid idea. The pack may base on objects that I don't have. The only not-so-brilliant idea is to reconstruct the commit from the archive. This is not possible, as not only some author and committer metadata is not reconstructable, but worse: the parents' hash is not either. And since all these are hashed to get the commit hash, you lost. However, it could work like this: - reconstruct tree commit - ask for a diff between a certain commit, with respect to your tree It might even be easy to convince git-upload-pack to construct a thin pack containing deltas _only_ against objects which are reachable from your tree. Note: this is feasible, but not necessarily sensible: - it puts more strain on the server, which otherwise could probably reuse a lot of deltas, and - it contradicts the idea of _distributed_ development (for example, you could not tell which HEAD commit is newer when you fetched from two repos). Probably, you could add a third argument: merges are not necessarily _possible_ with that setup. Note that this argument applies to shallow clones, too! Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-12-12 22:25 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-12 12:35 Tracking a repository for content instead of history Andy Parkins 2006-12-12 13:04 ` Jakub Narebski 2006-12-12 13:26 ` Andy Parkins 2006-12-12 14:28 ` Johannes Schindelin 2006-12-12 15:38 ` Andy Parkins 2006-12-12 16:24 ` Johannes Schindelin 2006-12-12 16:35 ` Johannes Schindelin 2006-12-12 21:46 ` Nguyen Thai Ngoc Duy 2006-12-12 21:48 ` Nguyen Thai Ngoc Duy 2006-12-12 22:25 ` Johannes Schindelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).