* clone breaks replace @ 2011-01-06 21:00 Phillip Susi 2011-01-06 21:33 ` Jonathan Nieder 0 siblings, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-06 21:00 UTC (permalink / raw) To: git I've been experimenting with git replace to remove ancient history, and I have found that cloning a repository breaks replace. I read about this process at http://progit.org/2010/03/17/replace.html. I managed to correctly add a replace commit that truncates the history and contains instructions where you can find it, and running git log only goes back to the replacement commit, unless you add --no-replace-objects, which causes it to show the original full history. The problem is that when I clone the repository, I expect the clone to contain only history up to the replacement record, and not the old history before that. Instead, the clone contains only the full original history, and the replacement ref is not imported at all. A git replace in the new clone shows nothing. Shouldn't clone copy .git/refs/replace? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-06 21:00 clone breaks replace Phillip Susi @ 2011-01-06 21:33 ` Jonathan Nieder 2011-01-06 21:59 ` Junio C Hamano 2011-01-07 19:43 ` Phillip Susi 0 siblings, 2 replies; 34+ messages in thread From: Jonathan Nieder @ 2011-01-06 21:33 UTC (permalink / raw) To: Phillip Susi; +Cc: git, Christian Couder Phillip Susi wrote: > I managed to > correctly add a replace commit that truncates the history and contains > instructions where you can find it, and running git log only goes back > to the replacement commit, unless you add --no-replace-objects, which > causes it to show the original full history. Before I get to your real question: this seems a bit backwards. Let me say a few words about why. In the days before replacement refs (and today, too), each commit name described not only the state of a tree at a moment but the history that led up to it. In fact you can see this somewhat directly: given two distinct commits A and B if you try $ git cat-file commit A >a.commit $ git cat-file commit B >b.commit $ diff -u a.commit b.commit then you will see precisely what can make them different: - the author's name and email and the date of authorship - the committer's name and email and the date committed - the names of the parent commits, describing the history - the name of a tree, describing the content - the log message, including its encoding The commit name is a hash of that information (see git-hash-object(1)) and an invariant maintained is "if a repository has access to commit A, it has access to its parents, their parents, and so on". This invariant is maintained during object transfer and garbage collection and relied on by object transfer and revision traversal. The beauty of replacement refs is that they can be easily added or removed without breaking this invariant. And a replacement ref is an actual reference into history, so garbage collection does not remove those commits and the repository keeps enough information to traverse both the modified and unmodified history. Therefore if you want clients to be able to choose between a minimal history and a larger one to save bandwidth, it has to work like this - to get the minimal history, fetch _without_ any replacement refs - to get the full history, fetch the replacement refs on top of that. because an additional reference can only increase the number of objects to be downloaded. > The problem is that when I clone the repository, I expect the clone to > contain only history up to the replacement record, and not the old > history before that. Instead, the clone contains only the full original > history, and the replacement ref is not imported at all. A git replace > in the new clone shows nothing. > > Shouldn't clone copy .git/refs/replace? With that in mind, I suspect the best way to achieve what you are looking for is the following: 1. Make a big, ugly history (branch "big"). Presumably this part's already done. 2. Find the part you want to get rid of and make appropriate replacement refs so "gitk big" shows what you want it to. 3. Use "git filter-branch" to make that history a reality (branch "simpler"). Remove the replacement refs. 4. Use "git replace" to graft back on the pieces you cauterized. Publish the result. 5. Perhaps also run and publish "git replace big simpler", so contributors of branches based against the old 'big' can merge your latest changes from 'simpler'. Encourage contributors to use 'git rebase' or 'git filter-branch' to rebase their contributions against the new, simpler history. Does that make sense? Jonathan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-06 21:33 ` Jonathan Nieder @ 2011-01-06 21:59 ` Junio C Hamano 2011-01-07 19:43 ` Phillip Susi 1 sibling, 0 replies; 34+ messages in thread From: Junio C Hamano @ 2011-01-06 21:59 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Phillip Susi, git, Christian Couder Jonathan Nieder <jrnieder@gmail.com> writes: > Therefore if you want clients to be able to choose between a minimal > history and a larger one to save bandwidth, it has to work like this > > - to get the minimal history, fetch _without_ any replacement refs > - to get the full history, fetch the replacement refs on top of that. > > because an additional reference can only increase the number of > objects to be downloaded. Very nicely and clearly put. Can we have this somewhere in the docs? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-06 21:33 ` Jonathan Nieder 2011-01-06 21:59 ` Junio C Hamano @ 2011-01-07 19:43 ` Phillip Susi 2011-01-07 20:51 ` Jonathan Nieder 1 sibling, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-07 19:43 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, Christian Couder On 1/6/2011 4:33 PM, Jonathan Nieder wrote: > Therefore if you want clients to be able to choose between a minimal > history and a larger one to save bandwidth, it has to work like this > > - to get the minimal history, fetch _without_ any replacement refs > - to get the full history, fetch the replacement refs on top of that. > > because an additional reference can only increase the number of > objects to be downloaded. This seems backwards. The original commit links to its parent and therefore, the full history trail going back. The reason you add the replacement record is to get rid of that parent link, thus truncating the history. Therefore, if you fetch the original record that still has the reference to its parent, and not the replacement record, you end up with the full history. Ergo, to get only the truncated history, you must fetch the replacement record, and pay attention to it to stop fetching commits older than the truncation point. > 3. Use "git filter-branch" to make that history a reality (branch > "simpler"). Remove the replacement refs. Isn't the whole purpose of using replace to avoid having to use filter-branch, which throws out all of the existing commit records, and creates an entirely new commit chain that is slightly modified? > 4. Use "git replace" to graft back on the pieces you cauterized. > Publish the result. If you are going to use filter-branch, then what do you need to replace? And publishing the result of a replace seems to have no effect, since other people do not get the replace ref when they clone. > 5. Perhaps also run and publish "git replace big simpler", so > contributors of branches based against the old 'big' can merge > your latest changes from 'simpler'. Encourage contributors to > use 'git rebase' or 'git filter-branch' to rebase their > contributions against the new, simpler history. Again, the entire point of replace seems to be to AVOID having to go through the hassle of having to rebase or filter-branch. Isn't that exactly how you would accomplish this before replace was added? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 19:43 ` Phillip Susi @ 2011-01-07 20:51 ` Jonathan Nieder 2011-01-07 21:15 ` Stephen Bash ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Jonathan Nieder @ 2011-01-07 20:51 UTC (permalink / raw) To: Phillip Susi; +Cc: git, Christian Couder Phillip Susi wrote: > Isn't the whole purpose of using replace to avoid having to use > filter-branch, which throws out all of the existing commit records, and > creates an entirely new commit chain that is slightly modified? No. What documentation suggested that? Maybe it can be fixed. The original purpose of grafts (the ideological ancestor of replacement refs) was to serve a very particular use case. Sit down by the fire, if you will, and... Git had just came into existence and pack files did not exist yet. A full import of the Linux kernel history was possible but the result was enormous and not something ready to be imposed on all Linux contributors. So what can one do? $ git show -s v2.6.12-rc2^0 commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Author: Linus Torvalds <torvalds@ppc970.osdl.org> Date: Sat Apr 16 15:20:36 2005 -0700 Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip! Fast forward three months, and there is discussion[1] about what to do with the historical git archive. A clever idea: teach git to _pretend_ that the historical archive is the parent to v2.6.12-rc2, so "git log --grep", "gitk", and so on work as they ought to. So grafts were born. One of the nicest advantages of grafts is that they make it easy to do complex history surgery: make some grafts --- cut here, paste there --- and then run "git filter-branch" to make it permanent. But grafts have a serious problem. Transport machinery needs to ignore grafts --- otherwise, the two ends of a connection could have different ideas of the history preceding a commit, resulting in confusion and breakage. A fix to that was finally grafted on a few years later (see also [2]). $ GIT_NOTES_REF=refs/remotes/charon/notes/full \ git log --grep=graft --grep=repack --all-match --no-merges [...] git repack: keep commits hidden by a graft [...] Archived-At: <http://thread.gmane.org/gmane.comp.version-control.git/123874> There is also the problem that grafts are too "raw": it is very easy to make a graft pointing to a nonexistent object, say. And meanwhile git has no native support for transfering grafts over the wire. In that context there emerged the nicer (imho) refs/replace mechanism: - reachability checking and transport machinery can treat them like all other references --- no need for low-level tools to pay attention to the artificial history; - easy to script around with "git replace" and "git for-each-ref" - can choose to fetch or not fetch with the usual "git fetch repo refs/replace/*:refs/replace/*" syntax Common applications: - locally staging history changes that will later be made permanent with "git filter-branch"; - grafting on additional (historical) history; - replacing ancient broken commits with fixed ones, for use by "git bisect". Hope that helps, Jonathan [1] http://thread.gmane.org/gmane.comp.version-control.git/6470/focus=6484 found with "git log --grep=graft --reverse" [2] http://thread.gmane.org/gmane.comp.version-control.git/37744/focus=37908 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 20:51 ` Jonathan Nieder @ 2011-01-07 21:15 ` Stephen Bash 2011-01-07 21:34 ` Jonathan Nieder 2011-01-07 21:44 ` Phillip Susi 2 siblings, 0 replies; 34+ messages in thread From: Stephen Bash @ 2011-01-07 21:15 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, Christian Couder, Phillip Susi ----- Original Message ----- > From: "Jonathan Nieder" <jrnieder@gmail.com> > To: "Phillip Susi" <psusi@cfl.rr.com> > Cc: git@vger.kernel.org, "Christian Couder" <chriscool@tuxfamily.org> > Sent: Friday, January 7, 2011 3:51:03 PM > Subject: Re: clone breaks replace > Phillip Susi wrote: > > > Isn't the whole purpose of using replace to avoid having to use > > filter-branch, which throws out all of the existing commit records, > > and creates an entirely new commit chain that is slightly modified? > > No. What documentation suggested that? Maybe it can be fixed. I'll chime in here as another person who read the ProGit blog entry on git-replace [1] and came to the same conclusion Phillip (and I'm guessing others) did. OTOH when I attempted to read the actual git-replace manpage, I got completely lost, so I retained my (apparently incorrect) understanding from ProGit. Thanks, Stephen [1] http://progit.org/2010/03/17/replace.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 20:51 ` Jonathan Nieder 2011-01-07 21:15 ` Stephen Bash @ 2011-01-07 21:34 ` Jonathan Nieder 2011-01-07 21:44 ` Phillip Susi 2 siblings, 0 replies; 34+ messages in thread From: Jonathan Nieder @ 2011-01-07 21:34 UTC (permalink / raw) To: Phillip Susi; +Cc: git, Christian Couder, Stephen Bash Jonathan Nieder wrote: > Transport machinery needs to ignore grafts --- otherwise, the two ends > of a connection could have different ideas of the history preceding a > commit, resulting in confusion and breakage. A fix to that was > finally grafted on a few years later (see also [2]). Sorry, I walked away mid-paragraph and left out a crucial piece when I returned. Because transport machinery ignores grafts, garbage collection must make sure not to remove pieces of the non-artificial history. It is the garbage collection that Dscho fixed with v1.6.4-rc3~7^2. Sorry for the nonsense. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 20:51 ` Jonathan Nieder 2011-01-07 21:15 ` Stephen Bash 2011-01-07 21:34 ` Jonathan Nieder @ 2011-01-07 21:44 ` Phillip Susi 2011-01-07 21:49 ` Jonathan Nieder 2 siblings, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-07 21:44 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, Christian Couder On 1/7/2011 3:51 PM, Jonathan Nieder wrote: > Phillip Susi wrote: > >> Isn't the whole purpose of using replace to avoid having to use >> filter-branch, which throws out all of the existing commit records, and >> creates an entirely new commit chain that is slightly modified? > > No. What documentation suggested that? Maybe it can be fixed. It's just what made sense to me. If you can modify the history with filter-branch, then you don't need replace refs. The downside to filter-branch is that it breaks people tracking your repository, since the history they had been tracking is thrown out and replaced with a completely new commit chain that looks similar, but as far as git is concerned, is unrelated to the original. Replace refs seem to have been created to allow you to accomplish the goal of modifying an old commit record, but without having to rewrite that and all subsequent commits, causing breakage. > - can choose to fetch or not fetch with the usual > "git fetch repo refs/replace/*:refs/replace/*" syntax It seems like this should be the default behavior. Or perhaps refs/replace should be forked into one meant to be private, and one meant to be public, and fetched by default. Or maybe it should be fetched by default, but not pushed, so you have to explicitly push replacements to the public mirror that you intend for public consumption. Having the replace only apply locally and still needing to filter-branch to make the change visible to the public seems to render the replace somewhat pointless. Take the kernel history as an example, only imagine that Linus did not originally make that first commit leaving out the prior history, but wants to go back and fix it now. He can do it with a replace, but then if he runs filter-branch as you suggest to make the change 'real', then everyone tracking his tree will fail the next time they try to pull. You could get the same result without replace, so why bother? If the replace was fetched by default, the people already tracking would get it the next time they pull and would not have a problem. If they wanted to see the old history, then they would already have it in the repository and just need to add --no-replace-objects to see it, or run git log on the original commit id that the replace record should refer you to ( in the comments ). Those cloning the repository for the first time would get it, and avoid fetching all of the old history since they would be using the replace record in place of the original commit. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 21:44 ` Phillip Susi @ 2011-01-07 21:49 ` Jonathan Nieder 2011-01-07 22:09 ` Phillip Susi 2011-01-07 22:09 ` Jeff King 0 siblings, 2 replies; 34+ messages in thread From: Jonathan Nieder @ 2011-01-07 21:49 UTC (permalink / raw) To: Phillip Susi; +Cc: git, Christian Couder, Stephen Bash Phillip Susi wrote: > Take the kernel history as an example, only imagine that Linus did not > originally make that first commit leaving out the prior history, but > wants to go back and fix it now. He can do it with a replace, but then > if he runs filter-branch as you suggest to make the change 'real', then > everyone tracking his tree will fail the next time they try to pull. > You could get the same result without replace, so why bother? > > If the replace was fetched by default, the people already tracking would > get it the next time they pull and would not have a problem. Interesting. I hadn't thought about this detail before. > Those cloning the repository for the first > time would get it, and avoid fetching all of the old history since they > would be using the replace record in place of the original commit. No, it doesn't work that way. Imagine for a moment that each commit object actually contains all of its ancestors. That isn't precisely right but in a way it is close. To change the ancestry of a commit, you really do need to change its name. If you disagree, feel free to try it and I'd be glad to help where I can with the coding if the design is sane. Deal? Maybe it would be nice if git replace worked that way, but that would be fundamentally a _different_ feature. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 21:49 ` Jonathan Nieder @ 2011-01-07 22:09 ` Phillip Susi 2011-01-07 22:09 ` Jeff King 1 sibling, 0 replies; 34+ messages in thread From: Phillip Susi @ 2011-01-07 22:09 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, Christian Couder, Stephen Bash On 1/7/2011 4:49 PM, Jonathan Nieder wrote: > No, it doesn't work that way. Imagine for a moment that each commit > object actually contains all of its ancestors. That isn't precisely > right but in a way it is close. > > To change the ancestry of a commit, you really do need to change its > name. If you disagree, feel free to try it and I'd be glad to help > where I can with the coding if the design is sane. Deal? That's why a replace record seems to be the perfect solution. The original record still references the old history, but you ignore it in favor of the replacement, which does not. Thus you have a choice; you ignore the replacement and use the original with the full history attached, or you respect the replacement and the history is truncated. As long as git-upload-pack respects the replacement, then new checkouts will ignore the old history. You could then create a new historical branch that points to the parent commit of the replaced one, and tell people to fetch that branch to get the old history, or pass --no-replace-objects over the wire to git-upload-pack. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 21:49 ` Jonathan Nieder 2011-01-07 22:09 ` Phillip Susi @ 2011-01-07 22:09 ` Jeff King 2011-01-07 22:58 ` Junio C Hamano 2011-01-08 0:43 ` Phillip Susi 1 sibling, 2 replies; 34+ messages in thread From: Jeff King @ 2011-01-07 22:09 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Phillip Susi, git, Christian Couder, Stephen Bash On Fri, Jan 07, 2011 at 03:49:07PM -0600, Jonathan Nieder wrote: > Phillip Susi wrote: > > > Take the kernel history as an example, only imagine that Linus did not > > originally make that first commit leaving out the prior history, but > > wants to go back and fix it now. He can do it with a replace, but then > > if he runs filter-branch as you suggest to make the change 'real', then > > everyone tracking his tree will fail the next time they try to pull. > > You could get the same result without replace, so why bother? > > > > If the replace was fetched by default, the people already tracking would > > get it the next time they pull and would not have a problem. > > Interesting. I hadn't thought about this detail before. I think there are two separate issues here: 1. Should transport protocols respect replacements (i.e., if you truncate history with a replacement object and I fetch from you, should you get the full history or the truncated one)? 2. Should clone fetch refs from refs/replace (either by default, or with an option)? Based on previous discussions, I think the answer to the first is no. The resulting repo violates a fundamental assumption of git. Yes, because of the replacement object, many things will still work. But many parts of git intentionally do not respect replacement, and they will be broken. Instead, I think of replacements as a specific view into history, not a fundamental history-changing operation itself. Which means you can never save bandwidth or space by truncating history with replacements. You can only give somebody the full history, and share with them your view. If you want to truncate, you must rewrite history[1]. Which leads to the second question. It is basically a matter of saying "do you want to fetch the view that upstream has"? I can definitely see that being useful, and meriting an option. However, it may or may not be worth turning on by default, as upstream's view may be confusing. -Peff [1] Actually, what we are talking about it basically shallow clone. Which does do exactly this truncation, but does not use the replace mechanism. So it _is_ possible, but lots of things need to be tweaked to understand the shallow-ness. Perhaps in the long run making git understand replacement-truncated repos with missing objects would be a good thing, and shallow clones can be implemented simply as a special case of that. It would probably make the code a bit cleaner. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 22:09 ` Jeff King @ 2011-01-07 22:58 ` Junio C Hamano 2011-01-11 5:36 ` Jeff King 2011-01-08 0:43 ` Phillip Susi 1 sibling, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2011-01-07 22:58 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Phillip Susi, git, Christian Couder, Stephen Bash Jeff King <peff@peff.net> writes: > 2. Should clone fetch refs from refs/replace (either by default, or > with an option)? > ... > Which leads to the second question. It is basically a matter of saying > "do you want to fetch the view that upstream has"? I can definitely see > that being useful, and meriting an option. However, it may or may not be > worth turning on by default, as upstream's view may be confusing. I think that should be stated a bit differently. "Do you want to fetch the view that the upstream offers as an option, and if you want, which ones (meaning: there could be more than one replacement grafts to give different views)?" And as an optional view, I would say it is perfectly Ok to fetch whichever view you want as a separate step after the initial clone. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 22:58 ` Junio C Hamano @ 2011-01-11 5:36 ` Jeff King 2011-01-11 17:40 ` Junio C Hamano 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2011-01-11 5:36 UTC (permalink / raw) To: Junio C Hamano Cc: Jonathan Nieder, Phillip Susi, git, Christian Couder, Stephen Bash On Fri, Jan 07, 2011 at 02:58:34PM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > 2. Should clone fetch refs from refs/replace (either by default, or > > with an option)? > > ... > > Which leads to the second question. It is basically a matter of saying > > "do you want to fetch the view that upstream has"? I can definitely see > > that being useful, and meriting an option. However, it may or may not be > > worth turning on by default, as upstream's view may be confusing. > > I think that should be stated a bit differently. "Do you want to fetch > the view that the upstream offers as an option, and if you want, which > ones (meaning: there could be more than one replacement grafts to give > different views)?" Sure, I think that is a sane way for the user to think about it, but do we actually support multiple views? I thought replacement objects were all or nothing. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 5:36 ` Jeff King @ 2011-01-11 17:40 ` Junio C Hamano 2011-01-11 17:50 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2011-01-11 17:40 UTC (permalink / raw) To: Jeff King Cc: Jonathan Nieder, Phillip Susi, git, Christian Couder, Stephen Bash Jeff King <peff@peff.net> writes: > Sure, I think that is a sane way for the user to think about it, but do > we actually support multiple views? I thought replacement objects were > all or nothing. It is not implausible for a long running large project to restart their history from a physical root commit every year, stiching the year-long segments together at their ends with replacements, to make a default clone to get a year's worth of the most recent history while allowing people to get more by asking, no? Of course, if you trust shallow-clones, you do not have to do that kind of history surgery ;-). ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 17:40 ` Junio C Hamano @ 2011-01-11 17:50 ` Jeff King 2011-01-11 17:56 ` Jonathan Nieder 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2011-01-11 17:50 UTC (permalink / raw) To: Junio C Hamano Cc: Jonathan Nieder, Phillip Susi, git, Christian Couder, Stephen Bash On Tue, Jan 11, 2011 at 09:40:17AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Sure, I think that is a sane way for the user to think about it, but do > > we actually support multiple views? I thought replacement objects were > > all or nothing. > > It is not implausible for a long running large project to restart their > history from a physical root commit every year, stiching the year-long > segments together at their ends with replacements, to make a default clone > to get a year's worth of the most recent history while allowing people to > get more by asking, no? Oh, absolutely I think it is reasonable. I just meant that we do not have a convenient way of saying "fetch these replace objects, but only use this particular subset". I think you are stuck with something manual like: # grab "view" from upstream and name it; let's imagine it links 2010 # history into 2009 git fetch origin refs/replace/$sha1 refs/views/2009/$sha1 # now we feel like using them git for-each-ref --shell --format='%(refname)' refs/views/2009 | while read ref; do git update-ref "refs/replace/${ref#refs/views/2009}" "$ref" done Which is a little overkill for the simple example you gave, but would also handle something as complex as a view like "pretend the foo/ subtree never existed" or even "pretend the foo/ subtree existed all along". Not that I'm sure such things are actually sane to do, performance-wise. The replace system is fast, but it was designed for a handful of objects, not hundreds or thousands. Anyway. My point is that we don't have the porcelain to do something like managing views or enabling/disabling them in a sane manner. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 17:50 ` Jeff King @ 2011-01-11 17:56 ` Jonathan Nieder 2011-01-11 18:03 ` Jeff King 2011-01-11 19:32 ` Christian Couder 0 siblings, 2 replies; 34+ messages in thread From: Jonathan Nieder @ 2011-01-11 17:56 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Phillip Susi, git, Christian Couder, Stephen Bash Jeff King wrote: > I think you are stuck with something manual > like: > > # grab "view" from upstream and name it; let's imagine it links 2010 > # history into 2009 > git fetch origin refs/replace/$sha1 refs/views/2009/$sha1 > > # now we feel like using them > git for-each-ref --shell --format='%(refname)' refs/views/2009 | > while read ref; do > git update-ref "refs/replace/${ref#refs/views/2009}" "$ref" > done > > Which is a little overkill for the simple example you gave, but would > also handle something as complex as a view like "pretend the foo/ > subtree never existed" or even "pretend the foo/ subtree existed all > along". > > Not that I'm sure such things are actually sane to do, performance-wise. > The replace system is fast, but it was designed for a handful of > objects, not hundreds or thousands. > > Anyway. My point is that we don't have the porcelain to do something > like managing views or enabling/disabling them in a sane manner. Maybe something like git fetch origin refs/views/2009/*:refs/replace/* except that that does not provide a nice way to remove to replace refs when done. A potential usability enhancement might be to allow additional replacement hierarchies to be requested on a per command basis, like GIT_REPLACE_REFS=refs/remotes/origin/views/2009 gitk --all along the lines of GIT_NOTES_REF. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 17:56 ` Jonathan Nieder @ 2011-01-11 18:03 ` Jeff King 2011-01-11 19:32 ` Christian Couder 1 sibling, 0 replies; 34+ messages in thread From: Jeff King @ 2011-01-11 18:03 UTC (permalink / raw) To: Jonathan Nieder Cc: Junio C Hamano, Phillip Susi, git, Christian Couder, Stephen Bash On Tue, Jan 11, 2011 at 11:56:21AM -0600, Jonathan Nieder wrote: > Maybe something like > > git fetch origin refs/views/2009/*:refs/replace/* Heh, yeah, that is much simpler than what I did. :) > A potential usability enhancement might be to allow additional > replacement hierarchies to be requested on a per command basis, like > > GIT_REPLACE_REFS=refs/remotes/origin/views/2009 gitk --all > > along the lines of GIT_NOTES_REF. Yes, that is a much better solution, IMHO. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 17:56 ` Jonathan Nieder 2011-01-11 18:03 ` Jeff King @ 2011-01-11 19:32 ` Christian Couder 1 sibling, 0 replies; 34+ messages in thread From: Christian Couder @ 2011-01-11 19:32 UTC (permalink / raw) To: Jonathan Nieder Cc: Jeff King, Junio C Hamano, Phillip Susi, git, Stephen Bash Hi, On Tuesday 11 January 2011 18:56:21 Jonathan Nieder wrote: > > A potential usability enhancement might be to allow additional > replacement hierarchies to be requested on a per command basis, like > > GIT_REPLACE_REFS=refs/remotes/origin/views/2009 gitk --all > > along the lines of GIT_NOTES_REF. Yes, it should not be much work to implement GIT_REPLACE_REFS like the above, but I think it should accept a list of ref directories, for example: GIT_REPLACE _REFS=".:bisect:refs/remotes/origin/views/2009" Best regards, Christian. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-07 22:09 ` Jeff King 2011-01-07 22:58 ` Junio C Hamano @ 2011-01-08 0:43 ` Phillip Susi 2011-01-11 5:47 ` Jeff King 1 sibling, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-08 0:43 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, git, Christian Couder, Stephen Bash On 01/07/2011 05:09 PM, Jeff King wrote: > I think there are two separate issues here: > > 1. Should transport protocols respect replacements (i.e., if you > truncate history with a replacement object and I fetch from you, > should you get the full history or the truncated one)? > > 2. Should clone fetch refs from refs/replace (either by default, or > with an option)? > > Based on previous discussions, I think the answer to the first is no. > The resulting repo violates a fundamental assumption of git. Yes, > because of the replacement object, many things will still work. But many > parts of git intentionally do not respect replacement, and they will be > broken. What parts do not respect replacement? More importantly, what parts will be broken? The man page seems to indicate that about the only thing that does not by default is reachability testing, which to me means fsck and prune. It seems to be the purpose of replace to /prevent/ breakage and be respected by default, unless doing so would cause harm, which is why fsck and prune do not. > Instead, I think of replacements as a specific view into history, not a > fundamental history-changing operation itself. Which means you can never > save bandwidth or space by truncating history with replacements. You can > only give somebody the full history, and share with them your view. If > you want to truncate, you must rewrite history[1]. Right, but if you only care about that view, then there is no need to waste bandwidth fetching the original one. It goes without saying that people pulling from the repository mainly care about the view upstream chooses to publish. Upstream can choose to rewrite, which will cause breakage and is a sort of sneaky way to hide the original history, or they can use replace, which avoids the breakage and gives the client the choice of which view to use. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-08 0:43 ` Phillip Susi @ 2011-01-11 5:47 ` Jeff King 2011-01-11 6:52 ` Jonathan Nieder 2011-01-11 15:24 ` Phillip Susi 0 siblings, 2 replies; 34+ messages in thread From: Jeff King @ 2011-01-11 5:47 UTC (permalink / raw) To: Phillip Susi; +Cc: Jonathan Nieder, git, Christian Couder, Stephen Bash On Fri, Jan 07, 2011 at 07:43:40PM -0500, Phillip Susi wrote: > >Based on previous discussions, I think the answer to the first is no. > >The resulting repo violates a fundamental assumption of git. Yes, > >because of the replacement object, many things will still work. But many > >parts of git intentionally do not respect replacement, and they will be > >broken. > > What parts do not respect replacement? More importantly, what parts > will be broken? The man page seems to indicate that about the only > thing that does not by default is reachability testing, which to me > means fsck and prune. It seems to be the purpose of replace to > /prevent/ breakage and be respected by default, unless doing so would > cause harm, which is why fsck and prune do not. Off the top of my head, I don't know. I suspect it would take somebody writing a patch to create such an incomplete repository (or making one manually) and seeing how badly things broke. Maybe nothing would, and I am being overly conservative. It just makes me nervous to start violating what has always been a fundamental assumption about the object database (though as I pointed out, we did start violating it with shallow clones, so maybe it is not so bad). > >Instead, I think of replacements as a specific view into history, not a > >fundamental history-changing operation itself. Which means you can never > >save bandwidth or space by truncating history with replacements. You can > >only give somebody the full history, and share with them your view. If > >you want to truncate, you must rewrite history[1]. > > Right, but if you only care about that view, then there is no need to > waste bandwidth fetching the original one. It goes without saying > that people pulling from the repository mainly care about the view > upstream chooses to publish. Upstream can choose to rewrite, which > will cause breakage and is a sort of sneaky way to hide the original > history, or they can use replace, which avoids the breakage and gives > the client the choice of which view to use. Once you have fetched with that view, how locked into that view are you? Certainly you can never push to or be the fetch remote for another repository that does not want to respect that view, because you simply don't have the objects to complete the history for them. But what about deepening your own repo? In your proposal, I contact the server and ask for the replacement refs along with the branch refs. For the history of the branches, it gives me the truncated version with the replacement objects, right? Now how do I go back later and say "I'm interested in getting the rest of history, give me the real one"? I guess you can get the parent pointer from the real, "non-replaced" object and ask for it. But you can't ask for a specific commit, so for every such truncation, the parent needs to publish an extra ref (but _not_ make it one of the ones fetched by default, or it would nullify your original shallow fetch), and we need to contact them and find that ref. So I guess it's do-able, but there are a few interesting corners. I think somebody would need to whip up a proof of concept patch to explore those corners. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 5:47 ` Jeff King @ 2011-01-11 6:52 ` Jonathan Nieder 2011-01-11 15:37 ` Phillip Susi 2011-01-11 15:24 ` Phillip Susi 1 sibling, 1 reply; 34+ messages in thread From: Jonathan Nieder @ 2011-01-11 6:52 UTC (permalink / raw) To: Jeff King; +Cc: Phillip Susi, git, Christian Couder, Stephen Bash Jeff King wrote: > On Fri, Jan 07, 2011 at 07:43:40PM -0500, Phillip Susi wrote: >> What parts do not respect replacement? More importantly, what parts >> will be broken? [...] > Off the top of my head, I don't know. I suspect it would take somebody > writing a patch to create such an incomplete repository (or making one > manually) and seeing how badly things broke. I have two worries: - first, how easily can the replacement be undone? (as you mention below) - second, what happens if the two ends of transport have different replacements? That second worry is the more major in my opinion. Shallow clones are a different story --- they do not fundamentally change the history and they have special support in git protocol. It is possible to punt on both by saying that (1) replacements _cannot_ be undone --- a second replacement is needed --- and (2) the receiving end of a connection is not allowed to have any replacements for objects in common that the sending end does not have, but then does that buy you anything significant over a filter-branch? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 6:52 ` Jonathan Nieder @ 2011-01-11 15:37 ` Phillip Susi 2011-01-11 18:22 ` Jonathan Nieder 0 siblings, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-11 15:37 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jeff King, git, Christian Couder, Stephen Bash On 1/11/2011 1:52 AM, Jonathan Nieder wrote: > I have two worries: > > - first, how easily can the replacement be undone? (as you mention > below) git replace -d id, or git --no-replace-objects. It also might be nice to add a new switch to git replace to disable a replace without deleting it, so that it can later be enabled again. > - second, what happens if the two ends of transport have different > replacements? Then you have a conflict, just like if the two ends have different tags with the same name. > That second worry is the more major in my opinion. Shallow clones are > a different story --- they do not fundamentally change the history and > they have special support in git protocol. It is possible to punt on > both by saying that (1) replacements _cannot_ be undone --- a second > replacement is needed --- and (2) the receiving end of a connection is > not allowed to have any replacements for objects in common that the > sending end does not have, but then does that buy you anything > significant over a filter-branch? One of the major advantages of replacements is that they can easily be undone, so defeating that would be silly. Just like with conflicting tags, if the receiving end has conflicting replacements, they will be kept instead of the remote version and a warning issued. If you want the remote version, delete your local one and fetch again. What it buys you over filter-branch is: 1) Those tracking your repo don't have breakage when they next fetch because the chain of commits they were tracking has been destroyed and replaced by a completely different one. 2) It is obvious when a replace has been done, and the original is still available. This is good for auditing and traceability. Paper trails are good. 3) Inserting a replace record takes a lot less cpu and IO than filter-branch rewriting the entire chain. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 15:37 ` Phillip Susi @ 2011-01-11 18:22 ` Jonathan Nieder 2011-01-11 18:42 ` Phillip Susi 0 siblings, 1 reply; 34+ messages in thread From: Jonathan Nieder @ 2011-01-11 18:22 UTC (permalink / raw) To: Phillip Susi; +Cc: Jeff King, git, Christian Couder, Stephen Bash Hi, Thoughts on use cases. Jeff already explained the main protocol problem to be solved very well (thanks!). Phillip Susi wrote: > 1) Those tracking your repo don't have breakage when they next fetch > because the chain of commits they were tracking has been destroyed and > replaced by a completely different one. This does not require transport respecting replacements. Just start a new line of history and teach "git pull" to pull replacement refs first when requested in the refspec. It could work like this: alice$ git branch historical alice$ git checkout --orphan newline alice$ git branch newroot alice$ ... hack hack hack ... alice$ git replace newroot historical alice$ git push world refs/replace/* +HEAD:master bob$ git remote show origin URL: git://git.alice.example.com/project.git Ref specifier: refs/replace/*:refs/replace/* refs/heads/*:refs/remotes/origin/* HEAD branch: master Remote branch: master tracked Local branch configured for 'git pull': master merges with remote master bob$ git pull remote: Counting objects: 18, done. remote: Compressing objects: 100% (11/11), done. remote: Total 11 (delta 8), reused 0 (delta 0) Unpacking objects: 100% (11/11), done. From git://git.alice.example.com/project.git * [new replacement] 87a8c7yc65c87c98c87c6a87c8a -> replace/87a8c7yc65c87c98c87c6a87c8a a78c9df..8c98df9 master -> origin/master > 2) It is obvious when a replace has been done, and the original is > still available. This is good for auditing and traceability. Paper > trails are good. With the method you are suggesting, others do _not_ always have the original still available. After I fetch from you with --respect-hard-replacements, then while I am on an airplane I will have this hard replacement ref staring at me that I cannot remove. If the original goes missing or gets corrupted on the few machines that had it, the hard replacement ref is permanent. > 3) Inserting a replace record takes a lot less cpu and IO than > filter-branch rewriting the entire chain. If the modified history is much shorter than the original (as in the use case you described), would building it really take so much CPU and I/O? Moreover, is the extra CPU time to keep checking all the replacements on the client side worth saving that one-time CPU time expenditure on the server? If (and only if) so then I see how that could be an advantage. Sorry for the longwinded message. Hope that helps, Jonathan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 18:22 ` Jonathan Nieder @ 2011-01-11 18:42 ` Phillip Susi 0 siblings, 0 replies; 34+ messages in thread From: Phillip Susi @ 2011-01-11 18:42 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Jeff King, git, Christian Couder, Stephen Bash On 1/11/2011 1:22 PM, Jonathan Nieder wrote: >> 1) Those tracking your repo don't have breakage when they next fetch >> because the chain of commits they were tracking has been destroyed and >> replaced by a completely different one. > > This does not require transport respecting replacements. Just start > a new line of history and teach "git pull" to pull replacement refs > first when requested in the refspec. That's what I've been saying. My statement that you quote above is stating why git replace is better than git filter-branch. >> 2) It is obvious when a replace has been done, and the original is >> still available. This is good for auditing and traceability. Paper >> trails are good. > > With the method you are suggesting, others do _not_ always have the > original still available. After I fetch from you with > --respect-hard-replacements, then while I am on an airplane I will > have this hard replacement ref staring at me that I cannot remove. They may not have it in their local repository, but it is clear that there IS an original history, and the replace record comment should tell them from where they can fetch it, and those tracking the repository before the replace was added already have it. Using filter-branch on the other hand, is a sort of dirty hack that violates the integrity constrains normally in place, and can leave you with a history that has no indication that there ever was more. > If the original goes missing or gets corrupted on the few machines > that had it, the hard replacement ref is permanent. I think it goes without saying that if you loose part of the repository, and there are no other copies, then you have lost part of the repository. > If the modified history is much shorter than the original (as in the > use case you described), would building it really take so much CPU and > I/O? Moreover, is the extra CPU time to keep checking all the > replacements on the client side worth saving that one-time CPU time > expenditure on the server? It would take more than just inserting the replace record. I'm not sure what you mean by "keep checking all the replacements on the client side". ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 5:47 ` Jeff King 2011-01-11 6:52 ` Jonathan Nieder @ 2011-01-11 15:24 ` Phillip Susi 2011-01-11 17:39 ` Jeff King 1 sibling, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-11 15:24 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, git, Christian Couder, Stephen Bash On 1/11/2011 12:47 AM, Jeff King wrote: > Once you have fetched with that view, how locked into that view are you? > Certainly you can never push to or be the fetch remote for another > repository that does not want to respect that view, because you simply > don't have the objects to complete the history for them. If you want to fetch the original history, then it is as simple as git --no-replace-objects fetch. Unless of course, the upstream repository actually removed the original history ( or you are pulling from someone else who only pulled the truncated history ), possibly transplanting it to a historical repository that they should refer you to in the message of the replace commit. Then you just fetch from there instead, and viola! You have the complete original history. > I guess you can get the parent pointer from the real, "non-replaced" > object and ask for it. But you can't ask for a specific commit, so for > every such truncation, the parent needs to publish an extra ref (but > _not_ make it one of the ones fetched by default, or it would nullify > your original shallow fetch), and we need to contact them and find that > ref. Yes, either a new branch or separate historical repository could be published to pull the original history from, or git would need to pass the --no-replace-objects flag to git-upload-pack on the server, causing it to ignore the replace and send the original history. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 15:24 ` Phillip Susi @ 2011-01-11 17:39 ` Jeff King 2011-01-11 19:48 ` Johannes Sixt 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2011-01-11 17:39 UTC (permalink / raw) To: Phillip Susi; +Cc: Jonathan Nieder, git, Christian Couder, Stephen Bash On Tue, Jan 11, 2011 at 10:24:01AM -0500, Phillip Susi wrote: > Yes, either a new branch or separate historical repository could be > published to pull the original history from, or git would need to pass > the --no-replace-objects flag to git-upload-pack on the server, causing > it to ignore the replace and send the original history. AFAIK, git can't pass --no-replace-objects to the server over git:// (or smart http). You would need a protocol extension. And here's another corner case I thought of: Suppose you have some server S1 with this history: A--B--C--D and a replace object truncating history to look like: B'--C--D You clone from S1 and have only commits B', C, and D (or maybe even B, depending on the implementation). But definitely not A, nor its associated tree and blobs. Now you want to fetch from another server S2, which built some commits on the original history: A--B--C--D--E--F You and S2 negotiate that you both have D, which implies that you have all of the ancestors of D. S2 therefore sends you a thin pack containing E and F, which may contain deltas against objects found in D or its ancestors. Some of which may be only in A, which means you do not have them. Aside from fetching the entire real history, the only solution is that you somehow have to communicate to S2 exactly which objects you have, presumably by telling them which replacements you have used to arrive at the object set you have. Which in the general case would mean actually shipping them your replacement refs and objects (simply handling the special case of commit truncation isn't sufficient; you could have replaced any object with any other one). -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 17:39 ` Jeff King @ 2011-01-11 19:48 ` Johannes Sixt 2011-01-11 19:51 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Johannes Sixt @ 2011-01-11 19:48 UTC (permalink / raw) To: Jeff King Cc: Phillip Susi, Jonathan Nieder, git, Christian Couder, Stephen Bash On Dienstag, 11. Januar 2011, Jeff King wrote: > On Tue, Jan 11, 2011 at 10:24:01AM -0500, Phillip Susi wrote: > > Yes, either a new branch or separate historical repository could be > > published to pull the original history from, or git would need to pass > > the --no-replace-objects flag to git-upload-pack on the server, causing > > it to ignore the replace and send the original history. > > AFAIK, git can't pass --no-replace-objects to the server over git:// (or > smart http). You would need a protocol extension. Why would you have to? git-upload-pack never looks at replacement objects. > And here's another corner case I thought of: > > Suppose you have some server S1 with this history: > > A--B--C--D > > and a replace object truncating history to look like: > > B'--C--D > > You clone from S1 and have only commits B', C, and D (or maybe even B, > depending on the implementation). But definitely not A, nor its > associated tree and blobs. Why so? Cloning transfers the database using git-upload-pack, git-pack-objects, git-index-pack, and git-unpack-objects. All of them have object replacements disabled. (And AFAICS, there is no possibility to *enable* it.) Therefore, after cloning you get A--B--C--D and perhaps also the replacement object B'. Hint: git grep read_replace_refs -- Hannes ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 19:48 ` Johannes Sixt @ 2011-01-11 19:51 ` Jeff King 2011-01-11 20:00 ` Johannes Sixt 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2011-01-11 19:51 UTC (permalink / raw) To: Johannes Sixt Cc: Phillip Susi, Jonathan Nieder, git, Christian Couder, Stephen Bash On Tue, Jan 11, 2011 at 08:48:57PM +0100, Johannes Sixt wrote: > On Dienstag, 11. Januar 2011, Jeff King wrote: > > On Tue, Jan 11, 2011 at 10:24:01AM -0500, Phillip Susi wrote: > > > Yes, either a new branch or separate historical repository could be > > > published to pull the original history from, or git would need to pass > > > the --no-replace-objects flag to git-upload-pack on the server, causing > > > it to ignore the replace and send the original history. > > > > AFAIK, git can't pass --no-replace-objects to the server over git:// (or > > smart http). You would need a protocol extension. > > Why would you have to? git-upload-pack never looks at replacement objects. I think you missed the first part of this discussion. Phillip is proposing that it should, and I am arguing against it. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 19:51 ` Jeff King @ 2011-01-11 20:00 ` Johannes Sixt 2011-01-11 20:22 ` Phillip Susi 0 siblings, 1 reply; 34+ messages in thread From: Johannes Sixt @ 2011-01-11 20:00 UTC (permalink / raw) To: Jeff King Cc: Phillip Susi, Jonathan Nieder, git, Christian Couder, Stephen Bash On Dienstag, 11. Januar 2011, Jeff King wrote: > I think you missed the first part of this discussion. Phillip is > proposing that it should, and I am arguing against it. You're right, sorry for the noise. Now I understand this three-word-subject. -- Hannes ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 20:00 ` Johannes Sixt @ 2011-01-11 20:22 ` Phillip Susi 2011-01-11 20:50 ` Jonathan Nieder 0 siblings, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-11 20:22 UTC (permalink / raw) To: Johannes Sixt Cc: Jeff King, Jonathan Nieder, git, Christian Couder, Stephen Bash On 1/11/2011 3:00 PM, Johannes Sixt wrote: > On Dienstag, 11. Januar 2011, Jeff King wrote: >> I think you missed the first part of this discussion. Phillip is >> proposing that it should, and I am arguing against it. > > You're right, sorry for the noise. Now I understand this three-word-subject. What it really comes down to is that you can use replace locally to modify your history and it works great. As soon as someone clones from you though, they don't get the replace and so they end up with a different history than you see. I suggested that git-upload-pack should respect replace records by default, so that people cloning your repository will get the same replaced history instead of the original. It seems that the recommended use of replace is to locally append history back on, after it has been removed upstream with git filter-branch. Using filter-branch is bad, so it makes more sense to me to do the remove with git replace, and then if you want to add it back, you just have to disable the replace ( and maybe fetch additional objects ). The one problem that has come up is that when you fetch and tell the server you have a commit after the replace, it assumes that you also have the commits prior to the replace and may delta against objects you do not have. Fixing that would require informing the server of any replacements you have, and it being able to use that information to avoid deltas against objects hidden by the replace. Does that sound like a pretty good summary to everyone? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 20:22 ` Phillip Susi @ 2011-01-11 20:50 ` Jonathan Nieder 2011-01-12 0:59 ` Phillip Susi 0 siblings, 1 reply; 34+ messages in thread From: Jonathan Nieder @ 2011-01-11 20:50 UTC (permalink / raw) To: Phillip Susi Cc: Johannes Sixt, Jeff King, git, Christian Couder, Stephen Bash Phillip Susi wrote: > It seems that the recommended use of replace is to locally append > history back on, after it has been removed upstream with git > filter-branch. Using filter-branch is bad, so it makes more sense to me > to do the remove with git replace, and then if you want to add it back, > you just have to disable the replace ( and maybe fetch additional objects ). > > The one problem that has come up is that when you fetch and tell the > server you have a commit after the replace, it assumes that you also > have the commits prior to the replace and may delta against objects you > do not have. Fixing that would require informing the server of any > replacements you have, and it being able to use that information to > avoid deltas against objects hidden by the replace. > > Does that sound like a pretty good summary to everyone? Yes, except for "Using filter-branch is bad". Using filter-branch is not bad. Also there are many recommended uses of replace: for example, to swap out a commit that builds for one that doesn't when using "git bisect", or to stage history changes before making them permanent with filter-branch. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: clone breaks replace 2011-01-11 20:50 ` Jonathan Nieder @ 2011-01-12 0:59 ` Phillip Susi 2011-01-14 20:53 ` small downloads and immutable history (Re: clone breaks replace) Jonathan Nieder 0 siblings, 1 reply; 34+ messages in thread From: Phillip Susi @ 2011-01-12 0:59 UTC (permalink / raw) To: Jonathan Nieder Cc: Johannes Sixt, Jeff King, git, Christian Couder, Stephen Bash On 01/11/2011 03:50 PM, Jonathan Nieder wrote: > Yes, except for "Using filter-branch is bad". Using filter-branch is > not bad. It is bad because it breaks people tracking your branch, and violates the immutability of history. ^ permalink raw reply [flat|nested] 34+ messages in thread
* small downloads and immutable history (Re: clone breaks replace) 2011-01-12 0:59 ` Phillip Susi @ 2011-01-14 20:53 ` Jonathan Nieder 2011-01-15 5:27 ` Phillip Susi 0 siblings, 1 reply; 34+ messages in thread From: Jonathan Nieder @ 2011-01-14 20:53 UTC (permalink / raw) To: Phillip Susi Cc: Johannes Sixt, Jeff King, git, Christian Couder, Stephen Bash Phillip Susi wrote: > On 01/11/2011 03:50 PM, Jonathan Nieder wrote: >> Yes, except for "Using filter-branch is bad". Using filter-branch is >> not bad. > > It is bad because it breaks people tracking your branch, and > violates the immutability of history. Ah, I forgot the use case. If you are using this to at long last get past the limitations (e.g., inability to push) of "fetch --depth", then yes, rewriting existing history is bad. So what's left is some way to make the "have" part of transport negotiation make sense in this context. I'll be happy if it happens. Thanks for clarifying. Jonathan [note: if you occasionally use git commit; # new commit git tag tmp git checkout --orphan newroot git replace newroot tmp git tag -d tmp so the history without replacement refs is short, no rewriting of history has to take place. Some testing and tweaking might be required to make "git pull" continue to fast-forward.] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: small downloads and immutable history (Re: clone breaks replace) 2011-01-14 20:53 ` small downloads and immutable history (Re: clone breaks replace) Jonathan Nieder @ 2011-01-15 5:27 ` Phillip Susi 0 siblings, 0 replies; 34+ messages in thread From: Phillip Susi @ 2011-01-15 5:27 UTC (permalink / raw) To: Jonathan Nieder Cc: Johannes Sixt, Jeff King, git, Christian Couder, Stephen Bash On 01/14/2011 03:53 PM, Jonathan Nieder wrote: > Ah, I forgot the use case. If you are using this to at long last get > past the limitations (e.g., inability to push) of "fetch --depth", > then yes, rewriting existing history is bad. I'm not really talking about using --depth, but more of the project deciding to truncate the history in the central repository. > So what's left is some way to make the "have" part of transport > negotiation make sense in this context. I'll be happy if it happens. Good point. Whether local history is short because of --depth or replace records, the same problem arises; the negotiation needs to be able to exclude older objects that are not present locally, rather than assuming that the client has the entire history if it has any at all. It seems like this should just require sending the server and end point in addition to a start point. In other words, not just send ID of the most recent commit, but also the oldest that it has on hand, so that the server can be sure that it does not deltafy against objects prior to that commit. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2011-01-15 5:27 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-06 21:00 clone breaks replace Phillip Susi 2011-01-06 21:33 ` Jonathan Nieder 2011-01-06 21:59 ` Junio C Hamano 2011-01-07 19:43 ` Phillip Susi 2011-01-07 20:51 ` Jonathan Nieder 2011-01-07 21:15 ` Stephen Bash 2011-01-07 21:34 ` Jonathan Nieder 2011-01-07 21:44 ` Phillip Susi 2011-01-07 21:49 ` Jonathan Nieder 2011-01-07 22:09 ` Phillip Susi 2011-01-07 22:09 ` Jeff King 2011-01-07 22:58 ` Junio C Hamano 2011-01-11 5:36 ` Jeff King 2011-01-11 17:40 ` Junio C Hamano 2011-01-11 17:50 ` Jeff King 2011-01-11 17:56 ` Jonathan Nieder 2011-01-11 18:03 ` Jeff King 2011-01-11 19:32 ` Christian Couder 2011-01-08 0:43 ` Phillip Susi 2011-01-11 5:47 ` Jeff King 2011-01-11 6:52 ` Jonathan Nieder 2011-01-11 15:37 ` Phillip Susi 2011-01-11 18:22 ` Jonathan Nieder 2011-01-11 18:42 ` Phillip Susi 2011-01-11 15:24 ` Phillip Susi 2011-01-11 17:39 ` Jeff King 2011-01-11 19:48 ` Johannes Sixt 2011-01-11 19:51 ` Jeff King 2011-01-11 20:00 ` Johannes Sixt 2011-01-11 20:22 ` Phillip Susi 2011-01-11 20:50 ` Jonathan Nieder 2011-01-12 0:59 ` Phillip Susi 2011-01-14 20:53 ` small downloads and immutable history (Re: clone breaks replace) Jonathan Nieder 2011-01-15 5:27 ` Phillip Susi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).