From: Johan Herland <johan@herland.net>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
"Stephen R. van den Berg" <srb@cuci.nl>,
Denis Bueno <dbueno@gmail.com>
Subject: Re: To graft or not to graft... (Re: Recovering from repository corruption)
Date: Thu, 12 Jun 2008 12:21:02 +0200 [thread overview]
Message-ID: <200806121221.02287.johan@herland.net> (raw)
In-Reply-To: <20080612074752.GA507@sigill.intra.peff.net>
On Thursday 12 June 2008, Jeff King wrote:
> On Thu, Jun 12, 2008 at 09:14:21AM +0200, Johan Herland wrote:
> > > The grafts file isn't part of the object stream and refs, and
> > > clones (and fetches) very much just copy the object database.
> >
> > AFAICS, there's already a perfectly fine way to distribute grafted
> > history: 1. Add a grafts file
> > 2. Run git-filter-branch
> > 3. Remove grafts file
> > 4. Distribute repo
> > 5. Profit!
> >
> > Since git-filter-branch turns grafted parentage into _real_
> > parentage, there's no point in ever having a grafts file at all
> > (except transiently for telling git-filter-branch what to do).
>
> But then you have rewritten all of the later commits, so you can no
> longer talk to other people about them.
Correct. My point is that if you want to talk to people about revisions,
you'd better do it from a repo where people agree on the entire
history. On the other hand, if you want to do archaeology with grafts,
you should be aware that you are subverting one of the core guarantees
provided by Git (i.e. a commit id verifies full ancestry of a commit),
and therefore shouldn't communicate with other repos _at_ _all_, as
other repos can easily be confused (see [1]).
> The kernel repo is split into "historical" and active repos. You can
> graft the historical repo and get more far-reaching answers to things
> like "git log" and "git blame". But if you run filter-branch, you
> can't share development on that repo via push / pull to people who
> _don't_ use the graft, since they don't share your history (and they
> probably don't want to, because of the extra resources required to
> pull in the historical chunk).
Yes, by forcing git-filter-branch, you can no longer push/pull to/from
such a historical repo. But as this thread has already demonstrated,
with grafts you can't clone from such a repo today (nor pull in certain
circumstances, see [1]); so the way I see it, communication with this
repo is _already_ limited. By disallowing grafts and forcing a rewrite
of the entire repo, we force these communication problems to be more
explicit/visible.
> That being said, I don't know how common such a setup is. And you did
> mention a "follow-grafts" config option for such people.
Indeed. :)
AFAICS, there's two use cases for grafts:
1. As a preparation for rewriting the history with git-filter-branch.
2. For providing historical repos (like you mention above).
My suggestion only makes life harder for people in the second use case.
If there are many people in the second use case, and they deem
the "follow-grafts" config option unacceptable, I expect them to flame
my suggestion to a crisp, and we'll have to think of something else...
Have fun! :)
...Johan
[1]: Consider the following:
### Create a repo with one commit, A
$ mkdir foo
$ cd foo
$ git init
Initialized empty Git repository in /path/to/foo/.git/
$ echo foo > foo
$ git add foo
$ git commit -mA
Created initial commit fe2ec02: A
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 foo
### Clone the repo
$ cd ..
$ git clone /path/to/foo bar
Initialize bar/.git
Initialized empty Git repository in /path/to/bar/.git/
### Create 3 more commits in the original repo: A---B---C---D
$ cd foo
$ echo bar >> foo && git commit -a -mB
Created commit ad10f00: B
1 files changed, 1 insertions(+), 0 deletions(-)
$ echo baz >> foo && git commit -a -mC
Created commit be96559: C
1 files changed, 1 insertions(+), 0 deletions(-)
$ echo xyzzy >> foo && git commit -a -mD
Created commit f2bafe5: D
1 files changed, 1 insertions(+), 0 deletions(-)
### Create a graft removing C from the history: A---B---D
$ echo "f2bafe58175e132077285e7fbbcec30859101d2e \
ad10f005205f61429dccda95e1442dabe31fbfbe" > .git/info/grafts
### Pull the recent changes into the clone
$ cd ../bar
$ git pull
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (2/2), done.
Unpacking objects: 100% (6/6), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
From /path/to/foo/
+ fe2ec02...f2bafe5 master -> origin/master (forced update)
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
Auto-merged foo
CONFLICT (add/add): Merge conflict in foo
Automatic merge failed; fix conflicts and then commit the result.
AFAICS, git-pull can easily become just as confused by grafts as
git-clone. I wouldn't be surprised by a similar example for git-push.
I can only draw the conclusion that with current versions of Git, repos
with grafts should _never_ be made public.
--
Johan Herland, <johan@herland.net>
www.herland.net
next prev parent reply other threads:[~2008-06-12 10:23 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-10 17:26 Recovering from repository corruption Denis Bueno
2008-06-10 17:55 ` Jakub Narebski
2008-06-10 19:38 ` Denis Bueno
2008-06-10 19:59 ` Jakub Narebski
2008-06-10 20:03 ` Denis Bueno
2008-06-10 20:14 ` Jakub Narebski
2008-06-10 20:35 ` Denis Bueno
2008-06-10 20:23 ` Linus Torvalds
2008-06-10 20:28 ` Denis Bueno
2008-06-10 21:09 ` Linus Torvalds
2008-06-10 21:22 ` Denis Bueno
2008-06-10 21:48 ` Linus Torvalds
2008-06-10 22:09 ` Denis Bueno
2008-06-10 22:25 ` Tarmigan
2008-06-10 22:41 ` Denis Bueno
2008-06-10 22:45 ` Linus Torvalds
2008-06-10 23:00 ` Linus Torvalds
2008-06-11 0:43 ` Nicolas Pitre
2008-06-11 1:39 ` Linus Torvalds
2008-06-11 1:47 ` Nicolas Pitre
2008-06-10 21:27 ` Denis Bueno
2008-06-10 22:52 ` Junio C Hamano
2008-06-11 23:21 ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
2008-06-11 23:34 ` Jakub Narebski
2008-06-11 23:39 ` Linus Torvalds
2008-06-12 7:14 ` Johan Herland
2008-06-12 7:47 ` Jeff King
2008-06-12 10:21 ` Johan Herland [this message]
2008-06-12 12:20 ` Stephen R. van den Berg
2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
2008-06-10 19:42 ` Denis Bueno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806121221.02287.johan@herland.net \
--to=johan@herland.net \
--cc=dbueno@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=srb@cuci.nl \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).