From: Johan Herland <johan@herland.net>
To: git@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
"Stephen R. van den Berg" <srb@cuci.nl>,
Denis Bueno <dbueno@gmail.com>
Subject: Re: To graft or not to graft... (Re: Recovering from repository corruption)
Date: Thu, 12 Jun 2008 09:14:21 +0200 [thread overview]
Message-ID: <200806120914.22083.johan@herland.net> (raw)
In-Reply-To: <alpine.LFD.1.10.0806111635200.3101@woody.linux-foundation.org>
On Thursday 12 June 2008, Linus Torvalds wrote:
> On Thu, 12 Jun 2008, Stephen R. van den Berg wrote:
> > As I understood it from the few shreds of documentation that actually
> > mention the grafts file, the grafts file is *not* being cloned.
> > Therefore, my assumption was that cloning a repository that has a
> > grafts file gives an identical result to cloning the same repository
> > *without* the grafts file present.
>
> That would probably be the right behaviour, but no - all our commit
> walkers honor the grafts file.
>
> Including the ones used for creating pack-files and thus a clone.
>
> > As I understand it now, the cloning process actually peeks at the
> > grafts file while cloning, and then doesn't copy it. This results in a
> > rather confusingly corrupt clone.
>
> Yes. The grafts-file was a mistake, but it's just barely useful to some
> people that it's stayed alive. Sadly, those "some people" don't tend to
> care enough about the problems it can cause.
>
> > I suggest two things:
> > a. That during the cloning process, the grafts file is completely
> > disregarded in any case at first.
>
> Yes.
>
> And (a'): git-fsck and repacking should just consider it to be an
> _additional_ source of parenthood rather than a _replacement_ source.
>
> > b. Preferably the grafts file is copied as well (after cloning). I
> > never really understood why the file is not being copied in the
> > first place (anyone care to explain that?).
>
> The grafts file isn't part of the object stream and refs, and clones (and
> fetches) very much just copy the object database.
AFAICS, there's already a perfectly fine way to distribute grafted history:
1. Add a grafts file
2. Run git-filter-branch
3. Remove grafts file
4. Distribute repo
5. Profit!
Since git-filter-branch turns grafted parentage into _real_ parentage,
there's no point in ever having a grafts file at all (except transiently
for telling git-filter-branch what to do).
I suggest we make commit walkers NOT obey the grafts file by default, but
instead require a --follow-grafts option to restore the current behaviour.
Then, we teach git-filter-branch to obey the grafts file (probably by
employing said --follow-grafts option).
For those who want to hang on to the current behaviour, they can create
some config option that is equivalent to always running with
--follow-grafts.
The following is ugly, untested, undocumented, and obviously unfit for
inclusion:
diff --git a/commit.c b/commit.c
index 94d5b3d..3e9ebf7 100644
--- a/commit.c
+++ b/commit.c
@@ -7,6 +7,7 @@
#include "revision.h"
int save_commit_buffer = 1;
+int use_grafts = 0;
const char *commit_type = "commit";
@@ -242,7 +243,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
char *bufptr = buffer;
unsigned char parent[20];
struct commit_list **pptr;
- struct commit_graft *graft;
+ struct commit_graft *graft = NULL;
unsigned n_refs = 0;
if (item->object.parsed)
@@ -260,7 +261,8 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
bufptr += 46; /* "tree " + "hex sha1" + "\n" */
pptr = &item->parents;
- graft = lookup_commit_graft(item->object.sha1);
+ if (use_grafts)
+ graft = lookup_commit_graft(item->object.sha1);
while (bufptr + 48 < tail && !memcmp(bufptr, "parent ", 7)) {
struct commit *new_parent;
diff --git a/commit.h b/commit.h
index 2d94d41..3e30aa0 100644
--- a/commit.h
+++ b/commit.h
@@ -22,6 +22,7 @@ struct commit {
};
extern int save_commit_buffer;
+extern int use_grafts;
extern const char *commit_type;
/* While we can decorate any object with a name, it's only used for commits.. */
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index d04c346..5ebe7cd 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -230,11 +230,11 @@ mkdir ../map || die "Could not create map/ directory"
case "$filter_subdir" in
"")
git rev-list --reverse --topo-order --default HEAD \
- --parents "$@"
+ --follow-grafts --parents "$@"
;;
*)
git rev-list --reverse --topo-order --default HEAD \
- --parents "$@" -- "$filter_subdir"
+ --follow-grafts --parents "$@" -- "$filter_subdir"
esac > ../revs || die "Could not get the commits"
commits=$(wc -l <../revs | tr -d " ")
diff --git a/revision.c b/revision.c
index 5a1a948..ca98815 100644
--- a/revision.c
+++ b/revision.c
@@ -1059,6 +1059,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, const ch
revs->first_parent_only = 1;
continue;
}
+ if (!strcmp(arg, "--follow-grafts")) {
+ use_grafts = 1;
+ continue;
+ }
if (!strcmp(arg, "--reflog")) {
handle_reflog(revs, flags);
continue;
--
1.5.6.rc2.128.gf64ae
Have fun! :)
...Johan
--
Johan Herland, <johan@herland.net>
www.herland.net
next prev parent reply other threads:[~2008-06-12 7:16 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-10 17:26 Recovering from repository corruption Denis Bueno
2008-06-10 17:55 ` Jakub Narebski
2008-06-10 19:38 ` Denis Bueno
2008-06-10 19:59 ` Jakub Narebski
2008-06-10 20:03 ` Denis Bueno
2008-06-10 20:14 ` Jakub Narebski
2008-06-10 20:35 ` Denis Bueno
2008-06-10 20:23 ` Linus Torvalds
2008-06-10 20:28 ` Denis Bueno
2008-06-10 21:09 ` Linus Torvalds
2008-06-10 21:22 ` Denis Bueno
2008-06-10 21:48 ` Linus Torvalds
2008-06-10 22:09 ` Denis Bueno
2008-06-10 22:25 ` Tarmigan
2008-06-10 22:41 ` Denis Bueno
2008-06-10 22:45 ` Linus Torvalds
2008-06-10 23:00 ` Linus Torvalds
2008-06-11 0:43 ` Nicolas Pitre
2008-06-11 1:39 ` Linus Torvalds
2008-06-11 1:47 ` Nicolas Pitre
2008-06-10 21:27 ` Denis Bueno
2008-06-10 22:52 ` Junio C Hamano
2008-06-11 23:21 ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
2008-06-11 23:34 ` Jakub Narebski
2008-06-11 23:39 ` Linus Torvalds
2008-06-12 7:14 ` Johan Herland [this message]
2008-06-12 7:47 ` Jeff King
2008-06-12 10:21 ` Johan Herland
2008-06-12 12:20 ` Stephen R. van den Berg
2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
2008-06-10 19:42 ` Denis Bueno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806120914.22083.johan@herland.net \
--to=johan@herland.net \
--cc=dbueno@gmail.com \
--cc=git@vger.kernel.org \
--cc=srb@cuci.nl \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.