* [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
@ 2007-07-02 12:56 Andy Parkins
2007-07-02 23:11 ` Junio C Hamano
2007-08-17 8:39 ` Andy Parkins
0 siblings, 2 replies; 8+ messages in thread
From: Andy Parkins @ 2007-07-02 12:56 UTC (permalink / raw)
To: git
I ran git-prune on a repository and got this:
$ git-prune
error: Object 228f8065b930120e35fc0c154c237487ab02d64a is a blob, not a commit
Segmentation fault (core dumped)
This repository was a strange one in that it was being used to provide
its own submodule. That is, the repository was cloned into a
subdirectory, an independent branch checked out in that subdirectory,
and then it was marked as a submodule. git-prune then failed in the
above manner.
The problem was that git-prune was not submodule aware in two areas.
Linus said:
> So what happens is that something traverses a tree object, looks at each
> entry, sees that it's not a tree, and tries to look it up as a blob. But
> subprojects are commits, not blobs, and then when you look at the object
> more closely, you get the above kind of object type confusion.
and included a patch to add an S_ISGITLINK() test to reachable.c's
process_tree() function. That fixed the first git-prune error, and
stopped it from trying to process the gitlink entries in trees as if
they were pointers to other trees (and of course failing, because
gitlinks _aren't_ trees). That part of this patch is his.
The second area is add_cache_refs(). This is called before starting the
reachability analysis, and was calling lookup_blob() on every object
hash found in the index. However, it is no longer true that every hash
in the index is a pointer to a blob, some of them are gitlinks, and are
not backed by any object at all, they are commits in another repository.
Normally this bug was not causing any problems, but in the case of the
self-referencing repository described above, it meant that the gitlink
hash was being marked as being of type OBJ_BLOB by add_cache_refs() call
to lookup_blob(). Then later, because that hash was also pointed to by
a ref, add_one_ref() would treat it as a commit; lookup_commit() would
return a NULL because that object was already noted as being an
OBJ_BLOB, not an OBJ_COMMIT; and parse_commit_buffer() would SEGFAULT on
that NULL pointer.
The fix made by this patch is to not blindly call lookup_blob() in
reachable.c's add_cache_refs(), and instead skip any index entries that
are S_ISGITLINK().
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
The two parts go together logically so I've put them in this single patch
from me, but half of this patch should be credited to Linus. (I don't know
how one deals with multiple authorship on a single commit in git though)
I suspect that Linus has enough credit to keep him happy and won't mind
that I've edited him out in this case :-)
reachable.c | 20 ++++++++++++++++++++
1 files changed, 20 insertions(+), 0 deletions(-)
diff --git a/reachable.c b/reachable.c
index ff3dd34..6383401 100644
--- a/reachable.c
+++ b/reachable.c
@@ -21,6 +21,14 @@ static void process_blob(struct blob *blob,
/* Nothing to do, really .. The blob lookup was the important part */
}
+static void process_gitlink(const unsigned char *sha1,
+ struct object_array *p,
+ struct name_path *path,
+ const char *name)
+{
+ /* I don't think we want to recurse into this, really. */
+}
+
static void process_tree(struct tree *tree,
struct object_array *p,
struct name_path *path,
@@ -47,6 +55,8 @@ static void process_tree(struct tree *tree,
while (tree_entry(&desc, &entry)) {
if (S_ISDIR(entry.mode))
process_tree(lookup_tree(entry.sha1), p, &me, entry.path);
+ else if (S_ISGITLINK(entry.mode))
+ process_gitlink(entry.sha1, p, &me, entry.path);
else
process_blob(lookup_blob(entry.sha1), p, &me, entry.path);
}
@@ -159,6 +169,16 @@ static void add_cache_refs(struct rev_info *revs)
read_cache();
for (i = 0; i < active_nr; i++) {
+ /*
+ * The index can contain blobs and GITLINKs, GITLINKs are hashes
+ * that don't actually point to objects in the repository, it's
+ * almost guaranteed that they are NOT blobs, so we don't call
+ * lookup_blob() on them, to avoid populating the hash table
+ * with invalid information
+ */
+ if (S_ISGITLINK(ntohl(active_cache[i]->ce_mode)))
+ continue;
+
lookup_blob(active_cache[i]->sha1);
/*
* We could add the blobs to the pending list, but quite
--
1.5.2.2.253.g2d8b
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-07-02 12:56 [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process) Andy Parkins
@ 2007-07-02 23:11 ` Junio C Hamano
2007-08-17 8:39 ` Andy Parkins
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2007-07-02 23:11 UTC (permalink / raw)
To: Andy Parkins; +Cc: git, Linus Torvalds
Andy Parkins <andyparkins@gmail.com> writes:
> ...
> The fix made by this patch is to not blindly call lookup_blob() in
> reachable.c's add_cache_refs(), and instead skip any index entries that
> are S_ISGITLINK().
Thanks, both of you. Will go to 'maint' hopefully tonight (if I
can shake the day job off early enough today, that is).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-07-02 12:56 [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process) Andy Parkins
2007-07-02 23:11 ` Junio C Hamano
@ 2007-08-17 8:39 ` Andy Parkins
2007-08-17 9:14 ` Johannes Sixt
2007-08-17 16:56 ` Linus Torvalds
1 sibling, 2 replies; 8+ messages in thread
From: Andy Parkins @ 2007-08-17 8:39 UTC (permalink / raw)
To: git
On Monday 2007, July 02, Andy Parkins wrote:
> This repository was a strange one in that it was being used to provide
> its own submodule. That is, the repository was cloned into a
> subdirectory, an independent branch checked out in that subdirectory,
> and then it was marked as a submodule. git-prune then failed in the
> above manner.
I think I've stumbled on another place where this is happening.
git-upload-pack is crashing for me with a similar "Object is commit not
blob" error just before.
I'm happy to try and track it down, but I'm having difficulty because I
think the crash is happening in the process on the remote system, so I'm
not getting a core dump I can use.
Could any of the guru's give me a guide to upload-pack.c? I assume the
problem is going to be the same as it was for git-prune, the hash for the
gitlink object in the tree is being assumed to be an object in the ODB;
which isn't the case with gitlink entries. Where would that be happening
in git-upload-pack? The fix is going to be..
if( S_ISGITLINK(mode))
continue;
But I've got no idea where to put it :-)
Andy
--
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-08-17 8:39 ` Andy Parkins
@ 2007-08-17 9:14 ` Johannes Sixt
2007-08-17 16:56 ` Linus Torvalds
1 sibling, 0 replies; 8+ messages in thread
From: Johannes Sixt @ 2007-08-17 9:14 UTC (permalink / raw)
To: git
Andy Parkins wrote:
> Could any of the guru's give me a guide to upload-pack.c? I assume the
> problem is going to be the same as it was for git-prune, the hash for the
> gitlink object in the tree is being assumed to be an object in the ODB;
> which isn't the case with gitlink entries. Where would that be happening
> in git-upload-pack? The fix is going to be..
>
> if( S_ISGITLINK(mode))
> continue;
>
> But I've got no idea where to put it :-)
Most likely in list-objects.c:traverse_commit_list(), which is called
from somewhere in upload-pack.c.
-- Hannes
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-08-17 8:39 ` Andy Parkins
2007-08-17 9:14 ` Johannes Sixt
@ 2007-08-17 16:56 ` Linus Torvalds
2007-08-17 23:48 ` Junio C Hamano
2007-08-18 9:54 ` [PATCH] Make thin-pack generation subproject aware Junio C Hamano
1 sibling, 2 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-08-17 16:56 UTC (permalink / raw)
To: Andy Parkins; +Cc: git
On Fri, 17 Aug 2007, Andy Parkins wrote:
>
> Could any of the guru's give me a guide to upload-pack.c? I assume the
> problem is going to be the same as it was for git-prune, the hash for the
> gitlink object in the tree is being assumed to be an object in the ODB;
> which isn't the case with gitlink entries. Where would that be happening
> in git-upload-pack? The fix is going to be..
>
> if( S_ISGITLINK(mode))
> continue;
>
> But I've got no idea where to put it :-)
Maybe this one?
Linus
---
builtin-pack-objects.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 24926db..77481df 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -979,6 +979,8 @@ static void add_pbase_object(struct tree_desc *tree,
int cmp;
while (tree_entry(tree,&entry)) {
+ if (S_ISGITLINK(entry.mode))
+ continue;
cmp = tree_entry_len(entry.path, entry.sha1) != cmplen ? 1 :
memcmp(name, entry.path, cmplen);
if (cmp > 0)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-08-17 16:56 ` Linus Torvalds
@ 2007-08-17 23:48 ` Junio C Hamano
2007-08-20 8:39 ` Andy Parkins
2007-08-18 9:54 ` [PATCH] Make thin-pack generation subproject aware Junio C Hamano
1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2007-08-17 23:48 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andy Parkins, git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Fri, 17 Aug 2007, Andy Parkins wrote:
>>
>> Could any of the guru's give me a guide to upload-pack.c? I assume the
>> problem is going to be the same as it was for git-prune, the hash for the
>> gitlink object in the tree is being assumed to be an object in the ODB;
>> which isn't the case with gitlink entries. Where would that be happening
>> in git-upload-pack? The fix is going to be..
>>
>> if( S_ISGITLINK(mode))
>> continue;
>>
>> But I've got no idea where to put it :-)
>
> Maybe this one?
>
> ---
> builtin-pack-objects.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 24926db..77481df 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -979,6 +979,8 @@ static void add_pbase_object(struct tree_desc *tree,
> int cmp;
>
> while (tree_entry(tree,&entry)) {
> + if (S_ISGITLINK(entry.mode))
> + continue;
> cmp = tree_entry_len(entry.path, entry.sha1) != cmplen ? 1 :
> memcmp(name, entry.path, cmplen);
> if (cmp > 0)
This sounds very plausible.
Andy, in the repository your fetch fails, if a fetch-pack
without "--thin" before Linus's patch does not barf, that
strongly suggests that the breakage you are seeing is related to
this codepath. And with Linus's patch, "fetch-pack --thin"
would also be fixed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Make thin-pack generation subproject aware.
2007-08-17 16:56 ` Linus Torvalds
2007-08-17 23:48 ` Junio C Hamano
@ 2007-08-18 9:54 ` Junio C Hamano
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2007-08-18 9:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andy Parkins, git
When a thin pack wants to send a tree object at "sub/dir", and
the commit that is common between the sender and the receiver
that is used as the base object has a subproject at that path,
we should not try to use the data at "sub/dir" of the base tree
as a tree object. It is not a tree to begin with, and more
importantly, the commit object there does not have to even
exist.
---
This turned out to be trickier to trigger than I thought. One
case to trigger is to have a subproject in the past at sub/dir
and then turn it into a directory.
builtin-pack-objects.c | 2 +
t/t3050-subprojects-fetch.sh | 52 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 0 deletions(-)
create mode 100755 t/t3050-subprojects-fetch.sh
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 24926db..77481df 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -979,6 +979,8 @@ static void add_pbase_object(struct tree_desc *tree,
int cmp;
while (tree_entry(tree,&entry)) {
+ if (S_ISGITLINK(entry.mode))
+ continue;
cmp = tree_entry_len(entry.path, entry.sha1) != cmplen ? 1 :
memcmp(name, entry.path, cmplen);
if (cmp > 0)
diff --git a/t/t3050-subprojects-fetch.sh b/t/t3050-subprojects-fetch.sh
new file mode 100755
index 0000000..34f26a8
--- /dev/null
+++ b/t/t3050-subprojects-fetch.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+
+test_description='fetching and pushing project with subproject'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ test_tick &&
+ mkdir -p sub && (
+ cd sub &&
+ git init &&
+ >subfile &&
+ git add subfile
+ git commit -m "subproject commit #1"
+ ) &&
+ >mainfile
+ git add sub mainfile &&
+ test_tick &&
+ git commit -m "superproject commit #1"
+'
+
+test_expect_success clone '
+ git clone file://`pwd`/.git cloned &&
+ (git rev-parse HEAD; git ls-files -s) >expected &&
+ (
+ cd cloned &&
+ (git rev-parse HEAD; git ls-files -s) >../actual
+ ) &&
+ diff -u expected actual
+'
+
+test_expect_success advance '
+ echo more >mainfile &&
+ git update-index --force-remove sub &&
+ mv sub/.git sub/.git-disabled &&
+ git add sub/subfile mainfile &&
+ mv sub/.git-disabled sub/.git &&
+ test_tick &&
+ git commit -m "superproject commit #2"
+'
+
+test_expect_success fetch '
+ (git rev-parse HEAD; git ls-files -s) >expected &&
+ (
+ cd cloned &&
+ git pull &&
+ (git rev-parse HEAD; git ls-files -s) >../actual
+ ) &&
+ diff -u expected actual
+'
+
+test_done
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process)
2007-08-17 23:48 ` Junio C Hamano
@ 2007-08-20 8:39 ` Andy Parkins
0 siblings, 0 replies; 8+ messages in thread
From: Andy Parkins @ 2007-08-20 8:39 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds
On Saturday 2007 August 18, Junio C Hamano wrote:
> Andy, in the repository your fetch fails, if a fetch-pack
> without "--thin" before Linus's patch does not barf, that
> strongly suggests that the breakage you are seeing is related to
> this codepath. And with Linus's patch, "fetch-pack --thin"
> would also be fixed.
I'm really sorry, somehow during my attempts to find the fault, the fault went
away. I think it's because I managed to get the fetch to work in some way,
and from then on fetch completed perfectly.
The upshot of this is that I have no way to test this patch, until I manage to
get myself in a similar state. I'll wait until it happens again though and
then try this patch.
Andy
--
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-08-20 8:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-02 12:56 [PATCH] Make git-prune submodule aware (and fix a SEGFAULT in the process) Andy Parkins
2007-07-02 23:11 ` Junio C Hamano
2007-08-17 8:39 ` Andy Parkins
2007-08-17 9:14 ` Johannes Sixt
2007-08-17 16:56 ` Linus Torvalds
2007-08-17 23:48 ` Junio C Hamano
2007-08-20 8:39 ` Andy Parkins
2007-08-18 9:54 ` [PATCH] Make thin-pack generation subproject aware Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).