* [PATCH] pull: do not segfault when HEAD refers to missing object file
@ 2017-03-05 23:42 André Laszlo
2017-03-05 23:52 ` brian m. carlson
2017-03-06 3:46 ` Jeff King
0 siblings, 2 replies; 6+ messages in thread
From: André Laszlo @ 2017-03-05 23:42 UTC (permalink / raw)
To: git
Cc: Nguyễn Thái Ngọc Duy, brian m . carlson,
René Scharfe, Johannes Schindelin, Jeff King,
André Laszlo
git pull --rebase on a corrupted HEAD used to segfault; it has been
corrected to error out with a message. A test has also been added to
verify this fix.
Signed-off-by: André Laszlo <andre@laszlo.nu>
---
Notes:
When add_head_to_pending fails to add a pending object, git pull
--rebase segfaults. This can happen if HEAD is referring to a corrupt
or missing object.
I discovered this segfault on my machine after pulling a repo from
GitHub, but I have been unable to reproduce the sequence of events
that lead to the corrupted HEAD (I think it may have been caused by a
lost network connection in my case).
The following commands use add_head_to_pending:
format-patch setup_revisions before add_head_to_pending
diff checks rev.pending.nr
shortlog checks rev.pending.nr
log uses resolve_ref_unsafe
All of the above return an error code of 128 and print "fatal: bad
object HEAD" instead of segfaulting, which I think is correct
behavior. The check and error message have been added to
has_uncommitted_changes, where they were missing, as well as to
diff-lib.c (without the error message).
diff-lib.c | 2 +-
t/t5520-pull.sh | 12 ++++++++++++
wt-status.c | 5 +++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/diff-lib.c b/diff-lib.c
index 52447466b..9d26b18c3 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -512,7 +512,7 @@ int run_diff_index(struct rev_info *revs, int cached)
struct object_array_entry *ent;
ent = revs->pending.objects;
- if (diff_cache(revs, ent->item->oid.hash, ent->name, cached))
+ if (!ent || diff_cache(revs, ent->item->oid.hash, ent->name, cached))
exit(128);
diff_set_mnemonic_prefix(&revs->diffopt, "c/", cached ? "i/" : "w/");
diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh
index 17f4d0fe4..1edb6a97a 100755
--- a/t/t5520-pull.sh
+++ b/t/t5520-pull.sh
@@ -664,4 +664,16 @@ test_expect_success 'git pull --rebase against local branch' '
test file = "$(cat file2)"
'
+test_expect_success 'git pull --rebase with corrupt HEAD does not segfault' '
+ mkdir corrupted &&
+ (cd corrupted &&
+ git init &&
+ echo one >file && git add file &&
+ git commit -m one &&
+ REV=$(git rev-parse HEAD) &&
+ rm -f .git/objects/${REV:0:2}/${REV:2} &&
+ test_expect_code 128 git pull --rebase > /dev/null
+ )
+'
+
test_done
diff --git a/wt-status.c b/wt-status.c
index d47012048..3d60eaff5 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -2252,6 +2252,11 @@ int has_uncommitted_changes(int ignore_submodules)
DIFF_OPT_SET(&rev_info.diffopt, IGNORE_SUBMODULES);
DIFF_OPT_SET(&rev_info.diffopt, QUICK);
add_head_to_pending(&rev_info);
+
+ /* The add_head_to_pending call might not have added anything. */
+ if (!rev_info.pending.nr)
+ die("bad object %s", "HEAD");
+
diff_setup_done(&rev_info.diffopt);
result = run_diff_index(&rev_info, 1);
return diff_result_code(&rev_info.diffopt, result);
--
2.12.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] pull: do not segfault when HEAD refers to missing object file 2017-03-05 23:42 [PATCH] pull: do not segfault when HEAD refers to missing object file André Laszlo @ 2017-03-05 23:52 ` brian m. carlson 2017-03-06 3:51 ` Jeff King 2017-03-06 3:46 ` Jeff King 1 sibling, 1 reply; 6+ messages in thread From: brian m. carlson @ 2017-03-05 23:52 UTC (permalink / raw) To: André Laszlo Cc: git, Nguyễn Thái Ngọc Duy, René Scharfe, Johannes Schindelin, Jeff King [-- Attachment #1: Type: text/plain, Size: 780 bytes --] On Mon, Mar 06, 2017 at 12:42:22AM +0100, André Laszlo wrote: > +test_expect_success 'git pull --rebase with corrupt HEAD does not segfault' ' > + mkdir corrupted && > + (cd corrupted && > + git init && > + echo one >file && git add file && > + git commit -m one && > + REV=$(git rev-parse HEAD) && > + rm -f .git/objects/${REV:0:2}/${REV:2} && I think this is a bashism. On dash, I get the following: genre ok % dash -c 'foo=abcdefg; echo ${foo:0:2}; echo ${foo:2}' dash: 1: Bad substitution > + test_expect_code 128 git pull --rebase > /dev/null > + ) > +' > + > test_done -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pull: do not segfault when HEAD refers to missing object file 2017-03-05 23:52 ` brian m. carlson @ 2017-03-06 3:51 ` Jeff King 2017-03-06 6:52 ` Johannes Sixt 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2017-03-06 3:51 UTC (permalink / raw) To: brian m. carlson, André Laszlo, git, Nguyễn Thái Ngọc Duy, René Scharfe, Johannes Schindelin On Sun, Mar 05, 2017 at 11:52:22PM +0000, brian m. carlson wrote: > On Mon, Mar 06, 2017 at 12:42:22AM +0100, André Laszlo wrote: > > +test_expect_success 'git pull --rebase with corrupt HEAD does not segfault' ' > > + mkdir corrupted && > > + (cd corrupted && > > + git init && > > + echo one >file && git add file && > > + git commit -m one && > > + REV=$(git rev-parse HEAD) && > > + rm -f .git/objects/${REV:0:2}/${REV:2} && > > I think this is a bashism. On dash, I get the following: > > genre ok % dash -c 'foo=abcdefg; echo ${foo:0:2}; echo ${foo:2}' > dash: 1: Bad substitution Yeah, it is. You can do it easily with 'sed', of course, but if you want to avoid the extra process and do it in pure shell, it's more like: last38=${REV#??} first2=${REV%$last38} rm -f .git/objects/$first2/$last38 -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pull: do not segfault when HEAD refers to missing object file 2017-03-06 3:51 ` Jeff King @ 2017-03-06 6:52 ` Johannes Sixt 2017-03-06 7:33 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Johannes Sixt @ 2017-03-06 6:52 UTC (permalink / raw) To: Jeff King, André Laszlo Cc: brian m. carlson, git, Nguyễn Thái Ngọc Duy, René Scharfe, Johannes Schindelin Am 06.03.2017 um 04:51 schrieb Jeff King: > On Sun, Mar 05, 2017 at 11:52:22PM +0000, brian m. carlson wrote: > >> On Mon, Mar 06, 2017 at 12:42:22AM +0100, André Laszlo wrote: >>> +test_expect_success 'git pull --rebase with corrupt HEAD does not segfault' ' >>> + mkdir corrupted && >>> + (cd corrupted && We usally indent this like so: ( cd corrupted && echo one >file && git add file && ... ) && >>> + git init && >>> + echo one >file && git add file && >>> + git commit -m one && >>> + REV=$(git rev-parse HEAD) && >>> + rm -f .git/objects/${REV:0:2}/${REV:2} && >> >> I think this is a bashism. On dash, I get the following: >> >> genre ok % dash -c 'foo=abcdefg; echo ${foo:0:2}; echo ${foo:2}' >> dash: 1: Bad substitution > > Yeah, it is. You can do it easily with 'sed', of course, but if you want > to avoid the extra process and do it in pure shell, it's more like: > > last38=${REV#??} > first2=${REV%$last38} > rm -f .git/objects/$first2/$last38 Is it "HEAD points to non-existent object" or ".git/HEAD contains junk"? In both cases there are simpler solutions than to remove an object. For example, `echo "$_x40" >.git/HEAD` or `echo "this is junk" >.git/HEAD`? -- Hannes ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pull: do not segfault when HEAD refers to missing object file 2017-03-06 6:52 ` Johannes Sixt @ 2017-03-06 7:33 ` Jeff King 0 siblings, 0 replies; 6+ messages in thread From: Jeff King @ 2017-03-06 7:33 UTC (permalink / raw) To: Johannes Sixt Cc: André Laszlo, brian m. carlson, git, Nguyễn Thái Ngọc Duy, René Scharfe, Johannes Schindelin On Mon, Mar 06, 2017 at 07:52:10AM +0100, Johannes Sixt wrote: > > Yeah, it is. You can do it easily with 'sed', of course, but if you want > > to avoid the extra process and do it in pure shell, it's more like: > > > > last38=${REV#??} > > first2=${REV%$last38} > > rm -f .git/objects/$first2/$last38 > > Is it "HEAD points to non-existent object" or ".git/HEAD contains junk"? In > both cases there are simpler solutions than to remove an object. For > example, `echo "$_x40" >.git/HEAD` or `echo "this is junk" >.git/HEAD`? Good point. Unfortunately, it's a little tricky. You can't use "this is junk" because then git does not recognize it as a git repo at all. You can't use $_x40 because the null sha1 is used internally to signal an unborn branch, so we mix it up with that case. Something like "111..." for 40 characters would work, though at that point I think just simulating an actual corruption might be less convoluted. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] pull: do not segfault when HEAD refers to missing object file 2017-03-05 23:42 [PATCH] pull: do not segfault when HEAD refers to missing object file André Laszlo 2017-03-05 23:52 ` brian m. carlson @ 2017-03-06 3:46 ` Jeff King 1 sibling, 0 replies; 6+ messages in thread From: Jeff King @ 2017-03-06 3:46 UTC (permalink / raw) To: André Laszlo Cc: git, Nguyễn Thái Ngọc Duy, brian m . carlson, René Scharfe, Johannes Schindelin On Mon, Mar 06, 2017 at 12:42:22AM +0100, André Laszlo wrote: > git pull --rebase on a corrupted HEAD used to segfault;it has been > corrected to error out with a message. A test has also been added to > verify this fix. Your commit message mostly tells us the "what" that we can see from the diff. But why are we segfaulting? What assumption is being violated, and why is the fix the right thing? You've stuck some of the details in your notes, and they should probably make it into the commit message. > When add_head_to_pending fails to add a pending object, git pull > --rebase segfaults. This can happen if HEAD is referring to a corrupt > or missing object. The other obvious time when HEAD is not valid is when you are on an unborn branch. So we should also consider how such a case interacts with the callsites you are touching. I think it is primarily this hunk: > --- a/wt-status.c > +++ b/wt-status.c > @@ -2252,6 +2252,11 @@ int has_uncommitted_changes(int ignore_submodules) > DIFF_OPT_SET(&rev_info.diffopt, IGNORE_SUBMODULES); > DIFF_OPT_SET(&rev_info.diffopt, QUICK); > add_head_to_pending(&rev_info); > + > + /* The add_head_to_pending call might not have added anything. */ > + if (!rev_info.pending.nr) > + die("bad object %s", "HEAD"); > + > diff_setup_done(&rev_info.diffopt); > result = run_diff_index(&rev_info, 1); > return diff_result_code(&rev_info.diffopt, result); that is the fix. We assume that add_head_to_pending() put something into rev_info.pending, but it might not. In the case of corruption, "bad object HEAD" is a reasonable outcome. In the case of an unborn branch, we'd probably want to compare against the empty tree. This trick is used elsewhere in wt-status.c, as in wt_status_collect_changes_index(). I'm not sure if we need to deal with that here or not. I wasn't able to trigger this code with an unborn branch. If you have no index, then the is_cache_unborn() check triggers earlier in the function. If you do have an index, then we have an earlier check: if (is_null_sha1(orig_head) && !is_cache_unborn()) die(_("Updating an unborn branch with changes added to the index.")); that covers this case. I am not 100% sure that check is correct, though. If I do: git init echo content >foo git add foo git rm -f foo then I have an index which is not "unborn", but also does not contain any changes. I think the conditional above should just be checking "active_nr". If we were to fix that, then I think has_uncommitted_changes() would need to be adjusted as well. I admit that this is probably an absurd corner case. So I think I am OK with your fix for now, and we can revisit the logic later if anybody starts to care. > I discovered this segfault on my machine after pulling a repo from > GitHub, but I have been unable to reproduce the sequence of events > that lead to the corrupted HEAD (I think it may have been caused by a > lost network connection in my case). Yikes. That should never lead to a corrupted HEAD (unless you are on a networked filesystem). I can't think of another read add_head_to_pending would fail, though, aside from a missing or corrupted object. Arguably add_head_to_pending() should die loudly if it can't parse the object pointed to by HEAD, rather than quietly returning without adding anything. > diff --git a/diff-lib.c b/diff-lib.c > index 52447466b..9d26b18c3 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -512,7 +512,7 @@ int run_diff_index(struct rev_info *revs, int cached) > struct object_array_entry *ent; > > ent = revs->pending.objects; > - if (diff_cache(revs, ent->item->oid.hash, ent->name, cached)) > + if (!ent || diff_cache(revs, ent->item->oid.hash, ent->name, cached)) > exit(128); So if I understand correctly, this hunk should not trigger anymore, because we would never call run_diff_index() without something in pending. I think it would be a programming error elsewhere to do so, and we should treat this as an assertion by complaining loudly (for this and any other confusing cases): if (revs->pending.nr != 1) die("BUG: run_diff_index requires exactly one rev (got %d)", revs->pending.nr); Lastly, I think this is your first patch. So welcome, and thank you. :) I know there was a lot of discussion and critique above, but I think overall your patch is going in the right direction. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-06 7:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-05 23:42 [PATCH] pull: do not segfault when HEAD refers to missing object file André Laszlo 2017-03-05 23:52 ` brian m. carlson 2017-03-06 3:51 ` Jeff King 2017-03-06 6:52 ` Johannes Sixt 2017-03-06 7:33 ` Jeff King 2017-03-06 3:46 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.