* [BUG] commit walk machinery is dangerous !
@ 2008-07-14 20:54 Nicolas Pitre
2008-07-14 21:55 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-14 20:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Good! I have all your attention now.
Yes, I'm kinda fscking upset about my hardware at this moment. I
apparently have git packs corrupting themselves from time to time which
prompted me to make git more robust against some kind of corruptions
recently.
However this time a corruption turned up and exposed what I think is a
major flaw in git's error checking. To demonstrate it, I created the
following test case. Turning the error() into a die() on line 772 of
commit.c makes this test pass but I don't know if this is the
appropriate fix (e.g. some attempt to parse non existing commits could
be valid usage, etc.). Note this is critical only for git versions
later than commit 8eca0b47ff15.
So here's the test. The catastrophic consequences that this can have on
one's repository is left as an exercise to the reader.
diff --git a/t/t6011-rev-list-with-bad-commit.sh b/t/t6011-rev-list-with-bad-commit.sh
new file mode 100755
index 0000000..a5fe190
--- /dev/null
+++ b/t/t6011-rev-list-with-bad-commit.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='git rev-list should notice bad commits'
+
+. ./test-lib.sh
+
+# Note:
+# - compression level is set to zero to make "corruptions" easier to perform
+# - reflog is disabled to avoid extra references which would twart the test
+
+test_expect_success 'setup' \
+ '
+ git init &&
+ git config core.compression 0 &&
+ git config core.logallrefupdates false &&
+ echo "foo" > foo &&
+ git add foo &&
+ git commit -m "first commit" &&
+ echo "bar" > bar &&
+ git add bar &&
+ git commit -m "second commit" &&
+ echo "baz" > baz &&
+ git add baz &&
+ git commit -m "third commit" &&
+ echo "foo again" >> foo &&
+ git add foo &&
+ git commit -m "fourth commit" &&
+ git repack -a -f -d
+ '
+
+test_expect_success 'verify number of revisions' \
+ '
+ revs=$(git rev-list --all | wc -l) &&
+ test $revs -eq 4 &&
+ first_commit=$(git rev-parse HEAD~3)
+ '
+
+test_expect_success 'corrupt second commit object' \
+ '
+ perl -i.bak -pe "s/second commit/socond commit/" .git/objects/pack/*.pack &&
+ test_must_fail git fsck --all
+ '
+
+test_expect_success 'rev-list should fail' \
+ '
+ test_must_fail git rev-list --all > /dev/null
+ '
+
+test_expect_success 'git repack _MUST_ fail' \
+ '
+ test_must_fail git repack -a -f -d
+ '
+
+test_expect_success 'first commit is still available' \
+ '
+ git log $first_commit
+ '
+
+test_done
+
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] commit walk machinery is dangerous !
2008-07-14 20:54 [BUG] commit walk machinery is dangerous ! Nicolas Pitre
@ 2008-07-14 21:55 ` Junio C Hamano
2008-07-14 23:08 ` Nicolas Pitre
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2008-07-14 21:55 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: git
Nicolas Pitre <nico@cam.org> writes:
> However this time a corruption turned up and exposed what I think is a
> major flaw in git's error checking. To demonstrate it, I created the
> following test case. Turning the error() into a die() on line 772 of
> commit.c makes this test pass but I don't know if this is the
> appropriate fix (e.g. some attempt to parse non existing commits could
> be valid usage, etc.). Note this is critical only for git versions
> later than commit 8eca0b47ff15.
Which probably means we should revert that commit as faulty? IIRC, before
that commit we did check and error out correctly but you loosened the
check to introduce "a major flaw" with that commit.
$ for b in maint master next pu
do
echo -n $b; git cat-file blob $b:commit.c | wc -l
done
maint 672
master 672
next 779
pu 789
Hmph...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] commit walk machinery is dangerous !
2008-07-14 21:55 ` Junio C Hamano
@ 2008-07-14 23:08 ` Nicolas Pitre
2008-07-14 23:32 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-14 23:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, 14 Jul 2008, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > However this time a corruption turned up and exposed what I think is a
> > major flaw in git's error checking. To demonstrate it, I created the
> > following test case. Turning the error() into a die() on line 772 of
> > commit.c makes this test pass but I don't know if this is the
> > appropriate fix (e.g. some attempt to parse non existing commits could
> > be valid usage, etc.). Note this is critical only for git versions
> > later than commit 8eca0b47ff15.
>
> Which probably means we should revert that commit as faulty? IIRC, before
> that commit we did check and error out correctly but you loosened the
> check to introduce "a major flaw" with that commit.
>
> $ for b in maint master next pu
> do
> echo -n $b; git cat-file blob $b:commit.c | wc -l
> done
> maint 672
> master 672
> next 779
> pu 789
>
> Hmph...
Well, most of them aren't that critical. If anything they will only
cause a segfault if ever the return value is not checked.
It is those with semantic meaning (e.g. object doesn't exist) which
should be audited, especially if used in the context of repository
modification, which pretty much limits it to the test case I produced.
Nicolas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] commit walk machinery is dangerous !
2008-07-14 23:08 ` Nicolas Pitre
@ 2008-07-14 23:32 ` Junio C Hamano
2008-07-14 23:39 ` Nicolas Pitre
2008-07-15 5:10 ` unpack_entry (was: [BUG] commit walk machinery is dangerous !) Shawn O. Pearce
0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-07-14 23:32 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: git
Nicolas Pitre <nico@cam.org> writes:
> It is those with semantic meaning (e.g. object doesn't exist) which
> should be audited, especially if used in the context of repository
> modification, which pretty much limits it to the test case I produced.
I've been wondering if we should make the change 8eca0b4 (implement some
resilience against pack corruptions, 2008-06-23) less aggressive.
It makes loose objects and data from other packs to be used as fallback
where we used to just punt, which is a genuine improvement for "salvaging"
mode of operation, but at the same time, it now forbids the callers to
expect that the objects they learned to exist from has_sha1_file() or
nth_packed_object_sha1() should never result NULL return value from
read_sha1_file().
It may make it safe again to fail if you cannot salvage using fallback
method after all. Something like the attached.
This is unrelated to the issue at hand, but I also notice that there are
few callsites outside sha1_file.c that bypasses cache_or_unpack_entry()
and call unpack_entry() directly. I wonder if they should be using the
cached version, making unpack_entry() static...
sha1_file.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/sha1_file.c b/sha1_file.c
index 2df78b5..55aa361 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1649,7 +1649,7 @@ static void *unpack_delta_entry(struct packed_git *p,
mark_bad_packed_object(p, base_sha1);
base = read_sha1_file(base_sha1, type, &base_size);
if (!base)
- return NULL;
+ exit(129);
}
delta_data = unpack_compressed_entry(p, w_curs, curpos, delta_size);
@@ -1946,6 +1946,8 @@ static void *read_packed_sha1(const unsigned char *sha1,
sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
mark_bad_packed_object(e.p, sha1);
data = read_sha1_file(sha1, type, size);
+ if (!data)
+ exit(129);
}
return data;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] commit walk machinery is dangerous !
2008-07-14 23:32 ` Junio C Hamano
@ 2008-07-14 23:39 ` Nicolas Pitre
2008-07-15 1:46 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
2008-07-15 5:10 ` unpack_entry (was: [BUG] commit walk machinery is dangerous !) Shawn O. Pearce
1 sibling, 1 reply; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-14 23:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, 14 Jul 2008, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > It is those with semantic meaning (e.g. object doesn't exist) which
> > should be audited, especially if used in the context of repository
> > modification, which pretty much limits it to the test case I produced.
>
> I've been wondering if we should make the change 8eca0b4 (implement some
> resilience against pack corruptions, 2008-06-23) less aggressive.
>
> It makes loose objects and data from other packs to be used as fallback
> where we used to just punt, which is a genuine improvement for "salvaging"
> mode of operation, but at the same time, it now forbids the callers to
> expect that the objects they learned to exist from has_sha1_file() or
> nth_packed_object_sha1() should never result NULL return value from
> read_sha1_file().
>
> It may make it safe again to fail if you cannot salvage using fallback
> method after all. Something like the attached.
Well, I have a different solution which should restore the original
"behavior" in the presence of existing but non-readable objects. Patch
will follow later.
Nicolas
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] restore legacy behavior for read_sha1_file()
2008-07-14 23:39 ` Nicolas Pitre
@ 2008-07-15 1:46 ` Nicolas Pitre
2008-07-15 1:50 ` [PATCH 2/2] test case for previous commit Nicolas Pitre
2008-07-15 5:12 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
0 siblings, 2 replies; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-15 1:46 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Since commit 8eca0b47ff1598a6d163df9358c0e0c9bd92d4c8, it is possible
for read_sha1_file() to return NULL even with existing objects when they
are corrupted. Previously a corrupted object would have terminated the
program immediately, effectively making read_sha1_file() return NULL
only when specified object is not found.
Let's restore this behavior for all users of read_sha1_file() and
provide a separate function with the ability to not terminate when
bad objects are encountered.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
On Mon, 14 Jul 2008, Nicolas Pitre wrote:
> Well, I have a different solution which should restore the original
> "behavior" in the presence of existing but non-readable objects. Patch
> will follow later.
So here it is.
diff --git a/sha1_file.c b/sha1_file.c
index 2df78b5..e281c14 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1006,6 +1006,18 @@ static void mark_bad_packed_object(struct packed_git *p,
p->num_bad_objects++;
}
+static int has_packed_and_bad(const unsigned char *sha1)
+{
+ struct packed_git *p;
+ unsigned i;
+
+ for (p = packed_git; p; p = p->next)
+ for (i = 0; i < p->num_bad_objects; i++)
+ if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+ return 1;
+ return 0;
+}
+
int check_sha1_signature(const unsigned char *sha1, void *map, unsigned long size, const char *type)
{
unsigned char real_sha1[20];
@@ -1647,7 +1659,7 @@ static void *unpack_delta_entry(struct packed_git *p,
sha1_to_hex(base_sha1), (uintmax_t)base_offset,
p->pack_name);
mark_bad_packed_object(p, base_sha1);
- base = read_sha1_file(base_sha1, type, &base_size);
+ base = read_object(base_sha1, type, &base_size);
if (!base)
return NULL;
}
@@ -1945,7 +1957,7 @@ static void *read_packed_sha1(const unsigned char *sha1,
error("failed to read object %s at offset %"PRIuMAX" from %s",
sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
mark_bad_packed_object(e.p, sha1);
- data = read_sha1_file(sha1, type, size);
+ data = read_object(sha1, type, size);
}
return data;
}
@@ -2010,8 +2022,8 @@ int pretend_sha1_file(void *buf, unsigned long len, enum object_type type,
return 0;
}
-void *read_sha1_file(const unsigned char *sha1, enum object_type *type,
- unsigned long *size)
+void *read_object(const unsigned char *sha1, enum object_type *type,
+ unsigned long *size)
{
unsigned long mapsize;
void *map, *buf;
@@ -2037,6 +2049,16 @@ void *read_sha1_file(const unsigned char *sha1, enum object_type *type,
return read_packed_sha1(sha1, type, size);
}
+void *read_sha1_file(const unsigned char *sha1, enum object_type *type,
+ unsigned long *size)
+{
+ void *data = read_object(sha1, type, size);
+ /* legacy behavior is to die on corrupted objects */
+ if (!data && (has_loose_object(sha1) || has_packed_and_bad(sha1)))
+ die("object %s is corrupted", sha1_to_hex(sha1));
+ return data;
+}
+
void *read_object_with_reference(const unsigned char *sha1,
const char *required_type_name,
unsigned long *size,
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] test case for previous commit
2008-07-15 1:46 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
@ 2008-07-15 1:50 ` Nicolas Pitre
2008-07-15 5:12 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
1 sibling, 0 replies; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-15 1:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Here's a test demonstrating a case of what was broken before
(and fixed by) the previous patch.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
diff --git a/t/t6011-rev-list-with-bad-commit.sh b/t/t6011-rev-list-with-bad-commit.sh
new file mode 100755
index 0000000..e51eb41
--- /dev/null
+++ b/t/t6011-rev-list-with-bad-commit.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='git rev-list should notice bad commits'
+
+. ./test-lib.sh
+
+# Note:
+# - compression level is set to zero to make "corruptions" easier to perform
+# - reflog is disabled to avoid extra references which would twart the test
+
+test_expect_success 'setup' \
+ '
+ git init &&
+ git config core.compression 0 &&
+ git config core.logallrefupdates false &&
+ echo "foo" > foo &&
+ git add foo &&
+ git commit -m "first commit" &&
+ echo "bar" > bar &&
+ git add bar &&
+ git commit -m "second commit" &&
+ echo "baz" > baz &&
+ git add baz &&
+ git commit -m "third commit" &&
+ echo "foo again" >> foo &&
+ git add foo &&
+ git commit -m "fourth commit" &&
+ git repack -a -f -d
+ '
+
+test_expect_success 'verify number of revisions' \
+ '
+ revs=$(git rev-list --all | wc -l) &&
+ test $revs -eq 4 &&
+ first_commit=$(git rev-parse HEAD~3)
+ '
+
+test_expect_success 'corrupt second commit object' \
+ '
+ perl -i.bak -pe "s/second commit/socond commit/" .git/objects/pack/*.pack &&
+ test_must_fail git fsck --full
+ '
+
+test_expect_success 'rev-list should fail' \
+ '
+ test_must_fail git rev-list --all > /dev/null
+ '
+
+test_expect_success 'git repack _MUST_ fail' \
+ '
+ test_must_fail git repack -a -f -d
+ '
+
+test_expect_success 'first commit is still available' \
+ '
+ git log $first_commit
+ '
+
+test_done
+
^ permalink raw reply related [flat|nested] 9+ messages in thread
* unpack_entry (was: [BUG] commit walk machinery is dangerous !)
2008-07-14 23:32 ` Junio C Hamano
2008-07-14 23:39 ` Nicolas Pitre
@ 2008-07-15 5:10 ` Shawn O. Pearce
1 sibling, 0 replies; 9+ messages in thread
From: Shawn O. Pearce @ 2008-07-15 5:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
Junio C Hamano <gitster@pobox.com> wrote:
> This is unrelated to the issue at hand, but I also notice that there are
> few callsites outside sha1_file.c that bypasses cache_or_unpack_entry()
> and call unpack_entry() directly. I wonder if they should be using the
> cached version, making unpack_entry() static...
There are two such callsites that I found:
$ git grep -n --cached unpack_entry
fast-import.c:1201: return unpack_entry(p, oe->offset, &type, sizep);
pack-check.c:114: data = unpack_entry(p, entries[i].offset, &type,
Now the one in fast-import.c could be using the cached version.
This is simply fast-import trying to reuse sha1_file.c for reading
a previously written object to the pack.
Stuffing the item into the cache is perhaps pointless here as gfi
does its own sort of caching and only goes through this code path
when that caching has tossed the least-recently-used data and its
suddenly needed again.
So yea, it could be using the caching form, but it would maybe be
doing more work (and using more memory) than it really needs.
In pack-check.c we are looping through the objects in the order they
appear in the index so we can unpack them and verify each object's
SHA-1 signature. Please note this is perhaps the slowest way to
enumerate through the pack and is why you can clone a repository
over git:// faster than you can run `fsck --full`.
If you really want to verify every single object's SHA-1, run the
damn pack through index-pack and compare the new index to the old
index (hint they should be identical, bit-for-bit).
So again, in this case the caching may only cause us to do more
work (and use more memory) than we need as we are slamming the
delta base cache with all sorts of unrelated base objects. It
probably has a low hit ratio anyway during this loop. :-|
--
Shawn.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] restore legacy behavior for read_sha1_file()
2008-07-15 1:46 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
2008-07-15 1:50 ` [PATCH 2/2] test case for previous commit Nicolas Pitre
@ 2008-07-15 5:12 ` Nicolas Pitre
1 sibling, 0 replies; 9+ messages in thread
From: Nicolas Pitre @ 2008-07-15 5:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, 14 Jul 2008, Nicolas Pitre wrote:
> Since commit 8eca0b47ff1598a6d163df9358c0e0c9bd92d4c8, it is possible
> for read_sha1_file() to return NULL even with existing objects when they
> are corrupted. Previously a corrupted object would have terminated the
> program immediately, effectively making read_sha1_file() return NULL
> only when specified object is not found.
>
> Let's restore this behavior for all users of read_sha1_file() and
> provide a separate function with the ability to not terminate when
> bad objects are encountered.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>
Grrrr. Forgot to 'git add' one file. So this goes with the same
commit:
diff --git a/cache.h b/cache.h
index 4a8b125..bc52af6 100644
--- a/cache.h
+++ b/cache.h
@@ -540,6 +540,9 @@ extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsig
extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+/* just like read_sha1_file(), but non fatal in presence of bad objects */
+extern void *read_object(const unsigned char *sha1, enum object_type *type, unsigned long *size);
+
extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
extern int move_temp_to_file(const char *tmpfile, const char *filename);
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-07-15 5:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-14 20:54 [BUG] commit walk machinery is dangerous ! Nicolas Pitre
2008-07-14 21:55 ` Junio C Hamano
2008-07-14 23:08 ` Nicolas Pitre
2008-07-14 23:32 ` Junio C Hamano
2008-07-14 23:39 ` Nicolas Pitre
2008-07-15 1:46 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
2008-07-15 1:50 ` [PATCH 2/2] test case for previous commit Nicolas Pitre
2008-07-15 5:12 ` [PATCH 1/2] restore legacy behavior for read_sha1_file() Nicolas Pitre
2008-07-15 5:10 ` unpack_entry (was: [BUG] commit walk machinery is dangerous !) Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).