* [PATCH] make pack-objects a bit more resilient to repo corruption @ 2010-10-22 4:53 Nicolas Pitre 2010-10-22 14:46 ` Jeff King 0 siblings, 1 reply; 11+ messages in thread From: Nicolas Pitre @ 2010-10-22 4:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Right now, packing valid objects could fail when creating a thin pack simply because a pack edge object used as a preferred base is corrupted. Since preferred base objects are not strictly needed to produce a valid pack, let's not consider the inability to read them as a fatal error. Delta compression may well be attempted against other objects in the search window. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> --- I wrote this patch while helping Uwe with his repo corruption. While his problem turned out to be different from the one this patch is addressing (see previous patch I posted) I think that since I did the patch already then this wouldn't hurt to have this one merged too. diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f8eba53..674247e 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1298,9 +1298,19 @@ static int try_delta(struct unpacked *trg, struct unpacked *src, read_lock(); src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); read_unlock(); - if (!src->data) + if (!src->data) { + if (src_entry->preferred_base) { + /* + * Those objects are not included in the + * resulting pack. Be resilient and ignore + * them if they can't be read, in case the + * pack could be created nevertheless. + */ + return 0; + } die("object %s cannot be read", sha1_to_hex(src_entry->idx.sha1)); + } if (sz != src_size) die("object %s inconsistent object length (%lu vs %lu)", sha1_to_hex(src_entry->idx.sha1), sz, src_size); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] make pack-objects a bit more resilient to repo corruption 2010-10-22 4:53 [PATCH] make pack-objects a bit more resilient to repo corruption Nicolas Pitre @ 2010-10-22 14:46 ` Jeff King 2010-10-22 15:24 ` Drew Northup 2010-10-22 18:42 ` Nicolas Pitre 0 siblings, 2 replies; 11+ messages in thread From: Jeff King @ 2010-10-22 14:46 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git On Fri, Oct 22, 2010 at 12:53:32AM -0400, Nicolas Pitre wrote: > - if (!src->data) > + if (!src->data) { > + if (src_entry->preferred_base) { > + /* > + * Those objects are not included in the > + * resulting pack. Be resilient and ignore > + * them if they can't be read, in case the > + * pack could be created nevertheless. > + */ > + return 0; > + } > die("object %s cannot be read", > sha1_to_hex(src_entry->idx.sha1)); > + } By converting this die() into a silent return, are we losing a place where git might previously have alerted a user to corruption? In this case, we can continue the operation without the object, but if we have detected corruption, letting the user know as soon as possible is probably a good idea. In other words, should this instead be: warning("unable to read preferred base object: %s", ...); return 0; Or will some other part of the code already complained to stderr? -Peff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] make pack-objects a bit more resilient to repo corruption 2010-10-22 14:46 ` Jeff King @ 2010-10-22 15:24 ` Drew Northup 2010-10-22 18:54 ` Nicolas Pitre 2010-10-22 18:42 ` Nicolas Pitre 1 sibling, 1 reply; 11+ messages in thread From: Drew Northup @ 2010-10-22 15:24 UTC (permalink / raw) To: Jeff King; +Cc: Nicolas Pitre, Junio C Hamano, git On Fri, 2010-10-22 at 10:46 -0400, Jeff King wrote: > On Fri, Oct 22, 2010 at 12:53:32AM -0400, Nicolas Pitre wrote: > > > - if (!src->data) > > + if (!src->data) { > > + if (src_entry->preferred_base) { > > + /* > > + * Those objects are not included in the > > + * resulting pack. Be resilient and ignore > > + * them if they can't be read, in case the > > + * pack could be created nevertheless. > > + */ > > + return 0; > > + } > > die("object %s cannot be read", > > sha1_to_hex(src_entry->idx.sha1)); > > + } > > By converting this die() into a silent return, are we losing a place > where git might previously have alerted a user to corruption? In this > case, we can continue the operation without the object, but if we have > detected corruption, letting the user know as soon as possible is > probably a good idea. > > In other words, should this instead be: > > warning("unable to read preferred base object: %s", ...); > return 0; > > Or will some other part of the code already complained to stderr? > > -Peff Agreed. If it broke we should probably tell the user--even if we can't do much useful about it other than attempt to recover by continuing. -- -Drew Northup N1XIM AKA RvnPhnx on OPN ________________________________________________ "As opposed to vegetable or mineral error?" -John Pescatore, SANS NewsBites Vol. 12 Num. 59 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] make pack-objects a bit more resilient to repo corruption 2010-10-22 15:24 ` Drew Northup @ 2010-10-22 18:54 ` Nicolas Pitre 0 siblings, 0 replies; 11+ messages in thread From: Nicolas Pitre @ 2010-10-22 18:54 UTC (permalink / raw) To: Drew Northup; +Cc: Jeff King, Junio C Hamano, git On Fri, 22 Oct 2010, Drew Northup wrote: > > On Fri, 2010-10-22 at 10:46 -0400, Jeff King wrote: > > On Fri, Oct 22, 2010 at 12:53:32AM -0400, Nicolas Pitre wrote: > > > > > - if (!src->data) > > > + if (!src->data) { > > > + if (src_entry->preferred_base) { > > > + /* > > > + * Those objects are not included in the > > > + * resulting pack. Be resilient and ignore > > > + * them if they can't be read, in case the > > > + * pack could be created nevertheless. > > > + */ > > > + return 0; > > > + } > > > die("object %s cannot be read", > > > sha1_to_hex(src_entry->idx.sha1)); > > > + } > > > > By converting this die() into a silent return, are we losing a place > > where git might previously have alerted a user to corruption? In this > > case, we can continue the operation without the object, but if we have > > detected corruption, letting the user know as soon as possible is > > probably a good idea. > > > > In other words, should this instead be: > > > > warning("unable to read preferred base object: %s", ...); > > return 0; > > > > Or will some other part of the code already complained to stderr? > > > > -Peff > > Agreed. If it broke we should probably tell the user--even if we can't > do much useful about it other than attempt to recover by continuing. Please don't misinterpret this case. As far as this change is concerned, nothing is actually "broken". The operation _will_ still succeed. The repository may be broken, but in this case we can do without the broken object. In those cases where the object is really needed the original die() is still in place. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] make pack-objects a bit more resilient to repo corruption 2010-10-22 14:46 ` Jeff King 2010-10-22 15:24 ` Drew Northup @ 2010-10-22 18:42 ` Nicolas Pitre 2010-10-22 20:26 ` [PATCH v2] " Nicolas Pitre 1 sibling, 1 reply; 11+ messages in thread From: Nicolas Pitre @ 2010-10-22 18:42 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git On Fri, 22 Oct 2010, Jeff King wrote: > On Fri, Oct 22, 2010 at 12:53:32AM -0400, Nicolas Pitre wrote: > > > - if (!src->data) > > + if (!src->data) { > > + if (src_entry->preferred_base) { > > + /* > > + * Those objects are not included in the > > + * resulting pack. Be resilient and ignore > > + * them if they can't be read, in case the > > + * pack could be created nevertheless. > > + */ > > + return 0; > > + } > > die("object %s cannot be read", > > sha1_to_hex(src_entry->idx.sha1)); > > + } > > By converting this die() into a silent return, are we losing a place > where git might previously have alerted a user to corruption? In this > case, we can continue the operation without the object, but if we have > detected corruption, letting the user know as soon as possible is > probably a good idea. > > In other words, should this instead be: > > warning("unable to read preferred base object: %s", ...); > return 0; Well, this get called repeatedly, being within the inner part of the delta search loop. So you might get that warning as many times as the delta window which is not that nice. If anything a static flag to display the warning only once would be needed. But you're pretty likely to have met that warning/error already from other operations, which is why I didn't bother. > Or will some other part of the code already complained to stderr? Some other part is likely to already have complained, through check_object() -> sha1_object_info(). But not necessarily in all cases. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-22 18:42 ` Nicolas Pitre @ 2010-10-22 20:26 ` Nicolas Pitre 2010-10-22 20:50 ` Sverre Rabbelier 0 siblings, 1 reply; 11+ messages in thread From: Nicolas Pitre @ 2010-10-22 20:26 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git Right now, packing valid objects could fail when creating a thin pack simply because a pack edge object used as a preferred base is corrupted. Since preferred base objects are not strictly needed to produce a valid pack, let's not consider the inability to read them as a fatal error. Delta compression may well be attempted against other objects in the search window. To avoid warning storms (we are in the inner loop of the delta search window) a warning is emitted only on the first occurrence. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> --- On Fri, 22 Oct 2010, Nicolas Pitre wrote: > On Fri, 22 Oct 2010, Jeff King wrote: > > > By converting this die() into a silent return, are we losing a place > > where git might previously have alerted a user to corruption? In this > > case, we can continue the operation without the object, but if we have > > detected corruption, letting the user know as soon as possible is > > probably a good idea. > > > > In other words, should this instead be: > > > > warning("unable to read preferred base object: %s", ...); > > return 0; > > Well, this get called repeatedly, being within the inner part of the > delta search loop. So you might get that warning as many times as the > delta window which is not that nice. If anything a static flag to > display the warning only once would be needed. But you're pretty likely > to have met that warning/error already from other operations, which is > why I didn't bother. > > > Or will some other part of the code already complained to stderr? > > Some other part is likely to already have complained, through > check_object() -> sha1_object_info(). But not necessarily in all cases. OK... After further analysis, it seems that the cases when problem objects are already warned about through sha1_object_info(), those objects will never end up in the delta search window, as their type ends up being a negative error code. We already support that possibility on purpose even. So let's add a warning for when check_object() was able to bypass the more expensive sha1_object_info() call and therefore object corruptions remain undetected until that point. diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f8eba53..81155b4 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1298,9 +1298,23 @@ static int try_delta(struct unpacked *trg, struct unpacked *src, read_lock(); src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); read_unlock(); - if (!src->data) + if (!src->data) { + if (src_entry->preferred_base) { + static int warned = 0; + if (!warned++) + warning("object %s cannot be read", + sha1_to_hex(src_entry->idx.sha1)); + /* + * Those objects are not included in the + * resulting pack. Be resilient and ignore + * them if they can't be read, in case the + * pack could be created nevertheless. + */ + return 0; + } die("object %s cannot be read", sha1_to_hex(src_entry->idx.sha1)); + } if (sz != src_size) die("object %s inconsistent object length (%lu vs %lu)", sha1_to_hex(src_entry->idx.sha1), sz, src_size); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-22 20:26 ` [PATCH v2] " Nicolas Pitre @ 2010-10-22 20:50 ` Sverre Rabbelier 2010-10-22 21:19 ` Nicolas Pitre 0 siblings, 1 reply; 11+ messages in thread From: Sverre Rabbelier @ 2010-10-22 20:50 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Jeff King, Junio C Hamano, git Heya, On Fri, Oct 22, 2010 at 13:26, Nicolas Pitre <nico@fluxnic.net> wrote: > + static int warned = 0; > + if (!warned++) > + warning("object %s cannot be read", > + sha1_to_hex(src_entry->idx.sha1)); How does this handle multiple missing objects? Will it only warn for the first one? -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-22 20:50 ` Sverre Rabbelier @ 2010-10-22 21:19 ` Nicolas Pitre 2010-10-22 21:59 ` Junio C Hamano 2010-10-24 2:26 ` Geert Bosch 0 siblings, 2 replies; 11+ messages in thread From: Nicolas Pitre @ 2010-10-22 21:19 UTC (permalink / raw) To: Sverre Rabbelier; +Cc: Jeff King, Junio C Hamano, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 943 bytes --] On Fri, 22 Oct 2010, Sverre Rabbelier wrote: > Heya, > > On Fri, Oct 22, 2010 at 13:26, Nicolas Pitre <nico@fluxnic.net> wrote: > > + static int warned = 0; > > + if (!warned++) > > + warning("object %s cannot be read", > > + sha1_to_hex(src_entry->idx.sha1)); > > How does this handle multiple missing objects? Will it only warn for > the first one? Yes, only the first one, so you have a bone to chase if that ever happens to you. And that's good enough IMHO. Trying to warn for every missing object would require extra storage per object to remember if any particular object was warned for already, which is I think overkill for an extremely unlikely event. Comprehensive reporting is the job of fsck. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-22 21:19 ` Nicolas Pitre @ 2010-10-22 21:59 ` Junio C Hamano 2010-10-24 2:26 ` Geert Bosch 1 sibling, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2010-10-22 21:59 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sverre Rabbelier, Jeff King, git Nicolas Pitre <nico@fluxnic.net> writes: > Yes, only the first one, ... > ... Comprehensive reporting is the job of > fsck. I had the same knee-jerk reaction, and about to suggest using one bit from flags, but I agree the design balance you struck here is a good one. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-22 21:19 ` Nicolas Pitre 2010-10-22 21:59 ` Junio C Hamano @ 2010-10-24 2:26 ` Geert Bosch 2010-10-24 2:47 ` Nicolas Pitre 1 sibling, 1 reply; 11+ messages in thread From: Geert Bosch @ 2010-10-24 2:26 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sverre Rabbelier, Jeff King, Junio C Hamano, git On Oct 22, 2010, at 17:19, Nicolas Pitre wrote: >> On Fri, Oct 22, 2010 at 13:26, Nicolas Pitre <nico@fluxnic.net> wrote: >>> + static int warned = 0; >>> + if (!warned++) >>> + warning("object %s cannot be read", >>> + sha1_to_hex(src_entry->idx.sha1)); >> >> How does this handle multiple missing objects? Will it only warn for >> the first one? > > Yes, only the first one, so you have a bone to chase if that ever > happens to you. And that's good enough IMHO. Trying to warn for every > missing object would require extra storage per object to remember if any > particular object was warned for already, which is I think overkill for > an extremely unlikely event. Comprehensive reporting is the job of > fsck. Maybe add a ", run git fsck" to the message. Will still comfortably fit a line. -Geert ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] make pack-objects a bit more resilient to repo corruption 2010-10-24 2:26 ` Geert Bosch @ 2010-10-24 2:47 ` Nicolas Pitre 0 siblings, 0 replies; 11+ messages in thread From: Nicolas Pitre @ 2010-10-24 2:47 UTC (permalink / raw) To: Geert Bosch; +Cc: Sverre Rabbelier, Jeff King, Junio C Hamano, git On Sat, 23 Oct 2010, Geert Bosch wrote: > > On Oct 22, 2010, at 17:19, Nicolas Pitre wrote: > > >> On Fri, Oct 22, 2010 at 13:26, Nicolas Pitre <nico@fluxnic.net> wrote: > >>> + static int warned = 0; > >>> + if (!warned++) > >>> + warning("object %s cannot be read", > >>> + sha1_to_hex(src_entry->idx.sha1)); > >> > >> How does this handle multiple missing objects? Will it only warn for > >> the first one? > > > > Yes, only the first one, so you have a bone to chase if that ever > > happens to you. And that's good enough IMHO. Trying to warn for every > > missing object would require extra storage per object to remember if any > > particular object was warned for already, which is I think overkill for > > an extremely unlikely event. Comprehensive reporting is the job of > > fsck. > > Maybe add a ", run git fsck" to the message. Will still comfortably fit a line. Maybe if this message ever gets printed often enough. Let's see if someone will even report it before 2012, and be clueless about it. And to be consistent, you'd have to do the same throughout the code where this could be relevant. Furthermore, if someone really need the additional clue, then I'm afraid that the current fsck output won't help at all except to confuse that person even more. Better for people to ask for help on this list when things break due to corruptions if they can't figure it out on their own. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-10-24 3:03 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-22 4:53 [PATCH] make pack-objects a bit more resilient to repo corruption Nicolas Pitre 2010-10-22 14:46 ` Jeff King 2010-10-22 15:24 ` Drew Northup 2010-10-22 18:54 ` Nicolas Pitre 2010-10-22 18:42 ` Nicolas Pitre 2010-10-22 20:26 ` [PATCH v2] " Nicolas Pitre 2010-10-22 20:50 ` Sverre Rabbelier 2010-10-22 21:19 ` Nicolas Pitre 2010-10-22 21:59 ` Junio C Hamano 2010-10-24 2:26 ` Geert Bosch 2010-10-24 2:47 ` Nicolas Pitre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).