All of lore.kernel.org
 help / color / mirror / Atom feed
* corrupt repos does not return error with `git fsck`
@ 2015-05-20 16:17 Faheem Mitha
  2015-05-20 17:19 ` [PUB]corrupt " Matthieu Moy
  0 siblings, 1 reply; 19+ messages in thread
From: Faheem Mitha @ 2015-05-20 16:17 UTC (permalink / raw)
  To: git


Hi,

Clone the repos https://github.com/fmitha/SICL.

Then

     git show 280c12ab49223c64c6f914944287a7d049cf4dd0

gives

     fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0

But

     git fsck

gives

     Checking object directories: 100% (256/256), done.
     Checking objects: 100% (49356/49356), done.

So `git fsck` does not return an error, though the repos is corrupt. This 
may be of interest to the developers.

Please CC me on any reply, I'm not subscribed to the list. Thanks.

                                             Regards, Faheem Mitha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 16:17 corrupt repos does not return error with `git fsck` Faheem Mitha
@ 2015-05-20 17:19 ` Matthieu Moy
  2015-05-20 17:40   ` Johannes Schindelin
  2015-05-20 17:58   ` Faheem Mitha
  0 siblings, 2 replies; 19+ messages in thread
From: Matthieu Moy @ 2015-05-20 17:19 UTC (permalink / raw)
  To: Faheem Mitha; +Cc: git

Faheem Mitha <faheem@faheem.info> writes:

> Hi,
>
> Clone the repos https://github.com/fmitha/SICL.
>
> Then
>
>     git show 280c12ab49223c64c6f914944287a7d049cf4dd0
>
> gives
>
>     fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0

It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in
your repository. The good news it: I don't think you have a corrupt
repository. What makes you think you have an object with identifier
280c12ab49223c64c6f914944287a7d049cf4dd0?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 17:19 ` [PUB]corrupt " Matthieu Moy
@ 2015-05-20 17:40   ` Johannes Schindelin
  2015-05-20 18:02     ` Stefan Beller
  2015-05-20 17:58   ` Faheem Mitha
  1 sibling, 1 reply; 19+ messages in thread
From: Johannes Schindelin @ 2015-05-20 17:40 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Faheem Mitha, git

Hi,

On 2015-05-20 19:19, Matthieu Moy wrote:
> Faheem Mitha <faheem@faheem.info> writes:
> 
>> Clone the repos https://github.com/fmitha/SICL.
>>
>> Then
>>
>>     git show 280c12ab49223c64c6f914944287a7d049cf4dd0
>>
>> gives
>>
>>     fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> 
> It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in
> your repository. The good news it: I don't think you have a corrupt
> repository. What makes you think you have an object with identifier
> 280c12ab49223c64c6f914944287a7d049cf4dd0?

I had a similar problem some time ago and tracked it down to a graft that was active while pushing to the public repository. Maybe it's the same problem here?

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 17:19 ` [PUB]corrupt " Matthieu Moy
  2015-05-20 17:40   ` Johannes Schindelin
@ 2015-05-20 17:58   ` Faheem Mitha
  2015-05-21  8:09     ` Matthieu Moy
  1 sibling, 1 reply; 19+ messages in thread
From: Faheem Mitha @ 2015-05-20 17:58 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git


On Wed, 20 May 2015, Matthieu Moy wrote:

> Faheem Mitha <faheem@faheem.info> writes:

>> Hi,

>> Clone the repos https://github.com/fmitha/SICL.

>> Then

>>     git show 280c12ab49223c64c6f914944287a7d049cf4dd0

>> gives

>>     fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0

> It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in
> your repository. The good news it: I don't think you have a corrupt
> repository. What makes you think you have an object with identifier
> 280c12ab49223c64c6f914944287a7d049cf4dd0?

I was going by the answer (by CodeWizard) in 
http://stackoverflow.com/q/30348615/350713

The question there also gives the context of this question.

The repos I referenced in my post to the git mailing list just now, is 
just a clone of https://github.com/drmeister/SICL.

If I just give a random hash to `git show` in that repos, I get

     fatal: ambiguous argument '...': unknown revision or path not in the working tree.

It seemed reasonable to assume (based on what little knowledge I had 
about) that the 280c12ab49223c64c6f914944287a7d049cf4dd0 commit was the 
problem.

However, this repos is a fork of another repos, namely 
https://github.com/robert-strandh/SICL

That repos contains more recent commits than the fork does.

If I take any of the more recent commits from that repos, and try the hash 
with `git show`, i.e.

     git show <hash>

in the fork, I get the same error, which makes to me think something else 
must be going on.

Chris (drmeister) has modified the path the submodule is obtained from, so 
the instructions in the SO question won't work as a reproduction recipe 
any more, but if you want to take a look I could clone his repos and set 
it up the same way it was. Let me know.

                                                     Regards, Faheem Mitha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 17:40   ` Johannes Schindelin
@ 2015-05-20 18:02     ` Stefan Beller
  2015-05-20 18:19       ` John Keeping
                         ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Stefan Beller @ 2015-05-20 18:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Matthieu Moy, Faheem Mitha, git@vger.kernel.org

$ git clone https://github.com/fmitha/SICL
cd SICL
$ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
$ git show 12323213123 # just to be sure to have a different error
message for non existing objects.
fatal: ambiguous argument '12323213123': unknown revision or path not
in the working tree.

$ mv .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack .
$ rm .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.idx
$ git unpack-objects < pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack
$ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
$ ls .git/objects/28/0*
.git/objects/28/01fef08b1dccf9725dde919a7373748a046cb7
.git/objects/28/03d8c1cb3275979ff2d8408450844f6a78a70d
.git/objects/28/0663a93d702a7fcb0dd36f461397f6b50ba01e
.git/objects/28/068e2656dd4bac61050e870712578032af9144
.git/objects/28/074e890d6ff2bb61eb7796bc500b6d8e344ad2
.git/objects/28/08596ac465cf8a819a9b13ad2f855e9a8a3235
.git/objects/28/098184d1ba97453227c18628cdf13087b6bce2
.git/objects/28/0ba19c68b26ee7c799ef8ca09d540a5ad7a5b2
.git/objects/28/0d66213173f0ae7aaae8684f3efcb1f8790792
.git/objects/28/0da35374c32303cbd726bef9847f18d7428d5e

There is no file 28/0c... however.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:02     ` Stefan Beller
@ 2015-05-20 18:19       ` John Keeping
  2015-05-20 18:22       ` Jeff King
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: John Keeping @ 2015-05-20 18:19 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote:
> $ git clone https://github.com/fmitha/SICL
> cd SICL
> $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> $ git show 12323213123 # just to be sure to have a different error
> message for non existing objects.
> fatal: ambiguous argument '12323213123': unknown revision or path not
> in the working tree.

I think 40 hex characters is special cased.  Using CGit as a repository
with a submodule so I can easily get an unrelated SHA1 and short name:

cgit $ git show $(git -C git rev-parse @)
fatal: bad object bb8577532add843833ebf8b5324f94f84cb71ca0
cgit $ git show $(git -C git rev-parse --short @)
fatal: ambiguous argument 'bb85775': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:02     ` Stefan Beller
  2015-05-20 18:19       ` John Keeping
@ 2015-05-20 18:22       ` Jeff King
  2015-05-20 18:31         ` Jeff King
  2015-05-20 20:16         ` Junio C Hamano
  2015-05-20 18:24       ` Faheem Mitha
  2015-05-20 21:03       ` Matthieu Moy
  3 siblings, 2 replies; 19+ messages in thread
From: Jeff King @ 2015-05-20 18:22 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote:

> $ git clone https://github.com/fmitha/SICL
> cd SICL
> $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> $ git show 12323213123 # just to be sure to have a different error
> message for non existing objects.
> fatal: ambiguous argument '12323213123': unknown revision or path not
> in the working tree.

Yeah, this is well-known. If you give a partial hash, the error comes
from get_sha1(), which says "hey, this doesn't look like anything I know
about". If you feed a whole hash, we skip all that and say "well, you
_definitely_ meant this sha1", and then later code complains when it
cannot be read.

We could add a has_sha1_file() check in get_sha1 for this case. I can't
think offhand of any reason it would need to be called with a
non-existent object, but there may be some lurking corner case (e.g.,
"cat-file -e" or something).

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:02     ` Stefan Beller
  2015-05-20 18:19       ` John Keeping
  2015-05-20 18:22       ` Jeff King
@ 2015-05-20 18:24       ` Faheem Mitha
  2015-05-20 18:54         ` Stefan Beller
  2015-05-20 21:03       ` Matthieu Moy
  3 siblings, 1 reply; 19+ messages in thread
From: Faheem Mitha @ 2015-05-20 18:24 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Schindelin, Matthieu Moy, git@vger.kernel.org


Hi Stefan,

Thank you for the reply, but I don't follow what conclusion you are 
drawing, if any.

On Wed, 20 May 2015, Stefan Beller wrote:

> $ git clone https://github.com/fmitha/SICL
> cd SICL
> $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> $ git show 12323213123 # just to be sure to have a different error
> message for non existing objects.
> fatal: ambiguous argument '12323213123': unknown revision or path not
> in the working tree.
>
> $ mv .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack .
> $ rm .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.idx
> $ git unpack-objects < pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack
> $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> $ ls .git/objects/28/0*
> .git/objects/28/01fef08b1dccf9725dde919a7373748a046cb7
> .git/objects/28/03d8c1cb3275979ff2d8408450844f6a78a70d
> .git/objects/28/0663a93d702a7fcb0dd36f461397f6b50ba01e
> .git/objects/28/068e2656dd4bac61050e870712578032af9144
> .git/objects/28/074e890d6ff2bb61eb7796bc500b6d8e344ad2
> .git/objects/28/08596ac465cf8a819a9b13ad2f855e9a8a3235
> .git/objects/28/098184d1ba97453227c18628cdf13087b6bce2
> .git/objects/28/0ba19c68b26ee7c799ef8ca09d540a5ad7a5b2
> .git/objects/28/0d66213173f0ae7aaae8684f3efcb1f8790792
> .git/objects/28/0da35374c32303cbd726bef9847f18d7428d5e
>
> There is no file 28/0c... however.

So, is the repos corrupt or not? Also, I don't understand why you say

     There is no file 28/0c... however.

Why would you expect there to be? I don't see it mentioned in that list.

I apologise for my ignorance. I don't really know anything about git. I 
just happened to encounter this error.

                                                   Regards, Faheem Mitha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:22       ` Jeff King
@ 2015-05-20 18:31         ` Jeff King
  2015-05-20 20:39           ` Junio C Hamano
  2015-05-20 20:16         ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2015-05-20 18:31 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 02:22:19PM -0400, Jeff King wrote:

> On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote:
> 
> > $ git clone https://github.com/fmitha/SICL
> > cd SICL
> > $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> > fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> > $ git show 12323213123 # just to be sure to have a different error
> > message for non existing objects.
> > fatal: ambiguous argument '12323213123': unknown revision or path not
> > in the working tree.
> 
> Yeah, this is well-known. If you give a partial hash, the error comes
> from get_sha1(), which says "hey, this doesn't look like anything I know
> about". If you feed a whole hash, we skip all that and say "well, you
> _definitely_ meant this sha1", and then later code complains when it
> cannot be read.
> 
> We could add a has_sha1_file() check in get_sha1 for this case. I can't
> think offhand of any reason it would need to be called with a
> non-existent object, but there may be some lurking corner case (e.g.,
> "cat-file -e" or something).

I should have looked before replying. It would indeed break "cat-file
-e" horribly. So the right answer may be to just improve the "bad
object" message (probably by checking has_sha1_file there and diagnosing
it either as missing or corrupted).

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:24       ` Faheem Mitha
@ 2015-05-20 18:54         ` Stefan Beller
  2015-05-20 19:13           ` Faheem Mitha
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2015-05-20 18:54 UTC (permalink / raw)
  To: Faheem Mitha; +Cc: Johannes Schindelin, Matthieu Moy, git@vger.kernel.org

On Wed, May 20, 2015 at 11:24 AM, Faheem Mitha <faheem@faheem.info> wrote:

> So, is the repos corrupt or not? Also, I don't understand why you say
>
>     There is no file 28/0c... however.
>
> Why would you expect there to be? I don't see it mentioned in that list.
>

Each object is stored at .git/objects/<xz>/<tail> with <xz> being the first
2 characters of the sha1 and the tail the remaining 38 characters of the sha1.
I did not draw a conclusion yet, as I needed to run for a meeting.

So the object you're looking for is not there (stating this as a fact).
But why would you expect it to be there? At the time of sending the previous
email I tried to do a reverse search "Give me all objects, which
reference objectX"
but did not succeed yet.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:54         ` Stefan Beller
@ 2015-05-20 19:13           ` Faheem Mitha
  0 siblings, 0 replies; 19+ messages in thread
From: Faheem Mitha @ 2015-05-20 19:13 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Schindelin, Matthieu Moy, git@vger.kernel.org


On Wed, 20 May 2015, Stefan Beller wrote:

> On Wed, May 20, 2015 at 11:24 AM, Faheem Mitha <faheem@faheem.info> wrote:

>> So, is the repos corrupt or not? Also, I don't understand why you say

>>     There is no file 28/0c... however.

>> Why would you expect there to be? I don't see it mentioned in that list.

> Each object is stored at .git/objects/<xz>/<tail> with <xz> being the 
> first 2 characters of the sha1 and the tail the remaining 38 characters 
> of the sha1. I did not draw a conclusion yet, as I needed to run for a 
> meeting.

> So the object you're looking for is not there (stating this as a fact). 
> But why would you expect it to be there? At the time of sending the 
> previous email I tried to do a reverse search "Give me all objects, 
> which reference objectX" but did not succeed yet.

Ok. See my reply to Matthieu Moy for context. I make have been taking too 
much for granted before posting to this list. Maybe I should have asked 
here first.

As I wrote to him, I can reconstruct the original setup if anyone thinks 
it is worthwhile trying to investigate further.

                                                    Regards, Faheem Mitha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:22       ` Jeff King
  2015-05-20 18:31         ` Jeff King
@ 2015-05-20 20:16         ` Junio C Hamano
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2015-05-20 20:16 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> We could add a has_sha1_file() check in get_sha1 for this case.

Please don't.  get_sha1() is merely "I have this string, which may
be a 40-hex or an extended SHA-1 expression.  Turn it into a 20-byte
binary" and does not require you to have any such object.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:31         ` Jeff King
@ 2015-05-20 20:39           ` Junio C Hamano
  2015-05-20 20:57             ` Stefan Beller
  2015-05-20 21:55             ` Jeff King
  0 siblings, 2 replies; 19+ messages in thread
From: Junio C Hamano @ 2015-05-20 20:39 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> I should have looked before replying. It would indeed break "cat-file
> -e" horribly. So the right answer may be to just improve the "bad
> object" message (probably by checking has_sha1_file there and diagnosing
> it either as missing or corrupted).

I should have looked before replying, too ;-)

Yeah, "bad object" sounds as if we tried to parse something that
exists and it was corrupt.  So classifying "a file or a pack index
entry exists where a valid object with that name should reside in"
as "bad object" and "there is no such file or a pack index entry
that would house the named object" as "missing object" _might_ make
things better.

But let's think about it a bit more.  Would it have prevented the
original confusion if we said "missing object"?  I have a feeling
that it wouldn't have.  Faheem was so convinced that the object
named with the 40-hex *must* exist in the cloned repository, and if
we told "missing object" to such a person, it will just enforce the
(mis)conception that the repository is somehow corrupt, when it is
not.

So...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 20:39           ` Junio C Hamano
@ 2015-05-20 20:57             ` Stefan Beller
  2015-05-20 21:06               ` Junio C Hamano
  2015-05-20 21:55             ` Jeff King
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2015-05-20 20:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> So...

maybe we need a command:

Given this SHA1, tell me anything you know about it,
Is it a {blob,tree,commit,tag}?
Is it referenced from anywhere else in this repository and if so, which type?
And if it is not referenced, nor an object, tell me so explicitely.


This would have helped a lot for this confusion:

    $ git frotz 280c12...
     No object exists with such a substring (either as prefix, postfix
or in between)
     No other object is referencing any object containing this
substring as pre/post-fix

and this issue would have been resolved in a heartbeat.

Specially the verbose feature is contradicting the terse unix style though
and this command is tailored to this issue, so I don't know if it's any useful
outside this specific problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 18:02     ` Stefan Beller
                         ` (2 preceding siblings ...)
  2015-05-20 18:24       ` Faheem Mitha
@ 2015-05-20 21:03       ` Matthieu Moy
  3 siblings, 0 replies; 19+ messages in thread
From: Matthieu Moy @ 2015-05-20 21:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Schindelin, Matthieu Moy, Faheem Mitha, git

sbeller@google.com writes:
> $ git clone https://github.com/fmitha/SICL
> cd SICL
> $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0
> fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0
> $ git show 12323213123 # just to be sure to have a different error message for non existing objects.

I did the same, but the error message is different if you provide an abreviated sha1 or a full 40-chars sha1.

Any full sha1 I tried gave the same error message.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 20:57             ` Stefan Beller
@ 2015-05-20 21:06               ` Junio C Hamano
  2015-05-20 21:13                 ` Stefan Beller
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2015-05-20 21:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jeff King, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

Stefan Beller <sbeller@google.com> writes:

> On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> So...
>
> maybe we need a command:
>
> Given this SHA1, tell me anything you know about it,
> Is it a {blob,tree,commit,tag}?
> Is it referenced from anywhere else in this repository and if so, which type?
> And if it is not referenced, nor an object, tell me so explicitely.

Let me add another to that list ;-)

  I have this prefix; please enumerate all known objects that share it.

I do not know the value of the first two in your list.  If it is a
known object, then you throw it at "git show", "git cat-file -t" and
dig from there.  If it is not known, there is nothing more to do.

I do not know if "need" is the right word, but I hope that you
realize the last two among the four you listed need the equivalent
of "git fsck".  It is an expensive operation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 21:06               ` Junio C Hamano
@ 2015-05-20 21:13                 ` Stefan Beller
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2015-05-20 21:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 2:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> So...
>>
>> maybe we need a command:
>>
>> Given this SHA1, tell me anything you know about it,
>> Is it a {blob,tree,commit,tag}?
>> Is it referenced from anywhere else in this repository and if so, which type?
>> And if it is not referenced, nor an object, tell me so explicitely.
>
> Let me add another to that list ;-)
>
>   I have this prefix; please enumerate all known objects that share it.
>
> I do not know the value of the first two in your list.  If it is a
> known object, then you throw it at "git show", "git cat-file -t" and
> dig from there.  If it is not known, there is nothing more to do.

Right, I just tried to think of all the questions which are relevant to answer
in such a case, so probably this can be outside of

>
> I do not know if "need" is the right word, but I hope that you
> realize the last two among the four you listed need the equivalent
> of "git fsck".  It is an expensive operation.

Yes, I do realize that. The way I interpreted Faheems original message was:

    git fsck tells me everything is alright, but I don't trust fsck.
So now I want
    to find a way to ask Git about everything it knows about this
$SHA1 and print
    it for me so I can manually look at each entry and verify by hand.

So that's why I included the parts easily done with cat-file and show.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 20:39           ` Junio C Hamano
  2015-05-20 20:57             ` Stefan Beller
@ 2015-05-20 21:55             ` Jeff King
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff King @ 2015-05-20 21:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Johannes Schindelin, Matthieu Moy, Faheem Mitha,
	git@vger.kernel.org

On Wed, May 20, 2015 at 01:39:36PM -0700, Junio C Hamano wrote:

> Yeah, "bad object" sounds as if we tried to parse something that
> exists and it was corrupt.  So classifying "a file or a pack index
> entry exists where a valid object with that name should reside in"
> as "bad object" and "there is no such file or a pack index entry
> that would house the named object" as "missing object" _might_ make
> things better.
> 
> But let's think about it a bit more.  Would it have prevented the
> original confusion if we said "missing object"?  I have a feeling
> that it wouldn't have.  Faheem was so convinced that the object
> named with the 40-hex *must* exist in the cloned repository, and if
> we told "missing object" to such a person, it will just enforce the
> (mis)conception that the repository is somehow corrupt, when it is
> not.
> 
> So...

I dunno. If it were phrased not as "missing object" but as "there is no
such object in the repository", then it seems more clear to me that the
error is in the request, not in the repository (and hopefully the user
would examine their assumption that it should be).

But "bad object" is just a horrible error message. It actively implies
corruption. And I think if we do have corruption, then parse_object()
already reports it. For example:

  # helpers
  objfile() {
    printf '.git/objects/%s' $(echo $1 | sed 's,..,&/,')
  }
  blob=$(echo content | git hash-object -w --stdin)

  # object with a sha1 mismatch
  mismatch=1234567890123456789012345678901234567890
  mkdir .git/objects/12
  cp $(objfile $blob) $(objfile $mismatch)

  # plain old missing object
  missing=1234abcdef1234abcdef1234abcdef1234abcdef

  # object with data corruption
  corrupt=$blob
  chmod +w $(objfile $corrupt)
  dd if=/dev/zero of=$(objfile $corrupt) bs=1 count=1 conv=notrunc seek=10

  # now show each
  for bad in corrupt mismatch missing; do
    echo "==> $bad"
    git --no-pager show $(eval "echo \$$bad")
  done

produces:

  ==> corrupt
  error: inflate: data stream error (invalid distance too far back)
  error: unable to unpack d95f3ad14dee633a758d2e331151e950dd13e4ed header
  error: inflate: data stream error (invalid distance too far back)
  fatal: loose object d95f3ad14dee633a758d2e331151e950dd13e4ed (stored in .git/objects/d9/5f3ad14dee633a758d2e331151e950dd13e4ed) is corrupt
  ==> mismatch
  error: sha1 mismatch 1234567890123456789012345678901234567890
  fatal: bad object 1234567890123456789012345678901234567890
  ==> missing
  fatal: bad object 1234abcdef1234abcdef1234abcdef1234abcdef

Note that the "missing" case is the only one that _doesn't_ give further
clarification, and it is likely to be the most common (however just
changing "bad object" to "no such object" would be a bad idea, as it
makes the second case harder to understand).

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PUB]corrupt repos does not return error with `git fsck`
  2015-05-20 17:58   ` Faheem Mitha
@ 2015-05-21  8:09     ` Matthieu Moy
  0 siblings, 0 replies; 19+ messages in thread
From: Matthieu Moy @ 2015-05-21  8:09 UTC (permalink / raw)
  To: Faheem Mitha; +Cc: git

Faheem Mitha <faheem@faheem.info> writes:

> I was going by the answer (by CodeWizard) in
> http://stackoverflow.com/q/30348615/350713

OK, so the hash you got comes from a superproject which references it.
My guess is that the superproject did a private commit in a submodule,
added this submodule to the superproject, and forgot to push the
submodule.

If so, it's a user error (that could arguably have been avoided with a
better command-line interface, so Git is partly guilty), but not a
repository corruption.

> If I just give a random hash to `git show` in that repos, I get
>
>     fatal: ambiguous argument '...': unknown revision or path not in the working tree.

Not "a random hash", but a random abreviated hash. Look:

Changing the last digit:

$ git show 280c12ab49223c64c6f914944287a7d049cf4d23
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d23
$ git show 280c12ab49223c64c6f914944287a7d049cf4d24
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d24
$ git show 280c12ab49223c64c6f914944287a7d049cf4d25
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d25
$ git show 280c12ab49223c64c6f914944287a7d049cf4d26
fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d26

Removing the last digit:

$ git show 280c12ab49223c64c6f914944287a7d049cf4d2 
fatal: ambiguous argument '280c12ab49223c64c6f914944287a7d049cf4d2': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-05-21  8:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-20 16:17 corrupt repos does not return error with `git fsck` Faheem Mitha
2015-05-20 17:19 ` [PUB]corrupt " Matthieu Moy
2015-05-20 17:40   ` Johannes Schindelin
2015-05-20 18:02     ` Stefan Beller
2015-05-20 18:19       ` John Keeping
2015-05-20 18:22       ` Jeff King
2015-05-20 18:31         ` Jeff King
2015-05-20 20:39           ` Junio C Hamano
2015-05-20 20:57             ` Stefan Beller
2015-05-20 21:06               ` Junio C Hamano
2015-05-20 21:13                 ` Stefan Beller
2015-05-20 21:55             ` Jeff King
2015-05-20 20:16         ` Junio C Hamano
2015-05-20 18:24       ` Faheem Mitha
2015-05-20 18:54         ` Stefan Beller
2015-05-20 19:13           ` Faheem Mitha
2015-05-20 21:03       ` Matthieu Moy
2015-05-20 17:58   ` Faheem Mitha
2015-05-21  8:09     ` Matthieu Moy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.