* [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
@ 2007-07-22 20:20 Johannes Schindelin
2007-07-22 21:42 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2007-07-22 20:20 UTC (permalink / raw)
To: git, gitster
When looking for a lost blob, it is much nicer to be able to grep
through .git/lost-found/other/* than to write an inefficient loop
over the file names. So write the contents of the dangling blobs,
not their object names.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
While working on filter-branch, I inadvertently said
"git reset --hard" without having committed first. That's
when I was almost happy to have "git fsck --lost-found".
But when grepping through the "other" found objects, nothing
turned up... because there were only SHA-1s.
Documentation/git-fsck.txt | 6 ++++--
builtin-fsck.c | 12 +++++++++++-
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 1a432f2..45c0bee 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -65,8 +65,10 @@ index file and all SHA1 references in .git/refs/* as heads.
Be chatty.
--lost-found::
- Write dangling refs into .git/lost-found/commit/ or
- .git/lost-found/other/, depending on type.
+ Write dangling objects into .git/lost-found/commit/ or
+ .git/lost-found/other/, depending on type. If the object is
+ a blob, the contents are written into the file, rather than
+ its object name.
It tests SHA1 and general object sanity, and it does full tracking of
the resulting reachability and everything else. It prints out any
diff --git a/builtin-fsck.c b/builtin-fsck.c
index 350ec5e..8d12287 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -152,7 +152,17 @@ static void check_unreachable_object(struct object *obj)
}
if (!(f = fopen(filename, "w")))
die("Could not open %s", filename);
- fprintf(f, "%s\n", sha1_to_hex(obj->sha1));
+ if (obj->type == OBJ_BLOB) {
+ enum object_type type;
+ unsigned long size;
+ char *buf = read_sha1_file(obj->sha1,
+ &type, &size);
+ if (buf) {
+ fwrite(buf, size, 1, f);
+ free(buf);
+ }
+ } else
+ fprintf(f, "%s\n", sha1_to_hex(obj->sha1));
fclose(f);
}
return;
--
1.5.3.rc2.32.g35c5b
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
2007-07-22 20:20 [PATCH] fsck --lost-found: write blob's contents, not their SHA-1 Johannes Schindelin
@ 2007-07-22 21:42 ` Junio C Hamano
2007-07-22 21:52 ` Johannes Schindelin
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2007-07-22 21:42 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> When looking for a lost blob, it is much nicer to be able to grep
> through .git/lost-found/other/* than to write an inefficient loop
> over the file names. So write the contents of the dangling blobs,
> not their object names.
I think this is an idea to solve a good problem, but if we go
this route, the need for us to worry about expiring lost-found
entries would become more urgent, I suspect.
And when you think about expiring lost-found entries, another
possible solution emerges. If we teach git-prune to remove the
corresponding entry from lost-found/other when it removes a
loose blob from the object store, then we can easily and safely
do this instead:
$ cat .git/lost-found/other/* |
xargs -r git grep 'the word to look for'
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
2007-07-22 21:42 ` Junio C Hamano
@ 2007-07-22 21:52 ` Johannes Schindelin
2007-07-22 23:00 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2007-07-22 21:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Hi,
On Sun, 22 Jul 2007, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > When looking for a lost blob, it is much nicer to be able to grep
> > through .git/lost-found/other/* than to write an inefficient loop
> > over the file names. So write the contents of the dangling blobs,
> > not their object names.
>
> I think this is an idea to solve a good problem, but if we go
> this route, the need for us to worry about expiring lost-found
> entries would become more urgent, I suspect.
Why? AFAICT lost+found/ has to be cleaned by people. So if you look for
something, you say "git fsck --lost-found", and once you found it, it's
time for "rm -rf .git/lost-found".
> And when you think about expiring lost-found entries, another
> possible solution emerges. If we teach git-prune to remove the
> corresponding entry from lost-found/other when it removes a
> loose blob from the object store, then we can easily and safely
> do this instead:
>
> $ cat .git/lost-found/other/* |
> xargs -r git grep 'the word to look for'
Well, it is not only for grepping. In my case, I could get away by this:
$ ls -lrt $(grep -l filter_subdir .git/lost-found/other/* |
sed "s/^.*other\/\(..\)/.git\/objects\/\1\//")
IOW I found the loose dangling objects which matched a keyword, and sorted
them by time.
In other cases, though, I wanted to see the size.
But what the whole thing boils down to: After finding dangling objects,
you are much more likely using git tools on non-blobs than on blobs, and
vice versa.
Ciao,
Dscho
P.S.: I fully forgot to mention that happily, I did "git add -u" sometime
before "git reset --hard", otherwise I would have been lost.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
2007-07-22 21:52 ` Johannes Schindelin
@ 2007-07-22 23:00 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2007-07-22 23:00 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> But what the whole thing boils down to: After finding dangling objects,
> you are much more likely using git tools on non-blobs than on blobs, and
> vice versa.
Ok, color me converted.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-07-22 23:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-22 20:20 [PATCH] fsck --lost-found: write blob's contents, not their SHA-1 Johannes Schindelin
2007-07-22 21:42 ` Junio C Hamano
2007-07-22 21:52 ` Johannes Schindelin
2007-07-22 23:00 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).