git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
@ 2007-07-22 20:20 Johannes Schindelin
  2007-07-22 21:42 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2007-07-22 20:20 UTC (permalink / raw)
  To: git, gitster


When looking for a lost blob, it is much nicer to be able to grep
through .git/lost-found/other/* than to write an inefficient loop
over the file names.  So write the contents of the dangling blobs,
not their object names.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	While working on filter-branch, I inadvertently said
	"git reset --hard" without having committed first.  That's
	when I was almost happy to have "git fsck --lost-found".
	But when grepping through the "other" found objects, nothing
	turned up... because there were only SHA-1s.

 Documentation/git-fsck.txt |    6 ++++--
 builtin-fsck.c             |   12 +++++++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 1a432f2..45c0bee 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -65,8 +65,10 @@ index file and all SHA1 references in .git/refs/* as heads.
 	Be chatty.
 
 --lost-found::
-	Write dangling refs into .git/lost-found/commit/ or
-	.git/lost-found/other/, depending on type.
+	Write dangling objects into .git/lost-found/commit/ or
+	.git/lost-found/other/, depending on type.  If the object is
+	a blob, the contents are written into the file, rather than
+	its object name.
 
 It tests SHA1 and general object sanity, and it does full tracking of
 the resulting reachability and everything else. It prints out any
diff --git a/builtin-fsck.c b/builtin-fsck.c
index 350ec5e..8d12287 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -152,7 +152,17 @@ static void check_unreachable_object(struct object *obj)
 			}
 			if (!(f = fopen(filename, "w")))
 				die("Could not open %s", filename);
-			fprintf(f, "%s\n", sha1_to_hex(obj->sha1));
+			if (obj->type == OBJ_BLOB) {
+				enum object_type type;
+				unsigned long size;
+				char *buf = read_sha1_file(obj->sha1,
+						&type, &size);
+				if (buf) {
+					fwrite(buf, size, 1, f);
+					free(buf);
+				}
+			} else
+				fprintf(f, "%s\n", sha1_to_hex(obj->sha1));
 			fclose(f);
 		}
 		return;
-- 
1.5.3.rc2.32.g35c5b

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
  2007-07-22 20:20 [PATCH] fsck --lost-found: write blob's contents, not their SHA-1 Johannes Schindelin
@ 2007-07-22 21:42 ` Junio C Hamano
  2007-07-22 21:52   ` Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2007-07-22 21:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> When looking for a lost blob, it is much nicer to be able to grep
> through .git/lost-found/other/* than to write an inefficient loop
> over the file names.  So write the contents of the dangling blobs,
> not their object names.

I think this is an idea to solve a good problem, but if we go
this route, the need for us to worry about expiring lost-found
entries would become more urgent, I suspect.

And when you think about expiring lost-found entries, another
possible solution emerges.  If we teach git-prune to remove the
corresponding entry from lost-found/other when it removes a
loose blob from the object store, then we can easily and safely
do this instead:

	$ cat .git/lost-found/other/* |
	  xargs -r git grep 'the word to look for'

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
  2007-07-22 21:42 ` Junio C Hamano
@ 2007-07-22 21:52   ` Johannes Schindelin
  2007-07-22 23:00     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2007-07-22 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Sun, 22 Jul 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > When looking for a lost blob, it is much nicer to be able to grep
> > through .git/lost-found/other/* than to write an inefficient loop
> > over the file names.  So write the contents of the dangling blobs,
> > not their object names.
> 
> I think this is an idea to solve a good problem, but if we go
> this route, the need for us to worry about expiring lost-found
> entries would become more urgent, I suspect.

Why?  AFAICT lost+found/ has to be cleaned by people.  So if you look for 
something, you say "git fsck --lost-found", and once you found it, it's 
time for "rm -rf .git/lost-found".

> And when you think about expiring lost-found entries, another
> possible solution emerges.  If we teach git-prune to remove the
> corresponding entry from lost-found/other when it removes a
> loose blob from the object store, then we can easily and safely
> do this instead:
> 
> 	$ cat .git/lost-found/other/* |
> 	  xargs -r git grep 'the word to look for'

Well, it is not only for grepping.  In my case, I could get away by this:

$ ls -lrt $(grep -l filter_subdir .git/lost-found/other/* |
	sed "s/^.*other\/\(..\)/.git\/objects\/\1\//") 

IOW I found the loose dangling objects which matched a keyword, and sorted 
them by time.

In other cases, though, I wanted to see the size.

But what the whole thing boils down to: After finding dangling objects, 
you are much more likely using git tools on non-blobs than on blobs, and 
vice versa.

Ciao,
Dscho

P.S.: I fully forgot to mention that happily, I did "git add -u" sometime 
before "git reset --hard", otherwise I would have been lost.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fsck --lost-found: write blob's contents, not their SHA-1
  2007-07-22 21:52   ` Johannes Schindelin
@ 2007-07-22 23:00     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2007-07-22 23:00 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> But what the whole thing boils down to: After finding dangling objects, 
> you are much more likely using git tools on non-blobs than on blobs, and 
> vice versa.

Ok, color me converted.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-07-22 23:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-22 20:20 [PATCH] fsck --lost-found: write blob's contents, not their SHA-1 Johannes Schindelin
2007-07-22 21:42 ` Junio C Hamano
2007-07-22 21:52   ` Johannes Schindelin
2007-07-22 23:00     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).