git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] fsck improvements
@ 2011-11-04 15:47 Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 1/4] fsck: return error code when verify_pack() goes wrong Nguyễn Thái Ngọc Duy
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-11-04 15:47 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

So I looked at fsck to see if it's easy to support multithread and
found out I could reduce fsck time by half. So here it is. I think I
haven't made any mistakes but fsck is not my domain.

The last patch is reposted because it conflicts with the series. It
also prints progress when checking connnectivity.

Nguyễn Thái Ngọc Duy (4):
  fsck: return error code when verify_pack() goes wrong
  Stop verify_packfile() as soon as an error occurs
  fsck: avoid reading every object twice
  fsck: print progress

 Documentation/git-fsck.txt |   12 ++++++-
 builtin/fsck.c             |   78 +++++++++++++++++++++++++++++++++++---------
 pack-check.c               |   27 +++++++++++++--
 pack.h                     |    6 +++-
 4 files changed, 101 insertions(+), 22 deletions(-)

-- 
1.7.4.74.g639db

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] fsck: return error code when verify_pack() goes wrong
  2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
@ 2011-11-04 15:47 ` Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 2/4] Stop verify_packfile() as soon as an error occurs Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-11-04 15:47 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/fsck.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index df1a88b..4ead98d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -29,6 +29,7 @@ static int write_lost_and_found;
 static int verbose;
 #define ERROR_OBJECT 01
 #define ERROR_REACHABLE 02
+#define ERROR_PACK 04
 
 #ifdef NO_D_INO_IN_DIRENT
 #define SORT_DIRENT 0
@@ -626,7 +627,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		prepare_packed_git();
 		for (p = packed_git; p; p = p->next)
 			/* verify gives error messages itself */
-			verify_pack(p);
+			if (verify_pack(p))
+				errors_found |= ERROR_PACK;
 
 		for (p = packed_git; p; p = p->next) {
 			uint32_t j, num;
-- 
1.7.4.74.g639db

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] Stop verify_packfile() as soon as an error occurs
  2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 1/4] fsck: return error code when verify_pack() goes wrong Nguyễn Thái Ngọc Duy
@ 2011-11-04 15:47 ` Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 3/4] fsck: avoid reading every object twice Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-11-04 15:47 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 pack-check.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/pack-check.c b/pack-check.c
index 0c19b6e..e33ea79 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -79,6 +79,8 @@ static int verify_packfile(struct packed_git *p,
 		err = error("%s SHA1 does not match its index",
 			    p->pack_name);
 	unuse_pack(w_curs);
+	if (err)
+		return err;
 
 	/* Make sure everything reachable from idx is valid.  Since we
 	 * have verified that nr_objects matches between idx and pack,
@@ -106,11 +108,13 @@ static int verify_packfile(struct packed_git *p,
 			off_t offset = entries[i].offset;
 			off_t len = entries[i+1].offset - offset;
 			unsigned int nr = entries[i].nr;
-			if (check_pack_crc(p, w_curs, offset, len, nr))
+			if (check_pack_crc(p, w_curs, offset, len, nr)) {
 				err = error("index CRC mismatch for object %s "
 					    "from %s at offset %"PRIuMAX"",
 					    sha1_to_hex(entries[i].sha1),
 					    p->pack_name, (uintmax_t)offset);
+				break;
+			}
 		}
 		data = unpack_entry(p, entries[i].offset, &type, &size);
 		if (!data) {
-- 
1.7.4.74.g639db

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] fsck: avoid reading every object twice
  2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 1/4] fsck: return error code when verify_pack() goes wrong Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 2/4] Stop verify_packfile() as soon as an error occurs Nguyễn Thái Ngọc Duy
@ 2011-11-04 15:47 ` Nguyễn Thái Ngọc Duy
  2011-11-04 15:47 ` [PATCH 4/4] fsck: print progress Nguyễn Thái Ngọc Duy
  2011-11-04 18:43 ` [PATCH 0/4] fsck improvements Junio C Hamano
  4 siblings, 0 replies; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-11-04 15:47 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

During verify_pack() all objects are read for SHA-1 check. Then
fsck_sha1() is called on every object, which read the object again
(fsck_sha1 -> parse_object -> read_sha1_file).

Avoid reading an object twice, do fsck_sha1 while we have an object
uncompressed data in verify_pack.

On git.git, with this patch I got:

$ /usr/bin/time ./git fsck >/dev/null
98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k
0inputs+0outputs (0major+194186minor)pagefaults 0swaps

Without it:

$ /usr/bin/time ./git fsck >/dev/null
231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k
0inputs+0outputs (0major+461629minor)pagefaults 0swaps

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/fsck.c |   42 +++++++++++++++++++++++++-----------------
 pack-check.c   |   13 ++++++++++---
 pack.h         |    5 ++++-
 3 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 4ead98d..0603f64 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -282,14 +282,8 @@ static void check_connectivity(void)
 	}
 }
 
-static int fsck_sha1(const unsigned char *sha1)
+static int fsck_obj(struct object *obj)
 {
-	struct object *obj = parse_object(sha1);
-	if (!obj) {
-		errors_found |= ERROR_OBJECT;
-		return error("%s: object corrupt or missing",
-			     sha1_to_hex(sha1));
-	}
 	if (obj->flags & SEEN)
 		return 0;
 	obj->flags |= SEEN;
@@ -332,6 +326,29 @@ static int fsck_sha1(const unsigned char *sha1)
 	return 0;
 }
 
+static int fsck_sha1(const unsigned char *sha1)
+{
+	struct object *obj = parse_object(sha1);
+	if (!obj) {
+		errors_found |= ERROR_OBJECT;
+		return error("%s: object corrupt or missing",
+			     sha1_to_hex(sha1));
+	}
+	return fsck_obj(obj);
+}
+
+static int fsck_obj_buffer(const unsigned char *sha1, enum object_type type,
+			   unsigned long size, void *buffer, int *eaten)
+{
+	struct object *obj;
+	obj = parse_object_buffer(sha1, type, size, buffer, eaten);
+	if (!obj) {
+		errors_found |= ERROR_OBJECT;
+		return error("%s: object corrupt or missing", sha1_to_hex(sha1));
+	}
+	return fsck_obj(obj);
+}
+
 /*
  * This is the sorting chunk size: make it reasonably
  * big so that we can sort well..
@@ -627,17 +644,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		prepare_packed_git();
 		for (p = packed_git; p; p = p->next)
 			/* verify gives error messages itself */
-			if (verify_pack(p))
+			if (verify_pack(p, fsck_obj_buffer))
 				errors_found |= ERROR_PACK;
-
-		for (p = packed_git; p; p = p->next) {
-			uint32_t j, num;
-			if (open_pack_index(p))
-				continue;
-			num = p->num_objects;
-			for (j = 0; j < num; j++)
-				fsck_sha1(nth_packed_object_sha1(p, j));
-		}
 	}
 
 	heads = 0;
diff --git a/pack-check.c b/pack-check.c
index e33ea79..372d6b2 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -42,7 +42,8 @@ int check_pack_crc(struct packed_git *p, struct pack_window **w_curs,
 }
 
 static int verify_packfile(struct packed_git *p,
-		struct pack_window **w_curs)
+			   struct pack_window **w_curs,
+			   verify_fn fn)
 {
 	off_t index_size = p->index_size;
 	const unsigned char *index_base = p->index_data;
@@ -129,6 +130,12 @@ static int verify_packfile(struct packed_git *p,
 			free(data);
 			break;
 		}
+		if (fn) {
+			int eaten = 0;
+			fn(entries[i].sha1, type, size, data, &eaten);
+			if (eaten)
+				data = NULL;
+		}
 		free(data);
 	}
 	free(entries);
@@ -159,7 +166,7 @@ int verify_pack_index(struct packed_git *p)
 	return err;
 }
 
-int verify_pack(struct packed_git *p)
+int verify_pack(struct packed_git *p, verify_fn fn)
 {
 	int err = 0;
 	struct pack_window *w_curs = NULL;
@@ -168,7 +175,7 @@ int verify_pack(struct packed_git *p)
 	if (!p->index_data)
 		return -1;
 
-	err |= verify_packfile(p, &w_curs);
+	err |= verify_packfile(p, &w_curs, fn);
 	unuse_pack(&w_curs);
 
 	return err;
diff --git a/pack.h b/pack.h
index 722a54e..70f3c29 100644
--- a/pack.h
+++ b/pack.h
@@ -70,10 +70,13 @@ struct pack_idx_entry {
 	off_t offset;
 };
 
+
+typedef int (*verify_fn)(const unsigned char*, enum object_type, unsigned long, void*, int*);
+
 extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, unsigned char *sha1);
 extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 extern int verify_pack_index(struct packed_git *);
-extern int verify_pack(struct packed_git *);
+extern int verify_pack(struct packed_git *, verify_fn fn);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
 extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned char *);
-- 
1.7.4.74.g639db

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] fsck: print progress
  2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2011-11-04 15:47 ` [PATCH 3/4] fsck: avoid reading every object twice Nguyễn Thái Ngọc Duy
@ 2011-11-04 15:47 ` Nguyễn Thái Ngọc Duy
  2011-11-04 20:14   ` Jeff King
  2011-11-04 18:43 ` [PATCH 0/4] fsck improvements Junio C Hamano
  4 siblings, 1 reply; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-11-04 15:47 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

fsck is usually a long process and it would be nice if it prints
progress from time to time.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-fsck.txt |   12 +++++++++++-
 builtin/fsck.c             |   40 ++++++++++++++++++++++++++++++++++++++--
 pack-check.c               |   14 +++++++++++---
 pack.h                     |    3 ++-
 4 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index a2a508d..5245101 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found] [<object>*]
+	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]progress] [<object>*]
 
 DESCRIPTION
 -----------
@@ -72,6 +73,15 @@ index file, all SHA1 references in .git/refs/*, and all reflogs (unless
 	a blob, the contents are written into the file, rather than
 	its object name.
 
+--progress::
+--no-progress::
+	When fsck is run in a terminal, it will show the progress.
+	These options can force progress to be shown or not
+	regardless terminal check.
++
+Progress is not shown when --verbose is used. --progress is ignored
+in this case.
+
 It tests SHA1 and general object sanity, and it does full tracking of
 the resulting reachability and everything else. It prints out any
 corruption it finds (missing or bad objects), and if you use the
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0603f64..c4b1ca6 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -11,6 +11,7 @@
 #include "fsck.h"
 #include "parse-options.h"
 #include "dir.h"
+#include "progress.h"
 
 #define REACHABLE 0x0001
 #define SEEN      0x0002
@@ -27,6 +28,7 @@ static const char *head_points_at;
 static int errors_found;
 static int write_lost_and_found;
 static int verbose;
+static int show_progress = -1;
 #define ERROR_OBJECT 01
 #define ERROR_REACHABLE 02
 #define ERROR_PACK 04
@@ -138,7 +140,11 @@ static int traverse_one_object(struct object *obj)
 
 static int traverse_reachable(void)
 {
+	struct progress *progress = NULL;
+	unsigned int nr = 0;
 	int result = 0;
+	if (show_progress)
+		progress = start_progress_delay("Checking connectivity", 0, 0, 2);
 	while (pending.nr) {
 		struct object_array_entry *entry;
 		struct object *obj;
@@ -146,7 +152,9 @@ static int traverse_reachable(void)
 		entry = pending.objects + --pending.nr;
 		obj = entry->item;
 		result |= traverse_one_object(obj);
+		display_progress(progress, ++nr);
 	}
+	stop_progress(&progress);
 	return !!result;
 }
 
@@ -530,15 +538,20 @@ static void get_default_heads(void)
 static void fsck_object_dir(const char *path)
 {
 	int i;
+	struct progress *progress = NULL;
 
 	if (verbose)
 		fprintf(stderr, "Checking object directory\n");
 
+	if (show_progress)
+		progress = start_progress("Checking object directories", 256);
 	for (i = 0; i < 256; i++) {
 		static char dir[4096];
 		sprintf(dir, "%s/%02x", path, i);
 		fsck_dir(i, dir);
+		display_progress(progress, i+1);
 	}
+	stop_progress(&progress);
 	fsck_sha1_list();
 }
 
@@ -609,6 +622,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOLEAN(0, "strict", &check_strict, "enable more strict checking"),
 	OPT_BOOLEAN(0, "lost-found", &write_lost_and_found,
 				"write dangling objects in .git/lost-found"),
+	OPT_BOOL   (0, "progress", &show_progress, "show progress"),
 	OPT_END(),
 };
 
@@ -621,6 +635,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	read_replace_refs = 0;
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
+
+	if (show_progress == -1)
+		show_progress = isatty(2);
+	if (verbose)
+		show_progress = 0;
+
 	if (write_lost_and_found) {
 		check_full = 1;
 		include_reflogs = 0;
@@ -640,12 +660,28 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	if (check_full) {
 		struct packed_git *p;
+		uint32_t total = 0, count = 0;
+		struct progress *progress = NULL;
 
 		prepare_packed_git();
-		for (p = packed_git; p; p = p->next)
+
+		if (show_progress) {
+			for (p = packed_git; p; p = p->next) {
+				if (open_pack_index(p))
+					continue;
+				total += p->num_objects;
+			}
+
+			progress = start_progress("Checking objects", total);
+		}
+		for (p = packed_git; p; p = p->next) {
 			/* verify gives error messages itself */
-			if (verify_pack(p, fsck_obj_buffer))
+			if (verify_pack(p, fsck_obj_buffer,
+					progress, count))
 				errors_found |= ERROR_PACK;
+			count += p->num_objects;
+		}
+		stop_progress(&progress);
 	}
 
 	heads = 0;
diff --git a/pack-check.c b/pack-check.c
index 372d6b2..a3262af 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack.h"
 #include "pack-revindex.h"
+#include "progress.h"
 
 struct idx_entry {
 	off_t                offset;
@@ -43,7 +44,9 @@ int check_pack_crc(struct packed_git *p, struct pack_window **w_curs,
 
 static int verify_packfile(struct packed_git *p,
 			   struct pack_window **w_curs,
-			   verify_fn fn)
+			   verify_fn fn,
+			   struct progress *progress, uint32_t base_count)
+
 {
 	off_t index_size = p->index_size;
 	const unsigned char *index_base = p->index_data;
@@ -136,8 +139,12 @@ static int verify_packfile(struct packed_git *p,
 			if (eaten)
 				data = NULL;
 		}
+		if (((base_count + i) & 1023) == 0)
+			display_progress(progress, base_count + i);
 		free(data);
+
 	}
+	display_progress(progress, base_count + i);
 	free(entries);
 
 	return err;
@@ -166,7 +173,8 @@ int verify_pack_index(struct packed_git *p)
 	return err;
 }
 
-int verify_pack(struct packed_git *p, verify_fn fn)
+int verify_pack(struct packed_git *p, verify_fn fn,
+		struct progress *progress, uint32_t base_count)
 {
 	int err = 0;
 	struct pack_window *w_curs = NULL;
@@ -175,7 +183,7 @@ int verify_pack(struct packed_git *p, verify_fn fn)
 	if (!p->index_data)
 		return -1;
 
-	err |= verify_packfile(p, &w_curs, fn);
+	err |= verify_packfile(p, &w_curs, fn, progress, base_count);
 	unuse_pack(&w_curs);
 
 	return err;
diff --git a/pack.h b/pack.h
index 70f3c29..324a1d7 100644
--- a/pack.h
+++ b/pack.h
@@ -71,12 +71,13 @@ struct pack_idx_entry {
 };
 
 
+struct progress;
 typedef int (*verify_fn)(const unsigned char*, enum object_type, unsigned long, void*, int*);
 
 extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, unsigned char *sha1);
 extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 extern int verify_pack_index(struct packed_git *);
-extern int verify_pack(struct packed_git *, verify_fn fn);
+extern int verify_pack(struct packed_git *, verify_fn fn, struct progress *, uint32_t);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
 extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned char *);
-- 
1.7.4.74.g639db

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] fsck improvements
  2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2011-11-04 15:47 ` [PATCH 4/4] fsck: print progress Nguyễn Thái Ngọc Duy
@ 2011-11-04 18:43 ` Junio C Hamano
  2011-11-05  3:18   ` Nguyen Thai Ngoc Duy
  4 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2011-11-04 18:43 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

I think patches 1, 3 and 4 all are good ideas from a cursory look.

I am not sure what purpose patch 2 serves, though. When we find a checksum
mismatch for an object in a packstream due to a single-bit error, we would
still be able to salvage other objects in other parts of the pack as long
as we have a good .idx file, and in such a case, wouldn't it be better if
we attempted to find as many corrupt objects that we know we cannot
recover as possible and tell the user about them, so that they can be
skipped during the recovery process?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] fsck: print progress
  2011-11-04 15:47 ` [PATCH 4/4] fsck: print progress Nguyễn Thái Ngọc Duy
@ 2011-11-04 20:14   ` Jeff King
  2011-11-05  3:26     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2011-11-04 20:14 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

On Fri, Nov 04, 2011 at 10:47:50PM +0700, Nguyen Thai Ngoc Duy wrote:

>  static int traverse_reachable(void)
>  {
> +	struct progress *progress = NULL;
> +	unsigned int nr = 0;
>  	int result = 0;
> +	if (show_progress)
> +		progress = start_progress_delay("Checking connectivity", 0, 0, 2);
>  	while (pending.nr) {
>  		struct object_array_entry *entry;
>  		struct object *obj;
> @@ -146,7 +152,9 @@ static int traverse_reachable(void)
>  		entry = pending.objects + --pending.nr;
>  		obj = entry->item;
>  		result |= traverse_one_object(obj);
> +		display_progress(progress, ++nr);
>  	}
> +	stop_progress(&progress);
>  	return !!result;
>  }

Thanks, this is a good place to put a progress meter, too. If you're
feeling like pushing this topic further, "git prune" might be a good
place for a progress meter, too. It does a similar traversal[1] for
reachability, and makes "git gc" appear to hang at the end (we have nice
progress meters for packing, but it takes something like 25 seconds to
run "git prune" at the end, during which we are silent).

-Peff

[1] I wonder why fsck doesn't use mark_reachable from reachable.c.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] fsck improvements
  2011-11-04 18:43 ` [PATCH 0/4] fsck improvements Junio C Hamano
@ 2011-11-05  3:18   ` Nguyen Thai Ngoc Duy
  2011-11-05  5:26     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-11-05  3:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

2011/11/5 Junio C Hamano <gitster@pobox.com>:
> I am not sure what purpose patch 2 serves, though. When we find a checksum
> mismatch for an object in a packstream due to a single-bit error, we would
> still be able to salvage other objects in other parts of the pack as long
> as we have a good .idx file, and in such a case, wouldn't it be better if
> we attempted to find as many corrupt objects that we know we cannot
> recover as possible and tell the user about them, so that they can be
> skipped during the recovery process?

It's the inconsistency in that for(;;) loop. If we are going to
salvage as many objects as we could, should we do "continue;" instead
of "break;" when unpack_entry() or check_sha1_signature() fails?
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] fsck: print progress
  2011-11-04 20:14   ` Jeff King
@ 2011-11-05  3:26     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-11-05  3:26 UTC (permalink / raw)
  To: Jeff King; +Cc: git

2011/11/5 Jeff King <peff@peff.net>:
> Thanks, this is a good place to put a progress meter, too. If you're
> feeling like pushing this topic further, "git prune" might be a good
> place for a progress meter, too. It does a similar traversal[1] for
> reachability, and makes "git gc" appear to hang at the end (we have nice
> progress meters for packing, but it takes something like 25 seconds to
> run "git prune" at the end, during which we are silent).
>
> -Peff
>
> [1] I wonder why fsck doesn't use mark_reachable from reachable.c.

Thanks. I'll have a look.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] fsck improvements
  2011-11-05  3:18   ` Nguyen Thai Ngoc Duy
@ 2011-11-05  5:26     ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2011-11-05  5:26 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> 2011/11/5 Junio C Hamano <gitster@pobox.com>:
>> I am not sure what purpose patch 2 serves, though. When we find a checksum
>> mismatch for an object in a packstream due to a single-bit error, we would
>> still be able to salvage other objects in other parts of the pack as long
>> as we have a good .idx file, and in such a case, wouldn't it be better if
>> we attempted to find as many corrupt objects that we know we cannot
>> recover as possible and tell the user about them, so that they can be
>> skipped during the recovery process?
>
> It's the inconsistency in that for(;;) loop. If we are going to
> salvage as many objects as we could, should we do "continue;" instead
> of "break;" when unpack_entry() or check_sha1_signature() fails?

I do not think making things worse for the sake of "consistency" is a good
sell ;-). How hard would it be to make it also continue when these report
a corruption?

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-11-05  5:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-04 15:47 [PATCH 0/4] fsck improvements Nguyễn Thái Ngọc Duy
2011-11-04 15:47 ` [PATCH 1/4] fsck: return error code when verify_pack() goes wrong Nguyễn Thái Ngọc Duy
2011-11-04 15:47 ` [PATCH 2/4] Stop verify_packfile() as soon as an error occurs Nguyễn Thái Ngọc Duy
2011-11-04 15:47 ` [PATCH 3/4] fsck: avoid reading every object twice Nguyễn Thái Ngọc Duy
2011-11-04 15:47 ` [PATCH 4/4] fsck: print progress Nguyễn Thái Ngọc Duy
2011-11-04 20:14   ` Jeff King
2011-11-05  3:26     ` Nguyen Thai Ngoc Duy
2011-11-04 18:43 ` [PATCH 0/4] fsck improvements Junio C Hamano
2011-11-05  3:18   ` Nguyen Thai Ngoc Duy
2011-11-05  5:26     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).