Git development
 help / color / mirror / Atom feed
* Re: bug: git-repack -a -d produces broken pack on NFS
From: Linus Torvalds @ 2006-04-27 22:18 UTC (permalink / raw)
  To: Alex Riesen, Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0604271500500.3701@g5.osdl.org>



On Thu, 27 Apr 2006, Linus Torvalds wrote:
> 
> I wonder if the _pack-file_ itself might be ok, and the problem is an 
> index file corruption.

Hmm. verify_pack() actually checks that the index file matches its own 
SHA1 earlier, so the index file will have passed (my suggested patch is 
still correct: the same way we check the index file internal integrity 
first, we should also check the pack-file internal integrity before we 
bother to cross-check them with each other).

Anyway, the index file SHA1 check means that it's unlikely that the index 
file was corrupt. But it would be interesting to hear if the pack-file was 
internally consistent or not.. (Something that git-pack-check didn't check 
in your case, because it checked the pack-file against the index file data 
first).

		Linus

^ permalink raw reply

* Re: bug: git-repack -a -d produces broken pack on NFS
From: Junio C Hamano @ 2006-04-27 22:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604271500500.3701@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> That said, the pack-file should all be written with the "sha1write()" 
> interface, which is very careful indeed.
>
> I wonder if the _pack-file_ itself might be ok, and the problem is an 
> index file corruption. For some reason we check the index file first, 
> which is insane. We should check that the pack-file matches its _own_ SHA1 
> first, and check the index file second.

We need to check both, so I fail to see why the order matters.

> If it's just the index file that is corrupt, you may even have a chance to 
> recover the data.
>
> The index file is also written with sha1write(), though, so I really don't 
> see where it would break. Unless you just simply literally have data 
> corruption on the server for some strange reason.

I haven't seen this, and was wondering why.

Independently, and probably unrelated, but another person
reported failure while cloning, but the log appeared it had
trouble spawning the git-index-pack executable for some reason.

^ permalink raw reply

* Re: bug: git-repack -a -d produces broken pack on NFS
From: Linus Torvalds @ 2006-04-27 22:11 UTC (permalink / raw)
  To: Alex Riesen, Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <20060427213207.GA6709@steel.home>



On Thu, 27 Apr 2006, Alex Riesen wrote:
>
> NFS server: 2.6.15
> Client: 2.6.17-rc2
> mount options: tigra:/home /net/home nfs rw,nosuid,nodev,noatime,vers=3,rsize=8192,wsize=32768,hard,intr,proto=udp,timeo=7,retrans=3,addr=tigra 0 0

It's repeatable? Can you check if it goes away if your remove "intr"?

That said, the pack-file should all be written with the "sha1write()" 
interface, which is very careful indeed.

I wonder if the _pack-file_ itself might be ok, and the problem is an 
index file corruption. For some reason we check the index file first, 
which is insane. We should check that the pack-file matches its _own_ SHA1 
first, and check the index file second.

So the appended patch would be sensible: before we bother to look at the 
index file at all, we should check the pack-file itself.

If it's just the index file that is corrupt, you may even have a chance to 
recover the data.

The index file is also written with sha1write(), though, so I really don't 
see where it would break. Unless you just simply literally have data 
corruption on the server for some strange reason.

		Linus
---
diff --git a/pack-check.c b/pack-check.c
index 84ed90d..e575879 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -29,12 +29,12 @@ static int verify_packfile(struct packed
 	pack_base = p->pack_base;
 	SHA1_Update(&ctx, pack_base, pack_size - 20);
 	SHA1_Final(sha1, &ctx);
-	if (memcmp(sha1, index_base + index_size - 40, 20))
-		return error("Packfile %s SHA1 mismatch with idx",
-			     p->pack_name);
 	if (memcmp(sha1, pack_base + pack_size - 20, 20))
 		return error("Packfile %s SHA1 mismatch with itself",
 			     p->pack_name);
+	if (memcmp(sha1, index_base + index_size - 40, 20))
+		return error("Packfile %s SHA1 mismatch with idx",
+			     p->pack_name);
 
 	/* Make sure everything reachable from idx is valid.  Since we
 	 * have verified that nr_objects matches between idx and pack,

^ permalink raw reply related

* Re: [PATCH] C version of git-count-objects
From: Junio C Hamano @ 2006-04-27 22:07 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git
In-Reply-To: <20060427205155.GA26856@brainysmurf.cs.umu.se>

Peter Hagervall <hager@cs.umu.se> writes:

> Thanks, I'll make a third stab at it tomorrow, if anyone is interested
> that is?

How about something like this instead?

-- >8 --

 Makefile        |    2 -
 builtin-count.c |  124 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 builtin.h       |    1 
 git.c           |    1 
 4 files changed, 127 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 8ce27a6..14193aa 100644
--- a/Makefile
+++ b/Makefile
@@ -214,7 +214,7 @@ LIB_OBJS = \
 	$(DIFF_OBJS)
 
 BUILTIN_OBJS = \
-	builtin-log.o builtin-help.o
+	builtin-log.o builtin-help.o builtin-count.o
 
 GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
 LIBS = $(GITLIBS) -lz
diff --git a/builtin-count.c b/builtin-count.c
new file mode 100644
index 0000000..cbbb7dd
--- /dev/null
+++ b/builtin-count.c
@@ -0,0 +1,124 @@
+/*
+ * Builtin "git count-objects".
+ *
+ * Copyright (c) 2006 Junio C Hamano
+ */
+
+#include "cache.h"
+
+static const char count_objects_usage[] = "git-count-objects [-v]";
+
+static void count_objects(DIR *d, char *path, int len, int verbose, 
+			  unsigned long *loose,
+			  unsigned long *loose_size,
+			  unsigned long *packed_loose,
+			  unsigned long *garbage)
+{
+	struct dirent *ent;
+	while ((ent = readdir(d)) != NULL) {
+		char hex[41];
+		unsigned char sha1[20];
+		const char *cp;
+		int bad = 0;
+
+		if ((ent->d_name[0] == '.') &&
+		    (ent->d_name[1] == 0 ||
+		     ((ent->d_name[1] == '.') && (ent->d_name[2] == 0))))
+			continue;
+		for (cp = ent->d_name; *cp; cp++) {
+			int ch = *cp;
+			if (('0' <= ch && ch <= '9') ||
+			    ('a' <= ch && ch <= 'f'))
+				continue;
+			bad = 1;
+			break;
+		}
+		if (cp - ent->d_name != 38)
+			bad = 1;
+		else {
+			struct stat st;
+			memcpy(path + len + 3, ent->d_name, 38);
+			path[len + 2] = '/';
+			path[len + 41] = 0;
+			if (stat(path, &st)) {
+				die("OOPS <%s>", path);
+				bad = 1;
+			}
+			else
+				(*loose_size) += st.st_blocks;
+		}
+		if (bad) {
+			if (verbose) {
+				error("garbage found: %.*s/%s",
+				      len + 2, path, ent->d_name);
+				(*garbage)++;
+			}
+			continue;
+		}
+		(*loose)++;
+		if (!verbose)
+			continue;
+		memcpy(hex, path+len, 2);
+		memcpy(hex+2, ent->d_name, 38);
+		hex[40] = 0;
+		if (get_sha1_hex(hex, sha1))
+			die("internal error");
+		if (has_sha1_pack(sha1))
+			(*packed_loose)++;
+	}
+}
+
+int cmd_count_objects(int ac, const char **av, char *ep)
+{
+	int i;
+	int verbose = 0;
+	const char *objdir = get_object_directory();
+	int len = strlen(objdir);
+	char *path = xmalloc(len + 50);
+	unsigned long loose = 0, packed = 0, packed_loose = 0, garbage = 0;
+	unsigned long loose_size = 0;
+
+	for (i = 1; i < ac; i++) {
+		const char *arg = av[i];
+		if (*arg != '-')
+			break;
+		else if (!strcmp(arg, "-v"))
+			verbose = 1;
+		else
+			usage(count_objects_usage);
+	}
+
+	/* we do not take arguments other than flags for now */
+	if (i < ac)
+		usage(count_objects_usage);
+	memcpy(path, objdir, len);
+	if (len && objdir[len-1] != '/')
+		path[len++] = '/';
+	for (i = 0; i < 256; i++) {
+		DIR *d;
+		sprintf(path + len, "%02x", i);
+		d = opendir(path);
+		if (!d)
+			continue;
+		count_objects(d, path, len, verbose,
+			      &loose, &loose_size, &packed_loose, &garbage);
+		closedir(d);
+	}
+	if (verbose) {
+		struct packed_git *p;
+		for (p = packed_git; p; p = p->next) {
+			if (!p->pack_local)
+				continue;
+			packed += num_packed_objects(p);
+		}
+		printf("count: %lu\n", loose);
+		printf("size: %lu\n", loose_size / 2);
+		printf("in-pack: %lu\n", packed);
+		printf("prune-packable: %lu\n", packed_loose);
+		printf("garbage: %lu\n", garbage);
+	}
+	else
+		printf("%lu objects, %lu kilobytes\n",
+		       loose, loose_size / 2);
+	return 0;
+}
diff --git a/builtin.h b/builtin.h
index 47408a0..76169e3 100644
--- a/builtin.h
+++ b/builtin.h
@@ -19,5 +19,6 @@ extern int cmd_version(int argc, const c
 extern int cmd_whatchanged(int argc, const char **argv, char **envp);
 extern int cmd_show(int argc, const char **argv, char **envp);
 extern int cmd_log(int argc, const char **argv, char **envp);
+extern int cmd_count_objects(int argc, const char **argv, char **envp);
 
 #endif
diff --git a/git.c b/git.c
index 01b7e28..00fb399 100644
--- a/git.c
+++ b/git.c
@@ -46,6 +46,7 @@ static void handle_internal_command(int 
 		{ "log", cmd_log },
 		{ "whatchanged", cmd_whatchanged },
 		{ "show", cmd_show },
+		{ "count-objects", cmd_count_objects },
 	};
 	int i;
 

^ permalink raw reply related

* bug: git-repack -a -d produces broken pack on NFS
From: Alex Riesen @ 2006-04-27 21:32 UTC (permalink / raw)
  To: git

NFS server: 2.6.15
Client: 2.6.17-rc2
mount options: tigra:/home /net/home nfs rw,nosuid,nodev,noatime,vers=3,rsize=8192,wsize=32768,hard,intr,proto=udp,timeo=7,retrans=3,addr=tigra 0 0

Repack protocol ($SRC is /net/home/src):

$SRC/linux.git$ git repack -a -d
Generating pack...
Done counting 235947 objects.
Deltifying 235947 objects.
 100% (235947/235947) done
Writing 235947 objects.
 100% (235947/235947) done
Total 235947, written 235947 (delta 182131), reused 235466 (delta 181650)
Pack pack-6dcda5a7782864d57ec44bd30ebec13b07df2c87 created.
$SRC/linux.git$ git fsck-objects --full
git-fsck-objects: error: Packfile .git/objects/pack/pack-6dcda5a7782864d57ec44bd
30ebec13b07df2c87.pack SHA1 mismatch with idx
git-fsck-objects: fatal: delta data unpack failed
$SRC/linux.git$ git fsck-objects --full
git-fsck-objects: error: Packfile .git/objects/pack/pack-6dcda5a7782864d57ec44bd
30ebec13b07df2c87.pack SHA1 mismatch with idx
git-fsck-objects: fatal: delta data unpack failed
$SRC/linux.git$ du -sh .
124M    .
$SRC/linux.git$ cp . -a $BIG/linux.git
$SRC/linux.git$ cd $BIG/linux.git
$BIG/linux.git$ git fsck-objects --full
git-fsck-objects: error: Packfile .git/objects/pack/pack-6dcda5a7782864d57ec44bd
30ebec13b07df2c87.pack SHA1 mismatch with idx
git-fsck-objects: fatal: delta data unpack failed
$BIG/linux.git$ git clone -n . ../tmp
Generating pack...
Done counting 235947 objects.
Deltifying 235947 objects.
 100% (235947/235947) done
Total 235947, written 235947 (delta 182131), reused 235947 (delta 182131)

git-index-pack: fatal: packfile '/mnt/large/tmp/raa/tmp/.git/objects/pack/tmp-wc
Rvk5': bad object at offset 102601801: inflate returned -3
git-fetch-pack: error: git-fetch-pack: unable to read from git-index-pack
git-fetch-pack: error: git-index-pack died with error code 128
fetch-pack from '/mnt/large/tmp/raa/linux.git/.git' failed.

So the repository is now _really_ broken.
I didn't noticed when it started to happen, sorry.

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Peter Hagervall @ 2006-04-27 20:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0604271257010.3701@g5.osdl.org>

On Thu, Apr 27, 2006 at 01:07:27PM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 27 Apr 2006, Peter Hagervall wrote:
> > > 
> > > To avoid appending the filename to the path before each lstat() I'd 
> > > guess.
> > 
> > Yes, that's pretty much the reason.
> 
> It's a bad reason, though.
> 
> For one thing, it just doesn't work. You'll have to chdir() back, and you 
> can't use ".." in case the user has set up some symlink thing. So you end 
> up doing other really strange things.
> 
> You can do this much more efficiently with something like this:
> 
> 	const char *obj = git_object_directory();
> 	int len = strlen(obj);

<snip>

> 				continue;
> 			strcpy(prefix + len, de->d_name);
> 			fd = open(prefix, O_RDONLY);
> 			.. check if it's ok, perhaps.. ?
> 			if (ok)
> 				nr++;
> 			close(fd);
> 		}
> 		return nr;
> 	}
> 
> and you're done. Efficient, and it's easy to add the endign to the 
> pathname, because you're passing in a buffer that is big enough, and 
> you're telling people where they should put their suffixes..

Thanks, I'll make a third stab at it tomorrow, if anyone is interested
that is?

	Peter

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Linus Torvalds @ 2006-04-27 20:07 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: Nicolas Pitre, Junio C Hamano, git
In-Reply-To: <20060427194559.GA26386@brainysmurf.cs.umu.se>



On Thu, 27 Apr 2006, Peter Hagervall wrote:
> > 
> > To avoid appending the filename to the path before each lstat() I'd 
> > guess.
> 
> Yes, that's pretty much the reason.

It's a bad reason, though.

For one thing, it just doesn't work. You'll have to chdir() back, and you 
can't use ".." in case the user has set up some symlink thing. So you end 
up doing other really strange things.

You can do this much more efficiently with something like this:

	const char *obj = git_object_directory();
	int len = strlen(obj);
	char *dir = malloc(len + 300);

	memcpy(dir, obj, len);
	if (len && obj[len-1] != '/')
		dir[len++] = '/';
	dir[len+2] = 0;
	for (i = 0; i < 16; i++) {
		dir[len] = hexdigit[i];
		for (j = 0; j < 16; j+) {
			dir[len+1] = hexdigit[j];
			dir[len+2] = 0;
			DIR *d = opendir(dir);
			if (!d)
				continue;
			nr += count(d, dir, len+2);
			closedir(d);
		}
	}

where the "count()" function just ends up doing something like

	int count(DIR *d, const char *prefix, int len)
	{
		int nr = 0;
		struct dirent *de;

		prefix[len++] = '/';
		while ((de = readdir(d)) != NULL) {
			int fd;
			if (de->d_name[0] == '.')
				continue;
			strcpy(prefix + len, de->d_name);
			fd = open(prefix, O_RDONLY);
			.. check if it's ok, perhaps.. ?
			if (ok)
				nr++;
			close(fd);
		}
		return nr;
	}

and you're done. Efficient, and it's easy to add the endign to the 
pathname, because you're passing in a buffer that is big enough, and 
you're telling people where they should put their suffixes..

And no, the above has never been compiled or tested, and I wrote it with 
one eye closed, while drinking heavily and experimenting with some funky 
'shrooms. So caveat emptor.

		Linus

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Peter Hagervall @ 2006-04-27 19:46 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0604271535460.18816@localhost.localdomain>

On Thu, Apr 27, 2006 at 03:39:14PM -0400, Nicolas Pitre wrote:
> On Thu, 27 Apr 2006, Junio C Hamano wrote:
> 
> > Nicolas Pitre <nico@cam.org> writes:
> > 
> > > On Thu, 27 Apr 2006, Peter Hagervall wrote:
> > >
> > >> Answering the call Linus made[1], sort of, but for a completely
> > >> different program.
> > >> 
> > >> Anyway, it ought to be at least as portable as the shell script, and a
> > >> whole lot faster, however much that matters.
> > >> 
> > > [...]
> > >> +	for (i = 0; i < 16; i++) {
> > >> +		subdir[0] = hex_digits[i];
> > >> +		for (j = 0; j < 16; j++) {
> > >> +			subdir[1] = hex_digits[j];
> > >> +			if (access(subdir, R_OK | X_OK))
> > >> +				continue;
> > >> +			chdir(subdir);
> > >> +			if (!(dp = opendir("."))) {
> > >> +				error("can't open subdir %s", subdir);
> > >> +				continue;
> > >> +			}
> > >
> > > Looks like you're missing a chdir(".."); there.
> > 
> > Why would you even _need_ to chdir() anywhere, anyway?
> 
> To avoid appending the filename to the path before each lstat() I'd 
> guess.

Yes, that's pretty much the reason.

	Peter

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Nicolas Pitre @ 2006-04-27 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Peter Hagervall, git
In-Reply-To: <7vhd4ekfu1.fsf@assigned-by-dhcp.cox.net>

On Thu, 27 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Thu, 27 Apr 2006, Peter Hagervall wrote:
> >
> >> Answering the call Linus made[1], sort of, but for a completely
> >> different program.
> >> 
> >> Anyway, it ought to be at least as portable as the shell script, and a
> >> whole lot faster, however much that matters.
> >> 
> > [...]
> >> +	for (i = 0; i < 16; i++) {
> >> +		subdir[0] = hex_digits[i];
> >> +		for (j = 0; j < 16; j++) {
> >> +			subdir[1] = hex_digits[j];
> >> +			if (access(subdir, R_OK | X_OK))
> >> +				continue;
> >> +			chdir(subdir);
> >> +			if (!(dp = opendir("."))) {
> >> +				error("can't open subdir %s", subdir);
> >> +				continue;
> >> +			}
> >
> > Looks like you're missing a chdir(".."); there.
> 
> Why would you even _need_ to chdir() anywhere, anyway?

To avoid appending the filename to the path before each lstat() I'd 
guess.


Nicolas

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Junio C Hamano @ 2006-04-27 18:56 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git, Nicolas Pitre
In-Reply-To: <Pine.LNX.4.64.0604270914570.18816@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> On Thu, 27 Apr 2006, Peter Hagervall wrote:
>
>> Answering the call Linus made[1], sort of, but for a completely
>> different program.
>> 
>> Anyway, it ought to be at least as portable as the shell script, and a
>> whole lot faster, however much that matters.
>> 
> [...]
>> +	for (i = 0; i < 16; i++) {
>> +		subdir[0] = hex_digits[i];
>> +		for (j = 0; j < 16; j++) {
>> +			subdir[1] = hex_digits[j];
>> +			if (access(subdir, R_OK | X_OK))
>> +				continue;
>> +			chdir(subdir);
>> +			if (!(dp = opendir("."))) {
>> +				error("can't open subdir %s", subdir);
>> +				continue;
>> +			}
>
> Looks like you're missing a chdir(".."); there.

Why would you even _need_ to chdir() anywhere, anyway?

^ permalink raw reply

* Re: how to trace the patch?
From: Aubrey @ 2006-04-27 16:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: sean, git
In-Reply-To: <Pine.LNX.4.64.0604270843190.3701@g5.osdl.org>

Thanks a lot.
I'll enjoy it.

Regards,
-Aubrey

On 4/27/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Thu, 27 Apr 2006, sean wrote:
> >
> > $ git log -- <filename>
> >
> > To see a list of commits that affected the file you're interested in.
> >
> > $ git log -p -- <filename>
> >
> > Will include a diff after each commit showing you how the file was
> > changed.  And if you want to see what other changes happened in each
> > commit that modified your file, add "--full-diff" to the command above.
>
> Side note: the "git log -p" thing only works with git 1.3.0+, and even
> without the "-p", old versions will be very slow.
>
> So if you have anything older than 1.3.0, you're likely better off using
> "git whatchanged [-p] -- <filename>".
>
> Also, regardless of which one you use, it's worth pointing out that:
>
>  - for tracking multiple files, just use more than one filename, and you
>   can also use a directory name.
>
>  - you can combine this with all the normal revision limiting rules, which
>   is often useful when you know you're not interested in stuff you've
>   already seen.
>
> For example, if you have just done a "git pull" and you noticed that a
> file (or set of files) you cared about changed - or you just wonder _if_
> it changed - you can do something like
>
>        gitk ORIG_HEAD.. -- drivers/scsi/ include/scsi/
>
> to see what changed due to the pull within those files. Useful whether
> you're tracking certain subsystems, individual drivers, architectures,
> whatever.. It can be useful also just to split the logs up (ie maybe
> you're not interested in anything in particular, but you do a "git log"
> and see something that strikes your fancy, you can decide to see what
> _else_ changed in that area).
>
> And instead of "gitk", use "git log -p" or "git whatchanged" or whatever.
> It's all the same thing, just different ways of looking at it.
>
>                Linus
>

^ permalink raw reply

* Fix "git help -a" terminal autosizing
From: Linus Torvalds @ 2006-04-27 16:02 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


When I split out the builtin commands into their own files, I left the 
include of <sys/ioctl.h> in git.c rather than moving it to the file that 
needed it (builtin-help.c).

Nobody seems to have noticed, because everything still worked, but because 
the TIOCGWINSZ macro was now no longer defined when compiling the 
"term_columns()" function, it would no longer automatically notice the 
terminal size unless your system used the ancient "COLUMNS" environment 
variable approach.

Trivially fixed by just moving the header include to the file that 
actually needs it.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
diff --git a/builtin-help.c b/builtin-help.c
index 10a59cc..7470faa 100644
--- a/builtin-help.c
+++ b/builtin-help.c
@@ -3,6 +3,7 @@
  *
  * Builtin help-related commands (help, usage, version)
  */
+#include <sys/ioctl.h>
 #include "cache.h"
 #include "builtin.h"
 #include "exec_cmd.h"
diff --git a/git.c b/git.c
index aa2b814..01b7e28 100644
--- a/git.c
+++ b/git.c
@@ -8,7 +8,6 @@ #include <string.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdarg.h>
-#include <sys/ioctl.h>
 #include "git-compat-util.h"
 #include "exec_cmd.h"
 

^ permalink raw reply related

* Re: how to trace the patch?
From: Linus Torvalds @ 2006-04-27 15:55 UTC (permalink / raw)
  To: sean; +Cc: Aubrey, git
In-Reply-To: <BAYC1-PASMTP029B6CB13A6C0BA3956E17AEBD0@CEZ.ICE>



On Thu, 27 Apr 2006, sean wrote:
> 
> $ git log -- <filename>
> 
> To see a list of commits that affected the file you're interested in.
> 
> $ git log -p -- <filename>
> 
> Will include a diff after each commit showing you how the file was
> changed.  And if you want to see what other changes happened in each
> commit that modified your file, add "--full-diff" to the command above.

Side note: the "git log -p" thing only works with git 1.3.0+, and even 
without the "-p", old versions will be very slow.

So if you have anything older than 1.3.0, you're likely better off using 
"git whatchanged [-p] -- <filename>".

Also, regardless of which one you use, it's worth pointing out that:

 - for tracking multiple files, just use more than one filename, and you 
   can also use a directory name. 

 - you can combine this with all the normal revision limiting rules, which 
   is often useful when you know you're not interested in stuff you've 
   already seen.

For example, if you have just done a "git pull" and you noticed that a 
file (or set of files) you cared about changed - or you just wonder _if_ 
it changed - you can do something like

	gitk ORIG_HEAD.. -- drivers/scsi/ include/scsi/

to see what changed due to the pull within those files. Useful whether 
you're tracking certain subsystems, individual drivers, architectures, 
whatever.. It can be useful also just to split the logs up (ie maybe 
you're not interested in anything in particular, but you do a "git log" 
and see something that strikes your fancy, you can decide to see what 
_else_ changed in that area).

And instead of "gitk", use "git log -p" or "git whatchanged" or whatever. 
It's all the same thing, just different ways of looking at it.

		Linus

^ permalink raw reply

* [PATCH] C version of git-count-objects, second try
From: Peter Hagervall @ 2006-04-27 14:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, junkio, mwelinder
In-Reply-To: <Pine.LNX.4.64.0604270914570.18816@localhost.localdomain>

On Thu, Apr 27, 2006 at 09:23:47AM -0400, Nicolas Pitre wrote:
> On Thu, 27 Apr 2006, Peter Hagervall wrote:
> 
> > Answering the call Linus made[1], sort of, but for a completely
> > different program.
> > 
> > Anyway, it ought to be at least as portable as the shell script, and a
> > whole lot faster, however much that matters.
> > 
> [...]
> > +	for (i = 0; i < 16; i++) {
> > +		subdir[0] = hex_digits[i];
> > +		for (j = 0; j < 16; j++) {
> > +			subdir[1] = hex_digits[j];
> > +			if (access(subdir, R_OK | X_OK))
> > +				continue;
> > +			chdir(subdir);
> > +			if (!(dp = opendir("."))) {
> > +				error("can't open subdir %s", subdir);
> > +				continue;
> > +			}
> 
> Looks like you're missing a chdir(".."); there.
> 

Thanks, I overlooked that one (and the race condition pointed out by
Morten). Anyway, fixed those now, and removed an unused array.

Signed-off-by: Peter Hagervall <hager@cs.umu.se>

---

 Makefile             |    5 ++-
 count-objects.c      |   55 +++++++++++++++++++++++++++++++++++++
 git-count-objects.sh |   31 --------------------
 3 files changed, 58 insertions(+), 33 deletions(-)


diff --git a/Makefile b/Makefile
index 8ce27a6..53e7591 100644
--- a/Makefile
+++ b/Makefile
@@ -115,7 +115,7 @@ ### --- END CONFIGURATION SECTION ---
 SCRIPT_SH = \
 	git-add.sh git-bisect.sh git-branch.sh git-checkout.sh \
 	git-cherry.sh git-clean.sh git-clone.sh git-commit.sh \
-	git-count-objects.sh git-diff.sh git-fetch.sh \
+	git-diff.sh git-fetch.sh \
 	git-format-patch.sh git-ls-remote.sh \
 	git-merge-one-file.sh git-parse-remote.sh \
 	git-prune.sh git-pull.sh git-push.sh git-rebase.sh \
@@ -165,7 +165,8 @@ PROGRAMS = \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X git-check-ref-format$X \
 	git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
+	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X \
+	git-count-objects$X
 
 BUILT_INS = git-log$X
 
diff --git a/count-objects.c b/count-objects.c
new file mode 100644
index 0000000..beaa4d9
--- /dev/null
+++ b/count-objects.c
@@ -0,0 +1,55 @@
+#include "cache.h"
+#include "git-compat-util.h"
+
+static int numobjects, numblocks;
+static const char hex_digits[] = "0123456789abcdef";
+
+void count_objects(void)
+{
+	char subdir[3];
+	int i, j;
+	struct stat statbuf;
+	struct dirent *dirp;
+	DIR *dp;
+	subdir[2] = '\0';
+	for (i = 0; i < 16; i++) {
+		subdir[0] = hex_digits[i];
+		for (j = 0; j < 16; j++) {
+			subdir[1] = hex_digits[j];
+			if (chdir(subdir))
+				continue;
+			if (!(dp = opendir("."))) {
+				error("can't open subdir %s", subdir);
+				chdir("..");
+				continue;
+			}
+			while ((dirp = readdir(dp))) {
+				if (!strcmp(dirp->d_name, ".") ||
+					!strcmp(dirp->d_name, ".."))
+					continue;
+				if (lstat(dirp->d_name, &statbuf)) {
+					error("can't stat file %s", dirp->d_name);
+					continue;
+				}
+				numblocks += statbuf.st_blocks;
+				numobjects++;
+			}
+			closedir(dp);
+			chdir("..");
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	setup_git_directory();
+
+	if (chdir(".git/objects"))
+		die("%s", strerror(errno));
+
+	count_objects();
+
+	printf("%d objects, %d kilobytes\n", numobjects, numblocks / 2);
+
+	return 0;
+}
diff --git a/git-count-objects.sh b/git-count-objects.sh
deleted file mode 100755
index 40c58ef..0000000
--- a/git-count-objects.sh
+++ /dev/null
@@ -1,31 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-
-GIT_DIR=`git-rev-parse --git-dir` || exit $?
-
-dc </dev/null 2>/dev/null || {
-	# This is not a real DC at all -- it just knows how
-	# this script feeds DC and does the computation itself.
-	dc () {
-		while read a b
-		do
-			case $a,$b in
-			0,)	acc=0 ;;
-			*,+)	acc=$(($acc + $a)) ;;
-			p,)	echo "$acc" ;;
-			esac
-		done
-	}
-}
-
-echo $(find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null | wc -l) objects, \
-$({
-    echo 0
-    # "no-such" is to help Darwin folks by not using xargs -r.
-    find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null |
-    xargs du -k "$GIT_DIR/objects/no-such" 2>/dev/null |
-    sed -e 's/[ 	].*/ +/'
-    echo p
-} | dc) kilobytes

^ permalink raw reply related

* Re: Two gitweb feature requests
From: David Woodhouse @ 2006-04-27 13:47 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git
In-Reply-To: <E1FZ6eM-0000qC-HH@moooo.ath.cx>

On Thu, 2006-04-27 at 15:35 +0200, Matthias Lederhofer wrote:
> An easy way to do this is to put the git repository on the webserver
> and tell the webserver to redirect to gitweb if the directory is
> accessed directly, not a file in the git directory.

That's true, but isn't it much to use git:// instead of the 'dumb'
http:// method?

-- 
dwmw2

^ permalink raw reply

* [PATCH] change ent to tree in git-diff documentation
From: Matthias Lederhofer @ 2006-04-27 13:38 UTC (permalink / raw)
  To: Git Mailing List

---
This is quite confusing for someone new to git who is not familiar
with the vocabulary used with git. I don't think a man page is the
right place for riddles :)
Additionaly I changed two times 'is' to 'are', I hope this is correct.

 Documentation/git-diff.txt |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

83ace740fdf1e064168f8438a6ad863986cf4832
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 890931c..41d85cf 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -8,24 +8,24 @@ git-diff - Show changes between commits,
 
 SYNOPSIS
 --------
-'git-diff' [ --diff-options ] <ent>{0,2} [<path>...]
+'git-diff' [ --diff-options ] <tree-ish>{0,2} [<path>...]
 
 DESCRIPTION
 -----------
-Show changes between two ents, an ent and the working tree, an
-ent and the index file, or the index file and the working tree.
+Show changes between two trees, a tree and the working tree, a
+tree and the index file, or the index file and the working tree.
 The combination of what is compared with what is determined by
-the number of ents given to the command.
+the number of trees given to the command.
 
-* When no <ent> is given, the working tree and the index
-  file is compared, using `git-diff-files`.
+* When no <tree-ish> is given, the working tree and the index
+  file are compared, using `git-diff-files`.
 
-* When one <ent> is given, the working tree and the named
-  tree is compared, using `git-diff-index`.  The option
+* When one <tree-ish> is given, the working tree and the named
+  tree are compared, using `git-diff-index`.  The option
   `--cached` can be given to compare the index file and
   the named tree.
 
-* When two <ent>s are given, these two trees are compared
+* When two <tree-ish>s are given, these two trees are compared
   using `git-diff-tree`.
 
 OPTIONS
-- 
1.3.1.g9af0

^ permalink raw reply related

* Re: Two gitweb feature requests
From: Matthias Lederhofer @ 2006-04-27 13:35 UTC (permalink / raw)
  To: David Woodhouse; +Cc: git
In-Reply-To: <1146144425.11909.450.camel@pmac.infradead.org>

> First... When publishing trees, I currently give both the git:// URL for
> people who want to pull the tree, and the http:// URL to gitweb for
> those who just want to browse.
> 
> It would be useful if I could get away with giving just one URL --
> probably the http:// one to gitweb. If gitweb were to have a mode in
> which it gave a referral to the git:// URL, and if the git tools would
> use that, then that would work well.

An easy way to do this is to put the git repository on the webserver
and tell the webserver to redirect to gitweb if the directory is
accessed directly, not a file in the git directory.

^ permalink raw reply

* Two gitweb feature requests
From: David Woodhouse @ 2006-04-27 13:27 UTC (permalink / raw)
  To: Kay Sievers; +Cc: git

First... When publishing trees, I currently give both the git:// URL for
people who want to pull the tree, and the http:// URL to gitweb for
those who just want to browse.

It would be useful if I could get away with giving just one URL --
probably the http:// one to gitweb. If gitweb were to have a mode in
which it gave a referral to the git:// URL, and if the git tools would
use that, then that would work well.

Secondly, it would be useful if gitweb would list the branches in a
repository and allow each of them to be viewed in the same way as it
does the master branch.

-- 
dwmw2

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Nicolas Pitre @ 2006-04-27 13:23 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git, junkio
In-Reply-To: <20060427101254.GA22769@peppar.cs.umu.se>

On Thu, 27 Apr 2006, Peter Hagervall wrote:

> Answering the call Linus made[1], sort of, but for a completely
> different program.
> 
> Anyway, it ought to be at least as portable as the shell script, and a
> whole lot faster, however much that matters.
> 
[...]
> +	for (i = 0; i < 16; i++) {
> +		subdir[0] = hex_digits[i];
> +		for (j = 0; j < 16; j++) {
> +			subdir[1] = hex_digits[j];
> +			if (access(subdir, R_OK | X_OK))
> +				continue;
> +			chdir(subdir);
> +			if (!(dp = opendir("."))) {
> +				error("can't open subdir %s", subdir);
> +				continue;
> +			}

Looks like you're missing a chdir(".."); there.


Nicolas

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Morten Welinder @ 2006-04-27 13:16 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git, junkio
In-Reply-To: <20060427101254.GA22769@peppar.cs.umu.se>

> +                       if (access(subdir, R_OK | X_OK))
> +                               continue;
> +                       chdir(subdir);

You've got yourself a needless race condition right there.  Just
do the chdir and check the return value.  (And besides, access
checks with the wrong set of permissions, should this ever end
up in set[ug]id context.)

Morten

^ permalink raw reply

* Re: how to trace the patch?
From: sean @ 2006-04-27 10:57 UTC (permalink / raw)
  To: Aubrey; +Cc: git
In-Reply-To: <6d6a94c50604270306j44c280bdo283591f2f595f74e@mail.gmail.com>

On Thu, 27 Apr 2006 18:06:09 +0800
Aubrey <aubreylee@gmail.com> wrote:

> When I update the kernel git tree a few days later, you know, there
> could be a lot of patches. Then I found one file changed, how can I
> know which patch the modification belong to?
> How can I find the patch?

Hi Aubrey,

$ git log -- <filename>

To see a list of commits that affected the file you're interested in.

$ git log -p -- <filename>

Will include a diff after each commit showing you how the file was
changed.  And if you want to see what other changes happened in each
commit that modified your file, add "--full-diff" to the command above.

Note that you can also replace the <filename> with a <directory> 
to see a list of commits that affected any file below that directory.

HTH,
Sean

^ permalink raw reply

* cg-clone not fetching all tags?
From: Wolfgang Denk @ 2006-04-27 10:52 UTC (permalink / raw)
  To: Git Mailing List

Hi,

it seems that "cg-clone" does not fetch all tags any more - only  the
most  recent ones (modiufied in the last N days?) seem to be fetched?
[Eventually the "N days"  might  correspond  to  "changing  tools  to
version X", but I have no way to find out.]

This happens only when using HTTP; using ssh  or  rsync  works  fine.
Also,  if  we follow the "cg-clone" by a "git-fetch -t" command, this
will load the missing tags.

Is this intentional, or am I doing anything wrong?

[For testing, try "cg-clone http://www.denx.de/git/u-boot.git"]

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
In theory, there is no difference between  theory  and  practice.  In
practice, however, there is.

^ permalink raw reply

* [PATCH] C version of git-count-objects
From: Peter Hagervall @ 2006-04-27 10:12 UTC (permalink / raw)
  To: git; +Cc: junkio

Answering the call Linus made[1], sort of, but for a completely
different program.

Anyway, it ought to be at least as portable as the shell script, and a
whole lot faster, however much that matters.

Signed-off-by: Peter Hagervall <hager@cs.umu.se>

[1] http://article.gmane.org/gmane.comp.version-control.git/19073

---

 Makefile             |    5 +--
 count-objects.c      |   56 +++++++++++++++++++++++++++++++++++++
 git-count-objects.sh |   31 --------------------
 3 files changed, 59 insertions(+), 33 deletions(-)


diff --git a/Makefile b/Makefile
index 8ce27a6..53e7591 100644
--- a/Makefile
+++ b/Makefile
@@ -115,7 +115,7 @@ ### --- END CONFIGURATION SECTION ---
 SCRIPT_SH = \
 	git-add.sh git-bisect.sh git-branch.sh git-checkout.sh \
 	git-cherry.sh git-clean.sh git-clone.sh git-commit.sh \
-	git-count-objects.sh git-diff.sh git-fetch.sh \
+	git-diff.sh git-fetch.sh \
 	git-format-patch.sh git-ls-remote.sh \
 	git-merge-one-file.sh git-parse-remote.sh \
 	git-prune.sh git-pull.sh git-push.sh git-rebase.sh \
@@ -165,7 +165,8 @@ PROGRAMS = \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X git-check-ref-format$X \
 	git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
+	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X \
+	git-count-objects$X
 
 BUILT_INS = git-log$X
 
diff --git a/count-objects.c b/count-objects.c
new file mode 100644
index 0000000..67ab6f0
--- /dev/null
+++ b/count-objects.c
@@ -0,0 +1,56 @@
+#include "cache.h"
+#include "git-compat-util.h"
+
+static char pathname[PATH_MAX + 1];
+static int numobjects, numblocks;
+static const char hex_digits[] = "0123456789abcdef";
+
+void count_objects(void)
+{
+	char subdir[3];
+	int i, j;
+	struct stat statbuf;
+	struct dirent *dirp;
+	DIR *dp;
+	subdir[2] = '\0';
+	for (i = 0; i < 16; i++) {
+		subdir[0] = hex_digits[i];
+		for (j = 0; j < 16; j++) {
+			subdir[1] = hex_digits[j];
+			if (access(subdir, R_OK | X_OK))
+				continue;
+			chdir(subdir);
+			if (!(dp = opendir("."))) {
+				error("can't open subdir %s", subdir);
+				continue;
+			}
+			while ((dirp = readdir(dp))) {
+				if (!strcmp(dirp->d_name, ".") ||
+					!strcmp(dirp->d_name, ".."))
+					continue;
+				if (lstat(dirp->d_name, &statbuf)) {
+					error("can't stat file %s", dirp->d_name);
+					continue;
+				}
+				numblocks += statbuf.st_blocks;
+				numobjects++;
+			}
+			closedir(dp);
+			chdir("..");
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	setup_git_directory();
+
+	if (chdir(".git/objects"))
+		die("%s", strerror(errno));
+
+	count_objects();
+
+	printf("%d objects, %d kilobytes\n", numobjects, numblocks / 2);
+
+	return 0;
+}
diff --git a/git-count-objects.sh b/git-count-objects.sh
deleted file mode 100755
index 40c58ef..0000000
--- a/git-count-objects.sh
+++ /dev/null
@@ -1,31 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-
-GIT_DIR=`git-rev-parse --git-dir` || exit $?
-
-dc </dev/null 2>/dev/null || {
-	# This is not a real DC at all -- it just knows how
-	# this script feeds DC and does the computation itself.
-	dc () {
-		while read a b
-		do
-			case $a,$b in
-			0,)	acc=0 ;;
-			*,+)	acc=$(($acc + $a)) ;;
-			p,)	echo "$acc" ;;
-			esac
-		done
-	}
-}
-
-echo $(find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null | wc -l) objects, \
-$({
-    echo 0
-    # "no-such" is to help Darwin folks by not using xargs -r.
-    find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null |
-    xargs du -k "$GIT_DIR/objects/no-such" 2>/dev/null |
-    sed -e 's/[ 	].*/ +/'
-    echo p
-} | dc) kilobytes

^ permalink raw reply related

* how to trace the patch?
From: Aubrey @ 2006-04-27 10:06 UTC (permalink / raw)
  To: git

Hi all,

When I update the kernel git tree a few days later, you know, there
could be a lot of patches. Then I found one file changed, how can I
know which patch the modification belong to?
How can I find the patch?
Many thanks,

Regards,
-Aubrey

^ permalink raw reply

* Re: [PATCH] Add --continue and --abort options to git-rebase.
From: sean @ 2006-04-27  5:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3bg0nlvb.fsf@assigned-by-dhcp.cox.net>

On Wed, 26 Apr 2006 13:05:28 -0700
Junio C Hamano <junkio@cox.net> wrote:

> sean <seanlkml@sympatico.ca> writes:
> 
> >   git rebase [--onto <newbase>] <upstream> [<branch>]
> >   git rebase --continue
> >   git rebase --abort
> >
> > ---
> >
> > Take 2.  Must simpler patch which doesn't trying to 
> > rejigger the command line too much.
> 
> This second round seems to make more sense.  Sign-off?
> 

Sure,

    Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>

Sean

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox