Git development
 help / color / mirror / Atom feed
* Re: [PATCH] make file merging respect permissions
From: James Bottomley @ 2005-04-23 23:21 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <20050423230238.GD13222@pasky.ji.cz>

On Sun, 2005-04-24 at 01:02 +0200, Petr Baudis wrote:
> *cough*

OK, dirty file in the local tree, sorry.

This is the actual diff

---

1) permissions aren't respected in the merge script (primarily because
they're never passed in to it in the first place).  Fix that and also
check for permission conflicts in the merge

2) the delete of a file in both branches may indeed be just that, but it
could also be the indicator of a rename conflict (file moved to
different locations in both branches), so error out and ask the
committer for guidance.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

--- a/git-merge-one-file-script
+++ b/git-merge-one-file-script
@@ -20,23 +20,45 @@ mkdir -p "$dir"
 
 case "${1:-.}${2:-.}${3:-.}" in
 #
-# deleted in both, or deleted in one and unchanged in the other
+# deleted in both
+#
+"$1..")
+	echo "ERROR: $4 is removed in both branches"
+	echo "ERROR: This is a potential rename conflict"
+	exit 1;;
+#
+# deleted in one and unchanged in the other
 #
 "$1.." | "$1.$1" | "$1$1.")
 	rm -f -- "$4"
+	echo "Removing $4"
 	update-cache --remove -- "$4"
 	exit 0
 	;;
 
 #
-# added in one, or added identically in both
+# added in one
 #
-".$2." | "..$3" | ".$2$2")
-	mv $(unpack-file "${2:-$3}") $4
+".$2." | "..$3" )
+	echo "Adding $4 with perm $6$7"
+	mv $(unpack-file "$2$3") $4
+	chmod "$6$7" $4
 	update-cache --add -- $4
 	exit 0
 	;;
-
+#
+# Added in both (check for same permissions)
+#
+".$2$2")
+	if [ "$6" != "$7" ]; then
+		echo "ERROR: File $4 added in both branches, permissions conflict $6->$7"
+		exit 1
+	fi
+	echo "Adding $4 with perm $6"
+	mv $(unpack-file "$2") $4
+	chmod "$6" $4
+	update-cache --add -- $4
+	exit 0;;
 #
 # Modified in both, but differently ;(
 #
@@ -46,12 +68,21 @@ case "${1:-.}${2:-.}${3:-.}" in
 	src1=$(unpack-file $2)
 	src2=$(unpack-file $3)
 	merge "$src2" "$orig" "$src1"
-	if [ $? -ne 0 ]; then
-		echo Leaving conflict merge in $src2
+	ret=$?
+	if [ "$6" != "$7" ]; then
+		echo "ERROR: Permissions $5->$6->$7 don't match merging $src2"
+		if [ $ret -ne 0 ]; then
+			echo "ERROR: Leaving conflict merge in $src2"
+		fi
+		exit 1
+	fi
+	chmod -- "$6" "$src2"
+	if [ $ret -ne 0 ]; then
+		echo "ERROR: Leaving conflict merge in $src2"
 		exit 1
 	fi
-	cp "$src2" "$4" && update-cache --add -- "$4" && exit 0
+	cp -- "$src2" "$4" && chmod -- "$6" "$4" &&  update-cache --add -- "$4" && exit 0
 	;;
 
 *)
--- a/merge-cache.c
+++ b/merge-cache.c
@@ -4,7 +4,7 @@
 #include "cache.h"
 
 static const char *pgm = NULL;
-static const char *arguments[5];
+static const char *arguments[8];
 
 static void run_program(void)
 {
@@ -18,6 +18,9 @@ static void run_program(void)
 			    arguments[2],
 			    arguments[3],
 			    arguments[4],
+			    arguments[5],
+			    arguments[6],
+			    arguments[7],
 			    NULL);
 		die("unable to execute '%s'", pgm);
 	}
@@ -36,9 +39,13 @@ static int merge_entry(int pos, const ch
 	arguments[2] = "";
 	arguments[3] = "";
 	arguments[4] = path;
+	arguments[5] = "";
+	arguments[6] = "";
+	arguments[7] = "";
 	found = 0;
 	do {
 		static char hexbuf[4][60];
+		static char ownbuf[4][60];
 		struct cache_entry *ce = active_cache[pos];
 		int stage = ce_stage(ce);
 
@@ -46,7 +53,9 @@ static int merge_entry(int pos, const ch
 			break;
 		found++;
 		strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
+		sprintf(ownbuf[stage], "%o", ntohl(ce->ce_mode) & (~S_IFMT));
 		arguments[stage] = hexbuf[stage];
+		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < active_nr);
 	if (!found)
 		die("merge-cache: %s not in the cache", path);



^ permalink raw reply

* Re: Git-commits mailing list feed.
From: Linus Torvalds @ 2005-04-23 23:29 UTC (permalink / raw)
  To: Jan Harkes
  Cc: David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
	Git Mailing List
In-Reply-To: <20050423204957.GA16751@delft.aura.cs.cmu.edu>



On Sat, 23 Apr 2005, Jan Harkes wrote:
> 
> I respectfully disagree,
> 
> rsync works fine for now, but people are already looking at implementing
> smarter (more efficient) ways to synchronize git repositories by
> grabbing missing commits, and from there fetching any missing tree and
> file blobs.

Bit this is a _feature_.

Other people normally shouldn't be interested in your tags. I think it's a 
mistake to make everybody care.

So you normally would fetch only tags you _know_ about. For example, one 
of the reasons we've been _avoiding_ personal tags in teh BK trees is that 
it just gets really ugly really quickly because they get percolated up to 
everybody else. That means that in a BK tree, you can't sanely use tags 
for "private" stuff, like telling somebody else "please sync with this 
tag".

So having the tag in the object database means that fsck etc will notice 
these things, and can build up a list of tags you know about. It also 
means that you can have tag-aware synchronization tools, ie exactly the 
kind of tools that only grab missing commits can also then be used to 
select missing tags according to some _private_ understanding of what tags 
you might want to find..

		Linus

^ permalink raw reply

* Re: [patch] fixup GECOS handling
From: Petr Baudis @ 2005-04-23 23:38 UTC (permalink / raw)
  To: Martin Schlemmer; +Cc: kyle, GIT Mailing Lists
In-Reply-To: <1114196803.29271.52.camel@nosferatu.lan>

Dear diary, on Fri, Apr 22, 2005 at 09:06:43PM CEST, I got a letter
where Martin Schlemmer <azarah@nosferatu.za.org> told me that...
> @@ -311,6 +296,17 @@
>         if (!pw)
>                 die("You don't exist. Go away!");
>         realgecos = pw->pw_gecos;
> +       /*
> +        * The GECOS fields are seperated via ',' on Linux, FreeBSD, etc,
> +        * and ';' on AIX.
> +        */
> +#if defined(__aix__)
> +       if (strchr(realgecos, ';'))
> +               *strchr(realgecos, ';') = 0;
> +#else
> +       if (strchr(realgecos, ','))
> +               *strchr(realgecos, ',') = 0;
> +#endif
>         len = strlen(pw->pw_name);
>         memcpy(realemail, pw->pw_name, len);
>         realemail[len] = '@';

I'm confused, what does this has to do with AIX? Do we even have / can
expect to have any major AIX users?

I'm not too happy with this, I have to say. It seems it won't do always
the right thing anyway. I would still favour the approach when you cut
off everything after ';', and everything after ',' if no ';' is found.
Seems simplest, safest, etc.

Tell me about anyone who has a semicolon in his realname.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] Add help details to git help command. (This time with Perl)
From: Petr Baudis @ 2005-04-23 23:41 UTC (permalink / raw)
  To: David Greaves; +Cc: Steven Cole, git
In-Reply-To: <42677284.1010005@dgreaves.com>

Please ignore the reply marker part. ;-)

Sorry,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* [PATCH] Fix broken diff-cache output on added files
From: Petr Baudis @ 2005-04-23 23:43 UTC (permalink / raw)
  To: torvalds; +Cc: git

Added files were errorneously reported with the - prefix by diff-cache,
obviously leading to great confusion.

Signed-off-by: Petr Baudis <pasky@ucw.cz>

Index: diff-cache.c
===================================================================
--- 099679c62a98433d9d9b38581f39563c9574478e/diff-cache.c  (mode:100644 sha1:b407d753e520fa0b1523d770d98b3015af197275)
+++ 3df862ae5cc66733dab3d8bd5c4ea359b2ca1884/diff-cache.c  (mode:100644 sha1:2ec6c29ab6b79a10277a2ff9021a2032d656abf0)
@@ -57,7 +57,7 @@
 		}
 		/* No matching 1-stage (tree) entry? Show the current one as added */
 		if (entries == 1 || !same_name(ce, ac[1])) {
-			show_file("-", ce);
+			show_file("+", ce);
 			ac++;
 			entries--;
 			continue;


-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [patch] fixup GECOS handling
From: Martin Schlemmer @ 2005-04-23 23:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: kyle, GIT Mailing Lists
In-Reply-To: <20050423233821.GN13222@pasky.ji.cz>

[-- Attachment #1: Type: text/plain, Size: 1681 bytes --]

On Sun, 2005-04-24 at 01:38 +0200, Petr Baudis wrote:
> Dear diary, on Fri, Apr 22, 2005 at 09:06:43PM CEST, I got a letter
> where Martin Schlemmer <azarah@nosferatu.za.org> told me that...
> > @@ -311,6 +296,17 @@
> >         if (!pw)
> >                 die("You don't exist. Go away!");
> >         realgecos = pw->pw_gecos;
> > +       /*
> > +        * The GECOS fields are seperated via ',' on Linux, FreeBSD, etc,
> > +        * and ';' on AIX.
> > +        */
> > +#if defined(__aix__)
> > +       if (strchr(realgecos, ';'))
> > +               *strchr(realgecos, ';') = 0;
> > +#else
> > +       if (strchr(realgecos, ','))
> > +               *strchr(realgecos, ',') = 0;
> > +#endif
> >         len = strlen(pw->pw_name);
> >         memcpy(realemail, pw->pw_name, len);
> >         realemail[len] = '@';
> 
> I'm confused, what does this has to do with AIX? Do we even have / can
> expect to have any major AIX users?
> 

Given.

> I'm not too happy with this, I have to say. It seems it won't do always
> the right thing anyway. I would still favour the approach when you cut
> off everything after ';', and everything after ',' if no ';' is found.
> Seems simplest, safest, etc.
> 
> Tell me about anyone who has a semicolon in his realname.
> 

Point I guess is still that the only valid delimiter on linux is ',',
and the only reason for the ';' was because of some aix/whatever user
saying that is a delimiter as well. But like I said: cat $this
> /dev/null ... This is basically the same type of discussion as the
hash collision one, and I'm sure we all have better things to do.


Thanks,

-- 
Martin Schlemmer


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Hash collision count
From: Petr Baudis @ 2005-04-23 23:46 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Ray Heasman, Git Mailing List, Linus Torvalds
In-Reply-To: <426AD835.5070404@pobox.com>

Dear diary, on Sun, Apr 24, 2005 at 01:20:21AM CEST, I got a letter
where Jeff Garzik <jgarzik@pobox.com> told me that...
> Second, in your scenario, it's highly unlikely you would get 4 billion 
> sha1 hash collisions, even if you had the disk space to store such a git 
> database.

It's highly unlikely you would get a _single_ collision.

> First, the hash is NOT unique.
> 
> Second, you lose data if you pretend it is unique.  I don't like losing 
> data.

*sigh*

We've been through this before, haven't we?

> Third, a data check only occurs in the highly unlikely case that a hash 
> already exists -- a collision.  Rather than "trillions of times", more 
> like "one in a trillion chance."

No, a collision is pretty common thing, actually. It's the main power of
git, actually - when you do read-tree, modify it and do write-tree
(typically when doing commit), everything you didn't modify (99% of
stuff, most likely) is basically a collision - but it's ok since it
just stays the same.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Humble request of 'git' developers + [PATCH] Slight enhancement of GIT wrapper
From: Pavel Pisa @ 2005-04-23 23:47 UTC (permalink / raw)
  To: Jeff Garzik, git, Petr Baudis

Hello All,

I am resending patch from 13 April
which I have send Petr Baudis.
He has said, that he combines
best ideas from more similar patches
but he has not found time for that probably.
It is updated according his
and Johannes Schindelin's remarks.

It enables to move all growing number
of git scripts and programs to the any
directory out of the search path.
Actually only "git" script needs to be
linked symbolically to some directory
on the search path. This solution worked
for me with many versions of "git-pasky".

I would vote for this solution
and for single public exposed multiplexer
executable script.
It is how big portable place independent
projects solve hiding of garbage from
the search path.

Best wishes

                Pavel Pisa
        e-mail: pisa@cmp.felk.cvut.cz
        www:    http://cmp.felk.cvut.cz/~pisa
        work:   http://www.pikron.com




Slight enhancement of GIT wrapper

Git multiplexer and scripts can reside in non PATH directories
and linking git multiplexer into some searchables dirs makes the trick.

Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>

--- git.orig	2005-04-24 00:12:56.000000000 +0200
+++ git	2005-04-24 01:28:01.000000000 +0200
@@ -17,6 +17,28 @@
 	exit 1
 }
 
+if [ -z "$GIT_TOOLS_DIR" ] ; then
+	GIT_MUXBINARY_DIR="$(dirname "$0")"
+	if [ -h "$0" ]; then
+		#GIT_TOOLS_DIR="$(ls -l "$0" | sed 's/^.*-> *\(.*\) *$/\1/')"
+		GIT_TOOLS_DIR="$(ls -L "$0")"
+	        GIT_TOOLS_DIR="$(dirname "$GIT_TOOLS_DIR")"
+		GIT_TOOLS_DIR="$(cd "$GIT_MUXBINARY_DIR" ; cd "$GIT_TOOLS_DIR" ; pwd )"
+	else
+		GIT_TOOLS_DIR="$(cd "$GIT_MUXBINARY_DIR" ; pwd )"
+	fi
+
+	if [ -z "$(echo :$PATH: | sed -n -e 's#:'"$GIT_TOOLS_DIR"':#yes#p' )" ] ; then
+		export PATH="$GIT_TOOLS_DIR:$PATH"
+	fi
+
+	export GIT_TOOLS_DIR
+fi
+
+#echo GIT_TOOLS_DIR=$GIT_TOOLS_DIR
+#echo PATH=$PATH
+#echo CWD=$(pwd)
+#exit 
 
 help () {
 	cat <<__END__

^ permalink raw reply

* [PATCH] Simplify building of programs
From: Jonas Fonseca @ 2005-04-23 23:59 UTC (permalink / raw)
  To: torvalds; +Cc: git

Don't list libgit.a twice when compiling programs and make them depend
on the .c files so .o files are not left behind.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>

--- 66b3fa5bde838935121a2eb7cf4b67587c32de13/Makefile  (mode:100644 sha1:e26b7c3695bf7ee88a75dcb6fd1953ce8b33c748)
+++ uncommitted/Makefile  (mode:100644)
@@ -49,8 +49,7 @@
 LIB_H=cache.h object.h
 
 
-LIBS = $(LIB_FILE)
-LIBS += -lz
+LIBS = -lz
 
 ifdef MOZILLA_SHA1
 	SHA1_HEADER="mozilla-sha1/sha1.h"
@@ -70,7 +69,7 @@
 
 all: $(PROG) $(GEN_SCRIPT)
 
-$(PROG):%: %.o $(LIB_FILE)
+$(PROG):%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
 
 $(LIB_FILE): $(LIB_OBJS)

-- 
Jonas Fonseca

^ permalink raw reply

* [PATCH 0/5] Better merge-base, alternative transport programs
From: Daniel Barkalow @ 2005-04-24  0:03 UTC (permalink / raw)
  To: git

This series contains three patches to add functionality to the library
routines necessary for the rest of the series, a patch to change the
merge-base implementation such that it always returns one of its arguments
when possible (by way of using the date-based algorithm), and a patch to
support fetching what is needed from a repository by HTTP, and both
pushing and pulling by ssh.

 1: Add some functions for commit lists
 2: Parse tree objects completely
 3: Add some functions related to files
 4: Replace merge-base
 5: Add push and pull programs

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply

* [PATCH 1/5] Add some functions for commit lists
From: Daniel Barkalow @ 2005-04-24  0:07 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0504231953490.30848-100000@iabervon.org>

This adds a function for inserting an item in a commit list, a function
for sorting a commit list by date, and a function for progressively
scanning a commit history from most recent to least recent.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: commit.c
===================================================================
--- 329aca984ad6d06eb6d2dffae3933f00ccb8df5a/commit.c  (mode:100644 sha1:9f0668eb68cec56a738a58fe930ae0ae2960e2b2)
+++ e09a6d73a7c6c7a8bfb7e7003a34a507ed97a3b6/commit.c  (mode:100644 sha1:911f6435a74b93f6d25c6852d1814fa8dbaf626e)
@@ -63,12 +63,9 @@
 	bufptr += 46; /* "tree " + "hex sha1" + "\n" */
 	while (!memcmp(bufptr, "parent ", 7) &&
 	       !get_sha1_hex(bufptr + 7, parent)) {
-		struct commit_list *new_parent = 
-			malloc(sizeof(struct commit_list));
-		new_parent->next = item->parents;
-		new_parent->item = lookup_commit(parent);
-		add_ref(&item->object, &new_parent->item->object);
-		item->parents = new_parent;
+		struct commit *new_parent = lookup_commit(parent);
+		commit_list_insert(new_parent, &item->parents);
+		add_ref(&item->object, &new_parent->object);
 		bufptr += 48;
 	}
 	item->date = parse_commit_date(bufptr);
@@ -76,6 +73,14 @@
 	return 0;
 }
 
+void commit_list_insert(struct commit *item, struct commit_list **list_p)
+{
+	struct commit_list *new_list = malloc(sizeof(struct commit_list));
+	new_list->item = item;
+	new_list->next = *list_p;
+	*list_p = new_list;
+}
+
 void free_commit_list(struct commit_list *list)
 {
 	while (list) {
@@ -84,3 +89,44 @@
 		free(temp);
 	}
 }
+
+static void insert_by_date(struct commit_list **list, struct commit *item)
+{
+	struct commit_list **pp = list;
+	struct commit_list *p;
+	while ((p = *pp) != NULL) {
+		if (p->item->date < item->date) {
+			break;
+		}
+		pp = &p->next;
+	}
+	commit_list_insert(item, pp);
+}
+
+	
+void sort_by_date(struct commit_list **list)
+{
+	struct commit_list *ret = NULL;
+	while (*list) {
+		insert_by_date(&ret, (*list)->item);
+		*list = (*list)->next;
+	}
+	*list = ret;
+}
+
+struct commit *pop_most_recent_commit(struct commit_list **list)
+{
+	struct commit *ret = (*list)->item;
+	struct commit_list *parents = ret->parents;
+	struct commit_list *old = *list;
+
+	*list = (*list)->next;
+	free(old);
+
+	while (parents) {
+		parse_commit(parents->item);
+		insert_by_date(list, parents->item);
+		parents = parents->next;
+	}
+	return ret;
+}
Index: commit.h
===================================================================
--- 329aca984ad6d06eb6d2dffae3933f00ccb8df5a/commit.h  (mode:100644 sha1:4afd27b1095cf9f9203c96db2b9f2b0bba5063d8)
+++ e09a6d73a7c6c7a8bfb7e7003a34a507ed97a3b6/commit.h  (mode:100644 sha1:c8684d1cd07d7c9ed0af06a3f3d9e7b49fbed0a2)
@@ -22,6 +22,15 @@
 
 int parse_commit(struct commit *item);
 
+void commit_list_insert(struct commit *item, struct commit_list **list_p);
+
 void free_commit_list(struct commit_list *list);
 
+void sort_by_date(struct commit_list **list);
+
+/** Removes the first commit from a list sorted by date, and adds all
+ * of its parents.
+ **/
+struct commit *pop_most_recent_commit(struct commit_list **list);
+
 #endif /* COMMIT_H */


^ permalink raw reply

* [PATCH 2/5] Parse tree objects completely
From: Daniel Barkalow @ 2005-04-24  0:10 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0504231953490.30848-100000@iabervon.org>

This adds the contents of trees to struct tree.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
commit af03ca2bdc01fdc2565c2914285d9c3ccb1205d3
tree 144a13fb75a39538ec4578792d2c374c6ef50f46
parent fda07b139124925a8000207fb1d91feec1fe675d
author Daniel Barkalow <barkalow@iabervon.org> 1114296377 -0400
committer Daniel Barkalow <barkalow@silva-tulga.(none)> 1114296377 -0400

    Parse tree objects completely

Index: tree.c
===================================================================
--- e09a6d73a7c6c7a8bfb7e7003a34a507ed97a3b6/tree.c  (mode:100644 sha1:e988aed6a85d15568dcb93b69035b97a24e30cc9)
+++ 144a13fb75a39538ec4578792d2c374c6ef50f46/tree.c  (mode:100644 sha1:79b9625855c017ce0298f62cc398ed4d16964cb1)
@@ -92,6 +92,7 @@
 	char type[20];
 	void *buffer, *bufptr;
 	unsigned long size;
+	struct tree_entry_list **list_p;
 	if (item->object.parsed)
 		return 0;
 	item->object.parsed = 1;
@@ -103,8 +104,10 @@
 	if (strcmp(type, tree_type))
 		return error("Object %s not a tree",
 			     sha1_to_hex(item->object.sha1));
+	list_p = &item->entries;
 	while (size) {
 		struct object *obj;
+		struct tree_entry_list *entry;
 		int len = 1+strlen(bufptr);
 		unsigned char *file_sha1 = bufptr + len;
 		char *path = strchr(bufptr, ' ');
@@ -113,6 +116,12 @@
 		    sscanf(bufptr, "%o", &mode) != 1)
 			return -1;
 
+		entry = malloc(sizeof(struct tree_entry_list));
+		entry->name = strdup(path + 1);
+		entry->directory = S_ISDIR(mode);
+		entry->executable = mode & S_IXUSR;
+		entry->next = NULL;
+
 		/* Warn about trees that don't do the recursive thing.. */
 		if (strchr(path, '/')) {
 			item->has_full_path = 1;
@@ -121,12 +130,17 @@
 		bufptr += len + 20;
 		size -= len + 20;
 
-		if (S_ISDIR(mode)) {
-			obj = &lookup_tree(file_sha1)->object;
+		if (entry->directory) {
+			entry->item.tree = lookup_tree(file_sha1);
+			obj = &entry->item.tree->object;
 		} else {
-			obj = &lookup_blob(file_sha1)->object;
+			entry->item.blob = lookup_blob(file_sha1);
+			obj = &entry->item.blob->object;
 		}
 		add_ref(&item->object, obj);
+
+		*list_p = entry;
+		list_p = &entry->next;
 	}
 	return 0;
 }
Index: tree.h
===================================================================
--- e09a6d73a7c6c7a8bfb7e7003a34a507ed97a3b6/tree.h  (mode:100644 sha1:4d5496de307999f5ada8412259e0e86d2c8092de)
+++ 144a13fb75a39538ec4578792d2c374c6ef50f46/tree.h  (mode:100644 sha1:19b190565957a7a03c34f7efa68a7fe0c6783d04)
@@ -5,9 +5,21 @@
 
 extern const char *tree_type;
 
+struct tree_entry_list {
+	struct tree_entry_list *next;
+	unsigned directory : 1;
+	unsigned executable : 1;
+	char *name;
+	union {
+		struct tree *tree;
+		struct blob *blob;
+	} item;
+};
+
 struct tree {
 	struct object object;
 	unsigned has_full_path : 1;
+	struct tree_entry_list *entries;
 };
 
 struct tree *lookup_tree(unsigned char *sha1);


^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Paul Jackson @ 2005-04-24  0:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: dan, torvalds, greg, git
In-Reply-To: <20050421064931.GA31910@pasky.ji.cz>

> A little off-topic, anyone knows how to turn off that damn alternate
> screen thing on the xterm level? 

Do you mean the 'feature' where it clears the screen of the
last page you were viewing on exit from 'less'?

The following stops that clearing:

    export LESS=-X

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Paul Jackson @ 2005-04-24  0:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: pasky, greg, git
In-Reply-To: <Pine.LNX.4.58.0504201809170.2344@ppc970.osdl.org>

Linus wrote:
+				echo; sed 's/^/  /'

One can avoid adding useless trailing spaces on empty lines with:

+				echo; sed 's/./  &/'

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply

* [PATCH 3/5] Additional functions for the objects database
From: Daniel Barkalow @ 2005-04-24  0:15 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0504231953490.30848-100000@iabervon.org>

This adds two functions: one to check if an object is present in the local
database, and one to add an object to the local database by reading it
from a file descriptor and checking its hash.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: cache.h
===================================================================
--- 144a13fb75a39538ec4578792d2c374c6ef50f46/cache.h  (mode:100644 sha1:bf30ac4741d2eeeb483079f566182505898082f3)
+++ cae140a16189361d8c9f1f7e68ef519956fd26d9/cache.h  (mode:100644 sha1:794d676a5cf5c9a03309c4b368840f8707cfcf46)
@@ -122,11 +122,16 @@
 extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size);
 extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size);
 extern int write_sha1_file(char *buf, unsigned len, unsigned char *return_sha1);
+
 extern int check_sha1_signature(unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
 /* Read a tree into the cache */
 extern int read_tree(void *buffer, unsigned long size, int stage);
 
+extern int write_sha1_from_fd(const unsigned char *sha1, int fd);
+
+extern int has_sha1_file(const unsigned char *sha1);
+
 /* Convert to/from hex/sha1 representation */
 extern int get_sha1_hex(const char *hex, unsigned char *sha1);
 extern char *sha1_to_hex(const unsigned char *sha1);	/* static buffer result! */
Index: sha1_file.c
===================================================================
--- 144a13fb75a39538ec4578792d2c374c6ef50f46/sha1_file.c  (mode:100644 sha1:66308ede85c2dad6b184fb74a7215b06a173d8f7)
+++ cae140a16189361d8c9f1f7e68ef519956fd26d9/sha1_file.c  (mode:100644 sha1:97a515a073fec5870dfaaa279868ce9330853d3d)
@@ -328,3 +328,75 @@
 	close(fd);
 	return 0;
 }
+
+int write_sha1_from_fd(const unsigned char *sha1, int fd)
+{
+	char *filename = sha1_file_name(sha1);
+
+	int local;
+	z_stream stream;
+	unsigned char real_sha1[20];
+	char buf[4096];
+	char discard[4096];
+	int ret;
+	SHA_CTX c;
+
+	local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+
+	if (local < 0)
+		return error("Couldn't open %s\n", filename);
+
+	memset(&stream, 0, sizeof(stream));
+
+	inflateInit(&stream);
+
+	SHA1_Init(&c);
+
+	do {
+		ssize_t size;
+		size = read(fd, buf, 4096);
+		if (size <= 0) {
+			close(local);
+			unlink(filename);
+			if (!size)
+				return error("Connection closed?");
+			perror("Reading from connection");
+			return -1;
+		}
+		write(local, buf, size);
+		stream.avail_in = size;
+		stream.next_in = buf;
+		do {
+			stream.next_out = discard;
+			stream.avail_out = sizeof(discard);
+			ret = inflate(&stream, Z_SYNC_FLUSH);
+			SHA1_Update(&c, discard, sizeof(discard) -
+				    stream.avail_out);
+		} while (stream.avail_in && ret == Z_OK);
+		
+	} while (ret == Z_OK);
+	inflateEnd(&stream);
+
+	close(local);
+	SHA1_Final(real_sha1, &c);
+	if (ret != Z_STREAM_END) {
+		unlink(filename);
+		return error("File %s corrupted", sha1_to_hex(sha1));
+	}
+	if (memcmp(sha1, real_sha1, 20)) {
+		unlink(filename);
+		return error("File %s has bad hash\n", sha1_to_hex(sha1));
+	}
+	
+	return 0;
+}
+
+int has_sha1_file(const unsigned char *sha1)
+{
+	char *filename = sha1_file_name(sha1);
+	struct stat st;
+
+	if (!stat(filename, &st))
+		return 1;
+	return 0;
+}


^ permalink raw reply

* [PATCH 4/5] Replace merge-base implementation
From: Daniel Barkalow @ 2005-04-24  0:18 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0504231953490.30848-100000@iabervon.org>

The old implementation was a nice algorithm, but, unfortunately, it could
be confused in some cases and would not necessarily do the obvious thing
if one argument was decended from the other. This version fixes that by
changing the criterion to the most recent common ancestor.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: merge-base.c
===================================================================
--- cae140a16189361d8c9f1f7e68ef519956fd26d9/merge-base.c  (mode:100644 sha1:ac1153bc5646cb2d515ff206b759f4a79e90273a)
+++ 9b75904eab1300d83264a1840d396160482fee88/merge-base.c  (mode:100644 sha1:0e4c58ede915aca5719bbd12ecd1945f2f300590)
@@ -5,67 +5,63 @@
 static struct commit *process_list(struct commit_list **list_p, int this_mark,
 				   int other_mark)
 {
-	struct commit_list *parent, *temp;
-	struct commit_list *posn = *list_p;
-	*list_p = NULL;
-	while (posn) {
-		parse_commit(posn->item);
-		if (posn->item->object.flags & this_mark) {
-			/*
-			  printf("%d already seen %s %x\n",
-			  this_mark
-			  sha1_to_hex(posn->parent->sha1),
-			  posn->parent->flags);
-			*/
-			/* do nothing; this indicates that this side
-			 * split and reformed, and we only need to
-			 * mark it once.
-			 */
-		} else if (posn->item->object.flags & other_mark) {
-			return posn->item;
-		} else {
-			/*
-			  printf("%d based on %s\n",
-			  this_mark,
-			  sha1_to_hex(posn->parent->sha1));
-			*/
-			posn->item->object.flags |= this_mark;
-			
-			parent = posn->item->parents;
-			while (parent) {
-				temp = malloc(sizeof(struct commit_list));
-				temp->next = *list_p;
-				temp->item = parent->item;
-				*list_p = temp;
-				parent = parent->next;
-			}
-		}
-		posn = posn->next;
+	struct commit *item = (*list_p)->item;
+	
+	if (item->object.flags & this_mark) {
+		/*
+		  printf("%d already seen %s %x\n",
+		  this_mark
+		  sha1_to_hex(posn->parent->sha1),
+		  posn->parent->flags);
+		*/
+		/* do nothing; this indicates that this side
+		 * split and reformed, and we only need to
+		 * mark it once.
+		 */
+		*list_p = (*list_p)->next;
+	} else if (item->object.flags & other_mark) {
+		return item;
+	} else {
+		/*
+		  printf("%d based on %s\n",
+		  this_mark,
+		  sha1_to_hex(posn->parent->sha1));
+		*/
+		pop_most_recent_commit(list_p);
+		item->object.flags |= this_mark;
 	}
 	return NULL;
 }
 
 struct commit *common_ancestor(struct commit *rev1, struct commit *rev2)
 {
-	struct commit_list *rev1list = malloc(sizeof(struct commit_list));
-	struct commit_list *rev2list = malloc(sizeof(struct commit_list));
+	struct commit_list *rev1list = NULL;
+	struct commit_list *rev2list = NULL;
 
-	rev1list->item = rev1;
-	rev1list->next = NULL;
+	commit_list_insert(rev1, &rev1list);
+	commit_list_insert(rev2, &rev2list);
 
-	rev2list->item = rev2;
-	rev2list->next = NULL;
+	parse_commit(rev1);
+	parse_commit(rev2);
 
 	while (rev1list || rev2list) {
 		struct commit *ret;
-		ret = process_list(&rev1list, 0x1, 0x2);
-		if (ret) {
-			/* XXXX free lists */
-			return ret;
+		if (!rev1list) {
+			// process 2
+			ret = process_list(&rev2list, 0x2, 0x1);
+		} else if (!rev2list) {
+			// process 1
+			ret = process_list(&rev1list, 0x1, 0x2);
+		} else if (rev1list->item->date < rev2list->item->date) {
+			// process 2
+			ret = process_list(&rev2list, 0x2, 0x1);
+		} else {
+			// process 1
+			ret = process_list(&rev1list, 0x1, 0x2);
 		}
-		ret = process_list(&rev2list, 0x2, 0x1);
 		if (ret) {
-			/* XXXX free lists */
+			free_commit_list(rev1list);
+			free_commit_list(rev2list);
 			return ret;
 		}
 	}


^ permalink raw reply

* Re: [PATCH] Simplify building of programs
From: Jonas Fonseca @ 2005-04-24  0:23 UTC (permalink / raw)
  To: torvalds, git
In-Reply-To: <20050423235956.GA7437@diku.dk>

[ Sorry, newbie in action. Last patch was for cogito, here it is the for
git in case it matters. ]

Do not first build .o files when building programs.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>

--- 30290c79f4914d7575c87d1c06f441d8a3bc5115/Makefile  (mode:100644 sha1:57e70239503466fb3a77f1f2618ee64377e8e04b)
+++ uncommitted/Makefile  (mode:100644)
@@ -50,7 +50,7 @@
 
 init-db: init-db.o
 
-%: %.o $(LIB_FILE)
+%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $< $(LIBS)
 
 blob.o: $(LIB_H)

-- 
Jonas Fonseca

^ permalink raw reply

* [PATCH 5/5] Various transport programs
From: Daniel Barkalow @ 2005-04-24  0:24 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0504231953490.30848-100000@iabervon.org>

This patch adds three similar and related programs. http-pull downloads
objects from an HTTP server; rpull downloads objects by using ssh and
rpush on the other side; and rpush uploads objects by using ssh and rpull
on the other side.

The algorithm should be sufficient to make the network throughput required
depend only on how much content is new, not at all on how much content the
repository contains.

The combination should enable people to have remote repositories by way of
ssh login for authenticated users and HTTP for anonymous access.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Index: Makefile
===================================================================
--- 9b75904eab1300d83264a1840d396160482fee88/Makefile  (mode:100644 sha1:57e70239503466fb3a77f1f2618ee64377e8e04b)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/Makefile  (mode:100644 sha1:b60d8eb691f4edd56d5b310b0dd670e98c852228)
@@ -16,7 +16,7 @@
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
 	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
 	check-files ls-tree merge-base merge-cache unpack-file git-export \
-	diff-cache convert-cache
+	diff-cache convert-cache http-pull rpush rpull
 
 all: $(PROG)
 
@@ -51,7 +51,13 @@
 init-db: init-db.o
 
 %: %.o $(LIB_FILE)
-	$(CC) $(CFLAGS) -o $@ $< $(LIBS)
+	$(CC) $(CFLAGS) -o $@ $(filter %.o,$^) $(LIBS)
+
+rpush: rsh.o
+
+rpull: rsh.o
+
+http-pull: LIBS += -lcurl
 
 blob.o: $(LIB_H)
 cat-file.o: $(LIB_H)
@@ -80,6 +86,9 @@
 usage.o: $(LIB_H)
 unpack-file.o: $(LIB_H)
 write-tree.o: $(LIB_H)
+http-pull.o: $(LIB_H)
+rpull.o: $(LIB_H)
+rpush.o: $(LIB_H)
 
 clean:
 	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
Index: http-pull.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/http-pull.c  (mode:100644 sha1:a17225719c53508a37905618c624ad8c4d0372ec)
@@ -0,0 +1,204 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+
+#include <curl/curl.h>
+#include <curl/easy.h>
+
+static CURL *curl;
+
+static char *base;
+
+static int tree = 0;
+static int commits = 0;
+static int all = 0;
+
+static SHA_CTX c;
+static z_stream stream;
+
+static int local;
+static int zret;
+
+static size_t fwrite_sha1_file(void *ptr, size_t eltsize, size_t nmemb, 
+			       void *data) {
+	char expn[4096];
+	size_t size = eltsize * nmemb;
+	int posn = 0;
+	do {
+		ssize_t retval = write(local, ptr + posn, size - posn);
+		if (retval < 0)
+			return posn;
+		posn += retval;
+	} while (posn < size);
+
+	stream.avail_in = size;
+	stream.next_in = ptr;
+	do {
+		stream.next_out = expn;
+		stream.avail_out = sizeof(expn);
+		zret = inflate(&stream, Z_SYNC_FLUSH);
+		SHA1_Update(&c, expn, sizeof(expn) - stream.avail_out);
+	} while (stream.avail_in && zret == Z_OK);
+	return size;
+}
+
+static int fetch(unsigned char *sha1)
+{
+	char *hex = sha1_to_hex(sha1);
+	char *filename = sha1_file_name(sha1);
+	char real_sha1[20];
+	char *url;
+	char *posn;
+
+	if (has_sha1_file(sha1)) {
+		return 0;
+	}
+
+	local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666);
+
+	if (local < 0)
+		return error("Couldn't open %s\n", filename);
+
+	memset(&stream, 0, sizeof(stream));
+
+	inflateInit(&stream);
+
+	SHA1_Init(&c);
+
+	curl_easy_setopt(curl, CURLOPT_FILE, NULL);
+	curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
+
+	url = malloc(strlen(base) + 50);
+	strcpy(url, base);
+	posn = url + strlen(base);
+	strcpy(posn, "objects/");
+	posn += 8;
+	memcpy(posn, hex, 2);
+	posn += 2;
+	*(posn++) = '/';
+	strcpy(posn, hex + 2);
+
+	curl_easy_setopt(curl, CURLOPT_URL, url);
+
+	/*printf("Getting %s\n", hex);*/
+
+	if (curl_easy_perform(curl))
+		return error("Couldn't get %s for %s\n", url, hex);
+
+	close(local);
+	inflateEnd(&stream);
+	SHA1_Final(real_sha1, &c);
+	if (zret != Z_STREAM_END) {
+		unlink(filename);
+		return error("File %s (%s) corrupt\n", hex, url);
+	}
+	if (memcmp(sha1, real_sha1, 20)) {
+		unlink(filename);
+		return error("File %s has bad hash\n", hex);
+	}
+	
+	return 0;
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		if (fetch(entries->item.tree->object.sha1))
+			return -1;
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1))
+		return -1;
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!all)
+			tree = 0;
+	}
+	if (commits) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has_sha1_file(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	char *url;
+	int arg = 1;
+	unsigned char sha1[20];
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't') {
+			tree = 1;
+		} else if (argv[arg][1] == 'c') {
+			commits = 1;
+		} else if (argv[arg][1] == 'a') {
+			all = 1;
+			tree = 1;
+			commits = 1;
+		}
+		arg++;
+	}
+	if (argc < arg + 2) {
+		usage("http-pull [-c] [-t] [-a] commit-id url");
+		return 1;
+	}
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+
+	get_sha1_hex(commit_id, sha1);
+
+	curl_global_init(CURL_GLOBAL_ALL);
+
+	curl = curl_easy_init();
+
+	base = url;
+
+	if (fetch(sha1))
+		return 1;
+	if (process_commit(sha1))
+		return 1;
+
+	curl_global_cleanup();
+	return 0;
+}
Index: rpull.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rpull.c  (mode:100644 sha1:c27af2c2464de28732b8ad1fff3ed8a0804250d6)
@@ -0,0 +1,128 @@
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include "cache.h"
+#include "commit.h"
+#include <errno.h>
+#include <stdio.h>
+#include "rsh.h"
+
+static int tree = 0;
+static int commits = 0;
+static int all = 0;
+
+static int fd_in;
+static int fd_out;
+
+static int fetch(unsigned char *sha1)
+{
+	if (has_sha1_file(sha1))
+		return 0;
+	write(fd_out, sha1, 20);
+	return write_sha1_from_fd(sha1, fd_in);
+}
+
+static int process_tree(unsigned char *sha1)
+{
+	struct tree *tree = lookup_tree(sha1);
+	struct tree_entry_list *entries;
+
+	if (parse_tree(tree))
+		return -1;
+
+	for (entries = tree->entries; entries; entries = entries->next) {
+		/*
+		  fprintf(stderr, "Tree %s ", sha1_to_hex(sha1));
+		  fprintf(stderr, "needs %s\n", 
+		  sha1_to_hex(entries->item.tree->object.sha1));
+		*/
+		if (fetch(entries->item.tree->object.sha1)) {
+			return error("Missing item %s",
+				     sha1_to_hex(entries->item.tree->object.sha1));
+		}
+		if (entries->directory) {
+			if (process_tree(entries->item.tree->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static int process_commit(unsigned char *sha1)
+{
+	struct commit *obj = lookup_commit(sha1);
+
+	if (fetch(sha1)) {
+		return error("Fetching %s", sha1_to_hex(sha1));
+	}
+
+	if (parse_commit(obj))
+		return -1;
+
+	if (tree) {
+		if (fetch(obj->tree->object.sha1))
+			return -1;
+		if (process_tree(obj->tree->object.sha1))
+			return -1;
+		if (!all)
+			tree = 0;
+	}
+	if (commits) {
+		struct commit_list *parents = obj->parents;
+		for (; parents; parents = parents->next) {
+			if (has_sha1_file(parents->item->object.sha1))
+				continue;
+			if (fetch(parents->item->object.sha1)) {
+				/* The server might not have it, and
+				 * we don't mind. 
+				 */
+				error("Missing tree %s; continuing", 
+				      sha1_to_hex(parents->item->object.sha1));
+				continue;
+			}
+			if (process_commit(parents->item->object.sha1))
+				return -1;
+		}
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	char *commit_id;
+	char *url;
+	int arg = 1;
+	unsigned char sha1[20];
+
+	while (arg < argc && argv[arg][0] == '-') {
+		if (argv[arg][1] == 't') {
+			tree = 1;
+		} else if (argv[arg][1] == 'c') {
+			commits = 1;
+		} else if (argv[arg][1] == 'a') {
+			all = 1;
+			tree = 1;
+			commits = 1;
+		}
+		arg++;
+	}
+	if (argc < arg + 2) {
+		usage("rpull [-c] [-t] [-a] commit-id url");
+		return 1;
+	}
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+
+	if (setup_connection(&fd_in, &fd_out, "rpush", url, arg, argv + 1))
+		return 1;
+
+	get_sha1_hex(commit_id, sha1);
+
+	if (fetch(sha1))
+		return 1;
+	if (process_commit(sha1))
+		return 1;
+
+	return 0;
+}
Index: rpush.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rpush.c  (mode:100644 sha1:0293a1a46311d7e20b13177143741ab9d6d0d201)
@@ -0,0 +1,69 @@
+#include "cache.h"
+#include "rsh.h"
+#include <sys/socket.h>
+#include <errno.h>
+
+void service(int fd_in, int fd_out) {
+	ssize_t size;
+	int posn;
+	char sha1[20];
+	unsigned long objsize;
+	void *buf;
+	do {
+		posn = 0;
+		do {
+			size = read(fd_in, sha1 + posn, 20 - posn);
+			if (size < 0) {
+				perror("rpush: read ");
+				return;
+			}
+			if (!size)
+				return;
+			posn += size;
+		} while (posn < 20);
+
+		/* fprintf(stderr, "Serving %s\n", sha1_to_hex(sha1)); */
+
+		buf = map_sha1_file(sha1, &objsize);
+		if (!buf) {
+			fprintf(stderr, "rpush: could not find %s\n", 
+				sha1_to_hex(sha1));
+			return;
+		}
+		posn = 0;
+		do {
+			size = write(fd_out, buf + posn, objsize - posn);
+			if (size <= 0) {
+				if (!size) {
+					fprintf(stderr, "rpush: write closed");
+				} else {
+					perror("rpush: write ");
+				}
+				return;
+			}
+			posn += size;
+		} while (posn < objsize);
+	} while (1);
+}
+
+int main(int argc, char **argv)
+{
+	int arg = 1;
+        char *commit_id;
+        char *url;
+	int fd_in, fd_out;
+	while (arg < argc && argv[arg][0] == '-') {
+                arg++;
+        }
+        if (argc < arg + 2) {
+                usage("rpush [-c] [-t] [-a] commit-id url");
+                return 1;
+        }
+	commit_id = argv[arg];
+	url = argv[arg + 1];
+	if (setup_connection(&fd_in, &fd_out, "rpull", url, arg, argv + 1))
+		return 1;
+
+	service(fd_in, fd_out);
+	return 0;
+}
Index: rsh.c
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rsh.c  (mode:100644 sha1:4d6a90bf6c1b290975fb2ac22f25979be56cb476)
@@ -0,0 +1,63 @@
+#include "rsh.h"
+
+#include <string.h>
+#include <sys/socket.h>
+
+#include "cache.h"
+
+#define COMMAND_SIZE 4096
+
+int setup_connection(int *fd_in, int *fd_out, char *remote_prog, 
+		     char *url, int rmt_argc, char **rmt_argv)
+{
+	char *host;
+	char *path;
+	int sv[2];
+	char command[COMMAND_SIZE];
+	char *posn;
+	int i;
+
+	if (!strcmp(url, "-")) {
+		*fd_in = 0;
+		*fd_out = 1;
+		return 0;
+	}
+
+	host = strstr(url, "//");
+	if (!host) {
+		return error("Bad URL: %s", url);
+	}
+	host += 2;
+	path = strchr(host, '/');
+	if (!path) {
+		return error("Bad URL: %s", url);
+	}
+	*(path++) = '\0';
+	/* ssh <host> 'cd /<path>; stdio-pull <arg...> <commit-id>' */
+	snprintf(command, COMMAND_SIZE, 
+		 "cd /%s; SHA1_FILE_DIRECTORY=objects %s",
+		 path, remote_prog);
+	posn = command + strlen(command);
+	for (i = 0; i < rmt_argc; i++) {
+		*(posn++) = ' ';
+		strncpy(posn, rmt_argv[i], COMMAND_SIZE - (posn - command));
+		posn += strlen(rmt_argv[i]);
+		if (posn - command + 4 >= COMMAND_SIZE) {
+			return error("Command line too long");
+		}
+	}
+	strcpy(posn, " -");
+	if (socketpair(AF_LOCAL, SOCK_STREAM, 0, sv)) {
+		return error("Couldn't create socket");
+	}
+	if (!fork()) {
+		close(sv[1]);
+		dup2(sv[0], 0);
+		dup2(sv[0], 1);
+		execlp("ssh", "ssh", host, command, NULL);
+	}
+	close(sv[0]);
+	*fd_in = sv[1];
+	*fd_out = sv[1];
+	return 0;
+}
Index: rsh.h
===================================================================
--- /dev/null  (tree:9b75904eab1300d83264a1840d396160482fee88)
+++ a56d8adaecc49ce7f26536f9f5d54ec813072e4f/rsh.h  (mode:100644 sha1:97e4f20b2b80662269827d77f3104025143087e7)
@@ -0,0 +1,7 @@
+#ifndef RSH_H
+#define RSH_H
+
+int setup_connection(int *fd_in, int *fd_out, char *remote_prog, 
+		     char *url, int rmt_argc, char **rmt_argv);
+
+#endif


^ permalink raw reply

* Re: Hash collision count
From: Jeff Garzik @ 2005-04-24  0:35 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Ray Heasman, Git Mailing List, Linus Torvalds
In-Reply-To: <20050423234637.GS13222@pasky.ji.cz>

Petr Baudis wrote:
> Dear diary, on Sun, Apr 24, 2005 at 01:20:21AM CEST, I got a letter
> where Jeff Garzik <jgarzik@pobox.com> told me that...
> 
>>Second, in your scenario, it's highly unlikely you would get 4 billion 
>>sha1 hash collisions, even if you had the disk space to store such a git 
>>database.
> 
> 
> It's highly unlikely you would get a _single_ collision.

Agreed.


>>First, the hash is NOT unique.
>>
>>Second, you lose data if you pretend it is unique.  I don't like losing 
>>data.
> 
> 
> *sigh*
> 
> We've been through this before, haven't we?

<shrug>

In messing around with archive servers, people get nervous using 
(hash,value) based storage if there isn't even a simple test for collisions.

Someone just told me that one implementation of the Venti archive 
server[1] simply fails the write, if a data item exists with a duplicate 
hash value.  As long as git fails or does something -predictable- in the 
face of the hash collision, I'm satisfied.

	Jeff


[1] http://www.cs.bell-labs.com/sys/doc/venti/venti.html

^ permalink raw reply

* Re: git pull issues...
From: Morten Welinder @ 2005-04-24  0:39 UTC (permalink / raw)
  To: Petr Baudis; +Cc: GIT Mailing List
In-Reply-To: <20050423220049.GC13222@pasky.ji.cz>

> > 1. Multiple rsync call might connect to different servers (with
> > round-robin DNS).  The effect
> >    will be interesting.  One call, if possible, would be better.
> 
> If you can do it without overwriting HEAD, please go ahead and send me
> the patch. :-)

I'll have a go at it later, but something like this ought to work:

-d .rsync-git && die "previous pull failed -- cleanup"
mkdir .rsync-git || die "cannot create .rsync-git"
ln -s ../.git/objects .rsync-git/objects || die "cannot create symlink"
rsync ... --keep-dirlinks ...
...

Morten

^ permalink raw reply

* Re: Hash collision count
From: Petr Baudis @ 2005-04-24  0:40 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Ray Heasman, Git Mailing List, Linus Torvalds
In-Reply-To: <426AE9ED.4060005@pobox.com>

Dear diary, on Sun, Apr 24, 2005 at 02:35:57AM CEST, I got a letter
where Jeff Garzik <jgarzik@pobox.com> told me that...
> Someone just told me that one implementation of the Venti archive 
> server[1] simply fails the write, if a data item exists with a duplicate 
> hash value.  As long as git fails or does something -predictable- in the 
> face of the hash collision, I'm satisfied.

-DCOLLISION_CHECK

See the top of Makefile.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Hash collision count
From: Jeff Garzik @ 2005-04-24  0:43 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Ray Heasman, Git Mailing List, Linus Torvalds
In-Reply-To: <20050424004039.GU13222@pasky.ji.cz>

Petr Baudis wrote:
> -DCOLLISION_CHECK

Cool.  I am happy, then :)

Make sure that's enabled by default...

Thanks,

	Jeff



^ permalink raw reply

* [ANNOUNCE] git-pasky-0.7
From: Petr Baudis @ 2005-04-24  0:59 UTC (permalink / raw)
  To: git

  Hello,

  this is the last release of git-pasky, my SCMish layer over Linus' git
tree history storage tool. The next releases will be called 'cogito' and
will feature a significantly reworked user interface (finally). Get
git-pasky-0.7 at

	http://www.kernel.org/pub/software/scm/cogito

or

	ftp://ftp.kernel.org/pub/software/scm/cogito

  You can also pull, but actually you might as well not want to do that
if you don't know that you will be able to recover possible
inconsistencies (for no local changes, read-tree $(tree-id) &&
checkout-cache -f -a && update-cache --refresh should do). The
pulling/merging tools in older versions contain bugs which _might_
affect this pull.

  The biggest change is in the way the directory cache is used (this is
internal thing, nothing user-visible except less bugs). Now that we have
diff-cache, git-pasky uses that instead of show-diff, and drops the
add/rm queues. This also makes the diffs coming from git diff more
consistent-looking.

  To pick randomly from the other changes - older zlib compatibility,
always use bash, git patch output changes/fixes, git log timezone fix,
plenty of bugfixes and of course merges with Linus. Thanks to all the
contributors!

  Have fun,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] make file merging respect permissions
From: Linus Torvalds @ 2005-04-24  1:01 UTC (permalink / raw)
  To: James Bottomley; +Cc: Petr Baudis, Git Mailing List
In-Reply-To: <1114298490.5264.10.camel@mulgrave>



On Sat, 23 Apr 2005, James Bottomley wrote:
> 
> This is the actual diff

This is _still_ corrupted. 

Are you editing your diffs by hand without understanding how the diffs 
work?

The second chunk of the "git-merge-one-file" diff _still_ claims to change 
twelve lines, and that diff _still_ only changes eleven lines. My "patch" 
isn't happy, and I can count the lines in the diff myself and verify that 
it's not patch that is wrong, it's your diff.

Please please _please_ don't edit diffs by hand if you don't know what 
you're doing. Generate the diff from a clean source instead. Or ask me to 
fix it up, I'm so used to editing diffs that I can do it in my sleep.

		Linus

^ permalink raw reply

* Re: Hash collision count
From: Ray Heasman @ 2005-04-24  1:01 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List
In-Reply-To: <426AD835.5070404@pobox.com>

On Sat, 2005-04-23 at 19:20 -0400, Jeff Garzik wrote:
> Ray Heasman wrote:
> > On Sat, 2005-04-23 at 16:27 -0400, Jeff Garzik wrote:
> > 
> >>Ideally a hash + collision-count pair would make the best key, rather 
> >>than just hash alone.
> >>
> >>A collision -will- occur eventually, and it is trivial to avoid this 
> >>problem:
> >>
> >>	$n = 0
> >>	attempt to store as $hash-$n
> >>	if $hash-$n exists (unlikely)
> >>		$n++
> >>		goto restart
> >>	key = $hash-$n
> >>
> > 
> > 
> > Great. So what have you done here? Suppose you have 32 bits of counter
> > for n. Whoopee, you just added 32 bits to your hash, using a two stage
> > algorithm. So, you have a 192 bit hash assuming you started with the 160
> > bit SHA. And, one day your 32 bit counter won't be enough. Then what?
> 
> First, there is no 32-bit limit.  git stores keys (aka hashes) as 
> strings.  As it should.

Oh great, now we have variable length id strings too. And we'll have to
pretend the OS can store infinite length file names.

> Second, in your scenario, it's highly unlikely you would get 4 billion 
> sha1 hash collisions, even if you had the disk space to store such a git 
> database.

Er. So your so-unlikely-the-sun-will-burn-out-first scenario beats my
so-unlikely-the-sun-will-burn-out-first scenario? Why am I not worried?

> > You aren't solving anything. You're just putting it off, and doing it in
> > a way that breaks all the wonderful semantics possible by just assuming
> > that the hash is unique. All of a sudden we are doing checks of data
> > that we never did before, and we have to do the check trillions of times
> > before the CPU time spent pays off.
> 
> First, the hash is NOT unique.

Nooooo. Really?

Why not just use a 8192 bit hash for each 1KiB of data? We could store a
zero length file and store all the data in the filename. Guaranteed, no
hash collisions that way.

We make an assumption that we know is right most of the time, and we use
it because we know our computer will crash from random quantum
fluctuations before we have a chance of bumping into the problem. You do
know that metastability means that every logic gate in your computer
hardware is guaranteed to fail every "x" operations, where x is defined
by process size, voltage and temperature? Sure, any failures in git
would be data dependent rather than random, but that just means we don't
get to store carefully crafted blocks invented by hypothetical
cryptographers that have completely broken SHA.

> Second, you lose data if you pretend it is unique.  I don't like losing 
> data.

You lose data either way. Just we get to burn out a few extra suns
before yours dies, and I can burn out whole galaxies before mine dies by
using a 256 bit hash, anyway.

> Third, a data check only occurs in the highly unlikely case that a hash 
> already exists -- a collision.  Rather than "trillions of times", more 
> like "one in a trillion chance."

Heh. I calculate it has a 50% probability of happening after you have
seen 10^24 input blocks. So, you are off by a factor of a trillion or
so.

Assuming we store 1 KiB blocks with a 160-bit hash, we would be able to
store 1000 Trillion Terabytes before the chance of hitting a collision
goes to 50%. To use marketing units, that is around 10 Trillion
Libraries of Congress. Every 2 bits we add to the hash doubles the
amount of data we can store before we hit a 50% probability of
collision.

I'm not sure how I could convince you that we're arguing about the
number of angels that could dance on a pin.

Ciao,
Ray


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox