Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] checkout-cache fix
From: Petr Baudis @ 2005-05-12 19:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7voebhnwey.fsf_-_@assigned-by-dhcp.cox.net>

Dear diary, on Thu, May 12, 2005 at 02:02:45AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> Commit    cc01b05f0a3dfdf5ed114e429a7bec1ad549ab1c
> Author    Junio C Hamano <junkio@cox.net>, Wed May 11 17:00:16 2005 -0700
> Committer Junio C Hamano <junkio@cox.net>, Wed May 11 17:00:16 2005 -0700
> 
> Fix checkout-cache when existing work tree interferes with the checkout.

Thanks, applied. A nit about the commit message, though - I'd prefer you
to put this metadata stuff belong the --- separator, since they really
do not belong to the log message. I've already seen something like this
in one commit merged from git-jc (IIRC some of the Ingo Molnar's leak
fixes), and it was a little PITA there since the first line was some
 Date: header but we tend to use the first line as the commit's caption
at some places.

Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Sean @ 2005-05-12 19:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Junio C Hamano, tglx, H. Peter Anvin, git
In-Reply-To: <7vy8akfdss.fsf@assigned-by-dhcp.cox.net>

On Thu, May 12, 2005 3:24 pm, Junio C Hamano said:

> That would not work if (1) you are using SHA1_FILE_DIRECTORY
> mechanism to share object pool for multiple trees, or (2) you
> git-*-pull'ed but did not merge for some time.  The file
> timestamps are the time of download but we want the time of

Surely you mean "GIT_OBJECT_DIRECTORY" <g> and you're right, if the local
object is shared amongst several trees you'd have to store the timestamp
separately.   However, as for your second case, the merge process could
set the timestamp on the file so that one really isn't a problem.  I for
one, would like the option to use this method when its appropriate,
although I agree you'd need a timestamp-database for other situations.

> merge for this applicaton.  Also, that approach captures only
> half the information necessary.  The other half you missed is
> "which ones are foreign commits from this tree's point of view",
> and as you described that is something you cannot tell just by
> looking at the order of parents in commit objects.

Right, but we're not talking about identifying foreign commits anymore! 
The point is just to list multiple parents in the correct "local" order. 
The timestamp information _is_ enough to identify the proper order for
local viewing.   And this has the very nice feature that it works for
branches made in the same repository, where the repoid proposal would
fail.

> S> So it seems, that rather than a repository identifier, we
> S> need each repository to record the time of each local commit.
> S> Either in a separate file or just using the object file
> S> timestamps directly.
>
> I think we are in agreement here, except that object file
> timestamps is not something you can use.

You can use it, just not in every situation.

Sean

^ permalink raw reply

* Re: [PATCH] Test suite
From: Petr Baudis @ 2005-05-12 19:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vu0l9nwgx.fsf_-_@assigned-by-dhcp.cox.net>

Dear diary, on Thu, May 12, 2005 at 02:01:34AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> Commit    1da683e1247046796a094c4917bc0c4591530272
> Author    Junio C Hamano <junkio@cox.net>, Wed May 11 16:59:35 2005 -0700
> Committer Junio C Hamano <junkio@cox.net>, Wed May 11 16:59:35 2005 -0700
> 
> Test suite: infrastructure and examples.
> 
> This adds the test suite infrastructure with two example tests.
> The current git-checkout-cache the example tests would fail this
> test and will be corrected in a separate patch.
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>

Admittely, I'm not happy with this. From the design standpoint it looks
mostly fine now, but the code is rather crude. I wanted to go over it at
first and fix the obvious stuff, but it appeared to be overall quite
broken, so I decided to return it to you for another iteration. I don't
mind if you just fix the broken code for now, we can fix the semantics
and design stuff later - what is in the patch is already good enough for
the start.

I'll drop the testcases from your other patches for now so that we don't
get long stalls.

> --- /dev/null
> +++ b/t/test-lib.sh
> +# For repeatability, reset the environment to known value.
> +export LANG C
> +export TZ UTC

Dunno about *your* shell but this just exports variables $LANG, $C, $TZ
and $UTC here. You probably wanted assignments there?

> +# Each test should start with something like this, after copyright notices:
> +#
> +# . ./testlib.sh

test-lib.sh

> +# test_description "$@" 'Description of this test...
> +# This test checks if command xyzzy does the right thing...
> +# '

I think this usage is pretty weird. Why not just

test_description='Description, blah blah.'
. ./testlib.sh

I think it would be quite less confusing than test_description, which
actually does effectively something different anyway.

> +
> +test_description () {
> +	while case "$#" in 0) break;; esac

Duh. This looks mysterious - why not a simple test?

> +	do
> +		case "$1" in
> +		-d|--d|--de|--deb|--debu|--debug)
> +			debug=t; shift ;;
> +		-h|--h|--he|--hel|--help)
> +			eval echo '"$'$#'"'
> +			exit 0
> +			;;
> +		*)
> +			break ;;

This branch makes no sense, I think.

> +		esac
> +	done
> +	test_failure=0
> +}
> +
> +say () {
> +	echo "* $*"
> +}
> +
> +test_debug () {
> +	case "$debug" in '') ;; ?*) eval "$*" ;; esac

Again, why not a simple test?

[ "$debug" ] && eval "$*"

(Actually, eval will do the wrong thing here - it just concatenates the
arguments. Just "$@" would do, I guess.)

> +}
> +
> +test_ok () {
> +	echo "* $*";
> +}
> +
> +test_failure () {
> +	echo "* $*";
> +	test_failure=1;
> +}
> +
> +test_expect_failure () {
> +	say "expecting failure: $1"
> +	eval "$1"
> +	case $? in
> +	0)	test_failure "did not fail as expected" ;;
> +	*) 	test_ok "failed as expected" ;;
> +	esac
> +}
> +
> +test_expect_success () {
> +	say "expecting success: $1"
> +	eval "$1"
> +	case $? in
> +	0) 	test_ok "succeeded as expected" ;;
> +	*)	test_failure "did not succeed as expected" ;;
> +	esac
> +}
> +
> +test_done () {
> +	case "$test_failure" in
> +	0)	exit 0 ;;

Please clean up after yourself in this case.

> +	'')	echo "*** test script did not start with test_description";
> +		exit 2 ;;
> +	*)	exit 1 ;;
> +	esac
> +}
> +
> +# Test the binaries we have just built.  The tests are kept in
> +# t/ subdirectory and are run in test-repo subdirectory.
> +PATH=$(pwd)/..:$PATH
> +
> +# Test repository
> +test=test-repo
> +rm -fr "$test"
> +mkdir "$test"
> +cd "$test"
> +git-init-db 2>/dev/null || error "cannot run git-init-db"

But there's no 'error' thing.

> --- /dev/null
> +++ b/t/t1000-checkout-cache.sh
> +git-update-cache --add path0 path1/file1

You should make sure even those preparations calls actually succeed.

> --- /dev/null
> +++ b/t/t1001-checkout-cache.sh
> +git-update-cache --add path0/file0
> +tree1=$(git-write-tree)

Here too.

The testcases currently utterly fail, which is not good sign - either
they are ahead of the current code, or they are broken. This is the main
hurdle making me not accept it yet - it does not work for me. If you fix
this and the nits above, it can go in, I think.

* expecting failure: git-checkout-cache -a
checkout-cache: path0 already exists
error: checkout-cache: unable to create file path1/file1 (Not a directory)
* did not fail as expected
* expecting success: git-checkout-cache -f -a
error: checkout-cache: unable to create file path0 (Is a directory)
* succeeded as expected
* checkout failed

I consider this output... well, totally confusing. checkout-cache fails
but testcase thinks it does not, then it fails again but testcase thinks
it succeeded as expected but dies anyway.

I think it's too messy. I would much more appreciate output like this:

* checkout-cache test [1/3]: git-checkout-cache -a... passed
* checkout-cache test [2/3]: git-checkout-cache -f -a... NOT PASSED
Expected success, but the command failed. Output:
error: checkout-cache: unable to create file path0 (Is a directory)
* checkout-cache test [3/3]: git-checkout-cache foobar... NOT PASSED
Expected failure, but the command succeeded.

Much less cluttered, it is clear what went wrong etc.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Junio C Hamano @ 2005-05-12 19:24 UTC (permalink / raw)
  To: Sean; +Cc: Junio C Hamano, tglx, H. Peter Anvin, git
In-Reply-To: <1234.10.10.10.24.1115921886.squirrel@linux1>

>>>>> "S" == Sean  <seanlkml@sympatico.ca> writes:

S> On Thu, May 12, 2005 1:35 pm, Junio C Hamano said:
>> If that is not needed, then you can record in an auxiliary file
>> that is local to each tree the timestamp of when merge happened
>> in that tree along with set of foreign commit objects, and teach
>> rev-tree or rev-list to read from that auxiliary file and use
>> that timestamp for foreign commit objects instead of commit time
>> recorded in them when sorting by time is needed.

S> The time is already recorded.  Ie. the commit object is a
S> separate file with a modification time which can be used as a
S> "local commit timestamp".  If you want to protect those time
S> stamps by also recording them in a separate file, that's a
S> bonus I guess but shouldn't really be needed.

That would not work if (1) you are using SHA1_FILE_DIRECTORY
mechanism to share object pool for multiple trees, or (2) you
git-*-pull'ed but did not merge for some time.  The file
timestamps are the time of download but we want the time of
merge for this applicaton.  Also, that approach captures only
half the information necessary.  The other half you missed is
"which ones are foreign commits from this tree's point of view",
and as you described that is something you cannot tell just by
looking at the order of parents in commit objects.

S> So it seems, that rather than a repository identifier, we
S> need each repository to record the time of each local commit.
S> Either in a separate file or just using the object file
S> timestamps directly.

I think we are in agreement here, except that object file
timestamps is not something you can use.

^ permalink raw reply

* [PATCH] Add git-ls-files -k (take 2).
From: Junio C Hamano @ 2005-05-12 19:20 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <7vk6m4iyqa.fsf@assigned-by-dhcp.cox.net>

When checkout-cache attempts to check out a non-directory where
a directory exists on the work tree, or to check out a file
under directory D when path D is a non-directory on the work
tree, the attempt fails.  Before running checkout-cache, the
user can run git-ls-files with the -k (killed) option to get a
list of such paths.  The tagged output format uses "K" to denote
such paths.

This also includes the test for this functionality and
documentation of the new option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** This is a reworked patch that fixes performance problem in
*** the one I sent out earlier.  Discard the earlier one and
*** apply this on top of "git-ls-files --others symlink fix".

Documentation/git-ls-files.txt |   10 +++-
ls-files.c                     |  101 ++++++++++++++++++++++++++++++++++-------
t/t0500-ls-files.sh            |   55 ++++++++++++++++++++++
3 files changed, 147 insertions(+), 19 deletions(-)

--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -10,8 +10,8 @@ git-ls-files - Information about files i
 SYNOPSIS
 --------
 'git-ls-files' [-z] [-t]
-		(--[cached|deleted|others|ignored|stage|unmerged])\*
-		(-[c|d|o|i|s|u])\*
+		(--[cached|deleted|others|ignored|stage|unmerged|killed])\*
+		(-[c|d|o|i|s|u|k])\*
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 
@@ -45,6 +45,11 @@ OPTIONS
 -u|--unmerged::
 	Show unmerged files in the output (forces --stage)
 
+-k|--killed::
+	Show files on the filesystem that need to be removed due
+	to file/directory conflicts for checkout-cache to
+	succeed.
+
 -z::
 	\0 line termination on output
 
@@ -65,6 +70,7 @@ OPTIONS
 	H	cached
 	M	unmerged
 	R	removed/deleted
+	K	to be killed
 	?	other
 
 Output
--- a/ls-files.c
+++ b/ls-files.c
@@ -16,12 +16,14 @@ static int show_others = 0;
 static int show_ignored = 0;
 static int show_stage = 0;
 static int show_unmerged = 0;
+static int show_killed = 0;
 static int line_terminator = '\n';
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
 static const char *tag_removed = "";
 static const char *tag_other = "";
+static const char *tag_killed = "";
 
 static int nr_excludes;
 static const char **excludes;
@@ -87,24 +89,30 @@ static int excluded(const char *pathname
 	return 0;
 }
 
-static const char **dir;
+struct nond_on_fs {
+	int len;
+	char name[0];
+};
+
+static struct nond_on_fs **dir;
 static int nr_dir;
 static int dir_alloc;
 
 static void add_name(const char *pathname, int len)
 {
-	char *name;
+	struct nond_on_fs *ent;
 
 	if (cache_name_pos(pathname, len) >= 0)
 		return;
 
 	if (nr_dir == dir_alloc) {
 		dir_alloc = alloc_nr(dir_alloc);
-		dir = xrealloc(dir, dir_alloc*sizeof(char *));
+		dir = xrealloc(dir, dir_alloc*sizeof(ent));
 	}
-	name = xmalloc(len + 1);
-	memcpy(name, pathname, len + 1);
-	dir[nr_dir++] = name;
+	ent = xmalloc(sizeof(*ent) + len + 1);
+	ent->len = len;
+	memcpy(ent->name, pathname, len);
+	dir[nr_dir++] = ent;
 }
 
 /*
@@ -164,11 +172,62 @@ static void read_directory(const char *p
 
 static int cmp_name(const void *p1, const void *p2)
 {
-	const char *n1 = *(const char **)p1;
-	const char *n2 = *(const char **)p2;
-	int l1 = strlen(n1), l2 = strlen(n2);
+	const struct nond_on_fs *e1 = *(const struct nond_on_fs **)p1;
+	const struct nond_on_fs *e2 = *(const struct nond_on_fs **)p2;
+
+	return cache_name_compare(e1->name, e1->len,
+				  e2->name, e2->len);
+}
 
-	return cache_name_compare(n1, l1, n2, l2);
+static void show_killed_files()
+{
+	int i;
+	for (i = 0; i < nr_dir; i++) {
+		struct nond_on_fs *ent = dir[i];
+		char *cp, *sp;
+		int pos, len, killed = 0;
+
+		for (cp = ent->name; cp - ent->name < ent->len; cp = sp + 1) {
+			sp = strchr(cp, '/');
+			if (!sp) {
+				/* If ent->name is prefix of an entry in the
+				 * cache, it will be killed.
+				 */
+				pos = cache_name_pos(ent->name, ent->len);
+				if (0 <= pos)
+					die("bug in show-killed-files");
+				pos = -pos - 1;
+				while (pos < active_nr &&
+				       ce_stage(active_cache[pos]))
+					pos++; /* skip unmerged */
+				if (active_nr <= pos)
+					break;
+				/* pos points at a name immediately after
+				 * ent->name in the cache.  Does it expect
+				 * ent->name to be a directory?
+				 */
+				len = ce_namelen(active_cache[pos]);
+				if ((ent->len < len) &&
+				    !strncmp(active_cache[pos]->name,
+					     ent->name, ent->len) &&
+				    active_cache[pos]->name[ent->len] == '/')
+					killed = 1;
+				break;
+			}
+			if (0 <= cache_name_pos(ent->name, sp - ent->name)) {
+				/* If any of the leading directories in
+				 * ent->name is registered in the cache,
+				 * ent->name will be killed.
+				 */
+				killed = 1;
+				break;
+			}
+		}
+		if (killed)
+			printf("%s%.*s%c", tag_killed,
+			       dir[i]->len, dir[i]->name,
+			       line_terminator);
+	}
 }
 
 static void show_files(void)
@@ -176,11 +235,16 @@ static void show_files(void)
 	int i;
 
 	/* For cached/deleted files we don't need to even do the readdir */
-	if (show_others) {
+	if (show_others || show_killed) {
 		read_directory(".", "", 0);
-		qsort(dir, nr_dir, sizeof(char *), cmp_name);
-		for (i = 0; i < nr_dir; i++)
-			printf("%s%s%c", tag_other, dir[i], line_terminator);
+		qsort(dir, nr_dir, sizeof(struct nond_on_fs *), cmp_name);
+		if (show_others)
+			for (i = 0; i < nr_dir; i++)
+				printf("%s%.*s%c", tag_other,
+				       dir[i]->len, dir[i]->name,
+				       line_terminator);
+		if (show_killed)
+			show_killed_files();
 	}
 	if (show_cached | show_stage) {
 		for (i = 0; i < active_nr; i++) {
@@ -219,8 +283,8 @@ static void show_files(void)
 }
 
 static const char *ls_files_usage =
-	"ls-files [-z] [-t] (--[cached|deleted|others|stage|unmerged])* "
-	"[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]";
+"ls-files [-z] [-t] (--[cached|deleted|others|stage|unmerged|killed])* "
+"[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]";
 
 int main(int argc, char **argv)
 {
@@ -236,6 +300,7 @@ int main(int argc, char **argv)
 			tag_unmerged = "M ";
 			tag_removed = "R ";
 			tag_other = "? ";
+			tag_killed = "K ";
 		} else if (!strcmp(arg, "-c") || !strcmp(arg, "--cached")) {
 			show_cached = 1;
 		} else if (!strcmp(arg, "-d") || !strcmp(arg, "--deleted")) {
@@ -246,6 +311,8 @@ int main(int argc, char **argv)
 			show_ignored = 1;
 		} else if (!strcmp(arg, "-s") || !strcmp(arg, "--stage")) {
 			show_stage = 1;
+		} else if (!strcmp(arg, "-k") || !strcmp(arg, "--killed")) {
+			show_killed = 1;
 		} else if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
 			/* There's no point in showing unmerged unless
 			 * you also show the stage information.
@@ -271,7 +338,7 @@ int main(int argc, char **argv)
 	}
 
 	/* With no flags, we default to showing the cached files */
-	if (!(show_stage | show_deleted | show_others | show_unmerged))
+	if (!(show_stage | show_deleted | show_others | show_unmerged | show_killed))
 		show_cached = 1;
 
 	read_cache();
Created: t/t0500-ls-files.sh (mode:100755)
--- /dev/null
+++ b/t/t0500-ls-files.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+. ./test-lib.sh
+test_description "$@" 'git-ls-files -k flag test.
+
+This test prepares the following in the cache:
+
+    path0       - a file
+    path1       - a symlink
+    path2/file2 - a file in a directory
+    path3/file3 - a file in a directory
+
+and the following on the filesystem:
+
+    path0/file0 - a file in a directory
+    path1/file1 - a file in a directory
+    path2       - a file
+    path3       - a symlink
+    path4	- a file
+    path5	- a symlink
+    path6/file6 - a file in a directory
+
+git-ls-files -k should report that existing filesystem
+objects except path4, path5 and path6/file6 to be killed.
+'
+
+date >path0
+ln -s xyzzy path1
+mkdir path2 path3
+date >path2/file2
+date >path3/file3
+git-update-cache --add -- path0 path1 path?/file?
+
+rm -fr path?
+date >path2
+ln -s frotz path3
+ln -s nitfol path5
+mkdir path0 path1 path6
+date >path0/file0
+date >path1/file1
+date >path6/file6
+
+git-ls-files -k >.output
+cat >.expected <<EOF
+path0/file0
+path1/file1
+path2
+path3
+EOF
+
+test_expect_success 'diff .output .expected'
+test_done


^ permalink raw reply

* Re: Adapting scripts to work in current (not top) directory
From: H. Peter Anvin @ 2005-05-12 19:15 UTC (permalink / raw)
  To: Alexey Nezhdanov; +Cc: GIT Mailing List
In-Reply-To: <200505121758.10971.snake@penza-gsm.ru>

Alexey Nezhdanov wrote:
> All git and cogito scripts wants .git subdirectory. If I'm in a subdirectory 
> that have no .git direcory in it I'm out of luck.
> I have wrote an example script that determines the lowest possible .git 
> directory position and changes to it to satisfy user request.
> 
> Problems with script:
> 1) May be I misunderstood the git ideology and it needs not this at all.
> 

Linus has explicitly said he doesn't want that.

	-hpa

^ permalink raw reply

* Re: [RFC] Support projects including other projects
From: Daniel Barkalow @ 2005-05-12 19:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds
In-Reply-To: <7vll6kgu21.fsf@assigned-by-dhcp.cox.net>

On Thu, 12 May 2005, Junio C Hamano wrote:

> I have to think about this a bit but let me understand the
> problem first.  Let's say it is a couple of weeks ago when there
> were not cg-status.  You write cg-status, by adding -t flag to
> ls-files.c  You commit the addition of -t flag to git-pb
> repository and note the commit id.  You then commit addition of
> cg-status to cogito repository and when you do so you want the
> party that pulls the latter commit to know it needs the former
> commit in the git-pb tree.  Is it what you are solving here?

Right; and I'm not Petr, so the place that has the -t flag in ls-files
isn't his git-pb repository, and I'm not going to remember to tell him
about two places to pull from or two heads to pull.

Probably my biggest concern here is that it has to not make anything more
difficult for Cogito hackers (or people working on similarly arranged
projects) to have the other project demarcated as separate, or they'd tend
to be lazy and the upstream core will suffer. I believe that this is why
people in practice tend not to bother making projects clean and modular
with current tools. Having it streamlined and automatic would mean that
people in the position that Petr was in when he started would do it by
default.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: [ANNOUNCE] git tracker online
From: Jan-Benedict Glaw @ 2005-05-12 19:04 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: git
In-Reply-To: <1115794878.22180.27.camel@tglx>

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

On Wed, 2005-05-11 07:01:18 +0000, Thomas Gleixner <tglx@linutronix.de> wrote:
> git tracker is online in a beta version:
> 
> http://www.tglx.de/gittracker

I already said I like it, here are two suggestions:

	- Browsing the Cogito repository doesn't work. Could you fix
	  that?

	- When the {repository,diff against} drop-down box is changed,
	  it would be nice to fire off a onchange="submit()" so that (if
	  your browser is wacked with JavaScript) you don't need to
	  press the submit button.

Thanks, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH Cogito] cg-init breaks if . contains sub-dir
From: Petr Baudis @ 2005-05-12 18:53 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: Brian Gerst, Matthias Urlichs, git
In-Reply-To: <20050510075227.GA8176@lug-owl.de>

Dear diary, on Tue, May 10, 2005 at 09:52:27AM CEST, I got a letter
where Jan-Benedict Glaw <jbglaw@lug-owl.de> told me that...
> On Tue, 2005-05-10 01:17:31 -0400, Brian Gerst <bgerst@didntduck.org> wrote:
> > But it can handle symlinks:
> > 
> > 	find * -type f -o -type l -print0 | xargs -0r cg-add
> 
> This won't work because the explicit OR (-o) lower precedence compared
> to the implicit AND between "-type l" and "-print0", thus this find
> command will do print0 IFF the matched entry is a symlink. Use something
> like this instead:
> 
> 	find * \( -type f -o tyle l \) -print0 | ...

Thanks to all the four co-authors, applied.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [RFC] Support projects including other projects
From: Junio C Hamano @ 2005-05-12 18:47 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505121218280.30848-100000@iabervon.org>

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> Do you have some solution to the problem of having the
DB> porcelain layer (or the end user) find the version of git
DB> that a version of cogito needs, in some way such that if I'm
DB> working on the project and make a change to cogito and a
DB> matching change to git, Petr can get them.

I have to think about this a bit but let me understand the
problem first.  Let's say it is a couple of weeks ago when there
were not cg-status.  You write cg-status, by adding -t flag to
ls-files.c  You commit the addition of -t flag to git-pb
repository and note the commit id.  You then commit addition of
cg-status to cogito repository and when you do so you want the
party that pulls the latter commit to know it needs the former
commit in the git-pb tree.  Is it what you are solving here?

^ permalink raw reply

* Re: Mercurial 0.4e vs git network pull
From: Petr Baudis @ 2005-05-12 18:23 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <20050512094406.GZ5914@waste.org>

Dear diary, on Thu, May 12, 2005 at 11:44:06AM CEST, I got a letter
where Matt Mackall <mpm@selenic.com> told me that...
> Mercurial is more than 10 times as bandwidth efficient and
> considerably more I/O efficient. On the server side, rsync uses about
> twice as much CPU time as the Mercurial server and has about 10 times
> the I/O and pagecache footprint as well.
> 
> Mercurial is also much smarter than rsync at determining what
> outstanding changesets exist. Here's an empty pull as a demonstration:
> 
>  $ time hg merge hg://selenic.com/linux-hg/
>  retrieving changegroup
> 
>  real    0m0.363s
>  user    0m0.083s
>  sys     0m0.007s
> 
> That's a single http request and a one line response.

So, what about comparing it with something comparable, say git pull over
HTTP? :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Sean @ 2005-05-12 18:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: tglx, H. Peter Anvin, git
In-Reply-To: <7vvf5ogxdu.fsf@assigned-by-dhcp.cox.net>

On Thu, May 12, 2005 1:35 pm, Junio C Hamano said:

> If that is not needed, then you can record in an auxiliary file
> that is local to each tree the timestamp of when merge happened
> in that tree along with set of foreign commit objects, and teach
> rev-tree or rev-list to read from that auxiliary file and use
> that timestamp for foreign commit objects instead of commit time
> recorded in them when sorting by time is needed.

The time is already recorded.  Ie. the commit object is a separate file
with a modification time which can be used as a "local commit timestamp". 
 If you want to protect those time stamps by also recording them in a
separate file, that's a bonus I guess but shouldn't really be needed.

You can descend the history tree based on the parent position as described
by Jon Seymour.  That is, Cogito lists the "local" parent first, so you
descend that branch marking off visited nodes, then descend the other
branches reporting unvisited nodes only.  Afterward return and list any
unreported nodes in the first branch.

Of course, the problem with that is a fast forward node, where you can't
just blindly pick the first parent listed because it may belong to another
repository.   So the answer is to do away with fast forward nodes, or give
up on using the ordering of the parents to mean anything.   In which case
you have to pick the parent with the oldest local commit time as the first
node to descend.

So it seems, that rather than a repository identifier, we need each
repository to record the time of each local commit.   Either in a separate
file or just using the object file timestamps directly.

Sean

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Junio C Hamano @ 2005-05-12 17:35 UTC (permalink / raw)
  To: tglx; +Cc: H. Peter Anvin, git
In-Reply-To: <1115884637.22180.277.camel@tglx>

>>>>> "TG" == Thomas Gleixner <tglx@linutronix.de> writes:

TG> Rn   o
TG>      | \
TG> Rn-1 o  |
TG>      |  o Mn
TG>      |  o Mn-1
TG> Rn-2 o /
TG> Rn-3 o

TG> The correct display looking at R is

TG> Rn
TG>  Mn
TG>  Mn-1
TG> Rn-1
TG> Rn-2
TG> Rn-3

TG> Looking from M it is

TG> Rn
TG>  Rn-1
TG>  Rn-2
TG> Mn
TG> Mn-2
TG> Rn-3

Thanks for a very clear explanation.  The situation is
intriguing in that both R and M after converging end up with
exactly the same HEAD with the same set of objects but still
would want to see history leading to the HEAD differently.

I wonder what happens to a third person S, who pulls from both R
and M.  What does S see?  

Does the commit order observed by S depend on which one S pulls
from first?  That is, if S pulls from R then at that point Mn-1
and Mn comes after Rn-1 in S's history?  And after that what
hapens if S pulls from M (which is obviously a no-op except that
it would update .git/refs/heads/M)?  Does the history for S
change?

IIRC, Cogito lets you "track" upstream branches.  When S starts
tracking R, does it see R's history and when S starts tracking M
its history view changes to that of M?

Let's further say R and M are both based on another upstream L,
and R and M have converged at this point.  S has been tracking L
and it merged from R and M.  If S did not have any local
modifications since L, then that is just two fast forward
merges.  What does the history look like to S?  Which comes
first---Mn or Rn-1?

The answer to the above could be "the merge order history is per
tree and not something to be exported or given away to other
trees", in which case it may make sense from S's point of view
that Mn and Rn-1 are compares solely based on their commit
timestamps.  You will get consistent history and switching which
tree is being tracked would not change the history.  Is the goal
here to give the merge order history from R and M to S?

If that is not needed, then you can record in an auxiliary file
that is local to each tree the timestamp of when merge happened
in that tree along with set of foreign commit objects, and teach
rev-tree or rev-list to read from that auxiliary file and use
that timestamp for foreign commit objects instead of commit time
recorded in them when sorting by time is needed.

^ permalink raw reply

* Re: [RFC] Support projects including other projects
From: David Lang @ 2005-05-12 17:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git, Petr Baudis, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505121218280.30848-100000@iabervon.org>

I was thinking about this recently while reading an article on bittorrent 
and how it works and it occured to me that perhapse the network access 
model of git should be reexamined.

git produces a large pool of objects, there are two ways that people want 
to access these objects.

1. pull the current version of a project (either a straight 'ckeckout' 
type pull or a 'merge' to a local project)

2. pull the objects nessasary for past versions of a project (either all 
the way back to the beginning of time or back to some point, that point 
being a number of possibilities (date, version, things you don't have, 
etc)

in either case the important thing that's key are the indexes related to a 
particular project, the objects themselves could all be in one huge pool 
for all projects that ever existed (this doesn't make sense if you use 
rsync to copy repositories as Linux origionally did, but if you have a 
more git-aware transport it can make sense)

I believe that there are going to be quite a number of cases where the 
same object is used for multiple projects (either becouse the project is a 
fork of another project or becouse some functions (or include files) are 
so trivial that they are basicly boilerplate and get reused or recreated) 
if you think about a major mirror server distributing a dozen linux 
distros via git you will realize that in many cases the source files, 
scripts, and (in many cases) even the binaries are really going to be 
identical objects for all the distros so a ftp/http server that used a git 
filesystem could result in a pretty significant saveings in disk space.

In addition, when you are doing a pull you can accept data from 
non-authoritative sources since each object (and it's index info) includes 
enough info to validate the object hasn't been tampered with (at least 
until such time as the hashes are sufficiantly broken, but that's another 
debate, and we had that one :-). so a bittorrent-like peer sharing system 
to fetch objects identified by the index files would open the potential 
for saving significant bandwith on the master servers while not 
comprimising the trees at all.

Going back (somewhat) to the subject at hand, with something like this you 
should be able to combine as many projects as you want in one repository, 
and the only issue would be the work nessasary to go through that 
repository and all the index files that point at it when you want to prune 
old data out of the object pool to save disk space.

thoughts? unfortunnatly I don't have the time to even consider codeing 
something like this up, but hopefully it will spark interest for someone 
who does.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply

* Re: [PATCH] improved delta support for git
From: Junio C Hamano @ 2005-05-12 17:16 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: jon, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0505121110490.5426@localhost.localdomain>

>>>>> "NP" == Nicolas Pitre <nico@cam.org> writes:

>> On 5/13/05, Chris Mason <mason@suse.com> wrote:
>> > On Thursday 12 May 2005 00:36, Junio C Hamano wrote:
>> > > It appears to me that changes to the make_sure_we_have_it() ...
>> >
>> > If we fetch the named object and it is a delta, the delta will either depend
>> > on an object we already have or an object that we don't have.  If we don't
>> > have it, the pull should find it while pulling other commits we don't have.

NP> 1) If you happen to already have the referenced object in your local 
NP>    repository then you're done.

Yes.

NP> 2) If not you pull the referenced object from the remote repository, 
NP>    repeat with #1 if it happens to be another delta object.

Yes, that is the outline of what my (untested) patch does.

Unless I am grossly mistaken, what Chris says is true only when
we are pulling with -a flag to the git-*-pull family.  If we are
pulling "partially near the tip", we do not necessarily pull
"other commits we don't have", hence detecting delta's
requirement at per-object level and pulling the dependent
becomes necessary, which is essentially what you wrote in (2)
above.

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Jon Seymour @ 2005-05-12 17:12 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <2cfc403205051210093e1a396d@mail.gmail.com>

| added "local" to clarify what I meant the first-pass should do

> My previous algorithm was incorrect, but I suspect it could probably
> be fixed with a 2-pass algorithm that marked any nodes in the "local" path
> between the merge base and the merge head as local and then ensured
> that nodes marked that way are sorted after any nodes reached via
> "foreign" paths.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Jon Seymour @ 2005-05-12 17:09 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <20050512162023.GA14010@delft.aura.cs.cmu.edu>

On 5/13/05, Jan Harkes <jaharkes@cs.cmu.edu> wrote:
> >
> > Ln
> > |     \
> > Ln-1  Fn
> > |         |
> > Ln-2  Fn-1
> > |       /
> > Ln-3
> 
> It breaks when Fn was a pull from Ln-1, and Ln was a fast-forward to Fn.
> Now the first parent is going to be Fn-1 and the history of the local
> repository after the fast forward warps to
> 
>     Fn (== Ln)
>     Ln-1
>     Ln-2
>     Fn-1
>     Ln-3
> 

Yep, you are right.

> Which I believe is exactly what Thomas wants to see in this case. I
> don't see how repoid's can be useful for this. It is a porcelain thing
> where you need to track what you have seen before. Anything else doesn't
> matter because most permutations of the history are perfectly valid
> since the Fn and Ln changes in reality occured in parallel and as a
> result can be arbitrarily interleaved.
> 

I may be wrong, but I don't think Thomas is interested in his own
repository. I think he is interested in the history of commits found
in any public repository. Therefore, he needs an algorithm that
doesn't rely on locally cached information.

In otherwords, at each point in the commit graph, what did the
committer consider as "foreign" changes that needed to be merged into
the "local" repository to progress the repository forward. He wants to
derive that order only from the information in the repository itself -
everyone given the same commit graph should reach the same conclusion
as to what the committer saw as local and foreign at the time of the
commit.

My previous algorithm was incorrect, but I suspect it could probably
be fixed with a 2-pass algorithm that marked any nodes in the path
between the merge base and the merge head as local and then ensured
that nodes marked that way are sorted after any nodes reached via
"foreign" paths.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply

* Re: [RFC] Support projects including other projects
From: Daniel Barkalow @ 2005-05-12 16:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds
In-Reply-To: <7v8y2lj6u9.fsf@assigned-by-dhcp.cox.net>

On Wed, 11 May 2005, Junio C Hamano wrote:

> >>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> DB> My reasons for having it in the core are as follows:
> 
> DB>  - All of the porcelain layers have to, at least, agree as
> DB>  to how this is represented in order for repositories to be
> DB>  portable; since the representation is common, it might as
> DB>  well be core.
> 
> That is weak.  .git/refs/heads/master is not core, but something
> Porcelain need to agree on [*1*].

I think it is a defect of the current core that it fails to completely
specify a portable repository format. Obviously, it is not necessary to
have things in the core for this reason, but it's also not necessary to
have anything at all in the core. We could eliminate commits entirely in
favor of putting the information in special files in trees, and it would
still be as complete as it is, although it would also be unmaintainable.

> DB>  - There are currently no special files which are tracked for cogito (et 
> DB>    al) to put the information in.
> 
> I am somewhat sympathetic to this, but then there are probably
> lot other things that are more relevant than this "required
> version" thing.  One thing that immediately comes to mind is the
> dontdiff list.

The dontdiff list isn't expected to change with every commit, however.

> Also, if you consider Cogito and GIT independent projects as you said,
> you would probably need to have "require {project-name} {commit-id}",
> not "include {commit-id}".

I *don't* consider Cogito and GIT to be independant projects. GIT is
independant of Cogito, but Cogito includes GIT as part of it.

If you don't like the structure of Cogito, I have a set of projects at
work, where I have a bunch of microcontroller programs and a library of
common code. Traditionally, there are two possible arrangements: either
they are all separate projects, in which case the user has to figure out
what versions match, or they are the same project, in which case everybody
has to get everything. What I would like is to have the library consider
itself a separate project, but each program consider itself, in some
sense, the same project as the library (but not as other programs).

> Things start smelling much more like the traditional package version 
> matching issue which is outside of SCM (let alone core GIT).

Once the core portion matures to the point where it gets used without
program-specific patches, it can be done outside of SCM. But it doesn't
make sense to have an SCM require that the projects are really mature in
order to work well, since active development is supposed to be what an SCM
is for.

> DB>  - Ideally, the dependancy would only be per-commit, not
> DB>  per-tree; if Petr releases a new cogito which only merges a
> DB>  new mainline with the git-pb, the cogito tree object should
> DB>  be the same (since the cogito content didn't change). This
> DB>  means that it can't be anywhere other than the commit.
> 
> As I already said, I consider the current "overlayed" directory
> structure broken and not worth considering the toolset support

You missed my point here entirely. I think that the cogito tree including
any non-source files in it (if there are such) should be the same. So the
dependancy can't be tracked in the tree.

> DB>  - If the solution to the issue of finding the necessary
> DB>  git-pb is to store it with cogito, then the programs that
> DB>  pull from this repository need to know that they need to
> DB>  pull the git-pb portion, and fsck-cache needs to know that
> DB>  the cogito references the git-pb.
> 
> I do not think this is necessary for the same reason as I
> dismissed the third point above.

Do you have some solution to the problem of having the porcelain
layer (or the end user) find the version of git that a version of cogito
needs, in some way such that if I'm working on the project and make a
change to cogito and a matching change to git, Petr can get them.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: [PATCH] cg-init should only process files
From: David Greaves @ 2005-05-12 16:45 UTC (permalink / raw)
  To: Morten Welinder; +Cc: Petr Baudis, GIT Mailing Lists
In-Reply-To: <118833cc050512093750db7d55@mail.gmail.com>

Morten Welinder wrote:

>>-       find * | xargs cg-add
>>+       find * -type f | xargs cg-add
>>    
>>
>
>I think we went through that a day or two ago:
>
Doh!
I even commented.

Sorry for the noise.

David


^ permalink raw reply

* Re: [PATCH] cg-init should only process files
From: Morten Welinder @ 2005-05-12 16:37 UTC (permalink / raw)
  To: David Greaves; +Cc: Petr Baudis, GIT Mailing Lists
In-Reply-To: <E1DWG9b-0001VV-V0@ash.dgreaves.com>

> -       find * | xargs cg-add
> +       find * -type f | xargs cg-add

I think we went through that a day or two ago:  Problems:

1. You forgot symlinks.
2. It does not work if nothing matches "*".  (Nor does the original.)
3. Either it should not match dotfiles in subdirectories, or else it
should match
   them in the top directory too.
4. With needs some -print0 and "--" protection.

Morten

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Jan Harkes @ 2005-05-12 16:20 UTC (permalink / raw)
  To: jon; +Cc: Git Mailing List
In-Reply-To: <2cfc403205051208506249c9aa@mail.gmail.com>

On Fri, May 13, 2005 at 01:50:50AM +1000, Jon Seymour wrote:
> On 5/12/05, Jan Harkes <jaharkes@cs.cmu.edu> wrote:
> > On Thu, May 12, 2005 at 01:43:50PM +0200, Thomas Gleixner wrote:
> > ....
> > Your examples break if you consider additional merges where M syncs up a
> > couple of times (f.i. at Rn-2) before M is merged back into R.
...
> If committers always follow the convention that their previous local
> commit is nominated as the first (local) parent in the commit and
> commits from foreign repositories are listed after the first parent,
> can the chain of "local" parents be an effective proxy for repoid?
> 
> Consider first a graph where there are no more than 2 parents in a merge
> 
> Ln
> |     \
> Ln-1  Fn
> |         |
> Ln-2  Fn-1
> |       /
> Ln-3

It breaks when Fn was a pull from Ln-1, and Ln was a fast-forward to Fn.
Now the first parent is going to be Fn-1 and the history of the local
repository after the fast forward warps to

    Fn (== Ln)
    Ln-1
    Ln-2
    Fn-1
    Ln-3

And adding repoids doesn't help a bit. However if the local repo kept a
history of what the user has seen previously, it can be linearized
consistently. The history file would contain Ln-3...Ln-1 before the
fast-forward and would add Fn-1,Fn. We would end up with a history that
looks like,

    Fn (== Ln)
    Fn-1
    Ln-1
    Ln-2
    Ln-3

Which I believe is exactly what Thomas wants to see in this case. I
don't see how repoid's can be useful for this. It is a porcelain thing
where you need to track what you have seen before. Anything else doesn't
matter because most permutations of the history are perfectly valid
since the Fn and Ln changes in reality occured in parallel and as a
result can be arbitrarily interleaved.

In fact anyone else who branched at Ln-3 and merges again at Ln doesn't
really care in what order changes in the F and L branches occurred, only
that all modifications are included.

Jan

^ permalink raw reply

* [PATCH] cg-init should only process files
From: David Greaves @ 2005-05-12 16:03 UTC (permalink / raw)
  To: Petr Baudis; +Cc: GIT Mailing Lists

cg-init tries to add directories

Signed-off-by: David Greaves <david@dgreaves.com>

---
commit c6ecba40932efa0b28cd15d00fdab3b2607ec069
tree 39f7bebbadf6ebae67367b629d8cec298f7dcc90
parent f7d4b2adfc6a29036e2a8abe5b742e57b64e50d7
author David Greaves <david@dgreaves.com> Thu, 12 May 2005 13:05:03 +0100
committer David Greaves <david@ash.(none)> Thu, 12 May 2005 13:05:03 +0100

 cg-init |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: cg-init
===================================================================
--- 85d8d081e2012da8dd1af35b62ae82f79f89ebd0/cg-init  (mode:100755)
+++ 39f7bebbadf6ebae67367b629d8cec298f7dcc90/cg-init  (mode:100755)
@@ -31,7 +31,7 @@
 	echo "Cloned (origin $uri available as branch \"origin\")"
 else
 	git-read-tree # Seed the dircache
-	find * | xargs cg-add
+	find * -type f | xargs cg-add
 	cg-commit -C -m"Initial commit" -e
 fi
 exit 0

^ permalink raw reply

* Overwriting files in Makefile
From: Horst von Brand @ 2005-05-12 15:56 UTC (permalink / raw)
  To: git

The current setup disturbs me. The Makefile copies the scripts into the
destination, and then edits them in place. Why not just generate them from
.in files before installing, i.e. by a rule something like

%: %.in
   sed -e 's;@LIBDIR@;$(sedlibdir);g' $^ > $@

with '@LIBDIR@' in the .in file whereever the substitution should take
place. This way you also avoid the possible loss of the permission bits
when fooling around (any SUID/SGID would get lost; not that it matters
here).

In any case, the '\/'s (LTS, "Leaning Toothpick Syndrome") when futzing
around with file paths can be avoided by using something else than '/' as
delimiter for sed(1)'s substitute command, like ';' here.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Jon Seymour @ 2005-05-12 15:50 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <2cfc403205051208483132921@mail.gmail.com>

|| oops - fix to algorithm, sorry guys
| small clarification to algorithm, removed editing work area

On 5/12/05, Jan Harkes <jaharkes@cs.cmu.edu> wrote:
> On Thu, May 12, 2005 at 01:43:50PM +0200, Thomas Gleixner wrote:
> ....
> Your examples break if you consider additional merges where M syncs up a
> couple of times (f.i. at Rn-2) before M is merged back into R.
>
> What you seem to want won't be fixed by adding a repoid, you need to
> keep a list of all the commits you have already seen and append any new
> ones whenever you look at the history. If you look whenever you pull or
> merge the list will be in the total ordering that you seem to expect for
> your repository. But that is a porcelain thing.
>
> Jan

If committers always follow the convention that their previous local
commit is nominated as the first (local) parent in the commit and
commits from foreign repositories are listed after the first parent,
can the chain of "local" parents be an effective proxy for repoid?

Consider first a graph where there are no more than 2 parents in a merge

Ln
|     \
Ln-1  Fn
|         |
Ln-2  Fn-1
|       /
Ln-3

Thomas would like to sort this as:

Ln
Fn
Fn-1
Ln-1
Ln-2
Ln-3

So, use this algorithm:

1. Merge result comes first.
2. For each foreign parent:
    - sort the graph between the foreign parent and the merge base
(not including merge base) according to this algorithm . Append the
result into the list.
3. Sort the graph between the local parent and the merge base
(including merge base) according to this algorithm. Append the result
into the list.

Admittedly the order for foreign parent for N-way merges is somewhat
arbitrary but a committer could probably make a choice that "works" in
most cases by specifying the foreign parents in a "sensible" order.

Of course, this relies on a committer always nominating the local
parent first, but that wouldn't be hard to enforce in the porcelain
layer.

jon.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply

* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Jon Seymour @ 2005-05-12 15:48 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <2cfc4032050512084426ea3d4d@mail.gmail.com>

| small clarification to algorithm, removed editing work area 

On 5/12/05, Jan Harkes <jaharkes@cs.cmu.edu> wrote:
> On Thu, May 12, 2005 at 01:43:50PM +0200, Thomas Gleixner wrote:
> ....
> Your examples break if you consider additional merges where M syncs up a
> couple of times (f.i. at Rn-2) before M is merged back into R.
>
> What you seem to want won't be fixed by adding a repoid, you need to
> keep a list of all the commits you have already seen and append any new
> ones whenever you look at the history. If you look whenever you pull or
> merge the list will be in the total ordering that you seem to expect for
> your repository. But that is a porcelain thing.
>
> Jan

If committers always follow the convention that their previous local
commit is nominated as the first (local) parent in the commit and
commits from foreign repositories are listed after the first parent,
can the chain of "local" parents be an effective proxy for repoid?

Consider first a graph where there are no more than 2 parents in a merge

Ln
|     \
Ln-1  Fn
|         |
Ln-2  Fn-1
|       /
Ln-3

Thomas would like to sort this as:

Ln
Fn
Fn-1
Ln-1
Ln-2
Ln-3

So, use this algorithm:

1. Merge result comes first.
2. For each foreign parent:
    - sort the graph between the foreign parent and the merge base
(not including merge base) according to his algorithm using the
foreign parent as the starting
point of the algorithm. Append the result into the list.
3. Append the merge base to the list.

Admittedly the order for foreign parent for N-way merges is somewhat
arbitrary but a committer could probably make a choice that "works" in
most cases by specifying the foreign parents in a "sensible" order.

Of course, this relies on a committer always nominating the local
parent first, but that wouldn't be hard to enforce in the porcelain
layer.

jon.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox