Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 3/3] daemon: Support a --user-path option.
From: Mark Wooding @ 2006-02-04 10:02 UTC (permalink / raw)
  To: git
In-Reply-To: <7vr76kcggx.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:

> I am probably slow as usual but I do not see how this is useful.

I don't want the git-daemon roaming all over the file system.  Partly,
as a systems administrator, it makes me nervous about security (not for
any particularly good reason, I admit), but mainly because I don't want
to be exposing my local filesystem structure in my git://... namespace
-- it just seems like a bad idea.  This is what --base-path is all about.

I do still want users to be able to publish their repositories.  But I
also don't want git-daemon wandering all over their home directories --
restriction to sensible places is what --base-path is for, after all.

> Wouldn't loosening the "request must be absolute if you use
> --base-path" check in the area your first patch in the series
> touches to also allow paths that start with a '~' be enough?
> That way ~alice/foo would remain to be /home/alice/foo (with
> /home/alice being alice's $HOME) and ~becky/bar would be
> /home2/becky/bar (with /home2/becky being becky's $HOME).

That would still expose the structure of everyone's home directories in
git://~user URLs, which is rather unfortunate.  It's better than
nothing, though.

> I suppose you are doing something similar to ~/public_html, but
> I think that is an independent feature.

This is what I'm after, yes.  The above can be achieved
straightforwardly with --user-path=. if that's what you actually wanted.
(Indeed, --user-path= works too, but this is harder to explain.)

I think I'd probably either run with --user-path=public-git or
--user-path=public_html/git -- I've not made my mind up.

-- [mdw]

^ permalink raw reply

* Re: Two ideas for improving git's user interface
From: Alan Chandler @ 2006-02-04  9:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <7virrv4jk1.fsf@assigned-by-dhcp.cox.net>

On Saturday 04 February 2006 08:25, Junio C Hamano wrote:
> Alan Chandler <alan@chandlerfamily.org.uk> writes:
> > Wow - light comes on.
>
> That's good.
>
> > -this tutorial the first time, I'd suggest to skip to "Publishing
> > -your work" section and come back here later.
> > +this tutorial the first time, I'd suggest to skip to "Resolving Merge
> > +Problems" section and come back here later.
>
> The changes before this look very good to me, but these two
> lines do not make any sense. If you are going to talk about
> "Resolving Merge Problems", you _need_ to know about index, so
> you cannot skip the material.

Maybe - since the light has just come on, I need to understand a lot more 
about this area before I can really comment.  The tutorial invited one to 
skip it before, so I was just doing so again.

Even today when I tried to read this section again my eyes glazed over.  The 
long sha outputs from git-ls-files screams off the page "don't bother this is 
detailed technical stuff" :-(




>
> I think having a section on manual merge resolution between the
> Index File section and Publishing section makes sense.  What
> kind of merges did you have trouble figuring out when you were
> still git novice?  That would be a good starting point.
>

I STILL come out in a cold sweat (actually that is a bit over the top:-) ) as 
soon as a merge fails for whatever reason.  The problem is that I am not 
doing development full time, nor in a team, so I probably hit one about once 
every 2 months.  This means that I don't remember what to do, and need to go 
and look it up.  But where - there is nothing in my main reference places 
(Everyday Git - or before that the tutorial). 

So I normally attempt to do what I think is sensible.  Manually searching for 
files that haven't merged. Edit the lines with the
 >>>>>>
====
<<< 
markers in them until I think the resultant file is what it should be and then 
try commit again (probably cg-commit rather than git commit).  But what 
happens next is then hit or miss - sometimes it just works - sometimes it 
doesn't and I am that place where there was a long thread a couple of months 
ago entitled something like "and what do I do now?"

I must admit it normally works OK - but I have come across situations a couple 
of weeks later where a file is in an unexpected state - seems to have been 
from the wrong branch, or missing a commit I thought I had made.
-- 
Alan Chandler
http://www.chandlerfamily.org.uk
Open Source. It's the difference between trust and antitrust.

^ permalink raw reply

* Re: [PATCH 3/3] daemon: Support a --user-path option.
From: Junio C Hamano @ 2006-02-04  8:50 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git
In-Reply-To: <7vr76kcggx.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> Wouldn't loosening the "request must be absolute if you use
> --base-path" check in the area your first patch in the series
> touches to also allow paths that start with a '~' be enough?

That is, something like this is what I mean.

Tested, of course ;-).

diff --git a/daemon.c b/daemon.c
index 532bb0c..324bb04 100644
--- a/daemon.c
+++ b/daemon.c
@@ -145,13 +145,17 @@ static char *path_ok(char *dir)
 
 	if (base_path) {
 		static char rpath[PATH_MAX];
-		if (*dir != '/') {
-			/* Forbid possible base-path evasion using ~paths. */
+		if (!strict_paths && *dir == '~')
+			; /* allow user relative paths */
+		else if (*dir != '/') {
+			/* otherwise allow only absolute */
 			logerror("'%s': Non-absolute path denied (base-path active)", dir);
 			return NULL;
 		}
-		snprintf(rpath, PATH_MAX, "%s%s", base_path, dir);
-		dir = rpath;
+		else {
+			snprintf(rpath, PATH_MAX, "%s%s", base_path, dir);
+			dir = rpath;
+		}
 	}
 
 	path = enter_repo(dir, strict_paths);

^ permalink raw reply related

* Re: [PATCH 2/3] daemon: Set SO_REUSEADDR on listening sockets.
From: Junio C Hamano @ 2006-02-04  8:49 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git
In-Reply-To: <20060203202704.1895.18383.stgit@metalzone.distorted.org.uk>

Mark Wooding <mdw@distorted.org.uk> writes:

> From: Mark Wooding <mdw@distorted.org.uk>
>
> Without this, you can silently lose the ability to receive IPv4
> connections if you stop and restart the daemon.
>
> Signed-off-by: Mark Wooding <mdw@distorted.org.uk>

But with that, you expose yourself to the confusion TIME_WAIT
was designed to protect you from, so how about making it
optional like this?

Tested, of course ;-).

-- >8 --
From nobody Mon Sep 17 00:00:00 2001
From: Mark Wooding <mdw@distorted.org.uk>
Date: Fri Feb 3 20:27:04 2006 +0000
Subject: [PATCH] daemon: Set SO_REUSEADDR on listening sockets.

Without this, you can silently lose the ability to receive IPv4
connections if you stop and restart the daemon.

[jc: tweaked code organization a bit and made this controllable
 from a command line option.]

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 daemon.c |   27 ++++++++++++++++++++++++++-
 1 files changed, 26 insertions(+), 1 deletions(-)

bb1527c884bbb9bf6a5d06c1dd409ea6c2045a91
diff --git a/daemon.c b/daemon.c
index 324bb04..dab8c2c 100644
--- a/daemon.c
+++ b/daemon.c
@@ -13,11 +13,12 @@
 
 static int log_syslog;
 static int verbose;
+static int reuseaddr;
 
 static const char daemon_usage[] =
 "git-daemon [--verbose] [--syslog] [--inetd | --port=n] [--export-all]\n"
 "           [--timeout=n] [--init-timeout=n] [--strict-paths]\n"
-"           [--base-path=path] [directory...]";
+"           [--base-path=path] [--reuseaddr] [directory...]";
 
 /* List of acceptable pathname prefixes */
 static char **ok_paths = NULL;
@@ -451,6 +452,16 @@ static void child_handler(int signo)
 	}
 }
 
+static int set_reuse_addr(int sockfd)
+{
+	int on = 1;
+
+	if (!reuseaddr)
+		return 0;
+	return setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR,
+			  &on, sizeof(on));
+}
+
 #ifndef NO_IPV6
 
 static int socksetup(int port, int **socklist_p)
@@ -495,6 +506,11 @@ static int socksetup(int port, int **soc
 		}
 #endif
 
+		if (set_reuse_addr(sockfd)) {
+			close(sockfd);
+			return 0;	/* not fatal */
+		}
+
 		if (bind(sockfd, ai->ai_addr, ai->ai_addrlen) < 0) {
 			close(sockfd);
 			continue;	/* not fatal */
@@ -537,6 +553,11 @@ static int socksetup(int port, int **soc
 	sin.sin_addr.s_addr = htonl(INADDR_ANY);
 	sin.sin_port = htons(port);
 
+	if (set_reuse_addr(sockfd)) {
+		close(sockfd);
+		return 0;
+	}
+
 	if ( bind(sockfd, (struct sockaddr *)&sin, sizeof sin) < 0 ) {
 		close(sockfd);
 		return 0;
@@ -663,6 +684,10 @@ int main(int argc, char **argv)
 			base_path = arg+12;
 			continue;
 		}
+		if (!strcmp(arg, "--reuseaddr")) {
+			reuseaddr = 1;
+			continue;
+		}
 		if (!strcmp(arg, "--")) {
 			ok_paths = &argv[i+1];
 			break;
-- 
1.1.6.gf7ef

^ permalink raw reply related

* Re: Two ideas for improving git's user interface
From: Junio C Hamano @ 2006-02-04  8:25 UTC (permalink / raw)
  To: Alan Chandler; +Cc: git
In-Reply-To: <200602040803.45617.alan@chandlerfamily.org.uk>

Alan Chandler <alan@chandlerfamily.org.uk> writes:

> Wow - light comes on.

That's good.

> -this tutorial the first time, I'd suggest to skip to "Publishing
> -your work" section and come back here later.
> +this tutorial the first time, I'd suggest to skip to "Resolving Merge
> +Problems" section and come back here later.

The changes before this look very good to me, but these two
lines do not make any sense. If you are going to talk about
"Resolving Merge Problems", you _need_ to know about index, so
you cannot skip the material.

I think having a section on manual merge resolution between the
Index File section and Publishing section makes sense.  What
kind of merges did you have trouble figuring out when you were
still git novice?  That would be a good starting point.

^ permalink raw reply

* Re: Two ideas for improving git's user interface
From: Alan Chandler @ 2006-02-04  8:03 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0602011732560.21884@g5.osdl.org>

On Thursday 02 February 2006 01:44, Linus Torvalds wrote:
> On Wed, 1 Feb 2006, Linus Torvalds wrote:
> > And notice how I commit the _merge_ without actually committing my dirty
> > state in the tree - and whether the files involved in my standard dirty
> > changes ("Makefile") are part of the state that the merge changed or not
> > is _totally_ irrelevant.
>
> If you get the feeling that merging is special, then to some degree, yes,
> you'd be right.
>
> Merging (especially with conflicts) is the _one_ operation where you
> absolutely have to know about the index. If you don't know about how the
> index works, you can get the conflict resolution right kind of by
> accident, simply because the default workflow of
>
> 	.. edit conflict to look ok ..
> 	git commit file/with/conflict
>
> actually happens to do exactly the right thing (very much on purpose,
> btw), but the fact is, to actually figure out more complicated conflicts
> and to _understand_ what happens, you absolutely need to be aware of the
> index. Not being aware of it just isn't an option for any serious git
> user.
>
> (Btw, I think this is where cogito falls down. Cogito tries to hide the
> index file, but I don't think you really _can_ hide the index file and
> also do merges well at the same time. Anybody who has non-trivial merges
> should use raw git - not just because the "recursive" strategy just works
> better, but exactly because of the index file issue).

Wow - light comes on.

I have been using git (or rather to be exact git with cg-add, cg-rm and 
cg-commit) for about 6 months (bearing in mind I am only a part time 
programmer in the evenings for fun - even though I work in the computer 
industry the last time I was paid to write code was in 1979 - so I don't 
really need to be a power user).  Although I knew about the index file since 
the beginning I never really groked what it was about before.

Of course I knew of its existance, and I even knew that it could be used as a 
staging area, but up to now I had always thought of it as a necessary 
inconvenience to enable git to run as blazingly fast as it does - not as an 
essential part of work flow it complex situations

I think the problem is with three crucial bits of documentation. Firstly, the 
document is full of the git doesn't do prorcelain statements - pushing towards 
cogito which then hides the existance of the index file.  Git not doing 
porcelain was true at the very beginning, but I don't think that it is true 
any longer.

Secondly the tutorial.  The examples given start by using commands to 
explicitly update the index and them they move on to show how you don't need 
to do that by using the more advanced commands of git-add and git-commit.  So 
as I was trying to learn how to use git, I followed through this and thought 
that you just try an avoid using it directly.  Whats more, viewed in this 
light git-commit seemed to be a rather poor implementation of cogito's 
superior cg-commit command

[Incidentally there is a use case that doesn't seem to have been discussed in 
this thread which I use cg-commit all the time for and will now have to see 
if there is a use index file equivalence for.  That is, I am developing a web 
application and in the running version the database framework (iBatis) is 
using Tomcats connection pooling.  In order to run my JUnit test harness, I 
don't have tomcat, so I need to define a different version of iBatis 
configuration file to used its own database connection.  So I have created a 
test branch and edited the configuration file in that branch, and I update 
both code and tests in a edit/compile/fix and text loop until I have written 
or changed both code and tests.  I then do a cg-commit which lists the files 
I have changed.  I ONLY commit those in the test harness - by deleting the 
others from cogito's list of files to commit - and then repeat the commit 
commiting the rest].  I then switch back to my master branch and cherry pick 
commit that is the code changes - not the text harness] 

Thirdly,  "discussion" of the index file at the bottom end of the git man page 
(The "index" aka "Current Directory Cache") really concentrates on what it is 
and what operations you can perform with it in the normal situation.

I tried looking at the core tutorial looking at what I might be a way of bring 
this to the attention of the new learner into git and produced the following 
(partial) patch to the core-tutorial (It needs a whole set of examples on 
resolving merge problems which I have no idea at the moment how to do - this 
has been the real area which never understood - basically because the 
tutorial itself says skip that part).

--- a/Documentation/core-tutorial.txt
+++ b/Documentation/core-tutorial.txt
@@ -212,15 +212,22 @@ was just to show that `git-update-index`
 actually saved away the contents of your files into the git object
 database.

+The Index File
+--------------
+
 Updating the index did something else too: it created a `.git/index`
 file. This is the index that describes your current working tree, and
-something you should be very aware of. Again, you normally never worry
-about the index file itself, but you should be aware of the fact that
-you have not actually really "checked in" your files into git so far,
-you've only *told* git about them.
+something you should be very aware of.  It is a staging area between your
+working tree and the object store described above.
+
+In normal circumstances you do not worry about the index file itself, but you
+should be aware of the fact that you have not actually really "checked in"
+your files into git so far, you've only *told* git about them.  Later you
+will see how you can exploit the fact that there is this separate index
+file to undertake more complex operations.

-However, since git knows about them, you can now start using some of the
-most basic git commands to manipulate the files or look at their status.
+However, since git knows about these files, you can now start using some of
+the most basic git commands to manipulate them or look at their status.

 In particular, let's not even check in the two files into git yet, we'll
 start off by adding another line to `hello` first:
@@ -1188,8 +1195,8 @@ How does the merge work?
 We said this tutorial shows what plumbing does to help you cope
 with the porcelain that isn't flushing, but we so far did not
 talk about how the merge really works.  If you are following
-this tutorial the first time, I'd suggest to skip to "Publishing
-your work" section and come back here later.
+this tutorial the first time, I'd suggest to skip to "Resolving Merge
+Problems" section and come back here later.

 OK, still with me?  To give us an example to look at, let's go
 back to the earlier repository with "hello" and "example" file,
@@ -1332,6 +1339,10 @@ merge for you to resolve.  Notice that t
 unmerged, and what you see with `git diff` at this point is
 differences since stage 2 (i.e. your version).

+Resolving Merge Problems
+------------------------
+
+NOT SURE WHAT GOES HERE

 Publishing your work
 --------------------

-- 
Alan Chandler
http://www.chandlerfamily.org.uk
Open Source. It's the difference between trust and antitrust.

^ permalink raw reply

* [PATCH] read-tree --aggressive
From: Junio C Hamano @ 2006-02-04  7:31 UTC (permalink / raw)
  To: Peter Eriksen; +Cc: Git Mailing List, Linus Torvalds, Fredrik Kuivinen
In-Reply-To: <20060131213314.GA32131@ebar091.ebar.dtu.dk>

"Peter Eriksen" <s022018@student.dtu.dk> writes:

> In connection with Ian Molton's question about merge have I played a
> little with 'git merge' on the kernel sources.  What I find is that a
> merge can take quite some time, but I'm not sure where that time exactly
> goes to.  Here are the times I got:
>
> Recursive (default):  4m22.282s
> Resolve (-s resolve): 3m23.548s

In your sample script, you do not disable the post-merge diff,
which is typically one of the most expensive part in the whole
merge, and I am wondering how fast a machine you are using to
get 4 minutes.  The post-merge diff is generated by piping the
output of 'diff-tree -M' to 'apply --stat --summary', and that
step alone takes about 12 minutes wallclock time on my box X-<.

Since my box is not as fast as yours, I've eliminated the
post-merge diff step and tried your final merge step like this:

	$ time git merge --no-summary -s resolve \
            'Merging happily' HEAD v2.6.15 >/dev/null

and got this:

        real	2m15.737s
        user	1m43.320s
        sys	0m26.690s

With the attached patch, the most expensive part, which is the
repeated invocation of git-merge-one-file to remove many deleted
paths, is eliminated.  The result is this.

        real	0m20.311s
        user	0m15.780s
        sys	0m4.150s

This patch would not help recursive strategy, though.  Calling
read-tree with --aggressive flag essentially disables the
benefit we would expect to get from it -- rename detection.

-- >8 --
A new flag --aggressive resolves what we traditionally resolved
with external git-merge-one-file inside index while read-tree
3-way merge works.

git-merge-octopus and git-merge-resolve use this flag before
running git-merge-index with git-merge-one-file.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 git-merge-octopus.sh |    2 +-
 git-merge-resolve.sh |    2 +-
 read-tree.c          |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 2 deletions(-)

2a4bb6bc618bdad6529d9ffe361bc8b7dd28a56c
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
index eb74f96..eb3f473 100755
--- a/git-merge-octopus.sh
+++ b/git-merge-octopus.sh
@@ -90,7 +90,7 @@ do
 	NON_FF_MERGE=1
 
 	echo "Trying simple merge with $SHA1"
-	git-read-tree -u -m $common $MRT $SHA1 || exit 2
+	git-read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
 	next=$(git-write-tree 2>/dev/null)
 	if test $? -ne 0
 	then
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 966e81f..0a8ef21 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -38,7 +38,7 @@ then
 fi
 
 git-update-index --refresh 2>/dev/null
-git-read-tree -u -m $bases $head $remotes || exit 2
+git-read-tree -u -m --aggressive $bases $head $remotes || exit 2
 echo "Trying simple merge."
 if result_tree=$(git-write-tree  2>/dev/null)
 then
diff --git a/read-tree.c b/read-tree.c
index a46c6fe..5580f15 100644
--- a/read-tree.c
+++ b/read-tree.c
@@ -15,6 +15,7 @@ static int update = 0;
 static int index_only = 0;
 static int nontrivial_merge = 0;
 static int trivial_merges_only = 0;
+static int aggressive = 0;
 
 static int head_idx = -1;
 static int merge_size = 0;
@@ -424,11 +425,14 @@ static int threeway_merge(struct cache_e
 	int df_conflict_remote = 0;
 
 	int any_anc_missing = 0;
+	int no_anc_exists = 1;
 	int i;
 
 	for (i = 1; i < head_idx; i++) {
 		if (!stages[i])
 			any_anc_missing = 1;
+		else
+			no_anc_exists = 0;
 	}
 
 	index = stages[0];
@@ -489,6 +493,29 @@ static int threeway_merge(struct cache_e
 	if (!head && !remote && any_anc_missing)
 		return 0;
 
+	/* Under the new "aggressive" rule, we resolve mostly trivial
+	 * cases that we historically had git-merge-one-file resolve.
+	 */
+	if (aggressive) {
+		int head_deleted = !head && !df_conflict_head;
+		int remote_deleted = !remote && !df_conflict_remote;
+		/*
+		 * Deleted in both.
+		 * Deleted in one and unchanged in the other.
+		 */
+		if ((head_deleted && remote_deleted) ||
+		    (head_deleted && remote && remote_match) ||
+		    (remote_deleted && head && head_match))
+			return 0;
+
+		/*
+		 * Added in both, identically.
+		 */
+		if (no_anc_exists && head && remote && same(head, remote))
+			return merged_entry(head, index);
+
+	}
+
 	/* Below are "no merge" cases, which require that the index be
 	 * up-to-date to avoid the files getting overwritten with
 	 * conflict resolution files. 
@@ -677,6 +704,11 @@ int main(int argc, char **argv)
 			continue;
 		}
 
+		if (!strcmp(arg, "--aggressive")) {
+			aggressive = 1;
+			continue;
+		}
+
 		/* "-m" stands for "merge", meaning we start in stage 1 */
 		if (!strcmp(arg, "-m")) {
 			if (stage || merge)
-- 
1.1.6.ge2129

^ permalink raw reply related

* [PATCH] fmt-merge-msg: show summary of what is merged.
From: Junio C Hamano @ 2006-02-04  7:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brown, Len, Git Mailing List, Paul Mackerras, Marco Costalba,
	Aneesh Kumar, Dave Jones
In-Reply-To: <Pine.LNX.4.64.0602031841320.3969@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Yeah, it doesn't show the branch names _and_ it shows the commit that you 
> merged into too, so it looks like
>
>   Parent: 3ee68.. ([SPARC64]: Use compat_sys_futimesat in 32-bit syscall table.)
>   Parent: 876c1.. ([ACPI] Disable C2/C3 for _all_ IBM R40e Laptops)
>   Parent: 729b4.. ([ACPI] fix reboot upon suspend-to-disk)
>   Parent: cf824.. ([ACPI] handle BIOS with implicit C1 in _CST)
>
> but it's actually pretty readable there.

Fair enough.  I myself do not use gitk that often than I use
'git log'.  Something like this patch is what I've been thinking
of doing (it actually works rather nicely if you try to recreate
Len's merge).

-- >8 --
This was prompted by Len's 12-way octopus.  In addition to
the branch names, populate the log message with one-line
description from actual commits that are being merged.

This is experimental.  You need to have 'merge.summary'
in the configuration file to enable it:

	$ git repo-config merge.summary yes

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 git-fmt-merge-msg.perl |   79 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 77 insertions(+), 2 deletions(-)

b145e0d7a5fc728c00925b55c8a2c2a97788536b
diff --git a/git-fmt-merge-msg.perl b/git-fmt-merge-msg.perl
index 778388e..9ac3c87 100755
--- a/git-fmt-merge-msg.perl
+++ b/git-fmt-merge-msg.perl
@@ -27,10 +27,47 @@ sub andjoin {
 	return ($m);
 }
 
+sub repoconfig {
+	my $fh;
+	my $val;
+	eval {
+		open $fh, '-|', 'git-repo-config', '--get', 'merge.summary'
+		    or die "$!";
+		($val) = <$fh>;
+		close $fh;
+	};
+	return $val;
+}
+
+sub mergebase {
+	my ($other) = @_;
+	my $fh;
+	open $fh, '-|', 'git-merge-base', '--all', 'HEAD', $other or die "$!";
+	my (@mb) = map { chomp; $_ } <$fh>;
+	close $fh or die "$!";
+	return @mb;
+}
+
+sub shortlog {
+	my ($tip, $limit, @base) = @_;
+	my ($fh, @result);
+	open $fh, '-|', ('git-log', "--max-count=$limit", '--topo-order',
+			 '--pretty=oneline', $tip, map { "^$_" } @base)
+	    or die "$!";
+	while (<$fh>) {
+		s/^[0-9a-f]{40}\s+//;
+		push @result, $_;
+	}
+	close $fh or die "$!";
+	return @result;
+}
+
+my @origin = ();
 while (<>) {
-	my ($bname, $tname, $gname, $src);
+	my ($bname, $tname, $gname, $src, $sha1, $origin);
 	chomp;
-	s/^[0-9a-f]*	//;
+	s/^([0-9a-f]*)	//;
+	$sha1 = $1;
 	next if (/^not-for-merge/);
 	s/^	//;
 	if (s/ of (.*)$//) {
@@ -52,19 +89,30 @@ while (<>) {
 		};
 	}
 	if (/^branch (.*)$/) {
+		$origin = $1;
 		push @{$src{$src}{BRANCH}}, $1;
 		$src{$src}{HEAD_STATUS} |= 2;
 	}
 	elsif (/^tag (.*)$/) {
+		$origin = $_;
 		push @{$src{$src}{TAG}}, $1;
 		$src{$src}{HEAD_STATUS} |= 2;
 	}
 	elsif (/^HEAD$/) {
+		$origin = $src;
 		$src{$src}{HEAD_STATUS} |= 1;
 	}
 	else {
 		push @{$src{$src}{GENERIC}}, $_;
 		$src{$src}{HEAD_STATUS} |= 2;
+		$origin = $src;
+	}
+	if ($src eq '.' || $src eq $origin) {
+		$origin =~ s/^'(.*)'$/$1/;
+		push @origin, [$sha1, "$origin"];
+	}
+	else {
+		push @origin, [$sha1, "$origin of $src"];
 	}
 }
 
@@ -93,3 +141,30 @@ for my $src (@src) {
 	push @msg, $this;
 }
 print "Merge ", join("; ", @msg), "\n";
+
+if (!repoconfig) {
+	exit(0);
+}
+
+# We limit the merge message to the latst 20 or so per each branch.
+my $limit = 20;
+
+for (@origin) {
+	my ($sha1, $name) = @$_;
+	my @mb = mergebase($sha1);
+	my @log = shortlog($sha1, $limit, @mb);
+	if ($limit + 1 <= @log) {
+		print "\n* $name: (" . scalar(@log) . " commits)\n";
+	}
+	else {
+		print "\n* $name:\n";
+	}
+	my $cnt = 0;
+	for my $log (@log) {
+		if ($limit < ++$cnt) {
+			print "  ...\n";
+			last;
+		}
+		print "  $log";
+	}
+}
-- 
1.1.6.ge2129

^ permalink raw reply related

* [PATCH] Use sha1_file.c's mkdir-like routine in apply.c.
From: Jason Riedy @ 2006-02-04  6:50 UTC (permalink / raw)
  To: git

As far as I can see, create_subdirectories() in apply.c just
duplicates the functionality of safe_create_leading_directories() from
sha1_file.c.  The former has a warm, fuzzy const parameter, but that's
not important.

The potential problem with EEXIST and creating directories should
never occur here, but will be removed by future
safe_create_leading_directories() changes.  Other uses of EEXIST in
apply.c should be fine barring intentionally malicious behavior.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>

---

 apply.c |   25 ++++---------------------
 1 files changed, 4 insertions(+), 21 deletions(-)

9f6bc2c90544688c2aad0ad0da4b2674f5aa6fab
diff --git a/apply.c b/apply.c
index 79e23a7..2ad47fb 100644
--- a/apply.c
+++ b/apply.c
@@ -1564,24 +1564,6 @@ static void add_index_file(const char *p
 		die("unable to add cache entry for %s", path);
 }
 
-static void create_subdirectories(const char *path)
-{
-	int len = strlen(path);
-	char *buf = xmalloc(len + 1);
-	const char *slash = path;
-
-	while ((slash = strchr(slash+1, '/')) != NULL) {
-		len = slash - path;
-		memcpy(buf, path, len);
-		buf[len] = 0;
-		if (mkdir(buf, 0777) < 0) {
-			if (errno != EEXIST)
-				break;
-		}
-	}
-	free(buf);
-}
-
 static int try_create_file(const char *path, unsigned int mode, const char *buf, unsigned long size)
 {
 	int fd;
@@ -1610,13 +1592,14 @@ static int try_create_file(const char *p
  * which is true 99% of the time anyway. If they don't,
  * we create them and try again.
  */
-static void create_one_file(const char *path, unsigned mode, const char *buf, unsigned long size)
+static void create_one_file(char *path, unsigned mode, const char *buf, unsigned long size)
 {
 	if (!try_create_file(path, mode, buf, size))
 		return;
 
 	if (errno == ENOENT) {
-		create_subdirectories(path);
+		if (safe_create_leading_directories(path))
+			return;
 		if (!try_create_file(path, mode, buf, size))
 			return;
 	}
@@ -1643,7 +1626,7 @@ static void create_one_file(const char *
 
 static void create_file(struct patch *patch)
 {
-	const char *path = patch->new_name;
+	char *path = patch->new_name;
 	unsigned mode = patch->new_mode;
 	unsigned long size = patch->resultsize;
 	char *buf = patch->result;
-- 
1.1.5

^ permalink raw reply related

* Re: [RFD] diff-tree -c (not --cc) in diff-raw format?
From: Junio C Hamano @ 2006-02-04  6:39 UTC (permalink / raw)
  To: Marco Costalba; +Cc: git
In-Reply-To: <e5bfff550602032218s72f91151tb8f6fcda373c2a28@mail.gmail.com>

Marco Costalba <mcostalba@gmail.com> writes:

> As you see -c _could_ imply -m

In case it was not clear, it already implies -m and does not
skip merges even if you do not give -m.  Otherwise you would not
see anything without saying "diff-tree -c -m' to begin with.

^ permalink raw reply

* Re: [RFD] diff-tree -c (not --cc) in diff-raw format?
From: Marco Costalba @ 2006-02-04  6:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfyn0asdd.fsf@assigned-by-dhcp.cox.net>

On 2/4/06, Junio C Hamano <junkio@cox.net> wrote:
> The second paragraph from this log caught my attention:
>
>     commit cb80775b530a8a340a7f9e4fecc8feaaac13777c
>     Author: Marco Costalba <mcostalba@gmail.com>
>     Date:   Sun Jan 29 12:27:34 2006 +0100
>
>         Use git-diff-tree -c (combined) option to retrieve merge's
>         file list
>
>         Change un-interesting files pruning algorithm to use native
>         git-diff-tree -c option when showing merge files in files
>         list box.
>
> The current diff-tree -c is rather expensive way to do this.  I
> made both -c and --cc to always produce patches, but this
> suggests at least qgit would benefit if I allow -c in diff-raw
> format.  Essentially, you are interested in paths that the
> results do not match _any_ of the parents.
>

Yes. that would be great!

If you look at the patch you see I actually run something like this

         git-diff-tree -c <sha> | grep "diff --combined"

to get the 'interesting' file list.

> So instead of 70 lines output from
>
>         $ git-diff-tree -m -r --abbrev v1.0.0
>
> I could give you:
>
>         $ git-diff-tree -c -m -r --abbrev v1.0.0
>         c2f3bf071ee90b01f2d629921bb04c4f798f02fa
>         :100644 100644 92cfee4... e9bf860... M  Makefile
>         :100644 100644 d36904c... 4fa6c16... M  debian/changelog
>         c2f3bf071ee90b01f2d629921bb04c4f798f02fa
>         :100644 100644 50392ff... e9bf860... M  Makefile
>         :100644 100644 376f0fa... 4fa6c16... M  debian/changelog
>
> or even:
>
>         $ git-diff-tree -c -m -r --name-only v1.0.0
>         c2f3bf071ee90b01f2d629921bb04c4f798f02fa
>         Makefile
>         debian/changelog
>
> I am not so sure which one is more useful, though.  What do you
> think?
>

The first one without --abbrev option

 git-diff-tree -c -m -r c2f3bf071ee90b01f2d629921bb04c4f798f02fa

So to recycle my common diff-tree parsing code. Peraphs better, if possible,
I would suggest to treat -c option as all the other options, so to be possible
to use something like

git-diff-tree -c -r c2f3bf071ee90b01f2d629921bb04c4f798f02fa
git-diff-tree -c -r --name-only v1.0.0
git-diff-tree -c -r --abbrev v1.0.0
git-diff-tree -c -r c2f3bf071ee90b01f2d629921bb04c4f798f02fa  foo.c

and so on....

As you see -c _could_ imply -m

The current -c behaviour _could_ be obtained with

git-diff-tree -c -r -p <sha>

I didn't check the combined code to see if what I suggest is easy/possible.

> I should keep track of qgit repository more often but I haven't
> been doing so because the site used to be almost unpullable over
> http (it seems to work just fine these days so it is not an
> excuse for me anymore) and I do not read C++ too well (bad
> excuse perhaps but still an excuse for me).
>
>

No problem, I do not read C++ too well too. ;-)

Marco

^ permalink raw reply

* Re: [PATCH] combine-diff: add safety check to --cc.
From: Junio C Hamano @ 2006-02-04  6:12 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17380.15836.61062.401906@cargo.ozlabs.ibm.com>

Paul Mackerras <paulus@samba.org> writes:

> Linus Torvalds writes:
>
>> In fact, git-diff-tree now gets the subtle cases right for things that 
>> "gitk" for some reason gets wrong. I haven't figured out what's wrong with 
>> gitk, but I don't think it's even worth it: it would be better to just 
>> teach gitk to use git-diff-tree --cc.
>
> Working on it now.  That will let me cut out about 500 lines of pretty
> hairy Tcl code from gitk, which is nice.
>
> Paul.

Excited to hear that.  Please be sure to base your work on the
latest updated format, described in my earlier message

    Subject: [Attn - repository browser authors] diff-tree combined format.

^ permalink raw reply

* Re: Tracking and committing back to Subversion?
From: Eric Wong @ 2006-02-04  5:40 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Git Mailing List
In-Reply-To: <1138834301.21899.40.camel@wilber.wgtn.cat-it.co.nz>

Sam Vilain <sam@vilain.net> wrote:
> Hi all,
> 
> Has anyone done any work on bidirectional access to SVN repositories?
> ie, tracking and committing.

AFAIK, Not yet.  But it's something that's been on the back of my mind
for a while.  I've attempted similar things with Arch <-> SVN in the
past but didn't get anything extremely robust going from Arch -> SVN
although I'm pretty satisfied with my SVN -> Arch product
(svn-arch-mirror).

> That would be porcelain that behaves like SVK (http://svk.elixus.org)
> 
> Ideally it would probably need to link against the Subversion RA (remote
> access) library, neon.

I prefer to use git-svnimport for pulling from svn since it's pretty
good at what it does.  That already depends on SVN::Core and SVN::Ra.
SVK is really nice tool, just a bit slow after I've been using git for a
while.

> I can see forsee two potential issues;
> 
>   1. file properties - such as mime type, ignores and custom properties.
>      Linus, when I asked you about this in Dunedin, you mentioned that
>      there is a place at the end of the directory entry where this could
>      fit without breaking backwards compatibility.  Perhaps this could
>      be an optional pointer to another directory node;

Mostly ignore-able, imho.  svn:executable is the one that makes the most
sense (and is easiest) to support.

svn:ignore <-> .gitignore mapping are pretty close (identical, save for
recursive properties that svn gets very wrong.  I don't think svn:ignore
is affected by line order, whereas .gitignore is a plain file and line
order does matter.

svn:external is almost definitely better off ignored entirely when
interfacing with other RCSes

svn:keywords: I don't think there's a way to disable this like there
is with CVS, is there?  keywords are evil imho.

I don't use or know much about the other properties...

>   2. "forensic" file movement history - as opposed to the uncached,
>      (and unversioned), automatic "analytical" file movement history.
> 
>      It would be easy for a tool to provide 100% interface compatibility
>      with SVN client/SVK using properties, but properties that hang off
>      the head rather than the file itself (so that they don't stuff up
>      the ability to merge identical trees reached via independent
>      paths).  SVN calls these "revision properties".  If a good
>      convention is adopted for this, it could be used as a nice way to
>      supplement git's automatic analysis of the revision history.

Just parsing the output of diff-tree -C and marking them in SVN as
copies/renames should be sufficient for letting SVN do its thing.

Doing this kind of file movement history on the git side sounds like
overkill to me.  I was a _huge_ fan of logical file-identities in GNU
Arch in the past, but the complexity destroyed it from both a UI and
performance perspective.

-- 
Eric Wong

^ permalink raw reply

* Re: [PATCH] combine-diff: add safety check to --cc.
From: Paul Mackerras @ 2006-02-04  5:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Git Mailing List, Marco Costalba, Aneesh Kumar,
	Len Brown
In-Reply-To: <Pine.LNX.4.64.0602021454060.21884@g5.osdl.org>

Linus Torvalds writes:

> In fact, git-diff-tree now gets the subtle cases right for things that 
> "gitk" for some reason gets wrong. I haven't figured out what's wrong with 
> gitk, but I don't think it's even worth it: it would be better to just 
> teach gitk to use git-diff-tree --cc.

Working on it now.  That will let me cut out about 500 lines of pretty
hairy Tcl code from gitk, which is nice.

Paul.

^ permalink raw reply

* Re: git-svnimport
From: Martin Langhoff @ 2006-02-04  3:16 UTC (permalink / raw)
  To: Jason Riedy; +Cc: Jason Harrison, git
In-Reply-To: <16255.1139021263@lotus.CS.Berkeley.EDU>

On 2/4/06, Jason Riedy <ejr@eecs.berkeley.edu> wrote:
> Looking through the git-svnimport source makes it appear
> difficult to just snarf just one directory (maybe /) out
> of a svn repository.  Actually, it makes everything svn-
> related appear difficult.  Why should I worry about
> memory management by default in Perl?!?

I think that the SVN internals are so yucky that we're lucky to get
away with "just" that bit of memory-handling crud.

>  It'd be nice if
> there were an git-svnimport-trivial that just snarfed a
> single URL without tags or branches.

We're definitely needing some updates to the doco for svnimport to
make it easier. Maybe some examples of real life scenarios with
different repo layouts

> I have to deal with a few repos with bizarre directory
> structures...

looks like you're the man for the job ;-)

cheers,


martin

^ permalink raw reply

* Re: git-svnimport
From: Martin Langhoff @ 2006-02-04  3:12 UTC (permalink / raw)
  To: Jason Harrison; +Cc: git
In-Reply-To: <200602031429.07894.jharrison@linuxbs.org>

On 2/4/06, Jason Harrison <jharrison@linuxbs.org> wrote:
> I am trying to import from an svn repository into a git repository using
> git-svnimport.  So far my attempts have failed.  Here is what I have done so
> far.

You always want to be passing the branches and tags parameters
explicitly. SVN repos have so many weird layouts that svnimport does a
bad job without them. Trial and error has worked for me ;-)

See for instance:

  http://www.gelato.unsw.edu.au/archives/git/0512/14069.html

regards,


martin

^ permalink raw reply

* Re: git-svnimport
From: Jason Riedy @ 2006-02-04  2:47 UTC (permalink / raw)
  To: Jason Harrison; +Cc: git
In-Reply-To: <200602031429.07894.jharrison@linuxbs.org>

And Jason Harrison writes:
 - I am trying to import from an svn repository into a git repository using 
 - git-svnimport.  So far my attempts have failed.  Here is what I have done so 
 - far.
 - 
 - git-svnimport svn://svn.debian.org/demi/
[...]
 - git-svnimport -T demi svn://svn.debian.org/demi/
[...]

It seems git-svnimport requires one level of directory 
structure.  Running 
  git-svnimport -T bin svn://svn.debian.org/demi/
gives me the contents of the bin directory while (noisily)
ignoring everything else.  I think svn repositories
usually have a structure with branches in one directory,
tags in another, and its version of HEAD in yet another.

Looking through the git-svnimport source makes it appear
difficult to just snarf just one directory (maybe /) out 
of a svn repository.  Actually, it makes everything svn-
related appear difficult.  Why should I worry about
memory management by default in Perl?!?  It'd be nice if 
there were an git-svnimport-trivial that just snarfed a 
single URL without tags or branches.

I have to deal with a few repos with bizarre directory 
structures but _no_ branches or tags, and I found a work-
around that may be good enough.  Mirroring the repos using
svm (or svn-mirror, from SVN::Mirror) then importing the
mirrors works, but it gives an extraneous first commit.
Example derived from the svn-mirror/svm man page:
  env SVMREPOS=${HOME}/svm svn-mirror init mirror/demi \
        svn://svn.debian.org/demi
  env SVMREPOS=${HOME}/svm svn-mirror sync mirror/demi
  git-svnimport -v -T mirror/demi file://${HOME}/svm

That may spew messages about unrecognized paths if you 
mirror a few repos, but it should work.  Syncing and re-
running git-svnimport appears to keep things up-to-date, 
but I haven't had much opportunity to test that.

SVK (same author as SVN::Mirror) might work for creating
a mirror, possibly without the extraneous first commit, 
but I couldn't figure out the right -T and URL options 
for git-svnimport.

Jason

^ permalink raw reply

* Re: The merge from hell...
From: Linus Torvalds @ 2006-02-04  2:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brown, Len, Git Mailing List, Paul Mackerras, Marco Costalba,
	Aneesh Kumar, Dave Jones
In-Reply-To: <7vy80r97h6.fsf@assigned-by-dhcp.cox.net>

On Fri, 3 Feb 2006, Junio C Hamano wrote:
> 
> It might make sense if we have a tool support to pre-format the
> merge messages like this, given set of branch names:
> 
>     [ACPI] Merge 3549, 4320, 4485, 4588, 4980, 5483, 5651, acpica, asus, fops and pnpacpi branches into release
> 
>     3549: [ACPI] Disable C2/C3 for _all_ IBM R40e Laptops
>     4320: [ACPI] fix reboot upon suspend-to-disk
>     4485: [ACPI] handle BIOS with implicit C1 in _CST
..

Well, this is actually not all that different from what gitk will show you 
(since I added the commit "explanation" names with my increadible 
copy-paste skills to it).

Just look in the details window in gitk on that merge, and that's pretty 
much exactly what you'll see, except you'll also have the nice clickable 
hyperlink features ;)

Yeah, it doesn't show the branch names _and_ it shows the commit that you 
merged into too, so it looks like

  Parent: 3ee68.. ([SPARC64]: Use compat_sys_futimesat in 32-bit syscall table.)
  Parent: 876c1.. ([ACPI] Disable C2/C3 for _all_ IBM R40e Laptops)
  Parent: 729b4.. ([ACPI] fix reboot upon suspend-to-disk)
  Parent: cf824.. ([ACPI] handle BIOS with implicit C1 in _CST)

but it's actually pretty readable there.

		Linus

^ permalink raw reply

* Re: The merge from hell...
From: Junio C Hamano @ 2006-02-04  2:35 UTC (permalink / raw)
  To: Brown, Len
  Cc: Linus Torvalds, Git Mailing List, Paul Mackerras, Marco Costalba,
	Aneesh Kumar, Dave Jones
In-Reply-To: <F7DC2337C7631D4386A2DF6E8FB22B3005F34393@hdsmsx401.amr.corp.intel.com>

"Brown, Len" <len.brown@intel.com> writes:

> Naming the branch is just eye-candy for the merge comment.
> My topic branch labels in refs/my-branch never get to kernel.org, so you're
> not going to see the pretty green tags on topic branches branches that I see.
>
> I include the full-URL of the bug report in the original commit comments
> for those who are interested.  I think this it the important place to put it,
> and in practice I've found it to be extremely useful.

Both excellent points.

If I may digress,...

It appears most of the topic branches in that merge have only
one commit since they forked from trunk (the development track
led to the first parent of that merge), but some seem to have
more than one commits.

It might make sense if we have a tool support to pre-format the
merge messages like this, given set of branch names:

    [ACPI] Merge 3549, 4320, 4485, 4588, 4980, 5483, 5651, acpica, asus, fops and pnpacpi branches into release

    3549: [ACPI] Disable C2/C3 for _all_ IBM R40e Laptops
    4320: [ACPI] fix reboot upon suspend-to-disk
    4485: [ACPI] handle BIOS with implicit C1 in _CST
    4588: [ACPI] fix acpi_os_wait_sempahore() finite timeout case (AE_TIME warning)
    4980 (5 commits): [ACPI] build EC driver on IA64
    5483 (3 commits): [ACPI] fix acpi_cpufreq.c build warrning
    5651: [ACPI] SMP S3 resume: evaluate _WAK after INIT
    acpica (23 commits): [ACPI] ACPICA 20060113
    asus (3 commits): [ACPI_ASUS] fix asus module param description
    fops (2 commits): [ACPI] make two processor functions static
    pnpacpi (3 commits): [PNPACPI] clean excluded_id_list[]

Here I am counting the number of commits on each topic since it
last diverged from trunk (`merge-base release branch`), and
showing the latest commit of the topic.

We could even go further and have "per branch annotation" that
lets you do something like this:

	$ git checkout -b 3549 \
          --description 'http://bugzilla.kernel.org/show_bug.cgi?id=3549'
	$ work work
        $ git commit
	... work on other topics in similar way ...
        ... later, on the 'release' branch ...
        $ git pull . 3549 4320 4485...

With that, we could give a default merge message formatted like this:

    Merge 3549, 4320, 4485, 4588, 4980, 5483, 5651, acpica, asus, fops and pnpacpi branches into release

    3549: http://bugzilla.kernel.org/show_bug.cgi?id=3549
     [ACPI] Disable C2/C3 for _all_ IBM R40e Laptops

    4320: http://bugzilla.kernel.org/show_bug.cgi?id=4320
     [ACPI] fix reboot upon suspend-to-disk

    5483: http://bugzilla.kernel.org/show_bug.cgi?id=5483
     [ACPI] fix acpi_cpufreq.c build warrning
     [ACPI] IA64 ZX1 buildfix for _PDC patch
     [ACPI] Avoid BIOS inflicted crashes by evaluating _PDC only once

    pnpacpi: work on PNP-ACPI issues
     [PNPACPI] clean excluded_id_list[]
     [PNPACPI] Ignore devices that have no resources
     [ACPI] enable PNPACPI support for resource types used by HP serial ports

This last digression might be too much, though.  It may be
something that is better computed by 'git log' while reviewing
history, except that the description of each branch cannot be
given that way.

^ permalink raw reply

* Re: Two ideas for improving git's user interface
From: Linus Torvalds @ 2006-02-04  2:08 UTC (permalink / raw)
  To: Carl Worth; +Cc: Junio C Hamano, Nicolas Pitre, git
In-Reply-To: <87lkwsvusp.wl%cworth@cworth.org>

On Fri, 3 Feb 2006, Carl Worth wrote:
> 
> > And in that case, I will actually re-apply my manual Makefile change, even 
> > if that file was part of the merge changes (in which case I had had to 
> > first un-apply the change in order to do the merge).
> 
> Are the un-apply and re-apply operations here primarily manual? or
> does git help you much with those (beyond alerting you that the merge
> cannot take place before you un-apply things)?

They're purely manual. If the changes are more extensive, I just create a 
temporary branch for them, which is easy enough:

	git checkout -b temp
	git commit
	git checkout master

before I do the real merge, but the fact is, most of the changes in my 
tree tend to be pretty un-interesting. Most of the time it's literally 
_just_ the Makefile change, sometimes it's a trial patch that I'm not 
ready commit and had just sent out to somebody for testing or similar.

> I believe that the staging operations you perform are quite desirable,
> but I wonder if existing primitives in git might not provide a more
> powerful basis for the kinds of operation you're performing.

No. The point is that they are trivial to do, and that they don't _need_ 
"powerful basis".

What they need is _usability_.

And the git index _is_ that usability. It is incredibly powerful, and 
incredibly easy to use.

When you argue against exposing the index, you argue against it from the 
"let's not give them rope" angle. You argue against power and flexibility. 

You argue for the clippy, the helper app that says

	Are you sure you want to do this?
		[Yes] [No] [Cancel]

while I'm trying to explain that it's actually part of the _power_ of git.

The fact, that I can keep dirty state in my tree and continue to work with 
it _without_ having to worry about it is a huge relief to me. 

> If so, could your not-ready changes be implemented as some branch that
> is automatically unmerged prior to commit and then re-merged
> afterwards? Or something like that?

Sure. They could. You could make things more complicated, and they would 
WORK. 

They'd be inconvenient and not offer any actual improvement.

The "index" file in git really is very important.

Staging into the index is _the_ most fundamental operation. You can't 
actually see it very well in the history of git (because the first commit 
exists only after git actually worked pretty fully), but the birth of git 
is really in the index file. That actually came _before_ the object store, 
as the way to quickly and efficiently track the notion of "changes".

So git itself started out very much with the index file being the staging 
area for tracking the state of a working tree efficiently.

No git operation actually ever lets the working tree interact directly 
with the object store. The notion of "diff this <tree> object against the 
current working tree" comes closest, but even that actually really goes 
through the index file: it's properly a "diff this <tree> object against 
the index file, and check at the same time the index entry against the 
working tree"

If you deny the index file, you really deny git itself.

Think of it this way: when you start a new process, in UNIX you do that in 
two stages: first you fork() to create a copy, then you do exec() to 
populate the copy with the new process. 

Your argument is akin to saying "That's horribly wasteful: wouldn't it be 
much more intuitive to just do 'spawn()' to do it all, and avoid the 
unnecessary middle step".

But that "unnecessary" middle step - whether it's "fork()" or the git 
"index" file - is actually the source of the flexibility. It's what allows 
you to do the "fixups" in the middle when you switch file descriptors 
around, or when you fix up merge conflicts.

And then occasionally, you do fork() _without_ doing an execve() at all. 
The same way that sometimes you do operations on the index without 
actually committing them to a tree.

That's flexibility. Revel in it, instead of trying to push it under the 
rug. 

			Linus

^ permalink raw reply

* Re: [PATCH] config: Rummage through ~/.gitrc as well as the repository's config.
From: Junio C Hamano @ 2006-02-04  1:15 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git
In-Reply-To: <20060203203332.2718.13451.stgit@metalzone.distorted.org.uk>

Mark Wooding <mdw@distorted.org.uk> writes:

> I'm fed up of setting user.email in every repository I own.

That is what --template-dir to init-db is for (it works in an
already initialized git repository, but it does not overwrite
files).

> I want to put this somewhere central, and I shouldn't have to
> log in again to make it take effect.

I do not understand about logging in again.  If you are talking
about environment variables, ". ~/.env" would work nicely.

^ permalink raw reply

* Re: Two ideas for improving git's user interface
From: Carl Worth @ 2006-02-04  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Nicolas Pitre, git
In-Reply-To: <Pine.LNX.4.64.0602011656130.21884@g5.osdl.org>

[-- Attachment #1: Type: text/plain, Size: 4048 bytes --]

On Wed, 1 Feb 2006 17:23:38 -0800 (PST), Linus Torvalds wrote:
>
> I tend to have a certain fairly constant set of changes in my working 
> tree, namely every time a release is getting closer, I always tend to have 
> the "Makefile" already updated for the new version (but not checked in: I 
> do that just before I actually tag it, so that the tag will match the 
> commit that actually changes the version).

OK. That use case I understand just fine.

> However, if the question was an even stricter "do you ever commit 
> _changes_ to a particular file where the last HEAD, the index _and_ the 
> working tree are all different", then the answer is actually "Yes" to that 
> too.

Yes, this is the question I was trying to ask. Thanks for pretending
that I had actually asked it, and then answering it as well.

> What has happened is that I have had merges that have content conflicts 
> that I fix up by hand, but exactly _because_ I fix them up by hand, I 
> actually want to re-compile the kernel and test my fixups.

OK. I hadn't anticipated this use case, but I am interested in
exploring it more fully.

> And in that case, I will actually re-apply my manual Makefile change, even 
> if that file was part of the merge changes (in which case I had had to 
> first un-apply the change in order to do the merge).

Are the un-apply and re-apply operations here primarily manual? or
does git help you much with those (beyond alerting you that the merge
cannot take place before you un-apply things)?

> The thing is, once you get used to the git "index" as a staging place, 
> it's really really powerful.

I believe that the staging operations you perform are quite desirable,
but I wonder if existing primitives in git might not provide a more
powerful basis for the kinds of operation you're performing.

For example, in the case of the not-quite-ready-to-be-committed
changes that you want to carry along, couldn't you get additional
benefits if those changes could live on their own branch? I suppose
there may be a missing operator needed to allow you to easily merge
*and* unmerge that branch if needed. Would that seem at all feasible?

If so, could your not-ready changes be implemented as some branch that
is automatically unmerged prior to commit and then re-merged
afterwards? Or something like that?

I guess the feeling I get is that staging into the index feels
conceptually similar to a commit to a branch, but it's a uniquely weak
branch (only one revision per file). And this uniqueness also
introduces complexity (the various diff operations), as well as
possibilities of confusion when committing. Meanwhile the response to
the commit confusion seems to be to add yet more complexity to commit
which doesn't seem like an improvement to me.

[I'm maybe too far out on a limb at this point, since you've
definitely identified a use case for staging in the index, and all
I've offered as an alternative is hand-waving about "branches should
be able to do that". But if nothing else, I'm floating some ideas out
loud, and next I'll try experimenting more with possibilities for
non-index staging.]

I'm already having a lot of fun with git. It's a very impressive tool,
with a surprisingly simple/powerful core.

> Actually, we do exactly that. Right now we expressly limit the "preview" 
> to just the filenames, but we literally do run
> 
> 	git-diff-index -M --cached --name-status --diff-filter=MDTCRA HEAD
>
> as part of "git status", and the eventual end result is what we will 
> populate the commit message file with for your editing pleasure.

Yes, that's a good thing to do. In my personal workflow, a
pre-populated commit message is a bit late, since I want to review and
convince myself I like things before I type the magic word "commit".

And I'm not claiming that a preview patch is impossible to generate,
I'm just saying that it's currently rather hard to figure what the correct
correspondence for arguments to diff and arguments to commit, (see
more on this point in another branch of this thread).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [RFD] diff-tree -c (not --cc) in diff-raw format?
From: Junio C Hamano @ 2006-02-04  0:18 UTC (permalink / raw)
  To: Marco Costalba; +Cc: git

The second paragraph from this log caught my attention:

    commit cb80775b530a8a340a7f9e4fecc8feaaac13777c
    Author: Marco Costalba <mcostalba@gmail.com>
    Date:   Sun Jan 29 12:27:34 2006 +0100

        Use git-diff-tree -c (combined) option to retrieve merge's
        file list

        Change un-interesting files pruning algorithm to use native
        git-diff-tree -c option when showing merge files in files
        list box.

The current diff-tree -c is rather expensive way to do this.  I
made both -c and --cc to always produce patches, but this
suggests at least qgit would benefit if I allow -c in diff-raw
format.  Essentially, you are interested in paths that the
results do not match _any_ of the parents.

So instead of 70 lines output from

	$ git-diff-tree -m -r --abbrev v1.0.0

I could give you:

        $ git-diff-tree -c -m -r --abbrev v1.0.0
        c2f3bf071ee90b01f2d629921bb04c4f798f02fa
        :100644 100644 92cfee4... e9bf860... M	Makefile
        :100644 100644 d36904c... 4fa6c16... M	debian/changelog
        c2f3bf071ee90b01f2d629921bb04c4f798f02fa
        :100644 100644 50392ff... e9bf860... M	Makefile
        :100644 100644 376f0fa... 4fa6c16... M	debian/changelog

or even:

        $ git-diff-tree -c -m -r --name-only v1.0.0
        c2f3bf071ee90b01f2d629921bb04c4f798f02fa
        Makefile
        debian/changelog

I am not so sure which one is more useful, though.  What do you
think?

On the other hand, git-diff-tree --cc needs to look at the diff
between result and parents of the merge in order to do its job
hunk-per-hunk, so producing diff-raw output fundamentally does
not make sense for that option.

I should keep track of qgit repository more often but I haven't
been doing so because the site used to be almost unpullable over
http (it seems to work just fine these days so it is not an
excuse for me anymore) and I do not read C++ too well (bad
excuse perhaps but still an excuse for me).

^ permalink raw reply

* Re: Two ideas for improving git's user interface
From: Carl Worth @ 2006-02-03 23:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vek2mzec5.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 4527 bytes --]

[I'm still hesitant to be jumping into this discussion with both feet
like this, so please imagine lots of disclaimers of ignorance before
any claims I make---I would not be surprised or offended to learn I'm
wildly wrong about how I think some things work.]

On Wed, 01 Feb 2006 18:25:46 -0800, Junio C Hamano wrote:
> Carl Worth <cworth@cworth.org> writes:
> 
> > If not, we should be able to simplify things since a lot of the
> > UI complexity being discussed (-a vs. no -a, path names vs. no path
> > names), hinges on the handling of skewed files.
> 
> I am in agreement with you that "skewed files" might lead to
> confusion, but I do not see how that relates to "-a vs no -a" nor
> "path names vs no path names" issues.

In the case of skewed files, "-a" commits the current file content,
while "no -a" commits the skewed content. Similarly, "path names"
commits the current contents while "no path names" commits the skewed
content.

> Let's say we try to detect and forbid committing skewed files.  How
> would we do that?

I wasn't imagining adding extra checks (== more complexity). Instead I
was imagining something like a command that would mark a path to be
committed. I don't yet have a good suggestion for a short name for the
operation, but I'll call it "mark" for sake of discussion. This mark
operation would be used similarly to update-index but instead of
storing into the index an object created from the current contents of
the specified path, it would simply mark the path in the index as
to-be-committed. When committing such a path later, the object would
be created based on the contents of the path at that time.

So I imagined eliminating skewed files first by providing operations
based around "mark" rather than update-index, (since "mark" avoids all
of the confusing oops-I-committed-stale-file-contents scenarios), and
second by making all commands that update the index from the object DB
also update the working directory, (effectively making git-read-tree
always act according to its current -u).

But as a prerequisite, this kind of plan would require the user to
never actually _want_ to stash skewed contents in the index. On a
separate branch of the current thread, Linus has said he likes to do
that, so I'll continue to discuss that there, and before the outcome
of that discussion, this idea need not even be considered further.

> 1. "git commit" is the traditional one; it commits the current index.
> 
> 2. "git commit --also fileA..." updates fileA... on top of the current
> 
> 3. "git commit fileA..." initializes a temporary index from the
> 
> 4. "git commit -a" by definition would not have skewed files and there
>    is nothing to check.

The one comment I have about this proposal is a certain lack of
orthogonality. Namely the base "commit" performs one operation,
(committing the contents of the index), and "commit --also" performs
that same operation plus something more (that much is good so
far). The problem starts with "commit file" which does not perform the
base operation at all, but just does something different. Similarly,
"commit -a" is also doing something different, (its behavior can be
described as an additional step performed _before_ the base "commit"
but could also be described as an operation independent of the
original state of the index, if I'm not mistaken).

Before "-a" existed, there was better orthogonality, but apparently
there wasn't a good fit with what some users wanted to do, (hence the
addition of "-a" and the recent proposal of yet more variations on
"commit").

>         $ git diff --cached
>         $ git commit
...
>         $ git diff HEAD
>         $ git commit -a
...
> For that you may need to do something like:
> 
> 	git-diff-index --cached HEAD ;# already in index but do not look at A
>         git-diff-index HEAD -- A ;# and path A is taken from working tree
> 
> which is a bit cumbersome.
> 
> Without --also (the new semantics), the check would be
> straightforward:
..
>  +++++  $ git diff HEAD -- A
> 	$ git commit A

Thanks for the examples. If nothing else, I hope the above makes clear
that it's not always obvious how to achieve a preview diff of a
commit. I would love to see the number of fundamental variations of
"commit" shrink rather than grow, but especially if it does grow, I
think it will always be important for users to be able to easily view
"status" and "diff" previews of commits, (preferably by providing the
same arguments to some 'preview' commands as will be passed to
commit).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* RE: git bisect and the merge from hell
From: Linus Torvalds @ 2006-02-03 23:44 UTC (permalink / raw)
  To: Luck, Tony; +Cc: git
In-Reply-To: <B8E391BBE9FE384DAA4C5C003888BE6F059F4AF6@scsmsx401.amr.corp.intel.com>

On Fri, 3 Feb 2006, Luck, Tony wrote:
> 
> So Len's mega-octopus merge wasn't a problem at all, but this is still
> all his fault :-)  I'll go beat on him.

Note that this _can_ be a problem with huge octopus merges.

If some bug only appears as a result of the interaction of two branches, 
doing a 12-way merge will make it harder to debug. Doing a "git bisect" 
will (correctly) pinpoint the merge as being the problem, but after that 
you're on your own as to how to debug it.

So _if_ it had been a merge error, there's two issues with that:

 - debugging merges is usually a bit less straightforward than debugging a 
   single well-defined changeset anyway.

 - especially an octopus-merge will cause "git-bisect" to be less 
   efficient, since it cannot be bisected, so if the bug is in the merge 
   itself, it will ask you to test _every_ _single_ top-of-branch before 
   the merge.

(Normally, testing 12 kernels would zoom in on a bug from 10.000 feet, and 
you'd have bisected a massive four-thousand commits. So having to test 12 
branch heads just to pinpoint a _single_ commit is "unusually expensive" 
by any standard for git bisection).

Anyway, had it been a merge bug, you should then have done:

 - check if it's simply a mis-merge. Do "git-diff-tree --cc" to see if 
   there were any conflicts, and check them out more closely to see 
   if maybe they were incorrectly fixed up.

   Normally, an octopus merge will never have any actual _manual_ 
   conflicts (the standard git tools shouldn't allow it), but there can 
   still be several branches that touch the same area and that could have 
   merged strangely.

If that doesn't get you anywhere, you'll literally have to go to the next 
step:

 - re-do the merges one by one, until the bug appears, or, if it's not 
   there once you've re-done them all, check what the differences are 
   (there _should_ be none, but see above on doing mis-merging) with the 
   final octopus one.

Anyway, for "normal" bugs (like this one apparently is), git-bisect 
shouldn't ever pinpoint a merge, since the bug hopefully was introduced 
somewhere _during_ the branch development, and not when it was merged 
back. Hopefully.

Anyway. The message you should take home from this is that "git bisect" 
handles merges perfectly well, and that at worst it might be less 
efficient and harder to debug - especially for octopus merges - but that 
both of those problems are likely (a) rare and (b) not insurmountable.

		Linus

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox