Git development

Git development
 help / color / mirror / Atom feed

* The smartest way to save money, buy Viagra online.
From: Sandy Benton @ 2006-05-16 11:29 UTC (permalink / raw)
  To: git

We guarantee the quality and safety of all medications ordered.
http://dtc.pxvxuguuv4uf7p7icppiu7p7.myrciaid.com/?nizjfaylh

^ permalink raw reply

* Re: [PATCH] CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
From: Rocco Rutte @ 2006-05-16 10:49 UTC (permalink / raw)
  To: git
In-Reply-To: <7vmzdi9ssv.fsf@assigned-by-dhcp.cox.net>

Hi,

* Junio C Hamano [06-05-16 03:18:24 -0700] wrote:

[...]

> Thoughts?

> If we decide to do the header formatting here, there are two
> further enhancements that need to be done:

> (1) The charset must be configurable for projects that use
>     encoding different from UTF-8, perhaps with the .git/config
>     [i18n] commitEncoding.  It is only a convention, not a hard
>     rule, to use UTF-8 for the metainformation.

To write an encoder really fully conforming to RfC2047 is a mess. Not so
much because the algorithms are difficult but because there're many
things to take care of if you want to do it right.

For example, encoded words are required to be at most something below 80
characters long. For names this maybe is not an issue, but for subjects.
I didn't really check whether your patch produces only the minimum
encoding (i.e. only those words that need it and not just all words with
'_' or '=20' in between them) but if not, 80 isn't that much after all.
And you may need to think about header folding (and unfolding for
reading it back in).

Also, supporting any character set (via iconv()) blows up the
implementation. There're character sets for which other RfCs define the
encoding method so only using quoted-printable is not fully correct in
all possible cases.

And, with the first point, several character sets really can become a
mess as you need to produce several encoded words because the input
would exceed RfC limits otherwise. Because for multi-byte character sets
you musn't break within a multi-byte character sequence but only at
their boundaries. So you need a generic way to detect the byte-size of
such a character in any supported character set.

With just the UTF-8 encoding all of this is pretty simple though.

I would rather try to find a way to implement this in a scripting
language that already has standard modules for this or makes it easy to
write one. In C this gets quite lengthy...

   bye, Rocco
-- 
:wq!

^ permalink raw reply

* Re: [PATCH] CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
From: Jakub Narebski @ 2006-05-16 10:38 UTC (permalink / raw)
  To: git
In-Reply-To: <7vmzdi9ssv.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> By convention, the commit message and the author/committer names
> in the commit objects are UTF-8 encoded.  When formatting for
> e-mails, Q-encode them according to RFC 2047.
> 
> While we are at it, generate the content-type and
> content-transfer-encoding headers as well.
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> 
> ---
> 
>  With this patch, the output formatted with
> 
> git show --pretty=email --patch-with-stat 9d7f73d4
> 
>  would start like this:
> 
>    From 9d7f73d43fa49d0d2f5a8cfcce9d659e8ad2d265  Thu Apr 7 15:13:13 2005
>    From: =?utf-8?q?Lukas_Sandstr=C3=B6m?= <lukass@etek.chalmers.se>
>    Date: Sat, 25 Feb 2006 12:20:13 +0100
>    Subject: [PATCH] git-fetch: print the new and old ref when fast-forwarding 
>    Content-Type: text/plain; charset=UTF-8 
>    Content-Transfer-Encoding: 8bit

I guess that we also need

     MIME-Version: 1.0

(from what I remember of troubles with Eoutlook Express not sending all 
the required headers, and tin not working properly).

If I remember correctly encoding headers using quoted-printable is needed
only because headers are before charset is set. IIRC there was proposal
to use UTF-8 for headers regardless of the charset used for body of message.

P.S. Should we set User-Agent header as well?
-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Buy popular drugs online.
From: Simeon Kuhn @ 2006-05-16 10:21 UTC (permalink / raw)
  To: git

The objective of this guide is to save you time and money. 
http://jrtz.s0g0fjffg7f0aaa3fsslxaaa.sluggingmc.com/?efvck

^ permalink raw reply

* [PATCH] CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
From: Junio C Hamano @ 2006-05-16 10:18 UTC (permalink / raw)
  To: git

By convention, the commit message and the author/committer names
in the commit objects are UTF-8 encoded.  When formatting for
e-mails, Q-encode them according to RFC 2047.

While we are at it, generate the content-type and
content-transfer-encoding headers as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 With this patch, the output formatted with

	git show --pretty=email --patch-with-stat 9d7f73d4

 would start like this:

   From 9d7f73d43fa49d0d2f5a8cfcce9d659e8ad2d265  Thu Apr 7 15:13:13 2005
   From: =?utf-8?q?Lukas_Sandstr=C3=B6m?= <lukass@etek.chalmers.se>
   Date: Sat, 25 Feb 2006 12:20:13 +0100
   Subject: [PATCH] git-fetch: print the new and old ref when fast-forwarding
   Content-Type: text/plain; charset=UTF-8
   Content-Transfer-Encoding: 8bit

 This is marked RFC because I am not convinced if this kind of
 header formatting should be done by format-patch; we might be
 better off leaving the proper massaging to whatever downstream
 program that reads its output (e.g. send-email or imap-send).
 We produce the mbox format (and that is a requirement -- its
 output should be consumable by git-am), so the downstream needs
 to strip off the initial UNIX-From line at least anyway.

 Thoughts?

 If we decide to do the header formatting here, there are two
 further enhancements that need to be done:

 (1) The charset must be configurable for projects that use
     encoding different from UTF-8, perhaps with the .git/config
     [i18n] commitEncoding.  It is only a convention, not a hard
     rule, to use UTF-8 for the metainformation.

 (2) Some projects, notably Wine, seem to prefer patches to be
     sent as attachments, and we have support for that in the
     script version of format-patch.  We would want to have the
     same here.  This needs to be an option; define a new
     format, CMIT_FMT_MIME, and invoke it with --pretty=mime.

     Ideally we would want to say, in the body part header for
     the attachment, that the type of the payload is a raw 8bit
     text/patch without any specific charset (if the upstream
     project has a UTF-8 encoded file, you should not send in a
     patch in iso-8859-1 and expect somebody to automagically
     transcode your patch -- the patch is applied as is and MTA
     should not molest it).

 The RFC2047 q-encoding code definitely needs to be audited by
 an RFC lawyer.  I used to be one myself but I lost my edge and
 patience these days.

diff --git a/commit.c b/commit.c
index 93b3903..dee5756 100644
--- a/commit.c
+++ b/commit.c
@@ -413,6 +413,46 @@ static int get_one_line(const char *msg,
 	return ret;
 }
 
+static int is_rfc2047_special(char ch)
+{
+	return ((ch & 0x80) || (ch == '=') || (ch == '?') || (ch == '_'));
+}
+
+static int add_rfc2047(char *buf, const char *line, int len)
+{
+	char *bp = buf;
+	int i, needquote;
+	static const char q_utf8[] = "=?utf-8?q?";
+
+	for (i = needquote = 0; !needquote && i < len; i++) {
+		unsigned ch = line[i];
+		if (ch & 0x80)
+			needquote++;
+		if ((i + 1 < len) &&
+		    (ch == '=' && line[i+1] == '?'))
+			needquote++;
+	}
+	if (!needquote)
+		return sprintf(buf, "%.*s", len, line);
+
+	memcpy(bp, q_utf8, sizeof(q_utf8)-1);
+	bp += sizeof(q_utf8)-1;
+	for (i = 0; i < len; i++) {
+		unsigned ch = line[i];
+		if (is_rfc2047_special(ch)) {
+			sprintf(bp, "=%02X", ch);
+			bp += 3;
+		}
+		else if (ch == ' ')
+			*bp++ = '_';
+		else
+			*bp++ = ch;
+	}
+	memcpy(bp, "?=", 2);
+	bp += 2;
+	return bp - buf;
+}
+
 static int add_user_info(const char *what, enum cmit_fmt fmt, char *buf, const char *line)
 {
 	char *date;
@@ -431,12 +471,26 @@ static int add_user_info(const char *wha
 	tz = strtol(date, NULL, 10);
 
 	if (fmt == CMIT_FMT_EMAIL) {
-		what = "From";
+		char *name_tail = strchr(line, '<');
+		int display_name_length;
+		if (!name_tail)
+			return 0;
+		while (line < name_tail && isspace(name_tail[-1]))
+			name_tail--;
+		display_name_length = name_tail - line;
 		filler = "";
+		strcpy(buf, "From: ");
+		ret = strlen(buf);
+		ret += add_rfc2047(buf + ret, line, display_name_length);
+		memcpy(buf + ret, name_tail, namelen - display_name_length);
+		ret += namelen - display_name_length;
+		buf[ret++] = '\n';
+	}
+	else {
+		ret = sprintf(buf, "%s: %.*s%.*s\n", what,
+			      (fmt == CMIT_FMT_FULLER) ? 4 : 0,
+			      filler, namelen, line);
 	}
-	ret = sprintf(buf, "%s: %.*s%.*s\n", what,
-		      (fmt == CMIT_FMT_FULLER) ? 4 : 0,
-		      filler, namelen, line);
 	switch (fmt) {
 	case CMIT_FMT_MEDIUM:
 		ret += sprintf(buf + ret, "Date:   %s\n", show_date(time, tz));
@@ -575,14 +629,24 @@ unsigned long pretty_print_commit(enum c
 			int slen = strlen(subject);
 			memcpy(buf + offset, subject, slen);
 			offset += slen;
+			offset += add_rfc2047(buf + offset, line, linelen);
+		}
+		else {
+			memset(buf + offset, ' ', indent);
+			memcpy(buf + offset + indent, line, linelen);
+			offset += linelen + indent;
 		}
-		memset(buf + offset, ' ', indent);
-		memcpy(buf + offset + indent, line, linelen);
-		offset += linelen + indent;
 		buf[offset++] = '\n';
 		if (fmt == CMIT_FMT_ONELINE)
 			break;
-		subject = NULL;
+		if (subject) {
+			static const char header[] =
+				"Content-Type: text/plain; charset=UTF-8\n"
+				"Content-Transfer-Encoding: 8bit\n";
+			memcpy(buf + offset, header, sizeof(header)-1);
+			offset += sizeof(header)-1;
+			subject = NULL;
+		}
 	}
 	while (offset && isspace(buf[offset-1]))
 		offset--;

^ permalink raw reply related

* Re: [PATCH] Update the documentation for git-merge-base
From: Junio C Hamano @ 2006-05-16  7:51 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: git
In-Reply-To: <20060516065452.GA5540@c165.ib.student.liu.se>

Fredrik Kuivinen <freku045@student.liu.se> writes:

>> See the big illustration at the top of the source for how you
>> can construct pathological case to defeat an attempt to
>> guarantee such.  --all guarantees that the output contains all
>> interesting ones, but does not guarantee the output has no
>> suboptimal merge bases.
>
> There are two examples at the top of the source. In the first one a
> least common ancestor is returned. As I interpret the second one, it
> is an example of how the old algorithm without the postprocessing step
> produced a common ancestor which is not least.

Ah, yes, I remember now.

The drawing was done while we were working on the solution to
that pathological case; mark_reachable_commits() solves that
horizon effect.

    http://article.gmane.org/gmane.comp.version-control.git/11410
    http://article.gmane.org/gmane.comp.version-control.git/11429
    http://article.gmane.org/gmane.comp.version-control.git/11552
    http://article.gmane.org/gmane.comp.version-control.git/11613

However, our inability to come up with one is not a nonexistence
proof of cases the current algorithm can fail, so math minded
people _might_ want to prove the algorithm is optimal.

Not me, though.

^ permalink raw reply

* Re: [PATCH 1/2] Handle branch names with slashes
From: Karl Hasselström @ 2006-05-16  7:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Catalin Marinas
In-Reply-To: <7v64k6ea8r.fsf@assigned-by-dhcp.cox.net>

On 2006-05-15 23:48:04 -0700, Junio C Hamano wrote:

> Karl Hasselström <kha@treskal.com> writes:
>
> > I had to change the patch@branch/top command-line syntax to
> > patch@branch%top, in order to get sane parsing. The /top variant
> > is still available for repositories that have no slashy branches;
> > it is disabled as soon as there exists at least one subdirectory
> > of refs/heads. Preferably, this compatibility hack can be killed
> > some time in the future.
>
> I wonder if using double-slashes is an easier alternative to type
> than '%', like "patch@branch//top". That way, you do not have to
> forbid per-cent sign in branch names.

Good argument. And // does look slightly better than %, too. But I'll
wait a few days this time, or else someone will surely come along with
an even better idea. :-)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* Re: newbie question
From: Matthias Kestenholz @ 2006-05-16  7:09 UTC (permalink / raw)
  To: Li Yang-r58472; +Cc: git
In-Reply-To: <9FCDBA58F226D911B202000BDBAD467308146E@zch01exm40.ap.freescale.net>

Hello,

* Li Yang-r58472 (LeoLi@freescale.com) wrote:
> I just starting to use git recently.  I have setup a public repository,
> and pushed cloned open source repository to it.  As most documents 
> suggested, I need to run a repack on the public repository.  Normally 
> git-repack is run in the source directory(the parent directory of .git).  
> Considering the public repository, there is no source directory and the
> *.git is the uppest level directory.  Where am I supposed to run the
> git-repack command?

Do it like that:

$ ls
project.git
$ GIT_DIR=project.git git-repack -a -d

^ permalink raw reply

* newbie question
From: Li Yang-r58472 @ 2006-05-16  7:03 UTC (permalink / raw)
  To: git

I just starting to use git recently.  I have setup a public repository, and pushed cloned open source repository to it.  As most documents suggested, I need to run a repack on the public repository.  Normally git-repack is run in the source directory(the parent directory of .git).  Considering the public repository, there is no source directory and the *.git is the uppest level directory.  Where am I supposed to run the git-repack command?

^ permalink raw reply

* Re: [PATCH] Update the documentation for git-merge-base
From: Fredrik Kuivinen @ 2006-05-16  6:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Kuivinen, git
In-Reply-To: <7vhd3qebuv.fsf@assigned-by-dhcp.cox.net>

On Mon, May 15, 2006 at 11:13:12PM -0700, Junio C Hamano wrote:
> Fredrik Kuivinen <freku045@student.liu.se> writes:
> 
> > Is the code guaranteed to return a least common ancestor? If that is
> > the case we should probably mention it in the documentation.
> 
> Unfortunately, no, if you mean by "least common" closest to the
> tips.
>

By "least" I mean the following:

C is a least common ancestor of A and B if:

* C is a common ancestor of A and B, and
* for every other common ancestor D (different from C) of A and B, C
  is not reacheable from D.

> See the big illustration at the top of the source for how you
> can construct pathological case to defeat an attempt to
> guarantee such.  --all guarantees that the output contains all
> interesting ones, but does not guarantee the output has no
> suboptimal merge bases.

There are two examples at the top of the source. In the first one a
least common ancestor is returned. As I interpret the second one, it
is an example of how the old algorithm without the postprocessing step
produced a common ancestor which is not least.

Am I wrong? Do we have any cases where the current merge-base
algorithm gives us common ancestors which are not least?

- Fredrik

^ permalink raw reply

* Re: [PATCH 1/2] Handle branch names with slashes
From: Junio C Hamano @ 2006-05-16  6:48 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: git
In-Reply-To: <20060516063541.GA11218@backpacker.hemma.treskal.com>

Karl Hasselström <kha@treskal.com> writes:

> Teach stgit to handle branch names with slashes in them; that is,
> branches living in a subdirectory of .git/refs/heads.
>
> I had to change the patch@branch/top command-line syntax to
> patch@branch%top, in order to get sane parsing. The /top variant is
> still available for repositories that have no slashy branches; it is
> disabled as soon as there exists at least one subdirectory of
> refs/heads. Preferably, this compatibility hack can be killed some
> time in the future.

I wonder if using double-slashes is an easier alternative to
type than '%', like "patch@branch//top".  That way, you do not
have to forbid per-cent sign in branch names.

^ permalink raw reply

* [PATCH 2/2] Tests for branch names with slashes
From: Karl Hasselström @ 2006-05-16  6:37 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Wartan Hachaturow, git
In-Reply-To: <20060515105810.GA27077@diana.vm.bytemark.co.uk>

Test a number of operations on a repository that has branch names
containing slashes (that is, branches living in a subdirectory of
.git/refs/heads).

Signed-off-by: Karl Hasselström <kha@treskal.com>


---

The test also had to be changed to use % instead of #.

 t/t0001-subdir-branches.sh |   59 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100644 t/t0001-subdir-branches.sh

2278d3988ae3fee7624aac6db6bd92677173749f
diff --git a/t/t0001-subdir-branches.sh b/t/t0001-subdir-branches.sh
new file mode 100644
index 0000000..64f583c
--- /dev/null
+++ b/t/t0001-subdir-branches.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Karl Hasselström
+#
+
+test_description='Branch names containing slashes
+
+Test a number of operations on a repository that has branch names
+containing slashes (that is, branches living in a subdirectory of
+.git/refs/heads).'
+
+. ./test-lib.sh
+
+test_expect_success 'Create a patch' \
+  'stg init &&
+   echo "foo" > foo.txt &&
+   stg add foo.txt &&
+   stg new foo -m "Add foo.txt" &&
+   stg refresh'
+
+test_expect_success 'Old and new id with non-slashy branch' \
+  'stg id foo &&
+   stg id foo% &&
+   stg id foo/ &&
+   stg id foo%top &&
+   stg id foo/top &&
+   stg id foo@master &&
+   stg id foo@master%top &&
+   stg id foo@master/top'
+
+test_expect_success 'Clone branch to slashier name' \
+  'stg branch --clone x/y/z'
+
+test_expect_success 'Try new form of id with slashy branch' \
+  'stg id foo &&
+   stg id foo% &&
+   stg id foo%top &&
+   stg id foo@x/y/z &&
+   stg id foo@x/y/z%top'
+
+test_expect_failure 'Try old id with slashy branch' \
+  'stg id foo/ ||
+   stg id foo/top ||
+   stg id foo@x/y/z/top'
+
+test_expect_success 'Create patch in slashy branch' \
+  'echo "bar" >> foo.txt &&
+   stg new bar -m "Add another line" &&
+   stg refresh'
+
+test_expect_success 'Rename branches' \
+  'stg branch --rename master goo/gaa &&
+   test ! -e .git/refs/heads/master &&
+   stg branch --rename goo/gaa x1/x2/x3/x4 &&
+   test ! -e .git/refs/heads/goo &&
+   stg branch --rename x1/x2/x3/x4 servant &&
+   test ! -e .git/refs/heads/x1'
+
+test_done
-- 
1.3.2.g639c


-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply related

* [PATCH 1/2] Handle branch names with slashes
From: Karl Hasselström @ 2006-05-16  6:35 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Wartan Hachaturow, git
In-Reply-To: <20060515105810.GA27077@diana.vm.bytemark.co.uk>

Teach stgit to handle branch names with slashes in them; that is,
branches living in a subdirectory of .git/refs/heads.

I had to change the patch@branch/top command-line syntax to
patch@branch%top, in order to get sane parsing. The /top variant is
still available for repositories that have no slashy branches; it is
disabled as soon as there exists at least one subdirectory of
refs/heads. Preferably, this compatibility hack can be killed some
time in the future.

Signed-off-by: Karl Hasselström <kha@treskal.com>


---

This is the same patch as before, but with # replaced with %.

 stgit/commands/branch.py |    5 ++
 stgit/commands/common.py |  103 ++++++++++++++++++++++++++--------------------
 stgit/commands/diff.py   |   12 +++--
 stgit/commands/files.py  |    4 +-
 stgit/commands/id.py     |    2 -
 stgit/commands/mail.py   |    8 ++--
 stgit/git.py             |   42 +++++++++----------
 stgit/stack.py           |   21 ++++++---
 stgit/utils.py           |   88 +++++++++++++++++++++++++++++++++++++--
 9 files changed, 193 insertions(+), 92 deletions(-)

76545c189be3a091ab62b112f1a841473600d35c
diff --git a/stgit/commands/branch.py b/stgit/commands/branch.py
index 2218bbb..d348409 100644
--- a/stgit/commands/branch.py
+++ b/stgit/commands/branch.py
@@ -172,7 +172,10 @@ def func(parser, options, args):
         if len(args) != 0:
             parser.error('incorrect number of arguments')
 
-        branches = os.listdir(os.path.join(basedir.get(), 'refs', 'heads'))
+        branches = []
+        basepath = os.path.join(basedir.get(), 'refs', 'heads')
+        for path, files, dirs in walk_tree(basepath):
+            branches += [os.path.join(path, f) for f in files]
         branches.sort()
 
         if branches:
diff --git a/stgit/commands/common.py b/stgit/commands/common.py
index c6ca514..a428dd9 100644
--- a/stgit/commands/common.py
+++ b/stgit/commands/common.py
@@ -18,7 +18,7 @@ along with this program; if not, write t
 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 """
 
-import sys, os, re
+import sys, os, os.path, re
 from optparse import OptionParser, make_option
 
 from stgit.utils import *
@@ -34,54 +34,69 @@ class CmdException(Exception):
 
 
 # Utility functions
+class RevParseException(Exception):
+    """Revision spec parse error."""
+    pass
+
+def parse_rev(rev):
+    """Parse a revision specification into its
+    patchname@branchname%patch_id parts. If no branch name has a slash
+    in it, also accept / instead of %."""
+    files, dirs = list_files_and_dirs(os.path.join(basedir.get(),
+                                                   'refs', 'heads'))
+    if len(dirs) != 0:
+        # We have branch names with / in them.
+        branch_chars = '[^@%]'
+        patch_id_mark = '%'
+    else:
+        # No / in branch names.
+        branch_chars = '[^@%/]'
+        patch_id_mark = '[/%]'
+    patch_re = r'(?P<patch>[^@/%]+)'
+    branch_re = r'@(?P<branch>%s+)' % branch_chars
+    patch_id_re = r'%s(?P<patch_id>[a-z.]*)' % patch_id_mark
+
+    # Try %patch_id.
+    m = re.match(r'^%s$' % patch_id_re, rev)
+    if m:
+        return None, None, m.group('patch_id')
+
+    # Try patch[@branch][%patch_id].
+    m = re.match(r'^%s(%s)?(%s)?$' % (patch_re, branch_re, patch_id_re), rev)
+    if m:
+        return m.group('patch'), m.group('branch'), m.group('patch_id')
+
+    # No, we can't parse that.
+    raise RevParseException
+
 def git_id(rev):
     """Return the GIT id
     """
     if not rev:
         return None
-    
-    rev_list = rev.split('/')
-    if len(rev_list) == 2:
-        patch_id = rev_list[1]
-        if not patch_id:
-            patch_id = 'top'
-    elif len(rev_list) == 1:
-        patch_id = 'top'
-    else:
-        patch_id = None
-
-    patch_branch = rev_list[0].split('@')
-    if len(patch_branch) == 1:
-        series = crt_series
-    elif len(patch_branch) == 2:
-        series = stack.Series(patch_branch[1])
-    else:
-        raise CmdException, 'Unknown id: %s' % rev
-
-    patch_name = patch_branch[0]
-    if not patch_name:
-        patch_name = series.get_current()
-        if not patch_name:
-            raise CmdException, 'No patches applied'
-
-    # patch
-    if patch_name in series.get_applied() \
-           or patch_name in series.get_unapplied():
-        if patch_id == 'top':
-            return series.get_patch(patch_name).get_top()
-        elif patch_id == 'bottom':
-            return series.get_patch(patch_name).get_bottom()
-        # Note we can return None here.
-        elif patch_id == 'top.old':
-            return series.get_patch(patch_name).get_old_top()
-        elif patch_id == 'bottom.old':
-            return series.get_patch(patch_name).get_old_bottom()
-
-    # base
-    if patch_name == 'base' and len(rev_list) == 1:
-        return read_string(series.get_base_file())
-
-    # anything else failed
+    try:
+        patch, branch, patch_id = parse_rev(rev)
+        if branch == None:
+            series = crt_series
+        else:
+            series = stack.Series(branch)
+        if patch == None:
+            patch = series.get_current()
+            if not patch:
+                raise CmdException, 'No patches applied'
+        if patch in series.get_applied() or patch in series.get_unapplied():
+            if patch_id in ['top', '', None]:
+                return series.get_patch(patch).get_top()
+            elif patch_id == 'bottom':
+                return series.get_patch(patch).get_bottom()
+            elif patch_id == 'top.old':
+                return series.get_patch(patch).get_old_top()
+            elif patch_id == 'bottom.old':
+                return series.get_patch(patch).get_old_bottom()
+        if patch == 'base' and patch_id == None:
+            return read_string(series.get_base_file())
+    except RevParseException:
+        pass
     return git.rev_parse(rev + '^{commit}')
 
 def check_local_changes():
diff --git a/stgit/commands/diff.py b/stgit/commands/diff.py
index 7dc6c5d..e465e7a 100644
--- a/stgit/commands/diff.py
+++ b/stgit/commands/diff.py
@@ -33,12 +33,12 @@ or a tree-ish object and another tree-is
 be given to restrict the diff output. The tree-ish object can be a
 standard git commit, tag or tree. In addition to these, the command
 also supports 'base', representing the bottom of the current stack,
-and '[patch]/[bottom | top]' for the patch boundaries (defaulting to
+and '[patch][%[bottom | top]]' for the patch boundaries (defaulting to
 the current one):
 
-rev = '([patch]/[bottom | top]) | <tree-ish> | base'
+rev = '([patch][%[bottom | top]]) | <tree-ish> | base'
 
-If neither bottom or top are given but a '/' is present, the command
+If neither bottom or top are given but a '%' is present, the command
 shows the specified patch (defaulting to the current one)."""
 
 options = [make_option('-r', metavar = 'rev1[:[rev2]]', dest = 'revs',
@@ -55,10 +55,10 @@ def func(parser, options, args):
         rev_list = options.revs.split(':')
         rev_list_len = len(rev_list)
         if rev_list_len == 1:
-            if rev_list[0][-1] == '/':
+            if rev_list[0][-1] in ['/', '%']:
                 # the whole patch
-                rev1 = rev_list[0] + 'bottom'
-                rev2 = rev_list[0] + 'top'
+                rev1 = rev_list[0][:-1] + '%bottom'
+                rev2 = rev_list[0][:-1] + '%top'
             else:
                 rev1 = rev_list[0]
                 rev2 = None
diff --git a/stgit/commands/files.py b/stgit/commands/files.py
index 0694d83..a20ce96 100644
--- a/stgit/commands/files.py
+++ b/stgit/commands/files.py
@@ -53,8 +53,8 @@ def func(parser, options, args):
     else:
         parser.error('incorrect number of arguments')
 
-    rev1 = git_id('%s/bottom' % patch)
-    rev2 = git_id('%s/top' % patch)
+    rev1 = git_id('%s%%bottom' % patch)
+    rev2 = git_id('%s%%top' % patch)
 
     if options.stat:
         print git.diffstat(rev1 = rev1, rev2 = rev2)
diff --git a/stgit/commands/id.py b/stgit/commands/id.py
index 1cf6ea6..1a5938b 100644
--- a/stgit/commands/id.py
+++ b/stgit/commands/id.py
@@ -28,7 +28,7 @@ usage = """%prog [options] [id]
 
 Print the hash value of a GIT id (defaulting to HEAD). In addition to
 the standard GIT id's like heads and tags, this command also accepts
-'base[@<branch>]' and '[<patch>[@<branch>]][/(bottom | top)]'. If no
+'base[@<branch>]' and '[<patch>[@<branch>]][%[bottom | top]]'. If no
 'top' or 'bottom' are passed and <patch> is a valid patch name, 'top'
 will be used by default."""
 
diff --git a/stgit/commands/mail.py b/stgit/commands/mail.py
index 5e01ea1..0d2c260 100644
--- a/stgit/commands/mail.py
+++ b/stgit/commands/mail.py
@@ -324,10 +324,10 @@ def __build_message(tmpl, patch, patch_n
                  'shortdescr':   short_descr,
                  'longdescr':    long_descr,
                  'endofheaders': headers_end,
-                 'diff':         git.diff(rev1 = git_id('%s/bottom' % patch),
-                                          rev2 = git_id('%s/top' % patch)),
-                 'diffstat':     git.diffstat(rev1 = git_id('%s/bottom'%patch),
-                                              rev2 = git_id('%s/top' % patch)),
+                 'diff':         git.diff(rev1 = git_id('%s%%bottom' % patch),
+                                          rev2 = git_id('%s%%top' % patch)),
+                 'diffstat':     git.diffstat(rev1 = git_id('%s%%bottom'%patch),
+                                              rev2 = git_id('%s%%top' % patch)),
                  'date':         email.Utils.formatdate(localtime = True),
                  'version':      version_str,
                  'patchnr':      patch_nr_str,
diff --git a/stgit/git.py b/stgit/git.py
index 2884f36..716609c 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -225,7 +225,8 @@ def get_head():
 def get_head_file():
     """Returns the name of the file pointed to by the HEAD link
     """
-    return os.path.basename(_output_one_line('git-symbolic-ref HEAD'))
+    return strip_prefix('refs/heads/',
+                        _output_one_line('git-symbolic-ref HEAD'))
 
 def set_head_file(ref):
     """Resets HEAD to point to a new ref
@@ -233,7 +234,8 @@ def set_head_file(ref):
     # head cache flushing is needed since we might have a different value
     # in the new head
     __clear_head_cache()
-    if __run('git-symbolic-ref HEAD', [ref]) != 0:
+    if __run('git-symbolic-ref HEAD',
+             [os.path.join('refs', 'heads', ref)]) != 0:
         raise GitException, 'Could not set head to "%s"' % ref
 
 def __set_head(val):
@@ -272,6 +274,7 @@ def rev_parse(git_id):
 def branch_exists(branch):
     """Existence check for the named branch
     """
+    branch = os.path.join('refs', 'heads', branch)
     for line in _output_lines('git-rev-parse --symbolic --all 2>&1'):
         if line.strip() == branch:
             return True
@@ -282,12 +285,11 @@ def branch_exists(branch):
 def create_branch(new_branch, tree_id = None):
     """Create a new branch in the git repository
     """
-    new_head = os.path.join('refs', 'heads', new_branch)
-    if branch_exists(new_head):
+    if branch_exists(new_branch):
         raise GitException, 'Branch "%s" already exists' % new_branch
 
     current_head = get_head()
-    set_head_file(new_head)
+    set_head_file(new_branch)
     __set_head(current_head)
 
     # a checkout isn't needed if new branch points to the current head
@@ -297,22 +299,22 @@ def create_branch(new_branch, tree_id = 
     if os.path.isfile(os.path.join(basedir.get(), 'MERGE_HEAD')):
         os.remove(os.path.join(basedir.get(), 'MERGE_HEAD'))
 
-def switch_branch(name):
+def switch_branch(new_branch):
     """Switch to a git branch
     """
     global __head
 
-    new_head = os.path.join('refs', 'heads', name)
-    if not branch_exists(new_head):
-        raise GitException, 'Branch "%s" does not exist' % name
+    if not branch_exists(new_branch):
+        raise GitException, 'Branch "%s" does not exist' % new_branch
 
-    tree_id = rev_parse(new_head + '^{commit}')
+    tree_id = rev_parse(os.path.join('refs', 'heads', new_branch)
+                        + '^{commit}')
     if tree_id != get_head():
         refresh_index()
         if __run('git-read-tree -u -m', [get_head(), tree_id]) != 0:
             raise GitException, 'git-read-tree failed (local changes maybe?)'
         __head = tree_id
-    set_head_file(new_head)
+    set_head_file(new_branch)
 
     if os.path.isfile(os.path.join(basedir.get(), 'MERGE_HEAD')):
         os.remove(os.path.join(basedir.get(), 'MERGE_HEAD'))
@@ -320,25 +322,23 @@ def switch_branch(name):
 def delete_branch(name):
     """Delete a git branch
     """
-    branch_head = os.path.join('refs', 'heads', name)
-    if not branch_exists(branch_head):
+    if not branch_exists(name):
         raise GitException, 'Branch "%s" does not exist' % name
-    os.remove(os.path.join(basedir.get(), branch_head))
+    remove_file_and_dirs(os.path.join(basedir.get(), 'refs', 'heads'),
+                         name)
 
 def rename_branch(from_name, to_name):
     """Rename a git branch
     """
-    from_head = os.path.join('refs', 'heads', from_name)
-    if not branch_exists(from_head):
+    if not branch_exists(from_name):
         raise GitException, 'Branch "%s" does not exist' % from_name
-    to_head = os.path.join('refs', 'heads', to_name)
-    if branch_exists(to_head):
+    if branch_exists(to_name):
         raise GitException, 'Branch "%s" already exists' % to_name
 
     if get_head_file() == from_name:
-        set_head_file(to_head)
-    os.rename(os.path.join(basedir.get(), from_head), \
-              os.path.join(basedir.get(), to_head))
+        set_head_file(to_name)
+    rename(os.path.join(basedir.get(), 'refs', 'heads'),
+           from_name, to_name)
 
 def add(names):
     """Add the files or recursively add the directory contents
diff --git a/stgit/stack.py b/stgit/stack.py
index f83161b..49b50e7 100644
--- a/stgit/stack.py
+++ b/stgit/stack.py
@@ -443,8 +443,7 @@ class Series:
 
         os.makedirs(self.__patch_dir)
 
-        if not os.path.isdir(bases_dir):
-            os.makedirs(bases_dir)
+        create_dirs(bases_dir)
 
         create_empty_file(self.__applied_file)
         create_empty_file(self.__unapplied_file)
@@ -502,11 +501,14 @@ class Series:
         git.rename_branch(self.__name, to_name)
 
         if os.path.isdir(self.__series_dir):
-            os.rename(self.__series_dir, to_stack.__series_dir)
+            rename(os.path.join(self.__base_dir, 'patches'),
+                   self.__name, to_stack.__name)
         if os.path.exists(self.__base_file):
-            os.rename(self.__base_file, to_stack.__base_file)
+            rename(os.path.join(self.__base_dir, 'refs', 'bases'),
+                   self.__name, to_stack.__name)
         if os.path.exists(self.__refs_dir):
-            os.rename(self.__refs_dir, to_stack.__refs_dir)
+            rename(os.path.join(self.__base_dir, 'refs', 'patches'),
+                   self.__name, to_stack.__name)
 
         self.__init__(to_name)
 
@@ -560,16 +562,19 @@ class Series:
             else:
                 print 'Patch directory %s is not empty.' % self.__name
             if not os.listdir(self.__series_dir):
-                os.rmdir(self.__series_dir)
+                remove_dirs(os.path.join(self.__base_dir, 'patches'),
+                            self.__name)
             else:
                 print 'Series directory %s is not empty.' % self.__name
             if not os.listdir(self.__refs_dir):
-                os.rmdir(self.__refs_dir)
+                remove_dirs(os.path.join(self.__base_dir, 'refs', 'patches'),
+                            self.__name)
             else:
                 print 'Refs directory %s is not empty.' % self.__refs_dir
 
         if os.path.exists(self.__base_file):
-            os.remove(self.__base_file)
+            remove_file_and_dirs(
+                os.path.join(self.__base_dir, 'refs', 'bases'), self.__name)
 
     def refresh_patch(self, files = None, message = None, edit = False,
                       show_patch = False,
diff --git a/stgit/utils.py b/stgit/utils.py
index 5749b3b..68b8f58 100644
--- a/stgit/utils.py
+++ b/stgit/utils.py
@@ -1,6 +1,8 @@
 """Common utility functions
 """
 
+import errno, os, os.path
+
 __copyright__ = """
 Copyright (C) 2005, Catalin Marinas <catalin.marinas@gmail.com>
 
@@ -18,6 +20,12 @@ along with this program; if not, write t
 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 """
 
+def mkdir_file(filename, mode):
+    """Opens filename with the given mode, creating the directory it's
+    in if it doesn't already exist."""
+    create_dirs(os.path.dirname(filename))
+    return file(filename, mode)
+
 def read_string(filename, multiline = False):
     """Reads the first line from a file
     """
@@ -32,7 +40,7 @@ def read_string(filename, multiline = Fa
 def write_string(filename, line, multiline = False):
     """Writes 'line' to file and truncates it
     """
-    f = file(filename, 'w+')
+    f = mkdir_file(filename, 'w+')
     if multiline:
         f.write(line)
     else:
@@ -42,7 +50,7 @@ def write_string(filename, line, multili
 def append_strings(filename, lines):
     """Appends 'lines' sequence to file
     """
-    f = file(filename, 'a+')
+    f = mkdir_file(filename, 'a+')
     for line in lines:
         print >> f, line
     f.close()
@@ -50,14 +58,14 @@ def append_strings(filename, lines):
 def append_string(filename, line):
     """Appends 'line' to file
     """
-    f = file(filename, 'a+')
+    f = mkdir_file(filename, 'a+')
     print >> f, line
     f.close()
 
 def insert_string(filename, line):
     """Inserts 'line' at the beginning of the file
     """
-    f = file(filename, 'r+')
+    f = mkdir_file(filename, 'r+')
     lines = f.readlines()
     f.seek(0); f.truncate()
     print >> f, line
@@ -67,4 +75,74 @@ def insert_string(filename, line):
 def create_empty_file(name):
     """Creates an empty file
     """
-    file(name, 'w+').close()
+    mkdir_file(name, 'w+').close()
+
+def list_files_and_dirs(path):
+    """Return the sets of filenames and directory names in a
+    directory."""
+    files, dirs = [], []
+    for fd in os.listdir(path):
+        full_fd = os.path.join(path, fd)
+        if os.path.isfile(full_fd):
+            files.append(fd)
+        elif os.path.isdir(full_fd):
+            dirs.append(fd)
+    return files, dirs
+
+def walk_tree(basedir):
+    """Starting in the given directory, iterate through all its
+    subdirectories. For each subdirectory, yield the name of the
+    subdirectory (relative to the base directory), the list of
+    filenames in the subdirectory, and the list of directory names in
+    the subdirectory."""
+    subdirs = ['']
+    while subdirs:
+        subdir = subdirs.pop()
+        files, dirs = list_files_and_dirs(os.path.join(basedir, subdir))
+        for d in dirs:
+            subdirs.append(os.path.join(subdir, d))
+        yield subdir, files, dirs
+
+def strip_prefix(prefix, string):
+    """Return string, without the prefix. Blow up if string doesn't
+    start with prefix."""
+    assert string.startswith(prefix)
+    return string[len(prefix):]
+
+def remove_dirs(basedir, dirs):
+    """Starting at join(basedir, dirs), remove the directory if empty,
+    and try the same with its parent, until we find a nonempty
+    directory or reach basedir."""
+    path = dirs
+    while path:
+        try:
+            os.rmdir(os.path.join(basedir, path))
+        except OSError:
+            return # can't remove nonempty directory
+        path = os.path.dirname(path)
+
+def remove_file_and_dirs(basedir, file):
+    """Remove join(basedir, file), and then remove the directory it
+    was in if empty, and try the same with its parent, until we find a
+    nonempty directory or reach basedir."""
+    os.remove(os.path.join(basedir, file))
+    remove_dirs(basedir, os.path.dirname(file))
+
+def create_dirs(directory):
+    """Create the given directory, if the path doesn't already exist."""
+    if directory:
+        create_dirs(os.path.dirname(directory))
+        try:
+            os.mkdir(directory)
+        except OSError, e:
+            if e.errno != errno.EEXIST:
+                raise e
+
+def rename(basedir, file1, file2):
+    """Rename join(basedir, file1) to join(basedir, file2), not
+    leaving any empty directories behind and creating any directories
+    necessary."""
+    full_file2 = os.path.join(basedir, file2)
+    create_dirs(os.path.dirname(full_file2))
+    os.rename(os.path.join(basedir, file1), full_file2)
+    remove_dirs(basedir, os.path.dirname(file1))
-- 
1.3.2.g639c


-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply related

* Re: [PATCH] Update the documentation for git-merge-base
From: Junio C Hamano @ 2006-05-16  6:13 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: git
In-Reply-To: <20060516055815.GA4572@c165.ib.student.liu.se>

Fredrik Kuivinen <freku045@student.liu.se> writes:

> Is the code guaranteed to return a least common ancestor? If that is
> the case we should probably mention it in the documentation.

Unfortunately, no, if you mean by "least common" closest to the
tips.

See the big illustration at the top of the source for how you
can construct pathological case to defeat an attempt to
guarantee such.  --all guarantees that the output contains all
interesting ones, but does not guarantee the output has no
suboptimal merge bases.

^ permalink raw reply

* [PATCH] Update the documentation for git-merge-base
From: Fredrik Kuivinen @ 2006-05-16  5:58 UTC (permalink / raw)
  To: junkio; +Cc: git


Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>

---

Is the code guaranteed to return a least common ancestor? If that is
the case we should probably mention it in the documentation.


 Documentation/git-merge-base.txt |   18 ++++++++++++++----
 1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-merge-base.txt b/Documentation/git-merge-base.txt
index d1d56f1..6099be2 100644
--- a/Documentation/git-merge-base.txt
+++ b/Documentation/git-merge-base.txt
@@ -8,16 +8,26 @@ git-merge-base - Finds as good a common 
 
 SYNOPSIS
 --------
-'git-merge-base' <commit> <commit>
+'git-merge-base' [--all] <commit> <commit>
 
 DESCRIPTION
 -----------
-"git-merge-base" finds as good a common ancestor as possible. Given a
-selection of equally good common ancestors it should not be relied on
-to decide in any particular way.
+
+"git-merge-base" finds as good a common ancestor as possible between
+the two commits. That is, given two commits A and B 'git-merge-base A
+B' will output a commit which is reachable from both A and B through
+the parent relationship.
+
+Given a selection of equally good common ancestors it should not be
+relied on to decide in any particular way.
 
 The "git-merge-base" algorithm is still in flux - use the source...
 
+OPTIONS
+-------
+--all::
+	Output all common ancestors for the two commits instead of
+	just one.
 
 Author
 ------

^ permalink raw reply related

* What's in git.git
From: Junio C Hamano @ 2006-05-16  5:30 UTC (permalink / raw)
  To: git

* The 'maint' branch produced the v1.3.3 release I announced
  just a minute ago.

* The 'master' branch has these since the last announcement.

 - 64-bit (especially BE) fix for pack-objects (Dennis Stosberg)

 - Porting issues (Dennis Stosberg, Ben Clifford)

 - send-email updates (Eric Wong)

 - built-in "grep" (Linus and me)

 - "reset --hard" simplification (Linus and me)

 - configuration file syntax updates (Linus)

 - cvsserver updates (Martin Langhoff, Martyn Smith)

 - delta generation fix (Nicolas Pitre)

 - "git commit" novice usability fix (Sean Estabrooks)

* The 'next' branch, in addition, has these.

 - "diff revA:path1 revB:path2" fix

   When two blobs are given, it produced diff in reverse by
   mistake ("setup_revisions()" left the parsed objects in
   reverse order, and the caller forgot to reverse it).

   This is trivial and ready -- I just haven't got around to
   merging it up.

 - "rebase" help text updates (Sean Estabrooks)

   Ready.

 - "diff --summary" (Sean Estabrooks)

   I haven't really read the code yet, but it is low impact and
   should be ready.

 - strip leading tags/ from "git tag -l" output (Sean Estabrooks) 

   "git tag -l" as I wrote it originally stupidly left leading tags/
   in its output for all tags.  This removes it, and I think it
   is a sensible thing to do.

 - move remotes/ to config (Johannes)

   Now configuration syntax discussion is settled, thanks to
   Linus, we can start discussing per-branch attribute
   semantics.  This series is about the other half of the story.

   I think it is ready as it is, if we are not going to change
   the semantics of "remote"; except some people seem to want to
   reorganize the way per-branch property and remotes interact
   with each other.

 - Further optimiation of pack-object (Nicolas Pitre)

   Testing.

 - "apply --cached"

   This allows "git apply" to apply a patch to the index without
   touching the working tree.  It is handy to prepare a tree to
   use in 3-way fallback, and updated "git am" takes advantage
   of it.  I am planning to use it for stash/unstash.

 - built-in format-patch (Johannes) 

   I think this is almost ready to supersede the script version,
   except that this does not do attachments.

   We need to do RFC2047 for headers as well.  I'd rather do it
   in this version than fix the script version.

 - cache-tree with read-tree/write-tree --prefix

   I haven't made any progress on this one, but haven't been
   bitten by it either, so it is a good sign.

^ permalink raw reply

* [ANNOUNCE] GIT 1.3.3
From: Junio C Hamano @ 2006-05-16  4:49 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

The latest maintenance release GIT 1.3.3 is available at the
usual places:

	http://www.kernel.org/pub/software/scm/git/

	git-1.3.3.tar.{gz,bz2}			(tarball)
	RPMS/$arch/git-*-1.3.3-1.$arch.rpm	(RPM)

This contains two notable non-fixes:

 (1) Future-proofing configuration file syntax by Linus.
     Nothing in 1.3.X series takes advantage of it, but it is
     there so 1.3.3 would not barf in a repository that you
     previously used later versions of git to manipulate its
     configuration file.

 (2) core.prefersymlinkrefs configuration can be set in the
     configuration file while bisecting a project that wants to
     use .git/HEAD symbolic link in its historical version
     (notably Linux kernel around January this year).

----------------------------------------------------------------

Changes since v1.3.2 are as follows:

Ben Clifford:
      include header to define uint32_t, necessary on Mac OS X

Dennis Stosberg:
      Fix git-pack-objects for 64-bit platforms
      Fix compilation on newer NetBSD systems

Dmitry V. Levin:
      Separate object name errors from usage errors

Eric Wong:
      apply: fix infinite loop with multiple patches with --index
      Install git-send-email by default

Johannes Schindelin:
      repo-config: trim white-space before comment

Junio C Hamano:
      core.prefersymlinkrefs: use symlinks for .git/HEAD
      repo-config: document what value_regexp does a bit more clearly.
      Fix repo-config set-multivar error return path.
      Documentation: {caret} fixes (git-rev-list.txt)
      checkout: use --aggressive when running a 3-way merge (-m).
      Fix pack-index issue on 64-bit platforms a bit more portably.

Linus Torvalds:
      Fix "git diff --stat" with long filenames
      revert/cherry-pick: use aggressive merge.
      git config syntax updates

Martin Waitz:
      clone: keep --reference even with -l -s
      repack: honor -d even when no new pack was created

Matthias Lederhofer:
      core-tutorial.txt: escape asterisk

Pavel Roskin:
      Release config lock if the regex is invalid

Sean Estabrooks:
      Fix for config file section parsing.
      Another config file parsing fix.
      Ensure author & committer before asking for commit message.

Yakov Lerner:
      read-cache.c: use xcalloc() not calloc()

^ permalink raw reply

* Find cheap prescriptions on the internet pharmacy!
From: Horace Cole @ 2006-05-16  3:35 UTC (permalink / raw)
  To: git

The next generation online pharmacy.
http://rfvhrec.musurvrrajrc444x944x94mm.tullianhf.com/?zno

^ permalink raw reply

* Pickaxe usage question -- only matching on added string
From: Martin Langhoff @ 2006-05-16  3:37 UTC (permalink / raw)
  To: git

Documentation for diffcore-pickaxe (in Documentation/diffcore.txt) says:

When diffcore-pickaxe is in use, it checks if there are
filepairs whose "original" side has the specified string and
whose "result" side does not.  Such a filepair represents "the
string appeared in this changeset".  It also checks for the
opposite case that loses the specified string.

Now, is there a way to get diffcore to match only on 'added' (or on
'removed', for that matter)? I am tring to identify commtis that added
patches to a project, and I seem to be getting matches that add and
remove when I do:

   git-whatchanged -p -C -S"\t" master

cheers,

martin

^ permalink raw reply

* Re: Fix silly typo in new builtin grep
From: Linus Torvalds @ 2006-05-16  3:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vu07qfyj0.fsf@assigned-by-dhcp.cox.net>

On Mon, 15 May 2006, Junio C Hamano wrote:
> >
> > (In fact, I would say that doing the above command in just 4 seconds is 
> > damn impressive - it's a large code-base, and v2.6.13 is several months, 
> > and over 20 _thousand_ revisions ago).
> 
> That is a BS praise and you know it ;-).  You do not have delta
> chains that are 20k long, so grepping from the tree 10 revs ago
> and from the tree 20k revs ago would not make a difference.

Oh, I agree. I meant in a "general version-control sense". I doubt a lot 
of other version-control systems could do it. Git can, exactly because 
it's whole-file based, and our deltas are limited.

So it's not that "builtin-grep" is wonderful. It's that _git_ is
wonderful, and the builtin-grep just shows one of the end results.

That's why we have killer features. To show off.

(That said, git will slow down a tad too - the pack-file access won't be 
as optimized for an old version tree, and so you'll seek around some more 
for the cold-cache case).

		Linus

^ permalink raw reply

* Re: Fix silly typo in new builtin grep
From: Linus Torvalds @ 2006-05-16  3:27 UTC (permalink / raw)
  To: Morten Welinder; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <118833cc0605151910s7619ddf0x8f014adba2a1eba5@mail.gmail.com>

On Mon, 15 May 2006, Morten Welinder wrote:
>
> If I read the code right, it calls regexec for every single character
> on every single line.  No wonder that takes a while!  Just call it
> once and it'll search for its match quite nicely.

No, it calls it once per pattern per line.

But yes, it calls it once per line, instead of calling it on some bigger 
boundary. Partly because of the line-based output, partly probably because 
regexec() is not actually amenable to a "<buffer,size>" kind of usage, but 
is based on NUL-terminated strings.

> 1. If the pattern contains no regexp characters  (and that is very
>    common), do a strstr.
> 
> 2. If the pattern must start with a specific character, search for that
>    by itself.

Yeah, we could do some simple stuff, and see if it helps..

		Linus

^ permalink raw reply

* Re: Fix silly typo in new builtin grep
From: Junio C Hamano @ 2006-05-16  3:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605151743360.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> The "-F" flag apparently got mis-translated due to some over-eager 
> copy-paste work into a duplicate "-H" when using the external grep.

Thanks.  I've pushed it out to "master", along with some other
stuff.

> Me likee the new built-in grep. The ability to say
>
> 	git grep __make_request v2.6.13 -- '*.c'
>
> to grep for it in a specific version is well worth the fact that it 
> obviously ends up being slower than grepping in the currently checked-out 
> tree. It's doing a hell of a lot more, but despite that it's not at all 
> that slow.
>
> (In fact, I would say that doing the above command in just 4 seconds is 
> damn impressive - it's a large code-base, and v2.6.13 is several months, 
> and over 20 _thousand_ revisions ago).

That is a BS praise and you know it ;-).  You do not have delta
chains that are 20k long, so grepping from the tree 10 revs ago
and from the tree 20k revs ago would not make a difference.

It _would_ be impressive to CVS folks, but even there each path
would not have 20k revisions.  The kernel patches tend to touch
3 paths per patch on average, so 60k changes over 18k files
distributed unevenly -- my guess (I could count but haven't) is
probably 200 revisions at most for most frequently touched file.

^ permalink raw reply

* Re: [PATCH] simple euristic for further free packing improvements
From: Nicolas Pitre @ 2006-05-16  2:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4pzqhh3t.fsf@assigned-by-dhcp.cox.net>

On Mon, 15 May 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > @@ -1038,8 +1038,8 @@ static int try_delta(struct unpacked *tr
> >  
> >  	/* Now some size filtering euristics. */
> >  	size = trg_entry->size;
> > -	max_size = size / 2 - 20;
> > -	if (trg_entry->delta)
> > +	max_size = (size/2 - 20) / (src_entry->depth + 1);
> > +	if (trg_entry->delta && trg_entry->delta_size <= max_size)
> >  		max_size = trg_entry->delta_size-1;
> >  	src_size = src_entry->size;
> >  	sizediff = src_size < size ? size - src_size : 0;
> 
> At the first glance, this seems rather too agressive.  It makes
> me wonder if it is a good balance to penalize the second
> generation base by requiring it to produce a small delta that is
> at most half as we normally would (and the third generation a
> third), or maybe the penalty should kick in more gradually, like
> e.g. ((max_depth * 2 - src_entry->depth) / (max_depth * 2).
> 
> Having said that, judging from your past patches, I learned to
> trust that you have tried tweaking this part and settled on this
> simplicity and elegance, so I'll take the patch as is -- if
> somebody wants to play with it that can always be done to
> further improve things.

Actually I didn't play with that part that much.  The only thing I tried 
besides this version was (size - 20) / (src_entry->depth + 1) but it 
produced larger packs than the current version.

So I thought it was better to provide a simple initial rule and leave 
possible improvements for later.


Nicolas

^ permalink raw reply

* Re: Fix silly typo in new builtin grep
From: Morten Welinder @ 2006-05-16  2:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0605151801100.3866@g5.osdl.org>

If I read the code right, it calls regexec for every single character
on every single line.  No wonder that takes a while!  Just call it
once and it'll search for its match quite nicely.

If that's not enough, the two obvious optimizations are...

1. If the pattern contains no regexp characters  (and that is very
    common), do a strstr.

2. If the pattern must start with a specific character, search for that
    by itself.

M.

^ permalink raw reply

* Re: [PATCH] simple euristic for further free packing improvements
From: Junio C Hamano @ 2006-05-16  1:51 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605151129540.18071@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> Given that the early eviction of objects with maximum delta depth 
> may exhibit bad packing on its own, why not considering a bias against 
> deep base objects in try_delta() to mitigate that bad behavior.

This is really a good stuff.  Thanks.  Oh, and thanks for
noticing my puzzlement expressed with "#if 0" ;-).

> @@ -1038,8 +1038,8 @@ static int try_delta(struct unpacked *tr
>  
>  	/* Now some size filtering euristics. */
>  	size = trg_entry->size;
> -	max_size = size / 2 - 20;
> -	if (trg_entry->delta)
> +	max_size = (size/2 - 20) / (src_entry->depth + 1);
> +	if (trg_entry->delta && trg_entry->delta_size <= max_size)
>  		max_size = trg_entry->delta_size-1;
>  	src_size = src_entry->size;
>  	sizediff = src_size < size ? size - src_size : 0;

At the first glance, this seems rather too agressive.  It makes
me wonder if it is a good balance to penalize the second
generation base by requiring it to produce a small delta that is
at most half as we normally would (and the third generation a
third), or maybe the penalty should kick in more gradually, like
e.g. ((max_depth * 2 - src_entry->depth) / (max_depth * 2).

Having said that, judging from your past patches, I learned to
trust that you have tried tweaking this part and settled on this
simplicity and elegance, so I'll take the patch as is -- if
somebody wants to play with it that can always be done to
further improve things.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox