Git development
 help / color / mirror / Atom feed
* [PATCH 2/2] QP-encode email body
From: Karl Hasselström @ 2006-10-22 12:49 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git
In-Reply-To: <20061022124551.14051.25145.stgit@localhost>

From: Karl Hasselström <kha@treskal.com>

Some mail servers dislike the 8bit transfer encoding, so use
quoted-printable instead.

Signed-off-by: Karl Hasselström <kha@treskal.com>
---

 stgit/commands/mail.py |   16 +++++++++-------
 1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/stgit/commands/mail.py b/stgit/commands/mail.py
index b661308..885d5e9 100644
--- a/stgit/commands/mail.py
+++ b/stgit/commands/mail.py
@@ -15,7 +15,7 @@ along with this program; if not, write t
 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 """
 
-import sys, os, re, time, datetime, smtplib, email.Header, email.Utils
+import sys, os, re, time, datetime, quopri, smtplib, email.Header, email.Utils
 from optparse import OptionParser, make_option
 
 from stgit.commands.common import *
@@ -253,7 +253,7 @@ def __build_extra_headers():
     """Build extra headers like content-type etc.
     """
     headers  = 'Content-Type: text/plain; charset=utf-8; format=fixed\n'
-    headers += 'Content-Transfer-Encoding: 8bit\n'
+    headers += 'Content-Transfer-Encoding: quoted-printable\n'
     headers += 'User-Agent: StGIT/%s\n' % version.version
 
     return headers
@@ -425,9 +425,9 @@ def encode_header(s, enc):
     else:
         return s
 
-def encode_headers(msg, enc):
-    """rfc2047-encode the headers of msg, assuming it is encoded in
-    enc."""
+def encode_message(msg, enc):
+    """rfc2047-encode the headers of msg, and quoted-printable-encode
+    the body. msg is assumed to be encoded in enc."""
     in_header = True
     lines = []
     for line in msg.splitlines(True):
@@ -436,6 +436,8 @@ def encode_headers(msg, enc):
                 line = encode_header(line, enc)
             else:
                 in_header = False
+        else:
+            line = quopri.encodestring(line)
         lines.append(line)
     return ''.join(lines)
 
@@ -497,7 +499,7 @@ def func(parser, options, args):
                 raise CmdException, 'No cover message template file found'
 
         msg_id = email.Utils.make_msgid('stgit')
-        msg = encode_headers(__build_cover(tmpl, total_nr, msg_id, options),
+        msg = encode_message(__build_cover(tmpl, total_nr, msg_id, options),
                              'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 
@@ -524,7 +526,7 @@ def func(parser, options, args):
 
     for (p, patch_nr) in zip(patches, range(1, len(patches) + 1)):
         msg_id = email.Utils.make_msgid('stgit')
-        msg = encode_headers(__build_message(tmpl, p, patch_nr, total_nr,
+        msg = encode_message(__build_message(tmpl, p, patch_nr, total_nr,
                                              msg_id, ref_id, options), 'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 

^ permalink raw reply related

* [PATCH 1/2] RFC2047-encode email headers
From: Karl Hasselström @ 2006-10-22 12:49 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git
In-Reply-To: <20061022124551.14051.25145.stgit@localhost>

From: Karl Hasselström <kha@treskal.com>

Having non-ascii characters in email headers is illegal, but StGIT
currently does not care. I'm often bitten by this, since my name
doesn't fit in ascii.

This patch implements an encoding pass just before the email is sent
over the wire -- in particular, it comes after any interactive editing
and templates and such, so the user should never have to see the
rfc2047 encoding.

NOTE: The rfc2047 encoder needs to know the encoding of the input
string. This patch hard-codes this to utf8, since that should be by
far the most common non-ascii encoding, and since utf8 is already the
hardcoded character set for the email body. In the long run, we
probably want to get this from the locale, or from a command line
switch, or both.

Signed-off-by: Karl Hasselström <kha@treskal.com>
---

 stgit/commands/mail.py |   45 +++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/stgit/commands/mail.py b/stgit/commands/mail.py
index 34504e6..b661308 100644
--- a/stgit/commands/mail.py
+++ b/stgit/commands/mail.py
@@ -15,7 +15,7 @@ along with this program; if not, write t
 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 """
 
-import sys, os, re, time, datetime, smtplib, email.Utils
+import sys, os, re, time, datetime, smtplib, email.Header, email.Utils
 from optparse import OptionParser, make_option
 
 from stgit.commands.common import *
@@ -403,6 +403,42 @@ def __build_message(tmpl, patch, patch_n
 
     return msg.strip('\n')
 
+def encode_header(s, enc):
+    """Take an entire e-mail header line, encoded in enc, and
+    rfc2047-encode it."""
+    def trans(s):
+        return str(email.Header.Header(unicode(s, enc)))
+    words = s.split(' ')
+    first_encode = len(words)
+    last_encode = -1
+    for i in xrange(len(words)):
+        ew = trans(words[i])
+        if ew != words[i]:
+            first_encode = min(first_encode, i)
+            last_encode = max(last_encode, i)
+    if first_encode <= last_encode:
+        return ' '.join(filter(
+            None,
+            [' '.join(words[:first_encode]),
+             trans(' '.join(words[first_encode:last_encode+1])),
+             ' '.join(words[last_encode+1:])]))
+    else:
+        return s
+
+def encode_headers(msg, enc):
+    """rfc2047-encode the headers of msg, assuming it is encoded in
+    enc."""
+    in_header = True
+    lines = []
+    for line in msg.splitlines(True):
+        if in_header:
+            if line.strip():
+                line = encode_header(line, enc)
+            else:
+                in_header = False
+        lines.append(line)
+    return ''.join(lines)
+
 def func(parser, options, args):
     """Send the patches by e-mail using the patchmail.tmpl file as
     a template
@@ -461,7 +497,8 @@ def func(parser, options, args):
                 raise CmdException, 'No cover message template file found'
 
         msg_id = email.Utils.make_msgid('stgit')
-        msg = __build_cover(tmpl, total_nr, msg_id, options)
+        msg = encode_headers(__build_cover(tmpl, total_nr, msg_id, options),
+                             'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 
         # subsequent e-mails are seen as replies to the first one
@@ -487,8 +524,8 @@ def func(parser, options, args):
 
     for (p, patch_nr) in zip(patches, range(1, len(patches) + 1)):
         msg_id = email.Utils.make_msgid('stgit')
-        msg = __build_message(tmpl, p, patch_nr, total_nr, msg_id, ref_id,
-                              options)
+        msg = encode_headers(__build_message(tmpl, p, patch_nr, total_nr,
+                                             msg_id, ref_id, options), 'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 
         # subsequent e-mails are seen as replies to the first one

^ permalink raw reply related

* [PATCH 0/2] Make "stg mail" behave better with non-ascii characters
From: Karl Hasselström @ 2006-10-22 12:45 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git
In-Reply-To: <20061022121240.GA21084@diana.vm.bytemark.co.uk>

These two patches teach "stg mail" to ref2047-escape non-ascii
characters in the mail headers (not doing so is illegal), and
QP-encodes the body (leaving it as 8bit is not well received by some
mail severs, notably vger).

The first patch is a resend, this time hopefully with my name intact.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* Re: VCS comparison table
From: Matthew D. Fuller @ 2006-10-22 12:46 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git
In-Reply-To: <87ac3p1jn7.wl%cworth@cworth.org>

[ Time to trim up CC's a bit ]

On Sat, Oct 21, 2006 at 01:47:08PM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> > I think we're getting into scratched-record-mode on this.
> 
> I apologize if I've come across as beating a dead horse on this.

Oh, I don't mean the whole topic in general.  It's just that there are
only so many ways one can say "revnos are only valid in certain
situations", and I really think we must have hit them all by now.  We
all agree on that; we just disagree (probably highly based on
differing workflows) on the commonness and extent of those situations.


> > B: Revnos are handier tools for [situation] and [situation] for
> >    [reason] and [reason].
> 
> I'm missing something:
> 
> I still haven't seen strong examples for this last claim. When are
> they handier?

This ties in a bit with what you say below, so I'll address it there.


> There's no doubt that there has been semantic confusion over the
> term branch that has been confounding communication on both sides.
  [...]
> Let me know if I botched any of that.

This seems correct; at least, it's correct enough to work from until
we find a detail wrong.


> But dropping a merged branch in bzr means throwing away the ability to
> reference any of its commits by its custom, branch-specific revision
> numbers.

True (though see below).


> And there is no simple way to correlate the numbers between
> branches.

Rather, unless you can one way or another access the branch the number
was for, there's NO way.


> Maybe you can argue that there isn't any centralization bias in bzr.
> But anyone that claims that the revnos. are stable really is talking
> from a standpoint that favors centralization.

I think it's using that 'c' word there that's causing contention here;
we're ascribing different meanings to it.

Revnos only apply to a specific "branch" (in this usage, I'm talking
about branch abstractly and somewhat specifically; more in a moment),
and so except by wild coincidence are only useful in talking about
that branch.  One of the two cases (the second discussed later) where
that's useful is when you have long-lived branches.  In git,
apparently, you don't have long-lived "branches" in this particular
meaning of the word, but the way people use bzr they do.  Perhaps this
is what you mean by 'centralization'.

That long-lived branch doesn't have to be any sort of "trunk", though
it usually is; it could as easily be something totally peripheral.


Now, details of that use of "branch".  In mathematical terms, a branch
may be defined purely by its head rev (and the graph built up by
recursing through all the parents), but in [bzr] UI and mental model
terms, a "branch" is that plus its mainline[0]; the left-most or first
line of descent, which colloquially is the difference between 'things
I commit' and 'things I merge'.

Let me try flexing my git-expression muscles here.  Given a branch at
a specific point in time, you point at the head rev, and there's a
subset we call 'mainline' of the whole set of parents, which is
expressed by following the 'first' parent pointers back to a single
origin (there can be 50 origins in the whole graph, of course, but
only one of them is on the 'mainline').  At some later time, more
revisions have been added to the graph, and the head rev is now
something "later".  If, at that later time, all the nodes which were
previously on that 'mainline' are still on it tracing back from the
new head, then in the sense I'm using "branch", it's still the same
"branch".  All the revnos referring to its earlier incarnation are
still valid for this one (though there are new ones tacked onto the
end; that doesn't affect the pre-existing ones).

[I THINK we all understand that, but just making sure]


[0] This probably causes some confusion too, since I know I'm guilty
    of using the word 'mainline' both in the sense of a 'trunk'
    branch, and this particular path through one branch.  _I_ think
    it's usually clear from context, but I guess it probably isn't for
    those with a different mental modeling of "branch".


> To illustrate, yesterday I gave an example where performing a bzr
> branch from a dotted-decimal revision would rewrite the numbers from
> the originating branch (1.2.2, 1.2.1, and 1) to unrelated numbers in
> the new branch (3, 2, 1).

One thing to note here is that that 1.2.1 and 1.2.2 came into your
first branch here by merging from another branch (call that branch
'b').  When you created your new branch here that now has (3,2,1),
those numbers are the same as the numbers that existed locally in 'b'
at the time 1.2.2 was its 'head'.  In a sense, then, you've just
recreated [a copy of] "branch" 'b' at that time.  So, in a way, by
taking a copy of the current bzr.dev branch, you can recreate the
entire state of any branches that were merged into it as of the time
they were merged (excluding cases of cherrypicking, or when merging
prior to the head of those branches of course).


> But then I realized why bzr is doing this. It's because, bzr users
> don't just use the revision numbers for external communication, but
> they also use them for lots of direct interaction with the tool. The
> rewriting makes it easy to write something like "bzr diff -r1..3".

This is an instance of the second case (first above) where the revnos,
applying just to one branch, become useful.  And, it's probably the
case I'm most attached to.

The great majority (I'd say easily 80%) of my references to revisions
are transient.  Most of 'em have probably exhausted their usefulness
in an hour; many of them (as in interaction with the tool you
mentioned) in just a couple seconds.  Virtually all my branches live
longer than that, so the limited lifespan of the numbers in the grand
scheme doesn't matter a whit.

So, from above, some of the places they're handier:

- Typing.  I know, copy and paste copies and pastes one string just as
  well as another, and long strings just as well as short.  But I
  don't want to copy&paste; I want to ^Z out of log and run a quick
  diff, between two revisions only one of which is on my screen at the
  time.  I can just remember the offscreen revno I'm comparing
  against, and it's very easy to quickly type the numbers,
  particularly since 95% of the time I'm comparing mainline revs so I
  don't even have to think about dotted forms.


- Some forms of communicating.  I can yell numbers across the room
  without concern about whether they'll be interpreted right.  Even 6
  digits of an SHA-1 hash are a lot harder to do that with.  I can
  hold revnos in my head while I walk down the hall to talk to
  somebody about them, or pick up a phone, or go to a meeting.  I can
  scrawl them on notepads or whiteboards.  In all these cases, the
  only reason for which I'm communicating that revno will be exhausted
  very shortly, so it's completely irrelevant whether it's meaningful
  in 5 years, or next week.


- Visual comparing (this is one that's useful on the long-lived
  branches, as well as transient stuff) and information gathering.  I
  can hold in my head "Yeah, I looked at 1350 of Joe's branch", and if
  I see an email from him "Oh, I fixed a bug in 1358" or "in 1293", I
  can know just from that whether I saw the fix or not.

  If somebody says "I introduced a bug in revision 3841, and fixed it
  in 3843", I know the window where that bug is in play is probably
  pretty small, whereas "introduced in 3841, fixed in 5337" tells me
  it was alive a looong time.

  bzr.dev is currently on revno 2091.  I didn't know that, I had to
  look it up.  But I knew it was a little past 2000, just from loosely
  watching it.  If somebody talks about something that happened in
  revno 1800, I know automatically "That was fairly recent", compared
  to talking about revno 75, where I know "Wow, that was a long time
  ago".

  This property is true of bzr revids as well.  If I see talk about
  revision "mbp@sourcefrog.net-20050520021228-bc46a17f07eff7f9", I
  know right away Martin committed it, and it was a year and a half
  ago.  If I see talk about an oops in revision "af38cc3", that just
  tells me that somebody screwed up, and it gets mentally filed away
  or goes in one eye and out the other.  But if I see talk about an
  oops in revision "fullermd@over-yonder.net-[...]", that rings bright
  blue bells that tell me that *I* screwed up and I need to jump on
  that right now.  In a sufficiently small projects with sufficiently
  discrete task division, I may even be able to guess offhand based on
  the person and date what bit of functionality the commit references,
  though that's a much lower probability.

  It can also be useful in looking at cases where you don't
  necessarily have the tool.  Compare putting CVS's rcsid tags in
  strings in the source.  static const char *rcsid = "$Id"; and the
  like.  Then you can use 'ident' on the compiled binaries to see the
  revs of files in them.  If somebody says "foo.c has a bug in 1.34,
  fixed in 1.37", I can without any VCS interaction just look at the
  compiled binary and tell whether I'm prior to the bug, have the bug,
  or after the fix.  If the binary is known to be compiled from a
  particular branch, a tree-wide revno tells me that too.  A revid
  (even one containing a date) won't tell me that; I'll have to find
  the tool and a copy of the tree and find out if my rev contains that
  other rev.

  Now, on any given revision reference, I probably don't care about
  most of those bits of info.  I may not care about any of them, but I
  often care about at least one or two.  And we all probably have
  wildly varying appraisals of the commonness of various of the
  situations described.  And yes, a lot of them are just mental
  heuristics.  Sure, with a completely opaque id, I could pull up the
  tool to look up any of those (and a lot more information besides),
  the gain is I don't HAVE to.  Just knowing some bit of that info can
  often tell me if I don't care to investigate whatever the revision
  is being referenced for at all, or that I need to put doing so at
  the top of my priority list.



> And it turns out that git also allows branch specific naming for the
> exact same reason. In place of 3, 2, 1 in the same situation git
> would allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same
> three revisions. So the easy diff command would be "git diff HEAD~2
> HEAD".

In bzr, that would be "bzr diff -r-2..-1" (or just "-r-2.." since
open-ended revspecs pretty much work like you'd expect them to).  IME,
that only works well maybe 4 or 5 revs back; past that, you spend too
much time counting, and it's easier to just whack in the number from
log.

bzr _doesn't_, OTOH, have anything like HEAD^2, for selecting
alternate parent paths.  That's probably use-pattern bias; we hardly
ever do something like that, so it's never occurred to us to add the
ability to.


> Maybe some of the people that dislike git's "ugly" names so much is
> that they imagine that to compare two revisions a user of git must
> inspect the logs, fish out the sha1sum for each, and then
> cut-and-paste to create the command needed.

I do imagine that.  And I think I'd hit it, since I often look around
revs that aren't right near the tip; trying to figure out
"HEAD~293..HEAD~38" is even worse than excavating the sha1sum's.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply

* Re: [PATCH] RFC2047-encode email headers
From: Karl Hasselström @ 2006-10-22 12:12 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git
In-Reply-To: <20061022120217.7650.23715.stgit@localhost>

On 2006-10-22 14:02:17 +0200, Karl Hasselström wrote:

> From: Karl Hasselström <kha@treskal.com>

OK, this patch did what it was supposed to do -- which was to encode
the mail headers properly -- but StGIT still generates an 8-bit
encoded body, and vger doesn't seem to like that (see the X-Warning:
headers it added to the patch mail). That's another fix for another
day.

Catalin, if you take this patch, I'd appreciate it if you made double
sure that my name doesn't have garbage in it. (It may very well be
that the copy of the patch sent to you personally is unharmed; it all
works fine when I send patches to myself. vger is the only mail server
I have seen that has this problem.)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* [PATCH] RFC2047-encode email headers
From: Karl Hasselström @ 2006-10-22 12:02 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git

From: Karl Hasselström <kha@treskal.com>

Having non-ascii characters in email headers is illegal, but StGIT
currently does not care. I'm often bitten by this, since my name
doesn't fit in ascii.

This patch implements an encoding pass just before the email is sent
over the wire -- in particular, it comes after any interactive editing
and templates and such, so the user should never have to see the
rfc2047 encoding.

NOTE: The rfc2047 encoder needs to know the encoding of the input
string. This patch hard-codes this to utf8, since that should be by
far the most common non-ascii encoding, and since utf8 is already the
hardcoded character set for the email body. In the long run, we
probably want to get this from the locale, or from a command line
switch, or both.

Signed-off-by: Karl Hasselström <kha@treskal.com>
---

 stgit/commands/mail.py |   45 +++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/stgit/commands/mail.py b/stgit/commands/mail.py
index 34504e6..b661308 100644
--- a/stgit/commands/mail.py
+++ b/stgit/commands/mail.py
@@ -15,7 +15,7 @@ along with this program; if not, write t
 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 """
 
-import sys, os, re, time, datetime, smtplib, email.Utils
+import sys, os, re, time, datetime, smtplib, email.Header, email.Utils
 from optparse import OptionParser, make_option
 
 from stgit.commands.common import *
@@ -403,6 +403,42 @@ def __build_message(tmpl, patch, patch_n
 
     return msg.strip('\n')
 
+def encode_header(s, enc):
+    """Take an entire e-mail header line, encoded in enc, and
+    rfc2047-encode it."""
+    def trans(s):
+        return str(email.Header.Header(unicode(s, enc)))
+    words = s.split(' ')
+    first_encode = len(words)
+    last_encode = -1
+    for i in xrange(len(words)):
+        ew = trans(words[i])
+        if ew != words[i]:
+            first_encode = min(first_encode, i)
+            last_encode = max(last_encode, i)
+    if first_encode <= last_encode:
+        return ' '.join(filter(
+            None,
+            [' '.join(words[:first_encode]),
+             trans(' '.join(words[first_encode:last_encode+1])),
+             ' '.join(words[last_encode+1:])]))
+    else:
+        return s
+
+def encode_headers(msg, enc):
+    """rfc2047-encode the headers of msg, assuming it is encoded in
+    enc."""
+    in_header = True
+    lines = []
+    for line in msg.splitlines(True):
+        if in_header:
+            if line.strip():
+                line = encode_header(line, enc)
+            else:
+                in_header = False
+        lines.append(line)
+    return ''.join(lines)
+
 def func(parser, options, args):
     """Send the patches by e-mail using the patchmail.tmpl file as
     a template
@@ -461,7 +497,8 @@ def func(parser, options, args):
                 raise CmdException, 'No cover message template file found'
 
         msg_id = email.Utils.make_msgid('stgit')
-        msg = __build_cover(tmpl, total_nr, msg_id, options)
+        msg = encode_headers(__build_cover(tmpl, total_nr, msg_id, options),
+                             'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 
         # subsequent e-mails are seen as replies to the first one
@@ -487,8 +524,8 @@ def func(parser, options, args):
 
     for (p, patch_nr) in zip(patches, range(1, len(patches) + 1)):
         msg_id = email.Utils.make_msgid('stgit')
-        msg = __build_message(tmpl, p, patch_nr, total_nr, msg_id, ref_id,
-                              options)
+        msg = encode_headers(__build_message(tmpl, p, patch_nr, total_nr,
+                                             msg_id, ref_id, options), 'UTF-8')
         from_addr, to_addr_list = __parse_addresses(msg)
 
         # subsequent e-mails are seen as replies to the first one

^ permalink raw reply related

* Re: VCS comparison table
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git
In-Reply-To: <20061022100028.GQ75501@over-yonder.net>

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply

* Re: VCS comparison table
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Jeff Licquia, bazaar-ng, git
In-Reply-To: <20061022100028.GQ75501@over-yonder.net>

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply

* [PATCH 3/3] Teach git-branch -v and -w options
From: Lars Hjemli @ 2006-10-22 11:30 UTC (permalink / raw)
  To: git
In-Reply-To: <5245bfe3982f5c23841229af9f548f982b9c60c3.1161516129.git.hjemli@gmail.com>

This makes git-branch display sha1 and first line of commit
message for each branch.

Additionaly, the -w option may be used to specify columnwidth
for branchname (default is 20 characters)

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 Documentation/git-branch.txt |    8 ++++++-
 git-branch.sh                |   47 ++++++++++++++++++++++++++++++++++++-----
 2 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index d43ef1d..efbab61 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -8,7 +8,7 @@ git-branch - List, create, or delete bra
 SYNOPSIS
 --------
 [verse]
-'git-branch' [-r]
+'git-branch' [-r] [-v [-w width]]
 'git-branch' [-l] [-f] <branchname> [<start-point>]
 'git-branch' (-d | -D) <branchname>...
 
@@ -47,6 +47,12 @@ OPTIONS
 -r::
 	List only the "remote" branches.
 
+-v::
+	Show sha1 and first line of commit message for each branch
+
+-w <width>::
+	Set columnwidth for branchname display
+
 <branchname>::
 	The name of the branch to create or delete.
 	The new branch name must pass all checks defined by
diff --git a/git-branch.sh b/git-branch.sh
index 1f628a4..73839b4 100755
--- a/git-branch.sh
+++ b/git-branch.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-USAGE=' [-l] [-f] <branchname> [<start-point>] | (-d | -D) <branchname> | [-r]'
+USAGE=' [-l] [-f] <branchname> [<start-point>] | (-d | -D) <branchname> | [-r] [-v [-w width]]'
 LONG_USAGE='If no arguments, show available branches and mark current branch with a star.
 If one argument, create a new branch <branchname> based off of current HEAD.
 If two arguments, create a new branch <branchname> based off of <start-point>.'
@@ -49,12 +49,26 @@ If you are sure you want to delete it, r
 }
 
 ls_remote_branches () {
+	verbose="$1"
+	width="$2"
     git-rev-parse --symbolic --all |
     sed -ne 's|^refs/\(remotes/\)|\1|p' |
-    sort
+    sort |
+    while read ref
+    do
+		if test "$verbose" = "yes"
+		then
+			log=$(git-log --pretty=oneline --max-count=1 "$ref")
+			printf "%-*s %s\n" "$width" "$ref" "$log"
+		else
+			echo "$ref"
+		fi
+	done
 }
 
 ls_local_branches () {
+	verbose="$1"
+	width="$2"
 	git-rev-parse --symbolic --branches |
 	sort |
 	while read ref
@@ -65,12 +79,22 @@ ls_local_branches () {
 		else
 			pfx=' '
 		fi
-		echo "$pfx $ref"
+		if test "$verbose" = "yes"
+		then
+			log=$(git-log --pretty=oneline --max-count=1 "$ref")
+			printf "%s %-*s %s\n" "$pfx" "$width" "$ref" "$log"
+		else
+			echo "$pfx $ref"
+		fi
 	done
 }
 
 force=
 create_log=
+remotes=
+verbose=
+width=20
+
 while case "$#,$1" in 0,*) break ;; *,-*) ;; *) break ;; esac
 do
 	case "$1" in
@@ -79,8 +103,14 @@ do
 		exit
 		;;
 	-r)
-		ls_remote_branches
-		exit
+		remotes="yes"
+		;;
+	-v)
+		verbose="yes"
+		;;
+	-w)
+		shift
+		width="$1"
 		;;
 	-f)
 		force="$1"
@@ -101,7 +131,12 @@ done
 
 case "$#" in
 0)
-	ls_local_branches
+	if test "$remotes" = "yes"
+	then
+		ls_remote_branches "$verbose" "$width"
+	else
+		ls_local_branches "$verbose" "$width"
+	fi
 	exit 0 ;;
 1)
 	head=HEAD ;;
-- 
1.4.3.1.g1688

^ permalink raw reply related

* [PATCH 2/3] Refactor git-branch
From: Lars Hjemli @ 2006-10-22 11:30 UTC (permalink / raw)
  To: git
In-Reply-To: <5245bfe3982f5c23841229af9f548f982b9c60c3.1161516129.git.hjemli@gmail.com>

This moves the code used to display local branches into a
separate function.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 git-branch.sh |   28 ++++++++++++++++------------
 1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/git-branch.sh b/git-branch.sh
index b80bcda..1f628a4 100755
--- a/git-branch.sh
+++ b/git-branch.sh
@@ -54,6 +54,21 @@ ls_remote_branches () {
     sort
 }
 
+ls_local_branches () {
+	git-rev-parse --symbolic --branches |
+	sort |
+	while read ref
+	do
+		if test "$headref" = "$ref"
+		then
+			pfx='*'
+		else
+			pfx=' '
+		fi
+		echo "$pfx $ref"
+	done
+}
+
 force=
 create_log=
 while case "$#,$1" in 0,*) break ;; *,-*) ;; *) break ;; esac
@@ -86,18 +101,7 @@ done
 
 case "$#" in
 0)
-	git-rev-parse --symbolic --branches |
-	sort |
-	while read ref
-	do
-		if test "$headref" = "$ref"
-		then
-			pfx='*'
-		else
-			pfx=' '
-		fi
-		echo "$pfx $ref"
-	done
+	ls_local_branches
 	exit 0 ;;
 1)
 	head=HEAD ;;
-- 
1.4.3.1.g1688

^ permalink raw reply related

* [PATCH 1/3] Fix usagestring for git-branch
From: Lars Hjemli @ 2006-10-22 11:30 UTC (permalink / raw)
  To: git
In-Reply-To: <1161516626749-git-send-email-hjemli@gmail.com>

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 git-branch.sh |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/git-branch.sh b/git-branch.sh
index 4379a07..b80bcda 100755
--- a/git-branch.sh
+++ b/git-branch.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-USAGE='[-l] [(-d | -D) <branchname>] | [[-f] <branchname> [<start-point>]] | -r'
+USAGE=' [-l] [-f] <branchname> [<start-point>] | (-d | -D) <branchname> | [-r]'
 LONG_USAGE='If no arguments, show available branches and mark current branch with a star.
 If one argument, create a new branch <branchname> based off of current HEAD.
 If two arguments, create a new branch <branchname> based off of <start-point>.'
-- 
1.4.3.1.g1688

^ permalink raw reply related

* [PATCH 0/3] Add -v and -w options to git-branch
From: Lars Hjemli @ 2006-10-22 11:30 UTC (permalink / raw)
  To: git

This patch-series teaches git-branch to show sha1 and first line
of commit message for each branch (and is a replacement for my
previous patches)

Diffstat:
 Documentation/git-branch.txt |    8 ++++-
 git-branch.sh                |   71 ++++++++++++++++++++++++++++++++---------
 2 files changed, 62 insertions(+), 17 deletions(-)

^ permalink raw reply

* [PATCH] Build in shortlog
From: Johannes Schindelin @ 2006-10-22 11:23 UTC (permalink / raw)
  To: git, junkio


Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
 Documentation/git-shortlog.txt |    1 +
 Makefile                       |    5 +-
 builtin-shortlog.c             |  302 ++++++++++++++++++++++++++++++++++++++++
 builtin.h                      |    1 +
 git-shortlog.perl              |  234 -------------------------------
 git.c                          |    1 +
 path-list.c                    |    2 +-
 7 files changed, 309 insertions(+), 237 deletions(-)

diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt
index d54fc3e..95fa901 100644
--- a/Documentation/git-shortlog.txt
+++ b/Documentation/git-shortlog.txt
@@ -8,6 +8,7 @@ git-shortlog - Summarize 'git log' outpu
 SYNOPSIS
 --------
 git-log --pretty=short | 'git-shortlog' [-h] [-n] [-s]
+git-shortlog [-n|--number] [-s|--summary] [<committish>...]
 
 DESCRIPTION
 -----------
diff --git a/Makefile b/Makefile
index 018dad2..0beda57 100644
--- a/Makefile
+++ b/Makefile
@@ -106,7 +106,7 @@ uname_P := $(shell sh -c 'uname -p 2>/de
 
 # CFLAGS and LDFLAGS are for the users to override from the command line.
 
-CFLAGS = -g -O2 -Wall
+CFLAGS = -g -O0 -Wall
 LDFLAGS =
 ALL_CFLAGS = $(CFLAGS)
 ALL_LDFLAGS = $(LDFLAGS)
@@ -178,7 +178,7 @@ SCRIPT_SH = \
 
 SCRIPT_PERL = \
 	git-archimport.perl git-cvsimport.perl git-relink.perl \
-	git-shortlog.perl git-rerere.perl \
+	git-rerere.perl \
 	git-cvsserver.perl \
 	git-svnimport.perl git-cvsexportcommit.perl \
 	git-send-email.perl git-svn.perl
@@ -306,6 +306,7 @@ BUILTIN_OBJS = \
 	builtin-rev-parse.o \
 	builtin-rm.o \
 	builtin-runstatus.o \
+	builtin-shortlog.o \
 	builtin-show-branch.o \
 	builtin-stripspace.o \
 	builtin-symbolic-ref.o \
diff --git a/builtin-shortlog.c b/builtin-shortlog.c
new file mode 100644
index 0000000..df60bd2
--- /dev/null
+++ b/builtin-shortlog.c
@@ -0,0 +1,302 @@
+#include "builtin.h"
+#include "cache.h"
+#include "commit.h"
+#include "diff.h"
+#include "path-list.h"
+#include "revision.h"
+#include <string.h>
+
+static const char shortlog_usage[] =
+"git-shortlog [-n] [-s] [<commit-id>... ]\n";
+
+static int compare_by_number(const void *a1, const void *a2)
+{
+	const struct path_list_item *i1 = a1, *i2 = a2;
+	const struct path_list *l1 = i1->util, *l2 = i2->util;
+
+	if (l1->nr < l2->nr)
+		return -1;
+	else if (l1->nr == l2->nr)
+		return 0;
+	else
+		return +1;
+}
+
+static struct path_list_item mailmap_list[] = {
+	{ "R.Marek@sh.cvut.cz", (void*)"Rudolf Marek" },
+	{ "Ralf.Wildenhues@gmx.de", (void*)"Ralf Wildenhues" },
+	{ "aherrman@de.ibm.com", (void*)"Andreas Herrmann" },
+	{ "akpm@osdl.org", (void*)"Andrew Morton" },
+	{ "andrew.vasquez@qlogic.com", (void*)"Andrew Vasquez" },
+	{ "aquynh@gmail.com", (void*)"Nguyen Anh Quynh" },
+	{ "axboe@suse.de", (void*)"Jens Axboe" },
+	{ "blaisorblade@yahoo.it", (void*)"Paolo 'Blaisorblade' Giarrusso" },
+	{ "bunk@stusta.de", (void*)"Adrian Bunk" },
+	{ "domen@coderock.org", (void*)"Domen Puncer" },
+	{ "dougg@torque.net", (void*)"Douglas Gilbert" },
+	{ "dwmw2@shinybook.infradead.org", (void*)"David Woodhouse" },
+	{ "ecashin@coraid.com", (void*)"Ed L Cashin" },
+	{ "felix@derklecks.de", (void*)"Felix Moeller" },
+	{ "fzago@systemfabricworks.com", (void*)"Frank Zago" },
+	{ "gregkh@suse.de", (void*)"Greg Kroah-Hartman" },
+	{ "hch@lst.de", (void*)"Christoph Hellwig" },
+	{ "htejun@gmail.com", (void*)"Tejun Heo" },
+	{ "jejb@mulgrave.(none)", (void*)"James Bottomley" },
+	{ "jejb@titanic.il.steeleye.com", (void*)"James Bottomley" },
+	{ "jgarzik@pretzel.yyz.us", (void*)"Jeff Garzik" },
+	{ "johnpol@2ka.mipt.ru", (void*)"Evgeniy Polyakov" },
+	{ "kay.sievers@vrfy.org", (void*)"Kay Sievers" },
+	{ "minyard@acm.org", (void*)"Corey Minyard" },
+	{ "mshah@teja.com", (void*)"Mitesh shah" },
+	{ "pj@ludd.ltu.se", (void*)"Peter A Jonsson" },
+	{ "rmps@joel.ist.utl.pt", (void*)"Rui Saraiva" },
+	{ "santtu.hyrkko@gmail.com", (void*)"Santtu Hyrkk^[,Av^[(B" },
+	{ "simon@thekelleys.org.uk", (void*)"Simon Kelley" },
+	{ "ssant@in.ibm.com", (void*)"Sachin P Sant" },
+	{ "terra@gnome.org", (void*)"Morten Welinder" },
+	{ "tony.luck@intel.com", (void*)"Tony Luck" },
+	{ "welinder@anemone.rentec.com", (void*)"Morten Welinder" },
+	{ "welinder@darter.rentec.com", (void*)"Morten Welinder" },
+	{ "welinder@troll.com", (void*)"Morten Welinder" }
+};
+
+static struct path_list mailmap = {
+	mailmap_list,
+	sizeof(mailmap_list) / sizeof(struct path_list_item), 0, 0
+};
+
+static int map_email(char *email, char *name, int maxlen)
+{
+	char *p;
+	struct path_list_item *item;
+
+	/* autocomplete common developers */
+	p = strchr(email, '>');
+	if (!p)
+		return 0;
+
+	*p = '\0';
+	item = path_list_lookup(email, &mailmap);
+	if (item != NULL) {
+		const char *realname = (const char *)item->util;
+		strncpy(name, realname, maxlen);
+		return 1;
+	}
+	return 0;
+}
+
+static void insert_author_oneline(struct path_list *list,
+		const char *author, int authorlen,
+		const char *oneline, int onelinelen)
+{
+	const char *dot3 = "/pub/scm/linux/kernel/git/";
+	char *buffer, *p;
+	struct path_list_item *item;
+	struct path_list *onelines;
+
+	while (authorlen > 0 && isspace(author[authorlen - 1]))
+		authorlen--;
+
+	buffer = xmalloc(authorlen + 1);
+	memcpy(buffer, author, authorlen);
+	buffer[authorlen] = '\0';
+
+	item = path_list_insert(buffer, list);
+	if (item->util == NULL)
+		item->util = xcalloc(1, sizeof(struct path_list));
+	else
+		free(buffer);
+
+	if (!strncmp(oneline, "[PATCH", 6)) {
+		char *eob = strchr(buffer, ']');
+
+		while (isspace(eob[1]) && eob[1] != '\n')
+			eob++;
+		if (eob - oneline < onelinelen) {
+			onelinelen -= eob - oneline;
+			oneline = eob;
+		}
+	}
+
+	while (onelinelen > 0 && isspace(oneline[0])) {
+		oneline++;
+		onelinelen--;
+	}
+
+	while (onelinelen > 0 && isspace(oneline[onelinelen - 1]))
+		onelinelen--;
+
+	buffer = xmalloc(onelinelen + 1);
+	memcpy(buffer, oneline, onelinelen);
+	buffer[onelinelen] = '\0';
+
+	while ((p = strstr(buffer, dot3)) != NULL) {
+		memcpy(p, "...", 3);
+		strcpy(p + 2, p + sizeof(dot3) - 1);
+	}
+
+
+	onelines = item->util;
+	if (onelines->nr >= onelines->alloc) {
+		onelines->alloc = alloc_nr(onelines->nr);
+		onelines->items = xrealloc(onelines->items,
+				onelines->alloc
+				* sizeof(struct path_list_item));
+	}
+
+	onelines->items[onelines->nr].util = NULL;
+	onelines->items[onelines->nr++].path = buffer;
+}
+
+static void read_from_stdin(struct path_list *list)
+{
+	char buffer[1024];
+
+	while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
+		char *bob;
+		if ((buffer[0] == 'A' || buffer[0] == 'a') &&
+				!strncmp(buffer + 1, "uthor: ", 7) &&
+				(bob = strchr(buffer + 7, '<')) != NULL) {
+			char buffer2[1024], offset = 0;
+
+			if (map_email(bob + 1, buffer, sizeof(buffer)))
+				bob = buffer + strlen(buffer);
+			else {
+				offset = 8;
+				while (isspace(bob[-1]))
+					bob--;
+			}
+
+			while (fgets(buffer2, sizeof(buffer2), stdin) &&
+					buffer2[0] != '\n')
+				; /* chomp input */
+			if (fgets(buffer2, sizeof(buffer2), stdin))
+				insert_author_oneline(list,
+						buffer + offset,
+						bob - buffer - offset,
+						buffer2, strlen(buffer2));
+		}
+	}
+}
+
+static void get_from_rev(struct rev_info *rev, struct path_list *list)
+{
+	char scratch[1024];
+	struct commit *commit;
+
+	prepare_revision_walk(rev);
+	while ((commit = get_revision(rev)) != NULL) {
+		char *author = NULL, *oneline, *buffer;
+		int authorlen, onelinelen;
+
+		/* get author and oneline */
+		for (buffer = commit->buffer; buffer && *buffer != '\0' &&
+				*buffer != '\n'; ) {
+			char *eol = strchr(buffer, '\n');
+
+			if (eol == NULL)
+				eol = buffer + strlen(buffer);
+			else
+				eol++;
+
+			if (!strncmp(buffer, "author ", 7)) {
+				char *bracket = strchr(buffer, '<');
+
+				if (bracket == NULL || bracket > eol)
+					die("Invalid commit buffer: %s",
+					    sha1_to_hex(commit->object.sha1));
+
+				if (map_email(bracket + 1, scratch,
+							sizeof(scratch))) {
+					author = scratch;
+					authorlen = strlen(scratch);
+				} else {
+					while (bracket[-1] == ' ')
+						bracket--;
+
+					author = buffer + 7;
+					authorlen = bracket - buffer - 7;
+				}
+			}
+			buffer = eol;
+		}
+
+		if (author == NULL)
+			die ("Missing author: %s",
+					sha1_to_hex(commit->object.sha1));
+
+		if (buffer == NULL || *buffer == '\0') {
+			oneline = "<none>";
+			onelinelen = sizeof(oneline) + 1;
+		} else {
+			char *eol;
+
+			oneline = buffer + 1;
+			eol = strchr(oneline, '\n');
+			if (eol == NULL)
+				onelinelen = strlen(oneline);
+			else
+				onelinelen = eol - oneline;
+		}
+
+		insert_author_oneline(list,
+				author, authorlen, oneline, onelinelen);
+	}
+
+}
+
+int cmd_shortlog(int argc, const char **argv, const char *prefix)
+{
+	struct rev_info rev;
+	struct path_list list = { NULL, 0, 0, 1 };
+	int i, j, sort_by_number = 0, summary = 0;
+
+	init_revisions(&rev, prefix);
+	argc = setup_revisions(argc, argv, &rev, NULL);
+	while (argc > 1) {
+		if (!strcmp(argv[1], "-n") || !strcmp(argv[1], "--numbered"))
+			sort_by_number = 1;
+		else if (!strcmp(argv[1], "-s") ||
+				!strcmp(argv[1], "--summary"))
+			summary = 1;
+		else if (!strcmp(argv[1], "-h") || !strcmp(argv[1], "--help"))
+			usage(shortlog_usage);
+		else
+			die ("unrecognized argument: %s", argv[1]);
+		argv++;
+		argc--;
+	}
+
+	if (rev.pending.nr == 1)
+		die ("Need a range!");
+	else if (rev.pending.nr == 0)
+		read_from_stdin(&list);
+	else
+		get_from_rev(&rev, &list);
+
+	if (sort_by_number)
+		qsort(list.items, sizeof(struct path_list_item), list.nr,
+			compare_by_number);
+
+	for (i = 0; i < list.nr; i++) {
+		struct path_list *onelines = list.items[i].util;
+
+		printf("%s (%d):\n", list.items[i].path, onelines->nr);
+		if (!summary) {
+			for (j = onelines->nr - 1; j >= 0; j--)
+				printf("      %s\n", onelines->items[j].path);
+			printf("\n");
+		}
+
+		onelines->strdup_paths = 1;
+		path_list_clear(onelines, 1);
+		free(onelines);
+		list.items[i].util = NULL;
+	}
+
+	list.strdup_paths = 1;
+	path_list_clear(&list, 1);
+
+	return 0;
+}
+
diff --git a/builtin.h b/builtin.h
index 9683a7c..0ce8f8b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -51,6 +51,7 @@ extern int cmd_rev_list(int argc, const 
 extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_runstatus(int argc, const char **argv, const char *prefix);
+extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
diff --git a/git-shortlog.perl b/git-shortlog.perl
deleted file mode 100755
index 334fec7..0000000
--- a/git-shortlog.perl
+++ /dev/null
@@ -1,234 +0,0 @@
-#!/usr/bin/perl -w
-
-use strict;
-use Getopt::Std;
-use File::Basename qw(basename dirname);
-
-our ($opt_h, $opt_n, $opt_s);
-getopts('hns');
-
-$opt_h && usage();
-
-sub usage {
-	print STDERR "Usage: ${\basename $0} [-h] [-n] [-s] < <log_data>\n";
-        exit(1);
-}
-
-my (%mailmap);
-my (%email);
-my (%map);
-my $pstate = 1;
-my $n_records = 0;
-my $n_output = 0;
-
-sub shortlog_entry($$) {
-	my ($name, $desc) = @_;
-	my $key = $name;
-
-	$desc =~ s#/pub/scm/linux/kernel/git/#/.../#g;
-	$desc =~ s#\[PATCH\] ##g;
-
-	# store description in array, in email->{desc list} map
-	if (exists $map{$key}) {
-		# grab ref
-		my $obj = $map{$key};
-
-		# add desc to array
-		push(@$obj, $desc);
-	} else {
-		# create new array, containing 1 item
-		my @arr = ($desc);
-
-		# store ref to array
-		$map{$key} = \@arr;
-	}
-}
-
-# sort comparison function
-sub by_name($$) {
-	my ($a, $b) = @_;
-
-	uc($a) cmp uc($b);
-}
-sub by_nbentries($$) {
-	my ($a, $b) = @_;
-	my $a_entries = $map{$a};
-	my $b_entries = $map{$b};
-
-	@$b_entries - @$a_entries || by_name $a, $b;
-}
-
-my $sort_method = $opt_n ? \&by_nbentries : \&by_name;
-
-sub summary_output {
-	my ($obj, $num, $key);
-
-	foreach $key (sort $sort_method keys %map) {
-		$obj = $map{$key};
-		$num = @$obj;
-		printf "%s: %u\n", $key, $num;
-		$n_output += $num;
-	}
-}
-
-sub shortlog_output {
-	my ($obj, $num, $key, $desc);
-
-	foreach $key (sort $sort_method keys %map) {
-		$obj = $map{$key};
-		$num = @$obj;
-
-		# output author
-		printf "%s (%u):\n", $key, $num;
-
-		# output author's 1-line summaries
-		foreach $desc (reverse @$obj) {
-			print "  $desc\n";
-			$n_output++;
-		}
-
-		# blank line separating author from next author
-		print "\n";
-	}
-}
-
-sub changelog_input {
-	my ($author, $desc);
-
-	while (<>) {
-		# get author and email
-		if ($pstate == 1) {
-			my ($email);
-
-			next unless /^[Aa]uthor:?\s*(.*?)\s*<(.*)>/;
-
-			$n_records++;
-
-			$author = $1;
-			$email = $2;
-			$desc = undef;
-
-			# cset author fixups
-			if (exists $mailmap{$email}) {
-				$author = $mailmap{$email};
-			} elsif (exists $mailmap{$author}) {
-				$author = $mailmap{$author};
-			} elsif (!$author) {
-				$author = $email;
-			}
-			$email{$author}{$email}++;
-			$pstate++;
-		}
-
-		# skip to blank line
-		elsif ($pstate == 2) {
-			next unless /^\s*$/;
-			$pstate++;
-		}
-
-		# skip to non-blank line
-		elsif ($pstate == 3) {
-			next unless /^\s*?(.*)/;
-
-			# skip lines that are obviously not
-			# a 1-line cset description
-			next if /^\s*From: /;
-
-			chomp;
-			$desc = $1;
-
-			&shortlog_entry($author, $desc);
-
-			$pstate = 1;
-		}
-	
-		else {
-			die "invalid parse state $pstate";
-		}
-	}
-}
-
-sub read_mailmap {
-	my ($fh, $mailmap) = @_;
-	while (<$fh>) {
-		chomp;
-		if (/^([^#].*?)\s*<(.*)>/) {
-			$mailmap->{$2} = $1;
-		}
-	}
-}
-
-sub setup_mailmap {
-	read_mailmap(\*DATA, \%mailmap);
-	if (-f '.mailmap') {
-		my $fh = undef;
-		open $fh, '<', '.mailmap';
-		read_mailmap($fh, \%mailmap);
-		close $fh;
-	}
-}
-
-sub finalize {
-	#print "\n$n_records records parsed.\n";
-
-	if ($n_records != $n_output) {
-		die "parse error: input records != output records\n";
-	}
-	if (0) {
-		for my $author (sort keys %email) {
-			my $e = $email{$author};
-			for my $email (sort keys %$e) {
-				print STDERR "$author <$email>\n";
-			}
-		}
-	}
-}
-
-&setup_mailmap;
-&changelog_input;
-$opt_s ? &summary_output : &shortlog_output;
-&finalize;
-exit(0);
-
-
-__DATA__
-#
-# Even with git, we don't always have name translations.
-# So have an email->real name table to translate the
-# (hopefully few) missing names
-#
-Adrian Bunk <bunk@stusta.de>
-Andreas Herrmann <aherrman@de.ibm.com>
-Andrew Morton <akpm@osdl.org>
-Andrew Vasquez <andrew.vasquez@qlogic.com>
-Christoph Hellwig <hch@lst.de>
-Corey Minyard <minyard@acm.org>
-David Woodhouse <dwmw2@shinybook.infradead.org>
-Domen Puncer <domen@coderock.org>
-Douglas Gilbert <dougg@torque.net>
-Ed L Cashin <ecashin@coraid.com>
-Evgeniy Polyakov <johnpol@2ka.mipt.ru>
-Felix Moeller <felix@derklecks.de>
-Frank Zago <fzago@systemfabricworks.com>
-Greg Kroah-Hartman <gregkh@suse.de>
-James Bottomley <jejb@mulgrave.(none)>
-James Bottomley <jejb@titanic.il.steeleye.com>
-Jeff Garzik <jgarzik@pretzel.yyz.us>
-Jens Axboe <axboe@suse.de>
-Kay Sievers <kay.sievers@vrfy.org>
-Mitesh shah <mshah@teja.com>
-Morten Welinder <terra@gnome.org>
-Morten Welinder <welinder@anemone.rentec.com>
-Morten Welinder <welinder@darter.rentec.com>
-Morten Welinder <welinder@troll.com>
-Nguyen Anh Quynh <aquynh@gmail.com>
-Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
-Peter A Jonsson <pj@ludd.ltu.se>
-Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
-Rudolf Marek <R.Marek@sh.cvut.cz>
-Rui Saraiva <rmps@joel.ist.utl.pt>
-Sachin P Sant <ssant@in.ibm.com>
-Santtu Hyrkk^[,Av^[(B <santtu.hyrkko@gmail.com>
-Simon Kelley <simon@thekelleys.org.uk>
-Tejun Heo <htejun@gmail.com>
-Tony Luck <tony.luck@intel.com>
diff --git a/git.c b/git.c
index 8044667..771e8ee 100644
--- a/git.c
+++ b/git.c
@@ -262,6 +262,7 @@ static void handle_internal_command(int 
 		{ "rev-parse", cmd_rev_parse, RUN_SETUP },
 		{ "rm", cmd_rm, RUN_SETUP },
 		{ "runstatus", cmd_runstatus, RUN_SETUP },
+		{ "shortlog", cmd_shortlog, RUN_SETUP },
 		{ "show-branch", cmd_show_branch, RUN_SETUP },
 		{ "show", cmd_show, RUN_SETUP | USE_PAGER },
 		{ "stripspace", cmd_stripspace },
diff --git a/path-list.c b/path-list.c
index 0c332dc..f8800f8 100644
--- a/path-list.c
+++ b/path-list.c
@@ -57,7 +57,7 @@ struct path_list_item *path_list_insert(
 	int index = add_entry(list, path);
 
 	if (index < 0)
-		index = 1 - index;
+		index = -1 - index;
 
 	return list->items + index;
 }
-- 
1.4.3.1.ge8ca-dirty

^ permalink raw reply related

* Re: VCS comparison table
From: Erik Bågfors @ 2006-10-22  9:56 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth,
	Jan Hudec, git
In-Reply-To: <200610221105.26421.jnareb@gmail.com>

> For example git encourages using many short and longer-lived feature
> branches; I don't see bzr encouraging this workflow.

Why not? I think it really does.  And due to the fact that merges are
merges and will show up as such, I think it's very suitable for
feature branches.

In fact, in the bzr development of bzr itself.  All commits are done
in feature branches and then merged into bzr.dev (the main "trunk" of
bzr) when they are considered stable.

Consider the following
bzr branch mainline featureA
cd featureA
hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
No I want to merge in mainline again
bzr merge ../mainline; bzr commit -m merge
hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

right now, I would have something line this in the branch log
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f4
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f3
----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   merge
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something done in mainline
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something else done in mainline
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f2
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f1

In this view,I can easily see what was part of this feature branch,
because the committs that belongs to the feature branch are not
indented, and they have a "branch nick" of "featureA".  I can also
easily see what comes from other branches.

I can also run bzr log with --line or --short which shows you only the
commits made in this branch and not the once that are merged in.  So
with --line I would get something line
Erik Bågfors 2006-10-19 f4
Erik Bågfors 2006-10-19 f3
Erik Bågfors 2006-10-19 merge
Erik Bågfors 2006-10-19 f2
Erik Bågfors 2006-10-19 f1

Which will give me a good view of what has been done in this feature
branch only.

If I understand it correctly, in git, you don't really know what has
been committed as part of this branch/repo, and what has been
committed in another branch/repo (this is my understanding from
reading this thread, I might be wrong, feel free to correct me again
:) )

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply

* Re: VCS comparison table
From: Matthew D. Fuller @ 2006-10-22 10:00 UTC (permalink / raw)
  To: Sean; +Cc: Jeff Licquia, bazaar-ng, git
In-Reply-To: <20061021233014.d4525a1d.seanlkml@sympatico.ca>

On Sat, Oct 21, 2006 at 11:30:14PM -0400 I heard the voice of
Sean, and lo! it spake thus:
> On Sat, 21 Oct 2006 23:23:37 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > 
> > OK.  So you are conflating the two.  Could someone who isn't
> > comment?
> 
> No, actually i'm not.  Single revno's or your dotted revno's _both_
> have the same property.

I think Jeff's actually meaning the other way around.  We're confident
through experience of the utility of the single revnos.  We're NOT (at
least, I'm not) so convinced of the utility and usability of the
dotted ones; they haven't gone through the crucible of experience yet.

During the dotted-decimal discussion, I favored numbering from the
merge point (rather than the ancestral point) for a lot of the same
reasons brought up here.  e.g., the log-ish output would look
something like:

200
199
 199.3
 199.2
 199.1
198
[...]

See <https://lists.ubuntu.com/archives/bazaar-ng/2006q3/017773.html>
for instance.

Of course, now we have them, and they  number from ancestors.  So
after that's in a couple releases, we'll get to see how it works in
practice.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply

* The bad patches to git-branch
From: Lars Hjemli @ 2006-10-22  9:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hello

Just in case you considered applying my patches to git-branch,  please
don't, as they're too much of a wip.

I'l redo the patches (unless the whole idea of git-branch -v opposes you :-)

-- 
larsh

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-22  9:05 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git
In-Reply-To: <20061022074513.GF29927@artax.karlin.mff.cuni.cz>

Jan Hudec wrote:
> On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
>> Carl Worth wrote:
>>> On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:

>>>> Bazaar encourages you to stick lots and lots of branches in your
>>>> repository.  They don't even have to be related.  For example, my repo
>>>> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
>>> 
>>> Git allows this just fine. And lots of branches belonging to a single
>>> project is definitely the common usage. It is not common (nor
>>> encouraged) for unrelated projects to share a repository, since a git
>>> clone will fetch every branch in the repository.
>> 
>> Right.  This is a difference between Bazaar and Git that's I'd
>> characterize as being "branch-oriented" vs "repository-oriented".  We'll
>> see more of this below.
> 
> This is one of things I on the other hand like better on bzr than git.
> Because it is really branches and not repositories that I usually care
> about.

That's probably because you are used to Bazaar-NG, and your habits
speaking. Think of git clone of repository as of bzr "branch".

For example git encourages using many short and longer-lived feature
branches; I don't see bzr encouraging this workflow.
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: VCS comparison table
From: Tim Webster @ 2006-10-22  7:49 UTC (permalink / raw)
  To: git; +Cc: Aaron Bentley, bazaar-ng, Jakub Narebski
In-Reply-To: <Pine.LNX.4.64.0610211007320.3962@g5.osdl.org>

On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Aaron Bentley wrote:
> >
> > Any SCM worth its salt should support that.  AIUI, that's not what Tim
> > wants.  He wants to intermix files from different repos in the same
> > directory.
> >
> > i.e.
> >
> > project/file-1
> > project/file-2
> > project/.git-1
> > project/.git-2
>
> Ok, that's just insane.
[snip]
> Anyway. Git certainly allows you to do some really insane things. The
> above is just the beginning - it's not even talking about alternate object
> directories where you can share databases _partially_ between two
> otherwise totally independent repositories etc.


Perhaps this is insane, but it does not make sense to track all config
files in etc as though they belong in a single repo. Each
application/pkg has a set of associated config files. Actually in some
cases it is easy to track which files belong in each application/pkg
repo. For example dpkg list conffiles per pkg. Additional config files
not in the application/pkg maintainer repo branch are easily added to
the application/pkg local repo branch.

My question is where should file metadata be stored in git? With hook
scripts, the file metadata can be captured and applied appropriately.

If a similar thing can be done with bzr as Linus described for git, I
am all ears.

^ permalink raw reply

* Re: VCS comparison table
From: Jan Hudec @ 2006-10-22  7:45 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski
In-Reply-To: <453A7D7E.8060105@utoronto.ca>

On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
> Carl Worth wrote:
> > On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
> [...]
> > But it really is fundamental and unavoidable that sequential numbers
> > don't work as names in a distributed version control system.
> 
> Right.  You need something guaranteed to be unique.  It's the revno +
> url combo that is unique.  That may not be permanent, but anyone can
> create one of those names, so it is decentralized.

But it is *not* *distributed*. The definition of a distributed system
among other things require, that resource identifiers are independent on
the location of the resources. So only using the revision-ids is really
distributed.

> >> I meant that the active branch and a mirror of the abandoned branch
> >> could be stored in the same repository, for ease of access.
> > 
> > Granted, everything can be stored in one repository. But that still
> > doesn't change what I was trying to say with my example. One of the
> > repositories would "win" (the names it published during the fork would
> > still be valid). And the other repository would "lose" (the names it
> > published would be not valid anymore). Right?
> 
> No.  It would be silly for the losing side to publish a mirror of the
> winning branch at the same location where they had previously published
> their own branch.  So the old number + URL combination would remain valid.

I regularly use bzr and I never used git. But I'd not hesitate a second
to pull --overwrite over the old location. Because the url has a meaning
"the base I develop against" for me and I'd want to preserve that
meaning.

> If the losing faction decided to maintain their own branch after the
> merge, they'd have two options
> 
> 1. continue to develop against the losing "branch", without updating its
> numbers from the "winning" branch.  It would be hard to tell who had won
> or lost in this case.
> 
> 2. create a new mirror of the "winning" branch and develop against that.
>  I'm not sure what this point of this would be.
> 
> I think the most realistic thing in this scenario is that they leave the
> "losing" branch exactly where it was, and develop against the "winning"
> branch.
> 
> >> Bazaar encourages you to stick lots and lots of branches in your
> >> repository.  They don't even have to be related.  For example, my repo
> >> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
> > 
> > Git allows this just fine. And lots of branches belonging to a single
> > project is definitely the common usage. It is not common (nor
> > encouraged) for unrelated projects to share a repository, since a git
> > clone will fetch every branch in the repository.
> 
> Right.  This is a difference between Bazaar and Git that's I'd
> characterize as being "branch-oriented" vs "repository-oriented".  We'll
> see more of this below.

This is one of things I on the other hand like better on bzr than git.
Because it is really branches and not repositories that I usually care
about.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply

* Re: prune/prune-packed
From: Junio C Hamano @ 2006-10-22  4:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: git
In-Reply-To: <20061022035919.GA4420@fieldses.org>

"J. Bruce Fields" <bfields@fieldses.org> writes:

> Both "man prune" and everyday.txt say that git-prune also runs
> git-prune-packed.  But that doesn't seem to be true.  Is the bug in the
> documentation?

I think it is a regression when prune was rewritten as a
built-in.

^ permalink raw reply

* prune/prune-packed
From: J. Bruce Fields @ 2006-10-22  3:59 UTC (permalink / raw)
  To: git

Both "man prune" and everyday.txt say that git-prune also runs
git-prune-packed.  But that doesn't seem to be true.  Is the bug in the
documentation?

--b.

^ permalink raw reply

* Re: VCS comparison table
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git
In-Reply-To: <1161487417.9241.220.camel@localhost.localdomain>

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply

* Re: VCS comparison table
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git
In-Reply-To: <1161487417.9241.220.camel@localhost.localdomain>

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply

* Re: VCS comparison table
From: Jeff Licquia @ 2006-10-22  3:23 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git
In-Reply-To: <20061021212645.2f9ba751.seanlkml@sympatico.ca>

On Sat, 2006-10-21 at 21:26 -0400, Sean wrote:
> On Sat, 21 Oct 2006 20:46:45 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > I suspect you're conflating the two, and interpreting certainty for the
> > former as certainty for the latter.  Though I don't mind being
> > corrected.
> 
> The archives have all the posts of people claiming that there were no
> issues with revno's and fully distributed models.  

"revno's"?  Which "revno's"? ...

OK.  So you are conflating the two.  Could someone who isn't comment?

^ permalink raw reply

* renames in StGIT
From: Karl Hasselström @ 2006-10-22  1:39 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: git

It doesn't seem like StGIT uses any of git's rename tracking stuff.
Specifically, pushing patches doesn't seem to use rename-aware
merging, and there is no way to tell diff to detect renames and
copies.

Should this perhaps be an item in the TODO list?

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox