git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Adeodato Simó" <dato@net.com.org.es>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Pierre Habouzit <madcoder@debian.org>,
	davidel@xmailserver.org, Francis Galiegue <fg@one2team.net>,
	Git ML <git@vger.kernel.org>
Subject: Re: [PATCH 0/3] Teach Git about the patience diff algorithm
Date: Thu, 1 Jan 2009 21:46:52 +0100	[thread overview]
Message-ID: <20090101204652.GA26128@chistera.yi.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0901011134210.5086@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 4390 bytes --]

* Linus Torvalds [Thu, 01 Jan 2009 11:45:21 -0800]:

> On Thu, 1 Jan 2009, Johannes Schindelin wrote:

> > Nothing fancy, really, just a straight-forward implementation of the
> > heavily under-documented and under-analyzed paience diff algorithm.

> Exactly because the patience diff is so under-documented, could you 
> perhaps give a few examples of how it differs in the result, and why it's 
> so wonderful? Yes, yes, I can google, and no, no, nothing useful shows up 
> except for *totally* content-free fanboisms. 

> So could we have some actual real data on it?

For me, the cases where I find patience output to be of substantial
higher readability are those involving a rewrite of several consecutive
paragraphs (i.e., lines of code separated by blank lines). Compare:

-8<- git -8<-
@@ -51,29 +51,30 @@ def mbox_update(bug):
             f.close()
     else:
         # make a list of Message-Id we have
-        fp1 = file(path, 'ab+')
-        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
+        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
 
-        # get remote mbox again
-        fp2 = tempfile.TemporaryFile()
-        retrieve_mbox(bug, fp2)
+        with tempfile.NamedTemporaryFile() as tmpfd:
+            # retrieve the remote mbox again
+            retrieve_mbox(bug, tmpfd)
 
-        # parse its messages
-        fp2.seek(0)
-        parser = email.Parser.Parser()
-        msgs2 = dict((x['Message-Id'], x)
-                    for x in mailbox.UnixMailbox(fp2, parser.parse))
+            # parse its messages
+            parser = email.parser.Parser()
+            new_msgids = { x['Message-Id']: x
+                            for x in mailbox.mbox(tmpfd.name, parser.parse) }
 
-        # now append the new ones
-        for msgid in set(msgs2.keys()) - set(ids1):
-            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+        with open(path, 'a+') as fd:
+            # now append the new messages
+            for msgid in new_msgids.keys() - msgids:
+                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
 
     return path
->8- git ->8-

with:

-8<- bzr patience -8<-
@@ -51,29 +51,30 @@
             f.close()
     else:
         # make a list of Message-Id we have
-        fp1 = file(path, 'ab+')
-        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
-
-        # get remote mbox again
-        fp2 = tempfile.TemporaryFile()
-        retrieve_mbox(bug, fp2)
-
-        # parse its messages
-        fp2.seek(0)
-        parser = email.Parser.Parser()
-        msgs2 = dict((x['Message-Id'], x)
-                    for x in mailbox.UnixMailbox(fp2, parser.parse))
-
-        # now append the new ones
-        for msgid in set(msgs2.keys()) - set(ids1):
-            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
+
+        with tempfile.NamedTemporaryFile() as tmpfd:
+            # retrieve the remote mbox again
+            retrieve_mbox(bug, tmpfd)
+
+            # parse its messages
+            parser = email.parser.Parser()
+            new_msgids = { x['Message-Id']: x
+                            for x in mailbox.mbox(tmpfd.name, parser.parse) }
+
+        with open(path, 'a+') as fd:
+            # now append the new messages
+            for msgid in new_msgids.keys() - msgids:
+                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
 
     return path
->8- bzr patience ->8-

I don't know about you, but I find the latter much easier to read,
because the whole context of each version is always available.

As you see, in (at least) this case is just a matter of considering the
blank lines worthy of presented as common, or not.

I'll note that in this particular case, `git diff` yielded the very same
results with or without --patience. I don't know why that is, Johannes?
I'll also note that /usr/bin/diff produces (in this case) something
closer to patience than to git.

I'm attaching both versions of the file in case they are useful to
anybody.

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
I promise you. Once I enter into an exclusive relationship, I sleep with
very few people.
                -- Denny Crane

[-- Attachment #2: bdo0 --]
[-- Type: text/plain, Size: 2358 bytes --]

#! /usr/bin/python
## vim: fileencoding=utf-8

"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.

A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""

import os
import re
import sys
import urllib
import mailbox
import tempfile
import email.Parser

MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')

##

def main():
    if len(sys.argv) != 2:
        print >>sys.stderr, 'Usage: %s <bugnumber>' % (sys.argv[0],)
        sys.exit(1)

    bug = re.sub(r'[^0-9]', '', sys.argv[1])
    if not re.match(r'\d{4,}$', bug):
        print >>sys.stderr, \
            'E: %s does not seem a valid number' % (sys.argv[1],)
        sys.exit(1)

    path = mbox_update(bug)
    invoke_mailer(path)

##

def mbox_update(bug):
    """Return a path with an up-to-date copy of the mbox for bug."""
    path = os.path.join(MBOX_DIR, bug + '.mbox')

    if not os.path.exists(path):
        f = file(path, 'wb')
        try:
            retrieve_mbox(bug, f)
        except:
            os.unlink(path)
            raise
        else:
            f.close()
    else:
        # make a list of Message-Id we have
        fp1 = file(path, 'ab+')
        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]

        # get remote mbox again
        fp2 = tempfile.TemporaryFile()
        retrieve_mbox(bug, fp2)

        # parse its messages
        fp2.seek(0)
        parser = email.Parser.Parser()
        msgs2 = dict((x['Message-Id'], x)
                    for x in mailbox.UnixMailbox(fp2, parser.parse))

        # now append the new ones
        for msgid in set(msgs2.keys()) - set(ids1):
            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))

    return path

def retrieve_mbox(bug, fileobj):
    """Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
    for line in urllib.urlopen(
            'http://bugs.debian.org/cgi-bin/bugreport.cgi?mboxstatus=yes;mboxmaint=yes;mbox=yes;bug=%s' % (bug,)):
        fileobj.write(line)

def invoke_mailer(path):
    """Exec mutt, opening path."""
    os.execlp('mutt', 'mutt', '-f', path)

##

if __name__ == '__main__':
    try:
        sys.exit(main())
    except KeyboardInterrupt:
        print >>sys.stderr, '\nCancelled.'
        sys.exit(1)

[-- Attachment #3: bdo1 --]
[-- Type: text/plain, Size: 2501 bytes --]

#! /usr/bin/python3

"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.

A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""

import os
import re
import sys
import mailbox
import tempfile
import email.parser
import urllib.request

MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')

##

def main():
    if len(sys.argv) != 2:
        print('Usage: {0} <bugnumber>'.format(sys.argv[0]), file=sys.stderr)
        return 1
    else:
        bug = re.sub(r'[^0-9]', '', sys.argv[1])

    if not re.search(r'^\d{4,}$', bug):
        print('E: {0} does not seem a valid number'.format(sys.argv[1]),
              file=sys.stderr)
        return 1

    path = mbox_update(bug)
    invoke_mailer(path)

##

def mbox_update(bug):
    """Return a path with an up-to-date copy of the mbox for bug."""
    path = os.path.join(MBOX_DIR, bug + '.mbox')

    if not os.path.exists(path):
        f = open(path, 'wb')
        try:
            retrieve_mbox(bug, f)
        except:
            os.unlink(path)
            raise
        else:
            f.close()
    else:
        # make a list of Message-Id we have
        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }

        with tempfile.NamedTemporaryFile() as tmpfd:
            # retrieve the remote mbox again
            retrieve_mbox(bug, tmpfd)

            # parse its messages
            parser = email.parser.Parser()
            new_msgids = { x['Message-Id']: x
                            for x in mailbox.mbox(tmpfd.name, parser.parse) }

        with open(path, 'a+') as fd:
            # now append the new messages
            for msgid in new_msgids.keys() - msgids:
                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))

    return path

def retrieve_mbox(bug, fileobj):
    """Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
    url = urllib.request.urlopen(
            'http://bugs.debian.org/cgi-bin/bugreport.cgi?'
            'mboxstatus=yes;mboxmaint=yes;mbox=yes;bug={0}'.format(bug))
    for line in url.fp: # http://bugs.python.org/issue4608
        fileobj.write(line)

def invoke_mailer(path):
    """Exec mutt, opening path."""
    os.execlp('mutt', 'mutt', '-f', path)

##

if __name__ == '__main__':
    try:
        sys.exit(main())
    except KeyboardInterrupt:
        print('\nCancelled.', file=sys.stderr)
        sys.exit(1)

  parent reply	other threads:[~2009-01-01 20:48 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-04  0:40 libxdiff and patience diff Pierre Habouzit
2008-11-04  3:17 ` Davide Libenzi
2008-11-04  8:33   ` Pierre Habouzit
2008-11-04  5:39 ` Johannes Schindelin
2008-11-04  8:30   ` Pierre Habouzit
2008-11-04 14:34     ` Johannes Schindelin
2008-11-04 15:23       ` Pierre Habouzit
2008-11-04 15:57         ` Johannes Schindelin
2008-11-04 16:15           ` Pierre Habouzit
2009-01-01 16:38         ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-01 16:38           ` [PATCH 1/3] Implement " Johannes Schindelin
2009-01-01 16:39           ` [PATCH 2/3] Introduce the diff option '--patience' Johannes Schindelin
2009-01-01 16:39           ` [PATCH 3/3] bash completions: Add the --patience option Johannes Schindelin
2009-01-01 19:45           ` [PATCH 0/3] Teach Git about the patience diff algorithm Linus Torvalds
2009-01-01 20:00             ` Linus Torvalds
2009-01-02 18:17               ` Johannes Schindelin
2009-01-02 18:49                 ` Linus Torvalds
2009-01-02 19:07                   ` Johannes Schindelin
2009-01-02 18:51                 ` Jeff King
2009-01-02 21:59               ` [PATCH 1/3 v2] Implement " Johannes Schindelin
2009-01-02 21:59                 ` Johannes Schindelin
2009-01-01 20:46             ` Adeodato Simó [this message]
2009-01-02  1:56               ` [PATCH 0/3] Teach Git about " Linus Torvalds
2009-01-02 10:55                 ` Clemens Buchacher
2009-01-02 10:58                   ` Clemens Buchacher
2009-01-02 16:42                     ` Linus Torvalds
2009-01-02 18:46                       ` Johannes Schindelin
2009-01-02 19:03                         ` Linus Torvalds
2009-01-02 19:22                           ` Johannes Schindelin
2009-01-02 19:39                           ` Jeff King
2009-01-02 19:50                             ` Jeff King
2009-01-02 20:52                               ` Jeff King
2009-01-02 23:05                                 ` Linus Torvalds
2009-01-03 16:24                             ` Bazaar's patience diff as GIT_EXTERNAL_DIFF Adeodato Simó
2009-01-02 21:59                       ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-08 19:55                       ` Adeodato Simó
2009-01-08 20:06                         ` Adeodato Simó
2009-01-09  6:54                         ` Junio C Hamano
2009-01-09 13:07                           ` Johannes Schindelin
2009-01-09 15:59                             ` Adeodato Simó
2009-01-09 18:09                             ` Linus Torvalds
2009-01-09 18:13                               ` Linus Torvalds
2009-01-09 20:53                             ` Junio C Hamano
2009-01-10 11:36                               ` Johannes Schindelin
2009-01-02 11:03                   ` Junio C Hamano
2009-01-02 18:50                 ` Adeodato Simó
2009-01-06 11:17     ` Pierre Habouzit
2009-01-06 11:39       ` Pierre Habouzit
2009-01-06 19:40       ` Johannes Schindelin
2009-01-07 14:39         ` Pierre Habouzit
2009-01-07 17:01           ` Johannes Schindelin
2009-01-07 17:04             ` [PATCH v3 1/3] Implement " Johannes Schindelin
2009-01-07 18:10               ` Davide Libenzi
2009-01-07 18:32                 ` Johannes Schindelin
2009-01-07 20:09                   ` Davide Libenzi
2009-01-07 20:19                     ` Johannes Schindelin
2009-01-07 18:59                 ` Linus Torvalds
2009-01-07 20:00                   ` Johannes Schindelin
2009-01-07 20:11                   ` Davide Libenzi
2009-01-07 20:15         ` [PATCH 0/3] Teach Git about " Sam Vilain
2009-01-07 20:25           ` Linus Torvalds
2009-01-08  2:31             ` Sam Vilain
2009-01-07 20:38           ` Johannes Schindelin
2009-01-07 20:48             ` Junio C Hamano
2009-01-07 22:00               ` Johannes Schindelin
2009-01-07 22:45                 ` Pierre Habouzit
2009-01-07 23:03                   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090101204652.GA26128@chistera.yi.org \
    --to=dato@net.com.org.es \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=davidel@xmailserver.org \
    --cc=fg@one2team.net \
    --cc=git@vger.kernel.org \
    --cc=madcoder@debian.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).