All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Adeodato Simó" <dato@net.com.org.es>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Pierre Habouzit <madcoder@debian.org>,
	davidel@xmailserver.org, Francis Galiegue <fg@one2team.net>,
	Git ML <git@vger.kernel.org>
Subject: Re: [PATCH 0/3] Teach Git about the patience diff algorithm
Date: Thu, 1 Jan 2009 21:46:52 +0100	[thread overview]
Message-ID: <20090101204652.GA26128@chistera.yi.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0901011134210.5086@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 4390 bytes --]

* Linus Torvalds [Thu, 01 Jan 2009 11:45:21 -0800]:

> On Thu, 1 Jan 2009, Johannes Schindelin wrote:

> > Nothing fancy, really, just a straight-forward implementation of the
> > heavily under-documented and under-analyzed paience diff algorithm.

> Exactly because the patience diff is so under-documented, could you 
> perhaps give a few examples of how it differs in the result, and why it's 
> so wonderful? Yes, yes, I can google, and no, no, nothing useful shows up 
> except for *totally* content-free fanboisms. 

> So could we have some actual real data on it?

For me, the cases where I find patience output to be of substantial
higher readability are those involving a rewrite of several consecutive
paragraphs (i.e., lines of code separated by blank lines). Compare:

-8<- git -8<-
@@ -51,29 +51,30 @@ def mbox_update(bug):
             f.close()
     else:
         # make a list of Message-Id we have
-        fp1 = file(path, 'ab+')
-        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
+        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
 
-        # get remote mbox again
-        fp2 = tempfile.TemporaryFile()
-        retrieve_mbox(bug, fp2)
+        with tempfile.NamedTemporaryFile() as tmpfd:
+            # retrieve the remote mbox again
+            retrieve_mbox(bug, tmpfd)
 
-        # parse its messages
-        fp2.seek(0)
-        parser = email.Parser.Parser()
-        msgs2 = dict((x['Message-Id'], x)
-                    for x in mailbox.UnixMailbox(fp2, parser.parse))
+            # parse its messages
+            parser = email.parser.Parser()
+            new_msgids = { x['Message-Id']: x
+                            for x in mailbox.mbox(tmpfd.name, parser.parse) }
 
-        # now append the new ones
-        for msgid in set(msgs2.keys()) - set(ids1):
-            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+        with open(path, 'a+') as fd:
+            # now append the new messages
+            for msgid in new_msgids.keys() - msgids:
+                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
 
     return path
->8- git ->8-

with:

-8<- bzr patience -8<-
@@ -51,29 +51,30 @@
             f.close()
     else:
         # make a list of Message-Id we have
-        fp1 = file(path, 'ab+')
-        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
-
-        # get remote mbox again
-        fp2 = tempfile.TemporaryFile()
-        retrieve_mbox(bug, fp2)
-
-        # parse its messages
-        fp2.seek(0)
-        parser = email.Parser.Parser()
-        msgs2 = dict((x['Message-Id'], x)
-                    for x in mailbox.UnixMailbox(fp2, parser.parse))
-
-        # now append the new ones
-        for msgid in set(msgs2.keys()) - set(ids1):
-            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
+
+        with tempfile.NamedTemporaryFile() as tmpfd:
+            # retrieve the remote mbox again
+            retrieve_mbox(bug, tmpfd)
+
+            # parse its messages
+            parser = email.parser.Parser()
+            new_msgids = { x['Message-Id']: x
+                            for x in mailbox.mbox(tmpfd.name, parser.parse) }
+
+        with open(path, 'a+') as fd:
+            # now append the new messages
+            for msgid in new_msgids.keys() - msgids:
+                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
 
     return path
->8- bzr patience ->8-

I don't know about you, but I find the latter much easier to read,
because the whole context of each version is always available.

As you see, in (at least) this case is just a matter of considering the
blank lines worthy of presented as common, or not.

I'll note that in this particular case, `git diff` yielded the very same
results with or without --patience. I don't know why that is, Johannes?
I'll also note that /usr/bin/diff produces (in this case) something
closer to patience than to git.

I'm attaching both versions of the file in case they are useful to
anybody.

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
I promise you. Once I enter into an exclusive relationship, I sleep with
very few people.
                -- Denny Crane

[-- Attachment #2: bdo0 --]
[-- Type: text/plain, Size: 2358 bytes --]

#! /usr/bin/python
## vim: fileencoding=utf-8

"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.

A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""

import os
import re
import sys
import urllib
import mailbox
import tempfile
import email.Parser

MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')

##

def main():
    if len(sys.argv) != 2:
        print >>sys.stderr, 'Usage: %s <bugnumber>' % (sys.argv[0],)
        sys.exit(1)

    bug = re.sub(r'[^0-9]', '', sys.argv[1])
    if not re.match(r'\d{4,}$', bug):
        print >>sys.stderr, \
            'E: %s does not seem a valid number' % (sys.argv[1],)
        sys.exit(1)

    path = mbox_update(bug)
    invoke_mailer(path)

##

def mbox_update(bug):
    """Return a path with an up-to-date copy of the mbox for bug."""
    path = os.path.join(MBOX_DIR, bug + '.mbox')

    if not os.path.exists(path):
        f = file(path, 'wb')
        try:
            retrieve_mbox(bug, f)
        except:
            os.unlink(path)
            raise
        else:
            f.close()
    else:
        # make a list of Message-Id we have
        fp1 = file(path, 'ab+')
        ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]

        # get remote mbox again
        fp2 = tempfile.TemporaryFile()
        retrieve_mbox(bug, fp2)

        # parse its messages
        fp2.seek(0)
        parser = email.Parser.Parser()
        msgs2 = dict((x['Message-Id'], x)
                    for x in mailbox.UnixMailbox(fp2, parser.parse))

        # now append the new ones
        for msgid in set(msgs2.keys()) - set(ids1):
            fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))

    return path

def retrieve_mbox(bug, fileobj):
    """Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
    for line in urllib.urlopen(
            'http://bugs.debian.org/cgi-bin/bugreport.cgi?mboxstatus=yes;mboxmaint=yes;mbox=yes;bug=%s' % (bug,)):
        fileobj.write(line)

def invoke_mailer(path):
    """Exec mutt, opening path."""
    os.execlp('mutt', 'mutt', '-f', path)

##

if __name__ == '__main__':
    try:
        sys.exit(main())
    except KeyboardInterrupt:
        print >>sys.stderr, '\nCancelled.'
        sys.exit(1)

[-- Attachment #3: bdo1 --]
[-- Type: text/plain, Size: 2501 bytes --]

#! /usr/bin/python3

"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.

A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""

import os
import re
import sys
import mailbox
import tempfile
import email.parser
import urllib.request

MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')

##

def main():
    if len(sys.argv) != 2:
        print('Usage: {0} <bugnumber>'.format(sys.argv[0]), file=sys.stderr)
        return 1
    else:
        bug = re.sub(r'[^0-9]', '', sys.argv[1])

    if not re.search(r'^\d{4,}$', bug):
        print('E: {0} does not seem a valid number'.format(sys.argv[1]),
              file=sys.stderr)
        return 1

    path = mbox_update(bug)
    invoke_mailer(path)

##

def mbox_update(bug):
    """Return a path with an up-to-date copy of the mbox for bug."""
    path = os.path.join(MBOX_DIR, bug + '.mbox')

    if not os.path.exists(path):
        f = open(path, 'wb')
        try:
            retrieve_mbox(bug, f)
        except:
            os.unlink(path)
            raise
        else:
            f.close()
    else:
        # make a list of Message-Id we have
        msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }

        with tempfile.NamedTemporaryFile() as tmpfd:
            # retrieve the remote mbox again
            retrieve_mbox(bug, tmpfd)

            # parse its messages
            parser = email.parser.Parser()
            new_msgids = { x['Message-Id']: x
                            for x in mailbox.mbox(tmpfd.name, parser.parse) }

        with open(path, 'a+') as fd:
            # now append the new messages
            for msgid in new_msgids.keys() - msgids:
                fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))

    return path

def retrieve_mbox(bug, fileobj):
    """Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
    url = urllib.request.urlopen(
            'http://bugs.debian.org/cgi-bin/bugreport.cgi?'
            'mboxstatus=yes;mboxmaint=yes;mbox=yes;bug={0}'.format(bug))
    for line in url.fp: # http://bugs.python.org/issue4608
        fileobj.write(line)

def invoke_mailer(path):
    """Exec mutt, opening path."""
    os.execlp('mutt', 'mutt', '-f', path)

##

if __name__ == '__main__':
    try:
        sys.exit(main())
    except KeyboardInterrupt:
        print('\nCancelled.', file=sys.stderr)
        sys.exit(1)

  parent reply	other threads:[~2009-01-01 20:48 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-04  0:40 libxdiff and patience diff Pierre Habouzit
2008-11-04  3:17 ` Davide Libenzi
2008-11-04  8:33   ` Pierre Habouzit
2008-11-04  5:39 ` Johannes Schindelin
2008-11-04  8:30   ` Pierre Habouzit
2008-11-04 14:34     ` Johannes Schindelin
2008-11-04 15:23       ` Pierre Habouzit
2008-11-04 15:57         ` Johannes Schindelin
2008-11-04 16:15           ` Pierre Habouzit
2009-01-01 16:38         ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-01 16:38           ` [PATCH 1/3] Implement " Johannes Schindelin
2009-01-01 16:39           ` [PATCH 2/3] Introduce the diff option '--patience' Johannes Schindelin
2009-01-01 16:39           ` [PATCH 3/3] bash completions: Add the --patience option Johannes Schindelin
2009-01-01 19:45           ` [PATCH 0/3] Teach Git about the patience diff algorithm Linus Torvalds
2009-01-01 20:00             ` Linus Torvalds
2009-01-02 18:17               ` Johannes Schindelin
2009-01-02 18:49                 ` Linus Torvalds
2009-01-02 19:07                   ` Johannes Schindelin
2009-01-02 18:51                 ` Jeff King
2009-01-02 21:59               ` [PATCH 1/3 v2] Implement " Johannes Schindelin
2009-01-02 21:59                 ` Johannes Schindelin
2009-01-01 20:46             ` Adeodato Simó [this message]
2009-01-02  1:56               ` [PATCH 0/3] Teach Git about " Linus Torvalds
2009-01-02 10:55                 ` Clemens Buchacher
2009-01-02 10:58                   ` Clemens Buchacher
2009-01-02 16:42                     ` Linus Torvalds
2009-01-02 18:46                       ` Johannes Schindelin
2009-01-02 19:03                         ` Linus Torvalds
2009-01-02 19:22                           ` Johannes Schindelin
2009-01-02 19:39                           ` Jeff King
2009-01-02 19:50                             ` Jeff King
2009-01-02 20:52                               ` Jeff King
2009-01-02 23:05                                 ` Linus Torvalds
2009-01-03 16:24                             ` Bazaar's patience diff as GIT_EXTERNAL_DIFF Adeodato Simó
2009-01-02 21:59                       ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-08 19:55                       ` Adeodato Simó
2009-01-08 20:06                         ` Adeodato Simó
2009-01-09  6:54                         ` Junio C Hamano
2009-01-09 13:07                           ` Johannes Schindelin
2009-01-09 15:59                             ` Adeodato Simó
2009-01-09 18:09                             ` Linus Torvalds
2009-01-09 18:13                               ` Linus Torvalds
2009-01-09 20:53                             ` Junio C Hamano
2009-01-10 11:36                               ` Johannes Schindelin
2009-01-02 11:03                   ` Junio C Hamano
2009-01-02 18:50                 ` Adeodato Simó
2009-01-06 11:17           ` Pierre Habouzit
2009-01-06 11:39             ` Pierre Habouzit
2009-01-06 19:40             ` Johannes Schindelin
2009-01-07 14:39               ` Pierre Habouzit
2009-01-07 17:01                 ` Johannes Schindelin
2009-01-07 17:04                   ` [PATCH v3 1/3] Implement " Johannes Schindelin
2009-01-07 18:10                     ` Davide Libenzi
2009-01-07 18:32                       ` Johannes Schindelin
2009-01-07 20:09                         ` Davide Libenzi
2009-01-07 20:19                           ` Johannes Schindelin
2009-01-07 18:59                       ` Linus Torvalds
2009-01-07 20:00                         ` Johannes Schindelin
2009-01-07 20:11                         ` Davide Libenzi
2009-01-07 20:15               ` [PATCH 0/3] Teach Git about " Sam Vilain
2009-01-07 20:25                 ` Linus Torvalds
2009-01-08  2:31                   ` Sam Vilain
2009-01-07 20:38                 ` Johannes Schindelin
2009-01-07 20:48                   ` Junio C Hamano
2009-01-07 22:00                     ` Johannes Schindelin
2009-01-07 22:45                       ` Pierre Habouzit
2009-01-07 23:03                         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090101204652.GA26128@chistera.yi.org \
    --to=dato@net.com.org.es \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=davidel@xmailserver.org \
    --cc=fg@one2team.net \
    --cc=git@vger.kernel.org \
    --cc=madcoder@debian.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.