From: "Adeodato Simó" <dato@net.com.org.es>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Pierre Habouzit <madcoder@debian.org>,
davidel@xmailserver.org, Francis Galiegue <fg@one2team.net>,
Git ML <git@vger.kernel.org>
Subject: Re: [PATCH 0/3] Teach Git about the patience diff algorithm
Date: Thu, 1 Jan 2009 21:46:52 +0100 [thread overview]
Message-ID: <20090101204652.GA26128@chistera.yi.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0901011134210.5086@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 4390 bytes --]
* Linus Torvalds [Thu, 01 Jan 2009 11:45:21 -0800]:
> On Thu, 1 Jan 2009, Johannes Schindelin wrote:
> > Nothing fancy, really, just a straight-forward implementation of the
> > heavily under-documented and under-analyzed paience diff algorithm.
> Exactly because the patience diff is so under-documented, could you
> perhaps give a few examples of how it differs in the result, and why it's
> so wonderful? Yes, yes, I can google, and no, no, nothing useful shows up
> except for *totally* content-free fanboisms.
> So could we have some actual real data on it?
For me, the cases where I find patience output to be of substantial
higher readability are those involving a rewrite of several consecutive
paragraphs (i.e., lines of code separated by blank lines). Compare:
-8<- git -8<-
@@ -51,29 +51,30 @@ def mbox_update(bug):
f.close()
else:
# make a list of Message-Id we have
- fp1 = file(path, 'ab+')
- ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
+ msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
- # get remote mbox again
- fp2 = tempfile.TemporaryFile()
- retrieve_mbox(bug, fp2)
+ with tempfile.NamedTemporaryFile() as tmpfd:
+ # retrieve the remote mbox again
+ retrieve_mbox(bug, tmpfd)
- # parse its messages
- fp2.seek(0)
- parser = email.Parser.Parser()
- msgs2 = dict((x['Message-Id'], x)
- for x in mailbox.UnixMailbox(fp2, parser.parse))
+ # parse its messages
+ parser = email.parser.Parser()
+ new_msgids = { x['Message-Id']: x
+ for x in mailbox.mbox(tmpfd.name, parser.parse) }
- # now append the new ones
- for msgid in set(msgs2.keys()) - set(ids1):
- fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+ with open(path, 'a+') as fd:
+ # now append the new messages
+ for msgid in new_msgids.keys() - msgids:
+ fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
return path
->8- git ->8-
with:
-8<- bzr patience -8<-
@@ -51,29 +51,30 @@
f.close()
else:
# make a list of Message-Id we have
- fp1 = file(path, 'ab+')
- ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
-
- # get remote mbox again
- fp2 = tempfile.TemporaryFile()
- retrieve_mbox(bug, fp2)
-
- # parse its messages
- fp2.seek(0)
- parser = email.Parser.Parser()
- msgs2 = dict((x['Message-Id'], x)
- for x in mailbox.UnixMailbox(fp2, parser.parse))
-
- # now append the new ones
- for msgid in set(msgs2.keys()) - set(ids1):
- fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
+ msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
+
+ with tempfile.NamedTemporaryFile() as tmpfd:
+ # retrieve the remote mbox again
+ retrieve_mbox(bug, tmpfd)
+
+ # parse its messages
+ parser = email.parser.Parser()
+ new_msgids = { x['Message-Id']: x
+ for x in mailbox.mbox(tmpfd.name, parser.parse) }
+
+ with open(path, 'a+') as fd:
+ # now append the new messages
+ for msgid in new_msgids.keys() - msgids:
+ fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
return path
->8- bzr patience ->8-
I don't know about you, but I find the latter much easier to read,
because the whole context of each version is always available.
As you see, in (at least) this case is just a matter of considering the
blank lines worthy of presented as common, or not.
I'll note that in this particular case, `git diff` yielded the very same
results with or without --patience. I don't know why that is, Johannes?
I'll also note that /usr/bin/diff produces (in this case) something
closer to patience than to git.
I'm attaching both versions of the file in case they are useful to
anybody.
--
Adeodato Simó dato at net.com.org.es
Debian Developer adeodato at debian.org
I promise you. Once I enter into an exclusive relationship, I sleep with
very few people.
-- Denny Crane
[-- Attachment #2: bdo0 --]
[-- Type: text/plain, Size: 2358 bytes --]
#! /usr/bin/python
## vim: fileencoding=utf-8
"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.
A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""
import os
import re
import sys
import urllib
import mailbox
import tempfile
import email.Parser
MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')
##
def main():
if len(sys.argv) != 2:
print >>sys.stderr, 'Usage: %s <bugnumber>' % (sys.argv[0],)
sys.exit(1)
bug = re.sub(r'[^0-9]', '', sys.argv[1])
if not re.match(r'\d{4,}$', bug):
print >>sys.stderr, \
'E: %s does not seem a valid number' % (sys.argv[1],)
sys.exit(1)
path = mbox_update(bug)
invoke_mailer(path)
##
def mbox_update(bug):
"""Return a path with an up-to-date copy of the mbox for bug."""
path = os.path.join(MBOX_DIR, bug + '.mbox')
if not os.path.exists(path):
f = file(path, 'wb')
try:
retrieve_mbox(bug, f)
except:
os.unlink(path)
raise
else:
f.close()
else:
# make a list of Message-Id we have
fp1 = file(path, 'ab+')
ids1 = [ x.get('Message-Id') for x in mailbox.UnixMailbox(fp1) ]
# get remote mbox again
fp2 = tempfile.TemporaryFile()
retrieve_mbox(bug, fp2)
# parse its messages
fp2.seek(0)
parser = email.Parser.Parser()
msgs2 = dict((x['Message-Id'], x)
for x in mailbox.UnixMailbox(fp2, parser.parse))
# now append the new ones
for msgid in set(msgs2.keys()) - set(ids1):
fp1.write('\n' + msgs2[msgid].as_string(unixfrom=True))
return path
def retrieve_mbox(bug, fileobj):
"""Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
for line in urllib.urlopen(
'http://bugs.debian.org/cgi-bin/bugreport.cgi?mboxstatus=yes;mboxmaint=yes;mbox=yes;bug=%s' % (bug,)):
fileobj.write(line)
def invoke_mailer(path):
"""Exec mutt, opening path."""
os.execlp('mutt', 'mutt', '-f', path)
##
if __name__ == '__main__':
try:
sys.exit(main())
except KeyboardInterrupt:
print >>sys.stderr, '\nCancelled.'
sys.exit(1)
[-- Attachment #3: bdo1 --]
[-- Type: text/plain, Size: 2501 bytes --]
#! /usr/bin/python3
"""Open Debian BTS mboxes with Mutt, à la /usr/bin/bts show --mbox.
A cache of mboxes is kept, and changed mboxes will be merged with existing
files instead of replacing them, so that e.g. read-status is preserved for each
message.
"""
import os
import re
import sys
import mailbox
import tempfile
import email.parser
import urllib.request
MBOX_DIR = os.path.expanduser('~/.mail/y.bug-cache')
##
def main():
if len(sys.argv) != 2:
print('Usage: {0} <bugnumber>'.format(sys.argv[0]), file=sys.stderr)
return 1
else:
bug = re.sub(r'[^0-9]', '', sys.argv[1])
if not re.search(r'^\d{4,}$', bug):
print('E: {0} does not seem a valid number'.format(sys.argv[1]),
file=sys.stderr)
return 1
path = mbox_update(bug)
invoke_mailer(path)
##
def mbox_update(bug):
"""Return a path with an up-to-date copy of the mbox for bug."""
path = os.path.join(MBOX_DIR, bug + '.mbox')
if not os.path.exists(path):
f = open(path, 'wb')
try:
retrieve_mbox(bug, f)
except:
os.unlink(path)
raise
else:
f.close()
else:
# make a list of Message-Id we have
msgids = { x.get('Message-Id') for x in mailbox.mbox(path) }
with tempfile.NamedTemporaryFile() as tmpfd:
# retrieve the remote mbox again
retrieve_mbox(bug, tmpfd)
# parse its messages
parser = email.parser.Parser()
new_msgids = { x['Message-Id']: x
for x in mailbox.mbox(tmpfd.name, parser.parse) }
with open(path, 'a+') as fd:
# now append the new messages
for msgid in new_msgids.keys() - msgids:
fd.write('\n' + new_msgids[msgid].as_string(unixfrom=True))
return path
def retrieve_mbox(bug, fileobj):
"""Retrieve mbox for bug from bugs.debian.org, writing it to fileobj."""
url = urllib.request.urlopen(
'http://bugs.debian.org/cgi-bin/bugreport.cgi?'
'mboxstatus=yes;mboxmaint=yes;mbox=yes;bug={0}'.format(bug))
for line in url.fp: # http://bugs.python.org/issue4608
fileobj.write(line)
def invoke_mailer(path):
"""Exec mutt, opening path."""
os.execlp('mutt', 'mutt', '-f', path)
##
if __name__ == '__main__':
try:
sys.exit(main())
except KeyboardInterrupt:
print('\nCancelled.', file=sys.stderr)
sys.exit(1)
next prev parent reply other threads:[~2009-01-01 20:48 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-04 0:40 libxdiff and patience diff Pierre Habouzit
2008-11-04 3:17 ` Davide Libenzi
2008-11-04 8:33 ` Pierre Habouzit
2008-11-04 5:39 ` Johannes Schindelin
2008-11-04 8:30 ` Pierre Habouzit
2008-11-04 14:34 ` Johannes Schindelin
2008-11-04 15:23 ` Pierre Habouzit
2008-11-04 15:57 ` Johannes Schindelin
2008-11-04 16:15 ` Pierre Habouzit
2009-01-01 16:38 ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-01 16:38 ` [PATCH 1/3] Implement " Johannes Schindelin
2009-01-01 16:39 ` [PATCH 2/3] Introduce the diff option '--patience' Johannes Schindelin
2009-01-01 16:39 ` [PATCH 3/3] bash completions: Add the --patience option Johannes Schindelin
2009-01-01 19:45 ` [PATCH 0/3] Teach Git about the patience diff algorithm Linus Torvalds
2009-01-01 20:00 ` Linus Torvalds
2009-01-02 18:17 ` Johannes Schindelin
2009-01-02 18:49 ` Linus Torvalds
2009-01-02 19:07 ` Johannes Schindelin
2009-01-02 18:51 ` Jeff King
2009-01-02 21:59 ` [PATCH 1/3 v2] Implement " Johannes Schindelin
2009-01-02 21:59 ` Johannes Schindelin
2009-01-01 20:46 ` Adeodato Simó [this message]
2009-01-02 1:56 ` [PATCH 0/3] Teach Git about " Linus Torvalds
2009-01-02 10:55 ` Clemens Buchacher
2009-01-02 10:58 ` Clemens Buchacher
2009-01-02 16:42 ` Linus Torvalds
2009-01-02 18:46 ` Johannes Schindelin
2009-01-02 19:03 ` Linus Torvalds
2009-01-02 19:22 ` Johannes Schindelin
2009-01-02 19:39 ` Jeff King
2009-01-02 19:50 ` Jeff King
2009-01-02 20:52 ` Jeff King
2009-01-02 23:05 ` Linus Torvalds
2009-01-03 16:24 ` Bazaar's patience diff as GIT_EXTERNAL_DIFF Adeodato Simó
2009-01-02 21:59 ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-08 19:55 ` Adeodato Simó
2009-01-08 20:06 ` Adeodato Simó
2009-01-09 6:54 ` Junio C Hamano
2009-01-09 13:07 ` Johannes Schindelin
2009-01-09 15:59 ` Adeodato Simó
2009-01-09 18:09 ` Linus Torvalds
2009-01-09 18:13 ` Linus Torvalds
2009-01-09 20:53 ` Junio C Hamano
2009-01-10 11:36 ` Johannes Schindelin
2009-01-02 11:03 ` Junio C Hamano
2009-01-02 18:50 ` Adeodato Simó
2009-01-06 11:17 ` Pierre Habouzit
2009-01-06 11:39 ` Pierre Habouzit
2009-01-06 19:40 ` Johannes Schindelin
2009-01-07 14:39 ` Pierre Habouzit
2009-01-07 17:01 ` Johannes Schindelin
2009-01-07 17:04 ` [PATCH v3 1/3] Implement " Johannes Schindelin
2009-01-07 18:10 ` Davide Libenzi
2009-01-07 18:32 ` Johannes Schindelin
2009-01-07 20:09 ` Davide Libenzi
2009-01-07 20:19 ` Johannes Schindelin
2009-01-07 18:59 ` Linus Torvalds
2009-01-07 20:00 ` Johannes Schindelin
2009-01-07 20:11 ` Davide Libenzi
2009-01-07 20:15 ` [PATCH 0/3] Teach Git about " Sam Vilain
2009-01-07 20:25 ` Linus Torvalds
2009-01-08 2:31 ` Sam Vilain
2009-01-07 20:38 ` Johannes Schindelin
2009-01-07 20:48 ` Junio C Hamano
2009-01-07 22:00 ` Johannes Schindelin
2009-01-07 22:45 ` Pierre Habouzit
2009-01-07 23:03 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090101204652.GA26128@chistera.yi.org \
--to=dato@net.com.org.es \
--cc=Johannes.Schindelin@gmx.de \
--cc=davidel@xmailserver.org \
--cc=fg@one2team.net \
--cc=git@vger.kernel.org \
--cc=madcoder@debian.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).