git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Carl Worth <cworth@cworth.org>
Cc: git <git@vger.kernel.org>, Junio C Hamano <junkio@cox.net>
Subject: Re: Make "git am" properly unescape lines matching ">>*From "
Date: Tue, 08 Jun 2010 13:50:08 -0700	[thread overview]
Message-ID: <4C0EAD00.8000706@zytor.com> (raw)
In-Reply-To: <87hbldjo0s.fsf@yoom.home.cworth.org>

On 06/08/2010 12:57 PM, Carl Worth wrote:
> I'm adding support to notmuch[1] to more easily pipe a thread full of
> patches to "git am". So I added support for notmuch to format a thread
> (or any search) as an mbox.
> 
> When I did that, I was careful to escape lines from the bodies of email
> messages that begin with zero or more '>' characters followed
> immediately by "From " (From_ lines) by adding an initial '>'. [2]
> 
> But I noticed that "git am" wasn't removing any of these added '>'
> characters, so I was getting corrupted commit messages.
> 
> I'll follow up this message with a patch that fixes that by making
> git-mailsplit un-escape these lines. It's careful to do this only when
> processing an actual mbox, using the existing detection of a bare email
> message and not doing any un-escaping in that case.
> 
> I'll also follow up with a new test for both cases, (using "git am" with
> both an mbox with escaped From_ lines and an email message without
> escaped From_ lines).
> 

The problem with that is that it is not universally applied.  For what
I've seen, some mbox-based programs simply rely on there being a
Content-Length: header and don't need From lines to be escaped at all
(and don't do anything useful if they are), some do the leading > trick
(usually not reversably at all).

As far as I can tell, the Content-Length: is the most reliably handled
format and probably is what we should use.  This is the "mboxcl2" format
in your list.[*]  Unfortunately "mboxcl2" and "mboxrd" cannot be
distinguished from each other by inspection, which is a major defect of
both formats.

The statement that "the entire "mbox" family of mailbox formats is
gradually becoming irrelevant, and of only historical interest" is also
pretty silly -- mbox is still the preferred format for moving groups of
email from MUA to MUA, even if it is no longer used for active live
spool storage.  But, of course, you knew that already.

	-hpa

[*] There are apparently some MTA/MUAs which simply bypass the entire
problem by base64-encoding any email that contains /^From /, just as if
it contained NUL bytes.  It's a heavyweight, but thoroughly unambiguous
way of dealing with the problem.

  parent reply	other threads:[~2010-06-08 20:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87hbldjo0s.fsf@yoom.home.cworth.org>
2010-06-08 20:02 ` [PATCH 1/2] mailsplit: Remove any '>' characters used to escape From_ lines in mbox Carl Worth
2010-06-08 20:02   ` [PATCH 2/2] Add test from From_-line escaping Carl Worth
2010-06-08 20:47 ` Make "git am" properly unescape lines matching ">>*From " Carl Worth
2010-06-08 20:54   ` H. Peter Anvin
2010-06-08 21:30     ` Carl Worth
2010-06-08 20:50 ` H. Peter Anvin [this message]
2010-06-08 21:52   ` Carl Worth
2010-06-08 22:10     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C0EAD00.8000706@zytor.com \
    --to=hpa@zytor.com \
    --cc=cworth@cworth.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).