* [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
@ 2010-06-09 1:01 Carl Worth
2010-06-09 3:50 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Carl Worth @ 2010-06-09 1:01 UTC (permalink / raw)
To: git, Junio C Hamano, H. Peter Anvin; +Cc: Carl Worth
This uses the mboxrd style of quoting as documented here:
http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html
This matches the mboxrd-style un-escaping recently added to "git am".
Test 4152 is now extended to verify that format-patch does the proper
escaping. It also now relies directly on the escaped output from
format-patch to verify that "git am" does the proper unescaping,
(where previously, it faked the escaped output with sed).
Signed-off-by: Carl Worth <cworth@cworth.org>
---
This patch is on top of the three patches I sent earlier. With this patch,
the series to make git use mbox in a robust fashion (and to avoid using mbox
where possible) is complete.
The entire test suite passes, and new tests are added for all new
functionality.
All of the features and caveats I mentioned earlier are taken care of. The
only potentially missing piece is that git-send-email doesn't have code
to un-escape From_ lines in an mbox. But this is irrelevant since send-
email doesn't even know how to handle an mbox anyway, (it will treat it
as one large email message instead).
Without this patch series, there's no documented way that an external
tool can use to reliably construct an mbox that will be correctly handled
by "git am". The best one could do is to peek inside the git implementation
and notice that it wants unescaped "From " lines, that it will ignore any
"From " line that doesn't end with something very much like asctime format,
and then somehow ensure that no messages in the mbox have lines that begin
with "From " and end with something like asctime format, (which won't be
possible in all cases without corrupting the message).
With this patch series, one can instead document that "git am" accepts an
mbox in "mboxrd" format as documented at the URL above, but with the caveat
that no additional characters are allowed after the asctime portion of the
"From " line. This requirement allows git to continue to accept mbox files
created by old versions of git, (with a very minor chance of corruption).
Mbox files create and consumed by versions of git after this patch series
should have no corruption by design.
builtin/log.c | 2 +-
commit.h | 5 ++-
log-tree.c | 1 +
pretty.c | 14 +++++++++++-
....sh => t4152-format-patch-am-From_-escaping.sh} | 21 ++++++++++++-------
5 files changed, 30 insertions(+), 13 deletions(-)
rename t/{t4152-am-From_.sh => t4152-format-patch-am-From_-escaping.sh} (78%)
mode change 100755 => 100644
diff --git a/builtin/log.c b/builtin/log.c
index adbec9f..36b2f5a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -763,7 +763,7 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
encoding);
pp_title_line(CMIT_FMT_EMAIL, &msg, &sb, subject_start, extra_headers,
encoding, need_8bit_cte);
- pp_remainder(CMIT_FMT_EMAIL, &msg, &sb, 0);
+ pp_remainder(CMIT_FMT_EMAIL, &msg, &sb, 0, 0);
printf("%s\n", sb.buf);
strbuf_release(&sb);
diff --git a/commit.h b/commit.h
index 6ef88dc..18e7197 100644
--- a/commit.h
+++ b/commit.h
@@ -70,6 +70,7 @@ struct pretty_print_context
const char *after_subject;
enum date_mode date_mode;
int need_8bit_cte;
+ int need_from_escaping;
int show_notes;
struct reflog_walk_info *reflog_info;
};
@@ -103,8 +104,8 @@ void pp_title_line(enum cmit_fmt fmt,
void pp_remainder(enum cmit_fmt fmt,
const char **msg_p,
struct strbuf *sb,
- int indent);
-
+ int indent,
+ int need_from_escaping);
/** Removes the first commit from a list sorted by date, and adds all
* of its parents.
diff --git a/log-tree.c b/log-tree.c
index 6aab273..1a179dc 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -415,6 +415,7 @@ void show_log(struct rev_info *opt)
ctx.abbrev = opt->diffopt.abbrev;
ctx.after_subject = extra_headers;
ctx.reflog_info = opt->reflog_info;
+ ctx.need_from_escaping = opt->format_mbox;
pretty_print_commit(opt->commit_format, commit, &msgbuf, &ctx);
if (opt->add_signoff)
diff --git a/pretty.c b/pretty.c
index 74cda1b..62b376b 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1011,7 +1011,8 @@ void pp_title_line(enum cmit_fmt fmt,
void pp_remainder(enum cmit_fmt fmt,
const char **msg_p,
struct strbuf *sb,
- int indent)
+ int indent,
+ int need_from_escaping)
{
int first = 1;
for (;;) {
@@ -1030,6 +1031,15 @@ void pp_remainder(enum cmit_fmt fmt,
}
first = 0;
+ if (need_from_escaping && (*line == '>' || *line == 'F'))
+ {
+ const char *s = line;
+ while (*s == '>')
+ s++;
+ if (strncmp (s, "From ", 5) == 0)
+ strbuf_addch(sb, '>');
+ }
+
strbuf_grow(sb, linelen + indent + 20);
if (indent) {
memset(sb->buf + sb->len, ' ', indent);
@@ -1117,7 +1127,7 @@ void pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit,
beginning_of_body = sb->len;
if (fmt != CMIT_FMT_ONELINE)
- pp_remainder(fmt, &msg, sb, indent);
+ pp_remainder(fmt, &msg, sb, indent, context->need_from_escaping);
strbuf_rtrim(sb);
/* Make sure there is an EOLN for the non-oneline case */
diff --git a/t/t4152-am-From_.sh b/t/t4152-format-patch-am-From_-escaping.sh
old mode 100755
new mode 100644
similarity index 78%
rename from t/t4152-am-From_.sh
rename to t/t4152-format-patch-am-From_-escaping.sh
index 02821ee..dc013bb
--- a/t/t4152-am-From_.sh
+++ b/t/t4152-format-patch-am-From_-escaping.sh
@@ -34,13 +34,18 @@ test_expect_success setup '
test_tick &&
git commit -s -F msg &&
git tag second &&
- git format-patch --stdout first | sed -e "1{p;d};s/^\(>*From \)/>\1/" > From_ &&
- {
- echo "X-Fake-Field: Line One" &&
- echo "X-Fake-Field: Line Two" &&
- echo "X-Fake-Field: Line Three" &&
- git format-patch --stdout first | sed -e "1d"
- } > From_.eml
+ git format-patch --stdout first > From_ &&
+ git format-patch first
+'
+
+test_expect_success 'format-patch escapes From_ lines in mbox' '
+ head -1 From_ | grep "^From " &&
+ test "$(grep "^From " From_ | wc -l)" = "1"
+'
+
+test_expect_success 'format-patch does not escapes From_ lines in email' '
+ head -1 0001-From_-lines.patch | grep -v "^From " >/dev/null &&
+ test "$(grep "^From " 0001-From_-lines.patch | wc -l)" = "1"
'
test_expect_success 'am unescapes From_ lines from mbox' '
@@ -54,7 +59,7 @@ test_expect_success 'am unescapes From_ lines from mbox' '
test_expect_success 'am does not unescape From_ lines from email' '
git checkout first &&
- git am From_.eml &&
+ git am 0001-From_-lines.patch &&
! test -d .git/rebase-apply &&
test -z "$(git diff second)" &&
test "$(git rev-parse second)" = "$(git rev-parse HEAD)" &&
--
1.7.0.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 1:01 [PATCH] format-patch: Properly escape From_ lines when creating an mbox Carl Worth
@ 2010-06-09 3:50 ` Junio C Hamano
2010-06-09 5:14 ` Carl Worth
2010-06-09 5:48 ` H. Peter Anvin
0 siblings, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2010-06-09 3:50 UTC (permalink / raw)
To: Carl Worth; +Cc: git, H. Peter Anvin
Carl Worth <cworth@cworth.org> writes:
> Without this patch series, there's no documented way that an external
> tool can use to reliably construct an mbox that will be correctly handled
> by "git am". The best one could do is to peek inside the git implementation
> and notice that it wants unescaped "From " lines, that it will ignore any
> "From " line that doesn't end with something very much like asctime format,
> and then somehow ensure that no messages in the mbox have lines that begin
> with "From " and end with something like asctime format, (which won't be
> possible in all cases without corrupting the message).
I have this small suspicion that mboxrd may be a suboptimal choice, when
you consider how robustly we can notice a failure (and to a lessor extent,
recover from it) when using output from "format-patch --stdout" to
sneakernet between existing and updated versions of git. Especially
because your implementation quotes lines that begin with "From "
unconditionally (even when the tail end of the line would never be a
valid-looking timestamp). Such an output will confuse existing mailsplit,
but the worst part of the story is that somebody who is applying a series
of patches will _not_ notice the breakage. The payload of the second and
subsequent messages will likely be concatenated as if it were part of the
first message, ignoring cruft between patches, but the resulting tree
would likely to be the same as what the sending end intended.
Compared to that, I think a failure to split a message in the middle (iow,
commit message happened to have a line that begins with "From " and ends
with a timestamp-looking string) is much easier to notice (because the
first part of the message that was incorrectly split at such a line will
not have any patch, so "git am" will stop). IOW, failure to split is
easier to notice than splitting too eagerly.
Perhaps perfect is an enemy of good?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 3:50 ` Junio C Hamano
@ 2010-06-09 5:14 ` Carl Worth
2010-06-09 16:56 ` Carl Worth
2010-06-10 14:49 ` Junio C Hamano
2010-06-09 5:48 ` H. Peter Anvin
1 sibling, 2 replies; 11+ messages in thread
From: Carl Worth @ 2010-06-09 5:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 2509 bytes --]
On Tue, 08 Jun 2010 20:50:01 -0700, Junio C Hamano <gitster@pobox.com> wrote:
> Carl Worth <cworth@cworth.org> writes:
> Especially because your implementation quotes lines that begin with "From "
> unconditionally (even when the tail end of the line would never be a
> valid-looking timestamp). Such an output will confuse existing mailsplit,
> but the worst part of the story is that somebody who is applying a series
> of patches will _not_ notice the breakage. The payload of the second and
> subsequent messages will likely be concatenated as if it were part of the
> first message, ignoring cruft between patches, but the resulting tree
> would likely to be the same as what the sending end intended.
I agree that anything that results in multiple patches being (silently!)
concatenated would be catastrophic and I do not recommend accepting any
patches that could result in failures like that.
Could you describe in more detail how the implementation could lead to a
case like that? I'm not seeing it myself. But if you can show me, I'll
be happy to attempt a fix.
In particular, I don't see how any of the new quoting will confuse
existing mailsplit. The splitting itself shouldn't be changed. And at
worst, using new "git format-patch" with old mailsplit could result in a
">From " getting into a commit message where a "From " should be.
We could reduce the occurrence of that problem by being less aggressive
with "From " quoting, (for example, examining whether the tail of the
line looks like a timestamp before quoting). The cost there would be
fairly minor. It would increase the occurrence of a failure to pass a
">From " correctly from a new "git am" to a new "git mailsplit". [*]
I don't see a way to eliminate both problems other than specifying that
git's mbox format is a non-standard mbox format that looks specifically
for From_ lines ending in timestamps and is not capable of containing an
arbitrary message, (namely messages with lines that begin with "From "
and end with timestamps).
That would be a particularly unsatisfying solution for me, since I'm
trying to implement an mbox-export option in a mail client as a general
feature (that happens to work with git) rather than implementing a
git-specific export option.
-Carl
[*] It would seem a strange strategy to make new git compatible with old
git while not being perfectly compatible with itself going forward, but
that is a possibility.
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 5:14 ` Carl Worth
@ 2010-06-09 16:56 ` Carl Worth
2010-06-10 14:49 ` Junio C Hamano
1 sibling, 0 replies; 11+ messages in thread
From: Carl Worth @ 2010-06-09 16:56 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 3866 bytes --]
On Tue, 08 Jun 2010 22:14:23 -0700, Carl Worth <cworth@cworth.org> wrote:
> On Tue, 08 Jun 2010 20:50:01 -0700, Junio C Hamano <gitster@pobox.com> wrote:
> > Carl Worth <cworth@cworth.org> writes:
> > Especially because your implementation quotes lines that begin with "From "
> > unconditionally (even when the tail end of the line would never be a
> > valid-looking timestamp). Such an output will confuse existing mailsplit,
> > but the worst part of the story is that somebody who is applying a series
> > of patches will _not_ notice the breakage. The payload of the second and
> > subsequent messages will likely be concatenated as if it were part of the
> > first message, ignoring cruft between patches, but the resulting tree
> > would likely to be the same as what the sending end intended.
...
> Could you describe in more detail how the implementation could lead to a
> case like that? I'm not seeing it myself. But if you can show me, I'll
> be happy to attempt a fix.
Oh, perhaps I understand what you were getting at here.
If a commit is created (by whatever means) with a commit message that
has a line of the form:
"From ... <timestamp>"
then with the existing code, there will be a failure if someone does a
format-patch and a git-am of that commit. And that might raise attention
that perhaps something went wrong.
But with my patch series, that commit will transfer through the
format-patch and git-am just fine.
I would contend that preserving this commit is the right (and "robust")
thing to do. For example, looking at the log recent of git.git master I
see 5 commits that have a "From ... <timestamp>" line in the commit
message.
34122b57eca747022336f5a3dc1aa80377d1ce56
48027a918d89bad6735897a2c3da77c0451a038c
19a8721ef8f82153fee93c62bd050659cf718d6d
3dc1383290f9db3371a13ae8009ce4fcd5ffc93a
1dfcfbce2d643b7c7b56dc828f36ced9de2bf9f2
They all look to me like mistakes, some worse than others. But now that
they are part of the history of the project, it would be better and more
robust of git to actually be able to replay these successfully.
Git has various tools for rewriting history, which are useful for
various reasons. But these tools will get tripped up on a commit like
one of the above. For example, taking the most recent commit from above,
"git rebase" is unable to replay it successfully:
$ git checkout -b tmp 34122b57eca747022336f5a3dc1aa80377d1ce56
Switched to a new branch 'tmp'
$ git rebase --onto HEAD~2 HEAD~1
First, rewinding head to replay your work on top of it...
Patch is empty. Was it split wrong?
After my patch series this rebase works:
$ git checkout -b tmp 34122b57eca747022336f5a3dc1aa80377d1ce56
Switched to a new branch 'tmp'
0:~/src/git:(tmp)$ git rebase --onto HEAD~2 HEAD~1
First, rewinding head to replay your work on top of it...
Applying: gitweb: Always use three argument form of open
That's git being demonstrably more robust. And an operation like that
would make a good test for git's test suite.
Now, it's likely git could also use some help to avoid whatever mistakes
caused these commits to be created in the first place, but that's an
orthogonal issue.
Also, there is one commit that is more particularly broken than any of
the others. Even my patch series is not sufficient to successfully
replay the following commit:
1dfcfbce2d643b7c7b56dc828f36ced9de2bf9f2
That's because in addition to the From_ line in the commit message, this
commit also has an entire additional patch within the commit
message. And git's "patch as email" format has an additional quoting
problem with the "---" delimiter to separate the commit message from the
patch. And again, that's orthogonal from the mbox quoting I'm currently
trying to solve.
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 5:14 ` Carl Worth
2010-06-09 16:56 ` Carl Worth
@ 2010-06-10 14:49 ` Junio C Hamano
2010-06-10 15:31 ` Carl Worth
1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2010-06-10 14:49 UTC (permalink / raw)
To: Carl Worth; +Cc: git, H. Peter Anvin
Carl Worth <cworth@cworth.org> writes:
> On Tue, 08 Jun 2010 20:50:01 -0700, Junio C Hamano <gitster@pobox.com> wrote:
>> Carl Worth <cworth@cworth.org> writes:
>> Especially because your implementation quotes lines that begin with "From "
>> unconditionally (even when the tail end of the line would never be a
>> valid-looking timestamp). Such an output will confuse existing mailsplit,
>> but the worst part of the story is that somebody who is applying a series
>> of patches will _not_ notice the breakage. The payload of the second and
>> subsequent messages will likely be concatenated as if it were part of the
>> first message, ignoring cruft between patches, but the resulting tree
>> would likely to be the same as what the sending end intended.
Please disregard the above; I wasn't thinking straight.
If your format-patch quotes ">*From " in the log message, and you unquote
it somewhere in mailsplit to mailinfo pipeline, then the only time any
funny interaction between the current git and your git would happen is
when your git formats a commit with a line in its log that begins with
"From " that cannot be a mistaken as a UNIX-From line and you use "am"
from the current git on the output; the resulting commit would get an
extra ">" left in the message, but that is a small price to pay. There is
no other downside I can see (and the upside is that the output from your
format-patch won't be split incorrectly, of course).
I find that the change to format-patch not to emit the UNIX-From line when
generating one file per commit is somewhat iffy. An upside is that the
existing mailsplit-mailinfo pipeline already knows not to split such an
input, so the change makes
git format-patch <revspec> |
while read path
do
git am $path || break
done
(which is essentially what an old rebase did) do the right thing, even in
the presense of a confusing "From " in the log message.
That change however is not without downsides. You may potentially be
breaking people's existing scripts in various ways. They may be relying
on the presense of the line by:
- using it to pick up the original commit object name from, just like
"rebase" does;
- using it as the "magic" number to protect them from being fed a bad
input;
- stripping the first line unconditionally, assuming it is that UNIX-From
line they shouldn't cut and paste into the MUA.
It is nice at the conceptual level, though. By declaring individual file
RFC2822 message (not mbox), it makes it very clear that it is MUA's
responsibility to quote "From " in the payload when the output is used by
MUA to compose and send a message. IOW, we shouldn't be doing the quote
in our output when operating in that mode.
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-10 14:49 ` Junio C Hamano
@ 2010-06-10 15:31 ` Carl Worth
2010-06-10 15:52 ` Carl Worth
0 siblings, 1 reply; 11+ messages in thread
From: Carl Worth @ 2010-06-10 15:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 2995 bytes --]
On Thu, 10 Jun 2010 07:49:34 -0700, Junio C Hamano <gitster@pobox.com> wrote:
> Carl Worth <cworth@cworth.org> writes:
> Please disregard the above; I wasn't thinking straight.
No worries.
> If your format-patch quotes ">*From " in the log message, and you unquote
> it somewhere in mailsplit to mailinfo pipeline, then the only time any
> funny interaction between the current git and your git would happen is
> when your git formats a commit with a line in its log that begins with
> "From " that cannot be a mistaken as a UNIX-From line and you use "am"
> from the current git on the output; the resulting commit would get an
> extra ">" left in the message, but that is a small price to pay.
Correct. That's the only downside I see. And it's clearly not a huge
price. Checking the entire git.git history, I found only 12 messages
that have a line beginning with "From " in the commit message, (ignoring
the 5 I referred to earlier where the mbox line actually ended up in the
commit message). Of these, three already have a ">From" in them.
So clearly this kind of thing is happening already. People are sometimes
using systems that quote these lines and git isn't un-quoting. But it's
really not that objectionable in the end.
> There is
> no other downside I can see (and the upside is that the output from your
> format-patch won't be split incorrectly, of course).
Good. And this series does fix actual bugs like the failure to rebase
across a message with a "From ... <timestamp>" line as mentioned before.
> I find that the change to format-patch not to emit the UNIX-From line when
> generating one file per commit is somewhat iffy. An upside is that the
> existing mailsplit-mailinfo pipeline already knows not to split such an
> input, so the change makes
This patch was written to reduce the likelihood of the downside
mentioned above. But I understand the potential costs you outlined.
I'll let you make the call on whether to include it. The rest of the
patch series will work fine without this one. But leaving it out will
lead to more occurrences of undesired ">From " in commit messages, (even
with new "git am" as send-email or some other MUA will see the quoted
">From " and treat it as the intended payload, not an escaped "From ").
> It is nice at the conceptual level, though. By declaring individual file
> RFC2822 message (not mbox), it makes it very clear that it is MUA's
> responsibility to quote "From " in the payload when the output is used by
> MUA to compose and send a message. IOW, we shouldn't be doing the quote
> in our output when operating in that mode.
I suppose we could maintain compatibility with any scripts, etc. by
still emitting the initial "From " line, but declaring these files as
messages (not mbox) and avoiding doing any quoting for them.
I think that gets us all the upsides with no downsides. I'll send one
last patch for that.
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-10 15:31 ` Carl Worth
@ 2010-06-10 15:52 ` Carl Worth
2010-06-10 16:12 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Carl Worth @ 2010-06-10 15:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 1469 bytes --]
On Thu, 10 Jun 2010 08:31:05 -0700, Carl Worth <cworth@cworth.org> wrote:
> I suppose we could maintain compatibility with any scripts, etc. by
> still emitting the initial "From " line, but declaring these files as
> messages (not mbox) and avoiding doing any quoting for them.
>
> I think that gets us all the upsides with no downsides. I'll send one
> last patch for that.
Thinking about implementing and testing this, I realized that a file
that looks like an mbox but isn't an mbox will confuse "git am"
slightly. It will think that it should unquote any ">From " lines, but
that would end up being the technically wrong thing to do since the
lines aren't quoted.
I'm not sure what to do here that would cause the least undesirable
breakage. Ignore this problem? Emit a line that still contains anything
that scripts might be looking for but that "git am" could key off of as
"not actually an mbox"?
I suppose we could put a magic timestamp there, but that feels pretty
creepy and fragile.
Another option would be to just emit RFC2822 messages unless the user
passes an explicit option to format-patch (such as --mbox, which would
be implied by --stdout). Then git would generate legitimate (unqoted)
messages and legitimate (quoted) mbox files.
I'd leave it to you to decide whether the --mbox option should be on by
default or phased in with a warning or whatever.
What do you think?
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-10 15:52 ` Carl Worth
@ 2010-06-10 16:12 ` Junio C Hamano
2010-06-10 16:30 ` Carl Worth
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2010-06-10 16:12 UTC (permalink / raw)
To: Carl Worth; +Cc: git, H. Peter Anvin
Carl Worth <cworth@cworth.org> writes:
> Another option would be to just emit RFC2822 messages unless the user
> passes an explicit option to format-patch (such as --mbox, which would
> be implied by --stdout). Then git would generate legitimate (unqoted)
> messages and legitimate (quoted) mbox files.
>
> I'd leave it to you to decide whether the --mbox option should be on by
> default or phased in with a warning or whatever.
>
> What do you think?
It sounds like that one good way to transition is to phase in --mbox as an
optional feature and at a revision bump later make that the default (which
would mean that we might need a --not-a-mbox option). I haven't thought
things through though...
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-10 16:12 ` Junio C Hamano
@ 2010-06-10 16:30 ` Carl Worth
0 siblings, 0 replies; 11+ messages in thread
From: Carl Worth @ 2010-06-10 16:30 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, H. Peter Anvin
[-- Attachment #1: Type: text/plain, Size: 863 bytes --]
On Thu, 10 Jun 2010 09:12:27 -0700, Junio C Hamano <gitster@pobox.com> wrote:
> It sounds like that one good way to transition is to phase in --mbox as an
> optional feature and at a revision bump later make that the default (which
> would mean that we might need a --not-a-mbox option). I haven't thought
> things through though...
OK. I'll put together a new, complete patch series implementing my best
attempt at all of this.
I'll make each commit address an actual bug in git as much as
possible.
For example, I suspect that "git am" is still passing a bare email to
mailsplit which can cause broken splitting for a message with a "From
... timestamp" line it. My current series doesn't test that, (nor try to
fix it), but that's all in the same family of bugs here.
Thanks for all the feedback,
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 3:50 ` Junio C Hamano
2010-06-09 5:14 ` Carl Worth
@ 2010-06-09 5:48 ` H. Peter Anvin
2010-06-09 7:04 ` Carl Worth
1 sibling, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2010-06-09 5:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Carl Worth, git
On 06/08/2010 08:50 PM, Junio C Hamano wrote:
> Carl Worth <cworth@cworth.org> writes:
>
>> Without this patch series, there's no documented way that an external
>> tool can use to reliably construct an mbox that will be correctly handled
>> by "git am". The best one could do is to peek inside the git implementation
>> and notice that it wants unescaped "From " lines, that it will ignore any
>> "From " line that doesn't end with something very much like asctime format,
>> and then somehow ensure that no messages in the mbox have lines that begin
>> with "From " and end with something like asctime format, (which won't be
>> possible in all cases without corrupting the message).
>
> I have this small suspicion that mboxrd may be a suboptimal choice, when
> you consider how robustly we can notice a failure (and to a lessor extent,
> recover from it) when using output from "format-patch --stdout" to
> sneakernet between existing and updated versions of git. Especially
> because your implementation quotes lines that begin with "From "
> unconditionally (even when the tail end of the line would never be a
> valid-looking timestamp). Such an output will confuse existing mailsplit,
> but the worst part of the story is that somebody who is applying a series
> of patches will _not_ notice the breakage. The payload of the second and
> subsequent messages will likely be concatenated as if it were part of the
> first message, ignoring cruft between patches, but the resulting tree
> would likely to be the same as what the sending end intended.
>
> Compared to that, I think a failure to split a message in the middle (iow,
> commit message happened to have a line that begins with "From " and ends
> with a timestamp-looking string) is much easier to notice (because the
> first part of the message that was incorrectly split at such a line will
> not have any patch, so "git am" will stop). IOW, failure to split is
> easier to notice than splitting too eagerly.
>
> Perhaps perfect is an enemy of good?
For production perhaps we should do the MIME-escape thing?
For consumption, it's not so clear...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] format-patch: Properly escape From_ lines when creating an mbox.
2010-06-09 5:48 ` H. Peter Anvin
@ 2010-06-09 7:04 ` Carl Worth
0 siblings, 0 replies; 11+ messages in thread
From: Carl Worth @ 2010-06-09 7:04 UTC (permalink / raw)
To: H. Peter Anvin, Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]
On Tue, 08 Jun 2010 22:48:44 -0700, "H. Peter Anvin" <hpa@zytor.com> wrote:
> > Perhaps perfect is an enemy of good?
>
> For production perhaps we should do the MIME-escape thing?
>
> For consumption, it's not so clear...
I suggest as a first step accepting the following:
format-patch: Emit bare email rather than mbox for single messages.
<id:1276040615-26008-1-git-send-email-cworth@cworth.org>
That patch should be entirely uncontroversial since it doesn't introduce
any new escaping, neither on the production nor on the consumption side.
It has the tremendous benefit of removing the mbox format entirely from
the "git send-email" workflow, (which will just use bare messages
instead).
With that patch in place, the only place that git will still generate
mbox files is "format-patch --stdout". And the most common use of that
is within git-rebase. For git-rebase, it doesn't matter what kind of
mbox is used as long as it's consistent, since it's practically
guaranteed that git-rebase will be using consistent versions of both
"git format-patch" and "git am".
At that point, I think discussion of confusion from new format-patch and
old am becomes almost meaningless as such interaction will most likely
be happening through bare messages rather than mbox files. When an mbox
file *is* involved I think it will be even more likely to happen through
some external program, (such as an MUA collecting a thread of
git-send-email messages and presenting them to "git am" as an mbox).
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-06-10 16:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-09 1:01 [PATCH] format-patch: Properly escape From_ lines when creating an mbox Carl Worth
2010-06-09 3:50 ` Junio C Hamano
2010-06-09 5:14 ` Carl Worth
2010-06-09 16:56 ` Carl Worth
2010-06-10 14:49 ` Junio C Hamano
2010-06-10 15:31 ` Carl Worth
2010-06-10 15:52 ` Carl Worth
2010-06-10 16:12 ` Junio C Hamano
2010-06-10 16:30 ` Carl Worth
2010-06-09 5:48 ` H. Peter Anvin
2010-06-09 7:04 ` Carl Worth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).