git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Johannes Sixt <j6t@kdbg.org>, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: What's cooking in git.git (Jan 2017, #02; Sun, 15)
Date: Mon, 16 Jan 2017 11:04:56 -0500	[thread overview]
Message-ID: <20170116160456.ltbb7ofe47xos7xo@sigill.intra.peff.net> (raw)
In-Reply-To: <alpine.DEB.2.20.1701161251100.3469@virtualbox>

On Mon, Jan 16, 2017 at 01:52:39PM +0100, Johannes Schindelin wrote:

> > >  An error message with an ASCII control character like '\r' in it
> > >  can alter the message to hide its early part, which is problematic
> > >  when a remote side gives such an error message that the local side
> > >  will relay with a "remote: " prefix.
> > >
> > >  Will merge to 'next'.
> > 
> > Please be not too hasty with advancing this topic to master. I could imagine
> > that there is some unwanted fallout on systems where the end-of-line marker
> > CRLF is common. Though, I haven't tested the topic myself, yet, nor do I
> > expect regressions in *my* typical workflow.
> 
> Thank you for being so attentive.
> 
> This topic branch would indeed have caused problems. Worse: it would have
> caused problems that are not covered by our test suite, as Git for
> Windows' own utilities do not generate CR/LF line endings. So this
> regression would have bit our users. Nasty.

Hmm.  I am not sure to what degree CRLFs are actually a problem here.
Keep in mind these are error messages generated via error(), and so not
processing arbitrary data. I can imagine that CRs might come from:

  1. A parameter like a filename or revision name. If I ask for a
     rev-parse of "foo\r\n", then it's probably useful to mention the
     "\r" in the error message, rather than ignoring it (or converting
     it to a line-feed).

     And I think that would apply to any input parameter we show via
     error(), etc, if it is connected to a newline (ideally we would
     show newlines as "?", too, but we cannot tell the difference
     between ones from parameters, and ones that are part of the error
     message).

  2. The printf-fmt strings themselves. But these come from C code,
     which just uses "\n". My impression is that it is fprintf() which
     is responsible for converting that to "\r\n". And we are doing our
     sanitizing here between an snprintf(), and an fprintf() of the
     result. So it should see only the raw "\n" fields.

I am certainly open to loosening the sanitizing for CR to make things
work seamlessly on Windows. But I am having trouble imagining a case
that is actually negatively impacted.

> -- snipsnap --
> Subject: [PATCH] fixup! vreport: sanitize ASCII control chars

Given the subtlety here, I'd much rather have a patch on top.

> The original patch is incorrect, as it turns carriage returns into
> question marks.
> 
> However, carriage returns should be left alone when preceding a line feed,
> and simply turned into line feeds otherwise.

The question of whether to leave CRLFs alone is addressed above. But I
do not understand why you'd want a lone CR to be converted to a line
feed. Running:

  git rev-parse --abbrev-ref "$(printf "foo\r")"

with my patch gets you:

  fatal: ambiguous argument 'foo?': unknown revision or path not in the working tree.

But with yours:

  fatal: ambiguous argument 'foo
  ': unknown revision or path not in the working tree.

Obviously the "?" thing is a lossy transformation, but I do not see how
a newline is any less confusing (but see below for some thoughts).

> To make the end result more readable, the logic is inverted so that the
> common case (no substitution) is handled first.
> 
> While at it, let's lose the unnecessary curly braces.

Please don't. Obviously C treats the "if/else" as a single unit, but
IMHO it's less error-prone to include the braces any time there are
multiple visual lines. E.g., something like:

  while (foo)
	if (bar)
		one();
	else
		two();
	three();

is much easier to spot as wrong when you would require braces either
way (and not relevant here, but I'd say that even an inner block with a
comment deserves braces for the same reason).

> We also add a regression test that verifies that carriage returns are
> handled properly. And as we expect CR/LF to be handled properly also on
> platforms other than Windows, this test case is not guarded by the MINGW
> prerequisite.

I am not sure what "properly" means here. In your test:

> +# This test verifies that the error reporting functions handle CR correctly.
> +# --abbrev-ref is simply used out of convenience, as it reports an error including
> +# a user-specified argument.
> +test_expect_success 'error messages treat CR/LF correctly' '
> +	test_must_fail git rev-parse --abbrev-ref "$(printf "CR/LF\\r\\n")" 2>err &&
> +	grep "^fatal: .*CR/LF$" err
> +'

The "\n" is eaten by the shell, and git sees only "CR/LF\r". So we are
not testing the CRLF case in vreportf() at all.

We do end up with "CR/LF\n" in vreportf(), which is presumably converted
by fprintf() to "CR/LF\r\n" on Windows. And so perhaps that is why you
are doing the "convert \r to \n" thing above.

But I still think it's not doing the right thing. Git _didn't_ see CRLF,
it saw CR.

-Peff

  reply	other threads:[~2017-01-16 16:05 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  1:51 What's cooking in git.git (Jan 2017, #02; Sun, 15) Junio C Hamano
2017-01-16  6:56 ` Johannes Sixt
2017-01-16  7:48   ` Junio C Hamano
2017-01-16 12:52   ` Johannes Schindelin
2017-01-16 16:04     ` Jeff King [this message]
2017-01-16 17:06       ` Johannes Schindelin
2017-01-16 20:33         ` Johannes Sixt
2017-01-16 21:44           ` Jeff King
2017-01-17 19:17             ` Junio C Hamano
2017-01-18  6:50               ` Johannes Sixt
2017-01-18 19:12                 ` Junio C Hamano
2017-01-19 16:12                   ` Johannes Schindelin
2017-01-16 22:00         ` Jeff King
2017-01-16 22:08           ` [PATCH] CodingGuidelines: clarify multi-line brace style Jeff King
2017-01-17 19:39             ` Junio C Hamano
2017-01-17 20:05               ` Jeff King
2017-01-17 21:41                 ` Junio C Hamano
2017-01-17  1:33           ` What's cooking in git.git (Jan 2017, #02; Sun, 15) Jacob Keller
2017-01-17  7:52             ` Jeff King
2017-01-17 19:20           ` Junio C Hamano
2017-01-17 19:36             ` Jeff King
2017-01-17 20:32               ` Junio C Hamano
2017-01-17 20:51                 ` Jeff King
2017-01-16 11:28 ` Johannes Schindelin
2017-01-17 19:45   ` Junio C Hamano
2017-01-18 13:54     ` Johannes Schindelin
2017-01-16 11:40 ` Johannes Schindelin
2017-01-18  1:05 ` [PATCHv3 4/4] unpack-trees: support super-prefix option Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170116160456.ltbb7ofe47xos7xo@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).