From: Jeff King <peff@peff.net>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH resend] Do not create commits whose message contains NUL
Date: Tue, 13 Dec 2011 12:59:32 -0500 [thread overview]
Message-ID: <20111213175932.GA1663@sigill.intra.peff.net> (raw)
In-Reply-To: <1323777368-19697-1-git-send-email-pclouds@gmail.com>
On Tue, Dec 13, 2011 at 06:56:08PM +0700, Nguyen Thai Ngoc Duy wrote:
> We assume that the commit log messages are uninterpreted sequences of
> non-NUL bytes (see Documentation/i18n.txt). However the assumption
> does not really stand out and it's quite easy to set an editor to save
> in a NUL-included encoding. Currently we silently cut at the first NUL
> we see.
>
> Make it more obvious that NUL is not welcome by refusing to create
> such commits. Those who deliberately want to create them can still do
> with hash-object.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> This is from UTF-16 in commit message discussion [1] a few months
> ago. I don't want to resurrect the discussion again. However I think
> it's a good idea to stop users from shooting themselves in this case,
> especially at porcelain level.
>
> There were no comments on this patch previously. So, any comments
> this time ? Should I drop it?
I think this is a sane thing to do. Having thought about and
experimented a little with utf-16 in the past few months, I really don't
see how you could be disrupting anybody's workflow. utf-16 messages get
butchered so badly already; we are much better off letting the user know
of the problem as soon as possible.
It looks like we already have a check for is_utf8, and this is not
failing that check. I guess because is_utf8 takes a NUL-terminated
buffer, so it simply sees the truncated result (i.e., depending on
endianness, "foo" in utf16 is something like "f\0o\0o\0", so we check
only "f"). We could make is_utf8 take a length parameter to be more
accurate, and then it would catch this.
However, I think that's not quite what we want. We only check is_utf8 if
the encoding field is not set. And really, we want to reject NULs no
matter _which_ encoding they've set, because git simply doesn't handle
them properly.
> diff --git a/commit.c b/commit.c
> index d67b8c7..0775eec 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -855,6 +855,9 @@ int commit_tree(const char *msg, size_t msg_len, unsigned char *tree,
Hmm. My version of commit_tree does not have a "msg_len" parameter, nor
do I have d67b8c7. Is there some refactoring patch this is based on that
I missed?
> + if (strlen(msg) < msg_len)
> + die(_("cannot commit with NUL in commit message"));
> +
Two nits:
1. For some reason, checking strlen(msg) seems a subtle way of looking
for NULs in a buffer. I would have found:
if (memchr(msg, '\0', msglen))
much more obvious. But perhaps it is just me. Certainly not a big
deal either way.
2. The error message could be a little friendlier. The likely reason
for NULs is a bogus encoding setting in the user's editor. We
already have a nice "your message isn't utf-8" message. Though it
does suggest setting i18n.commitencoding, which probably _isn't_
the solution here (since their encoding clearly isn't supported).
But maybe it would be nicer to say something like:
error: your commit message contains NUL characters.
hint: This is often caused by using multibyte encodings such as
hint: UTF-16. Please check your editor settings.
We could even go further and detect some common NUL-containing
encodings, but I don't think it's worth the effort.
> diff --git a/t/t3900/UTF-16.txt b/t/t3900/UTF-16.txt
> new file mode 100644
> index 0000000000000000000000000000000000000000..53296be684253f40964c0604be7fa7ff12e200cb
> GIT binary patch
> literal 32
> mcmezOpWz6@X@-jo=NYasZ~@^#h9rjP3@HpR7}6Nh8Mpw;r3yp<
I was disappointed not to find a secret message. :)
-Peff
next prev parent reply other threads:[~2011-12-13 17:59 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-13 11:56 [PATCH resend] Do not create commits whose message contains NUL Nguyễn Thái Ngọc Duy
2011-12-13 17:59 ` Jeff King [this message]
2011-12-14 5:23 ` Miles Bader
2011-12-14 7:17 ` Jeff King
2012-01-01 16:27 ` Drew Northup
2012-01-03 20:03 ` Jeff King
2011-12-14 14:08 ` [PATCH 0/3] git-commit rejects messages with NULs Nguyễn Thái Ngọc Duy
2011-12-14 14:08 ` [PATCH 1/3] Make commit_tree() take message length in addition to the commit message Nguyễn Thái Ngọc Duy
2011-12-14 18:12 ` Junio C Hamano
2011-12-15 13:47 ` [PATCH v2 1/3] merge: abort if fails to commit Nguyễn Thái Ngọc Duy
2011-12-15 13:47 ` [PATCH v2 2/3] Convert commit_tree() to take strbuf as message Nguyễn Thái Ngọc Duy
2011-12-15 13:47 ` [PATCH v2 3/3] commit: refuse commit messages that contain NULs Nguyễn Thái Ngọc Duy
2011-12-15 18:23 ` [PATCH v2 1/3] merge: abort if fails to commit Junio C Hamano
2011-12-14 14:08 ` [PATCH 2/3] " Nguyễn Thái Ngọc Duy
2011-12-14 18:13 ` Junio C Hamano
2011-12-14 14:08 ` [PATCH 3/3] Do not create commits whose message contains NUL Nguyễn Thái Ngọc Duy
2011-12-14 18:19 ` Junio C Hamano
2011-12-14 18:29 ` Jeff King
2011-12-15 18:46 ` Junio C Hamano
2011-12-15 19:35 ` Junio C Hamano
2011-12-15 1:04 ` Miles Bader
2011-12-15 1:18 ` Jeff King
2011-12-15 3:09 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111213175932.GA1663@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).