From: "Martin Ågren" <martin.agren@gmail.com>
To: Danh Doan <congdanhqx@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] mailinfo.c::convert_to_utf8: reuse strlen info
Date: Sun, 19 Apr 2020 06:34:10 +0200 [thread overview]
Message-ID: <CAN0heSoj9fwpx5WbME+Oew-7NLfBfQCfAKissix+wnFk0UW5uw@mail.gmail.com> (raw)
In-Reply-To: <20200419024805.GC9169@danh.dev>
On Sun, 19 Apr 2020 at 04:48, Danh Doan <congdanhqx@gmail.com> wrote:
>
> On 2020-04-18 16:12:06-0700, Junio C Hamano <gitster@pobox.com> wrote:
> > Martin Ågren <martin.agren@gmail.com> writes:
> >
> > > This is equivalent as long as `line->len` is equal to
> > > `strlen(line->buf)`, which it will be (should be) because it's a
> > > strbuf. Ok.
> >
> > For the guarantee to hold true, line->buf[0..line->len] should not
> > have any '\0' byte in it.
Yes. I sort of assumed it shouldn't ("because strbuf"), but it's a good
question. Especially considering this is about various encodings..
This assumption that "strlen is length so the conversion is a no-op"
could potentially be broken both for input (line->len) and output
(out_len).
> > This helper has two callers, but in either case, it needs to be
> > prepared to work on output of decode_[bq]_segment(). Is there code
> > anywhere that guarntees that the decoding won't stuff '\0' in the
> > line?
>
> the current code allows NUL character in utf-8 [bq]-encoded string
> in this function (early return) and its caller,
> and report an error later:
>
> error: a NUL byte in commit log message not allowed.
>
> meanwhile, if the email was sent in other encoding, the current code
> discards everything after NUL in that line,
> thus silently accept broken commit message.
My knee-jerk reaction to Junio's question was along the same line:
surely if we could have a NUL in there, the current `strlen()` would use
it as an excuse to silently truncate the string, either before
processing or afterwards. Thanks for looking into that more.
> Attached is the faulty patch in ISO-8859-1, which was used to
> demonstrate my words.
> The current code will make a commit with only "Áb" in the body,
> while the new code rightly claims we have a faulty email.
Good find.
Martin
next prev parent reply other threads:[~2020-04-19 4:34 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-18 3:54 [PATCH] mailinfo.c::convert_to_utf8: reuse strlen info Đoàn Trần Công Danh
2020-04-18 19:56 ` Martin Ågren
2020-04-18 20:18 ` [PATCH 0/6] strbuf: simplify `strbuf_attach()` usage Martin Ågren
2020-04-18 20:18 ` [PATCH 1/6] am: use `strbuf_attach()` correctly Martin Ågren
2020-04-18 20:18 ` [PATCH 2/6] strbuf_attach: correctly pass in `strlen() + 1` for `alloc` Martin Ågren
2020-04-18 20:18 ` [PATCH 3/6] strbuf: use `strbuf_attach()` correctly Martin Ågren
2020-04-18 20:18 ` [PATCH 4/6] fast-import: avoid awkward use of `strbuf_attach()` Martin Ågren
2020-04-18 20:18 ` [PATCH 5/6] rerere: " Martin Ågren
2020-04-18 20:18 ` [PATCH 6/6] strbuf: simplify `strbuf_attach()` usage Martin Ågren
2020-04-19 4:44 ` [PATCH 0/6] " Martin Ågren
2020-04-19 12:32 ` [PATCH 0/4] strbuf: fix doc for `strbuf_attach()` and avoid it Martin Ågren
2020-04-19 12:32 ` [PATCH 1/4] strbuf: fix doc for `strbuf_attach()` Martin Ågren
2020-04-20 17:30 ` Junio C Hamano
2020-04-21 18:44 ` Martin Ågren
2020-04-19 12:32 ` [PATCH 2/4] strbuf: introduce `strbuf_attachstr_len()` Martin Ågren
2020-04-19 12:32 ` [PATCH 3/4] strbuf: introduce `strbuf_attachstr()` Martin Ågren
2020-04-20 19:39 ` Junio C Hamano
2020-04-21 18:47 ` Martin Ågren
2020-04-19 12:32 ` [PATCH 4/4] strbuf_attach: prefer `strbuf_attachstr_len()` Martin Ågren
2020-04-18 23:12 ` [PATCH] mailinfo.c::convert_to_utf8: reuse strlen info Junio C Hamano
2020-04-19 2:48 ` Danh Doan
2020-04-19 4:34 ` Martin Ågren [this message]
2020-04-19 5:32 ` Junio C Hamano
2020-04-19 11:00 ` [PATCH v2 0/3] mailinfo: disallow and complains about NUL character Đoàn Trần Công Danh
2020-04-19 11:00 ` [PATCH v2 1/3] t4254: merge 2 steps of a single test Đoàn Trần Công Danh
2020-04-19 12:25 ` Martin Ågren
2020-04-19 14:17 ` Danh Doan
2020-04-19 11:00 ` [PATCH v2 2/3] mailinfo.c::convert_to_utf8: reuse strlen info Đoàn Trần Công Danh
2020-04-19 12:29 ` Martin Ågren
2020-04-19 14:16 ` Danh Doan
2020-04-20 19:59 ` Junio C Hamano
2020-04-20 23:53 ` Danh Doan
2020-04-19 11:00 ` [PATCH v2 3/3] mailinfo: disallow NUL character in mail's header Đoàn Trần Công Danh
2020-04-19 12:30 ` Martin Ågren
2020-04-19 14:24 ` Danh Doan
2020-04-20 23:54 ` [PATCH v3 0/3] Disallow NUL character in mailinfo Đoàn Trần Công Danh
2020-04-20 23:54 ` [PATCH v3 1/3] t4254: merge 2 steps of a single test Đoàn Trần Công Danh
2020-04-20 23:54 ` [PATCH v3 2/3] mailinfo.c: avoid strlen on strings that can contains NUL Đoàn Trần Công Danh
2020-04-20 23:54 ` [PATCH v3 3/3] mailinfo: disallow NUL character in mail's header Đoàn Trần Công Danh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAN0heSoj9fwpx5WbME+Oew-7NLfBfQCfAKissix+wnFk0UW5uw@mail.gmail.com \
--to=martin.agren@gmail.com \
--cc=congdanhqx@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).