From: Herbert Xu <herbert@gondor.apana.org.au>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
DASH Mailing List <dash@vger.kernel.org>
Subject: Re: [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter
Date: Thu, 7 May 2026 15:37:28 +0800 [thread overview]
Message-ID: <afxBOHhvXC7VxG3G@gondor.apana.org.au> (raw)
In-Reply-To: <afwpyiK9mh23c-JV@gondor.apana.org.au>
On Thu, May 07, 2026 at 01:57:30PM +0800, Herbert Xu wrote:
> On Thu, Apr 02, 2026 at 08:51:18AM +0200, Patrick Steinhardt wrote:
> > When executing our test suite with Dash v0.5.13.2 one can observe
> > several test failures that all have the same symptoms: we have a quoted
> > heredoc that contains multibyte characters, but the final data does not
> > match what we actually wanted to write. One such example is in t0300,
> > where we see the diffs like the following:
> >
> > --- expect-stdout 2026-04-01 07:25:45.249919440 +0000
> > +++ stdout 2026-04-01 07:25:45.254919509 +0000
> > @@ -1,5 +1,5 @@
> > protocol=https
> > host=example.com
> > -path=perú.git
> > +path=perú.git
> > username=foo
> > password=bar
>
> Thanks for the report.
>
> This patch should fix the problem. Please let me know if there are
> any more oustanding issues.
Oops, I forgot to cc the mailing list. Sorry for the resend.
---8<---
For a here-document with a quoted delimiter, multi-byte characters
should be written out as is with no escaping. Fix this by checking
for syntax == SQSYNTAX (the only time readtoken1 gets called with
SQSYNTAX is for such a here-document) before calling getmbc in
readtoken1.
Reported-by: Patrick Steinhardt <ps@pks.im>
Fixes: b12f136cc704 ("builtin: Process multi-byte characters in read(1)")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/src/parser.c b/src/parser.c
index bea4148..412e876 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -998,9 +998,13 @@ static char *dollarsq_escape(char *out)
STATIC int
readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
{
- struct synstack synbase = { .syntax = syntax };
+ struct synstack synbase = {
+ .dblquote = syntax == DQSYNTAX,
+ .syntax = syntax,
+ };
int chkeofmark = checkkwd & CHKEOFMARK;
struct synstack *synstack = &synbase;
+ bool sqheredoc = syntax == SQSYNTAX;
struct nodelist *bqlist = NULL;
int dollarsq = 0;
int c = firstc;
@@ -1009,9 +1013,6 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
size_t len;
char *out;
- if (syntax == DQSYNTAX)
- synstack->dblquote = 1;
-
STARTSTACKSTR(out);
loop: { /* for each line, until end of word */
#if ATTY
@@ -1035,7 +1036,8 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
out);
fieldsplitting = synstack->syntax == BASESYNTAX &&
!synstack->varnest ? 4 : 0;
- ml = getmbc(c, out, fieldsplitting);
+ ml = getmbc(c, out, fieldsplitting |
+ (sqheredoc ? 2 : 0));
if (ml == 1) {
if (out == stackblock())
return TBLANK;
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
next prev parent reply other threads:[~2026-05-07 7:37 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 10:42 [PATCH 0/2] t: work around bugs in Dash v0.5.13 Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-04-01 16:21 ` Eric Sunshine
2026-04-02 5:44 ` Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 2/2] t9300: work around partial read bug in " Patrick Steinhardt
2026-04-02 6:51 ` [PATCH v2 0/2] t: work around bugs " Patrick Steinhardt
2026-04-02 6:51 ` [PATCH v2 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-05-07 5:57 ` [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter Herbert Xu
2026-05-07 7:37 ` Herbert Xu [this message]
2026-04-02 6:51 ` [PATCH v2 2/2] t9300: work around partial read bug in Dash v0.5.13 Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afxBOHhvXC7VxG3G@gondor.apana.org.au \
--to=herbert@gondor.apana.org.au \
--cc=dash@vger.kernel.org \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox