Git development
 help / color / mirror / Atom feed
From: Herbert Xu <herbert@gondor.apana.org.au>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
	DASH Mailing List <dash@vger.kernel.org>
Subject: Re: [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter
Date: Thu, 7 May 2026 15:37:28 +0800	[thread overview]
Message-ID: <afxBOHhvXC7VxG3G@gondor.apana.org.au> (raw)
In-Reply-To: <afwpyiK9mh23c-JV@gondor.apana.org.au>

On Thu, May 07, 2026 at 01:57:30PM +0800, Herbert Xu wrote:
> On Thu, Apr 02, 2026 at 08:51:18AM +0200, Patrick Steinhardt wrote:
> > When executing our test suite with Dash v0.5.13.2 one can observe
> > several test failures that all have the same symptoms: we have a quoted
> > heredoc that contains multibyte characters, but the final data does not
> > match what we actually wanted to write. One such example is in t0300,
> > where we see the diffs like the following:
> > 
> >   --- expect-stdout	2026-04-01 07:25:45.249919440 +0000
> >   +++ stdout	2026-04-01 07:25:45.254919509 +0000
> >   @@ -1,5 +1,5 @@
> >    protocol=https
> >    host=example.com
> >   -path=perú.git
> >   +path=perú.git
> >    username=foo
> >    password=bar
> 
> Thanks for the report.
> 
> This patch should fix the problem.  Please let me know if there are
> any more oustanding issues.

Oops, I forgot to cc the mailing list.  Sorry for the resend.

---8<---
For a here-document with a quoted delimiter, multi-byte characters
should be written out as is with no escaping.  Fix this by checking
for syntax == SQSYNTAX (the only time readtoken1 gets called with
SQSYNTAX is for such a here-document) before calling getmbc in
readtoken1.

Reported-by: Patrick Steinhardt <ps@pks.im>
Fixes: b12f136cc704 ("builtin: Process multi-byte characters in read(1)")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/src/parser.c b/src/parser.c
index bea4148..412e876 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -998,9 +998,13 @@ static char *dollarsq_escape(char *out)
 STATIC int
 readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 {
-	struct synstack synbase = { .syntax = syntax };
+	struct synstack synbase = {
+		.dblquote = syntax == DQSYNTAX,
+		.syntax = syntax,
+	};
 	int chkeofmark = checkkwd & CHKEOFMARK;
 	struct synstack *synstack = &synbase;
+	bool sqheredoc = syntax == SQSYNTAX;
 	struct nodelist *bqlist = NULL;
 	int dollarsq = 0;
 	int c = firstc;
@@ -1009,9 +1013,6 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 	size_t len;
 	char *out;
 
-	if (syntax == DQSYNTAX)
-		synstack->dblquote = 1;
-
 	STARTSTACKSTR(out);
 	loop: {	/* for each line, until end of word */
 #if ATTY
@@ -1035,7 +1036,8 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 				      out);
 			fieldsplitting = synstack->syntax == BASESYNTAX &&
 					 !synstack->varnest ? 4 : 0;
-			ml = getmbc(c, out, fieldsplitting);
+			ml = getmbc(c, out, fieldsplitting |
+					    (sqheredoc ? 2 : 0));
 			if (ml == 1) {
 				if (out == stackblock())
 					return TBLANK;
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

  reply	other threads:[~2026-05-07  7:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 10:42 [PATCH 0/2] t: work around bugs in Dash v0.5.13 Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-04-01 16:21   ` Eric Sunshine
2026-04-02  5:44     ` Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 2/2] t9300: work around partial read bug in " Patrick Steinhardt
2026-04-02  6:51 ` [PATCH v2 0/2] t: work around bugs " Patrick Steinhardt
2026-04-02  6:51   ` [PATCH v2 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-05-07  5:57     ` [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter Herbert Xu
2026-05-07  7:37       ` Herbert Xu [this message]
2026-04-02  6:51   ` [PATCH v2 2/2] t9300: work around partial read bug in Dash v0.5.13 Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afxBOHhvXC7VxG3G@gondor.apana.org.au \
    --to=herbert@gondor.apana.org.au \
    --cc=dash@vger.kernel.org \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox