All of lore.kernel.org
 help / color / mirror / Atom feed
From: Herbert Xu <herbert@gondor.apana.org.au>
To: DASH Mailing List <dash@vger.kernel.org>
Subject: [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter
Date: Thu, 7 May 2026 15:45:55 +0800	[thread overview]
Message-ID: <afxDM2AmON2ASbfT@gondor.apana.org.au> (raw)
In-Reply-To: <afwpyiK9mh23c-JV@gondor.apana.org.au>

On Thu, May 07, 2026 at 01:57:30PM +0800, Herbert Xu wrote:
> On Thu, Apr 02, 2026 at 08:51:18AM +0200, Patrick Steinhardt wrote:
> > When executing our test suite with Dash v0.5.13.2 one can observe
> > several test failures that all have the same symptoms: we have a quoted
> > heredoc that contains multibyte characters, but the final data does not
> > match what we actually wanted to write. One such example is in t0300,
> > where we see the diffs like the following:
> > 
> >   --- expect-stdout	2026-04-01 07:25:45.249919440 +0000
> >   +++ stdout	2026-04-01 07:25:45.254919509 +0000
> >   @@ -1,5 +1,5 @@
> >    protocol=https
> >    host=example.com
> >   -path=perú.git
> >   +path=perú.git
> >    username=foo
> >    password=bar
> 
> Thanks for the report.
> 
> This patch should fix the problem.  Please let me know if there are
> any more oustanding issues.

Resending again to dash mailing list with a fixed Subject line.

---8<---
For a here-document with a quoted delimiter, multi-byte characters
should be written out as is with no escaping.  Fix this by checking
for syntax == SQSYNTAX (the only time readtoken1 gets called with
SQSYNTAX is for such a here-document) before calling getmbc in
readtoken1.

Reported-by: Patrick Steinhardt <ps@pks.im>
Fixes: b12f136cc704 ("builtin: Process multi-byte characters in read(1)")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/src/parser.c b/src/parser.c
index bea4148..412e876 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -998,9 +998,13 @@ static char *dollarsq_escape(char *out)
 STATIC int
 readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 {
-	struct synstack synbase = { .syntax = syntax };
+	struct synstack synbase = {
+		.dblquote = syntax == DQSYNTAX,
+		.syntax = syntax,
+	};
 	int chkeofmark = checkkwd & CHKEOFMARK;
 	struct synstack *synstack = &synbase;
+	bool sqheredoc = syntax == SQSYNTAX;
 	struct nodelist *bqlist = NULL;
 	int dollarsq = 0;
 	int c = firstc;
@@ -1009,9 +1013,6 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 	size_t len;
 	char *out;
 
-	if (syntax == DQSYNTAX)
-		synstack->dblquote = 1;
-
 	STARTSTACKSTR(out);
 	loop: {	/* for each line, until end of word */
 #if ATTY
@@ -1035,7 +1036,8 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 				      out);
 			fieldsplitting = synstack->syntax == BASESYNTAX &&
 					 !synstack->varnest ? 4 : 0;
-			ml = getmbc(c, out, fieldsplitting);
+			ml = getmbc(c, out, fieldsplitting |
+					    (sqheredoc ? 2 : 0));
 			if (ml == 1) {
 				if (out == stackblock())
 					return TBLANK;
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

  parent reply	other threads:[~2026-05-07  7:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 10:42 [PATCH 0/2] t: work around bugs in Dash v0.5.13 Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-04-01 16:21   ` Eric Sunshine
2026-04-02  5:44     ` Patrick Steinhardt
2026-04-01 10:42 ` [PATCH 2/2] t9300: work around partial read bug in " Patrick Steinhardt
2026-04-02  6:51 ` [PATCH v2 0/2] t: work around bugs " Patrick Steinhardt
2026-04-02  6:51   ` [PATCH v2 1/2] t: work around multibyte bug in quoted heredocs with " Patrick Steinhardt
2026-05-07  5:57     ` [PATCH] parser: Fix multi-byte output in here-doc with quoted delimiter Herbert Xu
2026-05-07  7:37       ` Herbert Xu
2026-05-07  7:45       ` Herbert Xu [this message]
2026-04-02  6:51   ` [PATCH v2 2/2] t9300: work around partial read bug in Dash v0.5.13 Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afxDM2AmON2ASbfT@gondor.apana.org.au \
    --to=herbert@gondor.apana.org.au \
    --cc=dash@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.