* The read builtin erroneously consumes null bytes @ 2026-06-19 5:04 Kerin Millar 2026-06-20 1:52 ` Alexis 2026-06-20 5:26 ` Herbert Xu 0 siblings, 2 replies; 7+ messages in thread From: Kerin Millar @ 2026-06-19 5:04 UTC (permalink / raw) To: dash; +Cc: Sam James Hi, Consider the following test case. It demonstrates NUL being consumed by the read builtin, rather than by cat(1) as it should have been. $ printf 'a\n\0bc' | dash -c 'read x; cat' | od -An -t x1 -c 62 63 b c This defect may be attributable to https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/src/input.c?h=v0.5.13.4#n411. Incidentally, it results in tests #5510, #5558 and #6020 failing for git v2.54.0. -- Kerin Millar ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The read builtin erroneously consumes null bytes 2026-06-19 5:04 The read builtin erroneously consumes null bytes Kerin Millar @ 2026-06-20 1:52 ` Alexis 2026-06-20 2:51 ` Harald van Dijk 2026-06-20 5:26 ` Herbert Xu 1 sibling, 1 reply; 7+ messages in thread From: Alexis @ 2026-06-20 1:52 UTC (permalink / raw) To: dash; +Cc: Sam James "Kerin Millar" <kfm@plushkava.net> writes: > Consider the following test case. It demonstrates NUL being > consumed by the read builtin, rather than by cat(1) as it should > have been. > > $ printf 'a\n\0bc' | dash -c 'read x; cat' | od -An -t x1 -c > 62 63 > b c POSIX doesn't require a `read` _builtin_, but for the `read` _utility_, it says: > STDIN > > If the -d delim option is not specified, or if it is specified > and > delim is not the null string, the standard input shall contain > zero or > more bytes (which need not form valid characters) and shall not > contain any null bytes. > > If the -d delim option is specified and delim is the null > string, the > standard input shall contain zero or more bytes (which need not > form > valid characters). -- https://pubs.opengroup.org/onlinepubs/9799919799/utilities/read.html#tag_20_100 dash's `read` builtin doesn't have an `-d delim` option, and it seems to me that in such a context, it's reasonable to regard `-d delim` as "not specified", such that stdin isn't allowed to contain any null bytes. Alexis. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The read builtin erroneously consumes null bytes 2026-06-20 1:52 ` Alexis @ 2026-06-20 2:51 ` Harald van Dijk 0 siblings, 0 replies; 7+ messages in thread From: Harald van Dijk @ 2026-06-20 2:51 UTC (permalink / raw) To: Alexis, dash; +Cc: Sam James On 6/20/26 02:52, Alexis wrote: > "Kerin Millar" <kfm@plushkava.net> writes: > >> Consider the following test case. It demonstrates NUL being consumed >> by the read builtin, rather than by cat(1) as it should have been. >> >> $ printf 'a\n\0bc' | dash -c 'read x; cat' | od -An -t x1 -c >> 62 63 >> b c > > POSIX doesn't require a `read` _builtin_, It is not possible to implement the read utility as an external utility because the read utility is required to modify shell variables, so effectively it is pretty much required to be a builtin. > but for the `read` _utility_, > it says: > >> STDIN >> >> If the -d delim option is not specified, or if it is specified and >> delim is not the null string, the standard input shall contain zero or >> more bytes (which need not form valid characters) and shall not >> contain any null bytes. The input to 'read' does not contain any null bytes. The input to 'read' is just 'a' followed by a newline and 'read' is supposed to stop reading after that. Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The read builtin erroneously consumes null bytes 2026-06-19 5:04 The read builtin erroneously consumes null bytes Kerin Millar 2026-06-20 1:52 ` Alexis @ 2026-06-20 5:26 ` Herbert Xu 2026-06-20 5:50 ` Kerin Millar 1 sibling, 1 reply; 7+ messages in thread From: Herbert Xu @ 2026-06-20 5:26 UTC (permalink / raw) To: Kerin Millar; +Cc: dash, sam Kerin Millar <kfm@plushkava.net> wrote: > Hi, > > Consider the following test case. It demonstrates NUL being consumed by the read builtin, rather than by cat(1) as it should have been. > > $ printf 'a\n\0bc' | dash -c 'read x; cat' | od -An -t x1 -c > 62 63 > b c > > This defect may be attributable to https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/src/input.c?h=v0.5.13.4#n411. > > Incidentally, it results in tests #5510, #5558 and #6020 failing for git v2.54.0. Is this a regression? If so please bisect it to the commit that introduced it. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The read builtin erroneously consumes null bytes 2026-06-20 5:26 ` Herbert Xu @ 2026-06-20 5:50 ` Kerin Millar 2026-06-20 9:23 ` [PATCH] input: Fix overeager NUL deletion in SMALL mode Herbert Xu 0 siblings, 1 reply; 7+ messages in thread From: Kerin Millar @ 2026-06-20 5:50 UTC (permalink / raw) To: Herbert Xu; +Cc: dash, Sam James On Sat, 20 Jun 2026, at 6:26 AM, Herbert Xu wrote: > Kerin Millar <kfm@plushkava.net> wrote: >> Hi, >> >> Consider the following test case. It demonstrates NUL being consumed by the read builtin, rather than by cat(1) as it should have been. >> >> $ printf 'a\n\0bc' | dash -c 'read x; cat' | od -An -t x1 -c >> 62 63 >> b c >> >> This defect may be attributable to https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/src/input.c?h=v0.5.13.4#n411. >> >> Incidentally, it results in tests #5510, #5558 and #6020 failing for git v2.54.0. > > Is this a regression? If so please bisect it to the commit that > introduced it. Aye, that it be and no mistake. https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=44b15ea09a9ee5872cf477e4ffc6b42ef37d1e46 -- Kerin Millar ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] input: Fix overeager NUL deletion in SMALL mode 2026-06-20 5:50 ` Kerin Millar @ 2026-06-20 9:23 ` Herbert Xu 2026-06-20 10:40 ` Kerin Millar 0 siblings, 1 reply; 7+ messages in thread From: Herbert Xu @ 2026-06-20 9:23 UTC (permalink / raw) To: Kerin Millar; +Cc: dash, Sam James On Sat, Jun 20, 2026 at 06:50:20AM +0100, Kerin Millar wrote: > > Aye, that it be and no mistake. > > https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=44b15ea09a9ee5872cf477e4ffc6b42ef37d1e46 OK this patch should fix the problem: ---8<--- NUL characters should not be removed from input lines that are yet to be processed because they could become the input to the next executed utility. Fix this by moving the NUL deletion into pgetc when history support is off (IS_DEFINED_SMALL). Also fold __pgetc into pgetc since the only other caller of it is preadbuffer and that logic can also be moved up. Finally add a missing signed char cast for the unget characters. Reported-by: Kerin Millar <kfm@plushkava.net> Fixes: 44ae22beedf8 ("input: Disable lleft in SMALL mode") Fixes: 2c92409145d0 ("input: Allow MB_LEN_MAX calls to pungetc") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> diff --git a/src/input.c b/src/input.c index 0fb2f18..e9b9d33 100644 --- a/src/input.c +++ b/src/input.c @@ -217,26 +217,6 @@ static void freestrings(struct strpush *sp) } -static int __pgetc(void) -{ - int c; - - if (parsefile->unget) { - long unget = -(long)(unsigned)parsefile->unget--; - - return parsefile->nextc[unget]; - } - - if (parsefile->nleft > 0) { - parsefile->nleft--; - c = (signed char)*parsefile->nextc++; - } else - c = preadbuffer(); - - return c; -} - - /* * Read a character from the script, returning PEOF on end of file. * Nul characters in the input are silently discarded. @@ -245,11 +225,39 @@ static int __pgetc(void) int __attribute__((noinline)) pgetc(void) { struct strpush *sp = parsefile->spfree; + int c; if (unlikely(sp)) freestrings(sp); - return __pgetc(); +again: + if (parsefile->unget) { + long unget = -(long)(unsigned)parsefile->unget--; + + return (signed char)parsefile->nextc[unget]; + } + +nextc: + if (likely(parsefile->nleft > 0)) { + parsefile->nleft--; + c = (signed char)*parsefile->nextc++; + } else if (unlikely(parsefile->strpush)) { + popstring(); + /* The freestrings call must be delayed til the next + * pgetc call for PEOA to work properly. + */ + goto again; + } else + c = preadbuffer(); + + /* delete nul characters */ + if (IS_DEFINED_SMALL && unlikely(!c)) { + parsefile->nextc = memmove(parsefile->nextc - 1, + parsefile->nextc, parsefile->nleft); + goto nextc; + } + + return c; } int pgetc_eoa(void) @@ -374,10 +382,6 @@ static int preadbuffer(void) int more; char *q; - if (unlikely(parsefile->strpush)) { - popstring(); - return __pgetc(); - } if (parsefile->eof & 2) { eof: parsefile->eof = 3; @@ -408,6 +412,12 @@ again: } } + if (IS_DEFINED_SMALL) { + q += more; + more = 0; + goto done; + } + /* delete nul characters */ for (;;) { int c; @@ -422,9 +432,6 @@ again: q++; - if (IS_DEFINED_SMALL) - goto check; - switch (c) { case '\n': goto done; @@ -439,11 +446,8 @@ again: } check: - if (more <= 0) { - if (!IS_DEFINED_SMALL) - goto again; - break; - } + if (more <= 0) + goto again; } done: input_set_lleft(parsefile, more); -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] input: Fix overeager NUL deletion in SMALL mode 2026-06-20 9:23 ` [PATCH] input: Fix overeager NUL deletion in SMALL mode Herbert Xu @ 2026-06-20 10:40 ` Kerin Millar 0 siblings, 0 replies; 7+ messages in thread From: Kerin Millar @ 2026-06-20 10:40 UTC (permalink / raw) To: Herbert Xu; +Cc: dash, Sam James On Sat, 20 Jun 2026, at 10:23 AM, Herbert Xu wrote: > On Sat, Jun 20, 2026 at 06:50:20AM +0100, Kerin Millar wrote: >> >> Aye, that it be and no mistake. >> >> https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=44b15ea09a9ee5872cf477e4ffc6b42ef37d1e46 > > OK this patch should fix the problem: I can confirm that, with the patch applied, dash is able to pass the git test suite, along with a slew of tests conducted by myself. -- Kerin Millar ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-20 10:41 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-19 5:04 The read builtin erroneously consumes null bytes Kerin Millar 2026-06-20 1:52 ` Alexis 2026-06-20 2:51 ` Harald van Dijk 2026-06-20 5:26 ` Herbert Xu 2026-06-20 5:50 ` Kerin Millar 2026-06-20 9:23 ` [PATCH] input: Fix overeager NUL deletion in SMALL mode Herbert Xu 2026-06-20 10:40 ` Kerin Millar
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.