From: Jilles Tjoelker <jilles@stack.nl>
To: Alexey Zinovyev <alzinovyev@gmail.com>
Cc: dash@vger.kernel.org
Subject: Re: [PATCH] fix UTF-8 issues in read() builtin
Date: Wed, 8 Sep 2010 00:57:33 +0200 [thread overview]
Message-ID: <20100907225733.GB18839@stack.nl> (raw)
In-Reply-To: <20100907212615.GA28796@3arch>
On Wed, Sep 08, 2010 at 01:26:15AM +0400, Alexey Zinovyev wrote:
> Hello, I think there is a bug in read() builtin.
> $ cat test
> echo 'ρ'|while read i; do echo $i; done
> $ dash test
> $ bash test
> ρ
> Same with some japanese symbols.
> Looks like dash strips 0x81 byte.
0x81 == CTLESC, the escape character in dash's internal representation.
> diff --git a/src/miscbltin.c b/src/miscbltin.c
> index 5ab1648..f8c5655 100644
> --- a/src/miscbltin.c
> +++ b/src/miscbltin.c
> @@ -101,7 +101,6 @@ readcmd_handle_line(char *line, char **ap, size_t len)
> * will not modify the length of the string */
> offset = sl->text - s;
> remainder = backup + offset;
> - rmescapes(remainder);
> setvar(*ap, remainder, 0);
>
> return;
This patch is not correct as it will leave 0x81 bytes for backslash
escapes. That is probably a bit worse than ignoring the backslashes
entirely, which is what it does now. It attempts to "escape" the next
character by placing a CTLESC, but CTLESC does not and should not escape
IFS characters for ifsbreakup(); the recordregion() mechanism should be
used for that.
(For the intermediate representation generated by parser.c, CTLESC does
escape IFS characters. This is not ideal as it prevents IFS splitting
with CTL* bytes in word in ${var+-word}.)
The patch I posted separately fixes the handling of 0x81 and various
other issues with read (by using separate code instead of trying to use
expand.c). Backslash escaping works too although I have just found some
bugs with corner cases.
--
Jilles Tjoelker
prev parent reply other threads:[~2010-09-07 22:57 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-07 21:26 [PATCH] fix UTF-8 issues in read() builtin Alexey Zinovyev
2010-09-07 22:57 ` Jilles Tjoelker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100907225733.GB18839@stack.nl \
--to=jilles@stack.nl \
--cc=alzinovyev@gmail.com \
--cc=dash@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox