From: Bruno Haible <bruno@clisp.org>
To: linux-man@vger.kernel.org, Alejandro Colomar <alx.manpages@gmail.com>
Cc: Reuben Thomas <rrt@sc3d.org>,
Steffen Nurpmeso <steffen@sdaoden.eu>,
Martin Sebor <msebor@redhat.com>,
Alejandro Colomar <alx@kernel.org>
Subject: Re: [PATCH] iconv.3: Clarify the behavior when input is untranslatable
Date: Sun, 21 May 2023 13:11:36 +0200 [thread overview]
Message-ID: <18117042.sWSEgdgrri@nimes> (raw)
In-Reply-To: <20230521103128.8472-1-alx@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 451 bytes --]
Alejandro Colomar wrote:
> This patch adds language that reflects the actual behavior, by adding an
> explicit bullet that distinguishes this case.
That is the right approach. Thanks for taking the initiative.
But I think that more details should be added, so that programmers are
not surprised if their program behaves differently on, say, musl libc
or FreeBSD than on glibc.
Find attached my take to describe the condition appropriately.
Bruno
[-- Attachment #2: 0001-List-a-fifth-conditions-when-iconv-3-may-stop.patch --]
[-- Type: text/x-patch, Size: 2364 bytes --]
From bc3102bd88b2c481d49cdb3433d8520d1289271b Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@clisp.org>
Date: Sun, 21 May 2023 13:05:29 +0200
Subject: [PATCH] List a fifth conditions when iconv(3) may stop.
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=29913#c4
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217059
Reported-by: Steffen Nurpmeso <steffen@sdaoden.eu>
Reported-by: Reuben Thomas <rrt@sc3d.org>
Signed-off-by: Bruno Haible <bruno@clisp.org>
---
man3/iconv.3 | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/man3/iconv.3 b/man3/iconv.3
index 66f59b8c3..b440da578 100644
--- a/man3/iconv.3
+++ b/man3/iconv.3
@@ -71,7 +71,7 @@ If the character encoding of the input is stateful, the
function can also convert a sequence of input bytes
to an update to the conversion state without producing any output bytes;
such input is called a \fIshift sequence\fP.
-The conversion can stop for four reasons:
+The conversion can stop for five reasons:
.IP \[bu] 3
An invalid multibyte sequence is encountered in the input.
In this case,
@@ -80,6 +80,34 @@ it sets \fIerrno\fP to \fBEILSEQ\fP and returns
\fI*inbuf\fP
is left pointing to the beginning of the invalid multibyte sequence.
.IP \[bu]
+A multibyte sequence is encountered that is valid but that cannot be
+translated to the character encoding of the output. This condition
+depends on the implementation and on the conversion descriptor.
+In the GNU C library and GNU libiconv, if
+.I cd
+was created without the suffix
+.B //TRANSLIT
+or
+.BR //IGNORE ,
+the conversion is strict: lossy conversions produce this condition.
+If the suffix
+.B //TRANSLIT
+was specified, transliteration can avoid this condition in some cases.
+In the musl C library, this condition cannot occur because a conversion to
+.B '*'
+is used as a fallback.
+In the FreeBSD, NetBSD, and Solaris implementations of
+.BR iconv ,
+this condition cannot occur either, because a conversion to
+.B '?'
+is used as a fallback.
+When this condition is met,
+.B iconv
+sets \fIerrno\fP to \fBEILSEQ\fP and returns
+.IR (size_t)\ \-1 .
+\fI*inbuf\fP
+is left pointing to the beginning of the invalid multibyte sequence.
+.IP \[bu]
The input byte sequence has been entirely converted,
that is, \fI*inbytesleft\fP has gone down to 0.
In this case,
--
2.34.1
next prev parent reply other threads:[~2023-05-21 11:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-21 10:31 [PATCH] iconv.3: Clarify the behavior when input is untranslatable Alejandro Colomar
2023-05-21 10:32 ` Alejandro Colomar
2023-05-21 11:11 ` Bruno Haible [this message]
2023-05-21 14:41 ` Alejandro Colomar
2023-05-21 19:37 ` Bruno Haible
2023-05-21 20:53 ` 2 spaces after the end of a sentence is the _right_ amount (was: [PATCH] iconv.3: Clarify the behavior when input is untranslatable) Alejandro Colomar
2023-05-21 20:57 ` [PATCH] iconv.3: Clarify the behavior when input is untranslatable Alejandro Colomar
2023-05-24 22:07 ` Bruno Haible
2023-05-24 23:25 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18117042.sWSEgdgrri@nimes \
--to=bruno@clisp.org \
--cc=alx.manpages@gmail.com \
--cc=alx@kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=msebor@redhat.com \
--cc=rrt@sc3d.org \
--cc=steffen@sdaoden.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox