* [PATCH] mbsrtowcs.3: add a note for conversion completion
@ 2023-11-13 13:48 Andriy Utkin
2023-11-14 9:21 ` Alejandro Colomar
0 siblings, 1 reply; 4+ messages in thread
From: Andriy Utkin @ 2023-11-13 13:48 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: linux-man, Andriy Utkin
This adds a note to resolve a confusion I had.
Maintainers are most welcome to improve my wording.
I aimed for this function to work in such a manner so that
it would convert the entire string. So I allocated a destination buffer
to accommodate the string length in wide charaters and the terminating
null. The function was called with len equal to the length of the string
in wide characters, as returned by mbsrtowcs(NULL, ...).
This resulted in *src being updated to point at the trailing null
character, rather than NULL which I expected.
Here is an example which illustrates the point:
Code:
#include <wchar.h>
#include <stdio.h>
int main(void) {
const char *src = "Hello", *s1 = src, *s2 = src;
wchar_t dest[6];
int ret;
printf("src is %p\n", src);
ret = mbsrtowcs(NULL, &src, 0, NULL);
printf("mbsrtowcs(src=NULL) returned %d\n", ret);
ret = mbsrtowcs(dest, &s1, 5, NULL);
printf("mbsrtowcs(len=5) returned %d, updated src is %p\n", ret, s1);
ret = mbsrtowcs(dest, &s2, 6, NULL);
printf("mbsrtowcs(len=6) returned %d, updated src is %p\n", ret, s2);
return 0;
}
Output:
src is 0x402010
mbsrtowcs(src=NULL) returned 5
mbsrtowcs(len=5) returned 5, updated src is 0x402015
mbsrtowcs(len=6) returned 5, updated src is (nil)
---
man3/mbsrtowcs.3 | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/man3/mbsrtowcs.3 b/man3/mbsrtowcs.3
index 11741d187..4718b335d 100644
--- a/man3/mbsrtowcs.3
+++ b/man3/mbsrtowcs.3
@@ -155,6 +155,15 @@ current locale.
Passing NULL as
.I ps
is not multithread safe.
+.P
+Calling this function with
+.I len
+set to the value returned from
+.I mbsrtowcs(NULL, ...)
+behaves according to scenario #2 described above:
+.I *src
+is set to the address of the terminating null wide character, rather than to NULL.
+Add 1 to that value for it to work according to scenario #3 (complete conversion).
.SH SEE ALSO
.BR iconv (3),
.BR mbrtowc (3),
--
2.41.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] mbsrtowcs.3: add a note for conversion completion 2023-11-13 13:48 [PATCH] mbsrtowcs.3: add a note for conversion completion Andriy Utkin @ 2023-11-14 9:21 ` Alejandro Colomar 2023-11-14 9:47 ` Andriy Utkin 0 siblings, 1 reply; 4+ messages in thread From: Alejandro Colomar @ 2023-11-14 9:21 UTC (permalink / raw) To: Andriy Utkin; +Cc: linux-man [-- Attachment #1: Type: text/plain, Size: 2717 bytes --] Hello Andriy, On Mon, Nov 13, 2023 at 01:48:57PM +0000, Andriy Utkin wrote: > This adds a note to resolve a confusion I had. > Maintainers are most welcome to improve my wording. > > I aimed for this function to work in such a manner so that > it would convert the entire string. So I allocated a destination buffer > to accommodate the string length in wide charaters and the terminating > null. The function was called with len equal to the length of the string > in wide characters, as returned by mbsrtowcs(NULL, ...). > > This resulted in *src being updated to point at the trailing null > character, rather than NULL which I expected. > > Here is an example which illustrates the point: > > Code: > > #include <wchar.h> > #include <stdio.h> > int main(void) { > const char *src = "Hello", *s1 = src, *s2 = src; > wchar_t dest[6]; > int ret; > printf("src is %p\n", src); > ret = mbsrtowcs(NULL, &src, 0, NULL); > printf("mbsrtowcs(src=NULL) returned %d\n", ret); > ret = mbsrtowcs(dest, &s1, 5, NULL); > printf("mbsrtowcs(len=5) returned %d, updated src is %p\n", ret, s1); > ret = mbsrtowcs(dest, &s2, 6, NULL); > printf("mbsrtowcs(len=6) returned %d, updated src is %p\n", ret, s2); > return 0; > } > > Output: > > src is 0x402010 > mbsrtowcs(src=NULL) returned 5 > mbsrtowcs(len=5) returned 5, updated src is 0x402015 > mbsrtowcs(len=6) returned 5, updated src is (nil) mbstowcs(3) has the following: In order to avoid the case 2 above, the programmer should make sure n is greater than or equal to mbstowcs(NULL,src,0)+1. We could add that. BTW, maybe you want to use mbstowcs(3), which is simpler. I think we could add something saying that mbsrtowcs(3) is a restartable version of mbstowcs(3). Thanks, Alex > --- > man3/mbsrtowcs.3 | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/man3/mbsrtowcs.3 b/man3/mbsrtowcs.3 > index 11741d187..4718b335d 100644 > --- a/man3/mbsrtowcs.3 > +++ b/man3/mbsrtowcs.3 > @@ -155,6 +155,15 @@ current locale. > Passing NULL as > .I ps > is not multithread safe. > +.P > +Calling this function with > +.I len > +set to the value returned from > +.I mbsrtowcs(NULL, ...) > +behaves according to scenario #2 described above: > +.I *src > +is set to the address of the terminating null wide character, rather than to NULL. > +Add 1 to that value for it to work according to scenario #3 (complete conversion). > .SH SEE ALSO > .BR iconv (3), > .BR mbrtowc (3), > -- > 2.41.0 > -- <https://www.alejandro-colomar.es/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mbsrtowcs.3: add a note for conversion completion 2023-11-14 9:21 ` Alejandro Colomar @ 2023-11-14 9:47 ` Andriy Utkin 2023-11-14 10:04 ` Alejandro Colomar 0 siblings, 1 reply; 4+ messages in thread From: Andriy Utkin @ 2023-11-14 9:47 UTC (permalink / raw) To: Alejandro Colomar; +Cc: linux-man On Tue, Nov 14, 2023 at 10:21:27AM +0100, Alejandro Colomar wrote: > mbstowcs(3) has the following: > > In order to avoid the case 2 above, the programmer should make > sure n is greater than or equal to mbstowcs(NULL,src,0)+1. > > We could add that. That might have enlightened me! I like the wording, and indeed, having it phrased the same way for these similar functions would be helpful. > BTW, maybe you want to use mbstowcs(3), which is simpler. Indeed I should have chosen that. > I think we could add something saying that mbsrtowcs(3) is a > restartable version of mbstowcs(3). It might have helped me, and probably will help others. Thanks Alejandro! ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mbsrtowcs.3: add a note for conversion completion 2023-11-14 9:47 ` Andriy Utkin @ 2023-11-14 10:04 ` Alejandro Colomar 0 siblings, 0 replies; 4+ messages in thread From: Alejandro Colomar @ 2023-11-14 10:04 UTC (permalink / raw) To: Andriy Utkin; +Cc: linux-man [-- Attachment #1: Type: text/plain, Size: 1771 bytes --] Hi Andriy, On Tue, Nov 14, 2023 at 09:47:36AM +0000, Andriy Utkin wrote: > On Tue, Nov 14, 2023 at 10:21:27AM +0100, Alejandro Colomar wrote: > > mbstowcs(3) has the following: > > > > In order to avoid the case 2 above, the programmer should make > > sure n is greater than or equal to mbstowcs(NULL,src,0)+1. > > > > We could add that. > > That might have enlightened me! I like the wording, and indeed, having > it phrased the same way for these similar functions would be helpful. I've applied a few patches to these pages: <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=06783b90b57459437eb4a91b127523cc61fb1173> > > > BTW, maybe you want to use mbstowcs(3), which is simpler. > > Indeed I should have chosen that. > > > I think we could add something saying that mbsrtowcs(3) is a > > restartable version of mbstowcs(3). > > It might have helped me, and probably will help others. <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=b94a9c18d89c5c3a7a649c83e16de8034509c04e> And a few more to be able to diff the pages with $ diff -u <(man mbstowcs) <(man mbsrtowcs) Which I had to use to understand the differences. <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=6f9e8feeb8d0c391b0e5eb3a2b4dc2d7eab4d098> <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=f77ff4a87d2ca676b81f6919676634ab126a18b2> <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=fcfa4c254f0454d34a9370e2051c84069183a46b> Cheers, Alex > > Thanks Alejandro! -- <https://www.alejandro-colomar.es/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-11-14 10:04 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-13 13:48 [PATCH] mbsrtowcs.3: add a note for conversion completion Andriy Utkin 2023-11-14 9:21 ` Alejandro Colomar 2023-11-14 9:47 ` Andriy Utkin 2023-11-14 10:04 ` Alejandro Colomar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox