* String conversions
@ 2009-04-28 18:45 Steve French
[not found] ` <38c3c4860904290533g1a62ed85y12005733d5599c33@mail.gmail.com>
0 siblings, 1 reply; 2+ messages in thread
From: Steve French @ 2009-04-28 18:45 UTC (permalink / raw)
To: LKML, linux-fsdevel
In looking at various patches for more accurately sizing the required
buffer needed for conversions to UTF-8, the following question came up
more than once.
Functions which do string copying often null terminate the target
string (single byte of \0), and size strings one byte larger than
their string name (for UCS-2 this does not work since the null
termination is two bytes). Are there local nls codepages in Linux
kernel which require double null termination (e.g. DBCS asian code
pages), and if so how do you tell which ones require "double null
termination?"
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 2+ messages in thread[parent not found: <38c3c4860904290533g1a62ed85y12005733d5599c33@mail.gmail.com>]
* Re: String conversions [not found] ` <38c3c4860904290533g1a62ed85y12005733d5599c33@mail.gmail.com> @ 2009-04-29 12:42 ` Suresh Jayaraman 0 siblings, 0 replies; 2+ messages in thread From: Suresh Jayaraman @ 2009-04-29 12:42 UTC (permalink / raw) To: Steve French; +Cc: LKML > Steve French <smfrench@gmail.com> wrote: > > In looking at various patches for more accurately sizing the required > buffer needed for conversions to UTF-8, the following question came up > more than once. > > Functions which do string copying often null terminate the target > string (single byte of \0), and size strings one byte larger than > their string name (for UCS-2 this does not work since the null > termination is two bytes). � Are there local nls codepages in Linux > kernel which require double null termination (e.g. DBCS asian code > pages), and if so how do you tell which ones require "double null > termination?" A look at fs/nls and supported charsets suggests that the linux kernel does not support a pure double-byte charsets. Some of the supported east asian charsets seem to be a superset of ASCII (for e.g TIS 620, Big5) and only non-ASCII characters are expressed in 2 bytes. Thanks, -- Suresh Jayaraman ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-04-29 12:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-28 18:45 String conversions Steve French
[not found] ` <38c3c4860904290533g1a62ed85y12005733d5599c33@mail.gmail.com>
2009-04-29 12:42 ` Suresh Jayaraman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox