* cyrillic next draft
@ 2003-03-13 5:50 Grigory Batalov
0 siblings, 0 replies; only message in thread
From: Grigory Batalov @ 2003-03-13 5:50 UTC (permalink / raw)
To: linux-msdos
[-- Attachment #1: Type: text/plain, Size: 2542 bytes --]
Hi, this is my next thoughts about different cyrillic charsets
in dosemu.
CP1125
------
Andy Shevchenko <andy@work.smile.org.ua> kindly reported that
there is nice DOS encoding for Ukrainian usage called CP1125.
It contains all Ukrainian symbols and is approved of by Ukraine
government. Great job for supporting it is done in ASPLinux's
dosemu RPM package.
I didn't found better visual description of CP1125 so used this
page for reference:
http://www.ic-chernobyl.kiev.ua/~porokh/cyr/index.htm
It seems to be quite correct.
CP1125 differs from CP866 in most upper characters with codes
0xF2-0xF9:
0x0490, /* 0xF2 - CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
0x0491, /* 0xF3 - CYRILLIC SMALL LETTER GHE WITH UPTURN */
0x0404, /* 0xF4 - CYRILLIC CAPITAL LETTER UKRAINIAN IE */
0x0454, /* 0xF5 - CYRILLIC SMALL LETTER UKRAINIAN IE */
0x0406, /* 0xF6 - CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
0x0456, /* 0xF7 - CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
0x0407, /* 0xF8 - CYRILLIC CAPITAL LETTER YI */
0x0457, /* 0xF9 - CYRILLIC SMALL LETTER YI */
So I made cp1125.c by changing unicode values for these
characters in cp866.c.
KOI8-U
------
KOI8-U is described in RFC2319: http://rfc.net/rfc2319.html
According to it, Perl Unicode::Map8 module gives wrong
value for character 0xB4 - 0x0403 when it must be
0x0404 - CYRILLIC CAPITAL LETTER UKRAINIAN IE.
CP1251, CP866
-------------
cp866.c and cp1251.c are also generated by Unicode::Map8 and
I hope they are correct =). You can find listings at:
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP866.TXT
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
Again I vote for returning back characters 0xF2-0xF7, 0xFC and
0xFD in cp866 because we have to comply some common rules
(Unicode in this case).
KOI8-RU
-------
KOI8-RU is described in RFC draft:
http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/rfc-draft-koi8-ru.txt
Table is derived from koi8-r.c by replacing changed codes.
Unicode::CharName Perl module was used for Unicode names.
Character 0xB4 points to 0x0403 while must point to 0x0404.
External/internal
-----------------
Encodings above combine in following charset pairs:
$_external_char_set $internal_char_set
Russian:
koi8-r cp866
cp1251 cp866
cp866 cp866
Ukrainian:
cp1251 cp1125
cp1125 cp1125
koi8-u cp1125
koi8-ru cp1125
Files
-----
cp866.tar.bz2 - changes in cp866 table and fonts,
cyr_ua.tar.bz2 - other tables and cp1125 Xfonts derived from
cp866 Xfonts.
--
Grigory Batalov.
[-- Attachment #2: cp866.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 7176 bytes --]
[-- Attachment #3: cyr_ua.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 9175 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2003-03-13 5:50 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-13 5:50 cyrillic next draft Grigory Batalov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox