From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Grigory Batalov" Subject: more cyrillic charsets Date: Mon, 24 Feb 2003 01:31:35 +0300 Sender: linux-msdos-owner@vger.kernel.org Message-ID: <20030224013135.58153aad.grisxa@mail.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Multipart_Mon__24_Feb_2003_01:31:35_+0300_08279df8" Return-path: List-Id: To: linux-msdos@vger.kernel.org This is a multi-part message in MIME format. --Multipart_Mon__24_Feb_2003_01:31:35_+0300_08279df8 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Hi! I'm asked about Ukrainian support in dosemu. If I understand correctly (I'm not sure =)), we need specific tables in extra_charsets. Main idea is to get specific characters that is not in koi8-r already as they don't used by Russians. According to Roman Czyborra's great document (http://czyborra.com/charsets/cyrillic.html) and to Unicode layout (http://www.unicode.org/charts/PDF/U0400.pdf) we can get next characters in same cp866 internal_charset: 8bit - Unicode -------------- 0xF2 - 0x0404 CYRILLIC CAPITAL LETTER UKRAINIAN IE 0xF3 - 0x0454 CYRILLIC SMALL LETTER UKRAINIAN IE 0xF4 - 0x0407 CYRILLIC CAPITAL LETTER YI 0xF5 - 0x0457 CYRILLIC SMALL LETTER YI just with returning them back (they are substituted with other helpful characters, but I think this is not correct). We miss 0x0491 CYRILLIC SMALL LETTER GHE WITH UPTURN 0x0490 CYRILLIC CAPITAL LETTER GHE WITH UPTURN as there isn't place for this symbols in cp866 and 0xF6 - 0x040E CYRILLIC CAPITAL LETTER SHORT U 0xF7 - 0x045E CYRILLIC SMALL LETTER SHORT U as there are not present in koi8-u charset (but they are in cp1251 so this is a subject to discuss). Symbols 0x0406 CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 0x0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I in koi8-u can be mapped to 0x0049 LATIN CAPITAL LETTER I 0x0069 LATIN SMALL LETTER I as they are very similar and aren't present in cp866. I made needed tables for cp1251 and koi8-u with attached Perl script. It generates exact copy of koi8-r.c so I think it's correct =) (checked with 'diff -Bubi'). Koi8-u table is corrected by hand a little. Tables are in attached patch. Also there is change for two cyrillic vga fonts that make them able to display symbols 0xF2, 0xF3, 0xF4, 0xF5. As I mentioned above it would be nice to implement symbols 0xF6, 0xF7 there, but I don't have request for them so not sure if they are really needed. -- Grigory Batalov. --Multipart_Mon__24_Feb_2003_01:31:35_+0300_08279df8 Content-Type: application/x-bzip2; name="dosemu-1.1.4.13-extra_charsets.diff.bz2" Content-Disposition: attachment; filename="dosemu-1.1.4.13-extra_charsets.diff.bz2" Content-Transfer-Encoding: base64 QlpoOTFBWSZTWUNCdgUADFBfgEAwWX//+n////q/799KYAq/eSvXVzO8GOIIIALu6vT2wk1tZQAJ JCT1NqMqfqeaqeNRpMjGoPSaMjEHpDExHoQDVPE1MMmUlAGTQAAAGgAAAANMUUygaBppoaADagaG gZAAAACTUkp6aGlPUZNGmmhiGjTQNGh6mgANAAcZMmhiMTRgEYCYQBgJpo0yNAMFU0kEATSbRNNJ kyJoyeppoZGgwgaaBppzUPa4A+x9VIlnre2vl1PKNJWaNtVnLhYoh4eH3V4t4vc6SskewDN0zXPG f0NwIIIPAsNrYWiZTfMqho13oWxFEuaEQ3jOi/NPdFE62yklGhpHhO5MSXjFUxtvejLfecaTKUzx fNd3TnKke+t203xJHi8a1VvhaFJIZTPR1Jr3WH3JkGwW02sKYRpgzgC1+OMmWJ+M9F+n73Q8De9/ 3Bue/8nL04xEqJEokpDQ0QkpMEDIjNp1VY3RQwXwCchrRSAYgW4GXSlY1TOTCugNXuaVgrKqIkhV VUgQMwIPYJGlhB6h5x/UiRSN1Ywyq22quijdUR6/1uTk5Lam1CK9KfhD7T/pn6OxnJ7vP060XR2z IgvJ724D9HDkIr8pkaqT0o+kd+BjXAONkqKNEC+ac9raZoLTLPWJ/AOyCddXHlVVXyd1W9V2S8yK wscQBuhzHCF5v9LkC6TOJDUxSpvOh0W+hzfEb9/B84pvFO9RXGkSkUotElKlIqolQhSuwajcQLgv CokzAqnqOqTuI1W1p0qd2s+k0NSp2HXolOU3NmrYGFMzcuyicA0TVNptWU6tNQVq5sypsySL0aEy phqtjaiA8pY3DhhQOIGdoECA2YokZlpNMzKprMlIw+ZmiA5RzcQIFBAgZiBAgNmKJGZaTTMyqazM Vw7mWFW5WpmusE55uazGPHUQapl70DsdHQGlz2C3u77Xi1rMuKIzruBBAQqVNRaxOTGujdkmLz0x sx1V8KnfS0CtrciIzvni7taIiIj0FMw0lEokVFer1/drJJJlWdOGuxVinOjFpdy7u/pD1Idin7AI YQhOxCboF3Mqr4Qlg7oS65VVVoDTCQ5wwMI9EQtEFUppWCznKzbbLRU2mFhhLJBqiwhcykhxmvgD tmS3s3nAxoWYO08vn237XD2bmbDtXk458efZpzepu0ZuDc4dr2PW9jgc1DjRllpR+SnFcZYY0pln MscM1tUzc2S0zSlYTPnowvNeWLWifU7NFDVWbZW3WiSVVQ2RwH4t1YWqms81Z1pUpurG/BimLyJq mK0FKZxlWM1C6pnLVgN0FkFgOF4GssBSCJYuKgmaXEFhEUQRkUAWEVQFgsixYoAsIqgLBZFiwRWC MDYBhC6zSrTIooCIIgjBYUIiIiIqiirFIiIiKIiIiIiItbd314JG27k1YYMHgZ7ZM5ffx67ck6ck 2VrwXxTkU+spZ451NjHtMMN92v8KotzyS2mFpqluNrYG6QkpMSphppzXCNDroaMt2xhjFl7LXE4r YPM81RLqtzSuHSp24vcy2tcJn258E0ZX1MhKqueWTNk2Lbc8hrrKzYLYRtyXWq2daiEqkKLa5Fux DhhatkVnThdZVpv2sLyY014d6M9xnbs3bSZGdZuKMt5bUjdDkRnla9lo0LXGVI0u1RsVw46s8XNN U2XeWSZ709GueWnLR1KRtKRARAcatswDGGGAbqwspC0ZYtKQniZjjiYS1YLjeuGOZwLya0tawQq1 ClUFEY1dRDLaSRQKmCIsTisSyDCUmAQ2yiSIiJNOULtXR8ppxAaJgihiJLMraaaOM0UBBoelFl1F rzS3DDDGmd3ZiuhV4kkTDWbqgQE8b2/vmhptucyPYNiuHsVNZruVeLWta11G6lgp33unTS7VGm5T 3unOYo/PUrn4LqJ2N/Zhzb2bCFltEBxYbGKouJlhrFS5DLyDK2u8c8HrOCgOjF21pWHSPO7cc3bm 8y2THDdc10tnuS6dzDRDVsay5Gtd22ss57nuKjcD+BVRVfGqsmXDNiVmLrVop1l0CiqMVROKHT5C fOcpqDTqTzYIIih6CJn5yRsoruvsnaNOY25hwHlYZNJERO0ukp1dbaHvbTjJfHLvqt+eW/k5Nrk1 yzmefl8FeDoy1o5T1yqM0rR2Q1RHPFMnwMnsrxfIfS2vB4N0OvhovudzzO9jv5q1rbb617zzVFR5 SV6d+7jv44nHOdcs9OfJkh6Ub+515i1HZAReSlZfSM5+0YDQYSyyy665z3PhaOGMF7jOoD8RAYYr mi8Mo8zM8yqgYmgd3DEFalR3KC5nlVZPFhm9FaGPL6qemmu3R6q2G+RElMPtar6SjJnVQ7aqevic 6Z++3Ndaa33MJV0YqNUMKV2Rqj7WxE4O5z9gTGGvjUNEYUVAR5q5tUeNVnorTezdTu5HSGmy3i7q 08taZYbFpIMVS0FDyKJLBmyEitBXWfpOMRSiVSxkVMIl5nkyag+4cQdzcapqlQP6HpCXBa5EVWYa AHChfgPmA8FSLUGsedYAaxHiYRSP6U+b7B7KRVypWK7DFRuK1GD/hCKd4grmpzamVP7cp7DGPuta 7X8IyO8cqwhBPP6HYBqRSCKucu6jrGBgmaUaFzLMHjIpof+LK7hCoVuVupG97Vd1VXkeRCKfiO8r zmT7wfM8ao8abVOivUZHKncpsU+Erq9CzV63Uy9Unrta7X53buf4K8DBSOtdjtK6kYIyZGp+NUeB +5g2KdSl1VV2FbzaO9VcivlOCmQr4KZjQ85ZUPJkjNT4tSmBmHnbyofWed3FjmOI5kMnnMKwOKyK +sr0MDIrj8bpftKXTspiqQdqq+isHaOLo9qlUpvKwO1BEIV/8XckU4UJBDQnYFA= --Multipart_Mon__24_Feb_2003_01:31:35_+0300_08279df8 Content-Type: text/plain; name="extra_charset.pl" Content-Disposition: attachment; filename="extra_charset.pl" Content-Transfer-Encoding: base64 IyEvdXNyL2Jpbi9wZXJsIC13Cgp1c2Ugc3RyaWN0OwpyZXF1aXJlIFVuaWNvZGU6Ok1hcDg7Cgpt eSAoJGxvY2FsZSwgJF9sb2NhbGUpOwokX2xvY2FsZSA9ICRsb2NhbGUgPSAia29pOC1yIjsKJF9s b2NhbGUgPX4gcy8tL18vZzsKCm15ICRtYXAgPSBVbmljb2RlOjpNYXA4LT5uZXcoJGxvY2FsZSk7 Cm15ICgkaSwgJGopOwoKCnByaW50ICJceDIzaW5jbHVkZSBcInRyYW5zbGF0ZS5oXCIKCnN0YXRp YyBjb25zdCB0X3VuaWNvZGUgJHtfbG9jYWxlfV9jMV9jaGFyc1tdID0ge1xuIjsKCmZvciAoJGk9 MDskaTw0OyRpKyspCiAgIHsgZm9yICgkaj0wOyRqPDg7JGorKykKICAgICAgICB7IHByaW50Zigi MHglMDRYLCAiLCAkbWFwLT50b19jaGFyMTYoMHg4MCArICRpKjggKyAkaikpOyB9CiAgICAgIHBy aW50ZigiLyogMHglMDJYLTB4JTAyWCAqL1xuIiwgMHg4MCArICRpKjgsIDB4ODAgKyAkaSo4ICsg Nyk7CiAgICB9CgpwcmludCAifTsKc3RydWN0IGNoYXJfc2V0ICR7X2xvY2FsZX1fYzEgPSB7Cgkx LAoJQ0hBUlMoJHtfbG9jYWxlfV9jMV9jaGFycyksCgkwLCBcIlwiLCAwLCAzMiwKfTsKCgpzdGF0 aWMgY29uc3QgdF91bmljb2RlICR7X2xvY2FsZX1fZzFfY2hhcnNbXSA9IHtcbiI7Cgpmb3IgKCRp PTA7JGk8MTI7JGkrKykKICAgeyBmb3IgKCRqPTA7JGo8ODskaisrKQogICAgICAgIHsgcHJpbnRm KCIweCUwNFgsICIsICRtYXAtPnRvX2NoYXIxNigweEEwICsgJGkqOCArICRqKSk7IH0KICAgICBw cmludGYoIi8qIDB4JTAyWC0weCUwMlggKi9cbiIsIDB4QTAgKyAkaSo4LCAweEEwICsgJGkqOCAr IDcpOwogICAgfQoKcHJpbnQgIn07CgpzdHJ1Y3QgY2hhcl9zZXQgJHtfbG9jYWxlfV9nMSA9IHsK CTEsCglDSEFSUygke19sb2NhbGV9X2cxX2NoYXJzKSwKCTAsIFwiXCIsIDEsIDk2LAp9OwoKc3Ry dWN0IGNoYXJfc2V0ICRfbG9jYWxlID0gewoJLmMwID0gJmFzY2lpX2MwLAoJLmcwID0gJmFzY2lp X2cwLAoJLmMxID0gJiR7X2xvY2FsZX1fYzEsCgkuZzEgPSAmJHtfbG9jYWxlfV9nMSwKCS5uYW1l cyA9IHsgXCIkbG9jYWxlXCIsIDAgfSwKfTsKCnN0cnVjdCBjaGFyX3NldCAke19sb2NhbGV9X3Nh ZmUgPSB7CgkuYzAgPSAmYXNjaWlfYzAsCgkuZzAgPSAmYXNjaWlfZzAsCgkuYzEgPSAmYXNjaWlf YzEsCgkuZzEgPSAmJHtfbG9jYWxlfV9nMSwKCS5uYW1lcyA9IHsgXCIke2xvY2FsZX0tc2FmZVwi LCAwIH0sCn07CgpDT05TVFJVQ1RPUihzdGF0aWMgdm9pZCBpbml0KHZvaWQpKQp7CglyZWdpc3Rl cl9jaGFyc2V0KCYkX2xvY2FsZSk7CglyZWdpc3Rlcl9jaGFyc2V0KCYke19sb2NhbGV9X3NhZmUp Owp9XG4iOwoK --Multipart_Mon__24_Feb_2003_01:31:35_+0300_08279df8--