From: Timur Sufiev To: Peter Krefting Subject: Re: [PATCH I18N filenames v2 3/3] Provide compatibility with MinGW In-reply-to: References: <1256752900-2615-1-git-send-email-timur@iris-comp.ru> <1256752900-2615-2-git-send-email-timur@iris-comp.ru> <1256752900-2615-3-git-send-email-timur@iris-comp.ru> Comments: In-reply-to Peter Krefting message dated "Thu, 29 Oct 2009 10:01:01 +0100." X-Mailer: MH-E 8.1; nmh 1.3; GNU Emacs 22.3.1 Date: Tue, 03 Nov 2009 19:53:44 +0300 Hello, Peter > Hi! > > Instead of calling the open_i18n() which converts from UTF-8 to a local > 8-bit character set, this should probably call a version that converts from > UTF-8 to UTF-16 and uses _wopen(). > > Same thing for fopen_i18n() and _wfopen(). > > I created a small RFC patch for that that changed parts of the system > earlier this year - http://kerneltrap.org/mailarchive/git/2009/3/2/5350814 > > I did not address readdir() and friends, I'm not sure if they are available > in UTF-16 form or if they need to be rewritten using findfirst()/findnext(). > > -- > \\// Peter - http://www.softwolves.pp.se/ I've decided to stick to local 8-bit encoding for now having considered the following issues: 1. Many git front-ends, e.g. TortoiseGit, use 8-bit set, not UTF-16: they call git plumbing commands and pass filenames to command line (in local 8-bit encoding). So, using [UTF-8] <-> [UTF-16] approach, I had to deal with 3 different encodings: UTF-8, UTF-16 and local 8-bit one (CP1251 in my case). Moreover, Windows itself uses both UTF-16 and CP1251, so one had to deal with reencoding between them (if he plans to support UTF-16). Too much confusion. 2. UTF-16 is a proper solution for Windows, but my patch is useful for other OSes with locales different from UTF-8 (e.g. Linux with KOI8-R locale). Still there is a possibility that one day we'll stumble upon some UTF-8 symbol which cannot not be correctly mapped into 8-bit encoding. UTF-16 would be a remedy in this case, but what if don't have it (see 2)? -- Timur Sufiev