From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f170.google.com ([209.85.216.170]:37921 "EHLO mail-qt0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937944AbdKSA5K (ORCPT ); Sat, 18 Nov 2017 19:57:10 -0500 Received: by mail-qt0-f170.google.com with SMTP id f8so11088333qta.5 for ; Sat, 18 Nov 2017 16:57:10 -0800 (PST) Date: Sat, 18 Nov 2017 21:57:05 -0300 From: Ernesto =?utf-8?Q?A=2E_Fern=C3=A1ndez?= To: Ting-Chang Hou Cc: linux-fsdevel@vger.kernel.org, Ernesto =?utf-8?Q?A=2E_Fern=C3=A1ndez?= Subject: Re: [PATCH] hfsplus: fix the bug that cannot recognize files with hangul file name Message-ID: <20171119005704.GA3495@debian.home> References: <1510906805-2142-1-git-send-email-tchou@synology.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1510906805-2142-1-git-send-email-tchou@synology.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Nov 17, 2017 at 04:20:05PM +0800, Ting-Chang Hou wrote: > The unicode of hangul from macOS is decomposed. There has a bug that > mistake decomposed unicode for composed when change unicode to ascii, > so it cannot recognize the hangul correctly. > > Signed-off-by: Ting-Chang Hou > --- > fs/hfsplus/unicode.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/hfsplus/unicode.c b/fs/hfsplus/unicode.c > index dfa90c2..2daf7b0 100644 > --- a/fs/hfsplus/unicode.c > +++ b/fs/hfsplus/unicode.c > @@ -135,7 +135,7 @@ int hfsplus_uni2asc(struct super_block *sb, > ustrlen = be16_to_cpu(ustr->length); > len = *len_p; > ce1 = NULL; > - compose = !test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags); > + compose = test_bit(HFSPLUS_SB_NODECOMPOSE, &HFSPLUS_SB(sb)->flags); I'm not sure this is a mistake. The developers probably wanted the filenames to be recomposed before being presented in utf8. With your patch, if you try the following (with the default mount options): touch Á ls | hexdump -C the utf8 output filename will be using the combining accent (CC 81) instead of the Á character (C3 81). This is a bit annoying because it won't print correctly in my terminal anymore. What is it exactly that you are trying to fix? You mention an issue with hangul characters, but I failed to trigger it. Could you expand on that? > while (ustrlen > 0) { > c0 = be16_to_cpu(*ip++); > -- > 2.7.4 >