All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: linux-fsdevel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org,
	jfs-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	Anton Altaparmakov <anton@tuxera.com>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	Luis de Bethencourt <luisbg@kernel.org>,
	Salah Triki <salah.triki@gmail.com>,
	Steve French <sfrench@samba.org>, Paulo Alcantara <pc@cjr.nz>,
	Ronnie Sahlberg <lsahlber@redhat.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>, Dave Kleikamp <shaggy@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pavel Machek <pavel@ucw.cz>,
	Christoph Hellwig <hch@infradead.org>,
	Kari Argillander <kari.argillander@gmail.com>,
	Viacheslav Dubeyko <slava@dubeyko.com>
Subject: [RFC PATCH v2 09/18] hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set
Date: Mon, 26 Dec 2022 15:21:41 +0100	[thread overview]
Message-ID: <20221226142150.13324-10-pali@kernel.org> (raw)
In-Reply-To: <20221226142150.13324-1-pali@kernel.org>

It does not make any sense to set hsb->nls_io (NLS iocharset used between
VFS and hfs driver) when hsb->nls_disk (NLS codepage used between hfs
driver and disk) is not set.

Reverse engineering driver code shown what is doing in this special case:

    When codepage was not defined but iocharset was then
    hfs driver copied 8bit character from disk directly to
    16bit unicode wchar_t type. Which means it did conversion
    from Latin1 (ISO-8859-1) to Unicode because first 256
    Unicode code points matches 8bit ISO-8859-1 codepage table.
    So when iocharset was specified and codepage not, then
    codepage used implicit value "iso8859-1".

So when hsb->nls_disk is not set and hsb->nls_io is then explicitly set
hsb->nls_disk to "iso8859-1".

Such setup is obviously incompatible with Mac OS systems as they do not
support iso8859-1 encoding for hfs. So print warning into dmesg about this
fact.

After this change hsb->nls_disk is always set, so remove code paths for
case when hsb->nls_disk was not set as they are not needed anymore.

Signed-off-by: Pali Rohár <pali@kernel.org>
---
 fs/hfs/super.c | 31 +++++++++++++++++++++++++++++++
 fs/hfs/trans.c | 38 ++++++++++++++------------------------
 2 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index 6764afa98a6f..cea19ed06bce 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -351,6 +351,37 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
 		}
 	}
 
+	if (hsb->nls_io && !hsb->nls_disk) {
+		/*
+		 * Previous version of hfs driver did something unexpected:
+		 * When codepage was not defined but iocharset was then
+		 * hfs driver copied 8bit character from disk directly to
+		 * 16bit unicode wchar_t type. Which means it did conversion
+		 * from Latin1 (ISO-8859-1) to Unicode because first 256
+		 * Unicode code points matches 8bit ISO-8859-1 codepage table.
+		 * So when iocharset was specified and codepage not, then
+		 * codepage used implicit value "iso8859-1".
+		 *
+		 * To not change this previous default behavior as some users
+		 * may depend on it, we load iso8859-1 NLS table explicitly
+		 * to simplify code and make it more reable what happens.
+		 *
+		 * In context of hfs driver it is really strange to use
+		 * ISO-8859-1 codepage table for storing data to disk, but
+		 * nothing forbids it. Just it is highly incompatible with
+		 * Mac OS systems. So via pr_warn() inform user that this
+		 * is not probably what he wants.
+		 */
+		pr_warn("iocharset was specified but codepage not, "
+			"using default codepage=iso8859-1\n");
+		pr_warn("this default codepage=iso8859-1 is incompatible with "
+			"Mac OS systems and may be changed in the future");
+		hsb->nls_disk = load_nls("iso8859-1");
+		if (!hsb->nls_disk) {
+			pr_err("unable to load iso8859-1 codepage\n");
+			return 0;
+		}
+	}
 	if (hsb->nls_disk && !hsb->nls_io) {
 		hsb->nls_io = load_nls_default();
 		if (!hsb->nls_io) {
diff --git a/fs/hfs/trans.c b/fs/hfs/trans.c
index fdb0edb8a607..dbf535d52d37 100644
--- a/fs/hfs/trans.c
+++ b/fs/hfs/trans.c
@@ -48,18 +48,13 @@ int hfs_mac2asc(struct super_block *sb, char *out, const struct hfs_name *in)
 		wchar_t ch;
 
 		while (srclen > 0) {
-			if (nls_disk) {
-				size = nls_disk->char2uni(src, srclen, &ch);
-				if (size <= 0) {
-					ch = '?';
-					size = 1;
-				}
-				src += size;
-				srclen -= size;
-			} else {
-				ch = *src++;
-				srclen--;
+			size = nls_disk->char2uni(src, srclen, &ch);
+			if (size <= 0) {
+				ch = '?';
+				size = 1;
 			}
+			src += size;
+			srclen -= size;
 			if (ch == '/')
 				ch = ':';
 			size = nls_io->uni2char(ch, dst, dstlen);
@@ -119,20 +114,15 @@ void hfs_asc2mac(struct super_block *sb, struct hfs_name *out, const struct qstr
 			srclen -= size;
 			if (ch == ':')
 				ch = '/';
-			if (nls_disk) {
-				size = nls_disk->uni2char(ch, dst, dstlen);
-				if (size < 0) {
-					if (size == -ENAMETOOLONG)
-						goto out;
-					*dst = '?';
-					size = 1;
-				}
-				dst += size;
-				dstlen -= size;
-			} else {
-				*dst++ = ch > 0xff ? '?' : ch;
-				dstlen--;
+			size = nls_disk->uni2char(ch, dst, dstlen);
+			if (size < 0) {
+				if (size == -ENAMETOOLONG)
+					goto out;
+				*dst = '?';
+				size = 1;
 			}
+			dst += size;
+			dstlen -= size;
 		}
 	} else {
 		char ch;
-- 
2.20.1


  parent reply	other threads:[~2022-12-26 14:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-26 14:21 [RFC PATCH v2 00/18] fs: Remove usage of broken nls_utf8 and drop it Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 01/18] fat: Fix iocharset=utf8 mount option Pali Rohár
2023-01-10  9:17   ` OGAWA Hirofumi
2023-02-04 10:57     ` Pali Rohár
2023-02-08 10:10       ` OGAWA Hirofumi
2022-12-26 14:21 ` [RFC PATCH v2 02/18] hfsplus: Add iocharset= mount option as alias for nls= Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 03/18] ntfs: Undeprecate iocharset= mount option Pali Rohár
2023-01-01 19:02   ` Kari Argillander
2023-01-01 19:06     ` Pali Rohár
2023-01-01 23:02       ` Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 04/18] ntfs: Fix error processing when load_nls() fails Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 05/18] befs: Fix printing iocharset= mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 06/18] befs: Rename enum value Opt_charset to Opt_iocharset to match " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 07/18] befs: Fix error processing when load_nls() fails Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 08/18] befs: Allow to use native UTF-8 mode Pali Rohár
2022-12-26 14:21 ` Pali Rohár [this message]
2022-12-26 14:21 ` [RFC PATCH v2 10/18] hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 11/18] hfsplus: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 12/18] jfs: Remove custom iso8859-1 implementation Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 13/18] jfs: Fix buffer overflow in jfs_strfromUCS_le() function Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 14/18] jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 15/18] ntfs: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 16/18] cifs: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 17/18] cifs: Remove usage of load_nls_default() calls Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 18/18] nls: Drop broken nls_utf8 module Pali Rohár

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221226142150.13324-10-pali@kernel.org \
    --to=pali@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@tuxera.com \
    --cc=hch@infradead.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jack@suse.cz \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=kari.argillander@gmail.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=lsahlber@redhat.com \
    --cc=luisbg@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=pc@cjr.nz \
    --cc=salah.triki@gmail.com \
    --cc=sfrench@samba.org \
    --cc=shaggy@kernel.org \
    --cc=slava@dubeyko.com \
    --cc=sprasad@microsoft.com \
    --cc=tom@talpey.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.