From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: kernel@collabora.com, linux-ext4@vger.kernel.org,
Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Subject: Re: [PATCH v3 08/12] ext2fs: nls: Support UTF-8 11.0 with NFKD normalization
Date: Fri, 30 Nov 2018 11:12:51 -0500 [thread overview]
Message-ID: <20181130161251.GA3512@thunk.org> (raw)
In-Reply-To: <20181126221949.12172-9-krisman@collabora.com>
On Mon, Nov 26, 2018 at 05:19:45PM -0500, Gabriel Krisman Bertazi wrote:
> +static int utf8_casefold(const struct nls_table *table,
> + const unsigned char *str, size_t len,
> + unsigned char *dest, size_t dlen)
> +{
> + const struct utf8data *data = utf8nfkdicf(UNICODE_AGE(10,0,0));
> + struct utf8cursor cur;
> + size_t nlen = 0;
> +
> + if (utf8ncursor(&cur, data, str, len) < 0)
> + goto invalid_seq;
> +
> + for (nlen = 0; nlen < dlen; nlen++) {
> + dest[nlen] = utf8byte(&cur);
> + if (!dest[nlen])
> + return nlen;
> + if (dest[nlen] == -1)
> + break;
> + }
> +invalid_seq:
> + /* Treat the sequence as a binary blob. */
> + memcpy(dest, str, len);
> + return len;
> +
> +}
So it looks like the interface is if the destination buffer is too
small OR if the string is not a valid UTF-8 string, we treat it as a
binary blob. I wonder if we would be better off if this function
actually signalling that there is a problem? (Buffer too small,
invalid UTF-8 string).
It's fine to treat it as a binary blob, and copy it out to the
destination buffer, but I can imagine be use cases where knowing this
will be useful. *Especially* the destination buffer too small case;
I'm actually a little nervous about having it silently ignoring that
error condition and just copying the binary blob.
Also, there *really* needs to be a check before dlen is assumed to be
>= len in the memcpy after the invalid_seq label.
- Ted
next prev parent reply other threads:[~2018-12-01 3:22 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-26 22:19 [PATCH e2fsprogs v3 00/12] Support encoding awareness and casefold Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 01/12] libe2p: Helpers for configuring the encoding superblock fields Gabriel Krisman Bertazi
2018-11-30 15:42 ` Theodore Y. Ts'o
2018-11-30 20:46 ` Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 02/12] mke2fs: Configure encoding during superblock initialization Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 03/12] chattr/lsattr: Support casefold attribute Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 04/12] lib/ext2fs: Implement NLS support Gabriel Krisman Bertazi
2018-11-30 15:54 ` Theodore Y. Ts'o
2018-11-26 22:19 ` [PATCH v3 05/12] lib/ext2fs: Support encoding when calculating dx hashes Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 06/12] debugfs/htree: Support encoding when printing the file hash Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 07/12] tune2fs: Prevent enabling encryption flag on encoding-aware fs Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 09/12] ext4.5: Add fname_encoding feature to ext4 man page Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 10/12] mke2fs.8: Document fname_encoding options Gabriel Krisman Bertazi
2018-11-30 15:59 ` Theodore Y. Ts'o
2018-11-26 22:19 ` [PATCH v3 11/12] mke2fs.conf.5: Document fname_encoding configuration option Gabriel Krisman Bertazi
2018-11-26 22:19 ` [PATCH v3 12/12] chattr.1: Document the casefold attribute Gabriel Krisman Bertazi
[not found] ` <20181126221949.12172-9-krisman@collabora.com>
2018-11-30 16:12 ` Theodore Y. Ts'o [this message]
2018-11-30 16:53 ` [PATCH v3 08/12] ext2fs: nls: Support UTF-8 11.0 with NFKD normalization Theodore Y. Ts'o
2018-11-30 18:48 ` Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181130161251.GA3512@thunk.org \
--to=tytso@mit.edu \
--cc=kernel@collabora.com \
--cc=krisman@collabora.co.uk \
--cc=krisman@collabora.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).