linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
To: tytso@mit.edu
Cc: linux-ext4@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Subject: [PATCH v3 23/23] docs: ext4.rst: Document encoding and case-insensitive
Date: Wed, 17 Oct 2018 16:55:24 -0400	[thread overview]
Message-ID: <20181017205524.23360-24-krisman@collabora.co.uk> (raw)
In-Reply-To: <20181017205524.23360-1-krisman@collabora.co.uk>

Introduces the encoding-awareness and case-insensitive features on ext4
for system administrators.  Explain the minimum of design decisions that
are important for sysadmins enabling this feature.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
---
 Documentation/admin-guide/ext4.rst | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/admin-guide/ext4.rst b/Documentation/admin-guide/ext4.rst
index e506d3dae510..f42c682acecc 100644
--- a/Documentation/admin-guide/ext4.rst
+++ b/Documentation/admin-guide/ext4.rst
@@ -91,10 +91,39 @@ Currently Available
 * large block (up to pagesize) support
 * efficient new ordered mode in JBD2 and ext4 (avoid using buffer head to force
   the ordering)
+* Encoding aware file names
+* Case insensitive file name lookups
 
 [1] Filesystems with a block size of 1k may see a limit imposed by the
 directory hash tree having a maximum depth of two.
 
+Encoding-aware file names and case-insensitive lookups
+======================================================
+
+Ext4 optionally supports filesystem-wide charset knowledge when handling
+file names, which allows the user to perform file system lookups using
+charset equivalent versions of the same file name, and optionally ensure
+that no invalid names are held by the filesystem.  charset encoding
+awareness is also essential for performing case-insensitive lookups,
+because it is what defines the casefold operation.
+
+The case-insensitive file name lookup feature is supported in a smaller
+granularity, on a per-directory basis, allowing the user to mix
+case-insensitive and case-sensitive directories in the same filesystem.
+It is enabled by flipping a file attribute on an empty directory.  For
+the reason stated above, the filesystem must have encoding enabled to
+use this feature.
+
+When we change from filenames as opaque byte sequences to seeing them as
+encoded strings we need to address what happens when a program tries to
+create a file with an invalid name.  The Natural Language System within
+the kernel leaves the decision of what to do in this case to the
+filesystem, which select its preferred behavior by enabling/disabling
+the strict mode in NLS.  When Ext4 encounters one of those strings, it
+falls back to considering the entire string as an opaque byte sequence,
+which still allows the user to operate on that file but the
+case-insensitive and equivalent sequence lookups won't work.
+
 Options
 =======
 
-- 
2.19.1

      parent reply	other threads:[~2018-10-18  4:54 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-17 20:55 [PATCH v3 00/23] Ext4 Encoding and Case-insensitive support Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 01/23] nls: Wrap uni2char/char2uni callers Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 02/23] nls: Wrap charset field access Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 03/23] nls: Wrap charset hooks in ops structure Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 04/23] nls: Split default charset from NLS core Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 05/23] nls: Split struct nls_charset from struct nls_table Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 06/23] nls: Add support for multiple versions of an encoding Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 07/23] nls: Implement NLS_STRICT_MODE flag Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 08/23] nls: Let charsets define the behavior of tolower/toupper Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 09/23] nls: Add new interface for string comparisons Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 10/23] nls: Add optional normalization and casefold hooks Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 11/23] nls: ascii: Support validation and normalization operations Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 12/23] nls: utf8n: Add unicode character database files Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 13/23] scripts: add trie generator for UTF-8 Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 14/23] nls: utf8: Move nls-utf8{,-core}.c Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 15/23] nls: utf8: Introduce code for UTF-8 normalization Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 16/23] nls: utf8n: reduce the size of utf8data[] Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 17/23] nls: utf8: Integrate utf8 normalization code with utf8 charset Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 18/23] nls: utf8: Introduce test module for normalized utf8 implementation Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 19/23] ext4: Reserve superblock fields for encoding information Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 20/23] ext4: Include encoding information in the superblock Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 21/23] ext4: Support encoding-aware file name lookups Gabriel Krisman Bertazi
2018-10-17 20:55 ` [PATCH v3 22/23] ext4: Implement EXT4_CASEFOLD_FL flag Gabriel Krisman Bertazi
2018-10-17 20:55 ` Gabriel Krisman Bertazi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181017205524.23360-24-krisman@collabora.co.uk \
    --to=krisman@collabora.co.uk \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).