All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: "Thaddeus H. Black" <thb@debian.org>
Cc: linux-man@vger.kernel.org,
	Alejandro Colomar <alx.manpages@gmail.com>,
	"G. Branden Robinson" <g.branden.robinson@gmail.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [PATCH v3] filename.7: new manual page
Date: Tue, 19 Oct 2021 10:54:11 +0200	[thread overview]
Message-ID: <87fssxgzt8.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <YW2hzL5vDfVZIAXY@b-tk.org> (Thaddeus H. Black's message of "Mon, 18 Oct 2021 16:33:16 +0000")

* Thaddeus H. Black:

> +.TH FILENAME 7 2021-10-18 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +filename \- requirements and conventions for the naming of files
> +.SH DESCRIPTION
> +This manual page sets forth requirements for
> +and delineates conventions regarding filenames
> +on a Linux system,
> +where a
> +.I filename
> +is either (as the word suggests) the name of a regular file
> +or the name of another object held by the system's filesystem
> +such as a directory, symbolic link, named pipe or device.

Maybe add: “A pathname contains zero or more filenames.”

> +.SS Legal filenames
> +A filename on a Linux system can consist
> +of almost any sequence of UTF-8 characters
> +or, indeed, almost any sequence of bytes.
> +The exceptions are as follows.
> +.TP
> +.B Reserved characters
> +.RS
> +The following characters are reserved.
> +.TP
> +.B /
> +The solidus is reserved to separate pathname components
> +as for example in
> +.IR /usr/share/doc ,
> +each component being itself a filename.
> +For this reason, no filename may include a solidus.
> +More precisely,
> +no filename may include the byte that,
> +in ASCII and UTF-8,
> +exclusively represents the solidus.

What does this mean?  I think only byte 0x2f is reserved.  The UTF-8
comment is misleading.  A historic/overlong encoding of / in multiple
UTF-8 bytes is *not* reserved.

> +.B \e0
> +The null character is reserved for the filesystem to append
> +to terminate a filename's representation in memory.
> +For this reason, no filename may include a null character.
> +More precisely,
> +no filename may include the byte that,
> +in ASCII and UTF-8,
> +exclusively represents the null character.

See above.

> +.B Reserved names
> +.RS
> +The following names are reserved.
> +.TP
> +.B .
> +The filename consisting of a single full stop
> +is reserved to represent the current directory.
> +.TP
> +.B ..
> +The filename consisting of two full stops
> +is reserved to represent the parent directory.
> +.TP
> +(empty)
> +The empty filename,
> +consisting of no bytes at all
> +(except a terminating null byte),
> +is not allowed.

This conflicts with the presentation of / as a separator in pathnames, I
think: The pathname "/usr/" contains two empty filenames.

> +.TP
> +.B Long names
> +.RS
> +No filename may exceed\~255 bytes in length,
> +or\~256 bytes after counting the terminating null byte.

This is not correct for Linux.  Despite the definition of NAME_MAX,
filenames can be longer than 255 bytes.  NTFS and CIFS have a limit of
255 UTF-16 characters, which translates to about 768 bytes in the UTF-8
encoding used by Linux.

Thanks,
Florian


  reply	other threads:[~2021-10-19  8:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-17 23:07 [PATCH v2] filename.7: new manual page Thaddeus H. Black
2021-10-18 16:25 ` Thaddeus H. Black
2021-10-18 16:33 ` [PATCH v3] " Thaddeus H. Black
2021-10-19  8:54   ` Florian Weimer [this message]
2021-10-19 11:05     ` Thaddeus H. Black
2021-10-19 13:55       ` Alejandro Colomar (man-pages)
2021-10-20  8:12       ` Florian Weimer
2021-10-21 12:18         ` Thaddeus H. Black
2021-10-19 13:38   ` Alejandro Colomar (man-pages)
2021-11-07 14:36     ` Thaddeus H. Black

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fssxgzt8.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=alx.manpages@gmail.com \
    --cc=g.branden.robinson@gmail.com \
    --cc=linux-man@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=thb@debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.