From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Joel Becker <jlbec@evilplan.org>,
Chris Mason <chris.mason@oracle.com>,
David Miller <davem@davemloft.net>
Subject: Re: [RFC] killing boilerplate checks in ->link/->mkdir/->rename
Date: Fri, 3 Feb 2012 01:16:12 +0000 [thread overview]
Message-ID: <20120203011612.GS23916@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFzHSv2eHKenVhxnSFMJMXtJCnxD2xu6QjMiMLEGLCZ2uQ@mail.gmail.com>
On Thu, Feb 02, 2012 at 03:46:06PM -0800, Linus Torvalds wrote:
> On Thu, Feb 2, 2012 at 1:24 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Comments? ?Boilerplate removal follows (22 files changed, 45 insertions(+),
> > 120 deletions(-)), but it's *not* for immediate merge; it's really completely
> > untested.
>
> Looks ok to me. Historically, the more things we can check at the VFS
> layer, the better.
After looking a bit more: nlink_t is a f*cking mess. Almost any code
using that type kernel-side is broken. Crap galore:
* sometimes it's 32 bits, sometimes 16, sometimes 64. Essentially
at random.
* almost all have it unsigned, except for sparc32, where it's
signed short [inherited from v7 via SunOS? BTW, in v6 it used to be even
funnier - char, which is where ridiculous LINK_MAX == 127 comes from]
IOW, nlink_t is an attractive nuisance - it's nearly impossible to use in
a portable way and we are lucky that almost nobody tries to. Exceptions:
ocfs2_rename() does
nlink_t old_dir_nlink = old_dir->i_nlink;
...
followed later by comparison with old_dir->i_nlink. And no, it's not to
handle truncation - it's "what if i_nlink changed while ocfs2_rename()
had been grabbing the cluster lock" kind of thing. OCFS2 can have up
to 2^32 links to file, so truncation is really possible... AFAICS,
that one is a genuine bug - this nlink_t should be u32...
Another one is proc_dir_entry ->nlink and it would cause Bad Things(tm)
on architecture with 16bit nlink_t if we could end up with 65534
subdirectories in some procfs dir. Might be possible, might be not -
doing that under /proc/sys is definitely possible, but that won't be
enough; needs to be proc_dir_entry-backed directory. Again, solution
is to use explicit u32 anyway.
* compat_nlink_t is even funnier - it's signed in *two* cases; sparc
and ppc. No, nlink_t on ppc32 is unsigned. Not that anyone cared, really,
since the _only_ use of that type is in struct compat_stat. For exactly
one field. Only used as left-hand side of assignment, which is actually
broken since unlike cp_new_stat(), cp_compat_stat() does *not* check if the
value fits into st_nlink. Bug, needs to be fixed. Incidentally, just what
should we do on sparc32 if we run into a file with 4G-10 links? -EOVERFLOW
or silently put 65536-10 in st_nlink and be done with that? Note that
filesystems allowing that many links *do* exist...
* when does jfs dtInsert() return -EMLINK? Can it ever get triggered?
* WTF is XFS doing with these checks? Note that we have them
done _twice_ on all paths - explictly from xfs_create(), xfs_link(),
xfs_rename() and then from xfs_bumplink() called by exactly the same
set of functions.
* what's up with btrfs_insert_inode_ref()? I've tried to trace
the codepaths around there, but... Incidentally, when could fixup_low_keys()
return non-zero? I don't see any candidates for that in there... Chris?
* ubifs, hfsplus, jffs2 - definitely broken if you create enough
links. i_nlink wraparound to zero, confused inode eviction logics.
next prev parent reply other threads:[~2012-02-03 1:16 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-30 21:56 sysfs regression: wrong link counts Jiri Slaby
2012-01-30 22:06 ` Greg KH
2012-01-30 22:10 ` Alan Cox
2012-01-30 22:27 ` Greg KH
2012-01-30 22:43 ` Al Viro
2012-01-30 22:56 ` Al Viro
2012-01-31 1:27 ` Eric W. Biederman
2012-01-31 10:48 ` Jiri Slaby
2012-01-31 12:44 ` Eric W. Biederman
2012-01-31 16:45 ` Linus Torvalds
2012-01-31 19:18 ` Al Viro
2012-02-01 5:06 ` Eric W. Biederman
2012-02-01 22:21 ` [PATCH] sysfs: Optionally count subdirectories to support buggy applications Eric W. Biederman
2012-02-01 22:24 ` Greg Kroah-Hartman
2012-02-01 22:44 ` Eric W. Biederman
2012-02-01 22:49 ` Greg Kroah-Hartman
2012-02-01 22:31 ` Dave Jones
2012-02-01 22:35 ` Jiri Slaby
2012-02-01 23:15 ` Linus Torvalds
2012-02-01 23:18 ` Linus Torvalds
2012-02-02 1:22 ` Al Viro
2012-02-02 21:24 ` [RFC] killing boilerplate checks in ->link/->mkdir/->rename Al Viro
2012-02-02 23:46 ` Linus Torvalds
2012-02-03 1:16 ` Al Viro [this message]
2012-02-03 1:45 ` Al Viro
2012-02-03 2:00 ` Linus Torvalds
2012-02-03 14:57 ` Chris Mason
2012-02-03 17:08 ` Al Viro
2012-02-03 19:34 ` Artem Bityutskiy
2012-02-06 8:50 ` Artem Bityutskiy
2012-02-06 13:56 ` Al Viro
2012-02-06 17:05 ` Artem Bityutskiy
2012-02-06 17:11 ` Al Viro
2012-02-07 7:21 ` Artem Bityutskiy
2012-02-06 22:49 ` Dave Chinner
2012-02-03 8:25 ` Andreas Dilger
2012-02-03 17:03 ` Al Viro
2012-02-04 7:42 ` Andreas Dilger
2012-03-05 13:30 ` [PATCH] sysfs: Optionally count subdirectories to support buggy applications Jiri Slaby
2012-03-05 16:09 ` Greg Kroah-Hartman
2012-03-05 16:47 ` Linus Torvalds
2012-03-08 21:05 ` Greg Kroah-Hartman
2012-03-08 22:18 ` Eric W. Biederman
2012-03-08 23:40 ` Linus Torvalds
2012-03-08 21:28 ` Eric W. Biederman
2012-03-08 21:34 ` [PATCH 1/3] sysfs: Compact sysfs_dirent s_flags into a byte Eric W. Biederman
2012-03-08 21:36 ` [PATCH 2/3] sysfs: Maintain usable nlink directory counts Eric W. Biederman
2012-03-08 21:37 ` [PATCH 3/3] sysfs: Remove SYSFS_FLAG_REMOVED and use sd->s_nlink == 0 instead Eric W. Biederman
2012-03-09 3:40 ` Linus Torvalds
2012-03-08 22:28 ` [PATCH 1/3] sysfs: Compact sysfs_dirent s_flags into a byte Greg Kroah-Hartman
2012-03-09 2:49 ` Eric W. Biederman
2012-01-31 3:45 ` sysfs regression: wrong link counts Eric W. Biederman
2012-01-31 11:54 ` Alan Cox
2012-01-30 22:52 ` Kay Sievers
2012-01-31 10:41 ` network regression: cannot rename netdev twice Jiri Slaby
2012-01-31 10:52 ` Kay Sievers
2012-01-31 11:00 ` Jiri Slaby
2012-01-31 11:13 ` Kay Sievers
2012-01-31 11:17 ` Jiri Slaby
2012-01-31 11:58 ` Kay Sievers
2012-01-31 14:18 ` Eric W. Biederman
2012-01-31 14:40 ` [PATCH] sysfs: Update the name hash when renaming sysfs entries Eric W. Biederman
2012-01-31 14:41 ` Jiri Slaby
2012-01-31 14:55 ` Greg KH
2012-02-04 2:14 ` network regression: cannot rename netdev twice Henrique de Moraes Holschuh
2012-02-06 20:03 ` Kay Sievers
2012-02-08 2:00 ` Henrique de Moraes Holschuh
2012-02-08 3:50 ` Kay Sievers
2012-02-08 6:42 ` Valdis.Kletnieks
2012-02-08 10:57 ` Kay Sievers
2012-02-08 20:06 ` Valdis.Kletnieks
2012-02-08 20:27 ` Stephen Hemminger
2012-02-08 23:48 ` Kay Sievers
2012-01-31 1:32 ` sysfs regression: wrong link counts Eric W. Biederman
2012-02-01 18:29 ` Maciej Rutecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120203011612.GS23916@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=chris.mason@oracle.com \
--cc=davem@davemloft.net \
--cc=jlbec@evilplan.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.