From: Daniel Lezcano <dlezcano@fr.ibm.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Benjamin Thery <benjamin.thery@bull.net>,
Greg KH <greg@kroah.com>,
linux-kernel@vger.kernel.org,
"Serge E. Hallyn" <serue@us.ibm.com>,
Al Viro <viro@ftp.linux.org.uk>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: sysfs: tagged directories not merged completely yet
Date: Tue, 07 Oct 2008 11:01:32 +0200 [thread overview]
Message-ID: <48EB256C.4020003@fr.ibm.com> (raw)
In-Reply-To: <20081003101331.GH28946@ZenIV.linux.org.uk>
Al Viro wrote:
> On Tue, Sep 23, 2008 at 11:23:57AM -0700, Eric W. Biederman wrote:
>> Benjamin Thery <benjamin.thery@bull.net> writes:
>>> Oh.
>>> It's a pity Al couldn't re-review them before. We've already lost a lot
>>> of time with this patchset and it's blocking easier testing of network
>>> namespaces (right now, with a mainline kernel, we have to disable sysfs
>>> to build network namespaces).
>> I am confident that we have a good base with these patches and the rest of
>> the work can be done incrementally on top of them if any issues turn up.
>>
>> Al recent rework of sysctl has a very similar structure.
>
> No, it does not. My apologies for delay, but here are more printable parts
> of review.
>
> First of all, this stuff breaks just about every damn integrity constraint VFS
> knows of. It tries to tiptoe through the resulting minefield - without
> success. So the first group of comments will be of "you *really* don't
> do $FOO" variety. I'm very far from being convinced that we want to
> special-case in VFS every kind of weirdness sysfs happens to do; in effect,
> that would require adding a FS_IS_SYSFS_SO_BEND_OVER fs type flag and making
> a lot of locking conditional on that.
>
> a) You do *not* share struct inode between the superblocks, for fsck sake!
> b) You do *not* grab i_mutex on ancestors after having grabbed it on
> file, as sysfs_chmod_file() does.
> c) You do *not* change dentry tree topology without s_vfs_rename_mutex on
> affected superblock. That, BTW, is broken in mainline sysfs as well.
> d) You REALLY, REALLY do not unhash busy dentries of directories.
>
> In addition to that, there are interesting internal problems:
> * inumbers are released by final sysfs_put(); that can happen before the
> final iput() on corresponding inode. Guess what happens if new entry is
> created in the meanwhile, reuses the same inumber and lookup gets to
> sysfs_get_inode() on it?
> * may I politely suggest that
> again:
> mutex_lock(&A);
> if (!mutex_trylock(&B)) {
> mutex_unlock(&A);
> goto again;
> }
> is somewhat, er, deficient way to deal with buggered locking hierarchy?
> Not to mention anything else, that's obviously FUBAR on UP box - if we
> have B contended, we've just killed the box dead since we'll never give
> the CPU back to whoever happens to hold B. See sysfs_mv_dir() for a lovely
> example.
> * sysfs_count_nlink() is called from sysfs_fill_super() without sysfs_mutex;
> now this sucker can get called at any moment.
> * just what is protecting ->s_tag?
> * __sysfs_remove_dir() appears to assume that subdirectories are possible;
> AFAICS, if we *do* get them, we get very screwed after remove_dir().
> * everything else aside, the internal locking is extremely heavy. For
> fsck sake, guys, a single system-wide mutex that can be grabbed for the
> duration of readdir on any directory and block just about anything
> in the filesystem? Just mmap() something over NFS on a slow link and
> do getdents() to such buffer. Watch a *lot* of stuff getting buggered.
> Hell, you can't even do ifconfig up while that sucker is held...
>
> Seriously, people, it's getting worse than devfs had ever been ;-/
Thank you Al for reviewing the patchset.
-- Daniel
next prev parent reply other threads:[~2008-10-07 9:01 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-22 14:31 sysfs: tagged directories not merged completely yet Benjamin Thery
2008-09-22 15:34 ` Greg KH
2008-09-22 20:24 ` Eric W. Biederman
2008-09-23 14:24 ` Benjamin Thery
2008-09-23 18:23 ` Eric W. Biederman
2008-10-03 10:13 ` Al Viro
2008-10-05 5:32 ` Greg KH
2008-10-07 8:27 ` Eric W. Biederman
2008-10-07 10:47 ` [PATCH 0/3] minor sysfs tagged directory fixes Eric W. Biederman
2008-10-07 10:49 ` [PATCH 1/3] sysfs: Remove lock ordering violation in sysfs_chmod_file Eric W. Biederman
2008-10-07 10:51 ` [PATCH 2/3] sysfs: Fix and sysfs_mv_dir by using lock_rename Eric W. Biederman
2008-10-07 10:52 ` [PATCH 3/3] sysfs: Take sysfs_mutex when fetching the root inode Eric W. Biederman
2008-10-07 21:21 ` [PATCH 2/3] sysfs: Fix and sysfs_mv_dir by using lock_rename Dave Hansen
2008-10-07 21:19 ` [PATCH 1/3] sysfs: Remove lock ordering violation in sysfs_chmod_file Dave Hansen
2008-10-07 22:31 ` Eric W. Biederman
2008-10-07 22:27 ` sysfs: tagged directories not merged completely yet Greg KH
2008-10-07 22:54 ` Serge E. Hallyn
2008-10-07 23:39 ` Greg KH
2008-10-08 0:12 ` Serge E. Hallyn
2008-10-08 0:38 ` Greg KH
2008-10-08 14:18 ` Serge E. Hallyn
2008-10-07 23:34 ` Tejun Heo
2008-10-14 1:11 ` Eric W. Biederman
2008-10-14 7:55 ` Tejun Heo
2008-10-14 12:19 ` Eric W. Biederman
2008-10-15 11:04 ` Tejun Heo
2008-10-16 21:58 ` Eric W. Biederman
2008-10-14 18:53 ` Serge E. Hallyn
2008-10-15 0:48 ` Eric W. Biederman
2008-10-15 13:42 ` Serge E. Hallyn
2008-10-15 13:54 ` Benjamin Thery
2008-10-08 0:39 ` Eric W. Biederman
2008-10-08 1:29 ` Eric W. Biederman
2008-10-07 8:08 ` Eric W. Biederman
2008-10-07 9:01 ` Daniel Lezcano [this message]
2008-10-07 9:12 ` Tejun Heo
2008-10-07 11:56 ` Eric W. Biederman
2008-10-07 12:19 ` Tejun Heo
2008-10-07 23:17 ` Tejun Heo
2008-10-08 0:04 ` Eric W. Biederman
2008-10-08 0:20 ` Tejun Heo
2008-10-08 0:58 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48EB256C.4020003@fr.ibm.com \
--to=dlezcano@fr.ibm.com \
--cc=benjamin.thery@bull.net \
--cc=ebiederm@xmission.com \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.