From: Tejun Heo <tj@kernel.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Benjamin Thery <benjamin.thery@bull.net>,
Greg KH <greg@kroah.com>,
linux-kernel@vger.kernel.org,
"Serge E. Hallyn" <serue@us.ibm.com>,
Al Viro <viro@ftp.linux.org.uk>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: sysfs: tagged directories not merged completely yet
Date: Tue, 07 Oct 2008 21:19:04 +0900 [thread overview]
Message-ID: <48EB53B8.5020309@kernel.org> (raw)
In-Reply-To: <m1zllgzki0.fsf@frodo.ebiederm.org>
Eric W. Biederman wrote:
>> If the filler is a real concern, I think it's better to just decouple
>> it rather than making sysfs locking fine-grained. sysfs metadata
>> might as well be protected by a single spinlock if it can be decoupled
>> from vfs locking and stuff. It's just an in-memory tree which isn't
>> used too often.
>
> I think with a little care we can make the sysfs read side rcu
> protected which would remove any real locking from lookup
> and readdir.
IIRC, the original readdir implementation put a cursor entry to walk
through the children list. The implementation was horribly broken in
a number of different ways (ISTR problems with locking and multiple
and different type of walkers) and I just gutted out all the
complexity out and made it simple as getting it correct was far more
important and there seemed to be little need for optimization.
Yeah, using RCU sounds like a plan.
>> Generally, the VFS layer isn't too easy for sysfs which is a bit like
>> distributed filesystem but has more strict here-and-now rule (all
>> changes should be visible instantaneously). At the beginning, sysfs
>> didn't have much metadata itself, it just used the VFS data structures
>> but that was too large so sysfs_dirent got introduced and it tried to
>> update VFS data structures as necessary and (this is when I started
>> working on it) the current code and Eric's patcheset evolved from
>> there.
>>
>> Maybe it can be done better by taking more traditional distributed
>> filesystem approach - re/invalidation on access. I don't know whether
>> it will fit sysfs's needs but if it can be done, sysfs would be able
>> to ride along with other distributed filesystems and become much more
>> conventional in its interfacing with VFS.
>
> The revalidate on access model doesn't appear to have a way to track
> remote renames. Something sysfs supports.
Yeap, IIRC, one of the reasons why sysfs wasn't converted over to
sysfs was because sysfs guarantees inode doesn't change over rename or
move so that notifications keep working over renames.
> I have just spent a little bit of time thinking it through. I had
> previously thought that we could take advantage of the fact that
> sysfs only allows VFS reads we could fix our backwards lock ordering
> by optimizing the read side with rcu. Unfortunately the VFS still
> takes locks on rename and similar paths despite the fact sysfs does
> not implement those paths functions. Therefore whatever we do has
> to be handle all VFS operations even if we don't support them.
> Weird, but true.
>
> We may need to delay dentry unhashing until revalidate. I think I see
> some issues if we don't do that.
Ah... okay. It shouldn't be difficult, right?
Thanks.
--
tejun
next prev parent reply other threads:[~2008-10-07 12:21 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-22 14:31 sysfs: tagged directories not merged completely yet Benjamin Thery
2008-09-22 15:34 ` Greg KH
2008-09-22 20:24 ` Eric W. Biederman
2008-09-23 14:24 ` Benjamin Thery
2008-09-23 18:23 ` Eric W. Biederman
2008-10-03 10:13 ` Al Viro
2008-10-05 5:32 ` Greg KH
2008-10-07 8:27 ` Eric W. Biederman
2008-10-07 10:47 ` [PATCH 0/3] minor sysfs tagged directory fixes Eric W. Biederman
2008-10-07 10:49 ` [PATCH 1/3] sysfs: Remove lock ordering violation in sysfs_chmod_file Eric W. Biederman
2008-10-07 10:51 ` [PATCH 2/3] sysfs: Fix and sysfs_mv_dir by using lock_rename Eric W. Biederman
2008-10-07 10:52 ` [PATCH 3/3] sysfs: Take sysfs_mutex when fetching the root inode Eric W. Biederman
2008-10-07 21:21 ` [PATCH 2/3] sysfs: Fix and sysfs_mv_dir by using lock_rename Dave Hansen
2008-10-07 21:19 ` [PATCH 1/3] sysfs: Remove lock ordering violation in sysfs_chmod_file Dave Hansen
2008-10-07 22:31 ` Eric W. Biederman
2008-10-07 22:27 ` sysfs: tagged directories not merged completely yet Greg KH
2008-10-07 22:54 ` Serge E. Hallyn
2008-10-07 23:39 ` Greg KH
2008-10-08 0:12 ` Serge E. Hallyn
2008-10-08 0:38 ` Greg KH
2008-10-08 14:18 ` Serge E. Hallyn
2008-10-07 23:34 ` Tejun Heo
2008-10-14 1:11 ` Eric W. Biederman
2008-10-14 7:55 ` Tejun Heo
2008-10-14 12:19 ` Eric W. Biederman
2008-10-15 11:04 ` Tejun Heo
2008-10-16 21:58 ` Eric W. Biederman
2008-10-14 18:53 ` Serge E. Hallyn
2008-10-15 0:48 ` Eric W. Biederman
2008-10-15 13:42 ` Serge E. Hallyn
2008-10-15 13:54 ` Benjamin Thery
2008-10-08 0:39 ` Eric W. Biederman
2008-10-08 1:29 ` Eric W. Biederman
2008-10-07 8:08 ` Eric W. Biederman
2008-10-07 9:01 ` Daniel Lezcano
2008-10-07 9:12 ` Tejun Heo
2008-10-07 11:56 ` Eric W. Biederman
2008-10-07 12:19 ` Tejun Heo [this message]
2008-10-07 23:17 ` Tejun Heo
2008-10-08 0:04 ` Eric W. Biederman
2008-10-08 0:20 ` Tejun Heo
2008-10-08 0:58 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48EB53B8.5020309@kernel.org \
--to=tj@kernel.org \
--cc=benjamin.thery@bull.net \
--cc=ebiederm@xmission.com \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).