From: Miklos Szeredi <miklos@szeredi.hu>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Karel Zak <kzak@redhat.com>, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2] proc/mounts: add cursor
Date: Thu, 9 Apr 2020 21:36:35 +0200 [thread overview]
Message-ID: <CAJfpegtZ3T+1bN-pg-vmVvWZs-7chDWxBr0T+j4x_Lt4x0T8MQ@mail.gmail.com> (raw)
In-Reply-To: <20200409183008.GG23230@ZenIV.linux.org.uk>
On Thu, Apr 9, 2020 at 8:30 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Apr 09, 2020 at 05:54:46PM +0100, Al Viro wrote:
> > On Thu, Apr 09, 2020 at 05:50:48PM +0100, Al Viro wrote:
> > > On Thu, Apr 09, 2020 at 04:16:19PM +0200, Miklos Szeredi wrote:
> > > > Solve this by adding a cursor entry for each open instance. Taking the
> > > > global namespace_sem for write seems excessive, since we are only dealing
> > > > with a per-namespace list. Instead add a per-namespace spinlock and use
> > > > that together with namespace_sem taken for read to protect against
> > > > concurrent modification of the mount list. This may reduce parallelism of
> > > > is_local_mountpoint(), but it's hardly a big contention point. We could
> > > > also use RCU freeing of cursors to make traversal not need additional
> > > > locks, if that turns out to be neceesary.
> > >
> > > Umm... That can do more than reduction of parallelism - longer lists take
> > > longer to scan and moving cursors dirties cachelines in a bunch of struct
> > > mount instances. And I'm not convinced that your locking in m_next() is
> > > correct.
> > >
> > > What's to stop umount_tree() from removing the next entry from the list
> > > just as your m_next() tries to move the cursor? I don't see any common
> > > locks for those two...
> >
> > Ah, you still have namespace_sem taken (shared) by m_start(). Nevermind
> > that one, then... Let me get through mnt_list users and see if I can
> > catch anything.
>
> OK... Locking is safe, but it's not obvious. And your changes do make it
> scarier. There are several kinds of lists that can be threaded through
> ->mnt_list and your code depends upon never having those suckers appear
> in e.g. anon namespace ->list. It is true (AFAICS), but...
See analysis below.
> Another fun question is ns->mounts rules - it used to be "the number of
> entries in ns->list", now it's "the number of non-cursor entries there".
> Incidentally, we might have a problem with that logics wrt count_mount().
Nope, count_mount() iterates through the mount tree, not through mnt_ns->list.
> Sigh... The damn thing has grown much too convoluted over years ;-/
>
> I'm still not happy with that patch; at the very least it needs a lot more
> detailed analysis to go along with it.
Functions touching mnt_list:
In pnode.c:
umount_one:
umount_list:
propagate_umount: both of the above are indirectly called from this.
The only caller is umount_tree(), which has lots of different call
paths, but in each one has namespace_sem taken for write:
do_move_mount
attach_recursive_mnt
umount_tree
do_loopback
graft_tree
attach_recursive_mnt
umount_tree
do_new_mount_fc
do_add_mount
graft_tree
attach_recursive_mnt
umount_tree
finish_automount
do_add_mount
graft_tree
attach_recursive_mnt
umount_tree
do_umount
shrink_submounts
umount_tree
namespace.c:
__is_local_mountpoint: takes namespace_sem for read
commit_tree: has namespace_sem for write (only caller being
attach_recursive_mnt, see above for call paths).
m_start:
m_next:
m_show: all have namespace_sem for read
umount_tree: all callers have namespace_sem for write (se above for call paths)
do_umount: has namespace_sem for write
copy_tree: all members are newly allocated
iterate_mounts: operates on private copy built by collect_mounts()
open_detached_copy: takes namespace_sem for write
copy_mnt_ns: takes namespace_sem for write
mount_subtree: adds onto a newly allocated mnt_namespace
sys_fsmount: ditto
init_mount_tree: ditto
mnt_already_visible: takes namespace_sem for read
Patch adds ns_lock locking to all places that only have namespace_sem
for read. So everyone is still excluded: those taking namespace_sem
for write against everyone else obviously, and those taking
namespace_sem for read because they take ns_lock.
Thanks,
Miklos
next prev parent reply other threads:[~2020-04-09 19:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-09 14:16 [PATCH v2] proc/mounts: add cursor Miklos Szeredi
2020-04-09 16:22 ` Aurélien Aptel
2020-04-09 16:50 ` Al Viro
2020-04-09 16:54 ` Al Viro
2020-04-09 18:30 ` Al Viro
2020-04-09 19:36 ` Miklos Szeredi [this message]
2020-04-09 18:45 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJfpegtZ3T+1bN-pg-vmVvWZs-7chDWxBr0T+j4x_Lt4x0T8MQ@mail.gmail.com \
--to=miklos@szeredi.hu \
--cc=kzak@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).