public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Nikolay Borisov <kernel@kyup.com>,
	viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	containers@lists.linux-foundation.org,
	serge.hallyn@canonical.com
Subject: Re: [RFC PATCH] locks: Show only file_locks created in the same pidns as current process
Date: Tue, 2 Aug 2016 16:34:06 -0400	[thread overview]
Message-ID: <20160802203406.GE15324@fieldses.org> (raw)
In-Reply-To: <1470168082.15226.14.camel@poochiereds.net>

On Tue, Aug 02, 2016 at 04:01:22PM -0400, Jeff Layton wrote:
> On Tue, 2016-08-02 at 15:44 -0400, J. Bruce Fields wrote:
> > On Tue, Aug 02, 2016 at 02:09:22PM -0500, Eric W. Biederman wrote:
> > > 
> > > > > "J. Bruce Fields" <bfields@fieldses.org> writes:
> > > 
> > > > 
> > > > On Tue, Aug 02, 2016 at 11:00:39AM -0500, Eric W. Biederman wrote:
> > > > > 
> > > > > > > > > Nikolay Borisov <kernel@kyup.com> writes:
> > > > > 
> > > > > > 
> > > > > > Currently when /proc/locks is read it will show all the file locks
> > > > > > which are currently created on the machine. On containers, hosted
> > > > > > on busy servers this means that doing lsof can be very slow. I
> > > > > > observed up to 5 seconds stalls reading 50k locks, while the container
> > > > > > itself had only a small number of relevant entries. Fix it by
> > > > > > filtering the locks listed by the pidns of the current process
> > > > > > and the process which created the lock.
> > > > > 
> > > > > The locks always confuse me so I am not 100% connecting locks
> > > > > to a pid namespace is appropriate.
> > > > > 
> > > > > That said if you are going to filter by pid namespace please use the pid
> > > > > namespace of proc, not the pid namespace of the process reading the
> > > > > file.
> > > > 
> > > > Oh, that makes sense, thanks.
> > > > 
> > > > What does /proc/mounts use, out of curiosity?  The mount namespace that
> > > > /proc was originally mounted in?
> > > 
> > > /proc/mounts -> /proc/self/mounts
> > 
> > D'oh, I knew that.
> > 
> > > 
> > > /proc/[pid]/mounts lists mounts from the mount namespace of the
> > > appropriate process.
> > > 
> > > That is another way to go but it is a tread carefully thing as changing
> > > things that way it is easy to surprise apparmor or selinux rules and be
> > > surprised you broke someones userspace in a way that prevents booting.
> > > Although I suspect /proc/locks isn't too bad.
> > 
> > OK, thanks.
> > 
> > /proc/[pid]/locks might be confusing.  I'd expect it to be "all the
> > locks owned by this task", rather than "all the locks owned by pid's in
> > the same pid namespace", or whatever criterion we choose.
> > 
> > Uh, I'm still trying to think of the Obviously Right solution here, and
> > it's not coming.
> > 
> > --b.
> 
> 
> I'm a little leery of changing how this works. It has always been
> maintained as a legacy interface, so do we run the risk of breaking
> something if we turn it into a per-namespace thing?

The namespace work is all about making interfaces per-namespace.  I
guess it works as long as it contributes to the illusion that each
container is its own machine.

Thinking about it, I might be sold on the per-pidns approach (with
Eric's modification to use the pidns of /proc not the reader).

My complaint about not being able to see conflicting locks would apply
just as well to conflicts from nfs locks held by other clients.  A disk
filesystem shared across multiple containers is a little like an nfs
filesystem shared between nfs clients.

That'd solve this immediate problem without requiring an lsof upgrade as
well.

> This also doesn't
> solve the problem of slow traversal in the init_pid_ns -- only in a
> container.
> 
> I also can't help but feel that /proc/locks is just showing its age. It
> was fine in the late 90's, but its limitations are just becoming more
> apparent as things get more complex. It was never designed for
> performance as you end up thrashing several spinlocks when reading it.
> 
> Maybe it's time to think about presenting this info in another way? A
> global view of all locks on the system is interesting but maybe it
> would be better to present it more granularly somehow?

But, yes, that might be a good idea.

--b.

> 
> I guess I should go look at what lsof actually does with this info...
> 
> -- 
> Jeff Layton <jlayton@poochiereds.net>

  parent reply	other threads:[~2016-08-02 20:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-02 14:42 [RFC PATCH] locks: Show only file_locks created in the same pidns as current process Nikolay Borisov
2016-08-02 14:45 ` Nikolay Borisov
2016-08-02 15:05 ` J. Bruce Fields
2016-08-02 15:20   ` Nikolay Borisov
2016-08-02 15:43     ` J. Bruce Fields
2016-08-02 16:00 ` Eric W. Biederman
2016-08-02 17:40   ` J. Bruce Fields
2016-08-02 19:09     ` Eric W. Biederman
2016-08-02 19:44       ` J. Bruce Fields
2016-08-02 20:01         ` Jeff Layton
2016-08-02 20:11           ` Nikolay Borisov
2016-08-02 20:34           ` J. Bruce Fields [this message]
2016-08-03  7:35 ` [PATCH v2] locks: Filter /proc/locks output on proc pid ns Nikolay Borisov
2016-08-03 13:46   ` Jeff Layton
2016-08-03 14:17     ` Nikolay Borisov
2016-08-03 14:28       ` J. Bruce Fields
2016-08-03 14:33         ` Nikolay Borisov
2016-08-03 14:54       ` Pavel Emelyanov
2016-08-03 15:00         ` Nikolay Borisov
2016-08-03 15:06           ` J. Bruce Fields
2016-08-03 15:10             ` Nikolay Borisov
2016-08-03 17:35               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160802203406.GE15324@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=jlayton@poochiereds.net \
    --cc=kernel@kyup.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=serge.hallyn@canonical.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox