linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: Benjamin Coddington <bcodding@redhat.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	kernel test robot <xiaolong.ye@intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
	lkp@01.org, Christoph Hellwig <hch@infradead.org>
Subject: Re: [lkp-robot] [fs/locks]  9d21d181d0: will-it-scale.per_process_ops -14.1% regression
Date: Tue, 06 Jun 2017 09:15:24 -0400	[thread overview]
Message-ID: <1496754924.2807.5.camel@redhat.com> (raw)
In-Reply-To: <3924EE88-DC6E-4D95-9A84-50032930A65C@redhat.com>

On Tue, 2017-06-06 at 09:00 -0400, Benjamin Coddington wrote:
> 
> On 5 Jun 2017, at 18:02, Jeff Layton wrote:
> 
> > On Mon, 2017-06-05 at 14:34 -0400, Benjamin Coddington wrote:
> > > On 1 Jun 2017, at 11:48, Jeff Layton wrote:
> > > 
> > > > On Thu, 2017-06-01 at 11:14 -0400, J. Bruce Fields wrote:
> > > > > On Thu, Jun 01, 2017 at 08:59:21AM -0400, Jeff Layton wrote:
> > > > > > I'm not so sure. That would only be the case if the thing were
> > > > > > marked
> > > > > > for manadatory locking (a really rare thing).
> > > > > > 
> > > > > > The test is really simple and I don't think any read/write 
> > > > > > activity
> > > > > > is
> > > > > > involved:
> > > > > > 
> > > > > >     https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c
> > > > > 
> > > > > So it's just F_WRLCK/F_UNLCK in a loop spread across multiple 
> > > > > cores?
> > > > > I'd think real workloads do some work while holding the lock, and a
> > > > > 15%
> > > > > regression on just the pure lock/unlock loop might not matter?  But
> > > > > best
> > > > > to be careful, I guess.
> > > > > 
> > > > > --b.
> > > > > 
> > > > 
> > > > Yeah, that's my take.
> > > > 
> > > > I was assuming that getting a pid reference would be essentially 
> > > > free,
> > > > but it doesn't seem to be.
> > > > 
> > > > So, I think we probably want to avoid taking it for a file_lock that
> > > > we
> > > > use to request a lock, but do take it for a file_lock that is used 
> > > > to
> > > > record a lock. How best to code that up, I'm not quite sure...
> > > 
> > > Maybe as simple as only setting fl_nspid in locks_insert_lock_ctx(), 
> > > but
> > > that seems to just take us back to the problem of getting the pid 
> > > wrong
> > > if
> > > the lock is inserted later by a different worker than created the
> > > request.
> > > 
> > > I have a mind now to just drop fl_nspid off the struct file_lock
> > > completely,
> > > and instead just carry fl_pid, and when we do F_GETLK, we can do:
> > > 
> > > task = find_task_by_pid_ns(fl_pid, init_pid_ns)
> > > fl_nspid = task_pid_nr_ns(task, task_active_pid_ns(current))
> > > 
> > > That moves all the work off into the F_GETLK case, which I think is 
> > > not
> > > used
> > > so much.
> > > 
> > 
> > Actually I think what might work best is to:
> > 
> > - have locks_copy_conflock also copy the fl_nspid and take a reference
> > to it (as your patch #2 does)
> > 
> > - only set fl_nspid and take a reference there in 
> > locks_insert_lock_ctx
> > if it's not already set
> > 
> > - allow ->lock operations (like nfs) to set fl_nspid before they call
> > locks_lock_inode_wait to set the local lock. Might need to take a 
> > nspid
> > reference before dispatching an RPC so that you get the right thread
> > context.
> 
> It would, but I think fl_nspid is completely unnecessary.  The reason we
> have it so that we can translate the pid number into other namespaces,  
> the
> most common case being that F_GETLK and views of /proc/locks within a
> namespace represent the same pid numbers as the processes in that 
> namespace
> that are holding the locks.
> 
> It is much simpler to just keep using fl_pid as the pid number in the 
> init
> namespace, but move the translation of that pid number to lookup time,
> rather than creation time.
> 

I think that would also work and I like the idea of getting rid of a
field in file_lock.

So, to be clear:

fl_pid would then store the pid of the process in the init_pid_ns, and
you'd just translate it as appropriate to the requestor's namespace?

If we want to go that route, then you'll probably still need a flag of
some sort to indicate that the fl_pid is to be expressed "as is", for
remote filesystems.

OTOH, if the lock is held remotely, I wonder if we'd be better off
simply reporting the pid as '-1', like we do with OFD locks. Hardly
anything pays attention to l_pid anyway and it's more or less
meaningless once the filesystem extends beyond the machine you're on.

That said, I'd be inclined to do that in a separate set so we could
revert it if it caused problems somewhere.
-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2017-06-06 13:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170601020556.GE16905@yexl-desktop>
2017-06-01 11:41 ` [lkp-robot] [fs/locks] 9d21d181d0: will-it-scale.per_process_ops -14.1% regression Jeff Layton
2017-06-01 11:49   ` Benjamin Coddington
2017-06-01 12:59     ` Jeff Layton
2017-06-01 15:14       ` J. Bruce Fields
2017-06-01 15:48         ` Jeff Layton
2017-06-05 18:34           ` Benjamin Coddington
2017-06-05 22:02             ` Jeff Layton
2017-06-06 13:00               ` Benjamin Coddington
2017-06-06 13:15                 ` Jeff Layton [this message]
2017-06-06 13:21                   ` Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1496754924.2807.5.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=bcodding@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).