From: Jeff Layton <jlayton@kernel.org>
To: lkp@lists.01.org
Subject: Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression
Date: Wed, 11 Mar 2020 07:52:23 -0500 [thread overview]
Message-ID: <9ff6eee403d293dd069935ca6979f72131fe5217.camel@kernel.org> (raw)
In-Reply-To: <f9db707f-74ef-9439-1aec-e1ce8234888e@huawei.com>
[-- Attachment #1: Type: text/plain, Size: 2692 bytes --]
On Wed, 2020-03-11 at 09:57 +0800, yangerkun wrote:
[snip]
>
> On 2020/3/11 5:01, NeilBrown wrote:
> >
> > I think this patch contains an assumption which is not justified. It
> > assumes that if a wait_event completes without error, then the wake_up()
> > must have happened. I don't think that is correct.
> >
> > In the patch that caused the recent regression, the race described
> > involved a signal arriving just as __locks_wake_up_blocks() was being
> > called on another thread.
> > So the waiting process was woken by a signal *after* ->fl_blocker was set
> > to NULL, and *before* the wake_up(). If wait_event_interruptible()
> > finds that the condition is true, it will report success whether there
> > was a signal or not.
> Neil and Jeff, Hi,
>
> But after this, like in flock_lock_inode_wait, we will go another
> flock_lock_inode. And the flock_lock_inode it may return
> -ENOMEM/-ENOENT/-EAGAIN/0.
>
> - 0: If there is a try lock, it means that we have call
> locks_move_blocks, and fl->fl_blocked_requests will be NULL, no need to
> wake up at all. If there is a unlock, no one call wait for me, no need
> to wake up too.
>
> - ENOENT: means we are doing unlock, no one will wait for me, no need to
> wake up.
>
> - ENOMEM: since last time we go through flock_lock_inode someone may
> wait for me, so for this error, we need to wake up them.
>
> - EAGAIN: since we has go through flock_lock_inode before, these may
> never happen because FL_SLEEP will not lose.
>
> So the assumption may be ok and for some error case we need to wake up
> someone may wait for me before(the reason for the patch "cifs: call
> locks_delete_block for all error case in cifs_posix_lock_set"). If I am
> wrong, please point out!
>
>
That's the basic dilemma. We need to know whether we'll need to delete
the block before taking the blocked_lock_lock.
Your most recent patch used the return code from the wait to determine
this, but that's not 100% reliable (as Neil pointed out). Could we try
to do this by doing the delete only when we get certain error codes?
Maybe, but that's a bit fragile-sounding.
Neil's most recent patch used presence on the fl_blocked_requests list
to determine whether to take the lock, but that relied on some very
subtle memory ordering. We could of course do that, but that's a bit
brittle too.
That's the main reason I'm leaning toward the patch Neil sent
originally and that uses the fl_wait.lock. The existing alternate lock
managers (nfsd and lockd) don't use fl_wait at all, so I don't think
doing that will cause any issues.
--
Jeff Layton <jlayton@kernel.org>
WARNING: multiple messages have this Message-ID (diff)
From: Jeff Layton <jlayton@kernel.org>
To: yangerkun <yangerkun@huawei.com>, NeilBrown <neilb@suse.de>,
Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel test robot <rong.a.chen@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
lkp@lists.01.org, Bruce Fields <bfields@fieldses.org>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression
Date: Wed, 11 Mar 2020 07:52:23 -0500 [thread overview]
Message-ID: <9ff6eee403d293dd069935ca6979f72131fe5217.camel@kernel.org> (raw)
In-Reply-To: <f9db707f-74ef-9439-1aec-e1ce8234888e@huawei.com>
On Wed, 2020-03-11 at 09:57 +0800, yangerkun wrote:
[snip]
>
> On 2020/3/11 5:01, NeilBrown wrote:
> >
> > I think this patch contains an assumption which is not justified. It
> > assumes that if a wait_event completes without error, then the wake_up()
> > must have happened. I don't think that is correct.
> >
> > In the patch that caused the recent regression, the race described
> > involved a signal arriving just as __locks_wake_up_blocks() was being
> > called on another thread.
> > So the waiting process was woken by a signal *after* ->fl_blocker was set
> > to NULL, and *before* the wake_up(). If wait_event_interruptible()
> > finds that the condition is true, it will report success whether there
> > was a signal or not.
> Neil and Jeff, Hi,
>
> But after this, like in flock_lock_inode_wait, we will go another
> flock_lock_inode. And the flock_lock_inode it may return
> -ENOMEM/-ENOENT/-EAGAIN/0.
>
> - 0: If there is a try lock, it means that we have call
> locks_move_blocks, and fl->fl_blocked_requests will be NULL, no need to
> wake up at all. If there is a unlock, no one call wait for me, no need
> to wake up too.
>
> - ENOENT: means we are doing unlock, no one will wait for me, no need to
> wake up.
>
> - ENOMEM: since last time we go through flock_lock_inode someone may
> wait for me, so for this error, we need to wake up them.
>
> - EAGAIN: since we has go through flock_lock_inode before, these may
> never happen because FL_SLEEP will not lose.
>
> So the assumption may be ok and for some error case we need to wake up
> someone may wait for me before(the reason for the patch "cifs: call
> locks_delete_block for all error case in cifs_posix_lock_set"). If I am
> wrong, please point out!
>
>
That's the basic dilemma. We need to know whether we'll need to delete
the block before taking the blocked_lock_lock.
Your most recent patch used the return code from the wait to determine
this, but that's not 100% reliable (as Neil pointed out). Could we try
to do this by doing the delete only when we get certain error codes?
Maybe, but that's a bit fragile-sounding.
Neil's most recent patch used presence on the fl_blocked_requests list
to determine whether to take the lock, but that relied on some very
subtle memory ordering. We could of course do that, but that's a bit
brittle too.
That's the main reason I'm leaning toward the patch Neil sent
originally and that uses the fl_wait.lock. The existing alternate lock
managers (nfsd and lockd) don't use fl_wait at all, so I don't think
doing that will cause any issues.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2020-03-11 12:52 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-08 14:03 [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression kernel test robot
2020-03-08 14:03 ` kernel test robot
2020-03-09 14:36 ` Jeff Layton
2020-03-09 14:36 ` Jeff Layton
2020-03-09 15:52 ` Linus Torvalds
2020-03-09 15:52 ` Linus Torvalds
2020-03-09 17:22 ` Jeff Layton
2020-03-09 17:22 ` Jeff Layton
2020-03-09 19:09 ` Jeff Layton
2020-03-09 19:09 ` Jeff Layton
2020-03-09 19:53 ` Jeff Layton
2020-03-09 19:53 ` Jeff Layton
2020-03-09 21:42 ` NeilBrown
2020-03-09 21:42 ` NeilBrown
2020-03-09 21:58 ` Jeff Layton
2020-03-09 21:58 ` Jeff Layton
2020-03-10 7:52 ` kernel test robot
2020-03-10 7:52 ` kernel test robot
2020-03-09 22:11 ` Jeff Layton
2020-03-09 22:11 ` Jeff Layton
2020-03-10 3:24 ` yangerkun
2020-03-10 3:24 ` yangerkun
2020-03-10 7:54 ` kernel test robot
2020-03-10 7:54 ` kernel test robot
2020-03-10 12:52 ` Jeff Layton
2020-03-10 12:52 ` Jeff Layton
2020-03-10 14:18 ` yangerkun
2020-03-10 14:18 ` yangerkun
2020-03-10 15:06 ` Jeff Layton
2020-03-10 15:06 ` Jeff Layton
2020-03-10 17:27 ` Jeff Layton
2020-03-10 17:27 ` Jeff Layton
2020-03-10 21:01 ` NeilBrown
2020-03-10 21:01 ` NeilBrown
2020-03-10 21:14 ` Jeff Layton
2020-03-10 21:14 ` Jeff Layton
2020-03-10 21:21 ` NeilBrown
2020-03-10 21:21 ` NeilBrown
2020-03-10 21:47 ` Linus Torvalds
2020-03-10 21:47 ` Linus Torvalds
2020-03-10 22:07 ` Jeff Layton
2020-03-10 22:07 ` Jeff Layton
2020-03-10 22:31 ` Linus Torvalds
2020-03-10 22:31 ` Linus Torvalds
2020-03-11 22:22 ` NeilBrown
2020-03-11 22:22 ` NeilBrown
2020-03-12 0:38 ` Linus Torvalds
2020-03-12 0:38 ` Linus Torvalds
2020-03-12 4:42 ` NeilBrown
2020-03-12 4:42 ` NeilBrown
2020-03-12 12:31 ` Jeff Layton
2020-03-12 12:31 ` Jeff Layton
2020-03-12 22:19 ` NeilBrown
2020-03-12 22:19 ` NeilBrown
2020-03-14 1:11 ` Jeff Layton
2020-03-14 1:11 ` Jeff Layton
2020-03-12 16:07 ` Linus Torvalds
2020-03-12 16:07 ` Linus Torvalds
2020-03-14 1:31 ` Jeff Layton
2020-03-14 1:31 ` Jeff Layton
2020-03-14 2:31 ` NeilBrown
2020-03-14 2:31 ` NeilBrown
2020-03-14 15:58 ` Linus Torvalds
2020-03-14 15:58 ` Linus Torvalds
2020-03-15 13:54 ` Jeff Layton
2020-03-15 13:54 ` Jeff Layton
2020-03-16 5:06 ` NeilBrown
2020-03-16 5:06 ` NeilBrown
2020-03-16 11:07 ` Jeff Layton
2020-03-16 11:07 ` Jeff Layton
2020-03-16 17:26 ` Linus Torvalds
2020-03-16 17:26 ` Linus Torvalds
2020-03-17 1:41 ` yangerkun
2020-03-17 1:41 ` yangerkun
2020-03-17 14:05 ` yangerkun
2020-03-17 14:05 ` yangerkun
2020-03-17 16:07 ` Jeff Layton
2020-03-17 16:07 ` Jeff Layton
2020-03-18 1:09 ` yangerkun
2020-03-18 1:09 ` yangerkun
2020-03-19 17:51 ` Jeff Layton
2020-03-19 17:51 ` Jeff Layton
2020-03-19 19:23 ` Linus Torvalds
2020-03-19 19:23 ` Linus Torvalds
2020-03-19 19:24 ` Jeff Layton
2020-03-19 19:24 ` Jeff Layton
2020-03-19 19:35 ` Linus Torvalds
2020-03-19 19:35 ` Linus Torvalds
2020-03-19 20:10 ` Jeff Layton
2020-03-19 20:10 ` Jeff Layton
2020-03-16 22:45 ` NeilBrown
2020-03-16 22:45 ` NeilBrown
2020-03-17 15:59 ` Jeff Layton
2020-03-17 15:59 ` Jeff Layton
2020-03-17 21:27 ` NeilBrown
2020-03-17 21:27 ` NeilBrown
2020-03-18 5:12 ` kernel test robot
2020-03-18 5:12 ` kernel test robot
2020-03-16 4:26 ` NeilBrown
2020-03-16 4:26 ` NeilBrown
2020-03-11 1:57 ` yangerkun
2020-03-11 1:57 ` yangerkun
2020-03-11 12:52 ` Jeff Layton [this message]
2020-03-11 12:52 ` Jeff Layton
2020-03-11 13:26 ` yangerkun
2020-03-11 13:26 ` yangerkun
2020-03-11 22:15 ` NeilBrown
2020-03-11 22:15 ` NeilBrown
2020-03-10 7:50 ` kernel test robot
2020-03-10 7:50 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ff6eee403d293dd069935ca6979f72131fe5217.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.