public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [LKP] [fs/locks]  83b381078b:  will-it-scale.per_thread_ops -62.5% regression
       [not found] <20181127060102.GF6163@shao2-debian>
@ 2018-11-27 17:43 ` J. Bruce Fields
  2018-11-27 23:20   ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2018-11-27 17:43 UTC (permalink / raw)
  To: kernel test robot
  Cc: NeilBrown, Jeff Layton, LKML, Jeff Layton, lkp, linux-nfs

Thanks for the report!

On Tue, Nov 27, 2018 at 02:01:02PM +0800, kernel test robot wrote:
> FYI, we noticed a -62.5% regression of will-it-scale.per_thread_ops due to commit:
> 
> 
> commit: 83b381078b5ecab098ebf6bc9548bb32af1dbf31 ("fs/locks: always delete_block after waiting.")
> https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git locks-next
> 
> in testcase: will-it-scale
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
> with following parameters:
> 
> 	nr_task: 16
> 	mode: thread
> 	test: lock1

So I guess it's doing this, uncontended file lock/unlock?:

	https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c

Each thread is repeatedly locking and unlocking a file that is only used
by that thread.

By the way, what's the X-axis on these graphs?  (Or the y-axis, for that
matter?)

--b.

>                             will-it-scale.per_thread_ops                        
>                                                                                 
>   450000 +-+----------------------------------------------------------------+   
>          |                                                                  |   
>   400000 +-+   +..+..  .+..+..  .+..+..+...+..+..+..   +..        .+..    ..|   
>   350000 +-+ ..      +.       +.                     ..         +.    +..+  |   
>          |  +                                       +     +     :           |   
>   300000 +-+                                              :    :            |   
>   250000 +-+                                               :   :            |   
>          |                                                 :   :            |   
>   200000 +-+                                               :   :            |   
>   150000 +-+                                                : :             |   
>          O  O  O  O  O  O  O  O  O  O  O   O  O  O  O  O  O :O: O  O  O  O  O   
>   100000 +-+                                                : :             |   
>    50000 +-+                                                : :             |   
>          |                                                   :              |   
>        0 +-+----------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                                will-it-scale.workload                           
>                                                                                 
>   7e+06 +-+-----------------------------------------------------------------+   
>         |              +...+..           .+..+..+     +            +..      |   
>   6e+06 +-+   +..+.. ..         .+..+..+.        +   + +         ..       ..|   
>         |   ..      +         +.                  + +   +       +     +..+  |   
>   5e+06 +-++                                       +     +      :           |   
>         |                                                :     :            |   
>   4e+06 +-+                                               :    :            |   
>         |                                                 :    :            |   
>   3e+06 +-+                                                :   :            |   
>         |                                       O     O    :  : O  O        |   
>   2e+06 O-+O  O  O  O  O   O  O  O  O  O  O  O     O     O : O:       O  O  O   
>         |                                                   : :             |   
>   1e+06 +-+                                                 : :             |   
>         |                                                    :              |   
>       0 +-+-----------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                            will-it-scale.time.user_time                         
>                                                                                 
>   250 +-+-------------------------------------------------------------------+   
>       |                 .+..               .+..                   +..       |   
>       |..   +...+..  .+.      .+...+..+..+.    +..    +..       ..   .    ..|   
>   200 +-+ ..       +.       +.                    . ..         +      +..+  |   
>       |  +                                         +     +     :            |   
>       |                                                  :    :             |   
>   150 +-+                                                 :   :             |   
>       |                                                   :   :             |   
>   100 +-+                                                 :   :             |   
>       |  O      O                                          : :              |   
>       O     O      O  O  O  O  O   O  O  O  O  O   O  O  O :O: O  O   O  O  O   
>    50 +-+                                                  : :              |   
>       |                                                    : :              |   
>       |                                                     :               |   
>     0 +-+-------------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                           will-it-scale.time.system_time                        
>                                                                                 
>   5000 +-+------------------------------------------------------------------+   
>   4500 O-+O..O..O...O..O..O..O..O..O..O...O..O..O..O..O..O  O  O...O..O..O..O   
>        |                                                 :     :            |   
>   4000 +-+                                               :     :            |   
>   3500 +-+                                                :   :             |   
>        |                                                  :   :             |   
>   3000 +-+                                                :   :             |   
>   2500 +-+                                                :   :             |   
>   2000 +-+                                                 : :              |   
>        |                                                   : :              |   
>   1500 +-+                                                 : :              |   
>   1000 +-+                                                 : :              |   
>        |                                                    :               |   
>    500 +-+                                                  :               |   
>      0 +-+------------------------------------------------------------------+   
>                                                                                 
>                                                                                 
> [*] bisect-good sample
> [O] bisect-bad  sample

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [fs/locks]  83b381078b:  will-it-scale.per_thread_ops -62.5% regression
  2018-11-27 17:43 ` [LKP] [fs/locks] 83b381078b: will-it-scale.per_thread_ops -62.5% regression J. Bruce Fields
@ 2018-11-27 23:20   ` NeilBrown
  2018-11-28  0:53     ` [PATCH] locks: fix performance regressions NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2018-11-27 23:20 UTC (permalink / raw)
  To: J. Bruce Fields, kernel test robot
  Cc: Jeff Layton, LKML, Jeff Layton, lkp, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 8850 bytes --]

On Tue, Nov 27 2018, J. Bruce Fields wrote:

> Thanks for the report!

Yes, thanks.  I thought I had replied to the previous report of a similar
problem, but I didn't actually send that email - oops.
Though the test is the same and the regression similar, this is a
different patch.  The previous report identified 
 fs/locks: allow a lock request to block other requests
this one identifies
  fs/locks: always delete_block after waiting.

Both cause blocked_lock_lock to be taken more often.

In one case is it due to locks_move_blocks().  That can probably be
optimised to skip the lock if list_empty(&fl->fl_blocked_requests).
I'd need to double-check, but I think that is safe to check without
locking.

This one causes locks_delete_blocks() to be called more often.  We now
call it even if no waiting happened at all.  I suspect we can test for
that and avoid it.  I'll have a look.

>
> On Tue, Nov 27, 2018 at 02:01:02PM +0800, kernel test robot wrote:
>> FYI, we noticed a -62.5% regression of will-it-scale.per_thread_ops due to commit:
>> 
>> 
>> commit: 83b381078b5ecab098ebf6bc9548bb32af1dbf31 ("fs/locks: always delete_block after waiting.")
>> https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git locks-next
>> 
>> in testcase: will-it-scale
>> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
>> with following parameters:
>> 
>> 	nr_task: 16
>> 	mode: thread
>> 	test: lock1
>
> So I guess it's doing this, uncontended file lock/unlock?:
>
> 	https://github.com/antonblanchard/will-it-scale/blob/master/tests/lock1.c
>
> Each thread is repeatedly locking and unlocking a file that is only used
> by that thread.

Thanks for identifying that Bruce.
This would certainly be a case where locks_delete_block() is now being
called when it wasn't before.


>
> By the way, what's the X-axis on these graphs?  (Or the y-axis, for that
> matter?)

A key would help.  I think the X-axis is number-of-threads.  y-axis
might be ops-per-second ??.

Thanks,
NeilBrown


>
> --b.
>
>>                             will-it-scale.per_thread_ops                        
>>                                                                                 
>>   450000 +-+----------------------------------------------------------------+   
>>          |                                                                  |   
>>   400000 +-+   +..+..  .+..+..  .+..+..+...+..+..+..   +..        .+..    ..|   
>>   350000 +-+ ..      +.       +.                     ..         +.    +..+  |   
>>          |  +                                       +     +     :           |   
>>   300000 +-+                                              :    :            |   
>>   250000 +-+                                               :   :            |   
>>          |                                                 :   :            |   
>>   200000 +-+                                               :   :            |   
>>   150000 +-+                                                : :             |   
>>          O  O  O  O  O  O  O  O  O  O  O   O  O  O  O  O  O :O: O  O  O  O  O   
>>   100000 +-+                                                : :             |   
>>    50000 +-+                                                : :             |   
>>          |                                                   :              |   
>>        0 +-+----------------------------------------------------------------+   
>>                                                                                 
>>                                                                                                                                                                 
>>                                will-it-scale.workload                           
>>                                                                                 
>>   7e+06 +-+-----------------------------------------------------------------+   
>>         |              +...+..           .+..+..+     +            +..      |   
>>   6e+06 +-+   +..+.. ..         .+..+..+.        +   + +         ..       ..|   
>>         |   ..      +         +.                  + +   +       +     +..+  |   
>>   5e+06 +-++                                       +     +      :           |   
>>         |                                                :     :            |   
>>   4e+06 +-+                                               :    :            |   
>>         |                                                 :    :            |   
>>   3e+06 +-+                                                :   :            |   
>>         |                                       O     O    :  : O  O        |   
>>   2e+06 O-+O  O  O  O  O   O  O  O  O  O  O  O     O     O : O:       O  O  O   
>>         |                                                   : :             |   
>>   1e+06 +-+                                                 : :             |   
>>         |                                                    :              |   
>>       0 +-+-----------------------------------------------------------------+   
>>                                                                                 
>>                                                                                                                                                                 
>>                            will-it-scale.time.user_time                         
>>                                                                                 
>>   250 +-+-------------------------------------------------------------------+   
>>       |                 .+..               .+..                   +..       |   
>>       |..   +...+..  .+.      .+...+..+..+.    +..    +..       ..   .    ..|   
>>   200 +-+ ..       +.       +.                    . ..         +      +..+  |   
>>       |  +                                         +     +     :            |   
>>       |                                                  :    :             |   
>>   150 +-+                                                 :   :             |   
>>       |                                                   :   :             |   
>>   100 +-+                                                 :   :             |   
>>       |  O      O                                          : :              |   
>>       O     O      O  O  O  O  O   O  O  O  O  O   O  O  O :O: O  O   O  O  O   
>>    50 +-+                                                  : :              |   
>>       |                                                    : :              |   
>>       |                                                     :               |   
>>     0 +-+-------------------------------------------------------------------+   
>>                                                                                 
>>                                                                                                                                                                 
>>                           will-it-scale.time.system_time                        
>>                                                                                 
>>   5000 +-+------------------------------------------------------------------+   
>>   4500 O-+O..O..O...O..O..O..O..O..O..O...O..O..O..O..O..O  O  O...O..O..O..O   
>>        |                                                 :     :            |   
>>   4000 +-+                                               :     :            |   
>>   3500 +-+                                                :   :             |   
>>        |                                                  :   :             |   
>>   3000 +-+                                                :   :             |   
>>   2500 +-+                                                :   :             |   
>>   2000 +-+                                                 : :              |   
>>        |                                                   : :              |   
>>   1500 +-+                                                 : :              |   
>>   1000 +-+                                                 : :              |   
>>        |                                                    :               |   
>>    500 +-+                                                  :               |   
>>      0 +-+------------------------------------------------------------------+   
>>                                                                                 
>>                                                                                 
>> [*] bisect-good sample
>> [O] bisect-bad  sample

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] locks: fix performance regressions.
  2018-11-27 23:20   ` NeilBrown
@ 2018-11-28  0:53     ` NeilBrown
  2018-11-28  9:17       ` kernel test robot
  2018-11-28 11:37       ` Jeff Layton
  0 siblings, 2 replies; 5+ messages in thread
From: NeilBrown @ 2018-11-28  0:53 UTC (permalink / raw)
  To: J. Bruce Fields, kernel test robot, Jeff Layton
  Cc: LKML, Jeff Layton, lkp, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]


The kernel test robot reported two performance regressions
caused by recent patches.
Both appear to related to the global spinlock blocked_lock_lock
being taken more often.

This patch avoids taking that lock in the cases tested.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
---

Hi Jeff,
 you might like to merge these back into the patches that introduced
 the problem.
 Or you might like me to re-send the series with these merged in,
 in which case, please ask.

 And a BIG thank-you to the kernel-test-robot team!!

Thanks,
NeilBrown

 fs/locks.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index f456cd3d9d50..67519a43e27a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -444,6 +444,13 @@ static void locks_move_blocks(struct file_lock *new, struct file_lock *fl)
 {
 	struct file_lock *f;
 
+	/*
+	 * As ctx->flc_lock is held, new requests cannot be added to
+	 * ->fl_blocked_requests, so we don't need a lock to check if it
+	 * is empty.
+	 */
+	if (list_empty(&fl->fl_blocked_requests))
+		return;
 	spin_lock(&blocked_lock_lock);
 	list_splice_init(&fl->fl_blocked_requests, &new->fl_blocked_requests);
 	list_for_each_entry(f, &fl->fl_blocked_requests, fl_blocked_member)
@@ -749,6 +756,20 @@ int locks_delete_block(struct file_lock *waiter)
 {
 	int status = -ENOENT;
 
+	/*
+	 * If fl_blocker is NULL, it won't be set again as this thread
+	 * "owns" the lock and is the only one that might try to claim
+	 * the lock.  So it is safe to test fl_blocker locklessly.
+	 * Also if fl_blocker is NULL, this waiter is not listed on
+	 * fl_blocked_requests for some lock, so no other request can
+	 * be added to the list of fl_blocked_requests for this
+	 * request.  So if fl_blocker is NULL, it is safe to
+	 * locklessly check if fl_blocked_requests is empty.  If both
+	 * of these checks succeed, there is no need to take the lock.
+	 */
+	if (waiter->fl_blocker == NULL &&
+	    list_empty(&waiter->fl_blocked_requests))
+		return status;
 	spin_lock(&blocked_lock_lock);
 	if (waiter->fl_blocker)
 		status = 0;
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] locks: fix performance regressions.
  2018-11-28  0:53     ` [PATCH] locks: fix performance regressions NeilBrown
@ 2018-11-28  9:17       ` kernel test robot
  2018-11-28 11:37       ` Jeff Layton
  1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2018-11-28  9:17 UTC (permalink / raw)
  To: NeilBrown; +Cc: J. Bruce Fields, Jeff Layton, LKML, Jeff Layton, lkp, linux-nfs

Hi,

On Wed, Nov 28, 2018 at 11:53:48AM +1100, NeilBrown wrote:
> 
> The kernel test robot reported two performance regressions
> caused by recent patches.
> Both appear to related to the global spinlock blocked_lock_lock
> being taken more often.
> 
> This patch avoids taking that lock in the cases tested.
> 
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> 
> Hi Jeff,
>  you might like to merge these back into the patches that introduced
>  the problem.
>  Or you might like me to re-send the series with these merged in,
>  in which case, please ask.
> 
>  And a BIG thank-you to the kernel-test-robot team!!
> 
> Thanks,
> NeilBrown
> 
>  fs/locks.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index f456cd3d9d50..67519a43e27a 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -444,6 +444,13 @@ static void locks_move_blocks(struct file_lock *new, struct file_lock *fl)
>  {
>  	struct file_lock *f;
>  
> +	/*
> +	 * As ctx->flc_lock is held, new requests cannot be added to
> +	 * ->fl_blocked_requests, so we don't need a lock to check if it
> +	 * is empty.
> +	 */
> +	if (list_empty(&fl->fl_blocked_requests))
> +		return;
>  	spin_lock(&blocked_lock_lock);
>  	list_splice_init(&fl->fl_blocked_requests, &new->fl_blocked_requests);
>  	list_for_each_entry(f, &fl->fl_blocked_requests, fl_blocked_member)
> @@ -749,6 +756,20 @@ int locks_delete_block(struct file_lock *waiter)
>  {
>  	int status = -ENOENT;
>  
> +	/*
> +	 * If fl_blocker is NULL, it won't be set again as this thread
> +	 * "owns" the lock and is the only one that might try to claim
> +	 * the lock.  So it is safe to test fl_blocker locklessly.
> +	 * Also if fl_blocker is NULL, this waiter is not listed on
> +	 * fl_blocked_requests for some lock, so no other request can
> +	 * be added to the list of fl_blocked_requests for this
> +	 * request.  So if fl_blocker is NULL, it is safe to
> +	 * locklessly check if fl_blocked_requests is empty.  If both
> +	 * of these checks succeed, there is no need to take the lock.
> +	 */
> +	if (waiter->fl_blocker == NULL &&
> +	    list_empty(&waiter->fl_blocked_requests))
> +		return status;
>  	spin_lock(&blocked_lock_lock);
>  	if (waiter->fl_blocker)
>  		status = 0;
> -- 
> 2.14.0.rc0.dirty
> 

FYI, the performance recovered back, we didn't find any regression between the two commits.

commit: 
  48a7a13ff3 ("locks: use properly initialized file_lock when unlocking.")
  8f64e497be ("locks: fix performance regressions.")

48a7a13ff31f0728  8f64e497be9929a2d5904c39c4  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
     33.56 ±  3%         5%      35.30        boot-time.boot
     10497 ±  3%        12%      11733 ±  4%  proc-vmstat.nr_shmem
     67392                       68449        proc-vmstat.nr_zone_active_anon
     67392                       68449        proc-vmstat.nr_active_anon
     16303                       16206        proc-vmstat.nr_slab_reclaimable
     30602                       29921        proc-vmstat.nr_slab_unreclaimable
         0            9e+03       9009 ± 80%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
         0            6e+03       5837 ±139%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
       149 ± 17%      5e+03       5457 ±137%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
       175 ± 29%      4e+03       3807 ±136%  latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     52868 ±110%     -4e+04      17482 ±  4%  latency_stats.avg.max
     45055 ±141%     -5e+04          0        latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
       227 ± 10%      1e+04       9907 ±136%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
         0            9e+03       9367 ± 78%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
         0            6e+03       5837 ±139%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
       175 ± 29%      4e+03       3807 ±136%  latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     98043 ±124%     -8e+04      20999 ± 27%  latency_stats.max.max
     90059 ±141%     -9e+04          0        latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1251 ± 23%      5e+04      49005 ±137%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
         0            1e+04      12061 ± 70%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
         0            6e+03       5837 ±139%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
       175 ± 29%      4e+03       3807 ±136%  latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup
     90111 ±141%     -9e+04          0        latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe


Best Regards,
Rong Chen

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locks: fix performance regressions.
  2018-11-28  0:53     ` [PATCH] locks: fix performance regressions NeilBrown
  2018-11-28  9:17       ` kernel test robot
@ 2018-11-28 11:37       ` Jeff Layton
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2018-11-28 11:37 UTC (permalink / raw)
  To: NeilBrown, J. Bruce Fields, kernel test robot; +Cc: LKML, lkp, linux-nfs

On Wed, 2018-11-28 at 11:53 +1100, NeilBrown wrote:
> The kernel test robot reported two performance regressions
> caused by recent patches.
> Both appear to related to the global spinlock blocked_lock_lock
> being taken more often.
> 
> This patch avoids taking that lock in the cases tested.
> 
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> 
> Hi Jeff,
>  you might like to merge these back into the patches that introduced
>  the problem.
>  Or you might like me to re-send the series with these merged in,
>  in which case, please ask.
> 

Thanks Neil,

This looks great. I'll go ahead and toss this patch on top of the pile
in linux-next for now.

Would you mind resending the series with this patch merged in? I took a
quick stab at squashing it into the earlier patch, but there is some
churn in this area.

Maybe you can also turn that Reported-by: into a Tested-by: in the
changelog afterward?

> And a BIG thank-you to the kernel-test-robot team!!
> 

Absolutely! We love you guys!

> Thanks,
> NeilBrown
> 
>  fs/locks.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index f456cd3d9d50..67519a43e27a 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -444,6 +444,13 @@ static void locks_move_blocks(struct file_lock *new, struct file_lock *fl)
>  {
>  	struct file_lock *f;
>  
> +	/*
> +	 * As ctx->flc_lock is held, new requests cannot be added to
> +	 * ->fl_blocked_requests, so we don't need a lock to check if it
> +	 * is empty.
> +	 */
> +	if (list_empty(&fl->fl_blocked_requests))
> +		return;
>  	spin_lock(&blocked_lock_lock);
>  	list_splice_init(&fl->fl_blocked_requests, &new->fl_blocked_requests);
>  	list_for_each_entry(f, &fl->fl_blocked_requests, fl_blocked_member)
> @@ -749,6 +756,20 @@ int locks_delete_block(struct file_lock *waiter)
>  {
>  	int status = -ENOENT;
>  
> +	/*
> +	 * If fl_blocker is NULL, it won't be set again as this thread
> +	 * "owns" the lock and is the only one that might try to claim
> +	 * the lock.  So it is safe to test fl_blocker locklessly.
> +	 * Also if fl_blocker is NULL, this waiter is not listed on
> +	 * fl_blocked_requests for some lock, so no other request can
> +	 * be added to the list of fl_blocked_requests for this
> +	 * request.  So if fl_blocker is NULL, it is safe to
> +	 * locklessly check if fl_blocked_requests is empty.  If both
> +	 * of these checks succeed, there is no need to take the lock.
> +	 */
> +	if (waiter->fl_blocker == NULL &&
> +	    list_empty(&waiter->fl_blocked_requests))
> +		return status;
>  	spin_lock(&blocked_lock_lock);
>  	if (waiter->fl_blocker)
>  		status = 0;

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-28 11:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20181127060102.GF6163@shao2-debian>
2018-11-27 17:43 ` [LKP] [fs/locks] 83b381078b: will-it-scale.per_thread_ops -62.5% regression J. Bruce Fields
2018-11-27 23:20   ` NeilBrown
2018-11-28  0:53     ` [PATCH] locks: fix performance regressions NeilBrown
2018-11-28  9:17       ` kernel test robot
2018-11-28 11:37       ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox