public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
@ 2022-09-01 10:28 Mukesh Ojha
  2022-09-02 20:55 ` Waiman Long
  2022-09-08 14:32 ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Mukesh Ojha @ 2022-09-01 10:28 UTC (permalink / raw)
  To: peterz, mingo, will, longman, boqun.feng
  Cc: linux-kernel, Gokul krishna Krishnakumar, Mukesh Ojha

From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>

Make the region inside the rwsem_write_trylock non preemptible.

We observe RT task is hogging CPU when trying to acquire rwsem lock
which was acquired by a kworker task but before the rwsem owner was set.

Here is the scenario:
1. CFS task (affined to a particular CPU) takes rwsem lock.

2. CFS task gets preempted by a RT task before setting owner.

3. RT task (FIFO) is trying to acquire the lock, but spinning until
RT throttling happens for the lock as the lock was taken by CFS task.

This patch attempts to fix the above issue by disabling preemption
until owner is set for the lock. while at it also fix this issue
at the place where owner being set/cleared.

Signed-off-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
 kernel/locking/rwsem.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 65f0262..3b4b32e 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -251,13 +251,16 @@ static inline bool rwsem_read_trylock(struct rw_semaphore *sem, long *cntp)
 static inline bool rwsem_write_trylock(struct rw_semaphore *sem)
 {
 	long tmp = RWSEM_UNLOCKED_VALUE;
+	bool ret = false;
 
+	preempt_disable();
 	if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, RWSEM_WRITER_LOCKED)) {
 		rwsem_set_owner(sem);
-		return true;
+		ret = true;
 	}
 
-	return false;
+	preempt_enable();
+	return ret;
 }
 
 /*
@@ -686,16 +689,21 @@ enum owner_state {
 static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
 {
 	long count = atomic_long_read(&sem->count);
+	bool ret = false;
 
+	preempt_disable();
 	while (!(count & (RWSEM_LOCK_MASK|RWSEM_FLAG_HANDOFF))) {
 		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
 					count | RWSEM_WRITER_LOCKED)) {
 			rwsem_set_owner(sem);
 			lockevent_inc(rwsem_opt_lock);
-			return true;
+			ret = true;
+			break;
 		}
 	}
-	return false;
+
+	preempt_enable();
+	return ret;
 }
 
 static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
@@ -1352,8 +1360,10 @@ static inline void __up_write(struct rw_semaphore *sem)
 	DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) &&
 			    !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem);
 
+	preempt_disable();
 	rwsem_clear_owner(sem);
 	tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
+	preempt_enable();
 	if (unlikely(tmp & RWSEM_FLAG_WAITERS))
 		rwsem_wake(sem);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
  2022-09-01 10:28 [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock Mukesh Ojha
@ 2022-09-02 20:55 ` Waiman Long
  2022-09-06 12:43   ` Mukesh Ojha
  2022-09-08 14:32 ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Waiman Long @ 2022-09-02 20:55 UTC (permalink / raw)
  To: Mukesh Ojha, peterz, mingo, will, boqun.feng
  Cc: linux-kernel, Gokul krishna Krishnakumar


On 9/1/22 06:28, Mukesh Ojha wrote:
> From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
>
> Make the region inside the rwsem_write_trylock non preemptible.
>
> We observe RT task is hogging CPU when trying to acquire rwsem lock
> which was acquired by a kworker task but before the rwsem owner was set.
>
> Here is the scenario:
> 1. CFS task (affined to a particular CPU) takes rwsem lock.
>
> 2. CFS task gets preempted by a RT task before setting owner.
>
> 3. RT task (FIFO) is trying to acquire the lock, but spinning until
> RT throttling happens for the lock as the lock was taken by CFS task.

Note that the spinning is likely caused by the following code in 
rwsem_down_write_slowpath():

1163                 /*
1164                  * After setting the handoff bit and failing to acquire
1165                  * the lock, attempt to spin on owner to accelerate 
lock
1166                  * transfer. If the previous owner is a on-cpu 
writer and it
1167                  * has just released the lock, OWNER_NULL will be 
returned.
1168                  * In this case, we attempt to acquire the lock again
1169                  * without sleeping.
1170                  */
1171                 if (waiter.handoff_set) {
1172                         enum owner_state owner_state;
1173
1174                         preempt_disable();
1175                         owner_state = rwsem_spin_on_owner(sem);
1176                         preempt_enable();
1177
1178                         if (owner_state == OWNER_NULL)
1179                                 goto trylock_again;
1180                 }

rwsem_optimistic_spin() limits RT task one additional attempt if 
OWNER_NULL is returned. There is no such limitation in this loop. So an 
alternative will be to put a limit on the number of times an OWNER_NULL 
return values will be allowed to continue spinning without sleeping. 
That put the burden on the slowpath instead of in the fastpath.

Other than the slight overhead in the fastpath, the patch should work too.

Acked-by: Waiman Long <longman@redhat.com>

Cheers,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
  2022-09-02 20:55 ` Waiman Long
@ 2022-09-06 12:43   ` Mukesh Ojha
  0 siblings, 0 replies; 6+ messages in thread
From: Mukesh Ojha @ 2022-09-06 12:43 UTC (permalink / raw)
  To: Waiman Long, peterz, mingo, will, boqun.feng
  Cc: linux-kernel, Gokul krishna Krishnakumar

Hi,

On 9/3/2022 2:25 AM, Waiman Long wrote:
> 
> On 9/1/22 06:28, Mukesh Ojha wrote:
>> From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
>>
>> Make the region inside the rwsem_write_trylock non preemptible.
>>
>> We observe RT task is hogging CPU when trying to acquire rwsem lock
>> which was acquired by a kworker task but before the rwsem owner was set.
>>
>> Here is the scenario:
>> 1. CFS task (affined to a particular CPU) takes rwsem lock.
>>
>> 2. CFS task gets preempted by a RT task before setting owner.
>>
>> 3. RT task (FIFO) is trying to acquire the lock, but spinning until
>> RT throttling happens for the lock as the lock was taken by CFS task.
> 
> Note that the spinning is likely caused by the following code in 
> rwsem_down_write_slowpath():
> 
> 1163                 /*
> 1164                  * After setting the handoff bit and failing to 
> acquire
> 1165                  * the lock, attempt to spin on owner to accelerate 
> lock
> 1166                  * transfer. If the previous owner is a on-cpu 
> writer and it
> 1167                  * has just released the lock, OWNER_NULL will be 
> returned.
> 1168                  * In this case, we attempt to acquire the lock again
> 1169                  * without sleeping.
> 1170                  */
> 1171                 if (waiter.handoff_set) {
> 1172                         enum owner_state owner_state;
> 1173
> 1174                         preempt_disable();
> 1175                         owner_state = rwsem_spin_on_owner(sem);
> 1176                         preempt_enable();
> 1177
> 1178                         if (owner_state == OWNER_NULL)
> 1179                                 goto trylock_again;
> 1180                 }
> 
> rwsem_optimistic_spin() limits RT task one additional attempt if 
> OWNER_NULL is returned. There is no such limitation in this loop. So an 
> alternative will be to put a limit on the number of times an OWNER_NULL 
> return values will be allowed to continue spinning without sleeping. 
> That put the burden on the slowpath instead of in the fastpath.
> 
> Other than the slight overhead in the fastpath, the patch should work too.
> 
> Acked-by: Waiman Long <longman@redhat.com>

Thanks Waiman for your time and suggestion.
Would like to take others opinion as well.

-Mukesh

> 
> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
  2022-09-01 10:28 [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock Mukesh Ojha
  2022-09-02 20:55 ` Waiman Long
@ 2022-09-08 14:32 ` Peter Zijlstra
  2022-09-08 15:48   ` Mukesh Ojha
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2022-09-08 14:32 UTC (permalink / raw)
  To: Mukesh Ojha
  Cc: mingo, will, longman, boqun.feng, linux-kernel,
	Gokul krishna Krishnakumar

On Thu, Sep 01, 2022 at 03:58:10PM +0530, Mukesh Ojha wrote:
> From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
> 
> Make the region inside the rwsem_write_trylock non preemptible.
> 
> We observe RT task is hogging CPU when trying to acquire rwsem lock
> which was acquired by a kworker task but before the rwsem owner was set.
> 
> Here is the scenario:
> 1. CFS task (affined to a particular CPU) takes rwsem lock.
> 
> 2. CFS task gets preempted by a RT task before setting owner.
> 
> 3. RT task (FIFO) is trying to acquire the lock, but spinning until
> RT throttling happens for the lock as the lock was taken by CFS task.
> 
> This patch attempts to fix the above issue by disabling preemption
> until owner is set for the lock. while at it also fix this issue
> at the place where owner being set/cleared.
> 
> Signed-off-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>

This is not a valid SoB chain.

> ---
>  kernel/locking/rwsem.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 65f0262..3b4b32e 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -251,13 +251,16 @@ static inline bool rwsem_read_trylock(struct rw_semaphore *sem, long *cntp)
>  static inline bool rwsem_write_trylock(struct rw_semaphore *sem)
>  {
>  	long tmp = RWSEM_UNLOCKED_VALUE;
> +	bool ret = false;
>  
> +	preempt_disable();
>  	if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, RWSEM_WRITER_LOCKED)) {
>  		rwsem_set_owner(sem);
> -		return true;
> +		ret = true;
>  	}
>  
> -	return false;
> +	preempt_enable();
> +	return ret;
>  }
>  
>  /*

Yes, this part looks ok.

> @@ -686,16 +689,21 @@ enum owner_state {
>  static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
>  {
>  	long count = atomic_long_read(&sem->count);
> +	bool ret = false;
>  
> +	preempt_disable();
>  	while (!(count & (RWSEM_LOCK_MASK|RWSEM_FLAG_HANDOFF))) {
>  		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
>  					count | RWSEM_WRITER_LOCKED)) {
>  			rwsem_set_owner(sem);
>  			lockevent_inc(rwsem_opt_lock);
> -			return true;
> +			ret = true;
> +			break;
>  		}
>  	}
> -	return false;
> +
> +	preempt_enable();
> +	return ret;
>  }
>  

This one I can't follow; afaict this is only called with preemption
already disabled.

>  static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> @@ -1352,8 +1360,10 @@ static inline void __up_write(struct rw_semaphore *sem)
>  	DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) &&
>  			    !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem);
>  
> +	preempt_disable();
>  	rwsem_clear_owner(sem);
>  	tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
> +	preempt_enable();
>  	if (unlikely(tmp & RWSEM_FLAG_WAITERS))
>  		rwsem_wake(sem);
>  }

Yep, that looks good again.

Perhaps the thing to do would be to add:

  lockdep_assert_preemption_disabled()

to rwsem_{set,clear}_owner() and expand the comment there to explain
that these functions should be in the same preempt-disable section as
the atomic op that changes sem->count.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
  2022-09-08 14:32 ` Peter Zijlstra
@ 2022-09-08 15:48   ` Mukesh Ojha
  2022-09-08 17:03     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Mukesh Ojha @ 2022-09-08 15:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, will, longman, boqun.feng, linux-kernel,
	Gokul krishna Krishnakumar

Hi Peter,

Thanks for your time in reviewing this patch.

On 9/8/2022 8:02 PM, Peter Zijlstra wrote:
> On Thu, Sep 01, 2022 at 03:58:10PM +0530, Mukesh Ojha wrote:
>> From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
>>
>> Make the region inside the rwsem_write_trylock non preemptible.
>>
>> We observe RT task is hogging CPU when trying to acquire rwsem lock
>> which was acquired by a kworker task but before the rwsem owner was set.
>>
>> Here is the scenario:
>> 1. CFS task (affined to a particular CPU) takes rwsem lock.
>>
>> 2. CFS task gets preempted by a RT task before setting owner.
>>
>> 3. RT task (FIFO) is trying to acquire the lock, but spinning until
>> RT throttling happens for the lock as the lock was taken by CFS task.
>>
>> This patch attempts to fix the above issue by disabling preemption
>> until owner is set for the lock. while at it also fix this issue
>> at the place where owner being set/cleared.
>>
>> Signed-off-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> 
> This is not a valid SoB chain.

Since this patch of adding preempt disable() at rwsem_write_trylock() is 
originated from Gokul.

Would be adding him in
Original-patch-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>

Convert myself to the author/SoB.

Would that be fine ?  please suggest.

> 
>> ---
>>   kernel/locking/rwsem.c | 18 ++++++++++++++----
>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 65f0262..3b4b32e 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -251,13 +251,16 @@ static inline bool rwsem_read_trylock(struct rw_semaphore *sem, long *cntp)
>>   static inline bool rwsem_write_trylock(struct rw_semaphore *sem)
>>   {
>>   	long tmp = RWSEM_UNLOCKED_VALUE;
>> +	bool ret = false;
>>   
>> +	preempt_disable();
>>   	if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, RWSEM_WRITER_LOCKED)) {
>>   		rwsem_set_owner(sem);
>> -		return true;
>> +		ret = true;
>>   	}
>>   
>> -	return false;
>> +	preempt_enable();
>> +	return ret;
>>   }
>>   
>>   /*
> 
> Yes, this part looks ok.
> 
>> @@ -686,16 +689,21 @@ enum owner_state {
>>   static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
>>   {
>>   	long count = atomic_long_read(&sem->count);
>> +	bool ret = false;
>>   
>> +	preempt_disable();
>>   	while (!(count & (RWSEM_LOCK_MASK|RWSEM_FLAG_HANDOFF))) {
>>   		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
>>   					count | RWSEM_WRITER_LOCKED)) {
>>   			rwsem_set_owner(sem);
>>   			lockevent_inc(rwsem_opt_lock);
>> -			return true;
>> +			ret = true;
>> +			break;
>>   		}
>>   	}
>> -	return false;
>> +
>> +	preempt_enable();
>> +	return ret;
>>   }
>>   
> 
> This one I can't follow; afaict this is only called with preemption
> already disabled.

Agreed. Will remove it in v2.

> 
>>   static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
>> @@ -1352,8 +1360,10 @@ static inline void __up_write(struct rw_semaphore *sem)
>>   	DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) &&
>>   			    !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem);
>>   
>> +	preempt_disable();
>>   	rwsem_clear_owner(sem);
>>   	tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
>> +	preempt_enable();
>>   	if (unlikely(tmp & RWSEM_FLAG_WAITERS))
>>   		rwsem_wake(sem);
>>   }
> 
> Yep, that looks good again.
> 
> Perhaps the thing to do would be to add:
> 
>    lockdep_assert_preemption_disabled()
> 
> to rwsem_{set,clear}_owner() and expand the comment there to explain
> that these functions should be in the same preempt-disable section as
> the atomic op that changes sem->count.

Thanks for the suggestion, will add it in v2.

-Mukesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock
  2022-09-08 15:48   ` Mukesh Ojha
@ 2022-09-08 17:03     ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2022-09-08 17:03 UTC (permalink / raw)
  To: Mukesh Ojha
  Cc: mingo, will, longman, boqun.feng, linux-kernel,
	Gokul krishna Krishnakumar

On Thu, Sep 08, 2022 at 09:18:29PM +0530, Mukesh Ojha wrote:
> Hi Peter,
> 
> Thanks for your time in reviewing this patch.
> 
> On 9/8/2022 8:02 PM, Peter Zijlstra wrote:
> > On Thu, Sep 01, 2022 at 03:58:10PM +0530, Mukesh Ojha wrote:
> > > From: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
> > > 
> > > Make the region inside the rwsem_write_trylock non preemptible.
> > > 
> > > We observe RT task is hogging CPU when trying to acquire rwsem lock
> > > which was acquired by a kworker task but before the rwsem owner was set.
> > > 
> > > Here is the scenario:
> > > 1. CFS task (affined to a particular CPU) takes rwsem lock.
> > > 
> > > 2. CFS task gets preempted by a RT task before setting owner.
> > > 
> > > 3. RT task (FIFO) is trying to acquire the lock, but spinning until
> > > RT throttling happens for the lock as the lock was taken by CFS task.
> > > 
> > > This patch attempts to fix the above issue by disabling preemption
> > > until owner is set for the lock. while at it also fix this issue
> > > at the place where owner being set/cleared.
> > > 
> > > Signed-off-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
> > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> > 
> > This is not a valid SoB chain.
> 
> Since this patch of adding preempt disable() at rwsem_write_trylock() is
> originated from Gokul.
> 
> Would be adding him in
> Original-patch-by: Gokul krishna Krishnakumar <quic_gokukris@quicinc.com>
> 
> Convert myself to the author/SoB.
> 

Oh, my bad, I missed From is actually Gokul. So yeah, all good, ignore
that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-08 17:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-01 10:28 [PATCH] locking/rwsem: Disable preemption while trying for rwsem lock Mukesh Ojha
2022-09-02 20:55 ` Waiman Long
2022-09-06 12:43   ` Mukesh Ojha
2022-09-08 14:32 ` Peter Zijlstra
2022-09-08 15:48   ` Mukesh Ojha
2022-09-08 17:03     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox