public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
@ 2021-10-20 11:06 Paolo Bonzini
  2021-10-20 11:17 ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-20 11:06 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Davidlohr Bueso, Oleg Nesterov, Ingo Molnar, Paul E . McKenney,
	Peter Zijlstra, Wanpeng Li

In some cases, rcuwait_wake_up can be called even if the actual likelihood
of a wakeup is very low.  If CONFIG_PREEMPT_RCU is active, the resulting
rcu_read_lock/rcu_read_unlock pair can be relatively expensive, and in
fact it is unnecessary when there is no w->task to keep alive: the
memory barrier before the read is what matters in order to avoid missed
wakeups.

Therefore, do an early check of w->task right after the barrier, and skip
rcu_read_lock/rcu_read_unlock unless there is someone waiting for a wakeup.

Running kvm-unit-test/vmexit.flat with APICv disabled, most interrupt
injection tests (tscdeadline*, self_ipi*, x2apic_self_ipi*) improve
by around 600 cpu cycles.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kernel/exit.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 91a43e57a32e..a38a08dbf85e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -234,8 +234,6 @@ int rcuwait_wake_up(struct rcuwait *w)
 	int ret = 0;
 	struct task_struct *task;
 
-	rcu_read_lock();
-
 	/*
 	 * Order condition vs @task, such that everything prior to the load
 	 * of @task is visible. This is the condition as to why the user called
@@ -245,10 +243,22 @@ int rcuwait_wake_up(struct rcuwait *w)
 	 *    WAIT                WAKE
 	 *    [S] tsk = current	  [S] cond = true
 	 *        MB (A)	      MB (B)
-	 *    [L] cond		  [L] tsk
+	 *    [L] cond		  [L] rcuwait_active(w)
+	 *                            task = rcu_dereference(w->task)
 	 */
 	smp_mb(); /* (B) */
 
+#ifdef CONFIG_PREEMPT_RCU
+	/*
+	 * The cost of rcu_read_lock() dominates for preemptible RCU,
+	 * avoid it if possible.
+	 */
+	if (!rcuwait_active(w))
+		return ret;
+#endif
+
+	rcu_read_lock();
+
 	task = rcu_dereference(w->task);
 	if (task)
 		ret = wake_up_process(task);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
  2021-10-20 11:06 [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed Paolo Bonzini
@ 2021-10-20 11:17 ` Peter Zijlstra
  2021-10-20 11:37   ` Paolo Bonzini
  2021-10-20 11:52   ` Paolo Bonzini
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2021-10-20 11:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, Davidlohr Bueso, Oleg Nesterov, Ingo Molnar,
	Paul E . McKenney, Wanpeng Li

On Wed, Oct 20, 2021 at 07:06:38AM -0400, Paolo Bonzini wrote:
> In some cases, rcuwait_wake_up can be called even if the actual likelihood
> of a wakeup is very low.  If CONFIG_PREEMPT_RCU is active, the resulting
> rcu_read_lock/rcu_read_unlock pair can be relatively expensive, and in
> fact it is unnecessary when there is no w->task to keep alive: the
> memory barrier before the read is what matters in order to avoid missed
> wakeups.
> 
> Therefore, do an early check of w->task right after the barrier, and skip
> rcu_read_lock/rcu_read_unlock unless there is someone waiting for a wakeup.
> 
> Running kvm-unit-test/vmexit.flat with APICv disabled, most interrupt
> injection tests (tscdeadline*, self_ipi*, x2apic_self_ipi*) improve
> by around 600 cpu cycles.

*how* ?!?

AFAICT, rcu_read_lock() for PREEMPT_RCU is:

  WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
  barrier();

Paul?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
  2021-10-20 11:17 ` Peter Zijlstra
@ 2021-10-20 11:37   ` Paolo Bonzini
  2021-10-22  4:11     ` Lai Jiangshan
  2021-10-20 11:52   ` Paolo Bonzini
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-20 11:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, kvm, Davidlohr Bueso, Oleg Nesterov, Ingo Molnar,
	Paul E . McKenney, Wanpeng Li

On 20/10/21 13:17, Peter Zijlstra wrote:
> AFAICT, rcu_read_lock() for PREEMPT_RCU is:
> 
>    WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
>    barrier();

rcu_read_unlock() is the expensive one if you need to go down 
rcu_read_unlock_special().

Paolo

> Paul?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
  2021-10-20 11:17 ` Peter Zijlstra
  2021-10-20 11:37   ` Paolo Bonzini
@ 2021-10-20 11:52   ` Paolo Bonzini
       [not found]     ` <CANRm+CyX+ZZh+LbLPBXEfMoExkx78qHpP-=yFCop9gX+LQeWDQ@mail.gmail.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-20 11:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, kvm, Davidlohr Bueso, Oleg Nesterov, Ingo Molnar,
	Paul E . McKenney, Wanpeng Li

On 20/10/21 13:17, Peter Zijlstra wrote:
> On Wed, Oct 20, 2021 at 07:06:38AM -0400, Paolo Bonzini wrote:
>> In some cases, rcuwait_wake_up can be called even if the actual likelihood
>> of a wakeup is very low.  If CONFIG_PREEMPT_RCU is active, the resulting
>> rcu_read_lock/rcu_read_unlock pair can be relatively expensive, and in
>> fact it is unnecessary when there is no w->task to keep alive: the
>> memory barrier before the read is what matters in order to avoid missed
>> wakeups.
>>
>> Therefore, do an early check of w->task right after the barrier, and skip
>> rcu_read_lock/rcu_read_unlock unless there is someone waiting for a wakeup.
>>
>> Running kvm-unit-test/vmexit.flat with APICv disabled, most interrupt
>> injection tests (tscdeadline*, self_ipi*, x2apic_self_ipi*) improve
>> by around 600 cpu cycles.
> 
> *how* ?!?
> 
> AFAICT, rcu_read_lock() for PREEMPT_RCU is:
> 
>    WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
>    barrier();
> 
> Paul?

Wanpeng, can you share your full .config?

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
       [not found]     ` <CANRm+CyX+ZZh+LbLPBXEfMoExkx78qHpP-=yFCop9gX+LQeWDQ@mail.gmail.com>
@ 2021-10-20 12:04       ` Paolo Bonzini
       [not found]         ` <CANRm+Cxpav5FqCBZoQv5BLnC160_RA3YDJc_CFabW7oMBFQ_5g@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-20 12:04 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Peter Zijlstra, LKML, kvm, Davidlohr Bueso, Oleg Nesterov,
	Ingo Molnar, Paul E . McKenney, Wanpeng Li

On 20/10/21 14:01, Wanpeng Li wrote:
> Yes, in the attachment.
> 
>      Wanpeng

This one does not have CONFIG_PREEMPT=y, let alone CONFIG_PREEMPT_RCU. 
It's completely impossible that this patch has an effect without those 
options.

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
  2021-10-20 11:37   ` Paolo Bonzini
@ 2021-10-22  4:11     ` Lai Jiangshan
  0 siblings, 0 replies; 7+ messages in thread
From: Lai Jiangshan @ 2021-10-22  4:11 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, LKML, kvm, Davidlohr Bueso, Oleg Nesterov,
	Ingo Molnar, Paul E . McKenney, Wanpeng Li

On Wed, Oct 20, 2021 at 7:39 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 20/10/21 13:17, Peter Zijlstra wrote:
> > AFAICT, rcu_read_lock() for PREEMPT_RCU is:
> >
> >    WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
> >    barrier();
>
> rcu_read_unlock() is the expensive one if you need to go down
> rcu_read_unlock_special().
>

If "actual likelihood of a wakeup is very low." as stated in the changelog,
the likelihood of rcu_read_unlock_special() is also very low.

rcu_read_lock() for PREEMPT_RCU is a function call, is it relevant?

(It is possible to remove the function call if the include-hell can
be resolved or remove the function call via LTO or just remove the
function call in X86 via percpu.)

Thanks
Lai

> Paolo
>
> > Paul?
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
       [not found]         ` <CANRm+Cxpav5FqCBZoQv5BLnC160_RA3YDJc_CFabW7oMBFQ_5g@mail.gmail.com>
@ 2021-10-22 13:59           ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2021-10-22 13:59 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Paolo Bonzini, LKML, kvm, Davidlohr Bueso, Oleg Nesterov,
	Ingo Molnar, Paul E . McKenney, Wanpeng Li

On Wed, Oct 20, 2021 at 08:08:41PM +0800, Wanpeng Li wrote:
> On Wed, 20 Oct 2021 at 20:04, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 20/10/21 14:01, Wanpeng Li wrote:
> > > Yes, in the attachment.
> > >
> > >      Wanpeng
> >
> > This one does not have CONFIG_PREEMPT=y, let alone CONFIG_PREEMPT_RCU.
> > It's completely impossible that this patch has an effect without those
> > options.
> 
> Sorry, should be this one in the attachment.

Uhhmmm.. you have lockdep enabled. You know you shouldn't be doing
performance measurements with lockdep on, right?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-22 13:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-20 11:06 [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed Paolo Bonzini
2021-10-20 11:17 ` Peter Zijlstra
2021-10-20 11:37   ` Paolo Bonzini
2021-10-22  4:11     ` Lai Jiangshan
2021-10-20 11:52   ` Paolo Bonzini
     [not found]     ` <CANRm+CyX+ZZh+LbLPBXEfMoExkx78qHpP-=yFCop9gX+LQeWDQ@mail.gmail.com>
2021-10-20 12:04       ` Paolo Bonzini
     [not found]         ` <CANRm+Cxpav5FqCBZoQv5BLnC160_RA3YDJc_CFabW7oMBFQ_5g@mail.gmail.com>
2021-10-22 13:59           ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox