kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
       [not found]                   ` <20180711164952.GA29994@linux.vnet.ibm.com>
@ 2018-07-11 17:03                     ` David Woodhouse
  2018-07-11 17:48                       ` Paul E. McKenney
                                         ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: David Woodhouse @ 2018-07-11 17:03 UTC (permalink / raw)
  To: paulmck; +Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm

On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> And here is an updated v4.15 patch with Marius's Reported-by and David's
> fix to my lost exclamation point.

Thanks. Are you sending the original version of that to Linus? It'd be
useful to have the commit ID so that we can watch for it landing, and
chase this one up to Greg.

As discussed on IRC, this patch reduces synchronize_sched() latency for
us from ~4600s to ~160ms, which is nice.

However, it isn't going to be sufficient in the NO_HZ_FULL case. For
that you want a patch like the one below, which happily reduces the
latency in our (!NO_HZ_FULL) case still further to ~40ms.

Adding kvm list for better review...

From: David Woodhouse <dwmw@amazon.co.uk>
Subject: [PATCH] kvm/x86: Inform RCU of quiescent state when entering guest mode

RCU can spend long periods of time waiting for a CPU which is actually in
KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
modes, and don't wait for it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kvm/x86.c      |  2 ++
 include/linux/rcutree.h |  2 ++
 kernel/rcu/tree.c       | 16 ++++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..b0c82f70afa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	}
 
+	rcu_kvm_enter();
 	kvm_x86_ops->run(vcpu);
+	rcu_kvm_exit();
 
 	/*
 	 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 914655848ef6..6d07af5a50fc 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -82,6 +82,8 @@ void cond_synchronize_sched(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+void rcu_kvm_enter(void);
+void rcu_kvm_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index aa7cade1b9f3..df7893273939 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1019,6 +1019,22 @@ void rcu_irq_enter_irqson(void)
 	local_irq_restore(flags);
 }
 
+/*
+ * These are currently identical to the _idle_ versions but let's
+ * explicitly have separate copies to keep Paul honest in future.
+ */
+void rcu_kvm_enter(void)
+{
+	rcu_idle_enter();
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_enter);
+
+void rcu_kvm_exit(void)
+{
+	rcu_idle_exit();
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_exit);
+
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is idle
  *
-- 
2.17.1


-- 
dwmw2

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 17:03                     ` [RFC] Make need_resched() return true when rcu_urgent_qs requested David Woodhouse
@ 2018-07-11 17:48                       ` Paul E. McKenney
  2018-07-11 18:01                         ` [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode David Woodhouse
  2018-07-11 18:31                       ` [RFC] Make need_resched() return true when rcu_urgent_qs requested Christian Borntraeger
  2018-07-19  0:32                       ` Frederic Weisbecker
  2 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 17:48 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 06:03:42PM +0100, David Woodhouse wrote:
> On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> > And here is an updated v4.15 patch with Marius's Reported-by and David's
> > fix to my lost exclamation point.
> 
> Thanks. Are you sending the original version of that to Linus? It'd be
> useful to have the commit ID so that we can watch for it landing, and
> chase this one up to Greg.

That would be great!  The commit ID is currently 6d1b6b684e1f ("rcu: Make
need_resched() respond to urgent RCU-QS needs"), but is subject to change
given the likely need to rebase in order to fix bugs in commits preceding
that one.  Given your Reported-by, you will be CCed when it reaches -tip,
won't you?  At that point, the commit ID would be set in stone.

Either way, I would very much welcome any help with -stable.  I would
be happy to send you an email when its commit ID become set in stone,
for example, if that would help.

> As discussed on IRC, this patch reduces synchronize_sched() latency for
> us from ~4600s to ~160ms, which is nice.

Woo-hoo!!!  ;-)

> However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> that you want a patch like the one below, which happily reduces the
> latency in our (!NO_HZ_FULL) case still further to ~40ms.

Even better, good stuff, thank you!

> Adding kvm list for better review...

And a comment below.

							Thanx, Paul

> From: David Woodhouse <dwmw@amazon.co.uk>
> Subject: [PATCH] kvm/x86: Inform RCU of quiescent state when entering guest mode
> 
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/x86/kvm/x86.c      |  2 ++
>  include/linux/rcutree.h |  2 ++
>  kernel/rcu/tree.c       | 16 ++++++++++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>  	} > 
> +	rcu_kvm_enter();
>  	kvm_x86_ops->run(vcpu);
> +	rcu_kvm_exit();
> 
>  	/*
>  	 * Do this here before restoring debug registers on the host.  And
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 914655848ef6..6d07af5a50fc 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -82,6 +82,8 @@ void cond_synchronize_sched(unsigned long oldstate);
> 
>  void rcu_idle_enter(void);
>  void rcu_idle_exit(void);
> +void rcu_kvm_enter(void);
> +void rcu_kvm_exit(void);
>  void rcu_irq_enter(void);
>  void rcu_irq_exit(void);
>  void rcu_irq_enter_irqson(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index aa7cade1b9f3..df7893273939 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1019,6 +1019,22 @@ void rcu_irq_enter_irqson(void)
>  	local_irq_restore(flags);
>  }
> 
> +/*
> + * These are currently identical to the _idle_ versions but let's
> + * explicitly have separate copies to keep Paul honest in future.
> + */
> +void rcu_kvm_enter(void)
> +{
> +	rcu_idle_enter();
> +}
> +EXPORT_SYMBOL_GPL(rcu_kvm_enter);
> +
> +void rcu_kvm_exit(void)
> +{
> +	rcu_idle_exit();
> +}
> +EXPORT_SYMBOL_GPL(rcu_kvm_exit);

These look good, but we also need these in include/linux/rcutiny.h:

	static inline void rcu_kvm_enter(void) { }
	static inline void rcu_kvm_exit(void) { }

Unless KVM is excluded on !SMP systems or some such.

Alternatively, you could just have a single pair of static inlines in
include/linux/rcupdate.h (after the #include of rcutree.h and rcutiny.h)
that mapped the _kvm_ functions to the _idle_ functions.  Your choice,
I am fine either way.

> +
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is idle
>   *
> -- 
> 2.17.1
> 
> 
> -- 
> dwmw2
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 17:48                       ` Paul E. McKenney
@ 2018-07-11 18:01                         ` David Woodhouse
  2018-07-11 18:20                           ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-11 18:01 UTC (permalink / raw)
  To: peterz, mhillenb, linux-kernel, kvm, paulmck

From: David Woodhouse <dwmw@amazon.co.uk>

RCU can spend long periods of time waiting for a CPU which is actually in
KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
modes, and don't wait for it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kvm/x86.c       | 2 ++
 include/linux/rcupdate.h | 7 +++++++
 kernel/rcu/tree.c        | 2 ++
 3 files changed, 11 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..b0c82f70afa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	}
 
+	rcu_kvm_enter();
 	kvm_x86_ops->run(vcpu);
+	rcu_kvm_exit();
 
 	/*
 	 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 65163aa0bb04..1325f9d9ce00 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -212,6 +212,13 @@ do { \
 #error "Unknown RCU implementation specified to kernel configuration"
 #endif
 
+/*
+ * These are currently identical to the _idle_ versions but let's
+ * explicitly have separate copies to keep Paul honest in future.
+ */
+static inline void rcu_kvm_enter(void) { rcu_idle_enter(); }
+static inline void rcu_kvm_exit(void) { rcu_idle_exit(); }
+
 /*
  * The init_rcu_head_on_stack() and destroy_rcu_head_on_stack() calls
  * are needed for dynamic initialization and destruction of rcu_head
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index aa7cade1b9f3..d38c381bf4e3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -760,6 +760,7 @@ void rcu_idle_enter(void)
 	lockdep_assert_irqs_disabled();
 	rcu_eqs_enter(false);
 }
+EXPORT_SYMBOL_GPL(rcu_idle_enter);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
@@ -913,6 +914,7 @@ void rcu_idle_exit(void)
 	rcu_eqs_exit(false);
 	local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_idle_exit);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 18:01                         ` [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode David Woodhouse
@ 2018-07-11 18:20                           ` Paul E. McKenney
  2018-07-11 18:36                             ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 18:20 UTC (permalink / raw)
  To: David Woodhouse; +Cc: peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>

And idiot here forgot about some of the debugging code in RCU's dyntick-idle
code.  I will reply with a fixed patch.

The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
so should be OK for testing, just not for mainline.

								Thanx, Paul

> ---
>  arch/x86/kvm/x86.c       | 2 ++
>  include/linux/rcupdate.h | 7 +++++++
>  kernel/rcu/tree.c        | 2 ++
>  3 files changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>  	}
> 
> +	rcu_kvm_enter();
>  	kvm_x86_ops->run(vcpu);
> +	rcu_kvm_exit();
> 
>  	/*
>  	 * Do this here before restoring debug registers on the host.  And
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 65163aa0bb04..1325f9d9ce00 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -212,6 +212,13 @@ do { \
>  #error "Unknown RCU implementation specified to kernel configuration"
>  #endif
> 
> +/*
> + * These are currently identical to the _idle_ versions but let's
> + * explicitly have separate copies to keep Paul honest in future.
> + */
> +static inline void rcu_kvm_enter(void) { rcu_idle_enter(); }
> +static inline void rcu_kvm_exit(void) { rcu_idle_exit(); }
> +
>  /*
>   * The init_rcu_head_on_stack() and destroy_rcu_head_on_stack() calls
>   * are needed for dynamic initialization and destruction of rcu_head
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index aa7cade1b9f3..d38c381bf4e3 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -760,6 +760,7 @@ void rcu_idle_enter(void)
>  	lockdep_assert_irqs_disabled();
>  	rcu_eqs_enter(false);
>  }
> +EXPORT_SYMBOL_GPL(rcu_idle_enter);
> 
>  #ifdef CONFIG_NO_HZ_FULL
>  /**
> @@ -913,6 +914,7 @@ void rcu_idle_exit(void)
>  	rcu_eqs_exit(false);
>  	local_irq_restore(flags);
>  }
> +EXPORT_SYMBOL_GPL(rcu_idle_exit);
> 
>  #ifdef CONFIG_NO_HZ_FULL
>  /**
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 17:03                     ` [RFC] Make need_resched() return true when rcu_urgent_qs requested David Woodhouse
  2018-07-11 17:48                       ` Paul E. McKenney
@ 2018-07-11 18:31                       ` Christian Borntraeger
  2018-07-11 20:17                         ` Paul E. McKenney
  2018-07-19  0:32                       ` Frederic Weisbecker
  2 siblings, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-11 18:31 UTC (permalink / raw)
  To: David Woodhouse, paulmck; +Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm

So why is the rcu_virt_note_context_switch(smp_processor_id());
in guest_enter_irqoff not good enough? 

This was actually supposed to tell rcu that being in the guest
is an extended quiescing period (like userspace).

What has changed?



On 07/11/2018 07:03 PM, David Woodhouse wrote:
> On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
>> And here is an updated v4.15 patch with Marius's Reported-by and David's
>> fix to my lost exclamation point.
> 
> Thanks. Are you sending the original version of that to Linus? It'd be
> useful to have the commit ID so that we can watch for it landing, and
> chase this one up to Greg.
> 
> As discussed on IRC, this patch reduces synchronize_sched() latency for
> us from ~4600s to ~160ms, which is nice.
> 
> However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> that you want a patch like the one below, which happily reduces the
> latency in our (!NO_HZ_FULL) case still further to ~40ms.
> 
> Adding kvm list for better review...
> 
> From: David Woodhouse <dwmw@amazon.co.uk>
> Subject: [PATCH] kvm/x86: Inform RCU of quiescent state when entering guest mode
> 
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/x86/kvm/x86.c      |  2 ++
>  include/linux/rcutree.h |  2 ++
>  kernel/rcu/tree.c       | 16 ++++++++++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>  	}
> 
> +	rcu_kvm_enter();
>  	kvm_x86_ops->run(vcpu);
> +	rcu_kvm_exit();
> 
>  	/*
>  	 * Do this here before restoring debug registers on the host.  And
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 914655848ef6..6d07af5a50fc 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -82,6 +82,8 @@ void cond_synchronize_sched(unsigned long oldstate);
> 
>  void rcu_idle_enter(void);
>  void rcu_idle_exit(void);
> +void rcu_kvm_enter(void);
> +void rcu_kvm_exit(void);
>  void rcu_irq_enter(void);
>  void rcu_irq_exit(void);
>  void rcu_irq_enter_irqson(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index aa7cade1b9f3..df7893273939 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1019,6 +1019,22 @@ void rcu_irq_enter_irqson(void)
>  	local_irq_restore(flags);
>  }
> 
> +/*
> + * These are currently identical to the _idle_ versions but let's
> + * explicitly have separate copies to keep Paul honest in future.
> + */
> +void rcu_kvm_enter(void)
> +{
> +	rcu_idle_enter();
> +}
> +EXPORT_SYMBOL_GPL(rcu_kvm_enter);
> +
> +void rcu_kvm_exit(void)
> +{
> +	rcu_idle_exit();
> +}
> +EXPORT_SYMBOL_GPL(rcu_kvm_exit);
> +
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is idle
>   *
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 18:20                           ` Paul E. McKenney
@ 2018-07-11 18:36                             ` Paul E. McKenney
  2018-07-11 18:39                               ` Christian Borntraeger
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 18:36 UTC (permalink / raw)
  To: David Woodhouse; +Cc: peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > RCU can spend long periods of time waiting for a CPU which is actually in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > modes, and don't wait for it.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> 
> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> code.  I will reply with a fixed patch.
> 
> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> so should be OK for testing, just not for mainline.

And here is the updated code that allegedly avoids splatting when run with
CONFIG_RCU_EQS_DEBUG.

Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
Author: David Woodhouse <dwmw@amazon.co.uk>
Date:   Wed Jul 11 19:01:01 2018 +0100

    kvm/x86: Inform RCU of quiescent state when entering guest mode
    
    RCU can spend long periods of time waiting for a CPU which is actually in
    KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
    modes, and don't wait for it.
    
    Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..b0c82f70afa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	}
 
+	rcu_kvm_enter();
 	kvm_x86_ops->run(vcpu);
+	rcu_kvm_exit();
 
 	/*
 	 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7fa4fb9e899e..4b2d691e453f 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -85,6 +85,8 @@ static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
+static inline void rcu_kvm_enter(void) { }
+static inline void rcu_kvm_exit(void) { }
 static inline void rcu_irq_enter(void) { }
 static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 7f83179177d1..48ce58b53ece 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -55,6 +55,8 @@ void cond_synchronize_rcu(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+void rcu_kvm_enter(void);
+void rcu_kvm_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 765c81dd675e..0c0672faa6d1 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -583,6 +583,24 @@ void rcu_idle_enter(void)
 	rcu_eqs_enter(false);
 }
 
+/**
+ * rcu_kvm_enter - inform RCU that current CPU is entering a guest OS
+ *
+ * Enter guest-OS mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur.  (Though RCU read-side critical
+ * sections can occur in irq handlers from guest OSes, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * If you add or remove a call to rcu_kvm_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_enter(void)
+{
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_enter(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_enter);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_enter - inform RCU that we are resuming userspace.
@@ -747,6 +765,22 @@ void rcu_idle_exit(void)
 	local_irq_restore(flags);
 }
 
+/**
+ * rcu_kvm_exit - inform RCU that current CPU is leaving a guest OS
+ *
+ * Exit guest-OS mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * If you add or remove a call to rcu_kvm_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_exit(void)
+{
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_exit(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_exit);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_exit - inform RCU that we are exiting userspace.

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 18:36                             ` Paul E. McKenney
@ 2018-07-11 18:39                               ` Christian Borntraeger
  2018-07-11 20:27                                 ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-11 18:39 UTC (permalink / raw)
  To: paulmck, David Woodhouse; +Cc: peterz, mhillenb, linux-kernel, kvm



On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>
>>> RCU can spend long periods of time waiting for a CPU which is actually in
>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>> modes, and don't wait for it.
>>>
>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>
>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
>> code.  I will reply with a fixed patch.
>>
>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
>> so should be OK for testing, just not for mainline.
> 
> And here is the updated code that allegedly avoids splatting when run with
> CONFIG_RCU_EQS_DEBUG.
> 
> Thoughts?
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> Author: David Woodhouse <dwmw@amazon.co.uk>
> Date:   Wed Jul 11 19:01:01 2018 +0100
> 
>     kvm/x86: Inform RCU of quiescent state when entering guest mode
>     
>     RCU can spend long periods of time waiting for a CPU which is actually in
>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>     modes, and don't wait for it.
>     
>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>  	}
> 
> +	rcu_kvm_enter();
>  	kvm_x86_ops->run(vcpu);
> +	rcu_kvm_exit();

As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
not good enough, we should rather fix that instead of adding another rcu hint.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 18:31                       ` [RFC] Make need_resched() return true when rcu_urgent_qs requested Christian Borntraeger
@ 2018-07-11 20:17                         ` Paul E. McKenney
  2018-07-11 20:19                           ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 20:17 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, Peter Zijlstra, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 08:31:55PM +0200, Christian Borntraeger wrote:
> So why is the rcu_virt_note_context_switch(smp_processor_id());
> in guest_enter_irqoff not good enough? 
> 
> This was actually supposed to tell rcu that being in the guest
> is an extended quiescing period (like userspace).
> 
> What has changed?

As I understand it, they would like to have their guest run uninterrupted
for extended times.  Because rcu_virt_note_context_switch() is a
point-in-time quiescent state, it cannot tell RCU about the extended
quiescent state.

Should we replace the current calls to rcu_virt_note_context_switch()
with rcu_kvm_enter() and rcu_kvm_exit()?  Would that be better
than the below architecture-by-architecture approach?

							Thanx, Paul

> On 07/11/2018 07:03 PM, David Woodhouse wrote:
> > On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> >> And here is an updated v4.15 patch with Marius's Reported-by and David's
> >> fix to my lost exclamation point.
> > 
> > Thanks. Are you sending the original version of that to Linus? It'd be
> > useful to have the commit ID so that we can watch for it landing, and
> > chase this one up to Greg.
> > 
> > As discussed on IRC, this patch reduces synchronize_sched() latency for
> > us from ~4600s to ~160ms, which is nice.
> > 
> > However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> > that you want a patch like the one below, which happily reduces the
> > latency in our (!NO_HZ_FULL) case still further to ~40ms.
> > 
> > Adding kvm list for better review...
> > 
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > Subject: [PATCH] kvm/x86: Inform RCU of quiescent state when entering guest mode
> > 
> > RCU can spend long periods of time waiting for a CPU which is actually in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > modes, and don't wait for it.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> >  arch/x86/kvm/x86.c      |  2 ++
> >  include/linux/rcutree.h |  2 ++
> >  kernel/rcu/tree.c       | 16 ++++++++++++++++
> >  3 files changed, 20 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 0046aa70205a..b0c82f70afa7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> >  	}
> > 
> > +	rcu_kvm_enter();
> >  	kvm_x86_ops->run(vcpu);
> > +	rcu_kvm_exit();
> > 
> >  	/*
> >  	 * Do this here before restoring debug registers on the host.  And
> > diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> > index 914655848ef6..6d07af5a50fc 100644
> > --- a/include/linux/rcutree.h
> > +++ b/include/linux/rcutree.h
> > @@ -82,6 +82,8 @@ void cond_synchronize_sched(unsigned long oldstate);
> > 
> >  void rcu_idle_enter(void);
> >  void rcu_idle_exit(void);
> > +void rcu_kvm_enter(void);
> > +void rcu_kvm_exit(void);
> >  void rcu_irq_enter(void);
> >  void rcu_irq_exit(void);
> >  void rcu_irq_enter_irqson(void);
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index aa7cade1b9f3..df7893273939 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1019,6 +1019,22 @@ void rcu_irq_enter_irqson(void)
> >  	local_irq_restore(flags);
> >  }
> > 
> > +/*
> > + * These are currently identical to the _idle_ versions but let's
> > + * explicitly have separate copies to keep Paul honest in future.
> > + */
> > +void rcu_kvm_enter(void)
> > +{
> > +	rcu_idle_enter();
> > +}
> > +EXPORT_SYMBOL_GPL(rcu_kvm_enter);
> > +
> > +void rcu_kvm_exit(void)
> > +{
> > +	rcu_idle_exit();
> > +}
> > +EXPORT_SYMBOL_GPL(rcu_kvm_exit);
> > +
> >  /**
> >   * rcu_is_watching - see if RCU thinks that the current CPU is idle
> >   *
> > 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 20:17                         ` Paul E. McKenney
@ 2018-07-11 20:19                           ` David Woodhouse
  2018-07-11 21:08                             ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-11 20:19 UTC (permalink / raw)
  To: paulmck, Christian Borntraeger
  Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]

On Wed, 2018-07-11 at 13:17 -0700, Paul E. McKenney wrote:
> As I understand it, they would like to have their guest run uninterrupted
> for extended times.  Because rcu_virt_note_context_switch() is a
> point-in-time quiescent state, it cannot tell RCU about the extended
> quiescent state.
> 
> Should we replace the current calls to rcu_virt_note_context_switch()
> with rcu_kvm_enter() and rcu_kvm_exit()?  Would that be better
> than the below architecture-by-architecture approach?

Yes it would. I was already starting to mutter about needing the same
for ARM and POWER. I'll do a v3 (incorporating your fixes too) in the
morning.

Thanks.

Also... why in $DEITY's name was the existing
rcu_virt_note_context_switch() not actually sufficient? If we had that
there, why did we need an additional explicit calls to rcu_all_qs() in
the KVM loop, or the more complex fixes to need_resched() which
ultimately had the same effect, to avoid ten-second latencies?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 18:39                               ` Christian Borntraeger
@ 2018-07-11 20:27                                 ` Paul E. McKenney
  2018-07-11 20:54                                   ` David Woodhouse
  2018-07-11 21:11                                   ` Christian Borntraeger
  0 siblings, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 20:27 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> >>> From: David Woodhouse <dwmw@amazon.co.uk>
> >>>
> >>> RCU can spend long periods of time waiting for a CPU which is actually in
> >>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>> modes, and don't wait for it.
> >>>
> >>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >>
> >> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> >> code.  I will reply with a fixed patch.
> >>
> >> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> >> so should be OK for testing, just not for mainline.
> > 
> > And here is the updated code that allegedly avoids splatting when run with
> > CONFIG_RCU_EQS_DEBUG.
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > Author: David Woodhouse <dwmw@amazon.co.uk>
> > Date:   Wed Jul 11 19:01:01 2018 +0100
> > 
> >     kvm/x86: Inform RCU of quiescent state when entering guest mode
> >     
> >     RCU can spend long periods of time waiting for a CPU which is actually in
> >     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >     modes, and don't wait for it.
> >     
> >     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 0046aa70205a..b0c82f70afa7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> >  	}
> > 
> > +	rcu_kvm_enter();
> >  	kvm_x86_ops->run(vcpu);
> > +	rcu_kvm_exit();
> 
> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
> not good enough, we should rather fix that instead of adding another rcu hint.

Something like this, on top of the earlier patch?  I am not at all
confident of this patch because there might be other entry/exit
paths I am missing.  Plus there might be RCU uses on the arch-specific
patch to and from the guest OS.

Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b0c82f70afa7..0046aa70205a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,9 +7458,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	}
 
-	rcu_kvm_enter();
 	kvm_x86_ops->run(vcpu);
-	rcu_kvm_exit();
 
 	/*
 	 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index d05609ad329d..8d2a9d3073ad 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
 	 * one time slice). Lets treat guest mode as quiescent state, just like
 	 * we do with user-mode execution.
 	 */
-	if (!context_tracking_cpu_is_enabled())
-		rcu_virt_note_context_switch(smp_processor_id());
+	rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
+	rcu_kvm_exit();
 	if (context_tracking_is_enabled())
 		__context_tracking_exit(CONTEXT_GUEST);
 
@@ -143,12 +143,13 @@ static inline void guest_enter_irqoff(void)
 	 */
 	vtime_account_system(current);
 	current->flags |= PF_VCPU;
-	rcu_virt_note_context_switch(smp_processor_id());
+	rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
 	/* Flush the guest cputime we spent on the guest */
+	rcu_kvm_exit();
 	vtime_account_system(current);
 	current->flags &= ~PF_VCPU;
 }
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 4b2d691e453f..a7aa5b3cfb81 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -81,7 +81,6 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
  * Take advantage of the fact that there is only one CPU, which
  * allows us to ignore virtualization-based context switches.
  */
-static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 48ce58b53ece..62b61e579bb4 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -34,17 +34,6 @@ void rcu_softirq_qs(void);
 void rcu_note_context_switch(bool preempt);
 int rcu_needs_cpu(u64 basem, u64 *nextevt);
 void rcu_cpu_stall_reset(void);
-
-/*
- * Note a virtualization-based context switch.  This is simply a
- * wrapper around rcu_note_context_switch(), which allows TINY_RCU
- * to save a few bytes. The caller must have disabled interrupts.
- */
-static inline void rcu_virt_note_context_switch(int cpu)
-{
-	rcu_note_context_switch(false);
-}
-
 void synchronize_rcu_expedited(void);
 void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
 

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 20:27                                 ` Paul E. McKenney
@ 2018-07-11 20:54                                   ` David Woodhouse
  2018-07-11 21:09                                     ` Paul E. McKenney
  2018-07-11 21:11                                   ` Christian Borntraeger
  1 sibling, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-11 20:54 UTC (permalink / raw)
  To: paulmck, Christian Borntraeger; +Cc: peterz, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 463 bytes --]

On Wed, 2018-07-11 at 13:27 -0700, Paul E. McKenney wrote:
> 
> Something like this, on top of the earlier patch?  I am not at all
> confident of this patch because there might be other entry/exit
> paths I am missing.  Plus there might be RCU uses on the arch-
> specific patch to and from the guest OS.

That was going to be my starting point, yes. With some additional
manual review and testing to cover the "not at all confident" part. :)

Thanks.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 20:19                           ` David Woodhouse
@ 2018-07-11 21:08                             ` Paul E. McKenney
  2018-07-12 12:00                               ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 21:08 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Wed, Jul 11, 2018 at 09:19:44PM +0100, David Woodhouse wrote:
> On Wed, 2018-07-11 at 13:17 -0700, Paul E. McKenney wrote:
> > As I understand it, they would like to have their guest run uninterrupted
> > for extended times.  Because rcu_virt_note_context_switch() is a
> > point-in-time quiescent state, it cannot tell RCU about the extended
> > quiescent state.
> > 
> > Should we replace the current calls to rcu_virt_note_context_switch()
> > with rcu_kvm_enter() and rcu_kvm_exit()?  Would that be better
> > than the below architecture-by-architecture approach?
> 
> Yes it would. I was already starting to mutter about needing the same
> for ARM and POWER. I'll do a v3 (incorporating your fixes too) in the
> morning.
> 
> Thanks.
> 
> Also... why in $DEITY's name was the existing
> rcu_virt_note_context_switch() not actually sufficient? If we had that
> there, why did we need an additional explicit calls to rcu_all_qs() in
> the KVM loop, or the more complex fixes to need_resched() which
> ultimately had the same effect, to avoid ten-second latencies?

My guess is that this was because control passed through the
rcu_virt_note_context_switch() only once, and then subsequent
scheduling-clock interrupts bypassed this code.  But that is just a guess.
I need to defer to someone who understands the KVM code better than I do.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 20:54                                   ` David Woodhouse
@ 2018-07-11 21:09                                     ` Paul E. McKenney
  0 siblings, 0 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 21:09 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 09:54:52PM +0100, David Woodhouse wrote:
> On Wed, 2018-07-11 at 13:27 -0700, Paul E. McKenney wrote:
> > 
> > Something like this, on top of the earlier patch?  I am not at all
> > confident of this patch because there might be other entry/exit
> > paths I am missing.  Plus there might be RCU uses on the arch-
> > specific patch to and from the guest OS.
> 
> That was going to be my starting point, yes. With some additional
> manual review and testing to cover the "not at all confident" part. :)

Please feel free to change as needed, up to and including throwing
my patch away and starting over.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 20:27                                 ` Paul E. McKenney
  2018-07-11 20:54                                   ` David Woodhouse
@ 2018-07-11 21:11                                   ` Christian Borntraeger
  2018-07-11 21:32                                     ` Paul E. McKenney
  1 sibling, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-11 21:11 UTC (permalink / raw)
  To: paulmck; +Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm



On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
>>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>>>
>>>>> RCU can spend long periods of time waiting for a CPU which is actually in
>>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>>> modes, and don't wait for it.
>>>>>
>>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>>
>>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
>>>> code.  I will reply with a fixed patch.
>>>>
>>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
>>>> so should be OK for testing, just not for mainline.
>>>
>>> And here is the updated code that allegedly avoids splatting when run with
>>> CONFIG_RCU_EQS_DEBUG.
>>>
>>> Thoughts?
>>>
>>> 							Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
>>> Author: David Woodhouse <dwmw@amazon.co.uk>
>>> Date:   Wed Jul 11 19:01:01 2018 +0100
>>>
>>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
>>>     
>>>     RCU can spend long periods of time waiting for a CPU which is actually in
>>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>     modes, and don't wait for it.
>>>     
>>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 0046aa70205a..b0c82f70afa7 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>>>  	}
>>>
>>> +	rcu_kvm_enter();
>>>  	kvm_x86_ops->run(vcpu);
>>> +	rcu_kvm_exit();
>>
>> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
>> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
>> not good enough, we should rather fix that instead of adding another rcu hint.
> 
> Something like this, on top of the earlier patch?  I am not at all
> confident of this patch because there might be other entry/exit
> paths I am missing.  Plus there might be RCU uses on the arch-specific
> patch to and from the guest OS.
> 
> Thoughts?
> 

If you instrment guest_enter/exit, you should cover all cases and all architectures as far
as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
case of long running guests blocking rcu for many seconds. And I am pretty sure that
this did help back then.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 21:11                                   ` Christian Borntraeger
@ 2018-07-11 21:32                                     ` Paul E. McKenney
  2018-07-11 21:39                                       ` Christian Borntraeger
  2018-07-11 23:37                                       ` Paul E. McKenney
  0 siblings, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 21:32 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> >>
> >>
> >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> >>>>> From: David Woodhouse <dwmw@amazon.co.uk>
> >>>>>
> >>>>> RCU can spend long periods of time waiting for a CPU which is actually in
> >>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>>>> modes, and don't wait for it.
> >>>>>
> >>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >>>>
> >>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> >>>> code.  I will reply with a fixed patch.
> >>>>
> >>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> >>>> so should be OK for testing, just not for mainline.
> >>>
> >>> And here is the updated code that allegedly avoids splatting when run with
> >>> CONFIG_RCU_EQS_DEBUG.
> >>>
> >>> Thoughts?
> >>>
> >>> 							Thanx, Paul
> >>>
> >>> ------------------------------------------------------------------------
> >>>
> >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> >>> Author: David Woodhouse <dwmw@amazon.co.uk>
> >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> >>>
> >>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
> >>>     
> >>>     RCU can spend long periods of time waiting for a CPU which is actually in
> >>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>>     modes, and don't wait for it.
> >>>     
> >>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> >>>
> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>> index 0046aa70205a..b0c82f70afa7 100644
> >>> --- a/arch/x86/kvm/x86.c
> >>> +++ b/arch/x86/kvm/x86.c
> >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> >>>  	}
> >>>
> >>> +	rcu_kvm_enter();
> >>>  	kvm_x86_ops->run(vcpu);
> >>> +	rcu_kvm_exit();
> >>
> >> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
> >> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
> >> not good enough, we should rather fix that instead of adding another rcu hint.
> > 
> > Something like this, on top of the earlier patch?  I am not at all
> > confident of this patch because there might be other entry/exit
> > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > patch to and from the guest OS.
> > 
> > Thoughts?
> > 
> 
> If you instrment guest_enter/exit, you should cover all cases and all architectures as far
> as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
> case of long running guests blocking rcu for many seconds. And I am pretty sure that
> this did help back then.

And my second patch on the email you replied to replaced the only call
to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
but yes, there might well be things I missed.  Let's see what David
comes up with.

What changed was RCU's reactions to longish grace periods.  It used to
be very aggressive about forcing the scheduler to do otherwise-unneeded
context switches, which became a problem somewhere between v4.9 and v4.15.
I therefore reduced the number of such context switches, which in turn
caused KVM to tell RCU about quiescent states way too infrequently.

The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
it tells RCU of an extended duration in the guest, which means that
RCU can ignore the corresponding CPU, which in turn allows the guest
to proceed without any RCU-induced interruptions.

Does that make sense, or am I missing something?  I freely admit to
much ignorance of both kvm and s390!  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 21:32                                     ` Paul E. McKenney
@ 2018-07-11 21:39                                       ` Christian Borntraeger
  2018-07-11 23:47                                         ` Paul E. McKenney
  2018-07-11 23:37                                       ` Paul E. McKenney
  1 sibling, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-11 21:39 UTC (permalink / raw)
  To: paulmck; +Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm



On 07/11/2018 11:32 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
>>> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
>>>>
>>>>
>>>> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
>>>>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>>>>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>>>>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>>>>>
>>>>>>> RCU can spend long periods of time waiting for a CPU which is actually in
>>>>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>>>>> modes, and don't wait for it.
>>>>>>>
>>>>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>>>>
>>>>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
>>>>>> code.  I will reply with a fixed patch.
>>>>>>
>>>>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
>>>>>> so should be OK for testing, just not for mainline.
>>>>>
>>>>> And here is the updated code that allegedly avoids splatting when run with
>>>>> CONFIG_RCU_EQS_DEBUG.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> 							Thanx, Paul
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
>>>>> Author: David Woodhouse <dwmw@amazon.co.uk>
>>>>> Date:   Wed Jul 11 19:01:01 2018 +0100
>>>>>
>>>>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
>>>>>     
>>>>>     RCU can spend long periods of time waiting for a CPU which is actually in
>>>>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>>>     modes, and don't wait for it.
>>>>>     
>>>>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
>>>>>
>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>> index 0046aa70205a..b0c82f70afa7 100644
>>>>> --- a/arch/x86/kvm/x86.c
>>>>> +++ b/arch/x86/kvm/x86.c
>>>>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>>>>>  	}
>>>>>
>>>>> +	rcu_kvm_enter();
>>>>>  	kvm_x86_ops->run(vcpu);
>>>>> +	rcu_kvm_exit();
>>>>
>>>> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
>>>> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
>>>> not good enough, we should rather fix that instead of adding another rcu hint.
>>>
>>> Something like this, on top of the earlier patch?  I am not at all
>>> confident of this patch because there might be other entry/exit
>>> paths I am missing.  Plus there might be RCU uses on the arch-specific
>>> patch to and from the guest OS.
>>>
>>> Thoughts?
>>>
>>
>> If you instrment guest_enter/exit, you should cover all cases and all architectures as far
>> as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
>> case of long running guests blocking rcu for many seconds. And I am pretty sure that
>> this did help back then.
> 
> And my second patch on the email you replied to replaced the only call
> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> but yes, there might well be things I missed.  Let's see what David
> comes up with.
> 
> What changed was RCU's reactions to longish grace periods.  It used to
> be very aggressive about forcing the scheduler to do otherwise-unneeded
> context switches, which became a problem somewhere between v4.9 and v4.15.
> I therefore reduced the number of such context switches, which in turn
> caused KVM to tell RCU about quiescent states way too infrequently.

You talk about 
commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
    rcu: Make non-preemptive schedule be Tasks RCU quiescent state

correct? In fact, then whatever (properly sent) patch comes up should contain
a fixes tag.



> 
> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> it tells RCU of an extended duration in the guest, which means that
> RCU can ignore the corresponding CPU, which in turn allows the guest
> to proceed without any RCU-induced interruptions.
> 
> Does that make sense, or am I missing something?  I freely admit to
> much ignorance of both kvm and s390!  ;-)

WIth that explanation it makes perfect sense to replace 
rcu_virt_note_context_switch with rcu_kvm_enter/exit from an rcu performance
perspective. I assume that rcu_kvm_enter is not much slower than 
rcu_virt_note_context_switch? Because we do call it on every guest entry/exit
which we might have plenty for ping pong I/O workload.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 21:32                                     ` Paul E. McKenney
  2018-07-11 21:39                                       ` Christian Borntraeger
@ 2018-07-11 23:37                                       ` Paul E. McKenney
  2018-07-12  2:15                                         ` Paul E. McKenney
  2018-07-12  6:21                                         ` Christian Borntraeger
  1 sibling, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 23:37 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> > 
> > 
> > On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> > >>
> > >>
> > >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> > >>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > >>>>> From: David Woodhouse <dwmw@amazon.co.uk>
> > >>>>>
> > >>>>> RCU can spend long periods of time waiting for a CPU which is actually in
> > >>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > >>>>> modes, and don't wait for it.
> > >>>>>
> > >>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > >>>>
> > >>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> > >>>> code.  I will reply with a fixed patch.
> > >>>>
> > >>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> > >>>> so should be OK for testing, just not for mainline.
> > >>>
> > >>> And here is the updated code that allegedly avoids splatting when run with
> > >>> CONFIG_RCU_EQS_DEBUG.
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> 							Thanx, Paul
> > >>>
> > >>> ------------------------------------------------------------------------
> > >>>
> > >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > >>> Author: David Woodhouse <dwmw@amazon.co.uk>
> > >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> > >>>
> > >>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
> > >>>     
> > >>>     RCU can spend long periods of time waiting for a CPU which is actually in
> > >>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > >>>     modes, and don't wait for it.
> > >>>     
> > >>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > >>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > >>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> > >>>
> > >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > >>> index 0046aa70205a..b0c82f70afa7 100644
> > >>> --- a/arch/x86/kvm/x86.c
> > >>> +++ b/arch/x86/kvm/x86.c
> > >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > >>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > >>>  	}
> > >>>
> > >>> +	rcu_kvm_enter();
> > >>>  	kvm_x86_ops->run(vcpu);
> > >>> +	rcu_kvm_exit();
> > >>
> > >> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
> > >> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
> > >> not good enough, we should rather fix that instead of adding another rcu hint.
> > > 
> > > Something like this, on top of the earlier patch?  I am not at all
> > > confident of this patch because there might be other entry/exit
> > > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > > patch to and from the guest OS.
> > > 
> > > Thoughts?
> > > 
> > 
> > If you instrment guest_enter/exit, you should cover all cases and all architectures as far
> > as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
> > case of long running guests blocking rcu for many seconds. And I am pretty sure that
> > this did help back then.
> 
> And my second patch on the email you replied to replaced the only call
> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> but yes, there might well be things I missed.  Let's see what David
> comes up with.
> 
> What changed was RCU's reactions to longish grace periods.  It used to
> be very aggressive about forcing the scheduler to do otherwise-unneeded
> context switches, which became a problem somewhere between v4.9 and v4.15.
> I therefore reduced the number of such context switches, which in turn
> caused KVM to tell RCU about quiescent states way too infrequently.
> 
> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> it tells RCU of an extended duration in the guest, which means that
> RCU can ignore the corresponding CPU, which in turn allows the guest
> to proceed without any RCU-induced interruptions.
> 
> Does that make sense, or am I missing something?  I freely admit to
> much ignorance of both kvm and s390!  ;-)

But I am getting some rcutorture near misses on the commit that
introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
vcpu_enter_guest() function.  These near misses occur when running
rcutorture scenarios TREE01 and TREE03, and in my -rcu tree rather
than the v4.15 version of this patch.

Given that I am making pervasive changes to the way that RCU works,
it might well be that this commit is an innocent bystander.  I will
run tests overnight and let you know what comes up.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 21:39                                       ` Christian Borntraeger
@ 2018-07-11 23:47                                         ` Paul E. McKenney
  2018-07-12  8:31                                           ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-11 23:47 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 11:39:10PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 11:32 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> >>
> >>
> >> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> >>> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> >>>>
> >>>>
> >>>> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> >>>>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >>>>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> >>>>>>> From: David Woodhouse <dwmw@amazon.co.uk>
> >>>>>>>
> >>>>>>> RCU can spend long periods of time waiting for a CPU which is actually in
> >>>>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>>>>>> modes, and don't wait for it.
> >>>>>>>
> >>>>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >>>>>>
> >>>>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> >>>>>> code.  I will reply with a fixed patch.
> >>>>>>
> >>>>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> >>>>>> so should be OK for testing, just not for mainline.
> >>>>>
> >>>>> And here is the updated code that allegedly avoids splatting when run with
> >>>>> CONFIG_RCU_EQS_DEBUG.
> >>>>>
> >>>>> Thoughts?
> >>>>>
> >>>>> 							Thanx, Paul
> >>>>>
> >>>>> ------------------------------------------------------------------------
> >>>>>
> >>>>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> >>>>> Author: David Woodhouse <dwmw@amazon.co.uk>
> >>>>> Date:   Wed Jul 11 19:01:01 2018 +0100
> >>>>>
> >>>>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
> >>>>>     
> >>>>>     RCU can spend long periods of time waiting for a CPU which is actually in
> >>>>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>>>>     modes, and don't wait for it.
> >>>>>     
> >>>>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> >>>>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >>>>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> >>>>>
> >>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>>>> index 0046aa70205a..b0c82f70afa7 100644
> >>>>> --- a/arch/x86/kvm/x86.c
> >>>>> +++ b/arch/x86/kvm/x86.c
> >>>>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> >>>>>  	}
> >>>>>
> >>>>> +	rcu_kvm_enter();
> >>>>>  	kvm_x86_ops->run(vcpu);
> >>>>> +	rcu_kvm_exit();
> >>>>
> >>>> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
> >>>> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
> >>>> not good enough, we should rather fix that instead of adding another rcu hint.
> >>>
> >>> Something like this, on top of the earlier patch?  I am not at all
> >>> confident of this patch because there might be other entry/exit
> >>> paths I am missing.  Plus there might be RCU uses on the arch-specific
> >>> patch to and from the guest OS.
> >>>
> >>> Thoughts?
> >>>
> >>
> >> If you instrment guest_enter/exit, you should cover all cases and all architectures as far
> >> as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
> >> case of long running guests blocking rcu for many seconds. And I am pretty sure that
> >> this did help back then.
> > 
> > And my second patch on the email you replied to replaced the only call
> > to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> > but yes, there might well be things I missed.  Let's see what David
> > comes up with.
> > 
> > What changed was RCU's reactions to longish grace periods.  It used to
> > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > context switches, which became a problem somewhere between v4.9 and v4.15.
> > I therefore reduced the number of such context switches, which in turn
> > caused KVM to tell RCU about quiescent states way too infrequently.
> 
> You talk about 
> commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
> 
> correct? In fact, then whatever (properly sent) patch comes up should contain
> a fixes tag.

Not that one, but this one is at least part of the "team":

28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
"git bisect" to find the most relevant commit...  :-/

> > The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> > it tells RCU of an extended duration in the guest, which means that
> > RCU can ignore the corresponding CPU, which in turn allows the guest
> > to proceed without any RCU-induced interruptions.
> > 
> > Does that make sense, or am I missing something?  I freely admit to
> > much ignorance of both kvm and s390!  ;-)
> 
> WIth that explanation it makes perfect sense to replace 
> rcu_virt_note_context_switch with rcu_kvm_enter/exit from an rcu performance
> perspective. I assume that rcu_kvm_enter is not much slower than 
> rcu_virt_note_context_switch? Because we do call it on every guest entry/exit
> which we might have plenty for ping pong I/O workload.

But is there any way for a guest OS to sneak back out to the hypervisor
without executing one of the rcu_kvm_exit() calls?  If there is, RCU
is broken.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 23:37                                       ` Paul E. McKenney
@ 2018-07-12  2:15                                         ` Paul E. McKenney
  2018-07-12  6:21                                         ` Christian Borntraeger
  1 sibling, 0 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-12  2:15 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 04:37:27PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> > > 
> > > 
> > > On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > > > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> > > >>
> > > >>
> > > >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > > >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> > > >>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > > >>>>> From: David Woodhouse <dwmw@amazon.co.uk>
> > > >>>>>
> > > >>>>> RCU can spend long periods of time waiting for a CPU which is actually in
> > > >>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > > >>>>> modes, and don't wait for it.
> > > >>>>>
> > > >>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > > >>>>
> > > >>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> > > >>>> code.  I will reply with a fixed patch.
> > > >>>>
> > > >>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
> > > >>>> so should be OK for testing, just not for mainline.
> > > >>>
> > > >>> And here is the updated code that allegedly avoids splatting when run with
> > > >>> CONFIG_RCU_EQS_DEBUG.
> > > >>>
> > > >>> Thoughts?
> > > >>>
> > > >>> 							Thanx, Paul
> > > >>>
> > > >>> ------------------------------------------------------------------------
> > > >>>
> > > >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > > >>> Author: David Woodhouse <dwmw@amazon.co.uk>
> > > >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> > > >>>
> > > >>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
> > > >>>     
> > > >>>     RCU can spend long periods of time waiting for a CPU which is actually in
> > > >>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > > >>>     modes, and don't wait for it.
> > > >>>     
> > > >>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > > >>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > >>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> > > >>>
> > > >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > >>> index 0046aa70205a..b0c82f70afa7 100644
> > > >>> --- a/arch/x86/kvm/x86.c
> > > >>> +++ b/arch/x86/kvm/x86.c
> > > >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > > >>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > > >>>  	}
> > > >>>
> > > >>> +	rcu_kvm_enter();
> > > >>>  	kvm_x86_ops->run(vcpu);
> > > >>> +	rcu_kvm_exit();
> > > >>
> > > >> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
> > > >> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
> > > >> not good enough, we should rather fix that instead of adding another rcu hint.
> > > > 
> > > > Something like this, on top of the earlier patch?  I am not at all
> > > > confident of this patch because there might be other entry/exit
> > > > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > > > patch to and from the guest OS.
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > If you instrment guest_enter/exit, you should cover all cases and all architectures as far
> > > as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
> > > case of long running guests blocking rcu for many seconds. And I am pretty sure that
> > > this did help back then.
> > 
> > And my second patch on the email you replied to replaced the only call
> > to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> > but yes, there might well be things I missed.  Let's see what David
> > comes up with.
> > 
> > What changed was RCU's reactions to longish grace periods.  It used to
> > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > context switches, which became a problem somewhere between v4.9 and v4.15.
> > I therefore reduced the number of such context switches, which in turn
> > caused KVM to tell RCU about quiescent states way too infrequently.
> > 
> > The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> > it tells RCU of an extended duration in the guest, which means that
> > RCU can ignore the corresponding CPU, which in turn allows the guest
> > to proceed without any RCU-induced interruptions.
> > 
> > Does that make sense, or am I missing something?  I freely admit to
> > much ignorance of both kvm and s390!  ;-)
> 
> But I am getting some rcutorture near misses on the commit that
> introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
> vcpu_enter_guest() function.  These near misses occur when running
> rcutorture scenarios TREE01 and TREE03, and in my -rcu tree rather
> than the v4.15 version of this patch.
> 
> Given that I am making pervasive changes to the way that RCU works,
> it might well be that this commit is an innocent bystander.  I will
> run tests overnight and let you know what comes up.

And thus far, the verdict is "intermittent".  I am starting a new
series with CONFIG_RCU_EQS_DEBUG=y, which should detect any mismatched
rcu_kvm_enter()/rcu_kvm_exit() pairs.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 23:37                                       ` Paul E. McKenney
  2018-07-12  2:15                                         ` Paul E. McKenney
@ 2018-07-12  6:21                                         ` Christian Borntraeger
  2018-07-12  9:52                                           ` David Woodhouse
  1 sibling, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-12  6:21 UTC (permalink / raw)
  To: paulmck; +Cc: David Woodhouse, peterz, mhillenb, linux-kernel, kvm



On 07/12/2018 01:37 AM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
>>>
>>>
>>> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
>>>> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
>>>>>
>>>>>
>>>>> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
>>>>>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>>>>>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>>>>>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>>>>>>
>>>>>>>> RCU can spend long periods of time waiting for a CPU which is actually in
>>>>>>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>>>>>> modes, and don't wait for it.
>>>>>>>>
>>>>>>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>>>>>
>>>>>>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
>>>>>>> code.  I will reply with a fixed patch.
>>>>>>>
>>>>>>> The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
>>>>>>> so should be OK for testing, just not for mainline.
>>>>>>
>>>>>> And here is the updated code that allegedly avoids splatting when run with
>>>>>> CONFIG_RCU_EQS_DEBUG.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> 							Thanx, Paul
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
>>>>>> Author: David Woodhouse <dwmw@amazon.co.uk>
>>>>>> Date:   Wed Jul 11 19:01:01 2018 +0100
>>>>>>
>>>>>>     kvm/x86: Inform RCU of quiescent state when entering guest mode
>>>>>>     
>>>>>>     RCU can spend long periods of time waiting for a CPU which is actually in
>>>>>>     KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>>>>>     modes, and don't wait for it.
>>>>>>     
>>>>>>     Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>>>>>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>>>>     [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>> index 0046aa70205a..b0c82f70afa7 100644
>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>>>>>>  	}
>>>>>>
>>>>>> +	rcu_kvm_enter();
>>>>>>  	kvm_x86_ops->run(vcpu);
>>>>>> +	rcu_kvm_exit();
>>>>>
>>>>> As indicated in my other mail. This is supposed to be handled in the guest_enter|exit_ calls around
>>>>> the run function. This would also handle other architectures. So if the guest_enter_irqoff code is
>>>>> not good enough, we should rather fix that instead of adding another rcu hint.
>>>>
>>>> Something like this, on top of the earlier patch?  I am not at all
>>>> confident of this patch because there might be other entry/exit
>>>> paths I am missing.  Plus there might be RCU uses on the arch-specific
>>>> patch to and from the guest OS.
>>>>
>>>> Thoughts?
>>>>
>>>
>>> If you instrment guest_enter/exit, you should cover all cases and all architectures as far
>>> as I can tell. FWIW, we did this rcu_note thing back then actually handling this particular
>>> case of long running guests blocking rcu for many seconds. And I am pretty sure that
>>> this did help back then.
>>
>> And my second patch on the email you replied to replaced the only call
>> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
>> but yes, there might well be things I missed.  Let's see what David
>> comes up with.
>>
>> What changed was RCU's reactions to longish grace periods.  It used to
>> be very aggressive about forcing the scheduler to do otherwise-unneeded
>> context switches, which became a problem somewhere between v4.9 and v4.15.
>> I therefore reduced the number of such context switches, which in turn
>> caused KVM to tell RCU about quiescent states way too infrequently.
>>
>> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
>> it tells RCU of an extended duration in the guest, which means that
>> RCU can ignore the corresponding CPU, which in turn allows the guest
>> to proceed without any RCU-induced interruptions.
>>
>> Does that make sense, or am I missing something?  I freely admit to
>> much ignorance of both kvm and s390!  ;-)
> 
> But I am getting some rcutorture near misses on the commit that
> introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
> vcpu_enter_guest() function.  These near misses occur when running
> rcutorture scenarios TREE01 and TREE03, and in my -rcu tree rather
> than the v4.15 version of this patch.
> 
> Given that I am making pervasive changes to the way that RCU works,
> it might well be that this commit is an innocent bystander.  I will
> run tests overnight and let you know what comes up.

Is there a single patch that that I can test or do I have to combine all the pieces
that are sprinkled in this mail thread?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-11 23:47                                         ` Paul E. McKenney
@ 2018-07-12  8:31                                           ` David Woodhouse
  2018-07-12 11:00                                             ` Christian Borntraeger
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-12  8:31 UTC (permalink / raw)
  To: paulmck, Christian Borntraeger; +Cc: peterz, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]



On Wed, 2018-07-11 at 16:47 -0700, Paul E. McKenney wrote:
> 
> > > What changed was RCU's reactions to longish grace periods.  It used to
> > > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > > context switches, which became a problem somewhere between v4.9 and v4.15.
> > > I therefore reduced the number of such context switches, which in turn
> > > caused KVM to tell RCU about quiescent states way too infrequently.
> > 
> > You talk about 
> > commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
> >     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
> > 
> > correct? In fact, then whatever (properly sent) patch comes up should contain
> > a fixes tag.
> 
> Not that one, but this one is at least part of the "team":
> 
> 28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
> "git bisect" to find the most relevant commit...  :-/

Whichever commit we blame, I suspect the Fixes: tag wants to go on
Paul's earlier 'Make need_resched() respond to urgent RCU-QS needs'
patch.

The rcu_kvm_enter() thing we're talking about now is actually
addressing a problem which has existed for much longer — that with
NO_HZ_FULL, a CPU might end up in guest mode for an indefinite period
of time, with RCU waiting for it to return.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-12  6:21                                         ` Christian Borntraeger
@ 2018-07-12  9:52                                           ` David Woodhouse
  0 siblings, 0 replies; 52+ messages in thread
From: David Woodhouse @ 2018-07-12  9:52 UTC (permalink / raw)
  To: Christian Borntraeger, paulmck; +Cc: peterz, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

On Thu, 2018-07-12 at 08:21 +0200, Christian Borntraeger wrote:
> 
> Is there a single patch that that I can test or do I have to combine
> all the pieces that are sprinkled in this mail thread?

I believe it should be just these two commits, and the first is
probably optional as the second makes it mostly redundant for the KVM
case. Fengguang's 0day bot is watching this tree, so I expect to get
success/failure results fairly shortly for various build tests at
least.

http://git.infradead.org/users/dwmw2/random-2.6.git/shortlog/refs/heads/kvm-rcu

As noted, the first (need_resched) patch took our observed
RCU latencies down from 5-10 seconds to 160ms, while the second one
that we're working on now took it down to 40ms.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-12  8:31                                           ` David Woodhouse
@ 2018-07-12 11:00                                             ` Christian Borntraeger
  2018-07-12 11:10                                               ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-12 11:00 UTC (permalink / raw)
  To: David Woodhouse, paulmck; +Cc: peterz, mhillenb, linux-kernel, kvm



On 07/12/2018 10:31 AM, David Woodhouse wrote:
> 
> 
> On Wed, 2018-07-11 at 16:47 -0700, Paul E. McKenney wrote:
>>
>>>> What changed was RCU's reactions to longish grace periods.  It used to
>>>> be very aggressive about forcing the scheduler to do otherwise-unneeded
>>>> context switches, which became a problem somewhere between v4.9 and v4.15.
>>>> I therefore reduced the number of such context switches, which in turn
>>>> caused KVM to tell RCU about quiescent states way too infrequently.
>>>  
>>> You talk about 
>>> commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
>>>      rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>>>  
>>> correct? In fact, then whatever (properly sent) patch comes up should contain
>>> a fixes tag.
>>
>> Not that one, but this one is at least part of the "team":
>>
>> 28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
>> "git bisect" to find the most relevant commit...  :-/
> 
> Whichever commit we blame, I suspect the Fixes: tag wants to go on
> Paul's earlier 'Make need_resched() respond to urgent RCU-QS needs'
> patch.
> 
> The rcu_kvm_enter() thing we're talking about now is actually
> addressing a problem which has existed for much longer — that with
> NO_HZ_FULL, a CPU might end up in guest mode for an indefinite period
> of time, with RCU waiting for it to return.

On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
visible in perf samples. The older interface was cheaper.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-12 11:00                                             ` Christian Borntraeger
@ 2018-07-12 11:10                                               ` David Woodhouse
  2018-07-12 11:58                                                 ` Christian Borntraeger
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-12 11:10 UTC (permalink / raw)
  To: Christian Borntraeger, paulmck; +Cc: peterz, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
> 
> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
> visible in perf samples. The older interface was cheaper.

Well, the older interface wasn't actually working, which made it
moderately suboptimal :)

But that is fixed by the first patch of the two, so perhaps the second
ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
fixing a real bug there?

Unless we can contrive some way to do the rcu_eqs_enter/exit only on
the paths to/from userspace, and *not* when we loop in the kernel to
handle interrupts (etc.) and immediately re-enter the guest without
returning? That's somewhat non-trivial though...

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-12 11:10                                               ` David Woodhouse
@ 2018-07-12 11:58                                                 ` Christian Borntraeger
  2018-07-12 12:04                                                   ` Christian Borntraeger
  0 siblings, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-12 11:58 UTC (permalink / raw)
  To: David Woodhouse, paulmck; +Cc: peterz, mhillenb, linux-kernel, kvm



On 07/12/2018 01:10 PM, David Woodhouse wrote:
> On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
>>
>> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
>> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
>> visible in perf samples. The older interface was cheaper.
> 
> Well, the older interface wasn't actually working, which made it
> moderately suboptimal :)
> 
> But that is fixed by the first patch of the two, so perhaps the second
> ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
> fixing a real bug there?
> 
> Unless we can contrive some way to do the rcu_eqs_enter/exit only on
> the paths to/from userspace, and *not* when we loop in the kernel to
> handle interrupts (etc.) and immediately re-enter the guest without
> returning? That's somewhat non-trivial though...

Assuming that patch1 really fixes the issue (and all KVM hot loops are
supposed to have something like
        if (need_resched())
                schedule();

this could work out given that patch1 does what it is supposed to do.

So you are thinking of something like

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index d8ee9f949ce3..cac5de8a887c 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -72,8 +72,17 @@ void cond_synchronize_sched(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+#ifdef CONFIG_NO_HZ_FULL
 void rcu_kvm_enter(void);
 void rcu_kvm_exit(void);
+#else
+static inline void rcu_kvm_enter(void)
+{
+}
+static inline void rcu_kvm_exit(void)
+{
+}
+#endif
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);

?

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 21:08                             ` Paul E. McKenney
@ 2018-07-12 12:00                               ` David Woodhouse
  2018-07-12 12:53                                 ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-12 12:00 UTC (permalink / raw)
  To: paulmck; +Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

[-- Attachment #1: Type: text/plain, Size: 1593 bytes --]



On Wed, 2018-07-11 at 14:08 -0700, Paul E. McKenney wrote:
> 
> > Also... why in $DEITY's name was the existing
> > rcu_virt_note_context_switch() not actually sufficient? If we had that
> > there, why did we need an additional explicit calls to rcu_all_qs() in
> > the KVM loop, or the more complex fixes to need_resched() which
> > ultimately had the same effect, to avoid ten-second latencies?
> 
> My guess is that this was because control passed through the
> rcu_virt_note_context_switch() only once, and then subsequent
> scheduling-clock interrupts bypassed this code.  But that is just a guess.
> I need to defer to someone who understands the KVM code better than I do.

I think it's more likely that we just never happened at all. It's
conditional. From the latest patch iteration (see it being removed):

@@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
         * one time slice). Lets treat guest mode as quiescent state, just like
         * we do with user-mode execution.
         */
-       if (!context_tracking_cpu_is_enabled())
-               rcu_virt_note_context_switch(smp_processor_id());
+       rcu_kvm_enter();
 }


Given the vmexit overhead, I don't think we can do the currently-
proposed rcu_kvm_enter() thing except for CONFIG_NO_HZ_FULL where it's
really necessary. I'll make that conditional, but probably on the RCU
side.

Without CONFIG_NO_HZ_FULL, rcu_kvm_exit() can do nothing, and
rcu_kvm_enter() can do rcu_virt_note_context_switch().

OK?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode
  2018-07-12 11:58                                                 ` Christian Borntraeger
@ 2018-07-12 12:04                                                   ` Christian Borntraeger
  0 siblings, 0 replies; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-12 12:04 UTC (permalink / raw)
  To: David Woodhouse, paulmck; +Cc: peterz, mhillenb, linux-kernel, kvm



On 07/12/2018 01:58 PM, Christian Borntraeger wrote:
> 
> 
> On 07/12/2018 01:10 PM, David Woodhouse wrote:
>> On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
>>>
>>> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
>>> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
>>> visible in perf samples. The older interface was cheaper.
>>
>> Well, the older interface wasn't actually working, which made it
>> moderately suboptimal :)
>>
>> But that is fixed by the first patch of the two, so perhaps the second
>> ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
>> fixing a real bug there?
>>
>> Unless we can contrive some way to do the rcu_eqs_enter/exit only on
>> the paths to/from userspace, and *not* when we loop in the kernel to
>> handle interrupts (etc.) and immediately re-enter the guest without
>> returning? That's somewhat non-trivial though...
> 
> Assuming that patch1 really fixes the issue (and all KVM hot loops are
> supposed to have something like
>         if (need_resched())
>                 schedule();
> 
> this could work out given that patch1 does what it is supposed to do.
> 
> So you are thinking of something like
> 
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index d8ee9f949ce3..cac5de8a887c 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -72,8 +72,17 @@ void cond_synchronize_sched(unsigned long oldstate);
>  
>  void rcu_idle_enter(void);
>  void rcu_idle_exit(void);
> +#ifdef CONFIG_NO_HZ_FULL
>  void rcu_kvm_enter(void);
>  void rcu_kvm_exit(void);
> +#else
> +static inline void rcu_kvm_enter(void)
> +{
> +}
> +static inline void rcu_kvm_exit(void)
> +{
> +}
> +#endif
>  void rcu_irq_enter(void);
>  void rcu_irq_exit(void);
>  void rcu_irq_enter_irqson(void);
> 
> ?
> 

Of course we also need to shift the ifdefs in rcu.c
and the rcu_virt_note_context_switch() on enter as you
said.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-12 12:00                               ` David Woodhouse
@ 2018-07-12 12:53                                 ` Paul E. McKenney
  2018-07-12 16:17                                   ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-12 12:53 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Thu, Jul 12, 2018 at 01:00:42PM +0100, David Woodhouse wrote:
> 
> 
> On Wed, 2018-07-11 at 14:08 -0700, Paul E. McKenney wrote:
> > 
> > > Also... why in $DEITY's name was the existing
> > > rcu_virt_note_context_switch() not actually sufficient? If we had that
> > > there, why did we need an additional explicit calls to rcu_all_qs() in
> > > the KVM loop, or the more complex fixes to need_resched() which
> > > ultimately had the same effect, to avoid ten-second latencies?
> > 
> > My guess is that this was because control passed through the
> > rcu_virt_note_context_switch() only once, and then subsequent
> > scheduling-clock interrupts bypassed this code.

Gah!  My guess was instead that the code did a rcu_kvm_enter() going in,
but somehow managed to miss the rcu_kvm_exit() going out.  But that makes
absolutely no sense -- had that happened, rcutorture would likely have
screamed bloody murder, loudly and often.  No mere near misses!

And besides, thus far, -ENOREPRODUCE.  :-/

Which indicates that I have an opportunity to improve rcutorture and
that this patch was with high probability an innocent bystander.

> >                                                  But that is just a guess.
> > I need to defer to someone who understands the KVM code better than I do.
> 
> I think it's more likely that we just never happened at all. It's
> conditional. From the latest patch iteration (see it being removed):
> 
> @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
>          * one time slice). Lets treat guest mode as quiescent state, just like
>          * we do with user-mode execution.
>          */
> -       if (!context_tracking_cpu_is_enabled())
> -               rcu_virt_note_context_switch(smp_processor_id());
> +       rcu_kvm_enter();
>  }
> 
> 
> Given the vmexit overhead, I don't think we can do the currently-
> proposed rcu_kvm_enter() thing except for CONFIG_NO_HZ_FULL where it's
> really necessary. I'll make that conditional, but probably on the RCU
> side.
> 
> Without CONFIG_NO_HZ_FULL, rcu_kvm_exit() can do nothing, and
> rcu_kvm_enter() can do rcu_virt_note_context_switch().
> 
> OK?

Makes sense to me!  And a big "thank you!" to Christian for testing
and analyzing this in a timely fashion!!!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-12 12:53                                 ` Paul E. McKenney
@ 2018-07-12 16:17                                   ` Paul E. McKenney
  2018-07-16 15:40                                     ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-12 16:17 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Thu, Jul 12, 2018 at 05:53:51AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 12, 2018 at 01:00:42PM +0100, David Woodhouse wrote:
> > 
> > 
> > On Wed, 2018-07-11 at 14:08 -0700, Paul E. McKenney wrote:
> > > 
> > > > Also... why in $DEITY's name was the existing
> > > > rcu_virt_note_context_switch() not actually sufficient? If we had that
> > > > there, why did we need an additional explicit calls to rcu_all_qs() in
> > > > the KVM loop, or the more complex fixes to need_resched() which
> > > > ultimately had the same effect, to avoid ten-second latencies?
> > > 
> > > My guess is that this was because control passed through the
> > > rcu_virt_note_context_switch() only once, and then subsequent
> > > scheduling-clock interrupts bypassed this code.
> 
> Gah!  My guess was instead that the code did a rcu_kvm_enter() going in,
> but somehow managed to miss the rcu_kvm_exit() going out.  But that makes
> absolutely no sense -- had that happened, rcutorture would likely have
> screamed bloody murder, loudly and often.  No mere near misses!
> 
> And besides, thus far, -ENOREPRODUCE.  :-/

OK, one close call in 63 hours of rcutorture, this one on scenario TREE03
(yesterday hit TREE01 and TREE03).  Time for probabilitistic long-runtime
bisection.  Plus thought about how to get more information out of the near
misses.  Fun!  ;-)

							Thanx, Paul

> Which indicates that I have an opportunity to improve rcutorture and
> that this patch was with high probability an innocent bystander.
> 
> > >                                                  But that is just a guess.
> > > I need to defer to someone who understands the KVM code better than I do.
> > 
> > I think it's more likely that we just never happened at all. It's
> > conditional. From the latest patch iteration (see it being removed):
> > 
> > @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
> >          * one time slice). Lets treat guest mode as quiescent state, just like
> >          * we do with user-mode execution.
> >          */
> > -       if (!context_tracking_cpu_is_enabled())
> > -               rcu_virt_note_context_switch(smp_processor_id());
> > +       rcu_kvm_enter();
> >  }
> > 
> > 
> > Given the vmexit overhead, I don't think we can do the currently-
> > proposed rcu_kvm_enter() thing except for CONFIG_NO_HZ_FULL where it's
> > really necessary. I'll make that conditional, but probably on the RCU
> > side.
> > 
> > Without CONFIG_NO_HZ_FULL, rcu_kvm_exit() can do nothing, and
> > rcu_kvm_enter() can do rcu_virt_note_context_switch().
> > 
> > OK?
> 
> Makes sense to me!  And a big "thank you!" to Christian for testing
> and analyzing this in a timely fashion!!!
> 
> 							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-12 16:17                                   ` Paul E. McKenney
@ 2018-07-16 15:40                                     ` Paul E. McKenney
  2018-07-17  8:19                                       ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-16 15:40 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Thu, Jul 12, 2018 at 09:17:04AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 12, 2018 at 05:53:51AM -0700, Paul E. McKenney wrote:
> > On Thu, Jul 12, 2018 at 01:00:42PM +0100, David Woodhouse wrote:
> > > 
> > > 
> > > On Wed, 2018-07-11 at 14:08 -0700, Paul E. McKenney wrote:
> > > > 
> > > > > Also... why in $DEITY's name was the existing
> > > > > rcu_virt_note_context_switch() not actually sufficient? If we had that
> > > > > there, why did we need an additional explicit calls to rcu_all_qs() in
> > > > > the KVM loop, or the more complex fixes to need_resched() which
> > > > > ultimately had the same effect, to avoid ten-second latencies?
> > > > 
> > > > My guess is that this was because control passed through the
> > > > rcu_virt_note_context_switch() only once, and then subsequent
> > > > scheduling-clock interrupts bypassed this code.
> > 
> > Gah!  My guess was instead that the code did a rcu_kvm_enter() going in,
> > but somehow managed to miss the rcu_kvm_exit() going out.  But that makes
> > absolutely no sense -- had that happened, rcutorture would likely have
> > screamed bloody murder, loudly and often.  No mere near misses!
> > 
> > And besides, thus far, -ENOREPRODUCE.  :-/
> 
> OK, one close call in 63 hours of rcutorture, this one on scenario TREE03
> (yesterday hit TREE01 and TREE03).  Time for probabilitistic long-runtime
> bisection.  Plus thought about how to get more information out of the near
> misses.  Fun!  ;-)

Most of the weekend was devoted to testing today's upcoming pull request,
but I did get a bit more testing done on this.

I was able to make this happen more often by tweaking rcutorture a
bit, but I still do not yet have statistically significant results.
Nevertheless, I have thus far only seen failures with David's patch or
with both David's and my patch.  And I actually got a full-up rcutorture
failure (a too-short grace period) in addition to the aforementioned
close calls.

Over this coming week I expect to devote significant testing time to
the commit just prior to David's in my stack.  If I don't see failures
on that commit, we will need to spent some quality time with the KVM
folks on whether or not kvm_x86_ops->run() and friends have the option of
failing to return, but instead causing control to pop up somewhere else.
Or someone could tell me how I am being blind to some obvious bug in
the two commits that allow RCU to treat KVM guest-OS execution as an
extended quiescent state.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-16 15:40                                     ` Paul E. McKenney
@ 2018-07-17  8:19                                       ` David Woodhouse
  2018-07-17 12:56                                         ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-17  8:19 UTC (permalink / raw)
  To: paulmck; +Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

[-- Attachment #1: Type: text/plain, Size: 1745 bytes --]

On Mon, 2018-07-16 at 08:40 -0700, Paul E. McKenney wrote:
> Most of the weekend was devoted to testing today's upcoming pull request,
> but I did get a bit more testing done on this.
> 
> I was able to make this happen more often by tweaking rcutorture a
> bit, but I still do not yet have statistically significant results.
> Nevertheless, I have thus far only seen failures with David's patch or
> with both David's and my patch.  And I actually got a full-up rcutorture
> failure (a too-short grace period) in addition to the aforementioned
> close calls.
> 
> Over this coming week I expect to devote significant testing time to
> the commit just prior to David's in my stack.  If I don't see failures
> on that commit, we will need to spent some quality time with the KVM
> folks on whether or not kvm_x86_ops->run() and friends have the option of
> failing to return, but instead causing control to pop up somewhere else.
> Or someone could tell me how I am being blind to some obvious bug in
> the two commits that allow RCU to treat KVM guest-OS execution as an
> extended quiescent state.  ;-)

One thing we can try, if my patch is implicated, is moving the calls to
rcu_kvm_en{ter,xit} closer to the actual VM entry. Let's try putting
them around the large asm block in arch/x86/kvm/vmx.c::vmx_vcpu_run()
for example. If that fixes it, then we know we've missed something else
interesting that's happening in the middle.

Testing on Skylake shows a guest CPUID goes from ~3000 cycles to ~3500
with this patch, so in the next iteration it definitely needs to be
ifdef CONFIG_NO_HZ_FULL anyway, because it's actually required there
(AFAICT) and it's too expensive otherwise as Christian pointed out.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-17  8:19                                       ` David Woodhouse
@ 2018-07-17 12:56                                         ` Paul E. McKenney
  2018-07-18 15:36                                           ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-17 12:56 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Tue, Jul 17, 2018 at 10:19:08AM +0200, David Woodhouse wrote:
> On Mon, 2018-07-16 at 08:40 -0700, Paul E. McKenney wrote:
> > Most of the weekend was devoted to testing today's upcoming pull request,
> > but I did get a bit more testing done on this.
> > 
> > I was able to make this happen more often by tweaking rcutorture a
> > bit, but I still do not yet have statistically significant results.
> > Nevertheless, I have thus far only seen failures with David's patch or
> > with both David's and my patch.  And I actually got a full-up rcutorture
> > failure (a too-short grace period) in addition to the aforementioned
> > close calls.
> > 
> > Over this coming week I expect to devote significant testing time to
> > the commit just prior to David's in my stack.  If I don't see failures
> > on that commit, we will need to spent some quality time with the KVM
> > folks on whether or not kvm_x86_ops->run() and friends have the option of
> > failing to return, but instead causing control to pop up somewhere else.
> > Or someone could tell me how I am being blind to some obvious bug in
> > the two commits that allow RCU to treat KVM guest-OS execution as an
> > extended quiescent state.  ;-)
> 
> One thing we can try, if my patch is implicated, is moving the calls to
> rcu_kvm_en{ter,xit} closer to the actual VM entry. Let's try putting
> them around the large asm block in arch/x86/kvm/vmx.c::vmx_vcpu_run()
> for example. If that fixes it, then we know we've missed something else
> interesting that's happening in the middle.

I don't have enough data to say anything with too much certainty, but
my patch has rcu_kvm_en{ter,xit}() quite a bit farther apart than yours
does, and I am not seeing massive increases in error rate in my patch
compared to yours.  Which again might or might not mean anything.

Plus I haven't proven that your patch isn't an innocent bystander yet.
If it isn't just an innocent bystander, that will take most of this
week do demonstrate given current failure rates.

I am also working on improving rcutorture diagnostics which should help
me work out how to change rcutorture so as to find this more quickly.

> Testing on Skylake shows a guest CPUID goes from ~3000 cycles to ~3500
> with this patch, so in the next iteration it definitely needs to be
> ifdef CONFIG_NO_HZ_FULL anyway, because it's actually required there
> (AFAICT) and it's too expensive otherwise as Christian pointed out.

Makes sense!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-17 12:56                                         ` Paul E. McKenney
@ 2018-07-18 15:36                                           ` Paul E. McKenney
  2018-07-18 16:01                                             ` David Woodhouse
  0 siblings, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-18 15:36 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Tue, Jul 17, 2018 at 05:56:53AM -0700, Paul E. McKenney wrote:
> On Tue, Jul 17, 2018 at 10:19:08AM +0200, David Woodhouse wrote:
> > On Mon, 2018-07-16 at 08:40 -0700, Paul E. McKenney wrote:
> > > Most of the weekend was devoted to testing today's upcoming pull request,
> > > but I did get a bit more testing done on this.
> > > 
> > > I was able to make this happen more often by tweaking rcutorture a
> > > bit, but I still do not yet have statistically significant results.
> > > Nevertheless, I have thus far only seen failures with David's patch or
> > > with both David's and my patch.  And I actually got a full-up rcutorture
> > > failure (a too-short grace period) in addition to the aforementioned
> > > close calls.
> > > 
> > > Over this coming week I expect to devote significant testing time to
> > > the commit just prior to David's in my stack.  If I don't see failures
> > > on that commit, we will need to spent some quality time with the KVM
> > > folks on whether or not kvm_x86_ops->run() and friends have the option of
> > > failing to return, but instead causing control to pop up somewhere else.
> > > Or someone could tell me how I am being blind to some obvious bug in
> > > the two commits that allow RCU to treat KVM guest-OS execution as an
> > > extended quiescent state.  ;-)
> > 
> > One thing we can try, if my patch is implicated, is moving the calls to
> > rcu_kvm_en{ter,xit} closer to the actual VM entry. Let's try putting
> > them around the large asm block in arch/x86/kvm/vmx.c::vmx_vcpu_run()
> > for example. If that fixes it, then we know we've missed something else
> > interesting that's happening in the middle.
> 
> I don't have enough data to say anything with too much certainty, but
> my patch has rcu_kvm_en{ter,xit}() quite a bit farther apart than yours
> does, and I am not seeing massive increases in error rate in my patch
> compared to yours.  Which again might or might not mean anything.
> 
> Plus I haven't proven that your patch isn't an innocent bystander yet.
> If it isn't just an innocent bystander, that will take most of this
> week do demonstrate given current failure rates.

And I finally did get some near misses from an earlier commit, so we
should consider your patch to be officially off the hook.

							Thanx, Paul

> I am also working on improving rcutorture diagnostics which should help
> me work out how to change rcutorture so as to find this more quickly.
> 
> > Testing on Skylake shows a guest CPUID goes from ~3000 cycles to ~3500
> > with this patch, so in the next iteration it definitely needs to be
> > ifdef CONFIG_NO_HZ_FULL anyway, because it's actually required there
> > (AFAICT) and it's too expensive otherwise as Christian pointed out.
> 
> Makes sense!
> 
> 							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 15:36                                           ` Paul E. McKenney
@ 2018-07-18 16:01                                             ` David Woodhouse
  2018-07-18 16:37                                               ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-18 16:01 UTC (permalink / raw)
  To: paulmck; +Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> And I finally did get some near misses from an earlier commit, so we
> should consider your patch to be officially off the hook.

Yay, I like it when it's not my fault. I'll redo it with the ifdef
CONFIG_NO_HZ_FULL.

What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
guest_enter_irqoff() clearly wasn't actually doing the right thing
anyway, hence the need for the need_resched() patch in $SUBJECT... so
should I just leave it doing nothing in guest_enter_irqoff()? 

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 16:01                                             ` David Woodhouse
@ 2018-07-18 16:37                                               ` Paul E. McKenney
  2018-07-18 19:41                                                 ` David Woodhouse
  2018-07-19 17:09                                                 ` Paul E. McKenney
  0 siblings, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-18 16:37 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
> On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> > And I finally did get some near misses from an earlier commit, so we
> > should consider your patch to be officially off the hook.
> 
> Yay, I like it when it's not my fault. I'll redo it with the ifdef
> CONFIG_NO_HZ_FULL.

Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
your fault.  ;-)

> What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
> guest_enter_irqoff() clearly wasn't actually doing the right thing
> anyway, hence the need for the need_resched() patch in $SUBJECT... so
> should I just leave it doing nothing in guest_enter_irqoff()? 

One starting point would be the combination of your patch and my
patch, with -rcu commit IDs and diff below.  But yes, it needs to be
!CONFIG_NO_HZ_FULL.  And no, I am not at all confident that I actually
found all the places needing change in the core code, so this needs some
serious review both by the KVM guys and the NO_HZ_FULL guys.

And some serious testing.  But you knew that already.  ;-)

							Thanx, Paul

------------------------------------------------------------------------

57e3b96d012a kvm/x86: Inform RCU of quiescent state when entering guest mode
f437e330a720 kvm: Inform RCU of quiescent state when entering guest mode

------------------------------------------------------------------------

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index d05609ad329d..8d2a9d3073ad 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
 	 * one time slice). Lets treat guest mode as quiescent state, just like
 	 * we do with user-mode execution.
 	 */
-	if (!context_tracking_cpu_is_enabled())
-		rcu_virt_note_context_switch(smp_processor_id());
+	rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
+	rcu_kvm_exit();
 	if (context_tracking_is_enabled())
 		__context_tracking_exit(CONTEXT_GUEST);
 
@@ -143,12 +143,13 @@ static inline void guest_enter_irqoff(void)
 	 */
 	vtime_account_system(current);
 	current->flags |= PF_VCPU;
-	rcu_virt_note_context_switch(smp_processor_id());
+	rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
 	/* Flush the guest cputime we spent on the guest */
+	rcu_kvm_exit();
 	vtime_account_system(current);
 	current->flags &= ~PF_VCPU;
 }
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7fa4fb9e899e..a7aa5b3cfb81 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -81,10 +81,11 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
  * Take advantage of the fact that there is only one CPU, which
  * allows us to ignore virtualization-based context switches.
  */
-static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
+static inline void rcu_kvm_enter(void) { }
+static inline void rcu_kvm_exit(void) { }
 static inline void rcu_irq_enter(void) { }
 static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 7f83179177d1..62b61e579bb4 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -34,17 +34,6 @@ void rcu_softirq_qs(void);
 void rcu_note_context_switch(bool preempt);
 int rcu_needs_cpu(u64 basem, u64 *nextevt);
 void rcu_cpu_stall_reset(void);
-
-/*
- * Note a virtualization-based context switch.  This is simply a
- * wrapper around rcu_note_context_switch(), which allows TINY_RCU
- * to save a few bytes. The caller must have disabled interrupts.
- */
-static inline void rcu_virt_note_context_switch(int cpu)
-{
-	rcu_note_context_switch(false);
-}
-
 void synchronize_rcu_expedited(void);
 void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
 
@@ -55,6 +44,8 @@ void cond_synchronize_rcu(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+void rcu_kvm_enter(void);
+void rcu_kvm_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8674ef151d50..cb182b7b0d9a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -583,6 +583,24 @@ void rcu_idle_enter(void)
 	rcu_eqs_enter(false);
 }
 
+/**
+ * rcu_kvm_enter - inform RCU that current CPU is entering a guest OS
+ *
+ * Enter guest-OS mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur.  (Though RCU read-side critical
+ * sections can occur in irq handlers from guest OSes, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * If you add or remove a call to rcu_kvm_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_enter(void)
+{
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_enter(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_enter);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_enter - inform RCU that we are resuming userspace.
@@ -747,6 +765,22 @@ void rcu_idle_exit(void)
 	local_irq_restore(flags);
 }
 
+/**
+ * rcu_kvm_exit - inform RCU that current CPU is leaving a guest OS
+ *
+ * Exit guest-OS mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * If you add or remove a call to rcu_kvm_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_exit(void)
+{
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_exit(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_exit);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_exit - inform RCU that we are exiting userspace.

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 16:37                                               ` Paul E. McKenney
@ 2018-07-18 19:41                                                 ` David Woodhouse
  2018-07-18 20:17                                                   ` Paul E. McKenney
  2018-07-19 17:09                                                 ` Paul E. McKenney
  1 sibling, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-18 19:41 UTC (permalink / raw)
  To: paulmck; +Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

[-- Attachment #1: Type: text/plain, Size: 2177 bytes --]



On Wed, 2018-07-18 at 09:37 -0700, Paul E. McKenney wrote:
> On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
> > 
> > On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> > > 
> > > And I finally did get some near misses from an earlier commit, so we
> > > should consider your patch to be officially off the hook.
> >
> > Yay, I like it when it's not my fault. I'll redo it with the ifdef
> > CONFIG_NO_HZ_FULL.
>
> Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
> your fault.  ;-)

I can live with being innocent until proven guilty.

> > 
> > What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
> > guest_enter_irqoff() clearly wasn't actually doing the right thing
> > anyway, hence the need for the need_resched() patch in $SUBJECT... so
> > should I just leave it doing nothing in guest_enter_irqoff()? 
>
> One starting point would be the combination of your patch and my
> patch, with -rcu commit IDs and diff below.  But yes, it needs to be
> !CONFIG_NO_HZ_FULL.  And no, I am not at all confident that I actually
> found all the places needing change in the core code, so this needs some
> serious review both by the KVM guys and the NO_HZ_FULL guys.

Right, that looks fairly much like the version I'd ended up with. So my
question was...

> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
>  	 * one time slice). Lets treat guest mode as quiescent state, just like
>  	 * we do with user-mode execution.
>  	 */

...if we change this to something like...

#ifdef CONFIG_NO_HZ_FULL
> +	rcu_kvm_enter();
#else
> 	if (!context_tracking_cpu_is_enabled())
> 		rcu_virt_note_context_switch(smp_processor_id());
#endif

... do you actually want me to keep the #else case there? It blatantly
wasn't working anyway for us, perhaps because the condition was false?
That's why I started fixing need_resched() in the first place, and that
fix ought to cover whatever this call to rcu_virt_note_context_switch()
was supposed to be doing?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 19:41                                                 ` David Woodhouse
@ 2018-07-18 20:17                                                   ` Paul E. McKenney
  2018-07-19  0:26                                                     ` Frederic Weisbecker
  2018-07-19  6:45                                                     ` Christian Borntraeger
  0 siblings, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-18 20:17 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm, Frederic Weisbecker

On Wed, Jul 18, 2018 at 09:41:05PM +0200, David Woodhouse wrote:
> 
> 
> On Wed, 2018-07-18 at 09:37 -0700, Paul E. McKenney wrote:
> > On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
> > > 
> > > On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> > > > 
> > > > And I finally did get some near misses from an earlier commit, so we
> > > > should consider your patch to be officially off the hook.
> > >
> > > Yay, I like it when it's not my fault. I'll redo it with the ifdef
> > > CONFIG_NO_HZ_FULL.
> >
> > Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
> > your fault.  ;-)
> 
> I can live with being innocent until proven guilty.
> 
> > > 
> > > What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
> > > guest_enter_irqoff() clearly wasn't actually doing the right thing
> > > anyway, hence the need for the need_resched() patch in $SUBJECT... so
> > > should I just leave it doing nothing in guest_enter_irqoff()? 
> >
> > One starting point would be the combination of your patch and my
> > patch, with -rcu commit IDs and diff below.  But yes, it needs to be
> > !CONFIG_NO_HZ_FULL.  And no, I am not at all confident that I actually
> > found all the places needing change in the core code, so this needs some
> > serious review both by the KVM guys and the NO_HZ_FULL guys.
> 
> Right, that looks fairly much like the version I'd ended up with. So my
> question was...
> 
> > --- a/include/linux/context_tracking.h
> > +++ b/include/linux/context_tracking.h
> > @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
> >  	 * one time slice). Lets treat guest mode as quiescent state, just like
> >  	 * we do with user-mode execution.
> >  	 */
> 
> ...if we change this to something like...
> 
> #ifdef CONFIG_NO_HZ_FULL
> > +	rcu_kvm_enter();
> #else
> > 	if (!context_tracking_cpu_is_enabled())
> > 		rcu_virt_note_context_switch(smp_processor_id());
> #endif
> 
> ... do you actually want me to keep the #else case there? It blatantly
> wasn't working anyway for us, perhaps because the condition was false?
> That's why I started fixing need_resched() in the first place, and that
> fix ought to cover whatever this call to rcu_virt_note_context_switch()
> was supposed to be doing?

My thought would be something like this:

	if (context_tracking_cpu_is_enabled())
		rcu_kvm_enter();
	else
		rcu_virt_note_context_switch(smp_processor_id());

The reason I believe that this is the right approach is that even when
you have CONFIG_NO_HZ_FULL=y, some CPUs will still be nohz_full=n CPUs,
so you don't want to take the extra overhead on those CPUs.

But I could easily be confused here, so I am adding Frederic for his
thoughts.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 20:17                                                   ` Paul E. McKenney
@ 2018-07-19  0:26                                                     ` Frederic Weisbecker
  2018-07-19  6:45                                                     ` Christian Borntraeger
  1 sibling, 0 replies; 52+ messages in thread
From: Frederic Weisbecker @ 2018-07-19  0:26 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Woodhouse, Christian Borntraeger, Peter Zijlstra, mhillenb,
	linux-kernel, kvm

On Wed, Jul 18, 2018 at 01:17:00PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 18, 2018 at 09:41:05PM +0200, David Woodhouse wrote:
> > 
> > 
> > On Wed, 2018-07-18 at 09:37 -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
> > > > 
> > > > On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > And I finally did get some near misses from an earlier commit, so we
> > > > > should consider your patch to be officially off the hook.
> > > >
> > > > Yay, I like it when it's not my fault. I'll redo it with the ifdef
> > > > CONFIG_NO_HZ_FULL.
> > >
> > > Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
> > > your fault.  ;-)
> > 
> > I can live with being innocent until proven guilty.
> > 
> > > > 
> > > > What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
> > > > guest_enter_irqoff() clearly wasn't actually doing the right thing
> > > > anyway, hence the need for the need_resched() patch in $SUBJECT... so
> > > > should I just leave it doing nothing in guest_enter_irqoff()? 
> > >
> > > One starting point would be the combination of your patch and my
> > > patch, with -rcu commit IDs and diff below.  But yes, it needs to be
> > > !CONFIG_NO_HZ_FULL.  And no, I am not at all confident that I actually
> > > found all the places needing change in the core code, so this needs some
> > > serious review both by the KVM guys and the NO_HZ_FULL guys.
> > 
> > Right, that looks fairly much like the version I'd ended up with. So my
> > question was...
> > 
> > > --- a/include/linux/context_tracking.h
> > > +++ b/include/linux/context_tracking.h
> > > @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
> > >  	 * one time slice). Lets treat guest mode as quiescent state, just like
> > >  	 * we do with user-mode execution.
> > >  	 */
> > 
> > ...if we change this to something like...
> > 
> > #ifdef CONFIG_NO_HZ_FULL
> > > +	rcu_kvm_enter();
> > #else
> > > 	if (!context_tracking_cpu_is_enabled())
> > > 		rcu_virt_note_context_switch(smp_processor_id());
> > #endif
> > 
> > ... do you actually want me to keep the #else case there? It blatantly
> > wasn't working anyway for us, perhaps because the condition was false?
> > That's why I started fixing need_resched() in the first place, and that
> > fix ought to cover whatever this call to rcu_virt_note_context_switch()
> > was supposed to be doing?
> 
> My thought would be something like this:
> 
> 	if (context_tracking_cpu_is_enabled())
> 		rcu_kvm_enter();
> 	else
> 		rcu_virt_note_context_switch(smp_processor_id());
> 
> The reason I believe that this is the right approach is that even when
> you have CONFIG_NO_HZ_FULL=y, some CPUs will still be nohz_full=n CPUs,
> so you don't want to take the extra overhead on those CPUs.
> 
> But I could easily be confused here, so I am adding Frederic for his
> thoughts.

Hmm, actually rcu_user_enter()/rcu_user_exit() are already called upon guest
entry/exit :-)

Now I must confess the code leading there in __context_tracking_enter/exit is
not very obvious but that part applies to both CONTEXT_USER and CONTEXT_GUEST cases.
I should probably add a few comments to clarify.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-11 17:03                     ` [RFC] Make need_resched() return true when rcu_urgent_qs requested David Woodhouse
  2018-07-11 17:48                       ` Paul E. McKenney
  2018-07-11 18:31                       ` [RFC] Make need_resched() return true when rcu_urgent_qs requested Christian Borntraeger
@ 2018-07-19  0:32                       ` Frederic Weisbecker
  2018-07-19  3:11                         ` Paul E. McKenney
  2 siblings, 1 reply; 52+ messages in thread
From: Frederic Weisbecker @ 2018-07-19  0:32 UTC (permalink / raw)
  To: David Woodhouse; +Cc: paulmck, Peter Zijlstra, mhillenb, linux-kernel, kvm

On Wed, Jul 11, 2018 at 06:03:42PM +0100, David Woodhouse wrote:
> On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> > And here is an updated v4.15 patch with Marius's Reported-by and David's
> > fix to my lost exclamation point.
> 
> Thanks. Are you sending the original version of that to Linus? It'd be
> useful to have the commit ID so that we can watch for it landing, and
> chase this one up to Greg.
> 
> As discussed on IRC, this patch reduces synchronize_sched() latency for
> us from ~4600s to ~160ms, which is nice.
> 
> However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> that you want a patch like the one below, which happily reduces the
> latency in our (!NO_HZ_FULL) case still further to ~40ms.

That is interesting. As I replied to Paul, we are already calling
rcu_user_enter/exit() on guest_enter/exit_irqsoff(). So I'm wondering why
you're seeing such an optimization by repeating those calls.

Perhaps the rcu_user_* somehow aren't actually called from
__context_tracking_enter()...? Some bug in context tracking?
Otherwise it's a curious side effect.

Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  0:32                       ` Frederic Weisbecker
@ 2018-07-19  3:11                         ` Paul E. McKenney
  2018-07-19  6:16                           ` David Woodhouse
  2018-07-19 13:15                           ` Frederic Weisbecker
  0 siblings, 2 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-19  3:11 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Woodhouse, Peter Zijlstra, mhillenb, linux-kernel, kvm

On Thu, Jul 19, 2018 at 02:32:06AM +0200, Frederic Weisbecker wrote:
> On Wed, Jul 11, 2018 at 06:03:42PM +0100, David Woodhouse wrote:
> > On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> > > And here is an updated v4.15 patch with Marius's Reported-by and David's
> > > fix to my lost exclamation point.
> > 
> > Thanks. Are you sending the original version of that to Linus? It'd be
> > useful to have the commit ID so that we can watch for it landing, and
> > chase this one up to Greg.
> > 
> > As discussed on IRC, this patch reduces synchronize_sched() latency for
> > us from ~4600s to ~160ms, which is nice.
> > 
> > However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> > that you want a patch like the one below, which happily reduces the
> > latency in our (!NO_HZ_FULL) case still further to ~40ms.
> 
> That is interesting. As I replied to Paul, we are already calling
> rcu_user_enter/exit() on guest_enter/exit_irqsoff(). So I'm wondering why
> you're seeing such an optimization by repeating those calls.
> 
> Perhaps the rcu_user_* somehow aren't actually called from
> __context_tracking_enter()...? Some bug in context tracking?
> Otherwise it's a curious side effect.

David is working with v4.15.  Is this maybe something that has changed
since then?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  3:11                         ` Paul E. McKenney
@ 2018-07-19  6:16                           ` David Woodhouse
  2018-07-19 13:17                             ` Frederic Weisbecker
  2018-07-19 13:15                           ` Frederic Weisbecker
  1 sibling, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-19  6:16 UTC (permalink / raw)
  To: paulmck, Frederic Weisbecker; +Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]



On Wed, 2018-07-18 at 20:11 -0700, Paul E. McKenney wrote:
> 
> > That is interesting. As I replied to Paul, we are already calling
> > rcu_user_enter/exit() on guest_enter/exit_irqsoff(). So I'm wondering why
> > you're seeing such an optimization by repeating those calls.
> > 
> > Perhaps the rcu_user_* somehow aren't actually called from
> > __context_tracking_enter()...? Some bug in context tracking?
> > Otherwise it's a curious side effect.
> 
> David is working with v4.15.  Is this maybe something that has changed
> since then?

To clarify: in 4.15 without CONFIG_PREEMPT and without NO_HZ_FULL I was
seeing RCU stalls because a thread in vcpu_run() was *never* seen to go
through a quiescent state. Hence the change to need_resched() in the
first patch in this thread, which fixed the problem at hand and seemed
to address the general case.

It then seemed by *inspection* that the NO_HZ_FULL case was probably
broken, because we'd failed to spot the rcu_user_* calls. But
rcu_user_enter() does nothing in the !NO_HZ_FULL case, so wouldn't have
helped in the testing that we were doing anyway.


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 20:17                                                   ` Paul E. McKenney
  2018-07-19  0:26                                                     ` Frederic Weisbecker
@ 2018-07-19  6:45                                                     ` Christian Borntraeger
  2018-07-19  7:20                                                       ` David Woodhouse
  1 sibling, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-19  6:45 UTC (permalink / raw)
  To: paulmck, David Woodhouse
  Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm, Frederic Weisbecker



On 07/18/2018 10:17 PM, Paul E. McKenney wrote:
> On Wed, Jul 18, 2018 at 09:41:05PM +0200, David Woodhouse wrote:
>>
>>
>> On Wed, 2018-07-18 at 09:37 -0700, Paul E. McKenney wrote:
>>> On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
>>>>
>>>> On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
>>>>>
>>>>> And I finally did get some near misses from an earlier commit, so we
>>>>> should consider your patch to be officially off the hook.
>>>>
>>>> Yay, I like it when it's not my fault. I'll redo it with the ifdef
>>>> CONFIG_NO_HZ_FULL.
>>>
>>> Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
>>> your fault.  ;-)
>>
>> I can live with being innocent until proven guilty.
>>
>>>>
>>>> What should it do for the !CONFIG_NO_HZ_FULL case? The existing call in
>>>> guest_enter_irqoff() clearly wasn't actually doing the right thing
>>>> anyway, hence the need for the need_resched() patch in $SUBJECT... so
>>>> should I just leave it doing nothing in guest_enter_irqoff()? 
>>>
>>> One starting point would be the combination of your patch and my
>>> patch, with -rcu commit IDs and diff below.  But yes, it needs to be
>>> !CONFIG_NO_HZ_FULL.  And no, I am not at all confident that I actually
>>> found all the places needing change in the core code, so this needs some
>>> serious review both by the KVM guys and the NO_HZ_FULL guys.
>>
>> Right, that looks fairly much like the version I'd ended up with. So my
>> question was...
>>
>>> --- a/include/linux/context_tracking.h
>>> +++ b/include/linux/context_tracking.h
>>> @@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
>>>  	 * one time slice). Lets treat guest mode as quiescent state, just like
>>>  	 * we do with user-mode execution.
>>>  	 */
>>
>> ...if we change this to something like...
>>
>> #ifdef CONFIG_NO_HZ_FULL
>>> +	rcu_kvm_enter();
>> #else
>>> 	if (!context_tracking_cpu_is_enabled())
>>> 		rcu_virt_note_context_switch(smp_processor_id());
>> #endif
>>
>> ... do you actually want me to keep the #else case there? It blatantly
>> wasn't working anyway for us, perhaps because the condition was false?
>> That's why I started fixing need_resched() in the first place, and that
>> fix ought to cover whatever this call to rcu_virt_note_context_switch()
>> was supposed to be doing?
> 
> My thought would be something like this:
> 
> 	if (context_tracking_cpu_is_enabled())
> 		rcu_kvm_enter();
> 	else
> 		rcu_virt_note_context_switch(smp_processor_id());

In the past we needed that (when we introduced that). At least with every
host interrupt we called this making an rcu event at least every HZ.
Will the changes in need_resched make this part unnecessary?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  6:45                                                     ` Christian Borntraeger
@ 2018-07-19  7:20                                                       ` David Woodhouse
  2018-07-19 10:23                                                         ` Christian Borntraeger
  2018-07-19 13:14                                                         ` Frederic Weisbecker
  0 siblings, 2 replies; 52+ messages in thread
From: David Woodhouse @ 2018-07-19  7:20 UTC (permalink / raw)
  To: Christian Borntraeger, paulmck
  Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm, Frederic Weisbecker

[-- Attachment #1: Type: text/plain, Size: 1163 bytes --]

On Thu, 2018-07-19 at 08:45 +0200, Christian Borntraeger wrote:
> 
> > My thought would be something like this:
> > 
> >       if (context_tracking_cpu_is_enabled())
> >               rcu_kvm_enter();
> >       else
> >               rcu_virt_note_context_switch(smp_processor_id());
> 
> In the past we needed that (when we introduced that). At least with every
> host interrupt we called this making an rcu event at least every HZ.
> Will the changes in need_resched make this part unnecessary?

Yes, the change in need_resched() should make this part unnecessary.
Unless your architecture's version of the vcpu_run() loop just loops
forever even when TIF_NEED_RESCHED is set? :)

I'm not sure about the context tracking condition in the code snippet
cited above, though. I think that's what caused my problem in the first
place — I have CONTEXT_TRACKING_FORCE && !NO_HZ_FULL. So in 4.15, that
means rcu_user_enter() did nothing and rcu_virt_note_context_switch()
wasn't called. Hence the observed stalls.

Should rcu_user_enter() itself be conditional on CONTEXT_TRACKING not
NO_HZ_FULL? 

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  7:20                                                       ` David Woodhouse
@ 2018-07-19 10:23                                                         ` Christian Borntraeger
  2018-07-19 12:55                                                           ` Paul E. McKenney
  2018-07-19 13:14                                                         ` Frederic Weisbecker
  1 sibling, 1 reply; 52+ messages in thread
From: Christian Borntraeger @ 2018-07-19 10:23 UTC (permalink / raw)
  To: David Woodhouse, paulmck
  Cc: Peter Zijlstra, mhillenb, linux-kernel, kvm, Frederic Weisbecker



On 07/19/2018 09:20 AM, David Woodhouse wrote:
> On Thu, 2018-07-19 at 08:45 +0200, Christian Borntraeger wrote:
>>
>>> My thought would be something like this:
>>>  
>>>        if (context_tracking_cpu_is_enabled())
>>>                rcu_kvm_enter();
>>>        else
>>>                rcu_virt_note_context_switch(smp_processor_id());
>>
>> In the past we needed that (when we introduced that). At least with every
>> host interrupt we called this making an rcu event at least every HZ.
>> Will the changes in need_resched make this part unnecessary?
> 
> Yes, the change in need_resched() should make this part unnecessary.
> Unless your architecture's version of the vcpu_run() loop just loops
> forever even when TIF_NEED_RESCHED is set? :)

Very early versions did that. The SIE instruction is interruptible
so you can continue to run the guest by simply returning from an host
interrupt. All sane versions of KVM on s390 now make sure to make a
short trip into C after a host interrupt. There we check for
need_resched signals and machine checks so we are good.

> 
> I'm not sure about the context tracking condition in the code snippet
> cited above, though. I think that's what caused my problem in the first
> place — I have CONTEXT_TRACKING_FORCE && !NO_HZ_FULL. So in 4.15, that
> means rcu_user_enter() did nothing and rcu_virt_note_context_switch()
> wasn't called. Hence the observed stalls.
> 
> Should rcu_user_enter() itself be conditional on CONTEXT_TRACKING not
> NO_HZ_FULL? 
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19 10:23                                                         ` Christian Borntraeger
@ 2018-07-19 12:55                                                           ` Paul E. McKenney
  0 siblings, 0 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-19 12:55 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Woodhouse, Peter Zijlstra, mhillenb, linux-kernel, kvm,
	Frederic Weisbecker

On Thu, Jul 19, 2018 at 12:23:34PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/19/2018 09:20 AM, David Woodhouse wrote:
> > On Thu, 2018-07-19 at 08:45 +0200, Christian Borntraeger wrote:
> >>
> >>> My thought would be something like this:
> >>>  
> >>>        if (context_tracking_cpu_is_enabled())
> >>>                rcu_kvm_enter();
> >>>        else
> >>>                rcu_virt_note_context_switch(smp_processor_id());
> >>
> >> In the past we needed that (when we introduced that). At least with every
> >> host interrupt we called this making an rcu event at least every HZ.
> >> Will the changes in need_resched make this part unnecessary?
> > 
> > Yes, the change in need_resched() should make this part unnecessary.
> > Unless your architecture's version of the vcpu_run() loop just loops
> > forever even when TIF_NEED_RESCHED is set? :)
> 
> Very early versions did that. The SIE instruction is interruptible
> so you can continue to run the guest by simply returning from an host
> interrupt. All sane versions of KVM on s390 now make sure to make a
> short trip into C after a host interrupt. There we check for
> need_resched signals and machine checks so we are good.

OK, thank you all!  I will drop the two patches that add the
rcu_kvm_enter() and rcu_kvm_exit() calls.  Two less things to
worry about!  ;-)

							Thanx, Paul

> > I'm not sure about the context tracking condition in the code snippet
> > cited above, though. I think that's what caused my problem in the first
> > place — I have CONTEXT_TRACKING_FORCE && !NO_HZ_FULL. So in 4.15, that
> > means rcu_user_enter() did nothing and rcu_virt_note_context_switch()
> > wasn't called. Hence the observed stalls.
> > 
> > Should rcu_user_enter() itself be conditional on CONTEXT_TRACKING not
> > NO_HZ_FULL? 
> > 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  7:20                                                       ` David Woodhouse
  2018-07-19 10:23                                                         ` Christian Borntraeger
@ 2018-07-19 13:14                                                         ` Frederic Weisbecker
  2018-07-19 13:36                                                           ` David Woodhouse
  1 sibling, 1 reply; 52+ messages in thread
From: Frederic Weisbecker @ 2018-07-19 13:14 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, paulmck, Peter Zijlstra, mhillenb,
	linux-kernel, kvm

On Thu, Jul 19, 2018 at 09:20:33AM +0200, David Woodhouse wrote:
> On Thu, 2018-07-19 at 08:45 +0200, Christian Borntraeger wrote:
> > 
> > > My thought would be something like this:
> > > 
> > >       if (context_tracking_cpu_is_enabled())
> > >               rcu_kvm_enter();
> > >       else
> > >               rcu_virt_note_context_switch(smp_processor_id());
> > 
> > In the past we needed that (when we introduced that). At least with every
> > host interrupt we called this making an rcu event at least every HZ.
> > Will the changes in need_resched make this part unnecessary?
> 
> Yes, the change in need_resched() should make this part unnecessary.
> Unless your architecture's version of the vcpu_run() loop just loops
> forever even when TIF_NEED_RESCHED is set? :)
> 
> I'm not sure about the context tracking condition in the code snippet
> cited above, though. I think that's what caused my problem in the first
> place — I have CONTEXT_TRACKING_FORCE && !NO_HZ_FULL. So in 4.15, that
> means rcu_user_enter() did nothing and rcu_virt_note_context_switch()
> wasn't called. Hence the observed stalls.
> 
> Should rcu_user_enter() itself be conditional on CONTEXT_TRACKING not
> NO_HZ_FULL? 

Ah, CONTEXT_TRACKING_FORCE is only for testing purpose, you should not select
it, it's going to introduce overhead. Actually I should remove that. Although
since we have removed CONFIG_NOHZ_FULL_ALL it's the last way we have to test
NOHZ_FULL from config alone.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  3:11                         ` Paul E. McKenney
  2018-07-19  6:16                           ` David Woodhouse
@ 2018-07-19 13:15                           ` Frederic Weisbecker
  1 sibling, 0 replies; 52+ messages in thread
From: Frederic Weisbecker @ 2018-07-19 13:15 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Woodhouse, Peter Zijlstra, mhillenb, linux-kernel, kvm

On Wed, Jul 18, 2018 at 08:11:52PM -0700, Paul E. McKenney wrote:
> On Thu, Jul 19, 2018 at 02:32:06AM +0200, Frederic Weisbecker wrote:
> > On Wed, Jul 11, 2018 at 06:03:42PM +0100, David Woodhouse wrote:
> > > On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote:
> > > > And here is an updated v4.15 patch with Marius's Reported-by and David's
> > > > fix to my lost exclamation point.
> > > 
> > > Thanks. Are you sending the original version of that to Linus? It'd be
> > > useful to have the commit ID so that we can watch for it landing, and
> > > chase this one up to Greg.
> > > 
> > > As discussed on IRC, this patch reduces synchronize_sched() latency for
> > > us from ~4600s to ~160ms, which is nice.
> > > 
> > > However, it isn't going to be sufficient in the NO_HZ_FULL case. For
> > > that you want a patch like the one below, which happily reduces the
> > > latency in our (!NO_HZ_FULL) case still further to ~40ms.
> > 
> > That is interesting. As I replied to Paul, we are already calling
> > rcu_user_enter/exit() on guest_enter/exit_irqsoff(). So I'm wondering why
> > you're seeing such an optimization by repeating those calls.
> > 
> > Perhaps the rcu_user_* somehow aren't actually called from
> > __context_tracking_enter()...? Some bug in context tracking?
> > Otherwise it's a curious side effect.
> 
> David is working with v4.15.  Is this maybe something that has changed
> since then?

Hmm, nope I think the context tracking code hasn't changed for a while.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19  6:16                           ` David Woodhouse
@ 2018-07-19 13:17                             ` Frederic Weisbecker
  0 siblings, 0 replies; 52+ messages in thread
From: Frederic Weisbecker @ 2018-07-19 13:17 UTC (permalink / raw)
  To: David Woodhouse; +Cc: paulmck, Peter Zijlstra, mhillenb, linux-kernel, kvm

On Thu, Jul 19, 2018 at 08:16:47AM +0200, David Woodhouse wrote:
> 
> 
> On Wed, 2018-07-18 at 20:11 -0700, Paul E. McKenney wrote:
> > 
> > > That is interesting. As I replied to Paul, we are already calling
> > > rcu_user_enter/exit() on guest_enter/exit_irqsoff(). So I'm wondering why
> > > you're seeing such an optimization by repeating those calls.
> > > 
> > > Perhaps the rcu_user_* somehow aren't actually called from
> > > __context_tracking_enter()...? Some bug in context tracking?
> > > Otherwise it's a curious side effect.
> > 
> > David is working with v4.15.  Is this maybe something that has changed
> > since then?
> 
> To clarify: in 4.15 without CONFIG_PREEMPT and without NO_HZ_FULL I was
> seeing RCU stalls because a thread in vcpu_run() was *never* seen to go
> through a quiescent state. Hence the change to need_resched() in the
> first patch in this thread, which fixed the problem at hand and seemed
> to address the general case.
> 
> It then seemed by *inspection* that the NO_HZ_FULL case was probably
> broken, because we'd failed to spot the rcu_user_* calls. But
> rcu_user_enter() does nothing in the !NO_HZ_FULL case, so wouldn't have
> helped in the testing that we were doing anyway.

Oh ok, so the optimization you saw is likely unrelated to the rcu_user* things.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19 13:14                                                         ` Frederic Weisbecker
@ 2018-07-19 13:36                                                           ` David Woodhouse
  0 siblings, 0 replies; 52+ messages in thread
From: David Woodhouse @ 2018-07-19 13:36 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Christian Borntraeger, paulmck, Peter Zijlstra, mhillenb,
	linux-kernel, kvm

[-- Attachment #1: Type: text/plain, Size: 883 bytes --]



On Thu, 2018-07-19 at 15:14 +0200, Frederic Weisbecker wrote:
> > I'm not sure about the context tracking condition in the code snippet
> > cited above, though. I think that's what caused my problem in the first
> > place — I have CONTEXT_TRACKING_FORCE && !NO_HZ_FULL. So in 4.15, that
> > means rcu_user_enter() did nothing and rcu_virt_note_context_switch()
> > wasn't called. Hence the observed stalls.
> > 
> > Should rcu_user_enter() itself be conditional on CONTEXT_TRACKING not
> > NO_HZ_FULL? 
> 
> Ah, CONTEXT_TRACKING_FORCE is only for testing purpose, you should not select
> it, it's going to introduce overhead. Actually I should remove that. Although
> since we have removed CONFIG_NOHZ_FULL_ALL it's the last way we have to test
> NOHZ_FULL from config alone.

I didn't select it. It defaults to y if !NO_HZ_FULL.
I'm turning it off now...


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-18 16:37                                               ` Paul E. McKenney
  2018-07-18 19:41                                                 ` David Woodhouse
@ 2018-07-19 17:09                                                 ` Paul E. McKenney
  2018-07-23  8:08                                                   ` David Woodhouse
  1 sibling, 1 reply; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-19 17:09 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Wed, Jul 18, 2018 at 09:37:12AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 18, 2018 at 06:01:51PM +0200, David Woodhouse wrote:
> > On Wed, 2018-07-18 at 08:36 -0700, Paul E. McKenney wrote:
> > > And I finally did get some near misses from an earlier commit, so we
> > > should consider your patch to be officially off the hook.
> > 
> > Yay, I like it when it's not my fault. I'll redo it with the ifdef
> > CONFIG_NO_HZ_FULL.
> 
> Hey, I didn't say it wasn't your fault, only that it -officially- wasn't
> your fault.  ;-)

And I believe that I found my bug in this commit (lots more testing still
required, but the preponderance of evidence and all that):

2cc0c7f07143 ("rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure")

So it really isn't your fault.

Of course, the real reason for the lack of fault on your part will not
because I believe I found the bug elsewhere, but instead because I will
be dropping your patch (and mine as well) on Frederic's advice.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-19 17:09                                                 ` Paul E. McKenney
@ 2018-07-23  8:08                                                   ` David Woodhouse
  2018-07-23 12:22                                                     ` Paul E. McKenney
  0 siblings, 1 reply; 52+ messages in thread
From: David Woodhouse @ 2018-07-23  8:08 UTC (permalink / raw)
  To: paulmck; +Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

On Thu, 2018-07-19 at 10:09 -0700, Paul E. McKenney wrote:
> 
> Of course, the real reason for the lack of fault on your part will not
> because I believe I found the bug elsewhere, but instead because I will
> be dropping your patch (and mine as well) on Frederic's advice.  ;-)

You're keeping the need_resched() one though?

And we are still left with the fact that CONTEXT_TRACKING_FORCE is
making the existing code in guest_enter_irqoff() do the wrong thing for
!NO_HZ_FULL. But in fact the rcu_virt_note_context_switch() there is
completely redundant now we fixed need_resched(), so can be dropped,
leaving only the rcu_user_enter/exit calls for the NO_HZ_FULL case?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested
  2018-07-23  8:08                                                   ` David Woodhouse
@ 2018-07-23 12:22                                                     ` Paul E. McKenney
  0 siblings, 0 replies; 52+ messages in thread
From: Paul E. McKenney @ 2018-07-23 12:22 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Christian Borntraeger, Peter Zijlstra, mhillenb, linux-kernel,
	kvm

On Mon, Jul 23, 2018 at 10:08:59AM +0200, David Woodhouse wrote:
> On Thu, 2018-07-19 at 10:09 -0700, Paul E. McKenney wrote:
> > 
> > Of course, the real reason for the lack of fault on your part will not
> > because I believe I found the bug elsewhere, but instead because I will
> > be dropping your patch (and mine as well) on Frederic's advice.  ;-)
> 
> You're keeping the need_resched() one though?

Yes.  This the current commit in -rcu (which will change when I rebase
onto v4.19-rc1, if not earlier):

fcf0407e6e63 ("rcu: Make need_resched() respond to urgent RCU-QS needs")

> And we are still left with the fact that CONTEXT_TRACKING_FORCE is
> making the existing code in guest_enter_irqoff() do the wrong thing for
> !NO_HZ_FULL. But in fact the rcu_virt_note_context_switch() there is
> completely redundant now we fixed need_resched(), so can be dropped,
> leaving only the rcu_user_enter/exit calls for the NO_HZ_FULL case?

I am not yet convinced that we know exactly the right thing to be
doing for guest OSes for either value of NO_HZ_FULL, much less that
we are actually doing it.  ;-)

But what does your testing say?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2018-07-23 12:22 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20180709163432.GV3593@linux.vnet.ibm.com>
     [not found] ` <1531162254.26547.3.camel@infradead.org>
     [not found]   ` <20180709203441.GE3593@linux.vnet.ibm.com>
     [not found]     ` <1531168538.26547.5.camel@infradead.org>
     [not found]       ` <20180709204248.GF3593@linux.vnet.ibm.com>
     [not found]         ` <1531169145.26547.8.camel@infradead.org>
     [not found]           ` <20180709210532.GH3593@linux.vnet.ibm.com>
     [not found]             ` <20180709220823.GA18045@linux.vnet.ibm.com>
     [not found]               ` <1531319025.8759.57.camel@infradead.org>
     [not found]                 ` <20180711144303.GQ3593@linux.vnet.ibm.com>
     [not found]                   ` <20180711164952.GA29994@linux.vnet.ibm.com>
2018-07-11 17:03                     ` [RFC] Make need_resched() return true when rcu_urgent_qs requested David Woodhouse
2018-07-11 17:48                       ` Paul E. McKenney
2018-07-11 18:01                         ` [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode David Woodhouse
2018-07-11 18:20                           ` Paul E. McKenney
2018-07-11 18:36                             ` Paul E. McKenney
2018-07-11 18:39                               ` Christian Borntraeger
2018-07-11 20:27                                 ` Paul E. McKenney
2018-07-11 20:54                                   ` David Woodhouse
2018-07-11 21:09                                     ` Paul E. McKenney
2018-07-11 21:11                                   ` Christian Borntraeger
2018-07-11 21:32                                     ` Paul E. McKenney
2018-07-11 21:39                                       ` Christian Borntraeger
2018-07-11 23:47                                         ` Paul E. McKenney
2018-07-12  8:31                                           ` David Woodhouse
2018-07-12 11:00                                             ` Christian Borntraeger
2018-07-12 11:10                                               ` David Woodhouse
2018-07-12 11:58                                                 ` Christian Borntraeger
2018-07-12 12:04                                                   ` Christian Borntraeger
2018-07-11 23:37                                       ` Paul E. McKenney
2018-07-12  2:15                                         ` Paul E. McKenney
2018-07-12  6:21                                         ` Christian Borntraeger
2018-07-12  9:52                                           ` David Woodhouse
2018-07-11 18:31                       ` [RFC] Make need_resched() return true when rcu_urgent_qs requested Christian Borntraeger
2018-07-11 20:17                         ` Paul E. McKenney
2018-07-11 20:19                           ` David Woodhouse
2018-07-11 21:08                             ` Paul E. McKenney
2018-07-12 12:00                               ` David Woodhouse
2018-07-12 12:53                                 ` Paul E. McKenney
2018-07-12 16:17                                   ` Paul E. McKenney
2018-07-16 15:40                                     ` Paul E. McKenney
2018-07-17  8:19                                       ` David Woodhouse
2018-07-17 12:56                                         ` Paul E. McKenney
2018-07-18 15:36                                           ` Paul E. McKenney
2018-07-18 16:01                                             ` David Woodhouse
2018-07-18 16:37                                               ` Paul E. McKenney
2018-07-18 19:41                                                 ` David Woodhouse
2018-07-18 20:17                                                   ` Paul E. McKenney
2018-07-19  0:26                                                     ` Frederic Weisbecker
2018-07-19  6:45                                                     ` Christian Borntraeger
2018-07-19  7:20                                                       ` David Woodhouse
2018-07-19 10:23                                                         ` Christian Borntraeger
2018-07-19 12:55                                                           ` Paul E. McKenney
2018-07-19 13:14                                                         ` Frederic Weisbecker
2018-07-19 13:36                                                           ` David Woodhouse
2018-07-19 17:09                                                 ` Paul E. McKenney
2018-07-23  8:08                                                   ` David Woodhouse
2018-07-23 12:22                                                     ` Paul E. McKenney
2018-07-19  0:32                       ` Frederic Weisbecker
2018-07-19  3:11                         ` Paul E. McKenney
2018-07-19  6:16                           ` David Woodhouse
2018-07-19 13:17                             ` Frederic Weisbecker
2018-07-19 13:15                           ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).