[PATCH v2 0/4] Simple wait queue support

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/4] Simple wait queue support
@ 2015-10-14  7:43 Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 1/4] wait.[ch]: Introduce the simple waitqueue (swait) implementation Daniel Wagner
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14  7:43 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Daniel Wagner, Paul E. McKenney, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Marcelo Tosatti, Mathieu Desnoyers, Paolo Bonzini,
	Paul Gortmaker, Peter Zijlstra, Sebastian Andrzej Siewior,
	Steven Rostedt, Thomas Gleixner

Hi,

As Peter pointed out the conversion to swait in completion is not so
simple. For the time being I drop it and look at them later again.
The main reason is that complete_all() can be called from hard-irq/atomic
context and we can't use swake_up_all() because that function wants
to run with IRQs enabled. If we can ensure that all callers of
complete_all() are running under normal context for -rt.

In this version I tried to weed out all obvious typos for the
different platforms which I missed in v1. 

These patches are against

  tip/master e6f195fd5c80214c5a4f139ee28798e66e20aa8f

also available as git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/wagi/linux.git tip-swait

cheers,
daniel

[I got the numbering wrong in v1, so instead 'PATCH v1' you find it
 as 'PATCH v0' series]

changes since v1 (PATCH v0)
 - rebased and fixed some typos found by cross building
   for S390, ARM and powerpc. For some unknown reason didn't catch
   them last time.
 - dropped completion patches because it is not clear yet
   how to handle complete_all() calls hard-irq/atomic contexts
   and swake_up_all.

changes since v0 (RFC v0)
 - promoted the series to PATCH state instead of RFC
 - fixed a few fallouts with build all and some cross compilers
   such ARM, PowerPC, S390.
 - Added the simple waitqueue transformation for KVM from -rt
   including some numbers requested by Paolo.
 - Added a commit message to PeterZ's patch. Hope he likes it.

v1: http://lwn.net/Articles/656942/
v0: http://lwn.net/Articles/653586/

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org

Daniel Wagner (1):
  rcu: Do not call swake_up_all with rnp->lock holding

Marcelo Tosatti (1):
  KVM: use simple waitqueue for vcpu->wq

Paul Gortmaker (1):
  rcu: use simple wait queues where possible in rcutree

Peter Zijlstra (Intel) (1):
  wait.[ch]: Introduce the simple waitqueue (swait) implementation

 arch/arm/kvm/arm.c                  |   4 +-
 arch/arm/kvm/psci.c                 |   4 +-
 arch/powerpc/include/asm/kvm_host.h |   4 +-
 arch/powerpc/kvm/book3s_hv.c        |  23 +++--
 arch/s390/include/asm/kvm_host.h    |   2 +-
 arch/s390/kvm/interrupt.c           |   8 +-
 arch/x86/kvm/lapic.c                |   6 +-
 include/linux/kvm_host.h            |   5 +-
 include/linux/swait.h               | 172 ++++++++++++++++++++++++++++++++++++
 kernel/rcu/tree.c                   |  17 ++--
 kernel/rcu/tree.h                   |  10 ++-
 kernel/rcu/tree_plugin.h            |  32 ++++---
 kernel/sched/Makefile               |   2 +-
 kernel/sched/swait.c                | 122 +++++++++++++++++++++++++
 virt/kvm/async_pf.c                 |   4 +-
 virt/kvm/kvm_main.c                 |  17 ++--
 16 files changed, 370 insertions(+), 62 deletions(-)
 create mode 100644 include/linux/swait.h
 create mode 100644 kernel/sched/swait.c

-- 
2.4.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/4] wait.[ch]: Introduce the simple waitqueue (swait) implementation
  2015-10-14  7:43 [PATCH v2 0/4] Simple wait queue support Daniel Wagner
@ 2015-10-14  7:43 ` Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq Daniel Wagner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14  7:43 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Peter Zijlstra (Intel), Daniel Wagner, Paul Gortmaker,
	Ingo Molnar, Marcelo Tosatti, Paolo Bonzini

From: "Peter Zijlstra (Intel)" <peterz@infradead.org>

The existing wait queue support has support for custom wake up call
backs, wake flags, wake key (passed to call back) and exclusive
flags that allow wakers to be tagged as exclusive, for limiting
the number of wakers.

In a lot of cases, none of these features are used, and hence we
can benefit from a slimmed down version that lowers memory overhead
and reduces runtime overhead.

The concept originated from -rt, where waitqueues are a constant
source of trouble, as we can't convert the head lock to a raw
spinlock due to fancy and long lasting callbacks.

With the removal of custom callbacks, we can use a raw lock for
queue list manipulations, hence allowing the simple wait support
to be used in -rt.

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Mostly-Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org

[Patch is from Peter which is based on Thomas version.
 Commit message is written by Paul.
 And some compile issues fixed by Daniel.]
---
 include/linux/swait.h | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/Makefile |   2 +-
 kernel/sched/swait.c  | 122 +++++++++++++++++++++++++++++++++++
 3 files changed, 295 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/swait.h
 create mode 100644 kernel/sched/swait.c

diff --git a/include/linux/swait.h b/include/linux/swait.h
new file mode 100644
index 0000000..c1f9c62
--- /dev/null
+++ b/include/linux/swait.h
@@ -0,0 +1,172 @@
+#ifndef _LINUX_SWAIT_H
+#define _LINUX_SWAIT_H
+
+#include <linux/list.h>
+#include <linux/stddef.h>
+#include <linux/spinlock.h>
+#include <asm/current.h>
+
+/*
+ * Simple wait queues
+ *
+ * While these are very similar to the other/complex wait queues (wait.h) the
+ * most important difference is that the simple waitqueue allows for
+ * deterministic behaviour -- IOW it has strictly bounded IRQ and lock hold
+ * times.
+ *
+ * In order to make this so, we had to drop a fair number of features of the
+ * other waitqueue code; notably:
+ *
+ *  - mixing INTERRUPTIBLE and UNINTERRUPTIBLE sleeps on the same waitqueue;
+ *    all wakeups are TASK_NORMAL in order to avoid O(n) lookups for the right
+ *    sleeper state.
+ *
+ *  - the exclusive mode; because this requires preserving the list order
+ *    and this is hard.
+ *
+ *  - custom wake functions; because you cannot give any guarantees about
+ *    random code.
+ *
+ * As a side effect of this; the data structures are slimmer.
+ *
+ * One would recommend using this wait queue where possible.
+ */
+
+struct task_struct;
+
+struct swait_queue_head {
+	raw_spinlock_t		lock;
+	struct list_head	task_list;
+};
+
+struct swait_queue {
+	struct task_struct	*task;
+	struct list_head	task_list;
+};
+
+#define __SWAITQUEUE_INITIALIZER(name) {				\
+	.task		= current,					\
+	.task_list	= LIST_HEAD_INIT((name).task_list),		\
+}
+
+#define DECLARE_SWAITQUEUE(name)					\
+	struct swait_queue name = __SWAITQUEUE_INITIALIZER(name)
+
+#define __SWAIT_QUEUE_HEAD_INITIALIZER(name) {				\
+	.lock		= __RAW_SPIN_LOCK_UNLOCKED(name.lock),		\
+	.task_list	= LIST_HEAD_INIT((name).task_list),		\
+}
+
+#define DECLARE_SWAIT_QUEUE_HEAD(name)					\
+	struct swait_queue_head name = __SWAIT_QUEUE_HEAD_INITIALIZER(name)
+
+extern void __init_swait_queue_head(struct swait_queue_head *q, const char *name,
+				    struct lock_class_key *key);
+
+#define init_swait_queue_head(q)				\
+	do {							\
+		static struct lock_class_key __key;		\
+		__init_swait_queue_head((q), #q, &__key);	\
+	} while (0)
+
+#ifdef CONFIG_LOCKDEP
+# define __SWAIT_QUEUE_HEAD_INIT_ONSTACK(name)			\
+	({ init_swait_queue_head(&name); name; })
+# define DECLARE_SWAIT_QUEUE_HEAD_ONSTACK(name)			\
+	struct swait_queue_head name = __SWAIT_QUEUE_HEAD_INIT_ONSTACK(name)
+#else
+# define DECLARE_SWAIT_QUEUE_HEAD_ONSTACK(name)			\
+	DECLARE_SWAIT_QUEUE_HEAD(name)
+#endif
+
+static inline int swait_active(struct swait_queue_head *q)
+{
+	return !list_empty(&q->task_list);
+}
+
+extern void swake_up(struct swait_queue_head *q);
+extern void swake_up_all(struct swait_queue_head *q);
+extern void swake_up_locked(struct swait_queue_head *q);
+
+extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
+extern void prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait, int state);
+extern long prepare_to_swait_event(struct swait_queue_head *q, struct swait_queue *wait, int state);
+
+extern void __finish_swait(struct swait_queue_head *q, struct swait_queue *wait);
+extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait);
+
+/* as per ___wait_event() but for swait, therefore "exclusive == 0" */
+#define ___swait_event(wq, condition, state, ret, cmd)			\
+({									\
+	struct swait_queue __wait;					\
+	long __ret = ret;						\
+									\
+	INIT_LIST_HEAD(&__wait.task_list);				\
+	for (;;) {							\
+		long __int = prepare_to_swait_event(&wq, &__wait, state);\
+									\
+		if (condition)						\
+			break;						\
+									\
+		if (___wait_is_interruptible(state) && __int) {		\
+			__ret = __int;					\
+			break;						\
+		}							\
+									\
+		cmd;							\
+	}								\
+	finish_swait(&wq, &__wait);					\
+	__ret;								\
+})
+
+#define __swait_event(wq, condition)					\
+	(void)___swait_event(wq, condition, TASK_UNINTERRUPTIBLE, 0,	\
+			    schedule())
+
+#define swait_event(wq, condition)					\
+do {									\
+	if (condition)							\
+		break;							\
+	__swait_event(wq, condition);					\
+} while (0)
+
+#define __swait_event_timeout(wq, condition, timeout)			\
+	___swait_event(wq, ___wait_cond_timeout(condition),		\
+		      TASK_UNINTERRUPTIBLE, timeout,			\
+		      __ret = schedule_timeout(__ret))
+
+#define swait_event_timeout(wq, condition, timeout)			\
+({									\
+	long __ret = timeout;						\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __swait_event_timeout(wq, condition, timeout);	\
+	__ret;								\
+})
+
+#define __swait_event_interruptible(wq, condition)			\
+	___swait_event(wq, condition, TASK_INTERRUPTIBLE, 0,		\
+		      schedule())
+
+#define swait_event_interruptible(wq, condition)			\
+({									\
+	int __ret = 0;							\
+	if (!(condition))						\
+		__ret = __swait_event_interruptible(wq, condition);	\
+	__ret;								\
+})
+
+#define __swait_event_interruptible_timeout(wq, condition, timeout)	\
+	___swait_event(wq, ___wait_cond_timeout(condition),		\
+		      TASK_INTERRUPTIBLE, timeout,			\
+		      __ret = schedule_timeout(__ret))
+
+#define swait_event_interruptible_timeout(wq, condition, timeout)	\
+({									\
+	long __ret = timeout;						\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __swait_event_interruptible_timeout(wq,		\
+						condition, timeout);	\
+	__ret;								\
+})
+
+#endif /* _LINUX_SWAIT_H */
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 6768797..7d4cba2 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -13,7 +13,7 @@ endif
 
 obj-y += core.o loadavg.o clock.o cputime.o
 obj-y += idle_task.o fair.o rt.o deadline.o stop_task.o
-obj-y += wait.o completion.o idle.o
+obj-y += wait.o swait.o completion.o idle.o
 obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o
 obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
 obj-$(CONFIG_SCHEDSTATS) += stats.o
diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
new file mode 100644
index 0000000..533710e
--- /dev/null
+++ b/kernel/sched/swait.c
@@ -0,0 +1,122 @@
+#include <linux/sched.h>
+#include <linux/swait.h>
+
+void __init_swait_queue_head(struct swait_queue_head *q, const char *name,
+			     struct lock_class_key *key)
+{
+	raw_spin_lock_init(&q->lock);
+	lockdep_set_class_and_name(&q->lock, key, name);
+	INIT_LIST_HEAD(&q->task_list);
+}
+EXPORT_SYMBOL(__init_swait_queue_head);
+
+/*
+ * The thing about the wake_up_state() return value; I think we can ignore it.
+ *
+ * If for some reason it would return 0, that means the previously waiting
+ * task is already running, so it will observe condition true (or has already).
+ */
+void swake_up_locked(struct swait_queue_head *q)
+{
+	struct swait_queue *curr;
+
+	list_for_each_entry(curr, &q->task_list, task_list) {
+		wake_up_process(curr->task);
+		list_del_init(&curr->task_list);
+		break;
+	}
+}
+EXPORT_SYMBOL(swake_up_locked);
+
+void swake_up(struct swait_queue_head *q)
+{
+	unsigned long flags;
+
+	if (!swait_active(q))
+		return;
+
+	raw_spin_lock_irqsave(&q->lock, flags);
+	swake_up_locked(q);
+	raw_spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(swake_up);
+
+/*
+ * Does not allow usage from IRQ disabled, since we must be able to
+ * release IRQs to guarantee bounded hold time.
+ */
+void swake_up_all(struct swait_queue_head *q)
+{
+	struct swait_queue *curr;
+	LIST_HEAD(tmp);
+
+	if (!swait_active(q))
+		return;
+
+	raw_spin_lock_irq(&q->lock);
+	list_splice_init(&q->task_list, &tmp);
+	while (!list_empty(&tmp)) {
+		curr = list_first_entry(&tmp, typeof(*curr), task_list);
+
+		wake_up_state(curr->task, TASK_NORMAL);
+		list_del_init(&curr->task_list);
+
+		if (list_empty(&tmp))
+			break;
+
+		raw_spin_unlock_irq(&q->lock);
+		raw_spin_lock_irq(&q->lock);
+	}
+	raw_spin_unlock_irq(&q->lock);
+}
+EXPORT_SYMBOL(swake_up_all);
+
+void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait)
+{
+	wait->task = current;
+	if (list_empty(&wait->task_list))
+		list_add(&wait->task_list, &q->task_list);
+}
+
+void prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait, int state)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&q->lock, flags);
+	__prepare_to_swait(q, wait);
+	set_current_state(state);
+	raw_spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(prepare_to_swait);
+
+long prepare_to_swait_event(struct swait_queue_head *q, struct swait_queue *wait, int state)
+{
+	if (signal_pending_state(state, current))
+		return -ERESTARTSYS;
+
+	prepare_to_swait(q, wait, state);
+
+	return 0;
+}
+EXPORT_SYMBOL(prepare_to_swait_event);
+
+void __finish_swait(struct swait_queue_head *q, struct swait_queue *wait)
+{
+	__set_current_state(TASK_RUNNING);
+	if (!list_empty(&wait->task_list))
+		list_del_init(&wait->task_list);
+}
+
+void finish_swait(struct swait_queue_head *q, struct swait_queue *wait)
+{
+	unsigned long flags;
+
+	__set_current_state(TASK_RUNNING);
+
+	if (!list_empty_careful(&wait->task_list)) {
+		raw_spin_lock_irqsave(&q->lock, flags);
+		list_del_init(&wait->task_list);
+		raw_spin_unlock_irqrestore(&q->lock, flags);
+	}
+}
+EXPORT_SYMBOL(finish_swait);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq
  2015-10-14  7:43 [PATCH v2 0/4] Simple wait queue support Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 1/4] wait.[ch]: Introduce the simple waitqueue (swait) implementation Daniel Wagner
@ 2015-10-14  7:43 ` Daniel Wagner
  2015-10-14 14:54   ` Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 4/4] rcu: Do not call swake_up_all with rnp->lock holding Daniel Wagner
  3 siblings, 1 reply; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14  7:43 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Marcelo Tosatti, Sebastian Andrzej Siewior, Daniel Wagner,
	Paolo Bonzini

From: Marcelo Tosatti <mtosatti@redhat.com>

The problem:

On -rt, an emulated LAPIC timer instances has the following path:

1) hard interrupt
2) ksoftirqd is scheduled
3) ksoftirqd wakes up vcpu thread
4) vcpu thread is scheduled

This extra context switch introduces unnecessary latency in the
LAPIC path for a KVM guest.

The solution:

Allow waking up vcpu thread from hardirq context,
thus avoiding the need for ksoftirqd to be scheduled.

Normal waitqueues make use of spinlocks, which on -RT
are sleepable locks. Therefore, waking up a waitqueue
waiter involves locking a sleeping lock, which
is not allowed from hard interrupt context.

cyclictest command line:

This patch reduces the average latency in my tests from 14us to 11us.

Daniel writes:
Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency
benchmark on mainline. The test was run 382, respectively 300 times:

  ./x86-run x86/tscdeadline_latency.flat -cpu host

with idle=poll.

The test seems not to deliver really stable numbers though most of
them are smaller which I consider a good sign:

Before:

	min              max          mean            std
count   382.000000       382.000000    382.000000     382.000000
mean   6068.552356    269502.528796   8056.016198    3912.128273
std     707.404966    848866.474783   1062.472704    9835.891707
min    2335.000000     29828.000000   7337.426000     445.738750
25%    6004.500000     44237.500000   7471.094250    1078.834837
50%    6372.000000     64175.000000   7663.133700    1783.172446
75%    6465.500000    150384.500000   8210.771900    2759.734524
max    6886.000000  10188451.000000  15466.434000  120469.205668

After
	min             max          mean           std
count   300.000000      300.000000    300.000000    300.000000
mean   5618.380000   217464.786667   7745.545114   3258.483272
std     824.719741   516371.888369    847.391685   5632.943904
min    3494.000000    31410.000000   7083.574800    438.445477
25%    4937.000000    45446.000000   7214.102850   1045.536261
50%    6118.000000    67023.000000   7417.330800   1699.574075
75%    6224.000000   134191.500000   7871.625600   2809.536185
max    6654.000000  4570896.000000  13528.788600  52206.226799

[Patch was originaly based on the swait implementation found in the -rt
 tree. Daniel ported it to mainline's version and gathered the
 benchmark numbers for tscdeadline_latency test.]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org
---
 arch/arm/kvm/arm.c                  |  4 ++--
 arch/arm/kvm/psci.c                 |  4 ++--
 arch/powerpc/include/asm/kvm_host.h |  4 ++--
 arch/powerpc/kvm/book3s_hv.c        | 23 +++++++++++------------
 arch/s390/include/asm/kvm_host.h    |  2 +-
 arch/s390/kvm/interrupt.c           |  8 ++++----
 arch/x86/kvm/lapic.c                |  6 +++---
 include/linux/kvm_host.h            |  5 +++--
 virt/kvm/async_pf.c                 |  4 ++--
 virt/kvm/kvm_main.c                 | 17 ++++++++---------
 10 files changed, 38 insertions(+), 39 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dc017ad..97e8336 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -470,9 +470,9 @@ bool kvm_arch_intc_initialized(struct kvm *kvm)
 
 static void vcpu_pause(struct kvm_vcpu *vcpu)
 {
-	wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu);
+	struct swait_queue_head *wq = kvm_arch_vcpu_wq(vcpu);
 
-	wait_event_interruptible(*wq, !vcpu->arch.pause);
+	swait_event_interruptible(*wq, !vcpu->arch.pause);
 }
 
 static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index ad6f642..2b93577 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -70,7 +70,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct kvm *kvm = source_vcpu->kvm;
 	struct kvm_vcpu *vcpu = NULL;
-	wait_queue_head_t *wq;
+	struct swait_queue_head *wq;
 	unsigned long cpu_id;
 	unsigned long context_id;
 	phys_addr_t target_pc;
@@ -119,7 +119,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	smp_mb();		/* Make sure the above is visible */
 
 	wq = kvm_arch_vcpu_wq(vcpu);
-	wake_up_interruptible(wq);
+	swake_up(wq);
 
 	return PSCI_RET_SUCCESS;
 }
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 827a38d..12e9835 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -286,7 +286,7 @@ struct kvmppc_vcore {
 	struct list_head runnable_threads;
 	struct list_head preempt_list;
 	spinlock_t lock;
-	wait_queue_head_t wq;
+	struct swait_queue_head wq;
 	spinlock_t stoltb_lock;	/* protects stolen_tb and preempt_tb */
 	u64 stolen_tb;
 	u64 preempt_tb;
@@ -628,7 +628,7 @@ struct kvm_vcpu_arch {
 	u8 prodded;
 	u32 last_inst;
 
-	wait_queue_head_t *wqp;
+	struct swait_queue_head *wqp;
 	struct kvmppc_vcore *vcore;
 	int ret;
 	int trap;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2280497..f534e15 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -121,11 +121,11 @@ static bool kvmppc_ipi_thread(int cpu)
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
 	int cpu;
-	wait_queue_head_t *wqp;
+	struct swait_queue_head *wqp;
 
 	wqp = kvm_arch_vcpu_wq(vcpu);
-	if (waitqueue_active(wqp)) {
-		wake_up_interruptible(wqp);
+	if (swait_active(wqp)) {
+		swake_up(wqp);
 		++vcpu->stat.halt_wakeup;
 	}
 
@@ -708,8 +708,8 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 		tvcpu->arch.prodded = 1;
 		smp_mb();
 		if (vcpu->arch.ceded) {
-			if (waitqueue_active(&vcpu->wq)) {
-				wake_up_interruptible(&vcpu->wq);
+			if (swait_active(&vcpu->wq)) {
+				swake_up(&vcpu->wq);
 				vcpu->stat.halt_wakeup++;
 			}
 		}
@@ -1448,7 +1448,7 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
 	INIT_LIST_HEAD(&vcore->runnable_threads);
 	spin_lock_init(&vcore->lock);
 	spin_lock_init(&vcore->stoltb_lock);
-	init_waitqueue_head(&vcore->wq);
+	init_swait_queue_head(&vcore->wq);
 	vcore->preempt_tb = TB_NIL;
 	vcore->lpcr = kvm->arch.lpcr;
 	vcore->first_vcpuid = core * threads_per_subcore;
@@ -2560,10 +2560,9 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 {
 	struct kvm_vcpu *vcpu;
 	int do_sleep = 1;
+	DECLARE_SWAITQUEUE(wait);
 
-	DEFINE_WAIT(wait);
-
-	prepare_to_wait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
+	prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
 
 	/*
 	 * Check one last time for pending exceptions and ceded state after
@@ -2577,7 +2576,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	}
 
 	if (!do_sleep) {
-		finish_wait(&vc->wq, &wait);
+		finish_swait(&vc->wq, &wait);
 		return;
 	}
 
@@ -2585,7 +2584,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 	trace_kvmppc_vcore_blocked(vc, 0);
 	spin_unlock(&vc->lock);
 	schedule();
-	finish_wait(&vc->wq, &wait);
+	finish_swait(&vc->wq, &wait);
 	spin_lock(&vc->lock);
 	vc->vcore_state = VCORE_INACTIVE;
 	trace_kvmppc_vcore_blocked(vc, 1);
@@ -2641,7 +2640,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 			kvmppc_start_thread(vcpu, vc);
 			trace_kvm_guest_enter(vcpu);
 		} else if (vc->vcore_state == VCORE_SLEEPING) {
-			wake_up(&vc->wq);
+			swake_up(&vc->wq);
 		}
 
 	}
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 8ced426..a044ddb 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -427,7 +427,7 @@ struct kvm_s390_irq_payload {
 struct kvm_s390_local_interrupt {
 	spinlock_t lock;
 	struct kvm_s390_float_interrupt *float_int;
-	wait_queue_head_t *wq;
+	struct swait_queue_head *wq;
 	atomic_t *cpuflags;
 	DECLARE_BITMAP(sigp_emerg_pending, KVM_MAX_VCPUS);
 	struct kvm_s390_irq_payload irq;
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 5c2c169..78625fa 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -884,13 +884,13 @@ no_timer:
 
 void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu)
 {
-	if (waitqueue_active(&vcpu->wq)) {
+	if (swait_active(&vcpu->wq)) {
 		/*
 		 * The vcpu gave up the cpu voluntarily, mark it as a good
 		 * yield-candidate.
 		 */
 		vcpu->preempted = true;
-		wake_up_interruptible(&vcpu->wq);
+		swake_up(&vcpu->wq);
 		vcpu->stat.halt_wakeup++;
 	}
 }
@@ -994,7 +994,7 @@ int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code)
 	spin_lock(&li->lock);
 	irq.u.pgm.code = code;
 	__inject_prog(vcpu, &irq);
-	BUG_ON(waitqueue_active(li->wq));
+	BUG_ON(swait_active(li->wq));
 	spin_unlock(&li->lock);
 	return 0;
 }
@@ -1009,7 +1009,7 @@ int kvm_s390_inject_prog_irq(struct kvm_vcpu *vcpu,
 	spin_lock(&li->lock);
 	irq.u.pgm = *pgm_info;
 	rc = __inject_prog(vcpu, &irq);
-	BUG_ON(waitqueue_active(li->wq));
+	BUG_ON(swait_active(li->wq));
 	spin_unlock(&li->lock);
 	return rc;
 }
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8d9013c..a59aead 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1117,7 +1117,7 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
 static void apic_timer_expired(struct kvm_lapic *apic)
 {
 	struct kvm_vcpu *vcpu = apic->vcpu;
-	wait_queue_head_t *q = &vcpu->wq;
+	struct swait_queue_head *q = &vcpu->wq;
 	struct kvm_timer *ktimer = &apic->lapic_timer;
 
 	if (atomic_read(&apic->lapic_timer.pending))
@@ -1126,8 +1126,8 @@ static void apic_timer_expired(struct kvm_lapic *apic)
 	atomic_inc(&apic->lapic_timer.pending);
 	kvm_set_pending_timer(vcpu);
 
-	if (waitqueue_active(q))
-		wake_up_interruptible(q);
+	if (swait_active(q))
+		swake_up(q);
 
 	if (apic_lvtt_tscdeadline(apic))
 		ktimer->expired_tscdeadline = ktimer->tscdeadline;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 1bef9e2..7b6231e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -24,6 +24,7 @@
 #include <linux/err.h>
 #include <linux/irqflags.h>
 #include <linux/context_tracking.h>
+#include <linux/swait.h>
 #include <asm/signal.h>
 
 #include <linux/kvm.h>
@@ -237,7 +238,7 @@ struct kvm_vcpu {
 	int fpu_active;
 	int guest_fpu_loaded, guest_xcr0_loaded;
 	unsigned char fpu_counter;
-	wait_queue_head_t wq;
+	struct swait_queue_head wq;
 	struct pid *pid;
 	int sigset_active;
 	sigset_t sigset;
@@ -759,7 +760,7 @@ static inline bool kvm_arch_has_assigned_device(struct kvm *kvm)
 }
 #endif
 
-static inline wait_queue_head_t *kvm_arch_vcpu_wq(struct kvm_vcpu *vcpu)
+static inline struct swait_queue_head *kvm_arch_vcpu_wq(struct kvm_vcpu *vcpu)
 {
 #ifdef __KVM_HAVE_ARCH_WQP
 	return vcpu->arch.wqp;
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index 44660ae..ff4891c 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -94,8 +94,8 @@ static void async_pf_execute(struct work_struct *work)
 
 	trace_kvm_async_pf_completed(addr, gva);
 
-	if (waitqueue_active(&vcpu->wq))
-		wake_up_interruptible(&vcpu->wq);
+	if (swait_active(&vcpu->wq))
+		swake_up(&vcpu->wq);
 
 	mmput(mm);
 	kvm_put_kvm(vcpu->kvm);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8db1d93..45ab55f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -226,8 +226,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	vcpu->kvm = kvm;
 	vcpu->vcpu_id = id;
 	vcpu->pid = NULL;
-	vcpu->halt_poll_ns = 0;
-	init_waitqueue_head(&vcpu->wq);
+	init_swait_queue_head(&vcpu->wq);
 	kvm_async_pf_vcpu_init(vcpu);
 
 	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
@@ -1996,7 +1995,7 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 {
 	ktime_t start, cur;
-	DEFINE_WAIT(wait);
+	DECLARE_SWAITQUEUE(wait);
 	bool waited = false;
 	u64 block_ns;
 
@@ -2019,7 +2018,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	}
 
 	for (;;) {
-		prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
+		prepare_to_swait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
 
 		if (kvm_vcpu_check_block(vcpu) < 0)
 			break;
@@ -2028,7 +2027,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		schedule();
 	}
 
-	finish_wait(&vcpu->wq, &wait);
+	finish_swait(&vcpu->wq, &wait);
 	cur = ktime_get();
 
 out:
@@ -2059,11 +2058,11 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 {
 	int me;
 	int cpu = vcpu->cpu;
-	wait_queue_head_t *wqp;
+	struct swait_queue_head *wqp;
 
 	wqp = kvm_arch_vcpu_wq(vcpu);
-	if (waitqueue_active(wqp)) {
-		wake_up_interruptible(wqp);
+	if (swait_active(wqp)) {
+		swake_up(wqp);
 		++vcpu->stat.halt_wakeup;
 	}
 
@@ -2164,7 +2163,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 				continue;
 			if (vcpu == me)
 				continue;
-			if (waitqueue_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
+			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
 				continue;
 			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
 				continue;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree
  2015-10-14  7:43 [PATCH v2 0/4] Simple wait queue support Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 1/4] wait.[ch]: Introduce the simple waitqueue (swait) implementation Daniel Wagner
  2015-10-14  7:43 ` [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq Daniel Wagner
@ 2015-10-14  7:43 ` Daniel Wagner
  2015-10-19 23:31   ` Paul E. McKenney
  2015-10-14  7:43 ` [PATCH v2 4/4] rcu: Do not call swake_up_all with rnp->lock holding Daniel Wagner
  3 siblings, 1 reply; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14  7:43 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Paul Gortmaker, Daniel Wagner, Thomas Gleixner,
	Sebastian Andrzej Siewior, Paul E. McKenney, Steven Rostedt

From: Paul Gortmaker <paul.gortmaker@windriver.com>

As of commit dae6e64d2bcfd4b06304ab864c7e3a4f6b5fedf4 ("rcu: Introduce
proper blocking to no-CBs kthreads GP waits") the RCU subsystem started
making use of wait queues.

Here we convert all additions of RCU wait queues to use simple wait queues,
since they don't need the extra overhead of the full wait queue features.

Originally this was done for RT kernels[1], since we would get things like...

  BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
  in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt
  Pid: 8, comm: rcu_preempt Not tainted
  Call Trace:
   [<ffffffff8106c8d0>] __might_sleep+0xd0/0xf0
   [<ffffffff817d77b4>] rt_spin_lock+0x24/0x50
   [<ffffffff8106fcf6>] __wake_up+0x36/0x70
   [<ffffffff810c4542>] rcu_gp_kthread+0x4d2/0x680
   [<ffffffff8105f910>] ? __init_waitqueue_head+0x50/0x50
   [<ffffffff810c4070>] ? rcu_gp_fqs+0x80/0x80
   [<ffffffff8105eabb>] kthread+0xdb/0xe0
   [<ffffffff8106b912>] ? finish_task_switch+0x52/0x100
   [<ffffffff817e0754>] kernel_thread_helper+0x4/0x10
   [<ffffffff8105e9e0>] ? __init_kthread_worker+0x60/0x60
   [<ffffffff817e0750>] ? gs_change+0xb/0xb

...and hence simple wait queues were deployed on RT out of necessity
(as simple wait uses a raw lock), but mainline might as well take
advantage of the more streamline support as well.

[1] This is a carry forward of work from v3.10-rt; the original conversion
was by Thomas on an earlier -rt version, and Sebastian extended it to
additional post-3.10 added RCU waiters; here I've added a commit log and
unified the RCU changes into one, and uprev'd it to match mainline RCU.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
---
 kernel/rcu/tree.c        | 13 +++++++------
 kernel/rcu/tree.h        |  7 ++++---
 kernel/rcu/tree_plugin.h | 18 +++++++++---------
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 775d36c..f1c80aa 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1589,7 +1589,7 @@ static void rcu_gp_kthread_wake(struct rcu_state *rsp)
 	    !READ_ONCE(rsp->gp_flags) ||
 	    !rsp->gp_kthread)
 		return;
-	wake_up(&rsp->gp_wq);
+	swake_up(&rsp->gp_wq);
 }
 
 /*
@@ -2057,7 +2057,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       READ_ONCE(rsp->gpnum),
 					       TPS("reqwait"));
 			rsp->gp_state = RCU_GP_WAIT_GPS;
-			wait_event_interruptible(rsp->gp_wq,
+			swait_event_interruptible(rsp->gp_wq,
 						 READ_ONCE(rsp->gp_flags) &
 						 RCU_GP_FLAG_INIT);
 			rsp->gp_state = RCU_GP_DONE_GPS;
@@ -2087,7 +2087,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       READ_ONCE(rsp->gpnum),
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
-			ret = wait_event_interruptible_timeout(rsp->gp_wq,
+			ret = swait_event_interruptible_timeout(rsp->gp_wq,
 					rcu_gp_fqs_check_wake(rsp, &gf), j);
 			rsp->gp_state = RCU_GP_DOING_FQS;
 			/* Locking provides needed memory barriers. */
@@ -2210,7 +2210,7 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
 	WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
 	WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS);
 	raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
-	rcu_gp_kthread_wake(rsp);
+	swake_up(&rsp->gp_wq);  /* Memory barrier implied by swake_up() path. */
 }
 
 /*
@@ -2871,7 +2871,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
 	}
 	WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS);
 	raw_spin_unlock_irqrestore(&rnp_old->lock, flags);
-	rcu_gp_kthread_wake(rsp);
+	swake_up(&rsp->gp_wq); /* Memory barrier implied by swake_up() path. */
 }
 
 /*
@@ -4178,7 +4178,8 @@ static void __init rcu_init_one(struct rcu_state *rsp,
 		}
 	}
 
-	init_waitqueue_head(&rsp->gp_wq);
+	rsp->rda = rda;
+	init_swait_queue_head(&rsp->gp_wq);
 	rnp = rsp->level[rcu_num_lvls - 1];
 	for_each_possible_cpu(i) {
 		while (i > rnp->grphi)
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 2e991f8..6aa3776 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -27,6 +27,7 @@
 #include <linux/threads.h>
 #include <linux/cpumask.h>
 #include <linux/seqlock.h>
+#include <linux/swait.h>
 #include <linux/stop_machine.h>
 
 /*
@@ -244,7 +245,7 @@ struct rcu_node {
 				/* Refused to boost: not sure why, though. */
 				/*  This can happen due to race conditions. */
 #ifdef CONFIG_RCU_NOCB_CPU
-	wait_queue_head_t nocb_gp_wq[2];
+	struct swait_queue_head nocb_gp_wq[2];
 				/* Place for rcu_nocb_kthread() to wait GP. */
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
 	int need_future_gp[2];
@@ -388,7 +389,7 @@ struct rcu_data {
 	atomic_long_t nocb_q_count_lazy; /*  invocation (all stages). */
 	struct rcu_head *nocb_follower_head; /* CBs ready to invoke. */
 	struct rcu_head **nocb_follower_tail;
-	wait_queue_head_t nocb_wq;	/* For nocb kthreads to sleep on. */
+	struct swait_queue_head nocb_wq; /* For nocb kthreads to sleep on. */
 	struct task_struct *nocb_kthread;
 	int nocb_defer_wakeup;		/* Defer wakeup of nocb_kthread. */
 
@@ -475,7 +476,7 @@ struct rcu_state {
 	unsigned long gpnum;			/* Current gp number. */
 	unsigned long completed;		/* # of last completed gp. */
 	struct task_struct *gp_kthread;		/* Task for grace periods. */
-	wait_queue_head_t gp_wq;		/* Where GP task waits. */
+	struct swait_queue_head gp_wq;		/* Where GP task waits. */
 	short gp_flags;				/* Commands for GP task. */
 	short gp_state;				/* GP kthread sleep state. */
 
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index b2bf396..545acdf 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1779,7 +1779,7 @@ early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
  */
 static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 {
-	wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
+	swake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
 }
 
 /*
@@ -1797,8 +1797,8 @@ static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)
 
 static void rcu_init_one_nocb(struct rcu_node *rnp)
 {
-	init_waitqueue_head(&rnp->nocb_gp_wq[0]);
-	init_waitqueue_head(&rnp->nocb_gp_wq[1]);
+	init_swait_queue_head(&rnp->nocb_gp_wq[0]);
+	init_swait_queue_head(&rnp->nocb_gp_wq[1]);
 }
 
 #ifndef CONFIG_RCU_NOCB_CPU_ALL
@@ -1823,7 +1823,7 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force)
 	if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) {
 		/* Prior smp_mb__after_atomic() orders against prior enqueue. */
 		WRITE_ONCE(rdp_leader->nocb_leader_sleep, false);
-		wake_up(&rdp_leader->nocb_wq);
+		swake_up(&rdp_leader->nocb_wq);
 	}
 }
 
@@ -2036,7 +2036,7 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
 	 */
 	trace_rcu_future_gp(rnp, rdp, c, TPS("StartWait"));
 	for (;;) {
-		wait_event_interruptible(
+		swait_event_interruptible(
 			rnp->nocb_gp_wq[c & 0x1],
 			(d = ULONG_CMP_GE(READ_ONCE(rnp->completed), c)));
 		if (likely(d))
@@ -2064,7 +2064,7 @@ wait_again:
 	/* Wait for callbacks to appear. */
 	if (!rcu_nocb_poll) {
 		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, "Sleep");
-		wait_event_interruptible(my_rdp->nocb_wq,
+		swait_event_interruptible(my_rdp->nocb_wq,
 				!READ_ONCE(my_rdp->nocb_leader_sleep));
 		/* Memory barrier handled by smp_mb() calls below and repoll. */
 	} else if (firsttime) {
@@ -2139,7 +2139,7 @@ wait_again:
 			 * List was empty, wake up the follower.
 			 * Memory barriers supplied by atomic_long_add().
 			 */
-			wake_up(&rdp->nocb_wq);
+			swake_up(&rdp->nocb_wq);
 		}
 	}
 
@@ -2160,7 +2160,7 @@ static void nocb_follower_wait(struct rcu_data *rdp)
 		if (!rcu_nocb_poll) {
 			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
 					    "FollowerSleep");
-			wait_event_interruptible(rdp->nocb_wq,
+			swait_event_interruptible(rdp->nocb_wq,
 						 READ_ONCE(rdp->nocb_follower_head));
 		} else if (firsttime) {
 			/* Don't drown trace log with "Poll"! */
@@ -2319,7 +2319,7 @@ void __init rcu_init_nohz(void)
 static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
 {
 	rdp->nocb_tail = &rdp->nocb_head;
-	init_waitqueue_head(&rdp->nocb_wq);
+	init_swait_queue_head(&rdp->nocb_wq);
 	rdp->nocb_follower_tail = &rdp->nocb_follower_head;
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/4] rcu: Do not call swake_up_all with rnp->lock holding
  2015-10-14  7:43 [PATCH v2 0/4] Simple wait queue support Daniel Wagner
                   ` (2 preceding siblings ...)
  2015-10-14  7:43 ` [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree Daniel Wagner
@ 2015-10-14  7:43 ` Daniel Wagner
  3 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14  7:43 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Daniel Wagner, Paul E. McKenney, Peter Zijlstra, Thomas Gleixner,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan

By moving the rcu_nocb_gp_cleanup() call out of the rnp->lock
protected region we avoid a deadlock as lockdep reported.

swake_up_all() is toggling IRQ enable/disable. That means we might
start processing soft IRQs. __do_softirq() calls
rcu_process_callbacks() which wants to grab nrp->lock.

 =================================
 [ INFO: inconsistent lock state ]
 4.2.0-rc5-00025-g9a73ba0 #136 Not tainted
 ---------------------------------
 inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
 rcu_preempt/8 [HC0[0]:SC0[0]:HE1:SE1] takes:
  (rcu_node_1){+.?...}, at: [<ffffffff811387c7>] rcu_gp_kthread+0xb97/0xeb0
 {IN-SOFTIRQ-W} state was registered at:
   [<ffffffff81109b9f>] __lock_acquire+0xd5f/0x21e0
   [<ffffffff8110be0f>] lock_acquire+0xdf/0x2b0
   [<ffffffff81841cc9>] _raw_spin_lock_irqsave+0x59/0xa0
   [<ffffffff81136991>] rcu_process_callbacks+0x141/0x3c0
   [<ffffffff810b1a9d>] __do_softirq+0x14d/0x670
   [<ffffffff810b2214>] irq_exit+0x104/0x110
   [<ffffffff81844e96>] smp_apic_timer_interrupt+0x46/0x60
   [<ffffffff81842e70>] apic_timer_interrupt+0x70/0x80
   [<ffffffff810dba66>] rq_attach_root+0xa6/0x100
   [<ffffffff810dbc2d>] cpu_attach_domain+0x16d/0x650
   [<ffffffff810e4b42>] build_sched_domains+0x942/0xb00
   [<ffffffff821777c2>] sched_init_smp+0x509/0x5c1
   [<ffffffff821551e3>] kernel_init_freeable+0x172/0x28f
   [<ffffffff8182cdce>] kernel_init+0xe/0xe0
   [<ffffffff8184231f>] ret_from_fork+0x3f/0x70
 irq event stamp: 76
 hardirqs last  enabled at (75): [<ffffffff81841330>] _raw_spin_unlock_irq+0x30/0x60
 hardirqs last disabled at (76): [<ffffffff8184116f>] _raw_spin_lock_irq+0x1f/0x90
 softirqs last  enabled at (0): [<ffffffff810a8df2>] copy_process.part.26+0x602/0x1cf0
 softirqs last disabled at (0): [<          (null)>]           (null)
 other info that might help us debug this:
  Possible unsafe locking scenario:
        CPU0
        ----
   lock(rcu_node_1);
   <Interrupt>
     lock(rcu_node_1);
  *** DEADLOCK ***
 1 lock held by rcu_preempt/8:
  #0:  (rcu_node_1){+.?...}, at: [<ffffffff811387c7>] rcu_gp_kthread+0xb97/0xeb0
 stack backtrace:
 CPU: 0 PID: 8 Comm: rcu_preempt Not tainted 4.2.0-rc5-00025-g9a73ba0 #136
 Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
  0000000000000000 000000006d7e67d8 ffff881fb081fbd8 ffffffff818379e0
  0000000000000000 ffff881fb0812a00 ffff881fb081fc38 ffffffff8110813b
  0000000000000000 0000000000000001 ffff881f00000001 ffffffff8102fa4f
 Call Trace:
  [<ffffffff818379e0>] dump_stack+0x4f/0x7b
  [<ffffffff8110813b>] print_usage_bug+0x1db/0x1e0
  [<ffffffff8102fa4f>] ? save_stack_trace+0x2f/0x50
  [<ffffffff811087ad>] mark_lock+0x66d/0x6e0
  [<ffffffff81107790>] ? check_usage_forwards+0x150/0x150
  [<ffffffff81108898>] mark_held_locks+0x78/0xa0
  [<ffffffff81841330>] ? _raw_spin_unlock_irq+0x30/0x60
  [<ffffffff81108a28>] trace_hardirqs_on_caller+0x168/0x220
  [<ffffffff81108aed>] trace_hardirqs_on+0xd/0x10
  [<ffffffff81841330>] _raw_spin_unlock_irq+0x30/0x60
  [<ffffffff810fd1c7>] swake_up_all+0xb7/0xe0
  [<ffffffff811386e1>] rcu_gp_kthread+0xab1/0xeb0
  [<ffffffff811089bf>] ? trace_hardirqs_on_caller+0xff/0x220
  [<ffffffff81841341>] ? _raw_spin_unlock_irq+0x41/0x60
  [<ffffffff81137c30>] ? rcu_barrier+0x20/0x20
  [<ffffffff810d2014>] kthread+0x104/0x120
  [<ffffffff81841330>] ? _raw_spin_unlock_irq+0x30/0x60
  [<ffffffff810d1f10>] ? kthread_create_on_node+0x260/0x260
  [<ffffffff8184231f>] ret_from_fork+0x3f/0x70
  [<ffffffff810d1f10>] ? kthread_create_on_node+0x260/0x260

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org
---
 kernel/rcu/tree.c        |  4 +++-
 kernel/rcu/tree.h        |  3 ++-
 kernel/rcu/tree_plugin.h | 16 +++++++++++++---
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f1c80aa..6388f8a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1568,7 +1568,6 @@ static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 	int needmore;
 	struct rcu_data *rdp = this_cpu_ptr(rsp->rda);
 
-	rcu_nocb_gp_cleanup(rsp, rnp);
 	rnp->need_future_gp[c & 0x1] = 0;
 	needmore = rnp->need_future_gp[(c + 1) & 0x1];
 	trace_rcu_future_gp(rnp, rdp, c,
@@ -1972,6 +1971,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	int nocb = 0;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
+	struct swait_queue_head *sq;
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	raw_spin_lock_irq(&rnp->lock);
@@ -2010,7 +2010,9 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 			needgp = __note_gp_changes(rsp, rnp, rdp) || needgp;
 		/* smp_mb() provided by prior unlock-lock pair. */
 		nocb += rcu_future_gp_cleanup(rsp, rnp);
+		sq = rcu_nocb_gp_get(rnp);
 		raw_spin_unlock_irq(&rnp->lock);
+		rcu_nocb_gp_cleanup(sq);
 		cond_resched_rcu_qs();
 		WRITE_ONCE(rsp->gp_activity, jiffies);
 		rcu_gp_slow(rsp, gp_cleanup_delay);
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 6aa3776..30cf567 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -609,7 +609,8 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static void increment_cpu_stall_ticks(void);
 static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu);
 static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq);
-static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp);
+static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
+static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
 static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
 			    bool lazy, unsigned long flags);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 545acdf..0c69868 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1777,9 +1777,9 @@ early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
  * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended
  * grace period.
  */
-static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
+static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
 {
-	swake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
+	swake_up_all(sq);
 }
 
 /*
@@ -1795,6 +1795,11 @@ static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)
 	rnp->need_future_gp[(rnp->completed + 1) & 0x1] += nrq;
 }
 
+static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
+{
+	return &rnp->nocb_gp_wq[rnp->completed & 0x1];
+}
+
 static void rcu_init_one_nocb(struct rcu_node *rnp)
 {
 	init_swait_queue_head(&rnp->nocb_gp_wq[0]);
@@ -2469,7 +2474,7 @@ static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu)
 	return false;
 }
 
-static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
+static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
 {
 }
 
@@ -2477,6 +2482,11 @@ static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)
 {
 }
 
+static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
+{
+	return NULL;
+}
+
 static void rcu_init_one_nocb(struct rcu_node *rnp)
 {
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq
  2015-10-14  7:43 ` [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq Daniel Wagner
@ 2015-10-14 14:54   ` Daniel Wagner
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2015-10-14 14:54 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Marcelo Tosatti, Sebastian Andrzej Siewior, Paolo Bonzini

On 10/14/2015 09:43 AM, Daniel Wagner wrote:
>  arch/arm/kvm/arm.c                  |  4 ++--
>  arch/arm/kvm/psci.c                 |  4 ++--
>  arch/powerpc/include/asm/kvm_host.h |  4 ++--
>  arch/powerpc/kvm/book3s_hv.c        | 23 +++++++++++------------
>  arch/s390/include/asm/kvm_host.h    |  2 +-
>  arch/s390/kvm/interrupt.c           |  8 ++++----
>  arch/x86/kvm/lapic.c                |  6 +++---
>  include/linux/kvm_host.h            |  5 +++--
>  virt/kvm/async_pf.c                 |  4 ++--
>  virt/kvm/kvm_main.c                 | 17 ++++++++---------
>  10 files changed, 38 insertions(+), 39 deletions(-)

As reported by the 0day build robot, at least the mips arch part needs
also to be updated.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree
  2015-10-14  7:43 ` [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree Daniel Wagner
@ 2015-10-19 23:31   ` Paul E. McKenney
  2015-10-20  7:14     ` Daniel Wagner
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2015-10-19 23:31 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-kernel, linux-rt-users, Paul Gortmaker, Thomas Gleixner,
	Sebastian Andrzej Siewior, Steven Rostedt

On Wed, Oct 14, 2015 at 09:43:21AM +0200, Daniel Wagner wrote:
> From: Paul Gortmaker <paul.gortmaker@windriver.com>
> 
> As of commit dae6e64d2bcfd4b06304ab864c7e3a4f6b5fedf4 ("rcu: Introduce
> proper blocking to no-CBs kthreads GP waits") the RCU subsystem started
> making use of wait queues.
> 
> Here we convert all additions of RCU wait queues to use simple wait queues,
> since they don't need the extra overhead of the full wait queue features.
> 
> Originally this was done for RT kernels[1], since we would get things like...
> 
>   BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
>   in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt
>   Pid: 8, comm: rcu_preempt Not tainted
>   Call Trace:
>    [<ffffffff8106c8d0>] __might_sleep+0xd0/0xf0
>    [<ffffffff817d77b4>] rt_spin_lock+0x24/0x50
>    [<ffffffff8106fcf6>] __wake_up+0x36/0x70
>    [<ffffffff810c4542>] rcu_gp_kthread+0x4d2/0x680
>    [<ffffffff8105f910>] ? __init_waitqueue_head+0x50/0x50
>    [<ffffffff810c4070>] ? rcu_gp_fqs+0x80/0x80
>    [<ffffffff8105eabb>] kthread+0xdb/0xe0
>    [<ffffffff8106b912>] ? finish_task_switch+0x52/0x100
>    [<ffffffff817e0754>] kernel_thread_helper+0x4/0x10
>    [<ffffffff8105e9e0>] ? __init_kthread_worker+0x60/0x60
>    [<ffffffff817e0750>] ? gs_change+0xb/0xb
> 
> ...and hence simple wait queues were deployed on RT out of necessity
> (as simple wait uses a raw lock), but mainline might as well take
> advantage of the more streamline support as well.
> 
> [1] This is a carry forward of work from v3.10-rt; the original conversion
> was by Thomas on an earlier -rt version, and Sebastian extended it to
> additional post-3.10 added RCU waiters; here I've added a commit log and
> unified the RCU changes into one, and uprev'd it to match mainline RCU.
> 
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: linux-kernel@vger.kernel.org

One comment below on an unneeded fix.  I will try queueing these for
testing.

							Thanx, Paul

> ---
>  kernel/rcu/tree.c        | 13 +++++++------
>  kernel/rcu/tree.h        |  7 ++++---
>  kernel/rcu/tree_plugin.h | 18 +++++++++---------
>  3 files changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 775d36c..f1c80aa 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1589,7 +1589,7 @@ static void rcu_gp_kthread_wake(struct rcu_state *rsp)
>  	    !READ_ONCE(rsp->gp_flags) ||
>  	    !rsp->gp_kthread)
>  		return;
> -	wake_up(&rsp->gp_wq);
> +	swake_up(&rsp->gp_wq);
>  }
> 
>  /*
> @@ -2057,7 +2057,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  					       READ_ONCE(rsp->gpnum),
>  					       TPS("reqwait"));
>  			rsp->gp_state = RCU_GP_WAIT_GPS;
> -			wait_event_interruptible(rsp->gp_wq,
> +			swait_event_interruptible(rsp->gp_wq,
>  						 READ_ONCE(rsp->gp_flags) &
>  						 RCU_GP_FLAG_INIT);
>  			rsp->gp_state = RCU_GP_DONE_GPS;
> @@ -2087,7 +2087,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  					       READ_ONCE(rsp->gpnum),
>  					       TPS("fqswait"));
>  			rsp->gp_state = RCU_GP_WAIT_FQS;
> -			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> +			ret = swait_event_interruptible_timeout(rsp->gp_wq,
>  					rcu_gp_fqs_check_wake(rsp, &gf), j);
>  			rsp->gp_state = RCU_GP_DOING_FQS;
>  			/* Locking provides needed memory barriers. */
> @@ -2210,7 +2210,7 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
>  	WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
>  	WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS);
>  	raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
> -	rcu_gp_kthread_wake(rsp);
> +	swake_up(&rsp->gp_wq);  /* Memory barrier implied by swake_up() path. */
>  }
> 
>  /*
> @@ -2871,7 +2871,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
>  	}
>  	WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS);
>  	raw_spin_unlock_irqrestore(&rnp_old->lock, flags);
> -	rcu_gp_kthread_wake(rsp);
> +	swake_up(&rsp->gp_wq); /* Memory barrier implied by swake_up() path. */
>  }
> 
>  /*
> @@ -4178,7 +4178,8 @@ static void __init rcu_init_one(struct rcu_state *rsp,
>  		}
>  	}
> 
> -	init_waitqueue_head(&rsp->gp_wq);
> +	rsp->rda = rda;

This is initialized at compile time in current mainline, so the above
statement is not needed.

But now that you mention it, the second parameter to rcu_init_one() is
a bit pointless these days.  I have queued a patch eliminating it,
which should not conflict with our patch.

> +	init_swait_queue_head(&rsp->gp_wq);
>  	rnp = rsp->level[rcu_num_lvls - 1];
>  	for_each_possible_cpu(i) {
>  		while (i > rnp->grphi)
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 2e991f8..6aa3776 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -27,6 +27,7 @@
>  #include <linux/threads.h>
>  #include <linux/cpumask.h>
>  #include <linux/seqlock.h>
> +#include <linux/swait.h>
>  #include <linux/stop_machine.h>
> 
>  /*
> @@ -244,7 +245,7 @@ struct rcu_node {
>  				/* Refused to boost: not sure why, though. */
>  				/*  This can happen due to race conditions. */
>  #ifdef CONFIG_RCU_NOCB_CPU
> -	wait_queue_head_t nocb_gp_wq[2];
> +	struct swait_queue_head nocb_gp_wq[2];
>  				/* Place for rcu_nocb_kthread() to wait GP. */
>  #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
>  	int need_future_gp[2];
> @@ -388,7 +389,7 @@ struct rcu_data {
>  	atomic_long_t nocb_q_count_lazy; /*  invocation (all stages). */
>  	struct rcu_head *nocb_follower_head; /* CBs ready to invoke. */
>  	struct rcu_head **nocb_follower_tail;
> -	wait_queue_head_t nocb_wq;	/* For nocb kthreads to sleep on. */
> +	struct swait_queue_head nocb_wq; /* For nocb kthreads to sleep on. */
>  	struct task_struct *nocb_kthread;
>  	int nocb_defer_wakeup;		/* Defer wakeup of nocb_kthread. */
> 
> @@ -475,7 +476,7 @@ struct rcu_state {
>  	unsigned long gpnum;			/* Current gp number. */
>  	unsigned long completed;		/* # of last completed gp. */
>  	struct task_struct *gp_kthread;		/* Task for grace periods. */
> -	wait_queue_head_t gp_wq;		/* Where GP task waits. */
> +	struct swait_queue_head gp_wq;		/* Where GP task waits. */
>  	short gp_flags;				/* Commands for GP task. */
>  	short gp_state;				/* GP kthread sleep state. */
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index b2bf396..545acdf 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1779,7 +1779,7 @@ early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
>   */
>  static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
>  {
> -	wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
> +	swake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
>  }
> 
>  /*
> @@ -1797,8 +1797,8 @@ static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)
> 
>  static void rcu_init_one_nocb(struct rcu_node *rnp)
>  {
> -	init_waitqueue_head(&rnp->nocb_gp_wq[0]);
> -	init_waitqueue_head(&rnp->nocb_gp_wq[1]);
> +	init_swait_queue_head(&rnp->nocb_gp_wq[0]);
> +	init_swait_queue_head(&rnp->nocb_gp_wq[1]);
>  }
> 
>  #ifndef CONFIG_RCU_NOCB_CPU_ALL
> @@ -1823,7 +1823,7 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force)
>  	if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) {
>  		/* Prior smp_mb__after_atomic() orders against prior enqueue. */
>  		WRITE_ONCE(rdp_leader->nocb_leader_sleep, false);
> -		wake_up(&rdp_leader->nocb_wq);
> +		swake_up(&rdp_leader->nocb_wq);
>  	}
>  }
> 
> @@ -2036,7 +2036,7 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
>  	 */
>  	trace_rcu_future_gp(rnp, rdp, c, TPS("StartWait"));
>  	for (;;) {
> -		wait_event_interruptible(
> +		swait_event_interruptible(
>  			rnp->nocb_gp_wq[c & 0x1],
>  			(d = ULONG_CMP_GE(READ_ONCE(rnp->completed), c)));
>  		if (likely(d))
> @@ -2064,7 +2064,7 @@ wait_again:
>  	/* Wait for callbacks to appear. */
>  	if (!rcu_nocb_poll) {
>  		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, "Sleep");
> -		wait_event_interruptible(my_rdp->nocb_wq,
> +		swait_event_interruptible(my_rdp->nocb_wq,
>  				!READ_ONCE(my_rdp->nocb_leader_sleep));
>  		/* Memory barrier handled by smp_mb() calls below and repoll. */
>  	} else if (firsttime) {
> @@ -2139,7 +2139,7 @@ wait_again:
>  			 * List was empty, wake up the follower.
>  			 * Memory barriers supplied by atomic_long_add().
>  			 */
> -			wake_up(&rdp->nocb_wq);
> +			swake_up(&rdp->nocb_wq);
>  		}
>  	}
> 
> @@ -2160,7 +2160,7 @@ static void nocb_follower_wait(struct rcu_data *rdp)
>  		if (!rcu_nocb_poll) {
>  			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
>  					    "FollowerSleep");
> -			wait_event_interruptible(rdp->nocb_wq,
> +			swait_event_interruptible(rdp->nocb_wq,
>  						 READ_ONCE(rdp->nocb_follower_head));
>  		} else if (firsttime) {
>  			/* Don't drown trace log with "Poll"! */
> @@ -2319,7 +2319,7 @@ void __init rcu_init_nohz(void)
>  static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
>  {
>  	rdp->nocb_tail = &rdp->nocb_head;
> -	init_waitqueue_head(&rdp->nocb_wq);
> +	init_swait_queue_head(&rdp->nocb_wq);
>  	rdp->nocb_follower_tail = &rdp->nocb_follower_head;
>  }
> 
> -- 
> 2.4.3
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree
  2015-10-19 23:31   ` Paul E. McKenney
@ 2015-10-20  7:14     ` Daniel Wagner
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2015-10-20  7:14 UTC (permalink / raw)
  To: paulmck, Daniel Wagner
  Cc: linux-kernel, linux-rt-users, Paul Gortmaker, Thomas Gleixner,
	Sebastian Andrzej Siewior, Steven Rostedt

Hi Paul,

On 10/20/2015 01:31 AM, Paul E. McKenney wrote:
> On Wed, Oct 14, 2015 at 09:43:21AM +0200, Daniel Wagner wrote:
>> From: Paul Gortmaker <paul.gortmaker@windriver.com>
>> @@ -4178,7 +4178,8 @@ static void __init rcu_init_one(struct rcu_state *rsp,
>>  		}
>>  	}
>>
>> -	init_waitqueue_head(&rsp->gp_wq);
>> +	rsp->rda = rda;
> 
> This is initialized at compile time in current mainline, so the above
> statement is not needed.

I accidentally forward ported this from 3.10.

> But now that you mention it, the second parameter to rcu_init_one() is
> a bit pointless these days.  I have queued a patch eliminating it,
> which should not conflict with our patch.

Glad I can help :)

I am about to send an updated version of this series containing which
swaps this patch with #4 to avoid lockdep warning while doing bissect.
Obviously, the above statement will be gone as well.

cheers,
daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-10-20  7:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-14  7:43 [PATCH v2 0/4] Simple wait queue support Daniel Wagner
2015-10-14  7:43 ` [PATCH v2 1/4] wait.[ch]: Introduce the simple waitqueue (swait) implementation Daniel Wagner
2015-10-14  7:43 ` [PATCH v2 2/4] KVM: use simple waitqueue for vcpu->wq Daniel Wagner
2015-10-14 14:54   ` Daniel Wagner
2015-10-14  7:43 ` [PATCH v2 3/4] rcu: use simple wait queues where possible in rcutree Daniel Wagner
2015-10-19 23:31   ` Paul E. McKenney
2015-10-20  7:14     ` Daniel Wagner
2015-10-14  7:43 ` [PATCH v2 4/4] rcu: Do not call swake_up_all with rnp->lock holding Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).