All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
	paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com,
	der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com,
	viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Subject: [RFC PATCH 6/6] stop_machine: optimize stop_work_alloc()
Date: Fri, 26 Jun 2015 04:15:26 +0200	[thread overview]
Message-ID: <20150626021526.GA5717@redhat.com> (raw)
In-Reply-To: <20150626021455.GA5675@redhat.com>

wait_event()/wake_up_all() in stop_work_alloc/stop_work_free logic
is very suboptimal because of non-exclusive wakeups. So we add the
wait_queue_func_t alloc_wake() helper which wakes the waiter up only
a) if it actually waits for a stop_work in the "freed" cpumask, and
b) only after we already set ->stop_owner = waiter.

So if 2 stop_machine()'s race with each other, the loser will likely
call schedule() only once and we will have a single wakeup.

TODO: we can optimize (and simplify!) this code more. We can remove
stop_work_alloc_one() and fold it into stop_work_alloc(), so that
prepare_to_wait() will be the outer loop. Lets do this later.

TODO: the init_waitqueue_func_entry() code in stop_work_alloc_one()
is really annoying, we need the trivial new __init_wait(wait, func)
helper.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/stop_machine.c |   47 +++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 572abc9..bbfc670 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -63,21 +63,60 @@ static void stop_work_free(const struct cpumask *cpumask)
 
 	for_each_cpu(cpu, cpumask)
 		stop_work_free_one(cpu);
-	wake_up_all(&stop_work_wq);
+	__wake_up(&stop_work_wq, TASK_ALL, 0, (void *)cpumask);
+}
+
+struct alloc_wait {
+	wait_queue_t	wait;
+	int		cpu;
+};
+
+static int alloc_wake(wait_queue_t *wait, unsigned int mode, int sync, void *key)
+{
+	struct alloc_wait *aw = container_of(wait, struct alloc_wait, wait);
+	struct cpu_stopper *stopper = &per_cpu(cpu_stopper, aw->cpu);
+	const struct cpumask *cpumask = key;
+
+	if (!cpumask_test_cpu(aw->cpu, cpumask))
+		return 0;
+	if (cmpxchg(&stopper->stop_owner, NULL, aw->wait.private) != NULL)
+		return 0;
+
+	return autoremove_wake_function(wait, mode, sync, key);
 }
 
 static struct cpu_stop_work *stop_work_alloc_one(int cpu, bool wait)
 {
 	struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
+	struct task_struct *me = current;
+	struct alloc_wait aw;
 
-	if (cmpxchg(&stopper->stop_owner, NULL, current) == NULL)
+	if (cmpxchg(&stopper->stop_owner, NULL, me) == NULL)
 		goto done;
 
 	if (!wait)
 		return NULL;
 
-	__wait_event(stop_work_wq,
-		cmpxchg(&stopper->stop_owner, NULL, current) == NULL);
+	/* TODO: add __init_wait(wait, func) helper! */
+	INIT_LIST_HEAD(&aw.wait.task_list);
+	init_waitqueue_func_entry(&aw.wait, alloc_wake);
+	aw.wait.private = me;
+	aw.cpu = cpu;
+	for (;;) {
+		prepare_to_wait(&stop_work_wq, &aw.wait, TASK_UNINTERRUPTIBLE);
+		/*
+		 * This can "falsely" fail if we race with alloc_wake() and
+		 * stopper->stop_owner is already me, in this case schedule()
+		 * won't block and the check below will notice this change.
+		 */
+		if (cmpxchg(&stopper->stop_owner, NULL, me) == NULL)
+			break;
+
+		schedule();
+		if (likely(stopper->stop_owner == me))
+			break;
+	}
+	finish_wait(&stop_work_wq, &aw.wait);
 done:
 	return &stopper->stop_work;
 }
-- 
1.5.5.1


  parent reply	other threads:[~2015-06-26  2:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-26  2:14 [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and stop_cpus_lock Oleg Nesterov
2015-06-26  2:15 ` [RFC PATCH 1/6] stop_machine: move cpu_stopper_task and stop_cpus_work into struct cpu_stopper Oleg Nesterov
2015-06-26  2:15 ` [RFC PATCH 2/6] stop_machine: don't do for_each_cpu() twice in queue_stop_cpus_work() Oleg Nesterov
2015-06-26  2:15 ` [RFC PATCH 3/6] stop_machine: introduce stop_work_alloc() and stop_work_free() Oleg Nesterov
2015-06-26  2:15 ` [RFC PATCH 4/6] stop_machine: kill stop_cpus_mutex Oleg Nesterov
2015-06-26  2:15 ` [RFC PATCH 5/6] stop_machine: change stop_two_cpus() just use stop_cpu(), kill lg_double_lock/unlock Oleg Nesterov
2015-06-26  2:15 ` Oleg Nesterov [this message]
2015-06-29  8:56   ` [RFC PATCH 6/6] stop_machine: optimize stop_work_alloc() Peter Zijlstra
2015-06-26  2:31 ` [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and stop_cpus_lock Oleg Nesterov
2015-06-26 12:23 ` Peter Zijlstra
2015-06-26 20:46   ` Oleg Nesterov
2015-06-29  4:02     ` Oleg Nesterov
2015-06-29  8:51       ` Peter Zijlstra
2015-06-30  1:08         ` Oleg Nesterov
2015-06-29  8:49     ` Peter Zijlstra
2015-06-30  1:03       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150626021526.GA5717@redhat.com \
    --to=oleg@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=der.herr@hofr.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.