linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] async: fix __lowest_in_progress()
@ 2009-01-12 10:16 Arjan van de Ven
  0 siblings, 0 replies; 4+ messages in thread
From: Arjan van de Ven @ 2009-01-12 10:16 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel

>From 24bafdeb8a42e2e43a1928ca72159e6ff90cfe85 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Sun, 11 Jan 2009 15:35:01 +0000
Subject: [PATCH] async: fix __lowest_in_progress()

At 37000 feet somewhere near Greenland I woke up from a half-sleep with the 
realisation that __lowest_in_progress() is buggy. After landing I checked 
and there were indeed 2 problems with it; this patch fixes both:
* The order of the list checks was wrong
* The locking was not correct.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 kernel/async.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/async.c b/kernel/async.c
index f286e9f..608b32b 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -90,12 +90,12 @@ extern int initcall_debug;
 static async_cookie_t  __lowest_in_progress(struct list_head *running)
 {
 	struct async_entry *entry;
-	if (!list_empty(&async_pending)) {
-		entry = list_first_entry(&async_pending,
+	if (!list_empty(running)) {
+		entry = list_first_entry(running,
 			struct async_entry, list);
 		return entry->cookie;
-	} else if (!list_empty(running)) {
-		entry = list_first_entry(running,
+	} else if (!list_empty(&async_pending)) {
+		entry = list_first_entry(&async_pending,
 			struct async_entry, list);
 		return entry->cookie;
 	} else {
@@ -104,6 +104,17 @@ static async_cookie_t  __lowest_in_progress(struct list_head *running)
 	}
 
 }
+
+static async_cookie_t  lowest_in_progress(struct list_head *running)
+{
+	unsigned long flags;
+	async_cookie_t ret;
+
+	spin_lock_irqsave(&async_lock, flags);
+	ret = __lowest_in_progress(running);
+	spin_unlock_irqrestore(&async_lock, flags);
+	return ret;
+}
 /*
  * pick the first pending entry and run it
  */
@@ -229,7 +240,7 @@ void async_synchronize_cookie_special(async_cookie_t cookie, struct list_head *r
 		starttime = ktime_get();
 	}
 
-	wait_event(async_done, __lowest_in_progress(running) >= cookie);
+	wait_event(async_done, lowest_in_progress(running) >= cookie);
 
 	if (initcall_debug && system_state == SYSTEM_BOOTING) {
 		endtime = ktime_get();
-- 
1.6.0.6



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] async: fix __lowest_in_progress()
  2013-01-15 18:32                 ` Tejun Heo
@ 2013-01-16 17:19                   ` Tejun Heo
  2013-01-17 18:16                     ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2013-01-16 17:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Alex Riesen, Alan Stern, Jens Axboe, USB list,
	Linux Kernel Mailing List, Arjan van de Ven

083b804c4d3e1e3d0eace56bdbc0f674946d2847 ("async: use workqueue for
worker pool") made it possible that async jobs are moved from pending
to running out-of-order.  While pending async jobs will be queued and
dispatched for execution in the same order, nothing guarantees they'll
enter "1) move self to the running queue" of async_run_entry_fn() in
the same order.

This broke __lowest_in_progress().  running->domain may not be
properly sorted and is not guaranteed to contain lower cookies than
pending list when not empty.  Fix it by ensuring sort-inserting to the
running list and always looking at both pending and running when
trying to determine the lowest cookie.

Over time, the async synchronization implementation became quite
messy.  We better restructure it such that each async_entry is linked
to two lists - one global and one per domain - and not move it when
execution starts.  There's no reason to distinguish pending and
running.  They behave the same for synchronization purposes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: stable@vger.kernel.org
---
And here's the fix for the breakage I mentioned earlier.  It wouldn't
happen often in the wild and the effect of it happening wouldn't be
critical for modern distros but it's still kinda surprising nobody
noticed this.

We definitely need to rewrite async synchronization.  It was already
messy and this makes it worse and there's no reason to be messy here.

Thanks.

 kernel/async.c |   27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

--- a/kernel/async.c
+++ b/kernel/async.c
@@ -86,18 +86,27 @@ static atomic_t entry_count;
  */
 static async_cookie_t  __lowest_in_progress(struct async_domain *running)
 {
+	async_cookie_t first_running = next_cookie;	/* infinity value */
+	async_cookie_t first_pending = next_cookie;	/* ditto */
 	struct async_entry *entry;
 
+	/*
+	 * Both running and pending lists are sorted but not disjoint.
+	 * Take the first cookies from both and return the min.
+	 */
 	if (!list_empty(&running->domain)) {
 		entry = list_first_entry(&running->domain, typeof(*entry), list);
-		return entry->cookie;
+		first_running = entry->cookie;
 	}
 
-	list_for_each_entry(entry, &async_pending, list)
-		if (entry->running == running)
-			return entry->cookie;
+	list_for_each_entry(entry, &async_pending, list) {
+		if (entry->running == running) {
+			first_pending = entry->cookie;
+			break;
+		}
+	}
 
-	return next_cookie;	/* "infinity" value */
+	return min(first_running, first_pending);
 }
 
 static async_cookie_t  lowest_in_progress(struct async_domain *running)
@@ -118,13 +127,17 @@ static void async_run_entry_fn(struct wo
 {
 	struct async_entry *entry =
 		container_of(work, struct async_entry, work);
+	struct async_entry *pos;
 	unsigned long flags;
 	ktime_t uninitialized_var(calltime), delta, rettime;
 	struct async_domain *running = entry->running;
 
-	/* 1) move self to the running queue */
+	/* 1) move self to the running queue, make sure it stays sorted */
 	spin_lock_irqsave(&async_lock, flags);
-	list_move_tail(&entry->list, &running->domain);
+	list_for_each_entry_reverse(pos, &running->domain, list)
+		if (entry->cookie < pos->cookie)
+			break;
+	list_move_tail(&entry->list, &pos->list);
 	spin_unlock_irqrestore(&async_lock, flags);
 
 	/* 2) run (and print duration) */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] async: fix __lowest_in_progress()
  2013-01-16 17:19                   ` [PATCH] async: fix __lowest_in_progress() Tejun Heo
@ 2013-01-17 18:16                     ` Linus Torvalds
  2013-01-17 18:50                       ` Tejun Heo
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2013-01-17 18:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ming Lei, Alex Riesen, Alan Stern, Jens Axboe, USB list,
	Linux Kernel Mailing List, Arjan van de Ven

Tejun, mind explaining this one a bit more to me?

If ordering matters, and we have a running queue and a pending queue,
how could the pending queue *ever* be lower than the running one?

That implies that something was taken off the pending queue and put on
the running queue out of order, right?

And that in turn implies that there isn't much of a "lowest" ordering
at all, so how could anybody even care about what lowest is? It seems
to be a meaningless measure.

So with that in mind, I don't see what semantics the first part of the
patch fixes. Can you explain more?

               Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] async: fix __lowest_in_progress()
  2013-01-17 18:16                     ` Linus Torvalds
@ 2013-01-17 18:50                       ` Tejun Heo
  0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2013-01-17 18:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Alex Riesen, Alan Stern, Jens Axboe, USB list,
	Linux Kernel Mailing List, Arjan van de Ven

Hello,

On Thu, Jan 17, 2013 at 10:16:50AM -0800, Linus Torvalds wrote:
> Tejun, mind explaining this one a bit more to me?
> 
> If ordering matters, and we have a running queue and a pending queue,
> how could the pending queue *ever* be lower than the running one?

So, before being converted to workqueue, async spooled up its own
workers and each worker would lock and move the first pending item to
the executing list and everything was in order.

The conversion to workqueue was done by adding work_struct to each
async_entry and async just schedules the work item.  The queueing and
dispatching of such work items are still in order but now each worker
thread is associated with a specific async_entry and move that
specific async_entry to the executing list.  So, depending on which
worker reaches that point earlier, which is completely
non-deterministic, we may end up moving an async_entry with larger
cookie before one with smaller one.

> That implies that something was taken off the pending queue and put on
> the running queue out of order, right?
> 
> And that in turn implies that there isn't much of a "lowest" ordering
> at all, so how could anybody even care about what lowest is? It seems
> to be a meaningless measure.

The execution is still lowest first as workqueue would dispatch
workers in queued order.  It just is that they can reach the
synchronization point at their own differing paces.

> So with that in mind, I don't see what semantics the first part of the
> patch fixes. Can you explain more?

The problem with the code is that it's keeping a global pending list
and domain-specific running lists.  I don't know why it developed like
this but even before workqueue conversion the code was weird.

* It has sorted per-domain running list, so looking at the first item
  is easy.

* It has sorted global pennding list, and looking for first item in a
  domain involves scanning it.

Global syncing ends up scanning all per-domain running lists and
domain syncing ends up scanning global pending list, when all we need
is each async item to be queued on two lists - global and per-domain
in-flight lists - and stay there until done.

The posted patch is minimal fix while keeping the basic operation the
same so that it doesn't disturb -stable too much.  I'll prep a patch
to redo synchronization for 3.9.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-01-17 18:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-12 10:16 [PATCH] async: fix __lowest_in_progress() Arjan van de Ven
  -- strict thread matches above, loose matches on Subject: below --
2013-01-13 12:09 USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Alex Riesen
2013-01-13 16:56 ` Alan Stern
2013-01-13 17:42   ` Alex Riesen
2013-01-14  3:47     ` Ming Lei
2013-01-14  7:15       ` Ming Lei
2013-01-14 17:30         ` Linus Torvalds
2013-01-15  1:53           ` Ming Lei
2013-01-15  6:23             ` Ming Lei
2013-01-15 17:36               ` Linus Torvalds
2013-01-15 18:32                 ` Tejun Heo
2013-01-16 17:19                   ` [PATCH] async: fix __lowest_in_progress() Tejun Heo
2013-01-17 18:16                     ` Linus Torvalds
2013-01-17 18:50                       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).