All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix bad percpu counter state during suspend
@ 2014-04-05  3:22 Jens Axboe
  0 siblings, 0 replies; only message in thread
From: Jens Axboe @ 2014-04-05  3:22 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

Hi,

I got a bug report yesterday from Laszlo Ersek <lersek@redhat.com>, in
which he states that his kvm instance fails to suspend. He Laszlo
bisected it down to this commit:

commit 1cf7e9c68fe84248174e998922b39e508375e7c1
  Author: Jens Axboe <axboe@kernel.dk>
  Date:   Fri Nov 1 10:52:52 2013 -0600

      virtio_blk: blk-mq support

where virtio-blk is converted to use the blk-mq infrastructure. After
digging a bit, it became clear that the issue was with the queue drain.
blk-mq tracks queue usage in a percpu counter, which is incremented on
request alloc and decremented when the request is freed. The initial
hunt was for an inconsistency in blk-mq, but everything seemed fine. In
fact, the counter only returned crazy values when suspend was in
progress. When a CPU is unplugged, the percpu counters merges that CPU
state with the general state. blk-mq takes care to register a hotcpu
notifier with the appropriate priority, so we know it runs after the
percpu counter notifier. However, the percpu counter notifier only
merges the state when the CPU is fully gone. This leaves a state
transition where the CPU going away is no longer in the online mask, yet
it still holds private values. This means that in this state,
percpu_counter_sum() returns invalid results, and the suspend then hangs
waiting for abs(dead-cpu-value) requests to complete which of course
will never happen.

Fix this by clearing the state earlier, so we never have a case where
the CPU isn't in online mask but still holds private state. This bug has
been there since forever, I guess we don't have a lot of users where
percpu counters needs to be reliable during the suspend cycle.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
index 8280a5dd1727..7dd33577b905 100644
--- a/lib/percpu_counter.c
+++ b/lib/percpu_counter.c
@@ -169,7 +169,7 @@ static int percpu_counter_hotcpu_callback(struct notifier_block *nb,
 	struct percpu_counter *fbc;
 
 	compute_batch_value();
-	if (action != CPU_DEAD)
+	if (action != CPU_DEAD && action != CPU_DEAD_FROZEN)
 		return NOTIFY_OK;
 
 	cpu = (unsigned long)hcpu;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2014-04-05  3:22 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-05  3:22 [PATCH] Fix bad percpu counter state during suspend Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.