linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcache: fix writeback thread to sleep less intrusively
@ 2014-04-11  0:26 Darrick J. Wong
  2014-04-11  0:46 ` Kent Overstreet
  0 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2014-04-11  0:26 UTC (permalink / raw)
  To: Kent Overstreet, Sam Fulcomer, Francis Moreau, Daniel J Blueman
  Cc: linux-bcache

Hi all,

The attached patch fixes both the "writeback blocked for XXX seconds"
complaints from the kernel and the oddly high load averages on idle systems
problems for me.  Can you give it a try to see if it fixes your problem too?

--D
---
Currently, the writeback thread performs uninterruptible sleep while
it waits for enough dirty data to accumulate to start writeback.
Unfortunately, uninterruptible sleep counts towards load average,
which artificially inflates it.  Since the wb thread is a kernel
thread and kthreads don't receive signals, we can use the
interruptible sleep call, which eliminates the high load average
symptom.

A second symptom is that if we mount a non-writeback cache, the
writeback thread will be woken up.  If the cache later accumulates
dirty data and writeback_running=1 (this seems to be a default) then
the writeback thread will enter uninterruptible sleep waiting for
dirty data.  This is unnecessary and (I think) results in the
"bcache_writebac:155 blocked for more than XXX seconds" complaints
that people have been talking about.  The fix for this is simple -- if
we're not in writeback mode, just go to (interruptible) sleep for a
long time.  Alternately, we could use wait_event until the cache mode
changes.

Finally, change bch_cached_dev_attach() to always wake up the
writeback thread, because the newly created wb thread remains in
uninterruptible sleep state until something explicitly wakes it up.
This wakeup allows the thread to call bch_writeback_thread(),
whereupon it will most likely end up in interruptible sleep.  In
theory we could just let the first write take care of this, but
there's really no reason not to do the transition quickly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 drivers/md/bcache/super.c     |    2 +-
 drivers/md/bcache/writeback.c |   16 ++++++++++++++--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 24a3a15..3ffe970 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1048,8 +1048,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c)
 		bch_sectors_dirty_init(dc);
 		atomic_set(&dc->has_dirty, 1);
 		atomic_inc(&dc->count);
-		bch_writeback_queue(dc);
 	}
+	bch_writeback_queue(dc);
 
 	bch_cached_dev_run(dc);
 	bcache_device_link(&dc->disk, c, "bdev");
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index f4300e4..f49e6b1 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -239,7 +239,7 @@ static void read_dirty(struct cached_dev *dc)
 		if (KEY_START(&w->key) != dc->last_read ||
 		    jiffies_to_msecs(delay) > 50)
 			while (!kthread_should_stop() && delay)
-				delay = schedule_timeout_uninterruptible(delay);
+				delay = schedule_timeout_interruptible(delay);
 
 		dc->last_read	= KEY_OFFSET(&w->key);
 
@@ -401,6 +401,18 @@ static int bch_writeback_thread(void *arg)
 
 	while (!kthread_should_stop()) {
 		down_write(&dc->writeback_lock);
+		if (BDEV_CACHE_MODE(&dc->sb) != CACHE_MODE_WRITEBACK) {
+			up_write(&dc->writeback_lock);
+			set_current_state(TASK_INTERRUPTIBLE);
+
+			if (kthread_should_stop())
+				return 0;
+
+			try_to_freeze();
+			schedule_timeout_interruptible(10 * HZ);
+			continue;
+		}
+
 		if (!atomic_read(&dc->has_dirty) ||
 		    (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
 		     !dc->writeback_running)) {
@@ -436,7 +448,7 @@ static int bch_writeback_thread(void *arg)
 			while (delay &&
 			       !kthread_should_stop() &&
 			       !test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags))
-				delay = schedule_timeout_uninterruptible(delay);
+				delay = schedule_timeout_interruptible(delay);
 		}
 	}
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-07-09 19:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-11  0:26 [PATCH] bcache: fix writeback thread to sleep less intrusively Darrick J. Wong
2014-04-11  0:46 ` Kent Overstreet
2014-04-11  1:41   ` Darrick J. Wong
2014-04-12  2:31     ` Darrick J. Wong
2014-04-12  2:33       ` [PATCH] bcache: let the writeback thread run at least once at startup Darrick J. Wong
2014-04-30 11:51       ` [PATCH] bcache: fix writeback thread to sleep less intrusively Daniel Smedegaard Buus
2014-04-30 17:24         ` Darrick J. Wong
2014-05-01  9:38           ` Daniel Smedegaard Buus
2014-05-01 21:54             ` Slava Pestov
2014-05-20  7:07               ` Daniel J Blueman
     [not found]             ` <CACHGV4+mNu_KV7JazT-34D++3S2NKhDkOmc_wo0QfrfdqpccoQ@mail.gmail.com>
2014-07-09 10:27               ` Daniel Smedegaard Buus
2014-07-09 15:53                 ` Peter Kieser
2014-07-09 18:02                   ` Daniel Smedegaard Buus
2014-07-09 18:19                     ` Slava Pestov
2014-07-09 19:18                       ` Daniel Smedegaard Buus
2014-04-11  7:26   ` Francis Moreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).