public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Lyle <mlyle@lyle.org>
To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org
Cc: axboe@fb.com, Michael Lyle <mlyle@lyle.org>
Subject: [416 PATCH 06/13] bcache: writeback: properly order backing device IO
Date: Mon,  8 Jan 2018 12:21:23 -0800	[thread overview]
Message-ID: <20180108202130.31303-7-mlyle@lyle.org> (raw)
In-Reply-To: <20180108202130.31303-1-mlyle@lyle.org>

Writeback keys are presently iterated and dispatched for writeback in
order of the logical block address on the backing device.  Multiple may
be, in parallel, read from the cache device and then written back
(especially when there are contiguous I/O).

However-- there was no guarantee with the existing code that the writes
would be issued in LBA order, as the reads from the cache device are
often re-ordered.  In turn, when writing back quickly, the backing disk
often has to seek backwards-- this slows writeback and increases
utilization.

This patch introduces an ordering mechanism that guarantees that the
original order of issue is maintained for the write portion of the I/O.
Performance for writeback is significantly improved when there are
multiple contiguous keys or high writeback rates.

Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
Tested-by: Tang Junhui <tang.junhui@zte.com.cn>
---
 drivers/md/bcache/bcache.h    |  8 ++++++++
 drivers/md/bcache/writeback.c | 29 +++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 1784e50eb857..3be0fcc19b1f 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -330,6 +330,14 @@ struct cached_dev {
 
 	struct keybuf		writeback_keys;
 
+	/*
+	 * Order the write-half of writeback operations strongly in dispatch
+	 * order.  (Maintain LBA order; don't allow reads completing out of
+	 * order to re-order the writes...)
+	 */
+	struct closure_waitlist writeback_ordering_wait;
+	atomic_t		writeback_sequence_next;
+
 	/* For tracking sequential IO */
 #define RECENT_IO_BITS	7
 #define RECENT_IO	(1 << RECENT_IO_BITS)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 479095987f22..6e1d2fde43df 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -116,6 +116,7 @@ static unsigned writeback_delay(struct cached_dev *dc, unsigned sectors)
 struct dirty_io {
 	struct closure		cl;
 	struct cached_dev	*dc;
+	uint16_t		sequence;
 	struct bio		bio;
 };
 
@@ -194,6 +195,27 @@ static void write_dirty(struct closure *cl)
 {
 	struct dirty_io *io = container_of(cl, struct dirty_io, cl);
 	struct keybuf_key *w = io->bio.bi_private;
+	struct cached_dev *dc = io->dc;
+
+	uint16_t next_sequence;
+
+	if (atomic_read(&dc->writeback_sequence_next) != io->sequence) {
+		/* Not our turn to write; wait for a write to complete */
+		closure_wait(&dc->writeback_ordering_wait, cl);
+
+		if (atomic_read(&dc->writeback_sequence_next) == io->sequence) {
+			/*
+			 * Edge case-- it happened in indeterminate order
+			 * relative to when we were added to wait list..
+			 */
+			closure_wake_up(&dc->writeback_ordering_wait);
+		}
+
+		continue_at(cl, write_dirty, io->dc->writeback_write_wq);
+		return;
+	}
+
+	next_sequence = io->sequence + 1;
 
 	/*
 	 * IO errors are signalled using the dirty bit on the key.
@@ -211,6 +233,9 @@ static void write_dirty(struct closure *cl)
 		closure_bio_submit(&io->bio, cl);
 	}
 
+	atomic_set(&dc->writeback_sequence_next, next_sequence);
+	closure_wake_up(&dc->writeback_ordering_wait);
+
 	continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq);
 }
 
@@ -242,7 +267,10 @@ static void read_dirty(struct cached_dev *dc)
 	int nk, i;
 	struct dirty_io *io;
 	struct closure cl;
+	uint16_t sequence = 0;
 
+	BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
+	atomic_set(&dc->writeback_sequence_next, sequence);
 	closure_init_stack(&cl);
 
 	/*
@@ -303,6 +331,7 @@ static void read_dirty(struct cached_dev *dc)
 
 			w->private	= io;
 			io->dc		= dc;
+			io->sequence    = sequence++;
 
 			dirty_init(w);
 			bio_set_op_attrs(&io->bio, REQ_OP_READ, 0);
-- 
2.14.1

  parent reply	other threads:[~2018-01-08 20:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-08 20:21 [416 PATCH 00/13] Bcache changes for 4.16 Michael Lyle
2018-01-08 20:21 ` [416 PATCH 01/13] bcache: ret IOERR when read meets metadata error Michael Lyle
2018-01-08 20:21 ` [416 PATCH 02/13] bcache: stop writeback thread after detaching Michael Lyle
2018-01-08 20:21 ` [416 PATCH 03/13] bcache: Use PTR_ERR_OR_ZERO() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 04/13] bcache: segregate flash only volume write streams Michael Lyle
2018-01-08 20:21 ` [416 PATCH 05/13] bcache: fix wrong return value in bch_debug_init() Michael Lyle
2018-01-08 20:21 ` Michael Lyle [this message]
2018-01-08 20:21 ` [416 PATCH 07/13] bcache: allow quick writeback when backing idle Michael Lyle
2018-01-08 20:21 ` [416 PATCH 08/13] bcache: Fix, improve efficiency of closure_sync() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 09/13] bcache: mark closure_sync() __sched Michael Lyle
2018-01-08 20:21 ` [416 PATCH 10/13] bcache: fix unmatched generic_end_io_acct() & generic_start_io_acct() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 11/13] bcache: reduce cache_set devices iteration by devices_max_used Michael Lyle
2018-01-08 20:21 ` [416 PATCH 12/13] bcache: fix misleading error message in bch_count_io_errors() Michael Lyle
2018-01-08 20:21 ` [416 PATCH 13/13] bcache: fix writeback target calc on large devices Michael Lyle
2018-01-08 20:42 ` [416 PATCH 00/13] Bcache changes for 4.16 Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180108202130.31303-7-mlyle@lyle.org \
    --to=mlyle@lyle.org \
    --cc=axboe@fb.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox