From: Vivek Goyal <vgoyal@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Ted Ts'o <tytso@mit.edu>, Tejun Heo <tj@kernel.org>,
Jan Kara <jack@suse.cz>,
jaxboe@fusionio.com, James.Bottomley@suse.de,
linux-fsdevel@vger.kernel.org, linux-scsi@vger.kernel.org,
chris.mason@oracle.com, swhiteho@redhat.com,
konishi.ryusuke@lab.ntt.co.jp
Subject: Re: [RFC] relaxed barrier semantics
Date: Thu, 29 Jul 2010 23:17:21 -0400 [thread overview]
Message-ID: <20100730031721.GA31762@redhat.com> (raw)
In-Reply-To: <20100729200655.GB17767@lst.de>
On Thu, Jul 29, 2010 at 10:06:55PM +0200, Christoph Hellwig wrote:
> On Thu, Jul 29, 2010 at 04:02:17PM -0400, Vivek Goyal wrote:
> > Looks like you still want to go with option 2 where you will scan the file
> > system code for requirement of DRAIN semantics and everything is fine then for
> > devices no supporting volatile caches, you will mark request queue as NONE.
>
> The filesystem can't simply change the request queue settings. A request
> queue is often shared by multiple filesystems that can have very
> different requirements.
>
> > This solves the problem on devices with WCE=0 but what about devices with
> > WCE=1. If file systems anyway don't require DRAIN semantics, then we
> > should not require it on devices with WCE=1 also?
>
> Yes.
>
> > If yes, then why not go with another variant of barriers which don't
> > perform DRAIN and just do PREFLUSH + FUA (or post flush for devices not
> > supporting FUA).
>
> I've been trying to prototype it, but it's in fact rather hard to
> get this right. Tejun has done a really good job at the current
> barrier implementation and coming up with something just half as
> clever for the relaxed barriers has been driving me mad.
>
> > And then file systems can slowly move to using this non
> > draining barrier usage wherever appropriate.
>
> Actually supporting different kind of barriers at the same time
> is even harder. We'll need two different state machines for them,
> including the actual state in the request_queue. And then make
> sure when different filesystems on the same queue use different
> types work well together. If at all possible switching the semantics
> on a flag day would make life a lot simpler.
Hi Christoph,
I was looking at barrier code and was trying to think that how hard it is
to support a new barrier type which does not implement DRAIN but only
does PREFLUSH + FUA for devices with WCE=1.
To me it looked like as if everything is there and it is just a matter
of skipping elevator draining and request queue draining.
Can you please have a look at attached patch. This is not a complete patch
but just a part of it if we were to implement another barrier type, say
FLUSHBARRIER. Do you think this will work or I am blissfuly unaware of
complexity here and oversimplifying the things.
Thanks
Vivek
---
block/blk-barrier.c | 14 +++++++++++++-
block/elevator.c | 3 ++-
include/linux/blkdev.h | 5 ++++-
3 files changed, 19 insertions(+), 3 deletions(-)
Index: linux-2.6/include/linux/blkdev.h
===================================================================
--- linux-2.6.orig/include/linux/blkdev.h 2010-06-19 09:54:32.000000000 -0400
+++ linux-2.6/include/linux/blkdev.h 2010-07-29 22:36:52.000000000 -0400
@@ -97,6 +97,7 @@ enum rq_flag_bits {
__REQ_SORTED, /* elevator knows about this request */
__REQ_SOFTBARRIER, /* may not be passed by ioscheduler */
__REQ_HARDBARRIER, /* may not be passed by drive either */
+ __REQ_FLUSHBARRIER, /* only flush barrier. no drains required */
__REQ_FUA, /* forced unit access */
__REQ_NOMERGE, /* don't touch this for merging */
__REQ_STARTED, /* drive already may have started this one */
@@ -126,6 +127,7 @@ enum rq_flag_bits {
#define REQ_SORTED (1 << __REQ_SORTED)
#define REQ_SOFTBARRIER (1 << __REQ_SOFTBARRIER)
#define REQ_HARDBARRIER (1 << __REQ_HARDBARRIER)
+#define REQ_FLUSHBARRIER (1 << __REQ_FLUSHBARRIER)
#define REQ_FUA (1 << __REQ_FUA)
#define REQ_NOMERGE (1 << __REQ_NOMERGE)
#define REQ_STARTED (1 << __REQ_STARTED)
@@ -626,6 +628,7 @@ enum {
#define blk_rq_cpu_valid(rq) ((rq)->cpu != -1)
#define blk_sorted_rq(rq) ((rq)->cmd_flags & REQ_SORTED)
#define blk_barrier_rq(rq) ((rq)->cmd_flags & REQ_HARDBARRIER)
+#define blk_flush_barrier_rq(rq) ((rq)->cmd_flags & REQ_FLUSHBARRIER)
#define blk_fua_rq(rq) ((rq)->cmd_flags & REQ_FUA)
#define blk_discard_rq(rq) ((rq)->cmd_flags & REQ_DISCARD)
#define blk_bidi_rq(rq) ((rq)->next_rq != NULL)
@@ -681,7 +684,7 @@ static inline void blk_clear_queue_full(
* it already be started by driver.
*/
#define RQ_NOMERGE_FLAGS \
- (REQ_NOMERGE | REQ_STARTED | REQ_HARDBARRIER | REQ_SOFTBARRIER)
+ (REQ_NOMERGE | REQ_STARTED | REQ_HARDBARRIER | REQ_SOFTBARRIER | REQ_FLUSHBARRIER)
#define rq_mergeable(rq) \
(!((rq)->cmd_flags & RQ_NOMERGE_FLAGS) && \
(blk_discard_rq(rq) || blk_fs_request((rq))))
Index: linux-2.6/block/blk-barrier.c
===================================================================
--- linux-2.6.orig/block/blk-barrier.c 2010-06-19 09:54:29.000000000 -0400
+++ linux-2.6/block/blk-barrier.c 2010-07-29 23:02:05.000000000 -0400
@@ -219,7 +219,8 @@ static inline bool start_ordered(struct
} else
skip |= QUEUE_ORDSEQ_PREFLUSH;
- if ((q->ordered & QUEUE_ORDERED_BY_DRAIN) && queue_in_flight(q))
+ if ((q->ordered & QUEUE_ORDERED_BY_DRAIN) && queue_in_flight(q)
+ && !blk_flush_barrier_rq(rq))
rq = NULL;
else
skip |= QUEUE_ORDSEQ_DRAIN;
@@ -241,6 +242,17 @@ bool blk_do_ordered(struct request_queue
if (!q->ordseq) {
if (!is_barrier)
return true;
+ /*
+ * For flush only barriers, nothing has to be done if there is
+ * no caching happening on the deice. The barrier request is
+ * still has to be written to disk but it can written as
+ * normal rq.
+ */
+
+ if (blk_flush_barrier_rq(rq)
+ && (q->ordered & QUEUE_ORDERED_BY_DRAIN
+ || q->ordered & QUEUE_ORDERED_BY_TAG))
+ return true;
if (q->next_ordered != QUEUE_ORDERED_NONE)
return start_ordered(q, rqp);
Index: linux-2.6/block/elevator.c
===================================================================
--- linux-2.6.orig/block/elevator.c 2010-06-19 09:54:29.000000000 -0400
+++ linux-2.6/block/elevator.c 2010-07-29 23:06:21.000000000 -0400
@@ -628,7 +628,8 @@ void elv_insert(struct request_queue *q,
case ELEVATOR_INSERT_BACK:
rq->cmd_flags |= REQ_SOFTBARRIER;
- elv_drain_elevator(q);
+ if (!blk_flush_barrier_rq(rq))
+ elv_drain_elevator(q);
list_add_tail(&rq->queuelist, &q->queue_head);
/*
* We kick the queue here for the following reasons.
next prev parent reply other threads:[~2010-07-30 3:17 UTC|newest]
Thread overview: 146+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-27 16:56 [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-27 17:54 ` Jan Kara
2010-07-27 18:35 ` Vivek Goyal
2010-07-27 18:42 ` James Bottomley
2010-07-27 18:51 ` Ric Wheeler
2010-07-27 19:43 ` Christoph Hellwig
2010-07-27 19:38 ` Christoph Hellwig
2010-07-28 8:08 ` Tejun Heo
2010-07-28 8:20 ` Tejun Heo
2010-07-28 13:55 ` Vladislav Bolkhovitin
2010-07-28 14:23 ` Tejun Heo
2010-07-28 14:37 ` James Bottomley
2010-07-28 14:44 ` Tejun Heo
2010-07-28 16:17 ` Vladislav Bolkhovitin
2010-07-28 16:17 ` Vladislav Bolkhovitin
2010-07-28 16:16 ` Vladislav Bolkhovitin
2010-07-28 8:24 ` Christoph Hellwig
2010-07-28 8:40 ` Tejun Heo
2010-07-28 8:50 ` Christoph Hellwig
2010-07-28 8:58 ` Tejun Heo
2010-07-28 9:00 ` Christoph Hellwig
2010-07-28 9:11 ` Hannes Reinecke
2010-07-28 9:16 ` Christoph Hellwig
2010-07-28 9:24 ` Tejun Heo
2010-07-28 9:38 ` Christoph Hellwig
2010-07-28 9:28 ` Steven Whitehouse
2010-07-28 9:35 ` READ_META semantics, was " Christoph Hellwig
2010-07-28 13:52 ` Jeff Moyer
2010-07-28 9:17 ` Tejun Heo
2010-07-28 9:28 ` Christoph Hellwig
2010-07-28 9:48 ` Tejun Heo
2010-07-28 10:19 ` Steven Whitehouse
2010-07-28 11:45 ` Christoph Hellwig
2010-07-28 12:47 ` Jan Kara
2010-07-28 23:00 ` Christoph Hellwig
2010-07-29 10:45 ` Jan Kara
2010-07-29 16:54 ` Joel Becker
2010-07-29 17:02 ` Christoph Hellwig
2010-07-29 1:44 ` Ted Ts'o
2010-07-29 2:43 ` Vivek Goyal
2010-07-29 8:42 ` Christoph Hellwig
2010-07-29 20:02 ` Vivek Goyal
2010-07-29 20:06 ` Christoph Hellwig
2010-07-30 3:17 ` Vivek Goyal [this message]
2010-07-30 7:07 ` Christoph Hellwig
2010-07-30 7:41 ` Vivek Goyal
2010-08-02 18:28 ` [RFC PATCH] Flush only barriers (Was: Re: [RFC] relaxed barrier semantics) Vivek Goyal
2010-08-03 13:03 ` Christoph Hellwig
2010-08-04 15:29 ` Vivek Goyal
2010-08-04 16:21 ` Christoph Hellwig
2010-07-29 8:31 ` [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-29 11:16 ` Jan Kara
2010-07-29 13:00 ` extfs reliability Vladislav Bolkhovitin
2010-07-29 13:08 ` Christoph Hellwig
2010-07-29 14:12 ` Vladislav Bolkhovitin
2010-07-29 14:34 ` Jan Kara
2010-07-29 18:20 ` Vladislav Bolkhovitin
2010-07-29 18:49 ` Vladislav Bolkhovitin
2010-07-29 14:26 ` Jan Kara
2010-07-29 18:20 ` Vladislav Bolkhovitin
2010-07-29 18:58 ` Ted Ts'o
2010-07-29 19:44 ` [RFC] relaxed barrier semantics Ric Wheeler
2010-07-29 19:49 ` Christoph Hellwig
2010-07-29 19:56 ` Ric Wheeler
2010-07-29 19:59 ` James Bottomley
2010-07-29 20:03 ` Christoph Hellwig
2010-07-29 20:07 ` James Bottomley
2010-07-29 20:11 ` Christoph Hellwig
2010-07-30 12:45 ` Vladislav Bolkhovitin
2010-07-30 12:56 ` Christoph Hellwig
2010-08-04 1:58 ` Jamie Lokier
2010-07-30 12:46 ` Vladislav Bolkhovitin
2010-07-30 12:57 ` Christoph Hellwig
2010-07-30 13:09 ` Vladislav Bolkhovitin
2010-07-30 13:12 ` Christoph Hellwig
2010-07-30 17:40 ` Vladislav Bolkhovitin
2010-07-29 20:58 ` Ric Wheeler
2010-07-29 22:30 ` Andreas Dilger
2010-07-29 23:04 ` Ted Ts'o
2010-07-29 23:08 ` Ric Wheeler
2010-07-29 23:28 ` James Bottomley
2010-07-29 23:37 ` James Bottomley
2010-07-30 0:19 ` Ted Ts'o
2010-07-30 12:56 ` Vladislav Bolkhovitin
2010-07-30 7:11 ` Christoph Hellwig
2010-07-30 12:56 ` Vladislav Bolkhovitin
2010-07-30 13:07 ` Tejun Heo
2010-07-30 13:22 ` Vladislav Bolkhovitin
2010-07-30 13:27 ` Vladislav Bolkhovitin
2010-07-30 13:09 ` Christoph Hellwig
2010-07-30 13:25 ` Vladislav Bolkhovitin
2010-07-30 13:34 ` Christoph Hellwig
2010-07-30 13:44 ` Vladislav Bolkhovitin
2010-07-30 14:20 ` Christoph Hellwig
2010-07-31 0:47 ` Jan Kara
2010-07-31 9:12 ` Christoph Hellwig
2010-08-02 13:14 ` Jan Kara
2010-08-02 10:38 ` Vladislav Bolkhovitin
2010-08-02 12:48 ` Christoph Hellwig
2010-08-02 19:03 ` xfs rm performance Vladislav Bolkhovitin
2010-08-02 19:18 ` Christoph Hellwig
2010-08-05 19:31 ` Vladislav Bolkhovitin
2010-08-02 19:01 ` [RFC] relaxed barrier semantics Vladislav Bolkhovitin
2010-08-02 19:26 ` Christoph Hellwig
2010-07-31 0:35 ` Jan Kara
2010-08-02 16:47 ` Ryusuke Konishi
2010-08-02 17:39 ` Chris Mason
2010-08-05 13:11 ` Vladislav Bolkhovitin
2010-08-05 13:32 ` Chris Mason
2010-08-05 14:52 ` Hannes Reinecke
2010-08-05 15:17 ` Chris Mason
2010-08-05 17:07 ` Christoph Hellwig
2010-08-05 19:48 ` Vladislav Bolkhovitin
[not found] ` <4C5B1583.6070706@vlnb.net>
2010-08-05 19:50 ` Christoph Hellwig
2010-08-05 20:05 ` Vladislav Bolkhovitin
2010-08-06 14:56 ` Hannes Reinecke
2010-08-06 18:38 ` Vladislav Bolkhovitin
2010-08-06 23:38 ` Christoph Hellwig
2010-08-06 23:34 ` Christoph Hellwig
2010-08-05 17:09 ` Christoph Hellwig
2010-08-05 19:32 ` Vladislav Bolkhovitin
2010-08-05 19:40 ` Christoph Hellwig
2010-07-28 13:56 ` Vladislav Bolkhovitin
2010-07-28 14:42 ` Vivek Goyal
2010-07-27 19:37 ` Christoph Hellwig
2010-08-03 18:49 ` [PATCH, RFC 1/2] relaxed cache flushes Christoph Hellwig
2010-08-03 18:51 ` [PATCH, RFC 2/2] dm: support REQ_FLUSH directly Christoph Hellwig
2010-08-04 4:57 ` Kiyoshi Ueda
2010-08-04 8:54 ` Christoph Hellwig
2010-08-05 2:16 ` Jun'ichi Nomura
2010-08-26 22:50 ` Mike Snitzer
2010-08-27 0:40 ` Mike Snitzer
2010-08-27 1:20 ` Jamie Lokier
2010-08-27 1:43 ` Jun'ichi Nomura
2010-08-27 4:08 ` Mike Snitzer
2010-08-27 5:52 ` Jun'ichi Nomura
2010-08-27 14:13 ` Mike Snitzer
2010-08-30 4:45 ` Jun'ichi Nomura
2010-08-30 8:33 ` Tejun Heo
2010-08-30 12:43 ` Mike Snitzer
2010-08-30 12:45 ` Tejun Heo
2010-08-06 16:04 ` [PATCH, RFC] relaxed barriers Tejun Heo
2010-08-06 23:34 ` Christoph Hellwig
2010-08-07 10:13 ` [PATCH REPOST " Tejun Heo
2010-08-08 14:31 ` Christoph Hellwig
2010-08-09 14:50 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100730031721.GA31762@redhat.com \
--to=vgoyal@redhat.com \
--cc=James.Bottomley@suse.de \
--cc=chris.mason@oracle.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jaxboe@fusionio.com \
--cc=konishi.ryusuke@lab.ntt.co.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=swhiteho@redhat.com \
--cc=tj@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).