All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerd Hoffmann <kraxel@suse.de>
To: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: xen-devel <xen-devel@lists.xensource.com>
Subject: Re: [patch] barrier support for blk{front,back}
Date: Mon, 11 Sep 2006 15:22:07 -0700	[thread overview]
Message-ID: <4505E18F.1090608@suse.de> (raw)
In-Reply-To: <3AAA99889D105740BE010EB6D5A5A3B20118A8@paddington.ad.cl.cam.ac.uk>

[-- Attachment #1: Type: text/plain, Size: 2051 bytes --]

Ian Pratt wrote:
>> This patch adds support for barriers to blk{back,front} drivers.
> 
> It's good to see barrier supported added.
> 
> Out of interest, what was your motivation for adding it?

Trying to fix some problems of loop-file backed virtual block devices.
For SLES10 we have a patch which adds a syncronous mode to the loop
driver (by opening the file with O_SYNC).  It solves the problem of loop
doing too much buffering and screw up journaling filesystems, but is
dead slow.  When using barriers instead the performance should become
better without the risc to kill the filesystem by ignoring write
ordering.  There is also a patch in the queue (for mainline) which adds
barrier support to loop devices, attached below for reference.

> Which file systems use it, and do you see a worthwhile performance
> gain from the extra disk scheduling flexibility?

All journaling filesystems should be able to use them.  ext3 and
reiserfs do for sure, although they are not enabled by default, you need
the barrier=1 (ext3) and barrier=flush (reiser) mount options.  Don't
know what xfs and jfs are doing by default.

No benchmarks yet, sorry.  I finished the patch just the day before the
summit on my notebook, which is way to slow for serious performance
tests.  Beside that I simply had no time yet.  I can run some next week.

> We are going to have to think through what the impact of this would
> be in the live relocation block safety optimizations Andy Warfield 
> described at the summit. The simple thing is just to revert to
> stalling until the backend gives the all clear if there's a barrier
> in the queue.

Hmm, yes, the frontend driver better should take care that there isn't
an barrier request in flight.  Doing that should also reduce the risc to
corrupt the filesystem in the (already unlikely) case that the writes on
the host the machine is migrated from are ending up on disk after the
ones resubmitted from the host the machine is migrated to.

jetlagged greetings from europe,

  Gerd

-- 
Gerd Hoffmann <kraxel@suse.de>

[-- Attachment #2: loop-barrier --]
[-- Type: text/plain, Size: 1451 bytes --]

--- linux-2.6.16/drivers/block/loop.c~	2006-06-29 13:22:37.000000000 +0200
+++ linux-2.6.16/drivers/block/loop.c	2006-06-29 13:28:17.000000000 +0200
@@ -467,16 +467,58 @@
 	return ret;
 }
 
+/*
+ * This is best effort. We really wouldn't know what to do with a returned
+ * error. This code is taken from the implementation of fsync.
+ */
+static int sync_file(struct file * file)  
+{
+	struct address_space *mapping;
+	int ret;
+
+	if (!file->f_op || !file->f_op->fsync)
+		return -EOPNOTSUPP;
+
+	mapping = file->f_mapping;
+
+	ret = filemap_fdatawrite(mapping);
+	if (!ret) {
+		/*
+		 * We need to protect against concurrent writers,
+		 * which could cause livelocks in fsync_buffers_list
+		 */
+		mutex_lock(&mapping->host->i_mutex);
+		ret = file->f_op->fsync(file, file->f_dentry, 1);
+		mutex_unlock(&mapping->host->i_mutex);
+
+		filemap_fdatawait(mapping);
+	}
+
+	return ret;
+}
+
 static int do_bio_filebacked(struct loop_device *lo, struct bio *bio)
 {
 	loff_t pos;
 	int ret;
+	int sync = bio_sync(bio);
+	int barrier = bio_barrier(bio);
+
+	if (barrier) {
+		ret = sync_file(lo->lo_backing_file);
+		if (unlikely(ret))
+			return ret;
+	}
 
 	pos = ((loff_t) bio->bi_sector << 9) + lo->lo_offset;
 	if (bio_rw(bio) == WRITE)
 		ret = lo_send(lo, bio, lo->lo_blocksize, pos);
 	else
 		ret = lo_receive(lo, bio, lo->lo_blocksize, pos);
+
+	if ((barrier || sync) && !ret)
+		ret = sync_file(lo->lo_backing_file);
+
 	return ret;
 }
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

      reply	other threads:[~2006-09-11 22:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-08 17:34 [patch] barrier support for blk{front,back} Gerd Hoffmann
2006-09-09 19:59 ` Ian Pratt
2006-09-11 22:22   ` Gerd Hoffmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4505E18F.1090608@suse.de \
    --to=kraxel@suse.de \
    --cc=m+Ian.Pratt@cl.cam.ac.uk \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.