From: Maxim Patlasov <mpatlasov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org
Cc: dev-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org,
fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
Subject: [PATCH 12/14] fuse: Fix O_DIRECT operations vs cached writeback misorder
Date: Fri, 16 Nov 2012 21:10:20 +0400 [thread overview]
Message-ID: <20121116171012.3196.35933.stgit@maximpc.sw.ru> (raw)
In-Reply-To: <20121116170123.3196.93431.stgit-vWG5eQQidJHciZdyczg/7Q@public.gmane.org>
The problem is:
1. write cached data to a file
2. read directly from the same file (via another fd)
The 2nd operation may read stale data, i.e. the one that was in a file
before the 1st op. Problem is in how fuse manages writeback.
When direct op occurs the core kernel code calls filemap_write_and_wait
to flush all the cached ops in flight. But fuse acks the writeback right
after the ->writepages callback exits w/o waiting for the real write to
happen. Thus the subsequent direct op proceeds while the real writeback
is still in flight. This is a problem for backends that reorder operation.
Fix this by making the fuse direct IO callback explicitly wait on the
in-flight writeback to finish.
Original patch by: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Signed-off-by: Maxim Patlasov <MPatlasov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
---
fs/fuse/file.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b73fe2a..741e9b4 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -348,6 +348,31 @@ u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
return (u64) v0 + ((u64) v1 << 32);
}
+static bool fuse_range_is_writeback(struct inode *inode, pgoff_t idx_from,
+ pgoff_t idx_to)
+{
+ struct fuse_conn *fc = get_fuse_conn(inode);
+ struct fuse_inode *fi = get_fuse_inode(inode);
+ struct fuse_req *req;
+ bool found = false;
+
+ spin_lock(&fc->lock);
+ list_for_each_entry(req, &fi->writepages, writepages_entry) {
+ pgoff_t curr_index;
+
+ BUG_ON(req->inode != inode);
+ curr_index = req->misc.write.in.offset >> PAGE_CACHE_SHIFT;
+ if (!(idx_from >= curr_index + req->num_pages ||
+ idx_to < curr_index)) {
+ found = true;
+ break;
+ }
+ }
+ spin_unlock(&fc->lock);
+
+ return found;
+}
+
/*
* Check if page is under writeback
*
@@ -392,6 +417,19 @@ static int fuse_wait_on_page_writeback(struct inode *inode, pgoff_t index)
return 0;
}
+static void fuse_wait_on_writeback(struct inode *inode, pgoff_t start,
+ size_t bytes)
+{
+ struct fuse_inode *fi = get_fuse_inode(inode);
+ pgoff_t idx_from, idx_to;
+
+ idx_from = start >> PAGE_CACHE_SHIFT;
+ idx_to = (start + bytes - 1) >> PAGE_CACHE_SHIFT;
+
+ wait_event(fi->page_waitq,
+ !fuse_range_is_writeback(inode, idx_from, idx_to));
+}
+
static int fuse_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file->f_path.dentry->d_inode;
@@ -1178,6 +1216,8 @@ ssize_t fuse_direct_io(struct file *file, const char __user *buf,
break;
}
+ fuse_wait_on_writeback(file->f_mapping->host, pos, nbytes);
+
if (write)
nres = fuse_send_write(req, file, pos, nbytes, owner);
else
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
WARNING: multiple messages have this Message-ID (diff)
From: Maxim Patlasov <mpatlasov@parallels.com>
To: miklos@szeredi.hu
Cc: dev@parallels.com, fuse-devel@lists.sourceforge.net,
linux-kernel@vger.kernel.org, jbottomley@parallels.com,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
xemul@openvz.org
Subject: [PATCH 12/14] fuse: Fix O_DIRECT operations vs cached writeback misorder
Date: Fri, 16 Nov 2012 21:10:20 +0400 [thread overview]
Message-ID: <20121116171012.3196.35933.stgit@maximpc.sw.ru> (raw)
In-Reply-To: <20121116170123.3196.93431.stgit@maximpc.sw.ru>
The problem is:
1. write cached data to a file
2. read directly from the same file (via another fd)
The 2nd operation may read stale data, i.e. the one that was in a file
before the 1st op. Problem is in how fuse manages writeback.
When direct op occurs the core kernel code calls filemap_write_and_wait
to flush all the cached ops in flight. But fuse acks the writeback right
after the ->writepages callback exits w/o waiting for the real write to
happen. Thus the subsequent direct op proceeds while the real writeback
is still in flight. This is a problem for backends that reorder operation.
Fix this by making the fuse direct IO callback explicitly wait on the
in-flight writeback to finish.
Original patch by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
---
fs/fuse/file.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b73fe2a..741e9b4 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -348,6 +348,31 @@ u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id)
return (u64) v0 + ((u64) v1 << 32);
}
+static bool fuse_range_is_writeback(struct inode *inode, pgoff_t idx_from,
+ pgoff_t idx_to)
+{
+ struct fuse_conn *fc = get_fuse_conn(inode);
+ struct fuse_inode *fi = get_fuse_inode(inode);
+ struct fuse_req *req;
+ bool found = false;
+
+ spin_lock(&fc->lock);
+ list_for_each_entry(req, &fi->writepages, writepages_entry) {
+ pgoff_t curr_index;
+
+ BUG_ON(req->inode != inode);
+ curr_index = req->misc.write.in.offset >> PAGE_CACHE_SHIFT;
+ if (!(idx_from >= curr_index + req->num_pages ||
+ idx_to < curr_index)) {
+ found = true;
+ break;
+ }
+ }
+ spin_unlock(&fc->lock);
+
+ return found;
+}
+
/*
* Check if page is under writeback
*
@@ -392,6 +417,19 @@ static int fuse_wait_on_page_writeback(struct inode *inode, pgoff_t index)
return 0;
}
+static void fuse_wait_on_writeback(struct inode *inode, pgoff_t start,
+ size_t bytes)
+{
+ struct fuse_inode *fi = get_fuse_inode(inode);
+ pgoff_t idx_from, idx_to;
+
+ idx_from = start >> PAGE_CACHE_SHIFT;
+ idx_to = (start + bytes - 1) >> PAGE_CACHE_SHIFT;
+
+ wait_event(fi->page_waitq,
+ !fuse_range_is_writeback(inode, idx_from, idx_to));
+}
+
static int fuse_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file->f_path.dentry->d_inode;
@@ -1178,6 +1216,8 @@ ssize_t fuse_direct_io(struct file *file, const char __user *buf,
break;
}
+ fuse_wait_on_writeback(file->f_mapping->host, pos, nbytes);
+
if (write)
nres = fuse_send_write(req, file, pos, nbytes, owner);
else
next prev parent reply other threads:[~2012-11-16 17:10 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-16 17:04 [PATCH v2 00/14] fuse: An attempt to implement a write-back cache policy Maxim Patlasov
2012-11-16 17:04 ` Maxim Patlasov
[not found] ` <20121116170123.3196.93431.stgit-vWG5eQQidJHciZdyczg/7Q@public.gmane.org>
2012-11-16 17:05 ` [PATCH 01/14] fuse: Linking file to inode helper Maxim Patlasov
2012-11-16 17:05 ` Maxim Patlasov
2012-11-16 17:05 ` [PATCH 02/14] fuse: Getting file for writeback helper Maxim Patlasov
2012-11-16 17:05 ` Maxim Patlasov
2012-11-16 17:06 ` [PATCH 03/14] fuse: Prepare to handle short reads Maxim Patlasov
2012-11-16 17:06 ` Maxim Patlasov
2012-11-16 17:07 ` [PATCH 04/14] fuse: Prepare to handle multiple pages in writeback Maxim Patlasov
2012-11-16 17:07 ` Maxim Patlasov
2012-11-16 17:07 ` [PATCH 05/14] fuse: Connection bit for enabling writeback Maxim Patlasov
2012-11-16 17:07 ` Maxim Patlasov
2012-11-16 17:07 ` [PATCH 06/14] fuse: Trust kernel i_size only Maxim Patlasov
2012-11-16 17:07 ` Maxim Patlasov
[not found] ` <20121116170731.3196.47157.stgit-vWG5eQQidJHciZdyczg/7Q@public.gmane.org>
2012-12-05 16:39 ` [PATCH] fuse: Trust kernel i_size only - v2 Maxim Patlasov
2012-12-05 16:39 ` Maxim Patlasov
2012-12-05 16:40 ` [PATCH] fuse: Implement writepages and write_begin/write_end callbacks " Maxim Patlasov
2012-12-05 16:40 ` Maxim Patlasov
2012-11-16 17:09 ` [PATCH 07/14] fuse: Update i_mtime on buffered writes Maxim Patlasov
2012-11-16 17:09 ` Maxim Patlasov
2012-11-16 17:09 ` [PATCH 08/14] fuse: Flush files on wb close Maxim Patlasov
2012-11-16 17:09 ` Maxim Patlasov
2012-11-16 17:09 ` [PATCH 09/14] fuse: Implement writepages and write_begin/write_end callbacks Maxim Patlasov
2012-11-16 17:09 ` Maxim Patlasov
2012-11-16 17:09 ` [PATCH 10/14] fuse: fuse_writepage_locked() should wait on writeback Maxim Patlasov
2012-11-16 17:09 ` Maxim Patlasov
2012-11-16 17:10 ` [PATCH 11/14] fuse: fuse_flush() " Maxim Patlasov
2012-11-16 17:10 ` Maxim Patlasov
2012-11-16 17:10 ` Maxim Patlasov [this message]
2012-11-16 17:10 ` [PATCH 12/14] fuse: Fix O_DIRECT operations vs cached writeback misorder Maxim Patlasov
[not found] ` <20121116171012.3196.35933.stgit-vWG5eQQidJHciZdyczg/7Q@public.gmane.org>
2012-12-05 16:43 ` [PATCH] fuse: Fix O_DIRECT operations vs cached writeback misorder - v2 Maxim Patlasov
2012-12-05 16:43 ` Maxim Patlasov
2012-11-16 17:10 ` [PATCH 13/14] fuse: Turn writeback cache on Maxim Patlasov
2012-11-16 17:10 ` Maxim Patlasov
2012-11-16 17:10 ` [PATCH 14/14] mm: Account for WRITEBACK_TEMP in balance_dirty_pages Maxim Patlasov
2012-11-16 17:10 ` Maxim Patlasov
2012-11-21 12:01 ` Maxim Patlasov
2012-11-21 12:01 ` Maxim Patlasov
2012-11-22 13:27 ` Jaegeuk Hanse
2012-11-22 13:27 ` Jaegeuk Hanse
2012-11-22 13:56 ` Maxim V. Patlasov
2012-11-22 13:56 ` Maxim V. Patlasov
2012-11-22 13:56 ` Maxim V. Patlasov
2012-11-27 1:04 ` [PATCH v2 00/14] fuse: An attempt to implement a write-back cache policy Feng Shuo
2012-11-27 7:56 ` Maxim V. Patlasov
[not found] ` <50B47243.9080706-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-11-27 15:19 ` Feng Shuo
2012-12-12 14:53 ` Maxim V. Patlasov
2013-01-15 15:20 ` Maxim V. Patlasov
2013-01-25 10:21 ` Miklos Szeredi
2013-01-25 12:50 ` Maxim V. Patlasov
2013-01-25 12:50 ` Maxim V. Patlasov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121116171012.3196.35933.stgit@maximpc.sw.ru \
--to=mpatlasov-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
--cc=dev-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
--cc=xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.