public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] fuse: fixes for writepages
@ 2013-09-19 16:11 Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 1/3] fuse: writepages: fix aggregation Miklos Szeredi
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Miklos Szeredi @ 2013-09-19 16:11 UTC (permalink / raw)
  To: mpatlasov; +Cc: linux-fsdevel, linux-kernel

Maxim,

Please review and test these.  I've appended them to writepages.v2 and for-next.

Here's a test patch for the rewrites stuff.  It simply disables the waiting in
fuse_page_mkwrite() and is quite effective in generating rewrites in my
testcase.

Thanks,
Miklos

---
 fs/fuse/file.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1498,6 +1498,7 @@ static void fuse_writepage_end(struct fu
 
 	mapping_set_error(inode->i_mapping, req->out.h.error);
 	spin_lock(&fc->lock);
+	int count = 0;
 	while (req->misc.write.next) {
 		struct fuse_req *next = req->misc.write.next;
 		req->misc.write.next = next->misc.write.next;
@@ -1505,7 +1506,10 @@ static void fuse_writepage_end(struct fu
 		list_add(&next->writepages_entry, &fi->writepages);
 		list_add_tail(&next->list, &fi->queued_writes);
 		fuse_flush_writepages(inode);
+		count++;
 	}
+	if (count)
+		printk("rewrite: %i\n", count);
 	fi->writectr--;
 	fuse_writepage_finish(fc, req);
 	spin_unlock(&fc->lock);
@@ -1884,7 +1888,7 @@ static int fuse_page_mkwrite(struct vm_a
 		return VM_FAULT_NOPAGE;
 	}
 
-	fuse_wait_on_page_writeback(inode, page->index);
+	//fuse_wait_on_page_writeback(inode, page->index);
 	return VM_FAULT_LOCKED;
 }
 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] fuse: writepages: fix aggregation
  2013-09-19 16:11 [PATCH 0/3] fuse: fixes for writepages Miklos Szeredi
@ 2013-09-19 16:11 ` Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 2/3] fuse: writepages: handle same page rewrites Miklos Szeredi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Miklos Szeredi @ 2013-09-19 16:11 UTC (permalink / raw)
  To: mpatlasov; +Cc: linux-fsdevel, linux-kernel, Miklos Szeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Checking against tmp-page indexes is not very useful, and results in one
(or rarely two) page requests.  Which is not much of an improvement...

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/fuse/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index cc3a6c4..bf765cf 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1633,7 +1633,7 @@ static int fuse_writepages_fill(struct page *page,
 		BUG_ON(!req->num_pages);
 		if (req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
 		    (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_write ||
-		    req->pages[req->num_pages - 1]->index + 1 != page->index) {
+		    data->orig_pages[req->num_pages - 1]->index + 1 != page->index) {
 
 			fuse_writepages_send(data);
 			data->req = NULL;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] fuse: writepages: handle same page rewrites
  2013-09-19 16:11 [PATCH 0/3] fuse: fixes for writepages Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 1/3] fuse: writepages: fix aggregation Miklos Szeredi
@ 2013-09-19 16:11 ` Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 3/3] fuse: writepage: skip already in flight Miklos Szeredi
  2013-09-20  8:06 ` [PATCH 0/3] fuse: fixes for writepages Maxim Patlasov
  3 siblings, 0 replies; 5+ messages in thread
From: Miklos Szeredi @ 2013-09-19 16:11 UTC (permalink / raw)
  To: mpatlasov; +Cc: linux-fsdevel, linux-kernel, Miklos Szeredi

From: Miklos Szeredi <mszeredi@suse.cz>

As Maxim Patlasov pointed out, it's possible to get a dirty page while it's
copy is still under writeback, despite fuse_page_mkwrite() doing its thing
(direct IO).

This could result in two concurrent write request for the same offset, with
data corruption if they get mixed up.

To prevent this, fuse needs to check and delay such writes.  This
implementation does this by:

 1. check if page is still under writeout, if so create a new, single page
    secondary request for it

 2. chain this secondary request onto the in-flight request

 2/a. if a seconday request for the same offset was already chained to the
    in-flight request, then just copy the contents of the page and discard
    the new secondary request.  This makes sure that for each page will
    have at most two requests associated with it

 3. when the in-flight request finished, send off all secondary requests
    chained onto it

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/fuse/file.c   | 100 +++++++++++++++++++++++++++++++++++++++++++++++++------
 fs/fuse/fuse_i.h |   1 +
 2 files changed, 91 insertions(+), 10 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index bf765cf..f8ff019 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1414,7 +1414,9 @@ static void fuse_writepage_free(struct fuse_conn *fc, struct fuse_req *req)
 
 	for (i = 0; i < req->num_pages; i++)
 		__free_page(req->pages[i]);
-	fuse_file_put(req->ff, false);
+
+	if (req->ff)
+		fuse_file_put(req->ff, false);
 }
 
 static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
@@ -1496,6 +1498,14 @@ static void fuse_writepage_end(struct fuse_conn *fc, struct fuse_req *req)
 
 	mapping_set_error(inode->i_mapping, req->out.h.error);
 	spin_lock(&fc->lock);
+	while (req->misc.write.next) {
+		struct fuse_req *next = req->misc.write.next;
+		req->misc.write.next = next->misc.write.next;
+		next->misc.write.next = NULL;
+		list_add(&next->writepages_entry, &fi->writepages);
+		list_add_tail(&next->list, &fi->queued_writes);
+		fuse_flush_writepages(inode);
+	}
 	fi->writectr--;
 	fuse_writepage_finish(fc, req);
 	spin_unlock(&fc->lock);
@@ -1548,6 +1558,7 @@ static int fuse_writepage_locked(struct page *page)
 
 	copy_highpage(tmp_page, page);
 	req->misc.write.in.write_flags |= FUSE_WRITE_CACHE;
+	req->misc.write.next = NULL;
 	req->in.argpages = 1;
 	req->num_pages = 1;
 	req->pages[0] = tmp_page;
@@ -1612,6 +1623,62 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
 		end_page_writeback(data->orig_pages[i]);
 }
 
+static bool fuse_writepage_in_flight(struct fuse_req *new_req,
+				     struct page *page)
+{
+	struct fuse_conn *fc = get_fuse_conn(new_req->inode);
+	struct fuse_inode *fi = get_fuse_inode(new_req->inode);
+	struct fuse_req *tmp;
+	struct fuse_req *old_req;
+	bool found = false;
+	pgoff_t curr_index;
+
+	BUG_ON(new_req->num_pages != 0);
+
+	spin_lock(&fc->lock);
+	list_del(&new_req->writepages_entry);
+	new_req->num_pages = 1;
+	list_for_each_entry(old_req, &fi->writepages, writepages_entry) {
+		BUG_ON(old_req->inode != new_req->inode);
+		curr_index = old_req->misc.write.in.offset >> PAGE_CACHE_SHIFT;
+		if (curr_index <= page->index &&
+		    page->index < curr_index + old_req->num_pages) {
+			found = true;
+			break;
+		}
+	}
+	if (!found)
+		goto out_unlock;
+
+	for (tmp = old_req; tmp != NULL; tmp = tmp->misc.write.next) {
+		BUG_ON(tmp->inode != new_req->inode);
+		curr_index = tmp->misc.write.in.offset >> PAGE_CACHE_SHIFT;
+		if (tmp->num_pages == 1 &&
+		    curr_index == page->index) {
+			old_req = tmp;
+		}
+	}
+
+	if (old_req->num_pages == 1 && (old_req->state == FUSE_REQ_INIT ||
+					old_req->state == FUSE_REQ_PENDING)) {
+		copy_highpage(old_req->pages[0], page);
+		spin_unlock(&fc->lock);
+
+		dec_bdi_stat(page->mapping->backing_dev_info, BDI_WRITEBACK);
+		dec_zone_page_state(page, NR_WRITEBACK_TEMP);
+		fuse_writepage_free(fc, new_req);
+		fuse_request_free(new_req);
+		goto out;
+	} else {
+		new_req->misc.write.next = old_req->misc.write.next;
+		old_req->misc.write.next = new_req;
+	}
+out_unlock:
+	spin_unlock(&fc->lock);
+out:
+	return found;
+}
+
 static int fuse_writepages_fill(struct page *page,
 		struct writeback_control *wbc, void *_data)
 {
@@ -1620,6 +1687,7 @@ static int fuse_writepages_fill(struct page *page,
 	struct inode *inode = data->inode;
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct page *tmp_page;
+	bool is_writeback;
 	int err;
 
 	if (!data->ff) {
@@ -1629,15 +1697,20 @@ static int fuse_writepages_fill(struct page *page,
 			goto out_unlock;
 	}
 
-	if (req) {
-		BUG_ON(!req->num_pages);
-		if (req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
-		    (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_write ||
-		    data->orig_pages[req->num_pages - 1]->index + 1 != page->index) {
+	/*
+	 * Being under writeback is unlikely but possible.  For example direct
+	 * read to an mmaped fuse file will set the page dirty twice; once when
+	 * the pages are faulted with get_user_pages(), and then after the read
+	 * completed.
+	 */
+	is_writeback = fuse_page_is_writeback(inode, page->index);
 
-			fuse_writepages_send(data);
-			data->req = NULL;
-		}
+	if (req && req->num_pages &&
+	    (is_writeback || req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
+	     (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_write ||
+	     data->orig_pages[req->num_pages - 1]->index + 1 != page->index)) {
+		fuse_writepages_send(data);
+		data->req = NULL;
 	}
 	err = -ENOMEM;
 	tmp_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
@@ -1669,6 +1742,7 @@ static int fuse_writepages_fill(struct page *page,
 
 		fuse_write_fill(req, data->ff, page_offset(page), 0);
 		req->misc.write.in.write_flags |= FUSE_WRITE_CACHE;
+		req->misc.write.next = NULL;
 		req->in.argpages = 1;
 		req->background = 1;
 		req->num_pages = 0;
@@ -1690,6 +1764,13 @@ static int fuse_writepages_fill(struct page *page,
 
 	inc_bdi_stat(page->mapping->backing_dev_info, BDI_WRITEBACK);
 	inc_zone_page_state(tmp_page, NR_WRITEBACK_TEMP);
+
+	err = 0;
+	if (is_writeback && fuse_writepage_in_flight(req, page)) {
+		end_page_writeback(page);
+		data->req = NULL;
+		goto out_unlock;
+	}
 	data->orig_pages[req->num_pages] = page;
 
 	/*
@@ -1700,7 +1781,6 @@ static int fuse_writepages_fill(struct page *page,
 	req->num_pages++;
 	spin_unlock(&fc->lock);
 
-	err = 0;
 out_unlock:
 	unlock_page(page);
 
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 5ced199..a08b355 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -319,6 +319,7 @@ struct fuse_req {
 		struct {
 			struct fuse_write_in in;
 			struct fuse_write_out out;
+			struct fuse_req *next;
 		} write;
 		struct fuse_notify_retrieve_in retrieve_in;
 		struct fuse_lk_in lk_in;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] fuse: writepage: skip already in flight
  2013-09-19 16:11 [PATCH 0/3] fuse: fixes for writepages Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 1/3] fuse: writepages: fix aggregation Miklos Szeredi
  2013-09-19 16:11 ` [PATCH 2/3] fuse: writepages: handle same page rewrites Miklos Szeredi
@ 2013-09-19 16:11 ` Miklos Szeredi
  2013-09-20  8:06 ` [PATCH 0/3] fuse: fixes for writepages Maxim Patlasov
  3 siblings, 0 replies; 5+ messages in thread
From: Miklos Szeredi @ 2013-09-19 16:11 UTC (permalink / raw)
  To: mpatlasov; +Cc: linux-fsdevel, linux-kernel, Miklos Szeredi

From: Miklos Szeredi <mszeredi@suse.cz>

If ->writepage() tries to write back a page whose copy is still in flight,
then just skip by calling redirty_page_for_writepage().

This is OK, since now ->writepage() should never be called for data
integrity sync.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/fuse/file.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f8ff019..135360e 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1591,6 +1591,18 @@ static int fuse_writepage(struct page *page, struct writeback_control *wbc)
 {
 	int err;
 
+	if (fuse_page_is_writeback(page->mapping->host, page->index)) {
+		/*
+		 * ->writepages() should be called for sync() and friends.  We
+		 * should only get here on direct reclaim and then we are
+		 * allowed to skip a page which is already in flight
+		 */
+		WARN_ON(wbc->sync_mode == WB_SYNC_ALL);
+
+		redirty_page_for_writepage(wbc, page);
+		return 0;
+	}
+
 	err = fuse_writepage_locked(page);
 	unlock_page(page);
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] fuse: fixes for writepages
  2013-09-19 16:11 [PATCH 0/3] fuse: fixes for writepages Miklos Szeredi
                   ` (2 preceding siblings ...)
  2013-09-19 16:11 ` [PATCH 3/3] fuse: writepage: skip already in flight Miklos Szeredi
@ 2013-09-20  8:06 ` Maxim Patlasov
  3 siblings, 0 replies; 5+ messages in thread
From: Maxim Patlasov @ 2013-09-20  8:06 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, linux-kernel, fuse-devel@lists.sourceforge.net

Miklos,

On 09/19/2013 08:11 PM, Miklos Szeredi wrote:
> Maxim,
>
> Please review and test these.  I've appended them to writepages.v2 and for-next.

Thanks a lot for efforts. I'll start to work on it now and send you 
feedback as soon as I get a progress.

Btw, adding fuse-devel to cc.

Thanks,
Maxim

>
> Here's a test patch for the rewrites stuff.  It simply disables the waiting in
> fuse_page_mkwrite() and is quite effective in generating rewrites in my
> testcase.
>
> Thanks,
> Miklos
>
> ---
>   fs/fuse/file.c |    6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1498,6 +1498,7 @@ static void fuse_writepage_end(struct fu
>   
>   	mapping_set_error(inode->i_mapping, req->out.h.error);
>   	spin_lock(&fc->lock);
> +	int count = 0;
>   	while (req->misc.write.next) {
>   		struct fuse_req *next = req->misc.write.next;
>   		req->misc.write.next = next->misc.write.next;
> @@ -1505,7 +1506,10 @@ static void fuse_writepage_end(struct fu
>   		list_add(&next->writepages_entry, &fi->writepages);
>   		list_add_tail(&next->list, &fi->queued_writes);
>   		fuse_flush_writepages(inode);
> +		count++;
>   	}
> +	if (count)
> +		printk("rewrite: %i\n", count);
>   	fi->writectr--;
>   	fuse_writepage_finish(fc, req);
>   	spin_unlock(&fc->lock);
> @@ -1884,7 +1888,7 @@ static int fuse_page_mkwrite(struct vm_a
>   		return VM_FAULT_NOPAGE;
>   	}
>   
> -	fuse_wait_on_page_writeback(inode, page->index);
> +	//fuse_wait_on_page_writeback(inode, page->index);
>   	return VM_FAULT_LOCKED;
>   }
>   
>
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-09-20  8:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-19 16:11 [PATCH 0/3] fuse: fixes for writepages Miklos Szeredi
2013-09-19 16:11 ` [PATCH 1/3] fuse: writepages: fix aggregation Miklos Szeredi
2013-09-19 16:11 ` [PATCH 2/3] fuse: writepages: handle same page rewrites Miklos Szeredi
2013-09-19 16:11 ` [PATCH 3/3] fuse: writepage: skip already in flight Miklos Szeredi
2013-09-20  8:06 ` [PATCH 0/3] fuse: fixes for writepages Maxim Patlasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox