All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fuse: fix race in fuse_writepages()
@ 2013-08-16 11:51 Maxim Patlasov
  2013-08-29 11:46 ` Miklos Szeredi
  0 siblings, 1 reply; 4+ messages in thread
From: Maxim Patlasov @ 2013-08-16 11:51 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel, devel, xemul

The patch is for

 git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git writepages.v2

The patch fixes a race between ftruncate(2), mmap-ed write and write(2):

1) An user makes a page dirty via mmap-ed write.
2) The user performs shrinking truncate(2) intended to purge the page.
3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
   writeback. fuse_writepages_fill attaches a new page to FUSE_WRITE request,
   then releases the original page by end_page_writeback and unlock it.
4) fuse_do_setattr completes and successfully returns. Since now, i_mutex
   is free.
5) Ordinary write(2) extends i_size back to cover the page. Note that
   fuse_send_write_pages do wait for fuse writeback, but for another
   page->index.
6) fuse_writepages_fill attaches more pages to the request (if any), then
   fuse_writepages_send is eventually called. It is supposed to crop
   inarg->size of the request, but it doesn't because i_size has already been
   extended back.

Moving end_page_writeback behind fuse_writepages_send guarantees that
__fuse_release_nowrite (called from fuse_do_setattr) will crop inarg->size
of the request before write(2) gets the chance to extend i_size.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 568e859..0ebcc79 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1583,6 +1583,7 @@ struct fuse_fill_wb_data {
 	struct fuse_req *req;
 	struct fuse_file *ff;
 	struct inode *inode;
+	struct page **orig_pages;
 };
 
 static void fuse_writepages_send(struct fuse_fill_wb_data *data)
@@ -1591,12 +1592,17 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
 	struct inode *inode = data->inode;
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct fuse_inode *fi = get_fuse_inode(inode);
+	int num_pages = req->num_pages;
+	int i;
 
 	req->ff = fuse_file_get(data->ff);
 	spin_lock(&fc->lock);
 	list_add_tail(&req->list, &fi->queued_writes);
 	fuse_flush_writepages(inode);
 	spin_unlock(&fc->lock);
+
+	for (i = 0; i < num_pages; i++)
+		end_page_writeback(data->orig_pages[i]);
 }
 
 static int fuse_writepages_fill(struct page *page,
@@ -1677,7 +1683,7 @@ static int fuse_writepages_fill(struct page *page,
 
 	inc_bdi_stat(page->mapping->backing_dev_info, BDI_WRITEBACK);
 	inc_zone_page_state(tmp_page, NR_WRITEBACK_TEMP);
-	end_page_writeback(page);
+	data->orig_pages[req->num_pages] = page;
 
 	/*
 	 * Protected by fc->lock against concurrent access by
@@ -1709,6 +1715,13 @@ static int fuse_writepages(struct address_space *mapping,
 	data.req = NULL;
 	data.ff = NULL;
 
+	err = -ENOMEM;
+	data.orig_pages = kzalloc(sizeof(struct page *) *
+				  FUSE_MAX_PAGES_PER_REQ,
+				  GFP_NOFS);
+	if (!data.orig_pages)
+		goto out;
+
 	err = write_cache_pages(mapping, wbc, fuse_writepages_fill, &data);
 	if (data.req) {
 		/* Ignore errors if we can write at least one page */
@@ -1718,6 +1731,8 @@ static int fuse_writepages(struct address_space *mapping,
 	}
 	if (data.ff)
 		fuse_file_put(data.ff, false);
+
+	kfree(data.orig_pages);
 out:
 	return err;
 }


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] fuse: fix race in fuse_writepages()
  2013-08-16 11:51 [PATCH] fuse: fix race in fuse_writepages() Maxim Patlasov
@ 2013-08-29 11:46 ` Miklos Szeredi
  2013-08-29 12:38   ` Maxim Patlasov
  0 siblings, 1 reply; 4+ messages in thread
From: Miklos Szeredi @ 2013-08-29 11:46 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: fuse-devel, linux-kernel, devel, xemul

On Fri, Aug 16, 2013 at 03:51:41PM +0400, Maxim Patlasov wrote:
> The patch is for
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git writepages.v2
> 
> The patch fixes a race between ftruncate(2), mmap-ed write and write(2):
> 
> 1) An user makes a page dirty via mmap-ed write.
> 2) The user performs shrinking truncate(2) intended to purge the page.
> 3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
>    writeback. fuse_writepages_fill attaches a new page to FUSE_WRITE request,
>    then releases the original page by end_page_writeback and unlock it.
> 4) fuse_do_setattr completes and successfully returns. Since now, i_mutex
>    is free.
> 5) Ordinary write(2) extends i_size back to cover the page. Note that
>    fuse_send_write_pages do wait for fuse writeback, but for another
>    page->index.
> 6) fuse_writepages_fill attaches more pages to the request (if any), then
>    fuse_writepages_send is eventually called. It is supposed to crop
>    inarg->size of the request, but it doesn't because i_size has already been
>    extended back.
> 
> Moving end_page_writeback behind fuse_writepages_send guarantees that
> __fuse_release_nowrite (called from fuse_do_setattr) will crop inarg->size
> of the request before write(2) gets the chance to extend i_size.

Thanks for the report.  Your analysis looks correct.

Just one nit, why orig_pages? req->pages is already there, so why duplicate it?

Note: you can do __fuse_get_request()/fuse_put_request() to prevent the req from
going away after it's been sent.

Thanks,
Miklos


> 
> Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
> ---
>  fs/fuse/file.c |   17 ++++++++++++++++-
>  1 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 568e859..0ebcc79 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1583,6 +1583,7 @@ struct fuse_fill_wb_data {
>  	struct fuse_req *req;
>  	struct fuse_file *ff;
>  	struct inode *inode;
> +	struct page **orig_pages;
>  };
>  
>  static void fuse_writepages_send(struct fuse_fill_wb_data *data)
> @@ -1591,12 +1592,17 @@ static void fuse_writepages_send(struct fuse_fill_wb_data *data)
>  	struct inode *inode = data->inode;
>  	struct fuse_conn *fc = get_fuse_conn(inode);
>  	struct fuse_inode *fi = get_fuse_inode(inode);
> +	int num_pages = req->num_pages;
> +	int i;
>  
>  	req->ff = fuse_file_get(data->ff);
>  	spin_lock(&fc->lock);
>  	list_add_tail(&req->list, &fi->queued_writes);
>  	fuse_flush_writepages(inode);
>  	spin_unlock(&fc->lock);
> +
> +	for (i = 0; i < num_pages; i++)
> +		end_page_writeback(data->orig_pages[i]);
>  }
>  
>  static int fuse_writepages_fill(struct page *page,
> @@ -1677,7 +1683,7 @@ static int fuse_writepages_fill(struct page *page,
>  
>  	inc_bdi_stat(page->mapping->backing_dev_info, BDI_WRITEBACK);
>  	inc_zone_page_state(tmp_page, NR_WRITEBACK_TEMP);
> -	end_page_writeback(page);
> +	data->orig_pages[req->num_pages] = page;
>  
>  	/*
>  	 * Protected by fc->lock against concurrent access by
> @@ -1709,6 +1715,13 @@ static int fuse_writepages(struct address_space *mapping,
>  	data.req = NULL;
>  	data.ff = NULL;
>  
> +	err = -ENOMEM;
> +	data.orig_pages = kzalloc(sizeof(struct page *) *
> +				  FUSE_MAX_PAGES_PER_REQ,
> +				  GFP_NOFS);
> +	if (!data.orig_pages)
> +		goto out;
> +
>  	err = write_cache_pages(mapping, wbc, fuse_writepages_fill, &data);
>  	if (data.req) {
>  		/* Ignore errors if we can write at least one page */
> @@ -1718,6 +1731,8 @@ static int fuse_writepages(struct address_space *mapping,
>  	}
>  	if (data.ff)
>  		fuse_file_put(data.ff, false);
> +
> +	kfree(data.orig_pages);
>  out:
>  	return err;
>  }
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fuse: fix race in fuse_writepages()
  2013-08-29 11:46 ` Miklos Szeredi
@ 2013-08-29 12:38   ` Maxim Patlasov
  2013-08-29 16:21     ` Miklos Szeredi
  0 siblings, 1 reply; 4+ messages in thread
From: Maxim Patlasov @ 2013-08-29 12:38 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-kernel, devel, xemul

Hi,

08/29/2013 03:46 PM, Miklos Szeredi пишет:
> On Fri, Aug 16, 2013 at 03:51:41PM +0400, Maxim Patlasov wrote:
>> The patch is for
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git writepages.v2
>>
>> The patch fixes a race between ftruncate(2), mmap-ed write and write(2):
>>
>> 1) An user makes a page dirty via mmap-ed write.
>> 2) The user performs shrinking truncate(2) intended to purge the page.
>> 3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
>>     writeback. fuse_writepages_fill attaches a new page to FUSE_WRITE request,
>>     then releases the original page by end_page_writeback and unlock it.
>> 4) fuse_do_setattr completes and successfully returns. Since now, i_mutex
>>     is free.
>> 5) Ordinary write(2) extends i_size back to cover the page. Note that
>>     fuse_send_write_pages do wait for fuse writeback, but for another
>>     page->index.
>> 6) fuse_writepages_fill attaches more pages to the request (if any), then
>>     fuse_writepages_send is eventually called. It is supposed to crop
>>     inarg->size of the request, but it doesn't because i_size has already been
>>     extended back.
>>
>> Moving end_page_writeback behind fuse_writepages_send guarantees that
>> __fuse_release_nowrite (called from fuse_do_setattr) will crop inarg->size
>> of the request before write(2) gets the chance to extend i_size.
> Thanks for the report.  Your analysis looks correct.
>
> Just one nit, why orig_pages? req->pages is already there, so why duplicate it?

req->pages is there, but it is already occupied by new pages (allocated 
by fuse_writepages_fill). We can't re-use req->pages for original pages 
because as soon as we put the request to bg_queue (in 
fuse_writepages_send) and released fc->lock, req->pages may be accessed 
w/o any delay. So we have two bunches of pointers to "struct page" to be 
stashed somewhere : original and new one. req->pages is for new pages, 
orig_pages[] is for original ones.

> Note: you can do __fuse_get_request()/fuse_put_request() to prevent the req from
> going away after it's been sent.

Yes, I experimented with this technique before adding orig_pages[]. I 
was very reluctant about duplicating that page array and was looking for 
any opportunity to avoid it. Pinning original pages to new ones using 
page->private looked promising, but unfortunately it didn't work because 
__fuse_get_request() protects only request itself from disappearing, not 
from releasing pages that req->pages[] points to. And obviously, as soon 
as a page released, it's not correct to rely on the content of its 
'private' field.

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fuse: fix race in fuse_writepages()
  2013-08-29 12:38   ` Maxim Patlasov
@ 2013-08-29 16:21     ` Miklos Szeredi
  0 siblings, 0 replies; 4+ messages in thread
From: Miklos Szeredi @ 2013-08-29 16:21 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: fuse-devel, Kernel Mailing List, devel, Pavel Emelianov

On Thu, Aug 29, 2013 at 2:38 PM, Maxim Patlasov <mpatlasov@parallels.com> wrote:

>> Just one nit, why orig_pages? req->pages is already there, so why
>> duplicate it?
>
>
> req->pages is there, but it is already occupied by new pages (allocated by
> fuse_writepages_fill). We can't re-use req->pages for original pages because
> as soon as we put the request to bg_queue (in fuse_writepages_send) and
> released fc->lock, req->pages may be accessed w/o any delay. So we have two
> bunches of pointers to "struct page" to be stashed somewhere : original and
> new one. req->pages is for new pages, orig_pages[] is for original ones.

Yeah.  Applied the original patch.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-29 16:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-16 11:51 [PATCH] fuse: fix race in fuse_writepages() Maxim Patlasov
2013-08-29 11:46 ` Miklos Szeredi
2013-08-29 12:38   ` Maxim Patlasov
2013-08-29 16:21     ` Miklos Szeredi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.