* Re: [RFC][PATCH] vfs: check inode size on no_cached_page
[not found] ` <20090414171114.04a47932.akpm@linux-foundation.org>
@ 2009-04-15 1:36 ` Wu Fengguang
2009-04-15 2:39 ` Wu Fengguang
0 siblings, 1 reply; 2+ messages in thread
From: Wu Fengguang @ 2009-04-15 1:36 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
xcf-pcTkQGq1WEvM1kAEIRd3EQ@public.gmane.org, linux-nfs,
Trond Myklebust
On Wed, Apr 15, 2009 at 08:11:14AM +0800, Andrew Morton wrote:
> On Sun, 12 Apr 2009 15:16:05 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > [This patch may not necessarily be merged, but at least we should
> > be aware of the problem.]
> >
> > When user space requests past-EOF data, do_generic_file_read() will
> > issue a bonus readpage call, which may be unfavorable.
> >
> > do_generic_file_read:
> > -> find_page:
> > -> find_get_page() = NULL
> > -> page_cache_sync_readahead()
> > -> find_get_page() = NULL
> > -> no_cached_page:
> > -> readpage:
> > -> nfs_readpage() = error
> > -> readpage_error:
Sorry nfs_readpage() will actually return 0 now. See below.
> >
> > Reported-by: Xu Chenfeng <xcf-pcTkQGq1WEvM1kAEIRd3EQ@public.gmane.org>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > mm/filemap.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > --- mm.orig/mm/filemap.c
> > +++ mm/mm/filemap.c
> > @@ -1269,6 +1269,11 @@ readpage_error:
> > goto out;
> >
> > no_cached_page:
> > + isize = i_size_read(inode);
> > + end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
> > + if (unlikely(!isize || index > end_index))
> > + goto out;
> > +
> > /*
> > * Ok, it wasn't cached, so we need to create a new
> > * page..
>
> Is this a problem which needs to be solved? userspace does something
> silly and the kernel behaves a bit suboptimally?
>
> If thats the only problem here then it's not worth adding fastpath
> cycles to fix it?
Yeah just some inefficiency in theory, so no fixing is necessary.
The underlying fs code shall be able to do the right thing - just
as if a concurrent truncate happened.
The NFS case goes like this:
nfs_readpage()
{
# some bonus accountings:
nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
nfs_add_stats(inode, NFSIOS_READPAGES, 1);
nfs_readpage_async(page)
nfs_return_empty_page(page)
zero_user(page) # will zero the page
return 0;
}
After it returns 0, do_generic_file_read() will goto page_ok and check
i_size there, and free the past-EOF page.
I wonder if NFS could be improved to:
- move the NFSIOS_READPAGES accounting _after_ a successful read
- return AOP_TRUNCATED_PAGE instead of zeroing the past-EOF page
The following untested patch demonstrates the ideas.
Thanks,
Fengguang
---
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 96c4ebf..6688b46 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -76,15 +76,6 @@ void nfs_readdata_release(void *data)
nfs_readdata_free(rdata);
}
-static
-int nfs_return_empty_page(struct page *page)
-{
- zero_user(page, 0, PAGE_CACHE_SIZE);
- SetPageUptodate(page);
- unlock_page(page);
- return 0;
-}
-
static void nfs_readpage_truncate_uninitialised_page(struct nfs_read_data *data)
{
unsigned int remainder = data->args.count - data->res.count;
@@ -123,7 +114,8 @@ int nfs_readpage_async(struct nfs_open_context *ctx, struct inode *inode,
len = nfs_page_length(page);
if (len == 0)
- return nfs_return_empty_page(page);
+ return AOP_TRUNCATED_PAGE;
+
new = nfs_create_request(ctx, inode, page, 0, len);
if (IS_ERR(new)) {
unlock_page(page);
@@ -516,7 +508,6 @@ int nfs_readpage(struct file *file, struct page *page)
dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
page, PAGE_CACHE_SIZE, page->index);
nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
- nfs_add_stats(inode, NFSIOS_READPAGES, 1);
/*
* Try to flush any pending writes to the file..
@@ -550,6 +541,8 @@ int nfs_readpage(struct file *file, struct page *page)
}
error = nfs_readpage_async(ctx, inode, page);
+ if (!error)
+ nfs_add_stats(inode, NFSIOS_READPAGES, 1);
out:
put_nfs_open_context(ctx);
@@ -575,7 +568,7 @@ readpage_async_filler(void *data, struct page *page)
len = nfs_page_length(page);
if (len == 0)
- return nfs_return_empty_page(page);
+ return AOP_TRUNCATED_PAGE;
new = nfs_create_request(desc->ctx, inode, page, 0, len);
if (IS_ERR(new))
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [RFC][PATCH] vfs: check inode size on no_cached_page
2009-04-15 1:36 ` [RFC][PATCH] vfs: check inode size on no_cached_page Wu Fengguang
@ 2009-04-15 2:39 ` Wu Fengguang
0 siblings, 0 replies; 2+ messages in thread
From: Wu Fengguang @ 2009-04-15 2:39 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
xcf@ustc.edu.cn, linux-nfs, Trond Myklebust
On Wed, Apr 15, 2009 at 09:36:34AM +0800, Wu Fengguang wrote:
> On Wed, Apr 15, 2009 at 08:11:14AM +0800, Andrew Morton wrote:
> > On Sun, 12 Apr 2009 15:16:05 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> >
> > > [This patch may not necessarily be merged, but at least we should
> > > be aware of the problem.]
> > >
> > > When user space requests past-EOF data, do_generic_file_read() will
> > > issue a bonus readpage call, which may be unfavorable.
> > >
> > > do_generic_file_read:
> > > -> find_page:
> > > -> find_get_page() = NULL
> > > -> page_cache_sync_readahead()
> > > -> find_get_page() = NULL
> > > -> no_cached_page:
> > > -> readpage:
>
> > > -> nfs_readpage() = error
> > > -> readpage_error:
>
> Sorry nfs_readpage() will actually return 0 now. See below.
>
> > >
> > > Reported-by: Xu Chenfeng <xcf@ustc.edu.cn>
> > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > ---
> > > mm/filemap.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > --- mm.orig/mm/filemap.c
> > > +++ mm/mm/filemap.c
> > > @@ -1269,6 +1269,11 @@ readpage_error:
> > > goto out;
> > >
> > > no_cached_page:
> > > + isize = i_size_read(inode);
> > > + end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
> > > + if (unlikely(!isize || index > end_index))
> > > + goto out;
> > > +
> > > /*
> > > * Ok, it wasn't cached, so we need to create a new
> > > * page..
> >
> > Is this a problem which needs to be solved? userspace does something
> > silly and the kernel behaves a bit suboptimally?
> >
> > If thats the only problem here then it's not worth adding fastpath
> > cycles to fix it?
>
> Yeah just some inefficiency in theory, so no fixing is necessary.
>
> The underlying fs code shall be able to do the right thing - just
> as if a concurrent truncate happened.
>
> The NFS case goes like this:
>
> nfs_readpage()
> {
> # some bonus accountings:
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> nfs_add_stats(inode, NFSIOS_READPAGES, 1);
>
> nfs_readpage_async(page)
> nfs_return_empty_page(page)
> zero_user(page) # will zero the page
>
> return 0;
> }
>
> After it returns 0, do_generic_file_read() will goto page_ok and check
> i_size there, and free the past-EOF page.
>
> I wonder if NFS could be improved to:
> - move the NFSIOS_READPAGES accounting _after_ a successful read
> - return AOP_TRUNCATED_PAGE instead of zeroing the past-EOF page
Ah AOP_TRUNCATED_PAGE actually indicates to retry the read_page()
call. Returning AOP_TRUNCATED_PAGE in nfs_read_page() in this case
will create an infinite loop...
Thanks,
Fengguang
> The following untested patch demonstrates the ideas.
>
> Thanks,
> Fengguang
> ---
>
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index 96c4ebf..6688b46 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -76,15 +76,6 @@ void nfs_readdata_release(void *data)
> nfs_readdata_free(rdata);
> }
>
> -static
> -int nfs_return_empty_page(struct page *page)
> -{
> - zero_user(page, 0, PAGE_CACHE_SIZE);
> - SetPageUptodate(page);
> - unlock_page(page);
> - return 0;
> -}
> -
> static void nfs_readpage_truncate_uninitialised_page(struct nfs_read_data *data)
> {
> unsigned int remainder = data->args.count - data->res.count;
> @@ -123,7 +114,8 @@ int nfs_readpage_async(struct nfs_open_context *ctx, struct inode *inode,
>
> len = nfs_page_length(page);
> if (len == 0)
> - return nfs_return_empty_page(page);
> + return AOP_TRUNCATED_PAGE;
> +
> new = nfs_create_request(ctx, inode, page, 0, len);
> if (IS_ERR(new)) {
> unlock_page(page);
> @@ -516,7 +508,6 @@ int nfs_readpage(struct file *file, struct page *page)
> dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
> page, PAGE_CACHE_SIZE, page->index);
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> - nfs_add_stats(inode, NFSIOS_READPAGES, 1);
>
> /*
> * Try to flush any pending writes to the file..
> @@ -550,6 +541,8 @@ int nfs_readpage(struct file *file, struct page *page)
> }
>
> error = nfs_readpage_async(ctx, inode, page);
> + if (!error)
> + nfs_add_stats(inode, NFSIOS_READPAGES, 1);
>
> out:
> put_nfs_open_context(ctx);
> @@ -575,7 +568,7 @@ readpage_async_filler(void *data, struct page *page)
>
> len = nfs_page_length(page);
> if (len == 0)
> - return nfs_return_empty_page(page);
> + return AOP_TRUNCATED_PAGE;
>
> new = nfs_create_request(desc->ctx, inode, page, 0, len);
> if (IS_ERR(new))
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-04-15 2:39 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090412071605.GA14058@localhost>
[not found] ` <20090414171114.04a47932.akpm@linux-foundation.org>
2009-04-15 1:36 ` [RFC][PATCH] vfs: check inode size on no_cached_page Wu Fengguang
2009-04-15 2:39 ` Wu Fengguang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).