All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"xcf@ustc.edu.cn" <xcf@ustc.edu.cn>,
	linux-nfs@vger.kernel.org,
	Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: Re: [RFC][PATCH] vfs: check inode size on no_cached_page
Date: Wed, 15 Apr 2009 10:39:49 +0800	[thread overview]
Message-ID: <20090415023949.GA9017@localhost> (raw)
In-Reply-To: <20090415013634.GB6143@localhost>

On Wed, Apr 15, 2009 at 09:36:34AM +0800, Wu Fengguang wrote:
> On Wed, Apr 15, 2009 at 08:11:14AM +0800, Andrew Morton wrote:
> > On Sun, 12 Apr 2009 15:16:05 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 
> > > [This patch may not necessarily be merged, but at least we should
> > >  be aware of the problem.]
> > > 
> > > When user space requests past-EOF data, do_generic_file_read() will
> > > issue a bonus readpage call, which may be unfavorable.
> > > 
> > > do_generic_file_read:
> > >         -> find_page:
> > >         -> find_get_page()             = NULL
> > >         -> page_cache_sync_readahead()
> > >         -> find_get_page()             = NULL
> > >         -> no_cached_page:
> > >         -> readpage:
> 
> > >                 -> nfs_readpage()      = error
> > >         -> readpage_error:
> 
> Sorry nfs_readpage() will actually return 0 now. See below.
> 
> > > 
> > > Reported-by: Xu Chenfeng <xcf@ustc.edu.cn>
> > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > ---
> > >  mm/filemap.c |    5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > --- mm.orig/mm/filemap.c
> > > +++ mm/mm/filemap.c
> > > @@ -1269,6 +1269,11 @@ readpage_error:
> > >  		goto out;
> > >  
> > >  no_cached_page:
> > > +		isize = i_size_read(inode);
> > > +		end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
> > > +		if (unlikely(!isize || index > end_index))
> > > +			goto out;
> > > +
> > >  		/*
> > >  		 * Ok, it wasn't cached, so we need to create a new
> > >  		 * page..
> > 
> > Is this a problem which needs to be solved?  userspace does something
> > silly and the kernel behaves a bit suboptimally?
> >
> > If thats the only problem here then it's not worth adding fastpath
> > cycles to fix it?
> 
> Yeah just some inefficiency in theory, so no fixing is necessary.
> 
> The underlying fs code shall be able to do the right thing - just
> as if a concurrent truncate happened.
> 
> The NFS case goes like this:
> 
>         nfs_readpage()
>         {
>                 # some bonus accountings:
>                 nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
>                 nfs_add_stats(inode, NFSIOS_READPAGES, 1);
> 
>                 nfs_readpage_async(page)
>                   nfs_return_empty_page(page)
>                     zero_user(page) # will zero the page
> 
>                 return 0;
>         }
> 
> After it returns 0, do_generic_file_read() will goto page_ok and check
> i_size there, and free the past-EOF page.
> 
> I wonder if NFS could be improved to:
> - move the NFSIOS_READPAGES accounting _after_ a successful read
> - return AOP_TRUNCATED_PAGE instead of zeroing the past-EOF page

Ah AOP_TRUNCATED_PAGE actually indicates to retry the read_page()
call.  Returning AOP_TRUNCATED_PAGE in nfs_read_page() in this case
will create an infinite loop...

Thanks,
Fengguang

> The following untested patch demonstrates the ideas.
> 
> Thanks,
> Fengguang
> ---
> 
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index 96c4ebf..6688b46 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -76,15 +76,6 @@ void nfs_readdata_release(void *data)
>  	nfs_readdata_free(rdata);
>  }
>  
> -static
> -int nfs_return_empty_page(struct page *page)
> -{
> -	zero_user(page, 0, PAGE_CACHE_SIZE);
> -	SetPageUptodate(page);
> -	unlock_page(page);
> -	return 0;
> -}
> -
>  static void nfs_readpage_truncate_uninitialised_page(struct nfs_read_data *data)
>  {
>  	unsigned int remainder = data->args.count - data->res.count;
> @@ -123,7 +114,8 @@ int nfs_readpage_async(struct nfs_open_context *ctx, struct inode *inode,
>  
>  	len = nfs_page_length(page);
>  	if (len == 0)
> -		return nfs_return_empty_page(page);
> +		return AOP_TRUNCATED_PAGE;
> +
>  	new = nfs_create_request(ctx, inode, page, 0, len);
>  	if (IS_ERR(new)) {
>  		unlock_page(page);
> @@ -516,7 +508,6 @@ int nfs_readpage(struct file *file, struct page *page)
>  	dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
>  		page, PAGE_CACHE_SIZE, page->index);
>  	nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> -	nfs_add_stats(inode, NFSIOS_READPAGES, 1);
>  
>  	/*
>  	 * Try to flush any pending writes to the file..
> @@ -550,6 +541,8 @@ int nfs_readpage(struct file *file, struct page *page)
>  	}
>  
>  	error = nfs_readpage_async(ctx, inode, page);
> +	if (!error)
> +		nfs_add_stats(inode, NFSIOS_READPAGES, 1);
>  
>  out:
>  	put_nfs_open_context(ctx);
> @@ -575,7 +568,7 @@ readpage_async_filler(void *data, struct page *page)
>  
>  	len = nfs_page_length(page);
>  	if (len == 0)
> -		return nfs_return_empty_page(page);
> +		return AOP_TRUNCATED_PAGE;
>  
>  	new = nfs_create_request(desc->ctx, inode, page, 0, len);
>  	if (IS_ERR(new))

      reply	other threads:[~2009-04-15  2:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-12  7:16 [RFC][PATCH] vfs: check inode size on no_cached_page Wu Fengguang
2009-04-15  0:11 ` Andrew Morton
2009-04-15  1:22   ` Chenfeng Xu
2009-04-15  1:22     ` Chenfeng Xu
2009-04-15  1:36   ` Wu Fengguang
2009-04-15  1:36     ` Wu Fengguang
2009-04-15  1:36     ` Wu Fengguang
2009-04-15  2:39     ` Wu Fengguang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090415023949.GA9017@localhost \
    --to=fengguang.wu@intel.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=xcf@ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.