All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
	Steve Rago <sar@nec-labs.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jens.axboe" <jens.axboe@oracle.com>,
	Peter Staubach <staubach@redhat.com>, Jan Kara <jack@suse.cz>,
	Arjan van de Ven <arjan@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] improve the performance of large sequential write NFS workloads
Date: Wed, 23 Dec 2009 19:05:51 +0100	[thread overview]
Message-ID: <20091223180551.GD3159@quack.suse.cz> (raw)
In-Reply-To: <1261578107.2606.11.camel@localhost>

On Wed 23-12-09 15:21:47, Trond Myklebust wrote:
> On Tue, 2009-12-22 at 09:59 +0800, Wu Fengguang wrote: 
> > 1) normal fs sets I_DIRTY_DATASYNC when extending i_size, however NFS
> >    will set the flag for any pages written -- why this trick? To
> >    guarantee the call of nfs_commit_inode()? Which unfortunately turns
> >    almost every server side NFS write into sync writes..
> > 
> >  writeback_single_inode:
> >     do_writepages
> >       nfs_writepages
> >         nfs_writepage ----[short time later]---> nfs_writeback_release*
> >                                                    nfs_mark_request_commit
> >                                                      __mark_inode_dirty(I_DIRTY_DATASYNC);
> >                                     
> >     if (I_DIRTY_SYNC || I_DIRTY_DATASYNC)  <---- so this will be true for most time
> >       write_inode
> >         nfs_write_inode
> >           nfs_commit_inode
> 
> 
> I have been working on a fix for this. We basically do want to ensure
> that NFS calls commit (otherwise we're not finished cleaning the dirty
> pages), but we want to do it _after_ we've waited for all the writes to
> complete. See below...
> 
> Trond
> 
> ------------------------------------------------------------------------------------------------------ 
> VFS: Add a new inode state: I_UNSTABLE_PAGES
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Add a new inode state to enable the vfs to commit the nfs unstable pages to
> stable storage once the write back of dirty pages is done.
  Hmm, does your patch really help?

> @@ -474,6 +482,18 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  	}
>  
>  	spin_lock(&inode_lock);
> +	/*
> +	 * Special state for cleaning NFS unstable pages
> +	 */
> +	if (inode->i_state & I_UNSTABLE_PAGES) {
> +		int err;
> +		inode->i_state &= ~I_UNSTABLE_PAGES;
> +		spin_unlock(&inode_lock);
> +		err = commit_unstable_pages(inode, wait);
> +		if (ret == 0)
> +			ret = err;
> +		spin_lock(&inode_lock);
> +	}
  I don't quite understand this chunk: We've called writeback_single_inode
because it had some dirty pages. Thus it has I_DIRTY_DATASYNC set and a few
lines above your chunk, we've called nfs_write_inode which sent commit to
the server. Now here you sometimes send the commit again? What's the
purpose?

> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index faa0918..4f129b3 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -99,17 +99,14 @@ u64 nfs_compat_user_ino64(u64 fileid)
>  
>  int nfs_write_inode(struct inode *inode, int sync)
>  {
> +	int flags = 0;
>  	int ret;
>  
> -	if (sync) {
> -		ret = filemap_fdatawait(inode->i_mapping);
> -		if (ret == 0)
> -			ret = nfs_commit_inode(inode, FLUSH_SYNC);
> -	} else
> -		ret = nfs_commit_inode(inode, 0);
> -	if (ret >= 0)
> +	if (sync)
> +		flags = FLUSH_SYNC;
> +	ret = nfs_commit_inode(inode, flags);
> +	if (ret > 0)
>  		return 0;
> -	__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
>  	return ret;
>  }
>  

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2009-12-23 18:05 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-17  2:03 [PATCH] improve the performance of large sequential write NFS workloads Steve Rago
2009-12-17  8:17 ` Peter Zijlstra
2009-12-18 19:33   ` Steve Rago
2009-12-18 19:41     ` Ingo Molnar
2009-12-18 19:41       ` Ingo Molnar
2009-12-18 21:20       ` Steve Rago
2009-12-18 21:20         ` Steve Rago
2009-12-18 22:07         ` Ingo Molnar
2009-12-18 22:07           ` Ingo Molnar
2009-12-18 22:46           ` Steve Rago
2009-12-18 22:46             ` Steve Rago
2009-12-19  8:08         ` Arjan van de Ven
2009-12-19  8:08           ` Arjan van de Ven
2009-12-19 13:37           ` Steve Rago
2009-12-19 13:37             ` Steve Rago
2009-12-18 19:44     ` Peter Zijlstra
2009-12-18 19:44       ` Peter Zijlstra
2009-12-19 12:20   ` Wu Fengguang
2009-12-19 12:20     ` Wu Fengguang
2009-12-19 14:25     ` Steve Rago
2009-12-22  1:59       ` Wu Fengguang
2009-12-22 12:35         ` Jan Kara
2009-12-22 12:35           ` Jan Kara
2009-12-22 12:35           ` Jan Kara
     [not found]           ` <20091222123538.GB604-jyMamyUUXNJG4ohzP4jBZS1Fcj925eT/@public.gmane.org>
2009-12-23  8:43             ` Christoph Hellwig
2009-12-23  8:43               ` Christoph Hellwig
2009-12-23  8:43               ` Christoph Hellwig
2009-12-23 13:32               ` Jan Kara
2009-12-23 13:32                 ` Jan Kara
2009-12-23 13:32                 ` Jan Kara
     [not found]                 ` <20091223133244.GB3159-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2009-12-24  5:25                   ` Wu Fengguang
2009-12-24  5:25                     ` Wu Fengguang
2009-12-24  5:25                     ` Wu Fengguang
2009-12-24  1:26           ` Wu Fengguang
2009-12-22 13:01         ` Martin Knoblauch
2009-12-22 13:01           ` Martin Knoblauch
     [not found]           ` <787373.9318.qm-rpBZDh8Qtqs5A34FEqDeB/u2YVrzzGjVVpNB7YpNyf8@public.gmane.org>
2009-12-24  1:46             ` Wu Fengguang
2009-12-24  1:46               ` Wu Fengguang
2009-12-22 16:41         ` Steve Rago
2009-12-22 16:41           ` Steve Rago
2009-12-24  1:21           ` Wu Fengguang
2009-12-24 14:49             ` Steve Rago
2009-12-24 14:49               ` Steve Rago
2009-12-24 14:49               ` Steve Rago
2009-12-25  7:37               ` Wu Fengguang
2009-12-23 14:21         ` Trond Myklebust
2009-12-23 14:21           ` Trond Myklebust
2009-12-23 14:21           ` Trond Myklebust
2009-12-23 18:05           ` Jan Kara [this message]
     [not found]             ` <20091223180551.GD3159-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2009-12-23 19:12               ` Trond Myklebust
2009-12-23 19:12                 ` Trond Myklebust
2009-12-23 19:12                 ` Trond Myklebust
2009-12-24  2:52                 ` Wu Fengguang
2009-12-24  2:52                   ` Wu Fengguang
2009-12-24  2:52                   ` Wu Fengguang
2009-12-24 12:04                   ` Trond Myklebust
2009-12-24 12:04                     ` Trond Myklebust
2009-12-24 12:04                     ` Trond Myklebust
2009-12-25  5:56                     ` Wu Fengguang
2009-12-25  5:56                       ` Wu Fengguang
2009-12-25  5:56                       ` Wu Fengguang
2009-12-30 16:22                       ` Trond Myklebust
2009-12-31  5:04                         ` Wu Fengguang
2009-12-31  5:04                           ` Wu Fengguang
2009-12-31  5:04                           ` Wu Fengguang
2009-12-31 19:13                           ` Trond Myklebust
2010-01-06  3:03                             ` Wu Fengguang
2010-01-06 16:56                               ` Trond Myklebust
2010-01-06 16:56                                 ` Trond Myklebust
2010-01-06 16:56                                 ` Trond Myklebust
2010-01-06 18:26                                 ` Trond Myklebust
2010-01-06 18:26                                   ` Trond Myklebust
2010-01-06 18:37                                   ` Peter Zijlstra
2010-01-06 18:37                                     ` Peter Zijlstra
2010-01-06 18:52                                     ` Trond Myklebust
2010-01-06 18:52                                       ` Trond Myklebust
2010-01-06 19:07                                       ` Peter Zijlstra
2010-01-06 19:21                                         ` Trond Myklebust
2010-01-06 19:21                                           ` Trond Myklebust
2010-01-06 19:21                                           ` Trond Myklebust
2010-01-06 19:53                                           ` Trond Myklebust
2010-01-06 19:53                                             ` Trond Myklebust
2010-01-06 20:09                                             ` Jan Kara
2010-01-06 20:09                                               ` Jan Kara
2010-01-06 20:51                                               ` [PATCH 0/6] " Trond Myklebust
2010-01-06 20:51                                                 ` Trond Myklebust
     [not found]                                                 ` <20100106205110.22547.85345.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-06 20:51                                                   ` [PATCH 5/6] VM: Use per-bdi unstable accounting to improve use of wbc->force_commit Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
     [not found]                                                     ` <20100106205110.22547.32584.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-07  2:34                                                       ` Wu Fengguang
2010-01-07  2:34                                                         ` Wu Fengguang
2010-01-06 20:51                                                   ` [PATCH 1/6] VFS: Ensure that writeback_single_inode() commits unstable writes Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
     [not found]                                                     ` <20100106205110.22547.17971.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-06 21:38                                                       ` Jan Kara
2010-01-06 21:38                                                         ` Jan Kara
     [not found]                                                         ` <20100106213843.GD22781-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2010-01-06 21:48                                                           ` Trond Myklebust
2010-01-06 21:48                                                             ` Trond Myklebust
2010-01-07  2:18                                                     ` Wu Fengguang
2010-01-07  2:18                                                       ` Wu Fengguang
     [not found]                                                       ` <1262839082.2185.15.camel@localhost>
2010-01-07  4:48                                                         ` Wu Fengguang
2010-01-07  4:48                                                           ` Wu Fengguang
2010-01-07  4:53                                                           ` [PATCH 0/5] Re: [PATCH] improve the performance of large sequential write NFS workloads Trond Myklebust
2010-01-07  4:53                                                             ` Trond Myklebust
     [not found]                                                             ` <20100107045330.5986.55090.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-07  4:53                                                               ` [PATCH 5/5] NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set Trond Myklebust
2010-01-07  4:53                                                                 ` Trond Myklebust
2010-01-07  4:53                                                               ` [PATCH 4/5] VM/NFS: The VM must tell the filesystem when to free reclaimable pages Trond Myklebust
2010-01-07  4:53                                                                 ` Trond Myklebust
2010-01-07  4:53                                                               ` [PATCH 1/5] VFS: Ensure that writeback_single_inode() commits unstable writes Trond Myklebust
2010-01-07  4:53                                                                 ` Trond Myklebust
2010-01-07  4:53                                                               ` [PATCH 2/5] VM: Split out the accounting of unstable writes from BDI_RECLAIMABLE Trond Myklebust
2010-01-07  4:53                                                                 ` Trond Myklebust
2010-01-07  4:53                                                               ` [PATCH 3/5] VM: Don't call bdi_stat(BDI_UNSTABLE) on non-nfs backing-devices Trond Myklebust
2010-01-07  4:53                                                                 ` Trond Myklebust
2010-01-07 14:56                                                         ` [PATCH 1/6] VFS: Ensure that writeback_single_inode() commits unstable writes Wu Fengguang
2010-01-07 14:56                                                           ` Wu Fengguang
2010-01-07 15:10                                                           ` Trond Myklebust
2010-01-07 15:10                                                             ` Trond Myklebust
2010-01-08  1:17                                                             ` Wu Fengguang
2010-01-08  1:17                                                               ` Wu Fengguang
2010-01-08  1:37                                                               ` Trond Myklebust
2010-01-08  1:37                                                                 ` Trond Myklebust
2010-01-08  1:53                                                                 ` Wu Fengguang
2010-01-08  1:53                                                                   ` Wu Fengguang
2010-01-08  9:25                                                             ` Christoph Hellwig
2010-01-08  9:25                                                               ` Christoph Hellwig
2010-01-08 13:46                                                               ` Trond Myklebust
2010-01-08 13:54                                                                 ` Christoph Hellwig
2010-01-08 14:15                                                                   ` Trond Myklebust
2010-01-06 20:51                                                   ` [PATCH 6/6] NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
     [not found]                                                     ` <20100106205110.22547.31434.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-07  2:32                                                       ` Wu Fengguang
2010-01-07  2:32                                                         ` Wu Fengguang
2010-01-06 20:51                                                   ` [PATCH 3/6] VM: Split out the accounting of unstable writes from BDI_RECLAIMABLE Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
     [not found]                                                     ` <20100106205110.22547.93554.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-07  1:48                                                       ` Wu Fengguang
2010-01-07  1:48                                                         ` Wu Fengguang
2010-01-06 20:51                                                   ` [PATCH 4/6] VM: Don't call bdi_stat(BDI_UNSTABLE) on non-nfs backing-devices Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
2010-01-07  1:56                                                     ` Wu Fengguang
2010-01-06 20:51                                                   ` [PATCH 2/6] VM/NFS: The VM must tell the filesystem when to free reclaimable pages Trond Myklebust
2010-01-06 20:51                                                     ` Trond Myklebust
2010-01-07  2:29                                                     ` Wu Fengguang
2010-01-07  4:49                                                       ` Trond Myklebust
2010-01-07  4:49                                                         ` Trond Myklebust
2010-01-07  5:03                                                         ` Wu Fengguang
2010-01-07  5:03                                                           ` Wu Fengguang
2010-01-07  5:30                                                           ` Trond Myklebust
2010-01-07  5:30                                                             ` Trond Myklebust
2010-01-07 14:37                                                             ` Wu Fengguang
2010-01-07 14:37                                                               ` Wu Fengguang
2010-01-07 14:41                                                               ` [PATCH 0/5] Re: [PATCH] improve the performance of large sequential write NFS workloads Trond Myklebust
     [not found]                                                                 ` <20100107144137.17158.53673.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-07 14:41                                                                   ` [PATCH 4/5] VM/NFS: The VM must tell the filesystem when to free reclaimable pages Trond Myklebust
2010-01-07 14:41                                                                   ` [PATCH 1/5] VFS: Ensure that writeback_single_inode() commits unstable writes Trond Myklebust
2010-01-07 14:41                                                                   ` [PATCH 2/5] VM: Split out the accounting of unstable writes from BDI_RECLAIMABLE Trond Myklebust
2010-01-07 14:41                                                                   ` [PATCH 3/5] VM: Don't call bdi_stat(BDI_UNSTABLE) on non-nfs backing-devices Trond Myklebust
2010-01-07 14:41                                                                   ` [PATCH 5/5] NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set Trond Myklebust
2010-01-06 21:44                                                   ` [PATCH 0/6] Re: [PATCH] improve the performance of large sequential write NFS workloads Jan Kara
2010-01-06 21:44                                                     ` Jan Kara
2010-01-06 22:03                                                     ` Trond Myklebust
2010-01-07  8:16                                                 ` Peter Zijlstra
2009-12-22 12:25       ` Jan Kara
     [not found]         ` <20091222122557.GA604-jyMamyUUXNJG4ohzP4jBZS1Fcj925eT/@public.gmane.org>
2009-12-22 12:38           ` Peter Zijlstra
2009-12-22 12:38             ` Peter Zijlstra
2009-12-22 12:55             ` Jan Kara
2009-12-22 12:55               ` Jan Kara
2009-12-22 16:20         ` Steve Rago
2009-12-23 18:39           ` Jan Kara
2009-12-23 18:39             ` Jan Kara
     [not found]             ` <20091223183912.GE3159-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2009-12-23 20:16               ` Steve Rago
2009-12-23 20:16                 ` Steve Rago
2009-12-23 21:49                 ` Trond Myklebust
2009-12-23 21:49                   ` Trond Myklebust
2009-12-23 23:13                   ` Steve Rago
2009-12-23 23:44                     ` Trond Myklebust
2009-12-23 23:44                       ` Trond Myklebust
2009-12-24  4:30                       ` Steve Rago

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091223180551.GD3159@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=Trond.Myklebust@netapp.com \
    --cc=arjan@infradead.org \
    --cc=fengguang.wu@intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=sar@nec-labs.com \
    --cc=staubach@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.