public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>, Steve Rago <sar@nec-labs.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jens.axboe" <jens.axboe@oracle.com>,
	Peter Staubach <staubach@redhat.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH] improve the performance of large sequential write NFS workloads
Date: Thu, 24 Dec 2009 13:04:41 +0100	[thread overview]
Message-ID: <1261656281.3596.1.camel@localhost> (raw)
In-Reply-To: <20091224025228.GA12477@localhost>

On Thu, 2009-12-24 at 10:52 +0800, Wu Fengguang wrote: 
> Trond,
> 
> On Thu, Dec 24, 2009 at 03:12:54AM +0800, Trond Myklebust wrote:
> > On Wed, 2009-12-23 at 19:05 +0100, Jan Kara wrote: 
> > > On Wed 23-12-09 15:21:47, Trond Myklebust wrote:
> > > > @@ -474,6 +482,18 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> > > >  	}
> > > >  
> > > >  	spin_lock(&inode_lock);
> > > > +	/*
> > > > +	 * Special state for cleaning NFS unstable pages
> > > > +	 */
> > > > +	if (inode->i_state & I_UNSTABLE_PAGES) {
> > > > +		int err;
> > > > +		inode->i_state &= ~I_UNSTABLE_PAGES;
> > > > +		spin_unlock(&inode_lock);
> > > > +		err = commit_unstable_pages(inode, wait);
> > > > +		if (ret == 0)
> > > > +			ret = err;
> > > > +		spin_lock(&inode_lock);
> > > > +	}
> > >   I don't quite understand this chunk: We've called writeback_single_inode
> > > because it had some dirty pages. Thus it has I_DIRTY_DATASYNC set and a few
> > > lines above your chunk, we've called nfs_write_inode which sent commit to
> > > the server. Now here you sometimes send the commit again? What's the
> > > purpose?
> > 
> > We no longer set I_DIRTY_DATASYNC. We only set I_DIRTY_PAGES (and later
> > I_UNSTABLE_PAGES).
> > 
> > The point is that we now do the commit only _after_ we've sent all the
> > dirty pages, and waited for writeback to complete, whereas previously we
> > did it in the wrong order.
> 
> Sorry I still don't get it. The timing used to be:
> 
> write 4MB   ==> WRITE block 0 (ie. first 512KB)
>                 WRITE block 1
>                 WRITE block 2
>                 WRITE block 3         ack from server for WRITE block 0 => mark 0 as unstable (inode marked need-commit)
>                 WRITE block 4         ack from server for WRITE block 1 => mark 1 as unstable
>                 WRITE block 5         ack from server for WRITE block 2 => mark 2 as unstable
>                 WRITE block 6         ack from server for WRITE block 3 => mark 3 as unstable
>                 WRITE block 7         ack from server for WRITE block 4 => mark 4 as unstable
>                                       ack from server for WRITE block 5 => mark 5 as unstable
> write_inode ==> COMMIT blocks 0-5
>                                       ack from server for WRITE block 6 => mark 6 as unstable (inode marked need-commit)
>                                       ack from server for WRITE block 7 => mark 7 as unstable 
> 
>                                       ack from server for COMMIT blocks 0-5 => mark 0-5 as clean
> 
> write_inode ==> COMMIT blocks 6-7
> 
>                                       ack from server for COMMIT blocks 6-7 => mark 6-7 as clean
> 
> Note that the first COMMIT is submitted before receiving all ACKs for
> the previous writes, hence the second COMMIT is necessary. It seems
> that your patch does not improve the timing at all.

That would indicate that we're cycling through writeback_single_inode()
more than once. Why?

Trond


  reply	other threads:[~2009-12-24 12:05 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-17  2:03 [PATCH] improve the performance of large sequential write NFS workloads Steve Rago
2009-12-17  8:17 ` Peter Zijlstra
2009-12-18 19:33   ` Steve Rago
2009-12-18 19:41     ` Ingo Molnar
2009-12-18 21:20       ` Steve Rago
2009-12-18 22:07         ` Ingo Molnar
2009-12-18 22:46           ` Steve Rago
2009-12-19  8:08         ` Arjan van de Ven
2009-12-19 13:37           ` Steve Rago
2009-12-18 19:44     ` Peter Zijlstra
2009-12-19 12:20   ` Wu Fengguang
2009-12-19 14:25     ` Steve Rago
2009-12-22  1:59       ` Wu Fengguang
2009-12-22 12:35         ` Jan Kara
2009-12-23  8:43           ` Christoph Hellwig
2009-12-23 13:32             ` Jan Kara
2009-12-24  5:25               ` Wu Fengguang
2009-12-24  1:26           ` Wu Fengguang
2009-12-22 13:01         ` Martin Knoblauch
2009-12-24  1:46           ` Wu Fengguang
2009-12-22 16:41         ` Steve Rago
2009-12-24  1:21           ` Wu Fengguang
2009-12-24 14:49             ` Steve Rago
2009-12-25  7:37               ` Wu Fengguang
2009-12-23 14:21         ` Trond Myklebust
2009-12-23 18:05           ` Jan Kara
2009-12-23 19:12             ` Trond Myklebust
2009-12-24  2:52               ` Wu Fengguang
2009-12-24 12:04                 ` Trond Myklebust [this message]
2009-12-25  5:56                   ` Wu Fengguang
2009-12-30 16:22                     ` Trond Myklebust
2009-12-31  5:04                       ` Wu Fengguang
2009-12-31 19:13                         ` Trond Myklebust
2010-01-06  3:03                           ` Wu Fengguang
2010-01-06 16:56                             ` Trond Myklebust
2010-01-06 18:26                               ` Trond Myklebust
2010-01-06 18:37                                 ` Peter Zijlstra
2010-01-06 18:52                                   ` Trond Myklebust
2010-01-06 19:07                                     ` Peter Zijlstra
2010-01-06 19:21                                       ` Trond Myklebust
2010-01-06 19:53                                         ` Trond Myklebust
2010-01-06 20:09                                           ` Jan Kara
2009-12-22 12:25       ` Jan Kara
2009-12-22 12:38         ` Peter Zijlstra
2009-12-22 12:55           ` Jan Kara
2009-12-22 16:20         ` Steve Rago
2009-12-23 18:39           ` Jan Kara
2009-12-23 20:16             ` Steve Rago
2009-12-23 21:49               ` Trond Myklebust
2009-12-23 23:13                 ` Steve Rago
2009-12-23 23:44                   ` Trond Myklebust
2009-12-24  4:30                     ` Steve Rago

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1261656281.3596.1.camel@localhost \
    --to=trond.myklebust@netapp.com \
    --cc=arjan@infradead.org \
    --cc=fengguang.wu@intel.com \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=sar@nec-labs.com \
    --cc=staubach@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox