From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>, Steve Rago <sar@nec-labs.com>,
Peter Zijlstra <peterz@infradead.org>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jens.axboe" <jens.axboe@oracle.com>,
Peter Staubach <staubach@redhat.com>,
Arjan van de Ven <arjan@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH] improve the performance of large sequential write NFS workloads
Date: Thu, 24 Dec 2009 13:04:41 +0100 [thread overview]
Message-ID: <1261656281.3596.1.camel@localhost> (raw)
In-Reply-To: <20091224025228.GA12477@localhost>
On Thu, 2009-12-24 at 10:52 +0800, Wu Fengguang wrote:
> Trond,
>
> On Thu, Dec 24, 2009 at 03:12:54AM +0800, Trond Myklebust wrote:
> > On Wed, 2009-12-23 at 19:05 +0100, Jan Kara wrote:
> > > On Wed 23-12-09 15:21:47, Trond Myklebust wrote:
> > > > @@ -474,6 +482,18 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> > > > }
> > > >
> > > > spin_lock(&inode_lock);
> > > > + /*
> > > > + * Special state for cleaning NFS unstable pages
> > > > + */
> > > > + if (inode->i_state & I_UNSTABLE_PAGES) {
> > > > + int err;
> > > > + inode->i_state &= ~I_UNSTABLE_PAGES;
> > > > + spin_unlock(&inode_lock);
> > > > + err = commit_unstable_pages(inode, wait);
> > > > + if (ret == 0)
> > > > + ret = err;
> > > > + spin_lock(&inode_lock);
> > > > + }
> > > I don't quite understand this chunk: We've called writeback_single_inode
> > > because it had some dirty pages. Thus it has I_DIRTY_DATASYNC set and a few
> > > lines above your chunk, we've called nfs_write_inode which sent commit to
> > > the server. Now here you sometimes send the commit again? What's the
> > > purpose?
> >
> > We no longer set I_DIRTY_DATASYNC. We only set I_DIRTY_PAGES (and later
> > I_UNSTABLE_PAGES).
> >
> > The point is that we now do the commit only _after_ we've sent all the
> > dirty pages, and waited for writeback to complete, whereas previously we
> > did it in the wrong order.
>
> Sorry I still don't get it. The timing used to be:
>
> write 4MB ==> WRITE block 0 (ie. first 512KB)
> WRITE block 1
> WRITE block 2
> WRITE block 3 ack from server for WRITE block 0 => mark 0 as unstable (inode marked need-commit)
> WRITE block 4 ack from server for WRITE block 1 => mark 1 as unstable
> WRITE block 5 ack from server for WRITE block 2 => mark 2 as unstable
> WRITE block 6 ack from server for WRITE block 3 => mark 3 as unstable
> WRITE block 7 ack from server for WRITE block 4 => mark 4 as unstable
> ack from server for WRITE block 5 => mark 5 as unstable
> write_inode ==> COMMIT blocks 0-5
> ack from server for WRITE block 6 => mark 6 as unstable (inode marked need-commit)
> ack from server for WRITE block 7 => mark 7 as unstable
>
> ack from server for COMMIT blocks 0-5 => mark 0-5 as clean
>
> write_inode ==> COMMIT blocks 6-7
>
> ack from server for COMMIT blocks 6-7 => mark 6-7 as clean
>
> Note that the first COMMIT is submitted before receiving all ACKs for
> the previous writes, hence the second COMMIT is necessary. It seems
> that your patch does not improve the timing at all.
That would indicate that we're cycling through writeback_single_inode()
more than once. Why?
Trond
next prev parent reply other threads:[~2009-12-24 12:05 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-17 2:03 [PATCH] improve the performance of large sequential write NFS workloads Steve Rago
2009-12-17 8:17 ` Peter Zijlstra
2009-12-18 19:33 ` Steve Rago
2009-12-18 19:41 ` Ingo Molnar
2009-12-18 21:20 ` Steve Rago
2009-12-18 22:07 ` Ingo Molnar
2009-12-18 22:46 ` Steve Rago
2009-12-19 8:08 ` Arjan van de Ven
2009-12-19 13:37 ` Steve Rago
2009-12-18 19:44 ` Peter Zijlstra
2009-12-19 12:20 ` Wu Fengguang
2009-12-19 14:25 ` Steve Rago
2009-12-22 1:59 ` Wu Fengguang
2009-12-22 12:35 ` Jan Kara
2009-12-23 8:43 ` Christoph Hellwig
2009-12-23 13:32 ` Jan Kara
2009-12-24 5:25 ` Wu Fengguang
2009-12-24 1:26 ` Wu Fengguang
2009-12-22 13:01 ` Martin Knoblauch
2009-12-24 1:46 ` Wu Fengguang
2009-12-22 16:41 ` Steve Rago
2009-12-24 1:21 ` Wu Fengguang
2009-12-24 14:49 ` Steve Rago
2009-12-25 7:37 ` Wu Fengguang
2009-12-23 14:21 ` Trond Myklebust
2009-12-23 18:05 ` Jan Kara
2009-12-23 19:12 ` Trond Myklebust
2009-12-24 2:52 ` Wu Fengguang
2009-12-24 12:04 ` Trond Myklebust [this message]
2009-12-25 5:56 ` Wu Fengguang
2009-12-30 16:22 ` Trond Myklebust
2009-12-31 5:04 ` Wu Fengguang
2009-12-31 19:13 ` Trond Myklebust
2010-01-06 3:03 ` Wu Fengguang
2010-01-06 16:56 ` Trond Myklebust
2010-01-06 18:26 ` Trond Myklebust
2010-01-06 18:37 ` Peter Zijlstra
2010-01-06 18:52 ` Trond Myklebust
2010-01-06 19:07 ` Peter Zijlstra
2010-01-06 19:21 ` Trond Myklebust
2010-01-06 19:53 ` Trond Myklebust
2010-01-06 20:09 ` Jan Kara
2009-12-22 12:25 ` Jan Kara
2009-12-22 12:38 ` Peter Zijlstra
2009-12-22 12:55 ` Jan Kara
2009-12-22 16:20 ` Steve Rago
2009-12-23 18:39 ` Jan Kara
2009-12-23 20:16 ` Steve Rago
2009-12-23 21:49 ` Trond Myklebust
2009-12-23 23:13 ` Steve Rago
2009-12-23 23:44 ` Trond Myklebust
2009-12-24 4:30 ` Steve Rago
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1261656281.3596.1.camel@localhost \
--to=trond.myklebust@netapp.com \
--cc=arjan@infradead.org \
--cc=fengguang.wu@intel.com \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=sar@nec-labs.com \
--cc=staubach@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox