All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: NFS page states & writeback
Date: Fri, 25 Mar 2011 15:00:54 +0800	[thread overview]
Message-ID: <20110325070054.GA5970@localhost> (raw)
In-Reply-To: <20110325012803.GA25052@quack.suse.cz>

Hi Jan,

On Fri, Mar 25, 2011 at 09:28:03AM +0800, Jan Kara wrote:
>   Hi,
> 
>   while working on changes to balance_dirty_pages() I was investigating why
> NFS writeback is *so* bumpy when I do not call writeback_inodes_wb() from
> balance_dirty_pages(). Take a single dd writing to NFS. What I can
> see is that we quickly accumulate dirty pages upto limit - ~700 MB on that
> machine. So flusher thread starts working and in an instant all these ~700
> MB transition from Dirty state to Writeback state. Then, as server acks

That can be fixed by the following patch:

        [PATCH 09/27] nfs: writeback pages wait queue
        https://lkml.org/lkml/2011/3/3/79

> writes, Writeback pages slowly change to Unstable pages (at 100 MB/s rate
> let's say) and then at one moment (commit to server happens) all pages
> transition from Unstable to Clean state - the cycle begins from the start.
> 
> The reason for this behavior seems to be a flaw in the logic in
> over_bground_thresh() which checks:
> global_page_state(NR_FILE_DIRTY) +
>       global_page_state(NR_UNSTABLE_NFS) > background_thresh
> So at the moment all pages are turned Writeback, flusher thread goes to
> sleep and doesn't do any background writeback, until we have accumulated
> enough Stable pages to get over background_thresh. But NFS needs to have
> ->write_inode() called so that it can sent commit requests to the server.
> So effectively we end up sending commit only when background_thresh Unstable
> pages have accumulated which creates the bumpyness. Previously this wasn't
> a problem because balance_dirty_pages() ended up calling ->write_inode()
> often enough for NFS to send commit requests reasonably often.
> 
> Now I wouldn't write so long email about this if I knew how to cleanly fix
> the check ;-). One way to "fix" the check would be to add there Writeback
> pages:
> NR_FILE_DIRTY + NR_WRITEBACK + NR_UNSTABLE_NFS > background_thresh
> 
> This would work in the sense that it would keep flusher thread working but
> a) for normal filesystems it would be working even if there's potentially
> nothing to do (or it is not necessary to do anything)
> b) NFS is picky when it sends commit requests (inode has to have more
> Stable pages than Writeback pages if I'm reading the code in
> nfs_commit_unstable_pages() right) so flusher thread may be working but
> nothing really happens until enough stable pages accumulate.
> 
> A check which kind of works but looks a bit hacky and is not perfect when
> there are multiple files is:
> NR_FILE_DIRTY + NR_UNSTABLE_NFS > background_thresh ||
> NR_UNSTABLE_NFS > NR_WRITEBACK (to match what NFS does)

There is another patch in the series "[PATCH 12/27] nfs: lower
writeback threshold proportionally to dirty threshold" that tries to
limit the NFS write queue size. For the system default 10%/20%
background/dirty ratios, it has the nice effect of keeping

        nr_writeback < 5%

So when the system is dirty exceeded, the background flusher won't
quit because

        nr_dirty + nr_unstable > 10%

Thanks,
Fengguang

  parent reply	other threads:[~2011-03-25  7:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-25  1:28 NFS page states & writeback Jan Kara
2011-03-25  4:47 ` Dave Chinner
2011-03-25  7:11   ` Wu Fengguang
2011-03-25  7:11     ` Wu Fengguang
2011-03-25 22:24   ` Jan Kara
2011-03-25 22:24     ` Jan Kara
2011-03-25 23:04     ` Dave Chinner
2011-03-25 23:04       ` Dave Chinner
2011-03-25  7:00 ` Wu Fengguang [this message]
2011-03-25  9:39   ` Dave Chinner
2011-03-25  9:39     ` Dave Chinner
2011-03-25 14:22     ` Wu Fengguang
2011-03-25 14:22       ` Wu Fengguang
2011-03-25 14:32       ` Wu Fengguang
2011-03-25 18:26       ` Jan Kara
2011-03-25 18:26         ` Jan Kara
2011-03-25 22:55       ` Dave Chinner
2011-03-25 23:24         ` Jan Kara
2011-03-26  1:18           ` Dave Chinner
2011-03-27 15:26             ` Trond Myklebust
2011-03-28  0:23               ` Dave Chinner
2011-03-28  0:23                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110325070054.GA5970@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.