All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Chris Mason <chris.mason@oracle.com>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Jens Axboe <axboe@kernel.dk>, Mel Gorman <mel@csn.ul.ie>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Christoph Hellwig <hch@lst.de>, linux-mm <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Tang, Feng" <feng.tang@intel.com>,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH 01/13] writeback: IO-less balance_dirty_pages()
Date: Mon, 6 Dec 2010 00:14:35 +0800	[thread overview]
Message-ID: <20101205161435.GA1421@localhost> (raw)
In-Reply-To: <20101201133818.GA13377@localhost>

[-- Attachment #1: Type: text/plain, Size: 3733 bytes --]

On Wed, Dec 01, 2010 at 09:38:18PM +0800, Wu Fengguang wrote:
> [restore CC list for new findings]
> 
> On Wed, Dec 01, 2010 at 06:39:25AM +0800, Peter Zijlstra wrote:
> > On Tue, 2010-11-30 at 23:35 +0100, Peter Zijlstra wrote:
> > > On Tue, 2010-11-30 at 12:37 +0800, Wu Fengguang wrote:
> > > > On Tue, Nov 30, 2010 at 04:53:33AM +0800, Peter Zijlstra wrote:
> > > > > On Mon, 2010-11-29 at 23:17 +0800, Wu Fengguang wrote:
> > > > > > Hi Peter,
> > > > > >
> > > > > > I'm drawing funny graphs to track the writeback dynamics :)
> > > > > >
> > > > > > In the attached graphs, I find abnormals in dirty-pages-3000.png and
> > > > > > dirty-pages-200.png.  The task limit is what's returned by
> > > > > > task_dirty_limit(), which should be very stable. However from the
> > > > > > graph it seems the task weight (numerator/denominator) will suddenly
> > > > > > drop to near 0 on every 9-10 seconds.  Do you have immediate insight
> > > > > > on what's going on? If not, I'm going to do some tracing to track down
> > > > > > how the numbers change over time.
> > > > >
> > > > > No immediate thoughts there.. I need to look through the math again, but
> > > > > I'm kinda swamped atm. (and my primary dev machine had its disk die this
> > > > > morning). I'll try and get around to it soon..
> > > >
> > > > Peter, I did a simple debug patch (attached) and collected these
> > > > numbers. I noticed that at the "task_weight=27%" and "task_weight=14%"
> > > > lines, "period" increases, "num" is decreased while "den" is still
> > > > increasing.
> > > >
> > > > num=db2e den=e8c0 period=3f8000 shift=10
> > > > num=e04c den=ede0 period=3f8000 shift=10
> > > > num=e56a den=f300 period=3f8000 shift=10
> > >
> > > > num=3e78 den=e400 period=408000 shift=10
> > >
> > > > num=1341 den=8900 period=418000 shift=10
> > > > num=185f den=8e20 period=418000 shift=10
> > > > num=1d7d den=9340 period=418000 shift=10
> > > > num=229b den=9860 period=418000 shift=10
> > > > num=27b9 den=9da0 period=418000 shift=10
> > > > num=2cd7 den=a2c0 period=418000 shift=10
> > >
> > >
> > > This looks sane.. the period indicates someone else was dirtying lots of
> > > pages. Every time the period increases (its shifted right by shift) we
> > > divide the events (num) by 2.
> >
> > Its actually shifted left by shift-1.. see prop_norm_single(), which
> > would make the below:
> >
> > > So the increment from 3f8000 to 408000 is 4064 to 4128, or 64, that
> > > should reset events to 0, seeing that it didn't means it got incremented
> > > as well.
> > >
> > > Funny enough, the second jump is again exactly 64..
> > >
> > > Anyway, as you can see, den increases as long as period stays constant,
> > > it takes a dip when period increments.
> >
> > two steps of 128, which is terribly large.
> >
> > then again, a period of 512 pages is very very small.
> 
> Peter, I also collected prop_norm_single() traces, hope it helps.
> 
> Again, you can find time points when the task limit suddenly skip high
> in graphs "dirty-pages*.png", and then find the corresponding data
> point in file "trace". Sorry I compute something wrong: the "ratio"
> field in the trace data is always 0, please just ignore them.
> 
> I noticed that jbd2/sda8-8-2811 dirtied lots of pages, perhaps by
> ext4_bio_write_page(). This should happen only on -ENOMEM.  I also

Ah I seem to find the root cause. See the attached graphs. Ext4 should
be calling redirty_page_for_writepage() to redirty ~300MB pages on
every ~10s. The redirties happen in big bursts, so not surprisingly
the dd task's dirty weight will suddenly drop to 0.

It should be the same ext4 issue discussed here:

        http://www.spinics.net/lists/linux-fsdevel/msg39555.html

Thanks,
Fengguang

[-- Attachment #2: vmstat-written-300.png --]
[-- Type: image/png, Size: 44152 bytes --]

[-- Attachment #3: vmstat-written.png --]
[-- Type: image/png, Size: 40715 bytes --]

  parent reply	other threads:[~2010-12-05 16:14 UTC|newest]

Thread overview: 168+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-17  4:27 [PATCH 00/13] IO-less dirty throttling v2 Wu Fengguang
2010-11-17  4:27 ` Wu Fengguang
2010-11-17  4:27 ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 01/13] writeback: IO-less balance_dirty_pages() Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17 10:34   ` Minchan Kim
2010-11-17 10:34     ` Minchan Kim
2010-11-22  2:01     ` Wu Fengguang
2010-11-22  2:01       ` Wu Fengguang
2010-11-22  2:01       ` Wu Fengguang
2010-11-17 23:08   ` Andrew Morton
2010-11-17 23:08     ` Andrew Morton
2010-11-17 23:08     ` Andrew Morton
2010-11-18 13:04   ` Peter Zijlstra
2010-11-18 13:04     ` Peter Zijlstra
2010-11-18 13:26     ` Wu Fengguang
2010-11-18 13:26       ` Wu Fengguang
2010-11-18 13:40       ` Peter Zijlstra
2010-11-18 13:40         ` Peter Zijlstra
2010-11-18 14:02         ` Wu Fengguang
2010-11-18 14:02           ` Wu Fengguang
     [not found]     ` <20101129151719.GA30590@localhost>
     [not found]       ` <1291064013.32004.393.camel@laptop>
     [not found]         ` <20101130043735.GA22947@localhost>
     [not found]           ` <1291156522.32004.1359.camel@laptop>
     [not found]             ` <1291156765.32004.1365.camel@laptop>
     [not found]               ` <20101201133818.GA13377@localhost>
2010-12-01 23:03                 ` Andrew Morton
2010-12-01 23:03                   ` Andrew Morton
2010-12-02  1:56                   ` Wu Fengguang
2010-12-02  1:56                     ` Wu Fengguang
2010-12-05 16:14                 ` Wu Fengguang [this message]
2010-12-06  2:42                   ` Ted Ts'o
2010-12-06  2:42                     ` Ted Ts'o
2010-12-06  9:52                     ` Dmitry
2010-12-06  9:52                       ` Dmitry
2010-12-06  9:52                       ` Dmitry
2010-12-06 12:34                       ` Ted Ts'o
2010-12-06 12:34                         ` Ted Ts'o
2010-11-17  4:27 ` [PATCH 02/13] writeback: consolidate variable names in balance_dirty_pages() Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 03/13] writeback: per-task rate limit on balance_dirty_pages() Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17 14:39   ` Wu Fengguang
2010-11-17 14:39     ` Wu Fengguang
2010-11-24 10:23   ` Peter Zijlstra
2010-11-24 10:23     ` Peter Zijlstra
2010-11-24 10:43     ` Wu Fengguang
2010-11-24 10:43       ` Wu Fengguang
2010-11-24 10:49       ` Peter Zijlstra
2010-11-24 10:49         ` Peter Zijlstra
2010-11-17  4:27 ` [PATCH 04/13] writeback: prevent duplicate balance_dirty_pages_ratelimited() calls Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 05/13] writeback: account per-bdi accumulated written pages Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-24 10:26   ` Peter Zijlstra
2010-11-24 10:26     ` Peter Zijlstra
2010-11-24 10:44     ` Wu Fengguang
2010-11-24 10:44       ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 06/13] writeback: bdi write bandwidth estimation Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17 23:08   ` Andrew Morton
2010-11-17 23:08     ` Andrew Morton
2010-11-17 23:08     ` Andrew Morton
2010-11-17 23:24     ` Peter Zijlstra
2010-11-17 23:24       ` Peter Zijlstra
2010-11-17 23:38       ` Andrew Morton
2010-11-17 23:38         ` Andrew Morton
2010-11-17 23:43         ` Peter Zijlstra
2010-11-17 23:43           ` Peter Zijlstra
2010-11-18  6:51     ` Wu Fengguang
2010-11-18  6:51       ` Wu Fengguang
2010-11-24 10:58   ` Peter Zijlstra
2010-11-24 10:58     ` Peter Zijlstra
2010-11-24 14:06     ` Wu Fengguang
2010-11-24 14:06       ` Wu Fengguang
2010-11-24 11:05   ` Peter Zijlstra
2010-11-24 11:05     ` Peter Zijlstra
2010-11-24 12:10     ` Wu Fengguang
2010-11-24 12:10       ` Wu Fengguang
2010-11-24 12:50       ` Peter Zijlstra
2010-11-24 12:50         ` Peter Zijlstra
2010-11-24 13:14         ` Wu Fengguang
2010-11-24 13:14           ` Wu Fengguang
2010-11-24 13:20           ` Wu Fengguang
2010-11-24 13:20             ` Wu Fengguang
2010-11-24 13:42             ` Peter Zijlstra
2010-11-24 13:42               ` Peter Zijlstra
2010-11-24 13:46               ` Wu Fengguang
2010-11-24 13:46                 ` Wu Fengguang
2010-11-24 14:12                 ` Peter Zijlstra
2010-11-24 14:12                   ` Peter Zijlstra
2010-11-24 14:21                   ` Wu Fengguang
2010-11-24 14:21                     ` Wu Fengguang
2010-11-24 14:31                     ` Peter Zijlstra
2010-11-24 14:31                       ` Peter Zijlstra
2010-11-24 14:38                       ` Wu Fengguang
2010-11-24 14:38                         ` Wu Fengguang
2010-11-24 14:34                   ` Wu Fengguang
2010-11-24 14:34                     ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 07/13] writeback: show bdi write bandwidth in debugfs Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 08/13] writeback: quit throttling when bdi dirty pages dropped low Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-24 11:13   ` Peter Zijlstra
2010-11-24 11:13     ` Peter Zijlstra
2010-11-24 12:30     ` Wu Fengguang
2010-11-24 12:30       ` Wu Fengguang
2010-11-24 12:46       ` Peter Zijlstra
2010-11-24 12:46         ` Peter Zijlstra
2010-11-24 12:59         ` Wu Fengguang
2010-11-24 12:59           ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 09/13] writeback: reduce per-bdi dirty threshold ramp up time Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-24 11:15   ` Peter Zijlstra
2010-11-24 11:15     ` Peter Zijlstra
2010-11-24 12:39     ` Wu Fengguang
2010-11-24 12:39       ` Wu Fengguang
2010-11-24 12:56       ` Peter Zijlstra
2010-11-24 12:56         ` Peter Zijlstra
2010-11-17  4:27 ` [PATCH 10/13] writeback: make reasonable gap between the dirty/background thresholds Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-24 11:18   ` Peter Zijlstra
2010-11-24 11:18     ` Peter Zijlstra
2010-11-24 12:48     ` Wu Fengguang
2010-11-24 12:48       ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 11/13] writeback: scale down max throttle bandwidth on concurrent dirtiers Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 12/13] writeback: add trace event for balance_dirty_pages() Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:41   ` Wu Fengguang
2010-11-17  4:41     ` Wu Fengguang
2010-11-17  4:27 ` [PATCH 13/13] writeback: make nr_to_write a per-file limit Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17  4:27   ` Wu Fengguang
2010-11-17 23:03 ` [PATCH 00/13] IO-less dirty throttling v2 Andrew Morton
2010-11-17 23:03   ` Andrew Morton
2010-11-17 23:03   ` Andrew Morton
2010-11-18  2:06   ` Dave Chinner
2010-11-18  2:06     ` Dave Chinner
2010-11-18  2:09     ` Andrew Morton
2010-11-18  2:09       ` Andrew Morton
2010-11-18  3:21       ` Dave Chinner
2010-11-18  3:21         ` Dave Chinner
2010-11-18  3:34         ` Andrew Morton
2010-11-18  3:34           ` Andrew Morton
2010-11-18  7:27           ` Dave Chinner
2010-11-18  7:27             ` Dave Chinner
2010-11-18  7:33             ` Andrew Morton
2010-11-18  7:33               ` Andrew Morton
2010-11-19  3:11               ` Dave Chinner
2010-11-19  3:11                 ` Dave Chinner
2010-11-24 11:12       ` Avi Kivity
2010-11-24 11:12         ` Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2010-11-17  3:58 [PATCH 01/13] writeback: IO-less balance_dirty_pages() Wu Fengguang
2010-11-17  3:58 ` Wu Fengguang
2010-11-17  3:58 ` Wu Fengguang
2010-11-17  4:19 ` Wu Fengguang
2010-11-17  4:19   ` Wu Fengguang
2010-11-17  8:33   ` Wu Fengguang
2010-11-17  8:33     ` Wu Fengguang
2010-11-17  4:30 ` Wu Fengguang
2010-11-17  4:30   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101205161435.GA1421@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=feng.tang@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.