linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Chris Mason <chris.mason@oracle.com>,
	Artem Bityutskiy <dedekind1@gmail.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	david@fromorbit.com, hch@infradead.org,
	akpm@linux-foundation.org, jack@suse.cz,
	Theodore Ts'o <tytso@mit.edu>,
	Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback_mb
Date: Wed, 9 Sep 2009 16:23:15 +0200	[thread overview]
Message-ID: <20090909142315.GA7949@duck.suse.cz> (raw)
In-Reply-To: <1252434746.7035.7.camel@laptop>

On Tue 08-09-09 20:32:26, Peter Zijlstra wrote:
> On Tue, 2009-09-08 at 19:55 +0200, Peter Zijlstra wrote:
> > 
> > I think I'm somewhat confused here though..
> > 
> > There's kernel threads doing writeout, and there's apps getting stuck in
> > balance_dirty_pages().
> > 
> > If we want all writeout to be done by kernel threads (bdi/pd-flush like
> > things) then we still need to manage the actual apps and delay them.
> > 
> > As things stand now, we kick pdflush into action when dirty levels are
> > above the background level, and start writing out from the app task when
> > we hit the full dirty level.
> > 
> > Moving all writeout to a kernel thread sounds good from writing linear
> > stuff pov, but what do we make apps wait on then?
> 
> OK, so like said in the previous email, we could have these app tasks
> simply sleep on a waitqueue which gets periodic wakeups from
> __bdi_writeback_inc() every time the dirty threshold drops.
> 
> The woken tasks would then check their bdi dirty limit (its task
> dependent) against the current values and either go back to sleep or
> back to work.
  Well, what I imagined we could do is:
Have a per-bdi variable 'pages_written' - that would reflect the amount of
pages written to the bdi since boot (OK, we'd have to handle overflows but
that's doable).

There will be a per-bdi variable 'pages_waited'. When a thread should sleep
in balance_dirty_pages() because we are over limits, it kicks writeback thread
and does:
  to_wait =  max(pages_waited, pages_written) + sync_dirty_pages() (or
whatever number we decide)
  pages_waited = to_wait
  sleep until pages_written reaches to_wait or we drop below dirty limits.

That will make sure each thread will sleep until writeback threads have done
their duty for the writing thread.

If we make sure sleeping threads are properly ordered on the wait queue,
we could always wakeup just the first one and thus avoid the herding
effect. When we drop below dirty limits, we would just wakeup the whole
waitqueue.

Does this sound reasonable?

> The only problem would be the mass wakeups when lots of tasks are
> blocked on dirty, but I'm guessing there's no way around that anyway,
> and its better to have a limited number of writers than have everybody
> write something, which would result in massive write fragmentation.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2009-09-09 14:23 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-08  9:23 [PATCH 0/8] Per-bdi writeback flusher threads v19 Jens Axboe
2009-09-08  9:23 ` [PATCH 1/8] writeback: get rid of generic_sync_sb_inodes() export Jens Axboe
2009-09-08 10:27   ` Artem Bityutskiy
2009-09-08 10:41     ` Jens Axboe
2009-09-08 10:52       ` Artem Bityutskiy
2009-09-08 10:57         ` Jens Axboe
2009-09-08 11:01           ` Artem Bityutskiy
2009-09-08 11:05             ` Jens Axboe
2009-09-08 11:31               ` Artem Bityutskiy
2009-09-08  9:23 ` [PATCH 2/8] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-09-08  9:23 ` [PATCH 3/8] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-09-08 13:46   ` Daniel Walker
2009-09-08 14:21     ` Jens Axboe
2009-09-08  9:23 ` [PATCH 4/8] writeback: get rid of pdflush completely Jens Axboe
2009-09-08  9:23 ` [PATCH 5/8] writeback: add some debug inode list counters to bdi stats Jens Axboe
2009-09-08  9:23 ` [PATCH 6/8] writeback: add name to backing_dev_info Jens Axboe
2009-09-08  9:23 ` [PATCH 7/8] writeback: check for registered bdi in flusher add and inode dirty Jens Axboe
2009-09-08  9:23 ` [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback_mb Jens Axboe
2009-09-08 10:37   ` Artem Bityutskiy
2009-09-08 16:06     ` Peter Zijlstra
2009-09-08 16:29       ` Chris Mason
2009-09-08 16:56         ` Peter Zijlstra
2009-09-08 17:28           ` Chris Mason
2009-09-08 17:46             ` Peter Zijlstra
2009-09-08 17:55               ` Peter Zijlstra
2009-09-08 18:32                 ` Peter Zijlstra
2009-09-09 14:23                   ` Jan Kara [this message]
2009-09-09 14:37                     ` Wu Fengguang
2009-09-10 15:49                     ` Peter Zijlstra
2009-09-14 11:17                       ` Jan Kara
2009-09-24  8:33                         ` Wu Fengguang
2009-09-24 15:38                           ` Peter Zijlstra
2009-09-25  1:33                             ` Wu Fengguang
2009-09-29 17:35                           ` Jan Kara
2009-09-30  1:24                             ` Wu Fengguang
2009-09-30 11:55                               ` Jan Kara
2009-09-30 12:10                                 ` Jens Axboe
2009-10-01 15:17                                   ` Wu Fengguang
2009-10-01 13:36                                 ` Wu Fengguang
2009-10-01 14:22                                   ` Jan Kara
2009-10-01 14:54                                     ` Wu Fengguang
2009-10-01 21:35                                       ` Jan Kara
2009-10-02  2:25                                         ` Wu Fengguang
2009-10-02  9:54                                           ` Jan Kara
2009-10-02 10:34                                             ` Wu Fengguang
2009-09-08 18:35                 ` Chris Mason
2009-09-08 17:57               ` Chris Mason
2009-09-08 18:28                 ` Peter Zijlstra
2009-09-09  1:53           ` Dave Chinner
2009-09-09  3:52             ` Wu Fengguang
2009-09-08 18:06         ` Theodore Tso
     [not found]           ` <20090908181937.GA11545@infradead.org>
2009-09-08 19:34             ` Theodore Tso
2009-09-09  9:29         ` Wu Fengguang
2009-09-09 12:28           ` Christoph Hellwig
2009-09-09 12:32             ` Wu Fengguang
2009-09-09 12:36               ` Artem Bityutskiy
2009-09-09 12:37               ` Jens Axboe
2009-09-09 12:43                 ` Christoph Hellwig
2009-09-09 12:44                   ` Jens Axboe
2009-09-09 12:51                     ` Christoph Hellwig
2009-09-09 12:57                 ` Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2009-09-04  7:46 [PATCH 0/8] Per-bdi writeback flusher threads v18 Jens Axboe
2009-09-04  7:46 ` [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback_mb Jens Axboe
2009-09-04 15:28   ` Richard Kennedy
2009-09-05 13:26     ` Jamie Lokier
2009-09-05 16:18       ` Richard Kennedy
2009-09-05 16:46     ` Theodore Tso
2009-09-07 19:09   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090909142315.GA7949@duck.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dedekind1@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).