linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Jan Kara <jack@suse.cz>,
	Jens Axboe <axboe@kernel.dk>,
	linux-mm@kvack.org, sjayaraman@suse.com, andrea@betterlinux.com,
	jmoyer@redhat.com, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	lizefan@huawei.com, containers@lists.linux-foundation.org,
	cgroups@vger.kernel.org, ctalbott@google.com, rni@google.com,
	lsf@lists.linux-foundation.org
Subject: Re: [RFC] writeback and cgroup
Date: Wed, 4 Apr 2012 14:42:28 -0700	[thread overview]
Message-ID: <20120404214228.GA6471@localhost> (raw)
In-Reply-To: <20120404183528.GJ12676@redhat.com>

On Wed, Apr 04, 2012 at 02:35:29PM -0400, Vivek Goyal wrote:
> On Wed, Apr 04, 2012 at 10:51:24AM -0700, Fengguang Wu wrote:
> 
> [..]
> > The sweet split point would be for balance_dirty_pages() to do cgroup
> > aware buffered write throttling and leave other IOs to the current
> > blkcg. For this to work well as a total solution for end users, I hope
> > we can cooperate and figure out ways for the two throttling entities
> > to work well with each other.
> 
> Throttling read + direct IO, higher up has few issues too. Users will

Yeah I have a bit worry about high layer throttling, too.
Anyway here are the ideas.

> not like that a task got blocked as it tried to submit a read from a
> throttled group.

That's not the same issue I worried about :) Throttling is about
inserting small sleep/waits into selected points. For reads, the ideal
sleep point is immediately after readahead IO is summited, at the end
of __do_page_cache_readahead(). The same should be applicable to
direct IO.

> Current async behavior works well where we queue up the
> bio from the task in throttled group and let task do other things. Same
> is true for AIO where we would not like to block in bio submission.

For AIO, we'll need to delay the IO completion notification or status
update, which may involve computing some delay time and delay the
calls to io_complete() with the help of some delayed work queue. There
may be more issues to deal with as I didn't look into aio.c carefully.

The thing worried me is that in the proportional throttling case, the
high level throttling works on the *estimated* task_ratelimit =
disk_bandwidth / N, where N is the number of read IO tasks. When N
suddenly changes from 2 to 1, it may take 1 second for the estimated
task_ratelimit to adapt from disk_bandwidth/2 up to disk_bandwidth,
during which time the disk won't get 100% utilized because of the
temporally over-throttling of the remaining IO task.

This is not a problem when throttling at the block/cfq layer, since it
has the full information of pending requests and should not depend on
such estimations.

The workaround I can think of, is to put the throttled task into a wait
queue, and let block layer wake up the waiters when the IO queue runs
empty. This should be able to avoid most disk idle time.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-04-04 21:47 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-03 18:36 [RFC] writeback and cgroup Tejun Heo
2012-04-04 14:51 ` Vivek Goyal
2012-04-04 15:36   ` [Lsf] " Steve French
2012-04-04 18:56     ` Tejun Heo
2012-04-04 19:19       ` Vivek Goyal
2012-04-25  8:47         ` Suresh Jayaraman
2012-04-04 18:49   ` Tejun Heo
2012-04-04 19:23     ` [Lsf] " Steve French
2012-04-14 12:15       ` Peter Zijlstra
2012-04-04 20:32     ` Vivek Goyal
2012-04-04 23:02       ` Tejun Heo
2012-04-05 16:38     ` Tejun Heo
2012-04-05 17:13       ` Vivek Goyal
2012-04-14 11:53     ` [Lsf] " Peter Zijlstra
2012-04-07  8:00   ` Jan Kara
2012-04-10 16:23     ` [Lsf] " Steve French
2012-04-10 18:16       ` Vivek Goyal
2012-04-10 18:06     ` Vivek Goyal
2012-04-10 21:05       ` Jan Kara
2012-04-10 21:20         ` Vivek Goyal
2012-04-10 22:24           ` Jan Kara
2012-04-11 15:40             ` Vivek Goyal
2012-04-11 15:45               ` Vivek Goyal
2012-04-11 17:05                 ` Jan Kara
2012-04-11 17:23                   ` Vivek Goyal
2012-04-11 19:44                     ` Jan Kara
2012-04-17 21:48                   ` Tejun Heo
2012-04-18 18:18                     ` Vivek Goyal
2012-04-11 19:22               ` Jan Kara
2012-04-12 20:37                 ` Vivek Goyal
2012-04-12 20:51                   ` Tejun Heo
2012-04-14 14:36                     ` Fengguang Wu
2012-04-16 14:57                       ` Vivek Goyal
2012-04-24 11:33                         ` Fengguang Wu
2012-04-24 14:56                           ` Jan Kara
2012-04-24 15:58                             ` Vivek Goyal
2012-04-25  2:42                               ` Fengguang Wu
2012-04-25  3:16                             ` Fengguang Wu
2012-04-25  9:01                               ` Jan Kara
2012-04-25 12:05                                 ` Fengguang Wu
2012-04-15 11:37                   ` [Lsf] " Peter Zijlstra
2012-04-17 22:01                 ` Tejun Heo
2012-04-18  6:30                   ` Jan Kara
2012-04-14 12:25               ` [Lsf] " Peter Zijlstra
2012-04-16 12:54                 ` Vivek Goyal
2012-04-16 13:07                   ` Fengguang Wu
2012-04-16 14:19                     ` Fengguang Wu
2012-04-16 15:52                     ` Vivek Goyal
2012-04-17  2:14                       ` Fengguang Wu
2012-04-04 17:51 ` Fengguang Wu
2012-04-04 18:35   ` Vivek Goyal
2012-04-04 21:42     ` Fengguang Wu [this message]
2012-04-05 15:10       ` Vivek Goyal
2012-04-06  0:32         ` Fengguang Wu
2012-04-04 19:33   ` Tejun Heo
2012-04-04 20:18     ` Vivek Goyal
2012-04-05 16:31       ` Tejun Heo
2012-04-05 17:09         ` Vivek Goyal
2012-04-06  9:59     ` Fengguang Wu
2012-04-17 22:38       ` Tejun Heo
2012-04-19 14:23         ` Fengguang Wu
2012-04-19 18:31           ` Vivek Goyal
2012-04-20 12:45             ` Fengguang Wu
2012-04-20 19:29               ` Vivek Goyal
2012-04-20 21:33                 ` Tejun Heo
2012-04-22 14:26                   ` Fengguang Wu
2012-04-23 12:30                   ` Vivek Goyal
2012-04-23 16:04                     ` Tejun Heo
2012-04-19 20:26           ` Jan Kara
2012-04-20 13:34             ` Fengguang Wu
2012-04-20 19:08               ` Tejun Heo
2012-04-22 14:46                 ` Fengguang Wu
2012-04-23 16:56                   ` Tejun Heo
2012-04-24  7:58                     ` Fengguang Wu
2012-04-25 15:47                       ` Tejun Heo
2012-04-23  9:14               ` Jan Kara
2012-04-23 10:24                 ` Fengguang Wu
2012-04-23 12:42                   ` Jan Kara
2012-04-23 14:31                     ` Fengguang Wu
2012-04-18  6:57       ` Jan Kara
2012-04-18  7:58         ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120404214228.GA6471@localhost \
    --to=fengguang.wu@intel.com \
    --cc=andrea@betterlinux.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=ctalbott@google.com \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=lsf@lists.linux-foundation.org \
    --cc=rni@google.com \
    --cc=sjayaraman@suse.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).