From mboxrd@z Thu Jan  1 00:00:00 1970
From: Suresh Jayaraman <sjayaraman@suse.com>
Subject: Re: [LSF/MM TOPIC] [ATTEND] Throttling I/O
Date: Mon, 28 Jan 2013 16:46:43 +0530
Message-ID: <51065E1B.30209@suse.com>
References: <51028666.1080109@suse.com> <20130125163408.GE6197@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>,
	Fengguang Wu <fengguang.wu@intel.com>,
	Andrea Righi <andrea@betterlinux.com>, Jan Kara <jack@suse.cz>,
	Moyer Jeff Moyer <jmoyer@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from cantor2.suse.de ([195.135.220.15]:55035 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753990Ab3A1LRB (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 28 Jan 2013 06:17:01 -0500
In-Reply-To: <20130125163408.GE6197@redhat.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On 01/25/2013 10:04 PM, Vivek Goyal wrote:
> On Fri, Jan 25, 2013 at 06:49:34PM +0530, Suresh Jayaraman wrote:
>> Hello,
>>
>> I'd like to discuss again[1] the problem of throttling buffered writes
>> and a throttle mechanism that works for all kinds of I/O.
>>
>> Some background information.
>>
>> During last year's LSF/MM, Fengguang discussed his proportional I/O
>> controller patches as part of the writeback session. The limitations
>> that were seen of his approach were a) non-handling of bursty IO
>> submission in the flusher thread b) sharing config variables among
>> different policies c) and that it violates layering and lacking
>> long-term design. Tejun proposed back-pressure approach to the problem
>> i.e. apply pressure where the problem is (block layer) and propagate
>> upwards.
>>
>> The general opinion at that time was that we needed more
>> inputs/consensus needed on the natural, flexible, extensible
>> "interface". The discussion thread that Vivek started[2] to collect the
>> inputs on "interface", though resulted in good collection of inputs,
>> not sure whether it represents inputs from all the interested parties.
>>
>> At Kernel Summit last year, I learned from LWN[3] that the topic was
>> discussed again. Tejun, apparently proposed a solution that splits up
>> the global async CFQ queue by cgroup, so that the CFQ scheduler can
>> easily schedule the per-cgroup sync/async queues according to the
>> per-cgroup I/O weights. Fengguang proposed a solution by supporting the
>> per-cgroup buffered write weights in balance_dirty_pages() and running a
>> user-space daemon that updates the CFQ/BDP weights every second. There
>> doesn't seem to be consensus towards either of the proposed approaches.
>>
> 
> Moving async queues in respective cgroup is easy part. Also for
> throttling, you don't need CFQ. So CFQ and IO throttling are little
> orthogonal. (I am assuming by throttling you mean upper limiting IO).

I meant being able to limit I/O. I didn't mean to strictly distinguish
between "Upper limit" and "Proportional I/O control" in this context
because my understanding is that limiting I/O for a customer on how much
he is paying could be achieved using proportional control as well
(by doing a little math with the weights). Perhaps, the topic name might
have suggested that the discussion is only about "Upper limit" and not
so much about "Proportional I/O control". But, my intent is to arrive at
an acceptable mechanism that will allow limiting I/O.

> And I think tejun wanted to implement throttling at block layer and
> wanted vm to adjust/respond to per group IO backlog when it comes
> to writting to dirty data/inodes.
> 
> Once we have take care of writeback problem then comes the issue
> of being able to associate a dirty inode/page to a cgroup. Not sure
> if something has happened on that front or not. In the past it was
> thought to be simple that one inode belongs to one IO cgroup.

Yes, this was discussed last year. But, not so much happened AFAIK.


Thanks

-- 
Suresh Jayaraman