Re: [PATCH 0/4] io-controller: Add new interfaces to trace backlogged group status

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
To: Divyesh Shah <dpshah@google.com>
Cc: Vivek Goyal <vgoyal@redhat.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux kernel mailing list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/4] io-controller: Add new interfaces to trace backlogged group status
Date: Fri, 11 Jun 2010 14:31:08 +0800	[thread overview]
Message-ID: <4C11D82C.50902@cn.fujitsu.com> (raw)
In-Reply-To: <AANLkTikZSgnSqK_JMVqy9ncFse0EbZm8j4WTTcJ00hwL@mail.gmail.com>

Divyesh Shah wrote:
> On Tue, May 25, 2010 at 6:25 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Tue, May 25, 2010 at 11:00:54AM +0800, Gui Jianfeng wrote:
>>> Vivek Goyal wrote:
>>>> On Tue, May 25, 2010 at 09:37:31AM +0800, Gui Jianfeng wrote:
>>>>> Vivek Goyal wrote:
>>>>>> On Mon, May 24, 2010 at 09:12:05AM +0800, Gui Jianfeng wrote:
>>>>>>> Vivek Goyal wrote:
>>>>>>>> On Fri, May 21, 2010 at 04:40:50PM +0800, Gui Jianfeng wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> This series implements three new interfaces to keep track of tranferred bytes,
>>>>>>>>> elapsing time and io rate since group getting backlogged. If the group dequeues
>>>>>>>>> from service tree, these three interfaces will reset and shows zero.
>>>>>>>> Hi Gui,
>>>>>>>>
>>>>>>>> Can you give some details regarding how this functionality is useful? Why
>>>>>>>> would somebody be interested in only in stats of till group was
>>>>>>>> backlogged and not in total stats?
>>>>>>>>
>>>>>>>> Groups can come and go so fast and these stats will reset so many times
>>>>>>>> that I am not able to visualize how these stats will be useful.
>>>>>>> Hi Vivek,
>>>>>>>
>>>>>>> Currently, we assign weight to a group, but user still doesn't know how fast the
>>>>>>> group runs. With io rate interface, users can check the rate of a group at any
>>>>>>> moment, or to determine whether the weight assigned to a group is enough.
>>>>>>> bytes and time interface is just for debug purpose.
>>>>>> Gui,
>>>>>>
>>>>>> I still don't understand that why blkio.sectors or blkio.io_service_bytes
>>>>>> or blkio.io_serviced interfaces are not good enough to determine at what
>>>>>> rate a group is doing IO.
>>>>>>
>>>>>> I think we can very well write something in userspace like "iostat" to
>>>>>> display the per group rate. Utility can read the any of the above files
>>>>>> say at the interfval of 1s, calculate the diff between the values and
>>>>>> display that as group effective rate.
>>>>> Hi Vivek,
>>>>>
>>>>> blkio.io_active_rate reflects the rate since group get backlogged, so the rate is a smooth
>>>>> value. This value represents the actual rate a group runs. IMO, io rate calculated from
>>>>> user space is not accurate in following two scenarios:
>>>>>
>>>>> 1 Userspace app chooses the interval of 1s, if 0.5s is backlogged and 0.5s is not, the
>>>>>   rate calculated in this interval doesn't make sense.
>>>>>
>>>> If you are not servicing groups for long time, anyway it is very bad for
>>>> latency. So that's why soft limit of 300ms of CFQ makes sense and
>>>> practically I am not sure you will be blocking groups for .5s.
>>>>
>>>> Even if you do, then user just needs to choose a bigger interval and you
>>>> will see more smooth rates. Reduce the interval and you might see little
>>>> bursty rate.
>>> Vivek,
>>>
>>> IIUC, the most big problem for user app is the user app doesn't know how long
>>> the group has been dequeued during the interval. For example, user choose
>>> 10s interval, 8s of which is not backlogged, but when user app calculates
>>> io rate, this 8s still include. So this rate isn't what we want. Am i missing
>>> something?
>> Gui,
>>
>> If user application is not doing enough IO and group is getting deleted
>> fast, io_active_rate is not going to give you any meaningful data as it
>> will be lost the moment group gets deleted.
>>
>> Hence one needs to monitor the IO rate when a workload is running and is
>> keeping disk busy more or less all the time.
>>
>> Even in your example, if you monitored IO rate over 10 second interval and
>> group is not doing any IO, you just can't do anything about it. Just that
>> your measurement e method is wrong. Even io_active_rate will not help you
>> here as by the time you read the file, group is gone and there is no data.
>>
>> The very reason you want to monitor rate is that you want to make sure
>> group is getting enough BW. If group is not doing IO then one can look at
>> blkio.dequeue file and see if group is getting deleted too frequently. If
>> yes, that means group is not doing enough IO to keep the disk busy. One
>> can also try increasing the weight of the group but that will not help
>> much if group does not remain backlogged for significant amount of time.
>>
>>> "io_active_rate" will never take un-backlogged time into account when calculating
>>> io rate.
>>>
>> Theoritically blkio.sectors/blkio.time gives the rate excluding the time
>> when group was not backlogged?
> 
> I agree with Vivek here. We use blkio.time as a source for io rate
> count for each cgroup, knowing that it is not entirely accurate but a
> good enough approximation.
> 
> Gui, if you want to find out whether the cgroup has enough weight or
> not, I'd recommend looking at the wait_time stat in addition to
> blkio.time. It has been very useful in identifying jobs that are not
> getting enough IO done due to less weight assigned to them.

Ok, see. :)

Thanks,
Gui

> 
>> But I will not recommend using blkio.time as it is very approximate.
>>
>> I really am not able to see what this interface is really buying you.
>>
>> Thanks
>> Vivek
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>

     prev parent reply	other threads:[~2010-06-11  6:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-21  8:40 [PATCH 0/4] io-controller: Add new interfaces to trace backlogged group status Gui Jianfeng
2010-05-21  8:43 ` [PATCH 1/4] io-controller: a new interface to keep track of bytes during group is backlogged Gui Jianfeng
2010-05-21  8:44 ` [PATCH 2/4] io-controller: a new interface to keep track of the time since group bacame backlogged Gui Jianfeng
2010-05-21  8:45 ` [PATCH 3/4] io-controller: a new interface to keep track of io rate when group is backlogged Gui Jianfeng
2010-05-21  8:46 ` [PATCH 4/4] io-controller: Document for active bytes, time and rate Gui Jianfeng
2010-05-26 18:57   ` Randy Dunlap
2010-05-21 13:17 ` [PATCH 0/4] io-controller: Add new interfaces to trace backlogged group status Vivek Goyal
2010-05-24  1:12   ` Gui Jianfeng
2010-05-24 21:22     ` Vivek Goyal
2010-05-25  1:37       ` Gui Jianfeng
2010-05-25  2:03         ` Vivek Goyal
2010-05-25  3:00           ` Gui Jianfeng
2010-05-25 13:25             ` Vivek Goyal
2010-06-11  5:10               ` Divyesh Shah
2010-06-11  6:31                 ` Gui Jianfeng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C11D82C.50902@cn.fujitsu.com \
    --to=guijianfeng@cn.fujitsu.com \
    --cc=dpshah@google.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox