Re: [PATCH] cgroup: limit block I/O bandwidth

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jens Axboe <jens.axboe@oracle.com>
To: Andrea Righi <righiandr@users.sourceforge.net>
Cc: Naveen Gupta <ngupta@google.com>, Paul Menage <menage@google.com>,
	Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cgroup: limit block I/O bandwidth
Date: Sun, 20 Jan 2008 15:32:40 +0100	[thread overview]
Message-ID: <20080120143239.GS6258@kernel.dk> (raw)
In-Reply-To: <4793507B.6040706@users.sourceforge.net>

On Sun, Jan 20 2008, Andrea Righi wrote:
> Andrea Righi wrote:
> > Naveen Gupta wrote:
> >>> Paul Menage wrote:
> >>>> On Jan 18,  2008 7:36 AM, Dhaval Giani <dhaval@linux.vnet.ibm.com>  wrote:
> >>>>> On Fri, Jan 18, 2008 at 12:41:03PM +0100, Andrea Righi  wrote:
> >>>>>> Allow to limit the  block I/O bandwidth for  specific process containers
> >>>>>> (cgroups) imposing additional delays  on I/O requests for those processes
> >>>>>> that exceed the  limits defined in the control group filesystem.
> >>>>>>
> >>>>>>  Example:
> >>>>>>   # mkdir /dev/cgroup
> >>>>>>   # mount -t cgroup -oio-throttle io-throttle /dev/cgroup
> >>>>> Just a minor nit, can't we name it as io,  keeping in mind that other
> >>>>> controllers are known as cpu and  memory?
> >>>> Or maybe "blockio"?
> >>> Agree, blockio seems better. Not all I/O is performed on  block devices
> >>> and in this case we're  considering block devices only.
> >> Here we want to rate limit in block layer, I would think I/O scheduler
> >> is the place where we are in much better position to do this kind of
> >> limiting.
> >>
> >> Also we are changing the behavior of application by adding sleeps to
> >> it during request submission. Moreover, we will prevent requests from
> >> being merged since we won't allow them to be submitted in this case.
> >>
> >> Since bulk of submission for writes is done in background kernel
> >> threads and we throttle based on limits on current, we will end up
> >> throttling these threads and not the actual processes submitting i/o.
> > 
> > Yep, that's true! This works for read operations only... at the very
> > least, if I've understood well, we could throttle I/O reads in the
> > submit_bio() path and write operations in __set_page_dirty(). But this
> > would change the applications behavior, so probably the best approcah
> > could be to just get I/O statistics from TASK_IO_ACCOUNTING stuff and
> > implement task delays at the I/O scheduler layer...
> 
> OK, to better figure the concept I tried to put the I/O throttling
> mechanism inside the simplest scheduler: noop. But this raises a new
> problem: under certain conditions (typically with write requests) a
> delay imposed on a I/O request of a process could impact on all the
> other I/O requests of other processes, causing the whole system to hang
> until the first request is completed, so it seem that we've to deal with
> a classic priority inversion problem.
> 
> Obviously the problem doesn't occur if the limited cgroup performs read
> operations only.
> 
> I'm posting the modified patch below only for discussion purposes, you
> could test it if you want, but you've been warned that the whole system
> could hang for certain amounts of time.

Your approach is totally flawed, imho. For instance, you don't want a
process to be able to dirty memory at foo mb/sec but only actually
write them out at bar mb/sec.

The noop-iosched changes are also very buggy. The queue back pointer
breaks reference counting and the task pointer storage assumes the task
will also always be around. That's of course not the case.

IOW, you are doing this at the wrong level.

What problem are you trying to solve?

-- 
Jens Axboe

next prev parent reply	other threads:[~2008-01-20 14:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-18 22:39 [PATCH] cgroup: limit block I/O bandwidth Naveen Gupta
2008-01-19 11:17 ` Andrea Righi
2008-01-20 13:45   ` Andrea Righi
2008-01-20 14:32     ` Jens Axboe [this message]
2008-01-20 14:58       ` Balbir Singh
2008-01-20 15:41       ` Andrea Righi
2008-01-20 16:06         ` Jens Axboe
2008-01-20 23:59           ` Andrea Righi
2008-01-22 19:02             ` Naveen Gupta
2008-01-22 23:11               ` Andrea Righi
2008-01-23  1:17                 ` Naveen Gupta
2008-01-23 15:23                   ` Andrea Righi
2008-01-23 15:38                     ` Balbir Singh
2008-01-23 20:55                       ` Andrea Righi
2008-01-24  9:05                         ` Pavel Emelyanov
2008-01-24 13:48                           ` Andrea Righi
2008-01-24 13:50                             ` Balbir Singh
  -- strict thread matches above, loose matches on Subject: below --
2008-01-18 11:41 Andrea Righi
2008-01-18 12:36 ` Dhaval Giani
2008-01-18 12:41   ` Paul Menage
2008-01-18 13:02     ` Andrea Righi
2008-01-18 15:50 ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080120143239.GS6258@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=ngupta@google.com \
    --cc=righiandr@users.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).