public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: linux-kernel@vger.kernel.org, axboe@kernel.dk
Cc: nauman@google.com, dpshah@google.com, guijianfeng@cn.fujitsu.com,
	vgoyal@redhat.com
Subject: [RFC PATCH] Block device bio throttling support [V3]
Date: Wed, 15 Sep 2010 17:06:31 -0400	[thread overview]
Message-ID: <1284584798-10264-1-git-send-email-vgoyal@redhat.com> (raw)


Hi,

This is V3 of the bio throttling patches. Following are changes since V2.

- Added the support for throttling in terms of IOPS (READ/WRITE). If one
  specifies both bandwidth as well as IOPS rules on a device then IO is
  subjected to both the rules.

- Did few bug fixes.

- Did some cleanups in blk-cgroup code.

Previous version of patches are available here.

[V2] http://lkml.org/lkml/2010/9/7/386
[V1] http://lkml.org/lkml/2010/9/1/251

Overview
========
Currently CFQ provides the weight based proportional division of bandwidth.
People also have been looking at extending block IO controller to provide
throttling/max bandwidth control.

I have started to write the support for throttling in block layer on
request queue so that it can be used both for higher level logical
devices as well as leaf nodes. This patch is still work in progress but
I wanted to post it for early feedback.

Basically currently I have hooked into __generic_make_request() function to
check which cgroup bio belongs to and if it is exceeding the specified
BW rate. If no, thread can continue to dispatch bio as it is otherwise
bio is queued internally and dispatched later with the help of a worker
thread.

One can do bio throttling in terms of bandwidth(bytes per second) or in 
terms of IO per second or both. Both BW and IOPS rules can be put either
on READ or WRITE flow.

Throttling logic is independent of IO scheduler hence can be used with any
IO scheduler operating. It also can be activated and used on any block
device/request queue, in the stack.

HOWTO
=====
- Make sure CONFIG_BLK_CGROUP=y and CONFIG_BLK_DEV_THROTTLING=y.

- Mount blkio controller
        mount -t cgroup -o blkio none /cgroup/blkio

- Specify a bandwidth rate on particular device for root group. The format
  for policy is "<major>:<minor>  <byes_per_second>".

        echo "8:16  1048576" > /cgroup/blkio/blkio.throttle.read_bps_device

  Above will put a limit of 1MB/second on reads happening for root group
  on device having major/minor number 8:16.

- Run dd to read a file and see if rate is throttled to 1MB/s or not.

       	# dd if=/mnt/common/zerofile of=/dev/null bs=4K count=1024 iflag=direct
        1024+0 records in
        1024+0 records out
        4194304 bytes (4.2 MB) copied, 4.0001 s, 1.0 MB/s

Note:
-----
- Limits for writes can be put using blkio.throttle.write_bps_device file.
- Limits for IOPS rules can be put using following files.

	blkio.throttle.read_iops_device
	blkio.throttle.write_iops_device

 Fore more info refer to Documentation/cgroup/blkio-controller.txt

Open Issues
===========
- Do we need to provide additional queue congestion semantics as we are
  throttling and queuing bios at request queue and probably we don't want
  a user space application to consume all the memory allocating bios
  and bombarding request queue with those bios.

TODO
====
- Testing, bug fixes.

Any feedback is welcome.

Overall diffstat.

 Documentation/cgroups/blkio-controller.txt |  106 +++-
 block/Kconfig                              |   12 +
 block/Makefile                             |    1 +
 block/blk-cgroup.c                         |  786 ++++++++++++++++++-----
 block/blk-cgroup.h                         |   79 +++-
 block/blk-core.c                           |   24 +
 block/blk-throttle.c                       |  999 ++++++++++++++++++++++++++++
 block/cfq-iosched.c                        |    1 +
 block/cfq.h                                |    2 +-
 include/linux/blk_types.h                  |    3 +
 include/linux/blkdev.h                     |   24 +
 init/Kconfig                               |    9 +-
 12 files changed, 1881 insertions(+), 165 deletions(-)

Thanks
Vivek

             reply	other threads:[~2010-09-15 21:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-15 21:06 Vivek Goyal [this message]
2010-09-15 21:06 ` [PATCH 1/7] blk-cgroup: Kill the header printed at the start of blkio.weight_device file Vivek Goyal
2010-09-15 21:06 ` [PATCH 2/7] blk-cgroup: Prepare the base for supporting more than one IO control policies Vivek Goyal
2010-09-15 21:06 ` [PATCH 3/7] blk-cgroup: Introduce cgroup changes for throttling policy Vivek Goyal
2010-09-15 21:06 ` [PATCH 4/7] blkio: Core implementation of throttle policy Vivek Goyal
2010-09-15 21:06 ` [PATCH 5/7] blk-cgroup: cgroup changes for IOPS limit support Vivek Goyal
2010-09-15 21:06 ` [PATCH 6/7] blkio: Implementation of IOPS limit logic Vivek Goyal
2010-09-15 21:06 ` [PATCH 7/7] blkio: Documentation Update Vivek Goyal
2010-09-16  6:48 ` [RFC PATCH] Block device bio throttling support [V3] Jens Axboe
2010-09-16 15:39   ` Vivek Goyal
2010-09-16  7:10 ` Divyesh Shah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1284584798-10264-1-git-send-email-vgoyal@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dpshah@google.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nauman@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox