All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Satoshi UCHIDA" <s-uchida@ap.jp.nec.com>
To: "'Paul Menage'" <menage@google.com>,
	<linux-kernel@vger.kernel.org>,
	<containers@lists.linux-foundation.org>
Cc: <axboe@kernel.dk>, <tom-sugawara@ap.jp.nec.com>,
	<m-takahashi@ex.jp.nec.com>
Subject: [RFC][v2][patch 0/12][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ
Date: Thu, 3 Apr 2008 16:09:12 +0900	[thread overview]
Message-ID: <005d01c89559$9e538200$dafa8600$@jp.nec.com> (raw)
In-Reply-To: <6599ad830804021541s3c1e3197y77d87f63bf47e4b3@mail.gmail.com>

This patchset modified a name of subsystem (from "cfq_cgroup" to "cfq")
and a checking in create function.


This patchset introduce "Yet Another" I/O bandwidth controlling
subsystem for cgroups based on CFQ (called 2 layer CFQ).

The idea of 2 layer CFQ is to build fairness control per group on the top of existing CFQ control.
We add a new data structure called CFQ meta-data on the top of
cfqd in order to control I/O bandwidth for cgroups.
CFQ meta-data control cfq_datas by service tree (rb-tree) and
CFQ algorithm when synchronous I/O.
An active cfqd controls queue for cfq by service tree.
Namely, the CFQ meta-data control traditional CFQ data.
the CFQ data runs conventionally.

           cfqmd     cfqmd     (cfqmd = cfq meta-data)
            |          |
  cfqc  -- cfqd ----- cfqd     (cfqd = cfq data,
            |          |        cfqc = cfq cgroup data)
  cfqc  --[cfqd]----- cfqd
            ↑
     conventional control.


This patchset is gainst 2.6.25-rc2-mm1.


Last week, we found a patchset from Vasily Tarasov (Open VZ) that
posted to LKML.
   [RFC][PATCH 0/9] cgroups: block: cfq: I/O bandwidth controlling subsystem for CGroups based on CFQ
  http://lwn.net/Articles/274652/

Our subsystem and  Vasily's one are similar on the point of modifying
the CFQ subsystem, but they are different on the point of the layer of
implementation. Vasily's subsystem add a new layer for cgroup between
cfqd and cfqq, but our subsystem add a new layer for cgroup on the top
of cfqd.

The different of implementation from OpenVZ's one are:
   * top layer algorithm is also based on service tree, and
   * top layer program is stored in the different file (block/cfq-cgroup.c).

We hope to discuss not which is better implementation, but what is the
best way to implement I/O bandwidth control based on CFQ here.

Please give us your comments, questions and suggestions.



Finally, we introduce a usage of our implementation.

* Preparation for using 2 layer CFQ

 1. Adopt this patchset to kernel 2.6.25-rc2-mm1.

 2. Build kernel with CFQ-CGROUP option.

 3. Restart new kernel.

 4. Mount cfq_cgroup special device to device directory.
    ex.
      mkdir /dev/cgroup
      mount -t cgroup -o cfq cfq /dev/cgroup


* Usage of grouping control.
 - Create New group
      Make new directory under /dev/cgroup.
      For example, the following command genrerates a 'test1' group.
          mkdir /dev/cgroup/test1

 - Insert task to group
      Write process id(pid) on "tasks" entry in the corresponding group.
      For example, the following command sets task with pid 1100 into test1 group.
         echo 1100 > /dev/cgroup/test1/tasks
      Child tasks of this tasks is also inserted into test1 group.

 - Change I/O priority of group
     Write priority on "cfq.ioprio" entry in corresponding group.
     For example, the following command sets priority of rank 2 to 'test1' group.
         echo 2 > /dev/cgroup/test1/tasks
     I/O priority for cgroups takes the value from 0 to 7. It is same as
     existing per-task CFQ.

     
 - Change I/O priority of task
     Use existing "ionice" command.


* Example
 Two I/O load (dd command) runs some conditions.
  
 - When they are same group and same priority,

   program
     #!/bin/sh
     echo $$ > /dev/cgroup/tasks
     echo $$ > /dev/cgroup/test/tasks
     ionice -c 2 -n 3 dd if=/internal/data1 of=/dev/null bs=1M count=1K &
     ionice -c 2 -n 3 dd if=/internal/data2 of=/dev/null bs=1M count=1K &
     echo $$ > /dev/cgroup/test2/tasks
     echo $$ > /dev/cgroup/tasks
    
   result
     1024+0 records in
     1024+0 records out
     1073741824 bytes (1.1 GB) copied, 27.7676 s, 38.7 MB/s
     1024+0 records in
     1024+0 records out
     1073741824 bytes (1.1 GB) copied, 28.8482 s, 37.2 MB/s

    These tasks was fair, therefore they finished at similar time.


 - When they are same group and different priorities (0 and 7),

    program
      #!/bin/sh
      echo $$ > /dev/cgroup/tasks
      echo $$ > /dev/cgroup/test/tasks
      ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K &
      ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K &
      echo $$ > /dev/cgroup/test2/tasks
      echo $$ > /dev/cgroup/tasks

    result
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB) copied, 18.8373 s, 57.0 MB/s
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB) copied, 28.108 s, 38.2 MB/s


     The first task (copy data1) had high priority, therefore it finished at fast.
 
 - When they are different groups and different priorities (0 and 7),

    program
      #!/bin/sh
      echo $$ > /dev/cgroup/tasks
      echo $$ > /dev/cgroup/test/tasks
      ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K 
      echo $$ > /dev/cgroup/test2/tasks
      ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K 
      echo $$ > /dev/cgroup/tasks

    result
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB) copied, 28.1661 s, 38.1 MB/s
      1024+0 records in
      1024+0 records out
      1073741824 bytes (1.1 GB) copied, 28.8486 s, 37.2 MB/s

     The first task (copy data1) had  high priority, but they finished at similar time.
     Because their groups had same priority.

 - When they are different groups with different priorities (7 and 0)
   and same priority,

    program
      #!/bin/sh
      echo $$ > /dev/cgroup/tasks
      echo 7 > /dev/cgroup/test/cfq.ioprio
      echo $$ > /dev/cgroup/test/tasks
      ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K >& test1.log &
      echo 0 > /dev/cgroup/test2/cfq.ioprio
      echo $$ > /dev/cgroup/test2/tasks
      ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K >& test2.log &
      echo $$ > /dev/cgroup/tasks

    result
      === test1.log ===
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB) copied, 27.3971 s, 39.2 MB/s
      === test2.log ===
        1024+0 records in
        1024+0 records out
        1073741824 bytes (1.1 GB) copied, 17.3837 s, 61.8 MB/s

     This first task (copy data1) had high priority, but they finished at late.
     Because its group had low priority.


=====
 Satoshi UHICDA
   NEC Corporation.



  parent reply	other threads:[~2008-04-03  7:09 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-01  9:22 [RFC][patch 0/11][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ Satoshi UCHIDA
2008-04-01  9:27 ` [RFC][patch 1/11][CFQ-cgroup] Add Configuration Satoshi UCHIDA
2008-04-01  9:27 ` Satoshi UCHIDA
2008-04-01  9:30 ` [RFC][patch 2/11][CFQ-cgroup] Move header file Satoshi UCHIDA
2008-04-01  9:30 ` Satoshi UCHIDA
2008-04-01  9:32 ` [RFC][patch 3/11][CFQ-cgroup] Introduce cgroup subsystem Satoshi UCHIDA
2008-04-01  9:32 ` Satoshi UCHIDA
2008-04-02 22:41   ` Paul Menage
2008-04-02 22:41   ` Paul Menage
2008-04-03  2:31     ` Satoshi UCHIDA
2008-04-03  2:39       ` Li Zefan
2008-04-03  2:39       ` Li Zefan
2008-04-03 15:31       ` Paul Menage
2008-04-03 15:31       ` Paul Menage
2008-04-03  7:09     ` Satoshi UCHIDA [this message]
2008-04-03  7:11       ` [PATCH] [RFC][patch 1/12][CFQ-cgroup] Add Configuration Satoshi UCHIDA
2008-04-03  7:11       ` Satoshi UCHIDA
2008-04-03  7:12       ` [RFC][patch 2/11][CFQ-cgroup] Move header file Satoshi UCHIDA
2008-04-03  7:12       ` Satoshi UCHIDA
2008-04-03  7:12       ` [RFC][patch 3/12][CFQ-cgroup] Introduce cgroup subsystem Satoshi UCHIDA
2008-04-03  7:12       ` Satoshi UCHIDA
2008-04-03  7:13       ` [PATCH] [RFC][patch 4/12][CFQ-cgroup] Add ioprio entry Satoshi UCHIDA
2008-04-03  7:13       ` Satoshi UCHIDA
2008-04-03  7:14       ` [RFC][patch 5/12][CFQ-cgroup] Create cfq driver unique data Satoshi UCHIDA
2008-04-03  7:14       ` Satoshi UCHIDA
2008-04-03  7:14       ` [RFC][patch 6/12][CFQ-cgroup] Add cfq optional operation framework Satoshi UCHIDA
2008-04-03  7:14       ` Satoshi UCHIDA
2008-04-03  7:15       ` [RFC][patch 7/12][CFQ-cgroup] Add new control layer over traditional control layer Satoshi UCHIDA
2008-04-03  7:15       ` Satoshi UCHIDA
2008-04-03  7:15       ` [RFC][patch 8/12][CFQ-cgroup] Control cfq_data per driver Satoshi UCHIDA
2008-04-03  7:15       ` Satoshi UCHIDA
2008-04-03  7:16       ` [RFC][patch 9/12][CFQ-cgroup] Control cfq_data per cgroup Satoshi UCHIDA
2008-04-03  7:16       ` Satoshi UCHIDA
2008-04-03  7:16       ` [PATCH] [RFC][patch 10/12][CFQ-cgroup] Search cfq_data when not connected Satoshi UCHIDA
2008-04-03  7:16       ` Satoshi UCHIDA
2008-04-03  7:17       ` [RFC][patch 11/12][CFQ-cgroup] Control service tree: Main functions Satoshi UCHIDA
2008-04-03  7:17       ` Satoshi UCHIDA
2008-04-03  7:18       ` [RFC][patch 12/12][CFQ-cgroup] entry/remove active cfq_data Satoshi UCHIDA
2008-04-03  7:18       ` Satoshi UCHIDA
2008-04-25  9:54       ` [RFC][v2][patch 0/12][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ Ryo Tsuruta
2008-04-25 21:37         ` [Devel] " Florian Westphal
     [not found]           ` <20080425213702.GL19845-N26foeg2n7CSQxDm3FYtNwe+0vtP5+jfMa5aj3AfGnc@public.gmane.org>
2008-04-29  0:44             ` Ryo Tsuruta
2008-04-29  0:44           ` Ryo Tsuruta
     [not found]         ` <20080425.185444.115924172.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-04-25 21:37           ` Florian Westphal
2008-05-09 10:17           ` Satoshi UCHIDA
2008-05-09 10:17         ` Satoshi UCHIDA
2008-05-12  3:10           ` Ryo Tsuruta
2008-05-12  3:10             ` Ryo Tsuruta
2008-05-12 15:33             ` Ryo Tsuruta
2008-05-22 13:04               ` Ryo Tsuruta
     [not found]                 ` <20080522.220438.226802699.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-05-23  2:53                   ` Satoshi UCHIDA
2008-05-23  2:53                 ` Satoshi UCHIDA
2008-05-26  2:46                   ` Ryo Tsuruta
2008-05-26  2:46                     ` Ryo Tsuruta
2008-05-27 11:32                     ` Satoshi UCHIDA
2008-05-30 10:37                       ` Andrea Righi
2008-06-18  9:48                         ` Satoshi UCHIDA
2008-06-18 22:33                           ` Andrea Righi
2008-06-18 22:33                           ` Andrea Righi
2008-06-22 17:04                           ` Andrea Righi
2008-06-22 17:04                           ` Andrea Righi
2008-05-30 10:37                       ` Andrea Righi
2008-06-03  8:15                       ` Ryo Tsuruta
     [not found]                         ` <20080603.171535.246514860.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-06-26  4:49                           ` Satoshi UCHIDA
2008-06-26  4:49                         ` Satoshi UCHIDA
2008-06-03  8:15                       ` Ryo Tsuruta
     [not found]                     ` <20080526.114627.104044752.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-05-27 11:32                       ` Satoshi UCHIDA
     [not found]               ` <20080513.003316.226782406.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-05-22 13:04                 ` Ryo Tsuruta
     [not found]             ` <20080512.121019.183038803.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-05-12 15:33               ` Ryo Tsuruta
2008-04-25  9:54       ` Ryo Tsuruta
     [not found]     ` <6599ad830804021541s3c1e3197y77d87f63bf47e4b3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-03  7:09       ` Satoshi UCHIDA
2008-04-01  9:33 ` [RFC][patch 4/11][CFQ-cgroup] Create cfq driver unique data Satoshi UCHIDA
2008-04-01  9:33 ` Satoshi UCHIDA
2008-04-01  9:35 ` [RFC][patch 5/11][CFQ-cgroup] Add cfq optional operation framework Satoshi UCHIDA
2008-04-01  9:35 ` Satoshi UCHIDA
2008-04-01  9:36 ` [RFC][patch 6/11][CFQ-cgroup] Add new control layer over traditional control layer Satoshi UCHIDA
2008-04-01  9:36 ` Satoshi UCHIDA
2008-04-01  9:37 ` [RFC][patch 7/11][CFQ-cgroup] Control cfq_data per driver Satoshi UCHIDA
2008-04-01  9:37 ` Satoshi UCHIDA
2008-04-01  9:38 ` [RFC][patch 8/11][CFQ-cgroup] Control cfq_data per cgroup Satoshi UCHIDA
2008-04-03 15:35   ` Paul Menage
     [not found]     ` <6599ad830804030835s73392db0v29426425c0ea4381-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-04  6:20       ` Satoshi UCHIDA
2008-04-04  6:20         ` Satoshi UCHIDA
2008-04-04  9:00         ` Paul Menage
     [not found]           ` <6599ad830804040200u1ed48f4bxecaa664862cf30a5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-04-04  9:46             ` Satoshi UCHIDA
2008-04-04  9:46           ` Satoshi UCHIDA
2008-04-04  9:00         ` Paul Menage
2008-04-03 15:35   ` Paul Menage
2008-04-01  9:38 ` Satoshi UCHIDA
2008-04-01  9:40 ` [RFC][patch 9/11][CFQ-cgroup] Search cfq_data when not connected Satoshi UCHIDA
2008-04-01  9:40 ` Satoshi UCHIDA
2008-04-01  9:41 ` [RFC][patch 10/11][CFQ-cgroup] Control service tree: Main functions Satoshi UCHIDA
2008-04-01  9:41 ` Satoshi UCHIDA
2008-04-01  9:42 ` [RFC][patch 11/11][CFQ-cgroup] entry/remove active cfq_data Satoshi UCHIDA
2008-04-01  9:42 ` Satoshi UCHIDA

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='005d01c89559$9e538200$dafa8600$@jp.nec.com' \
    --to=s-uchida@ap.jp.nec.com \
    --cc=axboe@kernel.dk \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m-takahashi@ex.jp.nec.com \
    --cc=menage@google.com \
    --cc=tom-sugawara@ap.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.