From: Fengguang Wu <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>, Tejun Heo <tj@kernel.org>,
Jens Axboe <axboe@kernel.dk>,
linux-mm@kvack.org, sjayaraman@suse.com, andrea@betterlinux.com,
jmoyer@redhat.com, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
lizefan@huawei.com, containers@lists.linux-foundation.org,
cgroups@vger.kernel.org, ctalbott@google.com, rni@google.com,
lsf@lists.linux-foundation.org
Subject: Re: [RFC] writeback and cgroup
Date: Wed, 25 Apr 2012 20:05:02 +0800 [thread overview]
Message-ID: <20120425120502.GA18819@localhost> (raw)
In-Reply-To: <20120425090156.GB12568@quack.suse.cz>
[-- Attachment #1: Type: text/plain, Size: 4319 bytes --]
> > So the cfq behavior is pretty undetermined. I more or less realize
> > this from the experiments. For example, when starting 2+ "dd oflag=direct"
> > tasks in one single cgroup, they _sometimes_ progress at different rates.
> > See the attached graphs for two such examples on XFS. ext4 is fine.
> >
> > The 2-dd test case is:
> >
> > mkdir /cgroup/dd
> > echo $$ > /cgroup/dd/tasks
> >
> > dd if=/dev/zero of=/fs/zero1 bs=1M oflag=direct &
> > dd if=/dev/zero of=/fs/zero2 bs=1M oflag=direct &
> >
> > The 6-dd test case is similar.
> Hum, interesting. I would not expect that. Maybe it's because files are
> allocated at the different area of the disk. But even then the difference
> should not be *that* big.
Agreed.
> > > > Look at this graph, the 4 dd tasks are granted the same weight (2 of
> > > > them are buffered writes). I guess the 2 buffered dd tasks managed to
> > > > progress much faster than the 2 direct dd tasks just because the async
> > > > IOs are much more efficient than the bs=64k direct IOs.
> > > Likely because 64k is too low to get good bandwidth with direct IO. If
> > > it was 4M, I believe you would get similar throughput for buffered and
> > > direct IO. So essentially you are right, small IO benefits from caching
> > > effects since they allow you to submit larger requests to the device which
> > > is more efficient.
> >
> > I didn't direct compare the effects, however here is an example of
> > doing 1M, 64k, 4k direct writes in parallel. It _seems_ bs=1M only has
> > marginal benefits of 64k, assuming cfq is behaving well.
> >
> > https://github.com/fengguang/io-controller-tests/raw/master/log/snb/ext4/direct-write-1M-64k-4k.2012-04-19-10-50/balance_dirty_pages-task-bw.png
> >
> > The test case is:
> >
> > # cgroup 1
> > echo 500 > /cgroup/cp/blkio.weight
> >
> > dd if=/dev/zero of=/fs/zero-1M bs=1M oflag=direct &
> >
> > # cgroup 2
> > echo 1000 > /cgroup/dd/blkio.weight
> >
> > dd if=/dev/zero of=/fs/zero-64k bs=64k oflag=direct &
> > dd if=/dev/zero of=/fs/zero-4k bs=4k oflag=direct &
> Um, I'm not completely sure what you tried to test in the above test.
Yeah it's not a good test case. I've changed it to run the 3 dd tasks
in 3 cgroups with equal weight. Attached the new results (looks the
same as the original one).
> What I wanted to point out is that direct IO is not necessarily less
> efficient than buffered IO. Look:
> xen-node0:~ # uname -a
> Linux xen-node0 3.3.0-rc4-xen+ #6 SMP PREEMPT Tue Apr 17 06:48:08 UTC 2012
> x86_64 x86_64 x86_64 GNU/Linux
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.5304 s, 102 MB/s
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.3678 s, 104 MB/s
>
> So both direct and buffered IO are about the same. Note that I used
> conv=fsync flag to erase the effect that part of buffered write still
> remains in the cache when dd is done writing which is unfair to direct
> writer...
OK, I also find direct write being a bit faster than buffered write:
root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync
1073741824 bytes (1.1 GB) copied, 10.4039 s, 103 MB/s
1073741824 bytes (1.1 GB) copied, 10.4143 s, 103 MB/s
root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync
1073741824 bytes (1.1 GB) copied, 9.9006 s, 108 MB/s
1073741824 bytes (1.1 GB) copied, 9.55173 s, 112 MB/s
root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync
1073741824 bytes (1.1 GB) copied, 9.83902 s, 109 MB/s
1073741824 bytes (1.1 GB) copied, 9.61725 s, 112 MB/s
> And actually 64k vs 1M makes a big difference on my machine:
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync
> 16384+0 records in
> 16384+0 records out
> 1073741824 bytes (1.1 GB) copied, 19.3176 s, 55.6 MB/s
Interestingly, my 64k direct writes are as fast as 1M direct writes...
and 4k writes run at ~1/4 speed:
root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=4k count=$((256<<10)) oflag=direct conv=fsync
1073741824 bytes (1.1 GB) copied, 42.0726 s, 25.5 MB/s
Thanks,
Fengguang
[-- Attachment #2: balance_dirty_pages-task-bw.png --]
[-- Type: image/png, Size: 61279 bytes --]
next prev parent reply other threads:[~2012-04-25 12:07 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-03 18:36 [RFC] writeback and cgroup Tejun Heo
2012-04-04 14:51 ` Vivek Goyal
2012-04-04 15:36 ` [Lsf] " Steve French
2012-04-04 18:56 ` Tejun Heo
2012-04-04 19:19 ` Vivek Goyal
2012-04-25 8:47 ` Suresh Jayaraman
2012-04-04 18:49 ` Tejun Heo
2012-04-04 19:23 ` [Lsf] " Steve French
2012-04-14 12:15 ` Peter Zijlstra
2012-04-04 20:32 ` Vivek Goyal
2012-04-04 23:02 ` Tejun Heo
2012-04-05 16:38 ` Tejun Heo
2012-04-05 17:13 ` Vivek Goyal
2012-04-14 11:53 ` [Lsf] " Peter Zijlstra
2012-04-07 8:00 ` Jan Kara
2012-04-10 16:23 ` [Lsf] " Steve French
2012-04-10 18:16 ` Vivek Goyal
2012-04-10 18:06 ` Vivek Goyal
2012-04-10 21:05 ` Jan Kara
2012-04-10 21:20 ` Vivek Goyal
2012-04-10 22:24 ` Jan Kara
2012-04-11 15:40 ` Vivek Goyal
2012-04-11 15:45 ` Vivek Goyal
2012-04-11 17:05 ` Jan Kara
2012-04-11 17:23 ` Vivek Goyal
2012-04-11 19:44 ` Jan Kara
2012-04-17 21:48 ` Tejun Heo
2012-04-18 18:18 ` Vivek Goyal
2012-04-11 19:22 ` Jan Kara
2012-04-12 20:37 ` Vivek Goyal
2012-04-12 20:51 ` Tejun Heo
2012-04-14 14:36 ` Fengguang Wu
2012-04-16 14:57 ` Vivek Goyal
2012-04-24 11:33 ` Fengguang Wu
2012-04-24 14:56 ` Jan Kara
2012-04-24 15:58 ` Vivek Goyal
2012-04-25 2:42 ` Fengguang Wu
2012-04-25 3:16 ` Fengguang Wu
2012-04-25 9:01 ` Jan Kara
2012-04-25 12:05 ` Fengguang Wu [this message]
2012-04-15 11:37 ` [Lsf] " Peter Zijlstra
2012-04-17 22:01 ` Tejun Heo
2012-04-18 6:30 ` Jan Kara
2012-04-14 12:25 ` [Lsf] " Peter Zijlstra
2012-04-16 12:54 ` Vivek Goyal
2012-04-16 13:07 ` Fengguang Wu
2012-04-16 14:19 ` Fengguang Wu
2012-04-16 15:52 ` Vivek Goyal
2012-04-17 2:14 ` Fengguang Wu
2012-04-04 17:51 ` Fengguang Wu
2012-04-04 18:35 ` Vivek Goyal
2012-04-04 21:42 ` Fengguang Wu
2012-04-05 15:10 ` Vivek Goyal
2012-04-06 0:32 ` Fengguang Wu
2012-04-04 19:33 ` Tejun Heo
2012-04-04 20:18 ` Vivek Goyal
2012-04-05 16:31 ` Tejun Heo
2012-04-05 17:09 ` Vivek Goyal
2012-04-06 9:59 ` Fengguang Wu
2012-04-17 22:38 ` Tejun Heo
2012-04-19 14:23 ` Fengguang Wu
2012-04-19 18:31 ` Vivek Goyal
2012-04-20 12:45 ` Fengguang Wu
2012-04-20 19:29 ` Vivek Goyal
2012-04-20 21:33 ` Tejun Heo
2012-04-22 14:26 ` Fengguang Wu
2012-04-23 12:30 ` Vivek Goyal
2012-04-23 16:04 ` Tejun Heo
2012-04-19 20:26 ` Jan Kara
2012-04-20 13:34 ` Fengguang Wu
2012-04-20 19:08 ` Tejun Heo
2012-04-22 14:46 ` Fengguang Wu
2012-04-23 16:56 ` Tejun Heo
2012-04-24 7:58 ` Fengguang Wu
2012-04-25 15:47 ` Tejun Heo
2012-04-23 9:14 ` Jan Kara
2012-04-23 10:24 ` Fengguang Wu
2012-04-23 12:42 ` Jan Kara
2012-04-23 14:31 ` Fengguang Wu
2012-04-18 6:57 ` Jan Kara
2012-04-18 7:58 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120425120502.GA18819@localhost \
--to=fengguang.wu@intel.com \
--cc=andrea@betterlinux.com \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=ctalbott@google.com \
--cc=jack@suse.cz \
--cc=jmoyer@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan@huawei.com \
--cc=lsf@lists.linux-foundation.org \
--cc=rni@google.com \
--cc=sjayaraman@suse.com \
--cc=tj@kernel.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).