From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Ben Clay" Subject: FW: cgroup blkio.weight working, but not for KVM guests Date: Mon, 22 Oct 2012 07:36:34 -0600 Message-ID: <014201cdb05a$414e7b20$c3eb7160$@ncsu.edu> References: <022401cdac8d$32565fa0$97031ee0$@ncsu.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE To: Return-path: Received: from na3sys009aog125.obsmtp.com ([74.125.149.153]:52796 "HELO na3sys009aog125.obsmtp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1754269Ab2JVNgo convert rfc822-to-8bit (ORCPT ); Mon, 22 Oct 2012 09:36:44 -0400 Received: by mail-ie0-f174.google.com with SMTP id k13so3547952iea.19 for ; Mon, 22 Oct 2012 06:36:43 -0700 (PDT) In-Reply-To: <022401cdac8d$32565fa0$97031ee0$@ncsu.edu> Content-Language: en-us Sender: kvm-owner@vger.kernel.org List-ID: =46orwarding this to the KVM general list.=A0 I doubt you folks can hel= p me with libvirt, but I was wondering if there=92s some way to verify if the cac= he=3Dnone parameter is being respected for my KVM guest=92s disk image, or if the= re are any other configuration/debug steps appropriate for KVM + virtio + cgro= up. Thanks. Ben Clay rbclay@ncsu.edu =46rom: Ben Clay [mailto:rbclay@ncsu.edu]=20 Sent: Wednesday, October 17, 2012 11:31 AM To: libvirt-users@redhat.com Subject: cgroup blkio.weight working, but not for KVM guests I=92m running libvirt 0.10.2 and qemu-kvm-1.2.0, both compiled from sou= rce, on CentOS 6.=A0 I=92ve got a working blkio cgroup hierarchy which I=92m at= taching guests to using the following XML guest configs: VM1 (foreground): =A0 =A0=A0=A0 2048 =A0 =A0 =A0=A0=A0 1000 =A0 VM2 (background):=20 =A0 =A0=A0=A0 2 =A0 =A0 =A0=A0=A0 100 =A0 I=92ve tested write throughput on the host using cgexec and dd, demonst= rating that libvirt has correctly set up the cgroups: cgexec -g blkio:libvirt/qemu/foreground time dd if=3D/dev/zero of=3Dtra= sh1.img oflag=3Ddirect bs=3D1M count=3D4096 & cgexec -g blkio:libvirt/qemu/back= ground time dd if=3D/dev/zero of=3Dtrash2.img oflag=3Ddirect bs=3D1M count=3D4096 & Snap from iotop, showing an 8:1 ratio (should be 10:1, but 8:1 is acceptable): Total DISK READ: 0.00 B/s | Total DISK WRITE: 91.52 M/s =A0 TID=A0 PRIO=A0 USER=A0=A0=A0=A0 DISK READ=A0 DISK WRITE=A0 SWAPIN=A0= =A0=A0=A0 IO>=A0=A0=A0 COMMAND 9602 be/4 root=A0=A0=A0=A0=A0=A0=A0 0.00 B/s=A0=A0 10.71 M/s=A0 0.00 % = 98.54 % dd if=3D/dev/zero of=3Dtrash2.img oflag=3Ddirect bs=3D1M count=3D4096 9601 be/4 root=A0=A0=A0=A0=A0=A0=A0 0.00 B/s=A0=A0 80.81 M/s=A0 0.00 % = 97.76 % dd if=3D/dev/zero of=3Dtrash1.img oflag=3Ddirect bs=3D1M count=3D4096 =46urther, checking the task list inside each cgroup shows the guest=92= s main PID, plus those of the virtio kernel threads.=A0 It=92s hard to tell if= all the virtio kernel threads are listed, but all the ones I=92ve hunted down a= ppear to be there. However, when running the same dd commands inside the guests, I get roughly-equal performance =96 nowhere near the ~8:1 relative bandwidth enforcement I get from the host: (background ctrl-c=92d right after for= eground finishes, both started within 1s of each other) [ben@foreground ~]$ dd if=3D/dev/zero of=3Dtrash1.img oflag=3Ddirect bs= =3D1M count=3D4096 4096+0 records in 4096+0 records out 4294967296 bytes (4.3 GB) copied, 104.645 s, 41.0 MB/s [ben@background ~]$ dd if=3D/dev/zero of=3Dtrash2.img oflag=3Ddirect bs= =3D1M count=3D4096 ^C4052+0 records in 4052+0 records out 4248829952 bytes (4.2 GB) copied, 106.318 s, 40.0 MB/s I thought based on this statement: =93Currently, the Block I/O subsyste= m does not work for buffered write operations. It is primarily targeted at dir= ect I/O, although it works for buffered read operations.=94 from this page: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux= /6/ht ml/Resource_Management_Guide/ch-Subsystems_and_Tunable_Parameters.html = that this problem might be due to host-side buffering, but I have that expli= citly disabled in my guest configs: =A0 =A0=A0=A0 /usr/bin/qemu-kvm =A0=A0=A0 =A0=A0=A0=A0=A0 =A0=A0=A0=A0=A0 =A0=A0=A0=A0=A0 =A0=A0=A0=A0=A0 =A0=A0=A0=A0=A0
=A0=A0=A0 Here is the qemu line from ps, showing that it=92s clearly being passed through from the guest XML config: root=A0=A0=A0=A0=A0 5110 20.8=A0 4.3 4491352 349312 ?=A0=A0=A0=A0=A0 Sl= =A0=A0 11:58=A0=A0 0:38 /usr/bin/qemu-kvm -name background -S -M pc-1.2 -enable-kvm -m 2048 -sm= p 2,sockets=3D2,cores=3D1,threads=3D1 -uuid ea632741-c7be-36ab-bd69-da3cb= e505b38 -no-user-config -nodefaults -chardev socket,id=3Dcharmonitor,path=3D/var/lib/libvirt/qemu/background.monitor= ,server,n owait -mon chardev=3Dcharmonitor,id=3Dmonitor,mode=3Dcontrol -rtc base=3D= utc -no-shutdown -device piix3-usb-uhci,id=3Dusb,bus=3Dpci.0,addr=3D0x1.0x2= -drive file=3D/path/to/disk.img,if=3Dnone,id=3Ddrive-virtio-disk0,format=3Draw= ,cache=3Dnone -device virtio-blk-pci,scsi=3Doff,bus=3Dpci.0,addr=3D0x4,drive=3Ddrive-virtio-d= isk0,id=3Dvirti o-disk0,bootindex=3D1 -netdev tap,fd=3D20,id=3Dhostnet0,vhost=3Don,vhos= tfd=3D22 -device virtio-net-pci,netdev=3Dhostnet0,id=3Dnet0,mac=3D00:11:22:33:44:55,bus=3D= pci.0,addr=3D 0x3 -chardev pty,id=3Dcharserial0 -device isa-serial,chardev=3Dcharserial0,id=3Dserial0 -device usb-tablet,id=3Di= nput0 -vnc 127.0.0.1:1 -vga cirrus -device virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x5 =46or fun I tried a few different cache options to try to force a bypas= s the host buffercache, including writethough and directsync, but the number = of virtio kernel threads appeared to explode (especially for directsync) a= nd the throughput dropped quite low: ~50% of =93none=94 for writethrough a= nd ~5% for directsync. With cache=3Dnone, when I generate write loads inside the VMs, I do see= growth in the host=92s buffer cache.=A0 Further, if I use non-direct I/O insid= e the VMs, and inflate the balloon (forcing the guest=92s buffer cache to flu= sh), I don=92t see a corresponding drop in background throughput.=A0 Is it pos= sible that the cache=3D"none" directive is not being respected?=A0=20 Since cgroups is working for host-side processes I think my blkio subsy= stem is correctly set up (using cfq, group_isolation=3D1 etc).=A0 Maybe I mi= scompiled qemu, without some needed direct I/O support?=A0 Has anyone seen this b= efore? Ben Clay rbclay@ncsu.edu