* FW: cgroup blkio.weight working, but not for KVM guests [not found] <022401cdac8d$32565fa0$97031ee0$@ncsu.edu> @ 2012-10-22 13:36 ` Ben Clay 2012-10-23 12:35 ` Stefan Hajnoczi 0 siblings, 1 reply; 5+ messages in thread From: Ben Clay @ 2012-10-22 13:36 UTC (permalink / raw) To: kvm Forwarding this to the KVM general list. I doubt you folks can help me with libvirt, but I was wondering if theres some way to verify if the cache=none parameter is being respected for my KVM guests disk image, or if there are any other configuration/debug steps appropriate for KVM + virtio + cgroup. Thanks. Ben Clay rbclay@ncsu.edu From: Ben Clay [mailto:rbclay@ncsu.edu] Sent: Wednesday, October 17, 2012 11:31 AM To: libvirt-users@redhat.com Subject: cgroup blkio.weight working, but not for KVM guests Im running libvirt 0.10.2 and qemu-kvm-1.2.0, both compiled from source, on CentOS 6. Ive got a working blkio cgroup hierarchy which Im attaching guests to using the following XML guest configs: VM1 (foreground): <cputune> <shares>2048</shares> </cputune> <blkiotune> <weight>1000</weight> </blkiotune> VM2 (background): <cputune> <shares>2</shares> </cputune> <blkiotune> <weight>100</weight> </blkiotune> Ive tested write throughput on the host using cgexec and dd, demonstrating that libvirt has correctly set up the cgroups: cgexec -g blkio:libvirt/qemu/foreground time dd if=/dev/zero of=trash1.img oflag=direct bs=1M count=4096 & cgexec -g blkio:libvirt/qemu/background time dd if=/dev/zero of=trash2.img oflag=direct bs=1M count=4096 & Snap from iotop, showing an 8:1 ratio (should be 10:1, but 8:1 is acceptable): Total DISK READ: 0.00 B/s | Total DISK WRITE: 91.52 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 9602 be/4 root 0.00 B/s 10.71 M/s 0.00 % 98.54 % dd if=/dev/zero of=trash2.img oflag=direct bs=1M count=4096 9601 be/4 root 0.00 B/s 80.81 M/s 0.00 % 97.76 % dd if=/dev/zero of=trash1.img oflag=direct bs=1M count=4096 Further, checking the task list inside each cgroup shows the guests main PID, plus those of the virtio kernel threads. Its hard to tell if all the virtio kernel threads are listed, but all the ones Ive hunted down appear to be there. However, when running the same dd commands inside the guests, I get roughly-equal performance nowhere near the ~8:1 relative bandwidth enforcement I get from the host: (background ctrl-cd right after foreground finishes, both started within 1s of each other) [ben@foreground ~]$ dd if=/dev/zero of=trash1.img oflag=direct bs=1M count=4096 4096+0 records in 4096+0 records out 4294967296 bytes (4.3 GB) copied, 104.645 s, 41.0 MB/s [ben@background ~]$ dd if=/dev/zero of=trash2.img oflag=direct bs=1M count=4096 ^C4052+0 records in 4052+0 records out 4248829952 bytes (4.2 GB) copied, 106.318 s, 40.0 MB/s I thought based on this statement: Currently, the Block I/O subsystem does not work for buffered write operations. It is primarily targeted at direct I/O, although it works for buffered read operations. from this page: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/ht ml/Resource_Management_Guide/ch-Subsystems_and_Tunable_Parameters.html that this problem might be due to host-side buffering, but I have that explicitly disabled in my guest configs: <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type="file" device="disk"> <driver name="qemu" type="raw" cache="none"/> <source file="/path/to/disk.img"/> <target dev="vda" bus="virtio"/> <alias name="virtio-disk0"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x04" function="0x0"/> </disk> Here is the qemu line from ps, showing that its clearly being passed through from the guest XML config: root 5110 20.8 4.3 4491352 349312 ? Sl 11:58 0:38 /usr/bin/qemu-kvm -name background -S -M pc-1.2 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid ea632741-c7be-36ab-bd69-da3cbe505b38 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/background.monitor,server,n owait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/path/to/disk.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virti o-disk0,bootindex=1 -netdev tap,fd=20,id=hostnet0,vhost=on,vhostfd=22 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:11:22:33:44:55,bus=pci.0,addr= 0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 For fun I tried a few different cache options to try to force a bypass the host buffercache, including writethough and directsync, but the number of virtio kernel threads appeared to explode (especially for directsync) and the throughput dropped quite low: ~50% of none for writethrough and ~5% for directsync. With cache=none, when I generate write loads inside the VMs, I do see growth in the hosts buffer cache. Further, if I use non-direct I/O inside the VMs, and inflate the balloon (forcing the guests buffer cache to flush), I dont see a corresponding drop in background throughput. Is it possible that the cache="none" directive is not being respected? Since cgroups is working for host-side processes I think my blkio subsystem is correctly set up (using cfq, group_isolation=1 etc). Maybe I miscompiled qemu, without some needed direct I/O support? Has anyone seen this before? Ben Clay rbclay@ncsu.edu ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: FW: cgroup blkio.weight working, but not for KVM guests 2012-10-22 13:36 ` FW: cgroup blkio.weight working, but not for KVM guests Ben Clay @ 2012-10-23 12:35 ` Stefan Hajnoczi 2012-10-23 22:48 ` Ben Clay 0 siblings, 1 reply; 5+ messages in thread From: Stefan Hajnoczi @ 2012-10-23 12:35 UTC (permalink / raw) To: Ben Clay; +Cc: kvm On Mon, Oct 22, 2012 at 07:36:34AM -0600, Ben Clay wrote: > Forwarding this to the KVM general list. I doubt you folks can help me with > libvirt, but I was wondering if theres some way to verify if the cache=none > parameter is being respected for my KVM guests disk image, or if there are > any other configuration/debug steps appropriate for KVM + virtio + cgroup. Here's how you can double-check the O_DIRECT flag: Find the QEMU process PID on the host: ps aux | grep qemu Then find the file descriptor of the image file which the QEMU process has open: ls -l /proc/$PID/fd Finally look at the file descriptor flags to confirm it is O_DIRECT: grep ^flags: /proc/$PID/fdinfo/$FD Note the flags field is in octal and you're looking for: #define O_DIRECT 00040000 /* direct disk access hint */ Stefan ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: FW: cgroup blkio.weight working, but not for KVM guests 2012-10-23 12:35 ` Stefan Hajnoczi @ 2012-10-23 22:48 ` Ben Clay 2012-10-24 6:10 ` Stefan Hajnoczi 0 siblings, 1 reply; 5+ messages in thread From: Ben Clay @ 2012-10-23 22:48 UTC (permalink / raw) To: 'Stefan Hajnoczi'; +Cc: kvm Stefan- Thanks for the hand-holding, it looks like the disk file is indeed open with O_DIRECT: [root@host ~]# grep ^flags: /proc/$PID/fdinfo/$FD flags: 02140002 Since this is not an issue, I guess another source of problems could be that all the virtio threads attached to this domain are not being placed within the cgroup. I will look through libvirt to see if they're setting the guest's process's cgroup classification as sticky (I can't imagine they wouldn't be), but this raises another question: are virtio kernel threads child processes of the guest's main process? Are you aware of any other factor which I should be considering here? I reran the dd tests inside the guest with iflag=fullblock set to make sure the guest buffer cache wasn't messing with throughput values (based on this: http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=5929322cc b1f9d27c1b07b746d37419d17a7cbf6), and got the same results listed earlier. Thanks again! Ben Clay rbclay@ncsu.edu -----Original Message----- From: Stefan Hajnoczi [mailto:stefanha@gmail.com] Sent: Tuesday, October 23, 2012 6:35 AM To: Ben Clay Cc: kvm@vger.kernel.org Subject: Re: FW: cgroup blkio.weight working, but not for KVM guests On Mon, Oct 22, 2012 at 07:36:34AM -0600, Ben Clay wrote: > Forwarding this to the KVM general list. I doubt you folks can help > me with libvirt, but I was wondering if theres some way to verify if > the cache=none parameter is being respected for my KVM guests disk > image, or if there are any other configuration/debug steps appropriate for KVM + virtio + cgroup. Here's how you can double-check the O_DIRECT flag: Find the QEMU process PID on the host: ps aux | grep qemu Then find the file descriptor of the image file which the QEMU process has open: ls -l /proc/$PID/fd Finally look at the file descriptor flags to confirm it is O_DIRECT: grep ^flags: /proc/$PID/fdinfo/$FD Note the flags field is in octal and you're looking for: #define O_DIRECT 00040000 /* direct disk access hint */ Stefan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: FW: cgroup blkio.weight working, but not for KVM guests 2012-10-23 22:48 ` Ben Clay @ 2012-10-24 6:10 ` Stefan Hajnoczi 2012-10-25 17:13 ` Avi Kivity 0 siblings, 1 reply; 5+ messages in thread From: Stefan Hajnoczi @ 2012-10-24 6:10 UTC (permalink / raw) To: Ben Clay; +Cc: kvm On Tue, Oct 23, 2012 at 04:48:13PM -0600, Ben Clay wrote: > Since this is not an issue, I guess another source of problems could be that > all the virtio threads attached to this domain are not being placed within > the cgroup. I will look through libvirt to see if they're setting the > guest's process's cgroup classification as sticky (I can't imagine they > wouldn't be), but this raises another question: are virtio kernel threads > child processes of the guest's main process? Virtio kernel threads? Depend on the qemu-kvm -drive ...,aio=native|threads setting you should either see: 1. For aio=native QEMU uses the Linux AIO API. I think this results in kernel threads that process I/O on behalf of the userspace process. 2. For aio=threads QEMU uses its own userspace threadpool to call preadv(2)/pwritev(2). These threads are spawned from QEMU's "iothread" event loop. I suggest you try switching between aio=native and aio=threads to check if this causes the result you have been seeing. > Are you aware of any other factor which I should be considering here? No, but I haven't played with the cgroups blkio controller much. Stefan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: FW: cgroup blkio.weight working, but not for KVM guests 2012-10-24 6:10 ` Stefan Hajnoczi @ 2012-10-25 17:13 ` Avi Kivity 0 siblings, 0 replies; 5+ messages in thread From: Avi Kivity @ 2012-10-25 17:13 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Ben Clay, kvm On 10/24/2012 08:10 AM, Stefan Hajnoczi wrote: > 1. For aio=native QEMU uses the Linux AIO API. I think this results in > kernel threads that process I/O on behalf of the userspace process. No, the request is submitted directly from io_submit(), and completion sets the eventfd from irq context. Usually no threads are involved. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-10-25 17:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <022401cdac8d$32565fa0$97031ee0$@ncsu.edu>
2012-10-22 13:36 ` FW: cgroup blkio.weight working, but not for KVM guests Ben Clay
2012-10-23 12:35 ` Stefan Hajnoczi
2012-10-23 22:48 ` Ben Clay
2012-10-24 6:10 ` Stefan Hajnoczi
2012-10-25 17:13 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).