From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: KVM with hugepages generate huge load with two guests Date: Fri, 1 Oct 2010 19:30:48 -0300 Message-ID: <20101001223048.GA31596@amt.cnet> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Dmitry Golubev Return-path: Received: from mx1.redhat.com ([209.132.183.28]:43162 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751268Ab0JAW4H (ORCPT ); Fri, 1 Oct 2010 18:56:07 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Sep 30, 2010 at 12:07:15PM +0300, Dmitry Golubev wrote: > Hi, > > I am not sure what's really happening, but every few hours > (unpredictable) two virtual machines (Linux 2.6.32) start to generate > huge cpu loads. It looks like some kind of loop is unable to complete > or something... > > So the idea is: > > 1. I have two linux 2.6.32 x64 (openvz, proxmox project) guests > running on linux 2.6.35 x64 (ubuntu maverick) host with a Q6600 > Core2Quad on qemu-kvm 0.12.5 and libvirt 0.8.3 and another one small > 32bit linux virtual machine (16MB of ram) with a router inside (i > doubt it contributes to the problem). > > 2. All these machines use hufetlbfs. The server has 8GB of RAM, I > reserved 3696 huge pages (page size is 2MB) on the server, and I am > running the main guests each having 3550MB of virtual memory. The > third guest, as I wrote before, takes 16MB of virtual memory. > > 3. Once run, the guests reserve huge pages for themselves normally. As > mem-prealloc is default, they grab all the memory they should have, > leaving 6 pages unreserved (HugePages_Free - HugePages_Rsvd = 6) all > times - so as I understand they should not want to get any more, > right? > > 4. All virtual machines run perfectly normal without any disturbances > for few hours. They do not, however, use all their memory, so maybe > the issue arises when they pass some kind of a threshold. > > 5. At some point of time both guests exhibit cpu load over the top > (16-24). At the same time, host works perfectly well, showing load of > 8 and that both kvm processes use CPU equally and fully. This point of > time is unpredictable - it can be anything from one to twenty hours, > but it will be less than a day. Sometimes the load disappears in a > moment, but usually it stays like that, and everything works extremely > slow (even a 'ps' command executes some 2-5 minutes). > > 6. If I am patient, I can start rebooting the gueat systems - once > they have restarted, everything returns to normal. If I destroy one of > the guests (virsh destroy), the other one starts working normally at > once (!). > > I am relatively new to kvm and I am absolutely lost here. I have not > experienced such problems before, but recently I upgraded from ubuntu > lucid (I think it was linux 2.6.32, qemukvm 0.12.3 and libvirt 0.7.5) > and started to use hugepages. These two virtual machines are not > normally run on the same host system (i have a corosync/pacemaker > cluster with drbd storage), but when one of the hosts is not > abailable, they start running on the same host. That is the reason I > have not noticed this earlier. > > Unfortunately, I don't have any spare hardware to experiment and this > is a production system, so my debugging options are rather limited. > > Do you have any ideas, what could be wrong? Is there swapping activity on the host when this happens?