From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38434) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z9uN5-0006w5-Si for qemu-devel@nongnu.org; Tue, 30 Jun 2015 08:10:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z9uN0-0002Is-Ig for qemu-devel@nongnu.org; Tue, 30 Jun 2015 08:10:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38907) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z9uN0-0002I2-DX for qemu-devel@nongnu.org; Tue, 30 Jun 2015 08:10:34 -0400 Date: Tue, 30 Jun 2015 14:10:29 +0200 From: Andrew Jones Message-ID: <20150630121029.GE3016@hawk.localdomain> References: <55918593.6090703@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55918593.6090703@huawei.com> Subject: Re: [Qemu-devel] QEMU + KVM PSCI and VCPU creation / destruction List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Claudio Fontana Cc: Peter Maydell , "qemu-devel@nongnu.org" On Mon, Jun 29, 2015 at 07:51:15PM +0200, Claudio Fontana wrote: > Hello, > > while heavily testing PSCI on QEMU+KVM during OSv enablement, I encountered, among others, the following issue: > > I am running a test in which I boot an OS at EL1 under KVM, then boot a secondary VCPU, > then immediately call PSCI for a SYSTEM_RESET (reboot). > > This loops over infinitely, or, as a matter of fact, until I run out of memory in the Foundation Model. > > Now, before submitting another support request for the Model, I checked the code for the handling of PSCI, and it turns out that KVM handles the HVC and then sets an exit reason for QEMU to check, > which again sets the system_reset_requested to true, which causes a qemu_system_reset. > > Now in there I see the call to qemu_devices_reset() and cpu_synchronize_all_post_reset(), > but are actually the VCPU destroyed? Is the VM destroyed? Or are new resources allocated at the next boot whenever PSCI asks for another VCPU to be booted via KVM_CREATE_VCPU etc? > > If the resources associated to the VCPU (and VM?) are not freed, isn't this always going to cause leak in the host? > > After around 3 hours of continuous PSCI secondary boot followed by SYSTEM_RESET I run out of memory on the host. I can reproduce this with kvm-unit-tests. I don't see much sign of a leak using tcg with x86 for the host, nor with using kvm on an aarch64 host. However using tcg on an aarch64 host shows memory getting used pretty quickly. I didn't test with arm, only aarch64. I would just provide the test, which is only int main(void) { assert(cpumask_weight(&cpu_present_mask) > 1); smp_boot_secondary(1, halt); psci_sys_reset(); return 0; } but, in order to cleanly handle system reset, I ended up tweaking the framework in several places. So I've created a branch for this, which is here https://github.com/rhdrjones/kvm-unit-tests/commits/arm/test-reset On an aarch64 host, just do ./configure make export QEMU= $QEMU -machine virt,accel=tcg \ -device virtio-serial-device \ -device virtconsole,chardev=ctd \ -chardev testdev,id=ctd \ -display none -serial stdio \ -kernel arm/secondary-leak.flat \ -smp 2 I've expanded the command line because arm/run arm/secondary-leak.flat -smp 2 would use kvm by default on the aarch64 host, and we need tcg. (maybe I'll add an arm/run command line switch for that someday) The test will never output anything and never exit. While it's running just check 'free' several times in another term to see that memory is going away, and then hit ^C on the test whenever you want to quit. drew