From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Rutland Subject: Re: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds Date: Tue, 18 Apr 2017 10:08:46 +0100 Message-ID: <20170418090845.GC17866@leverpostej> References: <20f6c994-d83e-7a6f-9f13-f10287211a6c@arm.com> <9f473bb9-d0eb-6803-1263-75ffef0301fe@redhat.com> <1050c9d8-5813-5df9-29e5-3ab6e61b5de6@arm.com> <88715300-ef58-e7bd-81f5-95e0b9c9c533@arm.com> <20170413155045.GA8387@e107814-lin.cambridge.arm.com> <20170418083230.GA17866@leverpostej> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 7975C40BBB for ; Tue, 18 Apr 2017 05:06:38 -0400 (EDT) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5+VV9Em6EGiS for ; Tue, 18 Apr 2017 05:06:36 -0400 (EDT) Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 1040C40190 for ; Tue, 18 Apr 2017 05:06:35 -0400 (EDT) Content-Disposition: inline In-Reply-To: <20170418083230.GA17866@leverpostej> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: "Suzuki K. Poulose" Cc: kvm@vger.kernel.org, Marc Zyngier , Andrey Konovalov , will.deacon@arm.com, linux-kernel@vger.kernel.org, kcc@google.com, syzkaller@googlegroups.com, linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, Paolo Bonzini , kvmarm@lists.cs.columbia.edu, dvyukov@google.com List-Id: kvmarm@lists.cs.columbia.edu T24gVHVlLCBBcHIgMTgsIDIwMTcgYXQgMDk6MzI6MzFBTSArMDEwMCwgTWFyayBSdXRsYW5kIHdy b3RlOgo+IEhpIFN1enVraSwKPiAKPiBPbiBUaHUsIEFwciAxMywgMjAxNyBhdCAwNDo1MDo0NlBN ICswMTAwLCBTdXp1a2kgSy4gUG91bG9zZSB3cm90ZToKPiA+IGt2bTogSG9sZCByZWZlcmVuY2Ug dG8gdGhlIHVzZXIgYWRkcmVzcyBzcGFjZQo+ID4gCj4gPiBUaGUgY29yZSBLVk0gY29kZSwgdXNl cyBtbWdyYWIvbW1kcm9wIHRvIHBpbiB0aGUgbW0gc3RydWN0IG9mIHRoZSB1c2VyCj4gPiBhcHBs aWNhdGlvbi4gbW1ncmFiIG9ubHkgZ3VhcmFudGVlcyB0aGF0IHRoZSBtbSBzdHJ1Y3QgaXMgYXZh aWxhYmxlLAo+ID4gd2hpbGUgdGhlICJyZWFsIGFkZHJlc3Mgc3BhY2UiIChzZWUgRG9jdW1lbnRh dGlvbi92bS9hY3RpdmVfbW0udHh0KSBtYXkKPiA+IGJlIGRlc3Ryb3llZC4gU2luY2UgdGhlIEtW TSBkZXBlbmRzIG9uIHRoZSB1c2VyIHNwYWNlIHBhZ2UgdGFibGVzIGZvcgo+ID4gdGhlIEd1ZXN0 IHBhZ2VzLCB3ZSBzaG91bGQgaW5zdGVhZCBkbyBhbiBtbWdldC9tbXB1dC4gRXZlbiB0aG91Z2gK PiA+IG1tZ2V0L21tcHV0IGlzIG5vdCBlbmNvdXJhZ2VkIGZvciB1c2VzIHdpdGggdW5ib3VuZGVk IHRpbWUsIHRoZSBLVk0KPiA+IGlzIGZpbmUgdG8gZG8gc28sIGFzIHdlIGFyZSBkb2luZyBpdCBm cm9tIHRoZSBjb250ZXh0IG9mIHRoZSBzYW1lIHByb2Nlc3MuCj4gPiAKPiA+IFRoaXMgYWxzbyBw cmV2ZW50cyB0aGUgcmFjZSBjb25kaXRpb24gd2hlcmUgbW11X25vdGlmaWVyX3JlbGVhc2UoKSBj b3VsZAo+ID4gYmUgY2FsbGVkIGluIHBhcmFsbGVsIGFuZCBvbmUgaW5zdGFuY2UgY291bGQgZW5k IHVwIHVzaW5nIGEgZnJlZSdkIGt2bQo+ID4gaW5zdGFuY2UuCj4gPiAKPiA+IENjOiBNYXJrIFJ1 dGxhbmQgPG1hcmsucnV0bGFuZEBhcm0uY29tPgo+ID4gQ2M6IFBhb2xvIEJvbnppbiA8cGJvbnpp bmlAcmVkaGF0LmNvbT4KPiA+IENjOiBSYWRpbSBLcsSNbcOhxZkgPHJrcmNtYXJAcmVkaGF0LmNv bT4KPiA+IENjOiBNYXJjIFp5bmdpZXIgPG1hcmMuenluZ2llckBhcm0uY29tPgo+ID4gQ2M6IENo cmlzdG9mZmVyIERhbGwgPGNocmlzdG9mZmVyLmRhbGxAbGluYXJvLm9yZz4KPiA+IENjOiBhbmRy ZXlrbnZsQGdvb2dsZS5jb20KPiA+IFNpZ25lZC1vZmYtYnk6IFN1enVraSBLIFBvdWxvc2UgPHN1 enVraS5wb3Vsb3NlQGFybS5jb20+Cj4gPiAtLS0KPiA+ICB2aXJ0L2t2bS9rdm1fbWFpbi5jIHwg NiArKystLS0KPiA+ICAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25zKCspLCAzIGRlbGV0aW9u cygtKQo+ID4gCj4gPiBkaWZmIC0tZ2l0IGEvdmlydC9rdm0va3ZtX21haW4uYyBiL3ZpcnQva3Zt L2t2bV9tYWluLmMKPiA+IGluZGV4IDg4MjU3YjMuLjU1NTcxMmUgMTAwNjQ0Cj4gPiAtLS0gYS92 aXJ0L2t2bS9rdm1fbWFpbi5jCj4gPiArKysgYi92aXJ0L2t2bS9rdm1fbWFpbi5jCj4gPiBAQCAt NjEzLDcgKzYxMyw3IEBAIHN0YXRpYyBzdHJ1Y3Qga3ZtICprdm1fY3JlYXRlX3ZtKHVuc2lnbmVk IGxvbmcgdHlwZSkKPiA+ICAJCXJldHVybiBFUlJfUFRSKC1FTk9NRU0pOwo+ID4gIAo+ID4gIAlz cGluX2xvY2tfaW5pdCgma3ZtLT5tbXVfbG9jayk7Cj4gPiAtCW1tZ3JhYihjdXJyZW50LT5tbSk7 Cj4gPiArCW1tZ2V0KGN1cnJlbnQtPm1tKTsKPiA+ICAJa3ZtLT5tbSA9IGN1cnJlbnQtPm1tOwo+ ID4gIAlrdm1fZXZlbnRmZF9pbml0KGt2bSk7Cj4gPiAgCW11dGV4X2luaXQoJmt2bS0+bG9jayk7 Cj4gPiBAQCAtNjg1LDcgKzY4NSw3IEBAIHN0YXRpYyBzdHJ1Y3Qga3ZtICprdm1fY3JlYXRlX3Zt KHVuc2lnbmVkIGxvbmcgdHlwZSkKPiA+ICAJZm9yIChpID0gMDsgaSA8IEtWTV9BRERSRVNTX1NQ QUNFX05VTTsgaSsrKQo+ID4gIAkJa3ZtX2ZyZWVfbWVtc2xvdHMoa3ZtLCBrdm0tPm1lbXNsb3Rz W2ldKTsKPiA+ICAJa3ZtX2FyY2hfZnJlZV92bShrdm0pOwo+ID4gLQltbWRyb3AoY3VycmVudC0+ bW0pOwo+ID4gKwltbXB1dChjdXJyZW50LT5tbSk7Cj4gPiAgCXJldHVybiBFUlJfUFRSKHIpOwo+ ID4gIH0KPiA+ICAKPiA+IEBAIC03NDcsNyArNzQ3LDcgQEAgc3RhdGljIHZvaWQga3ZtX2Rlc3Ry b3lfdm0oc3RydWN0IGt2bSAqa3ZtKQo+ID4gIAlrdm1fYXJjaF9mcmVlX3ZtKGt2bSk7Cj4gPiAg CXByZWVtcHRfbm90aWZpZXJfZGVjKCk7Cj4gPiAgCWhhcmR3YXJlX2Rpc2FibGVfYWxsKCk7Cj4g PiAtCW1tZHJvcChtbSk7Cj4gPiArCW1tcHV0KG1tKTsKPiA+ICB9Cj4gCj4gCj4gQXMgYSBoZWFk cy11cCwgSSdtIHNlZWluZyB3aGF0IGxvb2tzIHRvIGJlIGEgS1ZNIG1lbW9yeSBsZWFrIHdpdGgg dGhpcwo+IHBhdGNoIGFwcGxpZWQgYXRvcCBvZiBuZXh0LTIwMTcwNDExLgo+IAo+IEkgZG9uJ3Qg eWV0IGtub3cgaWYgdGhpcyBpcyBhIHByb2JsZW0gd2l0aCBuZXh0LTIwMTcwNDExIG9yIHRoaXMg cGF0Y2gKPiBpbiBwYXJ0aWN1bGFyIC0tIEkgd2lsbCB0cnkgdG8gdHJhY2sgdGhhdCBkb3duLiBJ biB0aGUgbWVhbiB0aW1lLCBpbmZvCj4gZHVtcCBiZWxvdy4KPiAKPiBJIGxlZnQgc3l6a2FsbGVy IHJ1bm5pbmcgb3ZlciB0aGUgd2Vla2VuZCB1c2luZyB0aGlzIGtlcm5lbCBvbiB0aGUgaG9zdCwK PiBhbmQgT09NIGtpY2tlZCBpbiBhZnRlciBpdCBoYWQgYmVlbiBydW5uaW5nIGZvciBhIHNob3J0 IHdoaWxlLiBBbG1vc3QKPiBhbGwgb2YgbXkgbWVtb3J5IGlzIGluIHVzZSwgYnV0IGp1ZGdpbmcg YnkgdG9wLCBhbG1vc3Qgbm9uZSBvZiB0aGlzIGlzCj4gYXNzb2NpYXRlZCB3aXRoIHByb2Nlc3Nl cy4KCkZXSVcsIGl0IHNlZW1zIGVhc3kgZW5vdWdoIHRvIHRyaWdnZXIgdGhpcyBsZWFrIHdpdGgg a3ZtdG9vbC4gU3RhcnQgYQpndWVzdCB0aGF0J2xsIGFsbG9jYXRlIGEgdG9ubmUgb2YgbWVtb3J5 OgoKJCBsa3ZtIHJ1biAtLWNvbnNvbGUgdmlydGlvIC1tIDQwOTYgLS1rZXJuZWwgSW1hZ2UgXAog IC1wICJtZW10ZXN0PTEgY29uc29sZT1odmMwLDM4NDAwIGVhcmx5Y29uPXVhcnQsbW1pbywweDNm OCIKCi4uLiB0aGVuIGtpbGwgdGhpcyB3aXRoIGEgU0lHS0lMTCBmcm9tIHlvdXIgZmF2b3VyaXRl IHByb2Nlc3MgbWFuYWdlbWVudAp0b29sLgoKQWxzbywgaWYgdGhlIGd1ZXN0IGV4aXRzIG5vcm1h bGx5LCBldmVyeXRoaW5nIGFwcGVhcnMgdG8gYmUgY2xlYW5lZCB1cApjb3JyZWN0bHkuIFNvIGl0 IGxvb2tzIGxpa2UgdGhpcyBpcyBpbiBzb21lIHdheSByZWxhdGVkIHRvIG91ciBmYXRhbApzaWdu YWwgaGFuZGxpbmcuCgpUaGFua3MsCk1hcmsuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fCmt2bWFybSBtYWlsaW5nIGxpc3QKa3ZtYXJtQGxpc3RzLmNzLmNv bHVtYmlhLmVkdQpodHRwczovL2xpc3RzLmNzLmNvbHVtYmlhLmVkdS9tYWlsbWFuL2xpc3RpbmZv L2t2bWFybQo= From mboxrd@z Thu Jan 1 00:00:00 1970 From: mark.rutland@arm.com (Mark Rutland) Date: Tue, 18 Apr 2017 10:08:46 +0100 Subject: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds In-Reply-To: <20170418083230.GA17866@leverpostej> References: <20f6c994-d83e-7a6f-9f13-f10287211a6c@arm.com> <9f473bb9-d0eb-6803-1263-75ffef0301fe@redhat.com> <1050c9d8-5813-5df9-29e5-3ab6e61b5de6@arm.com> <88715300-ef58-e7bd-81f5-95e0b9c9c533@arm.com> <20170413155045.GA8387@e107814-lin.cambridge.arm.com> <20170418083230.GA17866@leverpostej> Message-ID: <20170418090845.GC17866@leverpostej> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Tue, Apr 18, 2017 at 09:32:31AM +0100, Mark Rutland wrote: > Hi Suzuki, > > On Thu, Apr 13, 2017 at 04:50:46PM +0100, Suzuki K. Poulose wrote: > > kvm: Hold reference to the user address space > > > > The core KVM code, uses mmgrab/mmdrop to pin the mm struct of the user > > application. mmgrab only guarantees that the mm struct is available, > > while the "real address space" (see Documentation/vm/active_mm.txt) may > > be destroyed. Since the KVM depends on the user space page tables for > > the Guest pages, we should instead do an mmget/mmput. Even though > > mmget/mmput is not encouraged for uses with unbounded time, the KVM > > is fine to do so, as we are doing it from the context of the same process. > > > > This also prevents the race condition where mmu_notifier_release() could > > be called in parallel and one instance could end up using a free'd kvm > > instance. > > > > Cc: Mark Rutland > > Cc: Paolo Bonzin > > Cc: Radim Kr?m?? > > Cc: Marc Zyngier > > Cc: Christoffer Dall > > Cc: andreyknvl at google.com > > Signed-off-by: Suzuki K Poulose > > --- > > virt/kvm/kvm_main.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 88257b3..555712e 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -613,7 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > > return ERR_PTR(-ENOMEM); > > > > spin_lock_init(&kvm->mmu_lock); > > - mmgrab(current->mm); > > + mmget(current->mm); > > kvm->mm = current->mm; > > kvm_eventfd_init(kvm); > > mutex_init(&kvm->lock); > > @@ -685,7 +685,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > > kvm_free_memslots(kvm, kvm->memslots[i]); > > kvm_arch_free_vm(kvm); > > - mmdrop(current->mm); > > + mmput(current->mm); > > return ERR_PTR(r); > > } > > > > @@ -747,7 +747,7 @@ static void kvm_destroy_vm(struct kvm *kvm) > > kvm_arch_free_vm(kvm); > > preempt_notifier_dec(); > > hardware_disable_all(); > > - mmdrop(mm); > > + mmput(mm); > > } > > > As a heads-up, I'm seeing what looks to be a KVM memory leak with this > patch applied atop of next-20170411. > > I don't yet know if this is a problem with next-20170411 or this patch > in particular -- I will try to track that down. In the mean time, info > dump below. > > I left syzkaller running over the weekend using this kernel on the host, > and OOM kicked in after it had been running for a short while. Almost > all of my memory is in use, but judging by top, almost none of this is > associated with processes. FWIW, it seems easy enough to trigger this leak with kvmtool. Start a guest that'll allocate a tonne of memory: $ lkvm run --console virtio -m 4096 --kernel Image \ -p "memtest=1 console=hvc0,38400 earlycon=uart,mmio,0x3f8" ... then kill this with a SIGKILL from your favourite process management tool. Also, if the guest exits normally, everything appears to be cleaned up correctly. So it looks like this is in some way related to our fatal signal handling. Thanks, Mark. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755262AbdDRJJe (ORCPT ); Tue, 18 Apr 2017 05:09:34 -0400 Received: from foss.arm.com ([217.140.101.70]:54172 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755550AbdDRJJN (ORCPT ); Tue, 18 Apr 2017 05:09:13 -0400 Date: Tue, 18 Apr 2017 10:08:46 +0100 From: Mark Rutland To: "Suzuki K. Poulose" Cc: kvm@vger.kernel.org, Marc Zyngier , Andrey Konovalov , will.deacon@arm.com, linux-kernel@vger.kernel.org, kcc@google.com, syzkaller@googlegroups.com, catalin.marinas@arm.com, dvyukov@google.com, Paolo Bonzini , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds Message-ID: <20170418090845.GC17866@leverpostej> References: <20f6c994-d83e-7a6f-9f13-f10287211a6c@arm.com> <9f473bb9-d0eb-6803-1263-75ffef0301fe@redhat.com> <1050c9d8-5813-5df9-29e5-3ab6e61b5de6@arm.com> <88715300-ef58-e7bd-81f5-95e0b9c9c533@arm.com> <20170413155045.GA8387@e107814-lin.cambridge.arm.com> <20170418083230.GA17866@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170418083230.GA17866@leverpostej> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 18, 2017 at 09:32:31AM +0100, Mark Rutland wrote: > Hi Suzuki, > > On Thu, Apr 13, 2017 at 04:50:46PM +0100, Suzuki K. Poulose wrote: > > kvm: Hold reference to the user address space > > > > The core KVM code, uses mmgrab/mmdrop to pin the mm struct of the user > > application. mmgrab only guarantees that the mm struct is available, > > while the "real address space" (see Documentation/vm/active_mm.txt) may > > be destroyed. Since the KVM depends on the user space page tables for > > the Guest pages, we should instead do an mmget/mmput. Even though > > mmget/mmput is not encouraged for uses with unbounded time, the KVM > > is fine to do so, as we are doing it from the context of the same process. > > > > This also prevents the race condition where mmu_notifier_release() could > > be called in parallel and one instance could end up using a free'd kvm > > instance. > > > > Cc: Mark Rutland > > Cc: Paolo Bonzin > > Cc: Radim Krčmář > > Cc: Marc Zyngier > > Cc: Christoffer Dall > > Cc: andreyknvl@google.com > > Signed-off-by: Suzuki K Poulose > > --- > > virt/kvm/kvm_main.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 88257b3..555712e 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -613,7 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > > return ERR_PTR(-ENOMEM); > > > > spin_lock_init(&kvm->mmu_lock); > > - mmgrab(current->mm); > > + mmget(current->mm); > > kvm->mm = current->mm; > > kvm_eventfd_init(kvm); > > mutex_init(&kvm->lock); > > @@ -685,7 +685,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > > kvm_free_memslots(kvm, kvm->memslots[i]); > > kvm_arch_free_vm(kvm); > > - mmdrop(current->mm); > > + mmput(current->mm); > > return ERR_PTR(r); > > } > > > > @@ -747,7 +747,7 @@ static void kvm_destroy_vm(struct kvm *kvm) > > kvm_arch_free_vm(kvm); > > preempt_notifier_dec(); > > hardware_disable_all(); > > - mmdrop(mm); > > + mmput(mm); > > } > > > As a heads-up, I'm seeing what looks to be a KVM memory leak with this > patch applied atop of next-20170411. > > I don't yet know if this is a problem with next-20170411 or this patch > in particular -- I will try to track that down. In the mean time, info > dump below. > > I left syzkaller running over the weekend using this kernel on the host, > and OOM kicked in after it had been running for a short while. Almost > all of my memory is in use, but judging by top, almost none of this is > associated with processes. FWIW, it seems easy enough to trigger this leak with kvmtool. Start a guest that'll allocate a tonne of memory: $ lkvm run --console virtio -m 4096 --kernel Image \ -p "memtest=1 console=hvc0,38400 earlycon=uart,mmio,0x3f8" ... then kill this with a SIGKILL from your favourite process management tool. Also, if the guest exits normally, everything appears to be cleaned up correctly. So it looks like this is in some way related to our fatal signal handling. Thanks, Mark.