From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoffer Dall Subject: Re: [PATCH 1/2] kvm: Fix mmu_notifier release race Date: Tue, 25 Apr 2017 17:37:06 +0200 Message-ID: <20170425153706.GK4104@cbox> References: <1493028624-29837-1-git-send-email-suzuki.poulose@arm.com> <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 78C9340C8B for ; Tue, 25 Apr 2017 11:34:20 -0400 (EDT) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R+70U4ZVVj5A for ; Tue, 25 Apr 2017 11:34:18 -0400 (EDT) Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 6346340C46 for ; Tue, 25 Apr 2017 11:34:18 -0400 (EDT) Received: by mail-wm0-f50.google.com with SMTP id u65so26282032wmu.1 for ; Tue, 25 Apr 2017 08:37:08 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Suzuki K Poulose Cc: kvm@vger.kernel.org, marc.zyngier@arm.com, andreyknvl@google.com, linux-kernel@vger.kernel.org, pbonzini@redhat.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org List-Id: kvmarm@lists.cs.columbia.edu T24gTW9uLCBBcHIgMjQsIDIwMTcgYXQgMTE6MTA6MjNBTSArMDEwMCwgU3V6dWtpIEsgUG91bG9z ZSB3cm90ZToKPiBUaGUgS1ZNIHVzZXMgbW11X25vdGlmaWVyICh3aGVyZXZlciBhdmFpbGFibGUp IHRvIGtlZXAgdHJhY2sKPiBvZiB0aGUgY2hhbmdlcyB0byB0aGUgbW0gb2YgdGhlIGd1ZXN0LiBU aGUgZ3Vlc3Qgc2hhZG93IHBhZ2UKPiB0YWJsZXMgYXJlIHJlbGVhc2VkIHdoZW4gdGhlIFZNIGV4 aXRzIHZpYSBtbXVfbm90aWZpZXItPm9wcy5yZWxlYXNlKCkuCj4gVGhlcmUgaXMgYSByYXJlIGNo YW5jZSB0aGF0IHRoZSBtbXVfbm90aWZpZXItPnJlbGVhc2UgY291bGQgYmUKPiBjYWxsZWQgbW9y ZSB0aGFuIG9uY2UgdmlhIHR3byBkaWZmZXJlbnQgcGF0aHMsIHdoaWNoIGNvdWxkIGVuZAo+IHVw IGluIHVzZS1hZnRlci1mcmVlIG9mIGt2bSBpbnN0YW5jZSAoc3VjaCBhcyBbMF0pLgo+IAo+IGUu ZzoKPiAKPiB0aHJlYWQgQSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0 aHJlYWQgQgo+IC0tLS0tLS0gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg IC0tLS0tLS0tLS0tLS0tCj4gCj4gIGdldF9zaWduYWwtPiAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAga3ZtX2Rlc3Ryb3lfdm0oKS0+Cj4gIGRvX2V4aXQtPiAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICBtbXVfbm90aWZpZXJfdW5yZWdpc3Rlci0+Cj4gIGV4 aXRfbW0tPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBrdm1fYXJjaF9m bHVzaF9zaGFkb3dfYWxsKCktPgo+ICBleGl0X21tYXAtPiAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgc3Bpbl9sb2NrKCZrdm0tPm1tdV9sb2NrKQo+ICBtbXVfbm90aWZpZXJf cmVsZWFzZS0+ICAgICAgICAgICAgICAgICAgICAgICAgICAgLi4uLgo+ICAga3ZtX2FyY2hfZmx1 c2hfc2hhZG93X2FsbCgpLT4gICAgICAgICAgICAgICAgICAgLi4uLi4KPiAgIC4uLiBzcGluX2xv Y2soJmt2bS0+bW11X2xvY2spICAgICAgICAgICAgICAgICAgIC4uLi4uCj4gICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzcGluX3VubG9jaygma3ZtLT5t bXVfbG9jaykKPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICBrdm1fYXJjaF9mcmVlX2t2bSgpCj4gICAgKioqIHVzZSBhZnRlciBmcmVlIG9mIGt2bSAqKioK PiAKPiBUaGlzIHBhdGNoIGF0dGVtcHRzIHRvIHNvbHZlIHRoZSBwcm9ibGVtIGJ5IGhvbGRpbmcg YSByZWZlcmVuY2UgdG8gdGhlIEtWTQo+IGZvciB0aGUgbW11X25vdGlmaWVyLCB3aGljaCBpcyBk cm9wcGVkIG9ubHkgZnJvbSBub3RpZmllci0+b3BzLnJlbGVhc2UoKS4KPiBUaGlzIHdpbGwgZW5z dXJlIHRoYXQgdGhlIEtWTSBzdHJ1Y3QgaXMgYXZhaWxhYmxlIHRpbGwgd2UgcmVhY2ggdGhlCj4g a3ZtX21tdV9ub3RpZmllcl9yZWxlYXNlLCBhbmQgdGhlIGt2bV9kZXN0cm95X3ZtIGlzIGNhbGxl ZCBvbmx5IGZyb20vYWZ0ZXIKPiBpdC4gU28sIHdlIGNhbiB1bnJlZ2lzdGVyIHRoZSBub3RpZmll ciB3aXRoIG5vX3JlbGVhc2Ugb3B0aW9uIGFuZCBoZW5jZQo+IGF2b2lkaW5nIHRoZSByYWNlIGFi b3ZlLiBIb3dldmVyLCB3ZSBuZWVkIHRvIG1ha2Ugc3VyZSB0aGF0IHRoZSBLVk0gaXMKPiBmcmVl ZCBvbmx5IGFmdGVyIHRoZSBtbXVfbm90aWZpZXIgaGFzIGZpbmlzaGVkIHByb2Nlc3NpbmcgdGhl IG5vdGlmaWVyIGR1ZSB0bwo+IHRoZSBmb2xsb3dpbmcgcG9zc2libGUgcGF0aCBvZiBleGVjdXRp b24gOgo+IAo+IG1tdV9ub3RpZmllcl9yZWxlYXNlIC0+IGt2bV9tbXVfbm90aWZpZXJfcmVsZWFz ZSAtPiBrdm1fcHV0X2t2bSAtPgo+ICAga3ZtX2Rlc3Ryb3lfdm0gLT4ga3ZtX2FyY2hfZnJlZV9r dm0KPiAKPiBbMF0gaHR0cDovL2xrbWwua2VybmVsLm9yZy9yL0NBQWVISyt4OHVkSEtxOXhhMXpr VE82YXg1RThEazMySFlXZmFUMDVGTWNoTDJjcjQ4Z0BtYWlsLmdtYWlsLmNvbQo+IAo+IEZpeGVz OiBjb21taXQgODVkYjA2ZTUxNDQyMiAoIktWTTogbW11X25vdGlmaWVycyByZWxlYXNlIG1ldGhv ZCIpCj4gUmVwb3J0ZWQtYnk6IGFuZHJleWtudmxAZ29vZ2xlLmNvbQo+IENjOiBNYXJrIFJ1dGxh bmQgPG1hcmsucnV0bGFuZEBhcm0uY29tPgo+IENjOiBQYW9sbyBCb256aW5pIDxwYm9uemluaUBy ZWRoYXQuY29tPgo+IENjOiBSYWRpbSBLcsSNbcOhxZkgPHJrcmNtYXJAcmVkaGF0LmNvbT4KPiBD YzogTWFyYyBaeW5naWVyIDxtYXJjLnp5bmdpZXJAYXJtLmNvbT4KPiBDYzogQ2hyaXN0b2ZmZXIg RGFsbCA8Y2hyaXN0b2ZmZXIuZGFsbEBsaW5hcm8ub3JnPgo+IENjOiBhbmRyZXlrbnZsQGdvb2ds ZS5jb20KPiBDYzogTWFyYyBaeW5naWVyIDxtYXJjLnp5bmdpZXJAYXJtLmNvbT4KPiBUZXN0ZWQt Ynk6IE1hcmsgUnV0bGFuZCA8bWFyay5ydXRsYW5kQGFybS5jb20+Cj4gU2lnbmVkLW9mZi1ieTog U3V6dWtpIEsgUG91bG9zZSA8c3V6dWtpLnBvdWxvc2VAYXJtLmNvbT4KClRoaXMgbG9va3MgZ29v ZCB0byBtZSwgYnV0IHdlIHNob3VsZCBoYXZlIHNvbWUgS1ZNIGdlbmVyaWMgZXhwZXJ0cyBsb29r CmF0IGl0IGFzIHdlbGwuCgogUmV2aWV3ZWQtYnk6IENocmlzdG9mZmVyIERhbGwgPGNkYWxsQGxp bmFyby5vcmc+Cgo+IC0tLQo+ICBpbmNsdWRlL2xpbnV4L2t2bV9ob3N0LmggfCAgMSArCj4gIHZp cnQva3ZtL2t2bV9tYWluLmMgICAgICB8IDU5ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKy0tLS0tLQo+ICAyIGZpbGVzIGNoYW5nZWQsIDUzIGluc2VydGlvbnMoKyks IDcgZGVsZXRpb25zKC0pCj4gCj4gZGlmZiAtLWdpdCBhL2luY2x1ZGUvbGludXgva3ZtX2hvc3Qu aCBiL2luY2x1ZGUvbGludXgva3ZtX2hvc3QuaAo+IGluZGV4IGQwMjUwNzQuLjU2MWU5NjggMTAw NjQ0Cj4gLS0tIGEvaW5jbHVkZS9saW51eC9rdm1faG9zdC5oCj4gKysrIGIvaW5jbHVkZS9saW51 eC9rdm1faG9zdC5oCj4gQEAgLTQyNCw2ICs0MjQsNyBAQCBzdHJ1Y3Qga3ZtIHsKPiAgCXN0cnVj dCBtbXVfbm90aWZpZXIgbW11X25vdGlmaWVyOwo+ICAJdW5zaWduZWQgbG9uZyBtbXVfbm90aWZp ZXJfc2VxOwo+ICAJbG9uZyBtbXVfbm90aWZpZXJfY291bnQ7Cj4gKwlzdHJ1Y3QgcmN1X2hlYWQg bW11X25vdGlmaWVyX3JjdTsKPiAgI2VuZGlmCj4gIAlsb25nIHRsYnNfZGlydHk7Cj4gIAlzdHJ1 Y3QgbGlzdF9oZWFkIGRldmljZXM7Cj4gZGlmZiAtLWdpdCBhL3ZpcnQva3ZtL2t2bV9tYWluLmMg Yi92aXJ0L2t2bS9rdm1fbWFpbi5jCj4gaW5kZXggODgyNTdiMy4uMmMzZmRkNCAxMDA2NDQKPiAt LS0gYS92aXJ0L2t2bS9rdm1fbWFpbi5jCj4gKysrIGIvdmlydC9rdm0va3ZtX21haW4uYwo+IEBA IC00NzEsNiArNDcxLDcgQEAgc3RhdGljIHZvaWQga3ZtX21tdV9ub3RpZmllcl9yZWxlYXNlKHN0 cnVjdCBtbXVfbm90aWZpZXIgKm1uLAo+ICAJaWR4ID0gc3JjdV9yZWFkX2xvY2soJmt2bS0+c3Jj dSk7Cj4gIAlrdm1fYXJjaF9mbHVzaF9zaGFkb3dfYWxsKGt2bSk7Cj4gIAlzcmN1X3JlYWRfdW5s b2NrKCZrdm0tPnNyY3UsIGlkeCk7Cj4gKwlrdm1fcHV0X2t2bShrdm0pOwo+ICB9Cj4gIAo+ICBz dGF0aWMgY29uc3Qgc3RydWN0IG1tdV9ub3RpZmllcl9vcHMga3ZtX21tdV9ub3RpZmllcl9vcHMg PSB7Cj4gQEAgLTQ4Niw4ICs0ODcsNDYgQEAgc3RhdGljIGNvbnN0IHN0cnVjdCBtbXVfbm90aWZp ZXJfb3BzIGt2bV9tbXVfbm90aWZpZXJfb3BzID0gewo+ICAKPiAgc3RhdGljIGludCBrdm1faW5p dF9tbXVfbm90aWZpZXIoc3RydWN0IGt2bSAqa3ZtKQo+ICB7Cj4gKwlpbnQgcmM7Cj4gIAlrdm0t Pm1tdV9ub3RpZmllci5vcHMgPSAma3ZtX21tdV9ub3RpZmllcl9vcHM7Cj4gLQlyZXR1cm4gbW11 X25vdGlmaWVyX3JlZ2lzdGVyKCZrdm0tPm1tdV9ub3RpZmllciwgY3VycmVudC0+bW0pOwo+ICsJ cmMgPSBtbXVfbm90aWZpZXJfcmVnaXN0ZXIoJmt2bS0+bW11X25vdGlmaWVyLCBjdXJyZW50LT5t bSk7Cj4gKwkvKgo+ICsJICogV2UgaG9sZCBhIHJlZmVyZW5jZSB0byBLVk0gaGVyZSB0byBtYWtl IHN1cmUgdGhhdCB0aGUgS1ZNCj4gKwkgKiBkb2Vzbid0IGdldCBmcmVlJ2QgYmVmb3JlIG9wcy0+ cmVsZWFzZSgpIGNvbXBsZXRlcy4KPiArCSAqLwo+ICsJaWYgKCFyYykKPiArCQlrdm1fZ2V0X2t2 bShrdm0pOwo+ICsJcmV0dXJuIHJjOwo+ICt9Cj4gKwo+ICtzdGF0aWMgdm9pZCBrdm1fZnJlZV92 bV9yY3Uoc3RydWN0IHJjdV9oZWFkICpyY3UpCj4gK3sKPiArCXN0cnVjdCBrdm0gKmt2bSA9IGNv bnRhaW5lcl9vZihyY3UsIHN0cnVjdCBrdm0sIG1tdV9ub3RpZmllcl9yY3UpOwo+ICsJa3ZtX2Fy Y2hfZnJlZV92bShrdm0pOwo+ICt9Cj4gKwo+ICtzdGF0aWMgdm9pZCBrdm1fZmx1c2hfc2hhZG93 X21tdShzdHJ1Y3Qga3ZtICprdm0pCj4gK3sKPiArCS8qCj4gKwkgKiBXZSBob2xkIGEgcmVmZXJl bmNlIHRvIGt2bSBpbnN0YW5jZSBmb3IgbW11X25vdGlmaWVyIGFuZCBpcwo+ICsJICogb25seSBy ZWxlYXNlZCB3aGVuIG9wcy0+cmVsZWFzZSgpIGlzIGNhbGxlZCB2aWEgZXhpdF9tbWFwIHBhdGgu Cj4gKwkgKiBTbywgd2hlbiB3ZSByZWFjaCBoZXJlIG9wcy0+cmVsZWFzZSgpIGhhcyBiZWVuIGNh bGxlZCBhbHJlYWR5LCB3aGljaAo+ICsJICogZmx1c2hlcyB0aGUgc2hhZG93IHBhZ2UgdGFibGVz LiBIZW5jZSB0aGVyZSBpcyBubyBuZWVkIHRvIGNhbGwgdGhlCj4gKwkgKiByZWxlYXNlKCkgYWdh aW4gd2hlbiB3ZSB1bnJlZ2lzdGVyIHRoZSBub3RpZmllci4gSG93ZXZlciwgd2UgbmVlZAo+ICsJ ICogdG8gZGVsYXkgZnJlZWluZyB1cCB0aGUga3ZtIHVudGlsIHRoZSByZWxlYXNlKCkgY29tcGxl dGVzLCBzaW5jZQo+ICsJICogd2UgY291bGQgcmVhY2ggaGVyZSB2aWEgOgo+ICsJICogIGt2bV9t bXVfbm90aWZpZXJfcmVsZWFzZSgpIC0+IGt2bV9wdXRfa3ZtKCkgLT4ga3ZtX2Rlc3Ryb3lfdm0o KQo+ICsJICovCj4gKwltbXVfbm90aWZpZXJfdW5yZWdpc3Rlcl9ub19yZWxlYXNlKCZrdm0tPm1t dV9ub3RpZmllciwga3ZtLT5tbSk7Cj4gK30KPiArCj4gK3N0YXRpYyB2b2lkIGt2bV9mcmVlX3Zt KHN0cnVjdCBrdm0gKmt2bSkKPiArewo+ICsJLyoKPiArCSAqIFdhaXQgdW50aWwgdGhlIG1tdV9u b3RpZmllciBoYXMgZmluaXNoZWQgdGhlIHJlbGVhc2UoKS4KPiArCSAqIFNlZSBjb21tZW50cyBh Ym92ZSBpbiBrdm1fZmx1c2hfc2hhZG93X21tdS4KPiArCSAqLwo+ICsJbW11X25vdGlmaWVyX2Nh bGxfc3JjdSgma3ZtLT5tbXVfbm90aWZpZXJfcmN1LCBrdm1fZnJlZV92bV9yY3UpOwo+ICB9Cj4g IAo+ICAjZWxzZSAgLyogIShDT05GSUdfTU1VX05PVElGSUVSICYmIEtWTV9BUkNIX1dBTlRfTU1V X05PVElGSUVSKSAqLwo+IEBAIC00OTcsNiArNTM2LDE2IEBAIHN0YXRpYyBpbnQga3ZtX2luaXRf bW11X25vdGlmaWVyKHN0cnVjdCBrdm0gKmt2bSkKPiAgCXJldHVybiAwOwo+ICB9Cj4gIAo+ICtz dGF0aWMgdm9pZCBrdm1fZmx1c2hfc2hhZG93X21tdShzdHJ1Y3Qga3ZtICprdm0pCj4gK3sKPiAr CWt2bV9hcmNoX2ZsdXNoX3NoYWRvd19hbGwoa3ZtKTsKPiArfQo+ICsKPiArc3RhdGljIHZvaWQg a3ZtX2ZyZWVfdm0oc3RydWN0IGt2bSAqa3ZtKQo+ICt7Cj4gKwlrdm1fYXJjaF9mcmVlX3ZtKGt2 bSk7Cj4gK30KPiArCj4gICNlbmRpZiAvKiBDT05GSUdfTU1VX05PVElGSUVSICYmIEtWTV9BUkNI X1dBTlRfTU1VX05PVElGSUVSICovCj4gIAo+ICBzdGF0aWMgc3RydWN0IGt2bV9tZW1zbG90cyAq a3ZtX2FsbG9jX21lbXNsb3RzKHZvaWQpCj4gQEAgLTczMywxOCArNzgyLDE0IEBAIHN0YXRpYyB2 b2lkIGt2bV9kZXN0cm95X3ZtKHN0cnVjdCBrdm0gKmt2bSkKPiAgCQlrdm0tPmJ1c2VzW2ldID0g TlVMTDsKPiAgCX0KPiAgCWt2bV9jb2FsZXNjZWRfbW1pb19mcmVlKGt2bSk7Cj4gLSNpZiBkZWZp bmVkKENPTkZJR19NTVVfTk9USUZJRVIpICYmIGRlZmluZWQoS1ZNX0FSQ0hfV0FOVF9NTVVfTk9U SUZJRVIpCj4gLQltbXVfbm90aWZpZXJfdW5yZWdpc3Rlcigma3ZtLT5tbXVfbm90aWZpZXIsIGt2 bS0+bW0pOwo+IC0jZWxzZQo+IC0Ja3ZtX2FyY2hfZmx1c2hfc2hhZG93X2FsbChrdm0pOwo+IC0j ZW5kaWYKPiArCWt2bV9mbHVzaF9zaGFkb3dfbW11KGt2bSk7Cj4gIAlrdm1fYXJjaF9kZXN0cm95 X3ZtKGt2bSk7Cj4gIAlrdm1fZGVzdHJveV9kZXZpY2VzKGt2bSk7Cj4gIAlmb3IgKGkgPSAwOyBp IDwgS1ZNX0FERFJFU1NfU1BBQ0VfTlVNOyBpKyspCj4gIAkJa3ZtX2ZyZWVfbWVtc2xvdHMoa3Zt LCBrdm0tPm1lbXNsb3RzW2ldKTsKPiAgCWNsZWFudXBfc3JjdV9zdHJ1Y3QoJmt2bS0+aXJxX3Ny Y3UpOwo+ICAJY2xlYW51cF9zcmN1X3N0cnVjdCgma3ZtLT5zcmN1KTsKPiAtCWt2bV9hcmNoX2Zy ZWVfdm0oa3ZtKTsKPiArCWt2bV9mcmVlX3ZtKGt2bSk7Cj4gIAlwcmVlbXB0X25vdGlmaWVyX2Rl YygpOwo+ICAJaGFyZHdhcmVfZGlzYWJsZV9hbGwoKTsKPiAgCW1tZHJvcChtbSk7Cj4gLS0gCj4g Mi43LjQKPiAKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18K a3ZtYXJtIG1haWxpbmcgbGlzdAprdm1hcm1AbGlzdHMuY3MuY29sdW1iaWEuZWR1Cmh0dHBzOi8v bGlzdHMuY3MuY29sdW1iaWEuZWR1L21haWxtYW4vbGlzdGluZm8va3ZtYXJtCg== From mboxrd@z Thu Jan 1 00:00:00 1970 From: cdall@linaro.org (Christoffer Dall) Date: Tue, 25 Apr 2017 17:37:06 +0200 Subject: [PATCH 1/2] kvm: Fix mmu_notifier release race In-Reply-To: <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> References: <1493028624-29837-1-git-send-email-suzuki.poulose@arm.com> <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> Message-ID: <20170425153706.GK4104@cbox> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Mon, Apr 24, 2017 at 11:10:23AM +0100, Suzuki K Poulose wrote: > The KVM uses mmu_notifier (wherever available) to keep track > of the changes to the mm of the guest. The guest shadow page > tables are released when the VM exits via mmu_notifier->ops.release(). > There is a rare chance that the mmu_notifier->release could be > called more than once via two different paths, which could end > up in use-after-free of kvm instance (such as [0]). > > e.g: > > thread A thread B > ------- -------------- > > get_signal-> kvm_destroy_vm()-> > do_exit-> mmu_notifier_unregister-> > exit_mm-> kvm_arch_flush_shadow_all()-> > exit_mmap-> spin_lock(&kvm->mmu_lock) > mmu_notifier_release-> .... > kvm_arch_flush_shadow_all()-> ..... > ... spin_lock(&kvm->mmu_lock) ..... > spin_unlock(&kvm->mmu_lock) > kvm_arch_free_kvm() > *** use after free of kvm *** > > This patch attempts to solve the problem by holding a reference to the KVM > for the mmu_notifier, which is dropped only from notifier->ops.release(). > This will ensure that the KVM struct is available till we reach the > kvm_mmu_notifier_release, and the kvm_destroy_vm is called only from/after > it. So, we can unregister the notifier with no_release option and hence > avoiding the race above. However, we need to make sure that the KVM is > freed only after the mmu_notifier has finished processing the notifier due to > the following possible path of execution : > > mmu_notifier_release -> kvm_mmu_notifier_release -> kvm_put_kvm -> > kvm_destroy_vm -> kvm_arch_free_kvm > > [0] http://lkml.kernel.org/r/CAAeHK+x8udHKq9xa1zkTO6ax5E8Dk32HYWfaT05FMchL2cr48g at mail.gmail.com > > Fixes: commit 85db06e514422 ("KVM: mmu_notifiers release method") > Reported-by: andreyknvl at google.com > Cc: Mark Rutland > Cc: Paolo Bonzini > Cc: Radim Kr?m?? > Cc: Marc Zyngier > Cc: Christoffer Dall > Cc: andreyknvl at google.com > Cc: Marc Zyngier > Tested-by: Mark Rutland > Signed-off-by: Suzuki K Poulose This looks good to me, but we should have some KVM generic experts look at it as well. Reviewed-by: Christoffer Dall > --- > include/linux/kvm_host.h | 1 + > virt/kvm/kvm_main.c | 59 ++++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 53 insertions(+), 7 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index d025074..561e968 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -424,6 +424,7 @@ struct kvm { > struct mmu_notifier mmu_notifier; > unsigned long mmu_notifier_seq; > long mmu_notifier_count; > + struct rcu_head mmu_notifier_rcu; > #endif > long tlbs_dirty; > struct list_head devices; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 88257b3..2c3fdd4 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -471,6 +471,7 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn, > idx = srcu_read_lock(&kvm->srcu); > kvm_arch_flush_shadow_all(kvm); > srcu_read_unlock(&kvm->srcu, idx); > + kvm_put_kvm(kvm); > } > > static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { > @@ -486,8 +487,46 @@ static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { > > static int kvm_init_mmu_notifier(struct kvm *kvm) > { > + int rc; > kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops; > - return mmu_notifier_register(&kvm->mmu_notifier, current->mm); > + rc = mmu_notifier_register(&kvm->mmu_notifier, current->mm); > + /* > + * We hold a reference to KVM here to make sure that the KVM > + * doesn't get free'd before ops->release() completes. > + */ > + if (!rc) > + kvm_get_kvm(kvm); > + return rc; > +} > + > +static void kvm_free_vm_rcu(struct rcu_head *rcu) > +{ > + struct kvm *kvm = container_of(rcu, struct kvm, mmu_notifier_rcu); > + kvm_arch_free_vm(kvm); > +} > + > +static void kvm_flush_shadow_mmu(struct kvm *kvm) > +{ > + /* > + * We hold a reference to kvm instance for mmu_notifier and is > + * only released when ops->release() is called via exit_mmap path. > + * So, when we reach here ops->release() has been called already, which > + * flushes the shadow page tables. Hence there is no need to call the > + * release() again when we unregister the notifier. However, we need > + * to delay freeing up the kvm until the release() completes, since > + * we could reach here via : > + * kvm_mmu_notifier_release() -> kvm_put_kvm() -> kvm_destroy_vm() > + */ > + mmu_notifier_unregister_no_release(&kvm->mmu_notifier, kvm->mm); > +} > + > +static void kvm_free_vm(struct kvm *kvm) > +{ > + /* > + * Wait until the mmu_notifier has finished the release(). > + * See comments above in kvm_flush_shadow_mmu. > + */ > + mmu_notifier_call_srcu(&kvm->mmu_notifier_rcu, kvm_free_vm_rcu); > } > > #else /* !(CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER) */ > @@ -497,6 +536,16 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) > return 0; > } > > +static void kvm_flush_shadow_mmu(struct kvm *kvm) > +{ > + kvm_arch_flush_shadow_all(kvm); > +} > + > +static void kvm_free_vm(struct kvm *kvm) > +{ > + kvm_arch_free_vm(kvm); > +} > + > #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ > > static struct kvm_memslots *kvm_alloc_memslots(void) > @@ -733,18 +782,14 @@ static void kvm_destroy_vm(struct kvm *kvm) > kvm->buses[i] = NULL; > } > kvm_coalesced_mmio_free(kvm); > -#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) > - mmu_notifier_unregister(&kvm->mmu_notifier, kvm->mm); > -#else > - kvm_arch_flush_shadow_all(kvm); > -#endif > + kvm_flush_shadow_mmu(kvm); > kvm_arch_destroy_vm(kvm); > kvm_destroy_devices(kvm); > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > kvm_free_memslots(kvm, kvm->memslots[i]); > cleanup_srcu_struct(&kvm->irq_srcu); > cleanup_srcu_struct(&kvm->srcu); > - kvm_arch_free_vm(kvm); > + kvm_free_vm(kvm); > preempt_notifier_dec(); > hardware_disable_all(); > mmdrop(mm); > -- > 2.7.4 > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1950608AbdDYPhR (ORCPT ); Tue, 25 Apr 2017 11:37:17 -0400 Received: from mail-wm0-f47.google.com ([74.125.82.47]:35533 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1950235AbdDYPhI (ORCPT ); Tue, 25 Apr 2017 11:37:08 -0400 Date: Tue, 25 Apr 2017 17:37:06 +0200 From: Christoffer Dall To: Suzuki K Poulose Cc: pbonzini@redhat.com, christoffer.dall@linaro.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, marc.zyngier@arm.com, mark.rutland@arm.com, andreyknvl@google.com, rkrcmar@redhat.com Subject: Re: [PATCH 1/2] kvm: Fix mmu_notifier release race Message-ID: <20170425153706.GK4104@cbox> References: <1493028624-29837-1-git-send-email-suzuki.poulose@arm.com> <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1493028624-29837-2-git-send-email-suzuki.poulose@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 24, 2017 at 11:10:23AM +0100, Suzuki K Poulose wrote: > The KVM uses mmu_notifier (wherever available) to keep track > of the changes to the mm of the guest. The guest shadow page > tables are released when the VM exits via mmu_notifier->ops.release(). > There is a rare chance that the mmu_notifier->release could be > called more than once via two different paths, which could end > up in use-after-free of kvm instance (such as [0]). > > e.g: > > thread A thread B > ------- -------------- > > get_signal-> kvm_destroy_vm()-> > do_exit-> mmu_notifier_unregister-> > exit_mm-> kvm_arch_flush_shadow_all()-> > exit_mmap-> spin_lock(&kvm->mmu_lock) > mmu_notifier_release-> .... > kvm_arch_flush_shadow_all()-> ..... > ... spin_lock(&kvm->mmu_lock) ..... > spin_unlock(&kvm->mmu_lock) > kvm_arch_free_kvm() > *** use after free of kvm *** > > This patch attempts to solve the problem by holding a reference to the KVM > for the mmu_notifier, which is dropped only from notifier->ops.release(). > This will ensure that the KVM struct is available till we reach the > kvm_mmu_notifier_release, and the kvm_destroy_vm is called only from/after > it. So, we can unregister the notifier with no_release option and hence > avoiding the race above. However, we need to make sure that the KVM is > freed only after the mmu_notifier has finished processing the notifier due to > the following possible path of execution : > > mmu_notifier_release -> kvm_mmu_notifier_release -> kvm_put_kvm -> > kvm_destroy_vm -> kvm_arch_free_kvm > > [0] http://lkml.kernel.org/r/CAAeHK+x8udHKq9xa1zkTO6ax5E8Dk32HYWfaT05FMchL2cr48g@mail.gmail.com > > Fixes: commit 85db06e514422 ("KVM: mmu_notifiers release method") > Reported-by: andreyknvl@google.com > Cc: Mark Rutland > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Marc Zyngier > Cc: Christoffer Dall > Cc: andreyknvl@google.com > Cc: Marc Zyngier > Tested-by: Mark Rutland > Signed-off-by: Suzuki K Poulose This looks good to me, but we should have some KVM generic experts look at it as well. Reviewed-by: Christoffer Dall > --- > include/linux/kvm_host.h | 1 + > virt/kvm/kvm_main.c | 59 ++++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 53 insertions(+), 7 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index d025074..561e968 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -424,6 +424,7 @@ struct kvm { > struct mmu_notifier mmu_notifier; > unsigned long mmu_notifier_seq; > long mmu_notifier_count; > + struct rcu_head mmu_notifier_rcu; > #endif > long tlbs_dirty; > struct list_head devices; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 88257b3..2c3fdd4 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -471,6 +471,7 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn, > idx = srcu_read_lock(&kvm->srcu); > kvm_arch_flush_shadow_all(kvm); > srcu_read_unlock(&kvm->srcu, idx); > + kvm_put_kvm(kvm); > } > > static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { > @@ -486,8 +487,46 @@ static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { > > static int kvm_init_mmu_notifier(struct kvm *kvm) > { > + int rc; > kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops; > - return mmu_notifier_register(&kvm->mmu_notifier, current->mm); > + rc = mmu_notifier_register(&kvm->mmu_notifier, current->mm); > + /* > + * We hold a reference to KVM here to make sure that the KVM > + * doesn't get free'd before ops->release() completes. > + */ > + if (!rc) > + kvm_get_kvm(kvm); > + return rc; > +} > + > +static void kvm_free_vm_rcu(struct rcu_head *rcu) > +{ > + struct kvm *kvm = container_of(rcu, struct kvm, mmu_notifier_rcu); > + kvm_arch_free_vm(kvm); > +} > + > +static void kvm_flush_shadow_mmu(struct kvm *kvm) > +{ > + /* > + * We hold a reference to kvm instance for mmu_notifier and is > + * only released when ops->release() is called via exit_mmap path. > + * So, when we reach here ops->release() has been called already, which > + * flushes the shadow page tables. Hence there is no need to call the > + * release() again when we unregister the notifier. However, we need > + * to delay freeing up the kvm until the release() completes, since > + * we could reach here via : > + * kvm_mmu_notifier_release() -> kvm_put_kvm() -> kvm_destroy_vm() > + */ > + mmu_notifier_unregister_no_release(&kvm->mmu_notifier, kvm->mm); > +} > + > +static void kvm_free_vm(struct kvm *kvm) > +{ > + /* > + * Wait until the mmu_notifier has finished the release(). > + * See comments above in kvm_flush_shadow_mmu. > + */ > + mmu_notifier_call_srcu(&kvm->mmu_notifier_rcu, kvm_free_vm_rcu); > } > > #else /* !(CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER) */ > @@ -497,6 +536,16 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) > return 0; > } > > +static void kvm_flush_shadow_mmu(struct kvm *kvm) > +{ > + kvm_arch_flush_shadow_all(kvm); > +} > + > +static void kvm_free_vm(struct kvm *kvm) > +{ > + kvm_arch_free_vm(kvm); > +} > + > #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ > > static struct kvm_memslots *kvm_alloc_memslots(void) > @@ -733,18 +782,14 @@ static void kvm_destroy_vm(struct kvm *kvm) > kvm->buses[i] = NULL; > } > kvm_coalesced_mmio_free(kvm); > -#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) > - mmu_notifier_unregister(&kvm->mmu_notifier, kvm->mm); > -#else > - kvm_arch_flush_shadow_all(kvm); > -#endif > + kvm_flush_shadow_mmu(kvm); > kvm_arch_destroy_vm(kvm); > kvm_destroy_devices(kvm); > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > kvm_free_memslots(kvm, kvm->memslots[i]); > cleanup_srcu_struct(&kvm->irq_srcu); > cleanup_srcu_struct(&kvm->srcu); > - kvm_arch_free_vm(kvm); > + kvm_free_vm(kvm); > preempt_notifier_dec(); > hardware_disable_all(); > mmdrop(mm); > -- > 2.7.4 >