From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BE94C5DF60 for ; Wed, 6 Nov 2019 00:12:47 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 04796214D8 for ; Wed, 6 Nov 2019 00:12:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04796214D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4776Rx2RgSzF5Gg for ; Wed, 6 Nov 2019 11:12:45 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=intel.com (client-ip=192.55.52.93; helo=mga11.intel.com; envelope-from=sean.j.christopherson@intel.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4775mk44m9zF4QM for ; Wed, 6 Nov 2019 10:42:14 +1100 (AEDT) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Nov 2019 15:42:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,271,1569308400"; d="scan'208";a="200541599" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by fmsmga008.fm.intel.com with ESMTP; 05 Nov 2019 15:42:08 -0800 Date: Tue, 5 Nov 2019 15:42:08 -0800 From: Sean Christopherson To: Dan Williams Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes Message-ID: <20191105234208.GH23297@linux.intel.com> References: <20191024120938.11237-4-david@redhat.com> <01adb4cb-6092-638c-0bab-e61322be7cf5@redhat.com> <613f3606-748b-0e56-a3ad-1efaffa1a67b@redhat.com> <20191105160000.GC8128@linux.intel.com> <20191105231316.GE23297@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Mailman-Approved-At: Wed, 06 Nov 2019 11:05:37 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-hyperv@vger.kernel.org, Michal Hocko , Radim =?utf-8?B?S3LEjW3DocWZ?= , KVM list , David Hildenbrand , KarimAllah Ahmed , Dave Hansen , Alexander Duyck , Michal Hocko , Linux MM , Pavel Tatashin , Paul Mackerras , "H. Peter Anvin" , Wanpeng Li , Alexander Duyck , "K. Y. Srinivasan" , Thomas Gleixner , Kees Cook , devel@driverdev.osuosl.org, Stefano Stabellini , Stephen Hemminger , "Aneesh Kumar K.V" , Joerg Roedel , X86 ML , YueHaibing , "Matthew Wilcox \(Oracle\)" , Mike Rapoport , Peter Zijlstra , Ingo Molnar , Vlastimil Babka , Anthony Yznaga , Oscar Salvador , "Isaac J. Manjarres" , Matt Sickler , Juergen Gross , Anshuman Khandual , Haiyang Zhang , Sasha Levin , kvm-ppc@vger.kernel.org, Qian Cai , Alex Williamson , Mike Rapoport , Borislav Petkov , Nicholas Piggin , Andy Lutomirski , xen-devel , Boris Ostrovsky , Vitaly Kuznetsov , Allison Randal , Jim Mattson , Mel Gorman , Adam Borowski , Cornelia Huck , Pavel Tatashin , Linux Kernel Mailing List , Johannes Weiner , Paolo Bonzini , Andrew Morton , linuxppc-dev Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, Nov 05, 2019 at 03:30:00PM -0800, Dan Williams wrote: > On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson > wrote: > > > > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote: > > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand wrote: > > > > > The scarier code (for me) is transparent_hugepage_adjust() and > > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the > > > > > interaction between THP and _PAGE_DEVMAP. > > > > > > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it > > > > had to be said :/ ). Luckily, this should be independent of the > > > > PG_reserved thingy AFAIKs. > > > > > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte() > > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the > > > page count gets mismanaged and leads to the reported hang. > > > > When mapping pages into the guest, KVM gets the page via gup(), which > > increments the page count for ZONE_DEVICE pages. But KVM puts the page > > using kvm_release_pfn_clean(), which skips put_page() if PageReserved() > > and so never puts its reference to ZONE_DEVICE pages. > > Oh, yeah, that's busted. > > > My transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte() > > comments were for a post-patch/series scenario wheren PageReserved() is > > no longer true for ZONE_DEVICE pages. > > Ah, ok, for that David is preserving kvm_is_reserved_pfn() returning > true for ZONE_DEVICE because pfn_to_online_page() will fail for > ZONE_DEVICE. But David's proposed fix for the above refcount bug is to omit the patch so that KVM no longer treats ZONE_DEVICE pages as reserved. That seems like the right thing to do, including for thp_adjust(), e.g. it would naturally let KVM use 2mb pages for the guest when a ZONE_DEVICE page is mapped with a huge page (2mb or above) in the host. The only hiccup is figuring out how to correctly transfer the reference. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A2B1C5DF60 for ; Tue, 5 Nov 2019 23:42:31 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 210922087E for ; Tue, 5 Nov 2019 23:42:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 210922087E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iS8Sl-0002BO-Ba; Tue, 05 Nov 2019 23:42:15 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iS8Sk-0002BJ-3Y for xen-devel@lists.xenproject.org; Tue, 05 Nov 2019 23:42:14 +0000 X-Inumbo-ID: e32be1ee-0025-11ea-a1a5-12813bfff9fa Received: from mga05.intel.com (unknown [192.55.52.43]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id e32be1ee-0025-11ea-a1a5-12813bfff9fa; Tue, 05 Nov 2019 23:42:11 +0000 (UTC) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Nov 2019 15:42:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,271,1569308400"; d="scan'208";a="200541599" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by fmsmga008.fm.intel.com with ESMTP; 05 Nov 2019 15:42:08 -0800 Date: Tue, 5 Nov 2019 15:42:08 -0800 From: Sean Christopherson To: Dan Williams Message-ID: <20191105234208.GH23297@linux.intel.com> References: <20191024120938.11237-4-david@redhat.com> <01adb4cb-6092-638c-0bab-e61322be7cf5@redhat.com> <613f3606-748b-0e56-a3ad-1efaffa1a67b@redhat.com> <20191105160000.GC8128@linux.intel.com> <20191105231316.GE23297@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Subject: Re: [Xen-devel] [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: linux-hyperv@vger.kernel.org, Michal Hocko , Radim =?utf-8?B?S3LEjW3DocWZ?= , KVM list , David Hildenbrand , KarimAllah Ahmed , Benjamin Herrenschmidt , Dave Hansen , Alexander Duyck , Michal Hocko , Paul Mackerras , Linux MM , Pavel Tatashin , Paul Mackerras , Michael Ellerman , "H. Peter Anvin" , Wanpeng Li , Alexander Duyck , "K. Y. Srinivasan" , Thomas Gleixner , Kees Cook , devel@driverdev.osuosl.org, Stefano Stabellini , Stephen Hemminger , "Aneesh Kumar K.V" , Joerg Roedel , X86 ML , YueHaibing , "Matthew Wilcox \(Oracle\)" , Mike Rapoport , Peter Zijlstra , Ingo Molnar , Vlastimil Babka , Anthony Yznaga , Oscar Salvador , "Isaac J. Manjarres" , Matt Sickler , Juergen Gross , Anshuman Khandual , Haiyang Zhang , Sasha Levin , kvm-ppc@vger.kernel.org, Qian Cai , Alex Williamson , Mike Rapoport , Borislav Petkov , Nicholas Piggin , Andy Lutomirski , xen-devel , Boris Ostrovsky , Vitaly Kuznetsov , Allison Randal , Jim Mattson , Christophe Leroy , Mel Gorman , Adam Borowski , Cornelia Huck , Pavel Tatashin , Linux Kernel Mailing List , Johannes Weiner , Paolo Bonzini , Andrew Morton , linuxppc-dev Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" T24gVHVlLCBOb3YgMDUsIDIwMTkgYXQgMDM6MzA6MDBQTSAtMDgwMCwgRGFuIFdpbGxpYW1zIHdy b3RlOgo+IE9uIFR1ZSwgTm92IDUsIDIwMTkgYXQgMzoxMyBQTSBTZWFuIENocmlzdG9waGVyc29u Cj4gPHNlYW4uai5jaHJpc3RvcGhlcnNvbkBpbnRlbC5jb20+IHdyb3RlOgo+ID4KPiA+IE9uIFR1 ZSwgTm92IDA1LCAyMDE5IGF0IDAzOjAyOjQwUE0gLTA4MDAsIERhbiBXaWxsaWFtcyB3cm90ZToK PiA+ID4gT24gVHVlLCBOb3YgNSwgMjAxOSBhdCAxMjozMSBQTSBEYXZpZCBIaWxkZW5icmFuZCA8 ZGF2aWRAcmVkaGF0LmNvbT4gd3JvdGU6Cj4gPiA+ID4gPiBUaGUgc2NhcmllciBjb2RlIChmb3Ig bWUpIGlzIHRyYW5zcGFyZW50X2h1Z2VwYWdlX2FkanVzdCgpIGFuZAo+ID4gPiA+ID4ga3ZtX21t dV96YXBfY29sbGFwc2libGVfc3B0ZSgpLCBhcyBJIGRvbid0IGF0IGFsbCB1bmRlcnN0YW5kIHRo ZQo+ID4gPiA+ID4gaW50ZXJhY3Rpb24gYmV0d2VlbiBUSFAgYW5kIF9QQUdFX0RFVk1BUC4KPiA+ ID4gPgo+ID4gPiA+IFRoZSB4ODYgS1ZNIE1NVSBjb2RlIGlzIG9uZSBvZiB0aGUgdWdsaWVzdCBj b2RlIEkga25vdyAoc29ycnksIGJ1dCBpdAo+ID4gPiA+IGhhZCB0byBiZSBzYWlkIDovICkuIEx1 Y2tpbHksIHRoaXMgc2hvdWxkIGJlIGluZGVwZW5kZW50IG9mIHRoZQo+ID4gPiA+IFBHX3Jlc2Vy dmVkIHRoaW5neSBBRkFJS3MuCj4gPiA+Cj4gPiA+IEJvdGggdHJhbnNwYXJlbnRfaHVnZXBhZ2Vf YWRqdXN0KCkgYW5kIGt2bV9tbXVfemFwX2NvbGxhcHNpYmxlX3NwdGUoKQo+ID4gPiBhcmUgaG9u b3Jpbmcga3ZtX2lzX3Jlc2VydmVkX3BmbigpLCBzbyBhZ2FpbiBJJ20gbWlzc2luZyB3aGVyZSB0 aGUKPiA+ID4gcGFnZSBjb3VudCBnZXRzIG1pc21hbmFnZWQgYW5kIGxlYWRzIHRvIHRoZSByZXBv cnRlZCBoYW5nLgo+ID4KPiA+IFdoZW4gbWFwcGluZyBwYWdlcyBpbnRvIHRoZSBndWVzdCwgS1ZN IGdldHMgdGhlIHBhZ2UgdmlhIGd1cCgpLCB3aGljaAo+ID4gaW5jcmVtZW50cyB0aGUgcGFnZSBj b3VudCBmb3IgWk9ORV9ERVZJQ0UgcGFnZXMuICBCdXQgS1ZNIHB1dHMgdGhlIHBhZ2UKPiA+IHVz aW5nIGt2bV9yZWxlYXNlX3Bmbl9jbGVhbigpLCB3aGljaCBza2lwcyBwdXRfcGFnZSgpIGlmIFBh Z2VSZXNlcnZlZCgpCj4gPiBhbmQgc28gbmV2ZXIgcHV0cyBpdHMgcmVmZXJlbmNlIHRvIFpPTkVf REVWSUNFIHBhZ2VzLgo+IAo+IE9oLCB5ZWFoLCB0aGF0J3MgYnVzdGVkLgo+IAo+ID4gTXkgdHJh bnNwYXJlbnRfaHVnZXBhZ2VfYWRqdXN0KCkgYW5kIGt2bV9tbXVfemFwX2NvbGxhcHNpYmxlX3Nw dGUoKQo+ID4gY29tbWVudHMgd2VyZSBmb3IgYSBwb3N0LXBhdGNoL3NlcmllcyBzY2VuYXJpbyB3 aGVyZW4gUGFnZVJlc2VydmVkKCkgaXMKPiA+IG5vIGxvbmdlciB0cnVlIGZvciBaT05FX0RFVklD RSBwYWdlcy4KPiAKPiBBaCwgb2ssIGZvciB0aGF0IERhdmlkIGlzIHByZXNlcnZpbmcga3ZtX2lz X3Jlc2VydmVkX3BmbigpIHJldHVybmluZwo+IHRydWUgZm9yIFpPTkVfREVWSUNFIGJlY2F1c2Ug cGZuX3RvX29ubGluZV9wYWdlKCkgd2lsbCBmYWlsIGZvcgo+IFpPTkVfREVWSUNFLgoKQnV0IERh dmlkJ3MgcHJvcG9zZWQgZml4IGZvciB0aGUgYWJvdmUgcmVmY291bnQgYnVnIGlzIHRvIG9taXQg dGhlIHBhdGNoCnNvIHRoYXQgS1ZNIG5vIGxvbmdlciB0cmVhdHMgWk9ORV9ERVZJQ0UgcGFnZXMg YXMgcmVzZXJ2ZWQuICBUaGF0IHNlZW1zCmxpa2UgdGhlIHJpZ2h0IHRoaW5nIHRvIGRvLCBpbmNs dWRpbmcgZm9yIHRocF9hZGp1c3QoKSwgZS5nLiBpdCB3b3VsZApuYXR1cmFsbHkgbGV0IEtWTSB1 c2UgMm1iIHBhZ2VzIGZvciB0aGUgZ3Vlc3Qgd2hlbiBhIFpPTkVfREVWSUNFIHBhZ2UgaXMKbWFw cGVkIHdpdGggYSBodWdlIHBhZ2UgKDJtYiBvciBhYm92ZSkgaW4gdGhlIGhvc3QuICBUaGUgb25s eSBoaWNjdXAgaXMKZmlndXJpbmcgb3V0IGhvdyB0byBjb3JyZWN0bHkgdHJhbnNmZXIgdGhlIHJl ZmVyZW5jZS4KCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f Clhlbi1kZXZlbCBtYWlsaW5nIGxpc3QKWGVuLWRldmVsQGxpc3RzLnhlbnByb2plY3Qub3JnCmh0 dHBzOi8vbGlzdHMueGVucHJvamVjdC5vcmcvbWFpbG1hbi9saXN0aW5mby94ZW4tZGV2ZWw= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFDF5C5DF60 for ; Tue, 5 Nov 2019 23:42:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E1702087E for ; Tue, 5 Nov 2019 23:42:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E1702087E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0471C6B0007; Tue, 5 Nov 2019 18:42:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F11536B0008; Tue, 5 Nov 2019 18:42:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB2EE6B000A; Tue, 5 Nov 2019 18:42:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id C0F386B0007 for ; Tue, 5 Nov 2019 18:42:15 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 68B3A180AD81A for ; Tue, 5 Nov 2019 23:42:15 +0000 (UTC) X-FDA: 76123849830.09.band48_309a7e1258e34 X-HE-Tag: band48_309a7e1258e34 X-Filterd-Recvd-Size: 6352 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 5 Nov 2019 23:42:13 +0000 (UTC) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Nov 2019 15:42:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,271,1569308400"; d="scan'208";a="200541599" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by fmsmga008.fm.intel.com with ESMTP; 05 Nov 2019 15:42:08 -0800 Date: Tue, 5 Nov 2019 15:42:08 -0800 From: Sean Christopherson To: Dan Williams Cc: David Hildenbrand , Linux Kernel Mailing List , Linux MM , Michal Hocko , Andrew Morton , kvm-ppc@vger.kernel.org, linuxppc-dev , KVM list , linux-hyperv@vger.kernel.org, devel@driverdev.osuosl.org, xen-devel , X86 ML , Alexander Duyck , Alexander Duyck , Alex Williamson , Allison Randal , Andy Lutomirski , "Aneesh Kumar K.V" , Anshuman Khandual , Anthony Yznaga , Benjamin Herrenschmidt , Borislav Petkov , Boris Ostrovsky , Christophe Leroy , Cornelia Huck , Dave Hansen , Haiyang Zhang , "H. Peter Anvin" , Ingo Molnar , "Isaac J. Manjarres" , Jim Mattson , Joerg Roedel , Johannes Weiner , Juergen Gross , KarimAllah Ahmed , Kees Cook , "K. Y. Srinivasan" , "Matthew Wilcox (Oracle)" , Matt Sickler , Mel Gorman , Michael Ellerman , Michal Hocko , Mike Rapoport , Mike Rapoport , Nicholas Piggin , Oscar Salvador , Paolo Bonzini , Paul Mackerras , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , Radim =?utf-8?B?S3LEjW3DocWZ?= , Sasha Levin , Stefano Stabellini , Stephen Hemminger , Thomas Gleixner , Vitaly Kuznetsov , Vlastimil Babka , Wanpeng Li , YueHaibing , Adam Borowski Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes Message-ID: <20191105234208.GH23297@linux.intel.com> References: <20191024120938.11237-4-david@redhat.com> <01adb4cb-6092-638c-0bab-e61322be7cf5@redhat.com> <613f3606-748b-0e56-a3ad-1efaffa1a67b@redhat.com> <20191105160000.GC8128@linux.intel.com> <20191105231316.GE23297@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 05, 2019 at 03:30:00PM -0800, Dan Williams wrote: > On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson > wrote: > > > > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote: > > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand wrote: > > > > > The scarier code (for me) is transparent_hugepage_adjust() and > > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the > > > > > interaction between THP and _PAGE_DEVMAP. > > > > > > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it > > > > had to be said :/ ). Luckily, this should be independent of the > > > > PG_reserved thingy AFAIKs. > > > > > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte() > > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the > > > page count gets mismanaged and leads to the reported hang. > > > > When mapping pages into the guest, KVM gets the page via gup(), which > > increments the page count for ZONE_DEVICE pages. But KVM puts the page > > using kvm_release_pfn_clean(), which skips put_page() if PageReserved() > > and so never puts its reference to ZONE_DEVICE pages. > > Oh, yeah, that's busted. > > > My transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte() > > comments were for a post-patch/series scenario wheren PageReserved() is > > no longer true for ZONE_DEVICE pages. > > Ah, ok, for that David is preserving kvm_is_reserved_pfn() returning > true for ZONE_DEVICE because pfn_to_online_page() will fail for > ZONE_DEVICE. But David's proposed fix for the above refcount bug is to omit the patch so that KVM no longer treats ZONE_DEVICE pages as reserved. That seems like the right thing to do, including for thp_adjust(), e.g. it would naturally let KVM use 2mb pages for the guest when a ZONE_DEVICE page is mapped with a huge page (2mb or above) in the host. The only hiccup is figuring out how to correctly transfer the reference.