From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4999CA9EAF for ; Thu, 24 Oct 2019 20:10:39 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 22C1E20684 for ; Thu, 24 Oct 2019 20:10:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Q4HIcvcQ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 22C1E20684 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46zdf41w5HzDqf2 for ; Fri, 25 Oct 2019 07:10:36 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=redhat.com (client-ip=205.139.110.61; helo=us-smtp-delivery-1.mimecast.com; envelope-from=david@redhat.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="Q4HIcvcQ"; dkim-atps=neutral Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46zR4L2g3TzDqVl for ; Thu, 24 Oct 2019 23:14:10 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1571919247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BdHwj2SGut21P5mUR9vWZhfFecp/N83MWMVJ97x7s00=; b=Q4HIcvcQwva76X2no6936K2lkQExFGFzTqRqJh2zHbvaUL2TCO7zXbLSChwOPsylJmR2S1 bzb29uyuyW2uWmdIz35iOZhb1DnO+EjEH1foO6uqLw/BikmQ9hJvV37azOIJpV1id0oqa3 XM+Oxt9rem+zC3nePqtJFfKSGtbSbL8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-376-cghgf-YzMZOgToF4kkZ_Ag-1; Thu, 24 Oct 2019 08:14:04 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BEA09801E5C; Thu, 24 Oct 2019 12:13:57 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-141.ams2.redhat.com [10.36.116.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7969C8088; Thu, 24 Oct 2019 12:13:39 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Subject: [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap Date: Thu, 24 Oct 2019 14:09:37 +0200 Message-Id: <20191024120938.11237-10-david@redhat.com> In-Reply-To: <20191024120938.11237-1-david@redhat.com> References: <20191024120938.11237-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: cghgf-YzMZOgToF4kkZ_Ag-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Fri, 25 Oct 2019 06:41:17 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-hyperv@vger.kernel.org, Michal Hocko , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , kvm@vger.kernel.org, David Hildenbrand , KarimAllah Ahmed , Dave Hansen , Alexander Duyck , Michal Hocko , linux-mm@kvack.org, Pavel Tatashin , Paul Mackerras , "H. Peter Anvin" , Wanpeng Li , Alexander Duyck , "K. Y. Srinivasan" , Dan Williams , Kees Cook , devel@driverdev.osuosl.org, Stefano Stabellini , Stephen Hemminger , "Aneesh Kumar K.V" , Joerg Roedel , x86@kernel.org, YueHaibing , "Matthew Wilcox \(Oracle\)" , Mike Rapoport , Peter Zijlstra , Ingo Molnar , Vlastimil Babka , Anthony Yznaga , Oscar Salvador , "Isaac J. Manjarres" , Matt Sickler , Juergen Gross , Anshuman Khandual , Haiyang Zhang , Sasha Levin , kvm-ppc@vger.kernel.org, Qian Cai , Alex Williamson , Mike Rapoport , Borislav Petkov , Nicholas Piggin , Andy Lutomirski , xen-devel@lists.xenproject.org, Boris Ostrovsky , Vitaly Kuznetsov , Allison Randal , Jim Mattson , Mel Gorman , Cornelia Huck , Pavel Tatashin , Sean Christopherson , Thomas Gleixner , Johannes Weiner , Paolo Bonzini , Andrew Morton , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Everything should be prepared to stop setting pages PG_reserved when initializing the memmap on memory hotplug. Most importantly, we stop marking ZONE_DEVICE pages PG_reserved. a) We made sure that any code that relied on PG_reserved to detect ZONE_DEVICE memory will no longer rely on PG_reserved (especially, by relying on pfn_to_online_page() for now). Details can be found below. b) We made sure that memory blocks with holes cannot be offlined and therefore also not onlined. We have quite some code that relies on memory holes being marked PG_reserved. This is now not an issue anymore. generic_online_page() still calls __free_pages_core(), which performs __ClearPageReserved(p). AFAIKS, this should not hurt. It is worth nothing that the users of online_page_callback_t might see a change. E.g., until now, pages not freed to the buddy by the HyperV balloonm were set PG_reserved until freed via generic_online_page(). Now, they would look like ordinarily allocated pages (refcount =3D=3D 1). This callback is used by the XEN balloon and the HyperV balloon. To not introduce any silent errors, keep marking the pages PG_reserved. We can most probably stop doing that, but have to double check if there are issues (e.g., offlining code aborts right away in has_unmovable_pages() when it runs into a PageReserved(page)) Update the documentation at various places in the MM core. There are three PageReserved() users that might be affected by this change. - drivers/staging/gasket/gasket_page_table.c:gasket_release_page() -> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page -> I assume "we don't care" - drivers/staging/kpc2000/kpc_dma/fileops.c:transfer_complete_cb() -> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page -> I assume "we don't care" - mm/usercopy.c: check_page_span() -> According to Dan, non-HMM ZONE_DEVICE usage excluded this code since commit 52f476a323f9 ("libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead") -> It is unclear whether we rally cared about ZONE_DEVICE here (HMM) or simply about "PG_reserved". The worst thing that could happen is a false negative with CONFIG_HARDENED_USERCOPY we should be able to identify easily. -> There is a discussion to rip out that code completely -> I assume "not relevant" / "we don't care" I audited the other PageReserved() users. They don't affect ZONE_DEVICE: - mm/page_owner.c:pagetypeinfo_showmixedcount_print() -> Never called for ZONE_DEVICE, (+ pfn_to_online_page(pfn)) - mm/page_owner.c:init_pages_in_zone() -> Never called for ZONE_DEVICE (!populated_zone(zone)) - mm/page_ext.c:free_page_ext() -> Only a BUG_ON(PageReserved(page)), not relevant - mm/page_ext.c:has_unmovable_pages() -> Not releveant for ZONE_DEVICE - mm/page_ext.c:pfn_range_valid_contig() -> pfn_to_online_page() already guards us - mm/mempolicy.c:queue_pages_pte_range() -> vm_normal_page() checks against pte_devmap() - mm/memory-failure.c:hwpoison_user_mappings() -> Not reached via memory_failure() due to pfn_to_online_page() -> Also not reached indirectly via memory_failure_hugetlb() - mm/hugetlb.c:gather_bootmem_prealloc() -> Only a WARN_ON(PageReserved(page)), not relevant - kernel/power/snapshot.c:saveable_highmem_page() -> pfn_to_online_page() already guards us - kernel/power/snapshot.c:saveable_page() -> pfn_to_online_page() already guards us - fs/proc/task_mmu.c:can_gather_numa_stats() -> vm_normal_page() checks against pte_devmap() - fs/proc/task_mmu.c:can_gather_numa_stats_pmd -> vm_normal_page_pmd() checks against pte_devmap() - fs/proc/page.c:stable_page_flags() -> The reserved bit is simply copied, irrelevant - drivers/firmware/memmap.c:release_firmware_map_entry() -> really only a check to detect bootmem. Not relevant for ZONE_DEVICE - arch/ia64/kernel/mca_drv.c - arch/mips/mm/init.c - arch/mips/mm/ioremap.c - arch/nios2/mm/ioremap.c - arch/parisc/mm/ioremap.c - arch/sparc/mm/tlb.c - arch/xtensa/mm/cache.c -> No ZONE_DEVICE support - arch/powerpc/mm/init_64.c:vmemmap_free() -> Special-cases memmap on altmap -> Only a check for bootmem - arch/x86/kernel/alternative.c:__text_poke() -> Only a WARN_ON(!PageReserved(pages[0])) to verify it is bootmem - arch/x86/mm/init_64.c -> Only a check for bootmem Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Sasha Levin Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Andrew Morton Cc: Alexander Duyck Cc: Pavel Tatashin Cc: Vlastimil Babka Cc: Johannes Weiner Cc: Anthony Yznaga Cc: Michal Hocko Cc: Oscar Salvador Cc: Dan Williams Cc: Mel Gorman Cc: Mike Rapoport Cc: Anshuman Khandual Cc: Matt Sickler Cc: Kees Cook Suggested-by: Michal Hocko Signed-off-by: David Hildenbrand --- drivers/hv/hv_balloon.c | 6 ++++++ drivers/xen/balloon.c | 7 +++++++ include/linux/page-flags.h | 8 +------- mm/memory_hotplug.c | 17 +++++++---------- mm/page_alloc.c | 11 ----------- 5 files changed, 21 insertions(+), 28 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index c722079d3c24..3214b0ef5247 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -670,6 +670,12 @@ static struct notifier_block hv_memory_nb =3D { /* Check if the particular page is backed and can be onlined and online it= . */ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *p= g) { +=09/* +=09 * TODO: The core used to mark the pages reserved. Most probably +=09 * we can stop doing that now. +=09 */ +=09__SetPageReserved(pg); + =09if (!has_pfn_is_backed(has, page_to_pfn(pg))) { =09=09if (!PageOffline(pg)) =09=09=09__SetPageOffline(pg); diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 4f2e78a5e4db..af69f057913a 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -374,6 +374,13 @@ static void xen_online_page(struct page *page, unsigne= d int order) =09mutex_lock(&balloon_mutex); =09for (i =3D 0; i < size; i++) { =09=09p =3D pfn_to_page(start_pfn + i); +=09=09/* +=09=09 * TODO: The core used to mark the pages reserved. Most probably +=09=09 * we can stop doing that now. However, especially +=09=09 * alloc_xenballooned_pages() left PG_reserved set +=09=09 * on pages that can get mapped to user space. +=09=09 */ +=09=09__SetPageReserved(p); =09=09balloon_append(p); =09} =09mutex_unlock(&balloon_mutex); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 3b8e5c5f7e1f..e9a7465219d1 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -30,24 +30,18 @@ * - Pages falling into physical memory gaps - not IORESOURCE_SYSRAM. Tryi= ng * to read/write these pages might end badly. Don't touch! * - The zero page(s) - * - Pages not added to the page allocator when onlining a section because - * they were excluded via the online_page_callback() or because they are - * PG_hwpoison. * - Pages allocated in the context of kexec/kdump (loaded kernel image, * control pages, vmcoreinfo) * - MMIO/DMA pages. Some architectures don't allow to ioremap pages that = are * not marked PG_reserved (as they might be in use by somebody else who = does * not respect the caching strategy). - * - Pages part of an offline section (struct pages of offline sections sh= ould - * not be trusted as they will be initialized when first onlined). * - MCA pages on ia64 * - Pages holding CPU notes for POWER Firmware Assisted Dump - * - Device memory (e.g. PMEM, DAX, HMM) * Some PG_reserved pages will be excluded from the hibernation image. * PG_reserved does in general not hinder anybody from dumping or swapping * and is no longer required for remap_pfn_range(). ioremap might require = it. * Consequently, PG_reserved for a page mapped into user space can indicat= e - * the zero page, the vDSO, MMIO pages or device memory. + * the zero page, the vDSO, or MMIO pages. * * The PG_private bitflag is set on pagecache pages if they contain filesy= stem * specific data (which is normally at page->private). It can be used by diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8d81730cf036..2714edce98dd 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -501,9 +501,7 @@ static void __remove_section(unsigned long pfn, unsigne= d long nr_pages, * @altmap: alternative device page map or %NULL if default memmap is used * * Generic helper function to remove section mappings and sysfs entries - * for the section of the memory we are removing. Caller needs to make - * sure that pages are marked reserved and zones are adjust properly by - * calling offline_pages(). + * for the section of the memory we are removing. */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, =09=09 struct vmem_altmap *altmap) @@ -584,9 +582,9 @@ static int online_pages_range(unsigned long start_pfn, = unsigned long nr_pages, =09int order; =20 =09/* -=09 * Online the pages. The callback might decide to keep some pages -=09 * PG_reserved (to add them to the buddy later), but we still account -=09 * them as being online/belonging to this zone ("present"). +=09 * Online the pages. The callback might decide to not free some pages +=09 * (to add them to the buddy later), but we still account them as +=09 * being online/belonging to this zone ("present"). =09 */ =09for (pfn =3D start_pfn; pfn < end_pfn; pfn +=3D 1ul << order) { =09=09order =3D min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn))); @@ -659,8 +657,7 @@ static void __meminit resize_pgdat_range(struct pglist_= data *pgdat, unsigned lon } /* * Associate the pfn range with the given zone, initializing the memmaps - * and resizing the pgdat/zone data to span the added pages. After this - * call, all affected pages are PG_reserved. + * and resizing the pgdat/zone data to span the added pages. */ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_p= fn, =09=09unsigned long nr_pages, struct vmem_altmap *altmap) @@ -684,8 +681,8 @@ void __ref move_pfn_range_to_zone(struct zone *zone, un= signed long start_pfn, =09/* =09 * TODO now we have a visible range of pages which are not associated =09 * with their zone properly. Not nice but set_pfnblock_flags_mask -=09 * expects the zone spans the pfn range. All the pages in the range -=09 * are reserved so nobody should be touching them so we should be safe +=09 * expects the zone spans the pfn range. The sections are not yet +=09 * marked online so nobody should be touching the memmap. =09 */ =09memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn, =09=09=09MEMMAP_HOTPLUG, altmap); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f9488efff680..aa6ecac27b68 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5927,8 +5927,6 @@ void __meminit memmap_init_zone(unsigned long size, i= nt nid, unsigned long zone, =20 =09=09page =3D pfn_to_page(pfn); =09=09__init_single_page(page, pfn, zone, nid); -=09=09if (context =3D=3D MEMMAP_HOTPLUG) -=09=09=09__SetPageReserved(page); =20 =09=09/* =09=09 * Mark the block movable so that blocks are reserved for @@ -5980,15 +5978,6 @@ void __ref memmap_init_zone_device(struct zone *zone= , =20 =09=09__init_single_page(page, pfn, zone_idx, nid); =20 -=09=09/* -=09=09 * Mark page reserved as it will need to wait for onlining -=09=09 * phase for it to be fully associated with a zone. -=09=09 * -=09=09 * We can use the non-atomic __set_bit operation for setting -=09=09 * the flag as we are still initializing the pages. -=09=09 */ -=09=09__SetPageReserved(page); - =09=09/* =09=09 * ZONE_DEVICE pages union ->lru with a ->pgmap back pointer =09=09 * and zone_device_data. It is a bug if a ZONE_DEVICE page is --=20 2.21.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 323E6CA9EAF for ; Thu, 24 Oct 2019 12:14:35 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EC9212166E for ; Thu, 24 Oct 2019 12:14:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="T4nUTKuo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC9212166E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iNc0K-0005ZE-SL; Thu, 24 Oct 2019 12:14:12 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iNc0K-0005Ye-1n for xen-devel@lists.xenproject.org; Thu, 24 Oct 2019 12:14:12 +0000 X-Inumbo-ID: c6804a9e-f657-11e9-949f-12813bfff9fa Received: from us-smtp-1.mimecast.com (unknown [205.139.110.120]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTP id c6804a9e-f657-11e9-949f-12813bfff9fa; Thu, 24 Oct 2019 12:14:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1571919246; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BdHwj2SGut21P5mUR9vWZhfFecp/N83MWMVJ97x7s00=; b=T4nUTKuop9UVBTKArpbw/r4KDrADT1Mw4HZKgwSBG11GnyoBMVGQl/eq0jYD+n/mdG4J4Z NHmnQWAyvFIzQRPXmt5gjIrc3kqHt7Qp5f3pR66gzd/qwNpYfGv/0zbCRBU3Qb8higoJqT 4jaApMQWE5WvnLdHFSZ0Ehx08Fa9vEk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-376-cghgf-YzMZOgToF4kkZ_Ag-1; Thu, 24 Oct 2019 08:14:04 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BEA09801E5C; Thu, 24 Oct 2019 12:13:57 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-141.ams2.redhat.com [10.36.116.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7969C8088; Thu, 24 Oct 2019 12:13:39 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Date: Thu, 24 Oct 2019 14:09:37 +0200 Message-Id: <20191024120938.11237-10-david@redhat.com> In-Reply-To: <20191024120938.11237-1-david@redhat.com> References: <20191024120938.11237-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: cghgf-YzMZOgToF4kkZ_Ag-1 X-Mimecast-Spam-Score: 0 Subject: [Xen-devel] [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: linux-hyperv@vger.kernel.org, Michal Hocko , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , kvm@vger.kernel.org, David Hildenbrand , KarimAllah Ahmed , Benjamin Herrenschmidt , Dave Hansen , Alexander Duyck , Michal Hocko , Paul Mackerras , linux-mm@kvack.org, Pavel Tatashin , Paul Mackerras , Michael Ellerman , "H. Peter Anvin" , Wanpeng Li , Alexander Duyck , "K. Y. Srinivasan" , Dan Williams , Kees Cook , devel@driverdev.osuosl.org, Stefano Stabellini , Stephen Hemminger , "Aneesh Kumar K.V" , Joerg Roedel , x86@kernel.org, YueHaibing , "Matthew Wilcox \(Oracle\)" , Mike Rapoport , Peter Zijlstra , Ingo Molnar , Vlastimil Babka , Anthony Yznaga , Oscar Salvador , "Isaac J. Manjarres" , Matt Sickler , Juergen Gross , Anshuman Khandual , Haiyang Zhang , Sasha Levin , kvm-ppc@vger.kernel.org, Qian Cai , Alex Williamson , Mike Rapoport , Borislav Petkov , Nicholas Piggin , Andy Lutomirski , xen-devel@lists.xenproject.org, Boris Ostrovsky , Vitaly Kuznetsov , Allison Randal , Jim Mattson , Christophe Leroy , Mel Gorman , Cornelia Huck , Pavel Tatashin , Sean Christopherson , Thomas Gleixner , Johannes Weiner , Paolo Bonzini , Andrew Morton , linuxppc-dev@lists.ozlabs.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" RXZlcnl0aGluZyBzaG91bGQgYmUgcHJlcGFyZWQgdG8gc3RvcCBzZXR0aW5nIHBhZ2VzIFBHX3Jl c2VydmVkIHdoZW4KaW5pdGlhbGl6aW5nIHRoZSBtZW1tYXAgb24gbWVtb3J5IGhvdHBsdWcuIE1v c3QgaW1wb3J0YW50bHksIHdlCnN0b3AgbWFya2luZyBaT05FX0RFVklDRSBwYWdlcyBQR19yZXNl cnZlZC4KCmEpIFdlIG1hZGUgc3VyZSB0aGF0IGFueSBjb2RlIHRoYXQgcmVsaWVkIG9uIFBHX3Jl c2VydmVkIHRvIGRldGVjdAogICBaT05FX0RFVklDRSBtZW1vcnkgd2lsbCBubyBsb25nZXIgcmVs eSBvbiBQR19yZXNlcnZlZCAoZXNwZWNpYWxseSwKICAgYnkgcmVseWluZyBvbiBwZm5fdG9fb25s aW5lX3BhZ2UoKSBmb3Igbm93KS4gRGV0YWlscyBjYW4gYmUgZm91bmQKICAgYmVsb3cuCmIpIFdl IG1hZGUgc3VyZSB0aGF0IG1lbW9yeSBibG9ja3Mgd2l0aCBob2xlcyBjYW5ub3QgYmUgb2ZmbGlu ZWQgYW5kCiAgIHRoZXJlZm9yZSBhbHNvIG5vdCBvbmxpbmVkLiBXZSBoYXZlIHF1aXRlIHNvbWUg Y29kZSB0aGF0IHJlbGllcyBvbgogICBtZW1vcnkgaG9sZXMgYmVpbmcgbWFya2VkIFBHX3Jlc2Vy dmVkLiBUaGlzIGlzIG5vdyBub3QgYW4gaXNzdWUKICAgYW55bW9yZS4KCmdlbmVyaWNfb25saW5l X3BhZ2UoKSBzdGlsbCBjYWxscyBfX2ZyZWVfcGFnZXNfY29yZSgpLCB3aGljaCBwZXJmb3Jtcwpf X0NsZWFyUGFnZVJlc2VydmVkKHApLiBBRkFJS1MsIHRoaXMgc2hvdWxkIG5vdCBodXJ0LgoKSXQg aXMgd29ydGggbm90aGluZyB0aGF0IHRoZSB1c2VycyBvZiBvbmxpbmVfcGFnZV9jYWxsYmFja190 IG1pZ2h0IHNlZSBhCmNoYW5nZS4gRS5nLiwgdW50aWwgbm93LCBwYWdlcyBub3QgZnJlZWQgdG8g dGhlIGJ1ZGR5IGJ5IHRoZSBIeXBlclYKYmFsbG9vbm0gd2VyZSBzZXQgUEdfcmVzZXJ2ZWQgdW50 aWwgZnJlZWQgdmlhIGdlbmVyaWNfb25saW5lX3BhZ2UoKS4gTm93LAp0aGV5IHdvdWxkIGxvb2sg bGlrZSBvcmRpbmFyaWx5IGFsbG9jYXRlZCBwYWdlcyAocmVmY291bnQgPT0gMSkuIFRoaXMKY2Fs bGJhY2sgaXMgdXNlZCBieSB0aGUgWEVOIGJhbGxvb24gYW5kIHRoZSBIeXBlclYgYmFsbG9vbi4g VG8gbm90CmludHJvZHVjZSBhbnkgc2lsZW50IGVycm9ycywga2VlcCBtYXJraW5nIHRoZSBwYWdl cyBQR19yZXNlcnZlZC4gV2UgY2FuCm1vc3QgcHJvYmFibHkgc3RvcCBkb2luZyB0aGF0LCBidXQg aGF2ZSB0byBkb3VibGUgY2hlY2sgaWYgdGhlcmUgYXJlCmlzc3VlcyAoZS5nLiwgb2ZmbGluaW5n IGNvZGUgYWJvcnRzIHJpZ2h0IGF3YXkgaW4gaGFzX3VubW92YWJsZV9wYWdlcygpCndoZW4gaXQg cnVucyBpbnRvIGEgUGFnZVJlc2VydmVkKHBhZ2UpKQoKVXBkYXRlIHRoZSBkb2N1bWVudGF0aW9u IGF0IHZhcmlvdXMgcGxhY2VzIGluIHRoZSBNTSBjb3JlLgoKVGhlcmUgYXJlIHRocmVlIFBhZ2VS ZXNlcnZlZCgpIHVzZXJzIHRoYXQgbWlnaHQgYmUgYWZmZWN0ZWQgYnkgdGhpcyBjaGFuZ2UuCiAt IGRyaXZlcnMvc3RhZ2luZy9nYXNrZXQvZ2Fza2V0X3BhZ2VfdGFibGUuYzpnYXNrZXRfcmVsZWFz ZV9wYWdlKCkKICAgLT4gV2UgbWlnaHQgKHVubGlrZWx5KSBzZXQgU2V0UGFnZURpcnR5KCkgb24g YSBaT05FX0RFVklDRSBwYWdlCiAgIC0+IEkgYXNzdW1lICJ3ZSBkb24ndCBjYXJlIgogLSBkcml2 ZXJzL3N0YWdpbmcva3BjMjAwMC9rcGNfZG1hL2ZpbGVvcHMuYzp0cmFuc2Zlcl9jb21wbGV0ZV9j YigpCiAgIC0+IFdlIG1pZ2h0ICh1bmxpa2VseSkgc2V0IFNldFBhZ2VEaXJ0eSgpIG9uIGEgWk9O RV9ERVZJQ0UgcGFnZQogICAtPiBJIGFzc3VtZSAid2UgZG9uJ3QgY2FyZSIKIC0gbW0vdXNlcmNv cHkuYzogY2hlY2tfcGFnZV9zcGFuKCkKICAgLT4gQWNjb3JkaW5nIHRvIERhbiwgbm9uLUhNTSBa T05FX0RFVklDRSB1c2FnZSBleGNsdWRlZCB0aGlzIGNvZGUgc2luY2UKICAgICAgY29tbWl0IDUy ZjQ3NmEzMjNmOSAoImxpYm52ZGltbS9wbWVtOiBCeXBhc3MgQ09ORklHX0hBUkRFTkVEX1VTRVJD T1BZCiAgICAgIG92ZXJoZWFkIikKICAgLT4gSXQgaXMgdW5jbGVhciB3aGV0aGVyIHdlIHJhbGx5 IGNhcmVkIGFib3V0IFpPTkVfREVWSUNFIGhlcmUgKEhNTSkgb3IKICAgICAgc2ltcGx5IGFib3V0 ICJQR19yZXNlcnZlZCIuIFRoZSB3b3JzdCB0aGluZyB0aGF0IGNvdWxkIGhhcHBlbiBpcyBhCiAg ICAgIGZhbHNlIG5lZ2F0aXZlIHdpdGggQ09ORklHX0hBUkRFTkVEX1VTRVJDT1BZIHdlIHNob3Vs ZCBiZSBhYmxlIHRvCiAgICAgIGlkZW50aWZ5IGVhc2lseS4KICAgLT4gVGhlcmUgaXMgYSBkaXNj dXNzaW9uIHRvIHJpcCBvdXQgdGhhdCBjb2RlIGNvbXBsZXRlbHkKICAgLT4gSSBhc3N1bWUgIm5v dCByZWxldmFudCIgLyAid2UgZG9uJ3QgY2FyZSIKCkkgYXVkaXRlZCB0aGUgb3RoZXIgUGFnZVJl c2VydmVkKCkgdXNlcnMuIFRoZXkgZG9uJ3QgYWZmZWN0IFpPTkVfREVWSUNFOgogLSBtbS9wYWdl X293bmVyLmM6cGFnZXR5cGVpbmZvX3Nob3dtaXhlZGNvdW50X3ByaW50KCkKICAgLT4gTmV2ZXIg Y2FsbGVkIGZvciBaT05FX0RFVklDRSwgKCsgcGZuX3RvX29ubGluZV9wYWdlKHBmbikpCiAtIG1t L3BhZ2Vfb3duZXIuYzppbml0X3BhZ2VzX2luX3pvbmUoKQogICAtPiBOZXZlciBjYWxsZWQgZm9y IFpPTkVfREVWSUNFICghcG9wdWxhdGVkX3pvbmUoem9uZSkpCiAtIG1tL3BhZ2VfZXh0LmM6ZnJl ZV9wYWdlX2V4dCgpCiAgIC0+IE9ubHkgYSBCVUdfT04oUGFnZVJlc2VydmVkKHBhZ2UpKSwgbm90 IHJlbGV2YW50CiAtIG1tL3BhZ2VfZXh0LmM6aGFzX3VubW92YWJsZV9wYWdlcygpCiAgIC0+IE5v dCByZWxldmVhbnQgZm9yIFpPTkVfREVWSUNFCiAtIG1tL3BhZ2VfZXh0LmM6cGZuX3JhbmdlX3Zh bGlkX2NvbnRpZygpCiAgIC0+IHBmbl90b19vbmxpbmVfcGFnZSgpIGFscmVhZHkgZ3VhcmRzIHVz CiAtIG1tL21lbXBvbGljeS5jOnF1ZXVlX3BhZ2VzX3B0ZV9yYW5nZSgpCiAgIC0+IHZtX25vcm1h bF9wYWdlKCkgY2hlY2tzIGFnYWluc3QgcHRlX2Rldm1hcCgpCiAtIG1tL21lbW9yeS1mYWlsdXJl LmM6aHdwb2lzb25fdXNlcl9tYXBwaW5ncygpCiAgIC0+IE5vdCByZWFjaGVkIHZpYSBtZW1vcnlf ZmFpbHVyZSgpIGR1ZSB0byBwZm5fdG9fb25saW5lX3BhZ2UoKQogICAtPiBBbHNvIG5vdCByZWFj aGVkIGluZGlyZWN0bHkgdmlhIG1lbW9yeV9mYWlsdXJlX2h1Z2V0bGIoKQogLSBtbS9odWdldGxi LmM6Z2F0aGVyX2Jvb3RtZW1fcHJlYWxsb2MoKQogICAtPiBPbmx5IGEgV0FSTl9PTihQYWdlUmVz ZXJ2ZWQocGFnZSkpLCBub3QgcmVsZXZhbnQKIC0ga2VybmVsL3Bvd2VyL3NuYXBzaG90LmM6c2F2 ZWFibGVfaGlnaG1lbV9wYWdlKCkKICAgLT4gcGZuX3RvX29ubGluZV9wYWdlKCkgYWxyZWFkeSBn dWFyZHMgdXMKIC0ga2VybmVsL3Bvd2VyL3NuYXBzaG90LmM6c2F2ZWFibGVfcGFnZSgpCiAgIC0+ IHBmbl90b19vbmxpbmVfcGFnZSgpIGFscmVhZHkgZ3VhcmRzIHVzCiAtIGZzL3Byb2MvdGFza19t bXUuYzpjYW5fZ2F0aGVyX251bWFfc3RhdHMoKQogICAtPiB2bV9ub3JtYWxfcGFnZSgpIGNoZWNr cyBhZ2FpbnN0IHB0ZV9kZXZtYXAoKQogLSBmcy9wcm9jL3Rhc2tfbW11LmM6Y2FuX2dhdGhlcl9u dW1hX3N0YXRzX3BtZAogICAtPiB2bV9ub3JtYWxfcGFnZV9wbWQoKSBjaGVja3MgYWdhaW5zdCBw dGVfZGV2bWFwKCkKIC0gZnMvcHJvYy9wYWdlLmM6c3RhYmxlX3BhZ2VfZmxhZ3MoKQogICAtPiBU aGUgcmVzZXJ2ZWQgYml0IGlzIHNpbXBseSBjb3BpZWQsIGlycmVsZXZhbnQKIC0gZHJpdmVycy9m aXJtd2FyZS9tZW1tYXAuYzpyZWxlYXNlX2Zpcm13YXJlX21hcF9lbnRyeSgpCiAgIC0+IHJlYWxs eSBvbmx5IGEgY2hlY2sgdG8gZGV0ZWN0IGJvb3RtZW0uIE5vdCByZWxldmFudCBmb3IgWk9ORV9E RVZJQ0UKIC0gYXJjaC9pYTY0L2tlcm5lbC9tY2FfZHJ2LmMKIC0gYXJjaC9taXBzL21tL2luaXQu YwogLSBhcmNoL21pcHMvbW0vaW9yZW1hcC5jCiAtIGFyY2gvbmlvczIvbW0vaW9yZW1hcC5jCiAt IGFyY2gvcGFyaXNjL21tL2lvcmVtYXAuYwogLSBhcmNoL3NwYXJjL21tL3RsYi5jCiAtIGFyY2gv eHRlbnNhL21tL2NhY2hlLmMKICAgLT4gTm8gWk9ORV9ERVZJQ0Ugc3VwcG9ydAogLSBhcmNoL3Bv d2VycGMvbW0vaW5pdF82NC5jOnZtZW1tYXBfZnJlZSgpCiAgIC0+IFNwZWNpYWwtY2FzZXMgbWVt bWFwIG9uIGFsdG1hcAogICAtPiBPbmx5IGEgY2hlY2sgZm9yIGJvb3RtZW0KIC0gYXJjaC94ODYv a2VybmVsL2FsdGVybmF0aXZlLmM6X190ZXh0X3Bva2UoKQogICAtPiBPbmx5IGEgV0FSTl9PTigh UGFnZVJlc2VydmVkKHBhZ2VzWzBdKSkgdG8gdmVyaWZ5IGl0IGlzIGJvb3RtZW0KIC0gYXJjaC94 ODYvbW0vaW5pdF82NC5jCiAgIC0+IE9ubHkgYSBjaGVjayBmb3IgYm9vdG1lbQoKQ2M6ICJLLiBZ LiBTcmluaXZhc2FuIiA8a3lzQG1pY3Jvc29mdC5jb20+CkNjOiBIYWl5YW5nIFpoYW5nIDxoYWl5 YW5nekBtaWNyb3NvZnQuY29tPgpDYzogU3RlcGhlbiBIZW1taW5nZXIgPHN0aGVtbWluQG1pY3Jv c29mdC5jb20+CkNjOiBTYXNoYSBMZXZpbiA8c2FzaGFsQGtlcm5lbC5vcmc+CkNjOiBCb3JpcyBP c3Ryb3Zza3kgPGJvcmlzLm9zdHJvdnNreUBvcmFjbGUuY29tPgpDYzogSnVlcmdlbiBHcm9zcyA8 amdyb3NzQHN1c2UuY29tPgpDYzogU3RlZmFubyBTdGFiZWxsaW5pIDxzc3RhYmVsbGluaUBrZXJu ZWwub3JnPgpDYzogQW5kcmV3IE1vcnRvbiA8YWtwbUBsaW51eC1mb3VuZGF0aW9uLm9yZz4KQ2M6 IEFsZXhhbmRlciBEdXljayA8YWxleGFuZGVyLmguZHV5Y2tAbGludXguaW50ZWwuY29tPgpDYzog UGF2ZWwgVGF0YXNoaW4gPHBhdmVsLnRhdGFzaGluQG1pY3Jvc29mdC5jb20+CkNjOiBWbGFzdGlt aWwgQmFia2EgPHZiYWJrYUBzdXNlLmN6PgpDYzogSm9oYW5uZXMgV2VpbmVyIDxoYW5uZXNAY21w eGNoZy5vcmc+CkNjOiBBbnRob255IFl6bmFnYSA8YW50aG9ueS55em5hZ2FAb3JhY2xlLmNvbT4K Q2M6IE1pY2hhbCBIb2NrbyA8bWhvY2tvQHN1c2UuY29tPgpDYzogT3NjYXIgU2FsdmFkb3IgPG9z YWx2YWRvckBzdXNlLmRlPgpDYzogRGFuIFdpbGxpYW1zIDxkYW4uai53aWxsaWFtc0BpbnRlbC5j b20+CkNjOiBNZWwgR29ybWFuIDxtZ29ybWFuQHRlY2hzaW5ndWxhcml0eS5uZXQ+CkNjOiBNaWtl IFJhcG9wb3J0IDxycHB0QGxpbnV4LnZuZXQuaWJtLmNvbT4KQ2M6IEFuc2h1bWFuIEtoYW5kdWFs IDxhbnNodW1hbi5raGFuZHVhbEBhcm0uY29tPgpDYzogTWF0dCBTaWNrbGVyIDxNYXR0LlNpY2ts ZXJAZGFrdHJvbmljcy5jb20+CkNjOiBLZWVzIENvb2sgPGtlZXNjb29rQGNocm9taXVtLm9yZz4K U3VnZ2VzdGVkLWJ5OiBNaWNoYWwgSG9ja28gPG1ob2Nrb0BrZXJuZWwub3JnPgpTaWduZWQtb2Zm LWJ5OiBEYXZpZCBIaWxkZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT4KLS0tCiBkcml2ZXJzL2h2 L2h2X2JhbGxvb24uYyAgICB8ICA2ICsrKysrKwogZHJpdmVycy94ZW4vYmFsbG9vbi5jICAgICAg fCAgNyArKysrKysrCiBpbmNsdWRlL2xpbnV4L3BhZ2UtZmxhZ3MuaCB8ICA4ICstLS0tLS0tCiBt bS9tZW1vcnlfaG90cGx1Zy5jICAgICAgICB8IDE3ICsrKysrKystLS0tLS0tLS0tCiBtbS9wYWdl X2FsbG9jLmMgICAgICAgICAgICB8IDExIC0tLS0tLS0tLS0tCiA1IGZpbGVzIGNoYW5nZWQsIDIx IGluc2VydGlvbnMoKyksIDI4IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2RyaXZlcnMvaHYv aHZfYmFsbG9vbi5jIGIvZHJpdmVycy9odi9odl9iYWxsb29uLmMKaW5kZXggYzcyMjA3OWQzYzI0 Li4zMjE0YjBlZjUyNDcgMTAwNjQ0Ci0tLSBhL2RyaXZlcnMvaHYvaHZfYmFsbG9vbi5jCisrKyBi L2RyaXZlcnMvaHYvaHZfYmFsbG9vbi5jCkBAIC02NzAsNiArNjcwLDEyIEBAIHN0YXRpYyBzdHJ1 Y3Qgbm90aWZpZXJfYmxvY2sgaHZfbWVtb3J5X25iID0gewogLyogQ2hlY2sgaWYgdGhlIHBhcnRp Y3VsYXIgcGFnZSBpcyBiYWNrZWQgYW5kIGNhbiBiZSBvbmxpbmVkIGFuZCBvbmxpbmUgaXQuICov CiBzdGF0aWMgdm9pZCBodl9wYWdlX29ubGluZV9vbmUoc3RydWN0IGh2X2hvdGFkZF9zdGF0ZSAq aGFzLCBzdHJ1Y3QgcGFnZSAqcGcpCiB7CisJLyoKKwkgKiBUT0RPOiBUaGUgY29yZSB1c2VkIHRv IG1hcmsgdGhlIHBhZ2VzIHJlc2VydmVkLiBNb3N0IHByb2JhYmx5CisJICogd2UgY2FuIHN0b3Ag ZG9pbmcgdGhhdCBub3cuCisJICovCisJX19TZXRQYWdlUmVzZXJ2ZWQocGcpOworCiAJaWYgKCFo YXNfcGZuX2lzX2JhY2tlZChoYXMsIHBhZ2VfdG9fcGZuKHBnKSkpIHsKIAkJaWYgKCFQYWdlT2Zm bGluZShwZykpCiAJCQlfX1NldFBhZ2VPZmZsaW5lKHBnKTsKZGlmZiAtLWdpdCBhL2RyaXZlcnMv eGVuL2JhbGxvb24uYyBiL2RyaXZlcnMveGVuL2JhbGxvb24uYwppbmRleCA0ZjJlNzhhNWU0ZGIu LmFmNjlmMDU3OTEzYSAxMDA2NDQKLS0tIGEvZHJpdmVycy94ZW4vYmFsbG9vbi5jCisrKyBiL2Ry aXZlcnMveGVuL2JhbGxvb24uYwpAQCAtMzc0LDYgKzM3NCwxMyBAQCBzdGF0aWMgdm9pZCB4ZW5f b25saW5lX3BhZ2Uoc3RydWN0IHBhZ2UgKnBhZ2UsIHVuc2lnbmVkIGludCBvcmRlcikKIAltdXRl eF9sb2NrKCZiYWxsb29uX211dGV4KTsKIAlmb3IgKGkgPSAwOyBpIDwgc2l6ZTsgaSsrKSB7CiAJ CXAgPSBwZm5fdG9fcGFnZShzdGFydF9wZm4gKyBpKTsKKwkJLyoKKwkJICogVE9ETzogVGhlIGNv cmUgdXNlZCB0byBtYXJrIHRoZSBwYWdlcyByZXNlcnZlZC4gTW9zdCBwcm9iYWJseQorCQkgKiB3 ZSBjYW4gc3RvcCBkb2luZyB0aGF0IG5vdy4gSG93ZXZlciwgZXNwZWNpYWxseQorCQkgKiBhbGxv Y194ZW5iYWxsb29uZWRfcGFnZXMoKSBsZWZ0IFBHX3Jlc2VydmVkIHNldAorCQkgKiBvbiBwYWdl cyB0aGF0IGNhbiBnZXQgbWFwcGVkIHRvIHVzZXIgc3BhY2UuCisJCSAqLworCQlfX1NldFBhZ2VS ZXNlcnZlZChwKTsKIAkJYmFsbG9vbl9hcHBlbmQocCk7CiAJfQogCW11dGV4X3VubG9jaygmYmFs bG9vbl9tdXRleCk7CmRpZmYgLS1naXQgYS9pbmNsdWRlL2xpbnV4L3BhZ2UtZmxhZ3MuaCBiL2lu Y2x1ZGUvbGludXgvcGFnZS1mbGFncy5oCmluZGV4IDNiOGU1YzVmN2UxZi4uZTlhNzQ2NTIxOWQx IDEwMDY0NAotLS0gYS9pbmNsdWRlL2xpbnV4L3BhZ2UtZmxhZ3MuaAorKysgYi9pbmNsdWRlL2xp bnV4L3BhZ2UtZmxhZ3MuaApAQCAtMzAsMjQgKzMwLDE4IEBACiAgKiAtIFBhZ2VzIGZhbGxpbmcg aW50byBwaHlzaWNhbCBtZW1vcnkgZ2FwcyAtIG5vdCBJT1JFU09VUkNFX1NZU1JBTS4gVHJ5aW5n CiAgKiAgIHRvIHJlYWQvd3JpdGUgdGhlc2UgcGFnZXMgbWlnaHQgZW5kIGJhZGx5LiBEb24ndCB0 b3VjaCEKICAqIC0gVGhlIHplcm8gcGFnZShzKQotICogLSBQYWdlcyBub3QgYWRkZWQgdG8gdGhl IHBhZ2UgYWxsb2NhdG9yIHdoZW4gb25saW5pbmcgYSBzZWN0aW9uIGJlY2F1c2UKLSAqICAgdGhl eSB3ZXJlIGV4Y2x1ZGVkIHZpYSB0aGUgb25saW5lX3BhZ2VfY2FsbGJhY2soKSBvciBiZWNhdXNl IHRoZXkgYXJlCi0gKiAgIFBHX2h3cG9pc29uLgogICogLSBQYWdlcyBhbGxvY2F0ZWQgaW4gdGhl IGNvbnRleHQgb2Yga2V4ZWMva2R1bXAgKGxvYWRlZCBrZXJuZWwgaW1hZ2UsCiAgKiAgIGNvbnRy b2wgcGFnZXMsIHZtY29yZWluZm8pCiAgKiAtIE1NSU8vRE1BIHBhZ2VzLiBTb21lIGFyY2hpdGVj dHVyZXMgZG9uJ3QgYWxsb3cgdG8gaW9yZW1hcCBwYWdlcyB0aGF0IGFyZQogICogICBub3QgbWFy a2VkIFBHX3Jlc2VydmVkIChhcyB0aGV5IG1pZ2h0IGJlIGluIHVzZSBieSBzb21lYm9keSBlbHNl IHdobyBkb2VzCiAgKiAgIG5vdCByZXNwZWN0IHRoZSBjYWNoaW5nIHN0cmF0ZWd5KS4KLSAqIC0g UGFnZXMgcGFydCBvZiBhbiBvZmZsaW5lIHNlY3Rpb24gKHN0cnVjdCBwYWdlcyBvZiBvZmZsaW5l IHNlY3Rpb25zIHNob3VsZAotICogICBub3QgYmUgdHJ1c3RlZCBhcyB0aGV5IHdpbGwgYmUgaW5p dGlhbGl6ZWQgd2hlbiBmaXJzdCBvbmxpbmVkKS4KICAqIC0gTUNBIHBhZ2VzIG9uIGlhNjQKICAq IC0gUGFnZXMgaG9sZGluZyBDUFUgbm90ZXMgZm9yIFBPV0VSIEZpcm13YXJlIEFzc2lzdGVkIER1 bXAKLSAqIC0gRGV2aWNlIG1lbW9yeSAoZS5nLiBQTUVNLCBEQVgsIEhNTSkKICAqIFNvbWUgUEdf cmVzZXJ2ZWQgcGFnZXMgd2lsbCBiZSBleGNsdWRlZCBmcm9tIHRoZSBoaWJlcm5hdGlvbiBpbWFn ZS4KICAqIFBHX3Jlc2VydmVkIGRvZXMgaW4gZ2VuZXJhbCBub3QgaGluZGVyIGFueWJvZHkgZnJv bSBkdW1waW5nIG9yIHN3YXBwaW5nCiAgKiBhbmQgaXMgbm8gbG9uZ2VyIHJlcXVpcmVkIGZvciBy ZW1hcF9wZm5fcmFuZ2UoKS4gaW9yZW1hcCBtaWdodCByZXF1aXJlIGl0LgogICogQ29uc2VxdWVu dGx5LCBQR19yZXNlcnZlZCBmb3IgYSBwYWdlIG1hcHBlZCBpbnRvIHVzZXIgc3BhY2UgY2FuIGlu ZGljYXRlCi0gKiB0aGUgemVybyBwYWdlLCB0aGUgdkRTTywgTU1JTyBwYWdlcyBvciBkZXZpY2Ug bWVtb3J5LgorICogdGhlIHplcm8gcGFnZSwgdGhlIHZEU08sIG9yIE1NSU8gcGFnZXMuCiAgKgog ICogVGhlIFBHX3ByaXZhdGUgYml0ZmxhZyBpcyBzZXQgb24gcGFnZWNhY2hlIHBhZ2VzIGlmIHRo ZXkgY29udGFpbiBmaWxlc3lzdGVtCiAgKiBzcGVjaWZpYyBkYXRhICh3aGljaCBpcyBub3JtYWxs eSBhdCBwYWdlLT5wcml2YXRlKS4gSXQgY2FuIGJlIHVzZWQgYnkKZGlmZiAtLWdpdCBhL21tL21l bW9yeV9ob3RwbHVnLmMgYi9tbS9tZW1vcnlfaG90cGx1Zy5jCmluZGV4IDhkODE3MzBjZjAzNi4u MjcxNGVkY2U5OGRkIDEwMDY0NAotLS0gYS9tbS9tZW1vcnlfaG90cGx1Zy5jCisrKyBiL21tL21l bW9yeV9ob3RwbHVnLmMKQEAgLTUwMSw5ICs1MDEsNyBAQCBzdGF0aWMgdm9pZCBfX3JlbW92ZV9z ZWN0aW9uKHVuc2lnbmVkIGxvbmcgcGZuLCB1bnNpZ25lZCBsb25nIG5yX3BhZ2VzLAogICogQGFs dG1hcDogYWx0ZXJuYXRpdmUgZGV2aWNlIHBhZ2UgbWFwIG9yICVOVUxMIGlmIGRlZmF1bHQgbWVt bWFwIGlzIHVzZWQKICAqCiAgKiBHZW5lcmljIGhlbHBlciBmdW5jdGlvbiB0byByZW1vdmUgc2Vj dGlvbiBtYXBwaW5ncyBhbmQgc3lzZnMgZW50cmllcwotICogZm9yIHRoZSBzZWN0aW9uIG9mIHRo ZSBtZW1vcnkgd2UgYXJlIHJlbW92aW5nLiBDYWxsZXIgbmVlZHMgdG8gbWFrZQotICogc3VyZSB0 aGF0IHBhZ2VzIGFyZSBtYXJrZWQgcmVzZXJ2ZWQgYW5kIHpvbmVzIGFyZSBhZGp1c3QgcHJvcGVy bHkgYnkKLSAqIGNhbGxpbmcgb2ZmbGluZV9wYWdlcygpLgorICogZm9yIHRoZSBzZWN0aW9uIG9m IHRoZSBtZW1vcnkgd2UgYXJlIHJlbW92aW5nLgogICovCiB2b2lkIF9fcmVtb3ZlX3BhZ2VzKHVu c2lnbmVkIGxvbmcgcGZuLCB1bnNpZ25lZCBsb25nIG5yX3BhZ2VzLAogCQkgICAgc3RydWN0IHZt ZW1fYWx0bWFwICphbHRtYXApCkBAIC01ODQsOSArNTgyLDkgQEAgc3RhdGljIGludCBvbmxpbmVf cGFnZXNfcmFuZ2UodW5zaWduZWQgbG9uZyBzdGFydF9wZm4sIHVuc2lnbmVkIGxvbmcgbnJfcGFn ZXMsCiAJaW50IG9yZGVyOwogCiAJLyoKLQkgKiBPbmxpbmUgdGhlIHBhZ2VzLiBUaGUgY2FsbGJh Y2sgbWlnaHQgZGVjaWRlIHRvIGtlZXAgc29tZSBwYWdlcwotCSAqIFBHX3Jlc2VydmVkICh0byBh ZGQgdGhlbSB0byB0aGUgYnVkZHkgbGF0ZXIpLCBidXQgd2Ugc3RpbGwgYWNjb3VudAotCSAqIHRo ZW0gYXMgYmVpbmcgb25saW5lL2JlbG9uZ2luZyB0byB0aGlzIHpvbmUgKCJwcmVzZW50IikuCisJ ICogT25saW5lIHRoZSBwYWdlcy4gVGhlIGNhbGxiYWNrIG1pZ2h0IGRlY2lkZSB0byBub3QgZnJl ZSBzb21lIHBhZ2VzCisJICogKHRvIGFkZCB0aGVtIHRvIHRoZSBidWRkeSBsYXRlciksIGJ1dCB3 ZSBzdGlsbCBhY2NvdW50IHRoZW0gYXMKKwkgKiBiZWluZyBvbmxpbmUvYmVsb25naW5nIHRvIHRo aXMgem9uZSAoInByZXNlbnQiKS4KIAkgKi8KIAlmb3IgKHBmbiA9IHN0YXJ0X3BmbjsgcGZuIDwg ZW5kX3BmbjsgcGZuICs9IDF1bCA8PCBvcmRlcikgewogCQlvcmRlciA9IG1pbihNQVhfT1JERVIg LSAxLCBnZXRfb3JkZXIoUEZOX1BIWVMoZW5kX3BmbiAtIHBmbikpKTsKQEAgLTY1OSw4ICs2NTcs NyBAQCBzdGF0aWMgdm9pZCBfX21lbWluaXQgcmVzaXplX3BnZGF0X3JhbmdlKHN0cnVjdCBwZ2xp c3RfZGF0YSAqcGdkYXQsIHVuc2lnbmVkIGxvbgogfQogLyoKICAqIEFzc29jaWF0ZSB0aGUgcGZu IHJhbmdlIHdpdGggdGhlIGdpdmVuIHpvbmUsIGluaXRpYWxpemluZyB0aGUgbWVtbWFwcwotICog YW5kIHJlc2l6aW5nIHRoZSBwZ2RhdC96b25lIGRhdGEgdG8gc3BhbiB0aGUgYWRkZWQgcGFnZXMu IEFmdGVyIHRoaXMKLSAqIGNhbGwsIGFsbCBhZmZlY3RlZCBwYWdlcyBhcmUgUEdfcmVzZXJ2ZWQu CisgKiBhbmQgcmVzaXppbmcgdGhlIHBnZGF0L3pvbmUgZGF0YSB0byBzcGFuIHRoZSBhZGRlZCBw YWdlcy4KICAqLwogdm9pZCBfX3JlZiBtb3ZlX3Bmbl9yYW5nZV90b196b25lKHN0cnVjdCB6b25l ICp6b25lLCB1bnNpZ25lZCBsb25nIHN0YXJ0X3BmbiwKIAkJdW5zaWduZWQgbG9uZyBucl9wYWdl cywgc3RydWN0IHZtZW1fYWx0bWFwICphbHRtYXApCkBAIC02ODQsOCArNjgxLDggQEAgdm9pZCBf X3JlZiBtb3ZlX3Bmbl9yYW5nZV90b196b25lKHN0cnVjdCB6b25lICp6b25lLCB1bnNpZ25lZCBs b25nIHN0YXJ0X3BmbiwKIAkvKgogCSAqIFRPRE8gbm93IHdlIGhhdmUgYSB2aXNpYmxlIHJhbmdl IG9mIHBhZ2VzIHdoaWNoIGFyZSBub3QgYXNzb2NpYXRlZAogCSAqIHdpdGggdGhlaXIgem9uZSBw cm9wZXJseS4gTm90IG5pY2UgYnV0IHNldF9wZm5ibG9ja19mbGFnc19tYXNrCi0JICogZXhwZWN0 cyB0aGUgem9uZSBzcGFucyB0aGUgcGZuIHJhbmdlLiBBbGwgdGhlIHBhZ2VzIGluIHRoZSByYW5n ZQotCSAqIGFyZSByZXNlcnZlZCBzbyBub2JvZHkgc2hvdWxkIGJlIHRvdWNoaW5nIHRoZW0gc28g d2Ugc2hvdWxkIGJlIHNhZmUKKwkgKiBleHBlY3RzIHRoZSB6b25lIHNwYW5zIHRoZSBwZm4gcmFu Z2UuIFRoZSBzZWN0aW9ucyBhcmUgbm90IHlldAorCSAqIG1hcmtlZCBvbmxpbmUgc28gbm9ib2R5 IHNob3VsZCBiZSB0b3VjaGluZyB0aGUgbWVtbWFwLgogCSAqLwogCW1lbW1hcF9pbml0X3pvbmUo bnJfcGFnZXMsIG5pZCwgem9uZV9pZHgoem9uZSksIHN0YXJ0X3BmbiwKIAkJCU1FTU1BUF9IT1RQ TFVHLCBhbHRtYXApOwpkaWZmIC0tZ2l0IGEvbW0vcGFnZV9hbGxvYy5jIGIvbW0vcGFnZV9hbGxv Yy5jCmluZGV4IGY5NDg4ZWZmZjY4MC4uYWE2ZWNhYzI3YjY4IDEwMDY0NAotLS0gYS9tbS9wYWdl X2FsbG9jLmMKKysrIGIvbW0vcGFnZV9hbGxvYy5jCkBAIC01OTI3LDggKzU5MjcsNiBAQCB2b2lk IF9fbWVtaW5pdCBtZW1tYXBfaW5pdF96b25lKHVuc2lnbmVkIGxvbmcgc2l6ZSwgaW50IG5pZCwg dW5zaWduZWQgbG9uZyB6b25lLAogCiAJCXBhZ2UgPSBwZm5fdG9fcGFnZShwZm4pOwogCQlfX2lu aXRfc2luZ2xlX3BhZ2UocGFnZSwgcGZuLCB6b25lLCBuaWQpOwotCQlpZiAoY29udGV4dCA9PSBN RU1NQVBfSE9UUExVRykKLQkJCV9fU2V0UGFnZVJlc2VydmVkKHBhZ2UpOwogCiAJCS8qCiAJCSAq IE1hcmsgdGhlIGJsb2NrIG1vdmFibGUgc28gdGhhdCBibG9ja3MgYXJlIHJlc2VydmVkIGZvcgpA QCAtNTk4MCwxNSArNTk3OCw2IEBAIHZvaWQgX19yZWYgbWVtbWFwX2luaXRfem9uZV9kZXZpY2Uo c3RydWN0IHpvbmUgKnpvbmUsCiAKIAkJX19pbml0X3NpbmdsZV9wYWdlKHBhZ2UsIHBmbiwgem9u ZV9pZHgsIG5pZCk7CiAKLQkJLyoKLQkJICogTWFyayBwYWdlIHJlc2VydmVkIGFzIGl0IHdpbGwg bmVlZCB0byB3YWl0IGZvciBvbmxpbmluZwotCQkgKiBwaGFzZSBmb3IgaXQgdG8gYmUgZnVsbHkg YXNzb2NpYXRlZCB3aXRoIGEgem9uZS4KLQkJICoKLQkJICogV2UgY2FuIHVzZSB0aGUgbm9uLWF0 b21pYyBfX3NldF9iaXQgb3BlcmF0aW9uIGZvciBzZXR0aW5nCi0JCSAqIHRoZSBmbGFnIGFzIHdl IGFyZSBzdGlsbCBpbml0aWFsaXppbmcgdGhlIHBhZ2VzLgotCQkgKi8KLQkJX19TZXRQYWdlUmVz ZXJ2ZWQocGFnZSk7Ci0KIAkJLyoKIAkJICogWk9ORV9ERVZJQ0UgcGFnZXMgdW5pb24gLT5scnUg d2l0aCBhIC0+cGdtYXAgYmFjayBwb2ludGVyCiAJCSAqIGFuZCB6b25lX2RldmljZV9kYXRhLiAg SXQgaXMgYSBidWcgaWYgYSBaT05FX0RFVklDRSBwYWdlIGlzCi0tIAoyLjIxLjAKCgpfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpYZW4tZGV2ZWwgbWFpbGlu ZyBsaXN0Clhlbi1kZXZlbEBsaXN0cy54ZW5wcm9qZWN0Lm9yZwpodHRwczovL2xpc3RzLnhlbnBy b2plY3Qub3JnL21haWxtYW4vbGlzdGluZm8veGVuLWRldmVs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E870ACA9EAF for ; Thu, 24 Oct 2019 12:14:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 934242166E for ; Thu, 24 Oct 2019 12:14:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="T4nUTKuo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 934242166E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 45DD46B026A; Thu, 24 Oct 2019 08:14:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 40DF86B026B; Thu, 24 Oct 2019 08:14:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D51B6B026C; Thu, 24 Oct 2019 08:14:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id F2F8E6B026A for ; Thu, 24 Oct 2019 08:14:07 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 878FC5DD0 for ; Thu, 24 Oct 2019 12:14:07 +0000 (UTC) X-FDA: 76078570134.10.roll32_23ed820e0301e X-HE-Tag: roll32_23ed820e0301e X-Filterd-Recvd-Size: 17312 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Thu, 24 Oct 2019 12:14:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1571919246; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BdHwj2SGut21P5mUR9vWZhfFecp/N83MWMVJ97x7s00=; b=T4nUTKuop9UVBTKArpbw/r4KDrADT1Mw4HZKgwSBG11GnyoBMVGQl/eq0jYD+n/mdG4J4Z NHmnQWAyvFIzQRPXmt5gjIrc3kqHt7Qp5f3pR66gzd/qwNpYfGv/0zbCRBU3Qb8higoJqT 4jaApMQWE5WvnLdHFSZ0Ehx08Fa9vEk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-376-cghgf-YzMZOgToF4kkZ_Ag-1; Thu, 24 Oct 2019 08:14:04 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BEA09801E5C; Thu, 24 Oct 2019 12:13:57 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-141.ams2.redhat.com [10.36.116.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7969C8088; Thu, 24 Oct 2019 12:13:39 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Michal Hocko , Andrew Morton , kvm-ppc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org, linux-hyperv@vger.kernel.org, devel@driverdev.osuosl.org, xen-devel@lists.xenproject.org, x86@kernel.org, Alexander Duyck , Alexander Duyck , Alex Williamson , Allison Randal , Andy Lutomirski , "Aneesh Kumar K.V" , Anshuman Khandual , Anthony Yznaga , Benjamin Herrenschmidt , Borislav Petkov , Boris Ostrovsky , Christophe Leroy , Cornelia Huck , Dan Williams , Dave Hansen , Haiyang Zhang , "H. Peter Anvin" , Ingo Molnar , "Isaac J. Manjarres" , Jim Mattson , Joerg Roedel , Johannes Weiner , Juergen Gross , KarimAllah Ahmed , Kees Cook , "K. Y. Srinivasan" , "Matthew Wilcox (Oracle)" , Matt Sickler , Mel Gorman , Michael Ellerman , Michal Hocko , Mike Rapoport , Mike Rapoport , Nicholas Piggin , Oscar Salvador , Paolo Bonzini , Paul Mackerras , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Sasha Levin , Sean Christopherson , Stefano Stabellini , Stephen Hemminger , Thomas Gleixner , Vitaly Kuznetsov , Vlastimil Babka , Wanpeng Li , YueHaibing Subject: [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap Date: Thu, 24 Oct 2019 14:09:37 +0200 Message-Id: <20191024120938.11237-10-david@redhat.com> In-Reply-To: <20191024120938.11237-1-david@redhat.com> References: <20191024120938.11237-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: cghgf-YzMZOgToF4kkZ_Ag-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Everything should be prepared to stop setting pages PG_reserved when initializing the memmap on memory hotplug. Most importantly, we stop marking ZONE_DEVICE pages PG_reserved. a) We made sure that any code that relied on PG_reserved to detect ZONE_DEVICE memory will no longer rely on PG_reserved (especially, by relying on pfn_to_online_page() for now). Details can be found below. b) We made sure that memory blocks with holes cannot be offlined and therefore also not onlined. We have quite some code that relies on memory holes being marked PG_reserved. This is now not an issue anymore. generic_online_page() still calls __free_pages_core(), which performs __ClearPageReserved(p). AFAIKS, this should not hurt. It is worth nothing that the users of online_page_callback_t might see a change. E.g., until now, pages not freed to the buddy by the HyperV balloonm were set PG_reserved until freed via generic_online_page(). Now, they would look like ordinarily allocated pages (refcount =3D=3D 1). This callback is used by the XEN balloon and the HyperV balloon. To not introduce any silent errors, keep marking the pages PG_reserved. We can most probably stop doing that, but have to double check if there are issues (e.g., offlining code aborts right away in has_unmovable_pages() when it runs into a PageReserved(page)) Update the documentation at various places in the MM core. There are three PageReserved() users that might be affected by this change. - drivers/staging/gasket/gasket_page_table.c:gasket_release_page() -> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page -> I assume "we don't care" - drivers/staging/kpc2000/kpc_dma/fileops.c:transfer_complete_cb() -> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page -> I assume "we don't care" - mm/usercopy.c: check_page_span() -> According to Dan, non-HMM ZONE_DEVICE usage excluded this code since commit 52f476a323f9 ("libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead") -> It is unclear whether we rally cared about ZONE_DEVICE here (HMM) or simply about "PG_reserved". The worst thing that could happen is a false negative with CONFIG_HARDENED_USERCOPY we should be able to identify easily. -> There is a discussion to rip out that code completely -> I assume "not relevant" / "we don't care" I audited the other PageReserved() users. They don't affect ZONE_DEVICE: - mm/page_owner.c:pagetypeinfo_showmixedcount_print() -> Never called for ZONE_DEVICE, (+ pfn_to_online_page(pfn)) - mm/page_owner.c:init_pages_in_zone() -> Never called for ZONE_DEVICE (!populated_zone(zone)) - mm/page_ext.c:free_page_ext() -> Only a BUG_ON(PageReserved(page)), not relevant - mm/page_ext.c:has_unmovable_pages() -> Not releveant for ZONE_DEVICE - mm/page_ext.c:pfn_range_valid_contig() -> pfn_to_online_page() already guards us - mm/mempolicy.c:queue_pages_pte_range() -> vm_normal_page() checks against pte_devmap() - mm/memory-failure.c:hwpoison_user_mappings() -> Not reached via memory_failure() due to pfn_to_online_page() -> Also not reached indirectly via memory_failure_hugetlb() - mm/hugetlb.c:gather_bootmem_prealloc() -> Only a WARN_ON(PageReserved(page)), not relevant - kernel/power/snapshot.c:saveable_highmem_page() -> pfn_to_online_page() already guards us - kernel/power/snapshot.c:saveable_page() -> pfn_to_online_page() already guards us - fs/proc/task_mmu.c:can_gather_numa_stats() -> vm_normal_page() checks against pte_devmap() - fs/proc/task_mmu.c:can_gather_numa_stats_pmd -> vm_normal_page_pmd() checks against pte_devmap() - fs/proc/page.c:stable_page_flags() -> The reserved bit is simply copied, irrelevant - drivers/firmware/memmap.c:release_firmware_map_entry() -> really only a check to detect bootmem. Not relevant for ZONE_DEVICE - arch/ia64/kernel/mca_drv.c - arch/mips/mm/init.c - arch/mips/mm/ioremap.c - arch/nios2/mm/ioremap.c - arch/parisc/mm/ioremap.c - arch/sparc/mm/tlb.c - arch/xtensa/mm/cache.c -> No ZONE_DEVICE support - arch/powerpc/mm/init_64.c:vmemmap_free() -> Special-cases memmap on altmap -> Only a check for bootmem - arch/x86/kernel/alternative.c:__text_poke() -> Only a WARN_ON(!PageReserved(pages[0])) to verify it is bootmem - arch/x86/mm/init_64.c -> Only a check for bootmem Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Sasha Levin Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Andrew Morton Cc: Alexander Duyck Cc: Pavel Tatashin Cc: Vlastimil Babka Cc: Johannes Weiner Cc: Anthony Yznaga Cc: Michal Hocko Cc: Oscar Salvador Cc: Dan Williams Cc: Mel Gorman Cc: Mike Rapoport Cc: Anshuman Khandual Cc: Matt Sickler Cc: Kees Cook Suggested-by: Michal Hocko Signed-off-by: David Hildenbrand --- drivers/hv/hv_balloon.c | 6 ++++++ drivers/xen/balloon.c | 7 +++++++ include/linux/page-flags.h | 8 +------- mm/memory_hotplug.c | 17 +++++++---------- mm/page_alloc.c | 11 ----------- 5 files changed, 21 insertions(+), 28 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index c722079d3c24..3214b0ef5247 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -670,6 +670,12 @@ static struct notifier_block hv_memory_nb =3D { /* Check if the particular page is backed and can be onlined and online it= . */ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *p= g) { +=09/* +=09 * TODO: The core used to mark the pages reserved. Most probably +=09 * we can stop doing that now. +=09 */ +=09__SetPageReserved(pg); + =09if (!has_pfn_is_backed(has, page_to_pfn(pg))) { =09=09if (!PageOffline(pg)) =09=09=09__SetPageOffline(pg); diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 4f2e78a5e4db..af69f057913a 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -374,6 +374,13 @@ static void xen_online_page(struct page *page, unsigne= d int order) =09mutex_lock(&balloon_mutex); =09for (i =3D 0; i < size; i++) { =09=09p =3D pfn_to_page(start_pfn + i); +=09=09/* +=09=09 * TODO: The core used to mark the pages reserved. Most probably +=09=09 * we can stop doing that now. However, especially +=09=09 * alloc_xenballooned_pages() left PG_reserved set +=09=09 * on pages that can get mapped to user space. +=09=09 */ +=09=09__SetPageReserved(p); =09=09balloon_append(p); =09} =09mutex_unlock(&balloon_mutex); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 3b8e5c5f7e1f..e9a7465219d1 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -30,24 +30,18 @@ * - Pages falling into physical memory gaps - not IORESOURCE_SYSRAM. Tryi= ng * to read/write these pages might end badly. Don't touch! * - The zero page(s) - * - Pages not added to the page allocator when onlining a section because - * they were excluded via the online_page_callback() or because they are - * PG_hwpoison. * - Pages allocated in the context of kexec/kdump (loaded kernel image, * control pages, vmcoreinfo) * - MMIO/DMA pages. Some architectures don't allow to ioremap pages that = are * not marked PG_reserved (as they might be in use by somebody else who = does * not respect the caching strategy). - * - Pages part of an offline section (struct pages of offline sections sh= ould - * not be trusted as they will be initialized when first onlined). * - MCA pages on ia64 * - Pages holding CPU notes for POWER Firmware Assisted Dump - * - Device memory (e.g. PMEM, DAX, HMM) * Some PG_reserved pages will be excluded from the hibernation image. * PG_reserved does in general not hinder anybody from dumping or swapping * and is no longer required for remap_pfn_range(). ioremap might require = it. * Consequently, PG_reserved for a page mapped into user space can indicat= e - * the zero page, the vDSO, MMIO pages or device memory. + * the zero page, the vDSO, or MMIO pages. * * The PG_private bitflag is set on pagecache pages if they contain filesy= stem * specific data (which is normally at page->private). It can be used by diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8d81730cf036..2714edce98dd 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -501,9 +501,7 @@ static void __remove_section(unsigned long pfn, unsigne= d long nr_pages, * @altmap: alternative device page map or %NULL if default memmap is used * * Generic helper function to remove section mappings and sysfs entries - * for the section of the memory we are removing. Caller needs to make - * sure that pages are marked reserved and zones are adjust properly by - * calling offline_pages(). + * for the section of the memory we are removing. */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, =09=09 struct vmem_altmap *altmap) @@ -584,9 +582,9 @@ static int online_pages_range(unsigned long start_pfn, = unsigned long nr_pages, =09int order; =20 =09/* -=09 * Online the pages. The callback might decide to keep some pages -=09 * PG_reserved (to add them to the buddy later), but we still account -=09 * them as being online/belonging to this zone ("present"). +=09 * Online the pages. The callback might decide to not free some pages +=09 * (to add them to the buddy later), but we still account them as +=09 * being online/belonging to this zone ("present"). =09 */ =09for (pfn =3D start_pfn; pfn < end_pfn; pfn +=3D 1ul << order) { =09=09order =3D min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn))); @@ -659,8 +657,7 @@ static void __meminit resize_pgdat_range(struct pglist_= data *pgdat, unsigned lon } /* * Associate the pfn range with the given zone, initializing the memmaps - * and resizing the pgdat/zone data to span the added pages. After this - * call, all affected pages are PG_reserved. + * and resizing the pgdat/zone data to span the added pages. */ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_p= fn, =09=09unsigned long nr_pages, struct vmem_altmap *altmap) @@ -684,8 +681,8 @@ void __ref move_pfn_range_to_zone(struct zone *zone, un= signed long start_pfn, =09/* =09 * TODO now we have a visible range of pages which are not associated =09 * with their zone properly. Not nice but set_pfnblock_flags_mask -=09 * expects the zone spans the pfn range. All the pages in the range -=09 * are reserved so nobody should be touching them so we should be safe +=09 * expects the zone spans the pfn range. The sections are not yet +=09 * marked online so nobody should be touching the memmap. =09 */ =09memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn, =09=09=09MEMMAP_HOTPLUG, altmap); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f9488efff680..aa6ecac27b68 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5927,8 +5927,6 @@ void __meminit memmap_init_zone(unsigned long size, i= nt nid, unsigned long zone, =20 =09=09page =3D pfn_to_page(pfn); =09=09__init_single_page(page, pfn, zone, nid); -=09=09if (context =3D=3D MEMMAP_HOTPLUG) -=09=09=09__SetPageReserved(page); =20 =09=09/* =09=09 * Mark the block movable so that blocks are reserved for @@ -5980,15 +5978,6 @@ void __ref memmap_init_zone_device(struct zone *zone= , =20 =09=09__init_single_page(page, pfn, zone_idx, nid); =20 -=09=09/* -=09=09 * Mark page reserved as it will need to wait for onlining -=09=09 * phase for it to be fully associated with a zone. -=09=09 * -=09=09 * We can use the non-atomic __set_bit operation for setting -=09=09 * the flag as we are still initializing the pages. -=09=09 */ -=09=09__SetPageReserved(page); - =09=09/* =09=09 * ZONE_DEVICE pages union ->lru with a ->pgmap back pointer =09=09 * and zone_device_data. It is a bug if a ZONE_DEVICE page is --=20 2.21.0