From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FF0EC433B4 for ; Sun, 9 May 2021 05:59:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D7F8761263 for ; Sun, 9 May 2021 05:59:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D7F8761263 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 54C736B006E; Sun, 9 May 2021 01:59:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FC0F6B0070; Sun, 9 May 2021 01:59:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34F0C6B0071; Sun, 9 May 2021 01:59:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id 135666B006E for ; Sun, 9 May 2021 01:59:42 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AE4A5181AEF3E for ; Sun, 9 May 2021 05:59:41 +0000 (UTC) X-FDA: 78120640962.01.574037D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 62C682000257 for ; Sun, 9 May 2021 05:59:40 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1CEF561364; Sun, 9 May 2021 05:59:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620539978; bh=F9gqyGBpHvA5BP4VIejNuaHDQi9t//ZqzxoUtXHZIBY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Pg6c3El7f8wTosM1GHBA2Kmlc0+0eBs2evwd1DCes38mEsljDvr4GOET2y7BDPi0c L2AUKUVe8Iy2L5FFC5mrYAfumqdKP09p+LYX8GA0QrfcCG6SN2NVPnHW4FkETfwzXX OR3Qp/nyQ4g733GxrKLB8kyAuq5GPxWaiUDePC9D/O/kDMRLzPNqsWZtMU2Cqwn5Rw 2nTiBom6NZz7a/ETXo2sLS0lKFixn2K8Mu4aN96x7VubtFTIeTEtPnADkFRASjesAp 7t6tPQykwNdJNYoLm8iRRHsRPnasprnsnSIhY4MIIv9DRs0McBHwIOWYK7WzOwZccl FvFfB99NMgyig== Date: Sun, 9 May 2021 08:59:29 +0300 From: Mike Rapoport To: Kefeng Wang Cc: David Hildenbrand , linux-arm-kernel@lists.infradead.org, Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , Marc Zyngier , Mark Rutland , Mike Rapoport , Will Deacon , kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()) Message-ID: References: <0cb013e4-1157-f2fa-96ec-e69e60833f72@huawei.com> <24b37c01-fc75-d459-6e61-d67e8f0cf043@redhat.com> <82cfbb7f-dd4f-12d8-dc76-847f06172200@huawei.com> <33c67e13-dc48-9a2f-46d8-a532e17380fb@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <33c67e13-dc48-9a2f-46d8-a532e17380fb@huawei.com> X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 62C682000257 X-Stat-Signature: uiwbw58dewzz7fd6u155fckgj86juuyq Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Pg6c3El7; spf=pass (imf28.hostedemail.com: domain of rppt@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received-SPF: none (kernel.org>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=mail.kernel.org; client-ip=198.145.29.99 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620539980-264747 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 07, 2021 at 08:34:52PM +0800, Kefeng Wang wrote: >=20 >=20 > On 2021/5/7 18:30, Mike Rapoport wrote: > > On Fri, May 07, 2021 at 03:17:08PM +0800, Kefeng Wang wrote: > > >=20 > > > On 2021/5/6 20:47, Kefeng Wang wrote: > > > >=20 > > > > > > > > no, the CONFIG_ARM_LPAE is not set, and yes with same pan= ic at > > > > > > > > move_freepages at > > > > > > > >=20 > > > > > > > > start_pfn/end_pfn [de600, de7ff], [de600000, de7ff000] > > > > > > > > :=C2=A0 pfn =3Dde600, page > > > > > > > > =3Def3cc000, page-flags =3D ffffffff,=C2=A0 pfn2phy =3D d= e600000 > > > > > > > >=20 > > > > > > > > > > __free_memory_core, range: 0xb0200000 - > > > > > > > > > > 0xc0000000, pfn: b0200 - b0200 > > > > > > > > > > __free_memory_core, range: 0xcc000000 - > > > > > > > > > > 0xdca00000, pfn: cc000 - b0200 > > > > > > > > > > __free_memory_core, range: 0xde700000 - > > > > > > > > > > 0xdea00000, pfn: de700 - b0200 > > > > > > >=20 > > > > > > > Hmm, [de600, de7ff] is not added to the free lists which is > > > > > > > correct. But > > > > > > > then it's unclear how the page for de600 gets to move_freep= ages()... > > > > > > >=20 > > > > > > > Can't say I have any bright ideas to try here... > > > > > >=20 > > > > > > Are we missing some checks (e.g., PageReserved()) that > > > > > > pfn_valid_within() > > > > > > would have "caught" before? > > > > >=20 > > > > > Unless I'm missing something the crash happens in __rmqueue_fal= lback(): > > > > >=20 > > > > > do_steal: > > > > > =C2=A0=C2=A0=C2=A0=C2=A0page =3D get_page_from_free_area(area,= fallback_mt); > > > > >=20 > > > > > =C2=A0=C2=A0=C2=A0=C2=A0steal_suitable_fallback(zone, page, al= loc_flags, start_migratetype, > > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 can_steal); > > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -> move_freepages() > > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 -> BUG() > > > > >=20 > > > > > So a page from free area should be sane as the freed range was = never > > > > > added > > > > > it to the free lists. > > > >=20 > > > > Sorry for the late response due to the vacation. > > > >=20 > > > > The pfn in range [de600, de7ff] won't be added into the free list= s via > > > > __free_memory_core(), but the pfn could be added into freelists v= ia > > > > free_highmem_page() > > > >=20 > > > > I add some debug[1] in add_to_free_list(), we could see the callt= race > > > >=20 > > > > free_highpages, range_pfn [b0200, c0000], range_addr [b0200000, c= 0000000] > > > > free_highpages, range_pfn [cc000, dca00], range_addr [cc000000, d= ca00000] > > > > free_highpages, range_pfn [de700, dea00], range_addr [de700000, d= ea00000] > > > > add_to_free_list, =3D=3D=3D> pfn =3D de700 > > > > ------------[ cut here ]------------ > > > > WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:900 add_to_free_list+0x= 8c/0xec > > > > pfn =3D de700 > > > > Modules linked in: > > > > CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #48 > > > > Hardware name: Hisilicon A9 > > > > [] (show_stack) from [] (dump_stack+0x9c/0xc0= ) > > > > [] (dump_stack) from [] (__warn+0xc0/0xec) > > > > [] (__warn) from [] (warn_slowpath_fmt+0x74/0= xa4) > > > > [] (warn_slowpath_fmt) from [] > > > > (add_to_free_list+0x8c/0xec) > > > > [] (add_to_free_list) from [] > > > > (free_pcppages_bulk+0x200/0x278) > > > > [] (free_pcppages_bulk) from [] > > > > (free_unref_page+0x58/0x68) > > > > [] (free_unref_page) from [] > > > > (free_highmem_page+0xc/0x50) > > > > [] (free_highmem_page) from [] (mem_init+0x21= c/0x254) > > > > [] (mem_init) from [] (start_kernel+0x258/0x5= c0) > > > > [] (start_kernel) from [<00000000>] (0x0) > > > >=20 > > > > so any idea? > > >=20 > > > If pfn =3D 0xde700, due to the pageblock_nr_pages =3D 0x200, then t= he > > > start_pfn,end_pfn passed to move_freepages() will be [de600, de7ff]= , > > > but the range of [de600,de700] without =E2=80=98struct page' will l= ead to > > > this panic when pfn_valid_within not enabled if no HOLES_IN_ZONE, > > > and the same issue will occurred in isolate_freepages_block(), mayb= e > >=20 > > I think your analysis is correct except one minor detail. With the #i= fdef > > fix I've proposed earlieri [1] the memmap for [0xde600, 0xde700] shou= ld not > > be freed so there should be a struct page. Did you check what parts o= f the > > memmap are actually freed with this patch applied? > > Would you get a panic if you add > >=20 > > dump_page(pfn_to_page(0xde600), ""); > >=20 > > say, in the end of memblock_free_all()? >=20 > The memory is not continuous, see MEMBLOCK: > memory size =3D 0x4c0fffff reserved size =3D 0x027ef058 > memory.cnt =3D 0xa > memory[0x0] [0x80a00000-0x855fffff], 0x04c00000 bytes flags: 0x0 > memory[0x1] [0x86a00000-0x87dfffff], 0x01400000 bytes flags: 0x0 > memory[0x2] [0x8bd00000-0x8c4fffff], 0x00800000 bytes flags: 0x0 > memory[0x3] [0x8e300000-0x8ecfffff], 0x00a00000 bytes flags: 0x0 > memory[0x4] [0x90d00000-0xbfffffff], 0x2f300000 bytes flags: 0x0 > memory[0x5] [0xcc000000-0xdc9fffff], 0x10a00000 bytes flags: 0x0 > memory[0x6] [0xde700000-0xde9fffff], 0x00300000 bytes flags: 0x0 > ... >=20 > The pfn_range [0xde600,0xde700] =3D> addr_range [0xde600000,0xde700000] > is not available memory, and we won't create memmap , so with or withou= t > your patch, we can't see the range in free_memmap(), right? =20 This is not available memory and we won't see the reange in free_memmap()= , but we still should create memmap for it and that's what my patch tried t= o do. There are a lot of places in core mm that operate on pageblocks and free_unused_memmap() should make sure that any pageblock has a valid memo= ry map. Currently, that's not the case when SPARSEMEM=3Dy and my patch tried to f= ix it. Can you please send log with my patch applied and with the printing of ranges that are freed in free_unused_memmap() you've used in previous mails? =20 > > > there are some scene, so I select HOLES_IN_ZONE in ARCH_HISI(ARM) t= o solve > > > this issue in our 5.10, should we select HOLES_IN_ZONE in all ARM o= r only in > > > ARCH_HISI, any better solution? Thanks. > >=20 > > I don't think that HOLES_IN_ZONE is the right solution. I believe tha= t we > > must keep the memory map aligned on pageblock boundaries. That's sure= ly not the > > case for SPARSEMEM as of now, and if my fix is not enough we need to = find > > where it went wrong. > >=20 > > Besides, I'd say that if it is possible to update your firmware to ma= ke the > > memory layout reported to the kernel less, hmm, esoteric, you would h= it > > less corner cases. >=20 > Sorry, memory layout is customized and we can't change it, some memory = is > for special purposes by our production. =20 I understand that this memory cannot be used by Linux, but the firmware m= ay supply the kernel with actual physical memory layout and then mark all the special purpose memory that kernel should not touch as reserved. > > [1] https://lore.kernel.org/lkml/YIpY8TXCSc7Lfa2Z@kernel.org > >=20 --=20 Sincerely yours, Mike.