From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AFE4C43460 for ; Fri, 7 May 2021 12:35:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B89AB61104 for ; Fri, 7 May 2021 12:35:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B89AB61104 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0609F8D0012; Fri, 7 May 2021 08:35:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 010D28D0011; Fri, 7 May 2021 08:35:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF2E58D0012; Fri, 7 May 2021 08:35:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id C6BF88D0011 for ; Fri, 7 May 2021 08:35:03 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7DACC180ACF76 for ; Fri, 7 May 2021 12:35:03 +0000 (UTC) X-FDA: 78114379686.25.C43B08E Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf21.hostedemail.com (Postfix) with ESMTP id EDFABE00013B for ; Fri, 7 May 2021 12:34:57 +0000 (UTC) Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Fc8wx6s2rzlcY3; Fri, 7 May 2021 20:32:49 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Fri, 7 May 2021 20:34:52 +0800 Subject: Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()) To: Mike Rapoport CC: David Hildenbrand , , Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , Marc Zyngier , Mark Rutland , "Mike Rapoport" , Will Deacon , , , References: <6ad2956c-70ae-c423-ed7d-88e94c88060f@huawei.com> <0cb013e4-1157-f2fa-96ec-e69e60833f72@huawei.com> <24b37c01-fc75-d459-6e61-d67e8f0cf043@redhat.com> <82cfbb7f-dd4f-12d8-dc76-847f06172200@huawei.com> From: Kefeng Wang Message-ID: <33c67e13-dc48-9a2f-46d8-a532e17380fb@huawei.com> Date: Fri, 7 May 2021 20:34:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US X-Originating-IP: [10.174.177.243] X-CFilter-Loop: Reflected Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf21.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com X-Stat-Signature: bc4ibkqphn6q8oi7ja611qxbbrdseu3c X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EDFABE00013B Received-SPF: none (huawei.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=szxga06-in.huawei.com; client-ip=45.249.212.32 X-HE-DKIM-Result: none/none X-HE-Tag: 1620390897-625205 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/5/7 18:30, Mike Rapoport wrote: > On Fri, May 07, 2021 at 03:17:08PM +0800, Kefeng Wang wrote: >> >> On 2021/5/6 20:47, Kefeng Wang wrote: >>> >>> >>>>>>> no, the CONFIG_ARM_LPAE is not set, and yes with same panic at >>>>>>> move_freepages at >>>>>>> >>>>>>> start_pfn/end_pfn [de600, de7ff], [de600000, de7ff000] >>>>>>> :=C2=A0 pfn =3Dde600, page >>>>>>> =3Def3cc000, page-flags =3D ffffffff,=C2=A0 pfn2phy =3D de600000 >>>>>>> >>>>>>>>> __free_memory_core, range: 0xb0200000 - >>>>>>>>> 0xc0000000, pfn: b0200 - b0200 >>>>>>>>> __free_memory_core, range: 0xcc000000 - >>>>>>>>> 0xdca00000, pfn: cc000 - b0200 >>>>>>>>> __free_memory_core, range: 0xde700000 - >>>>>>>>> 0xdea00000, pfn: de700 - b0200 >>>>>> >>>>>> Hmm, [de600, de7ff] is not added to the free lists which is >>>>>> correct. But >>>>>> then it's unclear how the page for de600 gets to move_freepages().= .. >>>>>> >>>>>> Can't say I have any bright ideas to try here... >>>>> >>>>> Are we missing some checks (e.g., PageReserved()) that >>>>> pfn_valid_within() >>>>> would have "caught" before? >>>> >>>> Unless I'm missing something the crash happens in __rmqueue_fallback= (): >>>> >>>> do_steal: >>>> =C2=A0=C2=A0=C2=A0=C2=A0page =3D get_page_from_free_area(area, fall= back_mt); >>>> >>>> =C2=A0=C2=A0=C2=A0=C2=A0steal_suitable_fallback(zone, page, alloc_f= lags, start_migratetype, >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 can_steal); >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -> move_freepages() >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = -> BUG() >>>> >>>> So a page from free area should be sane as the freed range was never >>>> added >>>> it to the free lists. >>> >>> Sorry for the late response due to the vacation. >>> >>> The pfn in range [de600, de7ff] won't be added into the free lists vi= a >>> __free_memory_core(), but the pfn could be added into freelists via >>> free_highmem_page() >>> >>> I add some debug[1] in add_to_free_list(), we could see the calltrace >>> >>> free_highpages, range_pfn [b0200, c0000], range_addr [b0200000, c0000= 000] >>> free_highpages, range_pfn [cc000, dca00], range_addr [cc000000, dca00= 000] >>> free_highpages, range_pfn [de700, dea00], range_addr [de700000, dea00= 000] >>> add_to_free_list, =3D=3D=3D> pfn =3D de700 >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:900 add_to_free_list+0x8c/0= xec >>> pfn =3D de700 >>> Modules linked in: >>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #48 >>> Hardware name: Hisilicon A9 >>> [] (show_stack) from [] (dump_stack+0x9c/0xc0) >>> [] (dump_stack) from [] (__warn+0xc0/0xec) >>> [] (__warn) from [] (warn_slowpath_fmt+0x74/0xa4) >>> [] (warn_slowpath_fmt) from [] >>> (add_to_free_list+0x8c/0xec) >>> [] (add_to_free_list) from [] >>> (free_pcppages_bulk+0x200/0x278) >>> [] (free_pcppages_bulk) from [] >>> (free_unref_page+0x58/0x68) >>> [] (free_unref_page) from [] >>> (free_highmem_page+0xc/0x50) >>> [] (free_highmem_page) from [] (mem_init+0x21c/0x= 254) >>> [] (mem_init) from [] (start_kernel+0x258/0x5c0) >>> [] (start_kernel) from [<00000000>] (0x0) >>> >>> so any idea? >> >> If pfn =3D 0xde700, due to the pageblock_nr_pages =3D 0x200, then the >> start_pfn,end_pfn passed to move_freepages() will be [de600, de7ff], >> but the range of [de600,de700] without =E2=80=98struct page' will lead= to >> this panic when pfn_valid_within not enabled if no HOLES_IN_ZONE, >> and the same issue will occurred in isolate_freepages_block(), maybe >=20 > I think your analysis is correct except one minor detail. With the #ifd= ef > fix I've proposed earlieri [1] the memmap for [0xde600, 0xde700] should= not > be freed so there should be a struct page. Did you check what parts of = the > memmap are actually freed with this patch applied? > Would you get a panic if you add >=20 > dump_page(pfn_to_page(0xde600), ""); >=20 > say, in the end of memblock_free_all()? The memory is not continuous, see MEMBLOCK: memory size =3D 0x4c0fffff reserved size =3D 0x027ef058 memory.cnt =3D 0xa memory[0x0] [0x80a00000-0x855fffff], 0x04c00000 bytes flags: 0x0 memory[0x1] [0x86a00000-0x87dfffff], 0x01400000 bytes flags: 0x0 memory[0x2] [0x8bd00000-0x8c4fffff], 0x00800000 bytes flags: 0x0 memory[0x3] [0x8e300000-0x8ecfffff], 0x00a00000 bytes flags: 0x0 memory[0x4] [0x90d00000-0xbfffffff], 0x2f300000 bytes flags: 0x0 memory[0x5] [0xcc000000-0xdc9fffff], 0x10a00000 bytes flags: 0x0 memory[0x6] [0xde700000-0xde9fffff], 0x00300000 bytes flags: 0x0 ... The pfn_range [0xde600,0xde700] =3D> addr_range [0xde600000,0xde700000] is not available memory, and we won't create memmap , so with or without=20 your patch, we can't see the range in free_memmap(), right? >=20 >> there are some scene, so I select HOLES_IN_ZONE in ARCH_HISI(ARM) to s= olve >> this issue in our 5.10, should we select HOLES_IN_ZONE in all ARM or o= nly in >> ARCH_HISI, any better solution? Thanks. >=20 > I don't think that HOLES_IN_ZONE is the right solution. I believe that = we > must keep the memory map aligned on pageblock boundaries. That's surely= not the > case for SPARSEMEM as of now, and if my fix is not enough we need to fi= nd > where it went wrong. >=20 > Besides, I'd say that if it is possible to update your firmware to make= the > memory layout reported to the kernel less, hmm, esoteric, you would hit > less corner cases. Sorry, memory layout is customized and we can't change it, some memory=20 is for special purposes by our production. >=20 > [1] https://lore.kernel.org/lkml/YIpY8TXCSc7Lfa2Z@kernel.org >=20