From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47474C4338F for ; Sun, 1 Aug 2021 16:03:56 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0AEFA60200 for ; Sun, 1 Aug 2021 16:03:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0AEFA60200 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Q81+zgZHjhAMJz3AV3gLFqYni03LM5hI9N+ujBLj/3w=; b=zycgkQnCdetiOw 583/w7HMtsFtjVEDec8NG6uUv+enQrDblZL+WvPZh/plBCnvSYx6nLFvBxv1OeUzDGXP+4+1wFrAH xZBk6xbnRYJPrVDpzEa3CDKJtVN7mDhFslWH0N/vv0BBlnOEabXrpkFzxX0LSUSTZY5rV8UIkLudw bM7tcjAidxnGtWhg4i+SEi3HQL66hW5wvdS6500y/SYcUhyWMAWZ/tH1xadYy/2XCEmsRAB8Zj4Pf teCR//BTaKk72ftUX4X1XQWRIARM5JTWBNxuBeUyNJWP0V7NvShIGbAveZvbBGHzcCX2Ktx3o5xDF 5MX5BV9R5d3Daqr05Lrw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mADu3-00Dogw-FQ; Sun, 01 Aug 2021 16:01:27 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mADty-00DogD-S7 for linux-arm-kernel@lists.infradead.org; Sun, 01 Aug 2021 16:01:24 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9223160200; Sun, 1 Aug 2021 16:01:17 +0000 (UTC) Date: Sun, 1 Aug 2021 08:53:13 -0700 From: Catalin Marinas To: Kefeng Wang Cc: Will Deacon , Andrey Ryabinin , Andrey Konovalov , Dmitry Vyukov , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, Greg Kroah-Hartman Subject: Re: [PATCH v2 2/3] arm64: Support page mapping percpu first chunk allocator Message-ID: <20210801155302.GA29188@arm.com> References: <20210720025105.103680-1-wangkefeng.wang@huawei.com> <20210720025105.103680-3-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210720025105.103680-3-wangkefeng.wang@huawei.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210801_090122_994549_B384D2B5 X-CRM114-Status: GOOD ( 30.36 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Jul 20, 2021 at 10:51:04AM +0800, Kefeng Wang wrote: > Percpu embedded first chunk allocator is the firstly option, but it > could fails on ARM64, eg, > "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" > "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" > "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" > > then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838", > even the system could not boot successfully. > > Let's implement page mapping percpu first chunk allocator as a fallback > to the embedding allocator to increase the robustness of the system. It looks like x86, powerpc and sparc implement their own setup_per_cpu_areas(). I had a quick look on finding some commonalities but I think it's a lot more hassle to make a generic version out of them (powerpc looks the simplest though). I think we could add a generic variant with the arm64 support and later migrate other architectures to it if possible. The patch looks ok to me otherwise but I'd need an ack from Greg as it touches drivers/. BTW, do we need something similar for the non-NUMA setup_per_cpu_areas()? I can see this patch only enables NEED_PER_CPU_PAGE_FIRST_CHUNK if NUMA. Leaving the rest of the patch below for Greg. > Signed-off-by: Kefeng Wang > --- > arch/arm64/Kconfig | 4 ++ > drivers/base/arch_numa.c | 82 +++++++++++++++++++++++++++++++++++----- > 2 files changed, 76 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index b5b13a932561..eacb5873ded1 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1045,6 +1045,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > +config NEED_PER_CPU_PAGE_FIRST_CHUNK > + def_bool y > + depends on NUMA > + > source "kernel/Kconfig.hz" > > config ARCH_SPARSEMEM_ENABLE > diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c > index 4cc4e117727d..563b2013b75a 100644 > --- a/drivers/base/arch_numa.c > +++ b/drivers/base/arch_numa.c > @@ -14,6 +14,7 @@ > #include > > #include > +#include > > struct pglist_data *node_data[MAX_NUMNODES] __read_mostly; > EXPORT_SYMBOL(node_data); > @@ -168,22 +169,83 @@ static void __init pcpu_fc_free(void *ptr, size_t size) > memblock_free_early(__pa(ptr), size); > } > > +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK > +static void __init pcpu_populate_pte(unsigned long addr) > +{ > + pgd_t *pgd = pgd_offset_k(addr); > + p4d_t *p4d; > + pud_t *pud; > + pmd_t *pmd; > + > + p4d = p4d_offset(pgd, addr); > + if (p4d_none(*p4d)) { > + pud_t *new; > + > + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > + if (!new) > + goto err_alloc; > + p4d_populate(&init_mm, p4d, new); > + } > + > + pud = pud_offset(p4d, addr); > + if (pud_none(*pud)) { > + pmd_t *new; > + > + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > + if (!new) > + goto err_alloc; > + pud_populate(&init_mm, pud, new); > + } > + > + pmd = pmd_offset(pud, addr); > + if (!pmd_present(*pmd)) { > + pte_t *new; > + > + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); > + if (!new) > + goto err_alloc; > + pmd_populate_kernel(&init_mm, pmd, new); > + } > + > + return; > + > +err_alloc: > + panic("%s: Failed to allocate %lu bytes align=%lx from=%lx\n", > + __func__, PAGE_SIZE, PAGE_SIZE, PAGE_SIZE); > +} > +#endif > + > void __init setup_per_cpu_areas(void) > { > unsigned long delta; > unsigned int cpu; > - int rc; > + int rc = -EINVAL; > + > + if (pcpu_chosen_fc != PCPU_FC_PAGE) { > + /* > + * Always reserve area for module percpu variables. That's > + * what the legacy allocator did. > + */ > + rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, > + PERCPU_DYNAMIC_RESERVE, PAGE_SIZE, > + pcpu_cpu_distance, > + pcpu_fc_alloc, pcpu_fc_free); > +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK > + if (rc < 0) > + pr_warn("PERCPU: %s allocator failed (%d), falling back to page size\n", > + pcpu_fc_names[pcpu_chosen_fc], rc); > +#endif > + } > > - /* > - * Always reserve area for module percpu variables. That's > - * what the legacy allocator did. > - */ > - rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, > - PERCPU_DYNAMIC_RESERVE, PAGE_SIZE, > - pcpu_cpu_distance, > - pcpu_fc_alloc, pcpu_fc_free); > +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK > + if (rc < 0) > + rc = pcpu_page_first_chunk(PERCPU_MODULE_RESERVE, > + pcpu_fc_alloc, > + pcpu_fc_free, > + pcpu_populate_pte); > +#endif > if (rc < 0) > - panic("Failed to initialize percpu areas."); > + panic("Failed to initialize percpu areas (err=%d).", rc); > > delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start; > for_each_possible_cpu(cpu) > -- > 2.26.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel