From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7EA9EC433EF for ; Wed, 16 Mar 2022 12:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sUI8ulRu0vZrgWeRlh0dV8UWI315Qrcf/af9X+3jRyE=; b=orEpJ2OrKjxXsL ZQeV1Kn7HnhWcHuMCztUa0L6cIVwi5M0CpLBoVQFd1qUZp0UUV1q9NfyoLsUPYD6ZMZgvfVV+DgnN /mxNbmuCW/5l4fIl7R4pc7ZXjYh2qReS8RA8HTiuxNxh18BK+PWnDGKGleV7Qw0SGm+UqlKTSyJJc zEID3eI9FeZN9YpbzBYDJjWRqOZQGNbm51d+1SRq1i6LZRFCy8c5j16n3cDux5kG69jTnvpiyJX7t cSPt/Ej11jN7xYqRjVLFxmRMrEGBQIPbDwzk5Ih9W4bg80He5gLnPCQrxRpY9a3RH9pVSFdoxA8fS /HAcLO/MV5UY6offCMRA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nUSVi-00CkT8-EZ; Wed, 16 Mar 2022 12:12:14 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nUSVd-00CkPl-1a for linux-arm-kernel@lists.infradead.org; Wed, 16 Mar 2022 12:12:12 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647432725; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NFvT60sZqo3HaKmg6w6yCwx/Ub8DpG+qf8E82rYKcwc=; b=hu5AP+NK/vt0C2+ph8uOAUTY/XnAJNxjGzDCBOxtavjL9BfhdFdyuM8OPqKqS8v45zFYi6 OiswiIzYRS/KeIVri97UUXGGJYda+ueWe24hyaaaAzNLKQT8e2K8Vt1WiTzhXob18jQuEp vaw/KX1Sis/5zut3WtlWqWzGdHoU37k= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638--jMse2_6PYW9ytbJubyphA-1; Wed, 16 Mar 2022 08:12:00 -0400 X-MC-Unique: -jMse2_6PYW9ytbJubyphA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 07469892D28; Wed, 16 Mar 2022 12:11:52 +0000 (UTC) Received: from localhost (ovpn-12-236.pek2.redhat.com [10.72.12.236]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D1E90C44AEE; Wed, 16 Mar 2022 12:11:49 +0000 (UTC) Date: Wed, 16 Mar 2022 20:11:46 +0800 From: Baoquan He To: Zhen Lei Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , linux-kernel@vger.kernel.org, Dave Young , Vivek Goyal , Eric Biederman , kexec@lists.infradead.org, Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, Rob Herring , Frank Rowand , devicetree@vger.kernel.org, Jonathan Corbet , linux-doc@vger.kernel.org, Randy Dunlap , Feng Zhou , Kefeng Wang , Chen Zhou , John Donnelly , Dave Kleikamp Subject: Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Message-ID: References: <20220227030717.1464-1-thunder.leizhen@huawei.com> <20220227030717.1464-4-thunder.leizhen@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220227030717.1464-4-thunder.leizhen@huawei.com> X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220316_051209_219453_5B68DAE3 X-CRM114-Status: GOOD ( 46.71 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 02/27/22 at 11:07am, Zhen Lei wrote: > From: Chen Zhou > > There are following issues in arm64 kdump: > 1. We use crashkernel=X to reserve crashkernel below 4G, which > will fail when there is no enough low memory. > 2. If reserving crashkernel above 4G, in this case, crash dump > kernel will boot failure because there is no low memory available > for allocation. > > To solve these issues, change the behavior of crashkernel=X and > introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation > in DMA zone, and fall back to high allocation if it fails. > We can also use "crashkernel=X,high" to select a region above DMA zone, > which also tries to allocate at least 256M in DMA zone automatically. > "crashkernel=Y,low" can be used to allocate specified size low memory. > > Signed-off-by: Chen Zhou > Co-developed-by: Zhen Lei > Signed-off-by: Zhen Lei > --- > arch/arm64/kernel/machine_kexec.c | 9 ++- > arch/arm64/kernel/machine_kexec_file.c | 12 ++- > arch/arm64/mm/init.c | 106 +++++++++++++++++++++++-- > 3 files changed, 115 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > index e16b248699d5c3c..19c2d487cb08feb 100644 > --- a/arch/arm64/kernel/machine_kexec.c > +++ b/arch/arm64/kernel/machine_kexec.c > @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn) > > /* in reserved memory? */ > addr = __pfn_to_phys(pfn); > - if ((addr < crashk_res.start) || (crashk_res.end < addr)) > - return false; > + if ((addr < crashk_res.start) || (crashk_res.end < addr)) { > + if (!crashk_low_res.end) > + return false; > + > + if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr)) > + return false; > + } > > if (!kexec_crash_image) > return true; > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c > index 59c648d51848886..889951291cc0f9c 100644 > --- a/arch/arm64/kernel/machine_kexec_file.c > +++ b/arch/arm64/kernel/machine_kexec_file.c > @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz) > > /* Exclude crashkernel region */ > ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end); > + if (ret) > + goto out; > + > + if (crashk_low_res.end) { > + ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end); > + if (ret) > + goto out; > + } > > - if (!ret) > - ret = crash_prepare_elf64_headers(cmem, true, addr, sz); > + ret = crash_prepare_elf64_headers(cmem, true, addr, sz); > > +out: > kfree(cmem); > return ret; > } > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 90f276d46b93bc6..30ae6638ff54c47 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr); > phys_addr_t arm64_dma_phys_limit __ro_after_init; > > #ifdef CONFIG_KEXEC_CORE > +/* Current arm64 boot protocol requires 2MB alignment */ > +#define CRASH_ALIGN SZ_2M > + > +#define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit > +#define CRASH_ADDR_HIGH_MAX memblock.current_limit > + > +/* > + * This is an empirical value in x86_64 and taken here directly. Please > + * refer to the code comment in reserve_crashkernel_low() of x86_64 for more > + * details. > + */ > +#define DEFAULT_CRASH_KERNEL_LOW_SIZE \ > + max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20) > + > +static int __init reserve_crashkernel_low(unsigned long long low_size) > +{ > + unsigned long long low_base; > + > + /* passed with crashkernel=0,low ? */ > + if (!low_size) > + return 0; > + > + low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX); > + if (!low_base) { > + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); > + return -ENOMEM; > + } > + > + pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n", > + low_base, low_base + low_size, low_size >> 20); > + > + crashk_low_res.start = low_base; > + crashk_low_res.end = low_base + low_size - 1; > + insert_resource(&iomem_resource, &crashk_low_res); > + > + return 0; > +} > + > /* > * reserve_crashkernel() - reserves memory for crash kernel > * > @@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init; > static void __init reserve_crashkernel(void) > { > unsigned long long crash_base, crash_size; > - unsigned long long crash_max = arm64_dma_phys_limit; > + unsigned long long crash_low_size; > + unsigned long long crash_max = CRASH_ADDR_LOW_MAX; > int ret; Even though reverse xmas tree style is not enforced, this 'int ret;' is really annoying to look at. Maybe move it down two lines. > + bool fixed_base, high = false; > + char *cmdline = boot_command_line; > > - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + /* crashkernel=X[@offset] */ > + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), > &crash_size, &crash_base); > - /* no crashkernel= or invalid value specified */ > - if (ret || !crash_size) > - return; > + if (ret || !crash_size) { > + /* crashkernel=X,high */ > + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + /* crashkernel=Y,low */ > + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); > + if (ret == -ENOENT) > + /* > + * crashkernel=Y,low is not specified explicitly, use > + * default size automatically. > + */ > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + else if (ret) > + /* crashkernel=Y,low is specified but Y is invalid */ > + return; > + > + /* Mark crashkernel=X,high is specified */ > + high = true; > + crash_max = CRASH_ADDR_HIGH_MAX; > + } > > + fixed_base = !!crash_base; > crash_size = PAGE_ALIGN(crash_size); > > /* User specifies base address explicitly. */ This is over commenting, can't see why it's needed. > - if (crash_base) > + if (fixed_base) > crash_max = crash_base + crash_size; Hi leizhen, I made change on reserve_crashkenrel(), inline comment may be slow. Please check and consider if they can be taken. diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 30ae6638ff54..f96351da1e3e 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) * This function reserves memory area given in "crashkernel=" kernel command * line parameter. The memory reserved is used by dump capture kernel when * primary kernel is crashing. + * + * NOTE: Reservation of crashkernel,low is special since its existence + * is not independent, need rely on the existence of crashkernel,high. + * Hence there are different cases for crashkernel,low reservation: + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y; + * 2) crashkernel=,low is not given, while crashkernel=,high is specified, + * take the default crashkernel,low value; + * 3) crashkernel=X is specified, while fallback to get a memory region + * in high memory, take the default crashkernel,low value; + * 4) crashkernel='invalid value',low is specified, failed the whole + * crashkernel reservation and bail out. */ static void __init reserve_crashkernel(void) { unsigned long long crash_base, crash_size; unsigned long long crash_low_size; unsigned long long crash_max = CRASH_ADDR_LOW_MAX; - int ret; bool fixed_base, high = false; char *cmdline = boot_command_line; + int ret; /* crashkernel=X[@offset] */ ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), &crash_size, &crash_base); if (ret || !crash_size) { - /* crashkernel=X,high */ ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); if (ret || !crash_size) return; - /* crashkernel=Y,low */ ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); if (ret == -ENOENT) - /* - * crashkernel=Y,low is not specified explicitly, use - * default size automatically. - */ + /* case #2 of crashkernel,low reservation */ crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; else if (ret) - /* crashkernel=Y,low is specified but Y is invalid */ + /* case #4 of crashkernel,low reservation */ return; - /* Mark crashkernel=X,high is specified */ high = true; crash_max = CRASH_ADDR_HIGH_MAX; } @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void) fixed_base = !!crash_base; crash_size = PAGE_ALIGN(crash_size); - /* User specifies base address explicitly. */ if (fixed_base) crash_max = crash_base + crash_size; @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void) } if (crash_base >= SZ_4G) { - /* - * For case crashkernel=X, low memory is not enough and fall - * back to reserve specified size of memory above 4G, try to - * allocate minimum required memory below 4G again. - */ + /* case #3 of crashkernel,low reservation */ if (!high) crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > > - /* Current arm64 boot protocol requires 2MB alignment */ > - crash_base = memblock_phys_alloc_range(crash_size, SZ_2M, > +retry: > + crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > crash_base, crash_max); > if (!crash_base) { > + /* > + * Attempt to fully allocate low memory failed, fall back > + * to high memory, the minimum required low memory will be > + * reserved later. > + */ > + if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { > + crash_max = CRASH_ADDR_HIGH_MAX; > + goto retry; > + } > + > pr_warn("cannot allocate crashkernel (size:0x%llx)\n", > crash_size); > return; > } > > + if (crash_base >= SZ_4G) { > + /* > + * For case crashkernel=X, low memory is not enough and fall > + * back to reserve specified size of memory above 4G, try to > + * allocate minimum required memory below 4G again. > + */ > + if (!high) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + > + if (reserve_crashkernel_low(crash_low_size)) { > + memblock_phys_free(crash_base, crash_size); > + return; > + } > + } > + > pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n", > crash_base, crash_base + crash_size, crash_size >> 20); > > @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void) > * map. Inform kmemleak so that it won't try to access it. > */ > kmemleak_ignore_phys(crash_base); > + if (crashk_low_res.end) > + kmemleak_ignore_phys(crashk_low_res.start); > + > crashk_res.start = crash_base; > crashk_res.end = crash_base + crash_size - 1; > insert_resource(&iomem_resource, &crashk_res); > -- > 2.25.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel