From: Mike Rapoport <rppt@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>,
"guanghui.fgh" <guanghuifeng@linux.alibaba.com>,
Ard Biesheuvel <ardb@kernel.org>,
baolin.wang@linux.alibaba.com, akpm@linux-foundation.org,
david@redhat.com, jianyong.wu@arm.com, james.morse@arm.com,
quic_qiancai@quicinc.com, christophe.leroy@csgroup.eu,
jonathan@marek.ca, mark.rutland@arm.com,
thunder.leizhen@huawei.com, anshuman.khandual@arm.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, geert+renesas@glider.be,
linux-mm@kvack.org, yaohongbo@linux.alibaba.com,
alikernel-developer@linux.alibaba.com
Subject: Re: [PATCH v4] arm64: mm: fix linear mem mapping access performance degradation
Date: Tue, 5 Jul 2022 23:45:40 +0300 [thread overview]
Message-ID: <YsSi9HAOOzbPYN+w@kernel.org> (raw)
In-Reply-To: <YsRvPTORdvIwzShL@arm.com>
On Tue, Jul 05, 2022 at 06:05:01PM +0100, Catalin Marinas wrote:
> On Tue, Jul 05, 2022 at 06:57:53PM +0300, Mike Rapoport wrote:
> > On Tue, Jul 05, 2022 at 04:34:09PM +0100, Catalin Marinas wrote:
> > > On Tue, Jul 05, 2022 at 06:02:02PM +0300, Mike Rapoport wrote:
> > > > +void __init remap_crashkernel(void)
> > > > +{
> > > > +#ifdef CONFIG_KEXEC_CORE
> > > > + phys_addr_t start, end, size;
> > > > + phys_addr_t aligned_start, aligned_end;
> > > > +
> > > > + if (can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE))
> > > > + return;
> > > > +
> > > > + if (!crashk_res.end)
> > > > + return;
> > > > +
> > > > + start = crashk_res.start & PAGE_MASK;
> > > > + end = PAGE_ALIGN(crashk_res.end);
> > > > +
> > > > + aligned_start = ALIGN_DOWN(crashk_res.start, PUD_SIZE);
> > > > + aligned_end = ALIGN(end, PUD_SIZE);
> > > > +
> > > > + /* Clear PUDs containing crash kernel memory */
> > > > + unmap_hotplug_range(__phys_to_virt(aligned_start),
> > > > + __phys_to_virt(aligned_end), false, NULL);
> > >
> > > What I don't understand is what happens if there's valid kernel data
> > > between aligned_start and crashk_res.start (or the other end of the
> > > range).
> >
> > Data shouldn't go anywhere :)
> >
> > There is
> >
> > + /* map area from PUD start to start of crash kernel with large pages */
> > + size = start - aligned_start;
> > + __create_pgd_mapping(swapper_pg_dir, aligned_start,
> > + __phys_to_virt(aligned_start),
> > + size, PAGE_KERNEL, early_pgtable_alloc, 0);
> >
> > and
> >
> > + /* map area from end of crash kernel to PUD end with large pages */
> > + size = aligned_end - end;
> > + __create_pgd_mapping(swapper_pg_dir, end, __phys_to_virt(end),
> > + size, PAGE_KERNEL, early_pgtable_alloc, 0);
> >
> > after the unmap, so after we tear down a part of a linear map we
> > immediately recreate it, just with a different page size.
> >
> > This all happens before SMP, so there is no concurrency at that point.
>
> That brief period of unmap worries me. The kernel text, data and stack
> are all in the vmalloc space but any other (memblock) allocation to this
> point may be in the unmapped range before and after the crashkernel
> reservation. The interrupts are off, so I think the only allocation and
> potential access that may go in this range is the page table itself. But
> it looks fragile to me.
I agree there are chances there will be an allocation from the unmapped
range.
We can make sure this won't happen, though. We can cap the memblock
allocations with memblock_set_current_limit(aligned_end) or
memblock_reserve(algined_start, aligned_end) until the mappings are
restored.
> --
> Catalin
--
Sincerely yours,
Mike.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>,
"guanghui.fgh" <guanghuifeng@linux.alibaba.com>,
Ard Biesheuvel <ardb@kernel.org>,
baolin.wang@linux.alibaba.com, akpm@linux-foundation.org,
david@redhat.com, jianyong.wu@arm.com, james.morse@arm.com,
quic_qiancai@quicinc.com, christophe.leroy@csgroup.eu,
jonathan@marek.ca, mark.rutland@arm.com,
thunder.leizhen@huawei.com, anshuman.khandual@arm.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, geert+renesas@glider.be,
linux-mm@kvack.org, yaohongbo@linux.alibaba.com,
alikernel-developer@linux.alibaba.com
Subject: Re: [PATCH v4] arm64: mm: fix linear mem mapping access performance degradation
Date: Tue, 5 Jul 2022 23:45:40 +0300 [thread overview]
Message-ID: <YsSi9HAOOzbPYN+w@kernel.org> (raw)
In-Reply-To: <YsRvPTORdvIwzShL@arm.com>
On Tue, Jul 05, 2022 at 06:05:01PM +0100, Catalin Marinas wrote:
> On Tue, Jul 05, 2022 at 06:57:53PM +0300, Mike Rapoport wrote:
> > On Tue, Jul 05, 2022 at 04:34:09PM +0100, Catalin Marinas wrote:
> > > On Tue, Jul 05, 2022 at 06:02:02PM +0300, Mike Rapoport wrote:
> > > > +void __init remap_crashkernel(void)
> > > > +{
> > > > +#ifdef CONFIG_KEXEC_CORE
> > > > + phys_addr_t start, end, size;
> > > > + phys_addr_t aligned_start, aligned_end;
> > > > +
> > > > + if (can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE))
> > > > + return;
> > > > +
> > > > + if (!crashk_res.end)
> > > > + return;
> > > > +
> > > > + start = crashk_res.start & PAGE_MASK;
> > > > + end = PAGE_ALIGN(crashk_res.end);
> > > > +
> > > > + aligned_start = ALIGN_DOWN(crashk_res.start, PUD_SIZE);
> > > > + aligned_end = ALIGN(end, PUD_SIZE);
> > > > +
> > > > + /* Clear PUDs containing crash kernel memory */
> > > > + unmap_hotplug_range(__phys_to_virt(aligned_start),
> > > > + __phys_to_virt(aligned_end), false, NULL);
> > >
> > > What I don't understand is what happens if there's valid kernel data
> > > between aligned_start and crashk_res.start (or the other end of the
> > > range).
> >
> > Data shouldn't go anywhere :)
> >
> > There is
> >
> > + /* map area from PUD start to start of crash kernel with large pages */
> > + size = start - aligned_start;
> > + __create_pgd_mapping(swapper_pg_dir, aligned_start,
> > + __phys_to_virt(aligned_start),
> > + size, PAGE_KERNEL, early_pgtable_alloc, 0);
> >
> > and
> >
> > + /* map area from end of crash kernel to PUD end with large pages */
> > + size = aligned_end - end;
> > + __create_pgd_mapping(swapper_pg_dir, end, __phys_to_virt(end),
> > + size, PAGE_KERNEL, early_pgtable_alloc, 0);
> >
> > after the unmap, so after we tear down a part of a linear map we
> > immediately recreate it, just with a different page size.
> >
> > This all happens before SMP, so there is no concurrency at that point.
>
> That brief period of unmap worries me. The kernel text, data and stack
> are all in the vmalloc space but any other (memblock) allocation to this
> point may be in the unmapped range before and after the crashkernel
> reservation. The interrupts are off, so I think the only allocation and
> potential access that may go in this range is the page table itself. But
> it looks fragile to me.
I agree there are chances there will be an allocation from the unmapped
range.
We can make sure this won't happen, though. We can cap the memblock
allocations with memblock_set_current_limit(aligned_end) or
memblock_reserve(algined_start, aligned_end) until the mappings are
restored.
> --
> Catalin
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2022-07-05 20:47 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-02 15:57 [PATCH v4] arm64: mm: fix linear mem mapping access performance degradation Guanghui Feng
2022-07-02 15:57 ` Guanghui Feng
2022-07-04 10:35 ` Will Deacon
2022-07-04 10:35 ` Will Deacon
2022-07-04 10:58 ` guanghui.fgh
2022-07-04 10:58 ` guanghui.fgh
2022-07-04 11:14 ` Will Deacon
2022-07-04 11:14 ` Will Deacon
2022-07-04 12:05 ` guanghui.fgh
2022-07-04 12:05 ` guanghui.fgh
2022-07-04 13:15 ` Will Deacon
2022-07-04 13:15 ` Will Deacon
2022-07-04 13:41 ` guanghui.fgh
2022-07-04 13:41 ` guanghui.fgh
2022-07-04 14:11 ` guanghui.fgh
2022-07-04 14:11 ` guanghui.fgh
2022-07-04 14:23 ` Will Deacon
2022-07-04 14:23 ` Will Deacon
2022-07-04 14:34 ` guanghui.fgh
2022-07-04 14:34 ` guanghui.fgh
2022-07-04 16:38 ` Will Deacon
2022-07-04 16:38 ` Will Deacon
2022-07-04 17:09 ` Ard Biesheuvel
2022-07-04 17:09 ` Ard Biesheuvel
2022-07-05 8:35 ` Baoquan He
2022-07-05 8:35 ` Baoquan He
2022-07-05 8:35 ` Baoquan He
2022-07-05 9:52 ` Will Deacon
2022-07-05 9:52 ` Will Deacon
2022-07-05 12:07 ` guanghui.fgh
2022-07-05 12:07 ` guanghui.fgh
2022-07-05 12:11 ` Will Deacon
2022-07-05 12:11 ` Will Deacon
2022-07-05 12:27 ` guanghui.fgh
2022-07-05 12:27 ` guanghui.fgh
2022-07-05 12:56 ` Mike Rapoport
2022-07-05 12:56 ` Mike Rapoport
2022-07-05 13:17 ` guanghui.fgh
2022-07-05 13:17 ` guanghui.fgh
2022-07-05 15:02 ` Mike Rapoport
2022-07-05 15:02 ` Mike Rapoport
2022-07-05 15:34 ` Catalin Marinas
2022-07-05 15:34 ` Catalin Marinas
2022-07-05 15:57 ` Mike Rapoport
2022-07-05 15:57 ` Mike Rapoport
2022-07-05 17:05 ` Catalin Marinas
2022-07-05 17:05 ` Catalin Marinas
2022-07-05 20:45 ` Mike Rapoport [this message]
2022-07-05 20:45 ` Mike Rapoport
2022-07-06 2:49 ` guanghui.fgh
2022-07-06 2:49 ` guanghui.fgh
2022-07-06 7:43 ` Catalin Marinas
2022-07-06 7:43 ` Catalin Marinas
2022-07-06 10:04 ` Catalin Marinas
2022-07-06 10:04 ` Catalin Marinas
2022-07-06 13:54 ` Mike Rapoport
2022-07-06 13:54 ` Mike Rapoport
2022-07-06 15:18 ` guanghui.fgh
2022-07-06 15:18 ` guanghui.fgh
2022-07-06 15:30 ` guanghui.fgh
2022-07-06 15:30 ` guanghui.fgh
2022-07-06 15:40 ` Catalin Marinas
2022-07-06 15:40 ` Catalin Marinas
2022-07-07 17:02 ` guanghui.fgh
2022-07-07 17:02 ` guanghui.fgh
2022-07-08 12:28 ` [PATCH RESEND " guanghui.fgh
2022-07-08 12:28 ` guanghui.fgh
2022-07-10 13:44 ` [PATCH v5] " Guanghui Feng
2022-07-10 13:44 ` Guanghui Feng
2022-07-10 14:32 ` guanghui.fgh
2022-07-10 14:32 ` guanghui.fgh
2022-07-10 15:33 ` guanghui.fgh
2022-07-10 15:33 ` guanghui.fgh
2022-07-18 13:10 ` Will Deacon
2022-07-18 13:10 ` Will Deacon
2022-07-25 6:46 ` Mike Rapoport
2022-07-25 6:46 ` Mike Rapoport
2022-07-05 2:44 ` [PATCH v4] " guanghui.fgh
2022-07-05 2:44 ` guanghui.fgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YsSi9HAOOzbPYN+w@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alikernel-developer@linux.alibaba.com \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=geert+renesas@glider.be \
--cc=guanghuifeng@linux.alibaba.com \
--cc=james.morse@arm.com \
--cc=jianyong.wu@arm.com \
--cc=jonathan@marek.ca \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=quic_qiancai@quicinc.com \
--cc=thunder.leizhen@huawei.com \
--cc=will@kernel.org \
--cc=yaohongbo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.