From: Qian Cai <quic_qiancai@quicinc.com>
To: Jianyong Wu <jianyong.wu@arm.com>
Cc: <catalin.marinas@arm.com>, <will@kernel.org>,
<anshuman.khandual@arm.com>, <akpm@linux-foundation.org>,
<ardb@kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <david@redhat.com>,
<gshan@redhat.com>, <justin.he@arm.com>, <nd@arm.com>
Subject: Re: [PATCH v2] arm64/mm: avoid fixmap race condition when create pud mapping
Date: Wed, 15 Dec 2021 09:13:37 -0500 [thread overview]
Message-ID: <Ybn4EfweLqKtyW0+@fixkernel.com> (raw)
In-Reply-To: <20211210095432.51798-1-jianyong.wu@arm.com>
On Fri, Dec 10, 2021 at 05:54:32PM +0800, Jianyong Wu wrote:
> fixmap is a global resource and is used recursively in create pud mapping.
> It may lead to race condition when alloc_init_pud is called concurrently.
>
> Fox example:
> alloc_init_pud is called when kernel_init. If memory hotplug
> thread, which will also call alloc_init_pud, happens during
> kernel_init, the race for fixmap occurs.
>
> The race condition flow can be:
>
> *************** begin **************
>
> kerenl_init thread virtio-mem workqueue thread
> ================== ======== ==================
> alloc_init_pud(...)
> pudp = pud_set_fixmap_offset(..) alloc_init_pud(...)
> ... ...
> READ_ONCE(*pudp) //OK! pudp = pud_set_fixmap_offset(
> ... ...
> pud_clear_fixmap() //fixmap break
> READ_ONCE(*pudp) //CRASH!
>
> **************** end ***************
>
> Hence, a spin lock is introduced to protect the fixmap during create pdg
> mapping.
>
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
I am afraid there is a problem to take a spinlock there.
node 0 deferred pages initialised in 2740ms
pgdatinit0 (176) used greatest stack depth: 59184 bytes left
devtmpfs: initialized
KASLR disabled due to lack of seed
BUG: sleeping function called from invalid context at mm/page_alloc.c:5151
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
1 lock held by swapper/0/1:
#0: ffff800009ea3278 (fixmap_lock){+.+.}-{2:2}, at: __create_pgd_mapping
alloc_init_pud at /usr/src/linux-next/arch/arm64/mm/mmu.c:340 (discriminator 4)
(inlined by) __create_pgd_mapping at /usr/src/linux-next/arch/arm64/mm/mmu.c:393 (discriminator 4)
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.16.0-rc5-next-20211214
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
__might_resched
__might_sleep
__alloc_pages
alloc_page_interleave
alloc_pages
__get_free_pages
__pgd_pgtable_alloc
__create_pgd_mapping
__phys_to_pte_val at /usr/src/linux-next/./arch/arm64/include/asm/pgtable.h:77
(inlined by) __pud_populate at /usr/src/linux-next/./arch/arm64/include/asm/pgalloc.h:25
(inlined by) alloc_init_cont_pmd at /usr/src/linux-next/arch/arm64/mm/mmu.c:277
(inlined by) alloc_init_pud at /usr/src/linux-next/arch/arm64/mm/mmu.c:358
(inlined by) __create_pgd_mapping at /usr/src/linux-next/arch/arm64/mm/mmu.c:393
map_entry_trampoline
map_entry_trampoline at /usr/src/linux-next/arch/arm64/mm/mmu.c:639
do_one_initcall
kernel_init_freeable
kernel_init
ret_from_fork
> ---
> arch/arm64/mm/mmu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index acfae9b41cc8..98ac09ae9588 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -63,6 +63,7 @@ static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss __maybe_unused;
> static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss __maybe_unused;
>
> static DEFINE_SPINLOCK(swapper_pgdir_lock);
> +static DEFINE_SPINLOCK(fixmap_lock);
>
> void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
> {
> @@ -329,6 +330,11 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
> }
> BUG_ON(p4d_bad(p4d));
>
> + /*
> + * fixmap is global resource, thus it needs to be protected by a lock
> + * in case of race condition.
> + */
> + spin_lock(&fixmap_lock);
> pudp = pud_set_fixmap_offset(p4dp, addr);
> do {
> pud_t old_pud = READ_ONCE(*pudp);
> @@ -359,6 +365,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
> } while (pudp++, addr = next, addr != end);
>
> pud_clear_fixmap();
> + spin_unlock(&fixmap_lock);
> }
>
> static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> --
> 2.17.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Qian Cai <quic_qiancai@quicinc.com>
To: Jianyong Wu <jianyong.wu@arm.com>
Cc: <catalin.marinas@arm.com>, <will@kernel.org>,
<anshuman.khandual@arm.com>, <akpm@linux-foundation.org>,
<ardb@kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <david@redhat.com>,
<gshan@redhat.com>, <justin.he@arm.com>, <nd@arm.com>
Subject: Re: [PATCH v2] arm64/mm: avoid fixmap race condition when create pud mapping
Date: Wed, 15 Dec 2021 09:13:37 -0500 [thread overview]
Message-ID: <Ybn4EfweLqKtyW0+@fixkernel.com> (raw)
In-Reply-To: <20211210095432.51798-1-jianyong.wu@arm.com>
On Fri, Dec 10, 2021 at 05:54:32PM +0800, Jianyong Wu wrote:
> fixmap is a global resource and is used recursively in create pud mapping.
> It may lead to race condition when alloc_init_pud is called concurrently.
>
> Fox example:
> alloc_init_pud is called when kernel_init. If memory hotplug
> thread, which will also call alloc_init_pud, happens during
> kernel_init, the race for fixmap occurs.
>
> The race condition flow can be:
>
> *************** begin **************
>
> kerenl_init thread virtio-mem workqueue thread
> ================== ======== ==================
> alloc_init_pud(...)
> pudp = pud_set_fixmap_offset(..) alloc_init_pud(...)
> ... ...
> READ_ONCE(*pudp) //OK! pudp = pud_set_fixmap_offset(
> ... ...
> pud_clear_fixmap() //fixmap break
> READ_ONCE(*pudp) //CRASH!
>
> **************** end ***************
>
> Hence, a spin lock is introduced to protect the fixmap during create pdg
> mapping.
>
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
I am afraid there is a problem to take a spinlock there.
node 0 deferred pages initialised in 2740ms
pgdatinit0 (176) used greatest stack depth: 59184 bytes left
devtmpfs: initialized
KASLR disabled due to lack of seed
BUG: sleeping function called from invalid context at mm/page_alloc.c:5151
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
1 lock held by swapper/0/1:
#0: ffff800009ea3278 (fixmap_lock){+.+.}-{2:2}, at: __create_pgd_mapping
alloc_init_pud at /usr/src/linux-next/arch/arm64/mm/mmu.c:340 (discriminator 4)
(inlined by) __create_pgd_mapping at /usr/src/linux-next/arch/arm64/mm/mmu.c:393 (discriminator 4)
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.16.0-rc5-next-20211214
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
__might_resched
__might_sleep
__alloc_pages
alloc_page_interleave
alloc_pages
__get_free_pages
__pgd_pgtable_alloc
__create_pgd_mapping
__phys_to_pte_val at /usr/src/linux-next/./arch/arm64/include/asm/pgtable.h:77
(inlined by) __pud_populate at /usr/src/linux-next/./arch/arm64/include/asm/pgalloc.h:25
(inlined by) alloc_init_cont_pmd at /usr/src/linux-next/arch/arm64/mm/mmu.c:277
(inlined by) alloc_init_pud at /usr/src/linux-next/arch/arm64/mm/mmu.c:358
(inlined by) __create_pgd_mapping at /usr/src/linux-next/arch/arm64/mm/mmu.c:393
map_entry_trampoline
map_entry_trampoline at /usr/src/linux-next/arch/arm64/mm/mmu.c:639
do_one_initcall
kernel_init_freeable
kernel_init
ret_from_fork
> ---
> arch/arm64/mm/mmu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index acfae9b41cc8..98ac09ae9588 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -63,6 +63,7 @@ static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss __maybe_unused;
> static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss __maybe_unused;
>
> static DEFINE_SPINLOCK(swapper_pgdir_lock);
> +static DEFINE_SPINLOCK(fixmap_lock);
>
> void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
> {
> @@ -329,6 +330,11 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
> }
> BUG_ON(p4d_bad(p4d));
>
> + /*
> + * fixmap is global resource, thus it needs to be protected by a lock
> + * in case of race condition.
> + */
> + spin_lock(&fixmap_lock);
> pudp = pud_set_fixmap_offset(p4dp, addr);
> do {
> pud_t old_pud = READ_ONCE(*pudp);
> @@ -359,6 +365,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
> } while (pudp++, addr = next, addr != end);
>
> pud_clear_fixmap();
> + spin_unlock(&fixmap_lock);
> }
>
> static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> --
> 2.17.1
>
next prev parent reply other threads:[~2021-12-15 14:15 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-10 9:54 [PATCH v2] arm64/mm: avoid fixmap race condition when create pud mapping Jianyong Wu
2021-12-10 9:54 ` Jianyong Wu
2021-12-10 11:22 ` Catalin Marinas
2021-12-10 11:22 ` Catalin Marinas
2021-12-13 5:24 ` Jianyong Wu
2021-12-13 5:24 ` Jianyong Wu
2021-12-13 6:56 ` Anshuman Khandual
2021-12-13 6:56 ` Anshuman Khandual
2021-12-13 7:27 ` Jianyong Wu
2021-12-13 7:27 ` Jianyong Wu
2021-12-13 7:37 ` David Hildenbrand
2021-12-13 7:37 ` David Hildenbrand
2021-12-13 9:57 ` Catalin Marinas
2021-12-13 9:57 ` Catalin Marinas
2021-12-13 10:16 ` Anshuman Khandual
2021-12-13 10:16 ` Anshuman Khandual
2021-12-13 10:35 ` Ard Biesheuvel
2021-12-13 10:35 ` Ard Biesheuvel
2021-12-13 13:45 ` Will Deacon
2021-12-13 13:45 ` Will Deacon
2021-12-13 14:01 ` Ard Biesheuvel
2021-12-13 14:01 ` Ard Biesheuvel
2021-12-13 16:42 ` Will Deacon
2021-12-13 16:42 ` Will Deacon
2021-12-15 14:13 ` Qian Cai [this message]
2021-12-15 14:13 ` Qian Cai
2021-12-15 16:02 ` Catalin Marinas
2021-12-15 16:02 ` Catalin Marinas
2021-12-15 16:04 ` David Hildenbrand
2021-12-15 16:04 ` David Hildenbrand
2021-12-16 3:00 ` Jianyong Wu
2021-12-16 3:00 ` Jianyong Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ybn4EfweLqKtyW0+@fixkernel.com \
--to=quic_qiancai@quicinc.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=gshan@redhat.com \
--cc=jianyong.wu@arm.com \
--cc=justin.he@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nd@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.