From: Uladzislau Rezki <urezki@gmail.com>
To: Adrian Huang <adrianhuang0701@gmail.com>
Cc: urezki@gmail.com, ahuang12@lenovo.com, akpm@linux-foundation.org,
hch@infradead.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, peterz@infradead.org, sunjw10@lenovo.com
Subject: Re: [PATCH 1/1] mm/vmalloc: Add preempt point in purge_vmap_node() when enabling kasan
Date: Tue, 23 Jul 2024 12:50:36 +0200 [thread overview]
Message-ID: <Zp-K_A60DjlDhlRt@pc636> (raw)
In-Reply-To: <20240722115054.6295-1-ahuang12@lenovo.com>
> If we combine all tlb flush operations into one operation in the call path
> 'purge_vmap_node()->kasan_release_vmalloc()', the running time of
> drain_vmap_area_work() can be saved greately. The idea is from the
> flush_tlb_kernel_range() call in __purge_vmap_area_lazy().
> And, the soft lockup won't not be triggered. Please refer to the following patch.
> Here is the test result based on 6.10:
>
> [6.10 wo/ the patch below]
> 1. ftrace latency profiling (record a trace if the latency > 20s): Commands
> echo 20000000 > /sys/kernel/debug/tracing/tracing_thresh
> echo drain_vmap_area_work > /sys/kernel/debug/tracing/set_graph_function
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> echo 1 > /sys/kernel/debug/tracing/tracing_on
>
> 2. Run `make -j $(nproc)` to compile the kernel source
>
> 3. Once the soft lockup is reproduced, check the ftace:
> cat /sys/kernel/debug/tracing/trace
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
> 76) $ 50412985 us | } /* __purge_vmap_area_lazy */
> 76) $ 50412997 us | } /* drain_vmap_area_work */
> 76) $ 29165911 us | } /* __purge_vmap_area_lazy */
> 76) $ 29165926 us | } /* drain_vmap_area_work */
> 91) $ 53629423 us | } /* __purge_vmap_area_lazy */
> 91) $ 53629434 us | } /* drain_vmap_area_work */
> 91) $ 28121014 us | } /* __purge_vmap_area_lazy */
> 91) $ 28121026 us | } /* drain_vmap_area_work */
>
>
> [6.10 w/ the patch below]
> 1. Repeat step 1-2 in "[6.10 wo/ the patch below]"
>
> 2. The soft lockup is not triggered and the ftrace log is empty.
> cat /sys/kernel/debug/tracing/trace
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
>
>
> 3. Setting 'tracing_thresh' to 10/5 seconds does not get any ftrace log.
>
> 4. Setting 'tracing_thresh' to 1 second gets ftrace log.
> cat /sys/kernel/tracing/trace
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
> 51) $ 1019695 us | } /* __purge_vmap_area_lazy */
> 51) $ 1019703 us | } /* drain_vmap_area_work */
> 198) $ 1018707 us | } /* __purge_vmap_area_lazy */
> 198) $ 1018718 us | } /* drain_vmap_area_work */
>
> 5. Run the following test_vmalloc command without any issues
> modprobe test_vmalloc nr_threads=$(nproc) run_test_mask=0x1 nr_pages=4
>
> Could you please test this patch in your VM environment?
>
It works great and does not generate the soft-lock-up splat :)
See below some comments:
> ---
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 70d6a8f6e25d..ddbf42a1a7b7 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -55,6 +55,9 @@ extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
> int kasan_populate_early_shadow(const void *shadow_start,
> const void *shadow_end);
>
> +#define KASAN_VMALLOC_PAGE_RANGE 0x1 /* Apply existing page range */
> +#define KASAN_VMALLOC_TLB_FLUSH 0x2 /* TLB flush */
> +
> #ifndef kasan_mem_to_shadow
> static inline void *kasan_mem_to_shadow(const void *addr)
> {
> @@ -511,7 +514,8 @@ void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
> int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
> void kasan_release_vmalloc(unsigned long start, unsigned long end,
> unsigned long free_region_start,
> - unsigned long free_region_end);
> + unsigned long free_region_end,
> + unsigned long flags);
>
> #else /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
>
> @@ -526,7 +530,8 @@ static inline int kasan_populate_vmalloc(unsigned long start,
> static inline void kasan_release_vmalloc(unsigned long start,
> unsigned long end,
> unsigned long free_region_start,
> - unsigned long free_region_end) { }
> + unsigned long free_region_end,
> + unsigned long flags) { }
>
> #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
>
> @@ -561,7 +566,8 @@ static inline int kasan_populate_vmalloc(unsigned long start,
> static inline void kasan_release_vmalloc(unsigned long start,
> unsigned long end,
> unsigned long free_region_start,
> - unsigned long free_region_end) { }
> + unsigned long free_region_end,
> + unsigned long flags) { }
>
> static inline void *kasan_unpoison_vmalloc(const void *start,
> unsigned long size,
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index d6210ca48dda..88d1c9dcb507 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -489,7 +489,8 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr,
> */
> void kasan_release_vmalloc(unsigned long start, unsigned long end,
> unsigned long free_region_start,
> - unsigned long free_region_end)
> + unsigned long free_region_end,
> + unsigned long flags)
> {
> void *shadow_start, *shadow_end;
> unsigned long region_start, region_end;
> @@ -522,12 +523,17 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
> __memset(shadow_start, KASAN_SHADOW_INIT, shadow_end - shadow_start);
> return;
> }
> - apply_to_existing_page_range(&init_mm,
> +
> +
> + if (flags & KASAN_VMALLOC_PAGE_RANGE)
> + apply_to_existing_page_range(&init_mm,
> (unsigned long)shadow_start,
> size, kasan_depopulate_vmalloc_pte,
> NULL);
> - flush_tlb_kernel_range((unsigned long)shadow_start,
> - (unsigned long)shadow_end);
> +
> + if (flags & KASAN_VMALLOC_TLB_FLUSH)
> + flush_tlb_kernel_range((unsigned long)shadow_start,
> + (unsigned long)shadow_end);
> }
> }
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e34ea860153f..d66e09135876 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2193,8 +2193,15 @@ static void purge_vmap_node(struct work_struct *work)
> struct vmap_area *va, *n_va;
> LIST_HEAD(local_list);
>
> + unsigned long start;
> + unsigned long end;
> +
> vn->nr_purged = 0;
>
> + start = list_first_entry(&vn->purge_list, struct vmap_area, list)->va_start;
> +
> + end = list_last_entry(&vn->purge_list, struct vmap_area, list)->va_end;
> +
> list_for_each_entry_safe(va, n_va, &vn->purge_list, list) {
> unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
> unsigned long orig_start = va->va_start;
> @@ -2205,7 +2212,8 @@ static void purge_vmap_node(struct work_struct *work)
>
> if (is_vmalloc_or_module_addr((void *)orig_start))
> kasan_release_vmalloc(orig_start, orig_end,
> - va->va_start, va->va_end);
> + va->va_start, va->va_end,
> + KASAN_VMALLOC_PAGE_RANGE);
>
> atomic_long_sub(nr, &vmap_lazy_nr);
> vn->nr_purged++;
> @@ -2218,6 +2226,8 @@ static void purge_vmap_node(struct work_struct *work)
> list_add(&va->list, &local_list);
> }
>
> + kasan_release_vmalloc(start, end, start, end, KASAN_VMALLOC_TLB_FLUSH);
> +
>
Do we need it here? We just did the TLB flush for en entire range in the
__purge_vmap_area_lazy(). So, it is two times invoked and looks odd to me.
Am i missing something?
Thanks!
--
Uladzislau Rezki
next prev parent reply other threads:[~2024-07-23 10:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-05 13:08 [PATCH 1/1] mm/vmalloc: Add preempt point in purge_vmap_node() when enabling kasan Adrian Huang
2024-07-05 15:36 ` Uladzislau Rezki
2024-07-08 13:39 ` [External] " Adrian Huang12
2024-07-08 16:06 ` Uladzislau Rezki
2024-07-19 11:40 ` Uladzislau Rezki
2024-07-22 11:50 ` Adrian Huang
2024-07-22 12:40 ` Uladzislau Rezki
2024-07-23 10:50 ` Uladzislau Rezki [this message]
2024-07-24 12:46 ` Adrian Huang
2024-07-24 14:32 ` Uladzislau Rezki
2024-07-24 15:09 ` Uladzislau Rezki
2024-07-24 19:27 ` Uladzislau Rezki
2024-07-24 20:34 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zp-K_A60DjlDhlRt@pc636 \
--to=urezki@gmail.com \
--cc=adrianhuang0701@gmail.com \
--cc=ahuang12@lenovo.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
--cc=sunjw10@lenovo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.