From: Namhyung Kim <namhyung@kernel.org>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
akpm@linux-foundation.org, david@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
mjguzik@gmail.com, luto@kernel.org, peterz@infradead.org,
acme@kernel.org, tglx@linutronix.de, willy@infradead.org,
raghavendra.kt@amd.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com
Subject: Re: [PATCH v5 07/14] perf bench mem: Allow chunking on a memory region
Date: Tue, 15 Jul 2025 13:17:26 -0700 [thread overview]
Message-ID: <aHa3VgRA8qm8U9my@google.com> (raw)
In-Reply-To: <20250710005926.1159009-8-ankur.a.arora@oracle.com>
On Wed, Jul 09, 2025 at 05:59:19PM -0700, Ankur Arora wrote:
> There can be a significant gap in memset/memcpy performance depending
> on the size of the region being operated on.
>
> With chunk-size=4kb:
>
> $ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
>
> $ perf bench mem memset -p 4kb -k 4kb -s 4gb -l 10 -f x86-64-stosq
> # Running 'mem/memset' benchmark:
> # function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
> # Copying 4gb bytes ...
>
> 13.011655 GB/sec
>
> With chunk-size=1gb:
>
> $ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
>
> $ perf bench mem memset -p 4kb -k 1gb -s 4gb -l 10 -f x86-64-stosq
> # Running 'mem/memset' benchmark:
> # function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
> # Copying 4gb bytes ...
>
> 21.936355 GB/sec
>
> So, allow the user to specify the chunk-size.
>
> The default value is identical to the total size of the region, which
> preserves current behaviour.
>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Again, please update the documentation. With that,
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
> ---
> tools/perf/bench/mem-functions.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
> index e4d713587d45..412d18f2cb2e 100644
> --- a/tools/perf/bench/mem-functions.c
> +++ b/tools/perf/bench/mem-functions.c
> @@ -36,6 +36,7 @@
> static const char *size_str = "1MB";
> static const char *function_str = "all";
> static const char *page_size_str = "4KB";
> +static const char *chunk_size_str = "0";
> static unsigned int nr_loops = 1;
> static bool use_cycles;
> static int cycles_fd;
> @@ -49,6 +50,10 @@ static const struct option options[] = {
> "Specify page-size for mapping memory buffers. "
> "Available sizes: 4KB, 2MB, 1GB (case insensitive)"),
>
> + OPT_STRING('k', "chunk", &chunk_size_str, "0",
> + "Specify the chunk-size for each invocation. "
> + "Available units: B, KB, MB, GB and TB (case insensitive)"),
> +
> OPT_STRING('f', "function", &function_str, "all",
> "Specify the function to run, \"all\" runs all available functions, \"help\" lists them"),
>
> @@ -69,6 +74,7 @@ union bench_clock {
> struct bench_params {
> size_t size;
> size_t size_total;
> + size_t chunk_size;
> unsigned int nr_loops;
> unsigned int page_shift;
> };
> @@ -242,6 +248,14 @@ static int bench_mem_common(int argc, const char **argv, struct bench_mem_info *
> }
> p.size_total = (size_t)p.size * p.nr_loops;
>
> + p.chunk_size = (size_t)perf_atoll((char *)chunk_size_str);
> + if ((s64)p.chunk_size < 0 || (s64)p.chunk_size > (s64)p.size) {
> + fprintf(stderr, "Invalid chunk_size:%s\n", chunk_size_str);
> + return 1;
> + }
> + if (!p.chunk_size)
> + p.chunk_size = p.size;
> +
> page_size = (unsigned int)perf_atoll((char *)page_size_str);
> if (page_size != (1 << PAGE_SHIFT_4KB) &&
> page_size != (1 << PAGE_SHIFT_2MB) &&
> @@ -299,7 +313,8 @@ static int do_memcpy(const struct function *r, struct bench_params *p,
>
> clock_get(&start);
> for (unsigned int i = 0; i < p->nr_loops; ++i)
> - fn(dst, src, p->size);
> + for (size_t off = 0; off < p->size; off += p->chunk_size)
> + fn(dst + off, src + off, min(p->chunk_size, p->size - off));
> clock_get(&end);
>
> *rt = clock_diff(&start, &end);
> @@ -401,7 +416,8 @@ static int do_memset(const struct function *r, struct bench_params *p,
>
> clock_get(&start);
> for (unsigned int i = 0; i < p->nr_loops; ++i)
> - fn(dst, i, p->size);
> + for (size_t off = 0; off < p->size; off += p->chunk_size)
> + fn(dst + off, i, min(p->chunk_size, p->size - off));
> clock_get(&end);
>
> *rt = clock_diff(&start, &end);
> --
> 2.43.5
>
next prev parent reply other threads:[~2025-07-15 20:17 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 0:59 [PATCH v5 00/14] mm: folio_zero_user: clearing of page-extents Ankur Arora
2025-07-10 0:59 ` [PATCH v5 01/14] perf bench mem: Remove repetition around time measurement Ankur Arora
2025-07-15 20:04 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 02/14] perf bench mem: Defer type munging of size to float Ankur Arora
2025-07-15 20:05 ` Namhyung Kim
2025-07-16 2:17 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 03/14] perf bench mem: Move mem op parameters into a structure Ankur Arora
2025-07-15 20:06 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 04/14] perf bench mem: Pull out init/fini logic Ankur Arora
2025-07-15 20:09 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 05/14] perf bench mem: Switch from zalloc() to mmap() Ankur Arora
2025-07-15 20:09 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 06/14] perf bench mem: Allow mapping of hugepages Ankur Arora
2025-07-15 20:12 ` Namhyung Kim
2025-07-16 2:32 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 07/14] perf bench mem: Allow chunking on a memory region Ankur Arora
2025-07-15 20:17 ` Namhyung Kim [this message]
2025-07-16 2:34 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 08/14] perf bench mem: Refactor mem_options Ankur Arora
2025-07-15 20:18 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 09/14] perf bench mem: Add mmap() workloads Ankur Arora
2025-07-15 20:20 ` Namhyung Kim
2025-07-16 2:40 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 10/14] x86/mm: Simplify clear_page_* Ankur Arora
2025-07-11 11:47 ` David Hildenbrand
2025-07-11 17:26 ` Ankur Arora
2025-07-11 19:03 ` David Hildenbrand
2025-07-11 19:24 ` Ankur Arora
2025-07-11 19:27 ` David Hildenbrand
2025-07-10 0:59 ` [PATCH v5 11/14] x86/clear_page: Introduce clear_pages() Ankur Arora
2025-07-10 0:59 ` [PATCH v5 12/14] mm: add config option for clearing page-extents Ankur Arora
2025-07-10 7:58 ` Andrew Morton
2025-07-10 16:31 ` Ankur Arora
2025-07-11 11:39 ` David Hildenbrand
2025-07-11 17:25 ` Ankur Arora
2025-07-11 19:14 ` David Hildenbrand
2025-07-11 19:35 ` Ankur Arora
2025-07-11 11:40 ` David Hildenbrand
2025-07-11 17:32 ` Ankur Arora
2025-07-11 19:26 ` David Hildenbrand
2025-07-11 19:42 ` Ankur Arora
2025-07-14 20:35 ` Ankur Arora
2025-07-15 20:59 ` David Hildenbrand
2025-07-10 0:59 ` [PATCH v5 13/14] mm: memory: support " Ankur Arora
2025-07-11 11:44 ` David Hildenbrand
2025-07-11 13:27 ` Raghavendra K T
2025-07-11 17:39 ` Ankur Arora
2025-07-15 22:08 ` David Hildenbrand
2025-07-16 3:19 ` Ankur Arora
2025-07-16 8:03 ` David Hildenbrand
2025-07-16 17:54 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 14/14] x86/clear_pages: Support clearing of page-extents Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHa3VgRA8qm8U9my@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.