From: Namhyung Kim <namhyung@kernel.org>
To: James Clark <james.clark@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Ian Rogers <irogers@google.com>, Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-perf-users@vger.kernel.org,
Ankur Arora <ankur.a.arora@oracle.com>
Subject: Re: [PATCH] perf bench: Add -t/--threads option to perf bench mem mmap
Date: Mon, 15 Dec 2025 22:23:04 -0800 [thread overview]
Message-ID: <aUD6yGZ2bSHsCb2e@google.com> (raw)
In-Reply-To: <3c27128a-5cca-4c2f-a3f8-47b9a375bc7a@linaro.org>
Hi James,
On Tue, Dec 09, 2025 at 01:01:25PM +0000, James Clark wrote:
>
>
> On 07/12/2025 8:57 am, Namhyung Kim wrote:
> > So that it can measure overhead of mmap_lock and/or per-VMA lock
> > contention.
> >
> > $ perf bench mem mmap -f demand -l 1000 -t 1
> > # Running 'mem/mmap' benchmark:
> > # function 'demand' (Demand loaded mmap())
> > # Copying 1MB bytes ...
> >
> > 2.914503 GB/sec
> >
> > $ perf bench mem mmap -f demand -l 1000 -t 2
> > # Running 'mem/mmap' benchmark:
> > # function 'demand' (Demand loaded mmap())
> > # Copying 1MB bytes ...
> >
> > 888.769991 MB/sec
> >
> > $ perf bench mem mmap -f demand -l 1000 -t 3
> > # Running 'mem/mmap' benchmark:
> > # function 'demand' (Demand loaded mmap())
> > # Copying 1MB bytes ...
> >
> > 757.658220 MB/sec
> >
> > $ perf bench mem mmap -f demand -l 1000 -t 4
> > # Running 'mem/mmap' benchmark:
> > # function 'demand' (Demand loaded mmap())
> > # Copying 1MB bytes ...
> >
> > 316.410713 MB/sec
>
> Should this now say "MB/sec per thread" for nr_threads > 1? I think it could
> be interpreted either way without a label, but I see you divided by
> nr_threads in timeval2double().
Right, thanks for the review. I think we can add it unconditionally.
Thanks,
Namhyung
>
> >
> > Cc: Ankur Arora <ankur.a.arora@oracle.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/perf/Documentation/perf-bench.txt | 4 ++
> > tools/perf/bench/mem-functions.c | 74 +++++++++++++++++++++----
> > 2 files changed, 67 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
> > index 1160224cb718392d..c5913cf59c988421 100644
> > --- a/tools/perf/Documentation/perf-bench.txt
> > +++ b/tools/perf/Documentation/perf-bench.txt
> > @@ -274,6 +274,10 @@ Repeat mmap() invocation this number of times.
> > --cycles::
> > Use perf's cpu-cycles event instead of gettimeofday syscall.
> > +-t::
> > +--threads=<NUM>::
> > +Create multiple threads to call mmap/munmap concurrently.
> > +
> > SUITES FOR 'numa'
> > ~~~~~~~~~~~~~~~~~
> > *mem*::
> > diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
> > index 2908a3a796c932d0..e7e7d0b41fc7720f 100644
> > --- a/tools/perf/bench/mem-functions.c
> > +++ b/tools/perf/bench/mem-functions.c
> > @@ -26,6 +26,7 @@
> > #include <errno.h>
> > #include <linux/time64.h>
> > #include <linux/log2.h>
> > +#include <pthread.h>
> > #define K 1024
> > @@ -41,6 +42,7 @@ static unsigned int nr_loops = 1;
> > static bool use_cycles;
> > static int cycles_fd;
> > static unsigned int seed;
> > +static unsigned int nr_threads = 1;
> > static const struct option bench_common_options[] = {
> > OPT_STRING('s', "size", &size_str, "1MB",
> > @@ -174,7 +176,7 @@ static void clock_accum(union bench_clock *a, union bench_clock *b)
> > static double timeval2double(struct timeval *ts)
> > {
> > - return (double)ts->tv_sec + (double)ts->tv_usec / (double)USEC_PER_SEC;
> > + return (double)ts->tv_sec + (double)ts->tv_usec / (double)USEC_PER_SEC / nr_threads;
> > }
> > #define print_bps(x) do { \
> > @@ -494,16 +496,27 @@ static void mmap_page_touch(void *dst, size_t size, unsigned int page_shift, boo
> > }
> > }
> > -static int do_mmap(const struct function *r, struct bench_params *p,
> > - void *src __maybe_unused, void *dst __maybe_unused,
> > - union bench_clock *accum)
> > +struct mmap_data {
> > + pthread_t id;
> > + const struct function *func;
> > + struct bench_params *params;
> > + union bench_clock result;
> > + unsigned int seed;
> > + int error;
> > +};
> > +
> > +static void *do_mmap_thread(void *arg)
> > {
> > + struct mmap_data *data = arg;
> > + const struct function *r = data->func;
> > + struct bench_params *p = data->params;
> > union bench_clock start, end, diff;
> > mmap_op_t fn = r->fn.mmap_op;
> > bool populate = strcmp(r->name, "populate") == 0;
> > + void *dst;
> > - if (p->seed)
> > - srand(p->seed);
> > + if (data->seed)
> > + srand(data->seed);
> > for (unsigned int i = 0; i < p->nr_loops; i++) {
> > clock_get(&start);
> > @@ -514,16 +527,53 @@ static int do_mmap(const struct function *r, struct bench_params *p,
> > fn(dst, p->size, p->page_shift, p->seed);
> > clock_get(&end);
> > diff = clock_diff(&start, &end);
> > - clock_accum(accum, &diff);
> > + clock_accum(&data->result, &diff);
> > bench_munmap(dst, p->size);
> > }
> > - return 0;
> > + return data;
> > out:
> > - printf("# Memory allocation failed - maybe size (%s) %s?\n", size_str,
> > - p->page_shift != PAGE_SHIFT_4KB ? "has insufficient hugepages" : "is too large");
> > - return -1;
> > + data->error = -ENOMEM;
> > + return NULL;
> > +}
> > +
> > +static int do_mmap(const struct function *r, struct bench_params *p,
> > + void *src __maybe_unused, void *dst __maybe_unused,
> > + union bench_clock *accum)
> > +{
> > + struct mmap_data *data;
> > + int error = 0;
> > +
> > + data = calloc(nr_threads, sizeof(*data));
> > + if (!data) {
> > + printf("# Failed to allocate thread resources\n");
> > + return -1;
> > + }
> > +
> > + for (unsigned int i = 0; i < nr_threads; i++) {
> > + data[i].func = r;
> > + data[i].params = p;
> > + if (p->seed)
> > + data[i].seed = p->seed + i;
> > +
> > + if (pthread_create(&data[i].id, NULL, do_mmap_thread, &data[i]) < 0)
> > + data[i].error = -errno;
> > + }
> > +
> > + for (unsigned int i = 0; i < nr_threads; i++) {
> > + pthread_join(data[i].id, NULL);
> > +
> > + clock_accum(accum, &data[i].result);
> > + error |= data[i].error;
> > + }
> > + free(data);
> > +
> > + if (error) {
> > + printf("# Memory allocation failed - maybe size (%s) %s?\n", size_str,
> > + p->page_shift != PAGE_SHIFT_4KB ? "has insufficient hugepages" : "is too large");
> > + }
> > + return error ? -1 : 0;
> > }
> > static const char * const bench_mem_mmap_usage[] = {
> > @@ -548,6 +598,8 @@ int bench_mem_mmap(int argc, const char **argv)
> > static const struct option bench_mmap_options[] = {
> > OPT_UINTEGER('r', "randomize", &seed,
> > "Seed to randomize page access offset."),
> > + OPT_UINTEGER('t', "threads", &nr_threads,
> > + "Number of threads to run concurrently (default: 1)."),
> > OPT_PARENT(bench_common_options),
> > OPT_END()
> > };
>
next prev parent reply other threads:[~2025-12-16 6:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-07 8:57 [PATCH] perf bench: Add -t/--threads option to perf bench mem mmap Namhyung Kim
2025-12-09 13:01 ` James Clark
2025-12-16 6:23 ` Namhyung Kim [this message]
2025-12-16 6:53 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUD6yGZ2bSHsCb2e@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ankur.a.arora@oracle.com \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.