From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A225C83F2D for ; Tue, 15 Jul 2025 20:20:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B8576B009D; Tue, 15 Jul 2025 16:20:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 390116B009E; Tue, 15 Jul 2025 16:20:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CDA16B00A0; Tue, 15 Jul 2025 16:20:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1C5926B009D for ; Tue, 15 Jul 2025 16:20:21 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B9E51C0162 for ; Tue, 15 Jul 2025 20:20:20 +0000 (UTC) X-FDA: 83667616200.18.B320530 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf12.hostedemail.com (Postfix) with ESMTP id 131854000C for ; Tue, 15 Jul 2025 20:20:18 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Cnj5i1Mh; spf=pass (imf12.hostedemail.com: domain of namhyung@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752610819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xsY3L4Sy+s350rBUXNPRqjMzWZyjDq2N2o/mUdsX+s0=; b=ykMhYVes+/YTp+k02QDNOPBaPjdZ/m/S4+qGDam9R3lWF84NwogvqZHJ5rDsFphuBH5nGU ZsjoBt332f3UpJO6WbB7cJnnl0Zxz9jjg1coDvwAwTMMbqDXKeJlv+5MSW9l/+UPF/ej11 a5+DIjh+gBSox/fcCGGwRgan7k9SI2M= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Cnj5i1Mh; spf=pass (imf12.hostedemail.com: domain of namhyung@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752610819; a=rsa-sha256; cv=none; b=JC2kYLNc/q0h5XdxT7V95EQyM5pjt34g8fQary42S9NC3cat53ZRtHWFIUNFOcXopPBTZr vwllTac/PXQxDwkScItSHTvJHuY58pdQoW4qfcfXk6qmZMFbcxKv31TjMDoVALVZr+XFHB QYZLckkFns4Hx6eP0kmIk0cWfCMJa+8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 63FA9A533EB; Tue, 15 Jul 2025 20:20:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 45BB3C4CEE3; Tue, 15 Jul 2025 20:20:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752610818; bh=VDjHAjEESu5pACeK6VWZE2soOFSvFsRrQ2uQNbZDJ7E=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Cnj5i1MhTWs2rFnzlbhjXQX5rr74QpEWOkAvyZHnxVt3uXPO/EapOeoxjGk0Deumx XV0SijHYXCXsm51ZIEmco6LeXKq3hKTJaxjTMpwZ/b3IMRFqacZajluxMND8dkeYoS Lz2INE6t9b/j2AnAwiv5WPshSJPMlHqAIM+HLXxo3yCtOi/wbYahRA5jUcpHIBhaj/ xLU+8zQFSkV5TxPFpateKiHB/WNeoJGm5LwQGFB0pFcHoQEN5nVfx3jQ/FqJYaHupc HEEXbpzBMBGh+RCkBom9smRDL5mAwwXj4s+UzdDaz4IfHkhkeUY/sSEtja7ZEY/a6Z MPPJZ/Oerzg1g== Date: Tue, 15 Jul 2025 13:20:15 -0700 From: Namhyung Kim To: Ankur Arora Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, david@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, mjguzik@gmail.com, luto@kernel.org, peterz@infradead.org, acme@kernel.org, tglx@linutronix.de, willy@infradead.org, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH v5 09/14] perf bench mem: Add mmap() workloads Message-ID: References: <20250710005926.1159009-1-ankur.a.arora@oracle.com> <20250710005926.1159009-10-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20250710005926.1159009-10-ankur.a.arora@oracle.com> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 131854000C X-Stat-Signature: 585xjq8ziow3n7d73z3itgxg61mrup17 X-HE-Tag: 1752610818-377904 X-HE-Meta: U2FsdGVkX1/40ekeogMY5YyEUfu5vGIJpim5H+RtCIMM620xLVI8Fqxu7aH3kYkKgN/rejZGA3joS/mjD7BXmBu+6akh+VDwDnZLwtTipX43E8unHLnkgsxN6aG4ixsLjuZLWSQ4ojs0BNxnqmXpxeK9e8KEF690d3P8dS42eMiJFUTcpXPNJsRI8uUQbnQm6+7u/teWVYJPWYJIIvFqiUaZgRsFIuh4KDs4iFcbyrEOX++kFcN8GfgksioeGbVCpRLF6mzGuzY0KModAqxdwVf5ljSPNzfaeRmKOK4tEaehJztih/pKl5VQtZmoUeXDHaZfMGwDH0lVEC6bpugUKaxGWnJCqcalRroAf4c1+Jt6E9BzhpEO1vGR+Gt0sstvKJqWfRJUpdgCXyRe1QQaQmAUJoEV9aRVSBCqr+1iTaci3fQHDoYmz3CM0rYpNLILLfId4T0hsfl9iUMYIYAV9peix1WMVByCvC7YNqVaytl1//MGmPEoYy4DEm22Fsa6HYfOtsP4AjMn3VIAf6SdYtXJZqkeW8cnX7nDKxr/eXKUGeBTrUXVea9yOZ0byNFm/B6YMT8Kza2fH/9byv1EpN/24ugNNOLIaONPfEZqB9BzPkD0cY9VrdRQebSUnSB+JLg7twssZZXMudGeAJFi33t6iScUrkTrtd3vYpUwLYnArajh5sZxDj5ry8dni/2wa9QfIA7Q2iDK8ioLlw3Ms5vT7qXbjoJG5kT54fF8AD+3FlPUgAYuOD0lZZYZQG4GBji1O8txzxjQbqHSFAEkE7C9NUbUfLxRGaS/aQ78w884NCvZJA4WGXijgvG3dax9mKwo0l1+LNSuzxot7fC96j7V7UvZTx41tRbm3pRY3Gdw7NXviqQ6W7f/RXGj3BsRLeH8+GT+nvv4MRG4pZsjMufaixPmfyrq+5M+v7ChoENE7TdtWMguQdFSBH/T96lkyZVQUATyDpEA9qT0eVa R+rmCkpA 4k+viRZG7/ZBWyivGz6WUIk881WQjjzyCgtaJTJirq5/3gtwk2AlqrXhZfbiK56eV875LmofKqV1GOiqxqKcrsIM5Eam/C+V5HeBBLOu2YpUHDZ85HilyHbWbJXgoq+X4KNhLt34EOHwcrbEv4ZXDwMjD3q5xCs+QKaJQ62pba1UQAdjpjSNXem79jrVhYDmk7RV8Y27oIXqChNNw0aU4MTk/PZ79d4CXaddvyiDKmK/zVWP9MRpdZueBTpPuCPT6EJMGIrRZONo48GjEfyRamZCFmamqFyBLVOos2RQOuy62jHM9fos6/7FrUA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 09, 2025 at 05:59:21PM -0700, Ankur Arora wrote: > Add two mmap() workloads: one that eagerly populates a region and > another that demand faults it in. > > The intent is to probe the memory subsytem performance incurred > by mmap(). Maybe better to name 'mmap' as other tests named after the actual function. Also please update the documentation. Thanks, Namhyung > > $ perf bench mem map -s 4gb -p 4kb -l 10 -f populate > # Running 'mem/map' benchmark: > # function 'populate' (Eagerly populated map) > # Copying 4gb bytes ... > > 1.811691 GB/sec > > $ perf bench mem map -s 4gb -p 2mb -l 10 -f populate > # Running 'mem/map' benchmark: > # function 'populate' (Eagerly populated map) > # Copying 4gb bytes ... > > 12.272017 GB/sec > > $ perf bench mem map -s 4gb -p 1gb -l 10 -f populate > # Running 'mem/map' benchmark: > # function 'populate' (Eagerly populated map) > # Copying 4gb bytes ... > > 17.085927 GB/sec > > Signed-off-by: Ankur Arora > --- > tools/perf/bench/bench.h | 1 + > tools/perf/bench/mem-functions.c | 96 ++++++++++++++++++++++++++++++++ > tools/perf/builtin-bench.c | 1 + > 3 files changed, 98 insertions(+) > > diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h > index 9f736423af53..46484bb0eefb 100644 > --- a/tools/perf/bench/bench.h > +++ b/tools/perf/bench/bench.h > @@ -28,6 +28,7 @@ int bench_syscall_fork(int argc, const char **argv); > int bench_syscall_execve(int argc, const char **argv); > int bench_mem_memcpy(int argc, const char **argv); > int bench_mem_memset(int argc, const char **argv); > +int bench_mem_map(int argc, const char **argv); > int bench_mem_find_bit(int argc, const char **argv); > int bench_futex_hash(int argc, const char **argv); > int bench_futex_wake(int argc, const char **argv); > diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c > index 8a37da149327..ea62e3583a70 100644 > --- a/tools/perf/bench/mem-functions.c > +++ b/tools/perf/bench/mem-functions.c > @@ -40,6 +40,7 @@ static const char *chunk_size_str = "0"; > static unsigned int nr_loops = 1; > static bool use_cycles; > static int cycles_fd; > +static unsigned int seed; > > static const struct option bench_common_options[] = { > OPT_STRING('s', "size", &size_str, "1MB", > @@ -81,6 +82,7 @@ struct bench_params { > size_t chunk_size; > unsigned int nr_loops; > unsigned int page_shift; > + unsigned int seed; > }; > > struct bench_mem_info { > @@ -98,6 +100,7 @@ typedef void (*mem_fini_t)(struct bench_mem_info *, struct bench_params *, > void **, void **); > typedef void *(*memcpy_t)(void *, const void *, size_t); > typedef void *(*memset_t)(void *, int, size_t); > +typedef void (*map_op_t)(void *, size_t, unsigned int, bool); > > struct function { > const char *name; > @@ -108,6 +111,7 @@ struct function { > union { > memcpy_t memcpy; > memset_t memset; > + map_op_t map_op; > }; > } fn; > }; > @@ -160,6 +164,14 @@ static union bench_clock clock_diff(union bench_clock *s, union bench_clock *e) > return t; > } > > +static void clock_accum(union bench_clock *a, union bench_clock *b) > +{ > + if (use_cycles) > + a->cycles += b->cycles; > + else > + timeradd(&a->tv, &b->tv, &a->tv); > +} > + > static double timeval2double(struct timeval *ts) > { > return (double)ts->tv_sec + (double)ts->tv_usec / (double)USEC_PER_SEC; > @@ -270,6 +282,8 @@ static int bench_mem_common(int argc, const char **argv, struct bench_mem_info * > } > p.page_shift = ilog2(page_size); > > + p.seed = seed; > + > if (!strncmp(function_str, "all", 3)) { > for (i = 0; info->functions[i].name; i++) > __bench_mem_function(info, &p, i); > @@ -464,3 +478,85 @@ int bench_mem_memset(int argc, const char **argv) > > return bench_mem_common(argc, argv, &info); > } > + > +static void map_page_touch(void *dst, size_t size, unsigned int page_shift, bool random) > +{ > + unsigned long npages = size / (1 << page_shift); > + unsigned long offset = 0, r = 0; > + > + for (unsigned long i = 0; i < npages; i++) { > + if (random) > + r = rand() % (1 << page_shift); > + > + *((char *)dst + offset + r) = *(char *)(dst + offset + r) + i; > + offset += 1 << page_shift; > + } > +} > + > +static int do_map(const struct function *r, struct bench_params *p, > + void *src __maybe_unused, void *dst __maybe_unused, > + union bench_clock *accum) > +{ > + union bench_clock start, end, diff; > + map_op_t fn = r->fn.map_op; > + bool populate = strcmp(r->name, "populate") == 0; > + > + if (p->seed) > + srand(p->seed); > + > + for (unsigned int i = 0; i < p->nr_loops; i++) { > + clock_get(&start); > + dst = bench_mmap(p->size, populate, p->page_shift); > + if (!dst) > + goto out; > + > + fn(dst, p->size, p->page_shift, p->seed); > + clock_get(&end); > + diff = clock_diff(&start, &end); > + clock_accum(accum, &diff); > + > + bench_munmap(dst, p->size); > + } > + > + return 0; > +out: > + printf("# Memory allocation failed - maybe size (%s) %s?\n", size_str, > + p->page_shift != PAGE_SHIFT_4KB ? "has insufficient hugepages" : "is too large"); > + return -1; > +} > + > +static const char * const bench_mem_map_usage[] = { > + "perf bench mem map ", > + NULL > +}; > + > +static const struct function map_functions[] = { > + { .name = "populate", > + .desc = "Eagerly populated map", > + .fn.map_op = map_page_touch }, > + > + { .name = "demand", > + .desc = "Demand loaded map", > + .fn.map_op = map_page_touch }, > + > + { .name = NULL, } > +}; > + > +int bench_mem_map(int argc, const char **argv) > +{ > + static const struct option bench_map_options[] = { > + OPT_UINTEGER('r', "randomize", &seed, > + "Seed to randomize page RW offset with."), > + OPT_PARENT(bench_common_options), > + OPT_END() > + }; > + > + struct bench_mem_info info = { > + .functions = map_functions, > + .do_op = do_map, > + .usage = bench_mem_map_usage, > + .options = bench_map_options, > + }; > + > + return bench_mem_common(argc, argv, &info); > +} > diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c > index 2c1a9f3d847a..a20bd9882f0a 100644 > --- a/tools/perf/builtin-bench.c > +++ b/tools/perf/builtin-bench.c > @@ -65,6 +65,7 @@ static struct bench mem_benchmarks[] = { > { "memcpy", "Benchmark for memcpy() functions", bench_mem_memcpy }, > { "memset", "Benchmark for memset() functions", bench_mem_memset }, > { "find_bit", "Benchmark for find_bit() functions", bench_mem_find_bit }, > + { "map", "Benchmark for mmap() mappings", bench_mem_map }, > { "all", "Run all memory access benchmarks", NULL }, > { NULL, NULL, NULL } > }; > -- > 2.43.5 >