From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Tavian Barnes <tavianator@tavianator.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Kan Liang <kan.liang@linux.intel.com>,
John Garry <john.g.garry@oracle.com>,
James Clark <james.clark@linaro.org>, Leo Yan <leo.yan@linux.dev>,
Charlie Jenkins <charlie@rivosinc.com>,
Andi Kleen <ak@linux.intel.com>,
Veronika Molnarova <vmolnaro@redhat.com>,
Michael Petlan <mpetlan@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, coresight@lists.linaro.org
Subject: Re: [PATCH v1] perf sample: Make user_regs and intr_regs optional
Date: Wed, 12 Feb 2025 20:05:54 -0800 [thread overview]
Message-ID: <Z61vosIP67VMUJAS@google.com> (raw)
In-Reply-To: <Z6uM48_IJa-AEzEI@google.com>
On Tue, Feb 11, 2025 at 09:46:11AM -0800, Namhyung Kim wrote:
> On Mon, Feb 10, 2025 at 08:43:40PM -0800, Ian Rogers wrote:
> > On Mon, Feb 10, 2025 at 6:51 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Mon, Feb 10, 2025 at 10:15:22AM -0800, Ian Rogers wrote:
> > > > On Mon, Jan 13, 2025 at 11:43 AM Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > The struct dump_regs contains 512 bytes of cache_regs, meaning the two
> > > > > values in perf_sample contribute 1088 bytes of its total 1384 bytes
> > > > > size. Initializing this much memory has a cost reported by Tavian
> > > > > Barnes <tavianator@tavianator.com> as about 2.5% when running `perf
> > > > > script --itrace=i0`:
> > > > > https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/
> > > > >
> > > > > Adrian Hunter <adrian.hunter@intel.com> replied that the zero
> > > > > initialization was necessary and couldn't simply be removed.
> > > > >
> > > > > This patch aims to strike a middle ground of still zeroing the
> > > > > perf_sample, but removing 79% of its size by make user_regs and
> > > > > intr_regs optional pointers to zalloc-ed memory. To support the
> > > > > allocation accessors are created for user_regs and intr_regs. To
> > > > > support correct cleanup perf_sample__init and perf_sample__exit
> > > > > functions are created and added throughout the code base.
> > > >
> > > > Ping. Given the memory savings and performance wins it would be nice
> > > > to see this land. Andi Kleen commented on doing a reimplementation,
> > > > which is fine but out-of-scope of what I'm doing here.
> > >
> > > Yeah, I like the core of the change. Andi's concern is that it touches
> > > too many places. It'd be nice if we can do that without allocating
> > > memory for regs and eliminating the perf_sample__{init,exit}. But I'm
> > > not if it's possible.
> >
> > Moving from no allocations to 2 possible allocations means there has
> > to be corresponding frees. Putting the frees into an __exit function
> > is the norm for this kind of cleanup. I don't see how you can move to
> > the approach presented without adding the frees and not introduce a
> > memory leak. I don't see what's actionable for me to do here.
>
> Right, I'm inclined to merge this patch. But I need to think a bit more
> about the Andi's approach before that.
Probably we can use a global (or per-thread) variable, but I think it
could grow to another pain point in the future. Using __init/exit will
make it easier for potential future changes.
Thanks,
Namhyung
next prev parent reply other threads:[~2025-02-13 4:07 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-13 19:43 [PATCH v1] perf sample: Make user_regs and intr_regs optional Ian Rogers
2025-01-14 21:01 ` Andi Kleen
2025-02-10 18:15 ` Ian Rogers
2025-02-11 2:50 ` Namhyung Kim
2025-02-11 4:43 ` Ian Rogers
2025-02-11 17:46 ` Namhyung Kim
2025-02-13 4:05 ` Namhyung Kim [this message]
2025-02-13 17:21 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z61vosIP67VMUJAS@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=charlie@rivosinc.com \
--cc=coresight@lists.linaro.org \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=leo.yan@linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpetlan@redhat.com \
--cc=peterz@infradead.org \
--cc=tavianator@tavianator.com \
--cc=vmolnaro@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.