From: Steven Rostedt <rostedt@goodmis.org>
To: Felix Moessbauer <felix.moessbauer@siemens.com>
Cc: linux-trace-devel@vger.kernel.org
Subject: Re: [PATCH 1/1] libtracecmd: fix memory leak on partial reverse iteration
Date: Fri, 21 Nov 2025 12:01:17 -0500 [thread overview]
Message-ID: <20251121120117.20e82d9e@gandalf.local.home> (raw)
In-Reply-To: <20251121134749.1530855-1-felix.moessbauer@siemens.com>
On Fri, 21 Nov 2025 14:47:49 +0100
Felix Moessbauer <felix.moessbauer@siemens.com> wrote:
> When calling tracecmd_iterate_events_reverse with a callback that does
> not always return 0, the trace is only partially iterated. By that, the
> non-iterated records are leaked, resulting in the error:
>
> 1 pages still allocated on cpu <cpu>
>
> We fix this by always iterating the remaining events on all selected
> CPUs. In the full iteration case, this stops on the first record as this
> is already zero. In the partial iteration case, all remaining records
> are freed, which is - by construction of the records list - at max a
> page size.
>
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
> Note, that this bug has been reported in [1].
>
> [1] https://lore.kernel.org/linux-trace-devel/0b4fb4b774a8754ed86a9212f0a253304a41245c.camel@siemens.com/
>
Thanks for the report. I actually found two bugs here.
> Best regards,
> Felix Moessbauer
> Siemens AG
>
> lib/trace-cmd/trace-input.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/lib/trace-cmd/trace-input.c b/lib/trace-cmd/trace-input.c
> index f2471c92..afdbd2aa 100644
> --- a/lib/trace-cmd/trace-input.c
> +++ b/lib/trace-cmd/trace-input.c
> @@ -2995,6 +2995,25 @@ static struct tep_record *next_last_event(struct tracecmd_input *handle,
> return record;
> }
>
> +static void free_last_events(struct tracecmd_input *handle,
> + struct tep_record **last_records,
> + cpu_set_t *cpu_set, int cpu_size,
> + int cpus)
> +{
> + struct tep_record *record;
> + int cpu;
> +
> + for (cpu = 0; cpu < cpus; cpu++) {
> + if (cpus && !CPU_ISSET_S(cpu, cpu_size, cpu_set))
> + continue;
> +
> + do {
> + record = next_last_event(handle, last_records, cpu);
> + tracecmd_free_record(record);
> + } while (record);
I just fixed it slightly different than what you proposed here. But I think
we can take your patch instead.
> + }
> +}
> +
> /**
> * tracecmd_iterate_events_reverse - iterate events over a given handle backwards
> * @handle: The handle to iterate over
> @@ -3057,6 +3076,7 @@ int tracecmd_iterate_events_reverse(struct tracecmd_input *handle,
> }
> } while (next_cpu >= 0 && ret == 0);
>
> + free_last_events(handle, records, cpus, cpu_size, max_cpus);
> free(records);
>
> return ret;
The second bug I found was that calling tracecmd_iterate_events_reverse()
again after a callback returned a ret value, and this time with cont = true,
it can skip a lot of records because the cursor of the last record read is
not saved. This now makes calling this function again with cont start with
the last record that was read (the one that the caller exited from).
The below patch fixes that too:
-- Steve
diff --git a/lib/trace-cmd/trace-input.c b/lib/trace-cmd/trace-input.c
index afdbd2aa98b6..8dfeff458485 100644
--- a/lib/trace-cmd/trace-input.c
+++ b/lib/trace-cmd/trace-input.c
@@ -3036,6 +3036,7 @@ int tracecmd_iterate_events_reverse(struct tracecmd_input *handle,
void *callback_data, bool cont)
{
unsigned long long last_timestamp = 0;
+ unsigned long long page_offset = 0;
struct tep_record **records;
struct tep_record *record;
int next_cpu;
@@ -3072,6 +3073,8 @@ int tracecmd_iterate_events_reverse(struct tracecmd_input *handle,
record = next_last_event(handle, records, next_cpu);;
ret = call_callbacks(handle, record, next_cpu,
callback, callback_data);
+ if (ret)
+ page_offset = record->offset;
tracecmd_free_record(record);
}
} while (next_cpu >= 0 && ret == 0);
@@ -3079,6 +3082,17 @@ int tracecmd_iterate_events_reverse(struct tracecmd_input *handle,
free_last_events(handle, records, cpus, cpu_size, max_cpus);
free(records);
+ /*
+ * If the callback exited out early, then set the cursor back
+ * to the location of that record so that if this gets called
+ * again with cont = true, it will continue where it left off.
+ */
+ if (page_offset) {
+ /* Set the record to the previous record that was read */
+ record = tracecmd_read_at(handle, page_offset - 4, NULL);
+ tracecmd_free_record(record);
+ }
+
return ret;
}
next prev parent reply other threads:[~2025-11-21 17:00 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 13:47 [PATCH 1/1] libtracecmd: fix memory leak on partial reverse iteration Felix Moessbauer
2025-11-21 17:01 ` Steven Rostedt [this message]
2025-11-24 9:14 ` MOESSBAUER, Felix
2025-11-24 16:58 ` Steven Rostedt
2025-11-21 17:11 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251121120117.20e82d9e@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=felix.moessbauer@siemens.com \
--cc=linux-trace-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).