From: Tanushree Shah <tshah@linux.ibm.com>
To: sashiko-reviews@lists.linux.dev,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ian Rogers <irogers@google.com>,
namhyung@kernel.org, Steven Rostedt <rostedt@goodmis.org>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] perf/trace-dat: Add trace.dat export infrastructure
Date: Thu, 25 Jun 2026 23:34:57 +0530 [thread overview]
Message-ID: <e7bec79c-09fc-41aa-973c-0af89362f860@linux.ibm.com> (raw)
In-Reply-To: <20260608131352.BA2831F00893@smtp.kernel.org>
Thank you for the detailed review. I have investigated all identified
issues and confirmed them. All will be addressed in v2.
On 08/06/26 18:43, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 10 potential issue(s) to consider:
> - [High] Sequential realloc failures cause a use-after-free, double-free, and memory leak.
> - [High] Unchecked realloc overwrite leads to array leak and a later NULL pointer dereference.
> - [High] Unchecked calloc failure leads to NULL pointer dereference in fwrite.
> - [High] Heap buffer overflow when processing abnormally large trace records.
> - [High] Unchecked get_tracing_file failure leads to NULL pointer dereference in fopen.
> - [High] Ftrace ring buffer format is incorrectly constructed for events >= 28 words, causing data corruption.
> - [High] Timestamp deltas larger than 27 bits are silently truncated
> - [Medium] Direct return on record allocation failure bypasses cleanup, leaking accumulated memory.
> - [Medium] The section_size for the flyrecord section is incorrectly calculated, violating the trace.dat format.
> - [Medium] The current event record is leaked if flushing the page or reallocating the batch array fails.
> --
>
> commit b7f763e536c02304fb52bceb9c78dd92220fa81a
> Author: Tanushree Shah <tshah@linux.ibm.com>
>
> perf/trace-dat: Add trace.dat export infrastructure
>
> Add new utility files util/trace-dat.c and util/trace-dat.h
> implementing the infrastructure for exporting perf.data tracepoints
> to trace.dat format compatible with trace-cmd and KernelShark.
>
>> diff --git a/tools/perf/util/trace-dat.c b/tools/perf/util/trace-dat.c
>> index 0000000000000..aa34a7b89b7a3
>> --- /dev/null
>> +++ b/tools/perf/util/trace-dat.c
> [ ... ]
>> +int trace_dat__collect_cpu_event(int cpu, unsigned long long ts,
>> + void *raw, unsigned int raw_size)
>> +{
> [ ... ]
>> + if (cpu_events->count >= cpu_events->capacity) {
>> + cpu_events->capacity = cpu_events->capacity ?
>> + cpu_events->capacity * 2 : INITIAL_EVENT_CAPACITY;
>> + cpu_events->events = realloc(cpu_events->events,
>> + cpu_events->capacity * sizeof(*cpu_events->events));
>> + if (!cpu_events->events)
>> + return -ENOMEM;
>
> [Severity: High]
> If realloc fails, it returns NULL and overwrites cpu_events->events, leaking
> the original memory block. Later, trace_dat__free_cpu_buffers() iterates over
> the events and dereferences trace_cpu_data[cpu].events[i].raw. Can this cause
> a segmentation fault during cleanup since events is NULL?
Reproduced with fault injection. Direct assignment causes pointer loss
on failure, leading to use-after-free in cleanup. Will use temporary
pointer pattern.
>
>> + }
> [ ... ]
>> +static int trace_dat__write_page(FILE *fp, unsigned long long base_ts,
>> + char **records, int *rec_sizes, int nr_recs)
>> +{
> [ ... ]
>> + page = calloc(1, trace_dat_page_size);
>> + if (!page)
>> + return -ENOMEM;
>> +
>> + for (i = 0; i < nr_recs; i++) {
>> + memcpy(page + offset, records[i], rec_sizes[i]);
>
> [Severity: High]
> When a trace event's size exceeds trace_dat_page_size - 16 bytes, the
> oversized event is queued into a new batch. Does this memcpy copy data past
> the end of the page allocation, causing a heap buffer overflow?
>
Records larger than page payload can overflow buffer. Will reject
oversized records before batching, matching upstream ftrace behavior.
>> + offset += rec_sizes[i];
>> + commit += rec_sizes[i];
>> + }
> [ ... ]
>> +static int trace_dat__write_cpu_dat(FILE *fp, int cpu, unsigned long long *file_offset_out)
>> +{
> [ ... ]
>> + for (i = 0; i < cpu_events->count; i++) {
>> + struct cpu_event *event = &cpu_events->events[i];
>> + unsigned long long time_delta = event->ts - base_ts;
>> + unsigned int data_len = event->raw_size;
>> + unsigned int words = (data_len + TRACE_DAT_WORD_ALIGN_MASK) / TRACE_DAT_WORD_SIZE;
>> + unsigned int type_len = words & TRACE_DAT_RECORD_TYPE_LEN_MASK;
>> + unsigned int hdr_word = ((time_delta & TRACE_DAT_RECORD_TIME_MASK) <<
>> + TRACE_DAT_RECORD_TIME_SHIFT) | type_len;
>
> [Severity: High]
> If the gap between events on a CPU exceeds 0x07FFFFFF, the ftrace
> specification requires emitting a TIME_EXTEND or TIME_STAMP record to advance
> base_ts. Does masking off the upper bits via time_delta &
> TRACE_DAT_RECORD_TIME_MASK silently truncate the timestamp and corrupt the
> chronological ordering?
Deltas >27 bits are silently masked. Will emit TIME_EXTEND records for
large gaps, matching ftrace ring buffer semantics.
>
> [Severity: High]
> The ftrace ring buffer specification uses the 5-bit type_len field to encode
> lengths up to 27 words. For 28 words or more, type_len must be 0, and the
> actual length must be written as a 32-bit word immediately following the
> header. Does applying words & 0x1F map large events into special types (like
> PADDING for 29) or to type_len = 0 without appending the required 32-bit
> length word?
Events ≥28 words incorrectly map to reserved types (29=PADDING,
30=TIME_EXTEND, 31=TIME_STAMP). Will implement proper encoding with
type_len=0 and separate length word for large events.
>
>> + int rec_size;
>> + char *rec;
>> +
>> + rec_size = TRACE_DAT_WORD_SIZE + data_len;
>> + if (rec_size % TRACE_DAT_WORD_SIZE)
>> + rec_size += TRACE_DAT_WORD_SIZE - (rec_size % TRACE_DAT_WORD_SIZE);
>> +
>> + rec = calloc(1, rec_size);
>> + if (!rec)
>> + return -ENOMEM;
>
> [Severity: Medium]
> If calloc fails, this directly returns -ENOMEM. Does this bypass the out_free
> cleanup label entirely, leaking the page_records and page_rec_sizes tracking
> arrays and any previously allocated record strings?
Failed calloc returns directly, leaking page_records arrays. Will use
proper error path with goto.
>
>> + memcpy(rec, &hdr_word, TRACE_DAT_WORD_SIZE);
>> + memcpy(rec + TRACE_DAT_WORD_SIZE, event->raw, data_len);
>> +
>> + if (page_size_used + rec_size > trace_dat_page_size -
>> + TRACE_DAT_RECORD_HEADER_SIZE) {
>> + ret = trace_dat__write_page(fp, base_ts,
>> + page_records, page_rec_sizes, nr_page_recs);
>> + for (j = 0; j < nr_page_recs; j++)
>> + free(page_records[j]);
>> + nr_page_recs = 0;
>> + page_size_used = 0;
>> + base_ts = event->ts;
>> + if (ret < 0)
>> + goto out_free;
>
> [Severity: Medium]
> If trace_dat__write_page fails and we jump to out_free, the newly allocated
> rec has not yet been appended to the page_records array. Is this memory
> permanently leaked since it is skipped by the cleanup loop?
>
Newly allocated rec not yet in page_records when flush fails. Will check
page fit before allocating record.
>> + }
>> +
>> + if (nr_page_recs >= page_cap) {
>> + char **tmp_records;
>> + int *tmp_sizes;
>> +
>> + page_cap = page_cap ? page_cap * 2 : INITIAL_PAGE_RECORD_CAPACITY;
>> + tmp_records = realloc(page_records, page_cap * sizeof(char *));
>> + tmp_sizes = realloc(page_rec_sizes, page_cap * sizeof(int));
>> + if (!tmp_records || !tmp_sizes) {
>> + ret = -ENOMEM;
>> + goto out_free;
>
> [Severity: High]
> If realloc for tmp_sizes fails, the code jumps to out_free without updating
> page_records to point to the newly allocated block in tmp_records. Does the
> cleanup path then iterate over the potentially freed page_records pointer,
> dereference it (use-after-free), and then call free(page_records) again
> (double-free)?
Partial realloc failure can cause use-after-free and double-free. Will
replace with malloc/memcpy/free pattern.
>
>> + }
> [ ... ]
>> +int trace_dat__write_options_section1(void)
>> +{
> [ ... ]
>> + /* TRACECLOCK option */
>> + opt_id = TRACE_DAT_OPTION_TRACECLOCK;
>> +
>> + path = get_tracing_file("trace_clock");
>> + clock_file = fopen(path, "r");
>
> [Severity: High]
> The get_tracing_file function internally uses asprintf and can return NULL on
> memory allocation failure. Is it possible for fopen to receive a NULL path,
> leading to a segmentation fault?
>
Can return NULL on asprintf failure. Passing NULL to fopen() causes
segmentation fault. Will add NULL check before fopen().
>> + put_tracing_file(path);
> [ ... ]
>> +int trace_dat__write_flyrecord_section(void)
>> +{
> [ ... ]
>> + /* Align to page boundary */
>> + after_header = ftell(trace_dat_fp);
>> + padding_needed = (trace_dat_page_size -
>> + (after_header % trace_dat_page_size)) % trace_dat_page_size;
>> +
>> + if (padding_needed > 0) {
>> + pad = calloc(1, padding_needed);
>> +
>> + if (!fwrite(pad, 1, padding_needed, trace_dat_fp)) {
>
> [Severity: High]
> Is pad passed directly to fwrite without checking if calloc succeeded? Can
> this lead to a segmentation fault if the allocation fails and fwrite receives
> a NULL buffer?
NULL pad can be passed to fwrite() causing segmentation fault. Will add
explicit NULL check.
>
>> + free(pad);
>> + ret = -EIO;
>> + goto cleanup;
>> + }
>> + free(pad);
>> + }
> [ ... ]
>> + /* fill section size */
>> + end_pos = ftell(trace_dat_fp);
>> +
>> + section_size = end_pos - flyrecord_start;
>
> [Severity: Medium]
> In the trace.dat specification, the section_size field must indicate the size
> of the payload following the 16-byte section header. By capturing
> flyrecord_start before the header is written, does the final section_size
> erroneously include the 16 bytes of the header itself, causing parsing tools
> to miscalculate the boundary?
>
>> + if (fseek(trace_dat_fp, section_size_pos, SEEK_SET) < 0 ||
>> + !fwrite(§ion_size, sizeof(unsigned long long), 1, trace_dat_fp)) {
>> + ret = -EIO;
>> + goto cleanup;
>> + }
>
Verified by comparing with trace-cmd. trace-cmd excludes the header in
the section size. Will fix by capturing position after writing header.
next prev parent reply other threads:[~2026-06-25 18:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 12:59 [RFC PATCH 0/4] perf: Add perf.data tracepoint events to trace.dat conversion Tanushree Shah
2026-06-08 12:59 ` [RFC PATCH 1/4] perf/trace-dat: Add trace.dat export infrastructure Tanushree Shah
2026-06-08 13:13 ` sashiko-bot
2026-06-25 18:04 ` Tanushree Shah [this message]
2026-06-08 12:59 ` [RFC PATCH 2/4] perf/trace-event: Write trace.dat metadata sections during parsing Tanushree Shah
2026-06-08 13:12 ` sashiko-bot
2026-06-30 18:12 ` Tanushree Shah
2026-06-08 12:59 ` [RFC PATCH 3/4] perf data-convert: Add perf.data to trace.dat conversion backend Tanushree Shah
2026-06-08 13:14 ` sashiko-bot
2026-06-08 12:59 ` [RFC PATCH 4/4] perf data: Add --to-trace-dat option for converting perf.data tracepoint events into trace.dat format Tanushree Shah
2026-06-08 13:12 ` sashiko-bot
2026-06-08 15:18 ` [RFC PATCH 0/4] perf: Add perf.data tracepoint events to trace.dat conversion Ian Rogers
2026-06-09 13:09 ` Tanushree Shah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e7bec79c-09fc-41aa-973c-0af89362f860@linux.ibm.com \
--to=tshah@linux.ibm.com \
--cc=acme@kernel.org \
--cc=irogers@google.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
--cc=rostedt@goodmis.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox