Re: [RFC PATCH 2/4] perf/trace-event: Write trace.dat metadata sections during parsing

Linux Perf Users
 help / color / mirror / Atom feed

From: Tanushree Shah <tshah@linux.ibm.com>
To: sashiko-reviews@lists.linux.dev,
	Namhyung Kim <namhyung@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH 2/4] perf/trace-event: Write trace.dat metadata sections during parsing
Date: Tue, 30 Jun 2026 23:42:51 +0530	[thread overview]
Message-ID: <845b78ec-2385-4bb4-9ce1-900c266d2941@linux.ibm.com> (raw)
In-Reply-To: <20260608131224.B448A1F00898@smtp.kernel.org>

Thanks for the detailed review. I have verified the issues and will be 
fixing it in v2.


On 08/06/26 18:42, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
> 
> New issues:
> - [High] Using `read()` directly instead of the subsystem's `do_read()` loop fails to handle short reads, causing stream desynchronization and uninitialized memory writes.
> - [High] Integer truncation in `do_read()` causes massive uninitialized heap memory dumps to the output file.
> - [Medium] Metadata fields and integers for the `trace.dat` file are written in the host's native byte order, causing cross-platform incompatibility.
> - [Medium] Error paths in `fwrite` operations return directly without cleaning up allocated resources, leading to memory and state leaks.
> - [Medium] Writing 2 bytes from a 1-byte `char` variable causes a stack out-of-bounds read and file corruption.
> 
> Pre-existing issues:
> - [High] Unbounded string accumulation in `read_string()` causes a stack buffer overflow.
> --
> 
> commit f1360b22eb92ea64822af70555a808fd9d67ae93
> Author: Tanushree Shah <tshah@linux.ibm.com>
> Date:   Mon Jun 8 12:59:50 2026 +0000
> 
>      perf/trace-event: Write trace.dat metadata sections during parsing
>      
>      Perf already captures the tracing metadata as a part of
>      data section in perf.data
>      
>      When trace_dat_fp is set, write trace.dat compatible metadata
>      sections using the perf provided raw buffers.
>      
>      Sections written:
>      - Initial format header (magic, version, endian, long_size,
>        page_size, compression, options_offset placeholder)
>      - Section 16: HEADER INFO (header_page + header_event)
>      - Section 17: FTRACE EVENT FORMATS
>      - Section 18: EVENT FORMATS (per system/event format files)
>      - Section 19: KALLSYMS
>      - Section 21: CMDLINES
>      - Section 15: STRINGS (written last after all sections)
>      
>      Signed-off-by: Tanushree Shah <tshah@linux.ibm.com>
> 
> This commit updates the trace-event parsing in perf to write trace.dat
> compatible metadata sections when trace_dat_fp is set. It outputs sections
> such as initial format headers, FTRACE EVENT FORMATS, EVENT FORMATS,
> KALLSYMS, and CMDLINES using perf's raw buffers.
> 
>> diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c
>> index ecbbb93f01853..815577703c2e5 100644
>> --- a/tools/perf/util/trace-event-read.c
>> +++ b/tools/perf/util/trace-event-read.c
> 
> [ ... ]
> 
>> @@ -145,10 +146,9 @@ static char *read_string(void)
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but does read_string() allow a
> stack buffer overflow when accumulating characters?
> 
> In read_string():
>      buf[size++] = c;
> 
> If the input file contains a string longer than BUFSIZ without a null
> terminator, will this increment the size index and write past the bounds
> of the fixed-size array buf?
> 
Since, its a pre-existing issue and is unrelated to this feature 
addition, will fix it in a separate patch.
>>   static int read_proc_kallsyms(struct tep_handle *pevent)
>>   {
>>   	unsigned int size;
>> +	char *buf;
>>   
>>   	size = read4(pevent);
>> -	if (!size)
>> -		return 0;
> 
> [ ... ]
> 
>> +	buf = malloc(size);
>> +	if (buf == NULL)
>> +		return -1;
>> +	if (read(input_fd, buf, size) < 0) {
>> +		free(buf);
>> +		return -1;
>> +	}
>>   	trace_data_size += size;
>> +	/* Write kallsyms section with data */
>> +	if (trace_dat_fp) {
>> +		unsigned short section_id = TRACE_DAT_SECTION_KALLSYMS;
>> +		unsigned int string_id = STRID_KALLSYMS;
>> +		unsigned long long section_size = sizeof(unsigned int) + size;
>> +		unsigned short flags = 0;
>> +
>> +		trace_dat_kallsyms_offset = ftell(trace_dat_fp);
>> +		if (!fwrite(&section_id, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&flags, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&string_id, sizeof(unsigned int), 1, trace_dat_fp) ||
>> +		    !fwrite(&section_size, sizeof(unsigned long long), 1, trace_dat_fp) ||
>> +		    !fwrite(&size, sizeof(unsigned int), 1, trace_dat_fp) ||
>> +		    !fwrite(buf, 1, size, trace_dat_fp)) {
> 
> [Severity: High]
> Could using read() instead of do_read() here cause uninitialized memory to
> be written to the output file?
> 
> Unlike do_read(), read() can return a short read. If a short read occurs, the
> return value might still be greater than 0, bypassing the error check.
> 
> Will this leave the remainder of buf uninitialized, failing to advance the
> input_fd offset by the expected amount, and then dump the partially
> uninitialized heap memory into trace_dat_fp during the fwrite() call?
yes, will replace read() with do_read().
> 
>> +			free(buf);
>> +			return -EIO;
>> +		}
>> +	}
>> +	free(buf);
>>   	return 0;
>>   }
> 
> [ ... ]
> 
>> @@ -209,6 +261,7 @@ static int read_header_files(struct tep_handle *pevent)
>>   
>>   	size = read8(pevent);
>>   
>> +	header_page_size = size;
>>   	header_page = malloc(size);
>>   	if (header_page == NULL)
>>   		return -1;
>> @@ -227,19 +280,59 @@ static int read_header_files(struct tep_handle *pevent)
>>   		 */
>>   		tep_set_long_size(pevent, tep_get_header_page_size(pevent));
>>   	}
>> -	free(header_page);
>>   
>> -	if (do_read(buf, 13) < 0)
>> +	if (do_read(buf, 13) < 0) {
>> +		free(header_page);
>>   		return -1;
>> +	}
> 
> [Severity: High]
> Can integer truncation in do_read() lead to massive uninitialized heap memory
> dumps here?
> 
> Functions like read_header_files() allocate memory using the 64-bit size read
> from the file:
> 
>      size = read8(pevent);
>      header_page = malloc(size);
> 
> They then call do_read(header_page, size). Because do_read() takes an int
> for the size argument, values over 4GB are truncated to their lower 32 bits.
> If size is 0x100000004, do_read() will only read 4 bytes.
> 
> Later in the function, fwrite() is called with the full 64-bit size:
> 
>      !fwrite(header_page, 1, header_page_size, trace_dat_fp)
> 
> Will this result in dumping 4GB of uninitialized heap memory into the
> output file?
Confirmed.  do_read() takes int, truncates 64-bit sizes. Would need to 
change parameter type from int to size_t.
> 
> [ ... ]
> 
>> @@ -298,8 +405,31 @@ static int read_ftrace_files(struct tep_handle *pevent)
>>   	int count;
>>   	int i;
>>   	int ret;
>> +	long section_size_pos = 0;
>> +	long count_pos = 0;
>> +	unsigned long long section_size = 0;
>> +	long end_pos;
>>   
>>   	count = read4(pevent);
>> +	/* Write ftrace formats section to trace.dat output file */
>> +	if (trace_dat_fp) {
>> +		unsigned short section_id = TRACE_DAT_SECTION_FTRACE;
>> +		unsigned short flags = 0;
>> +		unsigned int string_id = STRID_FTRACE_FORMATS;
>> +
>> +		trace_dat_ftrace_format_offset = ftell(trace_dat_fp);
>> +
>> +		if (!fwrite(&section_id, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&flags, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&string_id, sizeof(unsigned int), 1, trace_dat_fp))
> 
> [Severity: Medium]
> Should these multi-byte integers be byte-swapped to match the output file's
> endianness?
> 
> The trace.dat file endianness is explicitly declared to match the original
> perf.data file (endian = file_bigendian).
> 
> However, variables like section_id, flags, and string_id are written using
> fwrite(), which writes them in the host's native byte order. If a
> little-endian host is processing a big-endian capture file, won't this corrupt
> the section IDs and sizes in the resulting trace.dat file?
> 
>> +			return -EIO;
>> +		section_size_pos = ftell(trace_dat_fp);
>> +		if (!fwrite(&section_size, sizeof(unsigned long long), 1, trace_dat_fp))
>> +			return -EIO;
>> +		count_pos = ftell(trace_dat_fp);
>> +		if (!fwrite(&count, sizeof(unsigned int), 1, trace_dat_fp))
>> +			return -EIO;
>> +	}
> 
> [ ... ]
> 
>> @@ -363,6 +557,23 @@ static int read_saved_cmdline(struct tep_handle *pevent)
>>   		pr_debug("error reading saved cmdlines\n");
>>   		goto out;
>>   	}
>> +	/* Write cmdlines section with data */
>> +	if (trace_dat_fp) {
>> +		unsigned short section_id = TRACE_DAT_SECTION_CMDLINE;
>> +		unsigned short flags = 0;
>> +		unsigned int string_id = STRID_CMDLINES;
>> +		unsigned long long section_size = sizeof(unsigned long long) + size;
>> +
>> +		trace_dat_cmdline_offset = ftell(trace_dat_fp);
>> +		if (!fwrite(&section_id, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&flags, sizeof(unsigned short), 1, trace_dat_fp) ||
>> +		    !fwrite(&string_id, sizeof(unsigned int), 1, trace_dat_fp) ||
>> +		    !fwrite(&section_size, sizeof(unsigned long long), 1, trace_dat_fp) ||
>> +		    !fwrite(&size, sizeof(unsigned long long), 1, trace_dat_fp) ||
>> +		    !fwrite(buf, 1, size, trace_dat_fp))
>> +			return -EIO;
> 
> [Severity: Medium]
> Does returning directly bypass the cleanup block?
> 
> If fwrite() fails, this error path returns -EIO immediately instead of using
> the "goto out" pattern established in this function.
> 
> Will this cause the allocated memory in buf to be leaked, since free(buf)
> is skipped?

On fwrite error, close trace_dat_fp and set it to NULL to disable 
further trace.dat writes, then continue normal parsing instead of 
returning early, this ensures cleanup code runs and primary parsing 
continues further.
> 
>> +	}
>> +
>>   	buf[ret] = '\0';
>>   
>>   	parse_saved_cmdline(pevent, buf, size);
> 
> [ ... ]
> 
>> @@ -440,6 +657,28 @@ ssize_t trace_report(int fd, struct trace_event *tevent, bool __repipe)
>>   	tep_set_long_size(pevent, file_long_size);
>>   	tep_set_page_size(pevent, file_page_size);
>>   
>> +	/* Write initial file header to trace.dat */
>> +	if (trace_dat_fp) {
>> +		unsigned char endian = file_bigendian;
>> +		unsigned char long_size = file_long_size;
>> +		unsigned int page_size = file_page_size;
>> +		unsigned long long placeholder = 0;
>> +		char trace_dat_version = TRACE_DAT_VERSION;
>> +
>> +		if (!fwrite(magic_buf, 1, 10, trace_dat_fp) ||    /* magic + "tracing" */
>> +		    !fwrite(&trace_dat_version, 1, 2, trace_dat_fp) ||
> 
> [Severity: Medium]
> Can this write out-of-bounds stack memory to the output file?
> 
> The variable trace_dat_version is declared as a 1-byte char:
> 
>      char trace_dat_version = TRACE_DAT_VERSION;
> 
> However, the fwrite call asks to write 2 bytes:
> 
>      !fwrite(&trace_dat_version, 1, 2, trace_dat_fp)
> 
> Will this read 1 byte past the trace_dat_version variable on the stack,
> leaking uninitialized stack memory into the output file and potentially
> failing to write the expected null terminator for the version string?

Confirmed. Writes 2 bytes from 1-byte char. Fix: Allocate 2-byte array 
for version string with null terminator.


> 
>> +		    !fwrite(&endian, 1, 1, trace_dat_fp) ||
>> +		    !fwrite(&long_size, 1, 1, trace_dat_fp) ||
>> +		    !fwrite(&page_size, sizeof(unsigned int), 1, trace_dat_fp) ||
>> +		    !fwrite("none", 1, 4, trace_dat_fp) ||
>> +		    !fwrite("\0", 1, 1, trace_dat_fp) ||
>> +		    !fwrite("\0", 1, 1, trace_dat_fp))
>> +			return -EIO;
>> +		trace_dat_options_offset = ftell(trace_dat_fp);
>> +		if (!fwrite(&placeholder, sizeof(unsigned long long), 1, trace_dat_fp))
>> +			return -EIO;
>> +	}
>> +
>>   	err = read_header_files(pevent);
>>   	if (err)
>>   		goto out;
>

next prev parent reply	other threads:[~2026-06-30 18:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 12:59 [RFC PATCH 0/4] perf: Add perf.data tracepoint events to trace.dat conversion Tanushree Shah
2026-06-08 12:59 ` [RFC PATCH 1/4] perf/trace-dat: Add trace.dat export infrastructure Tanushree Shah
2026-06-08 13:13   ` sashiko-bot
2026-06-25 18:04     ` Tanushree Shah
2026-06-08 12:59 ` [RFC PATCH 2/4] perf/trace-event: Write trace.dat metadata sections during parsing Tanushree Shah
2026-06-08 13:12   ` sashiko-bot
2026-06-30 18:12     ` Tanushree Shah [this message]
2026-06-08 12:59 ` [RFC PATCH 3/4] perf data-convert: Add perf.data to trace.dat conversion backend Tanushree Shah
2026-06-08 13:14   ` sashiko-bot
2026-06-08 12:59 ` [RFC PATCH 4/4] perf data: Add --to-trace-dat option for converting perf.data tracepoint events into trace.dat format Tanushree Shah
2026-06-08 13:12   ` sashiko-bot
2026-06-08 15:18 ` [RFC PATCH 0/4] perf: Add perf.data tracepoint events to trace.dat conversion Ian Rogers
2026-06-09 13:09   ` Tanushree Shah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=845b78ec-2385-4bb4-9ce1-900c266d2941@linux.ibm.com \
    --to=tshah@linux.ibm.com \
    --cc=acme@kernel.org \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox