From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1BB5277C81 for ; Tue, 26 May 2026 22:00:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779832812; cv=none; b=uuijcaasfcXg7ybOtZPFSvFuE4iRoRg8sR/Q5kyBpnXpVnKblFRBC5IQo/Un9VBdjB6oLISVP39fKufUBC9mGd6n8LNFXQPAbS2kSMOsrrKLjeesWWRcVvoB/03/MEZi/fX7GSokMsuh6y579v7av6oXgsadz6KaWYz4+WbX0No= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779832812; c=relaxed/simple; bh=LKAX4BCZug/06Y5htIL+hfqkVvftEWq3yxjiqy8JNfA=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=VzJbbyrnYKyDcA3vb8zP1qETSgpzJRntf685UpzhogfHh+7vP8kd0A8L4B/lfUQd1nH4uRpeqkktK853f3m4jMtFKzvJR59Rc8bSrZeAJqHjsUTipfki3YK4yaKQmGpTjhW8aK5qpcAz2h4WG9fAtVj5eh3uP19Y3MVOgrIobVE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cXYcA0Ts; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cXYcA0Ts" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C2AF21F000E9; Tue, 26 May 2026 22:00:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1779832809; bh=8csMbNv9LbCPXjegoPBezAE2AvaiAzYUPQGYUYKU9us=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=cXYcA0Ts+1BFYdVd1Ci9x1Ht4L8IBc4LLjuIArKkyYZZeoAUfJ0RHr6KytHdyymDh K7CgoYXDxvIo05Lu6z/7RdqNxYofFAgbji+pvQFkrl2JFywmubLVqAQ5QQcYYcQrEW O81Uvic7fNFe7CkQufHEIKgwr0Ov71Q7PD+KjaZWNOMZkDEmjEw6lcBjneWFQ00BG9 nr7cvxFhWh0WvgoLgq8WjLOjzAXFmPDU6U0xvFrwXQK0RbVv8pnL9K7Y6WBheNQ36V WAZzCNWCRgMo2tDCoZxJUPX9LTBcv4u+iTrzqoUmeh31lUW1JpG5vaBYGk+DnMzAje wSMTjK3prGrHg== From: sashiko-bot@kernel.org Subject: Re: [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records() Reply-To: sashiko-reviews@lists.linux.dev To: "Arnaldo Carvalho de Melo" Cc: linux-perf-users@vger.kernel.org In-Reply-To: <20260526211806.1193848-5-acme@kernel.org> References: <20260526211806.1193848-5-acme@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 26 May 2026 22:00:08 +0000 Message-Id: <20260526220008.C2AF21F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 4 potential issue(= s) to consider: Pre-existing issues: - [Critical] Mishandling of multi-record compressed streams leads to header= size overflow, stream corruption, and out-of-bounds stack reads. - [High] AIO compressed records contain an uninitialized `data_size` field,= breaking decompression. - [High] `ZSTD_flushStream()` is invoked on potentially errored contexts an= d its return value is ignored, causing silent data loss. - [High] `zstd_decompress_stream` manually advances the output buffer point= ers without resetting `output.pos`, leading to double-increment and uniniti= alized memory gaps. -- commit e4a35c8e9f1ac44e063d21b4f97da413caa1b375 Author: Arnaldo Carvalho de Melo perf zstd: Fix compression error path in zstd_compress_stream_to_records() This commit fixes multiple issues in the error paths of zstd_compress_stream_to_records(). It properly resets the ZSTD compression context on failures instead of incorrectly copying uncompressed data, and ensures process_header() doesn't write past the output buffer. It also fixes dst_size underflows and adds an out_child_no_flush label to prevent retrying failed mmap reads. > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index cc601796b2c8a..f1877bac815d7 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c [ ... ] > diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c > index 57027e0ac7b65..ecda9deb53b73 100644 > --- a/tools/perf/util/zstd.c > +++ b/tools/perf/util/zstd.c [ ... ] > @@ -54,7 +55,13 @@ ssize_t zstd_compress_stream_to_records(struct zstd_da= ta *data, void *dst, size_ > =20 > while (input.pos < input.size) { > record =3D dst; [Severity: Critical] This is a pre-existing issue, but does returning the aggregate size of multiple compressed records from this loop cause problems for callers like record__pushfn()? If record__pushfn() assumes the output is a single record and assigns the total aggregate size to event->header.size, a total compressed size exceeding 65535 bytes could cause the 16-bit header.size to overflow and truncate. This truncation could cause an underflow in the subsequent padding calculation (padding =3D event->header.size - compressed), resulting in a massive size passed to record__write(&pad, padding) which would dump out-of-bounds stack memory into the trace file. > + /* process_header writes the event header into record */ > + if (dst_size < sizeof(struct perf_event_header)) > + goto reset; > size =3D process_header(record, 0); [Severity: High] This isn't a bug introduced by this patch, but does the process_header callback leave the 8-byte data_size field uninitialized in AIO mode? In AIO mode, record__aio_pushfn() calls zstd_compress() but does not manually set event->data_size. When process_header() initializes header.type and header.size, the data_size field appears to be left uninitialized. This causes perf_session__process_compressed_event() to fail decompression when reading perf.data. > + /* Output buffer full =E2=80=94 cannot fit even the record header */ > + if (size > dst_size) > + goto reset; > compressed +=3D size; > dst +=3D size; > dst_size -=3D size; [ ... ] > @@ -65,10 +72,18 @@ ssize_t zstd_compress_stream_to_records(struct zstd_d= ata *data, void *dst, size_ > if (ZSTD_isError(ret)) { [Severity: High] This is a pre-existing issue, but does calling ZSTD_flushStream() just before this error check, without verifying if ZSTD_compressStream() succeeded, violate the API contract? Also, the return value of ZSTD_flushStream() is discarded. If the output buffer fills up during flush, ZSTD returns > 0 to indicate more flushing is required. Because the loop only evaluates input.pos < input.size, if the input is exhausted but the flush is incomplete, the loop will exit and leave the remaining compressed data trapped in the internal context, dropping data. > pr_err("failed to compress %ld bytes: %s\n", > (long)src_size, ZSTD_getErrorName(ret)); > - memcpy(dst, src, src_size); > - return src_size; > + goto reset; > } > size =3D output.pos; [ ... ] > @@ -76,6 +91,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_da= ta *data, void *dst, size_ > } > =20 > return compressed; > + > +reset: > + /* Reset so the context is usable if the caller retries */ > + ret =3D ZSTD_initCStream(data->cstream, data->comp_level); > + if (ZSTD_isError(ret)) > + pr_err("failed to reset compression context: %s\n", > + ZSTD_getErrorName(ret)); > + return -1; > } > =20 > size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t = src_size, [Severity: High] This isn't a bug introduced by this patch, but does zstd_decompress_stream() fail to reset output.pos to 0 during each iteration? At the end of the decompression loop: output.dst =3D dst + output.pos; output.size =3D dst_size - output.pos; Without resetting output.pos =3D 0, the next iteration will write at (dst + pos) + pos, skipping memory and leaving uninitialized gaps. I see this was fixed later in the series in commit e38284f65bb96 ("perf zstd: Fix multi-iteration decompression and error handling"). --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260526211806.1193= 848-1-acme@kernel.org?part=3D4