From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DAC42147E5 for ; Wed, 3 Jun 2026 01:16:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780449368; cv=none; b=Fu0SYqVzrcyhcvXsroBeQGXVrQ6hQv++hF8+8nl4DIpw8907PVzDb6EeN2nBmxtIrf9pa+U8FovySMbZvaUBVQJ2QXKL+SbcxnYXEmLv7RVT0TU/pvfI8GGIdcT6V7JdcWGO+kq0Cppsw7JkF4R0PSasNSkcnYxCZNlDuT4WdEg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780449368; c=relaxed/simple; bh=01bbdxlxE6InmTwxksKa0ArlDCXgCGlLaUIDmTDE+Fo=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=KVkT2J0boDn0Se6YCCjUgAZafZ1lyKkfz8R5nar5O/xywx3IU8rEZPFzs987jwYReNGksxLuhVoDTBSOmrhVCudhzMXtbNut8b2Vb+b1jl+eCGXZkthIc9AiHBB64pmletmF8zFW9IQ+fkR4eDRzXOlkEg8XIgVdf51lDRgy/ZY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CAolILcl; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CAolILcl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 86CA41F00893; Wed, 3 Jun 2026 01:16:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780449366; bh=dUg3rUPZkJTttr7Vb2KYcMdA9gvpy8nu8f/dtdzQtO8=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=CAolILcl7F0s4e4rGJUZszPtrPH7Ay68TgxrVG0xdFMRcW/cbEQ1Vr0ll0r6VnqWh zQf83vZBMgQSjxD9zIQuWOMQYoKiBk1wbPGvNdw2iPatMIltYOxVM1Kbn2qRbEBSUB FINtQ3m/a02q43m3xYZJJaE9dkNR22UbBIo4XmejOUR79mjhaaSIKEUSGujPfiJBXC I5GrQI08+B+LCwYIEOw4VdJ23KLzvkbquVwBaqINHhWhbycePhbwjE2PblMCAXnM5D uMTcLV8stbKyg0cp5X7JZhpkRkixORti+A8K41swoKAnLNTfDsPCcuVTP0ThERFymO HFJU07MvRphaw== From: sashiko-bot@kernel.org Subject: Re: [PATCH 7/8] perf sched: Replace BUG_ON on invalid CPU with graceful skip Reply-To: sashiko-reviews@lists.linux.dev To: "Arnaldo Carvalho de Melo" Cc: linux-perf-users@vger.kernel.org In-Reply-To: <20260602235709.1541603-8-acme@kernel.org> References: <20260602235709.1541603-8-acme@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Wed, 03 Jun 2026 01:16:06 +0000 Message-Id: <20260603011606.86CA41F00893@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 3 potential issue(= s) to consider: Pre-existing issues: - [High] Pre-existing NULL pointer dereference when machine__findnew_thread= fails in latency_runtime_event. - [High] Missing `thread__put()` for `sched_out` and `sched_in` on the erro= r path of `add_sched_out_event()`. - [High] Missing bounds check on `sample->cpu` causes out-of-bounds reads i= n `test_bit()` across multiple perf tools. -- commit a94415fdfe25f82cda0feb4b93da07a2fcbbe10d Author: Arnaldo Carvalho de Melo perf sched: Replace BUG_ON on invalid CPU with graceful skip This commit replaces BUG_ON assertions for invalid sample CPUs with gracefu= l=20 skips and warnings in latency_switch_event(), latency_runtime_event(), and= =20 map_switch_event(). It handles cases where PERF_SAMPLE_CPU is absent and=20 initialized to (u32)-1 by evsel__parse_sample(). > The central CPU validation in perf_session__deliver_event() intentionally > preserves the (u32)-1 sentinel for downstream tools like perf script > and perf inject, so leaf callbacks must handle it themselves. [Severity: High] This isn't a regression introduced by this patch, but do other leaf callbac= ks properly handle the (u32)-1 sentinel when using the --cpu list option? In process_sample_event() in tools/perf/builtin-annotate.c: if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap)) goto out_put; And similarly in hist_iter_cb_cmp() in tools/perf/builtin-diff.c: if (cpu_list && !test_bit(sample->cpu, cpu_bitmap)) { return -1; } And in process_sample_event() in tools/perf/builtin-report.c: if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap)) goto out_put; If sample->cpu is (u32)-1, could passing this to test_bit() compute an inva= lid bit offset and cause a deterministic out-of-bounds read and segmentation fa= ult? > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c > index 9ec8e049e19b0..81833d1694705 100644 > --- a/tools/perf/builtin-sched.c > +++ b/tools/perf/builtin-sched.c > @@ -1145,7 +1145,12 @@ static int latency_switch_event(struct perf_sched = *sched, > int cpu =3D sample->cpu, err =3D -1; > s64 delta; > =20 > - BUG_ON(cpu >=3D MAX_CPUS || cpu < 0); > + /* perf.data is untrusted input =E2=80=94 CPU may be absent or corrupte= d */ > + if (cpu >=3D MAX_CPUS || cpu < 0) { > + pr_warning("WARNING: at offset %#" PRIx64 ": out-of-bound sample CPU %= d, skipping sample\n", > + sample->file_offset, cpu); > + return 0; > + } [Severity: High] This isn't a regression introduced by this patch, but does the error path f= or add_sched_out_event() further down in latency_switch_event() leak thread references? If machine__findnew_thread() successfully acquires references for sched_out= and sched_in, and we later fail here: if (add_sched_out_event(out_events, prev_state, timestamp)) return -1; Could returning -1 directly bypass the thread__put() cleanup sequence at the out_put label and leak the thread objects? [ ... ] > @@ -1215,7 +1220,13 @@ static int latency_runtime_event(struct perf_sched= *sched, > if (thread =3D=3D NULL) > return -1; [Severity: High] This isn't a regression introduced by this patch, but could this cause a nu= ll pointer dereference before the validation check is reached? At the start of latency_runtime_event(), atoms is initialized by passing thread to thread_atoms_search(): struct thread *thread =3D machine__findnew_thread(machine, -1, pid); struct work_atoms *atoms =3D thread_atoms_search(&sched->atom_root, thr= ead, &sched->cmp_pid); ... if (thread =3D=3D NULL) return -1; If machine__findnew_thread() fails and returns null, would thread_atoms_sea= rch() dereference the null pointer before we reach this validation check? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260602235709.1541= 603-1-acme@kernel.org?part=3D7