From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 490281F91D6 for ; Sat, 2 May 2026 07:55:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777708520; cv=none; b=rZV8s+XnY5AU4H9LJEwx5Y2WuDfmBxshVFGfmn78ujVw9QHvMLPbQ5X7kxs45ymfqtXnTA0js+L/QKQV39RwOpgU/QiumZQA6uf/NPRqhxbbdkjo2zpvAdxQY+kGAqvigII2feyS3Rj5Y5ZiygxTPsO1kFOztmXu2zHQAydyCDs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777708520; c=relaxed/simple; bh=qTCsy7Ad4BcuRYaS+8/seGd+5w73tWGmiboX/U0GThE=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=O3TMsESiBAUYpafVcIROFaHzAoUkjgMIITH8PKLHS2u+/tB93J5hbI1IgRojZYDCvSpNsbi3ooSWxUo561aGozG8wlbPIA6EkP+A/9+doKx75bufe/Ol5EmZgsQEcpOPlkeAskepz1KzQ6nYPiTJtzsb5vKH0ZEHf+TRNZ0IlMU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eKZx/Pld; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eKZx/Pld" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CC583C19425; Sat, 2 May 2026 07:55:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777708519; bh=qTCsy7Ad4BcuRYaS+8/seGd+5w73tWGmiboX/U0GThE=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date:From; b=eKZx/Pld89A6xdURalW3Wq9GHZOuoE70so/tkcyLC1Wcuzr2l6HNM0lD0QX4Kn9w8 bl27GrWwqi4YkAiYVlo1kAclxCq3t716cn1lodVkDtXXjaoeq5j1dhA3S2tUrWGOE7 kKJVh0HE1T9+zaK7DmHe9Q6RyalWjwwR4xAUX2ZdXgpHOXoLNbfE8/G6PaE7+JNrqm xc9EUCopbNe8kTZestsLVwl2AJpjhmI2eD+M+fWk+DkVG4TRSm8sQKcPS2IUQaSqt9 kj63XoO0chl1JK6/iJZWnEpnhB/iGT9j0xcffciSg2ks45ZrwuGMknK4FPBUCKclYZ +E/zo+KtTqVdw== From: sashiko-bot@kernel.org Subject: Re: [PATCH v8 16/17] perf symbol: Add setters for bitfields sharing a byte to avoid concurrent update issues Reply-To: sashiko@lists.linux.dev To: "Ian Rogers" Cc: linux-perf-users@vger.kernel.org In-Reply-To: <20260502065935.291960-17-irogers@google.com> References: <20260502065935.291960-17-irogers@google.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Sat, 02 May 2026 07:55:19 +0000 Message-Id: <20260502075519.CC583C19425@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 3 potential issue(= s) to consider: - [High] Unconditional acquisition of a global mutex in high-frequency per-= sample fast paths causes severe performance regression. - [High] Unlocked reads of bitfields concurrent with locked writes cause da= ta races and undefined behavior. - [Medium] The `symbol_bits_lock` mutex is leaked and potentially re-initia= lized if `symbol__init()` fails. -- commit 587302f6555ca9ddcc54f4761c10004005eaa78f Author: Ian Rogers perf symbol: Add setters for bitfields sharing a byte to avoid concurrent u= pdate issues This commit introduces a global symbol_bits_lock and helper functions to serialize updates to the ignore and annotate2 bitfields in struct symbol. This aims to prevent lost updates caused by concurrent modifications to bitfields sharing the same byte. > diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c > index 9d3a4c779a41..7337ee956dc9 100644 > --- a/tools/perf/builtin-kwork.c > +++ b/tools/perf/builtin-kwork.c [ ... ] > @@ -725,7 +725,7 @@ static void timehist_save_callchain(struct perf_kwork= *kwork, > if (sym) { > if (!strcmp(sym->name, "__softirqentry_text_start") || > !strcmp(sym->name, "__do_softirq")) > - sym->ignore =3D 1; > + symbol__set_ignore(sym, true); > } > =20 > callchain_cursor_advance(cursor); Since timehist_save_callchain is a per-sample fast path that can be called millions of times per run, could this unconditional acquisition of a global mutex cause severe lock contention and performance regression? Would it be better to add a lockless check like if (!sym->ignore) to bypass the lock once the flag is already set? > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c > index 555247568e7a..655e95f660c2 100644 > --- a/tools/perf/builtin-sched.c > +++ b/tools/perf/builtin-sched.c [ ... ] > @@ -2371,7 +2371,7 @@ static void save_task_callchain(struct perf_sched *= sched, > if (!strcmp(sym->name, "schedule") || > !strcmp(sym->name, "__schedule") || > !strcmp(sym->name, "preempt_schedule")) > - sym->ignore =3D 1; > + symbol__set_ignore(sym, true); > } Could the same fast-path lock contention happen here during save_task_callchain? > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c > index a70066d17729..1238a0d6ce6e 100644 > --- a/tools/perf/util/symbol.c > +++ b/tools/perf/util/symbol.c [ ... ] > @@ -345,6 +348,20 @@ void symbol__delete(struct symbol *sym) > free(((void *)sym) - symbol_conf.priv_size); > } > =20 > +void symbol__set_ignore(struct symbol *sym, bool ignore) > +{ > + mutex_lock(&symbol_bits_lock); > + sym->ignore =3D ignore; > + mutex_unlock(&symbol_bits_lock); > +} > + > +void symbol__set_annotate2(struct symbol *sym, bool annotate2) > +{ > + mutex_lock(&symbol_bits_lock); > + sym->annotate2 =3D annotate2; > + mutex_unlock(&symbol_bits_lock); > +} While the new symbol_bits_lock serializes writes to these bitfields, readers of sym->ignore and sym->annotate2 remain unlocked throughout the codebase. In C, adjacent bitfields share the same underlying memory location. Under t= he C11 memory model, does reading a bitfield without synchronization while ano= ther thread performs a read-modify-write on that same memory location constitute= a data race? Could this lead to torn reads or undefined behavior during concurrent acces= s? > @@ -2415,6 +2432,8 @@ int symbol__init(struct perf_env *env) > if (symbol_conf.initialized) > return 0; > =20 > + mutex_init(&symbol_bits_lock); > + > symbol_conf.priv_size =3D PERF_ALIGN(symbol_conf.priv_size, sizeof(u64)= ); > =20 > symbol__elf_init(); If a subsequent initialization step fails, like setup_parallelism_bitmap, this function returns -1 without calling mutex_destroy on the error paths. Because symbol_conf.initialized remains false, a subsequent call to symbol__exit will also return early and fail to clean the mutex up. Additionally, if a caller attempts to retry symbol__init after a failure, could this cause mutex_init to be invoked on an already-initialized mutex? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260502065935.2919= 60-1-irogers@google.com?part=3D16