From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF1CBCDB474 for ; Mon, 23 Oct 2023 11:28:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229734AbjJWL2C (ORCPT ); Mon, 23 Oct 2023 07:28:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233969AbjJWL2B (ORCPT ); Mon, 23 Oct 2023 07:28:01 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62FDAF9 for ; Mon, 23 Oct 2023 04:27:59 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8849CC433C8; Mon, 23 Oct 2023 11:27:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1698060478; bh=DW9F2nQvS9rYlGDZmQZwgfQ/9V0/XhefLS1IbpFV+G0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iFYGqUB3jKWzjuXRyNGFZXe9UAb7f67mb1cJs/QPWf8qx8xTaDEfq09qfrp+/tDh5 UY/lq3pZMdAI+BJ/s5Isf/M4zH42ckFobu5MhuGJiPhX3ETNaMS43+60rey82nsoUv o19knfwOONkWdEzxxZroAyM54bumdx94nOxx3oCQ= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Budimir Markovic , "Peter Zijlstra (Intel)" Subject: [PATCH 6.1 168/196] perf: Disallow mis-matched inherited group reads Date: Mon, 23 Oct 2023 12:57:13 +0200 Message-ID: <20231023104833.178306443@linuxfoundation.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023104828.488041585@linuxfoundation.org> References: <20231023104828.488041585@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org 6.1-stable review patch. If anyone has any objections, please let me know. ------------------ From: Peter Zijlstra commit 32671e3799ca2e4590773fd0e63aaa4229e50c06 upstream. Because group consistency is non-atomic between parent (filedesc) and children (inherited) events, it is possible for PERF_FORMAT_GROUP read() to try and sum non-matching counter groups -- with non-sensical results. Add group_generation to distinguish the case where a parent group removes and adds an event and thus has the same number, but a different configuration of events as inherited groups. This became a problem when commit fa8c269353d5 ("perf/core: Invert perf_read_group() loops") flipped the order of child_list and sibling_list. Previously it would iterate the group (sibling_list) first, and for each sibling traverse the child_list. In this order, only the group composition of the parent is relevant. By flipping the order the group composition of the child (inherited) events becomes an issue and the mis-match in group composition becomes evident. That said; even prior to this commit, while reading of a group that is not equally inherited was not broken, it still made no sense. (Ab)use ECHILD as error return to indicate issues with child process group composition. Fixes: fa8c269353d5 ("perf/core: Invert perf_read_group() loops") Reported-by: Budimir Markovic Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20231018115654.GK33217@noisy.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman --- include/linux/perf_event.h | 1 + kernel/events/core.c | 39 +++++++++++++++++++++++++++++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -694,6 +694,7 @@ struct perf_event { /* The cumulative AND of all event_caps for events in this group. */ int group_caps; + unsigned int group_generation; struct perf_event *group_leader; struct pmu *pmu; void *pmu_private; --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1962,6 +1962,7 @@ static void perf_group_attach(struct per list_add_tail(&event->sibling_list, &group_leader->sibling_list); group_leader->nr_siblings++; + group_leader->group_generation++; perf_event__header_size(group_leader); @@ -2156,6 +2157,7 @@ static void perf_group_detach(struct per if (leader != event) { list_del_init(&event->sibling_list); event->group_leader->nr_siblings--; + event->group_leader->group_generation++; goto out; } @@ -5279,7 +5281,7 @@ static int __perf_read_group_add(struct u64 read_format, u64 *values) { struct perf_event_context *ctx = leader->ctx; - struct perf_event *sub; + struct perf_event *sub, *parent; unsigned long flags; int n = 1; /* skip @nr */ int ret; @@ -5289,6 +5291,33 @@ static int __perf_read_group_add(struct return ret; raw_spin_lock_irqsave(&ctx->lock, flags); + /* + * Verify the grouping between the parent and child (inherited) + * events is still in tact. + * + * Specifically: + * - leader->ctx->lock pins leader->sibling_list + * - parent->child_mutex pins parent->child_list + * - parent->ctx->mutex pins parent->sibling_list + * + * Because parent->ctx != leader->ctx (and child_list nests inside + * ctx->mutex), group destruction is not atomic between children, also + * see perf_event_release_kernel(). Additionally, parent can grow the + * group. + * + * Therefore it is possible to have parent and child groups in a + * different configuration and summing over such a beast makes no sense + * what so ever. + * + * Reject this. + */ + parent = leader->parent; + if (parent && + (parent->group_generation != leader->group_generation || + parent->nr_siblings != leader->nr_siblings)) { + ret = -ECHILD; + goto unlock; + } /* * Since we co-schedule groups, {enabled,running} times of siblings @@ -5322,8 +5351,9 @@ static int __perf_read_group_add(struct values[n++] = atomic64_read(&sub->lost_samples); } +unlock: raw_spin_unlock_irqrestore(&ctx->lock, flags); - return 0; + return ret; } static int perf_read_group(struct perf_event *event, @@ -5342,10 +5372,6 @@ static int perf_read_group(struct perf_e values[0] = 1 + leader->nr_siblings; - /* - * By locking the child_mutex of the leader we effectively - * lock the child list of all siblings.. XXX explain how. - */ mutex_lock(&leader->child_mutex); ret = __perf_read_group_add(leader, read_format, values); @@ -13267,6 +13293,7 @@ static int inherit_group(struct perf_eve !perf_get_aux_event(child_ctr, leader)) return -EINVAL; } + leader->group_generation = parent_event->group_generation; return 0; }