From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05F2F427A1F; Thu, 11 Jun 2026 16:09:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781194183; cv=none; b=tbOtP19Y/zinEnDcJw35WolDZYPYUPsoGWdLrWLP4Wop6ow0TKeY7CsRH5yjkHCTz6okgoglWoFoFIUbj6uesf8/4bnp1eCdR238j5osGz+tBjd/ajm+XP+64GUe7odLtkc586o8cG6TbF8AKolSZOuoxMlSNk+Ka0xInv/0n2s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781194183; c=relaxed/simple; bh=5nqNBRnsl3+fHB5JN/+/NGPlg+sRH7j53aPSZdLrMg8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BySm0USwApT7ZbOC1UNaoKK073bl+X+ANZccolO3HVUCIyHRArFR/J+0aEVRNDJYDMkLimFUIfxN3kjapnOSJIWBcPRdTIn5GZK+sUvOo2H/Kh5x/7/Awo87vSjik+3yyHoIFbYQQSYmNIAL8gRNZWv+IU6aXsIT8cY53JdWr3o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=MIIf9TVv; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="MIIf9TVv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1781194183; x=1812730183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5nqNBRnsl3+fHB5JN/+/NGPlg+sRH7j53aPSZdLrMg8=; b=MIIf9TVv0uYdv7lx4XQlMFbxOarYr4Isl032s0RayYlG+r8HogDWpxcC Ap35msxVupLEerfnRY6qU2TWsmrAmY/E4Hz0IzPC7MM7g5M7FXnCY0fe2 mBWKh2pWi8Fd4hbY16urlu2VlTtfUOEG+bx+VpDeb/NUzqCcjOpgQ1AXo dOmDsRRK83BhjWxQgzORwCF5RVflbnWwrBmsEKSX0k8PPjN5MkCy3QLdW uV+TO4rIplEon8+mhpCOfLCj6f2VIHeul8FbVnGv3D55eeESPYoQVg+OX nP36CVOx12ZwcE/hO0PWDSe9xpHYsfAyFGiI3SCoSB4NSkqP12psN2vxN g==; X-CSE-ConnectionGUID: mTP+hF+UQmWjgZvmrMf1+Q== X-CSE-MsgGUID: TJfzZppmSUed+XoqdHdIaw== X-IronPort-AV: E=McAfee;i="6800,10657,11813"; a="81994995" X-IronPort-AV: E=Sophos;i="6.24,199,1774335600"; d="scan'208";a="81994995" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jun 2026 09:09:37 -0700 X-CSE-ConnectionGUID: LjtzhPyEQ9mawAd/sI5Uhw== X-CSE-MsgGUID: q0ugtt6BQI6mVxWUsVyuIw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,199,1774335600"; d="scan'208";a="246403606" Received: from 9cc2c43eec6b.jf.intel.com ([10.54.77.29]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jun 2026 09:09:36 -0700 From: Zide Chen To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Zide Chen Subject: [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Date: Thu, 11 Jun 2026 09:00:32 -0700 Message-ID: <20260611160033.66760-8-zide.chen@intel.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260611160033.66760-1-zide.chen@intel.com> References: <20260611160033.66760-1-zide.chen@intel.com> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit In uncore_event_cpu_online(), uncore_box_ref() was called before uncore_change_context(). uncore_box_ref() gates on box->cpu >= 0, but box->cpu is still -1 at that point because uncore_change_context() has not run yet. As a result, the box is never initialized on the first CPU to come online in a die, leaving it permanently uninitialized in the single-CPU-per-die case. Thus, box->refcnt is one count below the true value, and in the CPU offline path, the box will be torn down on the second-to-last CPU. In uncore_event_cpu_offline(), uncore_box_unref() was called after uncore_change_context(), so box->cpu is already -1 when the collector CPU goes offline, which prevents it from tearing down the box. Fix by swapping the call order in both paths so that uncore_box_{ref,unref}() runs at the point where box->cpu reflects the correct context. Move allocate_boxes() out of uncore_box_ref() to enable this reordering. Fixes: c74443d92f68 ("perf/x86/uncore: Support per PMU cpumask") Reviewed-by: Ian Rogers Signed-off-by: Zide Chen --- v3: - Update changelog to mention moving allocate_boxes(). (Dapeng) - Update title; the bug is not limited to CPU hotplug. --- arch/x86/events/intel/uncore.c | 50 ++++++++++++++++------------------ 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index feb8c3b0076b..b9ac2f7d31ca 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -1580,9 +1580,15 @@ static int uncore_event_cpu_offline(unsigned int cpu) { int die, target; + /* Clear the references */ + die = topology_logical_die_id(cpu); + uncore_box_unref(uncore_msr_uncores, die); + uncore_box_unref(uncore_mmio_uncores, die); + /* Check if exiting cpu is used for collecting uncore events */ if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask)) - goto unref; + return 0; + /* Find a new cpu to collect uncore events */ target = cpumask_any_but(topology_die_cpumask(cpu), cpu); @@ -1595,16 +1601,10 @@ static int uncore_event_cpu_offline(unsigned int cpu) uncore_change_context(uncore_msr_uncores, cpu, target); uncore_change_context(uncore_mmio_uncores, cpu, target); uncore_change_context(uncore_pci_uncores, cpu, target); - -unref: - /* Clear the references */ - die = topology_logical_die_id(cpu); - uncore_box_unref(uncore_msr_uncores, die); - uncore_box_unref(uncore_mmio_uncores, die); return 0; } -static int allocate_boxes(struct intel_uncore_type **types, +static void allocate_boxes(struct intel_uncore_type **types, unsigned int die, unsigned int cpu) { struct intel_uncore_box *box, *tmp; @@ -1621,8 +1621,10 @@ static int allocate_boxes(struct intel_uncore_type **types, if (pmu->boxes[die] || uncore_pmu_broken(pmu)) continue; box = uncore_alloc_box(type, cpu_to_node(cpu)); - if (!box) + if (!box) { + uncore_pmu_set_broken(pmu); goto cleanup; + } box->pmu = pmu; box->dieid = die; list_add(&box->active_list, &allocated); @@ -1633,14 +1635,13 @@ static int allocate_boxes(struct intel_uncore_type **types, list_del_init(&box->active_list); box->pmu->boxes[die] = box; } - return 0; + return; cleanup: list_for_each_entry_safe(box, tmp, &allocated, active_list) { list_del_init(&box->active_list); kfree(box); } - return -ENOMEM; } static int uncore_box_ref(struct intel_uncore_type **types, @@ -1649,11 +1650,7 @@ static int uncore_box_ref(struct intel_uncore_type **types, struct intel_uncore_type *type; struct intel_uncore_pmu *pmu; struct intel_uncore_box *box; - int i, ret; - - ret = allocate_boxes(types, die, cpu); - if (ret) - return ret; + int i; for (; *types; types++) { type = *types; @@ -1669,27 +1666,26 @@ static int uncore_box_ref(struct intel_uncore_type **types, static int uncore_event_cpu_online(unsigned int cpu) { - int die, target, msr_ret, mmio_ret; + int die, target; die = topology_logical_die_id(cpu); - msr_ret = uncore_box_ref(uncore_msr_uncores, die, cpu); - mmio_ret = uncore_box_ref(uncore_mmio_uncores, die, cpu); + allocate_boxes(uncore_msr_uncores, die, cpu); + allocate_boxes(uncore_mmio_uncores, die, cpu); /* * Check if there is an online cpu in the package * which collects uncore events already. */ target = cpumask_any_and(&uncore_cpu_mask, topology_die_cpumask(cpu)); - if (target < nr_cpu_ids) - return 0; - - cpumask_set_cpu(cpu, &uncore_cpu_mask); - - if (!msr_ret) + if (target >= nr_cpu_ids) { + cpumask_set_cpu(cpu, &uncore_cpu_mask); uncore_change_context(uncore_msr_uncores, -1, cpu); - if (!mmio_ret) uncore_change_context(uncore_mmio_uncores, -1, cpu); - uncore_change_context(uncore_pci_uncores, -1, cpu); + uncore_change_context(uncore_pci_uncores, -1, cpu); + } + + uncore_box_ref(uncore_msr_uncores, die, cpu); + uncore_box_ref(uncore_mmio_uncores, die, cpu); return 0; } -- 2.54.0