From: "Chen, Zide" <zide.chen@intel.com>
To: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH V2 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups
Date: Thu, 4 Jun 2026 10:41:19 -0500 [thread overview]
Message-ID: <f4fd8bd6-bb61-401c-9e36-efaff86cf96c@intel.com> (raw)
In-Reply-To: <e1afadac-76d5-4b71-b5e4-ebae89bfc1b0@linux.intel.com>
On 6/3/2026 8:00 PM, Mi, Dapeng wrote:
>
> On 6/3/2026 11:09 PM, Chen, Zide wrote:
>>
>> On 6/2/2026 8:13 PM, Mi, Dapeng wrote:
>>> On 6/2/2026 10:16 PM, Chen, Zide wrote:
>>>> On 6/2/2026 4:52 AM, Mi, Dapeng wrote:
>>>>> On 6/2/2026 1:01 AM, Zide Chen wrote:
>>>>>> Fix typo UNCORE_BOX_FLAG_INITIATED to UNCORE_BOX_FLAG_INITIALIZED.
>>>>>>
>>>>>> Rename the 'id' parameter in uncore_box_{ref,unref}() to 'die' to
>>>>>> reflect its actual meaning and be consistent with other functions.
>>>>>>
>>>>>> Remove the incorrect atomic_inc(&box->refcnt) from
>>>>>> uncore_pci_pmu_register(): PCI boxes are not tracked by refcnt,
>>>>>> and this call incorrectly increments it on a per-die basis.
>>>>>>
>>>>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>>>>> ---
>>>>>> v2:
>>>>>> - Don't rename pmu->activeboxes and keep its semantics because in
>>>>>> uncore_pci_remove() path, uncore_pci_pmu_unregister() won't be
>>>>>> called for non-active boxes.
>>>>>> - Since pmu->activeboxes keeps it's name, don't need to rename
>>>>>> box->refcnt to box->cpu_refcnt.
>>>>>> ---
>>>>>> arch/x86/events/intel/uncore.c | 11 +++++------
>>>>>> arch/x86/events/intel/uncore.h | 6 +++---
>>>>>> 2 files changed, 8 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>>>>> index b69b6a21d46b..d759888476c3 100644
>>>>>> --- a/arch/x86/events/intel/uncore.c
>>>>>> +++ b/arch/x86/events/intel/uncore.c
>>>>>> @@ -1170,7 +1170,6 @@ static int uncore_pci_pmu_register(struct pci_dev *pdev,
>>>>>> if (!box)
>>>>>> return -ENOMEM;
>>>>>>
>>>>>> - atomic_inc(&box->refcnt);
>>>>> I'm not sure if we should remove this. The
>>>>> uncore_box_ref()/uncore_box_unref() are only called for MSR or MMIO type
>>>>> uncore PMUs. For the uncore PMUs of PCI type, the box->refcnt is only
>>>>> increased here. All the 3 kinds of uncore PMUs should keep consistent
>>>>> behavior on the refcnt.
>>>> box->refcnt tracks how many CPUs are active within this die. Here it is
>>>> incorrectly incremented on a per-die basis. Additionally, during the PCI
>>>> uncore enumeration or teardown process, there is no per-CPU context, so
>>>> box->refcnt is useless in PCI uncore.
>>>>
>>>> PCI uncore only needs pmu->activeboxes.
>>> Per my understanding, the main aim of refcnt is to prevent the box
>>> structure is freed unexpectedly, it doesn't directly couple with CPUs. It
>>> just records how many users are using the the box regardless the users are
>>> CPU or something else. For the uncore PMUs of MSR and MMIO type, CPUs can
>>> be seen as the users, while for the uncore PMUs of PCI, I suppose the dies
>>> (actually any cpu on the die) can be seen the users.
>> No, box->refcnt is a refcnt for in-die CPUs, while pmu->activeboxes is a
>> per-die count. They are clear and distinct. That's also why I tried to
>> rename them to box->cpu_refcnt and pmu->die_refcnt in v1.
>>
>> If this line is not deleted, box->refcnt for PCI boxes is always <= 1,
>> which is confusing and useless.
>>
>> > If we delete this refcnt reference, then the box structure is under no
>>> protection and the box structure could be freed by some accidents.
>> box->refcnt is not checked on any PCI boxes; Because there is one
>> pmu->box[die] per die, it utilizes box->refcnt for MSR/MMIO boxes only:
>>
>> - box_ref(): first in-die CPU online — allocate and initialize the box.
>> - box_unref(): last in-die CPU offline — tear down the box.
>>
>> Uncore PCI PMU enumeration and hot plug/unplug do not involve any CPU
>> online/offline events, which is why box->refcnt is not used.
>
> No, the refcnt is still useful even for pci type's uncore PMUs, especially
> from the software's perspective. refcnt is the essential way to prevent the
> unexpected free of the referenced structure. Although currently there would
> be only one user for pci type's uncore PMUs and refcnt would be 1 in most
> time, we can't ensure the code would always right in any time and no code
> accidentally free the box. We'd better keep this refcnt mechanism to avoid
> this situation. Thanks.
pmu->activeboxes is the refcnt used by PCI PMUs, whose online/offline
events operate on a per-die basis.
box->refcnt is the refcnt used by MSR/MMIO PMUs, where online/offline
events are per-CPU based.
In this particular case (uncore_pci_pmu_register(), prior to the
upcoming refactor), incrementing box->refcnt also increments
pmu->activeboxes. This makes box->refcnt completely redundant for PCI
PMUs and effectively a misuse of the per-cpu field for a per-die operation.
If I’m missing any scenarios where PCI PMU boxes still require
additional protection from box->refcnt, please help point them out.
>>
>>>>> Could we keep this and just decrease the refcnt in
>>>>> uncore_pci_pmu_unregister()? Thanks.
>>>>>
>>>>>
>>>>>> box->dieid = die;
>>>>>> box->pci_dev = pdev;
>>>>>> box->pmu = pmu;
>>>>>> @@ -1518,7 +1517,7 @@ static void uncore_change_context(struct intel_uncore_type **uncores,
>>>>>> uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
>>>>>> }
>>>>>>
>>>>>> -static void uncore_box_unref(struct intel_uncore_type **types, int id)
>>>>>> +static void uncore_box_unref(struct intel_uncore_type **types, int die)
>>>>>> {
>>>>>> struct intel_uncore_type *type;
>>>>>> struct intel_uncore_pmu *pmu;
>>>>>> @@ -1529,7 +1528,7 @@ static void uncore_box_unref(struct intel_uncore_type **types, int id)
>>>>>> type = *types;
>>>>>> pmu = type->pmus;
>>>>>> for (i = 0; i < type->num_boxes; i++, pmu++) {
>>>>>> - box = pmu->boxes[id];
>>>>>> + box = pmu->boxes[die];
>>>>>> if (box && box->cpu >= 0 && atomic_dec_return(&box->refcnt) == 0)
>>>>>> uncore_box_exit(box);
>>>>>> }
>>>>>> @@ -1604,14 +1603,14 @@ static int allocate_boxes(struct intel_uncore_type **types,
>>>>>> }
>>>>>>
>>>>>> static int uncore_box_ref(struct intel_uncore_type **types,
>>>>>> - int id, unsigned int cpu)
>>>>>> + int die, unsigned int cpu)
>>>>>> {
>>>>>> struct intel_uncore_type *type;
>>>>>> struct intel_uncore_pmu *pmu;
>>>>>> struct intel_uncore_box *box;
>>>>>> int i, ret;
>>>>>>
>>>>>> - ret = allocate_boxes(types, id, cpu);
>>>>>> + ret = allocate_boxes(types, die, cpu);
>>>>>> if (ret)
>>>>>> return ret;
>>>>>>
>>>>>> @@ -1619,7 +1618,7 @@ static int uncore_box_ref(struct intel_uncore_type **types,
>>>>>> type = *types;
>>>>>> pmu = type->pmus;
>>>>>> for (i = 0; i < type->num_boxes; i++, pmu++) {
>>>>>> - box = pmu->boxes[id];
>>>>>> + box = pmu->boxes[die];
>>>>>> if (box && box->cpu >= 0 && atomic_inc_return(&box->refcnt) == 1)
>>>>>> uncore_box_init(box);
>>>>>> }
>>>>>> diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
>>>>>> index c2e5ccb1d72c..bad5d8dec8e0 100644
>>>>>> --- a/arch/x86/events/intel/uncore.h
>>>>>> +++ b/arch/x86/events/intel/uncore.h
>>>>>> @@ -185,7 +185,7 @@ struct intel_uncore_box {
>>>>>> #define CFL_UNC_CBO_7_PERFEVTSEL0 0xf70
>>>>>> #define CFL_UNC_CBO_7_PER_CTR0 0xf76
>>>>>>
>>>>>> -#define UNCORE_BOX_FLAG_INITIATED 0
>>>>>> +#define UNCORE_BOX_FLAG_INITIALIZED 0
>>>>>> /* event config registers are 8-byte apart */
>>>>>> #define UNCORE_BOX_FLAG_CTL_OFFS8 1
>>>>>> /* CFL 8th CBOX has different MSR space */
>>>>>> @@ -559,7 +559,7 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
>>>>>>
>>>>>> static inline void uncore_box_init(struct intel_uncore_box *box)
>>>>>> {
>>>>>> - if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
>>>>>> + if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
>>>>>> if (box->pmu->type->ops->init_box)
>>>>>> box->pmu->type->ops->init_box(box);
>>>>>> }
>>>>>> @@ -567,7 +567,7 @@ static inline void uncore_box_init(struct intel_uncore_box *box)
>>>>>>
>>>>>> static inline void uncore_box_exit(struct intel_uncore_box *box)
>>>>>> {
>>>>>> - if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
>>>>>> + if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
>>>>>> if (box->pmu->type->ops->exit_box)
>>>>>> box->pmu->type->ops->exit_box(box);
>>>>>> }
next prev parent reply other threads:[~2026-06-04 15:41 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 17:01 [PATCH v2 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
2026-06-01 17:01 ` [PATCH V2 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
2026-06-02 7:24 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
2026-06-02 9:52 ` Mi, Dapeng
2026-06-02 14:16 ` Chen, Zide
2026-06-03 1:13 ` Mi, Dapeng
2026-06-03 15:09 ` Chen, Zide
2026-06-04 1:00 ` Mi, Dapeng
2026-06-04 15:41 ` Chen, Zide [this message]
2026-06-05 0:30 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
2026-06-02 9:57 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails Zide Chen
2026-06-03 1:24 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 5/8] perf/x86/intel/uncore: Factor out box setup code Zide Chen
2026-06-03 1:30 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
2026-06-03 2:13 ` Mi, Dapeng
2026-06-03 15:46 ` Chen, Zide
2026-06-04 1:15 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering on CPU hotplug Zide Chen
2026-06-03 2:32 ` Mi, Dapeng
2026-06-03 16:40 ` Chen, Zide
2026-06-04 1:16 ` Mi, Dapeng
2026-06-01 17:01 ` [PATCH V2 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMU Zide Chen
2026-06-03 2:43 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f4fd8bd6-bb61-401c-9e36-efaff86cf96c@intel.com \
--to=zide.chen@intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.