Re: [PATCH v2] perf/hx_arm_ni: Support uncore ARM NI-700 PMU

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Krzysztof Kozlowski <krzk@kernel.org>
To: "Yang Jialong 杨佳龙" <jialong.yang@shingroup.cn>,
	"Will Deacon" <will@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>
Cc: shenghui.qu@shingroup.cn, ke.zhao@shingroup.cn,
	zhijie.ren@shingroup.cn, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2] perf/hx_arm_ni: Support uncore ARM NI-700 PMU
Date: Wed, 31 Jan 2024 11:36:13 +0100	[thread overview]
Message-ID: <a399c8a5-5972-4e21-b3c0-9201f0f6b146@kernel.org> (raw)
In-Reply-To: <0C0EA95E5AC6D147+ff1001b7-d61b-4365-9a22-b3c4dfacbc53@shingroup.cn>

On 31/01/2024 11:07, Yang Jialong 杨佳龙 wrote:
> 
> 
> 在 2024/1/31 17:38, Krzysztof Kozlowski 写道:
>> On 31/01/2024 10:07, Yang Jialong 杨佳龙 wrote:
>>>
>>>
>>> 在 2024/1/31 15:59, Krzysztof Kozlowski 写道:
>>>> On 31/01/2024 08:08, JiaLong.Yang wrote:
>>>>> This code is based on uncore PMUs arm_smmuv3_pmu and arm-cmn.
>>>>> One ni-700 can have many clock domains. Each of them has only one PMU.
>>>>> Here one PMU corresponds to one 'struct ni_pmu' instance.
>>>>> PMU name will be ni_pmu_N_M, which N means different NI-700s and M means
>>>>> different PMU in one NI-700. If only one NI-700 found in NI-700, name will
>>>>> be ni_pmu_N.
>>>>> Node interface event name will be xxni_N_eventname, such as
>>>>> asni_0_rdreq_any. There are many kinds of type of nodes in one clock
>>>>> domain. Also means that there are many kinds of that in one PMU. So we
>>>>> distinguish them by xxni string. Besides, maybe there are many nodes
>>>>> have same type. So we have number N in event name.
>>>>> By ni_pmu_0_0/asni_0_rdreq_any/, we can pinpoint accurate bus traffic.
>>>>> Example1: perf stat -a -e ni_pmu_0_0/asni_0_rdreq_any/,ni_pmu_0_0/cycles/
>>>>> EXample2: perf stat -a -e ni_pmu_0_0/asni,id=0,event=0x0/
>>>>>
>>>>> Signed-off-by: JiaLong.Yang <jialong.yang@shingroup.cn>
>>>>> ---
>>>>> v1 --> v2:
>>>>> 1. Submit MAINTANER Documentation/ files seperately.
>>>>
>>>> SEPARATE PATCHES, not patchsets. You have now checkpatch warnings
>>>> because of this...
>>>
>>> ...OK. But the MAINTANER file changing should be given in which one
>>> patches.
>>> I will submit patch v3 after talking and your permission.
>>>
>>>>
>>>>> 2. Delete some useless info printing.
>>>>> 3. Change print from pr_xxx to dev_xxx.
>>>>> 4. Fix more than 75 length log info.
>>>>> 5. Fix dts attribute pccs-id.
>>>>> 6. Fix generic name according to DT specification.
>>>>> 7. Some indentation.
>>>>> 8. Del of_match_ptr macro.
>>>>>
>>>>>    drivers/perf/Kconfig     |   11 +
>>>>>    drivers/perf/Makefile    |    1 +
>>>>>    drivers/perf/hx_arm_ni.c | 1284 ++++++++++++++++++++++++++++++++++++++
>>>>>    3 files changed, 1296 insertions(+)
>>>>>    create mode 100644 drivers/perf/hx_arm_ni.c
>>>>>
>>>>> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
>>>>> index ec6e0d9194a1..95ef8b13730f 100644
>>>>> --- a/drivers/perf/Kconfig
>>>>> +++ b/drivers/perf/Kconfig
>>>>> @@ -241,4 +241,15 @@ config CXL_PMU
>>>>>    
>>>>>    	  If unsure say 'm'.
>>>>>    
>>>>> +config HX_ARM_NI_PMU
>>>>> +       tristate "HX ARM NI-700 PMU"
>>>>> +       depends on PPC_HX_C2000 && 64BIT
>>>>
>>>> 1. There is no PPC_HX_C2000.
>>>
>>> I have been used to using this macro. However this macro is not existed
>>> in mainline.
>>> I will replace it with ARM64. And del involved C code if OK.
>>>
>>> 64bit:
>>> __ffs(unsigned long) and __fls(unsigned long) will be wrong in 32bit. I
>>> pass a u64 argument.
>>
>> One thing is where the code is supposed to run, second thing is compile
>> testing.
>>
> 
> Now run on my company product, a 64bit PowerPC...
> But I think it's general for 64bit systems.
> 
>> Why do you use __ffs, not __ffs64 which takes u64 if you really want
>> only 64bit argument? unsigned long != u64, so your code is not
>> architecture independent. You claim you wrote it on purpose as
>> non-architecture-independent, but then I claim it's a bug. We are
>> supposed to write code which is portable, as much as possible, assuming
>> it does not affect readability.
>>
> 
> I write code in v5.18, there are __ffs64() and fls64(). Asymmetric.

Sorry, that's a no go.

That's some very, very old kernel. Do not develop on old kernels, but on
mainline. I also suspect that by basing your work on old kernel, you
duplicate a lot of issues already fixed.

> There are some difference in return val between __ffs() and ffs64().
> __ffs(0) and ffs64(0) will give different value.

__ffs64 calls __ffs, so why would results be different?

Anyway, that's not really excuse.


> 
> And I'm sure code run in 64bit. So I choose to use __ffs and __fls.
> 
> Maybe it could be compatbile with 32bit. But I don't have a environment 
> to test this.
>>
>>> struct ni_hw_perf_event will be big than limit.
>>> BUILD_BUG_ON(sizeof(struct ni_hw_perf_event) > offsetof(struct
>>> hw_perf_event, target));
>>
>> And why do you need to use any of such code? Please open one of hundreds
>> of other drivers which work correctly on 32 and 64-bit platforms.
>>
> 
> Code for 64bit.
> This code is to avoid struct ni_hw_perf_event is too big than struct 
> hw_perf_event::target.

1. Why would that matter? target is task_struct. It's size does not
matter. Maybe its offset matters, but not size.

2. So you claim that on 32-bit system the structure will be bigger than
on 64-bit system?

> I learn it from arm-cmn.c.

Are you copying patterns because they are good patterns or just because
you decided to copy?

> ni_hw_perf_event will replace hw_perf_event.
> I will put some useful information in it with less space and good field 
> names.
> But I can't exceed a limit.
> 
>>>
>>>> 2. Nothing justified dependency on 64bit. Drop or explain. Your previous
>>>> message did not provide real rationale.
>>>
>>> If ARM64, then drop.
>>
>> ...
>>
>> ...
>>
>>>>> +	/* Step2: Traverse all clock domains. */
>>>>> +	for (cd_idx = 0; cd_idx < ni->cd_num; cd_idx++) {
>>>>> +		cd = cd_arrays[cd_idx];
>>>>> +
>>>>> +		num = ni_child_number(cd);
>>>>> +		dev_dbg(dev, "The %dth clock domain has %d child nodes:", cd_idx, num);
>>>>> +
>>>>> +		/* Omit pmu node */
>>>>> +		ni_pmu = devm_kzalloc(dev, struct_size(ni_pmu, ev_src_nodes, num - 1),
>>>>> +				      GFP_KERNEL);
>>>>> +		ni_pmu->ev_src_num = num - 1;
>>>>> +
>>>>> +		if (!ni_pmu)
>>>>> +			return -ENOMEM;
>>>>> +
>>>>> +		num_idx = 0;
>>>>> +		for (nd_idx = 0; nd_idx < num; nd_idx++) {
>>>>> +			nd = ni_child_pointer(pbase, cd, nd_idx);
>>>>> +
>>>>> +			node.base = nd;
>>>>> +			node.node_type = ni_node_node_type(nd);
>>>>> +
>>>>> +			if (unlikely(ni_node_type(nd) == NI_PMU))
>>>>> +				ni_pmu->pmu_node = node;
>>>>> +			else
>>>>> +				ni_pmu->ev_src_nodes[num_idx++] = node;
>>>>> +			dev_dbg(dev, "  name: %s   id: %d", ni_node_name[node.type], node.id);
>>>>> +		}
>>>>> +
>>>>> +		ni_pmu->dev = dev;
>>>>> +		ni_pmu->ni = ni;
>>>>> +		ni->ni_pmus[cd_idx] = ni_pmu;
>>>>> +	}
>>>>> +
>>>>> +	devm_kfree(dev, cd_arrays);
>>>>
>>>> Why? If it is not device-lifetime then allocate with usual way.
>>>>
>>>
>>> No device-lifetime.
>>> Will allocate in stack.
>>
>> I was thinking about kzalloc. But if array is small, stack could be as well.
>>
> 
> If I have to return before devm_kfree because of wrong, I will have to use:
> 
> goto out;
> 
> out:
> kfree();
> 
> But if I use devm_kzalloc, I will not be worried about that. Even if no 

devm* is not for that purpose. devm is for device-managed allocations.
Device does not manage your allocation.

> device-lifetime.
> Isn't this a good way?

Then you want cleanup.h and use proper __free().

Best regards,
Krzysztof

next prev parent reply	other threads:[~2024-01-31 10:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31  7:08 [PATCH v2] perf/hx_arm_ni: Support uncore ARM NI-700 PMU JiaLong.Yang
2024-01-31  7:59 ` Krzysztof Kozlowski
2024-01-31  9:07   ` Yang Jialong 杨佳龙
2024-01-31  9:38     ` Krzysztof Kozlowski
2024-01-31 10:07       ` Yang Jialong 杨佳龙
2024-01-31 10:36         ` Krzysztof Kozlowski [this message]
2024-02-01  3:00           ` Yang Jialong 杨佳龙
     [not found]           ` <cefbbefe-9b10-4ae4-9bb8-af6eb6916f6a@shingroup.cn>
2024-02-01  6:35             ` Yang Jialong 杨佳龙
2024-01-31 16:50 ` Robin Murphy
2024-02-01  2:40   ` Yang Jialong 杨佳龙
2024-02-02 20:11     ` Robin Murphy
2024-02-29  2:22       ` Yang Jialong 杨佳龙
     [not found]       ` <fb75eadf-c029-4da7-bda5-300b7d7f51c1@shingroup.cn>
2024-02-29  2:26         ` Yang Jialong 杨佳龙

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a399c8a5-5972-4e21-b3c0-9201f0f6b146@kernel.org \
    --to=krzk@kernel.org \
    --cc=jialong.yang@shingroup.cn \
    --cc=ke.zhao@shingroup.cn \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=shenghui.qu@shingroup.cn \
    --cc=will@kernel.org \
    --cc=zhijie.ren@shingroup.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox