From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76DE56136 for ; Thu, 16 Jan 2025 00:47:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736988465; cv=none; b=THBi5xcetPxAgH7twGmedTHc+YchA93hxOxeXR7tS5n0x4hGrgg7xs7w3RVk92Qm13sLFDiBVNnq+7Trm0XUtN94cwBVWfbMBQ3Yx/nyOYTdq8aZPDfsrF0A9qXjDQReeTpVcD0AWoHNevKtKLcgeJcYqC47qoAhC7WiUDHYsfg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736988465; c=relaxed/simple; bh=YoaDJtJAc45veq6gIcasNbYOgKM/lCkH4ALxCcp7w8Y=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=A57HWZRFWXCD+3kZkdsfLHBy9A6giaPkJsMFzwlf0Axxtqj3qUWerq4tAuwkeCSm5ROFzJGzNR2yaiHd4l+ozh/EGBOd918sQSYt0lDB3ARsm4xuCfTOKBF48FgLZ891oLSo/x6W9Ide1iUlbSLultspaDcDnlW0BcS0OLiPsyk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com; spf=pass smtp.mailfrom=quicinc.com; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b=jWdktes6; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=quicinc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="jWdktes6" Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50FGoKOS030501; Thu, 16 Jan 2025 00:47:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= N2fg4n7W6YxC1gBLutNGg/gs5TumLQw/CCcXwRpP1qg=; b=jWdktes6/AKsYlZJ /9fXpEzpMxYc7rKPQeyeCaTrkPXKKXmFxIAg1P0y7Kj2VeMJhqtVPk6trOc2Tv0F 2hXPKjWbzloMXAewzne4d9qGi8V5Q8j9wwZqwX6048LlepzXuzAQEI3yZ0AgZ239 qQWMoLOrpEPkpJQaiJqG0V+bwHaIOhK3XhvQTkUfXszb5OR0x//8cZSJFXFB3BJb 14qcsVqukoqzVTPRrSXaTaSCG3RlgWioHAl7opSFctltZLdNOcldqUjUIXl089c/ +z3fKbmGvQ0xHujoF05j/UlaWKGs1gZl9Ty8kQen1olVXuJc/fIgec4NYova+2Tl iAvrUw== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 446fgm169f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 16 Jan 2025 00:47:36 +0000 (GMT) Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA02.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 50G0lZK7017442 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 16 Jan 2025 00:47:35 GMT Received: from [10.71.108.79] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Wed, 15 Jan 2025 16:47:35 -0800 Message-ID: <3b857797-489b-4dfe-af2c-6f25bb8d51ea@quicinc.com> Date: Wed, 15 Jan 2025 16:47:34 -0800 Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 8/9] drm/msm/dpu: rework core_perf debugfs overrides To: Dmitry Baryshkov CC: Rob Clark , Sean Paul , Marijn Suijten , David Airlie , Stephen Boyd , Simona Vetter , , , References: <20250106-dpu-perf-rework-v4-0-00b248349476@linaro.org> <20250106-dpu-perf-rework-v4-8-00b248349476@linaro.org> <86152d89-cf42-4e2f-a188-c401de9d47df@quicinc.com> <2qhm7oqpbgyoghpj3txjpmb4q3tmxd23cvrdqxicw3lpmwfcwg@m5mn5flvovmp> Content-Language: en-US From: Abhinav Kumar In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: h-7jz9VA6rv5NdcBg_p6KJL8GLdAkNkp X-Proofpoint-ORIG-GUID: h-7jz9VA6rv5NdcBg_p6KJL8GLdAkNkp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-15_11,2025-01-15_02,2024-11-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 clxscore=1015 spamscore=0 adultscore=0 priorityscore=1501 phishscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2501160003 On 1/15/2025 4:35 PM, Dmitry Baryshkov wrote: > On Wed, Jan 15, 2025 at 11:51:20AM -0800, Abhinav Kumar wrote: >> >> >> On 1/15/2025 12:41 AM, Dmitry Baryshkov wrote: >>> On Tue, Jan 14, 2025 at 02:02:54PM -0800, Abhinav Kumar wrote: >>>> >>>> >>>> On 1/5/2025 7:07 PM, Dmitry Baryshkov wrote: >>>>> Currently debugfs provides separate 'modes' to override calculated >>>>> MDP_CLK rate and interconnect bandwidth votes. Change that to allow >>>>> overriding individual values (e.g. one can override just clock or just >>>>> average bandwidth vote). The maximum values allowed for those entries by >>>>> the platform can be read from the 'max_core_ab' and 'max_core_clk_rate' >>>>> files in debugfs. >>>>> >>>> >>>> Apart from the concern I highlighted in the previous patch, the only issue I >>>> have with this is that, this went from a one step process of using the >>>> "mode" this has become a two step one. >>>> >>>> There were essentially two modes we are talking about - "fixed" and >>>> "minimum" >>>> >>>> With respect to "fixed" this is totally fine because this is preserving that >>>> functionality because to be able to set the fixed mode the end user must >>>> know what values they want to try anyway. >>>> >>>> With respect to "minimum" mode, is where this approach is not that great. >>>> The end users of this can be non-display developers too such as our QA teams >>>> who might want to perform a first level of triage on the issues and route it >>>> accordingly. This is especially true for underruns and some performance lags >>>> as well. >>>> >>>> If you really dont like the term "modes", to preserve the "minimum" mode, >>>> how about just using a bool debugfs like "max_perf_params" which internally >>>> maxes out the max MDP clock and ab/ib params. >>> >>> That's what I'm trying to avoid - having an extra debugfs file which >>> overrides other files. It is much easier to work if there is no need to >>> switch modes, it is easy to overlook it. I think it should be fine to >>> use `cat max_foo > fix_foo` to override each of the params. After >>> renaming the threshold_high to max_core_ab the names of the debugfs >>> files match. >>> >> >> Its just a difference in interpretation IMO. >> >> the "fixed" mode is trying to given an option to incrementally try and see >> which value really works and also to see whether its the clock OR the >> bandwidth which is making the difference. So individual control of those. >> >> The "max" mode is trying to see if even the max values of everything cannot >> fix the problem. BTW, the max was maxing out BOTH the DPU clocks and BW. >> >> So this is not just 2 extra reads for the user but 3. (ab/ib/dpu_clk) if we >> drop "max" and use "fixed" for max as well and even for that the user has to >> refer the max DPU clock value. > > Yes, I understand that. However I still think that it's easier than > having a set of 'fix_foo' values which are silently ignored because of > the preselected mode. > > I can probably see an option: use your max_perf_params idea, but in a > form of a write-only file which immediately selects max values for clock > rate and both bandwidths. WDYT? > Sorry I am missing something here. This is the same thing I had in mind to have it as a bool when someone does echo 1 > max_perf_params, it will immediately max the values for clock rate and bandwidth. So to summarize, there are four nodes: 1) fix_core_ab_vote 2) fix_core_ib_vote 3) fix_core_clk_rate These individually control their respective params 4) max_perf_params - which maxes out all of the above Is this what you are referring to as well? >>>>> Signed-off-by: Dmitry Baryshkov >>>>> --- >>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 87 +++------------------------ >>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 10 --- >>>>> 2 files changed, 9 insertions(+), 88 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c >>>>> index 7ff3405c6867556a8dc776783b91f1da6c86ef3f..913eb4c01abe10c1ed84215fbbee50abd69e9317 100644 >>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c >>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c >>>>> @@ -17,20 +17,6 @@ >>>>> #include "dpu_crtc.h" >>>>> #include "dpu_core_perf.h" >>>>> -/** >>>>> - * enum dpu_perf_mode - performance tuning mode >>>>> - * @DPU_PERF_MODE_NORMAL: performance controlled by user mode client >>>>> - * @DPU_PERF_MODE_MINIMUM: performance bounded by minimum setting >>>>> - * @DPU_PERF_MODE_FIXED: performance bounded by fixed setting >>>>> - * @DPU_PERF_MODE_MAX: maximum value, used for error checking >>>>> - */ >>>>> -enum dpu_perf_mode { >>>>> - DPU_PERF_MODE_NORMAL, >>>>> - DPU_PERF_MODE_MINIMUM, >>>>> - DPU_PERF_MODE_FIXED, >>>>> - DPU_PERF_MODE_MAX >>>>> -}; >>>>> - >>>>> /** >>>>> * _dpu_core_perf_calc_bw() - to calculate BW per crtc >>>>> * @perf_cfg: performance configuration >>>>> @@ -215,18 +201,16 @@ static int _dpu_core_perf_crtc_update_bus(struct dpu_kms *kms, >>>>> if (!kms->num_paths) >>>>> return 0; >>>>> - if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) { >>>>> - avg_bw = 0; >>>>> - peak_bw = 0; >>>>> - } else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) { >>>>> + dpu_core_perf_aggregate(crtc->dev, dpu_crtc_get_client_type(crtc), &perf); >>>>> + >>>>> + avg_bw = div_u64(perf.bw_ctl, 1000); /*Bps_to_icc*/ >>>>> + peak_bw = perf.max_per_pipe_ib; >>>>> + >>>>> + if (kms->perf.fix_core_ab_vote) >>>>> avg_bw = kms->perf.fix_core_ab_vote; >>>>> - peak_bw = kms->perf.fix_core_ib_vote; >>>>> - } else { >>>>> - dpu_core_perf_aggregate(crtc->dev, dpu_crtc_get_client_type(crtc), &perf); >>>>> - avg_bw = div_u64(perf.bw_ctl, 1000); /*Bps_to_icc*/ >>>>> - peak_bw = perf.max_per_pipe_ib; >>>>> - } >>>>> + if (kms->perf.fix_core_ib_vote) >>>>> + peak_bw = kms->perf.fix_core_ib_vote; >>>>> avg_bw /= kms->num_paths; >>>>> @@ -275,12 +259,9 @@ static u64 _dpu_core_perf_get_core_clk_rate(struct dpu_kms *kms) >>>>> struct drm_crtc *crtc; >>>>> struct dpu_crtc_state *dpu_cstate; >>>>> - if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) >>>>> + if (kms->perf.fix_core_clk_rate) >>>>> return kms->perf.fix_core_clk_rate; >>>>> - if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) >>>>> - return kms->perf.max_core_clk_rate; >>>>> - >>>>> clk_rate = 0; >>>>> drm_for_each_crtc(crtc, kms->dev) { >>>>> if (crtc->enabled) { >>>>> @@ -396,54 +377,6 @@ int dpu_core_perf_crtc_update(struct drm_crtc *crtc, >>>>> #ifdef CONFIG_DEBUG_FS >>>>> -static ssize_t _dpu_core_perf_mode_write(struct file *file, >>>>> - const char __user *user_buf, size_t count, loff_t *ppos) >>>>> -{ >>>>> - struct dpu_core_perf *perf = file->private_data; >>>>> - u32 perf_mode = 0; >>>>> - int ret; >>>>> - >>>>> - ret = kstrtouint_from_user(user_buf, count, 0, &perf_mode); >>>>> - if (ret) >>>>> - return ret; >>>>> - >>>>> - if (perf_mode >= DPU_PERF_MODE_MAX) >>>>> - return -EINVAL; >>>>> - >>>>> - if (perf_mode == DPU_PERF_MODE_FIXED) { >>>>> - DRM_INFO("fix performance mode\n"); >>>>> - } else if (perf_mode == DPU_PERF_MODE_MINIMUM) { >>>>> - /* run the driver with max clk and BW vote */ >>>>> - DRM_INFO("minimum performance mode\n"); >>>>> - } else if (perf_mode == DPU_PERF_MODE_NORMAL) { >>>>> - /* reset the perf tune params to 0 */ >>>>> - DRM_INFO("normal performance mode\n"); >>>>> - } >>>>> - perf->perf_tune.mode = perf_mode; >>>>> - >>>>> - return count; >>>>> -} >>>>> - >>>>> -static ssize_t _dpu_core_perf_mode_read(struct file *file, >>>>> - char __user *buff, size_t count, loff_t *ppos) >>>>> -{ >>>>> - struct dpu_core_perf *perf = file->private_data; >>>>> - int len; >>>>> - char buf[128]; >>>>> - >>>>> - len = scnprintf(buf, sizeof(buf), >>>>> - "mode %d\n", >>>>> - perf->perf_tune.mode); >>>>> - >>>>> - return simple_read_from_buffer(buff, count, ppos, buf, len); >>>>> -} >>>>> - >>>>> -static const struct file_operations dpu_core_perf_mode_fops = { >>>>> - .open = simple_open, >>>>> - .read = _dpu_core_perf_mode_read, >>>>> - .write = _dpu_core_perf_mode_write, >>>>> -}; >>>>> - >>>>> /** >>>>> * dpu_core_perf_debugfs_init - initialize debugfs for core performance context >>>>> * @dpu_kms: Pointer to the dpu_kms struct >>>>> @@ -472,8 +405,6 @@ int dpu_core_perf_debugfs_init(struct dpu_kms *dpu_kms, struct dentry *parent) >>>>> (u32 *)&perf->perf_cfg->min_llcc_ib); >>>>> debugfs_create_u32("min_dram_ib", 0400, entry, >>>>> (u32 *)&perf->perf_cfg->min_dram_ib); >>>>> - debugfs_create_file("perf_mode", 0600, entry, >>>>> - (u32 *)perf, &dpu_core_perf_mode_fops); >>>>> debugfs_create_u64("fix_core_clk_rate", 0600, entry, >>>>> &perf->fix_core_clk_rate); >>>>> debugfs_create_u32("fix_core_ib_vote", 0600, entry, >>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h >>>>> index 5e07119c14c6a9ed3413d0eaddbd93df5cc3f79d..9d8516ca32d162b1e277ec88067e5c21abeb2017 100644 >>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h >>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h >>>>> @@ -24,20 +24,11 @@ struct dpu_core_perf_params { >>>>> u64 core_clk_rate; >>>>> }; >>>>> -/** >>>>> - * struct dpu_core_perf_tune - definition of performance tuning control >>>>> - * @mode: performance mode >>>>> - */ >>>>> -struct dpu_core_perf_tune { >>>>> - u32 mode; >>>>> -}; >>>>> - >>>>> /** >>>>> * struct dpu_core_perf - definition of core performance context >>>>> * @perf_cfg: Platform-specific performance configuration >>>>> * @core_clk_rate: current core clock rate >>>>> * @max_core_clk_rate: maximum allowable core clock rate >>>>> - * @perf_tune: debug control for performance tuning >>>>> * @enable_bw_release: debug control for bandwidth release >>>>> * @fix_core_clk_rate: fixed core clock request in Hz used in mode 2 >>>>> * @fix_core_ib_vote: fixed core ib vote in bps used in mode 2 >>>>> @@ -47,7 +38,6 @@ struct dpu_core_perf { >>>>> const struct dpu_perf_cfg *perf_cfg; >>>>> u64 core_clk_rate; >>>>> u64 max_core_clk_rate; >>>>> - struct dpu_core_perf_tune perf_tune; >>>>> u32 enable_bw_release; >>>>> u64 fix_core_clk_rate; >>>>> u32 fix_core_ib_vote; >>>>> >>> >