* [Patch v3 0/2] Add support for _TFP and change throttle pctg
@ 2023-10-06 15:36 Sumit Gupta
2023-10-06 15:36 ` [Patch v3 1/2] ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support Sumit Gupta
2023-10-06 15:36 ` [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 Sumit Gupta
0 siblings, 2 replies; 5+ messages in thread
From: Sumit Gupta @ 2023-10-06 15:36 UTC (permalink / raw)
To: rafael, rui.zhang, lenb, linux-acpi, linux-tegra
Cc: treding, jonathanh, bbasu, sumitg, sanjayc, ksitaraman, srikars,
jbrasen
This patch set adds two improvements to get a finer control over the
impact of thermal throttling on performance.
1) Patch 1: Adds support to read "Thermal fast Sampling Period (_TFP)"
ACPI object and use it over "Thermal Sampling Period (_TSP)" for
Passive cooling if both are present.
2) Patch 2: Adds support to reduce the CPUFREQ reduction percentage
and not always cause throttling in steps of "20%" for Tegra241 SoC.
Both patches can be applied independently.
---
v2[2] -> v3:
- Patch1: rebased on top of linux-next.
- Patch2: use __read_mostly for the cpufreq_thermal_* variables.
: add static to new function acpi_thermal_cpufreq_config_nvidia.
: add null function if CONFIG_HAVE_ARM_SMCCC_DISCOVERY undefined
: removed redundant parenthesis.
v1[1] -> v2:
- Patch1: add ACPI spec section info in commit description and rebased.
- Patch2: add info about hardware in the commit description.
: switched CPUFREQ THERMAL tuning macros to static variables.
: update the tunings for Tegra241 SoC only using soc_id check.
Jeff Brasen (1):
ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support
Srikar Srimath Tirumala (1):
ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241
drivers/acpi/processor_thermal.c | 43 +++++++++++++++++++++++++++++---
drivers/acpi/thermal.c | 17 ++++++++-----
2 files changed, 51 insertions(+), 9 deletions(-)
[2] https://lore.kernel.org/lkml/20230913164659.9345-1-sumitg@nvidia.com/
[1] https://lore.kernel.org/lkml/20230817093011.1378-1-sumitg@nvidia.com/
--
2.17.1
^ permalink raw reply [flat|nested] 5+ messages in thread* [Patch v3 1/2] ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support 2023-10-06 15:36 [Patch v3 0/2] Add support for _TFP and change throttle pctg Sumit Gupta @ 2023-10-06 15:36 ` Sumit Gupta 2023-10-06 15:36 ` [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 Sumit Gupta 1 sibling, 0 replies; 5+ messages in thread From: Sumit Gupta @ 2023-10-06 15:36 UTC (permalink / raw) To: rafael, rui.zhang, lenb, linux-acpi, linux-tegra Cc: treding, jonathanh, bbasu, sumitg, sanjayc, ksitaraman, srikars, jbrasen From: Jeff Brasen <jbrasen@nvidia.com> Add support of "Thermal fast Sampling Period (_TFP)" for Passive cooling. As per [1], _TFP overrides the "Thermal Sampling Period (_TSP)" if both are present in a Thermal zone. [1] ACPI Specification 6.4 - section 11.4.17. _TFP (Thermal fast Sampling Period)" Signed-off-by: Jeff Brasen <jbrasen@nvidia.com> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> --- drivers/acpi/thermal.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 5c21079fbeb6..7eccb36c184a 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -90,7 +90,7 @@ struct acpi_thermal_passive { struct acpi_thermal_trip trip; unsigned long tc1; unsigned long tc2; - unsigned long tsp; + unsigned long passive_delay; }; struct acpi_thermal_active { @@ -409,11 +409,16 @@ static bool passive_trip_params_init(struct acpi_thermal *tz) tz->trips.passive.tc2 = tmp; - status = acpi_evaluate_integer(tz->device->handle, "_TSP", NULL, &tmp); - if (ACPI_FAILURE(status)) - return false; + status = acpi_evaluate_integer(tz->device->handle, "_TFP", NULL, &tmp); + if (ACPI_FAILURE(status)) { + status = acpi_evaluate_integer(tz->device->handle, "_TSP", NULL, &tmp); + if (ACPI_FAILURE(status)) + return false; - tz->trips.passive.tsp = tmp; + tz->trips.passive.passive_delay = tmp * 100; + } else { + tz->trips.passive.passive_delay = tmp; + } return true; } @@ -909,7 +914,7 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_trip = &tz->trips.passive.trip; if (acpi_thermal_trip_valid(acpi_trip)) { - passive_delay = tz->trips.passive.tsp * 100; + passive_delay = tz->trips.passive.passive_delay; trip->type = THERMAL_TRIP_PASSIVE; trip->temperature = acpi_thermal_temp(tz, acpi_trip->temp_dk); -- 2.17.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 2023-10-06 15:36 [Patch v3 0/2] Add support for _TFP and change throttle pctg Sumit Gupta 2023-10-06 15:36 ` [Patch v3 1/2] ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support Sumit Gupta @ 2023-10-06 15:36 ` Sumit Gupta 2023-10-06 15:52 ` Rafael J. Wysocki 1 sibling, 1 reply; 5+ messages in thread From: Sumit Gupta @ 2023-10-06 15:36 UTC (permalink / raw) To: rafael, rui.zhang, lenb, linux-acpi, linux-tegra Cc: treding, jonathanh, bbasu, sumitg, sanjayc, ksitaraman, srikars, jbrasen From: Srikar Srimath Tirumala <srikars@nvidia.com> Current implementation of processor_thermal performs software throttling in fixed steps of "20%" which can be too coarse for some platforms. We observed some performance gain after reducing the throttle percentage. Change the CPUFREQ thermal reduction percentage and maximum thermal steps to be configurable. Also, update the default values of both for Nvidia Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" and accordingly the maximum number of thermal steps are increased as they are derived from the reduction percentage. Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> --- drivers/acpi/processor_thermal.c | 43 +++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index b7c6287eccca..677ba8bc3fbc 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -26,7 +26,16 @@ */ #define CPUFREQ_THERMAL_MIN_STEP 0 -#define CPUFREQ_THERMAL_MAX_STEP 3 + +static int cpufreq_thermal_max_step __read_mostly = 3; + +/* + * Minimum throttle percentage for processor_thermal cooling device. + * The processor_thermal driver uses it to calculate the percentage amount by + * which cpu frequency must be reduced for each cooling state. This is also used + * to calculate the maximum number of throttling steps or cooling states. + */ +static int cpufreq_thermal_pctg __read_mostly = 20; static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return CPUFREQ_THERMAL_MAX_STEP; + return cpufreq_thermal_max_step; } static int cpufreq_get_cur_state(unsigned int cpu) @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!policy) return -EINVAL; - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; + max_freq = (policy->cpuinfo.max_freq * + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; cpufreq_cpu_put(policy); @@ -126,10 +136,37 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) return 0; } +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY +#define SMCCC_SOC_ID_T241 0x036b0241 + +static void acpi_thermal_cpufreq_config_nvidia(void) +{ + s32 soc_id = arm_smccc_get_soc_id_version(); + + /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */ + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241) + return; + + /* Reduce the CPUFREQ Thermal reduction percentage to 5% */ + cpufreq_thermal_pctg = 5; + + /* + * Derive the MAX_STEP from minimum throttle percentage so that the reduction + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that + * the CPU performance doesn't become 0. + */ + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; +} +#else +static inline void acpi_thermal_cpufreq_config_nvidia(void) {} +#endif + void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) { unsigned int cpu; + acpi_thermal_cpufreq_config_nvidia(); + for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); int ret; -- 2.17.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 2023-10-06 15:36 ` [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 Sumit Gupta @ 2023-10-06 15:52 ` Rafael J. Wysocki 2023-10-09 17:28 ` Sumit Gupta 0 siblings, 1 reply; 5+ messages in thread From: Rafael J. Wysocki @ 2023-10-06 15:52 UTC (permalink / raw) To: Sumit Gupta Cc: rafael, rui.zhang, lenb, linux-acpi, linux-tegra, treding, jonathanh, bbasu, sanjayc, ksitaraman, srikars, jbrasen On Fri, Oct 6, 2023 at 5:36 PM Sumit Gupta <sumitg@nvidia.com> wrote: > > From: Srikar Srimath Tirumala <srikars@nvidia.com> > > Current implementation of processor_thermal performs software throttling > in fixed steps of "20%" which can be too coarse for some platforms. > We observed some performance gain after reducing the throttle percentage. > Change the CPUFREQ thermal reduction percentage and maximum thermal steps > to be configurable. Also, update the default values of both for Nvidia > Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" > and accordingly the maximum number of thermal steps are increased as they > are derived from the reduction percentage. > > Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> > Signed-off-by: Sumit Gupta <sumitg@nvidia.com> > --- > drivers/acpi/processor_thermal.c | 43 +++++++++++++++++++++++++++++--- > 1 file changed, 40 insertions(+), 3 deletions(-) > > diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c > index b7c6287eccca..677ba8bc3fbc 100644 > --- a/drivers/acpi/processor_thermal.c > +++ b/drivers/acpi/processor_thermal.c > @@ -26,7 +26,16 @@ > */ > > #define CPUFREQ_THERMAL_MIN_STEP 0 > -#define CPUFREQ_THERMAL_MAX_STEP 3 > + > +static int cpufreq_thermal_max_step __read_mostly = 3; > + > +/* > + * Minimum throttle percentage for processor_thermal cooling device. > + * The processor_thermal driver uses it to calculate the percentage amount by > + * which cpu frequency must be reduced for each cooling state. This is also used > + * to calculate the maximum number of throttling steps or cooling states. > + */ > +static int cpufreq_thermal_pctg __read_mostly = 20; > > static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); > > @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) > if (!cpu_has_cpufreq(cpu)) > return 0; > > - return CPUFREQ_THERMAL_MAX_STEP; > + return cpufreq_thermal_max_step; > } > > static int cpufreq_get_cur_state(unsigned int cpu) > @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) > if (!policy) > return -EINVAL; > > - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; > + max_freq = (policy->cpuinfo.max_freq * > + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; > > cpufreq_cpu_put(policy); > > @@ -126,10 +136,37 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) > return 0; > } > > +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY > +#define SMCCC_SOC_ID_T241 0x036b0241 > + > +static void acpi_thermal_cpufreq_config_nvidia(void) > +{ > + s32 soc_id = arm_smccc_get_soc_id_version(); > + > + /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */ > + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241) > + return; > + > + /* Reduce the CPUFREQ Thermal reduction percentage to 5% */ > + cpufreq_thermal_pctg = 5; > + > + /* > + * Derive the MAX_STEP from minimum throttle percentage so that the reduction > + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that > + * the CPU performance doesn't become 0. > + */ > + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; > +} Looks better now, but one more thing: This is introducing an ARM-specific piece of code into an otherwise generic file and there is drivers/acpi/arm64/ for ARM-specific code, so I would very much prefer this piece of code to go there. Of course, it won't be able to modify the static variables directly then, but what if instead it defines functions to return the appropriate values? The variables in question could be initialized with the help of those functions then. > +#else > +static inline void acpi_thermal_cpufreq_config_nvidia(void) {} > +#endif > + > void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) > { > unsigned int cpu; > > + acpi_thermal_cpufreq_config_nvidia(); > + > for_each_cpu(cpu, policy->related_cpus) { > struct acpi_processor *pr = per_cpu(processors, cpu); > int ret; > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 2023-10-06 15:52 ` Rafael J. Wysocki @ 2023-10-09 17:28 ` Sumit Gupta 0 siblings, 0 replies; 5+ messages in thread From: Sumit Gupta @ 2023-10-09 17:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: rui.zhang, lenb, linux-acpi, linux-tegra, treding, jonathanh, bbasu, sanjayc, ksitaraman, srikars, jbrasen, Sumit Gupta On 06/10/23 21:22, Rafael J. Wysocki wrote: > External email: Use caution opening links or attachments > > > On Fri, Oct 6, 2023 at 5:36 PM Sumit Gupta <sumitg@nvidia.com> wrote: >> >> From: Srikar Srimath Tirumala <srikars@nvidia.com> >> >> Current implementation of processor_thermal performs software throttling >> in fixed steps of "20%" which can be too coarse for some platforms. >> We observed some performance gain after reducing the throttle percentage. >> Change the CPUFREQ thermal reduction percentage and maximum thermal steps >> to be configurable. Also, update the default values of both for Nvidia >> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" >> and accordingly the maximum number of thermal steps are increased as they >> are derived from the reduction percentage. >> >> Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> >> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >> --- >> drivers/acpi/processor_thermal.c | 43 +++++++++++++++++++++++++++++--- >> 1 file changed, 40 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c >> index b7c6287eccca..677ba8bc3fbc 100644 >> --- a/drivers/acpi/processor_thermal.c >> +++ b/drivers/acpi/processor_thermal.c >> @@ -26,7 +26,16 @@ >> */ >> >> #define CPUFREQ_THERMAL_MIN_STEP 0 >> -#define CPUFREQ_THERMAL_MAX_STEP 3 >> + >> +static int cpufreq_thermal_max_step __read_mostly = 3; >> + >> +/* >> + * Minimum throttle percentage for processor_thermal cooling device. >> + * The processor_thermal driver uses it to calculate the percentage amount by >> + * which cpu frequency must be reduced for each cooling state. This is also used >> + * to calculate the maximum number of throttling steps or cooling states. >> + */ >> +static int cpufreq_thermal_pctg __read_mostly = 20; >> >> static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); >> >> @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) >> if (!cpu_has_cpufreq(cpu)) >> return 0; >> >> - return CPUFREQ_THERMAL_MAX_STEP; >> + return cpufreq_thermal_max_step; >> } >> >> static int cpufreq_get_cur_state(unsigned int cpu) >> @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) >> if (!policy) >> return -EINVAL; >> >> - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; >> + max_freq = (policy->cpuinfo.max_freq * >> + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; >> >> cpufreq_cpu_put(policy); >> >> @@ -126,10 +136,37 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) >> return 0; >> } >> >> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY >> +#define SMCCC_SOC_ID_T241 0x036b0241 >> + >> +static void acpi_thermal_cpufreq_config_nvidia(void) >> +{ >> + s32 soc_id = arm_smccc_get_soc_id_version(); >> + >> + /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */ >> + if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241) >> + return; >> + >> + /* Reduce the CPUFREQ Thermal reduction percentage to 5% */ >> + cpufreq_thermal_pctg = 5; >> + >> + /* >> + * Derive the MAX_STEP from minimum throttle percentage so that the reduction >> + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that >> + * the CPU performance doesn't become 0. >> + */ >> + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; >> +} > > Looks better now, but one more thing: This is introducing an > ARM-specific piece of code into an otherwise generic file and there is > drivers/acpi/arm64/ for ARM-specific code, so I would very much prefer > this piece of code to go there. > > Of course, it won't be able to modify the static variables directly > then, but what if instead it defines functions to return the > appropriate values? > > The variables in question could be initialized with the help of those > functions then. > Hi Rafael, Thank you for the review! Have done the suggested change and sent v4[1]. Please suggest if it looks fine now (or) needs any further change. [1] https://lore.kernel.org/lkml/20231009171839.12267-1-sumitg@nvidia.com/ Best Regards, Sumit Gupta >> +#else >> +static inline void acpi_thermal_cpufreq_config_nvidia(void) {} >> +#endif >> + >> void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) >> { >> unsigned int cpu; >> >> + acpi_thermal_cpufreq_config_nvidia(); >> + >> for_each_cpu(cpu, policy->related_cpus) { >> struct acpi_processor *pr = per_cpu(processors, cpu); >> int ret; >> -- >> 2.17.1 >> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-09 17:29 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-06 15:36 [Patch v3 0/2] Add support for _TFP and change throttle pctg Sumit Gupta 2023-10-06 15:36 ` [Patch v3 1/2] ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support Sumit Gupta 2023-10-06 15:36 ` [Patch v3 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 Sumit Gupta 2023-10-06 15:52 ` Rafael J. Wysocki 2023-10-09 17:28 ` Sumit Gupta
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox