From: Zhang Rui <rui.zhang@intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: rjw@rjwysocki.net, daniel.lezcano@linaro.org,
linux-pm@vger.kernel.org, srinivas.pandruvada@linux.intel.com
Subject: Re: [PATCH 1/6] thermal/intel: Introduce Intel TCC library
Date: Fri, 09 Dec 2022 00:49:21 +0800 [thread overview]
Message-ID: <c827d96fcf50a39ebd219cd09d9b4bd1db2f0398.camel@intel.com> (raw)
In-Reply-To: <CAJZ5v0hOzCRnoH52EEMZm+CRDNKSYg2iZax2CnowzdMXjX8QYA@mail.gmail.com>
On Thu, 2022-12-08 at 14:49 +0100, Rafael J. Wysocki wrote:
> On Tue, Nov 8, 2022 at 4:31 AM Zhang Rui <rui.zhang@intel.com> wrote:
> > There are several different drivers that accesses the Intel TCC
> > (thermal control circuitry) MSRs, and each of them has its own
> > implementation for the same functionalities, e.g. getting the
> > current
> > temperature, getting the tj_max, and getting/setting the tj_max
> > offset.
> >
> > Introduce a library to unify the code for Intel CPU TCC MSR access.
> >
> > At the same time, ensure the temperature is got based on the
> > updated
> > tjmax value because tjmax can be changed at runtime for cases like
> > the Intel SST-PP (Intel Speed Select Technology - Performance
> > Profile)
> > level change.
> >
> > Signed-off-by: Zhang Rui <rui.zhang@intel.com>
>
> Nice series, overall I like it a lot, but I have some comments
> regarding this particular patch (below). Some of them are arguably
> minor, but at least one thing is more serious.
Hi, Rafael,
Thanks for reviewing the patch set.
>
> > ---
> > drivers/thermal/intel/Kconfig | 4 +
> > drivers/thermal/intel/Makefile | 1 +
> > drivers/thermal/intel/intel_tcc.c | 131
> > ++++++++++++++++++++++++++++++
> > include/linux/intel_tcc.h | 18 ++++
> > 4 files changed, 154 insertions(+)
> > create mode 100644 drivers/thermal/intel/intel_tcc.c
> > create mode 100644 include/linux/intel_tcc.h
> >
> > diff --git a/drivers/thermal/intel/Kconfig
> > b/drivers/thermal/intel/Kconfig
> > index f0c845679250..6b938c040d6e 100644
> > --- a/drivers/thermal/intel/Kconfig
> > +++ b/drivers/thermal/intel/Kconfig
> > @@ -12,6 +12,10 @@ config X86_THERMAL_VECTOR
> > def_bool y
> > depends on X86 && CPU_SUP_INTEL && X86_LOCAL_APIC
> >
> > +config INTEL_TCC
> > + bool
> > + depends on X86
> > +
> > config X86_PKG_TEMP_THERMAL
> > tristate "X86 package temperature thermal driver"
> > depends on X86_THERMAL_VECTOR
> > diff --git a/drivers/thermal/intel/Makefile
> > b/drivers/thermal/intel/Makefile
> > index 9a8d8054f316..5d8833c82ab6 100644
> > --- a/drivers/thermal/intel/Makefile
> > +++ b/drivers/thermal/intel/Makefile
> > @@ -2,6 +2,7 @@
> > #
> > # Makefile for various Intel thermal drivers.
> >
> > +obj-$(CONFIG_INTEL_TCC) += intel_tcc.o
> > obj-$(CONFIG_INTEL_POWERCLAMP) += intel_powerclamp.o
> > obj-$(CONFIG_X86_PKG_TEMP_THERMAL) += x86_pkg_temp_thermal.o
> > obj-$(CONFIG_INTEL_SOC_DTS_IOSF_CORE) += intel_soc_dts_iosf.o
> > diff --git a/drivers/thermal/intel/intel_tcc.c
> > b/drivers/thermal/intel/intel_tcc.c
> > new file mode 100644
> > index 000000000000..74b434914975
> > --- /dev/null
> > +++ b/drivers/thermal/intel/intel_tcc.c
> > @@ -0,0 +1,131 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * intel_tcc.c - Library for Intel TCC (thermal control circuitry)
> > MSR access
> > + * Copyright (c) 2022, Intel Corporation.
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/intel_tcc.h>
> > +#include <asm/msr.h>
> > +
> > +/**
> > + * intel_tcc_get_tjmax() - returns the default TCC activation
> > Temperature
> > + * @cpu: cpu that the MSR should be run on.
> > + * @tjmax: a valid pointer to where to store the Tjmax value
> > + *
> > + * Get the TjMax value, which is the default thermal throttling or
> > TCC
> > + * activation temperature in degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
>
> This extra empty line is not needed (and not desirable even). And
> same below.
Sure. will remove it.
>
> > +int intel_tcc_get_tjmax(int cpu, int *tjmax)
> > +{
> > + u32 eax, edx;
> > + int err;
> > +
> > + err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > + &eax, &edx);
>
> The current trend is to align the arguments after a line break with
> the first one, but I would just put them all on the same line (and
> below too).
I agree putting them in the same line looks prettier.
>
> > + if (err)
> > + return err;
> > +
> > + *tjmax = (eax >> 16) & 0xff;
>
> This means that the tjmax value cannot be negative.
>
> > +
> > + return *tjmax ? 0 : -EINVAL;
>
> So the return value of this function could be tjmax if positive or a
> negative error code otherwise. No return pointers needed.
I tried both. And I think I chose this solution just because it makes
the following cleanup patches in this series looks prettier.
I will try your suggestion, and if there is any other reason I wrote it
in this way, I will find it out again. :p
>
> And why do you want to return -EINVAL (rather than any other error
> code) if tjmax turns out to be 0?
Because x86_pkg_temp_thermal driver returns -EINVAL previously.
Any suggestions on this?
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_tjmax, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_get_offset() - returns the TCC Offset value to Tjmax
> > + * @cpu: cpu that the MSR should be run on.
> > + * @offset: a valid pointer to where to store the offset value
> > + *
> > + * Get the TCC offset value to Tjmax. The effective thermal
> > throttling or TCC
> > + * activation temperature equals "Tjmax" - "TCC Offset", in
> > degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
> > +int intel_tcc_get_offset(int cpu, int *offset)
> > +{
> > + u32 eax, edx;
> > + int err;
> > +
> > + err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > + &eax, &edx);
> > + if (err)
> > + return err;
> > +
> > + *offset = (eax >> 24) & 0x3f;
>
> Well, offset cannot be negative here, so (again) the return value of
> this function could be interpreted as the offsent (if non-negative)
> or
> a negative error code on failure.
>
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_offset, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_set_offset() - set the TCC offset value to Tjmax
> > + * @cpu: cpu that the MSR should be run on.
> > + * @offset: TCC offset value in degree C
>
> I think that this cannot be negative, so maybe say "in K" instead of
> "in degree C"?
degree C is the unit used in the Intel SDM, so better to keep this
comment aligned.
>
> And maybe it's better to pass u8 here?
sounds good, will do in next version.
>
> > + *
> > + * Set the TCC Offset value to Tjmax. The effective thermal
> > throttling or TCC
> > + * activation temperature equals "Tjmax" - "TCC Offset", in degree
> > C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
> > +int intel_tcc_set_offset(int cpu, int offset)
> > +{
> > + u32 eax, edx;
> > + int err;
> > +
> > + if (offset > 0x3f)
> > + return -EINVAL;
> > +
> > + err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > + &eax, &edx);
> > + if (err)
> > + return err;
> > +
> > + if (eax & BIT(31))
> > + return -EPERM;
>
> Why -EPERM?
Bit 31 set means the MSR is locked, and os software cannot write it.
should I use -EACCES instead?
>
> > +
> > + eax &= ~(0x3f << 24);
> > + eax |= (offset << 24);
>
> The parens are not needed AFAICS.
>
Agreed.
> > +
> > + return wrmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > eax, edx);
>
> So is any protection against concurrent access needed here? Like
> what
> if two different callers invoke this function at the same time for
> the
> same CPU?
Given that the tcc offset is the only field that kernel code writes,
all the other bits won't change, so this is not a problem.
>
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_set_offset, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_get_temp() - returns the current temperature
> > + * @cpu: cpu that the MSR should be run on.
> > + * @pkg: true: Package Thermal Sensor. false: Core Thermal Sensor.
> > + * @temp: a valid pointer to where to store the resulting
> > temperature
> > + *
> > + * Get the current temperature returned by the CPU core/package
> > level
> > + * thermal sensor, in degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +int intel_tcc_get_temp(int cpu, bool pkg, int *temp)
> > +{
> > + u32 eax, edx;
> > + u32 msr = pkg ? MSR_IA32_PACKAGE_THERM_STATUS :
> > MSR_IA32_THERM_STATUS;
> > + int tjmax, err;
> > +
> > + err = intel_tcc_get_tjmax(cpu, &tjmax);
> > + if (err)
> > + return err;
>
> Well, what if somebody change tjmax on this cpu while this function
> is running?
tjmax is read only. Only firmware can change its value at runtime, say
this field is updated when SST-PP level is changed.
thanks,
rui
>
> > +
> > + err = rdmsr_safe_on_cpu(cpu, msr, &eax, &edx);
> > + if (err)
> > + return err;
> > +
> > + if (eax & 0x80000000) {
> > + *temp = tjmax - ((eax >> 16) & 0x7f);
> > + return 0;
> > + }
> > + return -EINVAL;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_temp, INTEL_TCC);
> > +
> > diff --git a/include/linux/intel_tcc.h b/include/linux/intel_tcc.h
> > new file mode 100644
> > index 000000000000..94f8ceab5dd0
> > --- /dev/null
> > +++ b/include/linux/intel_tcc.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * header for Intel TCC (thermal control circuitry) library
> > + *
> > + * Copyright (C) 2022 Intel Corporation.
> > + */
> > +
> > +#ifndef __INTEL_TCC_H__
> > +#define __INTEL_TCC_H__
> > +
> > +#include <linux/types.h>
> > +
> > +int intel_tcc_get_tjmax(int cpu, int *tjmax);
> > +int intel_tcc_get_offset(int cpu, int *offset);
> > +int intel_tcc_set_offset(int cpu, int offset);
> > +int intel_tcc_get_temp(int cpu, bool pkg, int *temp);
> > +
> > +#endif /* __INTEL_TCC_H__ */
> > --
next prev parent reply other threads:[~2022-12-08 16:49 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 3:33 [PATCH 0/6] thermal/intel: Introduce intel-tcc lib and enhance tjmax handling Zhang Rui
2022-11-08 3:33 ` [PATCH 1/6] thermal/intel: Introduce Intel TCC library Zhang Rui
2022-12-08 13:49 ` Rafael J. Wysocki
2022-12-08 14:07 ` Rafael J. Wysocki
2022-12-09 13:57 ` Zhang Rui
2022-12-08 16:49 ` Zhang Rui [this message]
2022-12-08 17:00 ` Rafael J. Wysocki
2022-12-09 13:32 ` Zhang Rui
2022-12-09 16:28 ` srinivas pandruvada
2022-12-11 7:50 ` Zhang Rui
2022-12-12 12:15 ` Rafael J. Wysocki
2022-12-11 7:23 ` Zhang Rui
2022-12-12 12:11 ` Rafael J. Wysocki
2022-12-13 1:38 ` Zhang Rui
2022-12-13 15:34 ` Rafael J. Wysocki
2022-12-14 16:15 ` Zhang Rui
2022-12-14 16:17 ` Rafael J. Wysocki
2022-12-14 16:19 ` Rafael J. Wysocki
2022-11-08 3:33 ` [PATCH 2/6] thermal/int340x/processor_thermal: Use " Zhang Rui
2022-11-08 3:33 ` [PATCH 3/6] thermal/intel/intel_soc_dts_iosf: " Zhang Rui
2022-11-08 3:33 ` [PATCH 4/6] thermal/intel/intel_tcc_cooling: " Zhang Rui
2022-11-08 3:33 ` [PATCH 5/6] thermal/x86_pkg_temp_thermal: " Zhang Rui
2022-11-08 3:33 ` [PATCH 6/6] thermal/x86_pkg_temp_thermal: Add support for handling dynamic tjmax Zhang Rui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c827d96fcf50a39ebd219cd09d9b4bd1db2f0398.camel@intel.com \
--to=rui.zhang@intel.com \
--cc=daniel.lezcano@linaro.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=rjw@rjwysocki.net \
--cc=srinivas.pandruvada@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox