Re: [PATCH 1/6] thermal/intel: Introduce Intel TCC library

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

From: Zhang Rui <rui.zhang@intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: rjw@rjwysocki.net, daniel.lezcano@linaro.org,
	linux-pm@vger.kernel.org, srinivas.pandruvada@linux.intel.com
Subject: Re: [PATCH 1/6] thermal/intel: Introduce Intel TCC library
Date: Fri, 09 Dec 2022 00:49:21 +0800	[thread overview]
Message-ID: <c827d96fcf50a39ebd219cd09d9b4bd1db2f0398.camel@intel.com> (raw)
In-Reply-To: <CAJZ5v0hOzCRnoH52EEMZm+CRDNKSYg2iZax2CnowzdMXjX8QYA@mail.gmail.com>

On Thu, 2022-12-08 at 14:49 +0100, Rafael J. Wysocki wrote:
> On Tue, Nov 8, 2022 at 4:31 AM Zhang Rui <rui.zhang@intel.com> wrote:
> > There are several different drivers that accesses the Intel TCC
> > (thermal control circuitry) MSRs, and each of them has its own
> > implementation for the same functionalities, e.g. getting the
> > current
> > temperature, getting the tj_max, and getting/setting the tj_max
> > offset.
> > 
> > Introduce a library to unify the code for Intel CPU TCC MSR access.
> > 
> > At the same time, ensure the temperature is got based on the
> > updated
> > tjmax value because tjmax can be changed at runtime for cases like
> > the Intel SST-PP (Intel Speed Select Technology - Performance
> > Profile)
> > level change.
> > 
> > Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> 
> Nice series, overall I like it a lot, but I have some comments
> regarding this particular patch (below).  Some of them are arguably
> minor, but at least one thing is more serious.

Hi, Rafael,

Thanks for reviewing the patch set.
> 
> > ---
> >  drivers/thermal/intel/Kconfig     |   4 +
> >  drivers/thermal/intel/Makefile    |   1 +
> >  drivers/thermal/intel/intel_tcc.c | 131
> > ++++++++++++++++++++++++++++++
> >  include/linux/intel_tcc.h         |  18 ++++
> >  4 files changed, 154 insertions(+)
> >  create mode 100644 drivers/thermal/intel/intel_tcc.c
> >  create mode 100644 include/linux/intel_tcc.h
> > 
> > diff --git a/drivers/thermal/intel/Kconfig
> > b/drivers/thermal/intel/Kconfig
> > index f0c845679250..6b938c040d6e 100644
> > --- a/drivers/thermal/intel/Kconfig
> > +++ b/drivers/thermal/intel/Kconfig
> > @@ -12,6 +12,10 @@ config X86_THERMAL_VECTOR
> >         def_bool y
> >         depends on X86 && CPU_SUP_INTEL && X86_LOCAL_APIC
> > 
> > +config INTEL_TCC
> > +       bool
> > +       depends on X86
> > +
> >  config X86_PKG_TEMP_THERMAL
> >         tristate "X86 package temperature thermal driver"
> >         depends on X86_THERMAL_VECTOR
> > diff --git a/drivers/thermal/intel/Makefile
> > b/drivers/thermal/intel/Makefile
> > index 9a8d8054f316..5d8833c82ab6 100644
> > --- a/drivers/thermal/intel/Makefile
> > +++ b/drivers/thermal/intel/Makefile
> > @@ -2,6 +2,7 @@
> >  #
> >  # Makefile for various Intel thermal drivers.
> > 
> > +obj-$(CONFIG_INTEL_TCC)        += intel_tcc.o
> >  obj-$(CONFIG_INTEL_POWERCLAMP) += intel_powerclamp.o
> >  obj-$(CONFIG_X86_PKG_TEMP_THERMAL)     += x86_pkg_temp_thermal.o
> >  obj-$(CONFIG_INTEL_SOC_DTS_IOSF_CORE)  += intel_soc_dts_iosf.o
> > diff --git a/drivers/thermal/intel/intel_tcc.c
> > b/drivers/thermal/intel/intel_tcc.c
> > new file mode 100644
> > index 000000000000..74b434914975
> > --- /dev/null
> > +++ b/drivers/thermal/intel/intel_tcc.c
> > @@ -0,0 +1,131 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * intel_tcc.c - Library for Intel TCC (thermal control circuitry)
> > MSR access
> > + * Copyright (c) 2022, Intel Corporation.
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/intel_tcc.h>
> > +#include <asm/msr.h>
> > +
> > +/**
> > + * intel_tcc_get_tjmax() - returns the default TCC activation
> > Temperature
> > + * @cpu: cpu that the MSR should be run on.
> > + * @tjmax: a valid pointer to where to store the Tjmax value
> > + *
> > + * Get the TjMax value, which is the default thermal throttling or
> > TCC
> > + * activation temperature in degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
> 
> This extra empty line is not needed (and not desirable even).  And
> same below.

Sure. will remove it.
> 
> > +int intel_tcc_get_tjmax(int cpu, int *tjmax)
> > +{
> > +       u32 eax, edx;
> > +       int err;
> > +
> > +       err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > +                                       &eax, &edx);
> 
> The current trend is to align the arguments after a line break with
> the first one, but I would just put them all on the same line (and
> below too).

I agree putting them in the same line looks prettier.

> 
> > +       if (err)
> > +               return err;
> > +
> > +       *tjmax = (eax >> 16) & 0xff;
> 
> This means that the tjmax value cannot be negative.
> 
> > +
> > +       return *tjmax ? 0 : -EINVAL;
> 
> So the return value of this function could be tjmax if positive or a
> negative error code otherwise.  No return pointers needed.

I tried both. And I think I chose this solution just because it makes
the following cleanup patches in this series looks prettier.

I will try your suggestion, and if there is any other reason I wrote it
in this way, I will find it out again. :p

> 
> And why do you want to return -EINVAL (rather than any other error
> code) if tjmax turns out to be 0?

Because x86_pkg_temp_thermal driver returns -EINVAL previously.
Any suggestions on this?

> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_tjmax, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_get_offset() - returns the TCC Offset value to Tjmax
> > + * @cpu: cpu that the MSR should be run on.
> > + * @offset: a valid pointer to where to store the offset value
> > + *
> > + * Get the TCC offset value to Tjmax. The effective thermal
> > throttling or TCC
> > + * activation temperature equals "Tjmax" - "TCC Offset", in
> > degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
> > +int intel_tcc_get_offset(int cpu, int *offset)
> > +{
> > +       u32 eax, edx;
> > +       int err;
> > +
> > +       err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > +                                       &eax, &edx);
> > +       if (err)
> > +               return err;
> > +
> > +       *offset = (eax >> 24) & 0x3f;
> 
> Well, offset cannot be negative here, so (again) the return value of
> this function could be interpreted as the offsent (if non-negative)
> or
> a negative error code on failure.
> 
> > +
> > +       return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_offset, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_set_offset() - set the TCC offset value to Tjmax
> > + * @cpu: cpu that the MSR should be run on.
> > + * @offset: TCC offset value in degree C
> 
> I think that this cannot be negative, so maybe say "in K" instead of
> "in degree C"?

degree C is the unit used in the Intel SDM, so better to keep this
comment aligned.

> 
> And maybe it's better to pass u8 here?

sounds good, will do in next version.

> 
> > + *
> > + * Set the TCC Offset value to Tjmax. The effective thermal
> > throttling or TCC
> > + * activation temperature equals "Tjmax" - "TCC Offset", in degree
> > C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +
> > +int intel_tcc_set_offset(int cpu, int offset)
> > +{
> > +       u32 eax, edx;
> > +       int err;
> > +
> > +       if (offset > 0x3f)
> > +               return -EINVAL;
> > +
> > +       err = rdmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > +                                       &eax, &edx);
> > +       if (err)
> > +               return err;
> > +
> > +       if (eax & BIT(31))
> > +               return -EPERM;
> 
> Why -EPERM?

Bit 31 set means the MSR is locked, and os software cannot write it.
should I use -EACCES instead?

> 
> > +
> > +       eax &= ~(0x3f << 24);
> > +       eax |= (offset << 24);
> 
> The parens are not needed AFAICS.
> 
Agreed.

> > +
> > +       return wrmsr_safe_on_cpu(cpu, MSR_IA32_TEMPERATURE_TARGET,
> > eax, edx);
> 
> So is any protection against concurrent access needed here?  Like
> what
> if two different callers invoke this function at the same time for
> the
> same CPU?

Given that the tcc offset is the only field that kernel code writes,
all the other bits won't change, so this is not a problem.

> 
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_set_offset, INTEL_TCC);
> > +
> > +/**
> > + * intel_tcc_get_temp() - returns the current temperature
> > + * @cpu: cpu that the MSR should be run on.
> > + * @pkg: true: Package Thermal Sensor. false: Core Thermal Sensor.
> > + * @temp: a valid pointer to where to store the resulting
> > temperature
> > + *
> > + * Get the current temperature returned by the CPU core/package
> > level
> > + * thermal sensor, in degrees C.
> > + *
> > + * Return: On success returns 0, an error code otherwise
> > + */
> > +int intel_tcc_get_temp(int cpu, bool pkg, int *temp)
> > +{
> > +       u32 eax, edx;
> > +       u32 msr = pkg ? MSR_IA32_PACKAGE_THERM_STATUS :
> > MSR_IA32_THERM_STATUS;
> > +       int tjmax, err;
> > +
> > +       err = intel_tcc_get_tjmax(cpu, &tjmax);
> > +       if (err)
> > +               return err;
> 
> Well, what if somebody change tjmax on this cpu while this function
> is running?

tjmax is read only. Only firmware can change its value at runtime, say
this field is updated when SST-PP level is changed.

thanks,
rui
> 
> > +
> > +       err = rdmsr_safe_on_cpu(cpu, msr, &eax, &edx);
> > +       if (err)
> > +               return err;
> > +
> > +       if (eax & 0x80000000) {
> > +               *temp = tjmax - ((eax >> 16) & 0x7f);
> > +               return 0;
> > +       }
> > +       return -EINVAL;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(intel_tcc_get_temp, INTEL_TCC);
> > +
> > diff --git a/include/linux/intel_tcc.h b/include/linux/intel_tcc.h
> > new file mode 100644
> > index 000000000000..94f8ceab5dd0
> > --- /dev/null
> > +++ b/include/linux/intel_tcc.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + *  header for Intel TCC (thermal control circuitry) library
> > + *
> > + *  Copyright (C) 2022  Intel Corporation.
> > + */
> > +
> > +#ifndef __INTEL_TCC_H__
> > +#define __INTEL_TCC_H__
> > +
> > +#include <linux/types.h>
> > +
> > +int intel_tcc_get_tjmax(int cpu, int *tjmax);
> > +int intel_tcc_get_offset(int cpu, int *offset);
> > +int intel_tcc_set_offset(int cpu, int offset);
> > +int intel_tcc_get_temp(int cpu, bool pkg, int *temp);
> > +
> > +#endif /* __INTEL_TCC_H__ */
> > --

next prev parent reply	other threads:[~2022-12-08 16:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08  3:33 [PATCH 0/6] thermal/intel: Introduce intel-tcc lib and enhance tjmax handling Zhang Rui
2022-11-08  3:33 ` [PATCH 1/6] thermal/intel: Introduce Intel TCC library Zhang Rui
2022-12-08 13:49   ` Rafael J. Wysocki
2022-12-08 14:07     ` Rafael J. Wysocki
2022-12-09 13:57       ` Zhang Rui
2022-12-08 16:49     ` Zhang Rui [this message]
2022-12-08 17:00       ` Rafael J. Wysocki
2022-12-09 13:32         ` Zhang Rui
2022-12-09 16:28           ` srinivas pandruvada
2022-12-11  7:50             ` Zhang Rui
2022-12-12 12:15               ` Rafael J. Wysocki
2022-12-11  7:23       ` Zhang Rui
2022-12-12 12:11         ` Rafael J. Wysocki
2022-12-13  1:38           ` Zhang Rui
2022-12-13 15:34             ` Rafael J. Wysocki
2022-12-14 16:15               ` Zhang Rui
2022-12-14 16:17                 ` Rafael J. Wysocki
2022-12-14 16:19           ` Rafael J. Wysocki
2022-11-08  3:33 ` [PATCH 2/6] thermal/int340x/processor_thermal: Use " Zhang Rui
2022-11-08  3:33 ` [PATCH 3/6] thermal/intel/intel_soc_dts_iosf: " Zhang Rui
2022-11-08  3:33 ` [PATCH 4/6] thermal/intel/intel_tcc_cooling: " Zhang Rui
2022-11-08  3:33 ` [PATCH 5/6] thermal/x86_pkg_temp_thermal: " Zhang Rui
2022-11-08  3:33 ` [PATCH 6/6] thermal/x86_pkg_temp_thermal: Add support for handling dynamic tjmax Zhang Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c827d96fcf50a39ebd219cd09d9b4bd1db2f0398.camel@intel.com \
    --to=rui.zhang@intel.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox