* [PATCH] k10temp: temperature sensor for AMD Family 10h/11h CPUs [not found] <4AF91F70.10106@ladisch.de> @ 2009-11-20 8:15 ` Clemens Ladisch 2009-11-20 10:22 ` Serge Belyshev 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-20 8:15 UTC (permalink / raw) To: Andrew Morton; +Cc: lm-sensors, linux-kernel This adds a driver for the internal temperature sensor of AMD Family 10h and 11h CPUs. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- Documentation/hwmon/k10temp | 54 +++++++++++++++++ drivers/hwmon/Kconfig | 12 +++ drivers/hwmon/Makefile | 1 drivers/hwmon/k10temp.c | 135 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 202 insertions(+) --- /dev/null +++ linux-2.6/Documentation/hwmon/k10temp @@ -0,0 +1,54 @@ +Kernel driver k10temp +===================== + +Supported chips: +* AMD Family 10h processors: + Socket F: Quad-Core/Six-Core/Embedded AMD Opteron + Socket AM2+: Phenom II X3/X4 + Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II + mobile: Athlon II, Sempron, Turion II + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: see http://developer.amd.com/documentation/guides/ + #31116: BIOS and Kernel Developer's Guide For AMD Family 10h Processors + #41322: Revision Guide for AMD Family 10h Processors + +* AMD Family 11h processors: + mobile: Athlon (X2), Sempron (X2), Turion X2 (Ultra) + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: see http://developer.amd.com/documentation/guides/ + #41256: BIOS and Kernel Developer's Guide For AMD Family 11h Processors + #41788: Revision Guide for AMD Family 11h Processors + +Author: Clemens Ladisch <clemens@ladisch.de> + +Description +----------- + +This driver permits reading of the internal temperature sensor of AMD +Family 10h and 11h processors. + +All these processors have a sensor, but on older revisions of Family 10h +processors, the sensor returns inconsistent values (erratum 319). The driver +refuses to load with these revisions (DR-BA, DR-B2, DR-B3: some Embedded +Opterons on Socket F; and Quad-Core Opteron, Phenom Triple/Quad-Core, and +Athon Dual-Core on Socket AM2+). All later revisions (RB-C2, BL-C2, DA-C2, +RB-C3, HY-D0) work fine; see the list above. + +There is one temperature value, available as temp1_input in sysfs. It is +measured in degrees Celsius with a resolution of 1/8th degree. Please note +that it is defined as a relative value; to quote the AMD manual: + + Tctl is the processor temperatur control value, used by the platform to + control cooling systems. Tctl is a non-physical temperature on an arbitrary + scale measured in degrees. It does _not_ represent an actual physical + temperature like die or case temperature. Instead, it specifies the + processor temperature relative to the point at which the system must supply + the maximum cooling for the processor's specified maximum case temperature + and maximum thermal power dissipation. + +The maximum value for Tctl is usually defined as 70 degrees, so, as a rule of +thumb, this value should not exceed 60 degrees. --- linux-2.6/drivers/hwmon/Kconfig +++ linux-2.6/drivers/hwmon/Kconfig @@ -222,6 +222,18 @@ config SENSORS_K8TEMP This driver can also be built as a module. If so, the module will be called k8temp. +config SENSORS_K10TEMP + tristate "AMD Phenom/Sempron/Turion/Opteron temperature sensor" + depends on X86 && PCI + help + If you say yes here you get support for the temperature + sensor inside your CPU. Supported are later revisions of + the AMD Family 10h and all revisions of the AMD Family 11h + microarchitectures. + + This driver can also be built as a module. If so, the module + will be called k10temp. + config SENSORS_AMS tristate "Apple Motion Sensor driver" depends on PPC_PMAC && !PPC64 && INPUT && ((ADB_PMU && I2C = y) || (ADB_PMU && !I2C) || I2C) && EXPERIMENTAL --- linux-2.6/drivers/hwmon/Makefile +++ linux-2.6/drivers/hwmon/Makefile @@ -53,6 +53,7 @@ obj-$(CONFIG_SENSORS_IBMAEM) += ibmaem.o obj-$(CONFIG_SENSORS_IBMPEX) += ibmpex.o obj-$(CONFIG_SENSORS_IT87) += it87.o obj-$(CONFIG_SENSORS_K8TEMP) += k8temp.o +obj-$(CONFIG_SENSORS_K10TEMP) += k10temp.o obj-$(CONFIG_SENSORS_LIS3LV02D) += lis3lv02d.o hp_accel.o obj-$(CONFIG_SENSORS_LIS3_SPI) += lis3lv02d.o lis3lv02d_spi.o obj-$(CONFIG_SENSORS_LM63) += lm63.o --- /dev/null +++ linux-2.6/drivers/hwmon/k10temp.c @@ -0,0 +1,135 @@ +/* + * k10temp.c - AMD Family 10h/11h CPUs hardware monitoring + * + * Copyright (c) 2009 Clemens Ladisch <clemens@ladisch.de> + * + * + * This driver is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License; either + * version 2 of the License, or (at your option) any later version. + * + * This driver is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this driver; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <linux/hwmon.h> +#include <linux/hwmon-sysfs.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <asm/processor.h> + +MODULE_DESCRIPTION("AMD Family 10h/11h CPU core temperature monitor"); +MODULE_AUTHOR("Clemens Ladisch <clemens@ladisch.de>"); +MODULE_LICENSE("GPL"); + +#define REG_REPORTED_TEMPERATURE 0xa4 +#define CURTMP_TO_MILLIDEGREES(regval) (((regval) >> 21) * 125) + +static ssize_t show_temp(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u32 regval; + + pci_read_config_dword(to_pci_dev(dev), + REG_REPORTED_TEMPERATURE, ®val); + return sprintf(buf, "%u\n", CURTMP_TO_MILLIDEGREES(regval)); +} + +static ssize_t show_name(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "k10temp\n"); +} + +static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, show_temp, NULL, 0); +static DEVICE_ATTR(name, S_IRUGO, show_name, NULL); + +static bool __devinit has_erratum_319(void) +{ + /* + * Erratum 319: The thermal sensor of older Family 10h processors + * (B steppings) is unreliable. + */ + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; +} + +static int __devinit k10temp_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct device *hwmon_dev; + int err; + + if (has_erratum_319()) { + dev_err(&pdev->dev, + "unreliable CPU thermal sensor; monitoring disabled\n"); + err = -ENODEV; + goto exit; + } + + err = device_create_file(&pdev->dev, + &sensor_dev_attr_temp1_input.dev_attr); + if (err) + goto exit; + + err = device_create_file(&pdev->dev, &dev_attr_name); + if (err) + goto exit_remove1; + + hwmon_dev = hwmon_device_register(&pdev->dev); + if (IS_ERR(hwmon_dev)) { + err = PTR_ERR(hwmon_dev); + goto exit_remove2; + } + + dev_set_drvdata(&pdev->dev, hwmon_dev); + return 0; + +exit_remove2: + device_remove_file(&pdev->dev, &dev_attr_name); +exit_remove1: + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); +exit: + return err; +} + +static void __devexit k10temp_remove(struct pci_dev *pdev) +{ + hwmon_device_unregister(dev_get_drvdata(&pdev->dev)); + device_remove_file(&pdev->dev, &dev_attr_name); + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); + dev_set_drvdata(&pdev->dev, NULL); +} + +static struct pci_device_id k10temp_id_table[] = { + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, + {} +}; +MODULE_DEVICE_TABLE(pci, k10temp_id_table); + +static struct pci_driver k10temp_driver = { + .name = "k10temp", + .id_table = k10temp_id_table, + .probe = k10temp_probe, + .remove = __devexit_p(k10temp_remove), +}; + +static int __init k10temp_init(void) +{ + return pci_register_driver(&k10temp_driver); +} + +static void __exit k10temp_exit(void) +{ + pci_unregister_driver(&k10temp_driver); +} + +module_init(k10temp_init) +module_exit(k10temp_exit) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 8:15 ` [PATCH] k10temp: temperature sensor for AMD Family 10h/11h CPUs Clemens Ladisch @ 2009-11-20 10:22 ` Serge Belyshev 2009-11-20 10:44 ` [lm-sensors] " Jean Delvare 2009-11-20 10:47 ` [PATCH v2] " Clemens Ladisch 0 siblings, 2 replies; 26+ messages in thread From: Serge Belyshev @ 2009-11-20 10:22 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Andrew Morton, lm-sensors, linux-kernel > +All these processors have a sensor, but on older revisions of Family 10h > +processors, the sensor returns inconsistent values (erratum 319). The driver > +refuses to load with these revisions (DR-BA, DR-B2, DR-B3: some Embedded > +Opterons on Socket F; and Quad-Core Opteron, Phenom Triple/Quad-Core, and > +Athon Dual-Core on Socket AM2+). All later revisions (RB-C2, BL-C2, DA-C2, > +RB-C3, HY-D0) work fine; see the list above. Please note that erratum actually states that the sensor only "may report inconsistent values.", not that it is always broken. As evident by my own experience (tested with a userspace application), it actually works perfectly on all B3 stepping processors that I have. > +static bool __devinit has_erratum_319(void) > +{ > + /* > + * Erratum 319: The thermal sensor of older Family 10h processors > + * (B steppings) is unreliable. > + */ > + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; > +} ... > + if (has_erratum_319()) { > + dev_err(&pdev->dev, > + "unreliable CPU thermal sensor; monitoring disabled\n"); > + err = -ENODEV; > + goto exit; > + } So, please provide an alternative for those who have a working sensor on a revision B processor and want to use it. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [lm-sensors] [PATCH] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 10:22 ` Serge Belyshev @ 2009-11-20 10:44 ` Jean Delvare 2009-11-20 10:47 ` [PATCH v2] " Clemens Ladisch 1 sibling, 0 replies; 26+ messages in thread From: Jean Delvare @ 2009-11-20 10:44 UTC (permalink / raw) To: Serge Belyshev; +Cc: Clemens Ladisch, Andrew Morton, linux-kernel, lm-sensors On Fri, 20 Nov 2009 10:22:50 +0000, Serge Belyshev wrote: > > > +All these processors have a sensor, but on older revisions of Family 10h > > +processors, the sensor returns inconsistent values (erratum 319). The driver > > +refuses to load with these revisions (DR-BA, DR-B2, DR-B3: some Embedded > > +Opterons on Socket F; and Quad-Core Opteron, Phenom Triple/Quad-Core, and > > +Athon Dual-Core on Socket AM2+). All later revisions (RB-C2, BL-C2, DA-C2, > > +RB-C3, HY-D0) work fine; see the list above. > > Please note that erratum actually states that the sensor only "may report > inconsistent values.", not that it is always broken. As evident by my > own experience (tested with a userspace application), it actually works > perfectly on all B3 stepping processors that I have. We don't care that 5% of the CPU have working sensors. 95% [1] of theses CPUs have broken sensors and their users will ask us for help and we are fed up with this. So all these CPUs are blacklisted, period. > > +static bool __devinit has_erratum_319(void) > > +{ > > + /* > > + * Erratum 319: The thermal sensor of older Family 10h processors > > + * (B steppings) is unreliable. > > + */ > > + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; > > +} > ... > > + if (has_erratum_319()) { > > + dev_err(&pdev->dev, > > + "unreliable CPU thermal sensor; monitoring disabled\n"); > > + err = -ENODEV; > > + goto exit; > > + } > > So, please provide an alternative for those who have a working sensor on a > revision B processor and want to use it. The alternative already exists: you can rebuild the driver yourself without this check. Or yet another alternative: become the maintainer of the hwmon subsystem, and do lm-sensors user support for a couple years. Then you will be allowed to decide what goes in. And yet another one: instead of asking others to solve your very own problem, why don't you try and solve it yourself? I'm sure Clemens would welcome patches to his driver. Thank you very much. [1] Yes these numbers are totally made up. There is no reliable way to tell a broken sensor from a working one anyway. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 10:22 ` Serge Belyshev 2009-11-20 10:44 ` [lm-sensors] " Jean Delvare @ 2009-11-20 10:47 ` Clemens Ladisch 2009-11-20 11:30 ` [lm-sensors] " Jean Delvare 1 sibling, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-20 10:47 UTC (permalink / raw) To: Serge Belyshev; +Cc: Andrew Morton, lm-sensors, linux-kernel This adds a driver for the internal temperature sensor of AMD Family 10h and 11h CPUs. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- v2: Erratum 319 now results in a warning, not an error, because it cannot be reliably detected; Serge Belyshev reports that his CPU sensor works. Documentation/hwmon/k10temp | 50 ++++++++++++++++ drivers/hwmon/Kconfig | 11 +++ drivers/hwmon/Makefile | 1 drivers/hwmon/k10temp.c | 132 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 194 insertions(+) --- /dev/null +++ linux-2.6/Documentation/hwmon/k10temp @@ -0,0 +1,50 @@ +Kernel driver k10temp +===================== + +Supported chips: +* AMD Family 10h processors: + Socket F: Quad-Core/Six-Core/Embedded AMD Opteron + Socket AM2+: Quad-Core Opteron, Phenom (II) X3/X4 + Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II + mobile: Athlon II, Sempron, Turion II + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: see http://developer.amd.com/documentation/guides/ + #31116: BIOS and Kernel Developer's Guide For AMD Family 10h Processors + #41322: Revision Guide for AMD Family 10h Processors + +* AMD Family 11h processors: + mobile: Athlon (X2), Sempron (X2), Turion X2 (Ultra) + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: see http://developer.amd.com/documentation/guides/ + #41256: BIOS and Kernel Developer's Guide For AMD Family 11h Processors + #41788: Revision Guide for AMD Family 11h Processors + +Author: Clemens Ladisch <clemens@ladisch.de> + +Description +----------- + +This driver permits reading of the internal temperature sensor of AMD +Family 10h and 11h processors. + +All these processors have a sensor, but on older revisions of Family 10h +processors, the sensor may return inconsistent values (erratum 319). + +There is one temperature value, available as temp1_input in sysfs. It is +measured in degrees Celsius with a resolution of 1/8th degree. Please note +that it is defined as a relative value; to quote the AMD manual: + + Tctl is the processor temperatur control value, used by the platform to + control cooling systems. Tctl is a non-physical temperature on an arbitrary + scale measured in degrees. It does _not_ represent an actual physical + temperature like die or case temperature. Instead, it specifies the + processor temperature relative to the point at which the system must supply + the maximum cooling for the processor's specified maximum case temperature + and maximum thermal power dissipation. + +The maximum value for Tctl is usually defined as 70 degrees, so, as a rule of +thumb, this value should not exceed 60 degrees. --- linux-2.6/drivers/hwmon/Kconfig +++ linux-2.6/drivers/hwmon/Kconfig @@ -222,6 +222,17 @@ config SENSORS_K8TEMP This driver can also be built as a module. If so, the module will be called k8temp. +config SENSORS_K10TEMP + tristate "AMD Phenom/Sempron/Turion/Opteron temperature sensor" + depends on X86 && PCI + help + If you say yes here you get support for the temperature + sensor inside your CPU. Supported are all revisions of + the AMD Family 10h and 11h microarchitectures. + + This driver can also be built as a module. If so, the module + will be called k10temp. + config SENSORS_AMS tristate "Apple Motion Sensor driver" depends on PPC_PMAC && !PPC64 && INPUT && ((ADB_PMU && I2C = y) || (ADB_PMU && !I2C) || I2C) && EXPERIMENTAL --- linux-2.6/drivers/hwmon/Makefile +++ linux-2.6/drivers/hwmon/Makefile @@ -53,6 +53,7 @@ obj-$(CONFIG_SENSORS_IBMAEM) += ibmaem.o obj-$(CONFIG_SENSORS_IBMPEX) += ibmpex.o obj-$(CONFIG_SENSORS_IT87) += it87.o obj-$(CONFIG_SENSORS_K8TEMP) += k8temp.o +obj-$(CONFIG_SENSORS_K10TEMP) += k10temp.o obj-$(CONFIG_SENSORS_LIS3LV02D) += lis3lv02d.o hp_accel.o obj-$(CONFIG_SENSORS_LIS3_SPI) += lis3lv02d.o lis3lv02d_spi.o obj-$(CONFIG_SENSORS_LM63) += lm63.o --- /dev/null +++ linux-2.6/drivers/hwmon/k10temp.c @@ -0,0 +1,132 @@ +/* + * k10temp.c - AMD Family 10h/11h CPUs hardware monitoring + * + * Copyright (c) 2009 Clemens Ladisch <clemens@ladisch.de> + * + * + * This driver is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License; either + * version 2 of the License, or (at your option) any later version. + * + * This driver is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this driver; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <linux/hwmon.h> +#include <linux/hwmon-sysfs.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <asm/processor.h> + +MODULE_DESCRIPTION("AMD Family 10h/11h CPU core temperature monitor"); +MODULE_AUTHOR("Clemens Ladisch <clemens@ladisch.de>"); +MODULE_LICENSE("GPL"); + +#define REG_REPORTED_TEMPERATURE 0xa4 +#define CURTMP_TO_MILLIDEGREES(regval) (((regval) >> 21) * 125) + +static ssize_t show_temp(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u32 regval; + + pci_read_config_dword(to_pci_dev(dev), + REG_REPORTED_TEMPERATURE, ®val); + return sprintf(buf, "%u\n", CURTMP_TO_MILLIDEGREES(regval)); +} + +static ssize_t show_name(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "k10temp\n"); +} + +static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, show_temp, NULL, 0); +static DEVICE_ATTR(name, S_IRUGO, show_name, NULL); + +static bool __devinit has_erratum_319(void) +{ + /* + * Erratum 319: The thermal sensor of older Family 10h processors + * (B steppings) may be unreliable. + */ + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; +} + +static int __devinit k10temp_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct device *hwmon_dev; + int err; + + err = device_create_file(&pdev->dev, + &sensor_dev_attr_temp1_input.dev_attr); + if (err) + goto exit; + + err = device_create_file(&pdev->dev, &dev_attr_name); + if (err) + goto exit_remove1; + + hwmon_dev = hwmon_device_register(&pdev->dev); + if (IS_ERR(hwmon_dev)) { + err = PTR_ERR(hwmon_dev); + goto exit_remove2; + } + + if (has_erratum_319()) + dev_warn(&pdev->dev, "CPU thermal sensor may report" + " inconsistent values; check erratum 319\n"); + + dev_set_drvdata(&pdev->dev, hwmon_dev); + return 0; + +exit_remove2: + device_remove_file(&pdev->dev, &dev_attr_name); +exit_remove1: + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); +exit: + return err; +} + +static void __devexit k10temp_remove(struct pci_dev *pdev) +{ + hwmon_device_unregister(dev_get_drvdata(&pdev->dev)); + device_remove_file(&pdev->dev, &dev_attr_name); + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); + dev_set_drvdata(&pdev->dev, NULL); +} + +static struct pci_device_id k10temp_id_table[] = { + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, + {} +}; +MODULE_DEVICE_TABLE(pci, k10temp_id_table); + +static struct pci_driver k10temp_driver = { + .name = "k10temp", + .id_table = k10temp_id_table, + .probe = k10temp_probe, + .remove = __devexit_p(k10temp_remove), +}; + +static int __init k10temp_init(void) +{ + return pci_register_driver(&k10temp_driver); +} + +static void __exit k10temp_exit(void) +{ + pci_unregister_driver(&k10temp_driver); +} + +module_init(k10temp_init) +module_exit(k10temp_exit) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [lm-sensors] [PATCH v2] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 10:47 ` [PATCH v2] " Clemens Ladisch @ 2009-11-20 11:30 ` Jean Delvare 2009-11-20 11:56 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-20 11:30 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Serge Belyshev, Andrew Morton, linux-kernel, lm-sensors On Fri, 20 Nov 2009 11:47:51 +0100, Clemens Ladisch wrote: > This adds a driver for the internal temperature sensor of AMD Family 10h > and 11h CPUs. > > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > --- > v2: Erratum 319 now results in a warning, not an error, because it cannot > be reliably detected; Serge Belyshev reports that his CPU sensor works. Nack. Unreliable sensors -> the default must be to NOT bind to these CPUs. Feel free to provide a way to force the bind to happen (and still print a big fat warning that this is a very bad idea), but do NOT make it the default. Otherwise your driver will _never_ make it into the kernel tree. As a side note I'd be curious to see what Serge calls "working sensors". We had several reports in the past of people who claimed that their sensors were working. After looking at their numbers, I had doubt this really was the case. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 11:30 ` [lm-sensors] " Jean Delvare @ 2009-11-20 11:56 ` Clemens Ladisch 2009-11-20 12:18 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-20 11:56 UTC (permalink / raw) To: Jean Delvare; +Cc: Serge Belyshev, Andrew Morton, linux-kernel, lm-sensors Jean Delvare wrote: > On Fri, 20 Nov 2009 11:47:51 +0100, Clemens Ladisch wrote: > > v2: Erratum 319 now results in a warning, not an error, because it cannot > > be reliably detected; Serge Belyshev reports that his CPU sensor works. > > Nack. Unreliable sensors -> the default must be to NOT bind to these > CPUs. ... Otherwise your driver will _never_ make it into the > kernel tree. Then how did the k8temp driver make it into the kernel tree? :-) > Feel free to provide a way to force the bind to happen (and still > print a big fat warning that this is a very bad idea), but do NOT make > it the default. Okay, I'll add a force parameter. I'd guess k8temp should get the same? Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 11:56 ` Clemens Ladisch @ 2009-11-20 12:18 ` Jean Delvare 2009-11-23 7:45 ` [PATCH v3] " Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-20 12:18 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Serge Belyshev, Andrew Morton, linux-kernel, lm-sensors On Fri, 20 Nov 2009 12:56:50 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Fri, 20 Nov 2009 11:47:51 +0100, Clemens Ladisch wrote: > > > v2: Erratum 319 now results in a warning, not an error, because it cannot > > > be reliably detected; Serge Belyshev reports that his CPU sensor works. > > > > Nack. Unreliable sensors -> the default must be to NOT bind to these > > CPUs. ... Otherwise your driver will _never_ make it into the > > kernel tree. > > Then how did the k8temp driver make it into the kernel tree? :-) For the K8 family, the history is the other way around: early models worked fine, while later models did not. So there was no objection to the k8temp driver initially, there really was no reason for it. It is much harder to remove a driver from the kernel tree than to add it. Likewise, it is much harder to reject specific models if you have been supporting them before, because some users may see it as a regression. This is the reason why I would like us to be much more cautious with the family 10h CPU support than we have been for the K8 family. Let us learn from our past errors. > > Feel free to provide a way to force the bind to happen (and still > > print a big fat warning that this is a very bad idea), but do NOT make > > it the default. > > Okay, I'll add a force parameter. > I'd guess k8temp should get the same? It would make sense, yes, even though you will get complaints from some users. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-20 12:18 ` Jean Delvare @ 2009-11-23 7:45 ` Clemens Ladisch 2009-11-23 13:51 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-23 7:45 UTC (permalink / raw) To: Andrew Morton; +Cc: Serge Belyshev, Jean Delvare, linux-kernel, lm-sensors This adds a driver for the internal temperature sensor of AMD Family 10h and 11h CPUs. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- v3: added 'force' parameter for CPUs with buggy sensor; more documentation Documentation/hwmon/k10temp | 57 ++++++++++++ drivers/hwmon/Kconfig | 12 ++ drivers/hwmon/Makefile | 1 drivers/hwmon/k10temp.c | 142 ++++++++++++++++++++++++++++++++ 4 files changed, 212 insertions(+) --- /dev/null +++ linux-2.6/Documentation/hwmon/k10temp @@ -0,0 +1,57 @@ +Kernel driver k10temp +===================== + +Supported chips: +* AMD Family 10h processors: + Socket F: Quad-Core/Six-Core/Embedded Opteron + Socket AM2+: Opteron, Phenom (II) X3/X4 + Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II + Socket S1G3: Athlon II, Sempron, Turion II +* AMD Family 11h processors: + Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: + BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors: + http://support.amd.com/us/Processor_TechDocs/31116.pdf + BIOS and Kernel Developer's Guide (BKDG) for AMD Family 11h Processors: + http://support.amd.com/us/Processor_TechDocs/41256.pdf + Revision Guide for AMD Family 10h Processors: + http://support.amd.com/us/Processor_TechDocs/41322.pdf + Revision Guide for AMD Family 11h Processors: + http://support.amd.com/us/Processor_TechDocs/41788.pdf + AMD Family 11h Processor Power and Thermal Data Sheet for Notebooks: + http://support.amd.com/us/Processor_TechDocs/43373.pdf + AMD Family 10h Server and Workstation Processor Power and Thermal Data Sheet: + http://support.amd.com/us/Processor_TechDocs/43374.pdf + AMD Family 10h Desktop Processor Power and Thermal Data Sheet: + http://support.amd.com/us/Processor_TechDocs/43375.pdf + +Author: Clemens Ladisch <clemens@ladisch.de> + +Description +----------- + +This driver permits reading of the internal temperature sensor of AMD +Family 10h and 11h processors. + +All these processors have a sensor, but on older revisions of Family 10h +processors, the sensor may return inconsistent values (erratum 319). The +driver will refuse to load on these revisions unless you specify the +"force=1" module parameter. + +There is one temperature value, available as temp1_input in sysfs. It is +measured in degrees Celsius with a resolution of 1/8th degree. Please note +that it is defined as a relative value; to quote the AMD manual: + + Tctl is the processor temperatur control value, used by the platform to + control cooling systems. Tctl is a non-physical temperature on an + arbitrary scale measured in degrees. It does _not_ represent an actual + physical temperature like die or case temperature. Instead, it specifies + the processor temperature relative to the point at which the system must + supply the maximum cooling for the processor's specified maximum case + temperature and maximum thermal power dissipation. + +The maximum value for Tctl is defined as 70 degrees, so, as a rule of thumb, +this value should not exceed 60 degrees. --- linux-2.6/drivers/hwmon/Kconfig +++ linux-2.6/drivers/hwmon/Kconfig @@ -222,6 +222,18 @@ config SENSORS_K8TEMP This driver can also be built as a module. If so, the module will be called k8temp. +config SENSORS_K10TEMP + tristate "AMD Phenom/Sempron/Turion/Opteron temperature sensor" + depends on X86 && PCI + help + If you say yes here you get support for the temperature + sensor(s) inside your CPU. Supported are later revisions of + the AMD Family 10h and all revisions of the AMD Family 11h + microarchitectures. + + This driver can also be built as a module. If so, the module + will be called k10temp. + config SENSORS_AMS tristate "Apple Motion Sensor driver" depends on PPC_PMAC && !PPC64 && INPUT && ((ADB_PMU && I2C = y) || (ADB_PMU && !I2C) || I2C) && EXPERIMENTAL --- linux-2.6/drivers/hwmon/Makefile +++ linux-2.6/drivers/hwmon/Makefile @@ -53,6 +53,7 @@ obj-$(CONFIG_SENSORS_IBMAEM) += ibmaem.o obj-$(CONFIG_SENSORS_IBMPEX) += ibmpex.o obj-$(CONFIG_SENSORS_IT87) += it87.o obj-$(CONFIG_SENSORS_K8TEMP) += k8temp.o +obj-$(CONFIG_SENSORS_K10TEMP) += k10temp.o obj-$(CONFIG_SENSORS_LIS3LV02D) += lis3lv02d.o hp_accel.o obj-$(CONFIG_SENSORS_LIS3_SPI) += lis3lv02d.o lis3lv02d_spi.o obj-$(CONFIG_SENSORS_LM63) += lm63.o --- /dev/null +++ linux-2.6/drivers/hwmon/k10temp.c @@ -0,0 +1,142 @@ +/* + * k10temp.c - AMD Family 10h/11h processor hardware monitoring + * + * Copyright (c) 2009 Clemens Ladisch <clemens@ladisch.de> + * + * + * This driver is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License; either + * version 2 of the License, or (at your option) any later version. + * + * This driver is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this driver; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <linux/hwmon.h> +#include <linux/hwmon-sysfs.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <asm/processor.h> + +MODULE_DESCRIPTION("AMD Family 10h/11h CPU core temperature monitor"); +MODULE_AUTHOR("Clemens Ladisch <clemens@ladisch.de>"); +MODULE_LICENSE("GPL"); + +static bool force; +module_param(force, bool, 0444); +MODULE_PARM_DESC(force, "force loading on processors with erratum 319"); + +#define REG_REPORTED_TEMPERATURE 0xa4 +#define CURTMP_TO_MILLIDEGREES(regval) (((regval) >> 21) * 125) + +static ssize_t show_temp(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u32 regval; + + pci_read_config_dword(to_pci_dev(dev), + REG_REPORTED_TEMPERATURE, ®val); + return sprintf(buf, "%u\n", CURTMP_TO_MILLIDEGREES(regval)); +} + +static ssize_t show_name(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "k10temp\n"); +} + +static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, show_temp, NULL, 0); +static DEVICE_ATTR(name, S_IRUGO, show_name, NULL); + +static bool __devinit has_erratum_319(void) +{ + /* + * Erratum 319: The thermal sensor of older Family 10h processors + * (B steppings) may be unreliable. + */ + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; +} + +static int __devinit k10temp_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct device *hwmon_dev; + int err; + + if (has_erratum_319() && !force) { + dev_err(&pdev->dev, + "unreliable CPU thermal sensor; monitoring disabled\n"); + err = -ENODEV; + goto exit; + } + + err = device_create_file(&pdev->dev, + &sensor_dev_attr_temp1_input.dev_attr); + if (err) + goto exit; + + err = device_create_file(&pdev->dev, &dev_attr_name); + if (err) + goto exit_remove1; + + hwmon_dev = hwmon_device_register(&pdev->dev); + if (IS_ERR(hwmon_dev)) { + err = PTR_ERR(hwmon_dev); + goto exit_remove2; + } + dev_set_drvdata(&pdev->dev, hwmon_dev); + + if (has_erratum_319() && force) + dev_warn(&pdev->dev, + "unreliable CPU thermal sensor; check erratum 319\n"); + return 0; + +exit_remove2: + device_remove_file(&pdev->dev, &dev_attr_name); +exit_remove1: + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); +exit: + return err; +} + +static void __devexit k10temp_remove(struct pci_dev *pdev) +{ + hwmon_device_unregister(dev_get_drvdata(&pdev->dev)); + device_remove_file(&pdev->dev, &dev_attr_name); + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); + dev_set_drvdata(&pdev->dev, NULL); +} + +static struct pci_device_id k10temp_id_table[] = { + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, + {} +}; +MODULE_DEVICE_TABLE(pci, k10temp_id_table); + +static struct pci_driver k10temp_driver = { + .name = "k10temp", + .id_table = k10temp_id_table, + .probe = k10temp_probe, + .remove = __devexit_p(k10temp_remove), +}; + +static int __init k10temp_init(void) +{ + return pci_register_driver(&k10temp_driver); +} + +static void __exit k10temp_exit(void) +{ + pci_unregister_driver(&k10temp_driver); +} + +module_init(k10temp_init) +module_exit(k10temp_exit) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-23 7:45 ` [PATCH v3] " Clemens Ladisch @ 2009-11-23 13:51 ` Jean Delvare 2009-11-23 15:29 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-23 13:51 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Andrew Morton, Serge Belyshev, linux-kernel, lm-sensors Hi Clemens, On Mon, 23 Nov 2009 08:45:58 +0100, Clemens Ladisch wrote: > This adds a driver for the internal temperature sensor of AMD Family 10h > and 11h CPUs. > > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > --- > v3: added 'force' parameter for CPUs with buggy sensor; more documentation Review: > > Documentation/hwmon/k10temp | 57 ++++++++++++ > drivers/hwmon/Kconfig | 12 ++ > drivers/hwmon/Makefile | 1 > drivers/hwmon/k10temp.c | 142 ++++++++++++++++++++++++++++++++ > 4 files changed, 212 insertions(+) The name k10temp is a problem, as AMD insists that there is no such things as K10 and K11, but instead "family 10h" and "family 11h" processors. > > --- /dev/null > +++ linux-2.6/Documentation/hwmon/k10temp > @@ -0,0 +1,57 @@ > +Kernel driver k10temp > +===================== > + > +Supported chips: > +* AMD Family 10h processors: > + Socket F: Quad-Core/Six-Core/Embedded Opteron > + Socket AM2+: Opteron, Phenom (II) X3/X4 > + Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II > + Socket S1G3: Athlon II, Sempron, Turion II > +* AMD Family 11h processors: > + Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) > + > + Prefix: 'k10temp' > + Addresses scanned: PCI space > + Datasheets: > + BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors: > + http://support.amd.com/us/Processor_TechDocs/31116.pdf > + BIOS and Kernel Developer's Guide (BKDG) for AMD Family 11h Processors: > + http://support.amd.com/us/Processor_TechDocs/41256.pdf > + Revision Guide for AMD Family 10h Processors: > + http://support.amd.com/us/Processor_TechDocs/41322.pdf > + Revision Guide for AMD Family 11h Processors: > + http://support.amd.com/us/Processor_TechDocs/41788.pdf > + AMD Family 11h Processor Power and Thermal Data Sheet for Notebooks: > + http://support.amd.com/us/Processor_TechDocs/43373.pdf > + AMD Family 10h Server and Workstation Processor Power and Thermal Data Sheet: > + http://support.amd.com/us/Processor_TechDocs/43374.pdf > + AMD Family 10h Desktop Processor Power and Thermal Data Sheet: > + http://support.amd.com/us/Processor_TechDocs/43375.pdf > + > +Author: Clemens Ladisch <clemens@ladisch.de> > + > +Description > +----------- > + > +This driver permits reading of the internal temperature sensor of AMD > +Family 10h and 11h processors. > + > +All these processors have a sensor, but on older revisions of Family 10h > +processors, the sensor may return inconsistent values (erratum 319). The > +driver will refuse to load on these revisions unless you specify the > +"force=1" module parameter. > + > +There is one temperature value, available as temp1_input in sysfs. It is > +measured in degrees Celsius with a resolution of 1/8th degree. Please note > +that it is defined as a relative value; to quote the AMD manual: > + > + Tctl is the processor temperatur control value, used by the platform to Spelling: temperature. > + control cooling systems. Tctl is a non-physical temperature on an > + arbitrary scale measured in degrees. It does _not_ represent an actual > + physical temperature like die or case temperature. Instead, it specifies > + the processor temperature relative to the point at which the system must > + supply the maximum cooling for the processor's specified maximum case > + temperature and maximum thermal power dissipation. > + > +The maximum value for Tctl is defined as 70 degrees, so, as a rule of thumb, > +this value should not exceed 60 degrees. Now I am puzzled. If the temperature value is on an arbitrary scale, then the value returned by the driver is essentially fake? I can already hear the users complain that your driver is reporting temperature values which are different from their on-board hardware monitoring chip and they want to know who is right. It will be the same mess as with coretemp :( Don't we have additional information about the actual maximum Tcase value for the different supported models, as we have in coretemp? That would be an acceptable workaround, I guess. If not, then it might be the right time to introduce a new interface for relative temperature values. This needs some work, as we must first define the interface, then implement support in libsensors and sensors, and other monitoring applications, and then convert the affected drivers. But apparently we will have to, as major CPU makers are not able to implement something as simple as an absolute temperature sensor :( > --- linux-2.6/drivers/hwmon/Kconfig > +++ linux-2.6/drivers/hwmon/Kconfig > @@ -222,6 +222,18 @@ config SENSORS_K8TEMP > This driver can also be built as a module. If so, the module > will be called k8temp. > > +config SENSORS_K10TEMP > + tristate "AMD Phenom/Sempron/Turion/Opteron temperature sensor" > + depends on X86 && PCI > + help > + If you say yes here you get support for the temperature > + sensor(s) inside your CPU. Supported are later revisions of > + the AMD Family 10h and all revisions of the AMD Family 11h > + microarchitectures. > + > + This driver can also be built as a module. If so, the module > + will be called k10temp. > + > config SENSORS_AMS > tristate "Apple Motion Sensor driver" > depends on PPC_PMAC && !PPC64 && INPUT && ((ADB_PMU && I2C = y) || (ADB_PMU && !I2C) || I2C) && EXPERIMENTAL > --- linux-2.6/drivers/hwmon/Makefile > +++ linux-2.6/drivers/hwmon/Makefile > @@ -53,6 +53,7 @@ obj-$(CONFIG_SENSORS_IBMAEM) += ibmaem.o > obj-$(CONFIG_SENSORS_IBMPEX) += ibmpex.o > obj-$(CONFIG_SENSORS_IT87) += it87.o > obj-$(CONFIG_SENSORS_K8TEMP) += k8temp.o > +obj-$(CONFIG_SENSORS_K10TEMP) += k10temp.o > obj-$(CONFIG_SENSORS_LIS3LV02D) += lis3lv02d.o hp_accel.o > obj-$(CONFIG_SENSORS_LIS3_SPI) += lis3lv02d.o lis3lv02d_spi.o > obj-$(CONFIG_SENSORS_LM63) += lm63.o > --- /dev/null > +++ linux-2.6/drivers/hwmon/k10temp.c > @@ -0,0 +1,142 @@ > +/* > + * k10temp.c - AMD Family 10h/11h processor hardware monitoring > + * > + * Copyright (c) 2009 Clemens Ladisch <clemens@ladisch.de> > + * > + * > + * This driver is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License; either > + * version 2 of the License, or (at your option) any later version. > + * > + * This driver is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > + * See the GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this driver; if not, see <http://www.gnu.org/licenses/>. > + */ > + > +#include <linux/err.h> > +#include <linux/hwmon.h> > +#include <linux/hwmon-sysfs.h> > +#include <linux/init.h> > +#include <linux/module.h> > +#include <linux/pci.h> > +#include <asm/processor.h> > + > +MODULE_DESCRIPTION("AMD Family 10h/11h CPU core temperature monitor"); > +MODULE_AUTHOR("Clemens Ladisch <clemens@ladisch.de>"); > +MODULE_LICENSE("GPL"); > + > +static bool force; > +module_param(force, bool, 0444); > +MODULE_PARM_DESC(force, "force loading on processors with erratum 319"); > + > +#define REG_REPORTED_TEMPERATURE 0xa4 > +#define CURTMP_TO_MILLIDEGREES(regval) (((regval) >> 21) * 125) I'd vastly prefer this to be an inline function. Or even not a function at all, after all you're only using it once. > + > +static ssize_t show_temp(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + u32 regval; > + > + pci_read_config_dword(to_pci_dev(dev), > + REG_REPORTED_TEMPERATURE, ®val); > + return sprintf(buf, "%u\n", CURTMP_TO_MILLIDEGREES(regval)); > +} > + > +static ssize_t show_name(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return sprintf(buf, "k10temp\n"); > +} > + > +static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, show_temp, NULL, 0); There's no point in using SENSOR_DEVICE_ATTR() if you don't make use of the last parameter. DEVICE_ATTR is enough, and cheaper. And then you no longer need <linux/hwmon-sysfs.h>. > +static DEVICE_ATTR(name, S_IRUGO, show_name, NULL); > + > +static bool __devinit has_erratum_319(void) > +{ > + /* > + * Erratum 319: The thermal sensor of older Family 10h processors > + * (B steppings) may be unreliable. > + */ > + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; > +} > + > +static int __devinit k10temp_probe(struct pci_dev *pdev, > + const struct pci_device_id *id) > +{ > + struct device *hwmon_dev; > + int err; > + > + if (has_erratum_319() && !force) { > + dev_err(&pdev->dev, > + "unreliable CPU thermal sensor; monitoring disabled\n"); > + err = -ENODEV; > + goto exit; > + } > + > + err = device_create_file(&pdev->dev, > + &sensor_dev_attr_temp1_input.dev_attr); > + if (err) > + goto exit; > + > + err = device_create_file(&pdev->dev, &dev_attr_name); > + if (err) > + goto exit_remove1; > + > + hwmon_dev = hwmon_device_register(&pdev->dev); > + if (IS_ERR(hwmon_dev)) { > + err = PTR_ERR(hwmon_dev); > + goto exit_remove2; > + } > + dev_set_drvdata(&pdev->dev, hwmon_dev); > + > + if (has_erratum_319() && force) > + dev_warn(&pdev->dev, > + "unreliable CPU thermal sensor; check erratum 319\n"); > + return 0; > + > +exit_remove2: > + device_remove_file(&pdev->dev, &dev_attr_name); > +exit_remove1: > + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); > +exit: > + return err; > +} > + > +static void __devexit k10temp_remove(struct pci_dev *pdev) > +{ > + hwmon_device_unregister(dev_get_drvdata(&pdev->dev)); > + device_remove_file(&pdev->dev, &dev_attr_name); > + device_remove_file(&pdev->dev, &sensor_dev_attr_temp1_input.dev_attr); > + dev_set_drvdata(&pdev->dev, NULL); > +} > + > +static struct pci_device_id k10temp_id_table[] = { > + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) }, > + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, > + {} > +}; > +MODULE_DEVICE_TABLE(pci, k10temp_id_table); > + > +static struct pci_driver k10temp_driver = { > + .name = "k10temp", > + .id_table = k10temp_id_table, > + .probe = k10temp_probe, > + .remove = __devexit_p(k10temp_remove), > +}; > + > +static int __init k10temp_init(void) > +{ > + return pci_register_driver(&k10temp_driver); > +} > + > +static void __exit k10temp_exit(void) > +{ > + pci_unregister_driver(&k10temp_driver); > +} > + > +module_init(k10temp_init) > +module_exit(k10temp_exit) All the rest looks OK. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-23 13:51 ` Jean Delvare @ 2009-11-23 15:29 ` Clemens Ladisch 2009-11-23 19:05 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-23 15:29 UTC (permalink / raw) To: Jean Delvare; +Cc: Andrew Morton, Serge Belyshev, linux-kernel, lm-sensors Jean Delvare wrote: > On Mon, 23 Nov 2009 08:45:58 +0100, Clemens Ladisch wrote: >> Documentation/hwmon/k10temp | 57 ++++++++++++ >> drivers/hwmon/k10temp.c | 142 ++++++++++++++++++++++++++++++++ > > The name k10temp is a problem, as AMD insists that there is no such > things as K10 and K11, but instead "family 10h" and "family 11h" > processors. K10 was AMD's internal code name, and is widely used in practice. I'd like to keep this name since it is consistent with the older k8temp driver. What name would you propose instead? "amdfam10temp"? >> + control cooling systems. Tctl is a non-physical temperature on an >> + arbitrary scale measured in degrees. It does _not_ represent an actual >> + physical temperature like die or case temperature. Instead, it specifies >> + the processor temperature relative to the point at which the system must >> + supply the maximum cooling for the processor's specified maximum case >> + temperature and maximum thermal power dissipation. >> + >> +The maximum value for Tctl is defined as 70 degrees, so, as a rule of thumb, >> +this value should not exceed 60 degrees. > > Now I am puzzled. If the temperature value is on an arbitrary scale, > then the value returned by the driver is essentially fake? Yes (and it's near enough the absolute value to look plausible). > Don't we have additional information about the actual maximum Tcase > value for the different supported models, as we have in coretemp? For AMD, Tcase is the physical temperature. Did you mean Tctl? I'll add Tctl max (= "100% cooling, please") as temp1_max, and there's a register that contains the Tctl value at which the processor starts throttling, which could be exported as temp1_crit(_hyst), if I understand the lm-sensors documentation correctly. As for your other comments, I'll integrate them in the next version of the patch. > If not, then it might be the right time to introduce a new interface > for relative temperature values. This needs some work, as we must first > define the interface, then implement support in libsensors and sensors, > and other monitoring applications, and then convert the affected > drivers. But apparently we will have to, as major CPU makers are not > able to implement something as simple as an absolute temperature > sensor :( There still is the built-in diode to be read by the motherboard, but the internal sensor was never intended to be an absolute measurement but just as a means for controlling the cooling. Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-23 15:29 ` Clemens Ladisch @ 2009-11-23 19:05 ` Jean Delvare 2009-11-24 8:43 ` Clemens Ladisch 2009-11-24 8:43 ` [PATCH v4] " Clemens Ladisch 0 siblings, 2 replies; 26+ messages in thread From: Jean Delvare @ 2009-11-23 19:05 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Andrew Morton, Serge Belyshev, linux-kernel, lm-sensors On Mon, 23 Nov 2009 16:29:25 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Mon, 23 Nov 2009 08:45:58 +0100, Clemens Ladisch wrote: > >> Documentation/hwmon/k10temp | 57 ++++++++++++ > >> drivers/hwmon/k10temp.c | 142 ++++++++++++++++++++++++++++++++ > > > > The name k10temp is a problem, as AMD insists that there is no such > > things as K10 and K11, but instead "family 10h" and "family 11h" > > processors. > > K10 was AMD's internal code name, and is widely used in practice. > I'd like to keep this name since it is consistent with the older > k8temp driver. > > What name would you propose instead? "amdfam10temp"? Not very readable, I admit. "amd10temp" would do, I guess. But I agree it doesn't matter that much, it's only a driver name after all. > >> + control cooling systems. Tctl is a non-physical temperature on an > >> + arbitrary scale measured in degrees. It does _not_ represent an actual > >> + physical temperature like die or case temperature. Instead, it specifies > >> + the processor temperature relative to the point at which the system must > >> + supply the maximum cooling for the processor's specified maximum case > >> + temperature and maximum thermal power dissipation. > >> + > >> +The maximum value for Tctl is defined as 70 degrees, so, as a rule of thumb, > >> +this value should not exceed 60 degrees. > > > > Now I am puzzled. If the temperature value is on an arbitrary scale, > > then the value returned by the driver is essentially fake? > > Yes (and it's near enough the absolute value to look plausible). I don't know if it is a good or bad idea. Bad, I guess. > > Don't we have additional information about the actual maximum Tcase > > value for the different supported models, as we have in coretemp? > > For AMD, Tcase is the physical temperature. Did you mean Tctl? I meant the physical temperature when Tctl = 70. In other words, the offset between Tctl and the physical temperature. > I'll add Tctl max (= "100% cooling, please") as temp1_max, and there's Yes, good idea. > a register that contains the Tctl value at which the processor starts > throttling, which could be exported as temp1_crit(_hyst), if I > understand the lm-sensors documentation correctly. This is correct. > As for your other comments, I'll integrate them in the next version of > the patch. > > > If not, then it might be the right time to introduce a new interface > > for relative temperature values. This needs some work, as we must first > > define the interface, then implement support in libsensors and sensors, > > and other monitoring applications, and then convert the affected > > drivers. But apparently we will have to, as major CPU makers are not > > able to implement something as simple as an absolute temperature > > sensor :( > > There still is the built-in diode to be read by the motherboard, but the > internal sensor was never intended to be an absolute measurement but > just as a means for controlling the cooling. Still we use it for that purpose at the moment. Maybe we simply should not? -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-23 19:05 ` Jean Delvare @ 2009-11-24 8:43 ` Clemens Ladisch 2009-11-24 13:26 ` Jean Delvare 2009-11-24 8:43 ` [PATCH v4] " Clemens Ladisch 1 sibling, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-24 8:43 UTC (permalink / raw) To: Jean Delvare; +Cc: Serge Belyshev, linux-kernel, lm-sensors Jean Delvare wrote: > On Mon, 23 Nov 2009 16:29:25 +0100, Clemens Ladisch wrote: > > Jean Delvare wrote: > > > The name k10temp is a problem, as AMD insists that there is no such > > > things as K10 and K11, but instead "family 10h" and "family 11h" > > > processors. > > > > K10 was AMD's internal code name, and is widely used in practice. > > I'd like to keep this name since it is consistent with the older > > k8temp driver. > > > > What name would you propose instead? "amdfam10temp"? > > Not very readable, I admit. "amd10temp" would do, I guess. But I agree > it doesn't matter that much, it's only a driver name after all. In that case, I'll just keep it. :) > > > Don't we have additional information about the actual maximum Tcase > > > value for the different supported models, as we have in coretemp? > > > > For AMD, Tcase is the physical temperature. Did you mean Tctl? > > I meant the physical temperature when Tctl = 70. In other words, the > offset between Tctl and the physical temperature. The Power and Thermal datasheets have information like "Tcase Max: 55 °C to 71 °C". So this seems to be different for individual processors. It might be possible to get that information through SB-TSI, but AMD tries to keep that specification secret. > > There still is the built-in diode to be read by the motherboard, but the > > internal sensor was never intended to be an absolute measurement but > > just as a means for controlling the cooling. > > Still we use it for that purpose at the moment. Maybe we simply should > not? Well, the absolute measurements have essentially the same purpose, and would not make much sense without comparing them to some absolute limit. In any case, it might make more sense to show such values as something like "20 °C below maximum". Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 8:43 ` Clemens Ladisch @ 2009-11-24 13:26 ` Jean Delvare 2009-11-24 14:09 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-24 13:26 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Serge Belyshev, linux-kernel, lm-sensors On Tue, 24 Nov 2009 09:43:29 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Mon, 23 Nov 2009 16:29:25 +0100, Clemens Ladisch wrote: > > > There still is the built-in diode to be read by the motherboard, but the > > > internal sensor was never intended to be an absolute measurement but > > > just as a means for controlling the cooling. > > > > Still we use it for that purpose at the moment. Maybe we simply should > > not? > > Well, the absolute measurements have essentially the same purpose, and > would not make much sense without comparing them to some absolute limit. Of course. The problem is that users don't know that the temperature isn't real, and then get puzzled when comparing with other temperature sensors in their system, which _do_ report physical temperatures. > In any case, it might make more sense to show such values as something > like "20 °C below maximum". I think so, yes. Now the difficulty is to come up with a suitable sysfs interface. Dropping the current interface altogether doesn't sound right as it will take time before a new version of libsensors is written and spread out and all applications add support for the new interface. In the meantime, I guess we want users to still see the approximate value. So ideally we would come up with an interface that adds up to the one we have currently. Future libsensors/applications could read the extra information to display the value in a different format so that the users see the difference. An idea I have about this is adding a sysfs file temp#_relative, which would contain the fake temperature value that is used as a reference for the thermal sensor in question. In the case of k10temp, the value would be 70000. So for example we would have: temp1_input: 46000 temp1_relative: 70000 Old applications would display this as 46°C while new ones would display "24°C below the limit". For coretemp this could be 85000 or 100000 (at least in the case where we don't know the limit for sure.) If there are limits exported (temp1_max, temp1_crit etc.) the same offset could be applied to them too. This approach has the advantage of backwards compatibility. It may not be considered flexible enough though... For example it does not support sensors with totally arbitrary scales (where 1000 != 1°C.) I seem to remember we've seen this in the past? If others have ideas about how we can support this, I'm all ears. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 13:26 ` Jean Delvare @ 2009-11-24 14:09 ` Clemens Ladisch 2009-11-24 20:11 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-24 14:09 UTC (permalink / raw) To: Jean Delvare; +Cc: Serge Belyshev, linux-kernel, lm-sensors Jean Delvare wrote: > On Tue, 24 Nov 2009 09:43:29 +0100, Clemens Ladisch wrote: > > In any case, it might make more sense to show such values as something > > like "20 °C below maximum". > > I think so, yes. Now the difficulty is to come up with a suitable sysfs > interface. Dropping the current interface altogether doesn't sound > right as it will take time before a new version of libsensors is > written and spread out and all applications add support for the new > interface. In the meantime, I guess we want users to still see the > approximate value. k10temp is a new driver, so no old systems would break if it introduced a new class of measurements with names like, e.g., reltemp#_*. We could add this in parallel with the old interface. If we wanted to. > So ideally we would come up with an interface that adds up to the one > we have currently. Future libsensors/applications could read the extra > information to display the value in a different format so that the > users see the difference. > > An idea I have about this is adding a sysfs file temp#_relative, which > would contain the fake temperature value that is used as a reference > for the thermal sensor in question. In the case of k10temp, the value > would be 70000. So for example we would have: > > temp1_input: 46000 > temp1_relative: 70000 > > Old applications would display this as 46°C while new ones would > display "24°C below the limit". In this case, temp1_relative is identical with temp1_max. In the general case, there always must be some kind of limit (whether "max" or "crit" or something else) against which the values are measured, otherwise a relative value would not make sense. This means that one of the already existing limit values must be the reference base, so we'd need just a mechanism to specify which of them is it, i.e., "temp1_relative_base: max". If we'd have "temp1_relative: 70000", the application would have to search among the limit values for one with the same value. (It doesn't really matter which one is the base, as all values are relative anyway, so we could just define that temp#_max is the base; so we'd have "temp1_relative: true"). > It may not be considered flexible enough though... For example it does > not support sensors with totally arbitrary scales (where 1000 != 1°C.) When the scale differs but is _known_, the driver can just rescale its internal register values to millidegrees. When we have some scale like "0%...100%", that should probably be exported as "pwm#_target" or something like that. Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 14:09 ` Clemens Ladisch @ 2009-11-24 20:11 ` Jean Delvare 2009-11-25 9:51 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-24 20:11 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Serge Belyshev, linux-kernel, lm-sensors On Tue, 24 Nov 2009 15:09:57 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Tue, 24 Nov 2009 09:43:29 +0100, Clemens Ladisch wrote: > > > In any case, it might make more sense to show such values as something > > > like "20 °C below maximum". > > > > I think so, yes. Now the difficulty is to come up with a suitable sysfs > > interface. Dropping the current interface altogether doesn't sound > > right as it will take time before a new version of libsensors is > > written and spread out and all applications add support for the new > > interface. In the meantime, I guess we want users to still see the > > approximate value. > > k10temp is a new driver, so no old systems would break if it introduced > a new class of measurements with names like, e.g., reltemp#_*. I wasn't necessarily thinking of k10temp breaking a working system, more of k10temp just not working on current systems (if we don't opt for parallel implementations.) And there's also the coretemp case, where we report relative values in some cases; that case would fit in what we're discussing now, so we would modify an already existing driver. > We could add this in parallel with the old interface. If we wanted to. I think we want to, yes. I just don't know for how long. > > So ideally we would come up with an interface that adds up to the one > > we have currently. Future libsensors/applications could read the extra > > information to display the value in a different format so that the > > users see the difference. > > > > An idea I have about this is adding a sysfs file temp#_relative, which > > would contain the fake temperature value that is used as a reference > > for the thermal sensor in question. In the case of k10temp, the value > > would be 70000. So for example we would have: > > > > temp1_input: 46000 > > temp1_relative: 70000 > > > > Old applications would display this as 46°C while new ones would > > display "24°C below the limit". > > In this case, temp1_relative is identical with temp1_max. In the > general case, there always must be some kind of limit (whether "max" or > "crit" or something else) against which the values are measured, > otherwise a relative value would not make sense. I would love this to be true, but I can't see any reason why it would have to be. I can easily imagine a CPU specification reading: "This register contains a temperature value on an arbitrary scale; higher values mean higher temperatures." I'm not claiming it would be particularly useful... but if a CPU maker ever does this, don't we want to support that? (This is a real question, maybe we don't.) I really didn't expect the implementations we've seen in Intel Core or AMD family 10h processors. So I wouldn't be surprised if future implementations are different again. > This means that one of the already existing limit values must be the > reference base, so we'd need just a mechanism to specify which of them > is it, i.e., "temp1_relative_base: max". If we'd have > "temp1_relative: 70000", the application would have to search among the > limit values for one with the same value. I fail to see why the application would care about this at all. When in relative mode, all other values would be offset by the temp#_relative value. But that value itself would not be displayed (it has no physical value, otherwise we wouldn't be in absolute mode, would we?) > (It doesn't really matter which one is the base, as all values are > relative anyway, so we could just define that temp#_max is the base; > so we'd have "temp1_relative: true"). This is taking flexibility away from us, for no benefit that I can see. Am I missing something? (Additionally it wouldn't fit in libsensors as it exists today. This doesn't mean it can't be done, but the cost is higher, so it needs to bring an significant improvement.) > > It may not be considered flexible enough though... For example it does > > not support sensors with totally arbitrary scales (where 1000 != 1°C.) > > When the scale differs but is _known_, the driver can just rescale its > internal register values to millidegrees. Correct, and actually many drivers do that already. But the scale could also be totally unknown (as in my theoretical example above), only bounded by the register size (which would lead to pretty bad temp#_min and _max IMHO.) > When we have some scale like "0%...100%", that should probably be > exported as "pwm#_target" or something like that. That would be difficult, as we generally don't know which pwm output, if any, is related to a given CPU. ACPI may know more, but the ACPI interface to cooling zones and temperatures is very limited IMHO, and hard to connect to physical devices, I doubt we can get anything usable out of it. And anyway, deciding that cooling must be somehow proportional to temperature is pretty arbitrary and doesn't match what I've seen people do so far. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 20:11 ` Jean Delvare @ 2009-11-25 9:51 ` Clemens Ladisch 2009-11-26 20:44 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-25 9:51 UTC (permalink / raw) To: Jean Delvare; +Cc: Serge Belyshev, linux-kernel, lm-sensors Jean Delvare wrote: > On Tue, 24 Nov 2009 15:09:57 +0100, Clemens Ladisch wrote: > > This means that one of the already existing limit values must be the > > reference base, so we'd need just a mechanism to specify which of them > > is it, i.e., "temp1_relative_base: max". If we'd have > > "temp1_relative: 70000", the application would have to search among the > > limit values for one with the same value. > > I fail to see why the application would care about this at all. When in > relative mode, all other values would be offset by the temp#_relative > value. But that value itself would not be displayed (it has no physical > value, otherwise we wouldn't be in absolute mode, would we?) > ... > > > temp1_relative: true > > This is taking flexibility away from us, for no benefit that I can see. > Am I missing something? The application has to display something like "24 °C below the limit", so how should it know that the 70°C should be named "the limit"? To use an example, my CPU has these entries like these: temp1_input: 29875 temp1_max: 70000 temp1_crit: 95000 temp1_crit_hyst: 92500 How should these entries be displayed? (we know that: "40.1 °C below limit", "limit", "25 °C above limit" etc.) But what algorithm should the application (or libsensors) use to create those labels? If we have "temp1_relative: 70000", then this happens to be the "max" limit; but what if some CPU vendor decides to define, e.g., the value 0 as the "normal" operating temperatire, so that the entries would look like this: temp1_input: -1000 temp1_max: 40000 temp1_relative: 0 Should the values be labeled as "1 °C below normal" and "40 °C above normal", and how should the application know that 0 is to be labeled "normal"? It might make more sense to display the temperature just as "41 °C below max", in which case the actual value of temp1_relative is not used at all. "Relative" means that any value is meaningful only in comparison with other values/limits, so it does not make sense to declare one point on the scale as base. > Additionally it wouldn't fit in libsensors as it exists today. Then the best bet would probably be an entry like temp#_unit, with 0 = absolute °C (default); 1 = relative °C or °K; other values "unknown". Even if some silly scale is introduced later, applications that read this entry then know that they must not display a unit like °C for unknown unit specifications. Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-25 9:51 ` Clemens Ladisch @ 2009-11-26 20:44 ` Jean Delvare 2009-11-27 13:03 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-26 20:44 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Serge Belyshev, linux-kernel, lm-sensors Hi Clemens, On Wed, 25 Nov 2009 10:51:38 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Tue, 24 Nov 2009 15:09:57 +0100, Clemens Ladisch wrote: > > > This means that one of the already existing limit values must be the > > > reference base, so we'd need just a mechanism to specify which of them > > > is it, i.e., "temp1_relative_base: max". If we'd have > > > "temp1_relative: 70000", the application would have to search among the > > > limit values for one with the same value. > > > > I fail to see why the application would care about this at all. When in > > relative mode, all other values would be offset by the temp#_relative > > value. But that value itself would not be displayed (it has no physical > > value, otherwise we wouldn't be in absolute mode, would we?) > > ... > > > > > temp1_relative: true > > > > This is taking flexibility away from us, for no benefit that I can see. > > Am I missing something? > > The application has to display something like "24 °C below the limit", > so how should it know that the 70°C should be named "the limit"? OK, I get your point now. We have to think about how applications will present the information to the user. Admittedly my proposal doesn't address this part of the problem. > To use an example, my CPU has these entries like these: > temp1_input: 29875 > temp1_max: 70000 > temp1_crit: 95000 > temp1_crit_hyst: 92500 > > How should these entries be displayed? > (we know that: "40.1 °C below limit", "limit", "25 °C above limit" etc.) > > But what algorithm should the application (or libsensors) use to create > those labels? If we have "temp1_relative: 70000", then this happens to > be the "max" limit; but what if some CPU vendor decides to define, e.g., > the value 0 as the "normal" operating temperatire, so that the entries > would look like this: > temp1_input: -1000 > temp1_max: 40000 > temp1_relative: 0 > Should the values be labeled as "1 °C below normal" and "40 °C above > normal", and how should the application know that 0 is to be labeled > "normal"? It might make more sense to display the temperature just as > "41 °C below max", in which case the actual value of temp1_relative is > not used at all. Except that there may be no temp1_max, just a temperature value relative to the "normal" operating point of the CPU. In that case we can't fallback to the max limit. Even your initial proposal doesn't work there yet: the hwmon interface has no standard name for "normal operating temperature", so we can't put that name in temp#_relative. And again there's the (potential) case where we don't know what the reported temperature value is relative to. I'm wondering if this would make sense to (ab)use the temp#_label string for that. Or maybe create a new label (temp#_relative_label or similar) but I'm not sure how we would integrate this into libsensors and applications. In particular I am worried about translation issues if we make the drivers too verbose. > "Relative" means that any value is meaningful only in comparison with > other values/limits, so it does not make sense to declare one point on > the scale as base. The _hardware_ does use one point of the scale as the base, not us. We have to deal with what the hardware implements. If the base has a meaning (normal operating temperature, or critical temperature, etc.) we have to let the user know somehow. > > Additionally it wouldn't fit in libsensors as it exists today. > > Then the best bet would probably be an entry like temp#_unit, with > 0 = absolute °C (default); 1 = relative °C or °K; other values > "unknown". Even if some silly scale is introduced later, applications > that read this entry then know that they must not display a unit like °C > for unknown unit specifications. This could work, yes. Note that current drivers and libsensors don't have/know about this file yet, and they generally use an absolute °C scale. So the absence of temp#_unit file would be interpreted exactly as if the file was there and contained value 0. (I'd rather name that file temp#_scale - but that's an implementation detail.) Thanks, -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-26 20:44 ` Jean Delvare @ 2009-11-27 13:03 ` Clemens Ladisch 2010-01-10 14:45 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2009-11-27 13:03 UTC (permalink / raw) To: Jean Delvare; +Cc: linux-kernel, lm-sensors Jean Delvare wrote: > On Wed, 25 Nov 2009 10:51:38 +0100, Clemens Ladisch wrote: > > temp1_input: -1000 > > temp1_max: 40000 > > temp1_relative: 0 > > Should the values be labeled as "1 °C below normal" and "40 °C above > > normal", and how should the application know that 0 is to be labeled > > "normal"? It might make more sense to display the temperature just as > > "41 °C below max", in which case the actual value of temp1_relative is > > not used at all. > > Except that there may be no temp1_max, just a temperature value > relative to the "normal" operating point of the CPU. In that case we > can't fallback to the max limit. > > Even your initial proposal doesn't work there yet: the hwmon interface > has no standard name for "normal operating temperature", so we can't put > that name in temp#_relative. [...] > If the base has a meaning (normal operating temperature, or critical > temperature, etc.) we have to let the user know somehow. I chose that example because "normal" does not exist; and it's a bad example because "normal" actually has a meaning. Better take the AMD CPUs: The base of all relative values is zero (by definition), _not_ 70000, and the meaning of that base is just "70 °C below the temperature at which the processor wants 100% cooling". This base value is meaningless for any monitoring purposes. If any point on the scale has a meaning, it should be reported with some temp#_whatever file. However, the base itself does not necessarily have any meaning. As long as we have some corresponding _max or _crit limit that can be used for comparisons, we do not need a base value. Only if there is no known predefined limit do we need a temp#_relative value. > Or maybe create a new label (temp#_relative_label or similar) but I'm > not sure how we would integrate this into libsensors and applications. > In particular I am worried about translation issues if we make the > drivers too verbose. All known CPUs with relative temperature scale also have known _max limits, and I don't think that a CPU with relative scale and both unknown _max and _crit will ever be designed. In other words, temp#_relative* is not needed at the moment. I think we should not try to define how the semantics of such an unknown scale can be described. > > > Additionally it wouldn't fit in libsensors as it exists today. > > > > Then the best bet would probably be an entry like temp#_unit, with > > 0 = absolute °C (default); 1 = relative °C or °K; other values > > "unknown". Even if some silly scale is introduced later, applications > > that read this entry then know that they must not display a unit like °C > > for unknown unit specifications. > > This could work, yes. Note that current drivers and libsensors don't > have/know about this file yet, and they generally use an absolute °C > scale. So the absence of temp#_unit file would be interpreted exactly > as if the file was there and contained value 0. > > (I'd rather name that file temp#_scale - but that's an implementation > detail.) Like this? --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -314,6 +314,19 @@ temp_reset_history Reset temp_lowest and temp_highest for all sensors WO +temp[1-*]_scale Temperature scale type. + Integer + RO + 0: millidegrees Celsius (default if no _scale entry) + 1: relative millidegrees Celsius; see below + 2: millivolts; see below + other values: unknown + When scale=1 (relative), the temperature value 0 does not + correspond to zero degrees Celsius but to some unknown + temperature. In this case, temperate values should not be + interpreted or displayed as absolute values and make sense + only when compared to other values of the same channel. + Some chips measure temperature using external thermistors and an ADC, and report the temperature measurement as a voltage. Converting this voltage back to a temperature (or the other way around for limits) requires Hmm, which drivers use millivolt temperatures? Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-27 13:03 ` Clemens Ladisch @ 2010-01-10 14:45 ` Jean Delvare 2010-01-15 9:57 ` Clemens Ladisch 0 siblings, 1 reply; 26+ messages in thread From: Jean Delvare @ 2010-01-10 14:45 UTC (permalink / raw) To: Clemens Ladisch; +Cc: linux-kernel, lm-sensors Hi Clemens, Sorry for the late answer... On Fri, 27 Nov 2009 14:03:29 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Wed, 25 Nov 2009 10:51:38 +0100, Clemens Ladisch wrote: > > > temp1_input: -1000 > > > temp1_max: 40000 > > > temp1_relative: 0 > > > Should the values be labeled as "1 °C below normal" and "40 °C above > > > normal", and how should the application know that 0 is to be labeled > > > "normal"? It might make more sense to display the temperature just as > > > "41 °C below max", in which case the actual value of temp1_relative is > > > not used at all. > > > > Except that there may be no temp1_max, just a temperature value > > relative to the "normal" operating point of the CPU. In that case we > > can't fallback to the max limit. > > > > Even your initial proposal doesn't work there yet: the hwmon interface > > has no standard name for "normal operating temperature", so we can't put > > that name in temp#_relative. [...] > > If the base has a meaning (normal operating temperature, or critical > > temperature, etc.) we have to let the user know somehow. > > I chose that example because "normal" does not exist; and it's a bad > example because "normal" actually has a meaning. Indeed, we may want to add an standard name for this. > Better take the AMD CPUs: The base of all relative values is zero (by > definition), _not_ 70000, and the meaning of that base is just "70 °C > below the temperature at which the processor wants 100% cooling". This > base value is meaningless for any monitoring purposes. Correct. That being said, nothing prevents us from changing the base to whatever we want it to be, pretty much by definition of relative temperature reports. For example we could report 0 for temp1_max and (register value - 70000) for temp1_input. > If any point on the scale has a meaning, it should be reported with some > temp#_whatever file. However, the base itself does not necessarily have > any meaning. > > As long as we have some corresponding _max or _crit limit that can be > used for comparisons, we do not need a base value. Only if there is > no known predefined limit do we need a temp#_relative value. I agree (again, unless we always change the base to be one of these meaningful points... but I'm not sure if we want to do this.) > > Or maybe create a new label (temp#_relative_label or similar) but I'm > > not sure how we would integrate this into libsensors and applications. > > In particular I am worried about translation issues if we make the > > drivers too verbose. > > All known CPUs with relative temperature scale also have known _max > limits, Not correct. For some variants of the Intel Core, we do not known the max limit. I'm not even sure if the max limit value exists. > and I don't think that a CPU with relative scale and both > unknown _max and _crit will ever be designed. In other words, > temp#_relative* is not needed at the moment. I think we should not > try to define how the semantics of such an unknown scale can be > described. I agree that we shouldn't waste time designing an interface for something that doesn't exist. All I'm saying is that, if we are adding something to the interface today and it's not trivial, we should make it reasonably flexible so that we don't have to do it all again next year or so. Otherwise the interface could turn quite complex and unappealing. > > > > Additionally it wouldn't fit in libsensors as it exists today. > > > > > > Then the best bet would probably be an entry like temp#_unit, with > > > 0 = absolute °C (default); 1 = relative °C or °K; other values > > > "unknown". Even if some silly scale is introduced later, applications > > > that read this entry then know that they must not display a unit like °C > > > for unknown unit specifications. > > > > This could work, yes. Note that current drivers and libsensors don't > > have/know about this file yet, and they generally use an absolute °C > > scale. So the absence of temp#_unit file would be interpreted exactly > > as if the file was there and contained value 0. > > > > (I'd rather name that file temp#_scale - but that's an implementation > > detail.) > > Like this? > > --- a/Documentation/hwmon/sysfs-interface > +++ b/Documentation/hwmon/sysfs-interface > @@ -314,6 +314,19 @@ temp_reset_history > Reset temp_lowest and temp_highest for all sensors > WO > > +temp[1-*]_scale Temperature scale type. > + Integer > + RO > + 0: millidegrees Celsius (default if no _scale entry) > + 1: relative millidegrees Celsius; see below > + 2: millivolts; see below > + other values: unknown > + When scale=1 (relative), the temperature value 0 does not > + correspond to zero degrees Celsius but to some unknown > + temperature. In this case, temperate values should not be > + interpreted or displayed as absolute values and make sense > + only when compared to other values of the same channel. > + > Some chips measure temperature using external thermistors and an ADC, and > report the temperature measurement as a voltage. Converting this voltage > back to a temperature (or the other way around for limits) requires Maybe, yes. I am a little worried that older versions of libsensors will ignore this attribute. The good thing about this is that users will still get some value until they upgrade. The bad thing is that they will not know that the value isn't absolute. They are likely to get frightened by unexpected values and/or to complain to us about them. I am wondering if a totally separate channel type wouldn't be preferable. The pros and cons would be inverted of course: older versions of libsensors would have zero support for that, and all applications would have to be updated to support it, but at least the meaning of the value would be totally clear. This would come at the price of some code duplication both in libsensors and applications though. I guess it basically depends whether we want to consider a thermal margin as a "temperature measurement except that it's relative" or as something completely separate. Honestly, I've been thinking about this for some time now and I simply don't know what we'd rather do :( > Hmm, which drivers use millivolt temperatures? vt1211, vt8231, pc87360. These devices use thermistors for temperature measurements, they measure the voltage at the pin (exactly as for a voltage input) and leave it to user-space to turn the voltage value back into a temperature value. The exact conversion formula depends on the thermistor model and the value of the resistor used for the second part of the bridge. Note that virtually every chip monitoring voltages can be used to monitor temperature instead, using thermistors. And this was seen once already: http://article.gmane.org/gmane.linux.drivers.sensors/14564/ We do not have support for this at the moment though. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2010-01-10 14:45 ` Jean Delvare @ 2010-01-15 9:57 ` Clemens Ladisch 2010-01-15 13:31 ` Jean Delvare 0 siblings, 1 reply; 26+ messages in thread From: Clemens Ladisch @ 2010-01-15 9:57 UTC (permalink / raw) To: Jean Delvare; +Cc: linux-kernel, lm-sensors Jean Delvare wrote: > On Fri, 27 Nov 2009 14:03:29 +0100, Clemens Ladisch wrote: > > +temp[1-*]_scale Temperature scale type. > > + 0: millidegrees Celsius (default if no _scale entry) > > + 1: relative millidegrees Celsius; see below > > + 2: millivolts; see below > > + other values: unknown > > Maybe, yes. I am a little worried that older versions of libsensors > will ignore this attribute. The good thing about this is that users > will still get some value until they upgrade. The bad thing is that > they will not know that the value isn't absolute. They are likely to > get frightened by unexpected values and/or to complain to us about them. > > I am wondering if a totally separate channel type wouldn't be > preferable. The pros and cons would be inverted of course: older > versions of libsensors would have zero support for that, and all > applications would have to be updated to support it, but at least the > meaning of the value would be totally clear. This would come at the > price of some code duplication both in libsensors and applications > though. > > I guess it basically depends whether we want to consider a thermal > margin as a "temperature measurement except that it's relative" or as > something completely separate. It's different from the millidegree/millivolt issue; millivolts can be transparently converted by libsensors, while relative values must be handled/displayed differently by the application. So I think at least in the libsensors API, relative values should be different. (In any case, we should add temp#_scale at least for millivolts.) > Honestly, I've been thinking about this > for some time now and I simply don't know what we'd rather do :( The sysfs interface is a somewhat internal interface; I think the main question is whether old userspace (old libsensors or old apps using a new libsensors) should be able to see relative values without knowing that they are relative. Coretemp and k10temp already exist and show relative values. If we introduced a new channel for relative values now, we would still have problems with those drivers, so it might already be too late to avoid problems for old userspace. Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2010-01-15 9:57 ` Clemens Ladisch @ 2010-01-15 13:31 ` Jean Delvare 0 siblings, 0 replies; 26+ messages in thread From: Jean Delvare @ 2010-01-15 13:31 UTC (permalink / raw) To: Clemens Ladisch; +Cc: linux-kernel, lm-sensors Hi Clemens, On Fri, 15 Jan 2010 10:57:58 +0100, Clemens Ladisch wrote: > Jean Delvare wrote: > > On Fri, 27 Nov 2009 14:03:29 +0100, Clemens Ladisch wrote: > > > +temp[1-*]_scale Temperature scale type. > > > + 0: millidegrees Celsius (default if no _scale entry) > > > + 1: relative millidegrees Celsius; see below > > > + 2: millivolts; see below > > > + other values: unknown > > > > Maybe, yes. I am a little worried that older versions of libsensors > > will ignore this attribute. The good thing about this is that users > > will still get some value until they upgrade. The bad thing is that > > they will not know that the value isn't absolute. They are likely to > > get frightened by unexpected values and/or to complain to us about them. > > > > I am wondering if a totally separate channel type wouldn't be > > preferable. The pros and cons would be inverted of course: older > > versions of libsensors would have zero support for that, and all > > applications would have to be updated to support it, but at least the > > meaning of the value would be totally clear. This would come at the > > price of some code duplication both in libsensors and applications > > though. > > > > I guess it basically depends whether we want to consider a thermal > > margin as a "temperature measurement except that it's relative" or as > > something completely separate. > > It's different from the millidegree/millivolt issue; millivolts can be > transparently converted by libsensors, while relative values must be > handled/displayed differently by the application. So I think at least > in the libsensors API, relative values should be different. > > (In any case, we should add temp#_scale at least for millivolts.) I agree it would make sense. So far, libsensors worked without it, thanks to a large default configuration file that included a default voltage->temperature conversion for each affected chip. Now that we have opted for a much smaller configuration file, completely weird temperature values will be displayed by default. Ideally, libsensors would refuse to display temperature values that are reported in mV in the absence of a corresponding conversion formula in the configuration file. I'm not sure how easy that would be to implement in libsensors though. > > Honestly, I've been thinking about this > > for some time now and I simply don't know what we'd rather do :( > > The sysfs interface is a somewhat internal interface; NO, it is not. I would love to consider it internal, and we certainly discourage people from accessing the sysfs interface directly. However some applications _do_ access sysfs directly (fancontrol and pwmconfig come to mind but there are others.) On top of that, older and newer versions of libsensors are being used. So we have to preserve compatibility in every change we make to the sysfs interface. Thus you can't claim it is internal. > I think the main > question is whether old userspace (old libsensors or old apps using a > new libsensors) should be able to see relative values without knowing > that they are relative. Yes, this is the main question. There are pros and cons both ways, as I explained before. > Coretemp and k10temp already exist and show relative values. Not really. There is a significant difference between showing absolute values which may be incorrect because we are unsure of the base point (coretemp), showing relative values which are arbitrarily offset to look like absolute values (k10temp) and showing totally relative values (thermal margins actually.) In the former two cases, there can be some confusion for the user (for example when comparing two sensor values in the same machine) but I don't expect the user to be frightened. While showing "negative temperatures" would probably frighten the users. Showing values which decrease when temperature increases (which is still an option) would also be a lot more confusing than what we have today. As a matter of fact, we will have to handle the 3 cases I just described differently, at least to some extent, for compatibility reasons (see below.) > If we > introduced a new channel for relative values now, we would still have > problems with those drivers, so it might already be too late to avoid > problems for old userspace. We can't change old user-space by definition. All we can do is think of how it will react to sysfs interface changes. One thing you must keep in mind is that trading one form of imperfection for another is not considered OK. Users get used to the initial form of imperfection and will still see the change as a regression (I do software maintenance for a living if you couldn't tell.) This is especially true when things have been the way they are for long (which is the case of coretemp.) For k10temp we have a grace period because the driver is still new and its users are few. But this won't last. Concretely, this means that, just because coretemp doesn't always report correct temperature values, we can't change it to exclusively report raw thermal margins today. Whatever change we come up with, it shouldn't alter what older versions of libsensors would report. I am fine with your temp[1-*]_scale proposal, but this is only one piece of the puzzle. We must also define what newer versions of libsensors will do with the value in that file, both for older drivers such as coretemp (for which we can't change the interface too much for compatibility reasons) and for hypothetical new drivers reporting truly relative temperatures (with k10temp being an intermediate case.) I suspect we will have change the output of "sensors" too. So far we did not print any category names, under the assumption that the unit used for each value was sufficient. Now if °C can mean absolute temperature or (relative) thermal margin, this becomes ambiguous. I've created a new ticket on lm-sensors.org to track this issue: http://www.lm-sensors.org/ticket/2376 (Sorry, the site is awfully slow these days, when responsive at all...) I can create a wiki account for you if you want to keep participating in the lm-sensors project. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v4] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-23 19:05 ` Jean Delvare 2009-11-24 8:43 ` Clemens Ladisch @ 2009-11-24 8:43 ` Clemens Ladisch 2009-11-25 19:45 ` Andrew Morton 2009-11-27 15:43 ` Jean Delvare 1 sibling, 2 replies; 26+ messages in thread From: Clemens Ladisch @ 2009-11-24 8:43 UTC (permalink / raw) To: Andrew Morton, Jean Delvare; +Cc: Serge Belyshev, linux-kernel, lm-sensors This adds a driver for the internal temperature sensor of AMD Family 10h and 11h CPUs. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> --- v3: added 'force' parameter for CPUs with buggy sensor; more documentation v4: added max/crit values, other changes suggested by Jean Delvare Documentation/hwmon/k10temp | 60 +++++++++ drivers/hwmon/Kconfig | 12 + drivers/hwmon/Makefile | 1 drivers/hwmon/k10temp.c | 197 ++++++++++++++++++++++++++++++++ 4 files changed, 270 insertions(+) --- /dev/null +++ linux-2.6/Documentation/hwmon/k10temp @@ -0,0 +1,60 @@ +Kernel driver k10temp +===================== + +Supported chips: +* AMD Family 10h processors: + Socket F: Quad-Core/Six-Core/Embedded Opteron + Socket AM2+: Opteron, Phenom (II) X3/X4 + Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II + Socket S1G3: Athlon II, Sempron, Turion II +* AMD Family 11h processors: + Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) + + Prefix: 'k10temp' + Addresses scanned: PCI space + Datasheets: + BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors: + http://support.amd.com/us/Processor_TechDocs/31116.pdf + BIOS and Kernel Developer's Guide (BKDG) for AMD Family 11h Processors: + http://support.amd.com/us/Processor_TechDocs/41256.pdf + Revision Guide for AMD Family 10h Processors: + http://support.amd.com/us/Processor_TechDocs/41322.pdf + Revision Guide for AMD Family 11h Processors: + http://support.amd.com/us/Processor_TechDocs/41788.pdf + AMD Family 11h Processor Power and Thermal Data Sheet for Notebooks: + http://support.amd.com/us/Processor_TechDocs/43373.pdf + AMD Family 10h Server and Workstation Processor Power and Thermal Data Sheet: + http://support.amd.com/us/Processor_TechDocs/43374.pdf + AMD Family 10h Desktop Processor Power and Thermal Data Sheet: + http://support.amd.com/us/Processor_TechDocs/43375.pdf + +Author: Clemens Ladisch <clemens@ladisch.de> + +Description +----------- + +This driver permits reading of the internal temperature sensor of AMD +Family 10h and 11h processors. + +All these processors have a sensor, but on older revisions of Family 10h +processors, the sensor may return inconsistent values (erratum 319). The +driver will refuse to load on these revisions unless you specify the +"force=1" module parameter. + +There is one temperature measurement value, available as temp1_input in +sysfs. It is measured in degrees Celsius with a resolution of 1/8th degree. +Please note that it is defined as a relative value; to quote the AMD manual: + + Tctl is the processor temperature control value, used by the platform to + control cooling systems. Tctl is a non-physical temperature on an + arbitrary scale measured in degrees. It does _not_ represent an actual + physical temperature like die or case temperature. Instead, it specifies + the processor temperature relative to the point at which the system must + supply the maximum cooling for the processor's specified maximum case + temperature and maximum thermal power dissipation. + +The maximum value for Tctl is available in the file temp1_max. + +If the BIOS has enabled hardware temperature control, the threshold at +which the processor will throttle itself to avoid damage is available in +temp1_crit and temp1_crit_hyst. --- linux-2.6/drivers/hwmon/Kconfig +++ linux-2.6/drivers/hwmon/Kconfig @@ -222,6 +222,18 @@ config SENSORS_K8TEMP This driver can also be built as a module. If so, the module will be called k8temp. +config SENSORS_K10TEMP + tristate "AMD Phenom/Sempron/Turion/Opteron temperature sensor" + depends on X86 && PCI + help + If you say yes here you get support for the temperature + sensor(s) inside your CPU. Supported are later revisions of + the AMD Family 10h and all revisions of the AMD Family 11h + microarchitectures. + + This driver can also be built as a module. If so, the module + will be called k10temp. + config SENSORS_AMS tristate "Apple Motion Sensor driver" depends on PPC_PMAC && !PPC64 && INPUT && ((ADB_PMU && I2C = y) || (ADB_PMU && !I2C) || I2C) && EXPERIMENTAL --- linux-2.6/drivers/hwmon/Makefile +++ linux-2.6/drivers/hwmon/Makefile @@ -53,6 +53,7 @@ obj-$(CONFIG_SENSORS_IBMAEM) += ibmaem.o obj-$(CONFIG_SENSORS_IBMPEX) += ibmpex.o obj-$(CONFIG_SENSORS_IT87) += it87.o obj-$(CONFIG_SENSORS_K8TEMP) += k8temp.o +obj-$(CONFIG_SENSORS_K10TEMP) += k10temp.o obj-$(CONFIG_SENSORS_LIS3LV02D) += lis3lv02d.o hp_accel.o obj-$(CONFIG_SENSORS_LIS3_SPI) += lis3lv02d.o lis3lv02d_spi.o obj-$(CONFIG_SENSORS_LM63) += lm63.o --- /dev/null +++ linux-2.6/drivers/hwmon/k10temp.c @@ -0,0 +1,197 @@ +/* + * k10temp.c - AMD Family 10h/11h processor hardware monitoring + * + * Copyright (c) 2009 Clemens Ladisch <clemens@ladisch.de> + * + * + * This driver is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License; either + * version 2 of the License, or (at your option) any later version. + * + * This driver is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this driver; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <linux/hwmon.h> +#include <linux/hwmon-sysfs.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <asm/processor.h> + +MODULE_DESCRIPTION("AMD Family 10h/11h CPU core temperature monitor"); +MODULE_AUTHOR("Clemens Ladisch <clemens@ladisch.de>"); +MODULE_LICENSE("GPL"); + +static bool force; +module_param(force, bool, 0444); +MODULE_PARM_DESC(force, "force loading on processors with erratum 319"); + +#define REG_HARDWARE_THERMAL_CONTROL 0x64 +#define HTC_ENABLE 0x00000001 + +#define REG_REPORTED_TEMPERATURE 0xa4 + +#define REG_NORTHBRIDGE_CAPABILITIES 0xe8 +#define NB_CAP_HTC 0x00000400 + +static ssize_t show_temp(struct device *dev, + struct device_attribute *attr, char *buf) +{ + u32 regval; + + pci_read_config_dword(to_pci_dev(dev), + REG_REPORTED_TEMPERATURE, ®val); + return sprintf(buf, "%u\n", (regval >> 21) * 125); +} + +static ssize_t show_temp_max(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", 70 * 1000); +} + +static ssize_t show_temp_crit(struct device *dev, + struct device_attribute *devattr, char *buf) +{ + struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); + int show_hyst = attr->index; + u32 regval; + int value; + + pci_read_config_dword(to_pci_dev(dev), + REG_HARDWARE_THERMAL_CONTROL, ®val); + value = ((regval >> 16) & 0x7f) * 500 + 52000; + if (show_hyst) + value -= ((regval >> 24) & 0xf) * 500; + return sprintf(buf, "%d\n", value); +} + +static ssize_t show_name(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "k10temp\n"); +} + +static DEVICE_ATTR(temp1_input, S_IRUGO, show_temp, NULL); +static DEVICE_ATTR(temp1_max, S_IRUGO, show_temp_max, NULL); +static SENSOR_DEVICE_ATTR(temp1_crit, S_IRUGO, show_temp_crit, NULL, 0); +static SENSOR_DEVICE_ATTR(temp1_crit_hyst, S_IRUGO, show_temp_crit, NULL, 1); +static DEVICE_ATTR(name, S_IRUGO, show_name, NULL); + +static bool __devinit has_erratum_319(void) +{ + /* + * Erratum 319: The thermal sensor of older Family 10h processors + * (B steppings) may be unreliable. + */ + return boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model <= 2; +} + +static int __devinit k10temp_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct device *hwmon_dev; + u32 reg_caps, reg_htc; + int err; + + if (has_erratum_319() && !force) { + dev_err(&pdev->dev, + "unreliable CPU thermal sensor; monitoring disabled\n"); + err = -ENODEV; + goto exit; + } + + err = device_create_file(&pdev->dev, &dev_attr_temp1_input); + if (err) + goto exit; + err = device_create_file(&pdev->dev, &dev_attr_temp1_max); + if (err) + goto exit_remove; + + pci_read_config_dword(pdev, REG_NORTHBRIDGE_CAPABILITIES, ®_caps); + pci_read_config_dword(pdev, REG_HARDWARE_THERMAL_CONTROL, ®_htc); + if ((reg_caps & NB_CAP_HTC) && (reg_htc & HTC_ENABLE)) { + err = device_create_file(&pdev->dev, + &sensor_dev_attr_temp1_crit.dev_attr); + if (err) + goto exit_remove; + err = device_create_file(&pdev->dev, + &sensor_dev_attr_temp1_crit_hyst.dev_attr); + if (err) + goto exit_remove; + } + + err = device_create_file(&pdev->dev, &dev_attr_name); + if (err) + goto exit_remove; + + hwmon_dev = hwmon_device_register(&pdev->dev); + if (IS_ERR(hwmon_dev)) { + err = PTR_ERR(hwmon_dev); + goto exit_remove; + } + dev_set_drvdata(&pdev->dev, hwmon_dev); + + if (has_erratum_319() && force) + dev_warn(&pdev->dev, + "unreliable CPU thermal sensor; check erratum 319\n"); + return 0; + +exit_remove: + device_remove_file(&pdev->dev, &dev_attr_name); + device_remove_file(&pdev->dev, &dev_attr_temp1_input); + device_remove_file(&pdev->dev, &dev_attr_temp1_max); + device_remove_file(&pdev->dev, + &sensor_dev_attr_temp1_crit.dev_attr); + device_remove_file(&pdev->dev, + &sensor_dev_attr_temp1_crit_hyst.dev_attr); +exit: + return err; +} + +static void __devexit k10temp_remove(struct pci_dev *pdev) +{ + hwmon_device_unregister(dev_get_drvdata(&pdev->dev)); + device_remove_file(&pdev->dev, &dev_attr_name); + device_remove_file(&pdev->dev, &dev_attr_temp1_input); + device_remove_file(&pdev->dev, &dev_attr_temp1_max); + device_remove_file(&pdev->dev, + &sensor_dev_attr_temp1_crit.dev_attr); + device_remove_file(&pdev->dev, + &sensor_dev_attr_temp1_crit_hyst.dev_attr); + dev_set_drvdata(&pdev->dev, NULL); +} + +static struct pci_device_id k10temp_id_table[] = { + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, + {} +}; +MODULE_DEVICE_TABLE(pci, k10temp_id_table); + +static struct pci_driver k10temp_driver = { + .name = "k10temp", + .id_table = k10temp_id_table, + .probe = k10temp_probe, + .remove = __devexit_p(k10temp_remove), +}; + +static int __init k10temp_init(void) +{ + return pci_register_driver(&k10temp_driver); +} + +static void __exit k10temp_exit(void) +{ + pci_unregister_driver(&k10temp_driver); +} + +module_init(k10temp_init) +module_exit(k10temp_exit) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 8:43 ` [PATCH v4] " Clemens Ladisch @ 2009-11-25 19:45 ` Andrew Morton 2009-11-26 7:46 ` Clemens Ladisch 2009-11-27 15:43 ` Jean Delvare 1 sibling, 1 reply; 26+ messages in thread From: Andrew Morton @ 2009-11-25 19:45 UTC (permalink / raw) To: Clemens Ladisch Cc: Jean Delvare, Serge Belyshev, linux-kernel, lm-sensors, x86 On Tue, 24 Nov 2009 09:43:39 +0100 Clemens Ladisch <clemens@ladisch.de> wrote: > This adds a driver for the internal temperature sensor of AMD Family 10h > and 11h CPUs. > > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > --- > v3: added 'force' parameter for CPUs with buggy sensor; more documentation > v4: added max/crit values, other changes suggested by Jean Delvare Pierre's happy now? > > ... > > +#define REG_HARDWARE_THERMAL_CONTROL 0x64 > +#define HTC_ENABLE 0x00000001 > + > +#define REG_REPORTED_TEMPERATURE 0xa4 > + > +#define REG_NORTHBRIDGE_CAPABILITIES 0xe8 > +#define NB_CAP_HTC 0x00000400 These definitions relate to register addresses within the AMD CPUs, yes? I wonder if these should be placed in some header file in arch/x86/include/asm/ where other such things are described. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-25 19:45 ` Andrew Morton @ 2009-11-26 7:46 ` Clemens Ladisch 0 siblings, 0 replies; 26+ messages in thread From: Clemens Ladisch @ 2009-11-26 7:46 UTC (permalink / raw) To: Andrew Morton; +Cc: Jean Delvare, Serge Belyshev, linux-kernel, lm-sensors, x86 Andrew Morton wrote: > Clemens Ladisch <clemens@ladisch.de> wrote: > > This adds a driver for the internal temperature sensor of AMD Family 10h > > and 11h CPUs. > > ... > > +#define REG_HARDWARE_THERMAL_CONTROL 0x64 > > +#define HTC_ENABLE 0x00000001 > > + > > +#define REG_REPORTED_TEMPERATURE 0xa4 > > + > > +#define REG_NORTHBRIDGE_CAPABILITIES 0xe8 > > +#define NB_CAP_HTC 0x00000400 > > These definitions relate to register addresses within the AMD CPUs, > yes? Yes, in the PCI configuration space of the internal northbridge. > I wonder if these should be placed in some header file in > arch/x86/include/asm/ where other such things are described. There is no such header. The files that also access PCI cfg registers: arch/x86/kernel/k8.c arch/x86/kernel/quirks.c arch/x86/oprofile/op_model_amd.c define their own private symbols or use magic numbers. Best regards, Clemens ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-24 8:43 ` [PATCH v4] " Clemens Ladisch 2009-11-25 19:45 ` Andrew Morton @ 2009-11-27 15:43 ` Jean Delvare 2009-11-28 7:48 ` Andrew Morton 1 sibling, 1 reply; 26+ messages in thread From: Jean Delvare @ 2009-11-27 15:43 UTC (permalink / raw) To: Clemens Ladisch; +Cc: Andrew Morton, Serge Belyshev, linux-kernel, lm-sensors Hi Clemens, On Tue, 24 Nov 2009 09:43:39 +0100, Clemens Ladisch wrote: > This adds a driver for the internal temperature sensor of AMD Family 10h > and 11h CPUs. > > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > --- > v3: added 'force' parameter for CPUs with buggy sensor; more documentation > v4: added max/crit values, other changes suggested by Jean Delvare > > Documentation/hwmon/k10temp | 60 +++++++++ > drivers/hwmon/Kconfig | 12 + > drivers/hwmon/Makefile | 1 > drivers/hwmon/k10temp.c | 197 ++++++++++++++++++++++++++++++++ > 4 files changed, 270 insertions(+) Looks alright to me. Acked-by: Jean Delvare <khali@linux-fr.org> If this is OK with you, I can add this patch to my hwmon tree and schedule it for merge in kernel 2.6.33. -- Jean Delvare ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v4] k10temp: temperature sensor for AMD Family 10h/11h CPUs 2009-11-27 15:43 ` Jean Delvare @ 2009-11-28 7:48 ` Andrew Morton 0 siblings, 0 replies; 26+ messages in thread From: Andrew Morton @ 2009-11-28 7:48 UTC (permalink / raw) To: Jean Delvare; +Cc: Clemens Ladisch, Serge Belyshev, linux-kernel, lm-sensors On Fri, 27 Nov 2009 16:43:33 +0100 Jean Delvare <khali@linux-fr.org> wrote: > Hi Clemens, > > On Tue, 24 Nov 2009 09:43:39 +0100, Clemens Ladisch wrote: > > This adds a driver for the internal temperature sensor of AMD Family 10h > > and 11h CPUs. > > > > Signed-off-by: Clemens Ladisch <clemens@ladisch.de> > > --- > > v3: added 'force' parameter for CPUs with buggy sensor; more documentation > > v4: added max/crit values, other changes suggested by Jean Delvare > > > > Documentation/hwmon/k10temp | 60 +++++++++ > > drivers/hwmon/Kconfig | 12 + > > drivers/hwmon/Makefile | 1 > > drivers/hwmon/k10temp.c | 197 ++++++++++++++++++++++++++++++++ > > 4 files changed, 270 insertions(+) > > Looks alright to me. > > Acked-by: Jean Delvare <khali@linux-fr.org> > > If this is OK with you, I can add this patch to my hwmon tree and > schedule it for merge in kernel 2.6.33. Sounds good, thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2010-01-15 13:31 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4AF91F70.10106@ladisch.de>
2009-11-20 8:15 ` [PATCH] k10temp: temperature sensor for AMD Family 10h/11h CPUs Clemens Ladisch
2009-11-20 10:22 ` Serge Belyshev
2009-11-20 10:44 ` [lm-sensors] " Jean Delvare
2009-11-20 10:47 ` [PATCH v2] " Clemens Ladisch
2009-11-20 11:30 ` [lm-sensors] " Jean Delvare
2009-11-20 11:56 ` Clemens Ladisch
2009-11-20 12:18 ` Jean Delvare
2009-11-23 7:45 ` [PATCH v3] " Clemens Ladisch
2009-11-23 13:51 ` Jean Delvare
2009-11-23 15:29 ` Clemens Ladisch
2009-11-23 19:05 ` Jean Delvare
2009-11-24 8:43 ` Clemens Ladisch
2009-11-24 13:26 ` Jean Delvare
2009-11-24 14:09 ` Clemens Ladisch
2009-11-24 20:11 ` Jean Delvare
2009-11-25 9:51 ` Clemens Ladisch
2009-11-26 20:44 ` Jean Delvare
2009-11-27 13:03 ` Clemens Ladisch
2010-01-10 14:45 ` Jean Delvare
2010-01-15 9:57 ` Clemens Ladisch
2010-01-15 13:31 ` Jean Delvare
2009-11-24 8:43 ` [PATCH v4] " Clemens Ladisch
2009-11-25 19:45 ` Andrew Morton
2009-11-26 7:46 ` Clemens Ladisch
2009-11-27 15:43 ` Jean Delvare
2009-11-28 7:48 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox