* [PATCH] hwmon: coretemp: fix oops on cpu unplug @ 2012-04-30 13:18 Kirill A. Shutemov 2012-04-30 15:28 ` Guenter Roeck 2012-05-01 15:20 ` Guenter Roeck 0 siblings, 2 replies; 15+ messages in thread From: Kirill A. Shutemov @ 2012-04-30 13:18 UTC (permalink / raw) To: Fenghua Yu Cc: Andi Kleen, Jean Delvare, Guenter Roeck, lm-sensors, linux-kernel, Kirill A. Shutemov From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> coretemp tries to access core_data array beyond bounds on cpu unplug if core id of the cpu if more than NUM_REAL_CORES-1. BUG: unable to handle kernel NULL pointer dereference at 000000000000013c IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] PGD 673e5a067 PUD 66e9b3067 PMD 0 Oops: 0000 [#1] SMP CPU 79 Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile] Pid: 3315, comm: set-cpus Tainted: G W 3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R RIP: 0010:[<ffffffffa00159af>] [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] RSP: 0018:ffff880472fb3d48 EFLAGS: 00010246 RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246 RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001 R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800 R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000 FS: 00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000) Stack: ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034 Call Trace: [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10 [<ffffffff81059d30>] __cpu_notify+0x20/0x40 [<ffffffff815fa251>] _cpu_down+0x81/0x270 [<ffffffff815fa477>] cpu_down+0x37/0x50 [<ffffffff815fd6a3>] store_online+0x63/0xc0 [<ffffffff813c7078>] dev_attr_store+0x18/0x30 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170 [<ffffffff81180443>] vfs_write+0xb3/0x180 [<ffffffff8118076a>] sys_write+0x4a/0x90 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28 RIP [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] RSP <ffff880472fb3d48> CR2: 000000000000013c Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- drivers/hwmon/coretemp.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 0d3141f..54a70fe 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu) indx = TO_ATTR_NO(cpu); + /* The core id is too big, just return */ + if (indx > MAX_CORE_DATA - 1) + return; + if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) coretemp_remove_core(pdata, &pdev->dev, indx); -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov @ 2012-04-30 15:28 ` Guenter Roeck 2012-04-30 16:19 ` Kirill A. Shutemov 2012-05-01 15:20 ` Guenter Roeck 1 sibling, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2012-04-30 15:28 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Mon, 2012-04-30 at 09:18 -0400, Kirill A. Shutemov wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > coretemp tries to access core_data array beyond bounds on cpu unplug if > core id of the cpu if more than NUM_REAL_CORES-1. > > BUG: unable to handle kernel NULL pointer dereference at 000000000000013c > IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] > PGD 673e5a067 PUD 66e9b3067 PMD 0 > Oops: 0000 [#1] SMP > CPU 79 > Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile] > > Pid: 3315, comm: set-cpus Tainted: G W 3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R > RIP: 0010:[<ffffffffa00159af>] [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] > RSP: 0018:ffff880472fb3d48 EFLAGS: 00010246 > RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff > RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246 > RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001 > R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800 > R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000 > FS: 00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000) > Stack: > ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005 > 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14 > ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034 > Call Trace: > [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70 > [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10 > [<ffffffff81059d30>] __cpu_notify+0x20/0x40 > [<ffffffff815fa251>] _cpu_down+0x81/0x270 > [<ffffffff815fa477>] cpu_down+0x37/0x50 > [<ffffffff815fd6a3>] store_online+0x63/0xc0 > [<ffffffff813c7078>] dev_attr_store+0x18/0x30 > [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170 > [<ffffffff81180443>] vfs_write+0xb3/0x180 > [<ffffffff8118076a>] sys_write+0x4a/0x90 > [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b > Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28 > RIP [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp] > RSP <ffff880472fb3d48> > CR2: 000000000000013c > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > drivers/hwmon/coretemp.c | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > index 0d3141f..54a70fe 100644 > --- a/drivers/hwmon/coretemp.c > +++ b/drivers/hwmon/coretemp.c > @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu) > > indx = TO_ATTR_NO(cpu); > > + /* The core id is too big, just return */ > + if (indx > MAX_CORE_DATA - 1) > + return; > + > if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) > coretemp_remove_core(pdata, &pdev->dev, indx); > Hi, good catch. Couple of problems, though. First, what number of cores are we talking about ? We should probably increase NUM_REAL_CORES as well. Long term, we should get rid of the dependency to prevent that problem from happening again, but that is a different issue. Second, we'll need the same code in get_core_online(). Otherwise the platform device can be created for the new core (if the core is re-enabled) but will never be deleted. Thanks, Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-04-30 15:28 ` Guenter Roeck @ 2012-04-30 16:19 ` Kirill A. Shutemov 2012-04-30 16:59 ` Guenter Roeck 0 siblings, 1 reply; 15+ messages in thread From: Kirill A. Shutemov @ 2012-04-30 16:19 UTC (permalink / raw) To: Guenter Roeck Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote: > Hi, > > good catch. Couple of problems, though. > > First, what number of cores are we talking about ? We should probably > increase NUM_REAL_CORES as well. Long term, we should get rid of the > dependency to prevent that problem from happening again, but that is a > different issue. It triggered by another bug. Cores on my 10-core processor have ids 0, 1, 2, 8, 9, 16, 17, 18, 24, 25. :-/ No need to change NUM_REAL_CORES at the moment. > Second, we'll need the same code in get_core_online(). Otherwise the > platform device can be created for the new core (if the core is > re-enabled) but will never be deleted. It has already handled in create_core_data(). -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-04-30 16:19 ` Kirill A. Shutemov @ 2012-04-30 16:59 ` Guenter Roeck 0 siblings, 0 replies; 15+ messages in thread From: Guenter Roeck @ 2012-04-30 16:59 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Mon, 2012-04-30 at 12:19 -0400, Kirill A. Shutemov wrote: > On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote: > > Hi, > > > > good catch. Couple of problems, though. > > > > First, what number of cores are we talking about ? We should probably > > increase NUM_REAL_CORES as well. Long term, we should get rid of the > > dependency to prevent that problem from happening again, but that is a > > different issue. > > It triggered by another bug. Cores on my 10-core processor have ids 0, 1, > 2, 8, 9, 16, 17, 18, 24, 25. :-/ > No need to change NUM_REAL_CORES at the moment. > The problem, though, is that core_data[] is indexed by the core ID. So, NUM_REAL_CORES is a bit misleading. It should probably be named NUM_CORE_ID or something similar. > > Second, we'll need the same code in get_core_online(). Otherwise the > > platform device can be created for the new core (if the core is > > re-enabled) but will never be deleted. > > It has already handled in create_core_data(). > ... which is called from coretemp_add_core(), which does not return an error. Besides, it is called after the call to coretemp_device_add(). As a result, the platform device for the unsupported core(s) will be created, even if the core is not supported. I just don't think that is a good idea. Besides, if you'd have the check in get_core_online(), you would not even hit the problem in put_core_offline(), since the platform device would not exist and the function would bail out because of that. Thanks, Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov 2012-04-30 15:28 ` Guenter Roeck @ 2012-05-01 15:20 ` Guenter Roeck 2012-05-01 21:00 ` Kirill A. Shutemov 1 sibling, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2012-05-01 15:20 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > coretemp tries to access core_data array beyond bounds on cpu unplug if > core id of the cpu if more than NUM_REAL_CORES-1. > [ ... ] > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Looking at it again, you were right. Adding the check to get_core_online() doesn't really help as the platform device is per CPU, not per core. Applied. I'll submit a separate patch to increase NUM_REAL_CORES. Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-05-01 15:20 ` Guenter Roeck @ 2012-05-01 21:00 ` Kirill A. Shutemov 2012-05-01 21:25 ` Guenter Roeck 0 siblings, 1 reply; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-01 21:00 UTC (permalink / raw) To: Guenter Roeck Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote: > On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote: > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > coretemp tries to access core_data array beyond bounds on cpu unplug if > > core id of the cpu if more than NUM_REAL_CORES-1. > > > [ ... ] > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Looking at it again, you were right. Adding the check to get_core_online() > doesn't really help as the platform device is per CPU, not per core. > > Applied. I'll submit a separate patch to increase NUM_REAL_CORES. Actually, the problem is bigger. Documentation/cputopology.txt: ==== 2) /sys/devices/system/cpu/cpuX/topology/core_id: the CPU core ID of cpuX. Typically it is the hardware platform's identifier (rather than the kernel's). The actual value is architecture and platform dependent. ==== We should not use core id as an index in an array since it's an arbitrary number (from kernel POV). -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug 2012-05-01 21:00 ` Kirill A. Shutemov @ 2012-05-01 21:25 ` Guenter Roeck 2012-05-02 15:10 ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov 0 siblings, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2012-05-01 21:25 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Tue, May 01, 2012 at 05:00:41PM -0400, Kirill A. Shutemov wrote: > On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote: > > On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote: > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > > > coretemp tries to access core_data array beyond bounds on cpu unplug if > > > core id of the cpu if more than NUM_REAL_CORES-1. > > > > > [ ... ] > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > Looking at it again, you were right. Adding the check to get_core_online() > > doesn't really help as the platform device is per CPU, not per core. > > > > Applied. I'll submit a separate patch to increase NUM_REAL_CORES. > > Actually, the problem is bigger. > > Documentation/cputopology.txt: > ==== > 2) /sys/devices/system/cpu/cpuX/topology/core_id: > > the CPU core ID of cpuX. Typically it is the hardware platform's > identifier (rather than the kernel's). The actual value is > architecture and platform dependent. > ==== > > We should not use core id as an index in an array since it's an arbitrary > number (from kernel POV). > Yes, we know we'll need a better fix going forward. Using a fixed size array is bad, no matter what context. Thanks, Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-01 21:25 ` Guenter Roeck @ 2012-05-02 15:10 ` Kirill A. Shutemov 2012-05-03 5:29 ` [lm-sensors] " R, Durgadoss 2012-05-03 11:18 ` [PATCH, v2] " Kirill A. Shutemov 0 siblings, 2 replies; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-02 15:10 UTC (permalink / raw) To: Fenghua Yu, Guenter Roeck Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Let's rework code to allow arbitrary number of cores on a CPU, not limited by hardcoded array size. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- drivers/hwmon/coretemp.c | 176 +++++++++++++++++++++++++++++++++------------- 1 files changed, 127 insertions(+), 49 deletions(-) diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 54a70fe..f5f9108 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -36,6 +36,8 @@ #include <linux/cpu.h> #include <linux/pci.h> #include <linux/smp.h> +#include <linux/list.h> +#include <linux/kref.h> #include <linux/moduleparam.h> #include <asm/msr.h> #include <asm/processor.h> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); * @valid: If this is 1, the current temperature is valid. */ struct temp_data { + struct list_head list; + struct kref refcount; + int id; int temp; int ttarget; int tjmax; @@ -101,7 +104,8 @@ struct temp_data { struct platform_data { struct device *hwmon_dev; u16 phys_proc_id; - struct temp_data *core_data[MAX_CORE_DATA]; + struct list_head temp_data_list; + spinlock_t temp_data_lock; struct device_attribute name_attr; }; @@ -114,6 +118,29 @@ struct pdev_entry { static LIST_HEAD(pdev_list); static DEFINE_MUTEX(pdev_list_mutex); +static void temp_data_release(struct kref *ref) +{ + struct temp_data *tdata = container_of(ref, struct temp_data, refcount); + + kfree(tdata); +} + +static struct temp_data *get_temp_data(struct platform_data *pdata, int id) +{ + struct temp_data *tdata; + + spin_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) + if (tdata->id == id) { + kref_get(&tdata->refcount); + goto out; + } + tdata = NULL; +out: + spin_unlock(&pdata->temp_data_lock); + return tdata; +} + static ssize_t show_name(struct device *dev, struct device_attribute *devattr, char *buf) { @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; if (tdata->is_pkg_data) - return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); + ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); + else + ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id); - return sprintf(buf, "Core %u\n", tdata->cpu_core_id); + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_crit_alarm(struct device *dev, @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev, u32 eax, edx; struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); - return sprintf(buf, "%d\n", (eax >> 5) & 1); + ret = sprintf(buf, "%d\n", (eax >> 5) & 1); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_tjmax(struct device *dev, @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; + + ret = sprintf(buf, "%d\n", tdata->tjmax); - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax); + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_ttarget(struct device *dev, @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget); + if (!tdata) + return -ENOENT; + + ret = sprintf(buf, "%d\n", tdata->ttarget); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_temp(struct device *dev, @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev, u32 eax, edx; struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret = -EAGAIN; + + if (!tdata) + return -ENOENT; mutex_lock(&tdata->update_lock); @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev, } mutex_unlock(&tdata->update_lock); - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN; + + if (tdata->valid) + ret = sprintf(buf, "%d\n", tdata->temp); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id, @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL); if (!tdata) return NULL; + kref_init(&tdata->refcount); tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS : MSR_IA32_THERM_STATUS; @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, return tdata; } -static int __cpuinit create_core_data(struct platform_device *pdev, +static int __cpuinit create_temp_data(struct platform_device *pdev, unsigned int cpu, int pkg_flag) { struct temp_data *tdata; @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev, */ attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu); - if (attr_no > MAX_CORE_DATA - 1) - return -ERANGE; - /* * Provide a single set of attributes for all HT siblings of a core * to avoid duplicate sensors (the processor ID and core ID of all @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev, * Skip if a HT sibling of this core is already registered. * This is not an error. */ - if (pdata->core_data[attr_no] != NULL) - return 0; + spin_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) { + if (tdata->id == attr_no) { + spin_unlock(&pdata->temp_data_lock); + return 0; + } + } + spin_unlock(&pdata->temp_data_lock); tdata = init_temp_data(cpu, pkg_flag); if (!tdata) @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev, } } - pdata->core_data[attr_no] = tdata; + tdata->id = attr_no; + + spin_lock(&pdata->temp_data_lock); + list_add(&tdata->list, &pdata->temp_data_list); + spin_unlock(&pdata->temp_data_lock); /* Create sysfs interfaces */ err = create_core_attrs(tdata, &pdev->dev, attr_no); if (err) - goto exit_free; + goto exit_list_del; return 0; +exit_list_del: + spin_lock(&pdata->temp_data_lock); + list_del(&tdata->list); + spin_unlock(&pdata->temp_data_lock); exit_free: - pdata->core_data[attr_no] = NULL; kfree(tdata); return err; } @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag) if (!pdev) return; - err = create_core_data(pdev, cpu, pkg_flag); + err = create_temp_data(pdev, cpu, pkg_flag); if (err) dev_err(&pdev->dev, "Adding Core %u failed\n", cpu); } -static void coretemp_remove_core(struct platform_data *pdata, - struct device *dev, int indx) +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev) { int i; - struct temp_data *tdata = pdata->core_data[indx]; /* Remove the sysfs attributes */ for (i = 0; i < tdata->attr_size; i++) device_remove_file(dev, &tdata->sd_attrs[i].dev_attr); - kfree(pdata->core_data[indx]); - pdata->core_data[indx] = NULL; + list_del(&tdata->list); + kref_put(&tdata->refcount, temp_data_release); } static int __devinit coretemp_probe(struct platform_device *pdev) @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev) goto exit_free; pdata->phys_proc_id = pdev->id; + INIT_LIST_HEAD(&pdata->temp_data_list); + spin_lock_init(&pdata->temp_data_lock); platform_set_drvdata(pdev, pdata); pdata->hwmon_dev = hwmon_device_register(&pdev->dev); @@ -557,11 +634,12 @@ exit_free: static int __devexit coretemp_remove(struct platform_device *pdev) { struct platform_data *pdata = platform_get_drvdata(pdev); - int i; + struct temp_data *cur, *tmp; - for (i = MAX_CORE_DATA - 1; i >= 0; --i) - if (pdata->core_data[i]) - coretemp_remove_core(pdata, &pdev->dev, i); + spin_lock(&pdata->temp_data_lock); + list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list) + coretemp_remove_core(cur, &pdev->dev); + spin_unlock(&pdata->temp_data_lock); device_remove_file(&pdev->dev, &pdata->name_attr); hwmon_device_unregister(pdata->hwmon_dev); @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu) static bool __cpuinit is_any_core_online(struct platform_data *pdata) { - int i; - - /* Find online cores, except pkgtemp data */ - for (i = MAX_CORE_DATA - 1; i >= 0; --i) { - if (pdata->core_data[i] && - !pdata->core_data[i]->is_pkg_data) { - return true; - } - } - return false; + struct temp_data *tdata; + bool ret = true; + + spin_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) + if (!tdata->is_pkg_data) + goto out; + ret = false; +out: + spin_unlock(&pdata->temp_data_lock); + return ret; } static void __cpuinit get_core_online(unsigned int cpu) @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu) static void __cpuinit put_core_offline(unsigned int cpu) { - int i, indx; + int i, attr_no; struct platform_data *pdata; struct platform_device *pdev = coretemp_get_pdev(cpu); + struct temp_data *tdata; /* If the physical CPU device does not exist, just return */ if (!pdev) @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu) pdata = platform_get_drvdata(pdev); - indx = TO_ATTR_NO(cpu); - - /* The core id is too big, just return */ - if (indx > MAX_CORE_DATA - 1) - return; + attr_no = TO_ATTR_NO(cpu); - if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) - coretemp_remove_core(pdata, &pdev->dev, indx); + tdata = get_temp_data(pdata, attr_no); + if (tdata->cpu == cpu) + coretemp_remove_core(tdata, &pdev->dev); + kref_put(&tdata->refcount, temp_data_release); /* * If a HT sibling of a core is taken offline, but another HT sibling -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-02 15:10 ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov @ 2012-05-03 5:29 ` R, Durgadoss 2012-05-03 10:04 ` Kirill A. Shutemov 2012-05-03 11:18 ` [PATCH, v2] " Kirill A. Shutemov 1 sibling, 1 reply; 15+ messages in thread From: R, Durgadoss @ 2012-05-03 5:29 UTC (permalink / raw) To: Kirill A. Shutemov, Yu, Fenghua, Guenter Roeck Cc: Andi Kleen, linux-kernel@vger.kernel.org, lm-sensors@lm-sensors.org Hi, > -----Original Message----- > From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-bounces@lm- > sensors.org] On Behalf Of Kirill A. Shutemov > Sent: Wednesday, May 02, 2012 8:41 PM > To: Yu, Fenghua; Guenter Roeck > Cc: Andi Kleen; linux-kernel@vger.kernel.org; Kirill A. Shutemov; lm-sensors@lm- > sensors.org > Subject: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size > array for temp data > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Let's rework code to allow arbitrary number of cores on a CPU, not > limited by hardcoded array size. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > drivers/hwmon/coretemp.c | 176 +++++++++++++++++++++++++++++++++--- > ---------- > 1 files changed, 127 insertions(+), 49 deletions(-) > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > index 54a70fe..f5f9108 100644 > --- a/drivers/hwmon/coretemp.c > +++ b/drivers/hwmon/coretemp.c > @@ -36,6 +36,8 @@ > #include <linux/cpu.h> > #include <linux/pci.h> > #include <linux/smp.h> > +#include <linux/list.h> > +#include <linux/kref.h> > #include <linux/moduleparam.h> > #include <asm/msr.h> > #include <asm/processor.h> > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); > MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp > */ > -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu > */ > #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ > #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ > #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) > -#define MAX_CORE_DATA (NUM_REAL_CORES + > BASE_SYSFS_ATTR_NO) > > #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) > #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees > Celsius"); > * @valid: If this is 1, the current temperature is valid. > */ > struct temp_data { > + struct list_head list; > + struct kref refcount; > + int id; > int temp; > int ttarget; > int tjmax; > @@ -101,7 +104,8 @@ struct temp_data { > struct platform_data { > struct device *hwmon_dev; > u16 phys_proc_id; > - struct temp_data *core_data[MAX_CORE_DATA]; > + struct list_head temp_data_list; > + spinlock_t temp_data_lock; > struct device_attribute name_attr; > }; > > @@ -114,6 +118,29 @@ struct pdev_entry { > static LIST_HEAD(pdev_list); > static DEFINE_MUTEX(pdev_list_mutex); > > +static void temp_data_release(struct kref *ref) > +{ > + struct temp_data *tdata = container_of(ref, struct temp_data, refcount); > + > + kfree(tdata); > +} > + > +static struct temp_data *get_temp_data(struct platform_data *pdata, int id) > +{ > + struct temp_data *tdata; > + > + spin_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) > + if (tdata->id == id) { > + kref_get(&tdata->refcount); > + goto out; > + } > + tdata = NULL; > +out: > + spin_unlock(&pdata->temp_data_lock); > + return tdata; > +} > + > static ssize_t show_name(struct device *dev, > struct device_attribute *devattr, char *buf) > { > @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > > if (tdata->is_pkg_data) > - return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); > + ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); > + else > + ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id); > > - return sprintf(buf, "Core %u\n", tdata->cpu_core_id); > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_crit_alarm(struct device *dev, > @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev, > u32 eax, edx; > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > > rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); > > - return sprintf(buf, "%d\n", (eax >> 5) & 1); > + ret = sprintf(buf, "%d\n", (eax >> 5) & 1); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_tjmax(struct device *dev, > @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > + > + ret = sprintf(buf, "%d\n", tdata->tjmax); > > - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax); > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_ttarget(struct device *dev, > @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > > - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget); > + if (!tdata) > + return -ENOENT; > + > + ret = sprintf(buf, "%d\n", tdata->ttarget); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_temp(struct device *dev, > @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev, > u32 eax, edx; > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret = -EAGAIN; > + > + if (!tdata) > + return -ENOENT; > > mutex_lock(&tdata->update_lock); > > @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev, > } > > mutex_unlock(&tdata->update_lock); > - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN; > + > + if (tdata->valid) > + ret = sprintf(buf, "%d\n", tdata->temp); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id, > @@ -412,6 +478,7 @@ static struct temp_data __cpuinit > *init_temp_data(unsigned int cpu, > tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL); > if (!tdata) > return NULL; > + kref_init(&tdata->refcount); > > tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS : > > MSR_IA32_THERM_STATUS; > @@ -423,7 +490,7 @@ static struct temp_data __cpuinit > *init_temp_data(unsigned int cpu, > return tdata; > } > > -static int __cpuinit create_core_data(struct platform_device *pdev, > +static int __cpuinit create_temp_data(struct platform_device *pdev, > unsigned int cpu, int pkg_flag) > { > struct temp_data *tdata; > @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct > platform_device *pdev, > */ > attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu); > > - if (attr_no > MAX_CORE_DATA - 1) > - return -ERANGE; > - > /* > * Provide a single set of attributes for all HT siblings of a core > * to avoid duplicate sensors (the processor ID and core ID of all > @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct > platform_device *pdev, > * Skip if a HT sibling of this core is already registered. > * This is not an error. > */ > - if (pdata->core_data[attr_no] != NULL) > - return 0; > + spin_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) { > + if (tdata->id == attr_no) { > + spin_unlock(&pdata->temp_data_lock); > + return 0; > + } > + } > + spin_unlock(&pdata->temp_data_lock); > > tdata = init_temp_data(cpu, pkg_flag); > if (!tdata) > @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct > platform_device *pdev, > } > } > > - pdata->core_data[attr_no] = tdata; > + tdata->id = attr_no; > + > + spin_lock(&pdata->temp_data_lock); > + list_add(&tdata->list, &pdata->temp_data_list); > + spin_unlock(&pdata->temp_data_lock); > > /* Create sysfs interfaces */ > err = create_core_attrs(tdata, &pdev->dev, attr_no); > if (err) > - goto exit_free; > + goto exit_list_del; > > return 0; > +exit_list_del: > + spin_lock(&pdata->temp_data_lock); > + list_del(&tdata->list); > + spin_unlock(&pdata->temp_data_lock); > exit_free: > - pdata->core_data[attr_no] = NULL; > kfree(tdata); > return err; > } > @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int > cpu, int pkg_flag) > if (!pdev) > return; > > - err = create_core_data(pdev, cpu, pkg_flag); > + err = create_temp_data(pdev, cpu, pkg_flag); > if (err) > dev_err(&pdev->dev, "Adding Core %u failed\n", cpu); > } > > -static void coretemp_remove_core(struct platform_data *pdata, > - struct device *dev, int indx) > +static void coretemp_remove_core(struct temp_data *tdata, struct device > *dev) > { > int i; > - struct temp_data *tdata = pdata->core_data[indx]; > > /* Remove the sysfs attributes */ > for (i = 0; i < tdata->attr_size; i++) > device_remove_file(dev, &tdata->sd_attrs[i].dev_attr); > > - kfree(pdata->core_data[indx]); > - pdata->core_data[indx] = NULL; > + list_del(&tdata->list); > + kref_put(&tdata->refcount, temp_data_release); > } > > static int __devinit coretemp_probe(struct platform_device *pdev) > @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct > platform_device *pdev) > goto exit_free; > > pdata->phys_proc_id = pdev->id; > + INIT_LIST_HEAD(&pdata->temp_data_list); > + spin_lock_init(&pdata->temp_data_lock); > platform_set_drvdata(pdev, pdata); > > pdata->hwmon_dev = hwmon_device_register(&pdev->dev); > @@ -557,11 +634,12 @@ exit_free: > static int __devexit coretemp_remove(struct platform_device *pdev) > { > struct platform_data *pdata = platform_get_drvdata(pdev); > - int i; > + struct temp_data *cur, *tmp; > > - for (i = MAX_CORE_DATA - 1; i >= 0; --i) > - if (pdata->core_data[i]) > - coretemp_remove_core(pdata, &pdev->dev, i); > + spin_lock(&pdata->temp_data_lock); > + list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list) > + coretemp_remove_core(cur, &pdev->dev); > + spin_unlock(&pdata->temp_data_lock); > > device_remove_file(&pdev->dev, &pdata->name_attr); > hwmon_device_unregister(pdata->hwmon_dev); > @@ -641,16 +719,17 @@ static void __cpuinit > coretemp_device_remove(unsigned int cpu) > > static bool __cpuinit is_any_core_online(struct platform_data *pdata) > { > - int i; > - > - /* Find online cores, except pkgtemp data */ > - for (i = MAX_CORE_DATA - 1; i >= 0; --i) { > - if (pdata->core_data[i] && > - !pdata->core_data[i]->is_pkg_data) { > - return true; > - } > - } > - return false; > + struct temp_data *tdata; > + bool ret = true; > + > + spin_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) > + if (!tdata->is_pkg_data) > + goto out; > + ret = false; > +out: > + spin_unlock(&pdata->temp_data_lock); > + return ret; > } > > static void __cpuinit get_core_online(unsigned int cpu) > @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu) > > static void __cpuinit put_core_offline(unsigned int cpu) > { > - int i, indx; > + int i, attr_no; > struct platform_data *pdata; > struct platform_device *pdev = coretemp_get_pdev(cpu); > + struct temp_data *tdata; > > /* If the physical CPU device does not exist, just return */ > if (!pdev) > @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu) > > pdata = platform_get_drvdata(pdev); > > - indx = TO_ATTR_NO(cpu); > - > - /* The core id is too big, just return */ > - if (indx > MAX_CORE_DATA - 1) > - return; > + attr_no = TO_ATTR_NO(cpu); > > - if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) > - coretemp_remove_core(pdata, &pdev->dev, indx); > + tdata = get_temp_data(pdata, attr_no); > + if (tdata->cpu == cpu) The get_temp_data can return a NULL. So, you might want to do, if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash. In general, why are we using spin_locks instead of mutex_locks, for list manipulations .. ? Thanks, Durga ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-03 5:29 ` [lm-sensors] " R, Durgadoss @ 2012-05-03 10:04 ` Kirill A. Shutemov 0 siblings, 0 replies; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-03 10:04 UTC (permalink / raw) To: R, Durgadoss Cc: Yu, Fenghua, Guenter Roeck, Andi Kleen, linux-kernel@vger.kernel.org, lm-sensors@lm-sensors.org [-- Attachment #1: Type: text/plain, Size: 986 bytes --] On Thu, May 03, 2012 at 05:29:43AM +0000, R, Durgadoss wrote: > > @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu) > > > > pdata = platform_get_drvdata(pdev); > > > > - indx = TO_ATTR_NO(cpu); > > - > > - /* The core id is too big, just return */ > > - if (indx > MAX_CORE_DATA - 1) > > - return; > > + attr_no = TO_ATTR_NO(cpu); > > > > - if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) > > - coretemp_remove_core(pdata, &pdev->dev, indx); > > + tdata = get_temp_data(pdata, attr_no); > > + if (tdata->cpu == cpu) > > The get_temp_data can return a NULL. So, you might want to do, > if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash. Good catch, thanks. > In general, why are we using spin_locks instead of mutex_locks, > for list manipulations .. ? I don't think it matters here. Contention is low and nobody sleeps, but okay, I'll change it to mutex. -- Kirill A. Shutemov [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-02 15:10 ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov 2012-05-03 5:29 ` [lm-sensors] " R, Durgadoss @ 2012-05-03 11:18 ` Kirill A. Shutemov 2012-05-04 5:41 ` Guenter Roeck 1 sibling, 1 reply; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-03 11:18 UTC (permalink / raw) To: Fenghua Yu, Guenter Roeck, Durgadoss R Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Let's rework code to allow arbitrary number of cores on a CPU, not limited by hardcoded array size. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- v2: - fix NULL pointer dereference. Thanks to R, Durgadoss; - use mutex instead of spinlock for list locking. --- drivers/hwmon/coretemp.c | 178 +++++++++++++++++++++++++++++++++------------- 1 files changed, 129 insertions(+), 49 deletions(-) diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 54a70fe..1c66131 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -36,6 +36,8 @@ #include <linux/cpu.h> #include <linux/pci.h> #include <linux/smp.h> +#include <linux/list.h> +#include <linux/kref.h> #include <linux/moduleparam.h> #include <asm/msr.h> #include <asm/processor.h> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); * @valid: If this is 1, the current temperature is valid. */ struct temp_data { + struct list_head list; + struct kref refcount; + int id; int temp; int ttarget; int tjmax; @@ -101,7 +104,8 @@ struct temp_data { struct platform_data { struct device *hwmon_dev; u16 phys_proc_id; - struct temp_data *core_data[MAX_CORE_DATA]; + struct list_head temp_data_list; + struct mutex temp_data_lock; struct device_attribute name_attr; }; @@ -114,6 +118,29 @@ struct pdev_entry { static LIST_HEAD(pdev_list); static DEFINE_MUTEX(pdev_list_mutex); +static void temp_data_release(struct kref *ref) +{ + struct temp_data *tdata = container_of(ref, struct temp_data, refcount); + + kfree(tdata); +} + +static struct temp_data *get_temp_data(struct platform_data *pdata, int id) +{ + struct temp_data *tdata; + + mutex_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) + if (tdata->id == id) { + kref_get(&tdata->refcount); + goto out; + } + tdata = NULL; +out: + mutex_unlock(&pdata->temp_data_lock); + return tdata; +} + static ssize_t show_name(struct device *dev, struct device_attribute *devattr, char *buf) { @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; if (tdata->is_pkg_data) - return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); + ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); + else + ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id); - return sprintf(buf, "Core %u\n", tdata->cpu_core_id); + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_crit_alarm(struct device *dev, @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev, u32 eax, edx; struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); - return sprintf(buf, "%d\n", (eax >> 5) & 1); + ret = sprintf(buf, "%d\n", (eax >> 5) & 1); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_tjmax(struct device *dev, @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; + + if (!tdata) + return -ENOENT; + + ret = sprintf(buf, "%d\n", tdata->tjmax); - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax); + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_ttarget(struct device *dev, @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret; - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget); + if (!tdata) + return -ENOENT; + + ret = sprintf(buf, "%d\n", tdata->ttarget); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static ssize_t show_temp(struct device *dev, @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev, u32 eax, edx; struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct platform_data *pdata = dev_get_drvdata(dev); - struct temp_data *tdata = pdata->core_data[attr->index]; + struct temp_data *tdata = get_temp_data(pdata, attr->index); + ssize_t ret = -EAGAIN; + + if (!tdata) + return -ENOENT; mutex_lock(&tdata->update_lock); @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev, } mutex_unlock(&tdata->update_lock); - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN; + + if (tdata->valid) + ret = sprintf(buf, "%d\n", tdata->temp); + + kref_put(&tdata->refcount, temp_data_release); + return ret; } static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id, @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL); if (!tdata) return NULL; + kref_init(&tdata->refcount); tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS : MSR_IA32_THERM_STATUS; @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, return tdata; } -static int __cpuinit create_core_data(struct platform_device *pdev, +static int __cpuinit create_temp_data(struct platform_device *pdev, unsigned int cpu, int pkg_flag) { struct temp_data *tdata; @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev, */ attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu); - if (attr_no > MAX_CORE_DATA - 1) - return -ERANGE; - /* * Provide a single set of attributes for all HT siblings of a core * to avoid duplicate sensors (the processor ID and core ID of all @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev, * Skip if a HT sibling of this core is already registered. * This is not an error. */ - if (pdata->core_data[attr_no] != NULL) - return 0; + mutex_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) { + if (tdata->id == attr_no) { + mutex_unlock(&pdata->temp_data_lock); + return 0; + } + } + mutex_unlock(&pdata->temp_data_lock); tdata = init_temp_data(cpu, pkg_flag); if (!tdata) @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev, } } - pdata->core_data[attr_no] = tdata; + tdata->id = attr_no; + + mutex_lock(&pdata->temp_data_lock); + list_add(&tdata->list, &pdata->temp_data_list); + mutex_unlock(&pdata->temp_data_lock); /* Create sysfs interfaces */ err = create_core_attrs(tdata, &pdev->dev, attr_no); if (err) - goto exit_free; + goto exit_list_del; return 0; +exit_list_del: + mutex_lock(&pdata->temp_data_lock); + list_del(&tdata->list); + mutex_unlock(&pdata->temp_data_lock); exit_free: - pdata->core_data[attr_no] = NULL; kfree(tdata); return err; } @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag) if (!pdev) return; - err = create_core_data(pdev, cpu, pkg_flag); + err = create_temp_data(pdev, cpu, pkg_flag); if (err) dev_err(&pdev->dev, "Adding Core %u failed\n", cpu); } -static void coretemp_remove_core(struct platform_data *pdata, - struct device *dev, int indx) +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev) { int i; - struct temp_data *tdata = pdata->core_data[indx]; /* Remove the sysfs attributes */ for (i = 0; i < tdata->attr_size; i++) device_remove_file(dev, &tdata->sd_attrs[i].dev_attr); - kfree(pdata->core_data[indx]); - pdata->core_data[indx] = NULL; + list_del(&tdata->list); + kref_put(&tdata->refcount, temp_data_release); } static int __devinit coretemp_probe(struct platform_device *pdev) @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev) goto exit_free; pdata->phys_proc_id = pdev->id; + INIT_LIST_HEAD(&pdata->temp_data_list); + mutex_init(&pdata->temp_data_lock); platform_set_drvdata(pdev, pdata); pdata->hwmon_dev = hwmon_device_register(&pdev->dev); @@ -557,11 +634,12 @@ exit_free: static int __devexit coretemp_remove(struct platform_device *pdev) { struct platform_data *pdata = platform_get_drvdata(pdev); - int i; + struct temp_data *cur, *tmp; - for (i = MAX_CORE_DATA - 1; i >= 0; --i) - if (pdata->core_data[i]) - coretemp_remove_core(pdata, &pdev->dev, i); + mutex_lock(&pdata->temp_data_lock); + list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list) + coretemp_remove_core(cur, &pdev->dev); + mutex_unlock(&pdata->temp_data_lock); device_remove_file(&pdev->dev, &pdata->name_attr); hwmon_device_unregister(pdata->hwmon_dev); @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu) static bool __cpuinit is_any_core_online(struct platform_data *pdata) { - int i; - - /* Find online cores, except pkgtemp data */ - for (i = MAX_CORE_DATA - 1; i >= 0; --i) { - if (pdata->core_data[i] && - !pdata->core_data[i]->is_pkg_data) { - return true; - } - } - return false; + struct temp_data *tdata; + bool ret = true; + + mutex_lock(&pdata->temp_data_lock); + list_for_each_entry(tdata, &pdata->temp_data_list, list) + if (!tdata->is_pkg_data) + goto out; + ret = false; +out: + mutex_unlock(&pdata->temp_data_lock); + return ret; } static void __cpuinit get_core_online(unsigned int cpu) @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu) static void __cpuinit put_core_offline(unsigned int cpu) { - int i, indx; + int i, attr_no; struct platform_data *pdata; struct platform_device *pdev = coretemp_get_pdev(cpu); + struct temp_data *tdata; /* If the physical CPU device does not exist, just return */ if (!pdev) @@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu) pdata = platform_get_drvdata(pdev); - indx = TO_ATTR_NO(cpu); + attr_no = TO_ATTR_NO(cpu); - /* The core id is too big, just return */ - if (indx > MAX_CORE_DATA - 1) - return; - - if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) - coretemp_remove_core(pdata, &pdev->dev, indx); + tdata = get_temp_data(pdata, attr_no); + if (tdata) { + if (tdata->cpu == cpu) + coretemp_remove_core(tdata, &pdev->dev); + kref_put(&tdata->refcount, temp_data_release); + } /* * If a HT sibling of a core is taken offline, but another HT sibling -- 1.7.9.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-03 11:18 ` [PATCH, v2] " Kirill A. Shutemov @ 2012-05-04 5:41 ` Guenter Roeck 2012-05-04 6:46 ` Kirill A. Shutemov 0 siblings, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2012-05-04 5:41 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Let's rework code to allow arbitrary number of cores on a CPU, not > limited by hardcoded array size. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > v2: > - fix NULL pointer dereference. Thanks to R, Durgadoss; > - use mutex instead of spinlock for list locking. > --- > drivers/hwmon/coretemp.c | 178 +++++++++++++++++++++++++++++++++------------- > 1 files changed, 129 insertions(+), 49 deletions(-) > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > index 54a70fe..1c66131 100644 > --- a/drivers/hwmon/coretemp.c > +++ b/drivers/hwmon/coretemp.c > @@ -36,6 +36,8 @@ > #include <linux/cpu.h> > #include <linux/pci.h> > #include <linux/smp.h> > +#include <linux/list.h> > +#include <linux/kref.h> > #include <linux/moduleparam.h> > #include <asm/msr.h> > #include <asm/processor.h> > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); > MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ > -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ > #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ > #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ > #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) > -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) > > #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) > #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > * @valid: If this is 1, the current temperature is valid. > */ > struct temp_data { > + struct list_head list; > + struct kref refcount; Hi, the kref is not needed. The attribute access functions don't need to be protected since the attributes for a core are deleted before the core data itself is deleted. So it is not neccessary to hold a lock while accessing/using temp_data in the attribute access functions. All you need is to hold a mutex while you are manipulating or walking the list. Thanks, Guenter > + int id; > int temp; > int ttarget; > int tjmax; > @@ -101,7 +104,8 @@ struct temp_data { > struct platform_data { > struct device *hwmon_dev; > u16 phys_proc_id; > - struct temp_data *core_data[MAX_CORE_DATA]; > + struct list_head temp_data_list; > + struct mutex temp_data_lock; > struct device_attribute name_attr; > }; > > @@ -114,6 +118,29 @@ struct pdev_entry { > static LIST_HEAD(pdev_list); > static DEFINE_MUTEX(pdev_list_mutex); > > +static void temp_data_release(struct kref *ref) > +{ > + struct temp_data *tdata = container_of(ref, struct temp_data, refcount); > + > + kfree(tdata); > +} > + > +static struct temp_data *get_temp_data(struct platform_data *pdata, int id) > +{ > + struct temp_data *tdata; > + > + mutex_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) > + if (tdata->id == id) { > + kref_get(&tdata->refcount); > + goto out; > + } > + tdata = NULL; > +out: > + mutex_unlock(&pdata->temp_data_lock); > + return tdata; > +} > + > static ssize_t show_name(struct device *dev, > struct device_attribute *devattr, char *buf) > { > @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > > if (tdata->is_pkg_data) > - return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); > + ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id); > + else > + ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id); > > - return sprintf(buf, "Core %u\n", tdata->cpu_core_id); > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_crit_alarm(struct device *dev, > @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev, > u32 eax, edx; > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > > rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx); > > - return sprintf(buf, "%d\n", (eax >> 5) & 1); > + ret = sprintf(buf, "%d\n", (eax >> 5) & 1); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_tjmax(struct device *dev, > @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > + > + if (!tdata) > + return -ENOENT; > + > + ret = sprintf(buf, "%d\n", tdata->tjmax); > > - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax); > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_ttarget(struct device *dev, > @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev, > { > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret; > > - return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget); > + if (!tdata) > + return -ENOENT; > + > + ret = sprintf(buf, "%d\n", tdata->ttarget); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static ssize_t show_temp(struct device *dev, > @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev, > u32 eax, edx; > struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); > struct platform_data *pdata = dev_get_drvdata(dev); > - struct temp_data *tdata = pdata->core_data[attr->index]; > + struct temp_data *tdata = get_temp_data(pdata, attr->index); > + ssize_t ret = -EAGAIN; > + > + if (!tdata) > + return -ENOENT; > > mutex_lock(&tdata->update_lock); > > @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev, > } > > mutex_unlock(&tdata->update_lock); > - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN; > + > + if (tdata->valid) > + ret = sprintf(buf, "%d\n", tdata->temp); > + > + kref_put(&tdata->refcount, temp_data_release); > + return ret; > } > > static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id, > @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, > tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL); > if (!tdata) > return NULL; > + kref_init(&tdata->refcount); > > tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS : > MSR_IA32_THERM_STATUS; > @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu, > return tdata; > } > > -static int __cpuinit create_core_data(struct platform_device *pdev, > +static int __cpuinit create_temp_data(struct platform_device *pdev, > unsigned int cpu, int pkg_flag) > { > struct temp_data *tdata; > @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev, > */ > attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu); > > - if (attr_no > MAX_CORE_DATA - 1) > - return -ERANGE; > - > /* > * Provide a single set of attributes for all HT siblings of a core > * to avoid duplicate sensors (the processor ID and core ID of all > @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev, > * Skip if a HT sibling of this core is already registered. > * This is not an error. > */ > - if (pdata->core_data[attr_no] != NULL) > - return 0; > + mutex_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) { > + if (tdata->id == attr_no) { > + mutex_unlock(&pdata->temp_data_lock); > + return 0; > + } > + } > + mutex_unlock(&pdata->temp_data_lock); > > tdata = init_temp_data(cpu, pkg_flag); > if (!tdata) > @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev, > } > } > > - pdata->core_data[attr_no] = tdata; > + tdata->id = attr_no; > + > + mutex_lock(&pdata->temp_data_lock); > + list_add(&tdata->list, &pdata->temp_data_list); > + mutex_unlock(&pdata->temp_data_lock); > > /* Create sysfs interfaces */ > err = create_core_attrs(tdata, &pdev->dev, attr_no); > if (err) > - goto exit_free; > + goto exit_list_del; > > return 0; > +exit_list_del: > + mutex_lock(&pdata->temp_data_lock); > + list_del(&tdata->list); > + mutex_unlock(&pdata->temp_data_lock); > exit_free: > - pdata->core_data[attr_no] = NULL; > kfree(tdata); > return err; > } > @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag) > if (!pdev) > return; > > - err = create_core_data(pdev, cpu, pkg_flag); > + err = create_temp_data(pdev, cpu, pkg_flag); > if (err) > dev_err(&pdev->dev, "Adding Core %u failed\n", cpu); > } > > -static void coretemp_remove_core(struct platform_data *pdata, > - struct device *dev, int indx) > +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev) > { > int i; > - struct temp_data *tdata = pdata->core_data[indx]; > > /* Remove the sysfs attributes */ > for (i = 0; i < tdata->attr_size; i++) > device_remove_file(dev, &tdata->sd_attrs[i].dev_attr); > > - kfree(pdata->core_data[indx]); > - pdata->core_data[indx] = NULL; > + list_del(&tdata->list); > + kref_put(&tdata->refcount, temp_data_release); > } > > static int __devinit coretemp_probe(struct platform_device *pdev) > @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev) > goto exit_free; > > pdata->phys_proc_id = pdev->id; > + INIT_LIST_HEAD(&pdata->temp_data_list); > + mutex_init(&pdata->temp_data_lock); > platform_set_drvdata(pdev, pdata); > > pdata->hwmon_dev = hwmon_device_register(&pdev->dev); > @@ -557,11 +634,12 @@ exit_free: > static int __devexit coretemp_remove(struct platform_device *pdev) > { > struct platform_data *pdata = platform_get_drvdata(pdev); > - int i; > + struct temp_data *cur, *tmp; > > - for (i = MAX_CORE_DATA - 1; i >= 0; --i) > - if (pdata->core_data[i]) > - coretemp_remove_core(pdata, &pdev->dev, i); > + mutex_lock(&pdata->temp_data_lock); > + list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list) > + coretemp_remove_core(cur, &pdev->dev); > + mutex_unlock(&pdata->temp_data_lock); > > device_remove_file(&pdev->dev, &pdata->name_attr); > hwmon_device_unregister(pdata->hwmon_dev); > @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu) > > static bool __cpuinit is_any_core_online(struct platform_data *pdata) > { > - int i; > - > - /* Find online cores, except pkgtemp data */ > - for (i = MAX_CORE_DATA - 1; i >= 0; --i) { > - if (pdata->core_data[i] && > - !pdata->core_data[i]->is_pkg_data) { > - return true; > - } > - } > - return false; > + struct temp_data *tdata; > + bool ret = true; > + > + mutex_lock(&pdata->temp_data_lock); > + list_for_each_entry(tdata, &pdata->temp_data_list, list) > + if (!tdata->is_pkg_data) > + goto out; > + ret = false; > +out: > + mutex_unlock(&pdata->temp_data_lock); > + return ret; > } > > static void __cpuinit get_core_online(unsigned int cpu) > @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu) > > static void __cpuinit put_core_offline(unsigned int cpu) > { > - int i, indx; > + int i, attr_no; > struct platform_data *pdata; > struct platform_device *pdev = coretemp_get_pdev(cpu); > + struct temp_data *tdata; > > /* If the physical CPU device does not exist, just return */ > if (!pdev) > @@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu) > > pdata = platform_get_drvdata(pdev); > > - indx = TO_ATTR_NO(cpu); > + attr_no = TO_ATTR_NO(cpu); > > - /* The core id is too big, just return */ > - if (indx > MAX_CORE_DATA - 1) > - return; > - > - if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu) > - coretemp_remove_core(pdata, &pdev->dev, indx); > + tdata = get_temp_data(pdata, attr_no); > + if (tdata) { > + if (tdata->cpu == cpu) > + coretemp_remove_core(tdata, &pdev->dev); > + kref_put(&tdata->refcount, temp_data_release); > + } > > /* > * If a HT sibling of a core is taken offline, but another HT sibling > -- > 1.7.9.1 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-04 5:41 ` Guenter Roeck @ 2012-05-04 6:46 ` Kirill A. Shutemov 2012-05-04 13:34 ` Guenter Roeck 0 siblings, 1 reply; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-04 6:46 UTC (permalink / raw) To: Guenter Roeck Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 2773 bytes --] On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote: > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote: > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > Let's rework code to allow arbitrary number of cores on a CPU, not > > limited by hardcoded array size. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > --- > > v2: > > - fix NULL pointer dereference. Thanks to R, Durgadoss; > > - use mutex instead of spinlock for list locking. > > --- > > drivers/hwmon/coretemp.c | 178 +++++++++++++++++++++++++++++++++------------- > > 1 files changed, 129 insertions(+), 49 deletions(-) > > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > > index 54a70fe..1c66131 100644 > > --- a/drivers/hwmon/coretemp.c > > +++ b/drivers/hwmon/coretemp.c > > @@ -36,6 +36,8 @@ > > #include <linux/cpu.h> > > #include <linux/pci.h> > > #include <linux/smp.h> > > +#include <linux/list.h> > > +#include <linux/kref.h> > > #include <linux/moduleparam.h> > > #include <asm/msr.h> > > #include <asm/processor.h> > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); > > MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > > > #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ > > -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ > > #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ > > #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ > > #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) > > -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) > > > > #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) > > #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > * @valid: If this is 1, the current temperature is valid. > > */ > > struct temp_data { > > + struct list_head list; > > + struct kref refcount; > > Hi, > > the kref is not needed. The attribute access functions don't > need to be protected since the attributes for a core are deleted > before the core data itself is deleted. So it is not neccessary > to hold a lock while accessing/using temp_data in the attribute > access functions. All you need is to hold a mutex while you are > manipulating or walking the list. Without kref, what prevents following situation: CPU-A CPU-B tdata = get_temp_data(); coretemp_remove_core() { device_remove_file(); kfree(tdata); } <tdata dereference> ? -- Kirill A. Shutemov [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-04 6:46 ` Kirill A. Shutemov @ 2012-05-04 13:34 ` Guenter Roeck 2012-05-04 13:42 ` Kirill A. Shutemov 0 siblings, 1 reply; 15+ messages in thread From: Guenter Roeck @ 2012-05-04 13:34 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote: > On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote: > > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote: > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > > > Let's rework code to allow arbitrary number of cores on a CPU, not > > > limited by hardcoded array size. > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > --- > > > v2: > > > - fix NULL pointer dereference. Thanks to R, Durgadoss; > > > - use mutex instead of spinlock for list locking. > > > --- > > > drivers/hwmon/coretemp.c | 178 +++++++++++++++++++++++++++++++++------------- > > > 1 files changed, 129 insertions(+), 49 deletions(-) > > > > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > > > index 54a70fe..1c66131 100644 > > > --- a/drivers/hwmon/coretemp.c > > > +++ b/drivers/hwmon/coretemp.c > > > @@ -36,6 +36,8 @@ > > > #include <linux/cpu.h> > > > #include <linux/pci.h> > > > #include <linux/smp.h> > > > +#include <linux/list.h> > > > +#include <linux/kref.h> > > > #include <linux/moduleparam.h> > > > #include <asm/msr.h> > > > #include <asm/processor.h> > > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); > > > MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > > > > > #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ > > > -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ > > > #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ > > > #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ > > > #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) > > > -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) > > > > > > #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) > > > #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) > > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > > * @valid: If this is 1, the current temperature is valid. > > > */ > > > struct temp_data { > > > + struct list_head list; > > > + struct kref refcount; > > > > Hi, > > > > the kref is not needed. The attribute access functions don't > > need to be protected since the attributes for a core are deleted > > before the core data itself is deleted. So it is not neccessary > > to hold a lock while accessing/using temp_data in the attribute > > access functions. All you need is to hold a mutex while you are > > manipulating or walking the list. > > Without kref, what prevents following situation: > > CPU-A CPU-B > tdata = get_temp_data(); > coretemp_remove_core() { > device_remove_file(); > kfree(tdata); > } > <tdata dereference> > The remove function requires a semaphore which is held by the access function, so device_remove_file() will only proceed after CPU-A is done with the sysfs access. Guenter ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data 2012-05-04 13:34 ` Guenter Roeck @ 2012-05-04 13:42 ` Kirill A. Shutemov 0 siblings, 0 replies; 15+ messages in thread From: Kirill A. Shutemov @ 2012-05-04 13:42 UTC (permalink / raw) To: Guenter Roeck Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors@lm-sensors.org, linux-kernel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 3381 bytes --] On Fri, May 04, 2012 at 06:34:19AM -0700, Guenter Roeck wrote: > On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote: > > On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote: > > > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote: > > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > > > > > Let's rework code to allow arbitrary number of cores on a CPU, not > > > > limited by hardcoded array size. > > > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > --- > > > > v2: > > > > - fix NULL pointer dereference. Thanks to R, Durgadoss; > > > > - use mutex instead of spinlock for list locking. > > > > --- > > > > drivers/hwmon/coretemp.c | 178 +++++++++++++++++++++++++++++++++------------- > > > > 1 files changed, 129 insertions(+), 49 deletions(-) > > > > > > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c > > > > index 54a70fe..1c66131 100644 > > > > --- a/drivers/hwmon/coretemp.c > > > > +++ b/drivers/hwmon/coretemp.c > > > > @@ -36,6 +36,8 @@ > > > > #include <linux/cpu.h> > > > > #include <linux/pci.h> > > > > #include <linux/smp.h> > > > > +#include <linux/list.h> > > > > +#include <linux/kref.h> > > > > #include <linux/moduleparam.h> > > > > #include <asm/msr.h> > > > > #include <asm/processor.h> > > > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444); > > > > MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > > > > > > > #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ > > > > -#define NUM_REAL_CORES 16 /* Number of Real cores per cpu */ > > > > #define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ > > > > #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ > > > > #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) > > > > -#define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) > > > > > > > > #define TO_PHYS_ID(cpu) (cpu_data(cpu).phys_proc_id) > > > > #define TO_CORE_ID(cpu) (cpu_data(cpu).cpu_core_id) > > > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); > > > > * @valid: If this is 1, the current temperature is valid. > > > > */ > > > > struct temp_data { > > > > + struct list_head list; > > > > + struct kref refcount; > > > > > > Hi, > > > > > > the kref is not needed. The attribute access functions don't > > > need to be protected since the attributes for a core are deleted > > > before the core data itself is deleted. So it is not neccessary > > > to hold a lock while accessing/using temp_data in the attribute > > > access functions. All you need is to hold a mutex while you are > > > manipulating or walking the list. > > > > Without kref, what prevents following situation: > > > > CPU-A CPU-B > > tdata = get_temp_data(); > > coretemp_remove_core() { > > device_remove_file(); > > kfree(tdata); > > } > > <tdata dereference> > > > The remove function requires a semaphore which is held by the access function, > so device_remove_file() will only proceed after CPU-A is done with the sysfs access. Understood. I'll update the patch. -- Kirill A. Shutemov [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-05-04 13:41 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov 2012-04-30 15:28 ` Guenter Roeck 2012-04-30 16:19 ` Kirill A. Shutemov 2012-04-30 16:59 ` Guenter Roeck 2012-05-01 15:20 ` Guenter Roeck 2012-05-01 21:00 ` Kirill A. Shutemov 2012-05-01 21:25 ` Guenter Roeck 2012-05-02 15:10 ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov 2012-05-03 5:29 ` [lm-sensors] " R, Durgadoss 2012-05-03 10:04 ` Kirill A. Shutemov 2012-05-03 11:18 ` [PATCH, v2] " Kirill A. Shutemov 2012-05-04 5:41 ` Guenter Roeck 2012-05-04 6:46 ` Kirill A. Shutemov 2012-05-04 13:34 ` Guenter Roeck 2012-05-04 13:42 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox