From: "lihuisong (C)" <lihuisong@huawei.com>
To: Guenter Roeck <linux@roeck-us.net>, <linux-hwmon@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Cc: <jdelvare@suse.com>, <liuyonglong@huawei.com>,
<zhanjie9@hisilicon.com>, <zhenglifeng1@huawei.com>
Subject: Re: [PATCH v1 1/4] hwmon: (acpi_power_meter) Fix using uninitialized variables
Date: Thu, 12 Dec 2024 11:00:46 +0800 [thread overview]
Message-ID: <2068a450-7752-c47b-edfc-cb2a00ac4402@huawei.com> (raw)
In-Reply-To: <1ce7718c-0bf8-4009-9240-fc6e2363ed54@roeck-us.net>
在 2024/12/12 9:51, Guenter Roeck 写道:
> On 11/26/24 19:43, lihuisong (C) wrote:
>> Hi Guenter,
>>
>> How about the modification as below? But driver doesn't know what the
>> time is to set resource->power_alarm to false.
>>
> It's a start, but incomplete because power_alarm must be reset.
>
> See below.
>
>> 在 2024/11/27 0:19, Guenter Roeck 写道:
>>> On 11/25/24 23:03, lihuisong (C) wrote:
>>>>
>>>> 在 2024/11/26 12:04, Guenter Roeck 写道:
>>>>> On 11/25/24 17:56, lihuisong (C) wrote:
>>>>>> Hi Guente,
>>>>>>
>>>>>> Thanks for your timely review.
>>>>>>
>>>>>> 在 2024/11/26 0:03, Guenter Roeck 写道:
>>>>>>> On 11/25/24 01:34, Huisong Li wrote:
>>>>>>>> The 'power1_alarm' attribute uses the 'power' and 'cap' in the
>>>>>>>> acpi_power_meter_resource structure. However, these two fields
>>>>>>>> are just
>>>>>>>> updated when user query 'power' and 'cap' attribute, or
>>>>>>>> hardware enforced
>>>>>>>> limit. If user directly query the 'power1_alarm' attribute
>>>>>>>> without queryng
>>>>>>>> above two attributes, driver will use the uninitialized
>>>>>>>> variables to judge.
>>>>>>>> In addition, the 'power1_alarm' attribute needs to update power
>>>>>>>> and cap to
>>>>>>>> show the real state.
>>>>>>>>
>>>>>>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>>>>>>>> ---
>>>>>>>> drivers/hwmon/acpi_power_meter.c | 10 ++++++++++
>>>>>>>> 1 file changed, 10 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/hwmon/acpi_power_meter.c
>>>>>>>> b/drivers/hwmon/acpi_power_meter.c
>>>>>>>> index 2f1c9d97ad21..4c3314e35d30 100644
>>>>>>>> --- a/drivers/hwmon/acpi_power_meter.c
>>>>>>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>>>>>>> @@ -396,6 +396,9 @@ static ssize_t show_val(struct device *dev,
>>>>>>>> struct acpi_device *acpi_dev = to_acpi_device(dev);
>>>>>>>> struct acpi_power_meter_resource *resource =
>>>>>>>> acpi_dev->driver_data;
>>>>>>>> u64 val = 0;
>>>>>>>> + int ret;
>>>>>>>> +
>>>>>>>> + guard(mutex)(&resource->lock);
>>>>>>>> switch (attr->index) {
>>>>>>>> case 0:
>>>>>>>> @@ -423,6 +426,13 @@ static ssize_t show_val(struct device *dev,
>>>>>>>> val = 0;
>>>>>>>> break;
>>>>>>>> case 6:
>>>>>>>> + ret = update_meter(resource);
>>>>>>>> + if (ret)
>>>>>>>> + return ret;
>>>>>>>> + ret = update_cap(resource);
>>>>>>>> + if (ret)
>>>>>>>> + return ret;
>>>>>>>> +
>>>>>>>> if (resource->power > resource->cap)
>>>>>>>> val = 1;
>>>>>>>> else
>>>>>>>
>>>>>>>
>>>>>>> While technically correct, the implementation of this attribute
>>>>>>> defeats its
>>>>>>> purpose. It is supposed to reflect the current status as
>>>>>>> reported by the
>>>>>>> hardware. A real fix would be to use the associated notification
>>>>>>> to set or
>>>>>>> reset a status flag, and to report the current value of that
>>>>>>> flag as reported
>>>>>>> by the hardware.
>>>>>> I know what you mean.
>>>>>> The Notify(power_meter, 0x83) is supposed to meet your proposal
>>>>>> IIUC.
>>>>>> It's good, but it depands on hardware support notification.
>>>>>>>
>>>>>>> If there is no notification support, the attribute should not
>>>>>>> even exist,
>>>>>>> unless there is a means to retrieve its value from ACPI (the
>>>>>>> status itself,
>>>>>>> not by comparing temperature values).
>>>>>> Currently, the 'power1_alarm' attribute is created just when
>>>>>> platform support the power meter meassurement(bit0 of the
>>>>>> supported capabilities in _PMC).
>>>>>> And it doesn't see if the platform support notifications.
>>>>>> From the current implementation of this driver, this sysfs can
>>>>>> also reflect the status by comparing power and cap,
>>>>>> which is good to the platform that support hardware limit from
>>>>>> some out-of-band mechanism but doesn't support any notification.
>>>>>>
>>>>>
>>>>> The point is that this can also be done from userspace. Hardware
>>>>> monitoring drivers
>>>>> are supposed to provide hardware attributes, not software
>>>>> attributes derived from it.
>>>>>
>>>> So this 'power1_alarm' attribute can be exposed when platform
>>>> supports hardware enforced limit and notifcations when the hardware
>>>> limit is enforced, right?
>>>> If so, we have to change the condition that driver creates this
>>>> sysfs interface.
>>>
>>> This isn't about enforcing anything, it is about reporting an alarm
>>> if the power consumed exceeds the maximum configured.
>>>
>> -->
>>
>> index 2f1c9d97ad21..b436ebd863e6
>> --- a/drivers/hwmon/acpi_power_meter.c
>> +++ b/drivers/hwmon/acpi_power_meter.c
>> @@ -84,6 +84,7 @@ struct acpi_power_meter_resource {
>> u64 power;
>> u64 cap;
>> u64 avg_interval;
>> + bool power_alarm;
>> int sensors_valid;
>> unsigned long sensors_last_updated;
>> struct sensor_device_attribute sensors[NUM_SENSORS];
>> @@ -396,6 +397,9 @@ static ssize_t show_val(struct device *dev,
>> struct acpi_device *acpi_dev = to_acpi_device(dev);
>> struct acpi_power_meter_resource *resource =
>> acpi_dev->driver_data;
>> u64 val = 0;
>> + int ret;
>> +
>> + guard(mutex)(&resource->lock);
>>
>> switch (attr->index) {
>> case 0:
>> @@ -423,10 +427,21 @@ static ssize_t show_val(struct device *dev,
>> val = 0;
>> break;
>> case 6:
>> - if (resource->power > resource->cap)
>> - val = 1;
>> - else
>> - val = 0;
>> + /* report alarm status based on the notification if
>> support. */
>> + if (resource->caps.flags & POWER_METER_CAN_NOTIFY) {
>> + val = resource->power_alarm;
>> + } else {
>> + ret = update_meter(resource);
>> + if (ret)
>> + return ret;
>> + ret = update_cap(resource);
>> + if (ret)
>> + return ret;
>> + if (resource->power > resource->cap)
>> + val = 1;
>> + else
>> + val = 0;
>> + }
>
> It would have to be something like
>
> ret = update_meter(resource);
> if (ret)
> return ret;
>
> val = resource->power_alarm || resource->power > resource->cap;
> /* clear alarm if no longer active */
> resource->power_alarm &= resource->power > resource->cap;
>
> This ensures that alarms are cached if supported, and that cached
> values are
> reported at once. It is far from perfect but the best I can think of
> since
> there is no notification that the alarm is cleared.
>
Indeed, since there is no notification that the alarm is cleared, driver
have to compare 'power' and 'cap' to clear it anyway.
If platform support notify to OSPM, driver also need to update 'power'
to show this alarm status.
In this case, no need to update 'cap' which can be updated by nofity
0x82 event, right? But this also depands on the initialization of the
"resource->cap" the probe phase needs to add.
For the platform doesn't support notify, driver have to update 'cap' and
'power' to show this status, right?
But considering above two cases, directly to update 'power' and 'cap' is
simple to handle this without more switch case.
what do you think, Guenter?
>
>> break;
>> case 7:
>> case 8:
>> @@ -853,6 +868,7 @@ static void acpi_power_meter_notify(struct
>> acpi_device *device, u32 event)
>> sysfs_notify(&device->dev.kobj, NULL,
>> POWER_AVG_INTERVAL_NAME);
>> break;
>> case METER_NOTIFY_CAPPING:
>> + resource->power_alarm = true;
>> sysfs_notify(&device->dev.kobj, NULL,
>> POWER_ALARM_NAME);
>> dev_info(&device->dev, "Capping in progress.\n");
>> break;
>>
>>> .
>>
>
>
> .
next prev parent reply other threads:[~2024-12-12 3:00 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-25 9:34 [PATCH v1 0/4] hwmon: (acpi_power_meter) Some trival optimizations Huisong Li
2024-11-25 9:34 ` [PATCH v1 1/4] hwmon: (acpi_power_meter) Fix using uninitialized variables Huisong Li
2024-11-25 16:03 ` Guenter Roeck
2024-11-26 1:56 ` lihuisong (C)
2024-11-26 4:04 ` Guenter Roeck
2024-11-26 7:03 ` lihuisong (C)
2024-11-26 16:19 ` Guenter Roeck
2024-11-27 2:29 ` lihuisong (C)
2024-11-27 3:43 ` lihuisong (C)
2024-12-11 7:41 ` lihuisong (C)
2024-12-12 1:51 ` Guenter Roeck
2024-12-12 3:00 ` lihuisong (C) [this message]
2024-12-19 3:45 ` lihuisong (C)
2024-12-19 3:50 ` Guenter Roeck
2024-12-20 6:00 ` lihuisong (C)
2024-11-25 9:34 ` [PATCH v1 2/4] hwmon: (acpi_power_meter) Fix update the power trip points on failure Huisong Li
2024-11-25 15:22 ` Guenter Roeck
2024-11-26 1:59 ` lihuisong (C)
2024-11-25 9:34 ` [PATCH v1 3/4] hwmon: (acpi_power_meter) Remove redundant 'sensors_valid' variable Huisong Li
2024-11-25 15:38 ` Guenter Roeck
2024-11-26 2:25 ` lihuisong (C)
2024-11-25 9:34 ` [PATCH v1 4/4] hwmon: (acpi_power_meter) Add the print of no notification that hardware limit is enforced Huisong Li
2024-11-25 16:13 ` Guenter Roeck
2024-11-26 3:15 ` lihuisong (C)
2024-11-26 4:06 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2068a450-7752-c47b-edfc-cb2a00ac4402@huawei.com \
--to=lihuisong@huawei.com \
--cc=jdelvare@suse.com \
--cc=linux-hwmon@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=liuyonglong@huawei.com \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox