Linux Hardware Monitor development
 help / color / mirror / Atom feed
From: "lihuisong (C)" <lihuisong@huawei.com>
To: Guenter Roeck <linux@roeck-us.net>, <linux-hwmon@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: <jdelvare@suse.com>, <liuyonglong@huawei.com>,
	<zhanjie9@hisilicon.com>, <zhenglifeng1@huawei.com>
Subject: Re: [PATCH v1 1/4] hwmon: (acpi_power_meter) Fix using uninitialized variables
Date: Thu, 19 Dec 2024 11:45:21 +0800	[thread overview]
Message-ID: <b2ea68c5-2bcb-3d3a-565a-94a35a48629e@huawei.com> (raw)
In-Reply-To: <2068a450-7752-c47b-edfc-cb2a00ac4402@huawei.com>


在 2024/12/12 11:00, lihuisong (C) 写道:
>
> 在 2024/12/12 9:51, Guenter Roeck 写道:
>> On 11/26/24 19:43, lihuisong (C) wrote:
>>> Hi Guenter,
>>>
>>> How about the modification as below? But driver doesn't know what 
>>> the time is to set resource->power_alarm to false.
>>>
>> It's a start, but incomplete because power_alarm must be reset.
>>
>> See below.
>>
>>> 在 2024/11/27 0:19, Guenter Roeck 写道:
>>>> On 11/25/24 23:03, lihuisong (C) wrote:
>>>>>
>>>>> 在 2024/11/26 12:04, Guenter Roeck 写道:
>>>>>> On 11/25/24 17:56, lihuisong (C) wrote:
>>>>>>> Hi Guente,
>>>>>>>
>>>>>>> Thanks for your timely review.
>>>>>>>
>>>>>>> 在 2024/11/26 0:03, Guenter Roeck 写道:
>>>>>>>> On 11/25/24 01:34, Huisong Li wrote:
>>>>>>>>> The 'power1_alarm' attribute uses the 'power' and 'cap' in the
>>>>>>>>> acpi_power_meter_resource structure. However, these two fields 
>>>>>>>>> are just
>>>>>>>>> updated when user query 'power' and 'cap' attribute, or 
>>>>>>>>> hardware enforced
>>>>>>>>> limit. If user directly query the 'power1_alarm' attribute 
>>>>>>>>> without queryng
>>>>>>>>> above two attributes, driver will use the uninitialized 
>>>>>>>>> variables to judge.
>>>>>>>>> In addition, the 'power1_alarm' attribute needs to update 
>>>>>>>>> power and cap to
>>>>>>>>> show the real state.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/hwmon/acpi_power_meter.c | 10 ++++++++++
>>>>>>>>>   1 file changed, 10 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/hwmon/acpi_power_meter.c 
>>>>>>>>> b/drivers/hwmon/acpi_power_meter.c
>>>>>>>>> index 2f1c9d97ad21..4c3314e35d30 100644
>>>>>>>>> --- a/drivers/hwmon/acpi_power_meter.c
>>>>>>>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>>>>>>>> @@ -396,6 +396,9 @@ static ssize_t show_val(struct device *dev,
>>>>>>>>>       struct acpi_device *acpi_dev = to_acpi_device(dev);
>>>>>>>>>       struct acpi_power_meter_resource *resource = 
>>>>>>>>> acpi_dev->driver_data;
>>>>>>>>>       u64 val = 0;
>>>>>>>>> +    int ret;
>>>>>>>>> +
>>>>>>>>> +    guard(mutex)(&resource->lock);
>>>>>>>>>         switch (attr->index) {
>>>>>>>>>       case 0:
>>>>>>>>> @@ -423,6 +426,13 @@ static ssize_t show_val(struct device *dev,
>>>>>>>>>               val = 0;
>>>>>>>>>           break;
>>>>>>>>>       case 6:
>>>>>>>>> +        ret = update_meter(resource);
>>>>>>>>> +        if (ret)
>>>>>>>>> +            return ret;
>>>>>>>>> +        ret = update_cap(resource);
>>>>>>>>> +        if (ret)
>>>>>>>>> +            return ret;
>>>>>>>>> +
>>>>>>>>>           if (resource->power > resource->cap)
>>>>>>>>>               val = 1;
>>>>>>>>>           else
>>>>>>>>
>>>>>>>>
>>>>>>>> While technically correct, the implementation of this attribute 
>>>>>>>> defeats its
>>>>>>>> purpose. It is supposed to reflect the current status as 
>>>>>>>> reported by the
>>>>>>>> hardware. A real fix would be to use the associated 
>>>>>>>> notification to set or
>>>>>>>> reset a status flag, and to report the current value of that 
>>>>>>>> flag as reported
>>>>>>>> by the hardware.
>>>>>>> I know what you mean.
>>>>>>> The Notify(power_meter, 0x83) is supposed to meet your proposal 
>>>>>>> IIUC.
>>>>>>> It's good, but it depands on hardware support notification.
>>>>>>>>
>>>>>>>> If there is no notification support, the attribute should not 
>>>>>>>> even exist,
>>>>>>>> unless there is a means to retrieve its value from ACPI (the 
>>>>>>>> status itself,
>>>>>>>> not by comparing temperature values).
>>>>>>> Currently, the 'power1_alarm' attribute is created just when 
>>>>>>> platform support the power meter meassurement(bit0 of the 
>>>>>>> supported capabilities in _PMC).
>>>>>>> And it doesn't see if the platform support notifications.
>>>>>>>  From the current implementation of this driver, this sysfs can 
>>>>>>> also reflect the status by comparing power and cap,
>>>>>>> which is good to the platform that support hardware limit from 
>>>>>>> some out-of-band mechanism but doesn't support any notification.
>>>>>>>
>>>>>>
>>>>>> The point is that this can also be done from userspace. Hardware 
>>>>>> monitoring drivers
>>>>>> are supposed to provide hardware attributes, not software 
>>>>>> attributes derived from it.
>>>>>>
>>>>> So this 'power1_alarm' attribute can be exposed when platform 
>>>>> supports hardware enforced limit and notifcations when the 
>>>>> hardware limit is enforced, right?
>>>>> If so, we have to change the condition that driver creates this 
>>>>> sysfs interface.
>>>>
>>>> This isn't about enforcing anything, it is about reporting an alarm
>>>> if the power consumed exceeds the maximum configured.
>>>>
>>> -->
>>>
>>> index 2f1c9d97ad21..b436ebd863e6
>>> --- a/drivers/hwmon/acpi_power_meter.c
>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>> @@ -84,6 +84,7 @@ struct acpi_power_meter_resource {
>>>          u64             power;
>>>          u64             cap;
>>>          u64             avg_interval;
>>> +       bool            power_alarm;
>>>          int                     sensors_valid;
>>>          unsigned long           sensors_last_updated;
>>>          struct sensor_device_attribute  sensors[NUM_SENSORS];
>>> @@ -396,6 +397,9 @@ static ssize_t show_val(struct device *dev,
>>>          struct acpi_device *acpi_dev = to_acpi_device(dev);
>>>          struct acpi_power_meter_resource *resource = 
>>> acpi_dev->driver_data;
>>>          u64 val = 0;
>>> +       int ret;
>>> +
>>> +       guard(mutex)(&resource->lock);
>>>
>>>          switch (attr->index) {
>>>          case 0:
>>> @@ -423,10 +427,21 @@ static ssize_t show_val(struct device *dev,
>>>                          val = 0;
>>>                  break;
>>>          case 6:
>>> -               if (resource->power > resource->cap)
>>> -                       val = 1;
>>> -               else
>>> -                       val = 0;
>>> +               /* report alarm status based on the notification if 
>>> support. */
>>> +               if (resource->caps.flags & POWER_METER_CAN_NOTIFY) {
>>> +                       val = resource->power_alarm;
>>> +               } else {
>>> +                       ret = update_meter(resource);
>>> +                       if (ret)
>>> +                               return ret;
>>> +                       ret = update_cap(resource);
>>> +                       if (ret)
>>> +                               return ret;
>>> +                       if (resource->power > resource->cap)
>>> +                               val = 1;
>>> +                       else
>>> +                               val = 0;
>>> +               }
>>
>> It would have to be something like
>>
>>         ret = update_meter(resource);
>>         if (ret)
>>             return ret;
>>
>>         val = resource->power_alarm || resource->power > resource->cap;
>>         /* clear alarm if no longer active */
>>         resource->power_alarm &= resource->power > resource->cap;
>>
>> This ensures that alarms are cached if supported, and that cached 
>> values are
>> reported at once. It is far from perfect but the best I can think of 
>> since
>> there is no notification that the alarm is cleared.
>>
> Indeed, since there is no notification that the alarm is cleared, 
> driver have to compare 'power' and 'cap' to clear it anyway.
> If platform support notify to OSPM, driver also need to update 'power' 
> to show this alarm status.
> In this case, no need to update 'cap' which can be updated by nofity 
> 0x82 event, right? But this also depands on the initialization of the 
> "resource->cap" the probe phase needs to add.
> For the platform doesn't support notify, driver have to update 'cap' 
> and 'power' to show this status, right?
>
> But considering above two cases, directly to update 'power' and 'cap' 
> is simple to handle this without more switch case.
> what do you think, Guenter?

Hi Guenter,

What do you think? Looking forward to your reply.😁

/Huisong Li

>>
>>>                  break;
>>>          case 7:
>>>          case 8:
>>> @@ -853,6 +868,7 @@ static void acpi_power_meter_notify(struct 
>>> acpi_device *device, u32 event)
>>>                  sysfs_notify(&device->dev.kobj, NULL, 
>>> POWER_AVG_INTERVAL_NAME);
>>>                  break;
>>>          case METER_NOTIFY_CAPPING:
>>> +               resource->power_alarm = true;
>>>                  sysfs_notify(&device->dev.kobj, NULL, 
>>> POWER_ALARM_NAME);
>>>                  dev_info(&device->dev, "Capping in progress.\n");
>>>                  break;
>>>
>>>> .
>>>
>>
>>
>> .
>
> .

  reply	other threads:[~2024-12-19  4:00 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-25  9:34 [PATCH v1 0/4] hwmon: (acpi_power_meter) Some trival optimizations Huisong Li
2024-11-25  9:34 ` [PATCH v1 1/4] hwmon: (acpi_power_meter) Fix using uninitialized variables Huisong Li
2024-11-25 16:03   ` Guenter Roeck
2024-11-26  1:56     ` lihuisong (C)
2024-11-26  4:04       ` Guenter Roeck
2024-11-26  7:03         ` lihuisong (C)
2024-11-26 16:19           ` Guenter Roeck
2024-11-27  2:29             ` lihuisong (C)
2024-11-27  3:43             ` lihuisong (C)
2024-12-11  7:41               ` lihuisong (C)
2024-12-12  1:51               ` Guenter Roeck
2024-12-12  3:00                 ` lihuisong (C)
2024-12-19  3:45                   ` lihuisong (C) [this message]
2024-12-19  3:50                     ` Guenter Roeck
2024-12-20  6:00                       ` lihuisong (C)
2024-11-25  9:34 ` [PATCH v1 2/4] hwmon: (acpi_power_meter) Fix update the power trip points on failure Huisong Li
2024-11-25 15:22   ` Guenter Roeck
2024-11-26  1:59     ` lihuisong (C)
2024-11-25  9:34 ` [PATCH v1 3/4] hwmon: (acpi_power_meter) Remove redundant 'sensors_valid' variable Huisong Li
2024-11-25 15:38   ` Guenter Roeck
2024-11-26  2:25     ` lihuisong (C)
2024-11-25  9:34 ` [PATCH v1 4/4] hwmon: (acpi_power_meter) Add the print of no notification that hardware limit is enforced Huisong Li
2024-11-25 16:13   ` Guenter Roeck
2024-11-26  3:15     ` lihuisong (C)
2024-11-26  4:06       ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b2ea68c5-2bcb-3d3a-565a-94a35a48629e@huawei.com \
    --to=lihuisong@huawei.com \
    --cc=jdelvare@suse.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=liuyonglong@huawei.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox