linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eduardo Valentin <edubezval@gmail.com>
To: Kapileshwar Singh <kapileshwar.singh@arm.com>
Cc: Zhang Rui <rui.zhang@intel.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Javi Merino <Javi.Merino@arm.com>,
	Punit Agrawal <Punit.Agrawal@arm.com>
Subject: Re: Fix: Bind Cooling Device Weight
Date: Mon, 5 Jan 2015 17:31:08 -0400	[thread overview]
Message-ID: <20150105213106.GB31536@developer> (raw)
In-Reply-To: <5498488A.6080209@arm.com>

[-- Attachment #1: Type: text/plain, Size: 17111 bytes --]

On Mon, Dec 22, 2014 at 04:36:26PM +0000, Kapileshwar Singh wrote:
> Many thanks for looking through this! Had a few observations regarding
> the same:
> On 22/12/14 03:23, Zhang Rui wrote:
> > On Thu, 2014-12-18 at 16:44 +0000, Kapileshwar Singh wrote:
> >> Thanks for Reviewing this!
> >>
> >> On 18/12/14 13:43, Eduardo Valentin wrote:
> >>> On Thu, Dec 18, 2014 at 12:19:00PM +0000, Kapileshwar Singh wrote:
> >>>> The Problem:
> >>>>
> >>>> In the current code, the weight of the cooling device, which is read as
> >>>> contribution from the device tree in of-thermal.c as:
> >>>>
> >>>>
> >>>>         ret = of_property_read_u32(np, "contribution", &prop);
> >>>>         if (ret == 0)
> >>>>                 __tbp->usage = prop;
> >>>>
> >>>> This is only stored in the private devicedata as:
> >>>>
> >>>>         ((__thernal_zone *)cdev->devdata)->__tbp->usage
> >>>>
> >>>> As of-thermal.c specifies its "bind" operation and does not populate
> >>>> tzd->tzp->tbp, this causes an erroneous access in the fair-share
> >>>> governor when it tries to access the weight:
> >>>>
> >>>> instance->target = get_target_state(tz, cdev,
> >>>>                                         tzp->tbp[i].weight,
> >>>> cur_trip_level);
> >>>>
> >>>>
> >>>> The Proposed solution:
> >>>>
> >>>> The proposed solution has the following changes:
> >>>>
> >>>>  1. Passing the weight as an argument to thermal_zone_bind_cooling_device
> >>>>
> >>>>  2. Storing the weight in the thermal_instance data structure created in
> >>>> the thermal_zone_bind_cooling_device function
> >>>>
> >>>>  3. Changing the accesses in the governor accordingly.
> >>> Shouldn't we simply:
> >>> 1. In of-thermal, populate tzd->tzp->tbp by passing a tbp with correct
> >>> tbp's in the registration call:
> >>>     zone = thermal_zone_device_register(child->name, tz->ntrips,
> >>>                                             0,
> >>>                                             tz,
> >>>                                             ops,
> >>> /* This guy needs to be filled properly */  tzp,
> >>>                                             tz->passive_delay,
> >>>                                             tz->polling_delay);
> >> The reason why I think this might not work is because we do not have the cooling device information (pointer) when we register the thermal zone device.
> >>
> > and I think that is why .match() callback in struct thermal_bind_params
> > is introduced.
> of-thermal defines a bind function and stores the binding information in
> a private data structure.
> 
> Looking at the code where a cooling device is bound
> there seem to be two options:
> 
> 1. There is a bind operation defined that figures out a set of bind
> parameters and the cooling device associated with them (ops->bind)
>    and calls back into thermal_zone_bind_cooling_device.
> 
> 2. Or there can be bind parameters (tbp) defined that specify a match
> operation to match the cooling device to the thermal zone device.
> 
> 1 and 2 being mutually exclusive as follows (in thermal-core.c: bind_tz
> and bind_cdev:
> 
>                 if (pos->ops->bind) {
>                         ret = pos->ops->bind(pos, cdev);
>                         if (ret)
>                                 print_bind_err_msg(pos, cdev, ret);
>                         continue;
>                 }
> 
>                 tzp = pos->tzp;
>                 if (!tzp || !tzp->tbp)
>                         continue;
> 
>                 for (i = 0; i < tzp->num_tbps; i++) {
>                         if (tzp->tbp[i].cdev || !tzp->tbp[i].match)
>                                 continue;
>                         if (tzp->tbp[i].match(pos, cdev))
>                                 continue;
>                         tzp->tbp[i].cdev = cdev;
>                         __bind(pos, tzp->tbp[i].trip_mask, cdev,
>                                tzp->tbp[i].binding_limits);
>                 }
> 
> the match function wont be called if a tz->ops->bind is defined.

The above description is a correct understanding. You are right.

> 
> The fix suggested above would involve the removal of the
> __thermal_zone and __thermal_bind_params private data structures and
> also removing the bind operation from of-thermal.c.
> Would that be a direction we would like to proceed in?

well, we would need to see the code to answer that :-) But sounds like a
sane idea, yes.

> 
> I also think since thermal_instance is populated by
> thermal_zone_bind_cooling_device and represents a "bound" configuration
> for both the branches (ops->bind and .match()) it could make sense to
> add weight as a part of the thermal_instance as something
> that is a part of the "bound" configuration.
> 

here we jump into a different subject. Why would the weight be instance
specific and not platform specific (as currently coded)?


> I do realize that I could be completely wrong here and would appreciate
> if you can point me towards the right direction.
> 

there is no issue, the right direction is open discussion, as we are
doing already.

> Regards,
> KP
> 
> 
> thanks, rui
> >>> 2. Add a proper check in the governor to avoid accessing thermal zones
> >>> with missing data.
> >>>
> >>> I know there is a check in place:
> >>>         if (!tz->tzp || !tz->tzp->tbp)
> >>>               return -EINVAL;
> >>>
> >>> which based in your description seams to be failing. Here, I need more
> >>> info from your side describing what exactly you are seeing. Can you
> >>> please post the kernel log when the failure happens? Does it output a
> >>> kernel segfault or is the governor simply not working?
> >> There is no crash as such, it is more of a semantic failure where the weight being read from the device tree is not passed to the bind parameters.
> >>
> >> Cheers!
> >> KP
> >>
> >>
> >>> I would expect, with the current code, the governor be silent and
> >>> non-functional, which needs to be fixed too.
> >>>>  I am not sure of what default value should be assigned to the weight
> >>>> member in the instance data structure and would like to leave this open
> >>>> to discussion.
> >>>>
> >>>> Suggested Patch (Not Signed off):
> >>>>
> >>>> diff --git a/drivers/thermal/db8500_thermal.c
> >>>> b/drivers/thermal/db8500_thermal.c
> >>>> index 1e3b3bf9f993..e3ccc2218eb3 100644
> >>>> --- a/drivers/thermal/db8500_thermal.c
> >>>> +++ b/drivers/thermal/db8500_thermal.c
> >>>> @@ -76,7 +76,7 @@ static int db8500_cdev_bind(struct thermal_zone_device
> >>>> *thermal,
> >>>>          upper = lower = i > max_state ? max_state : i;
> >>>>
> >>>>          ret = thermal_zone_bind_cooling_device(thermal, i, cdev,
> >>>> -            upper, lower);
> >>>> +            upper, lower, THERMAL_WEIGHT_DEFAULT);
> >>>>
> >>> I think spreading such parameter, which is known to be part of tbp, is
> >>> not a good thing to do. Can we avoid that?
> >>>
> >>>
> >>> Cheers, Eduardo
> >>>>          dev_info(&cdev->device, "%s bind to %d: %d-%s\n", cdev->type,
> >>>>              i, ret, ret ? "fail" : "succeed");
> >>>> diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
> >>>> index 6e0a3fbfae86..c3b25187b467 100644
> >>>> --- a/drivers/thermal/fair_share.c
> >>>> +++ b/drivers/thermal/fair_share.c
> >>>> @@ -109,7 +109,7 @@ static int fair_share_throttle(struct
> >>>> thermal_zone_device *tz, int trip)
> >>>>              continue;
> >>>>
> >>>>          instance->target = get_target_state(tz, cdev,
> >>>> -                    tzp->tbp[i].weight, cur_trip_level);
> >>>> +                    instance->weight, cur_trip_level);
> >>>>
> >>>>          instance->cdev->updated = false;
> >>>>          thermal_cdev_update(cdev);
> >>>> diff --git a/drivers/thermal/imx_thermal.c b/drivers/thermal/imx_thermal.c
> >>>> index 5a1f1070b702..74d0c9951eef 100644
> >>>> --- a/drivers/thermal/imx_thermal.c
> >>>> +++ b/drivers/thermal/imx_thermal.c
> >>>> @@ -307,7 +307,8 @@ static int imx_bind(struct thermal_zone_device *tz,
> >>>>
> >>>>      ret = thermal_zone_bind_cooling_device(tz, IMX_TRIP_PASSIVE, cdev,
> >>>>                             THERMAL_NO_LIMIT,
> >>>> -                           THERMAL_NO_LIMIT);
> >>>> +                           THERMAL_NO_LIMIT
> >>>> +                           THERMAL_WEIGHT_DEFAULT);
> >>>>      if (ret) {
> >>>>          dev_err(&tz->device,
> >>>>              "binding zone %s with cdev %s failed:%d\n",
> >>>> diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
> >>>> index e032b9bf4085..cf4036cbb671 100644
> >>>> --- a/drivers/thermal/of-thermal.c
> >>>> +++ b/drivers/thermal/of-thermal.c
> >>>> @@ -157,7 +157,8 @@ static int of_thermal_bind(struct
> >>>> thermal_zone_device *thermal,
> >>>>              ret = thermal_zone_bind_cooling_device(thermal,
> >>>>                          tbp->trip_id, cdev,
> >>>>                          tbp->max,
> >>>> -                        tbp->min);
> >>>> +                        tbp->min,
> >>>> +                        tbp->usage);
> >>>>              if (ret)
> >>>>                  return ret;
> >>>>          }
> >>>> diff --git a/drivers/thermal/samsung/exynos_thermal_common.c
> >>>> b/drivers/thermal/samsung/exynos_thermal_common.c
> >>>> index b6be572704a4..d653798b519f 100644
> >>>> --- a/drivers/thermal/samsung/exynos_thermal_common.c
> >>>> +++ b/drivers/thermal/samsung/exynos_thermal_common.c
> >>>> @@ -163,7 +163,7 @@ static int exynos_bind(struct thermal_zone_device
> >>>> *thermal,
> >>>>          case MONITOR_ZONE:
> >>>>          case WARN_ZONE:
> >>>>              if (thermal_zone_bind_cooling_device(thermal, i, cdev,
> >>>> -                                level, 0)) {
> >>>> +                                level, 0, THERMAL_WEIGHT_DEFAULT)) {
> >>>>                  dev_err(data->dev,
> >>>>                      "error unbinding cdev inst=%d\n", i);
> >>>>                  ret = -EINVAL;
> >>>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> >>>> index 4921e084c20b..199866864ef1 100644
> >>>> --- a/drivers/thermal/thermal_core.c
> >>>> +++ b/drivers/thermal/thermal_core.c
> >>>> @@ -277,7 +277,8 @@ static void print_bind_err_msg(struct
> >>>> thermal_zone_device *tz,
> >>>>
> >>>>  static void __bind(struct thermal_zone_device *tz, int mask,
> >>>>              struct thermal_cooling_device *cdev,
> >>>> -            unsigned long *limits)
> >>>> +            unsigned long *limits,
> >>>> +            unsigned int weight)
> >>>>  {
> >>>>      int i, ret;
> >>>>
> >>>> @@ -292,7 +293,8 @@ static void __bind(struct thermal_zone_device *tz,
> >>>> int mask,
> >>>>                  upper = limits[i * 2 + 1];
> >>>>              }
> >>>>              ret = thermal_zone_bind_cooling_device(tz, i, cdev,
> >>>> -                                   upper, lower);
> >>>> +                                   upper, lower,
> >>>> +                                   weight);
> >>>>              if (ret)
> >>>>                  print_bind_err_msg(tz, cdev, ret);
> >>>>          }
> >>>> @@ -339,7 +341,8 @@ static void bind_cdev(struct thermal_cooling_device
> >>>> *cdev)
> >>>>                  continue;
> >>>>              tzp->tbp[i].cdev = cdev;
> >>>>              __bind(pos, tzp->tbp[i].trip_mask, cdev,
> >>>> -                   tzp->tbp[i].binding_limits);
> >>>> +                   tzp->tbp[i].binding_limits,
> >>>> +                   tzp->tbp[i].weight);
> >>>>          }
> >>>>      }
> >>>>
> >>>> @@ -378,7 +381,8 @@ static void bind_tz(struct thermal_zone_device *tz)
> >>>>                  continue;
> >>>>              tzp->tbp[i].cdev = pos;
> >>>>              __bind(tz, tzp->tbp[i].trip_mask, pos,
> >>>> -                   tzp->tbp[i].binding_limits);
> >>>> +                   tzp->tbp[i].binding_limits,
> >>>> +                   tzp->tbp[i].weight);
> >>>>          }
> >>>>      }
> >>>>  exit:
> >>>> @@ -770,7 +774,8 @@ passive_store(struct device *dev, struct
> >>>> device_attribute *attr,
> >>>>                  thermal_zone_bind_cooling_device(tz,
> >>>>                          THERMAL_TRIPS_NONE, cdev,
> >>>>                          THERMAL_NO_LIMIT,
> >>>> -                        THERMAL_NO_LIMIT);
> >>>> +                        THERMAL_NO_LIMIT,
> >>>> +                        THERMAL_WEIGHT_DEFAULT);
> >>>>          }
> >>>>          mutex_unlock(&thermal_list_lock);
> >>>>          if (!tz->passive_delay)
> >>>> @@ -1009,7 +1014,9 @@ thermal_cooling_device_trip_point_show(struct
> >>>> device *dev,
> >>>>   * @lower:    the Minimum cooling state can be used for this trip point.
> >>>>   *        THERMAL_NO_LIMIT means no lower limit,
> >>>>   *        and the cooling device can be in cooling state 0.
> >>>> - *
> >>>> + * @weight:    The weight of the cooling device to be bound to the
> >>>> +        thermal zone. THERMAL_WEIGHT_DEFAULT for default value
> >>>> +
> >>>>   * This interface function bind a thermal cooling device to the certain
> >>>> trip
> >>>>   * point of a thermal zone device.
> >>>>   * This function is usually called in the thermal zone device .bind
> >>>> callback.
> >>>> @@ -1019,7 +1026,8 @@ thermal_cooling_device_trip_point_show(struct
> >>>> device *dev,
> >>>>  int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
> >>>>                       int trip,
> >>>>                       struct thermal_cooling_device *cdev,
> >>>> -                     unsigned long upper, unsigned long lower)
> >>>> +                     unsigned long upper, unsigned long lower,
> >>>> +                     unsigned int weight)
> >>>>  {
> >>>>      struct thermal_instance *dev;
> >>>>      struct thermal_instance *pos;
> >>>> @@ -1062,6 +1070,7 @@ int thermal_zone_bind_cooling_device(struct
> >>>> thermal_zone_device *tz,
> >>>>      dev->upper = upper;
> >>>>      dev->lower = lower;
> >>>>      dev->target = THERMAL_NO_TARGET;
> >>>> +    dev->weight = weight;
> >>>>
> >>>>      result = get_idr(&tz->idr, &tz->lock, &dev->id);
> >>>>      if (result)
> >>>> diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
> >>>> index b907be823527..1c61abe7801f 100644
> >>>> --- a/drivers/thermal/thermal_core.h
> >>>> +++ b/drivers/thermal/thermal_core.h
> >>>> @@ -48,6 +48,7 @@ struct thermal_instance {
> >>>>      struct device_attribute attr;
> >>>>      struct list_head tz_node; /* node in tz->thermal_instances */
> >>>>      struct list_head cdev_node; /* node in cdev->thermal_instances */
> >>>> +    unsigned int weight; /* The weight of the cooling device */
> >>>>  };
> >>>>
> >>>>  int thermal_register_governor(struct thermal_governor *);
> >>>> diff --git a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
> >>>> b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
> >>>> index 9eec26dc0448..772549ec9bd8 100644
> >>>> --- a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
> >>>> +++ b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
> >>>> @@ -147,7 +147,8 @@ static int ti_thermal_bind(struct
> >>>> thermal_zone_device *thermal,
> >>>>      return thermal_zone_bind_cooling_device(thermal, 0, cdev,
> >>>>      /* bind with min and max states defined by cpu_cooling */
> >>>>                          THERMAL_NO_LIMIT,
> >>>> -                        THERMAL_NO_LIMIT);
> >>>> +                        THERMAL_NO_LIMIT,
> >>>> +                        THERMAL_WEIGHT_DEFAULT);
> >>>>  }
> >>>>
> >>>>  /* Unbind callback functions for thermal zone */
> >>>> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> >>>> index fc4fdcbbdecc..674795ba0900 100644
> >>>> --- a/include/linux/thermal.h
> >>>> +++ b/include/linux/thermal.h
> >>>> @@ -40,6 +40,9 @@
> >>>>  /* No upper/lower limit requirement */
> >>>>  #define THERMAL_NO_LIMIT    ((u32)~0)
> >>>>
> >>>> +/* Default weight of a bound cooling device */
> >>>> +#define THERMAL_WEIGHT_DEFAULT 1
> >>>> +
> >>>>  /* Unit conversion macros */
> >>>>  #define KELVIN_TO_CELSIUS(t)    (long)(((long)t-2732 >= 0) ?    \
> >>>>                  ((long)t-2732+5)/10 : ((long)t-2732-5)/10)
> >>>> @@ -375,7 +378,8 @@ void thermal_zone_device_unregister(struct
> >>>> thermal_zone_device *);
> >>>>
> >>>>  int thermal_zone_bind_cooling_device(struct thermal_zone_device *, int,
> >>>>                       struct thermal_cooling_device *,
> >>>> -                     unsigned long, unsigned long);
> >>>> +                     unsigned long, unsigned long,
> >>>> +                     unsigned int);
> >>>>  int thermal_zone_unbind_cooling_device(struct thermal_zone_device *, int,
> >>>>                         struct thermal_cooling_device *);
> >>>>  void thermal_zone_device_update(struct thermal_zone_device *);
> >>>>
> >>>>
> >>
> >
> >
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2015-01-05 21:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-18 12:19 Fix: Bind Cooling Device Weight Kapileshwar Singh
2014-12-18 13:43 ` Eduardo Valentin
2014-12-18 16:44   ` Kapileshwar Singh
2014-12-22  3:23     ` Zhang Rui
2014-12-22 16:36       ` Kapileshwar Singh
2015-01-05 21:31         ` Eduardo Valentin [this message]
2015-01-06 16:38           ` Kapileshwar Singh
2015-01-06 18:37             ` Eduardo Valentin
2015-01-07 14:08               ` Kapileshwar Singh
2014-12-22  3:18   ` Zhang Rui
2014-12-29 10:40     ` Javi Merino
2015-01-05 21:24       ` Eduardo Valentin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150105213106.GB31536@developer \
    --to=edubezval@gmail.com \
    --cc=Javi.Merino@arm.com \
    --cc=Punit.Agrawal@arm.com \
    --cc=kapileshwar.singh@arm.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).