From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753843AbcCRXgL (ORCPT ); Fri, 18 Mar 2016 19:36:11 -0400 Received: from mga09.intel.com ([134.134.136.24]:35603 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751948AbcCRXgF (ORCPT ); Fri, 18 Mar 2016 19:36:05 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,357,1455004800"; d="scan'208";a="672139731" From: "Pandruvada, Srinivas" To: "labbott@redhat.com" , "gregkh@linuxfoundation.org" CC: "ammdispose-arch@yahoo.com" , "linux-kernel@vger.kernel.org" , "Chen, Yu C" , "Zhang, Rui" , "manuelkrause@netscape.net" , "linux-pm@vger.kernel.org" , "frolvlad@gmail.com" , "szegadlo@poczta.onet.pl" , "prash.n.rao@gmail.com" , "javi.merino@arm.com" , "morpheusxyz123@yahoo.de" Subject: Re: [REGRESSION] 774ac8b7eff6 ("Thermal: initialize thermal zone device correctly") causes performance drop Thread-Topic: [REGRESSION] 774ac8b7eff6 ("Thermal: initialize thermal zone device correctly") causes performance drop Thread-Index: AQHRf+LSfAyVGhXqhkWfvACrwJfakp9gQKIAgAAR9oA= Date: Fri, 18 Mar 2016 23:36:00 +0000 Message-ID: <1458343981.2619.2.camel@intel.com> References: <56E9DDED.7000805@redhat.com> <20160316224634.GA3580@kroah.com> <56E9F387.9080703@redhat.com> <1458173863.4703.18.camel@intel.com> <56EC811C.6020404@redhat.com> In-Reply-To: <56EC811C.6020404@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.18.109] Content-Type: text/plain; charset="utf-8" Content-ID: <212C6C1D9127744A88C1893FFAB3A952@intel.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u2INaGsI028913 On Fri, 2016-03-18 at 15:28 -0700, Laura Abbott wrote: > (bringing this back to the main thread) > > On 03/16/2016 05:20 PM, Pandruvada, Srinivas wrote: > > On Wed, 2016-03-16 at 17:00 -0700, Laura Abbott wrote: > > > On 03/16/2016 03:46 PM, Greg Kroah-Hartman wrote: > > > > On Wed, Mar 16, 2016 at 03:27:57PM -0700, Laura Abbott wrote: > > > > > Hi, > > > > > > > > > > Fedora received a bug report (https://bugzilla.redhat.com/sho > > > > > w_bu > > > > > g.cgi?id=1317190) > > > > > of a major performance drop on various bench marks and > > > > > general > > > > > system > > > > > sluggishness with the 4.4.4 kernel update. The benchmarks > > > > > were > > > > > showing > > > > > a reduction to about 18% performance (not minor). > > > > > > > > > > Bisection showed the first bad commit was > > > > > > > > > > commit 774ac8b7eff69e0786970157de2157e68b22f456 > > > > > Author: Zhang Rui > > > > > Date:   Fri Oct 30 16:31:47 2015 +0800 > > > > > > > > > >       Thermal: initialize thermal zone device correctly > > > > >       commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 > > > > > upstream. > > > > >       After thermal zone device registered, as we have not > > > > > read > > > > > any > > > > >       temperature before, thus tz->temperature should not be > > > > > 0, > > > > >       which actually means 0C, and thermal trend is not > > > > > available. > > > > >       In this case, we need specially handling for the first > > > > >       thermal_zone_device_update(). > > > > >       Both thermal core framework and step_wise governor is > > > > >       enhanced to handle this. And since the step_wise > > > > > governor > > > > >       is the only one that uses trends, so it's the only > > > > > thermal > > > > >       governor that needs to be updated. > > > > >       Tested-by: Manuel Krause > > > > >       Tested-by: szegad > > > > >       Tested-by: prash > > > > >       Tested-by: amish > > > > >       Tested-by: Matthias > > > > >       Reviewed-by: Javi Merino > > > > >       Signed-off-by: Zhang Rui > > > > >       Signed-off-by: Chen Yu > > > > >       Signed-off-by: Greg Kroah-Hartman > > > > on.or > > > > > g> > > > > > > > > > > > > > > > > > > > > Reverting this plus to other commits in the series > > > > > (a67208e94d94 > > > > > "Thermal: handle thermal zone device properly during system > > > > > sleep" > > > > > and 27f356149d59 "Thermal: do thermal zone update after a > > > > > cooling > > > > > device registered") confirmed the performance was back to > > > > > normal. > > > > > > > > > > Bugzilla has the full discussion but this comment from one of > > > > > the > > > > > reporters sums it up: > > > > > > > > > > "In 4.4.3 and prior, my 2.40 MHz processor would fluctuate > > > > > between > > > > > 1000 and 3400 MHz.  In 4.4.4, the processor would fluctuate > > > > > between > > > > > 400 and 700 MHz, according to /proc/cpuinfo. > > > > > > > > > > Setting > > > > > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor > > > > > to > > > > > performance, instead of the default "powersave" forces the > > > > > CPU to > > > > > 2400 MHz, and improves performance greatly, but still not to > > > > > the > > > > > same level as in 4.4.3." > > > > > > > > > > Any ideas? > > > > > > > > Is this same "slowdown" also seen in 4.5? > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > > > > > Yes, the same issue is seen on 4.5 according to the reporter. > > What does it show here when performance drops? > > grep . /sys/devices/system/cpu/intel_pstate/* > > > > Is the problem still occurs if you set > > /sys/class/thermal/thermal_zone*/mode to "disabled" > > > > Thanks, > > Srinivas > > > > A separate thread was started which gave this insight: > > "I think > the problem is your device has a passive trip temp of 0 > /sys/class/thermal/thermal_zone0/trip_point_2_temp:0 > /sys/class/thermal/thermal_zone0/trip_point_2_type:passive > > Which triggers a false throttle = true. I think we should this trip > as > invalid in the case of > if (tz->temperature >= trip_temp) {} check > in thermal_zone_trip_update()." > > So would something like the following work? > > diff --git a/drivers/thermal/step_wise.c > b/drivers/thermal/step_wise.c > index ea9366a..1228797 100644 > --- a/drivers/thermal/step_wise.c > +++ b/drivers/thermal/step_wise.c > @@ -143,7 +143,7 @@ static void thermal_zone_trip_update(struct > thermal_zone_device *tz, int trip) >    >          trend = get_tz_trend(tz, trip); >    > -       if (tz->temperature >= trip_temp) { > +       if (trip_type != THERMAL_TRIPS_NONE && tz->temperature >= > trip_temp) { if (trip_temp && tz->temperature >= trip_temp) {                  throttle = true; >                  trace_thermal_zone_trip(tz, trip, trip_type); >          } > I think Rui is working on some change. Thanks, Srinivas > (completely untested, no idea if I'm even close) > > Thanks, > Laura