From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH v2 05/14] drm/i915: Rewrite vlv_find_best_dpll() Date: Fri, 27 Sep 2013 16:01:18 +0300 Message-ID: <20130927130118.GE14385@intel.com> References: <1380047191-3359-1-git-send-email-ville.syrjala@linux.intel.com> <1380047191-3359-6-git-send-email-ville.syrjala@linux.intel.com> <87a9izd2hc.fsf@gaia.fi.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id 519DDE7CD2 for ; Fri, 27 Sep 2013 06:02:02 -0700 (PDT) Content-Disposition: inline In-Reply-To: <87a9izd2hc.fsf@gaia.fi.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Mika Kuoppala Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Thu, Sep 26, 2013 at 06:30:55PM +0300, Mika Kuoppala wrote: > ville.syrjala@linux.intel.com writes: > = > > From: Ville Syrj=E4l=E4 > > > > Rewrite vlv_find_best_dpll() to use intel_clock_t rather than > > an army of local variables. > > > > Also extract the code to calculate the derived values into > > vlv_clock(). > > > > v2: Split up the earlier fixes, extract vlv_clock() > > > > Signed-off-by: Ville Syrj=E4l=E4 > > --- > > drivers/gpu/drm/i915/intel_display.c | 72 ++++++++++++++++------------= -------- > > 1 file changed, 31 insertions(+), 41 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i91= 5/intel_display.c > > index f646fea..c5f0794 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -438,6 +438,14 @@ static void i9xx_clock(int refclk, intel_clock_t *= clock) > > clock->dot =3D clock->vco / clock->p; > > } > > = > > +static void vlv_clock(int refclk, intel_clock_t *clock) > > +{ > > + clock->m =3D clock->m1 * clock->m2; > > + clock->p =3D clock->p1 * clock->p2; > > + clock->vco =3D refclk * clock->m / clock->n; > > + clock->dot =3D clock->vco / clock->p; > > +} > > + > > /** > > * Returns whether any output on the specified pipe is of the specifie= d type > > */ > > @@ -670,66 +678,48 @@ vlv_find_best_dpll(const intel_limit_t *limit, st= ruct drm_crtc *crtc, > > int target, int refclk, intel_clock_t *match_clock, > > intel_clock_t *best_clock) > > { > > - u32 p1, p2, m1, m2, vco, bestn, bestm1, bestm2, bestp1, bestp2; > > - u32 m, n, fastclk; > > - u32 updrate, minupdate, p; > > + intel_clock_t clock; > > + u32 minupdate =3D 19200; > > unsigned int bestppm =3D 1000000; > > - int dotclk, flag; > > = > > - flag =3D 0; > > - dotclk =3D target * 1000; > > - fastclk =3D dotclk / (2*100); > > - updrate =3D 0; > > - minupdate =3D 19200; > > - n =3D p =3D p1 =3D p2 =3D m =3D m1 =3D m2 =3D vco =3D bestn =3D 0; > > - bestm1 =3D bestm2 =3D bestp1 =3D bestp2 =3D 0; > > + target *=3D 5; /* fast clock */ > > = > > /* based on hardware requirement, prefer smaller n to precision */ > > - for (n =3D limit->n.min; n <=3D ((refclk) / minupdate); n++) { > > - updrate =3D refclk / n; > > - for (p1 =3D limit->p1.max; p1 > limit->p1.min; p1--) { > > - for (p2 =3D limit->p2.p2_fast+1; p2 > 0; p2--) { > > - if (p2 > 10) > > - p2 =3D p2 - 1; > > - p =3D p1 * p2; > > + for (clock.n =3D limit->n.min; clock.n <=3D ((refclk) / minupdate); c= lock.n++) { > > + for (clock.p1 =3D limit->p1.max; clock.p1 > limit->p1.min; clock.p1-= -) { > > + for (clock.p2 =3D limit->p2.p2_fast+1; clock.p2 > 0; clock.p2--) { > > + if (clock.p2 > 10) > > + clock.p2--; > > + clock.p =3D clock.p1 * clock.p2; > > /* based on hardware requirement, prefer bigger m1,m2 values */ > = > Is this comment valid as we seem to start from m1.min? We anyway try to find the closest m2 based on m1,n,p1 and p2, and since we start w/ large p dividers, m1*m2 will come out as something big to compensate. Though starting with small n does mean m2 doesn't come out as large as it could be, but I guess having a small n is considered more important than having a large m. The bestppm comparison we do guarantees that we prefer an earlier result unless the new ppm is at least 10 better, and since we start with small n and large p, it should do what we want. Then there's ppm<100 comparison which is a bit different. It means we favor anything that is considered good enough (ppm < 100) as long as the p divider increases, and hence the VCO frequency increases. That would seem to be in line with the other stated goals of big m and small n. > = > > - for (m1 =3D limit->m1.min; m1 <=3D limit->m1.max; m1++) { > > + for (clock.m1 =3D limit->m1.min; clock.m1 <=3D limit->m1.max; cloc= k.m1++) { > > unsigned int ppm, diff; > > = > > - m2 =3D DIV_ROUND_CLOSEST(fastclk * p * n, refclk * m1); > > - m =3D m1 * m2; > > - vco =3D updrate * m; > > + clock.m2 =3D DIV_ROUND_CLOSEST(target * clock.p * clock.n, > > + refclk * clock.m1); > > = > > - if (vco < limit->vco.min || vco >=3D limit->vco.max) > > + vlv_clock(refclk, &clock); > > + > = > > + if (clock.vco < limit->vco.min || > > + clock.vco >=3D limit->vco.max) > > continue; > = > Can intel_PLL_is_valid() used here instead of just checking the vco? We'd need to modify intel_PLL_is_valid() a bit to skip the m1<=3Dm2 check, and we'd also need to skip the 'm' and 'p' divider check, or populate the m and p min/max with something that makes sense. It would do the clock.dot min/max check that we're currently missing from this function, and I guess it would allow easier debugging since it has the INTELPllInvalid() macro for that purpose. So it would seem to be a good idea to use it. > = > > = > > - diff =3D abs(vco / p - fastclk); > > - ppm =3D div_u64(1000000ULL * diff, fastclk); > > - if (ppm < 100 && ((p1 * p2) > (bestp1 * bestp2))) { > > + diff =3D abs(clock.dot - target); > > + ppm =3D div_u64(1000000ULL * diff, target); > > + > > + if (ppm < 100 && clock.p > best_clock->p) { > > bestppm =3D 0; > > - flag =3D 1; > > + *best_clock =3D clock; > > } > > + > > if (bestppm >=3D 10 && ppm < bestppm - 10) { > > bestppm =3D ppm; > > - flag =3D 1; > > - } > > - if (flag) { > > - bestn =3D n; > > - bestm1 =3D m1; > > - bestm2 =3D m2; > > - bestp1 =3D p1; > > - bestp2 =3D p2; > > - flag =3D 0; > > + *best_clock =3D clock; > > } > > } > > } > > } > > } > > - best_clock->n =3D bestn; > > - best_clock->m1 =3D bestm1; > > - best_clock->m2 =3D bestm2; > > - best_clock->p1 =3D bestp1; > > - best_clock->p2 =3D bestp2; > > = > > return true; > > } > > -- = > > 1.8.1.5 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- = Ville Syrj=E4l=E4 Intel OTC