From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752542AbcAZVZk (ORCPT ); Tue, 26 Jan 2016 16:25:40 -0500 Received: from down.free-electrons.com ([37.187.137.238]:37510 "EHLO mail.free-electrons.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752499AbcAZVZg (ORCPT ); Tue, 26 Jan 2016 16:25:36 -0500 Date: Tue, 26 Jan 2016 22:25:32 +0100 From: Maxime Ripard To: Marcus Weseloh Cc: linux-sunxi , Chen-Yu Tsai , devicetree , Ian Campbell , Kumar Gala , "Mailing List, Arm" , linux-kernel@vger.kernel.org, linux-spi@vger.kernel.org, Mark Brown , Mark Rutland , Pawel Moll , Rob Herring Subject: Re: [PATCH v6 2/3] spi: sun4i: Fix clock calculations to be predictable and never exceed the requested rate Message-ID: <20160126212532.GU7908@lukather> References: <1451145186-14235-1-git-send-email-mweseloh42@gmail.com> <1451145186-14235-3-git-send-email-mweseloh42@gmail.com> <20151227210946.GL30359@lukather> <20160110181428.GH9631@lukather> <20160117185143.GL4581@lukather> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="vy3x0i7GCC7mAH4A" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --vy3x0i7GCC7mAH4A Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jan 18, 2016 at 10:40:59AM +0100, Marcus Weseloh wrote: > Hi, >=20 > 2016-01-17 19:51 GMT+01:00 Maxime Ripard : > > On Sun, Jan 10, 2016 at 10:11:11PM +0100, Marcus Weseloh wrote: > >> >> >> - /* Ensure that we have a parent clock fast enough */ > >> >> >> + /* > >> >> >> + * Ensure that the parent clock is set to twice the max sp= eed > >> >> >> + * of the spi device (possibly rounded up by the clk drive= r) > >> >> >> + */ > >> >> >> mclk_rate =3D clk_get_rate(sspi->mclk); > >> >> >> - if (mclk_rate < (2 * tfr->speed_hz)) { > >> >> >> - clk_set_rate(sspi->mclk, 2 * tfr->speed_hz); > >> >> >> + if (spi->max_speed_hz !=3D sspi->cur_max_speed || > >> >> >> + mclk_rate !=3D sspi->cur_mclk) { > >> >> > > >> >> > Do you need to cache the values? As far as I'm aware, you end up = doing > >> >> > the computation all the time anyway. > >> >> > >> >> By caching the values we optimize the case when a single SPI slave > >> >> device (or even multiple slave devices with the same max_speed) are > >> >> used multiple times in a row. In that case, we're saving two calls: > >> >> clk_set_rate and clk_get_rate. I was unsure about how expensive the > >> >> clk_* calls were, so I thought it would be safer use caching. But > >> >> maybe it's not worth the extra code? > >> > > >> > Unless you can point that there's a significant performance > >> > difference, I'm not sure it's worth it. > >> > >> I did actually notice a significant transfer latency when a new mod0 > >> clock frequency is set, probably due to the __delay in > >> drivers/clk/sunxi/clk-factors.c (line 147). So my feeling is that the > >> caching is worth it... at least for the case when there are two slave > >> devices with different transfer speeds accessing the same SPI module. > > > > I'm not sure we even need that delay in the first place, at least not > > for all the clocks using factor. > > > > AFAIK, the only case where it's useful is for PLL, and all of them > > have a bit you can busy-wait on that will let you know when the clock > > has stabilized. >=20 > OK, I didn't know that. Look for the lock bit in the PLLs section in the datasheet > Does that mean you would like me to drop the caching from this patch > and prepare a patch to remove the __delay from clk-factors? Yeah, you could add a lock bit field to the clk_factors structure, busy-wait on it. Otherwise, just go on (hence removing the delay for those). Be aware that it might conflict with the clk-factors reworking Chen-Yu posted yesterday, so you might want to synchronise with him. > >> >> [...] > >> >> >> - div =3D mclk_rate / (2 * tfr->speed_hz); > >> >> >> - if (div <=3D (SUN4I_CLK_CTL_CDR2_MASK + 1)) { > >> >> >> - if (div > 0) > >> >> >> - div--; > >> >> >> - > >> >> >> + div =3D DIV_ROUND_UP(mclk_rate, 2 * tfr->speed_hz) - 1; > >> >> > > >> >> > Isn't it exactly the same thing as mclk_rate / (2 * tfr->speed_hz= ) ? > >> >> > >> >> It is quite often, but not in all cases. The plain division rounds = to > >> >> the nearest integer, so it rounds down sometimes. Consider the > >> >> following case: we have a slow SPI device with a spi-max-frequency = of > >> >> 50kHz. Our clock driver can't find a clock as slow as 100kHz, so it > >> >> sets mclk to 214,285Hz. > >> >> > >> >> Using the old calculation we get: 214,285 / (2 * 50,000) =3D 2, so = div =3D > >> >> 1 as the old code subtracts 1 two lines further down > >> >> The new calculation results in: DIV_ROUND_UP(214,285, 2 * 50,000) = =3D > >> >> 3, so div =3D 2 if we add the -1 > >> > > >> > Except that you have that extra - 1 after your DIV_ROUND_UP > >> > calculation in the line you add. so you have to remove 1 from that > >> > line above, and then 1 again when we set the register, which ends up > >> > being the exact same thing, or am I missing something? > >> > >> The -1 after the DIV_ROUND_UP is already the -1 that is needed to set > >> the register. There's no need for another -1 after that (and there > >> isn't one in the code). > > > > I was probably hallucinating :) My bad. > > > > That being said, I still have a hard time figuring out what would not > > be solved simply by removing the div--, which seems to match what your > > commit log says. >=20 > I'm probably not doing a good job explaining the change. Let me try it > with a few examples. Consider the following three approaches: >=20 > A (original, unpatched version): > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > div =3D mclk_rate / (2 * tfr->speed_hz); > if (div <=3D (SUN4I_CLK_CTL_CDR2_MASK + 1)) { > if (div > 0) > div--; > } else { > ... > } >=20 > B (original version, but with div-- removed): > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > div =3D mclk_rate / (2 * tfr->speed_hz); > if (div <=3D (SUN4I_CLK_CTL_CDR2_MASK + 1)) { > ... > } else { > ... > } >=20 > C (version after this patch): > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > div =3D DIV_ROUND_UP(mclk_rate, 2 * tfr->speed_hz) - 1; > if (div <=3D SUN4I_CLK_CTL_CDR2_MASK) { Ah yes, you're removing the +1 from here too. Sorry for missing that obvious change.... > ... > } else { > ... > } >=20 > And now the following values for mclk, tfr->speed and the resulting > div and SPI_CLK >=20 > tfr->speed_hz =3D 50,000 > mclk =3D 214,285 > A: div =3D 1, SPI_CLK =3D 53,571(!) > B: div =3D 2, SPI_CLK =3D 35,714 > C: div =3D 2, SPI_CLK =3D 35,714 >=20 > tfr->speed_hz =3D 50,000 > mclk =3D 200,000 > A: div =3D 1, SPI_CLK =3D 50,000 > B: div =3D 2, SPI_CLK =3D 33,333(!) > C: div =3D 1, SPI_CLK =3D 50,000 >=20 > tfr->speed_hz =3D 650,000 > mclk =3D 1,6000,000 > A: div =3D 11, SPI_CLK =3D 666,667(!) > B: div =3D 12, SPI_CLK =3D 615,385 > C: div =3D 12, SPI_CLK =3D 615,385 >=20 > tfr->speed_hz =3D 1,000,000 > mclk =3D 2,000,000 > A: div =3D 0, SPI_CLK =3D 1,000,000 > B: div =3D 1, SPI_CLK =3D 500,000(!) > C: div =3D 0, SPI_CLK =3D 1,000,000 >=20 > I hope that makes it a little bit easier to understand the change. Of > course, the change only makes sense if you agree that the acutal SPI > transfer speed should never exceed the requested speed. Which I think > is the right approach... but maybe you think otherwise? No, my bad for being so picky about it, while missing the point... Thanks for your awesome explanation :) Maxime --=20 Maxime Ripard, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com --vy3x0i7GCC7mAH4A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWp+RMAAoJEBx+YmzsjxAgWMsQAKz97i8nyjdd/yqII9ceH753 A2PjWIl5CmdxJi1+lHuaZ3REBh96Nhf73Xv+xyceejsSNVjDhAmkrL+MbOzln6lW R0nCn6llQZBll5T3pskNw/YoSssjjCHqg/e09HcgHgR8QSGyCEHPk8qBAhqAEYvF 0pKyq0Ru412APkxSeoDFpfX8yJqSt4xcXjTwURmWi9em5W1prmWHbvp4zS3v0iWB cBqzI+rDYuJAmN1hjkI07qA8a+sZN0Dgdup8fAARurFPOkNQCydpfp3+qaIHEjP7 EOdYOjOxtGUHoeMGTC72ZhcuOiMLOTpLMV7XnjmGwQZzw5ay/g++4sHZcuSmhf3Z /QXPJMaXVOCv3zqZjNlLfWc23XnVOQtPCWHqzkL+WS6htbN58yPh9MIkyABX/7/i spnlm8X/mnx6YVPAafGaeH1Zy82MZjVq2QNBV89TFxaNBb5zIohFqyBvhPQURy5Q I9YNAUCuUYU3rUP0BpzeCHdsXHH6ds74rVGnR63bjdS1zQBy4u/2L8y+voymaVW8 f2je/D7Wbd34sfKugIEufkGUpOEyqkQRiN7wYOGJdVF1evqJM0UupWctw26yTPmQ 7f0Vrc8guqfY7wO3TtfHR2GfUa7tHIpzXLjHT4k9Ioki+c6MhBAu5HvAfmxwmpYM 9WktBK8jcPKv9oQawkKY =uq7B -----END PGP SIGNATURE----- --vy3x0i7GCC7mAH4A--