From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: David Airlie <airlied@linux.ie>,
dri-devel@lists.freedesktop.org,
linux-renesas-soc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] drm: rcar-du: calculate DPLLCR to be more small jitter
Date: Wed, 13 Dec 2017 11:09:18 +0200 [thread overview]
Message-ID: <4708231.IEM7dkWZuQ@avalon> (raw)
In-Reply-To: <87374oz9f9.wl%kuninori.morimoto.gx@renesas.com>
Hello Morimoto-san,
Thank you for the patch.
On Wednesday, 6 December 2017 08:05:38 EET Kuninori Morimoto wrote:
> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>
> In general, PLL has VCO (= Voltage controlled oscillator),
> one of the very important electronic feature called as "jitter"
> is related to this VCO.
> In academic generalism, VCO should be maximum to be more small jitter.
> In high frequency clock, jitter will be large impact.
> Thus, selecting Hi VCO is general theory.
>
> fin fvco fout fclkout
> in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out
> +-> | | |
> | |
> +-----------------[1/N]<-------------+
>
> fclkout = fvco / P / FDPLL -- (1)
>
> In PD, it will loop until fin/M = fvco/P/N
>
> fvco = fin * P * N / M -- (2)
>
> (1) + (2) indicates, fclkout = fin * N / M / FDPLL
> In this device, N = (n + 1), M = (m + 1), P = 2, thus
>
> fclkout = fin * (n + 1) / (m + 1) / FDPLL
>
> This is the datasheet formula.
> One note here is that it should be 2000 < fvco < 4096MHz
> To be smaller jitter, fvco should be maximum,
> in other words, N as large as possible, M as small as possible driver
> should select. Here, basically M=1.
> This patch do it.
>
> Reported-by: HIROSHI INOSE <hiroshi.inose.rb@renesas.com>
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
> v1 -> v2
>
> - tidyup typo on git-log "fout" -> "fclkout"
> - tidyup for loop terminate condition 40 -> 38 for n
>
> drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 36 ++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c index b492063..57479c9 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> @@ -125,8 +125,40 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc
> *rcrtc, unsigned int m;
> unsigned int n;
>
> - for (n = 39; n < 120; n++) {
> - for (m = 0; m < 4; m++) {
> + /*
> + * fin fvco fout fclkout
> + * in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out
> + * +-> | | |
> + * | |
> + * +-----------------[1/N]<-------------+
> + *
> + * fclkout = fvco / P / FDPLL -- (1)
> + *
> + * fin/M = fvco/P/N
> + *
> + * fvco = fin * P * N / M -- (2)
> + *
> + * (1) + (2) indicates
> + *
> + * fclkout = fin * N / M / FDPLL
> + *
> + * NOTES
> + * N = (n + 1), M = (m + 1), P = 2
> + * 2000 < fvco < 4096Mhz
Are you sure that the fvco constraint is really 2kHz, and not 2GHz ? 2kHz -
4GHz would be a surprisingly large range.
> + * Basically M=1
Why is this ?
> + * To be small jitter,
> + * N : as large as possible
> + * M : as small as possible
> + */
> + for (m = 0; m < 4; m++) {
> + for (n = 119; n > 38; n--) {
> + unsigned long long fvco = input * 2 * (n + 1) / (m + 1);
This code runs for Gen3 only, so unsigned long would be enough. The rest of
the function already relies on native support for 64-bit calculation. If you
wanted to run this on a 32-bit CPU, you would likely need to do_div() for the
division, and convert input to u64 to avoid integer overflows, otherwise the
calculation will be performed on 32-bit before a final conversion to 64-bit.
> + if ((fvco < 2000) ||
> + (fvco > 4096000000ll))
No need for the inner parentheses, and you can write both conditions on a
single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's no
need for the ll.
> + continue;
> +
I think you can then drop the output >= 4000000000 check inside the inner
fdpll loop, as the output frequency can't be higher than 4GHz if the VCO
frequency isn't.
> for (fdpll = 1; fdpll < 32; fdpll++) {
> unsigned long output;
The output frequency on the line below can be calculated with
output = fvco / 2 / (fdpll + 1)
to avoid the multiplication by (n + 1) and division by (m + 1).
If we wanted to optimize even more we could compute and operatate on fout
instead of fvco, that would remove the * 2 and / 2.
This patch seems to be a good first step in case of multiple possible exact
frequency matches. However, when the PLL can't achieve an exact match, we
might still end up with a high M value when a lower value could produce an
output frequency close enough to the desired value. I wonder if this function
should also take a frequency tolerance as an input parameter, and compute the
M, N and FDPLL values that will produce an output frequency within the
tolerance with M as small as possible. This can be done as a separate patch.
And while we're discussing PLL calculation, the three nested loops will run a
total of 10044 iterations :-/ That's a lot, and should be optimized if
possible. With the outer loop operating on N an easy optimization would have
been to compute fin * N in a local variable to avoid redoing the
multiplication for every value of M, but that's not possible anymore with the
outer loop operating on M.
--
Regards,
Laurent Pinchart
WARNING: multiple messages have this Message-ID (diff)
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: David Airlie <airlied@linux.ie>,
linux-renesas-soc@vger.kernel.org, linux-kernel@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2] drm: rcar-du: calculate DPLLCR to be more small jitter
Date: Wed, 13 Dec 2017 11:09:18 +0200 [thread overview]
Message-ID: <4708231.IEM7dkWZuQ@avalon> (raw)
In-Reply-To: <87374oz9f9.wl%kuninori.morimoto.gx@renesas.com>
Hello Morimoto-san,
Thank you for the patch.
On Wednesday, 6 December 2017 08:05:38 EET Kuninori Morimoto wrote:
> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>
> In general, PLL has VCO (= Voltage controlled oscillator),
> one of the very important electronic feature called as "jitter"
> is related to this VCO.
> In academic generalism, VCO should be maximum to be more small jitter.
> In high frequency clock, jitter will be large impact.
> Thus, selecting Hi VCO is general theory.
>
> fin fvco fout fclkout
> in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out
> +-> | | |
> | |
> +-----------------[1/N]<-------------+
>
> fclkout = fvco / P / FDPLL -- (1)
>
> In PD, it will loop until fin/M = fvco/P/N
>
> fvco = fin * P * N / M -- (2)
>
> (1) + (2) indicates, fclkout = fin * N / M / FDPLL
> In this device, N = (n + 1), M = (m + 1), P = 2, thus
>
> fclkout = fin * (n + 1) / (m + 1) / FDPLL
>
> This is the datasheet formula.
> One note here is that it should be 2000 < fvco < 4096MHz
> To be smaller jitter, fvco should be maximum,
> in other words, N as large as possible, M as small as possible driver
> should select. Here, basically M=1.
> This patch do it.
>
> Reported-by: HIROSHI INOSE <hiroshi.inose.rb@renesas.com>
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
> v1 -> v2
>
> - tidyup typo on git-log "fout" -> "fclkout"
> - tidyup for loop terminate condition 40 -> 38 for n
>
> drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 36 ++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c index b492063..57479c9 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
> @@ -125,8 +125,40 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc
> *rcrtc, unsigned int m;
> unsigned int n;
>
> - for (n = 39; n < 120; n++) {
> - for (m = 0; m < 4; m++) {
> + /*
> + * fin fvco fout fclkout
> + * in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out
> + * +-> | | |
> + * | |
> + * +-----------------[1/N]<-------------+
> + *
> + * fclkout = fvco / P / FDPLL -- (1)
> + *
> + * fin/M = fvco/P/N
> + *
> + * fvco = fin * P * N / M -- (2)
> + *
> + * (1) + (2) indicates
> + *
> + * fclkout = fin * N / M / FDPLL
> + *
> + * NOTES
> + * N = (n + 1), M = (m + 1), P = 2
> + * 2000 < fvco < 4096Mhz
Are you sure that the fvco constraint is really 2kHz, and not 2GHz ? 2kHz -
4GHz would be a surprisingly large range.
> + * Basically M=1
Why is this ?
> + * To be small jitter,
> + * N : as large as possible
> + * M : as small as possible
> + */
> + for (m = 0; m < 4; m++) {
> + for (n = 119; n > 38; n--) {
> + unsigned long long fvco = input * 2 * (n + 1) / (m + 1);
This code runs for Gen3 only, so unsigned long would be enough. The rest of
the function already relies on native support for 64-bit calculation. If you
wanted to run this on a 32-bit CPU, you would likely need to do_div() for the
division, and convert input to u64 to avoid integer overflows, otherwise the
calculation will be performed on 32-bit before a final conversion to 64-bit.
> + if ((fvco < 2000) ||
> + (fvco > 4096000000ll))
No need for the inner parentheses, and you can write both conditions on a
single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's no
need for the ll.
> + continue;
> +
I think you can then drop the output >= 4000000000 check inside the inner
fdpll loop, as the output frequency can't be higher than 4GHz if the VCO
frequency isn't.
> for (fdpll = 1; fdpll < 32; fdpll++) {
> unsigned long output;
The output frequency on the line below can be calculated with
output = fvco / 2 / (fdpll + 1)
to avoid the multiplication by (n + 1) and division by (m + 1).
If we wanted to optimize even more we could compute and operatate on fout
instead of fvco, that would remove the * 2 and / 2.
This patch seems to be a good first step in case of multiple possible exact
frequency matches. However, when the PLL can't achieve an exact match, we
might still end up with a high M value when a lower value could produce an
output frequency close enough to the desired value. I wonder if this function
should also take a frequency tolerance as an input parameter, and compute the
M, N and FDPLL values that will produce an output frequency within the
tolerance with M as small as possible. This can be done as a separate patch.
And while we're discussing PLL calculation, the three nested loops will run a
total of 10044 iterations :-/ That's a lot, and should be optimized if
possible. With the outer loop operating on N an easy optimization would have
been to compute fin * N in a local variable to avoid redoing the
multiplication for every value of M, but that's not possible anymore with the
outer loop operating on M.
--
Regards,
Laurent Pinchart
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2017-12-13 9:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-06 6:05 [PATCH v2] drm: rcar-du: calculate DPLLCR to be more small jitter Kuninori Morimoto
2017-12-13 9:09 ` Laurent Pinchart [this message]
2017-12-13 9:09 ` Laurent Pinchart
2017-12-14 2:10 ` Kuninori Morimoto
2017-12-14 8:17 ` Laurent Pinchart
2017-12-14 8:42 ` Geert Uytterhoeven
2017-12-15 0:13 ` Kuninori Morimoto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4708231.IEM7dkWZuQ@avalon \
--to=laurent.pinchart@ideasonboard.com \
--cc=airlied@linux.ie \
--cc=dri-devel@lists.freedesktop.org \
--cc=kuninori.morimoto.gx@renesas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-renesas-soc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.