From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Keith Packard" Subject: Re: [PATCH] vulkan: Add VK_GOOGLE_display_timing extension (x11+display, anv+radv) [v6] Date: Sat, 17 Nov 2018 13:34:59 -0800 Message-ID: <87lg5r75h8.fsf@keithp.com> References: <20181115180623.32390-1-keithp@keithp.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0822338292==" Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: mesa-dev-bounces@lists.freedesktop.org Sender: "mesa-dev" To: Michel =?utf-8?Q?D=C3=A4nzer?= Cc: mesa-dev@lists.freedesktop.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0822338292== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Michel D=C3=A4nzer writes: Thanks for taking time to review this patch! >> + int64_t refresh =3D (int64_t) refresh_timing.refreshDura= tion; >> + int64_t frames =3D (delta_nsec + refresh/2) / refresh; > > desiredPresentTime has "no sooner than" semantics, so I think this should= be > > int64_t frames =3D (delta_nsec + refresh-1) / refresh; Hrm. You're certainly right that we want to make sure to not hit the wrong frame, and we need to be very careful with this computation. And that turns out to be 'hard'. With a na=C3=AFve computation of frame times: future_frame_time =3D past_frame_time + n * refresh If the reported refresh is longer than the actual interval, due to rounding of that value or clock skew, this computation might select frame n+1 if the driver uses a later frame for its basis than the application: desiredPresentTime =3D application_past_frame_time + n * refresh delta_nsec =3D (desiredPresentTime - driver_past_frame_time); frames =3D (delta_nsec + refresh-1) / refresh; If 'driver_past_frame_time' was sampled 'm' frames after 'application_past_frame time', and 'refresh' is longer than the actual frame time: desiredPresentTime > driver_past_frame_time + m * refresh Because desiredPresentTime is *past* our estimate of the beginning of the frame the application wants, and because we're rounding the selected frame up, we end up targeting one frame too late. Now, if we use my value for 'frames', then we hit the right frame using this value, as long as the error is less than 1/2 frame time: desiredPresentTime > driver_past_frame_time + m * refresh desiredPresentTime < driver_past_frame_time + m * refresh + refresh= /2 delta_nsec > m * refresh delta_nsec < m * refresh + refresh /=20 frames > (m * refresh + refresh/2) / refresh frames < (m * refresh + refresh) / refresh With this computation, frames =3D m, which is the desired result. So at least you can see where my code came from. But, it's clearly wrong according to the spec, as you'll see in the next section. An application can attempt to compensate for this by using an earlier time; a slightly less na=C3=AFve computation might look like: future_frame_time =3D past_frame_time + 1 + (n-1) * refresh This makes 'future_frame_time' be the earliest possible time that should target the desired frame, given the 'no sooner than' semantics in the spec. If the reported refresh is shorter than the actual interval, this computation might hit frame (n-1). Ok, so now we make the application even 'smarter' by having it compute a time in the center of the target frame: future_frame_time =3D past_frame_time + (refresh/2) + (n-1) * refre= sh With your suggested code, this will hit the desired frame unless the error in the frame time is more than 1/2 of the refresh interval, which seems pretty good. Ok, so what can we do? I think we start with what we know: * driver_past_frame_time >=3D application_past_frame_time Because all application frame time information comes from the driver, we just need to use the latest possible frame time in the driver to keep this true. Now, what will cause errors in the 'refresh' value? 'refresh' error is a combination of rounding error and CPU vs GPU clock skew. * Rounding error. This is always less than 1ns. * Clock skew is related to the performance of a couple of crystals in the system. Even cheap crystals provide significantly better than 100ppm (parts per million) performance. At 30Hz, refresh_interval is 33.3ms, or 33,333,333ns, so each crystals will have a maximum error of 3300ns; combine two and we've a maximum error of 6600ns. As you can see, the rounding error is lost in the noise here, unless we find a system that uses CLOCK_MONOTONIC for display timing. It'll take 5000 frames before that error reaches a frame time. As long as the application_past_frame_time is within 2500 frames of the driver_past_frame_time, the error in the future_frame_time estimate will be within one-half frame, and our application will work reliably using the 'smarter' computation of future frame time. I would prefer to let applications use the initial na=C3=AFve future_frame_time estimate, as I think that could also work with variable refresh timing, but that would require a fairly complicated change in the specification. >> + timing->target_msc =3D swapchain->frame_msc + frames; >> + } >> + } > > Note that MSC based timing won't work well with variable refresh rate. > In the long term, support for PresentOptionUST should be implemented and > used. Agreed. Given the above discussion, I think that will have to wait for a more sophisticated specification for what 'desiredPresentTime' means, as I think the current specification makes it "impossible" to provide an actual desiredPresentTime to the interface without that causing occasional incorrect frame selection. > What is this code for? At least with X11 Present, shouldn't it be > sufficient to simply pass the target MSC value to x11_present_to_x11? This is the direct display back end which uses DRM interfaces in the kernel. Those interfaces do not support queuing a flip for anything other than the next frame. I'll update this patch to correct the computation of the next frame from 'nearest' to 'no sooner than'. Following the spec is certainly better than making applications simpler as any application which does follow the spec is going to get the wrong answer... =2D-=20 =2Dkeith --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEw4O3eCVWE9/bQJ2R2yIaaQAAABEFAlvwiYMACgkQ2yIaaQAA ABFDRhAApM4qDj9u3xJ35szxf3aeVvULahzoJdznm/wCDjOBkoUIA5IwufnLxpA3 kFqygmuvPjdl11QGU9YiQW7GQlsBlni6Nlp8tqXxRi+OD/aQelyEWsUGcVlhLsuS BtNP+hoeE2OyrDuswC3lVCbBPsY9zzTR2iLBooK4mN+KsUEc9uOnbXs1UVS2D1fr fqHOB+fDP6vJyeRpOGk5NOgQfI95sN/nm3UaTENjDMk3SC1eQIb7O4Ikuh75c1gv P1FHh2dWSDrOfzIra4e39IbFnkuNU7UGHLbHAL+xr9irJfhcz8/Tg3mPPehGmZuc JVdxdx0HlKccimNySz64YNAVlCasTdVDUcfsrXZ9DjajowQ2hre75UiqyFubhM/F x5ryIGnyAh+iUZ1fc4fQELcux6nGoUY/88N8c+Kso6l6b4q2mn3acN6mtUnxbpbf Sqm2X1DUmUcVdylycwePC+A8vShX6TFpBMVZ86mGxbasFiaOkbOwPdDesGdKdfpD TO5Kj031ulfMpRTNI8RJ9tDV1YNw6qR80pbe7mVcgYxkn5r8m08eM+LjOgKCA6M4 CkKYV71ZIfXLH62qm5NDwCgYaMO1aT46x39TBkpdRYWH/VekrfBOH//dStoCqR3N YsWsqq6Gnb3k6A4ivY5My70aB5sMug8uSywkDav6VfTjdrnm24M= =TnwL -----END PGP SIGNATURE----- --=-=-=-- --===============0822338292== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbWVzYS1kZXYg bWFpbGluZyBsaXN0Cm1lc2EtZGV2QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3Rz LmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL21lc2EtZGV2Cg== --===============0822338292==--