From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Anholt Subject: Re: [PATCH v2 3/3] drm/vc4: Add a load tracker to prevent HVS underflow errors Date: Thu, 08 Nov 2018 09:53:43 -0800 Message-ID: <87k1ln8nh4.fsf@anholt.net> References: <20181025124546.22145-1-boris.brezillon@bootlin.com> <20181025124546.22145-4-boris.brezillon@bootlin.com> <87bm7bm3k8.fsf@anholt.net> <20181105123648.4fe6d838@bbrezillon> <8736sbo7qu.fsf@anholt.net> <20181108175058.4038de03@bbrezillon> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1576942922==" Return-path: Received: from anholt.net (anholt.net [50.246.234.109]) by gabe.freedesktop.org (Postfix) with ESMTP id 3AA346E59C for ; Thu, 8 Nov 2018 17:53:46 +0000 (UTC) In-Reply-To: <20181108175058.4038de03@bbrezillon> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Boris Brezillon Cc: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1576942922== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Boris Brezillon writes: > On Thu, 08 Nov 2018 08:26:49 -0800 > Eric Anholt wrote: > >> >> > +static void vc4_plane_calc_load(struct drm_plane_state *state) >> >> > +{ >> >> > + unsigned int hvs_load_shift, vrefresh, i; >> >> > + struct drm_framebuffer *fb =3D state->fb; >> >> > + struct vc4_plane_state *vc4_state; >> >> > + struct drm_crtc_state *crtc_state; >> >> > + unsigned int vscale_factor; >> >> > + >> >> > + vc4_state =3D to_vc4_plane_state(state); >> >> > + crtc_state =3D drm_atomic_get_existing_crtc_state(state->state, >> >> > + state->crtc); >> >> > + vrefresh =3D drm_mode_vrefresh(&crtc_state->adjusted_mode); >> >> > + >> >> > + /* The HVS is able to process 2 pixels/cycle when scaling the sou= rce, >> >> > + * 4 pixels/cycle otherwise. >> >> > + * Alpha blending step seems to be pipelined and it's always oper= ating >> >> > + * at 4 pixels/cycle, so the limiting aspect here seems to be the >> >> > + * scaler block. >> >> > + * HVS load is expressed in clk-cycles/sec (AKA Hz). >> >> > + */ >> >> > + if (vc4_state->x_scaling[0] !=3D VC4_SCALING_NONE || >> >> > + vc4_state->x_scaling[1] !=3D VC4_SCALING_NONE || >> >> > + vc4_state->y_scaling[0] !=3D VC4_SCALING_NONE || >> >> > + vc4_state->y_scaling[1] !=3D VC4_SCALING_NONE) >> >> > + hvs_load_shift =3D 1; >> >> > + else >> >> > + hvs_load_shift =3D 2; >> >> > + >> >> > + vc4_state->membus_load =3D 0; >> >> > + vc4_state->hvs_load =3D 0; >> >> > + for (i =3D 0; i < fb->format->num_planes; i++) { >> >> > + unsigned long pixels_load;=20=20=20=20 >> >>=20 >> >> I'm scared any time I see longs. Do you want 32 or 64 bits here?=20= =20 >> > >> > I just assumed a 32bit unsigned var would be enough, so unsigned long >> > seemed just fine. I can use u32 or u64 if you prefer.=20=20 >>=20 >> Yes, please. See also Maxime's recent trouble with a 64-bit kernel. > > Will use u32 then. > >>=20 >> >> > + /* Even if the bandwidth/plane required for a single frame is >> >> > + * >> >> > + * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * vrefresh >> >> > + * >> >> > + * when downscaling, we have to read more pixels per line in >> >> > + * the time frame reserved for a single line, so the bandwidth >> >> > + * demand can be punctually higher. To account for that, we >> >> > + * calculate the down-scaling factor and multiply the plane >> >> > + * load by this number. We're likely over-estimating the read >> >> > + * demand, but that's better than under-estimating it. >> >> > + */ >> >> > + vscale_factor =3D DIV_ROUND_UP(vc4_state->src_h[i], >> >> > + vc4_state->crtc_h); >> >> > + pixels_load =3D vc4_state->src_w[i] * vc4_state->src_h[i] * >> >> > + vscale_factor;=20=20=20=20 >> >>=20 >> >> If we're upscaling (common for video, right?), aren't we under-counti= ng >> >> the cost? You need to scale/colorspace-convert crtc_w * crtc_h at 2 >> >> pixels per cycle.=20=20 >> > >> > That's not entirely clear to me. I'm not sure what the scaler does when >> > upscaling. Are the same pixels read several times from the memory? If >> > that's the case, then the membus load should indeed be based on the >> > crtc_w,h.=20=20 >>=20 >> I'm going to punt on this question because that would be a *lot* of >> verilog tracing to figure out for me (and I'm not sure I'd even trust >> what I came up with). >>=20 >> > Also, when the spec says the HVS can process 4pixels/cycles, is it 4 >> > input pixels or 4 output pixels per cycle?=20=20 >>=20 >> Well, it's 4 pixels per cycle when not scaling, so both :) > > Sorry, I meant 2pixels/cycle :). > >>=20 >> I think the scaling pipeline is doing two output pixels per cycle. >> Nothing else would make sense to me. > > Okay, so the HVS load should be based on crtc_w/h and the membus load > should be based on src_w/h and the scaling ratio, right? That sounds about right to me. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAlvkeCcACgkQtdYpNtH8 nuhfFw/9FaozTmJlm2cyIFVEn2+HO/bfWXmdG6XQ08gjyWHKfdFfg5fGR56WIW5q OzDLWZduFiFuT8yPSvjbmh8u20XpQAZnnXsT5DbsMxb4RI5x15x4tLtRa4FJYAIL yOXSCmB8NZ5pZwkLwule+bZF3n5pJ7bH12cEgKOQmkNMuyn68XZlT0v94b5IjAiW UWxEDh+XkgRSoF7m7ejLavUoKpge5cOaeNRlLmDSWXWVntzMSr9L6BKV7UgzKB8U A9JSF4kI/9HB78cp9M7oBMFu0Je1rE7TmsBAjTXaoJkvzTTQZxCXzdtFvBRL9z2K exV/8Vm0hDdP+oFWg/dO4TO0/kc1T6RN08NcWvBnwlK/J6NTj5jfspquHe6zdyIh urNFrZZnb44vuGdBXP7uSCZDO+zGvfr3iXZXz0+9zxoxo/b7Iw3BLqlvVJusp75E npwNVismErhP68UwIkx92t2BeZcOC/vOvlTzNJCRelzv3rTzljZampQ+dr74LqKn 5nQiB9FlH0y8uux4v7KYRPMfgI+0wlUyF3ZEDjU5+VuWPwyrmirfTJJFhQOxxM1+ e+13n1umGyY+VJvzJog92ORkvtaN2EfkKNzuFLCfk8oOsB/gaKkBAoCULdN0uTIR wRzuJWWDyrV2gnRXL16bipCFhdGNBKy1lGjJ8iUySMD0gtLjNVs= =Eo6u -----END PGP SIGNATURE----- --=-=-=-- --===============1576942922== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1576942922==--