From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Anholt Subject: Re: [PATCH v2 3/3] drm/vc4: Add a load tracker to prevent HVS underflow errors Date: Thu, 08 Nov 2018 08:26:49 -0800 Message-ID: <8736sbo7qu.fsf@anholt.net> References: <20181025124546.22145-1-boris.brezillon@bootlin.com> <20181025124546.22145-4-boris.brezillon@bootlin.com> <87bm7bm3k8.fsf@anholt.net> <20181105123648.4fe6d838@bbrezillon> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0772148160==" Return-path: Received: from anholt.net (anholt.net [50.246.234.109]) by gabe.freedesktop.org (Postfix) with ESMTP id 727806E4E8 for ; Thu, 8 Nov 2018 16:26:52 +0000 (UTC) In-Reply-To: <20181105123648.4fe6d838@bbrezillon> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Boris Brezillon Cc: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0772148160== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Boris Brezillon writes: > Hi Eric, > > On Tue, 30 Oct 2018 16:12:55 -0700 > Eric Anholt wrote: >> > +static void vc4_load_tracker_destroy_state(struct drm_private_obj *ob= j, >> > + struct drm_private_state *state) >> > +{ >> > + struct vc4_load_tracker_state *load_state; >> > + >> > + load_state =3D to_vc4_load_tracker_state(state); >> > + kfree(load_state); >> > +}=20=20 >>=20 >> Optional: just kfree(state) for simplicity. > > Hm, not sure that's a good idea. kfree(state) works as long as > drm_private_state is the first field in vc4_load_tracker_state, but it > sounds a bit fragile. > > I can do > > kfree(to_vc4_load_tracker_state(state)); > > if you prefer. I said optional for that reason :) Just keep it as is. >> > +static void vc4_plane_calc_load(struct drm_plane_state *state) >> > +{ >> > + unsigned int hvs_load_shift, vrefresh, i; >> > + struct drm_framebuffer *fb =3D state->fb; >> > + struct vc4_plane_state *vc4_state; >> > + struct drm_crtc_state *crtc_state; >> > + unsigned int vscale_factor; >> > + >> > + vc4_state =3D to_vc4_plane_state(state); >> > + crtc_state =3D drm_atomic_get_existing_crtc_state(state->state, >> > + state->crtc); >> > + vrefresh =3D drm_mode_vrefresh(&crtc_state->adjusted_mode); >> > + >> > + /* The HVS is able to process 2 pixels/cycle when scaling the source, >> > + * 4 pixels/cycle otherwise. >> > + * Alpha blending step seems to be pipelined and it's always operati= ng >> > + * at 4 pixels/cycle, so the limiting aspect here seems to be the >> > + * scaler block. >> > + * HVS load is expressed in clk-cycles/sec (AKA Hz). >> > + */ >> > + if (vc4_state->x_scaling[0] !=3D VC4_SCALING_NONE || >> > + vc4_state->x_scaling[1] !=3D VC4_SCALING_NONE || >> > + vc4_state->y_scaling[0] !=3D VC4_SCALING_NONE || >> > + vc4_state->y_scaling[1] !=3D VC4_SCALING_NONE) >> > + hvs_load_shift =3D 1; >> > + else >> > + hvs_load_shift =3D 2; >> > + >> > + vc4_state->membus_load =3D 0; >> > + vc4_state->hvs_load =3D 0; >> > + for (i =3D 0; i < fb->format->num_planes; i++) { >> > + unsigned long pixels_load;=20=20 >>=20 >> I'm scared any time I see longs. Do you want 32 or 64 bits here? > > I just assumed a 32bit unsigned var would be enough, so unsigned long > seemed just fine. I can use u32 or u64 if you prefer. Yes, please. See also Maxime's recent trouble with a 64-bit kernel. >> > + /* Even if the bandwidth/plane required for a single frame is >> > + * >> > + * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * vrefresh >> > + * >> > + * when downscaling, we have to read more pixels per line in >> > + * the time frame reserved for a single line, so the bandwidth >> > + * demand can be punctually higher. To account for that, we >> > + * calculate the down-scaling factor and multiply the plane >> > + * load by this number. We're likely over-estimating the read >> > + * demand, but that's better than under-estimating it. >> > + */ >> > + vscale_factor =3D DIV_ROUND_UP(vc4_state->src_h[i], >> > + vc4_state->crtc_h); >> > + pixels_load =3D vc4_state->src_w[i] * vc4_state->src_h[i] * >> > + vscale_factor;=20=20 >>=20 >> If we're upscaling (common for video, right?), aren't we under-counting >> the cost? You need to scale/colorspace-convert crtc_w * crtc_h at 2 >> pixels per cycle. > > That's not entirely clear to me. I'm not sure what the scaler does when > upscaling. Are the same pixels read several times from the memory? If > that's the case, then the membus load should indeed be based on the > crtc_w,h. I'm going to punt on this question because that would be a *lot* of verilog tracing to figure out for me (and I'm not sure I'd even trust what I came up with). > Also, when the spec says the HVS can process 4pixels/cycles, is it 4 > input pixels or 4 output pixels per cycle? Well, it's 4 pixels per cycle when not scaling, so both :) I think the scaling pipeline is doing two output pixels per cycle. Nothing else would make sense to me. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAlvkY8oACgkQtdYpNtH8 nuj4hA/+LmseQCj6SfivaAzw5rit6G/hN5pgaY5NKFD2DnTnt/RFaQZKz2IW45bB T/fQKPaGrR0CKwtJ1QwsB/611jTdZbmK92oE9mB/8DEWUomiaSaBYKJH5D5fizPJ bVK/HjP6Oyg8k9SWSDDAMM+7D8jFUJRiJsJwqBy82FKg2iGdLZr/13yj43lr3HeJ Q6SyoBHJzGGd9JEA7mO77pF+3IJeirwc704XuLq7tmZq/fg7HDCOJR3BsfPZc1+N fHz82DJXd7Hmlc7jcT2kzup0jSKMNfrTIIG0eMl052xIN3hZrGM8Yofl0e/wfOJk ffJZuilDzV60MyIiTr5Hi86TVEGnzjvgowQzgOSGfIZv81s1PrBb4EPD8SeAneB+ Qs+WbOiE6CA/D2VRJLg2XkHaCgmEG+WZ5nrtlsnt/b0wdnojMPUV5yujZ5an2kiX 1DIB/pxq8CiWyl2zoSS4HlAjQRLpPOxYUnadeMTRL+nXXBMsWlKNugi1hMd2QdZ7 YzEIlen+bc8FVQmfU/sbIMuYCLNE7yA5gLwXMF7TcnJXWdQH/gy+f1rJjq1N9NIq JiQvUBtbRoAarJxx0SUtWrbIC66o6qkpth3EGEUBlWaUO6GAgOyvU3Z4+91evVwT xU3c1zNEcaCKfDfUmnFdw702RebqNbHbtjvBer9ecyBZjCvpuHg= =eBXh -----END PGP SIGNATURE----- --=-=-=-- --===============0772148160== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0772148160==--