From mboxrd@z Thu Jan 1 00:00:00 1970 From: Grigori Goronzy Subject: Re: [PATCH] drm/radeon: Adding UVD handle basis fps estimation v2 Date: Fri, 15 Aug 2014 17:20:16 +0200 Message-ID: <53EE2530.2030203@chown.ath.cx> References: <1407411210-1939-1-git-send-email-deathsimple@vodafone.de> <53E39C0A.6090109@vodafone.de> <53E88811.9030302@vodafone.de> <53E9E5A8.8010004@vodafone.de> <53EDC943.3020306@amd.com> <53EE151B.6050105@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1618184595==" Return-path: Received: from pygmy.kinoho.net (pygmy.kinoho.net [134.0.27.24]) by gabe.freedesktop.org (Postfix) with ESMTP id D36F86E7FC for ; Fri, 15 Aug 2014 08:20:21 -0700 (PDT) In-Reply-To: <53EE151B.6050105@amd.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?UTF-8?B?Q2hyaXN0aWFuIEvDtm5pZw==?= , Marco Benatto Cc: "Peter.Fruehberger@gmail.com" , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============1618184595== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dX1huxlUbHMdDeAKxruAex3GAIXCmKbjX" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --dX1huxlUbHMdDeAKxruAex3GAIXCmKbjX Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 15.08.2014 16:11, Christian K=C3=B6nig wrote: > Hi Marco, >=20 > the problem with an CS ioctl flag is that we sometimes don't know how > much SCLK/MCLK boost is needed, for example when we do post processing > in the player using OpenGL and UVD decoding with VDPAU. In this case > VDPAU don't has the slightest idea how high SCLK/MCLK must be and so > can't give that info to the kernel either. > Maybe it's an acceptable workaround to simply disable dynamic UVD state selection in case the UVD states only have a single power level. That will avoid the performance issues on affected systems, while still allowing dynamic UVD states on systems that have a saner DPM table setup. I think it is mosly older systems that suffer from this. Best regards Grigori > Regards, > Christian. >=20 > Am 15.08.2014 um 15:21 schrieb Marco Benatto: >> Hey all, >> >> I also had a talk with Alex yesterday about post-processing issues >> when using dynamic UVD profiles and a chamge on CS ioctl >> including a flag to let user mode driver tell to the kernel which >> performance requirement it wants for post processing. A commom >> point for both discussion is to stablish the default values for these >> profiles, but probably this ioctl change would be more impacting/compl= ex >> to implement than a sysfs entry. >> >> If a sysfs entry is anough for now I can handle the code to create it >> and, with your help, the code to setup the UVD profile requested >> through it. >> >> Is there any suggestion? >> >> Thanks all for your help, >> >> >> On Fri, Aug 15, 2014 at 5:48 AM, Christian K=C3=B6nig >> > wrote: >> >> Hi guys, >> >> to make a long story short every time I watch a movie my laptop >> start to heat up because we always select the standard UVD power >> profile without actually measuring if that is necessary. >> >> Marco came up with a patch that seems to reliable measure the fps >> send down to the kernel and so together with knowing the frame >> size of the video should allow us to select the right UVD power >> profile. >> >> The problem is that Alex (unnoticed by me) completely disabled >> selecting the UVD profiles because of some issues with advanced >> post processing discussed on IRC. The problem seems to be that the= >> lower UVD profiles have a to low SCLK/MCLK to handle the 3D load >> that comes with scaling, deinterlacing etc... >> >> I unfortunately don't have time for it, cause this only affects >> the hardware generations R600-SI and not the newest one CIK. So >> could you guys stick together and come up with a solution? >> Something like a sysfs entry that let's us select the minimum UVD >> power level allowed? >> >> I think Marco is happy to come up with a patch, we just need to >> know what's really needed and what should be the default values. >> I'm happy to review everything that comes out of it, just don't >> have time to do it myself. >> >> Happy discussion and thanks in advance, >> Christian. >> >> Am 12.08.2014 um 15:05 schrieb Alex Deucher: >> >> On Tue, Aug 12, 2014 at 6:00 AM, Christian K=C3=B6nig >> > wro= te: >> >> Am 11.08.2014 um 16:52 schrieb Alex Deucher: >> >> On Mon, Aug 11, 2014 at 5:08 AM, Christian K=C3=B6nig >> > > wrote: >> >> Am 07.08.2014 um 21:43 schrieb Alex Deucher: >> >> On Thu, Aug 7, 2014 at 11:32 AM, Christian K=C3= =B6nig >> > > wrote: >> >> Am 07.08.2014 um 16:32 schrieb Alex Deuche= r: >> >> On Thu, Aug 7, 2014 at 7:33 AM, >> Christian K=C3=B6nig >> > > >> wrote: >> >> From: Marco A Benatto >> > > >> >> Adding a Frames Per Second >> estimation logic on UVD handles >> when it has being used. This >> estimation is per handle basis >> and will help on DPM profile >> calculation. >> >> v2 (chk): fix timestamp type, move= >> functions around and >> cleanup code a bit. >> >> Will this really help much? I thought= >> the problem was mainly due to >> sclk and mclk for post processing. >> >> >> It should at least handle the UVD side for= >> upclocking when you get a >> lot >> of >> streams / fps. And at on my NI the patch >> seems to do exactly that. >> >> Switching sclk and mclk for post >> processing is a different task, and I >> actually have no idea what to do with them= =2E >> >> At this point we always choose the plain UVD >> state anyway so this >> patch would only take effect if we re-enabled >> the dynamic UVD state >> selection. >> >> >> Hui? I thought we already re-enabled the dynamic >> UVD state selection, but >> double checking this I found it disabled again. >> >> What was the problem with that? Looks like I >> somehow missed the >> discussion >> around it. >> >> We did, but after doing so a number of people >> complained about a >> regression on IRC because when apps like xmbc enabled >> post processing, >> performance went down. >> >> >> That's strange, from my experience the different UVD >> performance states only >> affect UVDs dclk/vclk, not sclk/mclk. I need to get the >> DPM dumps to >> confirms this. >> >> The sclks and mclks are usually different as well, especially >> on APUs. >> I can send you some examples. >> >> You not off hand remember who complained on IRC? Finding >> something in the >> IRC logs is like searching for a needle in a haystack. >> >> I don't remember off hand. I think zgreg was involved in some= >> of the >> discussions. >> >> Alex >> >> Thanks, >> Christian. >> >> >> Alex >> >> >> Christian. >> >> >> For the post processing, we probably need a >> hint we can >> pass to the driver in the CS ioctl to denote >> what state we need. >> Although if we did that, this could would >> largely be moot. That said, >> newer asics support dynamic UVD clocks so we >> really only need >> something like that for older asics and I >> guess VCE. >> >> Alex >> >> Christian. >> >> >> Alex >> >> Signed-off-by: Marco A Benatto >> > > >> Signed-off-by: Christian K=C3=B6ni= g >> > >= >> --- >> =20 >> drivers/gpu/drm/radeon/radeon.h=20 >> | 10 ++++++ >> =20 >> drivers/gpu/drm/radeon/radeon_uvd= =2Ec >> | 64 >> +++++++++++++++++++++++++++++++++-= --- >> 2 files changed, 68 >> insertions(+), 6 deletions(-) >> >> diff --git >> a/drivers/gpu/drm/radeon/radeon.h >> b/drivers/gpu/drm/radeon/radeon.h >> index 9e1732e..e92f6cb 100644 >> --- a/drivers/gpu/drm/radeon/radeo= n.h >> +++ b/drivers/gpu/drm/radeon/radeo= n.h >> @@ -1617,6 +1617,15 @@ int >> radeon_pm_get_type_index(struct >> radeon_device >> *rdev, >> #define >> RADEON_UVD_STACK_SIZE (1024*1024)= >> #define RADEON_UVD_HEAP_SIZE = >> (1024*1024) >> >> +#define RADEON_UVD_FPS_EVENTS_MAX= 8 >> +#define RADEON_UVD_DEFAULT_FPS 60= >> + >> +struct radeon_uvd_fps { >> + uint64_t timestamp;= >> + uint8_t event_inde= x; >> + uint8_t =20 >> events[RADEON_UVD_FPS_EVENTS_MAX]= ; >> +}; >> + >> struct radeon_uvd { >> struct radeon_bo =20 >> *vcpu_bo; >> void =20 >> *cpu_addr; >> @@ -1626,6 +1635,7 @@ struct >> radeon_uvd { >> struct drm_file = >> *filp[RADEON_MAX_UVD_HANDLES]; >> unsigned =20 >> img_size[RADEON_MAX_UVD_HANDLES]= ; >> struct delayed_work = >> idle_work; >> + struct radeon_uvd_fps=20 >> fps_info[RADEON_MAX_UVD_HANDLES];= >> }; >> >> int radeon_uvd_init(struct >> radeon_device *rdev); >> diff --git >> a/drivers/gpu/drm/radeon/radeon_uv= d.c >> b/drivers/gpu/drm/radeon/radeon_uv= d.c >> index 6bf55ec..ef5667a 100644 >> --- >> a/drivers/gpu/drm/radeon/radeon_uv= d.c >> +++ >> b/drivers/gpu/drm/radeon/radeon_uv= d.c >> @@ -237,6 +237,51 @@ void >> radeon_uvd_force_into_uvd_segment(= struct >> radeon_bo *rbo) >> rbo->placement.lpfn =3D= >> (256 * 1024 * 1024) >> PAGE_SHIFT;= >> } >> >> +static void >> radeon_uvd_fps_clear_events(struct= >> radeon_device *rdev, >> int >> idx) >> +{ >> + struct radeon_uvd_fps *fps= >> =3D &rdev->uvd.fps_info[idx]; >> + unsigned i; >> + >> + fps->timestamp =3D jiffies= _64; >> + fps->event_index =3D 0; >> + for (i =3D 0; i < >> RADEON_UVD_FPS_EVENTS_MAX; i++) >> + fps->events[i] =3D= 0; >> +} >> + >> +static void >> radeon_uvd_fps_note_event(struct >> radeon_device *rdev, >> int >> idx) >> +{ >> + struct radeon_uvd_fps *fps= >> =3D &rdev->uvd.fps_info[idx]; >> + uint64_t timestamp =3D >> jiffies_64; >> + unsigned rate =3D 0; >> + >> + uint8_t index =3D >> fps->event_index++; >> + fps->event_index %=3D >> RADEON_UVD_FPS_EVENTS_MAX; >> + >> + rate =3D div64_u64(HZ, >> max(timestamp - fps->timestamp, >> 1ULL)); >> + >> + fps->timestamp =3D timesta= mp; >> + fps->events[index] =3D >> min(rate, 120u); >> +} >> + >> +static unsigned >> radeon_uvd_estimate_fps(struct >> radeon_device *rdev, >> int >> idx) >> +{ >> + struct radeon_uvd_fps *fps= >> =3D &rdev->uvd.fps_info[idx]; >> + unsigned i, valid =3D 0, >> count =3D 0; >> + >> + for (i =3D 0; i < >> RADEON_UVD_FPS_EVENTS_MAX; i++) { >> + /* We should >> ignore zero values */ >> + if (fps->events[i]= >> !=3D 0) { >> + count +=3D= >> fps->events[i]; >> + valid++; >> + } >> + } >> + >> + if (valid > 0) >> + return count / val= id; >> + else >> + return >> RADEON_UVD_DEFAULT_FPS; >> +} >> + >> void >> radeon_uvd_free_handles(struct >> radeon_device *rdev, struct >> drm_file *filp) >> { >> int i, r; >> @@ -419,8 +464,10 @@ static int >> radeon_uvd_cs_msg(struct >> radeon_cs_parser >> *p, struct radeon_bo *bo, >> >> /* create or decode, >> validate the handle */ >> for (i =3D 0; i < >> RADEON_MAX_UVD_HANDLES; ++i) { >> - if >> (atomic_read(&p->rdev->uvd.handles= [i]) >> =3D=3D handle) >> + if >> (atomic_read(&p->rdev->uvd.handles= [i]) >> =3D=3D handle) >> { >> + =20 >> radeon_uvd_fps_note_event(p->rdev= , i); >> return= 0; >> + } >> } >> >> /* handle not found >> try to alloc a new one */ >> @@ -428,6 +475,7 @@ static int >> radeon_uvd_cs_msg(struct >> radeon_cs_parser >> *p, struct radeon_bo *bo, >> if >> (!atomic_cmpxchg(&p->rdev->uvd.han= dles[i], >> 0, >> handle)) { >> =20 >> p->rdev->uvd.filp[i] =3D p->filp; >> =20 >> p->rdev->uvd.img_size[i] =3D img_s= ize; >> + =20 >> radeon_uvd_fps_clear_events(p->rd= ev, >> i); >> return= 0; >> } >> } >> @@ -763,7 +811,7 @@ int >> radeon_uvd_get_destroy_msg(struct >> radeon_device >> *rdev, int ring, >> static void >> radeon_uvd_count_handles(struct >> radeon_device *rdev, >> =20 >> unsigned *sd, unsigned *hd)= >> { >> - unsigned i; >> + unsigned i, fps_rate =3D 0= ; >> >> *sd =3D 0; >> *hd =3D 0; >> @@ -772,10 +820,13 @@ static void >> radeon_uvd_count_handles(struct >> radeon_device *rdev, >> if >> (!atomic_read(&rdev->uvd.handles[i= ])) >> contin= ue; >> >> - if >> (rdev->uvd.img_size[i] >=3D 720*57= 6) >> - ++(*hd); >> - else >> - ++(*sd); >> + fps_rate =3D >> radeon_uvd_estimate_fps(rdev, i); >> + >> + if >> (rdev->uvd.img_size[i] >=3D 720*57= 6) { >> + (*hd) +=3D= >> fps_rate > 30 ? 1 : 2; >> + } else { >> + (*sd) +=3D= >> fps_rate > 30 ? 1 : 2; >> + } >> } >> } >> >> @@ -805,6 +856,7 @@ void >> radeon_uvd_note_usage(struct >> radeon_device >> *rdev) >> set_clocks &=3D >> schedule_delayed_work(&rdev->uvd.i= dle_work, >> >> msecs_to_jiffies(UVD_IDLE_TIMEOUT_= MS)); >> >> + >> if >> ((rdev->pm.pm_method =3D=3D >> PM_METHOD_DPM) && >> rdev->pm.dpm_enabled) { >> unsigned hd =3D= >> 0, sd =3D 0; >> =20 >> radeon_uvd_count_handles(rdev, >> &sd, &hd); >> -- >> 1.9.1 >> >> >> >> >> >> --=20 >> Marco Antonio Benatto >> Linux user ID:#506236 >=20 --dX1huxlUbHMdDeAKxruAex3GAIXCmKbjX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJT7iUxAAoJEOpmYE8Sf8H2fT4P/0iwcnz5WctZEuvLddv/u6Mq wzKGsvhBk6EUX+eTUwgT7aK8IQyv16Q6Gq+T82iu6RWOIezvlkN0wov09LJM2iNS HchxpeyeyEpFEDbqwfd8LMfyeyp6Uu3e9XNkepkRdj8KtFPJatoRHC0XElIZJ+HJ jqken7rE2ykvTAvQN/XAEGP1qvfK6N7A6jr6VzD0Lj9KS3xab4zlAGoapEjTMzOE 58EMaxVwDU/aNikl8TMW8suM6AXMQY9R9UOEZgDPK3GmoUHwJkod0B0EQWnKQrD6 UqkdSLyy5Gz1ixjLGLBF6dbPbFwIZWnaF1mSPbxQdMaMCEk70AYgdzTsolLMOsxx /dYYJwBct0qd6KZxXupWe/fD+0H1PBfIQLNmBAxjfpDnDYyIbgdFiBOg7hdRC2L2 wRwHZAV7Csfj4QQANKguhxTZNW41OrSCqQtk8SldJCkPVdo+urWCK+9+vpD+QmMg M3Vo4y8DJD/gJAhY5Z9yOIurrDFIzC2zN6LI92lyozdjax6rVLpz355/TmT8idaH 4ZXKVXhoFAK4KrwntqWL1ydezluFyRHcJ0MW6KifYmTfZ8oBtkzB1HSvU9CRV2Ff +9cHbphqfxbxxK7b26jX6mq/8bJrfkq75Pulx4nUb7zF2ZqONHbMrk05AiELwG9U sc/dil6DfhsHrEOG4hN1 =9E6z -----END PGP SIGNATURE----- --dX1huxlUbHMdDeAKxruAex3GAIXCmKbjX-- --===============1618184595== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel --===============1618184595==--