From mboxrd@z Thu Jan 1 00:00:00 1970 From: Grigori Goronzy Subject: Re: [PATCH] drm/radeon: Adding UVD handle basis fps estimation v2 Date: Fri, 15 Aug 2014 18:52:48 +0200 Message-ID: <53EE3AE0.20803@chown.ath.cx> References: <1407411210-1939-1-git-send-email-deathsimple@vodafone.de> <53E39C0A.6090109@vodafone.de> <53E88811.9030302@vodafone.de> <53E9E5A8.8010004@vodafone.de> <53EDC943.3020306@amd.com> <53EE151B.6050105@amd.com> <53EE2530.2030203@chown.ath.cx> <53EE2823.7030101@chown.ath.cx> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1712187825==" Return-path: Received: from pygmy.kinoho.net (pygmy.kinoho.net [134.0.27.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 8142E6E190 for ; Fri, 15 Aug 2014 09:52:52 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Marco Benatto Cc: "Peter.Fruehberger@gmail.com" , =?windows-1252?Q?Christian_K=F6?= =?windows-1252?Q?nig?= , Maling list - DRI developers List-Id: dri-devel@lists.freedesktop.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============1712187825== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Tp2fh6garGMaJPjELLDmF592bPqU496LI" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Tp2fh6garGMaJPjELLDmF592bPqU496LI Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 15.08.2014 18:43, Marco Benatto wrote: > Looking at radeon_uvd source code and AFAIK it's currently disable > since commit dca5086a90c9ec64f4e0de801a659508202b0640 > > Please, correct me if I'm wrong. > No, that's correct. It was just a misunderstanding. > Maybe we can check for which system we're running and take a decision > in what we use (if sysfs entry or UVD streams) to determine which UVD > state can be set, like Grigori suggested. >=20 > What do you think? > Seems like a good idea to me. Systems that have "good" 3D power levels for all UVD power states can choose the power state automatically according to active streams. Maybe the sysfs entry can still be used for overriding the automatic state decision. Systems which have "bad" (slow) 3D power levels in UVD power states should never choose dynamically. They should use the generic UVD power state by default and users can override it with the sysfs entry. Grigori >=20 >=20 >=20 > On Fri, Aug 15, 2014 at 12:32 PM, Grigori Goronzy > wrote: >=20 > On 15.08.2014 17:26, Alex Deucher wrote: > > On Fri, Aug 15, 2014 at 11:20 AM, Grigori Goronzy > > wrote: > >> On 15.08.2014 16:11, Christian K=F6nig wrote: > >>> Hi Marco, > >>> > >>> the problem with an CS ioctl flag is that we sometimes don't > know how > >>> much SCLK/MCLK boost is needed, for example when we do post > processing > >>> in the player using OpenGL and UVD decoding with VDPAU. In this= case > >>> VDPAU don't has the slightest idea how high SCLK/MCLK must be a= nd so > >>> can't give that info to the kernel either. > >>> > >> > >> Maybe it's an acceptable workaround to simply disable dynamic UV= D > state > >> selection in case the UVD states only have a single power level.= That > >> will avoid the performance issues on affected systems, while sti= ll > >> allowing dynamic UVD states on systems that have a saner DPM tab= le > >> setup. I think it is mosly older systems that suffer from this. > >> > > > > That is exactly what we do now. > > >=20 > Is it? In 3.17-wip, dynamic UVD state selection (according to activ= e > streams) is still completely disabled. It will always use the gener= ic > UVD state. In fact wasn't it reverted again because of the performa= nce > issues on some systems? >=20 > Grigori >=20 > > Alex > > > > > >> Best regards > >> Grigori > >> > >>> Regards, > >>> Christian. > >>> > >>> Am 15.08.2014 um 15:21 schrieb Marco Benatto: > >>>> Hey all, > >>>> > >>>> I also had a talk with Alex yesterday about post-processing is= sues > >>>> when using dynamic UVD profiles and a chamge on CS ioctl > >>>> including a flag to let user mode driver tell to the kernel wh= ich > >>>> performance requirement it wants for post processing. A commom= > >>>> point for both discussion is to stablish the default values fo= r > these > >>>> profiles, but probably this ioctl change would be more > impacting/complex > >>>> to implement than a sysfs entry. > >>>> > >>>> If a sysfs entry is anough for now I can handle the code to > create it > >>>> and, with your help, the code to setup the UVD profile request= ed > >>>> through it. > >>>> > >>>> Is there any suggestion? > >>>> > >>>> Thanks all for your help, > >>>> > >>>> > >>>> On Fri, Aug 15, 2014 at 5:48 AM, Christian K=F6nig > >>>> > >= > > wrote: > >>>> > >>>> Hi guys, > >>>> > >>>> to make a long story short every time I watch a movie my l= aptop > >>>> start to heat up because we always select the standard UVD= > power > >>>> profile without actually measuring if that is necessary. > >>>> > >>>> Marco came up with a patch that seems to reliable measure > the fps > >>>> send down to the kernel and so together with knowing the f= rame > >>>> size of the video should allow us to select the right UVD = power > >>>> profile. > >>>> > >>>> The problem is that Alex (unnoticed by me) completely disa= bled > >>>> selecting the UVD profiles because of some issues with adv= anced > >>>> post processing discussed on IRC. The problem seems to be > that the > >>>> lower UVD profiles have a to low SCLK/MCLK to handle the 3= D > load > >>>> that comes with scaling, deinterlacing etc... > >>>> > >>>> I unfortunately don't have time for it, cause this only af= fects > >>>> the hardware generations R600-SI and not the newest one CI= K. So > >>>> could you guys stick together and come up with a solution?= > >>>> Something like a sysfs entry that let's us select the > minimum UVD > >>>> power level allowed? > >>>> > >>>> I think Marco is happy to come up with a patch, we just ne= ed to > >>>> know what's really needed and what should be the default > values. > >>>> I'm happy to review everything that comes out of it, just = don't > >>>> have time to do it myself. > >>>> > >>>> Happy discussion and thanks in advance, > >>>> Christian. > >>>> > >>>> Am 12.08.2014 um 15:05 schrieb Alex Deucher: > >>>> > >>>> On Tue, Aug 12, 2014 at 6:00 AM, Christian K=F6nig > >>>> >> wrote: > >>>> > >>>> Am 11.08.2014 um 16:52 schrieb Alex Deucher: > >>>> > >>>> On Mon, Aug 11, 2014 at 5:08 AM, Christian K=F6= nig > >>>> > >>>> >> wrote: > >>>> > >>>> Am 07.08.2014 um 21:43 schrieb Alex Deuche= r: > >>>> > >>>> On Thu, Aug 7, 2014 at 11:32 AM, > Christian K=F6nig > >>>> > >>>> >> wrote: > >>>> > >>>> Am 07.08.2014 um 16:32 schrieb Ale= x > Deucher: > >>>> > >>>> On Thu, Aug 7, 2014 at 7:33 AM= , > >>>> Christian K=F6nig > >>>> > >>>> >> > >>>> wrote: > >>>> > >>>> From: Marco A Benatto > >>>> =20 > > >>>> =20 > >> > >>>> > >>>> Adding a Frames Per Second= > >>>> estimation logic on UVD ha= ndles > >>>> when it has being used. Th= is > >>>> estimation is per handle b= asis > >>>> and will help on DPM profi= le > >>>> calculation. > >>>> > >>>> v2 (chk): fix timestamp > type, move > >>>> functions around and > >>>> cleanup code= > a bit. > >>>> > >>>> Will this really help much? I= > thought > >>>> the problem was mainly due to > >>>> sclk and mclk for post process= ing. > >>>> > >>>> > >>>> It should at least handle the UVD > side for > >>>> upclocking when you get a > >>>> lot > >>>> of > >>>> streams / fps. And at on my NI the= > patch > >>>> seems to do exactly that. > >>>> > >>>> Switching sclk and mclk for post > >>>> processing is a different task, an= d I > >>>> actually have no idea what to do > with them. > >>>> > >>>> At this point we always choose the > plain UVD > >>>> state anyway so this > >>>> patch would only take effect if we > re-enabled > >>>> the dynamic UVD state > >>>> selection. > >>>> > >>>> > >>>> Hui? I thought we already re-enabled the > dynamic > >>>> UVD state selection, but > >>>> double checking this I found it disabled a= gain. > >>>> > >>>> What was the problem with that? Looks like= I > >>>> somehow missed the > >>>> discussion > >>>> around it. > >>>> > >>>> We did, but after doing so a number of people > >>>> complained about a > >>>> regression on IRC because when apps like xmbc > enabled > >>>> post processing, > >>>> performance went down. > >>>> > >>>> > >>>> That's strange, from my experience the different U= VD > >>>> performance states only > >>>> affect UVDs dclk/vclk, not sclk/mclk. I need to ge= t the > >>>> DPM dumps to > >>>> confirms this. > >>>> > >>>> The sclks and mclks are usually different as well, > especially > >>>> on APUs. > >>>> I can send you some examples. > >>>> > >>>> You not off hand remember who complained on IRC? > Finding > >>>> something in the > >>>> IRC logs is like searching for a needle in a hayst= ack. > >>>> > >>>> I don't remember off hand. I think zgreg was involved= > in some > >>>> of the > >>>> discussions. > >>>> > >>>> Alex > >>>> > >>>> Thanks, > >>>> Christian. > >>>> > >>>> > >>>> Alex > >>>> > >>>> > >>>> Christian. > >>>> > >>>> > >>>> For the post processing, we probably n= eed a > >>>> hint we can > >>>> pass to the driver in the CS ioctl to > denote > >>>> what state we need. > >>>> Although if we did that, this could wo= uld > >>>> largely be moot. That said, > >>>> newer asics support dynamic UVD clocks= > so we > >>>> really only need > >>>> something like that for older asics an= d I > >>>> guess VCE. > >>>> > >>>> Alex > >>>> > >>>> Christian. > >>>> > >>>> > >>>> Alex > >>>> > >>>> Signed-off-by: Marco A Ben= atto > >>>> =20 > > >>>> =20 > >> > >>>> Signed-off-by: Christian K= =F6nig > >>>> > >>>> =20 > = >> > >>>> --- > >>>> > >>>> =20 > drivers/gpu/drm/radeon/radeon.h > >>>> | 10 ++++++ > >>>> > >>>> =20 > drivers/gpu/drm/radeon/radeon_uvd.c > >>>> | 64 > >>>> =20 > +++++++++++++++++++++++++++++++++---- > >>>> 2 files changed, 68 > >>>> insertions(+), 6 deletions= (-) > >>>> > >>>> diff --git > >>>> =20 > a/drivers/gpu/drm/radeon/radeon.h > >>>> =20 > b/drivers/gpu/drm/radeon/radeon.h > >>>> index 9e1732e..e92f6cb 100= 644 > >>>> --- > a/drivers/gpu/drm/radeon/radeon.h > >>>> +++ > b/drivers/gpu/drm/radeon/radeon.h > >>>> @@ -1617,6 +1617,15 @@ int= > >>>> radeon_pm_get_type_index(s= truct > >>>> radeon_device > >>>> *rdev, > >>>> #define > >>>> RADEON_UVD_STACK_SIZE=20 > (1024*1024) > >>>> #define > RADEON_UVD_HEAP_SIZE > >>>> (1024*1024) > >>>> > >>>> +#define > RADEON_UVD_FPS_EVENTS_MAX 8 > >>>> +#define > RADEON_UVD_DEFAULT_FPS 60 > >>>> + > >>>> +struct radeon_uvd_fps { > >>>> + uint64_t =20 > timestamp; > >>>> + uint8_t =20 > event_index; > >>>> + uint8_t > >>>> =20 > events[RADEON_UVD_FPS_EVENTS_MAX]; > >>>> +}; > >>>> + > >>>> struct radeon_uvd { > >>>> struct radeon_= bo > >>>> *vcpu_bo; > >>>> void > >>>> *cpu_addr; > >>>> @@ -1626,6 +1635,7 @@ stru= ct > >>>> radeon_uvd { > >>>> struct drm_fil= e > >>>> *filp[RADEON_MAX_UVD_HAND= LES]; > >>>> unsigned > >>>> =20 > img_size[RADEON_MAX_UVD_HANDLES]; > >>>> struct delayed= _work > >>>> idle_work; > >>>> + struct radeon_uvd_= fps > >>>> =20 > fps_info[RADEON_MAX_UVD_HANDLES]; > >>>> }; > >>>> > >>>> int radeon_uvd_init(s= truct > >>>> radeon_device *rdev); > >>>> diff --git > >>>> =20 > a/drivers/gpu/drm/radeon/radeon_uvd.c > >>>> =20 > b/drivers/gpu/drm/radeon/radeon_uvd.c > >>>> index 6bf55ec..ef5667a 100= 644 > >>>> --- > >>>> =20 > a/drivers/gpu/drm/radeon/radeon_uvd.c > >>>> +++ > >>>> =20 > b/drivers/gpu/drm/radeon/radeon_uvd.c > >>>> @@ -237,6 +237,51 @@ void > >>>> =20 > radeon_uvd_force_into_uvd_segment(struct > >>>> radeon_bo *rbo) > >>>> =20 > rbo->placement.lpfn =3D > >>>> (256 * 1024 * 1024) >> > PAGE_SHIFT; > >>>> } > >>>> > >>>> +static void > >>>> =20 > radeon_uvd_fps_clear_events(struct > >>>> radeon_device *rdev, > >>>> int > >>>> idx) > >>>> +{ > >>>> + struct > radeon_uvd_fps *fps > >>>> =3D &rdev->uvd.fps_info[id= x]; > >>>> + unsigned i; > >>>> + > >>>> + fps->timestamp =3D= > jiffies_64; > >>>> + fps->event_index =3D= 0; > >>>> + for (i =3D 0; i < > >>>> RADEON_UVD_FPS_EVENTS_MAX;= i++) > >>>> + =20 > fps->events[i] =3D 0; > >>>> +} > >>>> + > >>>> +static void > >>>> =20 > radeon_uvd_fps_note_event(struct > >>>> radeon_device *rdev, > >>>> int > >>>> idx) > >>>> +{ > >>>> + struct > radeon_uvd_fps *fps > >>>> =3D &rdev->uvd.fps_info[id= x]; > >>>> + uint64_t timestamp= =3D > >>>> jiffies_64; > >>>> + unsigned rate =3D = 0; > >>>> + > >>>> + uint8_t index =3D > >>>> fps->event_index++; > >>>> + fps->event_index %= =3D > >>>> RADEON_UVD_FPS_EVENTS_MAX;= > >>>> + > >>>> + rate =3D div64_u64= (HZ, > >>>> max(timestamp - fps->times= tamp, > >>>> 1ULL)); > >>>> + > >>>> + fps->timestamp =3D= > timestamp; > >>>> + fps->events[index]= =3D > >>>> min(rate, 120u); > >>>> +} > >>>> + > >>>> +static unsigned > >>>> radeon_uvd_estimate_fps(st= ruct > >>>> radeon_device *rdev, > >>>> int > >>>> idx) > >>>> +{ > >>>> + struct > radeon_uvd_fps *fps > >>>> =3D &rdev->uvd.fps_info[id= x]; > >>>> + unsigned i, valid = =3D 0, > >>>> count =3D 0; > >>>> + > >>>> + for (i =3D 0; i < > >>>> RADEON_UVD_FPS_EVENTS_MAX;= > i++) { > >>>> + /* We shou= ld > >>>> ignore zero values */ > >>>> + if > (fps->events[i] > >>>> !=3D 0) { > >>>> + =20 > count +=3D > >>>> fps->events[i]; > >>>> + =20 > valid++; > >>>> + } > >>>> + } > >>>> + > >>>> + if (valid > 0) > >>>> + return > count / valid; > >>>> + else > >>>> + return > >>>> RADEON_UVD_DEFAULT_FPS; > >>>> +} > >>>> + > >>>> void > >>>> radeon_uvd_free_handles(st= ruct > >>>> radeon_device *rdev, struc= t > >>>> drm_file *filp) > >>>> { > >>>> int i, r; > >>>> @@ -419,8 +464,10 @@ stati= c int > >>>> radeon_uvd_cs_msg(struct > >>>> radeon_cs_parser > >>>> *p, struct radeon_bo *bo, > >>>> > >>>> /* create or > decode, > >>>> validate the handle */ > >>>> for (i =3D 0; = i < > >>>> RADEON_MAX_UVD_HANDLES; ++= i) { > >>>> - if > >>>> =20 > (atomic_read(&p->rdev->uvd.handles[i]) > >>>> =3D=3D handle) > >>>> + if > >>>> =20 > (atomic_read(&p->rdev->uvd.handles[i]) > >>>> =3D=3D handle) > >>>> { > >>>> + > >>>> =20 > radeon_uvd_fps_note_event(p->rdev, i); > >>>> = > return 0; > >>>> + } > >>>> } > >>>> > >>>> /* handle not = found > >>>> try to alloc a new one */ > >>>> @@ -428,6 +475,7 @@ static= int > >>>> radeon_uvd_cs_msg(struct > >>>> radeon_cs_parser > >>>> *p, struct radeon_bo *bo, > >>>> if > >>>> =20 > (!atomic_cmpxchg(&p->rdev->uvd.handles[i], > >>>> 0, > >>>> handle)) { > >>>> > >>>> p->rdev->uvd.filp[i] =3D p= ->filp; > >>>> > >>>> p->rdev->uvd.img_size[i] =3D= > img_size; > >>>> + > >>>> =20 > radeon_uvd_fps_clear_events(p->rdev, > >>>> i); > >>>> = > return 0; > >>>> } > >>>> } > >>>> @@ -763,7 +811,7 @@ int > >>>> =20 > radeon_uvd_get_destroy_msg(struct > >>>> radeon_device > >>>> *rdev, int ring, > >>>> static void > >>>> radeon_uvd_count_handles(s= truct > >>>> radeon_device *rdev, > >>>> > >>>> unsigned *sd, > unsigned *hd) > >>>> { > >>>> - unsigned i; > >>>> + unsigned i, > fps_rate =3D 0; > >>>> > >>>> *sd =3D 0; > >>>> *hd =3D 0; > >>>> @@ -772,10 +820,13 @@ > static void > >>>> radeon_uvd_count_handles(s= truct > >>>> radeon_device *rdev, > >>>> if > >>>> =20 > (!atomic_read(&rdev->uvd.handles[i])) > >>>> = > continue; > >>>> > >>>> - if > >>>> (rdev->uvd.img_size[i] >=3D= > 720*576) > >>>> - =20 > ++(*hd); > >>>> - else > >>>> - =20 > ++(*sd); > >>>> + fps_rate =3D= > >>>> =20 > radeon_uvd_estimate_fps(rdev, i); > >>>> + > >>>> + if > >>>> (rdev->uvd.img_size[i] >=3D= > 720*576) { > >>>> + =20 > (*hd) +=3D > >>>> fps_rate > 30 ? 1 : 2; > >>>> + } else { > >>>> + =20 > (*sd) +=3D > >>>> fps_rate > 30 ? 1 : 2; > >>>> + } > >>>> } > >>>> } > >>>> > >>>> @@ -805,6 +856,7 @@ void > >>>> radeon_uvd_note_usage(stru= ct > >>>> radeon_device > >>>> *rdev) > >>>> set_clocks &=3D= > >>>> =20 > schedule_delayed_work(&rdev->uvd.idle_work, > >>>> > >>>> =20 > msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS)); > >>>> > >>>> + > >>>> if > >>>> ((rdev->pm.pm_method =3D=3D= > >>>> PM_METHOD_DPM) && > >>>> rdev->pm.dpm_enabled) { > >>>> =20 > unsigned hd =3D > >>>> 0, sd =3D 0; > >>>> > >>>> radeon_uvd_count_handles(r= dev, > >>>> &sd, &hd); > >>>> -- > >>>> 1.9.1 > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Marco Antonio Benatto > >>>> Linux user ID:#506236 > >>> > >> > >> >=20 >=20 >=20 >=20 >=20 > --=20 > Marco Antonio Benatto > Linux user ID:#506236 --Tp2fh6garGMaJPjELLDmF592bPqU496LI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJT7jrgAAoJEOpmYE8Sf8H2eOYQAJ0KjPUtxV/JQUj5vtQ94e7K 3iK39zAZUNf+M3M0asEf3FK6vXgKFIOwjDf6U3oSOEs0TLONbcH+0Wq1HXoqe4nn 2AhK1g+eD+rrepZlIrFPHqlUYCmmeLgODu+waJHzufBdrwTSosY2t8QbR9RCCxqZ BE5FavpUTEQiosVLy01L2xZzIdtKConIBN+kaN+XCJkxN9fIl+dl9zCw9bRelpnH xfgl6IQmyznISxc1zfm8CeQunpGxJeDJ56b5TZ9EXLPj+VobomHHFrUsrfKZZk1Z p550lWlTYcdqFupLJc+yV+jgzWZ4QjN/pJMw9JEkReXTekWgBDj99WfDiHHDT3ct 1KN+8gwDF+IUO+y57aTcYwZOXE+b6CV5MNwSaauN5IG4l31do59RCssjiVdaVUji 1QsfUVzgLlhJR4OLGKZP2Cyz8PwZOHK4c/2kXdhk65fddnl8bv0u4QE4FRHuH5np neBk7BnmUPFAKYo+xU0TULlHHWa8tfKaULPiHQgWYAIdfxR9BTxpme3sIKVkSr0s 6513aKdyExyOdDKCwn4MLYZjsaR0cz9tBsto2HpAfQm+Gn6gieDKLLJb/OipmOxh yWT8rrmKbYkfRcd3KtgG6F/jWpT7Plh5oMrRn75NvPbhY0P9HhwrnbxhcFbROD2b kQpF/as3l6ZkMmRNeI3/ =p5ho -----END PGP SIGNATURE----- --Tp2fh6garGMaJPjELLDmF592bPqU496LI-- --===============1712187825== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel --===============1712187825==--