From: John Brooks <john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
To: "Christian König" <deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
Cc: "Marek Olšák" <maraeo-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
"David Airlie" <airlied-cv59FeDIM0c@public.gmane.org>,
"Felix Kuehling" <felix.kuehling-5C7GfCeVMHo@public.gmane.org>,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
"Michel Dänzer" <michel-otUistvHUpPR7s880joybQ@public.gmane.org>
Subject: Re: [PATCH 0/9] Visible VRAM Management Improvements
Date: Sat, 24 Jun 2017 21:50:05 +0000 [thread overview]
Message-ID: <20170624215005.GA16913@kitsune.fastquake.com> (raw)
In-Reply-To: <644cf9b4-e22b-eab1-a505-b0e1f9850f82-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
On Sat, Jun 24, 2017 at 08:20:22PM +0200, Christian König wrote:
> Am 24.06.2017 um 01:16 schrieb John Brooks:
> >On Fri, Jun 23, 2017 at 05:02:58PM -0400, Felix Kuehling wrote:
> >>Hi John,
> >>
> >>I haven't read your patches. Just a question based on the cover letter.
> >>
> >>I understand that visible VRAM is the biggest pain point. But could the
> >>same reasoning make sense for invisible VRAM? That is, doing all the
> >>migrations to VRAM in a workqueue?
> >>
> >>Regards,
> >> Felix
> >>
> >I don't see why not. In theory, all non-essential buffer moves could be done
> >this way, and it would be relatively trivial to extend it to that.
> >
> >But I wanted to limit the scope of my changes, at least for this series.
> >Testing takes a long time and I wanted to focus those testing efforts as much
> >as possible, produce something well-tested (I hope), and get feedback on this
> >limited application of the concept before expanding its reach.
>
> Yeah, sorry I have to say that but the whole approach is utterly nonsense.
>
> What happens is that the delayed BO can only be moved AFTER the command
> submission which wants it to be in VRAM.
>
> So you use the BO in a CS and *then* move it to where the CS wants it to be,
> no matter if the BO is then needed there or not.
>
> Regards,
> Christian.
>
I'm aware of the effect it has. The BO won't be in VRAM for the current command
submission, but it'll be there for a future one. If a BO is used at a given
time, then it's likely it'll be used again soon. In which case you'll come out
ahead on latency even if the GPU has to read it from GTT a few times. In any
case, it's never going to hurt as much as full-stop waiting for a worst-case BO
move that needs a lot of evictions.
Feel free to correct my understanding; you'd certainly know any of this better
than I do. But my tests indicate that immediate forced moves during CS cause
stalls, and the average framerate with delayed moves is the almost (~2%) the
same as with immediate ones, which is about 9% higher than with no forced moves
during CS at all.
DiRT Rally average framerates:
With the whole patch set (n=3):
89.56
Without it (drm-next-4.13 5ac55629d6b3fcde69f46aa772c6e83be0bdcbbf)
(n=3):
91.16 (+stalls)
Patches 1 and 3 only, and with GTT set as the only busy placement for
CPU_ACCESS_REQUIRED BOs in amdgpu_cs_bo_validate (n=3):
82.15
John
>
> >
> >John
> >
> >>On 17-06-23 01:39 PM, John Brooks wrote:
> >>>This patch series is intended to improve performance when limited CPU-visible
> >>>VRAM is under pressure.
> >>>
> >>>Moving BOs into visible VRAM is essentially a housekeeping task. It's faster to
> >>>access them in VRAM than GTT, but it isn't a hard requirement for them to be in
> >>>VRAM. As such, it is unnecessary to spend valuable time blocking on this in the
> >>>page fault handler or during command submission. Doing so translates directly
> >>>into a longer frame time (ergo stalls and stuttering).
> >>>
> >>>The problem worsens when attempting to move BOs into visible VRAM when it is
> >>>full. This takes much longer than a simple move because other BOs have to be
> >>>evicted, which involves finding and then moving potentially hundreds of other
> >>>BOs, which is very time consuming. In the case of limited visible VRAM, it's
> >>>important to do this sometime to keep the contents of visible VRAM fresh, but
> >>>it does not need to be a blocking operation. If visible VRAM is full, the BO
> >>>can be read from GTT in the meantime and the BO can be moved to VRAM later.
> >>>
> >>>Thus, I have made it so that neither the command submission code nor page fault
> >>>handler spends time evicting BOs from visible VRAM, and instead this is
> >>>deferred to a workqueue function that's queued when CS requests BOs flagged
> >>>AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.
> >>>
> >>>Speaking of CPU_ACCESS_REQUIRED, I've changed the handling of that flag so that
> >>>the kernel driver can clear it later even if it was set by userspace. This is
> >>>because the userspace graphics library can't know whether the application
> >>>really needs it to be CPU_ACCESS_REQUIRED forever. The kernel driver can't know
> >>>either, but it does know when page faults occur, and if a BO doesn't appear to
> >>>have any page faults when it's moved somewhere inaccessible, the flag can be
> >>>removed and it doesn't have to take up space in CPU-visible memory anymore.
> >>>This change was based on IRC discussions with Michel.
> >>>
> >>>Patch 7 fixes a problem with BO moverate throttling that causes visible VRAM
> >>>moves to not be throttled if total VRAM isn't full enough.
> >>>
> >>>I've also added a vis_vramlimit module parameter for debugging purposes. It's
> >>>similar to the vramlimit parameter except it limits only visible VRAM.
> >>>
> >>>I have tested this patch set with the two games I know to be affected by
> >>>visible VRAM pressure: DiRT Rally and Dying Light. It practically eliminates
> >>>eviction-related stuttering in DiRT Rally as well as very low performance if
> >>>visible VRAM is limited to 64MB. It also fixes severely low framerates that
> >>>occurred in some areas of Dying Light. All my testing was done with an R9 290
> >>>with 4GB of visible VRAM with an Intel i7 4790.
> >>>
> >>>--
> >>>John Brooks (Frogging101)
> >>>
> >>>_______________________________________________
> >>>amd-gfx mailing list
> >>>amd-gfx@lists.freedesktop.org
> >>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2017-06-24 21:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-23 17:39 [PATCH 0/9] Visible VRAM Management Improvements John Brooks
[not found] ` <1498239580-17360-1-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-23 17:39 ` [PATCH 1/9] drm/amdgpu: Separate placements and busy placements John Brooks
2017-06-23 17:39 ` [PATCH 2/9] drm/amdgpu: Add vis_vramlimit module parameter John Brooks
2017-06-26 9:48 ` Michel Dänzer
2017-06-26 9:57 ` Christian König
2017-06-23 17:39 ` [PATCH 4/9] drm/amdgpu: Don't force BOs into visible VRAM if they can go to GTT instead John Brooks
[not found] ` <1498239580-17360-5-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-24 18:09 ` Christian König
[not found] ` <0c5064f9-5b84-8833-b410-055b5e2064bf-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-24 18:37 ` John Brooks
2017-06-23 17:39 ` [PATCH 6/9] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS John Brooks
[not found] ` <1498239580-17360-7-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-24 18:00 ` Christian König
2017-06-25 1:57 ` John Brooks
[not found] ` <55ea5e84-0791-5a70-6278-ade83c343a3b-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-26 9:27 ` Michel Dänzer
[not found] ` <6c6fca21-df95-a413-d5eb-c05f1913787b-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-26 23:25 ` Marek Olšák
2017-06-23 17:39 ` [PATCH 7/9] drm/amdgpu: Throttle visible VRAM moves separately John Brooks
[not found] ` <1498239580-17360-8-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-26 9:44 ` Michel Dänzer
[not found] ` <c132d211-bb7c-1e7d-617a-6f128343a581-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-26 22:29 ` John Brooks
2017-06-27 8:25 ` Michel Dänzer
2017-06-23 17:39 ` [PATCH 8/9] drm/amdgpu: Asynchronously move BOs to visible VRAM John Brooks
2017-06-23 21:02 ` [PATCH 0/9] Visible VRAM Management Improvements Felix Kuehling
[not found] ` <82339d2d-481c-ab3f-1590-ab22f0eac371-5C7GfCeVMHo@public.gmane.org>
2017-06-23 23:16 ` John Brooks
2017-06-24 18:20 ` Christian König
[not found] ` <644cf9b4-e22b-eab1-a505-b0e1f9850f82-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-24 21:50 ` John Brooks [this message]
2017-06-25 11:54 ` Christian König
2017-06-24 18:07 ` Christian König
[not found] ` <3cd916a7-6734-5eff-b645-66f3ee83f13a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-24 18:36 ` John Brooks
2017-06-25 11:31 ` Christian König
2017-06-23 17:39 ` [PATCH 3/9] drm/amdgpu: Don't force BOs into visible VRAM for page faults John Brooks
[not found] ` <1498239580-17360-4-git-send-email-john-xq/Ko7C6e2Bl57MIdRCFDg@public.gmane.org>
2017-06-26 9:38 ` Michel Dänzer
[not found] ` <f399c192-d90d-9f43-9b8a-820fa51a7715-otUistvHUpPR7s880joybQ@public.gmane.org>
2017-06-27 3:25 ` John Brooks
2017-06-23 17:39 ` [PATCH 5/9] drm/amdgpu: Track time of last page fault and last CS move in struct amdgpu_bo John Brooks
2017-06-23 17:39 ` [PATCH 9/9] drm/amdgpu: Reduce lock contention when evicting from visible VRAM John Brooks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170624215005.GA16913@kitsune.fastquake.com \
--to=john-xq/ko7c6e2bl57midrcfdg@public.gmane.org \
--cc=airlied-cv59FeDIM0c@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=felix.kuehling-5C7GfCeVMHo@public.gmane.org \
--cc=maraeo-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=michel-otUistvHUpPR7s880joybQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).