AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Marek Olšák" <maraeo@gmail.com>
Cc: amd-gfx mailing list <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM
Date: Mon, 23 Jan 2023 10:31:28 +0100	[thread overview]
Message-ID: <4992933e-ad45-5f7a-b7af-39c6d0948321@gmail.com> (raw)
In-Reply-To: <CAAxE2A6JcREmKKmh1n0xSgkOZq77kpnzC-27-srunLKduyAwiw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6190 bytes --]

Let's do this as valid in fdinfo.

This way we can easily extend whatever the kernel wants to display as 
statistics in the userspace HUD.

Regards,
Christian.

Am 21.01.23 um 01:45 schrieb Marek Olšák:
> We badly need a way to query evicted memory usage. It's essential for 
> investigating performance problems and it uncovered the buddy 
> allocator disaster. Please either suggest an alternative, suggest 
> changes, or review. We need it ASAP.
>
> Thanks,
> Marek
>
> On Tue, Jan 10, 2023 at 11:55 AM Marek Olšák <maraeo@gmail.com> wrote:
>
>     On Tue, Jan 10, 2023 at 11:23 AM Christian König
>     <ckoenig.leichtzumerken@gmail.com> wrote:
>
>         Am 10.01.23 um 16:28 schrieb Marek Olšák:
>>         On Wed, Jan 4, 2023 at 9:51 AM Christian König
>>         <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>>             Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>>             I see about the access now, but did you even look at the
>>>             patch?
>>
>>             I did look at the patch, but I haven't fully understood
>>             yet what you are trying to do here.
>>
>>
>>         First and foremost, it returns the evicted size of VRAM and
>>         visible VRAM, and returns visible VRAM usage. It should be
>>         obvious which stat includes the size of another.
>>
>>
>>>             Because what the patch does isn't even exposed to common
>>>             drm code, such as the preferred domain and visible VRAM
>>>             placement, so it can't be in fdinfo right now.
>>>
>>>             Or do you even know what fdinfo contains? Because it
>>>             contains nothing useful. It only has VRAM and GTT usage,
>>>             which we already have in the INFO ioctl, so it has
>>>             nothing that we need. We mainly need the eviction
>>>             information and visible VRAM information now. Everything
>>>             else is a bonus.
>>
>>             Well the main question is what are you trying to get from
>>             that information? The eviction list for example is
>>             completely meaningless to userspace, that stuff is only
>>             temporary and will be cleared on the next CS again.
>>
>>
>>         I don't know what you mean. The returned eviction stats look
>>         correct and are stable (they don't change much). You can
>>         suggest changes if you think some numbers are not reported
>>         correctly.
>>
>>
>>             What we could expose is the VRAM over-commit value, e.g.
>>             how much BOs which where supposed to be in VRAM are in
>>             GTT now. I think that's what you are looking for here, right?
>>
>>
>>         The VRAM overcommit value is "evicted_vram".
>>
>>
>>>             Also, it's undesirable to open and parse a text file if
>>>             we can just call an ioctl.
>>
>>             Well I see the reasoning for that, but I also see why
>>             other drivers do a lot of the stuff we have as IOCTL as
>>             separate files in sysfs, fdinfo or debugfs.
>>
>>             Especially repeating all the static information which
>>             were already available under sysfs in the INFO IOCTL was
>>             a design mistake as far as I can see. Just compare what
>>             AMDGPU and the KFD code is doing to what for example i915
>>             is doing.
>>
>>             Same for things like debug information about a process.
>>             The fdinfo stuff can be queried from external tools (gdb,
>>             gputop, umr etc...) as well which makes that interface
>>             more preferred.
>>
>>
>>         Nothing uses fdinfo in Mesa. No driver uses sysfs in Mesa
>>         except drm shims, noop drivers, and Intel for perf metrics.
>>         sysfs itself is an unusable mess for the PCIe query and is
>>         missing information.
>>
>>         I'm not against exposing more stuff through sysfs and fdinfo
>>         for tools, but I don't see any reason why drivers should use
>>         it (other than for slowing down queries and initialization).
>
>         That's what I'm asking: Is this for some tool or to make some
>         driver decision based on it?
>
>         If you just want the numbers for over displaying then I think
>         it would be better to put this into fdinfo together with the
>         other existing stuff there.
>
>
>         If you want to make allocation decisions based on this then we
>         should have that as IOCTL or even better as mmap() page
>         between kernel and userspace. But in this case I would also
>         calculation the numbers completely different as well.
>
>         See we have at least the following things in the kernel:
>         1. The eviction list in the VM.
>             Those are the BOs which are currently evicted and tried to
>         moved back in on the next CS.
>
>         2. The VRAM over commit value.
>             In other words how much more VRAM than available has the
>         application tried to allocate?
>
>         3. The visible VRAM usage by this application.
>
>         The end goal is that the eviction list will go away, e.g. we
>         will always have stable allocations based on allocations of
>         other applications and not constantly swap things in and out.
>
>         When you now expose the eviction list to userspace we will be
>         stuck with this interface forever.
>
>
>     It's for the GALLIUM HUD.
>
>     The only missing thing is the size of all evicted VRAM
>     allocations, and the size of all evicted visible VRAM allocations.
>
>     1. No list is exposed. Only sums of buffer sizes are exposed.
>     Also, the eviction list has no meaning here. All lists are treated
>     equally, and mem_type is compared with preferred_domains to
>     determine where buffers are and where they should be.
>
>     2. I'm not interested in the overcommit value. I'm only interested
>     in knowing the number of bytes of evicted VRAM right now. It can
>     be as variable as the CPU load, but in practice it shouldn't be
>     because PCIe doesn't have the bandwidth to move things quickly.
>
>     3. Yes, that's true.
>
>     Marek
>

[-- Attachment #2: Type: text/html, Size: 13800 bytes --]

  reply	other threads:[~2023-01-23  9:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 22:07 [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM Marek Olšák
2022-12-30 22:10 ` Marek Olšák
2023-01-02 15:56 ` Christian König
2023-01-02 17:57   ` Marek Olšák
2023-01-03  8:33     ` Christian König
2023-01-03 23:08       ` Marek Olšák
2023-01-04 14:51         ` Christian König
2023-01-10 15:28           ` Marek Olšák
2023-01-10 16:23             ` Christian König
2023-01-10 16:55               ` Marek Olšák
2023-01-21  0:45                 ` Marek Olšák
2023-01-23  9:31                   ` Christian König [this message]
2023-01-24  7:37                     ` Marek Olšák
2023-01-24  7:58                       ` Christian König
2023-01-24  8:13                         ` Marek Olšák
2023-01-24  8:27                           ` Marek Olšák
2023-01-24  8:29                             ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4992933e-ad45-5f7a-b7af-39c6d0948321@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=maraeo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox