From: Alex Williamson <alex.williamson@redhat.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>,
Alex Deucher <alexander.deucher@amd.com>,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
Maxim Levitsky <mlevitsk@redhat.com>
Subject: Re: Couple of issues with amdgpu on my WX4100
Date: Mon, 4 Jan 2021 11:43:35 -0700 [thread overview]
Message-ID: <20210104114335.3f87ff27@omen.home> (raw)
In-Reply-To: <ea539e21-aed3-8f23-74b2-5a214fa9fdb2@amd.com>
On Mon, 4 Jan 2021 18:39:33 +0100
Christian König <christian.koenig@amd.com> wrote:
> Am 04.01.21 um 17:45 schrieb Alex Williamson:
> > On Mon, 4 Jan 2021 12:34:34 +0100
> > Christian König <christian.koenig@amd.com> wrote:
> >
> >> Hi Maxim,
> >>
> >> I can't help with the display related stuff. Probably best approach to
> >> get this fixes would be to open up a bug tracker for this on FDO.
> >>
> >> But I'm the one who implemented the resizeable BAR support and your
> >> analysis of the problem sounds about correct to me.
> >>
> >> The reason why this works on Linux is most likely because we restore the
> >> BAR size on resume (and maybe during initial boot as well).
> >>
> >> See this patch for reference:
> >>
> >> commit d3252ace0bc652a1a244455556b6a549f969bf99
> >> Author: Christian König <ckoenig.leichtzumerken@gmail.com>
> >> Date: Fri Jun 29 19:54:55 2018 -0500
> >>
> >> PCI: Restore resized BAR state on resume
> >>
> >> Resize BARs after resume to the expected size again.
> >>
> >> BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D199959&data=04%7C01%7Cchristian.koenig%40amd.com%7C942176d2e6aa4a4f3a4208d8b0d032bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637453755549960615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3rsR%2Fx4uTpjtXFNqlJyFBteMmZMjWf3Neci7lUlkh88%3D&reserved=0
> >> Fixes: d6895ad39f3b ("drm/amdgpu: resize VRAM BAR for CPU access v6")
> >> Fixes: 276b738deb5b ("PCI: Add resizable BAR infrastructure")
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> >> CC: stable@vger.kernel.org # v4.15+
> >>
> >>
> >> It should be trivial to add this to the reset module as well. Most
> >> likely even completely vendor independent since I'm not sure what a bus
> >> reset will do to this configuration and restoring it all the time should
> >> be the most defensive approach.
> > Hmm, this should already be used by the bus/slot reset path:
> >
> > pci_bus_restore_locked()/pci_slot_restore_locked()
> > pci_dev_restore()
> > pci_restore_state()
> > pci_restore_rebar_state()
> >
> > VFIO support for resizeable BARs has been on my todo list, but I don't
> > have access to any systems that have both a capable device and >4G
> > decoding enabled in the BIOS. If we have a consistent view of the BAR
> > size after the BARs are expanded, I'm not sure why it doesn't just
> > work. FWIW, QEMU currently hides the REBAR capability to the guest
> > because the kernel driver doesn't support emulation through config
> > space (ie. it's read-only, which the spec doesn't support).
>
> In this case the guest shouldn't be able to change the config at all and
> I have no idea what's going wrong here.
>
> > AIUI, resource allocation can fail when enabling REBAR support, which
> > is a problem if the failure occurs on the host but not the guest since
> > we have no means via the hardware protocol to expose such a condition.
> > Therefore the model I was considering for vfio-pci would be to simply
> > pre-enable REBAR at the max size.
>
> That's a rather bad idea. See our GPUs for example return way more than
> they actually need.
>
> E.g. a Polaris usually returns 4GiB even when only 2GiB are installed,
> because 4GiB is just the maximum amount of RAM you can put together with
> the ASIC on a board.
Would the driver fail or misbehave if the BAR is sized larger than the
amount of memory on the card or is memory size determined independently
of BAR size?
> Some devices even return a mask of all 1 even when they need only 2MiB,
> resulting in nearly 1TiB of wasted address space with this approach.
Ugh. I'm afraid to ask why a device with a 2MiB BAR would implement a
REBAR capability, but I guess we really can't make any assumptions
about the breadth of SKUs that ASIC might support (or sanity of the
designers).
We could probe to determine the maximum size the host can support and
potentially emulate the capability to remove sizes that we can't
allocate, but without any ability for the device to reject a size
advertised as supported via the capability protocol it makes me nervous
how we can guarantee the resources are available when the user
re-configures the device. That might mean we'd need to reserve the
resources, up to what the host can support, regardless of what the
device can actually use. I'm not sure how else to know how much to
reserve without device specific code in vfio-pci. Thanks,
Alex
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2021-01-04 18:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-02 22:42 Couple of issues with amdgpu on my WX4100 Maxim Levitsky
2021-01-04 11:34 ` Christian König
2021-01-04 16:45 ` Alex Williamson
2021-01-04 17:39 ` Christian König
2021-01-04 18:43 ` Alex Williamson [this message]
2021-01-04 20:13 ` Christian König
2021-01-04 21:45 ` Alex Williamson
2021-01-06 20:21 ` Maxim Levitsky
2021-01-15 11:29 ` Christian König
2021-01-06 21:27 ` Maxim Levitsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210104114335.3f87ff27@omen.home \
--to=alex.williamson@redhat.com \
--cc=airlied@linux.ie \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=mlevitsk@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox