From: James Jones <jajones@nvidia.com>
To: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Cc: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
Mary Guillemard <mary@mary.zone>,
Faith Ekstrand <faith.ekstrand@collabora.com>,
Lyude Paul <lyude@redhat.com>, Danilo Krummrich <dakr@kernel.org>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
nouveau@lists.freedesktop.org
Subject: Re: [PATCH v4 2/5] drm/nouveau/uvmm: Allow larger pages
Date: Mon, 3 Nov 2025 17:12:09 -0800 [thread overview]
Message-ID: <15cc6de2-d1af-4504-a08d-1278329a2113@nvidia.com> (raw)
In-Reply-To: <CAA+WOBtmbPHigscFQCFgDo=9WSM6V-JMXGCO7orP=01XOqTPHQ@mail.gmail.com>
On 11/3/25 15:53, Mohamed Ahmed wrote:
> Thanks a lot for the shout out! Looking more at things, the logic here
> is actually redundant. It was originally copied over directly from the
> bo allocation code to stay on the safer side (basically the idea back
> then was to make both the bo and vmm sides match exactly). We aren't
> at risk of having an aligned address that is in the wrong memory type
> because the bo allocation code (nouveau_bo.c:321) forces anything that
> has the GART flag to have a page size of 4K. Anything getting a page
> size higher than that is exclusively VRAM only. Additionally,
> currently things marked VRAM only don't get evicted to host memory
> except under high memory pressure and in that case, the context is
> paused until the objects in question are paged back in, so we also
> don't have to worry about memory placement there.
>
> The memory placement check in the vmm code could be removed but I am
> leaning more towards leaving it as is just to stay on the safer side.
> At the same time, it would be more useful to keep it for the future as
> one of the future investigation targets that we want to look into is
> all the memory placement rules because the "only 4K is allowed for
> host memory" limit that nouveau imposes is a source of many pains in
> userspace (originally thought to be a HW thing but seems it's actually
> not), and having the checks on both bo and vmm paths would help
> starting out with that.
OK, thanks for the explanation. I'm fine with leaving the check as-is in
that case.
Given that, for the series:
Reviewed-by: James Jones <jajones@nvidia.com>
Thanks,
-James
> Thanks a lot again,
> Mohamed
>
> On Fri, Oct 31, 2025 at 7:01 PM James Jones <jajones@nvidia.com> wrote:
>>
>> On 10/31/25 03:49, Mohamed Ahmed wrote:
>>> From: Mary Guillemard <mary@mary.zone>
>>>
>>> Now that everything in UVMM knows about the variable page shift, we can
>>> select larger values.
>>>
>>> The proposed approach relies on nouveau_bo::page unless if it would cause
>>> alignment issues (in which case we fall back to searching for an
>>> appropriate shift)
>>>
>>> Signed-off-by: Mary Guillemard <mary@mary.zone>
>>> Co-developed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
>>> Signed-off-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
>>> ---
>>> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 60 +++++++++++++++++++++++++-
>>> 1 file changed, 58 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> index 2cd0835b05e8..ab8933b88337 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> @@ -454,6 +454,62 @@ op_unmap_prepare_unwind(struct drm_gpuva *va)
>>> drm_gpuva_insert(va->vm, va);
>>> }
>>>
>>> +static bool
>>> +op_map_aligned_to_page_shift(const struct drm_gpuva_op_map *op, u8 page_shift)
>>> +{
>>> + u64 non_page_bits = (1ULL << page_shift) - 1;
>>> +
>>> + return (op->va.addr & non_page_bits) == 0 &&
>>> + (op->va.range & non_page_bits) == 0 &&
>>> + (op->gem.offset & non_page_bits) == 0;
>>> +}
>>> +
>>> +static u8
>>> +select_page_shift(struct nouveau_uvmm *uvmm, struct drm_gpuva_op_map *op)
>>> +{
>>> + struct nouveau_bo *nvbo = nouveau_gem_object(op->gem.obj);
>>> +
>>> + /* nouveau_bo_fixup_align() guarantees that the page size will be aligned
>>> + * for most cases, but it can't handle cases where userspace allocates with
>>> + * a size and then binds with a smaller granularity. So in order to avoid
>>> + * breaking old userspace, we need to ensure that the VA is actually
>>> + * aligned before using it, and if it isn't, then we downgrade to the first
>>> + * granularity that will fit, which is optimal from a correctness and
>>> + * performance perspective.
>>> + */
>>> + if (op_map_aligned_to_page_shift(op, nvbo->page))
>>> + return nvbo->page;
>>> +
>>> + struct nouveau_mem *mem = nouveau_mem(nvbo->bo.resource);
>>> + struct nvif_vmm *vmm = &uvmm->vmm.vmm;
>>> + int i;
>>> +
>>> + /* If the given granularity doesn't fit, let's find one that will fit. */
>>> + for (i = 0; i < vmm->page_nr; i++) {
>>> + /* Ignore anything that is bigger or identical to the BO preference. */
>>> + if (vmm->page[i].shift >= nvbo->page)
>>> + continue;
>>> +
>>> + /* Skip incompatible domains. */
>>> + if ((mem->mem.type & NVIF_MEM_VRAM) && !vmm->page[i].vram)
>>> + continue;
>>> + if ((mem->mem.type & NVIF_MEM_HOST) &&
>>> + (!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
>>> + continue;
>>
>> This logic doesn't seem correct. I'm not sure why there's a need to
>> limit the page size on the host memory type, but assuming there is due
>> to nouveau architecture or HW limitations I'm not aware of, it should be
>> applied universally, not just when falling back due to misaligned
>> addresses. You can get lucky and have aligned addresses regardless of
>> the target page size. Hence, this check would need to precede the above
>> early-out for the case where op_map_aligned_to_page_shift() succeeds.
>>
>> Thanks,
>> -James
>>
>>> + /* If it fits, return the proposed shift. */
>>> + if (op_map_aligned_to_page_shift(op, vmm->page[i].shift))
>>> + return vmm->page[i].shift;
>>> + }
>>> +
>>> + /* If we get here then nothing can reconcile the requirements. This should never
>>> + * happen.
>>> + */
>>> + WARN_ON(1);
>>> +
>>> + return PAGE_SHIFT;
>>> +}
>>> +
>>> static void
>>> nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
>>> struct nouveau_uvma_prealloc *new,
>>> @@ -506,7 +562,7 @@ nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
>>> if (vmm_get_range)
>>> nouveau_uvmm_vmm_put(uvmm, vmm_get_start,
>>> vmm_get_range,
>>> - PAGE_SHIFT);
>>> + select_page_shift(uvmm, &op->map));
>>> break;
>>> }
>>> case DRM_GPUVA_OP_REMAP: {
>>> @@ -599,7 +655,7 @@ op_map_prepare(struct nouveau_uvmm *uvmm,
>>>
>>> uvma->region = args->region;
>>> uvma->kind = args->kind;
>>> - uvma->page_shift = PAGE_SHIFT;
>>> + uvma->page_shift = select_page_shift(uvmm, op);
>>>
>>> drm_gpuva_map(&uvmm->base, &uvma->va, op);
>>>
>>
next prev parent reply other threads:[~2025-11-04 1:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-31 10:49 [PATCH v4 0/5] drm/nouveau: Enable variable page sizes and compression Mohamed Ahmed
2025-10-31 10:49 ` [PATCH v4 1/5] drm/nouveau/uvmm: Prepare for larger pages Mohamed Ahmed
2025-10-31 10:49 ` [PATCH v4 2/5] drm/nouveau/uvmm: Allow " Mohamed Ahmed
2025-10-31 17:01 ` James Jones
2025-11-03 23:53 ` Mohamed Ahmed
2025-11-04 1:12 ` James Jones [this message]
2025-11-05 22:50 ` Danilo Krummrich
2025-11-08 19:30 ` Mary Guillemard
2025-11-05 22:51 ` Danilo Krummrich
2025-10-31 10:49 ` [PATCH v4 3/5] drm/nouveau/mmu/gp100: Remove unused/broken support for compression Mohamed Ahmed
2025-10-31 10:49 ` [PATCH v4 4/5] drm/nouveau/mmu/tu102: Add support for compressed kinds Mohamed Ahmed
2025-10-31 10:49 ` [PATCH v4 5/5] drm/nouveau/drm: Bump the driver version to 1.4.1 to report new features Mohamed Ahmed
2025-10-31 14:18 ` [PATCH v4 0/5] drm/nouveau: Enable variable page sizes and compression Mary Guillemard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15cc6de2-d1af-4504-a08d-1278329a2113@nvidia.com \
--to=jajones@nvidia.com \
--cc=airlied@gmail.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=faith.ekstrand@collabora.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lyude@redhat.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mary@mary.zone \
--cc=mohamedahmedegypt2001@gmail.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=simona@ffwll.ch \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox