Re: [PATCH] drm/amdgpu: extend mtype override to non-contiguous pages

public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

From: "Christian König" <christian.koenig@amd.com>
To: Philip Yang <yangp@amd.com>,
	"Chen, Xiaogang" <xiaogang.chen@amd.com>,
	Philip Yang <Philip.Yang@amd.com>,
	amd-gfx@lists.freedesktop.org
Cc: Felix.Kuehling@amd.com, Kent.Russell@amd.com, Andrew.Martin@amd.com
Subject: Re: [PATCH] drm/amdgpu: extend mtype override to non-contiguous pages
Date: Fri, 10 Apr 2026 13:26:09 +0200	[thread overview]
Message-ID: <92c191d7-cdca-403f-a622-ddbf0b90ed11@amd.com> (raw)
In-Reply-To: <8ab95e17-11bf-4ea1-b408-c740c6cefc88@amd.com>

On 4/9/26 23:56, Philip Yang wrote:
> On 2026-04-08 04:04, Christian König wrote:
>> On 4/7/26 21:40, Chen, Xiaogang wrote:
>>> On 4/7/2026 8:38 AM, Philip Yang wrote:
>>>> On multi-socket MI300A APU systems, system memory pages mapped to the
>>>> closest GPU must use MTYPE_RW instead of MTYPE_NC to maintain correct
>>>> cache coherence. The existing mtype override in amdgpu_vm_pte_update_flags()
>>>> excluded non-contiguous page mappings from the override. This caused
>>>> incorrect MTYPE_NC for scattered local pages, leading to cache coherence
>>>> issues.
>>>>
>>>> The override applies to both contiguous and non-contiguous mappings.
>>>> When pages_addr is set, resolve the physical address via
>>>> pages_addr[addr >> PAGE_SHIFT] before passing it to the override
>>>> callback for NUMA node lookup.
>>>>
>>>> Introduce amdgpu_vm_addr_contiguous() helper that, on MI300A, treats
>>>> pages on different NUMA nodes as non-contiguous even if their DMA
>>>> addresses are adjacent. This ensures amdgpu_vm_update_range() splits
>>>> page table updates at NUMA node boundaries so each batch gets the
>>>> correct mtype override.
>>>>
>>>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c    | 48 +++++++++++++++++++----
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 14 +++++--
>>>>  2 files changed, 50 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 63156289ae7f..f8fcbf079bf4 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -1099,6 +1099,34 @@ amdgpu_vm_tlb_flush(struct amdgpu_vm_update_params *params,
>>>>  	}
>>>>  }
>>>>  
>>>> +/**
>>>> + * amdgpu_vm_addr_contiguous - check if two DMA addresses are contiguous
>>>> + *
>>>> + * @adev: amdgpu_device pointer
>>>> + * @addr: current DMA address
>>>> + * @addr_next: next DMA address to check against
>>>> + * @contiguous: current contiguity state of the range being built
>>>> + *
>>>> + * Check whether @addr and @addr_next are physically contiguous. On APU
>>>> + * platforms with multiple NUMA nodes (e.g. MI300A), a NUMA node boundary
>>>> + * also breaks contiguity so that each contiguous batch stays within a
>>>> + * single NUMA node for correct MTYPE override selection.
>>>> + *
>>>> + * Returns:
>>>> + * true if @addr_next continues the current contiguous range, false otherwise.
>>>> + */
>>> We can use pfn_to_nid or page_to_nid to get which noma(id) the backing memory is at. pfn_to_nid uses pfn from physical address. You use dma_addr_t that is device dependent. It is not always same as physical address of RAM.
>> Yeah that here won't work at all.
>>
>>> ttm_tt also has
>>>
>>> /** @pages: Array of pages backing the data. */ struct page **pages;
>>>
>>> I think using the pages to get numa id by page_to_nid is more appropriate.
>> That array isn't filled in for imported pages.
>>
>> As far as I can see the whole approach won't work reliable. For imports we only know the dma_addr and not the struct page nor the pfn.
> if adev->ram_is_direct_mapped, then dma_addr equals to pfn, we can use dma_addr to call pfn_to_nid.
> I will add ram_is_direct_mapped condition, this is currently inside override function, and add fast path change suggested by Felix in next version.

That is seriously not something we can do. This relies an specific HW behavior in common code and is clearly a no-go from my side.

Regards,
Christian.

> 
> Regards,
> Philip  
>> I think we need to re-iterate the whole idea of MTYPE override.
>>
>> Regards,
>> Christian.
>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>>> +static inline bool amdgpu_vm_addr_contiguous(struct amdgpu_device *adev, dma_addr_t addr,
>>>> +					     dma_addr_t addr_next, bool contiguous)
>>>> +{
>>>> +	if (!adev->gmc.is_app_apu || !page_is_ram(addr >> PAGE_SHIFT))
>>>> +		return (addr + PAGE_SIZE) == addr_next;
>>>> +
>>>> +	if (pfn_to_nid(addr >> PAGE_SHIFT) != pfn_to_nid(addr_next >> PAGE_SHIFT))
>>>> +		return !contiguous;
>>>> +
>>>> +	return (addr + PAGE_SIZE) == addr_next;
>>>> +}
>>>> +
>>>>  /**
>>>>   * amdgpu_vm_update_range - update a range in the vm page table
>>>>   *
>>>> @@ -1198,22 +1226,26 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>>>  				uint64_t pfn = cursor.start >> PAGE_SHIFT;
>>>>  				uint64_t count;
>>>>  
>>>> -				contiguous = pages_addr[pfn + 1] ==
>>>> -					pages_addr[pfn] + PAGE_SIZE;
>>>> +				contiguous = amdgpu_vm_addr_contiguous(adev,
>>>> +								       pages_addr[pfn],
>>>> +								       pages_addr[pfn + 1],
>>>> +								       contiguous);
>>>>  
>>>> -				tmp = num_entries /
>>>> -					AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>>>> +				tmp = num_entries / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>>>>  				for (count = 2; count < tmp; ++count) {
>>>>  					uint64_t idx = pfn + count;
>>>>  
>>>> -					if (contiguous != (pages_addr[idx] ==
>>>> -					    pages_addr[idx - 1] + PAGE_SIZE))
>>>> +					if (contiguous != amdgpu_vm_addr_contiguous(adev,
>>>> +									pages_addr[idx - 1],
>>>> +									pages_addr[idx],
>>>> +									contiguous))
>>>>  						break;
>>>>  				}
>>>> +
>>>>  				if (!contiguous)
>>>>  					count--;
>>>> -				num_entries = count *
>>>> -					AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>>>> +
>>>> +				num_entries = count * AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>>>>  			}
>>>>  
>>>>  			if (!contiguous) {
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>>>> index 31a437ce9570..9e1607fb3b2e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>>>> @@ -708,13 +708,19 @@ static void amdgpu_vm_pte_update_flags(struct amdgpu_vm_update_params *params,
>>>>  		amdgpu_vm_pte_update_noretry_flags(adev, &flags);
>>>>  
>>>>  	/* APUs mapping system memory may need different MTYPEs on different
>>>> -	 * NUMA nodes. Only do this for contiguous ranges that can be assumed
>>>> -	 * to be on the same NUMA node.
>>>> +	 * NUMA nodes. Both contiguous and non-contiguous ranges are handled
>>>> +	 * since amdgpu_vm_update_range ensures updates don't span NUMA
>>>> +	 * node boundaries.
>>>>  	 */
>>>>  	if ((flags & AMDGPU_PTE_SYSTEM) && (adev->flags & AMD_IS_APU) &&
>>>>  	    adev->gmc.gmc_funcs->override_vm_pte_flags &&
>>>> -	    num_possible_nodes() > 1 && !params->pages_addr && params->allow_override)
>>>> -		amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, &flags);
>>>> +	    num_possible_nodes() > 1 && params->allow_override) {
>>>> +		if (params->pages_addr)
>>>> +			amdgpu_gmc_override_vm_pte_flags(adev, params->vm,
>>>> +					params->pages_addr[addr >> PAGE_SHIFT], &flags);
>>>> +		else
>>>> +			amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, &flags);
>>>> +	}
>>>>  
>>>>  	params->vm->update_funcs->update(params, pt, pe, addr, count, incr,
>>>>  					 flags);
>

next prev parent reply	other threads:[~2026-04-10 11:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07 13:38 [PATCH] drm/amdgpu: extend mtype override to non-contiguous pages Philip Yang
2026-04-07 19:40 ` Chen, Xiaogang
2026-04-07 22:24   ` Philip Yang
2026-04-08  8:04   ` Christian König
2026-04-09 21:56     ` Philip Yang
2026-04-10 11:26       ` Christian König [this message]
2026-04-07 23:06 ` Felix Kuehling
2026-04-09 22:32   ` Philip Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92c191d7-cdca-403f-a622-ddbf0b90ed11@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Andrew.Martin@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Kent.Russell@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=xiaogang.chen@amd.com \
    --cc=yangp@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox