Re: [PATCH v5 1/8] drm/gpusvm: fix hmm_pfn_to_map_order() usage

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Auld <matthew.auld@intel.com>
To: Francois Dugast <francois.dugast@intel.com>,
	Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Subject: Re: [PATCH v5 1/8] drm/gpusvm: fix hmm_pfn_to_map_order() usage
Date: Tue, 2 Sep 2025 11:23:23 +0100	[thread overview]
Message-ID: <78720cf5-022a-40c4-b5ea-db4ac811bd29@intel.com> (raw)
In-Reply-To: <aLa0ejSifS16yA31@fdugast-desk>

On 02/09/2025 10:10, Francois Dugast wrote:
> On Mon, Aug 18, 2025 at 10:39:10AM -0700, Matthew Brost wrote:
>> On Mon, Aug 18, 2025 at 04:21:54PM +0100, Matthew Auld wrote:
>>> Handle the case where the hmm range partially covers a huge page (like
>>> 2M), otherwise we can potentially end up doing something nasty like
>>> mapping memory which is outside the range, and maybe not even mapped by
>>> the mm. Fix is based on the xe userptr code, which in a future patch
>>> will directly use gpusvm, so needs alignment here.
>>>
>>> v2:
>>>    - Add kernel-doc (Matt B)
>>>    - s/fls/ilog2/ (Thomas)
>>>
>>> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/drm_gpusvm.c | 33 +++++++++++++++++++++++++++++++--
>>>   1 file changed, 31 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
>>> index 661306da6b2d..50a36e7b71e2 100644
>>> --- a/drivers/gpu/drm/drm_gpusvm.c
>>> +++ b/drivers/gpu/drm/drm_gpusvm.c
>>> @@ -708,6 +708,35 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
>>>   	return range;
>>>   }
>>>   
>>> +/**
>>> + * drm_gpusvm_hmm_pfn_to_order() - Get the largest CPU mapping order.
>>> + * @hmm_pfn: The current hmm_pfn.
>>> + * @hmm_pfn_index: Index of the @hmm_pfn within the pfn array.
>>> + * @npages: Number of pages within the pfn array i.e the hmm range size.
>>> + *
>>> + * To allow skipping PFNs with the same flags (like when they belong to
>>> + * the same huge PTE) when looping over the pfn array, take a given a hmm_pfn,
>>> + * and return the largest order that will fit inside the CPU PTE, but also
>>> + * crucially accounting for the original hmm range boundaries.
>>> + *
>>> + * Return: The largest order that will safely fit within the size of the hmm_pfn
>>> + * CPU PTE.
>>> + */
>>> +static unsigned int drm_gpusvm_hmm_pfn_to_order(unsigned long hmm_pfn,
>>> +						unsigned long hmm_pfn_index,
>>> +						unsigned long npages)
>>> +{
>>> +	unsigned long size;
>>> +
>>> +	size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
>>> +	size -= (hmm_pfn & ~HMM_PFN_FLAGS) & (size - 1);
>>> +	hmm_pfn_index += size;
>>> +	if (hmm_pfn_index > npages)
>>> +		size -= (hmm_pfn_index - npages);
>>
>> Hmm, okay. On the core MM side, we’ve discussed updating HMM to populate
>> PFNs only at order granularity. If that lands, this code could break in
>> some odd cases. That argues for leaving HMM as-is for now. For the
>> moment, this code works, but we should keep an eye on HMM—or
>> future-proof it by populating all potentially absent entries.
>>
> 
> For reference, this is the patch which mentions possibly only sparsely
> populating HMM PFNs in the future:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-hmm-populate-pfns-from-pmd-swap-entry.patch

Thanks for the link. I don't immediately see the connection with this 
change and the linked patch. With that linked change we are still 
basically using the PTE size when encoding the hmm pfn or did something 
else change? So say if we hmm fault a range partially covering that huge 
PTE, the hmm pfn order will still be that of the huge PTE to facilitate 
potentially skipping entries later?

For example say we have 2M PTE and you hmm fault only the final 4K page 
at the end, the order of the hmm pfn will still be 2M, right? Taking 
that same example but now we hmm fault the final 4K page within the 2M 
PTE plus also the 4K page after it, so just 8K range. Without this patch 
in get_pages() it will incorrectly think it is mapping a full 2M page 
when looking at the order of the first hmm pfn and skip mapping that 
second 4K page, in addition it just created a 2M mapping instead of 8K. 
With this patch it will take into account the hmm range and instead give 
you 4K + 4K.

Sorry if I'm totally misunderstanding something here.

> 
> Francois
>   
>> Matt
>>
>>> +
>>> +	return ilog2(size);
>>> +}
>>> +
>>>   /**
>>>    * drm_gpusvm_check_pages() - Check pages
>>>    * @gpusvm: Pointer to the GPU SVM structure
>>> @@ -766,7 +795,7 @@ static bool drm_gpusvm_check_pages(struct drm_gpusvm *gpusvm,
>>>   			err = -EFAULT;
>>>   			goto err_free;
>>>   		}
>>> -		i += 0x1 << hmm_pfn_to_map_order(pfns[i]);
>>> +		i += 0x1 << drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
>>>   	}
>>>   
>>>   err_free:
>>> @@ -1342,7 +1371,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>>>   	for (i = 0, j = 0; i < npages; ++j) {
>>>   		struct page *page = hmm_pfn_to_page(pfns[i]);
>>>   
>>> -		order = hmm_pfn_to_map_order(pfns[i]);
>>> +		order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
>>>   		if (is_device_private_page(page) ||
>>>   		    is_device_coherent_page(page)) {
>>>   			if (zdd != page->zone_device_data && i > 0) {
>>> -- 
>>> 2.50.1
>>>

next prev parent reply	other threads:[~2025-09-02 10:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18 15:21 [PATCH v5 0/8] Replace xe_hmm with gpusvm Matthew Auld
2025-08-18 15:21 ` [PATCH v5 1/8] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
2025-08-18 17:39   ` Matthew Brost
2025-09-02  9:10     ` Francois Dugast
2025-09-02 10:23       ` Matthew Auld [this message]
2025-08-18 15:21 ` [PATCH v5 2/8] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
2025-08-18 15:21 ` [PATCH v5 3/8] drm/gpusvm: pull out drm_gpusvm_pages substructure Matthew Auld
2025-08-18 15:21 ` [PATCH v5 4/8] drm/gpusvm: refactor core API to use pages struct Matthew Auld
2025-08-18 15:21 ` [PATCH v5 5/8] drm/gpusvm: export drm_gpusvm_pages API Matthew Auld
2025-08-18 15:21 ` [PATCH v5 6/8] drm/xe/vm: split userptr bits into separate file Matthew Auld
2025-08-18 15:22 ` [PATCH v5 7/8] drm/xe/userptr: replace xe_hmm with gpusvm Matthew Auld
2025-08-19  0:18   ` kernel test robot
2025-08-18 15:22 ` [PATCH v5 8/8] drm/xe/pt: unify xe_pt_svm_pre_commit with userptr Matthew Auld
2025-08-18 17:04 ` ✗ CI.checkpatch: warning for Replace xe_hmm with gpusvm (rev5) Patchwork
2025-08-18 17:05 ` ✓ CI.KUnit: success " Patchwork
2025-08-18 18:10 ` ✓ Xe.CI.BAT: " Patchwork
2025-08-19 10:40 ` ✓ Xe.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78720cf5-022a-40c4-b5ea-db4ac811bd29@intel.com \
    --to=matthew.auld@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.