All of lore.kernel.org
 help / color / mirror / Atom feed
From: Francois Dugast <francois.dugast@intel.com>
To: Balbir Singh <balbirs@nvidia.com>
Cc: <akpm@linux-foundation.org>, <airlied@gmail.com>,
	<apopple@nvidia.com>, <christian.koenig@amd.com>,
	<jgg@nvidia.com>, <leonro@nvidia.com>, <matthew.brost@intel.com>,
	<mm-commits@vger.kernel.org>, <mpenttil@redhat.com>,
	<thomas.hellstrom@linux.intel.com>, <ziy@nvidia.com>
Subject: Re: [PATCH] mm/hmm: populate PFNs from PMD swap entry
Date: Tue, 2 Sep 2025 14:53:08 +0200	[thread overview]
Message-ID: <aLbotHsTPaqk50fK@fdugast-desk> (raw)
In-Reply-To: <9481c30d-2232-4d29-a38f-6a87b0648238@nvidia.com>

On Tue, Sep 02, 2025 at 09:30:13PM +1000, Balbir Singh wrote:
> On 9/2/25 21:17, Francois Dugast wrote:
> > Once support for THP migration of zone device pages is enabled, device
> > private swap entries will be found during the walk not only for PTEs but
> > also for PMDs.
> > 
> > Therefore, it is necessary to extend to PMDs the special handling which is
> > already in place for PTEs when device private pages are owned by the
> > caller: instead of faulting or skipping the range, the correct behavior is
> > to use the swap entry to populate HMM PFNs.
> > 
> > This change is a prerequisite to make use of device-private THP in drivers
> > using drivers/gpu/drm/drm_pagemap, such as xe.
> > 
> > Even though subsequent PFNs can be inferred when handling large order
> > PFNs, the PFN list is still fully populated because this is currently
> > expected by HMM users. In case this changes in the future, that is all HMM
> > users support a sparsely populated PFN list, the for() loop can be made to
> > skip remaining PFNs for the current order. A quick test shows the loop
> > takes about 10 ns, roughly 20 times faster than without this optimization.
> > 
> > Link: https://lkml.kernel.org/r/20250829080505.1020155-1-francois.dugast@intel.com
> > Signed-off-by: Francois Dugast <francois.dugast@intel.com>
> > Cc: Jason Gunthorpe <jgg@nvidia.com>
> > Cc: Leon Romanovsky <leonro@nvidia.com>
> > Cc: Zi Yan <ziy@nvidia.com>
> > Cc: Alistair Popple <apopple@nvidia.com>
> > Cc: Balbir Singh <balbirs@nvidia.com>
> > Cc: David Airlie <airlied@gmail.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Mika Penttilä <mpenttil@redhat.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >  mm/hmm.c | 23 +++++++++++++++++++++++
> >  1 file changed, 23 insertions(+)
> > 
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index d545e2494994..d449fc4647d7 100644
> > --- a/mm/hmm.c
> > +++ b/mm/hmm.c
> > @@ -355,6 +355,29 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
> >  	}
> >  
> >  	if (!pmd_present(pmd)) {
> > +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> > +		swp_entry_t entry = pmd_to_swp_entry(pmd);
> > +
> > +		if (is_device_private_entry(entry) &&
> > +		    pfn_swap_entry_folio(entry)->pgmap->owner ==
> > +		    range->dev_private_owner) {
> > +			unsigned long cpu_flags = HMM_PFN_VALID |
> > +				hmm_pfn_flags_order(PMD_SHIFT - PAGE_SHIFT);
> > +			unsigned long pfn = swp_offset_pfn(entry);
> > +			unsigned long i;
> > +
> > +			if (is_writable_device_private_entry(entry))
> > +				cpu_flags |= HMM_PFN_WRITE;
> > +
> > +			for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) {
> > +				hmm_pfns[i] &= HMM_PFN_INOUT_FLAGS;
> > +				hmm_pfns[i] |= pfn | cpu_flags;
> > +			}
> > +
> 
> Can you add a comment here about why this is added? Why would there be a disconnect
> between HMM users and the API? I assume you are referring to drivers that are
> not yet aware of large folios.

Yes I am, will do.

> 
> > +			return 0;
> > +		}
> > +#endif  /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
> > +
> >  		if (hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0))
> >  			return -EFAULT;
> >  		return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
> 
> Other than that, based on the assumption that my patches are not a pre-requisite
> for this
> 
> Acked-by: Balbir Singh <balbirs@nvidia.com>

Thanks.

> 
> Thanks,
> Balbir
> 

  reply	other threads:[~2025-09-02 12:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-30  3:56 + mm-hmm-populate-pfns-from-pmd-swap-entry.patch added to mm-new branch Andrew Morton
2025-09-02 11:17 ` [PATCH] mm/hmm: populate PFNs from PMD swap entry Francois Dugast
2025-09-02 11:30   ` Balbir Singh
2025-09-02 12:53     ` Francois Dugast [this message]
2025-09-02 13:07       ` Francois Dugast
2025-09-03  5:47         ` Matthew Brost
2025-09-04 13:25           ` Francois Dugast
2025-09-08  9:10             ` Francois Dugast
2025-09-08 13:46               ` Jason Gunthorpe
2025-09-09  1:50                 ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2025-08-29  8:05 [PATCH] mm/hmm: Populate " Francois Dugast

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLbotHsTPaqk50fK@fdugast-desk \
    --to=francois.dugast@intel.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=christian.koenig@amd.com \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=matthew.brost@intel.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=mpenttil@redhat.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.