linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	inuxppc-dev@lists.ozlabs.org, linux-ia64@vger.kernel.org,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] hugetlb: simplify hugetlb handling in follow_page_mask
Date: Tue, 30 Aug 2022 09:52:09 -0700	[thread overview]
Message-ID: <Yw5AOZ/Kc5f3UP+s@monkey> (raw)
In-Reply-To: <608934d4-466d-975e-6458-34a91ccb4669@redhat.com>

On 08/30/22 10:11, David Hildenbrand wrote:
> On 30.08.22 01:40, Mike Kravetz wrote:
> > During discussions of this series [1], it was suggested that hugetlb
> > handling code in follow_page_mask could be simplified.  At the beginning
> 
> Feel free to use a Suggested-by if you consider it appropriate.
> 
> > of follow_page_mask, there currently is a call to follow_huge_addr which
> > 'may' handle hugetlb pages.  ia64 is the only architecture which provides
> > a follow_huge_addr routine that does not return error.  Instead, at each
> > level of the page table a check is made for a hugetlb entry.  If a hugetlb
> > entry is found, a call to a routine associated with that entry is made.
> > 
> > Currently, there are two checks for hugetlb entries at each page table
> > level.  The first check is of the form:
> > 	if (p?d_huge())
> > 		page = follow_huge_p?d();
> > the second check is of the form:
> > 	if (is_hugepd())
> > 		page = follow_huge_pd().
> 
> BTW, what about all this hugepd stuff in mm/pagewalk.c?
> 
> Isn't this all dead code as we're essentially routing all hugetlb VMAs
> via walk_hugetlb_range? [yes, all that hugepd stuff in generic code that
> overcomplicates stuff has been annoying me for a long time]

I am 'happy' to look at cleaning up that code next.  Perhaps I will just
create a cleanup series.

I just wanted to focus on eliminating the two callouts in generic code mentioned
above: follow_huge_p?d() and follow_huge_pd().

Really looking for input from Aneesh and Naoya as they added much of the
code that is being removed here.

> > 
> > We can replace these checks, as well as the special handling routines
> > such as follow_huge_p?d() and follow_huge_pd() with a single routine to
> > handle hugetlb vmas.
> > 
> > A new routine hugetlb_follow_page_mask is called for hugetlb vmas at the
> > beginning of follow_page_mask.  hugetlb_follow_page_mask will use the
> > existing routine huge_pte_offset to walk page tables looking for hugetlb
> > entries.  huge_pte_offset can be overwritten by architectures, and already
> > handles special cases such as hugepd entries.
> > 
> > [1] https://lore.kernel.org/linux-mm/cover.1661240170.git.baolin.wang@linux.alibaba.com/
> > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> [...]
> 
> > +static struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
> > +				unsigned long address, unsigned int flags)
> > +{
> > +	/* should never happen, but do not want to BUG */
> > +	return ERR_PTR(-EINVAL);
> 
> Should there be a WARN_ON_ONCE() instead or could we use a BUILD_BUG_ON()?
> 

Ok, I will look into adding one of these.  Prefer a BUILD_BUG_ON().

> > +}
> 
> 
> [...]
> 
> > @@ -851,10 +814,15 @@ static struct page *follow_page_mask(struct vm_area_struct *vma,
> >  
> >  	ctx->page_mask = 0;
> >  
> > -	/* make this handle hugepd */
> > -	page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
> > -	if (!IS_ERR(page)) {
> > -		WARN_ON_ONCE(flags & (FOLL_GET | FOLL_PIN));
> > +	/*
> > +	 * Call hugetlb_follow_page_mask for hugetlb vmas as it will use
> > +	 * special hugetlb page table walking code.  This eliminates the
> > +	 * need to check for hugetlb entries in the general walking code.
> > +	 */
> 
> Maybe also comment that ordinary GUP never ends up in here and instead
> directly uses follow_hugetlb_page(). This is for follow_page() handling
> only.
> 
> [me suggestion to rename follow_hugetlb_page() still stands ;) ]

Will update the comment in v2.

I think renaming follow_hugetlb_page() would be in a separate patch.  Perhaps,
included in a larger cleanup series.  I will not forget. :)

> 
> Numbers speak for themselves.
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 

Thanks,
-- 
Mike Kravetz


  reply	other threads:[~2022-08-30 16:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-29 23:40 [PATCH] hugetlb: simplify hugetlb handling in follow_page_mask Mike Kravetz
2022-08-30  1:06 ` Baolin Wang
2022-08-30 16:44   ` Mike Kravetz
2022-08-30 18:39     ` Mike Kravetz
2022-08-31  1:07       ` Baolin Wang
2022-09-01  0:00         ` Mike Kravetz
2022-09-01  1:24           ` Baolin Wang
2022-09-01  6:59             ` David Hildenbrand
2022-09-01 10:40               ` Baolin Wang
2022-08-30  8:11 ` David Hildenbrand
2022-08-30 16:52   ` Mike Kravetz [this message]
2022-08-30 21:31     ` Mike Kravetz
2022-08-31  8:07       ` David Hildenbrand
2022-09-02 18:50         ` Mike Kravetz
2022-09-02 18:52           ` David Hildenbrand
2022-09-03  6:59             ` Christophe Leroy
2022-09-03  7:07             ` Christophe Leroy
2022-09-04 11:49               ` Michael Ellerman
2022-09-05  8:37               ` David Hildenbrand
2022-09-05  9:33                 ` Christophe Leroy
2022-09-05  9:46                   ` David Hildenbrand
2022-09-05 16:05                     ` Christophe Leroy
2022-09-05 16:09                       ` David Hildenbrand
2022-08-31  5:08 ` kernel test robot
2022-08-31 20:42   ` Mike Kravetz
2022-09-01 16:19 ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yw5AOZ/Kc5f3UP+s@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=inuxppc-dev@lists.ozlabs.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mpe@ellerman.id.au \
    --cc=naoya.horiguchi@linux.dev \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).