All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: muchun.song@linux.dev, akpm@linux-foundation.org,
	rientjes@google.com, fvdl@google.com, jthoughton@google.com,
	david@redhat.com, isaku.yamahata@intel.com,
	zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com,
	tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk,
	jgg@nvidia.com, jhubbard@nvidia.com, seanjc@google.com,
	pbonzini@redhat.com, erdemaktas@google.com,
	vannapurve@google.com, pgonda@google.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH 2/3] mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv()
Date: Tue, 5 Nov 2024 13:46:39 -0500	[thread overview]
Message-ID: <ZypoDzm2XdfnG1if@x1n> (raw)
In-Reply-To: <3d1946d01f63104de913c0979b5a596e2add1672.1728684491.git.ackerleytng@google.com>

On Fri, Oct 11, 2024 at 11:22:37PM +0000, Ackerley Tng wrote:
> With the addition of the chg parameter, vma_has_reserves() no longer
> just determines whether the vma has reserves.
> 
> The comment in the vma->vm_flags & VM_NORESERVE block indicates that
> this function actually computes whether or not the reserved count
> should be decremented.
> 
> This refactoring also takes into account the allocation's request
> parameter avoid_reserve, which helps to further simplify the calling
> function alloc_hugetlb_folio().
> 
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

I wonder if this patch could be merged with the 3rd, IIUC this can
fundamentally be seen as a movement of what patch 3 was removed.

Furthermore, I do feel like should_use_hstate_resv() could be redundant on
its own on many things.

Let me try to justify.  Firstly, after 3 patches applied, now it looks like
this (I removed all comments to make things shorter..):

static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg,
				   bool avoid_reserve)
{
	if (avoid_reserve)
		return false;

	if (vma->vm_flags & VM_NORESERVE) {
		if (vma->vm_flags & VM_MAYSHARE && chg == 0)
			return true;
		else
			return false;
	}

	if (vma->vm_flags & VM_MAYSHARE) {
		if (chg)
			return false;
		else
			return true;
	}

	if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
		if (chg)
			return false;
		else
			return true;
	}

	return false;
}

Then let's look at chg==0 processing all above: what does it say?  I
suppose it means "we don't need another global reservation".  It means if
chg==0 we always will use an existing reservation.  From math POV it also
is true, so it can already be moved out ahead, IIUC, like this:

static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg,
				   bool avoid_reserve)
{
	if (avoid_reserve)
		return false;

        if (chg == 0)
                return true;

	if (vma->vm_flags & VM_NORESERVE)
                return false;

	if (vma->vm_flags & VM_MAYSHARE)
                return false;

	if (is_vma_resv_set(vma, HPAGE_RESV_OWNER))
                return false;

	return false;             <--------------------- [1]
}

Move on.  If I read it right, above [1] is exactly for avoid_reserve==1
case, where it basically says "it's !NORESERVE, private, and it's not the
vma resv owner, either fork() or CoW".  If my reading is correct, it means
after your patch 2, [1] should never be reachable at all.. I would hope
adding a panic() right above would be ok.

And irrelevant of whether my understanding is correct.. math-wise above is
also already the same as:

static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg,
				   bool avoid_reserve)
{
	if (avoid_reserve)
		return false;

        if (chg == 0)
                return true;

	return false;
}

Then it makes a lot more sense now, because afaict, gbl_chg is exactly what
should_use_hstate_resv() is looking for.. only except avoid_reserve==true.

Would above make sense to you?

In short, it's about whether a patch on top of your series would further
simply this whole thing, like below:

===8<===
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 60e72214d5bf..4b1c5c4ed7be 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1245,80 +1245,6 @@ void clear_vma_resv_huge_pages(struct vm_area_struct *vma)
 	hugetlb_dup_vma_private(vma);
 }
 
-/*
- * Returns true if this allocation should use (debit) hstate reservations, based on
- *
- * @vma: VMA config
- * @chg: Whether the page requirement can be satisfied using subpool reservations
- * @avoid_reserve: Whether allocation was requested to avoid using reservations
- */
-static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg,
-				   bool avoid_reserve)
-{
-	if (avoid_reserve)
-		return false;
-
-	if (vma->vm_flags & VM_NORESERVE) {
-		/*
-		 * This address is already reserved by other process(chg == 0),
-		 * so, we should decrement reserved count. Without decrementing,
-		 * reserve count remains after releasing inode, because this
-		 * allocated page will go into page cache and is regarded as
-		 * coming from reserved pool in releasing step.  Currently, we
-		 * don't have any other solution to deal with this situation
-		 * properly, so add work-around here.
-		 */
-		if (vma->vm_flags & VM_MAYSHARE && chg == 0)
-			return true;
-		else
-			return false;
-	}
-
-	/* Shared mappings always use reserves */
-	if (vma->vm_flags & VM_MAYSHARE) {
-		/*
-		 * We know VM_NORESERVE is not set.  Therefore, there SHOULD
-		 * be a region map for all pages.  The only situation where
-		 * there is no region map is if a hole was punched via
-		 * fallocate.  In this case, there really are no reserves to
-		 * use.  This situation is indicated if chg != 0.
-		 */
-		if (chg)
-			return false;
-		else
-			return true;
-	}
-
-	/*
-	 * Only the process that called mmap() has reserves for private
-	 * mappings. A child process with MAP_PRIVATE mappings created by their
-	 * parent have no page reserves.
-	 */
-	if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
-		/*
-		 * Like the shared case above, a hole punch or truncate
-		 * could have been performed on the private mapping.
-		 * Examine the value of chg to determine if reserves
-		 * actually exist or were previously consumed.
-		 * Very Subtle - The value of chg comes from a previous
-		 * call to vma_needs_reserves().  The reserve map for
-		 * private mappings has different (opposite) semantics
-		 * than that of shared mappings.  vma_needs_reserves()
-		 * has already taken this difference in semantics into
-		 * account.  Therefore, the meaning of chg is the same
-		 * as in the shared case above.  Code could easily be
-		 * combined, but keeping it separate draws attention to
-		 * subtle differences.
-		 */
-		if (chg)
-			return false;
-		else
-			return true;
-	}
-
-	return false;
-}
-
 static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio)
 {
 	int nid = folio_nid(folio);
@@ -3255,7 +3181,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 
 	}
 
-	use_hstate_resv = should_use_hstate_resv(vma, gbl_chg, avoid_reserve);
+	use_hstate_resv = avoid_reserve || !gbl_chg;
 
 	/*
 	 * charge_cgroup_reservation if this allocation is not consuming a
===8<===

Thanks,

-- 
Peter Xu



  reply	other threads:[~2024-11-05 18:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-11 23:22 [RFC PATCH 0/3] Reduce dependence on vmas deep in hugetlb allocation code Ackerley Tng
2024-10-11 23:22 ` [RFC PATCH 1/3] mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma() Ackerley Tng
2024-10-30 14:31   ` Sean Christopherson
2024-11-05 17:10   ` Peter Xu
2024-11-06 10:13   ` Oscar Salvador
2024-10-11 23:22 ` [RFC PATCH 2/3] mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv() Ackerley Tng
2024-11-05 18:46   ` Peter Xu [this message]
2024-11-11  9:19     ` Oscar Salvador
2024-10-11 23:22 ` [RFC PATCH 3/3] mm: hugetlb: Remove unnecessary check for avoid_reserve Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZypoDzm2XdfnG1if@x1n \
    --to=peterx@redhat.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=erdemaktas@google.com \
    --cc=fan.du@intel.com \
    --cc=fvdl@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=jun.miao@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=quic_eberman@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=seanjc@google.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.