All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeelb@google.com>, Wei Xu <weixugc@google.com>,
	Greg Thelen <gthelen@google.com>, Hugh Dickins <hughd@google.com>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH v2] mm/page_alloc: call check_new_pages() while zone spinlock is not held
Date: Wed, 9 Mar 2022 12:32:45 +0000	[thread overview]
Message-ID: <20220309123245.GI15701@techsingularity.net> (raw)
In-Reply-To: <CANn89iLmwT4XQ6JPi4C7dO+Q2O_j7HK19-TAo4nA1NUf8ZSLBw@mail.gmail.com>

On Tue, Mar 08, 2022 at 03:49:48PM -0800, Eric Dumazet wrote:
> On Mon, Mar 7, 2022 at 1:15 AM Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Mar 04, 2022 at 09:02:15AM -0800, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > For high order pages not using pcp, rmqueue() is currently calling
> > > the costly check_new_pages() while zone spinlock is held,
> > > and hard irqs masked.
> > >
> > > This is not needed, we can release the spinlock sooner to reduce
> > > zone spinlock contention.
> > >
> > > Note that after this patch, we call __mod_zone_freepage_state()
> > > before deciding to leak the page because it is in bad state.
> > >
> > > v2: We need to keep interrupts disabled to call __mod_zone_freepage_state()
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Cc: Mel Gorman <mgorman@techsingularity.net>
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Cc: Shakeel Butt <shakeelb@google.com>
> > > Cc: Wei Xu <weixugc@google.com>
> > > Cc: Greg Thelen <gthelen@google.com>
> > > Cc: Hugh Dickins <hughd@google.com>
> > > Cc: David Rientjes <rientjes@google.com>
> >
> > Ok, this is only more expensive in the event pages on the free list have
> > been corrupted whch is already very unlikely so thanks!
> >
> > Acked-by: Mel Gorman <mgorman@techsingularity.net>
> >
> 
> One remaining question is:
> 
> After your patch ("mm/page_alloc: allow high-order pages to be stored
> on the per-cpu lists"),
> do we want to change check_pcp_refill()/check_new_pcp() to check all pages,
> and not only the head ?
> 

We should because it was an oversight. Thanks for pointing that out.

> Or was it a conscious choice of yours ?
> (I presume part of the performance gains came from
> not having to bring ~7 cache lines per 32KB chunk on x86)
> 

There will be a performance penalty due to the check but it's a correctness
vs performance issue.

This? It's boot tested only.

--8<--
mm/page_alloc: check high-order pages for corruption during PCP operations

Eric Dumazet pointed out that commit 44042b449872 ("mm/page_alloc: allow
high-order pages to be stored on the per-cpu lists") only checks the head
page during PCP refill and allocation operations. This was an oversight
and all pages should be checked. This will incur a small performance
penalty but it's necessary for correctness.

Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/page_alloc.c | 46 +++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3589febc6d31..2920344fa887 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2342,23 +2342,36 @@ static inline int check_new_page(struct page *page)
 	return 1;
 }
 
+static bool check_new_pages(struct page *page, unsigned int order)
+{
+	int i;
+	for (i = 0; i < (1 << order); i++) {
+		struct page *p = page + i;
+
+		if (unlikely(check_new_page(p)))
+			return true;
+	}
+
+	return false;
+}
+
 #ifdef CONFIG_DEBUG_VM
 /*
  * With DEBUG_VM enabled, order-0 pages are checked for expected state when
  * being allocated from pcp lists. With debug_pagealloc also enabled, they are
  * also checked when pcp lists are refilled from the free lists.
  */
-static inline bool check_pcp_refill(struct page *page)
+static inline bool check_pcp_refill(struct page *page, unsigned int order)
 {
 	if (debug_pagealloc_enabled_static())
-		return check_new_page(page);
+		return check_new_pages(page, order);
 	else
 		return false;
 }
 
-static inline bool check_new_pcp(struct page *page)
+static inline bool check_new_pcp(struct page *page, unsigned int order)
 {
-	return check_new_page(page);
+	return check_new_pages(page, order);
 }
 #else
 /*
@@ -2366,32 +2379,19 @@ static inline bool check_new_pcp(struct page *page)
  * when pcp lists are being refilled from the free lists. With debug_pagealloc
  * enabled, they are also checked when being allocated from the pcp lists.
  */
-static inline bool check_pcp_refill(struct page *page)
+static inline bool check_pcp_refill(struct page *page, unsigned int order)
 {
-	return check_new_page(page);
+	return check_new_pages(page, order);
 }
-static inline bool check_new_pcp(struct page *page)
+static inline bool check_new_pcp(struct page *page, unsigned int order)
 {
 	if (debug_pagealloc_enabled_static())
-		return check_new_page(page);
+		return check_new_pages(page, order);
 	else
 		return false;
 }
 #endif /* CONFIG_DEBUG_VM */
 
-static bool check_new_pages(struct page *page, unsigned int order)
-{
-	int i;
-	for (i = 0; i < (1 << order); i++) {
-		struct page *p = page + i;
-
-		if (unlikely(check_new_page(p)))
-			return true;
-	}
-
-	return false;
-}
-
 inline void post_alloc_hook(struct page *page, unsigned int order,
 				gfp_t gfp_flags)
 {
@@ -3037,7 +3037,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 		if (unlikely(page == NULL))
 			break;
 
-		if (unlikely(check_pcp_refill(page)))
+		if (unlikely(check_pcp_refill(page, order)))
 			continue;
 
 		/*
@@ -3641,7 +3641,7 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order,
 		page = list_first_entry(list, struct page, lru);
 		list_del(&page->lru);
 		pcp->count -= 1 << order;
-	} while (check_new_pcp(page));
+	} while (check_new_pcp(page, order));
 
 	return page;
 }


  reply	other threads:[~2022-03-09 12:32 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04 17:02 [PATCH v2] mm/page_alloc: call check_new_pages() while zone spinlock is not held Eric Dumazet
2022-03-04 19:19 ` Shakeel Butt
2022-03-06 22:15 ` David Rientjes
2022-03-07  9:15 ` Mel Gorman
2022-03-08 23:49   ` Eric Dumazet
2022-03-09 12:32     ` Mel Gorman [this message]
2022-03-09 17:32       ` Eric Dumazet
2022-03-12 15:43       ` [mm/page_alloc] 8212a964ee: vm-scalability.throughput 30.5% improvement kernel test robot
2022-03-12 15:43         ` kernel test robot
2022-03-12 18:58         ` Vlastimil Babka
2022-03-12 18:58           ` Vlastimil Babka
2022-03-12 23:26           ` Eric Dumazet
2022-03-12 23:26             ` Eric Dumazet
2022-03-13  9:28             ` Vlastimil Babka
2022-03-13  9:28               ` Vlastimil Babka
2022-03-13 21:10               ` Eric Dumazet
2022-03-13 21:10                 ` Eric Dumazet
2022-03-13 21:18                 ` Matthew Wilcox
2022-03-13 21:18                   ` Matthew Wilcox
2022-03-13 21:27                   ` Eric Dumazet
2022-03-13 21:27                     ` Eric Dumazet
2022-03-13 21:36                     ` Eric Dumazet
2022-03-13 21:36                       ` Eric Dumazet
2022-03-14  9:09                 ` Vlastimil Babka
2022-03-14  9:09                   ` Vlastimil Babka
2022-03-07  9:24 ` [PATCH v2] mm/page_alloc: call check_new_pages() while zone spinlock is not held Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220309123245.GI15701@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.