From: Prathu Baronia <prathubaronia2011@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Prathu Baronia <prathu.baronia@oneplus.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
chintan.pandya@oneplus.com,
"glider@google.com" <glider@google.com>,
Andrey Konovalov <andreyknvl@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org
Subject: [PATCH 1/1] mm: Optimizing hugepage zeroing in arm64
Date: Thu, 21 Jan 2021 22:21:51 +0530 [thread overview]
Message-ID: <20210121165153.17828-2-prathu.baronia@oneplus.com> (raw)
In-Reply-To: <20210121165153.17828-1-prathu.baronia@oneplus.com>
In !HIGHMEM cases, specially in 64-bit architectures, we don't need temp
mapping of pages. Hence, k(map|unmap)_atomic() acts as nothing more than
multiple barrier() calls, for example for a 2MB hugepage in
clear_huge_page() these are called 512 times i.e. to map and unmap each
subpage that means in total 2048 barrier calls. This called for
optimization. Simply getting VADDR from page does the job for us.
We profiled clear_huge_page() using ftrace and observed an improvement
of 62%.
Setup:-
Below data has been collected on Qualcomm's SM7250 SoC THP enabled (kernel
v4.19.113) with only CPU-0(Cortex-A55) and CPU-7(Cortex-A76) switched on
and set to max frequency, also DDR set to perf governor.
FTRACE Data:-
Base data:-
Number of iterations: 48
Mean of allocation time: 349.5 us
std deviation: 74.5 us
v1 data:-
Number of iterations: 48
Mean of allocation time: 131 us
std deviation: 32.7 us
The following simple userspace experiment to allocate
100MB(BUF_SZ) of pages and writing to it gave us a good insight,
we observed an improvement of 42% in allocation and writing timings.
-------------------------------------------------------------
Test code snippet
-------------------------------------------------------------
clock_start();
buf = malloc(BUF_SZ); /* Allocate 100 MB of memory */
for(i=0; i < BUF_SZ_PAGES; i++)
{
*((int *)(buf + (i*PAGE_SIZE))) = 1;
}
clock_end();
-------------------------------------------------------------
Malloc test timings for 100MB anon allocation:-
Base data:-
Number of iterations: 100
Mean of allocation time: 31831 us
std deviation: 4286 us
v1 data:-
Number of iterations: 100
Mean of allocation time: 18193 us
std deviation: 4915 us
Reported-by: Chintan Pandya <chintan.pandya@oneplus.com>
Signed-off-by: Prathu Baronia <prathu.baronia@oneplus.com>
---
arch/arm64/include/asm/page.h | 3 +++
arch/arm64/mm/copypage.c | 8 ++++++++
2 files changed, 11 insertions(+)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..8f9d005a11bb 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -35,6 +35,9 @@ void copy_highpage(struct page *to, struct page *from);
#define clear_user_page(page, vaddr, pg) clear_page(page)
#define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
+#define clear_user_highpage clear_user_highpage
+void clear_user_highpage(struct page *page, unsigned long vaddr);
+
typedef struct page *pgtable_t;
extern int pfn_valid(unsigned long);
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index b5447e53cd73..7f5943c6fc12 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -44,3 +44,11 @@ void copy_user_highpage(struct page *to, struct page *from,
flush_dcache_page(to);
}
EXPORT_SYMBOL_GPL(copy_user_highpage);
+
+inline void clear_user_highpage(struct page *page, unsigned long vaddr)
+{
+ void *addr = page_address(page);
+
+ clear_user_page(addr, vaddr, page);
+}
+EXPORT_SYMBOL_GPL(clear_user_highpage);
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-01-21 16:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-21 16:51 [PATCH 0/1] mm: Optimizing hugepage zeroing in arm64 Prathu Baronia
2021-01-21 16:51 ` Prathu Baronia [this message]
2021-01-21 17:46 ` Will Deacon
2021-01-21 18:59 ` Robin Murphy
2021-01-22 12:13 ` Catalin Marinas
2021-01-22 12:45 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210121165153.17828-2-prathu.baronia@oneplus.com \
--to=prathubaronia2011@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@google.com \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=chintan.pandya@oneplus.com \
--cc=glider@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=prathu.baronia@oneplus.com \
--cc=vincenzo.frascino@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox