public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>,
	Will Deacon <will@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, eranian@google.com, ak@linux.intel.com,
	dave.hansen@intel.com, kirill.shutemov@linux.intel.com,
	benh@kernel.crashing.org, paulus@samba.org,
	David Miller <davem@davemloft.net>,
	vbabka@suse.cz
Subject: Re: [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE
Date: Wed, 11 Nov 2020 18:22:53 +0100	[thread overview]
Message-ID: <20201111172253.GG2628@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201111163848.GU17076@casper.infradead.org>

On Wed, Nov 11, 2020 at 04:38:48PM +0000, Matthew Wilcox wrote:
> On Wed, Nov 11, 2020 at 04:57:24PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 11, 2020 at 03:30:22PM +0000, Matthew Wilcox wrote:
> > > This confuses me.  Why only special-case hugetlbfs pages here?  Should
> > > they really be treated differently from THP?  If you want to consider
> > > that we might be mapping a page that's twice as big as a PUD entry and
> > > this is only half of it, then the simple way is:
> > > 
> > > 	if (pud_leaf(pud)) {
> > > #ifdef pud_page
> > > 		page = compound_head(pud_page(*pud));
> > > 		return page_size(page);
> > 
> > Also; this is 'wrong'. The purpose of this function is to return the
> > hardware TLB size of a given address. The above will return the compound
> > size, for any random compound page, which would be myrads of reasons.
> 
> Oh, then the whole thing is overly-complicated.  This should just be
> 
> 	if (pud_leaf(pud))
> 		return PUD_SIZE;

But that doesn't handle non-pagetable aligned hugetlb sizes. Granted,
that's unlikely at the PUD level, but why be inconsistent..

So we really want:

	if (p*d_leaf(p*d)) {
		if (!'special') {
			page = p*d_page(p*d);
			if (PageHuge(page))
				return page_size(compound_head(page));
		}
		return P*D_SIZE;
	}

That gets us:

  - regular page-table aligned large-pages
  - 'funny' hugetlb sizes

The only thing it doesn't gets us is kernel usage of 'funny' sizes,
which is why that function is weak (arm64, power32, sparc64 have funny
sizes and at the very least arm64 uses them for kernel mappings too).

Now, when you add !PMD THP sizes (presumably for architectures that have
'funny' sizes, otherwise what's the point), then you get to add '||
PageTransHuge()' to the above PageHuge() (and fix PageTransHuge() to
actually do what it claims it does).

Arguably we could fix arm64 with something like the below, but then, I'd
have to audit powerpc32 and sparc64 again to see if I can make that work
for them too -- not today.

---

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7003,6 +7003,10 @@ static u64 perf_virt_to_phys(u64 virt)
 
 #ifdef CONFIG_MMU
 
+#ifndef pte_cont
+#define pte_cont(pte)	(false)
+#endif
+
 /*
  * Return the MMU page size of a given virtual address.
  *
@@ -7077,7 +7081,7 @@ __weak u64 arch_perf_get_page_size(struc
 
 	if (!pte_devmap(pte) && !pte_special(pte)) {
 		page = pte_page(pte);
-		if (PageHuge(page)) {
+		if (PageHuge(page) || pte_cont(pte)) {
 			u64 size = page_size(compound_head(page));
 			pte_unmap(ptep);
 			return size;

  reply	other threads:[~2020-11-11 17:23 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 13:57 [PATCH V9 0/4] Add the page size in the perf record (kernel) kan.liang
2020-10-01 13:57 ` [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-10-09  9:09   ` Peter Zijlstra
2020-10-09  9:16     ` Peter Zijlstra
2020-10-09  9:37     ` Will Deacon
2020-10-09  9:53       ` Peter Zijlstra
2020-10-20  2:49         ` Leo Yan
2020-10-20  7:19           ` Peter Zijlstra
2020-10-20  8:16             ` Leo Yan
2020-10-09 12:29     ` Liang, Kan
2020-10-09 12:57       ` Peter Zijlstra
2020-10-09 13:28     ` Michael Ellerman
2020-10-12  8:48       ` Will Deacon
2020-10-13 14:57         ` Liang, Kan
2020-10-13 15:46           ` Peter Zijlstra
2020-10-13 16:34             ` Peter Zijlstra
2020-11-04 17:11               ` Liang, Kan
2020-11-10 15:20                 ` Liang, Kan
2020-11-11  9:57                 ` Peter Zijlstra
2020-11-11 11:22                   ` Peter Zijlstra
2020-11-11 12:43                     ` Peter Zijlstra
2020-11-11 15:30                       ` Matthew Wilcox
2020-11-11 15:52                         ` Peter Zijlstra
2020-11-11 15:57                         ` Peter Zijlstra
2020-11-11 16:38                           ` Matthew Wilcox
2020-11-11 17:22                             ` Peter Zijlstra [this message]
2020-11-11 18:26                               ` Matthew Wilcox
2020-11-11 20:00                                 ` Peter Zijlstra
2020-11-11 22:33                                   ` Matthew Wilcox
2020-11-12  9:53                                     ` Peter Zijlstra
2020-11-12 11:36                                       ` Peter Zijlstra
2020-11-12 14:01                                         ` Matthew Wilcox
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 2/4] perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 3/4] powerpc/perf: " kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-10-01 13:57 ` [PATCH V9 4/4] perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE kan.liang
2020-10-29 10:51   ` [tip: perf/core] " tip-bot2 for Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201111172253.GG2628@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=dave.hansen@intel.com \
    --cc=davem@davemloft.net \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox