From: Nicholas Piggin <npiggin@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au
Subject: Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE
Date: Wed, 04 Aug 2021 15:14:22 +1000 [thread overview]
Message-ID: <1628053302.0qclx0xcj9.astroid@bobo.none> (raw)
In-Reply-To: <20210803143725.615186-1-aneesh.kumar@linux.ibm.com>
Excerpts from Aneesh Kumar K.V's message of August 4, 2021 12:37 am:
> With shared mapping, even though we are unmapping a large range, the kernel
> will force a TLB flush with ptl lock held to avoid the race mentioned in
> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts")
> This results in the kernel issuing a high number of TLB flushes even for a large
> range. This can be improved by making sure the kernel switch to pid based flush if the
> kernel is unmapping a 2M range.
It would be good to have a bit more description here.
In any patch that changes a heuristic like this, I would like to see
some justification or reasoning that could be refuted or used as a
supporting argument if we ever wanted to change the heuristic later.
Ideally with some of the obvious downsides listed as well.
This "improves" things here, but what if it hurt things elsewhere, how
would we come in later and decide to change it back?
THP flushes for example, I think now they'll do PID flushes (if they
have to be broadcast, which they will tend to be when khugepaged does
them). So now that might increase jitter for THP and cause it to be a
loss for more workloads.
So where do you notice this? What's the benefit?
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> arch/powerpc/mm/book3s64/radix_tlb.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
> index aefc100d79a7..21d0f098e43b 100644
> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
> * invalidating a full PID, so it has a far lower threshold to change from
> * individual page flushes to full-pid flushes.
> */
> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
> static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = POWER9_TLB_SETS_RADIX * 2;
>
> static inline void __radix__flush_tlb_range(struct mm_struct *mm,
> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct mm_struct *mm,
> if (fullmm)
> flush_pid = true;
> else if (type == FLUSH_TYPE_GLOBAL)
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
Arguably >= is nicer than > here, but this shouldn't be in the same
patch as the value change.
> else
> flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
And it should change everything to be consistent. Although I'm not sure
it's worth changing even though I highly doubt any administrator would
be tweaking this.
Thanks,
Nick
> /*
> @@ -1335,7 +1335,7 @@ static void __radix__flush_tlb_range_psize(struct mm_struct *mm,
> if (fullmm)
> flush_pid = true;
> else if (type == FLUSH_TYPE_GLOBAL)
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
> else
> flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
>
> @@ -1505,7 +1505,7 @@ void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
> continue;
>
> nr_pages = (end - start) >> def->shift;
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>
> /*
> * If the number of pages spanning the range is above
> --
> 2.31.1
>
>
next prev parent reply other threads:[~2021-08-04 5:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-03 14:37 [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE Aneesh Kumar K.V
2021-08-04 5:14 ` Nicholas Piggin [this message]
2021-08-04 6:39 ` Nicholas Piggin
2021-08-04 7:34 ` Peter Zijlstra
2021-08-04 6:59 ` Michael Ellerman
-- strict thread matches above, loose matches on Subject: below --
2021-08-06 5:22 Puvichakravarthy Ramachandran
2021-08-06 7:56 Puvichakravarthy Ramachandran
2021-08-12 12:49 ` Michael Ellerman
2021-08-12 13:20 ` Aneesh Kumar K.V
2021-08-16 7:03 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1628053302.0qclx0xcj9.astroid@bobo.none \
--to=npiggin@gmail.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).