linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
	Puvichakravarthy Ramachandran <puvichakravarthy@in.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com
Subject: Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE
Date: Thu, 12 Aug 2021 18:50:11 +0530	[thread overview]
Message-ID: <c157f9c9-d340-24f7-1aa0-40bbd4e1386e@linux.ibm.com> (raw)
In-Reply-To: <87fsven7yv.fsf@mpe.ellerman.id.au>

On 8/12/21 6:19 PM, Michael Ellerman wrote:
> "Puvichakravarthy Ramachandran" <puvichakravarthy@in.ibm.com> writes:
>>> With shared mapping, even though we are unmapping a large range, the kernel
>>> will force a TLB flush with ptl lock held to avoid the race mentioned in
>>> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts")
>>> This results in the kernel issuing a high number of TLB flushes even for a large
>>> range. This can be improved by making sure the kernel switch to pid based flush if the
>>> kernel is unmapping a 2M range.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>>   arch/powerpc/mm/book3s64/radix_tlb.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c > b/arch/powerpc/mm/book3s64/radix_tlb.c
>>> index aefc100d79a7..21d0f098e43b 100644
>>> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
>>> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
>>> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
>>>    * invalidating a full PID, so it has a far lower threshold to change > from
>>>    * individual page flushes to full-pid flushes.
>>>    */
>>> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
>>> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
>>>   static unsigned long tlb_local_single_page_flush_ceiling __read_mostly > = POWER9_TLB_SETS_RADIX * 2;
>>>
>>>   static inline void __radix__flush_tlb_range(struct mm_struct *mm,
>>> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct > mm_struct *mm,
>>>        if (fullmm)
>>>                flush_pid = true;
>>>        else if (type == FLUSH_TYPE_GLOBAL)
>>> -             flush_pid = nr_pages > tlb_single_page_flush_ceiling;
>>> +             flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>>>        else
>>>                flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
>>
>> Additional details on the test environment. This was tested on a 2 Node/8
>> socket Power10 system.
>> The LPAR had 105 cores and the LPAR spanned across all the sockets.
>>
>> # perf stat -I 1000 -a -e cycles,instructions -e
>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e
>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1  -t 1
>>   Rate of work: = 176
>> #           time             counts unit events
>>       1.029206442         4198594519      cycles
>>       1.029206442         2458254252      instructions              # 0.59 insn per cycle
>>       1.029206442         3004031488      PM_EXEC_STALL
>>       1.029206442         1798186036      PM_EXEC_STALL_TLBIE
>>   Rate of work: = 181
>>       2.054288539         4183883450      cycles
>>       2.054288539         2472178171      instructions              # 0.59 insn per cycle
>>       2.054288539         3014609313      PM_EXEC_STALL
>>       2.054288539         1797851642      PM_EXEC_STALL_TLBIE
>>   Rate of work: = 180
>>       3.078306883         4171250717      cycles
>>       3.078306883         2468341094      instructions              # 0.59 insn per cycle
>>       3.078306883         2993036205      PM_EXEC_STALL
>>       3.078306883         1798181890      PM_EXEC_STALL_TLBIE
>> .
>> .
>>
>> # cat /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling
>> 34
>>
>> # echo 32 > /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling
>>
>> # perf stat -I 1000 -a -e cycles,instructions -e
>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e
>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1  -t 1
>>   Rate of work: = 313
>> #           time             counts unit events
>>       1.030310506         4206071143      cycles
>>       1.030310506         4314716958      instructions              # 1.03 insn per cycle
>>       1.030310506         2157762167      PM_EXEC_STALL
>>       1.030310506          110825573      PM_EXEC_STALL_TLBIE
>>   Rate of work: = 322
>>       2.056034068         4331745630      cycles
>>       2.056034068         4531658304      instructions              # 1.05 insn per cycle
>>       2.056034068         2288971361      PM_EXEC_STALL
>>       2.056034068          111267927      PM_EXEC_STALL_TLBIE
>>   Rate of work: = 321
>>       3.081216434         4327050349      cycles
>>       3.081216434         4379679508      instructions              # 1.01 insn per cycle
>>       3.081216434         2252602550      PM_EXEC_STALL
>>       3.081216434          110974887      PM_EXEC_STALL_TLBIE
> 
> 
> What is the tlbie test actually doing?
> 
> Does it do anything to measure the cost of refilling after the full mm flush?
> 


That is essentially

for ()
{
   shmat()
   fillshm()
   shmdt()

}

for a 256MB range. So it is not really a fair benchmark because it 
doesn't take into account the impact of throwing away the full pid 
translation. But even then the TLBIE stalls is an important data point?

-aneesh



  reply	other threads:[~2021-08-12 13:21 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-06  7:56 [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE Puvichakravarthy Ramachandran
2021-08-12 12:49 ` Michael Ellerman
2021-08-12 13:20   ` Aneesh Kumar K.V [this message]
2021-08-16  7:03     ` Michael Ellerman
  -- strict thread matches above, loose matches on Subject: below --
2021-08-06  5:22 Puvichakravarthy Ramachandran
2021-08-03 14:37 Aneesh Kumar K.V
2021-08-04  5:14 ` Nicholas Piggin
2021-08-04  6:39   ` Nicholas Piggin
2021-08-04  7:34     ` Peter Zijlstra
2021-08-04  6:59   ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c157f9c9-d340-24f7-1aa0-40bbd4e1386e@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=puvichakravarthy@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).