All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: dgibson@redhat.com, "qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Tom Musta <tommusta@gmail.com>
Subject: Re: [Qemu-devel] [PATCH 02/17] ppc: avoid excessive TLB flushing
Date: Thu, 28 Aug 2014 21:35:27 +0200	[thread overview]
Message-ID: <53FF847F.7060400@redhat.com> (raw)
In-Reply-To: <CAFEAcA8JBpC3p9jKfRYB9q3N3JYqo5HD-v6Pe8rPdBh++nz3sQ@mail.gmail.com>

Il 28/08/2014 19:30, Peter Maydell ha scritto:
> On 28 August 2014 18:14, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> PowerPC TCG flushes the TLB on every IR/DR change, which basically
>> means on every user<->kernel context switch.  Use the 6-element
>> TLB array as a cache, where each MMU index is mapped to a different
>> state of the IR/DR/PR/HV bits.
>>
>> This brings the number of TLB flushes down from ~900000 to ~50000
>> for starting up the Debian installer, which is in line with x86
>> and gives a ~10% performance improvement.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  cputlb.c                    | 19 +++++++++++++++++
>>  hw/ppc/spapr_hcall.c        |  6 +++++-
>>  include/exec/exec-all.h     |  5 +++++
>>  target-ppc/cpu.h            |  4 +++-
>>  target-ppc/excp_helper.c    |  6 +-----
>>  target-ppc/helper_regs.h    | 52 +++++++++++++++++++++++++++++++--------------
>>  target-ppc/translate_init.c |  5 +++++
>>  7 files changed, 74 insertions(+), 23 deletions(-)
>>
>> diff --git a/cputlb.c b/cputlb.c
>> index afd3705..17e1b03 100644
>> --- a/cputlb.c
>> +++ b/cputlb.c
>> @@ -67,6 +67,25 @@ void tlb_flush(CPUState *cpu, int flush_global)
>>      tlb_flush_count++;
>>  }
>>
>> +void tlb_flush_idx(CPUState *cpu, int mmu_idx)
>> +{
>> +    CPUArchState *env = cpu->env_ptr;
>> +
>> +#if defined(DEBUG_TLB)
>> +    printf("tlb_flush_idx %d:\n", mmu_idx);
>> +#endif
>> +    /* must reset current TB so that interrupts cannot modify the
>> +       links while we are modifying them */
>> +    cpu->current_tb = NULL;
>> +
>> +    memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[mmu_idx]));
>> +    memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
>> +
>> +    env->tlb_flush_addr = -1;
>> +    env->tlb_flush_mask = 0;
> 
> Isn't this going to break huge page support? Consider
> the case:
>  * set up huge pages in one TLB index (causing tlb_flush_addr
>    and tlb_flush_mask to be set to cover that range)
>  * switch to a different TLB index
>  * tlb_flush_idx() for that index (causing flush_addr/mask to
>    be reset)
>  * switch back to first TLB index
>  * do tlb_flush_page for an address inside the huge-page
>     region
> 
> I think you need the flush addr/mask to be per-TLB-index
> if you want this to work.

Yes, you're right.

> Personally I would put the "implement new feature in core
> code" in a separate patch from "use new feature in PPC code".

This too, of course.  The patches aren't quite ready, I wanted to post
early because the speedups are very appealing to me.

> Does PPC hardware do lots of TLB flushes on user-kernel
> transitions, or does it have some sort of info in the TLB
> entry about whether it should match or not?

The IR and DR bits simply disable paging for respectively instructions
and data.  I suppose real hardware simply does not use the TLB when
paging is disabled.

IIRC each user->kernel transition disables paging, and then the kernel
can re-enable it (optionally only on data).  So the transition is
user->kernel unpaged->kernel paged, and the kernel unpaged->kernel paged
part is what triggers the TLB flush.  (Something like this---Alex
explained it to me a year ago when I asked why tlb_flush was always the
top function in the profile of qemu-system-ppc*).

Paolo

  reply	other threads:[~2014-08-28 19:35 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-28 17:14 [Qemu-devel] [RFT/RFH PATCH 00/16] PPC speedup patches for TCG Paolo Bonzini
2014-08-28 17:14 ` [Qemu-devel] [PATCH 01/17] ppc: do not look at the MMU index Paolo Bonzini
2014-08-28 17:14 ` [Qemu-devel] [PATCH 02/17] ppc: avoid excessive TLB flushing Paolo Bonzini
2014-08-28 17:30   ` Peter Maydell
2014-08-28 19:35     ` Paolo Bonzini [this message]
2014-09-05  6:00       ` David Gibson
2014-09-05  7:10   ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-09-05 12:11     ` Paolo Bonzini
2014-09-09 16:42       ` Paolo Bonzini
2014-09-09 20:51         ` Alexander Graf
2014-08-28 17:14 ` [Qemu-devel] [PATCH 03/17] ppc: fix monitor access to CR Paolo Bonzini
2014-09-03 18:21   ` Tom Musta
2014-09-05  7:10     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-08-28 17:15 ` [Qemu-devel] [PATCH 04/17] ppc: use ARRAY_SIZE in gdbstub.c Paolo Bonzini
2014-09-03 18:21   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 05/17] ppc: use CRF_* in fpu_helper.c Paolo Bonzini
2014-09-03 18:21   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 06/17] ppc: use CRF_* in int_helper.c Paolo Bonzini
2014-09-03 18:28   ` Tom Musta
2014-09-05  7:12     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-08-28 17:15 ` [Qemu-devel] [PATCH 07/17] ppc: fix result of DLMZB when no zero bytes are found Paolo Bonzini
2014-09-03 18:28   ` Tom Musta
2014-09-05  7:26     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-08-28 17:15 ` [Qemu-devel] [PATCH 08/17] ppc: introduce helpers for mfocrf/mtocrf Paolo Bonzini
2014-09-03 18:28   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 09/17] ppc: reorganize gen_compute_fprf Paolo Bonzini
2014-09-03 18:29   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 10/17] ppc: introduce gen_op_mfcr/gen_op_mtcr Paolo Bonzini
2014-09-03 18:58   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 11/17] ppc: rename gen_set_cr6_from_fpscr Paolo Bonzini
2014-09-03 19:41   ` Tom Musta
2014-09-05  7:27     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-08-28 17:15 ` [Qemu-devel] [PATCH 12/17] ppc: use movcond for isel Paolo Bonzini
2014-08-29 18:30   ` Richard Henderson
2014-09-03 19:41   ` Tom Musta
2014-09-15 13:39     ` Paolo Bonzini
2014-08-28 17:15 ` [Qemu-devel] [PATCH 13/17] ppc: compute mask from BI using right shift Paolo Bonzini
2014-09-03 20:59   ` Tom Musta
2014-09-05  7:29     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-08-28 17:15 ` [Qemu-devel] [PATCH 14/17] ppc: introduce ppc_get_crf and ppc_set_crf Paolo Bonzini
2014-09-04 18:26   ` Tom Musta
2014-08-28 17:15 ` [Qemu-devel] [PATCH 15/17] ppc: store CR registers in 32 1-bit registers Paolo Bonzini
2014-09-04 18:27   ` Tom Musta
2014-09-09 15:44     ` Paolo Bonzini
2014-09-09 16:41       ` Paolo Bonzini
2014-09-09 16:03     ` Richard Henderson
2014-09-09 16:26       ` Paolo Bonzini
2014-08-28 17:15 ` [Qemu-devel] [PATCH 16/17] ppc: inline ppc_get_crf/ppc_set_crf when clearer Paolo Bonzini
2014-08-28 17:15 ` [Qemu-devel] [PATCH 17/17] ppc: dump all 32 CR bits Paolo Bonzini
2014-08-28 18:05 ` [Qemu-devel] [RFT/RFH PATCH 00/16] PPC speedup patches for TCG Tom Musta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53FF847F.7060400@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=dgibson@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=tommusta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.