* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-01-31 10:57 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-01-31 10:57 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, bamvor.zhangjian,
open list:ARM
On 27 January 2017 at 10:39, Alex Bennée <alex.bennee@linaro.org> wrote:
> Some helpers may trigger an immediate exit of the cpu_loop. If this
> happens the PC need to be rectified to ensure the restart will begin
> on the next instruction.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> target/arm/cpu.h | 3 ++-
> target/arm/translate-a64.c | 4 ++++
> target/arm/translate.c | 4 ++++
> 3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index f56a96c675..1b0670ae11 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
> #define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
> #define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
> #define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
> -#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
> +#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
> +#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
There's a comment above this list of defines that documents
what all the flags mean; can you add an entry to it for the
new flag?
> /* Used only as a terminator for ARMCPRegInfo lists */
> #define ARM_CP_SENTINEL 0xffff
> /* Mask of only the flag bits in a type field */
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 05162f335e..a3f37d8bec 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
> tcg_rt = cpu_reg(s, rt);
> gen_helper_dc_zva(cpu_env, tcg_rt);
> return;
> + case ARM_CP_EXIT_PC:
> + /* The helper may exit the cpu_loop so ensure PC is correct */
> + gen_a64_set_pc_im(s->pc);
> + break;
This will work, but it's a little odd because it breaks the
existing invariant that cp helpers never throw exceptions
(except in the access function).
Does single-stepping (of the emulated architectural
debug step, and gdbstub singlestep) work across one of
these instructions?
Should we also set dc->is_jmp to force ending the TB here?
This is probably a question answered in the rest of the series,
but why do we need the helper to be able to longjump out to the
top level? Can't we just have the helper do its work and then
end the TB with tcg_gen_exit_tb(0) so we return to the top level
loop in the normal way?
> default:
> break;
> }
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 444a24c2b6..7bd18cd25d 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
> gen_set_pc_im(s, s->pc);
> s->is_jmp = DISAS_WFI;
> return 0;
> + case ARM_CP_EXIT_PC:
> + /* The helper may exit the cpu_loop so ensure PC is correct */
> + gen_set_pc_im(s, s->pc);
> + break;
Do we also need to gen_set_condexec() ?
> default:
> break;
> }
> --
> 2.11.0
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-01-31 10:57 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-01-31 10:57 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, bamvor.zhangjian,
open list:ARM
On 27 January 2017 at 10:39, Alex Bennée <alex.bennee@linaro.org> wrote:
> Some helpers may trigger an immediate exit of the cpu_loop. If this
> happens the PC need to be rectified to ensure the restart will begin
> on the next instruction.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> target/arm/cpu.h | 3 ++-
> target/arm/translate-a64.c | 4 ++++
> target/arm/translate.c | 4 ++++
> 3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index f56a96c675..1b0670ae11 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
> #define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
> #define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
> #define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
> -#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
> +#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
> +#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
There's a comment above this list of defines that documents
what all the flags mean; can you add an entry to it for the
new flag?
> /* Used only as a terminator for ARMCPRegInfo lists */
> #define ARM_CP_SENTINEL 0xffff
> /* Mask of only the flag bits in a type field */
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 05162f335e..a3f37d8bec 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
> tcg_rt = cpu_reg(s, rt);
> gen_helper_dc_zva(cpu_env, tcg_rt);
> return;
> + case ARM_CP_EXIT_PC:
> + /* The helper may exit the cpu_loop so ensure PC is correct */
> + gen_a64_set_pc_im(s->pc);
> + break;
This will work, but it's a little odd because it breaks the
existing invariant that cp helpers never throw exceptions
(except in the access function).
Does single-stepping (of the emulated architectural
debug step, and gdbstub singlestep) work across one of
these instructions?
Should we also set dc->is_jmp to force ending the TB here?
This is probably a question answered in the rest of the series,
but why do we need the helper to be able to longjump out to the
top level? Can't we just have the helper do its work and then
end the TB with tcg_gen_exit_tb(0) so we return to the top level
loop in the normal way?
> default:
> break;
> }
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 444a24c2b6..7bd18cd25d 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
> gen_set_pc_im(s, s->pc);
> s->is_jmp = DISAS_WFI;
> return 0;
> + case ARM_CP_EXIT_PC:
> + /* The helper may exit the cpu_loop so ensure PC is correct */
> + gen_set_pc_im(s, s->pc);
> + break;
Do we also need to gen_set_condexec() ?
> default:
> break;
> }
> --
> 2.11.0
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-01-31 10:57 ` [Qemu-devel] " Peter Maydell
@ 2017-02-02 11:03 ` Alex Bennée
-1 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 11:03 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, bamvor.zhangjian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 27 January 2017 at 10:39, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Some helpers may trigger an immediate exit of the cpu_loop. If this
>> happens the PC need to be rectified to ensure the restart will begin
>> on the next instruction.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>> target/arm/cpu.h | 3 ++-
>> target/arm/translate-a64.c | 4 ++++
>> target/arm/translate.c | 4 ++++
>> 3 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
>> index f56a96c675..1b0670ae11 100644
>> --- a/target/arm/cpu.h
>> +++ b/target/arm/cpu.h
>> @@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
>> #define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
>> #define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
>> #define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
>> -#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
>> +#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
>> +#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
>
> There's a comment above this list of defines that documents
> what all the flags mean; can you add an entry to it for the
> new flag?
Sure - given your comment bellow maybe AP_CP_SYNC_REGS is a better name?
>
>> /* Used only as a terminator for ARMCPRegInfo lists */
>> #define ARM_CP_SENTINEL 0xffff
>> /* Mask of only the flag bits in a type field */
>> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
>> index 05162f335e..a3f37d8bec 100644
>> --- a/target/arm/translate-a64.c
>> +++ b/target/arm/translate-a64.c
>> @@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>> tcg_rt = cpu_reg(s, rt);
>> gen_helper_dc_zva(cpu_env, tcg_rt);
>> return;
>> + case ARM_CP_EXIT_PC:
>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>> + gen_a64_set_pc_im(s->pc);
>> + break;
>
> This will work, but it's a little odd because it breaks the
> existing invariant that cp helpers never throw exceptions
> (except in the access function).
We don't throw an exception but we do exit the CPU loop which has work
waiting for it.
>
> Does single-stepping (of the emulated architectural
> debug step, and gdbstub singlestep) work across one of
> these instructions?
I'll have to test but I don't see why not. The instruction is fully
executed we just ensure we have exited the run loop to process the flush
before we get to the next instruction/
> Should we also set dc->is_jmp to force ending the TB here?
Probably - there is no reason to keep translating as the next
instruction will be in its own block.
> This is probably a question answered in the rest of the series,
> but why do we need the helper to be able to longjump out to the
> top level? Can't we just have the helper do its work and then
> end the TB with tcg_gen_exit_tb(0) so we return to the top level
> loop in the normal way?
Well I guess this is a philosophical question. The cputlb API is
offering the guarantee that when an *_all_cpus_synced() flush is done
everything will be complete with respect to all vCPUS. This is reliant
on the source vCPU executing an exclusive safe work which ensures all
other vCPUs have halted and therefor will have run their safe work
before returning to execution.
If ARM wanted to it could call the *_all_cpus() variant, schedule its
own exclusive safe work (a null function - as cputlb will have scheduled
the flush) and exit the TB in the usual way. In fact this is the
mechanism ARM could use if it wanted to defer the sync point to a later
DMB instruction.
I haven't implemented it yet as the flush stuff only comes up high in
the perf runs with my aggressive TLB flush microbenchmarks.
However I'm wary of having a _synched() variant which will only work
correctly if the guest also does a bunch of other steps.
>
>> default:
>> break;
>> }
>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>> index 444a24c2b6..7bd18cd25d 100644
>> --- a/target/arm/translate.c
>> +++ b/target/arm/translate.c
>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>> gen_set_pc_im(s, s->pc);
>> s->is_jmp = DISAS_WFI;
>> return 0;
>> + case ARM_CP_EXIT_PC:
>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>> + gen_set_pc_im(s, s->pc);
>> + break;
>
> Do we also need to gen_set_condexec() ?
Do we? This isn't an exception so we don't need to resolve the condition
flags as long as there is enough information preserved so the next TB
can resolve if it needs to.
I'm afraid my knowledge of the condition code is "it is deferred until
it is needed as it is an expensive calculation". I'll dig into the
implementation.
>
>> default:
>> break;
>> }
>> --
>> 2.11.0
>
> thanks
> -- PMM
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-02-02 11:03 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 11:03 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, bamvor.zhangjian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 27 January 2017 at 10:39, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Some helpers may trigger an immediate exit of the cpu_loop. If this
>> happens the PC need to be rectified to ensure the restart will begin
>> on the next instruction.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>> target/arm/cpu.h | 3 ++-
>> target/arm/translate-a64.c | 4 ++++
>> target/arm/translate.c | 4 ++++
>> 3 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
>> index f56a96c675..1b0670ae11 100644
>> --- a/target/arm/cpu.h
>> +++ b/target/arm/cpu.h
>> @@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
>> #define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
>> #define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
>> #define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
>> -#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
>> +#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
>> +#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
>
> There's a comment above this list of defines that documents
> what all the flags mean; can you add an entry to it for the
> new flag?
Sure - given your comment bellow maybe AP_CP_SYNC_REGS is a better name?
>
>> /* Used only as a terminator for ARMCPRegInfo lists */
>> #define ARM_CP_SENTINEL 0xffff
>> /* Mask of only the flag bits in a type field */
>> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
>> index 05162f335e..a3f37d8bec 100644
>> --- a/target/arm/translate-a64.c
>> +++ b/target/arm/translate-a64.c
>> @@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>> tcg_rt = cpu_reg(s, rt);
>> gen_helper_dc_zva(cpu_env, tcg_rt);
>> return;
>> + case ARM_CP_EXIT_PC:
>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>> + gen_a64_set_pc_im(s->pc);
>> + break;
>
> This will work, but it's a little odd because it breaks the
> existing invariant that cp helpers never throw exceptions
> (except in the access function).
We don't throw an exception but we do exit the CPU loop which has work
waiting for it.
>
> Does single-stepping (of the emulated architectural
> debug step, and gdbstub singlestep) work across one of
> these instructions?
I'll have to test but I don't see why not. The instruction is fully
executed we just ensure we have exited the run loop to process the flush
before we get to the next instruction/
> Should we also set dc->is_jmp to force ending the TB here?
Probably - there is no reason to keep translating as the next
instruction will be in its own block.
> This is probably a question answered in the rest of the series,
> but why do we need the helper to be able to longjump out to the
> top level? Can't we just have the helper do its work and then
> end the TB with tcg_gen_exit_tb(0) so we return to the top level
> loop in the normal way?
Well I guess this is a philosophical question. The cputlb API is
offering the guarantee that when an *_all_cpus_synced() flush is done
everything will be complete with respect to all vCPUS. This is reliant
on the source vCPU executing an exclusive safe work which ensures all
other vCPUs have halted and therefor will have run their safe work
before returning to execution.
If ARM wanted to it could call the *_all_cpus() variant, schedule its
own exclusive safe work (a null function - as cputlb will have scheduled
the flush) and exit the TB in the usual way. In fact this is the
mechanism ARM could use if it wanted to defer the sync point to a later
DMB instruction.
I haven't implemented it yet as the flush stuff only comes up high in
the perf runs with my aggressive TLB flush microbenchmarks.
However I'm wary of having a _synched() variant which will only work
correctly if the guest also does a bunch of other steps.
>
>> default:
>> break;
>> }
>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>> index 444a24c2b6..7bd18cd25d 100644
>> --- a/target/arm/translate.c
>> +++ b/target/arm/translate.c
>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>> gen_set_pc_im(s, s->pc);
>> s->is_jmp = DISAS_WFI;
>> return 0;
>> + case ARM_CP_EXIT_PC:
>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>> + gen_set_pc_im(s, s->pc);
>> + break;
>
> Do we also need to gen_set_condexec() ?
Do we? This isn't an exception so we don't need to resolve the condition
flags as long as there is enough information preserved so the next TB
can resolve if it needs to.
I'm afraid my knowledge of the condition code is "it is deferred until
it is needed as it is an expensive calculation". I'll dig into the
implementation.
>
>> default:
>> break;
>> }
>> --
>> 2.11.0
>
> thanks
> -- PMM
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-02-02 11:03 ` [Qemu-devel] " Alex Bennée
@ 2017-02-02 11:31 ` Peter Maydell
-1 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-02-02 11:31 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
On 2 February 2017 at 11:03, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>> Does single-stepping (of the emulated architectural
>> debug step, and gdbstub singlestep) work across one of
>> these instructions?
>
> I'll have to test but I don't see why not. The instruction is fully
> executed we just ensure we have exited the run loop to process the flush
> before we get to the next instruction/
The reason I ask is that the single-stepping code path involves
doing some work at the tail end of the translate:
if (unlikely(cs->singlestep_enabled || dc->ss_active)
&& dc->is_jmp != DISAS_EXC) {
/* do some stuff */
}
The other things that jump out of the normal code flow are:
* exceptions (where we don't want to do finished-the-step
work anyway as the insn hasn't executed)
* SWI (hopefully we single step SWI right but maybe not)
* YIELD, WFE (which are special cased so that they do the
actual work only at the end of the gen_intermediate_code
function and only if not single-stepping, so they're
no-ops on singlestep)
You've introduced a new item to this list which isn't
handled by the singlestep code.
>> This is probably a question answered in the rest of the series,
>> but why do we need the helper to be able to longjump out to the
>> top level? Can't we just have the helper do its work and then
>> end the TB with tcg_gen_exit_tb(0) so we return to the top level
>> loop in the normal way?
>
> Well I guess this is a philosophical question. The cputlb API is
> offering the guarantee that when an *_all_cpus_synced() flush is done
> everything will be complete with respect to all vCPUS. This is reliant
> on the source vCPU executing an exclusive safe work which ensures all
> other vCPUs have halted and therefor will have run their safe work
> before returning to execution.
>
> If ARM wanted to it could call the *_all_cpus() variant, schedule its
> own exclusive safe work (a null function - as cputlb will have scheduled
> the flush) and exit the TB in the usual way. In fact this is the
> mechanism ARM could use if it wanted to defer the sync point to a later
> DMB instruction.
>
> I haven't implemented it yet as the flush stuff only comes up high in
> the perf runs with my aggressive TLB flush microbenchmarks.
>
> However I'm wary of having a _synched() variant which will only work
> correctly if the guest also does a bunch of other steps.
Well, with the implementation as it is you need to do a bunch
of extra steps to handle all the corner cases (condexec,
single stepping) that would be handled for you if you exited
the TB in the normal way rather than longjumping out of it...
IME longjumping out should be reserved for "we don't want to
continue executing whatever other generated code we have after
this" situations. Here we know definitely what we're going to
want to do, so it would be better to generate code that
arranged to leave the TB in the usual way.
>>> default:
>>> break;
>>> }
>>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>>> index 444a24c2b6..7bd18cd25d 100644
>>> --- a/target/arm/translate.c
>>> +++ b/target/arm/translate.c
>>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>>> gen_set_pc_im(s, s->pc);
>>> s->is_jmp = DISAS_WFI;
>>> return 0;
>>> + case ARM_CP_EXIT_PC:
>>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>>> + gen_set_pc_im(s, s->pc);
>>> + break;
>>
>> Do we also need to gen_set_condexec() ?
>
> Do we? This isn't an exception so we don't need to resolve the condition
> flags as long as there is enough information preserved so the next TB
> can resolve if it needs to.
Your longjump is effectively skipping the normal "end of the TB" code,
which is what usually does the set_condexec for you. At the end of a
TB the expectation is that everything's been sync'd back to the CPU
state structure.
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-02-02 11:31 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-02-02 11:31 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
On 2 February 2017 at 11:03, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>> Does single-stepping (of the emulated architectural
>> debug step, and gdbstub singlestep) work across one of
>> these instructions?
>
> I'll have to test but I don't see why not. The instruction is fully
> executed we just ensure we have exited the run loop to process the flush
> before we get to the next instruction/
The reason I ask is that the single-stepping code path involves
doing some work at the tail end of the translate:
if (unlikely(cs->singlestep_enabled || dc->ss_active)
&& dc->is_jmp != DISAS_EXC) {
/* do some stuff */
}
The other things that jump out of the normal code flow are:
* exceptions (where we don't want to do finished-the-step
work anyway as the insn hasn't executed)
* SWI (hopefully we single step SWI right but maybe not)
* YIELD, WFE (which are special cased so that they do the
actual work only at the end of the gen_intermediate_code
function and only if not single-stepping, so they're
no-ops on singlestep)
You've introduced a new item to this list which isn't
handled by the singlestep code.
>> This is probably a question answered in the rest of the series,
>> but why do we need the helper to be able to longjump out to the
>> top level? Can't we just have the helper do its work and then
>> end the TB with tcg_gen_exit_tb(0) so we return to the top level
>> loop in the normal way?
>
> Well I guess this is a philosophical question. The cputlb API is
> offering the guarantee that when an *_all_cpus_synced() flush is done
> everything will be complete with respect to all vCPUS. This is reliant
> on the source vCPU executing an exclusive safe work which ensures all
> other vCPUs have halted and therefor will have run their safe work
> before returning to execution.
>
> If ARM wanted to it could call the *_all_cpus() variant, schedule its
> own exclusive safe work (a null function - as cputlb will have scheduled
> the flush) and exit the TB in the usual way. In fact this is the
> mechanism ARM could use if it wanted to defer the sync point to a later
> DMB instruction.
>
> I haven't implemented it yet as the flush stuff only comes up high in
> the perf runs with my aggressive TLB flush microbenchmarks.
>
> However I'm wary of having a _synched() variant which will only work
> correctly if the guest also does a bunch of other steps.
Well, with the implementation as it is you need to do a bunch
of extra steps to handle all the corner cases (condexec,
single stepping) that would be handled for you if you exited
the TB in the normal way rather than longjumping out of it...
IME longjumping out should be reserved for "we don't want to
continue executing whatever other generated code we have after
this" situations. Here we know definitely what we're going to
want to do, so it would be better to generate code that
arranged to leave the TB in the usual way.
>>> default:
>>> break;
>>> }
>>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>>> index 444a24c2b6..7bd18cd25d 100644
>>> --- a/target/arm/translate.c
>>> +++ b/target/arm/translate.c
>>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>>> gen_set_pc_im(s, s->pc);
>>> s->is_jmp = DISAS_WFI;
>>> return 0;
>>> + case ARM_CP_EXIT_PC:
>>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>>> + gen_set_pc_im(s, s->pc);
>>> + break;
>>
>> Do we also need to gen_set_condexec() ?
>
> Do we? This isn't an exception so we don't need to resolve the condition
> flags as long as there is enough information preserved so the next TB
> can resolve if it needs to.
Your longjump is effectively skipping the normal "end of the TB" code,
which is what usually does the set_condexec for you. At the end of a
TB the expectation is that everything's been sync'd back to the CPU
state structure.
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-02-02 11:31 ` [Qemu-devel] " Peter Maydell
@ 2017-02-02 12:17 ` Alex Bennée
-1 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 12:17 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 2 February 2017 at 11:03, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>> Does single-stepping (of the emulated architectural
>>> debug step, and gdbstub singlestep) work across one of
>>> these instructions?
>>
>> I'll have to test but I don't see why not. The instruction is fully
>> executed we just ensure we have exited the run loop to process the flush
>> before we get to the next instruction/
>
> The reason I ask is that the single-stepping code path involves
> doing some work at the tail end of the translate:
>
> if (unlikely(cs->singlestep_enabled || dc->ss_active)
> && dc->is_jmp != DISAS_EXC) {
> /* do some stuff */
> }
>
> The other things that jump out of the normal code flow are:
> * exceptions (where we don't want to do finished-the-step
> work anyway as the insn hasn't executed)
> * SWI (hopefully we single step SWI right but maybe not)
> * YIELD, WFE (which are special cased so that they do the
> actual work only at the end of the gen_intermediate_code
> function and only if not single-stepping, so they're
> no-ops on singlestep)
>
> You've introduced a new item to this list which isn't
> handled by the singlestep code.
>
>>> This is probably a question answered in the rest of the series,
>>> but why do we need the helper to be able to longjump out to the
>>> top level? Can't we just have the helper do its work and then
>>> end the TB with tcg_gen_exit_tb(0) so we return to the top level
>>> loop in the normal way?
>>
>> Well I guess this is a philosophical question. The cputlb API is
>> offering the guarantee that when an *_all_cpus_synced() flush is done
>> everything will be complete with respect to all vCPUS. This is reliant
>> on the source vCPU executing an exclusive safe work which ensures all
>> other vCPUs have halted and therefor will have run their safe work
>> before returning to execution.
>>
>> If ARM wanted to it could call the *_all_cpus() variant, schedule its
>> own exclusive safe work (a null function - as cputlb will have scheduled
>> the flush) and exit the TB in the usual way. In fact this is the
>> mechanism ARM could use if it wanted to defer the sync point to a later
>> DMB instruction.
>>
>> I haven't implemented it yet as the flush stuff only comes up high in
>> the perf runs with my aggressive TLB flush microbenchmarks.
>>
>> However I'm wary of having a _synched() variant which will only work
>> correctly if the guest also does a bunch of other steps.
>
> Well, with the implementation as it is you need to do a bunch
> of extra steps to handle all the corner cases (condexec,
> single stepping) that would be handled for you if you exited
> the TB in the normal way rather than longjumping out of it...
> IME longjumping out should be reserved for "we don't want to
> continue executing whatever other generated code we have after
> this" situations. Here we know definitely what we're going to
> want to do, so it would be better to generate code that
> arranged to leave the TB in the usual way.
OK I can certainly see the logic in exiting the "clean" way. I guess it
really depends on how the other guests are going to handle the case. It
would be nice if there was some mechanism by which the cputlb code could
be sure whatever has just called a synched function really is going to
exit the loop.
Paolo, Richard,
Any ideas? Do the other guests have similar mechanisms?
This would mean removing the QEMU_NORETURN from the *_synched functions
(but keeping their scheduling of the safe work) and documenting the
guest translations should be exiting their TBs after this instruction.
>
>>>> default:
>>>> break;
>>>> }
>>>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>>>> index 444a24c2b6..7bd18cd25d 100644
>>>> --- a/target/arm/translate.c
>>>> +++ b/target/arm/translate.c
>>>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>>>> gen_set_pc_im(s, s->pc);
>>>> s->is_jmp = DISAS_WFI;
>>>> return 0;
>>>> + case ARM_CP_EXIT_PC:
>>>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>>>> + gen_set_pc_im(s, s->pc);
>>>> + break;
>>>
>>> Do we also need to gen_set_condexec() ?
>>
>> Do we? This isn't an exception so we don't need to resolve the condition
>> flags as long as there is enough information preserved so the next TB
>> can resolve if it needs to.
>
> Your longjump is effectively skipping the normal "end of the TB" code,
> which is what usually does the set_condexec for you. At the end of a
> TB the expectation is that everything's been sync'd back to the CPU
> state structure.
Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
things should just work normally? Is shouldn't matter if the TB with the
flush is chained to a new TB as the exit_request test should fire before
any more state changing operations happen?
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-02-02 12:17 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 12:17 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 2 February 2017 at 11:03, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>> Does single-stepping (of the emulated architectural
>>> debug step, and gdbstub singlestep) work across one of
>>> these instructions?
>>
>> I'll have to test but I don't see why not. The instruction is fully
>> executed we just ensure we have exited the run loop to process the flush
>> before we get to the next instruction/
>
> The reason I ask is that the single-stepping code path involves
> doing some work at the tail end of the translate:
>
> if (unlikely(cs->singlestep_enabled || dc->ss_active)
> && dc->is_jmp != DISAS_EXC) {
> /* do some stuff */
> }
>
> The other things that jump out of the normal code flow are:
> * exceptions (where we don't want to do finished-the-step
> work anyway as the insn hasn't executed)
> * SWI (hopefully we single step SWI right but maybe not)
> * YIELD, WFE (which are special cased so that they do the
> actual work only at the end of the gen_intermediate_code
> function and only if not single-stepping, so they're
> no-ops on singlestep)
>
> You've introduced a new item to this list which isn't
> handled by the singlestep code.
>
>>> This is probably a question answered in the rest of the series,
>>> but why do we need the helper to be able to longjump out to the
>>> top level? Can't we just have the helper do its work and then
>>> end the TB with tcg_gen_exit_tb(0) so we return to the top level
>>> loop in the normal way?
>>
>> Well I guess this is a philosophical question. The cputlb API is
>> offering the guarantee that when an *_all_cpus_synced() flush is done
>> everything will be complete with respect to all vCPUS. This is reliant
>> on the source vCPU executing an exclusive safe work which ensures all
>> other vCPUs have halted and therefor will have run their safe work
>> before returning to execution.
>>
>> If ARM wanted to it could call the *_all_cpus() variant, schedule its
>> own exclusive safe work (a null function - as cputlb will have scheduled
>> the flush) and exit the TB in the usual way. In fact this is the
>> mechanism ARM could use if it wanted to defer the sync point to a later
>> DMB instruction.
>>
>> I haven't implemented it yet as the flush stuff only comes up high in
>> the perf runs with my aggressive TLB flush microbenchmarks.
>>
>> However I'm wary of having a _synched() variant which will only work
>> correctly if the guest also does a bunch of other steps.
>
> Well, with the implementation as it is you need to do a bunch
> of extra steps to handle all the corner cases (condexec,
> single stepping) that would be handled for you if you exited
> the TB in the normal way rather than longjumping out of it...
> IME longjumping out should be reserved for "we don't want to
> continue executing whatever other generated code we have after
> this" situations. Here we know definitely what we're going to
> want to do, so it would be better to generate code that
> arranged to leave the TB in the usual way.
OK I can certainly see the logic in exiting the "clean" way. I guess it
really depends on how the other guests are going to handle the case. It
would be nice if there was some mechanism by which the cputlb code could
be sure whatever has just called a synched function really is going to
exit the loop.
Paolo, Richard,
Any ideas? Do the other guests have similar mechanisms?
This would mean removing the QEMU_NORETURN from the *_synched functions
(but keeping their scheduling of the safe work) and documenting the
guest translations should be exiting their TBs after this instruction.
>
>>>> default:
>>>> break;
>>>> }
>>>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>>>> index 444a24c2b6..7bd18cd25d 100644
>>>> --- a/target/arm/translate.c
>>>> +++ b/target/arm/translate.c
>>>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>>>> gen_set_pc_im(s, s->pc);
>>>> s->is_jmp = DISAS_WFI;
>>>> return 0;
>>>> + case ARM_CP_EXIT_PC:
>>>> + /* The helper may exit the cpu_loop so ensure PC is correct */
>>>> + gen_set_pc_im(s, s->pc);
>>>> + break;
>>>
>>> Do we also need to gen_set_condexec() ?
>>
>> Do we? This isn't an exception so we don't need to resolve the condition
>> flags as long as there is enough information preserved so the next TB
>> can resolve if it needs to.
>
> Your longjump is effectively skipping the normal "end of the TB" code,
> which is what usually does the set_condexec for you. At the end of a
> TB the expectation is that everything's been sync'd back to the CPU
> state structure.
Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
things should just work normally? Is shouldn't matter if the TB with the
flush is chained to a new TB as the exit_request test should fire before
any more state changing operations happen?
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-02-02 12:17 ` [Qemu-devel] " Alex Bennée
@ 2017-02-02 12:48 ` Peter Maydell
-1 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-02-02 12:48 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
On 2 February 2017 at 12:17, Alex Bennée <alex.bennee@linaro.org> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
>> Your longjump is effectively skipping the normal "end of the TB" code,
>> which is what usually does the set_condexec for you. At the end of a
>> TB the expectation is that everything's been sync'd back to the CPU
>> state structure.
>
> Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
> things should just work normally?
If they're register writes, yes -- we end the TB on writes but not
on reads (on the assumption that only writes might modify state that
requires a TB end). If the TLB instructions are reads then we can
add a "force TB end" flag.
> Is shouldn't matter if the TB with the
> flush is chained to a new TB as the exit_request test should fire before
> any more state changing operations happen?
I think so, but again if you know that you're always going to exit
the TB then there's no point setting it up so it might chain.
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-02-02 12:48 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2017-02-02 12:48 UTC (permalink / raw)
To: Alex Bennée
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
On 2 February 2017 at 12:17, Alex Bennée <alex.bennee@linaro.org> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
>> Your longjump is effectively skipping the normal "end of the TB" code,
>> which is what usually does the set_condexec for you. At the end of a
>> TB the expectation is that everything's been sync'd back to the CPU
>> state structure.
>
> Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
> things should just work normally?
If they're register writes, yes -- we end the TB on writes but not
on reads (on the assumption that only writes might modify state that
requires a TB end). If the TLB instructions are reads then we can
add a "force TB end" flag.
> Is shouldn't matter if the TB with the
> flush is chained to a new TB as the exit_request test should fire before
> any more state changing operations happen?
I think so, but again if you know that you're always going to exit
the TB then there's no point setting it up so it might chain.
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-02-02 12:48 ` [Qemu-devel] " Peter Maydell
@ 2017-02-02 13:25 ` Alex Bennée
-1 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 13:25 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 2 February 2017 at 12:17, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>> Your longjump is effectively skipping the normal "end of the TB" code,
>>> which is what usually does the set_condexec for you. At the end of a
>>> TB the expectation is that everything's been sync'd back to the CPU
>>> state structure.
>>
>> Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
>> things should just work normally?
>
> If they're register writes, yes -- we end the TB on writes but not
> on reads (on the assumption that only writes might modify state that
> requires a TB end). If the TLB instructions are reads then we can
> add a "force TB end" flag.
>
>> Is shouldn't matter if the TB with the
>> flush is chained to a new TB as the exit_request test should fire before
>> any more state changing operations happen?
>
> I think so, but again if you know that you're always going to exit
> the TB then there's no point setting it up so it might chain.
So setting s->is_jmp = DISAS_JUMP ensures that?
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
@ 2017-02-02 13:25 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-02-02 13:25 UTC (permalink / raw)
To: Peter Maydell
Cc: MTTCG Devel, QEMU Developers, KONRAD Frédéric,
Alvise Rigo, Emilio G. Cota, Pranith Kumar, Nikunj A Dadhania,
Mark Burton, Paolo Bonzini, Jan Kiszka, Fedorov Sergey,
Richard Henderson, Claudio Fontana, Bamvor Zhang Jian,
open list:ARM
Peter Maydell <peter.maydell@linaro.org> writes:
> On 2 February 2017 at 12:17, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>> Your longjump is effectively skipping the normal "end of the TB" code,
>>> which is what usually does the set_condexec for you. At the end of a
>>> TB the expectation is that everything's been sync'd back to the CPU
>>> state structure.
>>
>> Hmm so as long as the tlb flush helpers don't set ARM_CP_SUPPRESS_TB_END
>> things should just work normally?
>
> If they're register writes, yes -- we end the TB on writes but not
> on reads (on the assumption that only writes might modify state that
> requires a TB end). If the TLB instructions are reads then we can
> add a "force TB end" flag.
>
>> Is shouldn't matter if the TB with the
>> flush is chained to a new TB as the exit_request test should fire before
>> any more state changing operations happen?
>
> I think so, but again if you know that you're always going to exit
> the TB then there's no point setting it up so it might chain.
So setting s->is_jmp = DISAS_JUMP ensures that?
--
Alex Bennée
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Qemu-devel] [PATCH v8 00/25] Remaining MTTCG Base patches and ARM enablement
@ 2017-01-27 10:38 Alex Bennée
2017-01-27 10:39 ` [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC Alex Bennée
0 siblings, 1 reply; 14+ messages in thread
From: Alex Bennée @ 2017-01-27 10:38 UTC (permalink / raw)
To: mttcg, qemu-devel, fred.konrad, a.rigo, cota, bobby.prani, nikunj
Cc: mark.burton, pbonzini, jan.kiszka, serge.fdrv, rth, peter.maydell,
claudio.fontana, bamvor.zhangjian, Alex Bennée
Hi,
All of the changes in this revision are addressing comments from v7
posted last week. A new pre-cursor patch was added:
cputlb and arm/sparc targets: convert mmuidx flushes from varg to
bitmap
To change the cputlb API to use a bitmap instead of vargs. This has
generated quite a bit of churn in the ARM target but it is pretty
mechanical.
I also folded the BQL irq protection patches from v7 into:
tcg: drop global lock during TCG code execution
This is required to keep the series bisectable although the BQL safety
is only really relevant to guests using MTTCG. I didn't think it was
worth making the asserts conditional on parallel_cpus although it does
mean this patch gets a little bigger.
The other big change was to:
cputlb: introduce tlb_flush_*_all_cpus[_synced]
Where I replaced the wait flag with an expanded set of API calls. The
*_synced variants which are marked as QEMU_NORETURN to make their
behaviour clear.
The series applies to origin/master as of today and you can find my
tree at:
https://github.com/stsquad/qemu/tree/mttcg/base-patches-v8
There is the usual collection of r-b tags and minor merge/re-base
fixes all documented in the --- sections of the commit messages.
In terms of merging strategy I would appreciate some thoughts. While I
think the series is ready to go I appreciate it is quite a chunk to
merge in one go. That said an early merge gives us plenty of time to
shake out any lingering issues before feature freeze.
I guess the key decider is that we are happy the design provides for
solutions for any other things we come across?
Cheers,
Alex
Alex Bennée (19):
docs: new design document multi-thread-tcg.txt
tcg: move TCG_MO/BAR types into own file
tcg: add kick timer for single-threaded vCPU emulation
tcg: rename tcg_current_cpu to tcg_current_rr_cpu
tcg: remove global exit_request
tcg: enable tb_lock() for SoftMMU
tcg: enable thread-per-vCPU
cputlb: add assert_cpu_is_self checks
cputlb: tweak qemu_ram_addr_from_host_nofail reporting
cputlb and arm/sparc targets: convert mmuidx flushes from varg to
bitmap
cputlb: add tlb_flush_by_mmuidx async routines
cputlb: atomically update tlb fields used by tlb_reset_dirty
cputlb: introduce tlb_flush_*_all_cpus[_synced]
target-arm/powerctl: defer cpu reset work to CPU context
target-arm: don't generate WFE/YIELD calls for MTTCG
target-arm/cpu.h: make ARM_CP defined consistent
target-arm: introduce ARM_CP_EXIT_PC
target-arm: ensure all cross vCPUs TLB flushes complete
tcg: enable MTTCG by default for ARM on x86 hosts
Jan Kiszka (1):
tcg: drop global lock during TCG code execution
KONRAD Frederic (2):
tcg: add options for enabling MTTCG
cputlb: introduce tlb_flush_* async work.
Pranith Kumar (3):
mttcg: translate-all: Enable locking debug in a debug build
mttcg: Add missing tb_lock/unlock() in cpu_exec_step()
tcg: handle EXCP_ATOMIC exception for system emulation
configure | 6 +
cpu-exec-common.c | 3 -
cpu-exec.c | 41 ++--
cpus.c | 343 ++++++++++++++++++++++++++-------
cputlb.c | 465 +++++++++++++++++++++++++++++++++++++--------
docs/multi-thread-tcg.txt | 350 ++++++++++++++++++++++++++++++++++
exec.c | 12 +-
hw/core/irq.c | 1 +
hw/i386/kvmvapic.c | 4 +-
hw/intc/arm_gicv3_cpuif.c | 3 +
hw/ppc/ppc.c | 16 +-
hw/ppc/spapr.c | 3 +
include/exec/cputlb.h | 2 -
include/exec/exec-all.h | 130 +++++++++++--
include/qom/cpu.h | 16 ++
include/sysemu/cpus.h | 2 +
memory.c | 2 +
qemu-options.hx | 20 ++
qom/cpu.c | 10 +
target/arm/arm-powerctl.c | 146 ++++++++------
target/arm/cpu.h | 73 ++++---
target/arm/helper.c | 385 ++++++++++++++++++-------------------
target/arm/op_helper.c | 50 ++++-
target/arm/translate-a64.c | 26 ++-
target/arm/translate.c | 46 +++--
target/arm/translate.h | 4 +-
target/i386/smm_helper.c | 7 +
target/s390x/misc_helper.c | 5 +-
target/sparc/ldst_helper.c | 8 +-
tcg/i386/tcg-target.h | 16 ++
tcg/tcg-mo.h | 45 +++++
tcg/tcg.h | 27 +--
translate-all.c | 66 ++-----
translate-common.c | 21 +-
vl.c | 49 ++++-
35 files changed, 1818 insertions(+), 585 deletions(-)
create mode 100644 docs/multi-thread-tcg.txt
create mode 100644 tcg/tcg-mo.h
--
2.11.0
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
2017-01-27 10:38 [Qemu-devel] [PATCH v8 00/25] Remaining MTTCG Base patches and ARM enablement Alex Bennée
@ 2017-01-27 10:39 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-01-27 10:39 UTC (permalink / raw)
To: mttcg, qemu-devel, fred.konrad, a.rigo, cota, bobby.prani, nikunj
Cc: mark.burton, pbonzini, jan.kiszka, serge.fdrv, rth, peter.maydell,
claudio.fontana, bamvor.zhangjian, Alex Bennée,
open list:ARM
Some helpers may trigger an immediate exit of the cpu_loop. If this
happens the PC need to be rectified to ensure the restart will begin
on the next instruction.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/cpu.h | 3 ++-
target/arm/translate-a64.c | 4 ++++
target/arm/translate.c | 4 ++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f56a96c675..1b0670ae11 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
#define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
#define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
#define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
-#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
+#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
+#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
/* Used only as a terminator for ARMCPRegInfo lists */
#define ARM_CP_SENTINEL 0xffff
/* Mask of only the flag bits in a type field */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 05162f335e..a3f37d8bec 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
tcg_rt = cpu_reg(s, rt);
gen_helper_dc_zva(cpu_env, tcg_rt);
return;
+ case ARM_CP_EXIT_PC:
+ /* The helper may exit the cpu_loop so ensure PC is correct */
+ gen_a64_set_pc_im(s->pc);
+ break;
default:
break;
}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 444a24c2b6..7bd18cd25d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
gen_set_pc_im(s, s->pc);
s->is_jmp = DISAS_WFI;
return 0;
+ case ARM_CP_EXIT_PC:
+ /* The helper may exit the cpu_loop so ensure PC is correct */
+ gen_set_pc_im(s, s->pc);
+ break;
default:
break;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
[parent not found: <20170127103505.18606-1-alex.bennee@linaro.org>]
* [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
[not found] <20170127103505.18606-1-alex.bennee@linaro.org>
@ 2017-01-27 10:35 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2017-01-27 10:35 UTC (permalink / raw)
Cc: Alex Bennée, Peter Maydell, open list:ARM,
open list:All patches CC here
Some helpers may trigger an immediate exit of the cpu_loop. If this
happens the PC need to be rectified to ensure the restart will begin
on the next instruction.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/cpu.h | 3 ++-
target/arm/translate-a64.c | 4 ++++
target/arm/translate.c | 4 ++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f56a96c675..1b0670ae11 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
#define ARM_CP_NZCV (ARM_CP_SPECIAL | (3 << 8))
#define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | (4 << 8))
#define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | (5 << 8))
-#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
+#define ARM_CP_EXIT_PC (ARM_CP_SPECIAL | (6 << 8))
+#define ARM_LAST_SPECIAL ARM_CP_EXIT_PC
/* Used only as a terminator for ARMCPRegInfo lists */
#define ARM_CP_SENTINEL 0xffff
/* Mask of only the flag bits in a type field */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 05162f335e..a3f37d8bec 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
tcg_rt = cpu_reg(s, rt);
gen_helper_dc_zva(cpu_env, tcg_rt);
return;
+ case ARM_CP_EXIT_PC:
+ /* The helper may exit the cpu_loop so ensure PC is correct */
+ gen_a64_set_pc_im(s->pc);
+ break;
default:
break;
}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 444a24c2b6..7bd18cd25d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
gen_set_pc_im(s, s->pc);
s->is_jmp = DISAS_WFI;
return 0;
+ case ARM_CP_EXIT_PC:
+ /* The helper may exit the cpu_loop so ensure PC is correct */
+ gen_set_pc_im(s, s->pc);
+ break;
default:
break;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-02-02 13:25 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-31 10:57 [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC Peter Maydell
2017-01-31 10:57 ` [Qemu-devel] " Peter Maydell
2017-02-02 11:03 ` Alex Bennée
2017-02-02 11:03 ` [Qemu-devel] " Alex Bennée
2017-02-02 11:31 ` Peter Maydell
2017-02-02 11:31 ` [Qemu-devel] " Peter Maydell
2017-02-02 12:17 ` Alex Bennée
2017-02-02 12:17 ` [Qemu-devel] " Alex Bennée
2017-02-02 12:48 ` Peter Maydell
2017-02-02 12:48 ` [Qemu-devel] " Peter Maydell
2017-02-02 13:25 ` Alex Bennée
2017-02-02 13:25 ` [Qemu-devel] " Alex Bennée
-- strict thread matches above, loose matches on Subject: below --
2017-01-27 10:38 [Qemu-devel] [PATCH v8 00/25] Remaining MTTCG Base patches and ARM enablement Alex Bennée
2017-01-27 10:39 ` [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC Alex Bennée
[not found] <20170127103505.18606-1-alex.bennee@linaro.org>
2017-01-27 10:35 ` Alex Bennée
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.