qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized
@ 2022-07-12 19:25 Matheus Ferst
  2022-07-12 21:13 ` Daniel Henrique Barboza
  0 siblings, 1 reply; 4+ messages in thread
From: Matheus Ferst @ 2022-07-12 19:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: clg, danielhb413, david, groug, Matheus Ferst

When using "-machine none", env->tb_env is not allocated, causing the
segmentation fault reported in issue #85 (launchpad bug #811683). To
avoid this problem, check if the pointer != NULL before calling the
methods to print TBU/TBL/DECR.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
This patch fixes the reported problem, but may be an incomplete solution
since many other places dereference env->tb_env without checking for
NULL. AFAICS, "-machine none" is the only way to trigger this problem,
and I'm not familiar with the use-cases for this option.

Should we stop assuming env->tb_env != NULL and add checks everywhere?
Or should we find a way to provide Time Base/Decrementer for
"-machine none"?
---
 target/ppc/cpu_init.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 86ad28466a..7e96baac9f 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags)
                  "%08x iidx %d didx %d\n",
                  env->msr, env->spr[SPR_HID0], env->hflags,
                  cpu_mmu_index(env, true), cpu_mmu_index(env, false));
-#if !defined(NO_TIMER_DUMP)
-    qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
+    if (env->tb_env) {
+        qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
 #if !defined(CONFIG_USER_ONLY)
-                 " DECR " TARGET_FMT_lu
+                     " DECR " TARGET_FMT_lu
 #endif
-                 "\n",
-                 cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
+                     "\n",
+                     cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
 #if !defined(CONFIG_USER_ONLY)
-                 , cpu_ppc_load_decr(env)
-#endif
-        );
+                     , cpu_ppc_load_decr(env)
 #endif
+            );
+    }
     for (i = 0; i < 32; i++) {
         if ((i & (RGPL - 1)) == 0) {
             qemu_fprintf(f, "GPR%02d", i);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized
  2022-07-12 19:25 [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized Matheus Ferst
@ 2022-07-12 21:13 ` Daniel Henrique Barboza
  2022-07-13  2:21   ` David Gibson
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Henrique Barboza @ 2022-07-12 21:13 UTC (permalink / raw)
  To: Matheus Ferst, qemu-devel, qemu-ppc; +Cc: clg, david, groug



On 7/12/22 16:25, Matheus Ferst wrote:
> When using "-machine none", env->tb_env is not allocated, causing the
> segmentation fault reported in issue #85 (launchpad bug #811683). To
> avoid this problem, check if the pointer != NULL before calling the
> methods to print TBU/TBL/DECR.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> This patch fixes the reported problem, but may be an incomplete solution
> since many other places dereference env->tb_env without checking for
> NULL. AFAICS, "-machine none" is the only way to trigger this problem,
> and I'm not familiar with the use-cases for this option.

The "none"  machine type is mainly used by libvirt to do instrospection
of the available options/capabilities of the QEMU binary. It starts a QEMU
process like the following:

./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
       -nographic -machine none,accel=kvm:tcg -daemonize

And then it uses QMP to probe the binary.

Aside from this libvirt usage I am not aware of anyone else using -machine
none extensively.


> 
> Should we stop assuming env->tb_env != NULL and add checks everywhere?
> Or should we find a way to provide Time Base/Decrementer for
> "-machine none"?
> ---

Are there other cases where env->tb_env can be NULL, aside from the case
reported in the bug?

I don't mind the bug fix, but I'm not fond of the idea of adding additional
checks because of this particular issue. I mean, the bug is using  the 'prep'
machine that Thomas removed year ago in b2ce76a0730. If there's no other
foreseeable problem, that we care about, with env->tb_env being NULL, IMO
let's fix the bug and move on.



Thanks,


Daniel




>   target/ppc/cpu_init.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 86ad28466a..7e96baac9f 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags)
>                    "%08x iidx %d didx %d\n",
>                    env->msr, env->spr[SPR_HID0], env->hflags,
>                    cpu_mmu_index(env, true), cpu_mmu_index(env, false));
> -#if !defined(NO_TIMER_DUMP)
> -    qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
> +    if (env->tb_env) {
> +        qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
>   #if !defined(CONFIG_USER_ONLY)
> -                 " DECR " TARGET_FMT_lu
> +                     " DECR " TARGET_FMT_lu
>   #endif
> -                 "\n",
> -                 cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
> +                     "\n",
> +                     cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
>   #if !defined(CONFIG_USER_ONLY)
> -                 , cpu_ppc_load_decr(env)
> -#endif
> -        );
> +                     , cpu_ppc_load_decr(env)
>   #endif
> +            );
> +    }
>       for (i = 0; i < 32; i++) {
>           if ((i & (RGPL - 1)) == 0) {
>               qemu_fprintf(f, "GPR%02d", i);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized
  2022-07-12 21:13 ` Daniel Henrique Barboza
@ 2022-07-13  2:21   ` David Gibson
  2022-07-13 18:28     ` Matheus K. Ferst
  0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2022-07-13  2:21 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: Matheus Ferst, qemu-devel, qemu-ppc, clg, groug

[-- Attachment #1: Type: text/plain, Size: 4683 bytes --]

On Tue, Jul 12, 2022 at 06:13:44PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 7/12/22 16:25, Matheus Ferst wrote:
> > When using "-machine none", env->tb_env is not allocated, causing the
> > segmentation fault reported in issue #85 (launchpad bug #811683). To
> > avoid this problem, check if the pointer != NULL before calling the
> > methods to print TBU/TBL/DECR.
> > 
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85
> > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> > ---
> > This patch fixes the reported problem, but may be an incomplete solution
> > since many other places dereference env->tb_env without checking for
> > NULL. AFAICS, "-machine none" is the only way to trigger this problem,
> > and I'm not familiar with the use-cases for this option.
> 
> The "none"  machine type is mainly used by libvirt to do instrospection
> of the available options/capabilities of the QEMU binary. It starts a QEMU
> process like the following:
> 
> ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
>       -nographic -machine none,accel=kvm:tcg -daemonize
> 
> And then it uses QMP to probe the binary.
> 
> Aside from this libvirt usage I am not aware of anyone else using -machine
> none extensively.

Right.  -machine none basically cannot work as a real machine for
POWER (maybe some other CPUs as well).  At least the more modern POWER
CPUs simply cannot boot without a bunch of supporting board/system
level elements, and there's not really a sane way to encode those into
individual emulated devices at present (maybe ever).

One of those things is that POWER expects the timebases to be
synchronized across all CPUs in the system, which obviously can't be
done locally to a single CPU chip.  It requires system level
operations, which is why it's handled by the machine type

[Example: a typical sequence which might be handled in hardware by
 low-level firmware would be to use machine-specific board-level
 registers to suspend the clock pulse to the CPUs which drives the
 timebase, then write the same value to the TB on each CPU, then
 (atomically) restart the clock pulse using board registers again]
 
> > Should we stop assuming env->tb_env != NULL and add checks everywhere?
> > Or should we find a way to provide Time Base/Decrementer for
> > "-machine none"?
> > ---
> 
> Are there other cases where env->tb_env can be NULL, aside from the case
> reported in the bug?

If there are, I'd say that's a bug in the machine type.  Setting up
(and synchronizing) the timebase is part of the machine's job.

> I don't mind the bug fix, but I'm not fond of the idea of adding additional
> checks because of this particular issue. I mean, the bug is using  the 'prep'
> machine that Thomas removed year ago in b2ce76a0730. If there's no other
> foreseeable problem, that we care about, with env->tb_env being NULL, IMO
> let's fix the bug and move on.
> 
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> 
> 
> 
> >   target/ppc/cpu_init.c | 16 ++++++++--------
> >   1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> > index 86ad28466a..7e96baac9f 100644
> > --- a/target/ppc/cpu_init.c
> > +++ b/target/ppc/cpu_init.c
> > @@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags)
> >                    "%08x iidx %d didx %d\n",
> >                    env->msr, env->spr[SPR_HID0], env->hflags,
> >                    cpu_mmu_index(env, true), cpu_mmu_index(env, false));
> > -#if !defined(NO_TIMER_DUMP)
> > -    qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
> > +    if (env->tb_env) {
> > +        qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64
> >   #if !defined(CONFIG_USER_ONLY)
> > -                 " DECR " TARGET_FMT_lu
> > +                     " DECR " TARGET_FMT_lu
> >   #endif
> > -                 "\n",
> > -                 cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
> > +                     "\n",
> > +                     cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)
> >   #if !defined(CONFIG_USER_ONLY)
> > -                 , cpu_ppc_load_decr(env)
> > -#endif
> > -        );
> > +                     , cpu_ppc_load_decr(env)
> >   #endif
> > +            );
> > +    }
> >       for (i = 0; i < 32; i++) {
> >           if ((i & (RGPL - 1)) == 0) {
> >               qemu_fprintf(f, "GPR%02d", i);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized
  2022-07-13  2:21   ` David Gibson
@ 2022-07-13 18:28     ` Matheus K. Ferst
  0 siblings, 0 replies; 4+ messages in thread
From: Matheus K. Ferst @ 2022-07-13 18:28 UTC (permalink / raw)
  To: David Gibson, Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc, clg, groug

On 12/07/2022 23:21, David Gibson wrote:
> On Tue, Jul 12, 2022 at 06:13:44PM -0300, Daniel Henrique Barboza wrote:
>>
>>
>> On 7/12/22 16:25, Matheus Ferst wrote:
>>> When using "-machine none", env->tb_env is not allocated, causing the
>>> segmentation fault reported in issue #85 (launchpad bug #811683). To
>>> avoid this problem, check if the pointer != NULL before calling the
>>> methods to print TBU/TBL/DECR.
>>>
>>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85
>>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>> ---
>>> This patch fixes the reported problem, but may be an incomplete solution
>>> since many other places dereference env->tb_env without checking for
>>> NULL. AFAICS, "-machine none" is the only way to trigger this problem,
>>> and I'm not familiar with the use-cases for this option.
>>
>> The "none"  machine type is mainly used by libvirt to do instrospection
>> of the available options/capabilities of the QEMU binary. It starts a QEMU
>> process like the following:
>>
>> ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
>>        -nographic -machine none,accel=kvm:tcg -daemonize
>>
>> And then it uses QMP to probe the binary.
>>
>> Aside from this libvirt usage I am not aware of anyone else using -machine
>> none extensively.
> 
> Right.  -machine none basically cannot work as a real machine for
> POWER (maybe some other CPUs as well).  At least the more modern POWER
> CPUs simply cannot boot without a bunch of supporting board/system
> level elements, and there's not really a sane way to encode those into
> individual emulated devices at present (maybe ever).
> 
> One of those things is that POWER expects the timebases to be
> synchronized across all CPUs in the system, which obviously can't be
> done locally to a single CPU chip.  It requires system level
> operations, which is why it's handled by the machine type
> 
> [Example: a typical sequence which might be handled in hardware by
>   low-level firmware would be to use machine-specific board-level
>   registers to suspend the clock pulse to the CPUs which drives the
>   timebase, then write the same value to the TB on each CPU, then
>   (atomically) restart the clock pulse using board registers again]
>  

So I guess it's safe to assume that it's impossible to run code with 
"-machine none", and then there would be no reason to check for NULL in 
the mtspr/mfspr path, right?

>>> Should we stop assuming env->tb_env != NULL and add checks everywhere?
>>> Or should we find a way to provide Time Base/Decrementer for
>>> "-machine none"?
>>> ---
>>
>> Are there other cases where env->tb_env can be NULL, aside from the case
>> reported in the bug?
> 
> If there are, I'd say that's a bug in the machine type.  Setting up
> (and synchronizing) the timebase is part of the machine's job.
> 

With "-machine none", it seems that the only places where it could 
happen are:

i) Monitor code: there are some other places where env_tb is used, like 
monitor_get_tb{u,l} and monitor_get_decr, so commands like "p $tbu" or 
"p $dect" cause a segfault too.
ii) mtspr/mfspr: it shouldn't be a problem if it's not possible to run 
code without a machine.
iii) gdbstub: we're not reading or setting TB{U,L} from gdb, which may 
be an issue on its own, but not related to #85.

>> I don't mind the bug fix, but I'm not fond of the idea of adding additional
>> checks because of this particular issue. I mean, the bug is using  the 'prep'
>> machine that Thomas removed year ago in b2ce76a0730. If there's no other
>> foreseeable problem, that we care about, with env->tb_env being NULL, IMO
>> let's fix the bug and move on.
>>

I'll send a v2 fixing the other segfault in monitor, and then I guess we 
have a complete solution. Thanks Daniel and David for the feedback.

-- 
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-07-13 18:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-12 19:25 [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized Matheus Ferst
2022-07-12 21:13 ` Daniel Henrique Barboza
2022-07-13  2:21   ` David Gibson
2022-07-13 18:28     ` Matheus K. Ferst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).