* [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized @ 2022-07-12 19:25 Matheus Ferst 2022-07-12 21:13 ` Daniel Henrique Barboza 0 siblings, 1 reply; 4+ messages in thread From: Matheus Ferst @ 2022-07-12 19:25 UTC (permalink / raw) To: qemu-devel, qemu-ppc; +Cc: clg, danielhb413, david, groug, Matheus Ferst When using "-machine none", env->tb_env is not allocated, causing the segmentation fault reported in issue #85 (launchpad bug #811683). To avoid this problem, check if the pointer != NULL before calling the methods to print TBU/TBL/DECR. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85 Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- This patch fixes the reported problem, but may be an incomplete solution since many other places dereference env->tb_env without checking for NULL. AFAICS, "-machine none" is the only way to trigger this problem, and I'm not familiar with the use-cases for this option. Should we stop assuming env->tb_env != NULL and add checks everywhere? Or should we find a way to provide Time Base/Decrementer for "-machine none"? --- target/ppc/cpu_init.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c index 86ad28466a..7e96baac9f 100644 --- a/target/ppc/cpu_init.c +++ b/target/ppc/cpu_init.c @@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags) "%08x iidx %d didx %d\n", env->msr, env->spr[SPR_HID0], env->hflags, cpu_mmu_index(env, true), cpu_mmu_index(env, false)); -#if !defined(NO_TIMER_DUMP) - qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 + if (env->tb_env) { + qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 #if !defined(CONFIG_USER_ONLY) - " DECR " TARGET_FMT_lu + " DECR " TARGET_FMT_lu #endif - "\n", - cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) + "\n", + cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) #if !defined(CONFIG_USER_ONLY) - , cpu_ppc_load_decr(env) -#endif - ); + , cpu_ppc_load_decr(env) #endif + ); + } for (i = 0; i < 32; i++) { if ((i & (RGPL - 1)) == 0) { qemu_fprintf(f, "GPR%02d", i); -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized 2022-07-12 19:25 [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized Matheus Ferst @ 2022-07-12 21:13 ` Daniel Henrique Barboza 2022-07-13 2:21 ` David Gibson 0 siblings, 1 reply; 4+ messages in thread From: Daniel Henrique Barboza @ 2022-07-12 21:13 UTC (permalink / raw) To: Matheus Ferst, qemu-devel, qemu-ppc; +Cc: clg, david, groug On 7/12/22 16:25, Matheus Ferst wrote: > When using "-machine none", env->tb_env is not allocated, causing the > segmentation fault reported in issue #85 (launchpad bug #811683). To > avoid this problem, check if the pointer != NULL before calling the > methods to print TBU/TBL/DECR. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85 > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > --- > This patch fixes the reported problem, but may be an incomplete solution > since many other places dereference env->tb_env without checking for > NULL. AFAICS, "-machine none" is the only way to trigger this problem, > and I'm not familiar with the use-cases for this option. The "none" machine type is mainly used by libvirt to do instrospection of the available options/capabilities of the QEMU binary. It starts a QEMU process like the following: ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \ -nographic -machine none,accel=kvm:tcg -daemonize And then it uses QMP to probe the binary. Aside from this libvirt usage I am not aware of anyone else using -machine none extensively. > > Should we stop assuming env->tb_env != NULL and add checks everywhere? > Or should we find a way to provide Time Base/Decrementer for > "-machine none"? > --- Are there other cases where env->tb_env can be NULL, aside from the case reported in the bug? I don't mind the bug fix, but I'm not fond of the idea of adding additional checks because of this particular issue. I mean, the bug is using the 'prep' machine that Thomas removed year ago in b2ce76a0730. If there's no other foreseeable problem, that we care about, with env->tb_env being NULL, IMO let's fix the bug and move on. Thanks, Daniel > target/ppc/cpu_init.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c > index 86ad28466a..7e96baac9f 100644 > --- a/target/ppc/cpu_init.c > +++ b/target/ppc/cpu_init.c > @@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags) > "%08x iidx %d didx %d\n", > env->msr, env->spr[SPR_HID0], env->hflags, > cpu_mmu_index(env, true), cpu_mmu_index(env, false)); > -#if !defined(NO_TIMER_DUMP) > - qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 > + if (env->tb_env) { > + qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 > #if !defined(CONFIG_USER_ONLY) > - " DECR " TARGET_FMT_lu > + " DECR " TARGET_FMT_lu > #endif > - "\n", > - cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) > + "\n", > + cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) > #if !defined(CONFIG_USER_ONLY) > - , cpu_ppc_load_decr(env) > -#endif > - ); > + , cpu_ppc_load_decr(env) > #endif > + ); > + } > for (i = 0; i < 32; i++) { > if ((i & (RGPL - 1)) == 0) { > qemu_fprintf(f, "GPR%02d", i); ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized 2022-07-12 21:13 ` Daniel Henrique Barboza @ 2022-07-13 2:21 ` David Gibson 2022-07-13 18:28 ` Matheus K. Ferst 0 siblings, 1 reply; 4+ messages in thread From: David Gibson @ 2022-07-13 2:21 UTC (permalink / raw) To: Daniel Henrique Barboza; +Cc: Matheus Ferst, qemu-devel, qemu-ppc, clg, groug [-- Attachment #1: Type: text/plain, Size: 4683 bytes --] On Tue, Jul 12, 2022 at 06:13:44PM -0300, Daniel Henrique Barboza wrote: > > > On 7/12/22 16:25, Matheus Ferst wrote: > > When using "-machine none", env->tb_env is not allocated, causing the > > segmentation fault reported in issue #85 (launchpad bug #811683). To > > avoid this problem, check if the pointer != NULL before calling the > > methods to print TBU/TBL/DECR. > > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85 > > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > > --- > > This patch fixes the reported problem, but may be an incomplete solution > > since many other places dereference env->tb_env without checking for > > NULL. AFAICS, "-machine none" is the only way to trigger this problem, > > and I'm not familiar with the use-cases for this option. > > The "none" machine type is mainly used by libvirt to do instrospection > of the available options/capabilities of the QEMU binary. It starts a QEMU > process like the following: > > ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \ > -nographic -machine none,accel=kvm:tcg -daemonize > > And then it uses QMP to probe the binary. > > Aside from this libvirt usage I am not aware of anyone else using -machine > none extensively. Right. -machine none basically cannot work as a real machine for POWER (maybe some other CPUs as well). At least the more modern POWER CPUs simply cannot boot without a bunch of supporting board/system level elements, and there's not really a sane way to encode those into individual emulated devices at present (maybe ever). One of those things is that POWER expects the timebases to be synchronized across all CPUs in the system, which obviously can't be done locally to a single CPU chip. It requires system level operations, which is why it's handled by the machine type [Example: a typical sequence which might be handled in hardware by low-level firmware would be to use machine-specific board-level registers to suspend the clock pulse to the CPUs which drives the timebase, then write the same value to the TB on each CPU, then (atomically) restart the clock pulse using board registers again] > > Should we stop assuming env->tb_env != NULL and add checks everywhere? > > Or should we find a way to provide Time Base/Decrementer for > > "-machine none"? > > --- > > Are there other cases where env->tb_env can be NULL, aside from the case > reported in the bug? If there are, I'd say that's a bug in the machine type. Setting up (and synchronizing) the timebase is part of the machine's job. > I don't mind the bug fix, but I'm not fond of the idea of adding additional > checks because of this particular issue. I mean, the bug is using the 'prep' > machine that Thomas removed year ago in b2ce76a0730. If there's no other > foreseeable problem, that we care about, with env->tb_env being NULL, IMO > let's fix the bug and move on. > > > > Thanks, > > > Daniel > > > > > > target/ppc/cpu_init.c | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c > > index 86ad28466a..7e96baac9f 100644 > > --- a/target/ppc/cpu_init.c > > +++ b/target/ppc/cpu_init.c > > @@ -7476,18 +7476,18 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, int flags) > > "%08x iidx %d didx %d\n", > > env->msr, env->spr[SPR_HID0], env->hflags, > > cpu_mmu_index(env, true), cpu_mmu_index(env, false)); > > -#if !defined(NO_TIMER_DUMP) > > - qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 > > + if (env->tb_env) { > > + qemu_fprintf(f, "TB %08" PRIu32 " %08" PRIu64 > > #if !defined(CONFIG_USER_ONLY) > > - " DECR " TARGET_FMT_lu > > + " DECR " TARGET_FMT_lu > > #endif > > - "\n", > > - cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) > > + "\n", > > + cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env) > > #if !defined(CONFIG_USER_ONLY) > > - , cpu_ppc_load_decr(env) > > -#endif > > - ); > > + , cpu_ppc_load_decr(env) > > #endif > > + ); > > + } > > for (i = 0; i < 32; i++) { > > if ((i & (RGPL - 1)) == 0) { > > qemu_fprintf(f, "GPR%02d", i); > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized 2022-07-13 2:21 ` David Gibson @ 2022-07-13 18:28 ` Matheus K. Ferst 0 siblings, 0 replies; 4+ messages in thread From: Matheus K. Ferst @ 2022-07-13 18:28 UTC (permalink / raw) To: David Gibson, Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc, clg, groug On 12/07/2022 23:21, David Gibson wrote: > On Tue, Jul 12, 2022 at 06:13:44PM -0300, Daniel Henrique Barboza wrote: >> >> >> On 7/12/22 16:25, Matheus Ferst wrote: >>> When using "-machine none", env->tb_env is not allocated, causing the >>> segmentation fault reported in issue #85 (launchpad bug #811683). To >>> avoid this problem, check if the pointer != NULL before calling the >>> methods to print TBU/TBL/DECR. >>> >>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85 >>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >>> --- >>> This patch fixes the reported problem, but may be an incomplete solution >>> since many other places dereference env->tb_env without checking for >>> NULL. AFAICS, "-machine none" is the only way to trigger this problem, >>> and I'm not familiar with the use-cases for this option. >> >> The "none" machine type is mainly used by libvirt to do instrospection >> of the available options/capabilities of the QEMU binary. It starts a QEMU >> process like the following: >> >> ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \ >> -nographic -machine none,accel=kvm:tcg -daemonize >> >> And then it uses QMP to probe the binary. >> >> Aside from this libvirt usage I am not aware of anyone else using -machine >> none extensively. > > Right. -machine none basically cannot work as a real machine for > POWER (maybe some other CPUs as well). At least the more modern POWER > CPUs simply cannot boot without a bunch of supporting board/system > level elements, and there's not really a sane way to encode those into > individual emulated devices at present (maybe ever). > > One of those things is that POWER expects the timebases to be > synchronized across all CPUs in the system, which obviously can't be > done locally to a single CPU chip. It requires system level > operations, which is why it's handled by the machine type > > [Example: a typical sequence which might be handled in hardware by > low-level firmware would be to use machine-specific board-level > registers to suspend the clock pulse to the CPUs which drives the > timebase, then write the same value to the TB on each CPU, then > (atomically) restart the clock pulse using board registers again] > So I guess it's safe to assume that it's impossible to run code with "-machine none", and then there would be no reason to check for NULL in the mtspr/mfspr path, right? >>> Should we stop assuming env->tb_env != NULL and add checks everywhere? >>> Or should we find a way to provide Time Base/Decrementer for >>> "-machine none"? >>> --- >> >> Are there other cases where env->tb_env can be NULL, aside from the case >> reported in the bug? > > If there are, I'd say that's a bug in the machine type. Setting up > (and synchronizing) the timebase is part of the machine's job. > With "-machine none", it seems that the only places where it could happen are: i) Monitor code: there are some other places where env_tb is used, like monitor_get_tb{u,l} and monitor_get_decr, so commands like "p $tbu" or "p $dect" cause a segfault too. ii) mtspr/mfspr: it shouldn't be a problem if it's not possible to run code without a machine. iii) gdbstub: we're not reading or setting TB{U,L} from gdb, which may be an issue on its own, but not related to #85. >> I don't mind the bug fix, but I'm not fond of the idea of adding additional >> checks because of this particular issue. I mean, the bug is using the 'prep' >> machine that Thomas removed year ago in b2ce76a0730. If there's no other >> foreseeable problem, that we care about, with env->tb_env being NULL, IMO >> let's fix the bug and move on. >> I'll send a v2 fixing the other segfault in monitor, and then I guess we have a complete solution. Thanks Daniel and David for the feedback. -- Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-13 18:31 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-07-12 19:25 [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized Matheus Ferst 2022-07-12 21:13 ` Daniel Henrique Barboza 2022-07-13 2:21 ` David Gibson 2022-07-13 18:28 ` Matheus K. Ferst
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).