From: "Nicholas Piggin" <npiggin@gmail.com>
To: "Pavel Dovgalyuk" <pavel.dovgalyuk@ispras.ru>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>
Cc: "Cédric Le Goater" <clg@kaod.org>,
"David Gibson" <david@gibson.dropbear.id.au>,
"Greg Kurz" <groug@kaod.org>,
"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
"Pavel Dovgalyuk" <pavel.dovgaluk@ispras.ru>,
"Paolo Bonzini" <pbonzini@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events
Date: Sun, 06 Aug 2023 21:46:25 +1000 [thread overview]
Message-ID: <CULFQXOOUWDB.3GMPJXRWAWSDW@wheely> (raw)
In-Reply-To: <3be75aa3-780d-2d4d-a68c-1f8d1d000ee8@ispras.ru>
On Fri Aug 4, 2023 at 6:50 PM AEST, Pavel Dovgalyuk wrote:
> BTW, there is a function qemu_register_reset_nosnapshotload that can be
> used in similar cases.
> Can you just use it without changing the code of the reset handler?
I didn't know that, thanks for pointing it out. I'll take a closer look
at it before reposting.
Thanks,
Nick
>
> On 26.07.2023 21:35, Nicholas Piggin wrote:
> > spapr_machine_reset gets a random number to populate the device-tree
> > rng seed with. When loading a snapshot for record-replay, the machine
> > is reset again, and that tries to consume the random event record
> > again, crashing due to inconsistent record
> >
> > Fix this by saving the seed to populate the device tree with, and
> > skipping the rng on snapshot load.
> >
> > Cc: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > hw/ppc/spapr.c | 12 +++++++++---
> > include/hw/ppc/spapr.h | 1 +
> > 2 files changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 7d84244f03..ecfbdb0030 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1022,7 +1022,6 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt, bool reset)
> > {
> > MachineState *machine = MACHINE(spapr);
> > SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
> > - uint8_t rng_seed[32];
> > int chosen;
> >
> > _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
> > @@ -1100,8 +1099,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt, bool reset)
> > spapr_dt_ov5_platform_support(spapr, fdt, chosen);
> > }
> >
> > - qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
> > - _FDT(fdt_setprop(fdt, chosen, "rng-seed", rng_seed, sizeof(rng_seed)));
> > + _FDT(fdt_setprop(fdt, chosen, "rng-seed", spapr->fdt_rng_seed, 32));
> >
> > _FDT(spapr_dt_ovec(fdt, chosen, spapr->ov5_cas, "ibm,architecture-vec-5"));
> > }
> > @@ -1654,6 +1652,14 @@ static void spapr_machine_reset(MachineState *machine, ShutdownCause reason)
> > void *fdt;
> > int rc;
> >
> > + if (reason != SHUTDOWN_CAUSE_SNAPSHOT_LOAD) {
> > + /*
> > + * Record-replay snapshot load must not consume random, this was
> > + * already replayed from initial machine reset.
> > + */
> > + qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
> > + }
> > +
> > pef_kvm_reset(machine->cgs, &error_fatal);
> > spapr_caps_apply(spapr);
> >
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index f47e8419a5..f4bd204d86 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -204,6 +204,7 @@ struct SpaprMachineState {
> > uint32_t fdt_size;
> > uint32_t fdt_initial_size;
> > void *fdt_blob;
> > + uint8_t fdt_rng_seed[32];
> > long kernel_size;
> > bool kernel_le;
> > uint64_t kernel_addr;
next prev parent reply other threads:[~2023-08-06 11:47 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-26 18:35 [PATCH 0/7] ppc: record-replay fixes and enablement Nicholas Piggin
2023-07-26 18:35 ` [PATCH 1/7] target/ppc: Fix CPU reservation migration for record-replay Nicholas Piggin
2023-07-26 18:35 ` [PATCH 2/7] target/ppc: Fix timebase reset with record-replay Nicholas Piggin
2023-07-26 18:35 ` [PATCH 3/7] spapr: Fix machine reset deadlock from replay-record Nicholas Piggin
2023-07-26 18:35 ` [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events Nicholas Piggin
2023-07-31 11:40 ` Pavel Dovgalyuk
2023-08-04 8:50 ` Pavel Dovgalyuk
2023-08-06 11:46 ` Nicholas Piggin [this message]
2023-08-08 3:09 ` Nicholas Piggin
2023-08-08 3:52 ` Pavel Dovgalyuk
2023-08-09 9:25 ` Nicholas Piggin
2023-07-26 18:35 ` [PATCH 5/7] tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount Nicholas Piggin
2023-07-31 11:41 ` Pavel Dovgalyuk
2023-07-26 18:35 ` [PATCH 6/7] tests/avocado: reverse-debugging cope with re-executing breakpoints Nicholas Piggin
2023-07-31 12:08 ` Pavel Dovgalyuk
2023-07-26 18:35 ` [PATCH 7/7] tests/avocado: ppc64 reverse debugging tests for pseries and powernv Nicholas Piggin
2023-07-31 12:09 ` Pavel Dovgalyuk
-- strict thread matches above, loose matches on Subject: below --
2023-06-23 12:57 [PATCH 0/7] ppc: fix larx migration, fix record-replay Nicholas Piggin
2023-06-23 12:57 ` [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events Nicholas Piggin
2023-06-26 8:07 ` Pavel Dovgalyuk
2023-06-26 10:04 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CULFQXOOUWDB.3GMPJXRWAWSDW@wheely \
--to=npiggin@gmail.com \
--cc=clg@kaod.org \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=groug@kaod.org \
--cc=harshpb@linux.ibm.com \
--cc=pavel.dovgaluk@ispras.ru \
--cc=pavel.dovgalyuk@ispras.ru \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.