All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nicholas Piggin" <npiggin@gmail.com>
To: "Pavel Dovgalyuk" <pavel.dovgalyuk@ispras.ru>, <qemu-devel@nongnu.org>
Cc: qemu-ppc@nongnu.org,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"David Gibson" <david@gibson.dropbear.id.au>,
	"Greg Kurz" <groug@kaod.org>,
	"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
	"John Snow" <jsnow@redhat.com>, "Cleber Rosa" <crosa@redhat.com>,
	"Pavel Dovgalyuk" <pavel.dovgaluk@ispras.ru>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Beraldo Leal" <bleal@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Richard Henderson" <richard.henderson@linaro.org>
Subject: Re: [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events
Date: Mon, 26 Jun 2023 20:04:22 +1000	[thread overview]
Message-ID: <CTMHWGLNST28.1P1TPVQWO5LR7@wheely> (raw)
In-Reply-To: <b1b7ab0d-caf2-46ca-eed2-7cdc87c0b600@ispras.ru>

On Mon Jun 26, 2023 at 6:07 PM AEST, Pavel Dovgalyuk wrote:
> e500 has the same problem, I think, according to this issue: 
> https://gitlab.com/qemu-project/qemu/-/issues/1634

Same symptoms. e500 looks like it does the dt build in
machine_init_done notifier, though. Maybe I miss something.
I'll take a look.

>
> Btw, ARM virt platform rebuilds fdt only at initialization phase, not 
> when reset.

I was actually wondering about keeping the same rng-seed across resets
to make the code even simpler, but decided to keep behaviour unchanged.
That seems like one downside.

> Isn't this behavior correct? Shouldn't PPC platforms do the similar thing?

I believe spapr does this for an spapr "feature" that rebuilds the fdt
at runtime ("client-architecture-support"), so reset has to build the
original one again.

Thanks,
Nick

>
> On 23.06.2023 15:57, Nicholas Piggin wrote:
> > spapr_machine_reset gets a random number to populate the device-tree
> > rng seed with. When loading a snapshot for record-replay, the machine
> > is reset again, and that tries to consume the random event record
> > again, crashing due to inconsistent record
> > 
> > Fix this by saving the seed to populate the device tree with, and
> > skipping the rng on snapshot load.
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >   hw/ppc/spapr.c         | 12 +++++++++---
> >   include/hw/ppc/spapr.h |  1 +
> >   2 files changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index d290acfa95..55948f233f 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1017,7 +1017,6 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt, bool reset)
> >   {
> >       MachineState *machine = MACHINE(spapr);
> >       SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
> > -    uint8_t rng_seed[32];
> >       int chosen;
> >   
> >       _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
> > @@ -1095,8 +1094,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt, bool reset)
> >           spapr_dt_ov5_platform_support(spapr, fdt, chosen);
> >       }
> >   
> > -    qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
> > -    _FDT(fdt_setprop(fdt, chosen, "rng-seed", rng_seed, sizeof(rng_seed)));
> > +    _FDT(fdt_setprop(fdt, chosen, "rng-seed", spapr->fdt_rng_seed, 32));
> >   
> >       _FDT(spapr_dt_ovec(fdt, chosen, spapr->ov5_cas, "ibm,architecture-vec-5"));
> >   }
> > @@ -1649,6 +1647,14 @@ static void spapr_machine_reset(MachineState *machine, ShutdownCause reason)
> >       void *fdt;
> >       int rc;
> >   
> > +    if (reason != SHUTDOWN_CAUSE_SNAPSHOT_LOAD) {
> > +        /*
> > +         * Record-replay snapshot load must not consume random, this was
> > +         * already replayed from initial machine reset.
> > +         */
> > +        qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
> > +    }
> > +
> >       pef_kvm_reset(machine->cgs, &error_fatal);
> >       spapr_caps_apply(spapr);
> >   
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index f47e8419a5..f4bd204d86 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -204,6 +204,7 @@ struct SpaprMachineState {
> >       uint32_t fdt_size;
> >       uint32_t fdt_initial_size;
> >       void *fdt_blob;
> > +    uint8_t fdt_rng_seed[32];
> >       long kernel_size;
> >       bool kernel_le;
> >       uint64_t kernel_addr;



  reply	other threads:[~2023-06-26 10:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-23 12:57 [PATCH 0/7] ppc: fix larx migration, fix record-replay Nicholas Piggin
2023-06-23 12:57 ` [PATCH 1/7] target/ppc: Fix CPU reservation migration for record-replay Nicholas Piggin
2023-06-26  7:49   ` Pavel Dovgalyuk
2023-07-07  9:23   ` Daniel Henrique Barboza
2023-06-23 12:57 ` [PATCH 2/7] scripts/replay_dump.sh: Update to current rr record format Nicholas Piggin
2023-06-23 12:57 ` [PATCH 3/7] spapr: Fix machine reset deadlock from replay-record Nicholas Piggin
2023-06-23 12:57 ` [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events Nicholas Piggin
2023-06-26  8:07   ` Pavel Dovgalyuk
2023-06-26 10:04     ` Nicholas Piggin [this message]
2023-06-23 12:57 ` [PATCH 5/7] target/ppc: Fix timebase reset with record-replay Nicholas Piggin
2023-06-26  7:52   ` Pavel Dovgalyuk
2023-06-23 12:57 ` [PATCH 6/7] tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount Nicholas Piggin
2023-06-23 12:57 ` [PATCH 7/7] tests/avocado: ppc64 pseries reverse debugging test Nicholas Piggin
2023-06-26  7:49   ` Pavel Dovgalyuk
2023-06-26  9:34     ` Nicholas Piggin
2023-07-21 13:55       ` Nicholas Piggin
  -- strict thread matches above, loose matches on Subject: below --
2023-07-26 18:35 [PATCH 0/7] ppc: record-replay fixes and enablement Nicholas Piggin
2023-07-26 18:35 ` [PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events Nicholas Piggin
2023-07-31 11:40   ` Pavel Dovgalyuk
2023-08-04  8:50   ` Pavel Dovgalyuk
2023-08-06 11:46     ` Nicholas Piggin
2023-08-08  3:09       ` Nicholas Piggin
2023-08-08  3:52         ` Pavel Dovgalyuk
2023-08-09  9:25           ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CTMHWGLNST28.1P1TPVQWO5LR7@wheely \
    --to=npiggin@gmail.com \
    --cc=bleal@redhat.com \
    --cc=clg@kaod.org \
    --cc=crosa@redhat.com \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=harshpb@linux.ibm.com \
    --cc=jsnow@redhat.com \
    --cc=pavel.dovgaluk@ispras.ru \
    --cc=pavel.dovgalyuk@ispras.ru \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.