From: Nicholas Piggin <npiggin@gmail.com>
To: qemu-devel@nongnu.org
Cc: "Nicholas Piggin" <npiggin@gmail.com>,
"Pavel Dovgalyuk" <Pavel.Dovgalyuk@ispras.ru>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"John Snow" <jsnow@redhat.com>, "Cleber Rosa" <crosa@redhat.com>,
"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
"Beraldo Leal" <bleal@redhat.com>,
"Michael Tokarev" <mjt@tls.msk.ru>
Subject: [PATCH v4 00/24] replay: fixes and new test cases
Date: Tue, 12 Mar 2024 03:40:02 +1000 [thread overview]
Message-ID: <20240311174026.2177152-1-npiggin@gmail.com> (raw)
Since v3,
* Attacked the replay_linux.py bugs and found a bunch of gaps
in networking that was causing the hangs.
* And several powerpc bugs that were also causing problems on
pseries.
* Added ppc test to replay_linux.py now that it's working.
* Found several crash bugs in record/replay vs migration.
* Added snapshot and more stepping tests to reverse_debugging.py
* Addressed comments in auto-snapshot code.
* Added auto-snapshot test case.
* "Solved" x86-64 issues in test cases by switching to q35, which
seems to have less problems.
The last 3 patches I will take in the ppc tree, but included here
because powerpc is the only one that survives the record-replay test
with auto-snapshots at the moment.
Thanks,
Nick
Since v2, here fixes became less minor so I rename the series.
https://lore.kernel.org/qemu-devel/20240125160835.480488-1-npiggin@gmail.com/#r)
* Found several more bugs (patches 5-8).
* Enable the rr avocado test on pseries and aarch64 virt since they're
passing here (and on gitlab, e.g.,
https://gitlab.com/npiggin/qemu/-/jobs/6253787216,
https://gitlab.com/npiggin/qemu/-/jobs/6253787218).
* Updated replay-dump script to John's feedback.
x86-64 still has issues with replay and reverse debugging tests.
replay_kernel.py seems to be timing dependent -- after patch 5 I
had it pass 30/30 runs, then the following day 0/30 and I realized
I had several other QEMU instances hogging the CPU which probably
changed timings. So the first thing I would look at is timers and
clocks. pseries had some rounding issues in time calculations that meant
clock/timer were not replayed exactly as they were recorded, which
caused hangs.
Thanks,
Nick
Nicholas Piggin (24):
scripts/replay-dump.py: Update to current rr record format
scripts/replay-dump.py: rejig decoders in event number order
tests/avocado: excercise scripts/replay-dump.py in replay tests
replay: allow runstate shutdown->running when replaying trace
Revert "replay: stop us hanging in rr_wait_io_event"
chardev: set record/replay on the base device of a muxed device
replay: Fix migration use of clock
replay: Fix migration replay_mutex locking
virtio-net: Use replay_schedule_bh_event for bhs that affect machine
state
virtio-net: Use virtual time for RSC timers
net: Use virtual time for net announce
savevm: Fix load_snapshot error path crash
tests/avocado: replay_linux.py remove the timeout expected guards
tests/avocado/reverse_debugging.py: mark aarch64 and pseries as not
flaky
tests/avocado: reverse_debugging.py add test for x86-64 q35 machine
tests/avocado: reverse_debugging.py verify addresses between record
and replay
tests/avocado: reverse_debugging.py stop VM before sampling icount
tests/avocado: reverse_debugging reverse-step at the end of the trace
tests/avocado: reverse_debugging.py add snapshot testing
replay: simple auto-snapshot mode for record
tests/avocado: reverse_debugging.py test auto-snapshot mode
target/ppc: fix timebase register reset state
spapr: Fix vpa dispatch count for record-replay
tests/avocado: replay_linux.py add ppc64 pseries test
docs/system/replay.rst | 5 +
include/hw/ppc/spapr_cpu_core.h | 3 +
include/sysemu/replay.h | 16 ++-
include/sysemu/runstate.h | 1 +
accel/tcg/tcg-accel-ops-rr.c | 2 +-
chardev/char.c | 71 ++++++++----
hw/net/virtio-net.c | 17 +--
hw/ppc/ppc.c | 11 +-
hw/ppc/spapr.c | 36 +-----
hw/ppc/spapr_hcall.c | 33 ++++++
hw/ppc/spapr_rtas.c | 1 +
migration/migration.c | 17 ++-
migration/savevm.c | 1 +
net/announce.c | 2 +-
replay/replay-snapshot.c | 57 ++++++++++
replay/replay.c | 50 ++++----
system/runstate.c | 31 ++++-
system/vl.c | 9 ++
target/ppc/machine.c | 4 +
qemu-options.hx | 9 +-
scripts/replay-dump.py | 167 ++++++++++++++++++---------
tests/avocado/replay_kernel.py | 11 ++
tests/avocado/replay_linux.py | 97 +++++++++++++++-
tests/avocado/reverse_debugging.py | 176 ++++++++++++++++++++++++-----
24 files changed, 635 insertions(+), 192 deletions(-)
--
2.42.0
next reply other threads:[~2024-03-11 17:42 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-11 17:40 Nicholas Piggin [this message]
2024-03-11 17:40 ` [PATCH v4 01/24] scripts/replay-dump.py: Update to current rr record format Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 02/24] scripts/replay-dump.py: rejig decoders in event number order Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 03/24] tests/avocado: excercise scripts/replay-dump.py in replay tests Nicholas Piggin
2024-03-12 13:25 ` Alex Bennée
2024-03-11 17:40 ` [PATCH v4 04/24] replay: allow runstate shutdown->running when replaying trace Nicholas Piggin
2024-03-12 13:26 ` Alex Bennée
2024-03-11 17:40 ` [PATCH v4 05/24] Revert "replay: stop us hanging in rr_wait_io_event" Nicholas Piggin
2024-03-12 13:33 ` Alex Bennée
2024-03-12 14:03 ` Nicholas Piggin
2024-03-12 21:03 ` Alex Bennée
2024-03-13 5:27 ` Nicholas Piggin
2024-03-14 5:19 ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 06/24] chardev: set record/replay on the base device of a muxed device Nicholas Piggin
2024-03-12 12:39 ` Marc-André Lureau
2024-03-12 14:11 ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 07/24] replay: Fix migration use of clock Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 08/24] replay: Fix migration replay_mutex locking Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 09/24] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 10/24] virtio-net: Use virtual time for RSC timers Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 11/24] net: Use virtual time for net announce Nicholas Piggin
2024-03-12 9:09 ` Pavel Dovgalyuk
2024-03-12 11:05 ` Nicholas Piggin
2024-03-12 11:12 ` Pavel Dovgalyuk
2024-03-13 5:38 ` Nicholas Piggin
2024-03-13 7:09 ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 12/24] savevm: Fix load_snapshot error path crash Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 13/24] tests/avocado: replay_linux.py remove the timeout expected guards Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 14/24] tests/avocado/reverse_debugging.py: mark aarch64 and pseries as not flaky Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 15/24] tests/avocado: reverse_debugging.py add test for x86-64 q35 machine Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 16/24] tests/avocado: reverse_debugging.py verify addresses between record and replay Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 17/24] tests/avocado: reverse_debugging.py stop VM before sampling icount Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 18/24] tests/avocado: reverse_debugging reverse-step at the end of the trace Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 19/24] tests/avocado: reverse_debugging.py add snapshot testing Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 20/24] replay: simple auto-snapshot mode for record Nicholas Piggin
2024-03-12 9:00 ` Pavel Dovgalyuk
2024-03-12 10:43 ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 21/24] tests/avocado: reverse_debugging.py test auto-snapshot mode Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 22/24] target/ppc: fix timebase register reset state Nicholas Piggin
2024-03-12 13:24 ` Alex Bennée
2024-03-12 13:47 ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 23/24] spapr: Fix vpa dispatch count for record-replay Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 24/24] tests/avocado: replay_linux.py add ppc64 pseries test Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240311174026.2177152-1-npiggin@gmail.com \
--to=npiggin@gmail.com \
--cc=Pavel.Dovgalyuk@ispras.ru \
--cc=alex.bennee@linaro.org \
--cc=bleal@redhat.com \
--cc=crosa@redhat.com \
--cc=jsnow@redhat.com \
--cc=mjt@tls.msk.ru \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wainersm@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.