All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: qemu-devel@nongnu.org,
	"Pavel Dovgalyuk" <Pavel.Dovgalyuk@ispras.ru>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"John Snow" <jsnow@redhat.com>, "Cleber Rosa" <crosa@redhat.com>,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Beraldo Leal" <bleal@redhat.com>,
	"Michael Tokarev" <mjt@tls.msk.ru>
Subject: Re: [PATCH v4 05/24] Revert "replay: stop us hanging in rr_wait_io_event"
Date: Tue, 12 Mar 2024 13:33:30 +0000	[thread overview]
Message-ID: <87v85ro9qt.fsf@draig.linaro.org> (raw)
In-Reply-To: <20240311174026.2177152-6-npiggin@gmail.com> (Nicholas Piggin's message of "Tue, 12 Mar 2024 03:40:07 +1000")

Nicholas Piggin <npiggin@gmail.com> writes:

> This reverts commit 1f881ea4a444ef36a8b6907b0b82be4b3af253a2.
>
> That commit causes reverse_debugging.py test failures, and does
> not seem to solve the root cause of the problem x86-64 still
> hangs in record/replay tests.

I'm still finding the reverse debugging tests failing with this series.

> The problem with short-cutting the iowait that was taken during
> record phase is that related events will not get consumed at the
> same points (e.g., reading the clock).
>
> A hang with zero icount always seems to be a symptom of an earlier
> problem that has caused the recording to become out of synch with
> the execution and consumption of events by replay.

Would it be possible to still detect the failure mode rather than a full
revert?

>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  include/sysemu/replay.h      |  5 -----
>  accel/tcg/tcg-accel-ops-rr.c |  2 +-
>  replay/replay.c              | 21 ---------------------
>  3 files changed, 1 insertion(+), 27 deletions(-)
>
> diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
> index f229b2109c..8102fa54f0 100644
> --- a/include/sysemu/replay.h
> +++ b/include/sysemu/replay.h
> @@ -73,11 +73,6 @@ int replay_get_instructions(void);
>  /*! Updates instructions counter in replay mode. */
>  void replay_account_executed_instructions(void);
>  
> -/**
> - * replay_can_wait: check if we should pause for wait-io
> - */
> -bool replay_can_wait(void);
> -
>  /* Processing clocks and other time sources */
>  
>  /*! Save the specified clock */
> diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> index 894e73e52c..a942442a33 100644
> --- a/accel/tcg/tcg-accel-ops-rr.c
> +++ b/accel/tcg/tcg-accel-ops-rr.c
> @@ -109,7 +109,7 @@ static void rr_wait_io_event(void)
>  {
>      CPUState *cpu;
>  
> -    while (all_cpu_threads_idle() && replay_can_wait()) {
> +    while (all_cpu_threads_idle()) {
>          rr_stop_kick_timer();
>          qemu_cond_wait_bql(first_cpu->halt_cond);
>      }
> diff --git a/replay/replay.c b/replay/replay.c
> index b8564a4813..895fa6b67a 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -451,27 +451,6 @@ void replay_start(void)
>      replay_enable_events();
>  }
>  
> -/*
> - * For none/record the answer is yes.
> - */
> -bool replay_can_wait(void)
> -{
> -    if (replay_mode == REPLAY_MODE_PLAY) {
> -        /*
> -         * For playback we shouldn't ever be at a point we wait. If
> -         * the instruction count has reached zero and we have an
> -         * unconsumed event we should go around again and consume it.
> -         */
> -        if (replay_state.instruction_count == 0 && replay_state.has_unread_data) {
> -            return false;
> -        } else {
> -            replay_sync_error("Playback shouldn't have to iowait");
> -        }
> -    }
> -    return true;
> -}
> -
> -
>  void replay_finish(void)
>  {
>      if (replay_mode == REPLAY_MODE_NONE) {

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  reply	other threads:[~2024-03-12 13:34 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-11 17:40 [PATCH v4 00/24] replay: fixes and new test cases Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 01/24] scripts/replay-dump.py: Update to current rr record format Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 02/24] scripts/replay-dump.py: rejig decoders in event number order Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 03/24] tests/avocado: excercise scripts/replay-dump.py in replay tests Nicholas Piggin
2024-03-12 13:25   ` Alex Bennée
2024-03-11 17:40 ` [PATCH v4 04/24] replay: allow runstate shutdown->running when replaying trace Nicholas Piggin
2024-03-12 13:26   ` Alex Bennée
2024-03-11 17:40 ` [PATCH v4 05/24] Revert "replay: stop us hanging in rr_wait_io_event" Nicholas Piggin
2024-03-12 13:33   ` Alex Bennée [this message]
2024-03-12 14:03     ` Nicholas Piggin
2024-03-12 21:03       ` Alex Bennée
2024-03-13  5:27         ` Nicholas Piggin
2024-03-14  5:19         ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 06/24] chardev: set record/replay on the base device of a muxed device Nicholas Piggin
2024-03-12 12:39   ` Marc-André Lureau
2024-03-12 14:11     ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 07/24] replay: Fix migration use of clock Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 08/24] replay: Fix migration replay_mutex locking Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 09/24] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 10/24] virtio-net: Use virtual time for RSC timers Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 11/24] net: Use virtual time for net announce Nicholas Piggin
2024-03-12  9:09   ` Pavel Dovgalyuk
2024-03-12 11:05     ` Nicholas Piggin
2024-03-12 11:12       ` Pavel Dovgalyuk
2024-03-13  5:38         ` Nicholas Piggin
2024-03-13  7:09         ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 12/24] savevm: Fix load_snapshot error path crash Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 13/24] tests/avocado: replay_linux.py remove the timeout expected guards Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 14/24] tests/avocado/reverse_debugging.py: mark aarch64 and pseries as not flaky Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 15/24] tests/avocado: reverse_debugging.py add test for x86-64 q35 machine Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 16/24] tests/avocado: reverse_debugging.py verify addresses between record and replay Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 17/24] tests/avocado: reverse_debugging.py stop VM before sampling icount Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 18/24] tests/avocado: reverse_debugging reverse-step at the end of the trace Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 19/24] tests/avocado: reverse_debugging.py add snapshot testing Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 20/24] replay: simple auto-snapshot mode for record Nicholas Piggin
2024-03-12  9:00   ` Pavel Dovgalyuk
2024-03-12 10:43     ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 21/24] tests/avocado: reverse_debugging.py test auto-snapshot mode Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 22/24] target/ppc: fix timebase register reset state Nicholas Piggin
2024-03-12 13:24   ` Alex Bennée
2024-03-12 13:47     ` Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 23/24] spapr: Fix vpa dispatch count for record-replay Nicholas Piggin
2024-03-11 17:40 ` [PATCH v4 24/24] tests/avocado: replay_linux.py add ppc64 pseries test Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v85ro9qt.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=Pavel.Dovgalyuk@ispras.ru \
    --cc=bleal@redhat.com \
    --cc=crosa@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=mjt@tls.msk.ru \
    --cc=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.