All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>,
	 qemu-devel@nongnu.org, qemu-s390x@nongnu.org
Subject: Re: [RFC PATCH] tests/functional/s390x: Add reverse debugging test for s390x
Date: Mon, 01 Dec 2025 10:36:38 +0000	[thread overview]
Message-ID: <87ldjmv689.fsf@draig.linaro.org> (raw)
In-Reply-To: <6181bc6bd6b41f46a835cee58ab3215b8cefedb4.camel@linux.ibm.com> (Ilya Leoshkevich's message of "Sun, 30 Nov 2025 23:59:48 +0100")

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> On Sun, 2025-11-30 at 20:03 +0100, Ilya Leoshkevich wrote:
>> On Sun, 2025-11-30 at 19:32 +0100, Ilya Leoshkevich wrote:
>> > On Sun, 2025-11-30 at 16:47 +0000, Alex Bennée wrote:
>> > > Ilya Leoshkevich <iii@linux.ibm.com> writes:
>> > > 
>> > > > On Fri, 2025-11-28 at 18:25 +0100, Ilya Leoshkevich wrote:
>> > > > > On Fri, 2025-11-28 at 14:39 +0100, Thomas Huth wrote:
>> > > > > > From: Thomas Huth <thuth@redhat.com>
>> > > > > > 
>> > > > > > We just have to make sure that we can set the endianness to
>> > > > > > big
>> > > > > > endian,
>> > > > > > then we can also run this test on s390x.
>> > > > > > 
>> > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com>
>> > > > > > ---
>> > > > > >  Marked as RFC since it depends on the fix for this bug (so
>> > > > > > it
>> > > > > > cannot
>> > > > > >  be merged yet):
>> > > > > >  
>> > > > > > https://lore.kernel.org/qemu-devel/a0accce9-6042-4a7b-a7c7-218212818891@redhat.com
>> > > > > > /
>> > > > > > 
>> > > > > >  tests/functional/reverse_debugging.py        |  4 +++-
>> > > > > >  tests/functional/s390x/meson.build           |  1 +
>> > > > > >  tests/functional/s390x/test_reverse_debug.py | 21
>> > > > > > ++++++++++++++++++++
>> > > > > >  3 files changed, 25 insertions(+), 1 deletion(-)
>> > > > > >  create mode 100755
>> > > > > > tests/functional/s390x/test_reverse_debug.py
>> > > > > 
>> > > > > Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
>> > > > > 
>> > > > > 
>> > > > > I have a simple fix which helps with your original report,
>> > > > > but
>> > > > > not
>> > > > > with this test. I'm still investigating.
>> > > > > 
>> > > > > --- a/target/s390x/machine.c
>> > > > > +++ b/target/s390x/machine.c
>> > > > > @@ -52,6 +52,14 @@ static int cpu_pre_save(void *opaque)
>> > > > >          kvm_s390_vcpu_interrupt_pre_save(cpu);
>> > > > >      }
>> > > > >  
>> > > > > +    if (tcg_enabled()) {
>> > > > > +        /*
>> > > > > +         * Ensure symmetry with cpu_post_load() with respect
>> > > > > to
>> > > > > +         * CHECKPOINT_CLOCK_VIRTUAL.
>> > > > > +         */
>> > > > > +        tcg_s390_tod_updated(CPU(cpu), RUN_ON_CPU_NULL);
>> > > > > +    }
>> > > > > +
>> > > > >      return 0;
>> > > > >  }
>> > > > 
>> > > > Interestingly enough, this patch fails only under load, e.g.,
>> > > > if
>> > > > I
>> > > > run
>> > > > make check -j"$(nproc)" or if I run your test in isolation, but
>> > > > with
>> > > > stress-ng cpu in background. The culprit appears to be:
>> > > > 
>> > > > s390_tod_load()
>> > > >   qemu_s390_tod_set()
>> > > >     async_run_on_cpu(tcg_s390_tod_updated)
>> > > > 
>> > > > Depending on the system load, this additional
>> > > > tcg_s390_tod_updated()
>> > > > may or may not end up being called during handle_backward(). If
>> > > > it
>> > > > does, we get an infinite loop again, because now we need two
>> > > > checkpoints.
>> > > > 
>> > > > I have a feeling that this code may be violating some record-
>> > > > replay
>> > > > requirement, but I can't quite put my finger on it. For
>> > > > example,
>> > > > async_run_on_cpu() does not sound like something deterministic,
>> > > > but
>> > > > then again it just queues work for rr_cpu_thread_fn(), which is
>> > > > supposed to be deterministic.
>> > > 
>> > > The the async_run_on_cpu is called from the vcpu thread in
>> > > response
>> > > to a
>> > > deterministic event at a known point in time it should be fine.
>> > > If
>> > > it
>> > > came from another thread that is not synchronised via replay_lock
>> > > then
>> > > things will go wrong.
>> > > 
>> > > But this is a VM load save helper?
>> > 
>> > Yes, and it's called from the main thread. Either during
>> > initialization, or as a reaction to GDB packets.
>> > 
>> > Here is the call stack:
>> > 
>> >   qemu_loadvm_state()
>> >     qemu_loadvm_state_main()
>> >       qemu_loadvm_section_start_full()
>> >         vmstate_load()
>> >           vmstate_load_state()
>> >             cpu_post_load()
>> >               tcg_s390_tod_updated()
>> >                 update_ckc_timer()
>> >                   timer_mod()
>> >           s390_tod_load()
>> >             qemu_s390_tod_set()  # via tdc->set()
>> >               async_run_on_cpu(tcg_s390_tod_updated)
>> > 
>> > So you think we may have to take the replay lock around
>> > load_snapshot()? So that all async_run_on_cpu() calls it makes end
>> > up
>> > being handled by the vCPU thread deterministically.
>> 
>> To answer my own question: apparently this is already the case; at
>> least, the following does not cause any fallout:
>> 
>> diff --git a/include/system/replay.h b/include/system/replay.h
>> index 6859df09580..e1cd9b2f900 100644
>> --- a/include/system/replay.h
>> +++ b/include/system/replay.h
>> @@ -60,6 +60,7 @@ extern char *replay_snapshot;
>>  
>>  void replay_mutex_lock(void);
>>  void replay_mutex_unlock(void);
>> +bool replay_mutex_locked(void);
>>  
>>  static inline void replay_unlock_guard(void *unused)
>>  {
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 62cc2ce25cb..ba945d3a1ea 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -3199,6 +3199,8 @@ bool save_snapshot(const char *name, bool
>> overwrite, const char *vmstate,
>>      uint64_t vm_state_size;
>>      g_autoptr(GDateTime) now = g_date_time_new_now_local();
>>  
>> +    g_assert(replay_mutex_locked());
>> +
>>      GLOBAL_STATE_CODE();
>>  
>>      if (!migrate_can_snapshot(errp)) {
>> @@ -3390,6 +3392,8 @@ bool load_snapshot(const char *name, const char
>> *vmstate,
>>      int ret;
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>  
>> +    g_assert(replay_mutex_locked());
>> +
>>      if (!migrate_can_snapshot(errp)) {
>>          return false;
>>      }
>> diff --git a/replay/replay-internal.h b/replay/replay-internal.h
>> index 75249b76936..30825a0753e 100644
>> --- a/replay/replay-internal.h
>> +++ b/replay/replay-internal.h
>> @@ -124,7 +124,6 @@ void replay_get_array_alloc(uint8_t **buf, size_t
>> *size);
>>   * synchronisation between vCPU and main-loop threads. */
>>  
>>  void replay_mutex_init(void);
>> -bool replay_mutex_locked(void);
>>  
>>  /*! Checks error status of the file. */
>>  void replay_check_error(void);
>
> I believe now I at least understand what the race is about:
>
> - cpu_post_load() fires the TOD timer immediately.
>
> - s390_tod_load() schedules work for firing the TOD timer.

Is this a duplicate of work then? Could we just rely on one or the
other? If you drop the cpu_post_load() tweak then the vmstate load
helper should still ensure everything works right?

> - If rr loop sees work and then timer, we get one timer callback.
>
> - If rr loop sees timer and then work, we get two timer callbacks.

If the timer is armed we should expect at least to execute a few
instructions before triggering the timer, unless it was armed ready
expired. 

> - Record and replay may diverge due to this race.
>
> - In this particular case divergence makes rr loop spin: it sees that
>   TOD timer has expired, but cannot invoke its callback, because there
>   is no recorded CHECKPOINT_CLOCK_VIRTUAL.
>
> - The order in which rr loop sees work and timer depends on whether
>   and when rr loop wakes up during load_snapshot().
>
> - rr loop may wake up after the main thread kicks the CPU and drops
>   the BQL, which may happen if it calls, e.g., qemu_cond_wait_bql().

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  reply	other threads:[~2025-12-01 10:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-28 13:39 [RFC PATCH] tests/functional/s390x: Add reverse debugging test for s390x Thomas Huth
2025-11-28 17:25 ` Ilya Leoshkevich
2025-11-29 21:33   ` Ilya Leoshkevich
2025-11-30 16:47     ` Alex Bennée
2025-11-30 18:32       ` Ilya Leoshkevich
2025-11-30 19:03         ` Ilya Leoshkevich
2025-11-30 22:59           ` Ilya Leoshkevich
2025-12-01 10:36             ` Alex Bennée [this message]
2025-12-01 11:17               ` Ilya Leoshkevich
2025-12-01 11:57                 ` Ilya Leoshkevich
2025-12-01 12:43                 ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ldjmv689.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=iii@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.