Re: [RFC v2 PATCH] record-replay: support SMP target machine

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
To: Nicholas Piggin <npiggin@gmail.com>,
	Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
	Richard Henderson <richard.henderson@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [RFC v2 PATCH] record-replay: support SMP target machine
Date: Tue, 22 Aug 2023 07:44:43 +0300	[thread overview]
Message-ID: <48d32058-dd1b-a2ed-42f2-e21c099bc0e3@ispras.ru> (raw)
In-Reply-To: <20230811014700.39172-1-npiggin@gmail.com>

On 11.08.2023 04:47, Nicholas Piggin wrote:
> RR CPU switching is driven by timers and events so it is deterministic
> like everything else. Record a CPU switch event and use that to drive
> the CPU switch on replay.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> This is still in RFC phase because so far I've only really testd ppc
> pseries, and only with patches that are not yet upstream (but posted
> to list).
> 
> It works with smp 2, can step, reverse-step, reverse-continue, etc.
> throughout a Linux boot.

I still didn't have time to test it, but here are some comments.

> 
> One issue is reverse-step on one gdb thread (vCPU) only steps back one
> icount, so if another thread is currently running then it is that one
> which goes back one instruction and the selected thread doesn't move. I
> would call this a separate issue from the record-replay mechanism, which
> is in the replay-debugging policy. I think we could record in each vCPU
> an icount of the last instruction it executed before switching, then
> reverse step for that vCPU could replay to there. I think that's not so
> important yet until this mechanism is solid. But if you test and rsi is
> not going backwards, then check your other threads.
> 
> Thanks,
> Nick
> 
> 
>   accel/tcg/tcg-accel-ops-icount.c |  9 +++-
>   accel/tcg/tcg-accel-ops-rr.c     | 73 +++++++++++++++++++++++++++++---
>   include/exec/replay-core.h       |  3 ++
>   replay/replay-internal.h         |  1 +
>   replay/replay.c                  | 34 ++++++++++++++-
>   scripts/replay-dump.py           |  5 +++
>   softmmu/vl.c                     |  4 --
>   7 files changed, 115 insertions(+), 14 deletions(-)
> 
> diff --git a/accel/tcg/tcg-accel-ops-icount.c b/accel/tcg/tcg-accel-ops-icount.c
> index 3d2cfbbc97..c26782a56a 100644
> --- a/accel/tcg/tcg-accel-ops-icount.c
> +++ b/accel/tcg/tcg-accel-ops-icount.c
> @@ -93,10 +93,15 @@ void icount_handle_deadline(void)
>   int64_t icount_percpu_budget(int cpu_count)
>   {
>       int64_t limit = icount_get_limit();
> -    int64_t timeslice = limit / cpu_count;
> +    int64_t timeslice;
>   
> -    if (timeslice == 0) {
> +    if (replay_mode == REPLAY_MODE_PLAY) {
>           timeslice = limit;
> +    } else {
> +        timeslice = limit / cpu_count;
> +        if (timeslice == 0) {
> +            timeslice = limit;
> +        }
>       }
>   
>       return timeslice;
> diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> index 212d6f8df4..ce040a687e 100644
> --- a/accel/tcg/tcg-accel-ops-rr.c
> +++ b/accel/tcg/tcg-accel-ops-rr.c
> @@ -27,6 +27,7 @@
>   #include "qemu/lockable.h"
>   #include "sysemu/tcg.h"
>   #include "sysemu/replay.h"
> +#include "sysemu/reset.h"
>   #include "sysemu/cpu-timers.h"
>   #include "qemu/main-loop.h"
>   #include "qemu/notify.h"
> @@ -61,6 +62,22 @@ void rr_kick_vcpu_thread(CPUState *unused)
>   
>   static QEMUTimer *rr_kick_vcpu_timer;
>   static CPUState *rr_current_cpu;
> +static CPUState *rr_next_cpu;
> +static CPUState *rr_last_cpu;
> +
> +/*
> + * Reset the vCPU scheduler to the initial state.
> + */
> +static void record_replay_reset(void *param)
> +{
> +    if (rr_kick_vcpu_timer) {
> +        timer_del(rr_kick_vcpu_timer);
> +    }
> +    g_assert(!rr_current_cpu);
> +    rr_next_cpu = NULL;
> +    rr_last_cpu = first_cpu;
> +    current_cpu = NULL;
> +}
>   
>   static inline int64_t rr_next_kick_time(void)
>   {
> @@ -184,6 +201,8 @@ static void *rr_cpu_thread_fn(void *arg)
>       Notifier force_rcu;
>       CPUState *cpu = arg;
>   
> +    qemu_register_reset(record_replay_reset, NULL);
> +
>       assert(tcg_enabled());
>       rcu_register_thread();
>       force_rcu.notify = rr_force_rcu;
> @@ -238,14 +257,20 @@ static void *rr_cpu_thread_fn(void *arg)
>               cpu_budget = icount_percpu_budget(cpu_count);
>           }
>   
> +        if (!rr_next_cpu) {
> +            qatomic_set_mb(&rr_next_cpu, first_cpu);
> +        }
> +        cpu = rr_next_cpu;
> +
> +        if (cpu != rr_last_cpu) {
> +            replay_switch_cpu();
> +            qatomic_set_mb(&rr_last_cpu, cpu);
> +        }
> +
>           rr_start_kick_timer();
>   
>           replay_mutex_unlock();
>   
> -        if (!cpu) {
> -            cpu = first_cpu;
> -        }
> -
>           while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) {
>               /* Store rr_current_cpu before evaluating cpu_can_run().  */
>               qatomic_set_mb(&rr_current_cpu, cpu);
> @@ -284,7 +309,34 @@ static void *rr_cpu_thread_fn(void *arg)
>                   break;
>               }
>   
> -            cpu = CPU_NEXT(cpu);
> +            if (replay_mode == REPLAY_MODE_NONE) {
> +                cpu = CPU_NEXT(cpu);
> +            } else if (replay_mode == REPLAY_MODE_RECORD) {
> +                /*
> +                 * Exit the loop immediately so CPU switch events can be
> +                 * recorded. This may be able to be improved to record
> +                 * switch events here.
> +                 */
> +                cpu = CPU_NEXT(cpu);
> +                break;
> +            } else if (replay_mode == REPLAY_MODE_PLAY) {
> +                /*
> +                 * Play can exit from tcg_cpus_exec at different times than
> +                 * record, because icount budget is set to next non-insn
> +                 * event which could be an exception or something that
> +                 * tcg_cpus_exec can process internally if icount limit was
> +                 * not reached. So we don't necessarily switch CPU here,
> +		 * only if there is a switch in the trace.
> +                 */
> +                qemu_mutex_unlock_iothread();
> +                replay_mutex_lock();
> +                qemu_mutex_lock_iothread();
> +                if (replay_has_switch_cpu()) {
> +                    cpu = CPU_NEXT(cpu);
> +                }
> +                replay_mutex_unlock();
> +                break;
> +            }
>           } /* while (cpu && !cpu->exit_request).. */
>   
>           /* Does not need a memory barrier because a spurious wakeup is okay.  */
> @@ -294,9 +346,9 @@ static void *rr_cpu_thread_fn(void *arg)
>               qatomic_set_mb(&cpu->exit_request, 0);
>           }
>   
> -        if (all_cpu_threads_idle()) {
> -            rr_stop_kick_timer();
> +        qatomic_set(&rr_next_cpu, cpu);

This does not seem to be in the mainline.

>   
> +        if (all_cpu_threads_idle()) {
>               if (icount_enabled()) {
>                   /*
>   		 * When all cpus are sleeping (e.g in WFI), to avoid a deadlock
> @@ -304,6 +356,13 @@ static void *rr_cpu_thread_fn(void *arg)
>                    * timer.
>                    */
>                   qemu_notify_event();
> +            } else {
> +                /*
> +                 * icount timer won't expire when not executing instructions
> +                 * so no need to stop it. Avoiding restarts is also important
> +                 * for record-replay to avoid clock events.
> +                 */
> +                rr_stop_kick_timer();
>               }
>           }
>   
> diff --git a/include/exec/replay-core.h b/include/exec/replay-core.h
> index 244c77acce..543c129a1d 100644
> --- a/include/exec/replay-core.h
> +++ b/include/exec/replay-core.h
> @@ -52,6 +52,9 @@ void replay_gdb_attached(void);
>   
>   /* Interrupts and exceptions */
>   
> +bool replay_switch_cpu(void);
> +bool replay_has_switch_cpu(void);
> +
>   /* Called by exception handler to write or read exception processing events */
>   bool replay_exception(void);
>   /*
> diff --git a/replay/replay-internal.h b/replay/replay-internal.h
> index b6836354ac..95849e7461 100644
> --- a/replay/replay-internal.h
> +++ b/replay/replay-internal.h
> @@ -58,6 +58,7 @@ enum ReplayEvents {
>       /* some of greater codes are reserved for checkpoints */
>       EVENT_CHECKPOINT,
>       EVENT_CHECKPOINT_LAST = EVENT_CHECKPOINT + CHECKPOINT_COUNT - 1,
> +    EVENT_SWITCH_CPU,
>       /* end of log event */
>       EVENT_END,
>       EVENT_COUNT
> diff --git a/replay/replay.c b/replay/replay.c
> index 0f7d766efe..e2d77ab644 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -98,9 +98,41 @@ void replay_account_executed_instructions(void)
>       }
>   }
>   
> -bool replay_exception(void)
> +bool replay_switch_cpu(void)
> +{
> +    if (replay_mode == REPLAY_MODE_RECORD) {
> +        g_assert(replay_mutex_locked());
> +        replay_save_instructions();
> +        replay_put_event(EVENT_SWITCH_CPU);
> +        return true;
> +    } else if (replay_mode == REPLAY_MODE_PLAY) {
> +        bool res = replay_has_switch_cpu();
> +        if (res) {
> +            replay_finish_event();
> +        } else {
> +            g_assert_not_reached();
> +        }
> +        return res;
> +    }
> +
> +    return true;
> +}
> +
> +bool replay_has_switch_cpu(void)

Is this function really needed?

>   {
> +    bool res = false;
> +    if (replay_mode == REPLAY_MODE_PLAY) {
> +        g_assert(replay_mutex_locked());
> +        replay_account_executed_instructions();
> +        res = replay_next_event_is(EVENT_SWITCH_CPU);
> +    }
> +
> +    return res;
> +}
>   
> +
> +bool replay_exception(void)
> +{
>       if (replay_mode == REPLAY_MODE_RECORD) {
>           g_assert(replay_mutex_locked());
>           replay_save_instructions();
> diff --git a/softmmu/vl.c b/softmmu/vl.c
> index b0b96f67fa..946b22d1de 100644
> --- a/softmmu/vl.c
> +++ b/softmmu/vl.c
> @@ -1869,10 +1869,6 @@ static void qemu_apply_machine_options(QDict *qdict)
>           /* fall back to the -kernel/-append */
>           semihosting_arg_fallback(current_machine->kernel_filename, current_machine->kernel_cmdline);
>       }
> -
> -    if (current_machine->smp.cpus > 1) {
> -        replay_add_blocker("smp");
> -    }
>   }
>   
>   static void qemu_create_early_backends(void)

next prev parent reply	other threads:[~2023-08-22  4:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11  1:47 [RFC v2 PATCH] record-replay: support SMP target machine Nicholas Piggin
2023-08-22  4:44 ` Pavel Dovgalyuk [this message]
2023-08-25  8:22   ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48d32058-dd1b-a2ed-42f2-e21c099bc0e3@ispras.ru \
    --to=pavel.dovgalyuk@ispras.ru \
    --cc=npiggin@gmail.com \
    --cc=pavel.dovgaluk@ispras.ru \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).