qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org, Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH v7] spapr: Add support for time base offset migration
Date: Wed, 07 May 2014 00:50:43 +1000	[thread overview]
Message-ID: <5368F6C3.8000307@ozlabs.ru> (raw)
In-Reply-To: <1398940629-26415-1-git-send-email-aik@ozlabs.ru>

On 05/01/2014 08:37 PM, Alexey Kardashevskiy wrote:
> This allows guests to have a different timebase origin from the host.
> 
> This is needed for migration, where a guest can migrate from one host
> to another and the two hosts might have a different timebase origin.
> However, the timebase seen by the guest must not go backwards, and
> should go forwards only by a small amount corresponding to the time
> taken for the migration.
> 
> This is only supported for recent POWER hardware which has the TBU40
> (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
> 970.
> 
> This adds kvm_access_one_reg() to access a special register which is not
> in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.
> 
> The feature must be present in the host kernel.
> 
> This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase
> only for it. Since the vmstate_spapr::minimum_version_id remains
> unchanged, migration from older QEMU is supported but without
> vmstate_ppc_timebase.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v7:
> * migration_duration_ns forced to be between [0...1s]
> * s/tb/tb_remote/
> * time_of_the_day_ns is int64_t now as this is what get_clock_realtime()
> returns

Still bad? :)


> 
> v6:
> * time_of_the_day is now time_of_the_day_ns and measured in nm instead of us
> * VMSTATE_PPC_TIMEBASE_V supports versions now
> 
> v5:
> * fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
> into timebase_post_load()
> * removed round_up(1<<24) as KVM is expected to do this anyway
> * removed @freq from migration stream
> * renamed PPCTimebaseOffset to PPCTimebase
> * CLOCKS_PER_SEC is used as a constant which 1000000us/s (man clock)
> 
> v4:
> * made it per machine timebase offser rather than per CPU
> 
> v3:
> * kvm_access_one_reg moved out to a separate patch
> * tb_offset and host_timebase were replaced with guest_timebase as
> the destionation does not really care of offset on the source
> 
> v2:
> * bumped the vmstate_ppc_cpu version
> * defined version for the env.tb_env field
> ---
>  hw/ppc/ppc.c           | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr.c         |  4 +--
>  include/hw/ppc/spapr.h |  1 +
>  target-ppc/cpu-qom.h   | 16 ++++++++++
>  target-ppc/kvm.c       |  5 ++++
>  trace-events           |  3 ++
>  6 files changed, 106 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index 71df471..bec82cd 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -29,9 +29,11 @@
>  #include "sysemu/cpus.h"
>  #include "hw/timer/m48t59.h"
>  #include "qemu/log.h"
> +#include "qemu/error-report.h"
>  #include "hw/loader.h"
>  #include "sysemu/kvm.h"
>  #include "kvm_ppc.h"
> +#include "trace.h"
>  
>  //#define PPC_DEBUG_IRQ
>  //#define PPC_DEBUG_TB
> @@ -49,6 +51,8 @@
>  #  define LOG_TB(...) do { } while (0)
>  #endif
>  
> +#define NSEC_PER_SEC    1000000000LL
> +
>  static void cpu_ppc_tb_stop (CPUPPCState *env);
>  static void cpu_ppc_tb_start (CPUPPCState *env);
>  
> @@ -829,6 +833,81 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t freq)
>      cpu_ppc_store_purr(cpu, 0x0000000000000000ULL);
>  }
>  
> +static void timebase_pre_save(void *opaque)
> +{
> +    PPCTimebase *tb = opaque;
> +    uint64_t ticks = cpu_get_real_ticks();
> +    PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
> +
> +    if (!first_ppc_cpu->env.tb_env) {
> +        error_report("No timebase object");
> +        return;
> +    }
> +
> +    tb->time_of_the_day_ns = get_clock_realtime();
> +    /*
> +     * tb_offset is only expected to be changed by migration so
> +     * there is no need to update it from KVM here
> +     */
> +    tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
> +}
> +
> +static int timebase_post_load(void *opaque, int version_id)
> +{
> +    PPCTimebase *tb_remote = opaque;
> +    CPUState *cpu;
> +    PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
> +    int64_t tb_off_adj, tb_off, ns_diff;
> +    int64_t migration_duration_ns, migration_duration_tb, guest_tb, host_ns;
> +    unsigned long freq;
> +
> +    if (!first_ppc_cpu->env.tb_env) {
> +        error_report("No timebase object");
> +        return -1;
> +    }
> +
> +    freq = first_ppc_cpu->env.tb_env->tb_freq;
> +    /*
> +     * Calculate timebase on the destination side of migration.
> +     * The destination timebase must be not less than the source timebase.
> +     * We try to adjust timebase by downtime if host clocks are not
> +     * too much out of sync (1 second for now).
> +     */
> +    host_ns = get_clock_realtime();
> +    ns_diff = MAX(0, host_ns - tb_remote->time_of_the_day_ns);
> +    migration_duration_ns = MIN(NSEC_PER_SEC, ns_diff);
> +    migration_duration_tb = muldiv64(migration_duration_ns, freq, NSEC_PER_SEC);
> +    guest_tb = tb_remote->guest_timebase + MIN(0, migration_duration_tb);
> +
> +    tb_off_adj = guest_tb - cpu_get_real_ticks();
> +
> +    tb_off = first_ppc_cpu->env.tb_env->tb_offset;
> +    trace_ppc_tb_adjust(tb_off, tb_off_adj, tb_off_adj - tb_off,
> +                        (tb_off_adj - tb_off) / freq);
> +
> +    /* Set new offset to all CPUs */
> +    CPU_FOREACH(cpu) {
> +        PowerPCCPU *pcpu = POWERPC_CPU(cpu);
> +        pcpu->env.tb_env->tb_offset = tb_off_adj;
> +    }
> +
> +    return 0;
> +}
> +
> +const VMStateDescription vmstate_ppc_timebase = {
> +    .name = "timebase",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .minimum_version_id_old = 1,
> +    .pre_save = timebase_pre_save,
> +    .post_load = timebase_post_load,
> +    .fields      = (VMStateField []) {
> +        VMSTATE_UINT64(guest_timebase, PPCTimebase),
> +        VMSTATE_INT64(time_of_the_day_ns, PPCTimebase),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  /* Set up (once) timebase frequency (in Hz) */
>  clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq)
>  {
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 451c473..297fc6f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -818,7 +818,7 @@ static int spapr_vga_init(PCIBus *pci_bus)
>  
>  static const VMStateDescription vmstate_spapr = {
>      .name = "spapr",
> -    .version_id = 1,
> +    .version_id = 2,
>      .minimum_version_id = 1,
>      .minimum_version_id_old = 1,
>      .fields      = (VMStateField []) {
> @@ -826,7 +826,7 @@ static const VMStateDescription vmstate_spapr = {
>  
>          /* RTC offset */
>          VMSTATE_UINT64(rtc_offset, sPAPREnvironment),
> -
> +        VMSTATE_PPC_TIMEBASE_V(tb, sPAPREnvironment, 2),
>          VMSTATE_END_OF_LIST()
>      },
>  };
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5fdac1e..9f8bb89 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -29,6 +29,7 @@ typedef struct sPAPREnvironment {
>      target_ulong entry_point;
>      uint32_t next_irq;
>      uint64_t rtc_offset;
> +    struct PPCTimebase tb;
>      bool has_graphics;
>  
>      uint32_t epow_irq;
> diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
> index 47dc8e6..d926d93 100644
> --- a/target-ppc/cpu-qom.h
> +++ b/target-ppc/cpu-qom.h
> @@ -120,6 +120,22 @@ int ppc64_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
>                                 int cpuid, void *opaque);
>  #ifndef CONFIG_USER_ONLY
>  extern const struct VMStateDescription vmstate_ppc_cpu;
> +
> +typedef struct PPCTimebase {
> +    uint64_t guest_timebase;
> +    int64_t time_of_the_day_ns;
> +} PPCTimebase;
> +
> +extern const struct VMStateDescription vmstate_ppc_timebase;
> +
> +#define VMSTATE_PPC_TIMEBASE_V(_field, _state, _version) {            \
> +    .name       = (stringify(_field)),                                \
> +    .version_id = (_version),                                         \
> +    .size       = sizeof(PPCTimebase),                                \
> +    .vmsd       = &vmstate_ppc_timebase,                              \
> +    .flags      = VMS_STRUCT,                                         \
> +    .offset     = vmstate_offset_value(_state, _field, PPCTimebase),  \
> +}
>  #endif
>  
>  #endif
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 73dbb02..a8a1498 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -35,6 +35,7 @@
>  #include "hw/sysbus.h"
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> +#include "hw/ppc/ppc.h"
>  #include "sysemu/watchdog.h"
>  #include "trace.h"
>  
> @@ -890,6 +891,8 @@ int kvm_arch_put_registers(CPUState *cs, int level)
>                  DPRINTF("Warning: Unable to set VPA information to KVM\n");
>              }
>          }
> +
> +        kvm_set_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset);
>  #endif /* TARGET_PPC64 */
>      }
>  
> @@ -1133,6 +1136,8 @@ int kvm_arch_get_registers(CPUState *cs)
>                  DPRINTF("Warning: Unable to get VPA information from KVM\n");
>              }
>          }
> +
> +        kvm_get_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &env->tb_env->tb_offset);
>  #endif
>      }
>  
> diff --git a/trace-events b/trace-events
> index a5218ba..6627569 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1182,6 +1182,9 @@ spapr_iommu_get(uint64_t liobn, uint64_t ioba, uint64_t ret, uint64_t tce) "liob
>  spapr_iommu_xlate(uint64_t liobn, uint64_t ioba, uint64_t tce, unsigned perm, unsigned pgsize) "liobn=%"PRIx64" 0x%"PRIx64" -> 0x%"PRIx64" perm=%u mask=%x"
>  spapr_iommu_new_table(uint64_t liobn, void *tcet, void *table, int fd) "liobn=%"PRIx64" tcet=%p table=%p fd=%d"
>  
> +# hw/ppc/ppc.c
> +ppc_tb_adjust(uint64_t offs1, uint64_t offs2, int64_t diff, int64_t seconds) "adjusted from 0x%"PRIx64" to 0x%"PRIx64", diff %"PRId64" (%"PRId64"s)"
> +
>  # util/hbitmap.c
>  hbitmap_iter_skip_words(const void *hb, void *hbi, uint64_t pos, unsigned long cur) "hb %p hbi %p pos %"PRId64" cur 0x%lx"
>  hbitmap_reset(void *hb, uint64_t start, uint64_t count, uint64_t sbit, uint64_t ebit) "hb %p items %"PRIu64",%"PRIu64" bits %"PRIu64"..%"PRIu64
> 


-- 
Alexey

  reply	other threads:[~2014-05-06 14:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-01 10:37 [Qemu-devel] [PATCH v7] spapr: Add support for time base offset migration Alexey Kardashevskiy
2014-05-06 14:50 ` Alexey Kardashevskiy [this message]
2014-05-08 12:27   ` Alexander Graf
2014-05-15 15:52     ` Alexey Kardashevskiy
2014-05-16 13:08       ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5368F6C3.8000307@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).