From: Joao Martins <joao.m.martins@oracle.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Juergen Gross <jgross@suse.com>,
Andy Lutomirski <luto@amacapital.net>
Subject: Re: [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page
Date: Thu, 26 Jan 2017 13:22:45 +0000 [thread overview]
Message-ID: <5889F825.7060508@oracle.com> (raw)
In-Reply-To: <7a8b31b1-c597-57d6-2721-2c06388def39@oracle.com>
On 01/25/2017 07:26 PM, Boris Ostrovsky wrote:
> On 01/25/2017 12:33 PM, Joao Martins wrote:
>> In order to support pvclock vdso on xen we need to setup the time
>> info page for vcpu 0 and register the page with Xen using the
>> VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall
>> will also forcefully update the pvti which will set some of the
>> necessary flags for vdso. Afterwards we check if it supports the
>> PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having
>> vdso/vsyscall support. And if so, it will set the cpu 0 pvti that
>> will be later on used when mapping the vdso image.
>>
>> The xen headers are also updated to include the new hypercall for
>> registering the secondary vcpu_time_info struct.
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>> Changes since RFC:
>> (Comments from Boris and David)
>> * Remove Kconfig option
>> * Use get_zeroed_page/free/page
>> * Remove the hypercall availability check
>> * Unregister pvti with arg.addr.v = NULL if stable bit isn't supported.
>> (New)
>> * Set secondary copy on restore such that it works on migration.
>> * Drop global xen_clock variable and stash it locally on
>> xen_setup_vsyscall_time_info.
>> * WARN_ON(ret) if we fail to unregister the pvti.
>> ---
>> arch/x86/xen/enlighten.c | 2 ++
>> arch/x86/xen/time.c | 51 ++++++++++++++++++++++++++++++++++++++++++++
>> arch/x86/xen/xen-ops.h | 1 +
>> include/xen/interface/vcpu.h | 28 ++++++++++++++++++++++++
>> 4 files changed, 82 insertions(+)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index 51ef952..15d271d 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -270,6 +270,8 @@ void xen_vcpu_restore(void)
>
> This is called for PV only. What about HVM?
The call is missing from xen_hvm_post_suspend(...), I will add it.
>
>> HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL))
>> BUG();
>> }
>> +
>> + xen_setup_vsyscall_time_info(0);
>
> Do we need to tear down time memory area on VCPU suspend?
I also missed that; otherwise I am leaking a page.
I could also rework this patch such that the initially allocated xen_clock page
is reused and simply register/unregister in save/restore paths. This would
probably mean adding one extra helper to register the vcpu_time info and perhaps
make xen_setup_vsyscall_time_info a bit simpler.
>> }
>>
>> static void __init xen_banner(void)
>> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
>> index 1e69956..e90f703 100644
>> --- a/arch/x86/xen/time.c
>> +++ b/arch/x86/xen/time.c
>> @@ -367,6 +367,56 @@ static const struct pv_time_ops xen_time_ops __initconst = {
>> .steal_clock = xen_steal_clock,
>> };
>>
>> +int xen_setup_vsyscall_time_info(int cpu)
>> +{
>> + struct pvclock_vsyscall_time_info *xen_clock;
>> + struct vcpu_register_time_memory_area t;
>> + struct pvclock_vcpu_time_info *pvti;
>> + unsigned long addr;
>> + u8 flags;
>> + int ret;
>> +
>> + addr = get_zeroed_page(GFP_KERNEL);
>> + if (!addr)
>> + return -ENOMEM;
>> +
>> + xen_clock = (struct pvclock_vsyscall_time_info *) addr;
>> + memset(xen_clock, 0, PAGE_SIZE);
>
> You don't really need addr
The reason I had this was to avoid to save one cast to unsigned long (on
free_page paths). But maybe it's not worth it and looking at the rest of the
x86/xen code, this doesn't seem to be the case. I will remove it.
> and there is no reason to memset the page to
> zero, given that you got it with get_zeroed_page().
Yeap, Fixed.
>
>> +
>> + t.addr.v = &xen_clock->pvti;
>> +
>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area,
>> + cpu, &t);
>> +
>> + if (ret) {
>> + pr_debug("xen: cannot register vcpu_time_info err %d\n", ret);
>
> pr_warn() would be more appropriate I think. You also have blank line
> before 'if'.
Fixed.
>> + free_page(addr);
>> + return ret;
>> + }
>> +
>> + pvti = &xen_clock->pvti;
>> + flags = pvti->flags;
>
> I don't think you need these, given that you only reference flags once
> below.
>
Indeed, fixed as well.
>> +
>> + if (!(flags & PVCLOCK_TSC_STABLE_BIT)) {
>> + t.addr.v = NULL;
>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area,
>> + cpu, &t);
>> + if (!ret)
>> + free_page(addr);
>> +
>> + WARN_ON(ret);
>
> Did someone ask for WARN_ON()? (from your changelog it looks like it
> could have been either David or me). I think pr_warn() is sufficient,
> just like above.
Looking again at previous comments no, this WARN_ON wasn't asked. I can changed
it to the similar above as you are suggesting.
>
>> + pr_debug("xen: VCLOCK_PVCLOCK not supported\n");
>
> pr_notice()
Fixed it too.
>
> -boris
>
>> + return -ENOTSUPP;
>> + }
>> +
>> + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
>> + pvclock_set_pvti_cpu0_va(xen_clock);
>> +
>> + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK;
>> +
>> + return 0;
>> +}
>> +
>> static void __init xen_time_init(void)
>> {
>> int cpu = smp_processor_id();
>> @@ -393,6 +443,7 @@ static void __init xen_time_init(void)
>> setup_force_cpu_cap(X86_FEATURE_TSC);
>>
>> xen_setup_runstate_info(cpu);
>> + xen_setup_vsyscall_time_info(cpu);
>> xen_setup_timer(cpu);
>> xen_setup_cpu_clockevents();
>>
>> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
>> index ac0a2b0..4036d15 100644
>> --- a/arch/x86/xen/xen-ops.h
>> +++ b/arch/x86/xen/xen-ops.h
>> @@ -66,6 +66,7 @@ void __init xen_vmalloc_p2m_tree(void);
>> void xen_init_irq_ops(void);
>> void xen_setup_timer(int cpu);
>> void xen_setup_runstate_info(int cpu);
>> +int xen_setup_vsyscall_time_info(int cpu);
>> void xen_teardown_timer(int cpu);
>> u64 xen_clocksource_read(void);
>> void xen_setup_cpu_clockevents(void);
>> diff --git a/include/xen/interface/vcpu.h b/include/xen/interface/vcpu.h
>> index 98188c8..8da788c 100644
>> --- a/include/xen/interface/vcpu.h
>> +++ b/include/xen/interface/vcpu.h
>> @@ -178,4 +178,32 @@ DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_vcpu_info);
>>
>> /* Send an NMI to the specified VCPU. @extra_arg == NULL. */
>> #define VCPUOP_send_nmi 11
>> +
>> +/*
>> + * Register a memory location to get a secondary copy of the vcpu time
>> + * parameters. The master copy still exists as part of the vcpu shared
>> + * memory area, and this secondary copy is updated whenever the master copy
>> + * is updated (and using the same versioning scheme for synchronisation).
>> + *
>> + * The intent is that this copy may be mapped (RO) into userspace so
>> + * that usermode can compute system time using the time info and the
>> + * tsc. Usermode will see an array of vcpu_time_info structures, one
>> + * for each vcpu, and choose the right one by an existing mechanism
>> + * which allows it to get the current vcpu number (such as via a
>> + * segment limit). It can then apply the normal algorithm to compute
>> + * system time from the tsc.
>> + *
>> + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure.
>> + */
>> +#define VCPUOP_register_vcpu_time_memory_area 13
>> +DEFINE_GUEST_HANDLE_STRUCT(vcpu_time_info_t);
>> +struct vcpu_register_time_memory_area {
>> + union {
>> + GUEST_HANDLE(vcpu_time_info_t) h;
>> + struct pvclock_vcpu_time_info *v;
>> + uint64_t p;
>> + } addr;
>> +};
>> +DEFINE_GUEST_HANDLE_STRUCT(vcpu_register_time_memory_area_t);
>> +
>> #endif /* __XEN_PUBLIC_VCPU_H__ */
>
>
next prev parent reply other threads:[~2017-01-26 13:20 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-25 17:33 [PATCH v1 0/3] x86/xen: pvclock vdso support Joao Martins
2017-01-25 17:33 ` [PATCH v1 1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va Joao Martins
2017-01-26 17:25 ` Andy Lutomirski
2017-01-26 17:25 ` Andy Lutomirski
2017-01-26 19:58 ` Joao Martins
2017-01-25 17:33 ` [PATCH v1 2/3] x86/xen/time: setup vcpu 0 time info page Joao Martins
2017-01-25 19:26 ` Boris Ostrovsky
2017-01-26 13:22 ` Joao Martins [this message]
2017-01-26 17:18 ` Andy Lutomirski
2017-01-25 17:33 ` [PATCH v1 3/3] MAINTAINERS: xen, kvm: track pvclock-abi.h changes Joao Martins
2017-01-26 17:25 ` Andy Lutomirski
2017-01-26 20:08 ` Joao Martins
2017-01-27 14:54 ` Juergen Gross
2017-01-27 14:54 ` Juergen Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5889F825.7060508@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.