From: Marc Zyngier <maz@kernel.org>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Nicolas Saenz Julienne <nsaenzju@redhat.com>,
linux-arm-kernel@lists.infradead.org, rostedt@goodmis.org,
james.morse@arm.com, alexandru.elisei@arm.com,
suzuki.poulose@arm.com, catalin.marinas@arm.com, will@kernel.org,
linux-kernel@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
mingo@redhat.com, nilal@redhat.com
Subject: Re: [RFC PATCH 2/2] KVM: arm64: export cntvoff in debugfs
Date: Fri, 19 Nov 2021 13:31:11 +0000 [thread overview]
Message-ID: <87czmw6zmo.wl-maz@kernel.org> (raw)
In-Reply-To: <20211119125946.GA57544@fuller.cnet>
On Fri, 19 Nov 2021 12:59:46 +0000,
Marcelo Tosatti <mtosatti@redhat.com> wrote:
>
> On Fri, Nov 19, 2021 at 12:17:00PM +0000, Marc Zyngier wrote:
> > On Fri, 19 Nov 2021 10:21:18 +0000,
> > Nicolas Saenz Julienne <nsaenzju@redhat.com> wrote:
> > >
> > > While using cntvct as the raw clock for tracing, it's possible to
> > > synchronize host/guest traces just by knowing the virtual offset applied
> > > to the guest's virtual counter.
> > >
> > > This is also the case on x86 when TSC is available. The offset is
> > > exposed in debugfs as 'tsc-offset' on a per vcpu basis. So let's
> > > implement the same for arm64.
> >
> > How does this work with NV, where the guest hypervisor is in control
> > of the virtual offset? How does userspace knows which vcpu to pick so
> > that it gets the right offset?
>
> On x86, the offsets for different vcpus are the same due to the logic at
> kvm_synchronize_tsc function:
>
> During guest vcpu creation, when the TSC-clock values are written
> in a short window of time (or the clock value is zero), the code uses
> the same TSC.
>
> This logic is problematic (since "short window of time" is a heuristic
> which can fail), and is being replaced by writing the same offset
> for each vCPU:
>
> commit 828ca89628bfcb1b8f27535025f69dd00eb55207
> Author: Oliver Upton <oupton@google.com>
> Date: Thu Sep 16 18:15:38 2021 +0000
>
> KVM: x86: Expose TSC offset controls to userspace
>
> To date, VMM-directed TSC synchronization and migration has been a bit
> messy. KVM has some baked-in heuristics around TSC writes to infer if
> the VMM is attempting to synchronize. This is problematic, as it depends
> on host userspace writing to the guest's TSC within 1 second of the last
> write.
>
> A much cleaner approach to configuring the guest's views of the TSC is to
> simply migrate the TSC offset for every vCPU. Offsets are idempotent,
> and thus not subject to change depending on when the VMM actually
> reads/writes values from/to KVM. The VMM can then read the TSC once with
> KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when
> the guest is paused.
>
> So with that in place, the answer to
>
> How does userspace knows which vcpu to pick so
> that it gets the right offset?
>
> is any vcpu, since the offsets are the same.
As I just said above, this assertion doesn't hold true once you have
nested virt, because the offset is per-cpu, and is adjusted to mean
different things on different hypervisors (some hypervisors expose
stolen time through it, for example).
What this patch is doing is to expose a Linux-specific behaviour, and
try to derive properties from it. It really doesn't work in general.
>
> > I also wonder why we need this when userspace already has direct
> > access to that information without any extra kernel support (read the
> > CNTVCT view of the vcpu using the ONEREG API, subtract it from the
> > host view of the counter, job done).
>
> If guest has access to the clock offset (between guest and host), then
> in the guest:
>
> clockval = hostclockval - clockoffset
>
> Adding "clockoffset" to that will retrieve the host clock.
>
> Is that what you mean?
No. The *VMM* (qemu, kvmtool, crosvm, insertyourfavouriteonehere) has
already access to it. Why do we need an extra interface?
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-11-19 13:33 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-19 10:21 [RFC PATCH 0/2] KVM: arm64: Host/Guest trace syncronization Nicolas Saenz Julienne
2021-11-19 10:21 ` [RFC PATCH 1/2] arm64/tracing: add cntvct based trace clock Nicolas Saenz Julienne
2021-11-19 11:26 ` Marcelo Tosatti
2021-11-19 12:00 ` Marc Zyngier
2021-11-19 13:26 ` Nicolas Saenz Julienne
2021-11-22 14:57 ` Steven Rostedt
2021-11-24 9:45 ` Nicolas Saenz Julienne
2021-11-19 10:21 ` [RFC PATCH 2/2] KVM: arm64: export cntvoff in debugfs Nicolas Saenz Julienne
2021-11-19 11:11 ` Marcelo Tosatti
2021-11-19 12:17 ` Marc Zyngier
2021-11-19 12:59 ` Marcelo Tosatti
2021-11-19 13:31 ` Marc Zyngier [this message]
2021-11-22 20:40 ` Nicolas Saenz Julienne
2021-11-23 11:09 ` Marc Zyngier
2021-11-29 12:47 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87czmw6zmo.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=nilal@redhat.com \
--cc=nsaenzju@redhat.com \
--cc=rostedt@goodmis.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).