From: Alexander Graf <graf@amazon.de>
To: Paolo Bonzini <pbonzini@redhat.com>, <milanpa@amazon.com>,
Milan Pandurov <milanpa@amazon.de>, <kvm@vger.kernel.org>
Cc: <rkrcmar@redhat.com>, <borntraeger@de.ibm.com>
Subject: Re: [PATCH 2/2] kvm: Add ioctl for gathering debug counters
Date: Mon, 20 Jan 2020 18:53:10 +0100 [thread overview]
Message-ID: <e77a2477-6010-ae1d-0afd-0c5498ba2117@amazon.de> (raw)
In-Reply-To: <668ea6d3-06ae-4586-9818-cdea094419fe@redhat.com>
On 18.01.20 00:38, Paolo Bonzini wrote:
> On 15/01/20 15:59, Alexander Graf wrote:
>> On 15.01.20 15:43, milanpa@amazon.com wrote:
>>>>> Let's expose new interface to userspace for garhering these
>>>>> statistics with one ioctl.
>>>>>
>>>>> Userspace application can read counter description once using
>>>>> KVM_GET_SUPPORTED_DEBUGFS_STAT and periodically invoke the
>>>>> KVM_GET_DEBUGFS_VALUES to get value update.
>>>>
>>>> This is an interface that requires a lot of logic and buffers from
>>>> user space to retrieve individual, explicit counters. What if I just
>>>> wanted to monitor the number of exits on every user space exit?
>>>
>>> In case we want to cover such latency sensitive use cases solution b),
>>> with mmap'ed structs you suggested, would be a way to go, IMO.
>>>
>>>> Also, we're suddenly making the debugfs names a full ABI, because
>>>> through this interface we only identify the individual stats through
>>>> their names. That means we can not remove stats or change their
>>>> names, because people may rely on them, no? Thining about this again,
>>>> maybe they already are an ABI because people rely on them in debugfs
>>>> though?
>
> In theory not, in practice I have treated them as a kind of "soft" ABI:
> if the meaning changes you should rename them, and removing them is
> fine, but you shouldn't for example change the unit of measure (which is
> not hard since they are all counters :) but perhaps you could have
> nanoseconds vs TSC cycles in the future for some counters).
>
>>>> I see two alternatives to this approach here:
>>>>
>>>> a) ONE_REG
>>>>
>>>> We can just add a new DEBUG arch in ONE_REG and expose ONE_REG per VM
>>>> as well (if we really have to?). That gives us explicit identifiers
>>>> for each stat with an explicit path to introduce new ones with very
>>>> unique identifiers.
> ONE_REG would force us to define constants for each counter, and would
> make it hard to retire them. I don't like this.
Why does it make it hard to retire them? We would just return -EINVAL on
retrieval, like we do for any other non-supported ONE_REG.
It's the same as a file not existing in debugfs/statfs. Or an entry in
the array of this patch to disappear.
>
>>>> b) part of the mmap'ed vcpu struct
>
> Same here. Even if we say the semantics of the struct would be exposed
> to userspace via KVM_GET_SUPPORTED_DEBUGFS_STAT, someone might end up
> getting this wrong and expecting a particular layout. Milan's proposed
> API has the big advantage of being hard to get wrong for userspace. And
> pushing the aggregation to userspace is not a huge chore, but it's still
> a chore.
>
> So unless someone has a usecase for latency-sensitive monitoring I'd
> prefer to keep it simple (usually these kind of stats can even make
> sense if you gather them over relatively large period of time, because
> then you'll probably use something else like tracepoints to actually
> pinpoint what's going on).
I tend to agree. Fetching them via an explicit call into the kernel is
definitely the safer route.
>
>>>> 2) vcpu counters
>>>>
>>>> Most of the counters count on vcpu granularity, but debugfs only
>>>> gives us a full VM view. Whatever we do to improve the situation, we
>>>> should definitely try to build something that allows us to get the
>>>> counters per vcpu (as well).
>>>>
>>>> The main purpose of these counters is monitoring. It can be quite
>>>> important to know that only a single vCPU is going wild, compared to
>>>> all of them for example.
>>>
>>> I agree, exposing per vcpu counters can be useful. I guess it didn't
>>> make much sense exposing them through debugfs so aggregation was done
>>> in kernel. However if we chose to go with approach 1-b) mmap counters
>>> struct in userspace, we could do this.
>>
>> The reason I dislike the debugfs/statfs approach is that it generates a
>> completely separate permission and access paths to the stats. That's
>> great for full system monitoring, but really bad when you have multiple
>> individual tenants on a single host.
>
> I agree, anything in sysfs is complementary to vmfd/vcpufd access.
Cool :).
So we only need to agree on ONE_REG vs. this patch mostly?
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
next prev parent reply other threads:[~2020-01-20 17:53 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-15 13:43 [PATCH 2/2] kvm: Add ioctl for gathering debug counters Milan Pandurov
2020-01-15 14:04 ` Alexander Graf
2020-01-15 14:43 ` milanpa
2020-01-15 14:59 ` Alexander Graf
2020-01-17 23:38 ` Paolo Bonzini
2020-01-20 17:53 ` Alexander Graf [this message]
2020-01-20 18:57 ` milanpa
2020-01-21 15:38 ` Alexander Graf
2020-01-23 12:08 ` Paolo Bonzini
2020-01-23 12:32 ` Alexander Graf
2020-01-23 14:19 ` Paolo Bonzini
2020-01-23 14:45 ` Alexander Graf
2020-01-23 14:50 ` Paolo Bonzini
2020-01-23 14:58 ` Alexander Graf
2020-01-23 15:05 ` Paolo Bonzini
2020-01-23 15:27 ` milanpa
2020-01-23 16:15 ` Paolo Bonzini
2020-01-23 18:31 ` milanpa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e77a2477-6010-ae1d-0afd-0c5498ba2117@amazon.de \
--to=graf@amazon.de \
--cc=borntraeger@de.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=milanpa@amazon.com \
--cc=milanpa@amazon.de \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox