From: "Mihai Donțu" <mdontu@bitdefender.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>,
"Jan Kiszka" <jan.kiszka@siemens.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Adalbert Lazar" <alazar@bitdefender.com>,
kvm@vger.kernel.org
Subject: Re: [RFC PATCH v2 1/1] kvm: Add documentation and ABI/API header for VM introspection
Date: Thu, 27 Jul 2017 20:06:25 +0300 [thread overview]
Message-ID: <1501175185.8856.9.camel@bitdefender.com> (raw)
In-Reply-To: <7104167e-0747-92fe-05df-1b7e1848d65f@redhat.com>
On Fri, 2017-07-07 at 18:52 +0200, Paolo Bonzini wrote:
> > +2. KVMI_GET_GUEST_INFO
> > +----------------------
> > +
> > +:Architectures: all
> > +:Versions: >= 1
> > +:Parameters: {}
> > +:Returns: ↴
> > +
> > +::
> > +
> > > > + struct kvmi_get_guest_info_reply {
> > > > + __s32 err;
> > > > + __u16 vcpu_count;
> > > > + __u16 padding;
> > > > + __u64 tsc_speed;
> > > > + };
> > +
> > +Returns the number of online vcpus, and the TSC frequency in HZ, if supported
> > +by the architecture (otherwise is 0).
>
> Can the TSC frequency be specified only if the guest is using TSC
> scaling? Defining the TSC frequency on older hosts is a bit tricky. 0
> would be the host.
>
> Maybe define the second padding to be
>
> __u16 arch;
>
> (0 = x86) followed by an arch-specific payload.
We use the TSC to compute some performance numbers, but only when we're
debugging reported issues. Should be OK if the TSC information is not
available. We'll manage in some other way.
> > +10. KVMI_GET_XSAVE_INFO
> > +-----------------------
> > +
> > +:Architectures: x86
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_xsave_info {
> > + __u16 vcpu;
> > + __u16 padding[3];
> > + };
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_xsave_info_reply {
> > + __s32 err;
> > + __u32 size;
> > + };
> > +
> > +Returns the xstate size for the specified vCPU.
>
> Could this be replaced by a generic CPUID accessor?
Absolutely. I wonder if we should only made certain leafs acessible,
part of the whole "make the least amount of information available"
security philosophy, though I'm not aware of any attacks that can be
mounted on the host just by knowing too many things about a guest.
> > +11. KVMI_GET_PAGE_ACCESS
> > +------------------------
> > +
> > +:Architectures: all
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_get_page_access {
> > + __u16 vcpu;
> > + __u16 padding[3];
> > + __u64 gpa;
> > + };
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_get_page_access_reply {
> > + __s32 err;
> > + __u32 access;
> > + };
> > +
> > +Returns the spte flags (rwx - present, write & user) for the specified
> > +vCPU and guest physical address.
> > +
> > +12. KVMI_SET_PAGE_ACCESS
> > +------------------------
> > +
> > +:Architectures: all
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_set_page_access {
> > + __u16 vcpu;
> > + __u16 padding;
> > + __u32 access;
> > + __u64 gpa;
> > + };
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_error_code {
> > + __s32 err;
> > + __u32 padding;
> > + };
> > +
> > +Sets the spte flags (rwx - present, write & user) - access - for the specified
> > +vCPU and guest physical address.
>
> rwx or pwu? I suppose RWX. Maybe #define the constants in the
> documentation.
>
> Also, it should be noted here that the spte flags are ANDed with
> whatever is provided by userspace, because there could be readonly
> memslots, and that some access combinations could fail (--x) or will
> surely fail (-wx for example).
We are closely looking into how KVM's MMU works and see how we'd go
about extending it so as to make everyone happy (qemu and a possible
introspection tool).
As for the permitted page access flags, we have (thus far) kept away
from invalid combinations on older arch-es (--x on pre-SandyBridge, for
example). However, it would be very nice to have arch-specific code
that does a validation ahead of time (ie. before the guest is re-
entered) so that an introspection tool can use an alternative approach.
> > +13. KVMI_INJECT_PAGE_FAULT
> > +--------------------------
> > +
> > +:Architectures: x86
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_page_fault {
> > + __u16 vcpu;
> > + __u16 padding;
> > + __u32 error;
>
> error_code, I guess? Why not a generic inject exception message?
OK, we will rework the interface to be used for generic exception
injection. Maybe we can fold the breakpoint injection into it too.
> > + __u64 gva;
> > + };
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_error_code {
> > + __s32 err;
> > + __u32 padding;
> > + };
> > +
> > +Injects a vCPU page fault with the specified guest virtual address and
> > +error code.
> > +
> > +5. KVMI_EVENT_USER_CALL
> > +-----------------------
>
> Please rename this to KVMI_EVENT_HCALL or HYPERCALL or VMCALL.
>
> > +
> > +:Architectures: x86
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_event_x86;
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_event_x86_reply;
> > +
> > +This event is sent on a user hypercall and the introspection has already
> > +already been enabled for this kind of event (KVMI_CONTROL_EVENTS).
> > +
> > +kvmi_event_x86 is sent to the introspector, which can respond with the
> > +KVMI_EVENT_ACTION_SET_REGS bit set in 'actions', instructing the host
> > +kernel to override the general purpose registers using the values from
> > +introspector (regs).
>
> Does KVMI_EVENT_ACTION_SET_REGS bypass the hypercall? Why is
> KVMI_EVENT_ACTION_ALLOW not allowed? As before, I'd prefer
> SKIP/RETRY/ALLOW/CRASH as the actions.
No, we use SET_REGS only as a means to communicate with the guest code
that did the hypercall. We don't specify an action because, in our
specific case, saw no use for it.
Apropos: if possible, we'd like to keep the Xen convention for this
hypercall. It doesn't appear to interfere with either KVM or Hyper-V
(rax = 34 [__HYPERVISOR_hvm_op], rbx/rdi = 24
[HVMOP_guest_request_vm_event] - depends on wether long mode is
enabled).
> > +7. KVMI_EVENT_TRAP
> > +------------------
> > +
> > +:Architectures: x86
> > +:Versions: >= 1
> > +:Parameters: ↴
> > +
> > +::
> > +
> > + struct kvmi_event_x86;
> > + struct kvmi_event_trap {
> > + __u32 vector;
> > + __u32 type;
> > + __u32 err;
>
> error_code?
>
> > + __u32 padding;
> > + __u64 cr2;
> > + };
> > +
> > +:Returns: ↴
> > +
> > +::
> > +
> > + struct kvmi_event_x86_reply;
> > +
> > +This event is sent if a trap will be delivered to the guest (page fault,
> > +breakpoint, etc.) and the introspection has already enabled the reports
> > +for this kind of event (KVMI_CONTROL_EVENTS).
> > +
> > +This is used to inform the introspector of all pending traps giving it
> > +a chance to determine if it should try again later in case a previous
> > +KVMI_INJECT_PAGE_FAULT/KVMI_INJECT_BREAKPOINT command has been overwritten
> > +by an interrupt picked up during guest reentry.
> > +
> > +kvmi_event_x86, exception/interrupt number (vector), exception/interrupt
> > +type, exception code (err) and CR2 are sent to the introspector, which can
> > +respond with the KVMI_EVENT_ACTION_SET_REGS bit set in 'actions', instructing
> > +the host kernel to override the general purpose registers using the values
> > +from introspector (regs).
>
> Here I think the actions could be RETRY/ALLOW/CRASH only, with "set
> regs" done as a separate command.
>
> Some x86 exceptions are faults, others are traps. Traps should not
> allow RETRY as an action. There should be an input telling the
> introspector if retrying is allowed.
>
> How are dr6/dr7 handled around breakpoints?
Afaik, the debug registers should remain untouched, but I will ask my
colleagues with better knowledge of this.
Thanks,
--
Mihai Donțu
next prev parent reply other threads:[~2017-07-27 17:06 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-07 14:34 [RFC PATCH v2 0/1] VM introspection Adalbert Lazar
2017-07-07 14:34 ` [RFC PATCH v2 1/1] kvm: Add documentation and ABI/API header for " Adalbert Lazar
2017-07-07 16:52 ` Paolo Bonzini
2017-07-10 15:32 ` alazar
2017-07-10 17:03 ` Paolo Bonzini
2017-07-11 16:48 ` Adalbert Lazar
2017-07-11 16:51 ` Paolo Bonzini
2017-07-13 5:57 ` Mihai Donțu
2017-07-13 7:32 ` Paolo Bonzini
2017-07-18 11:51 ` Mihai Donțu
2017-07-18 12:02 ` Mihai Donțu
2017-07-23 13:02 ` Paolo Bonzini
2017-07-26 17:04 ` Mihai Donțu
2017-07-26 17:25 ` Tamas K Lengyel
2017-07-27 14:41 ` Mihai Donțu
2017-07-27 13:33 ` Paolo Bonzini
2017-07-27 14:46 ` Mihai Donțu
2017-07-13 8:36 ` Mihai Donțu
2017-07-13 9:15 ` Paolo Bonzini
2017-07-27 16:23 ` Mihai Donțu
2017-07-27 16:52 ` Paolo Bonzini
2017-07-27 17:19 ` Mihai Donțu
2017-08-01 10:40 ` Paolo Bonzini
2017-08-01 16:33 ` Tamas K Lengyel
2017-08-01 20:47 ` Paolo Bonzini
2017-08-02 11:52 ` Mihai Donțu
2017-08-02 12:27 ` Paolo Bonzini
2017-08-02 13:32 ` Mihai Donțu
2017-08-02 13:51 ` Paolo Bonzini
2017-08-02 14:17 ` Mihai Donțu
2017-08-04 8:35 ` Paolo Bonzini
2017-08-04 15:29 ` Mihai Donțu
2017-08-04 15:37 ` Paolo Bonzini
2017-08-05 8:00 ` Andrei Vlad LUTAS
2017-08-07 12:18 ` Paolo Bonzini
2017-08-07 13:25 ` Mihai Donțu
2017-08-07 13:49 ` Paolo Bonzini
2017-08-07 14:12 ` Mihai Donțu
2017-08-07 15:56 ` Paolo Bonzini
2017-08-07 16:44 ` Mihai Donțu
2017-08-02 13:53 ` Mihai Donțu
2017-07-27 17:06 ` Mihai Donțu [this message]
2017-07-27 17:18 ` Paolo Bonzini
2017-07-07 17:29 ` [RFC PATCH v2 0/1] " Paolo Bonzini
2017-08-07 15:28 ` Mihai Donțu
2017-08-07 15:44 ` Paolo Bonzini
2017-07-12 14:09 ` Konrad Rzeszutek Wilk
2017-07-13 5:37 ` Mihai Donțu
2017-07-14 16:13 ` Konrad Rzeszutek Wilk
2017-07-18 8:55 ` Mihai Donțu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1501175185.8856.9.camel@bitdefender.com \
--to=mdontu@bitdefender.com \
--cc=alazar@bitdefender.com \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox