From: Kyle Huey <me@kylehuey.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>,
Jun Nakajima <jun.nakajima@intel.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
xen-devel@lists.xen.org, Jan Beulich <JBeulich@suse.com>,
Robert O'Callahan <robert@ocallahan.org>
Subject: Re: [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting
Date: Sun, 23 Oct 2016 21:18:10 -0700 [thread overview]
Message-ID: <CAP045AponwaQwotfiKR6_u0k41J8LHr3y6VfNyt4zpVcmTjEhg@mail.gmail.com> (raw)
In-Reply-To: <CAP045Ap46mTAG6YcxiYYdA_GX5Lx88+-XXjxe5qy_pr1msN=EQ@mail.gmail.com>
On Fri, Oct 21, 2016 at 8:52 AM, Kyle Huey <me@kylehuey.com> wrote:
> On Thu, Oct 20, 2016 at 7:40 AM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
>> On 10/20/2016 10:11 AM, Andrew Cooper wrote:
>>> On 20/10/16 14:55, Kyle Huey wrote:
>>>>>> That said, rr currently does not work in Xen guests due to some PMU
>>>>>> issues that we haven't tracked down yet.
>>>>> Is this RR trying to use vPMU and it not functioning, or not
>>>>> specifically trying to use PMU facilities and getting stuck anyway?
>>>> The latter. rr relies on the values returned by the PMU (the retired
>>>> conditional branches counter in particular) being exactly the same
>>>> during the recording and replay phases. This is true when running on
>>>> bare metal, and when running inside a KVM guest, but when running in a
>>>> Xen HVM guest we see values that are off by a branch or two on a small
>>>> fraction of our tests. Since it works in KVM I suspect this is some
>>>> sort of issue with how Xen multiplexes the real PMU and events are
>>>> "leaking" between guests (or perhaps from Xen itself, though I don't
>>>> think the Xen kernel executes any ring 3 code). Even if that's
>>>> correct we're a long way from tracking it down and patching it though.
>>> Hmm. That is unfortunate, and does point towards a bug in Xen. Are
>>> these tests which notice the problem easy to run?
>>>
>>> Boris (CC'd) is the maintainer of that code. It has undergone quite a
>>> few changes recently.
>>
>> I am actually not the maintainer, I just break this code more often than
>> others.
>>
>> But yes, having a test case would make it much easier to understand what
>> and why is not working.
>>
>> Would something like
>>
>> wrmsr(PERFCTR,0);
>> wrmsr(EVNTSEL, XXX); //enable counter
>> // do something simple, with branches
>> wrmsr(EVTSEL,YYY); // disable counter
>>
>> demonstrate the problem? (I assume we are talking about HVM guest)
>>
>> -boris
>>
>
> That is a good question. I'll see if I can reduce the problem down
> from "run Linux and run our tests inside it".
The anomalies we see appear to be related to, or at least triggerable
by, the performance monitoring interrupt. The following program runs
a loop of roughly 2^25 conditional branches. It takes one argument,
the number of conditional branches to program the PMI to trigger on.
The default is 50,000, and if you run the program with that it'll
produce the same value every time. If you drop it to 5000 or so
you'll probably see occasional off-by-one discrepancies. If you drop
it to 500 the performance counter values fluctuate wildly.
I'm not yet sure if this is specifically related to the PMI, or if it
can be caused by any interrupt and it's only how frequently the
interrupts occur that matters.
- Kyle
#define _GNU_SOURCE 1
#include <assert.h>
#include <fcntl.h>
#include <linux/perf_event.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
#include <unistd.h>
static struct perf_event_attr rcb_attr;
static uint64_t period;
static int fd;
void counter_on(uint64_t ticks)
{
int ret = ioctl(fd, PERF_EVENT_IOC_RESET, 0);
assert(!ret);
ret = ioctl(fd, PERF_EVENT_IOC_PERIOD, &ticks);
assert(!ret);
ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 1);
assert(!ret);
}
void counter_off()
{
int ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
assert(!ret);
}
int64_t read_counter()
{
int64_t val;
ssize_t nread = read(fd, &val, sizeof(val));
assert(nread == sizeof(val));
return val;
}
void do_test()
{
int64_t counts;
int i, dummy;
counter_on(period);
for (i = 0; i < (1 << 25); i++) {
dummy += i % (1 << 10);
dummy += i % (79 * (1 << 10));
}
counter_off();
counts = read_counter();
printf("Counted %ld conditional branches\n", counts);
}
int main(int argc, const char* argv[])
{
memset(&rcb_attr, 0, sizeof(rcb_attr));
rcb_attr.size = sizeof(rcb_attr);
rcb_attr.type = PERF_TYPE_RAW;
/* Intel retired conditional branches counter, ring 3 only */
rcb_attr.config = 0x5101c4;
rcb_attr.exclude_kernel = 1;
rcb_attr.exclude_guest = 1;
/* We'll change this later */
rcb_attr.sample_period = 0xffffffff;
/* start the counter */
fd = syscall(__NR_perf_event_open, &rcb_attr, 0, -1, -1, 0);
if (fd < 0) {
printf("Failed to initialize counter\n");
return -1;
}
signal(SIGALRM, SIG_IGN);
if (fcntl(fd, F_SETFL, O_ASYNC) || fcntl(fd, F_SETSIG, SIGALRM)) {
printf("Failed to make counter async\n");
return -1;
}
counter_off();
period = 50000;
if (argc > 1) {
sscanf(argv[1], "%ld", &period);
}
printf("Period is %ld\n", period);
do_test();
return 0;
}
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-10-24 4:18 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-14 19:47 [PATCH v3] x86/Intel: virtualize support for cpuid faulting Kyle Huey
2016-10-14 19:47 ` [PATCH v3 1/2] x86/Intel: Expose cpuid_faulting_enabled so it can be used elsewhere Kyle Huey
2016-10-17 12:35 ` Andrew Cooper
2016-10-17 12:43 ` Wei Liu
2016-10-14 19:47 ` [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting Kyle Huey
2016-10-17 12:32 ` Wei Liu
2016-10-20 5:10 ` Kyle Huey
2016-10-20 7:56 ` Andrew Cooper
2016-10-20 13:55 ` Kyle Huey
2016-10-20 14:11 ` Andrew Cooper
2016-10-20 14:40 ` Boris Ostrovsky
2016-10-21 15:52 ` Kyle Huey
2016-10-24 4:18 ` Kyle Huey [this message]
2016-10-24 15:05 ` Boris Ostrovsky
2016-10-24 19:22 ` Kyle Huey
2016-10-24 21:15 ` Boris Ostrovsky
2016-10-17 12:49 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAP045AponwaQwotfiKR6_u0k41J8LHr3y6VfNyt4zpVcmTjEhg@mail.gmail.com \
--to=me@kylehuey.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=robert@ocallahan.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).