From: Yury Norov <ynorov@caviumnetworks.com>
To: Shih-Wei Li <shihwei@cs.columbia.edu>
Cc: Andrew Jones <drjones@redhat.com>,
kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
Christoffer Dall <christoffer.dall@linaro.org>,
Marc Zyngier <marc.zyngier@arm.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Christoffer Dall <cdall@cs.columbia.edu>
Subject: Re: [PATCH v3 2/2] arm64: add micro test
Date: Sat, 20 Jan 2018 14:26:31 +0300 [thread overview]
Message-ID: <20180120112229.3eomv6gpsjtzhzzg@yury-thinkpad> (raw)
In-Reply-To: <CAO+sbHHKyYHjGqmS5uM=C6bCnUQ2CbHmmFBTAb=WanQsWdLU2Q@mail.gmail.com>
On Fri, Jan 19, 2018 at 04:57:55PM -0500, Shih-Wei Li wrote:
> Thanks for the feedback about the mistakes in math and some issues in
> naming, print msg, and coding style. I'll be careful and try to avoid
> the same problems the next patch set. Sorry for all of the confusion.
>
> So we now skip the test when "sample == 0" happens over 1000 times.
> This is only due to the case that "cost is < 1/cntfrq" since it's not
> possible for the tick to overflow for that many times. Did I miss
> something here? I do agree that we should output better msgs to tell
> users that the cost of a certain test is constantly smaller than a
> tick.
>
> Also, because ticks overflow should rarely happen, I wonder if we can
> simply ignore the data set if it happens? If not, any thoughts on how
> to efficiently distinguish one case from the other?
It was my very first suggestion for you - don't drop 0-cost samples
and ignore counter overflow.
0-cost samples most probably indicate error in the test code.
Numerous (more than 1) overflows are most probably caused by broken
management of timer-counters on VM side. In both cases results of
the test are not trustworthy.
In case of broken timer-counter, too much things will get crazy, and
if you suspect it, I think it's much-much more important to write
separated test for timer counters.
> Thinking of the time granularity used in the output, given there's
> likely difference in hardware implementation as previously mentioned,
> should we output the cost in ticks (along with frequency) like I did
> in v1 & v2 instead, allowing the users to do the translation to the exact
> time based on the output?
Whatever you prefer.
Yury
> Thanks again,
> Shih-Wei
>
> On Thu, Jan 18, 2018 at 6:39 AM, Yury Norov <ynorov@caviumnetworks.com> wrote:
> > On Thu, Jan 18, 2018 at 11:37:37AM +0100, Andrew Jones wrote:
> >> On Thu, Jan 18, 2018 at 01:09:58PM +0300, Yury Norov wrote:
> >> > On Wed, Jan 17, 2018 at 04:46:36PM -0500, Shih-Wei Li wrote:
> >> > > Here we provide the support for measuring various micro level
> >> > > operations on arm64. We iterate each of the tests and output
> >> > > their average, minimum and maximum cost in microseconds.
> >> > > Instruction barriers were used before taking timestamps to
> >> > > avoid out-of-order execution or pipelining from skewing our
> >> > > measurements.
> >> > >
> >> > > The operations we currently supported and measured are mostly
> >> > > self-explanatory by their function name. For IPI, we measured the
> >> > > cost of sending IPI from a source VCPU to a target VCPU, until the
> >> > > target VCPU receives the IPI.
> >> > >
> >> > > Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
> >> > > Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
> >> > > ---
> >> > > arm/Makefile.common | 1 +
> >> > > arm/micro-test.c | 252 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> > > arm/unittests.cfg | 7 ++
> >> > > 3 files changed, 260 insertions(+)
> >> > > create mode 100644 arm/micro-test.c
> >> > >
> >> > > diff --git a/arm/Makefile.common b/arm/Makefile.common
> >> > > index 0a039cf..c7d5c27 100644
> >> > > --- a/arm/Makefile.common
> >> > > +++ b/arm/Makefile.common
> >> > > @@ -16,6 +16,7 @@ tests-common += $(TEST_DIR)/pmu.flat
> >> > > tests-common += $(TEST_DIR)/gic.flat
> >> > > tests-common += $(TEST_DIR)/psci.flat
> >> > > tests-common += $(TEST_DIR)/sieve.flat
> >> > > +tests-common += $(TEST_DIR)/micro-test.flat
> >> > >
> >> > > tests-all = $(tests-common) $(tests)
> >> > > all: directories $(tests-all)
> >> > > diff --git a/arm/micro-test.c b/arm/micro-test.c
> >> > > new file mode 100644
> >> > > index 0000000..407ce8b
> >> > > --- /dev/null
> >> > > +++ b/arm/micro-test.c
> >> > > @@ -0,0 +1,252 @@
> >> > > +/*
> >> > > + * Measure the cost of micro level operations.
> >> > > + *
> >> > > + * Copyright Columbia University
> >> > > + * Author: Shih-Wei Li <shihwei@cs.columbia.edu>
> >> > > + * Author: Christoffer Dall <cdall@cs.columbia.edu>
> >> > > + *
> >> > > + * This work is licensed under the terms of the GNU LGPL, version 2.
> >> > > + */
> >> > > +#include <asm/gic.h>
> >> > > +#include "libcflat.h"
> >> > > +#include <util.h>
> >> > > +#include <limits.h>
> >> > > +
> >> > > +static volatile bool ipi_received;
> >> > > +static volatile bool ipi_ready;
> >> > > +static volatile unsigned int cntfrq;
> >> > > +static volatile void *vgic_dist_addr;
> >> >
> >> > Volatiles considered harmful for kernel code. I think it is also correct
> >> > for your case:
> >> > https://lwn.net/Articles/234017/
> >> >
> >> > I would prefer use readl/writel accessors for interprocessor
> >> > communications which guaranties cache synchronization and proper
> >> > ordering.
> >> >
> >> > Also, ipi_received and ipi_ready are looking like synchronization
> >> > primitives. Did you consider using spinlocks instead?
> >> >
> >> > Also-also, cntfrq is written only once at init. What for you make it
> >> > volatile?
> >>
> >> We frequently take some shortcuts in kvm-unit-tests, using busy loops and
> >> volatile globals for synchronization, trying to keep things as simple as
> >> possible. I'm ok with the ipi_* synchronization variables being volatile,
> >> but agree the other two don't need to be.
> >>
> >> >
> >> > > +void (*write_eoir)(u32 irqstat);
> >> > > +
> >> > > +#define IPI_IRQ 1
> >> > > +
> >> > > +#define TRIES (1U << 28)
> >> > > +
> >> > > +#define MAX_FAILURES 1000
> >> > > +
> >> > > +/*
> >> > > + * The counter may not always start with zero, which means it could
> >> > > + * overflow after some period of time.
> >> > > + */
> >> > > +#define COUNT(c1, c2) \
> >> > > + ((c1) > (c2) ? 0 : (c2) - (c1))
> >> > > +
> >> > > +static uint64_t read_cc(void)
> >> > > +{
> >> > > + isb();
> >> > > + return read_sysreg(cntpct_el0);
> >> > > +}
> >> > > +
> >> > > +static void ipi_irq_handler(struct pt_regs *regs __unused)
> >> > > +{
> >> > > + u32 ack;
> >> > > + ipi_ready = false;
> >> > > + ipi_received = true;
> >> > > + ack = gic_read_iar();
> >> > > + gic_write_eoir(ack);
> >> > > + ipi_ready = true;
> >> > > +}
> >> > > +
> >> > > +static void ipi_test_secondary_entry(void *data __unused)
> >> > > +{
> >> > > + enum vector v = EL1H_IRQ;
> >> > > + install_irq_handler(v, ipi_irq_handler);
> >> > > +
> >> > > + gic_enable_defaults();
> >> > > +
> >> > > + local_irq_enable(); /* Enter small wait-loop */
> >> > > + ipi_ready = true;
> >> > > + while (true);
> >> > > +}
> >> > > +
> >> > > +static int test_init(void)
> >> > > +{
> >> > > + int v;
> >> > > +
> >> > > + v = gic_init();
> >> > > + if (v == 2) {
> >> > > + vgic_dist_addr = gicv2_dist_base();
> >> > > + write_eoir = gicv2_write_eoir;
> >> > > + } else if (v == 3) {
> >> > > + vgic_dist_addr = gicv3_dist_base();
> >> > > + write_eoir = gicv3_write_eoir;
> >> > > + } else {
> >> > > + printf("No supported gic present, skipping tests...\n");
> >> > > + return 0;
> >> > > + }
> >> > > +
> >> > > + ipi_ready = false;
> >> > > +
> >> > > + gic_enable_defaults();
> >> > > + on_cpu_async(1, ipi_test_secondary_entry, 0);
> >> > > +
> >> > > + cntfrq = get_cntfrq();
> >> > > + printf("Timer Frequency %d Hz (Output in microseconds)\n", cntfrq);
> >> > > +
> >> > > + return 1;
> >> > > +}
> >> > > +
> >> > > +static unsigned long ipi_test(void)
> >> > > +{
> >> > > + unsigned int tries = TRIES;
> >> > > + uint64_t c1, c2;
> >> > > +
> >> > > + while (!ipi_ready && tries--);
> >> > > + assert(ipi_ready);
> >> > > +
> >> > > + ipi_received = false;
> >> > > +
> >> > > + c1 = read_cc();
> >> > > +
> >> > > + gic_ipi_send_single(IPI_IRQ, 1);
> >> > > +
> >> > > + tries = TRIES;
> >> > > + while (!ipi_received && tries--);
> >> > > + assert(ipi_received);
> >> > > +
> >> > > + c2 = read_cc();
> >> > > + return COUNT(c1, c2);
> >> > > +}
> >> > > +
> >> > > +static unsigned long hvc_test(void)
> >> > > +{
> >> > > + uint64_t c1, c2;
> >> > > +
> >> > > + c1 = read_cc();
> >> > > + asm volatile("mov w0, #0x4b000000; hvc #0" ::: "w0");
> >> > > + c2 = read_cc();
> >> > > + return COUNT(c1, c2);
> >> > > +}
> >> > > +
> >> > > +static unsigned long mmio_read_user(void)
> >> > > +{
> >> > > + uint64_t c1, c2;
> >> > > + /*
> >> > > + * FIXME: Read device-id in virtio mmio here. This address
> >> > > + * needs to be updated in the future if any relevent
> >> > > + * changes in QEMU test-dev are made.
> >> > > + */
> >> > > + void *mmio_read_user_addr = (void*) 0x0a000008;
> >> >
> >> > Again, could you rename it? device_id_ptr sounds much better.
> >> >
> >> > > + c1 = read_cc();
> >> > > + readl(mmio_read_user_addr);
> >> > > + c2 = read_cc();
> >> > > + return COUNT(c1, c2);
> >> > > +}
> >> > > +
> >> > > +static unsigned long mmio_read_vgic(void)
> >> > > +{
> >> > > + uint64_t c1, c2;
> >> > > +
> >> > > + c1 = read_cc();
> >> > > + readl(vgic_dist_addr + GICD_IIDR);
> >> > > + c2 = read_cc();
> >> > > + return COUNT(c1, c2);
> >> > > +}
> >> > > +
> >> > > +static unsigned long eoi_test(void)
> >> > > +{
> >> > > + uint64_t c1, c2;
> >> > > +
> >> > > + u32 val = 1023; /* spurious IDs, writes to EOI are ignored */
> >> > > +
> >> > > + /* Avoid measuring assert(..) in gic_write_eoir */
> >> > > + c1 = read_cc();
> >> > > + write_eoir(val);
> >> > > + c2 = read_cc();
> >> > > +
> >> > > + return COUNT(c1, c2);
> >> > > +}
> >> > > +
> >> > > +struct exit_test {
> >> > > + const char *name;
> >> > > + unsigned long (*test_fn)(void);
> >> > > + bool run;
> >> > > +};
> >> > > +
> >> > > +static struct exit_test tests[] = {
> >> > > + {"hvc", hvc_test, true},
> >> > > + {"mmio_read_user", mmio_read_user, true},
> >> > > + {"mmio_read_vgic", mmio_read_vgic, true},
> >> > > + {"eoi", eoi_test, true},
> >> > > + {"ipi", ipi_test, true},
> >> > > +};
> >> > > +
> >> > > +static void get_us_output(const char *name,
> >> > > + unsigned long cycles)
> >> > > +{
> >> > > + unsigned int ns_per_cycle = 10^9U / cntfrq;
> >> > > + unsigned int ns, us, us_frac;
> >> >
> >> > '^' is 'xor', you probably mean 1000*1000*1000. And anyway,
> >> > ThunderX2 works at ~2GHz, so 10E9/get_cntfrq() is 0 for it.
> >>
> >> The counter (cntpct_el0) ticks that fast? We could start with picoseconds
> >> if necessary.
> >>
> >> >
> >> > For CPUs working at 500...1000 MHz, ns_per_cycle is 1,
> >> > For CPUs working at 1GHz+ it is 0. What's point in it?
> >>
> >> We're not looking at the CPU cycles, we're looking at counter ticks.
> >> Indeed we could rename cycles to ticks everywhere to make that more
> >> explicit.
> >
> > Ah, I don't say that counter ticks at CPU frequency on TX2, or
> > something like that. I was probably confused by name. But switching to
> > picoseconds, or even femtoseconds sounds reasonable anyway.
> >
> > Yury
next prev parent reply other threads:[~2018-01-20 11:26 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-17 21:46 [PATCH v3 0/2] Support micro operation measurement on arm64 Shih-Wei Li
2018-01-17 21:46 ` [PATCH v3 1/2] arm/arm64: add GICD_IIDR definition Shih-Wei Li
2018-01-17 21:46 ` [PATCH v3 2/2] arm64: add micro test Shih-Wei Li
2018-01-18 9:30 ` Andrew Jones
2018-01-18 10:09 ` Yury Norov
2018-01-18 10:37 ` Andrew Jones
2018-01-18 11:39 ` Yury Norov
2018-01-19 21:57 ` Shih-Wei Li
2018-01-20 11:26 ` Yury Norov [this message]
2018-01-22 8:48 ` Andrew Jones
2018-01-23 18:48 ` Shih-Wei Li
2018-01-23 19:54 ` Andrew Jones
2018-01-24 8:17 ` Christoffer Dall
2018-01-24 8:30 ` Andrew Jones
2018-01-24 9:23 ` Christoffer Dall
2018-01-25 2:55 ` Shih-Wei Li
2018-01-25 8:52 ` Christoffer Dall
2018-01-25 13:45 ` Shih-Wei Li
2018-01-18 8:45 ` [PATCH v3 0/2] Support micro operation measurement on arm64 Yury Norov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180120112229.3eomv6gpsjtzhzzg@yury-thinkpad \
--to=ynorov@caviumnetworks.com \
--cc=cdall@cs.columbia.edu \
--cc=christoffer.dall@linaro.org \
--cc=drjones@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=marc.zyngier@arm.com \
--cc=pbonzini@redhat.com \
--cc=shihwei@cs.columbia.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).