From: Andrew Jones <drjones@redhat.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <Mark.Rutland@arm.com>,
qemu-devel@nongnu.org,
Christoffer Dall <christoffer.dall@linaro.org>
Subject: Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer
Date: Wed, 20 Jan 2016 17:47:49 +0100 [thread overview]
Message-ID: <20160120164749.GC3723@hawk.localdomain> (raw)
In-Reply-To: <569FB3B3.7040303@arm.com>
On Wed, Jan 20, 2016 at 04:20:03PM +0000, Marc Zyngier wrote:
> On 20/01/16 15:06, Andrew Jones wrote:
> > On Wed, Jan 20, 2016 at 02:28:05PM +0000, Marc Zyngier wrote:
> >> On 20/01/16 14:01, Andrew Jones wrote:
> >>> On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
> >>>> On Tue, Jan 19, 2016 at 01:43:07PM +0000, Marc Zyngier wrote:
> >>>>>>> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> >>>>>> OK, CCing him. One thing I see is that without this change we're
> >>>>>> currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> >>>>>> it's not true. Having that set may disable the oneshot capabilityj
> >>>>>> necessary to switch to nohz mode? I'll just stop there with my
> >>>>>> speculation though, so Marc won't have to correct too much...
> >>>>>
> >>>>> You're spot on. See 82a5619 in the kernel tree. When I did a similar
> >>>>> change in kvmtool, I saw a massive reduction in the number of timer
> >>>>> interrupts injected (specially when the number of vcpu is relatively high).
> >>>>>
> >>>>> This also have interesting benefits when running on a model, where
> >>>>> you're trying to squeeze the last bits of "performance" from the monster...
> >>>>>
> >>>>
> >>>> Hmm, I'm probably testing this wrong, but I don't see any difference in
> >>>> the number of injected timer interrupts. My guest, which I boot with
> >>>> UEFI, has
> >>>>
> >>>> CONFIG_ARM_ARCH_TIMER=y
> >>>> CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
> >>>> CONFIG_ARM_TIMER_SP804=y
> >>>> CONFIG_HIGH_RES_TIMERS=y
> >>>> CONFIG_TICK_ONESHOT=y
> >>>> CONFIG_NO_HZ_COMMON=y
> >>>> # CONFIG_HZ_PERIODIC is not set
> >>>> CONFIG_NO_HZ_IDLE=y
> >>>> # CONFIG_NO_HZ_FULL is not set
> >>>> CONFIG_NO_HZ=y
> >>>> CONFIG_HZ_1000=y
> >>>> CONFIG_HZ=1000
> >>>>
> >>>> I've boot a guest using DT with and without this patch
> >>>>
> >>>> ---WITHOUT---
> >>>>
> >>>> # ls /proc/device-tree/timer
> >>>> compatible interrupts name
> >>>> # cat /proc/interrupts
> >>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
> >>>> 3: 6958 5766 5166 5187 5576 5129 4695 4398 GIC 27 Edge arch_timer
> >>>> # sleep 120 && cat /proc/interrupts
> >>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
> >>>> 3: 7557 5986 5487 5265 6232 5868 5464 4438 GIC 27 Edge arch_timer
> >>>>
> >>>> ---WITH---
> >>>>
> >>>> # ls /proc/device-tree/timer
> >>>> always-on compatible interrupts name
> >>>> # cat /proc/interrupts
> >>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
> >>>> 3: 7005 6080 4996 5391 5165 5257 4930 4844 GIC 27 Edge arch_timer
> >>>> # sleep 120 && cat /proc/interrupts
> >>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
> >>>> 3: 7523 6505 5264 6717 5273 5391 5526 4901 GIC 27 Edge arch_timer
> >>>>
> >>>>
> >>>>
> >>>> And kvm trace data has
> >>>>
> >>>> ---WITHOUT---
> >>>> $ grep kvm_timer_update_irq trace.out | wc -l
> >>>> 94336
> >>>> ---WITH---
> >>>> $ grep kvm_timer_update_irq trace.out | wc -l
> >>>> 95838
> >>>>
> >>>>
> >>>
> >>> Must be how I'm looking, because I just tried kvmtool with/without
> >>> Marc's patch that adds always-on, but don't see any reduction of
> >>> interrupts there either. I used a defconfig guest kernel. Also,
> >>> not that I think it should matter, but my host kernel is 4.4-rc4
> >>> based.
> >>>
> >>> I'd like to be able to see a difference with/without this always-on
> >>> patch, not because I don't think we should take it anyway, but because
> >>> I need a test case for the ACPI counterpart.
> >>
> >> I just run a couple of quick tests, measuring interrupt rate (vmstat 1)
> >> on the host, with one VM (2 vcpus) idling, and I'm seeing the following
> >> thing:
> >>
> >> Without "always-on": ~380 interrupts per second
> >> With "always-on": ~40 interrupts per second
> >>
> >> This is with kvmtool, 32bit host (but none of that is arch specific anyway).
> >>
> >
> > For me (64bit host, one VM (8 vcpus)) of 100 'vmstat 1' samples I have the
> > following.
> >
> > Without "always-on": mean=56.370 sd=33.404 min=1 max=244
> > With "always-on": mean=51.580 sd=33.361 min=1 max=273
> >
> > I'm also using kvmtool, and my guest is idle.
> >
> > So a difference between 32 and 64bit hosts? Again, my guest config is
> > now just a defconfig. My host config is not, but I'm not sure what
> > options to look for other than what I wrote above, which are the same
> > for my host.
>
> Just tried on Seattle with a 64bit guest, and there is hardly any
> difference indeed. Both host and guest are "mostly" defconfig as well.
> So there is a kernel configuration difference.
>
> Running my 32bit guest on a 64bit host definitely shows a massive
> difference (with 8 vcpus):
>
> Without "always-on": ~1200 interrupts per second
> With "always-on": ~50 interrupts per second
>
> [Head scratching, poking Mark]
>
> Right, I now know what is going on: The arm64 kernel uses
> tick_setup_hrtimer_broadcast() so that it can still use the arch timer
> as a broadcast timer (forcing one CPU to remain on), while the 32bit
> kernel relies on the presence of a backup timer (sp804 anyone?) or the
> guarantee that the timer cannot go away (always-on).
>
> This is probably why I'm seeing such a gain with a 32bit guest, and none
> with a 64bit guest (the kernel already does the right thing). As to why
> there is such a difference between the two architectures, this is a
> story for another day...
>
Thanks Marc! I just confirmed with an AArch32 guest using QEMU, and the
patch we've hijacked for this thread. Without the patch I get a megaton
of interrupts (~14000/s). With the patch, after letting the guest chill
for a while, I'm getting ~150/s (8 vcpus).
I guess I don't have a test case for the ACPI code though. afaik we only
have UEFI for AArch64 guests, and we don't have ACPI boot without UEFI.
Or maybe I can hack time_init to remove the tick_setup_hrtimer_broadcast
call?
drew
next prev parent reply other threads:[~2016-01-20 16:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-19 11:49 [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer Christoffer Dall
2016-01-19 12:37 ` Andrew Jones
2016-01-19 12:43 ` Christoffer Dall
2016-01-19 13:32 ` Andrew Jones
2016-01-19 13:43 ` Marc Zyngier
2016-01-19 14:07 ` Andrew Jones
2016-01-19 18:48 ` Andrew Jones
2016-01-20 14:01 ` Andrew Jones
2016-01-20 14:28 ` Marc Zyngier
2016-01-20 15:06 ` Andrew Jones
2016-01-20 16:20 ` Marc Zyngier
2016-01-20 16:47 ` Andrew Jones [this message]
2016-01-20 17:08 ` Marc Zyngier
2016-01-19 15:21 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160120164749.GC3723@hawk.localdomain \
--to=drjones@redhat.com \
--cc=Mark.Rutland@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=marc.zyngier@arm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).