From: Cyril Bur <cyrilbur@gmail.com>
To: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, mpe@ellerman.id.au,
drjones@redhat.com, dzickus@redhat.com, mingo@kernel.org,
uobergfe@redhat.com, chaiw.fnst@cn.fujitsu.com, cl@linu.com,
fabf@skynet.be, atomlin@redhat.com, benzh@chromium.org,
heiko.carstens@de.ibm.com
Subject: Re: [PATCH 2/2] powerpc: add running_clock for powerpc to prevent spurious softlockup warnings
Date: Fri, 09 Jan 2015 14:22:04 +1100 [thread overview]
Message-ID: <1420773724.2801.17.camel@cyril> (raw)
In-Reply-To: <20150107112024.76aa9217@mschwide>
On Wed, 2015-01-07 at 11:20 +0100, Martin Schwidefsky wrote:
> On Tue, 06 Jan 2015 13:44:01 +1100
> Cyril Bur <cyrilbur@gmail.com> wrote:
>
> > On Mon, 2015-01-05 at 14:10 -0800, Andrew Morton wrote:
> > > On Mon, 22 Dec 2014 16:06:04 +1100 Cyril Bur <cyrilbur@gmail.com> wrote:
> > >
> > > > On POWER8 virtualised kernels the VTB register can be read to have a view of
> > > > time that only increases while the guest is running. This will prevent guests
> > > > from seeing time jump if a guest is paused for significant amounts of time.
> > > >
> > > > On POWER7 and below virtualised kernels stolen time is subtracted from
> > > > sched_clock as a best effort approximation. This will not eliminate spurious
> > > > warnings in the case of a suspended guest but may reduce the occurance in the
> > > > case of softlockups due to host over commit.
> > > >
> > > > Bare metal kernels should avoid reading the VTB as KVM does not restore sane
> > > > values when not executing. sched_clock is returned in this case.
> > > >
> > > > --- a/arch/powerpc/kernel/time.c
> > > > +++ b/arch/powerpc/kernel/time.c
> > > > @@ -621,6 +621,30 @@ unsigned long long sched_clock(void)
> > > > return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
> > > > }
> > > >
> > > > +unsigned long long running_clock(void)
> > >
> > > Non-kvm kernels don't need this code. Is there some appropriate
> > > "#ifdef CONFIG_foo" we can wrap this in?
> > CONFIG_PSERIES would work, having said that typical compilation for a
> > powernv kernel almost always includes CONFIG_PSERIES (although it
> > doesn't need to)... still, your point is valid, will add in v2.
> > >
> > >
> > > > +{
> > > > + /*
> > > > + * Don't read the VTB as a host since KVM does not switch in host timebase
> > > > + * into the VTB when it takes a guest off the CPU, reading the VTB would
> > > > + * result in reading 'last switched out' guest VTB.
> > > > + */
> > > > +
> > > > + if (firmware_has_feature(FW_FEATURE_LPAR)) {
> > > > + if (cpu_has_feature(CPU_FTR_ARCH_207S))
> > > > + return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
> > > > +
> > > > + /* This is a next best approximation without a VTB. */
> > > > + return sched_clock() - cputime_to_nsecs(kcpustat_this_cpu->cpustat[CPUTIME_STEAL]);
> > >
> > > Why is this result dependent on FW_FEATURE_LPAR? It's all generic code.
> > Good point, the reason it ended up there is because I wanted to avoid
> > behaviour changes.
> > >
> > > In fact the kernel/sched/clock.c default implementation of
> > > running_clock() could use this expression. Would that be good or bad? :)
> > For power I'm almost certain it would be fine, on platforms which don't
> > do stolen time cpustat[CPUTIME_STEAL] should always be zero and if not
> > then the value should always be sane (although as mentioned in the
> > comment, not as accurate as using the VTB).
> >
> > Putting it in the default implementation could cause behavioural changes
> > for x86 and s390, I would want their views on doing that.
>
> I would prefer to make sched_clock do all the work. We have been thinking
> about steal time vs sched_clock as well, our solution would be to exchange
> the time source. Right now sched_clock is based on the TOD clock, the code
> that takes steal time into account would use the CPU timer instead.
> With the subtraction of kcpustat_this_cpu->cpustat[CPUTIME_STEAL] in
> common code we would have to add the same value in the sched_clock
> implementation as the steal time is already included in the CPU timer
> deltas.
Thanks for the quick reply Martin,
Sound like you've got ideas and while I didn't really grasp all of that,
I gather we best leave the common code as is.
>
next prev parent reply other threads:[~2015-01-09 3:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-22 5:06 [PATCH 0/2] Quieten softlockup detector on virtualised kernels Cyril Bur
2014-12-22 5:06 ` [PATCH 1/2] Add another clock for use with the soft lockup watchdog Cyril Bur
2015-01-05 22:09 ` Andrew Morton
2014-12-22 5:06 ` [PATCH 2/2] powerpc: add running_clock for powerpc to prevent spurious softlockup warnings Cyril Bur
2015-01-05 22:10 ` Andrew Morton
2015-01-06 2:44 ` Cyril Bur
2015-01-07 10:20 ` Martin Schwidefsky
2015-01-09 3:22 ` Cyril Bur [this message]
2015-01-05 16:50 ` [PATCH 0/2] Quieten softlockup detector on virtualised kernels Don Zickus
2015-01-05 23:53 ` Cyril Bur
2015-01-06 15:01 ` Don Zickus
2015-01-09 3:15 ` Cyril Bur
2015-01-09 14:56 ` Don Zickus
2015-01-05 22:09 ` Andrew Morton
2015-01-06 2:43 ` Cyril Bur
-- strict thread matches above, loose matches on Subject: below --
2014-12-01 2:38 Cyril Bur
2014-12-01 2:39 ` [PATCH 2/2] powerpc: add running_clock for powerpc to prevent spurious softlockup warnings Cyril Bur
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1420773724.2801.17.camel@cyril \
--to=cyrilbur@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=atomlin@redhat.com \
--cc=benzh@chromium.org \
--cc=chaiw.fnst@cn.fujitsu.com \
--cc=cl@linu.com \
--cc=drjones@redhat.com \
--cc=dzickus@redhat.com \
--cc=fabf@skynet.be \
--cc=heiko.carstens@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=schwidefsky@de.ibm.com \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).