From: Leonardo Bras <leobras.c@gmail.com>
To: Nicholas Piggin <npiggin@gmail.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Athira Rajeev <atrajeev@linux.vnet.ibm.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Frederic Weisbecker <frederic@kernel.org>,
Geert Uytterhoeven <geert+renesas@glider.be>,
Jordan Niethe <jniethe5@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Paul Mackerras <paulus@ozlabs.org>,
Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
kvm-ppc@vger.kernel.org
Subject: Re: [PATCH v2 1/1] powerpc/kvm: Save Timebase Offset to fix sched_clock() while running guest code.
Date: Mon, 08 Feb 2021 13:37:04 -0300 [thread overview]
Message-ID: <5f267a8aec5b8199a580c96ab2b1a3c27de4eb09.camel@gmail.com> (raw)
In-Reply-To: <1612579579.ztbklit4un.astroid@bobo.none>
Hello Nick,
On Sat, 2021-02-06 at 13:03 +1000, Nicholas Piggin wrote:
> Excerpts from Leonardo Bras's message of February 5, 2021 5:01 pm:
> > Hey Nick, thanks for reviewing :)
> >
> > On Fri, 2021-02-05 at 16:28 +1000, Nicholas Piggin wrote:
> > > Excerpts from Leonardo Bras's message of February 5, 2021 4:06 pm:
> > > > Before guest entry, TBU40 register is changed to reflect guest timebase.
> > > > After exitting guest, the register is reverted to it's original value.
> > > >
> > > > If one tries to get the timestamp from host between those changes, it
> > > > will present an incorrect value.
> > > >
> > > > An example would be trying to add a tracepoint in
> > > > kvmppc_guest_entry_inject_int(), which depending on last tracepoint
> > > > acquired could actually cause the host to crash.
> > > >
> > > > Save the Timebase Offset to PACA and use it on sched_clock() to always
> > > > get the correct timestamp.
> > >
> > > Ouch. Not sure how reasonable it is to half switch into guest registers
> > > and expect to call into the wider kernel, fixing things up as we go.
> > > What if mftb is used in other places?
> >
> > IIUC, the CPU is not supposed to call anything as host between guest
> > entry and guest exit, except guest-related cases, like
>
> When I say "call", I'm including tracing in that. If a function is not
> marked as no trace, then it will call into the tracing subsystem.
>
> > kvmppc_guest_entry_inject_int(), but anyway, if something calls mftb it
> > will still get the same value as before.
>
> Right, so it'll be out of whack again.
>
> > This is only supposed to change stuff that depends on sched_clock, like
> > Tracepoints, that can happen in those exceptions.
>
> If they depend on sched_clock that's one thing. Do they definitely have
> no dependencies on mftb from other calls?
We could change that on get_tb() or mftb() @ timebase.h, which would
have a broader reach, but would not reach any mftb from asm code.
> > > Especially as it doesn't seem like there is a reason that function _has_
> > > to be called after the timebase is switched to guest, that's just how
> > > the code is structured.
> >
> > Correct, but if called, like in rb routines, used by tracepoints, the
> > difference between last tb and current (lower) tb may cause the CPU to
> > trap PROGRAM exception, crashing host.
>
> Yes, so I agree with Michael any function that is involved when we begin
> to switch into guest context (or have not completed switching back to
> host going the other way) should be marked as no trace (noinstr even,
> perhaps).
Sure, that would avoid having to get paca->tb_offset for every mftb()
called, and avoid inconsistencies when different ways to get time are
used in code.
On the other hand, it would make it very hard to debug functions like
kvmppc_guest_entry_inject_int() as I am doing right now.
>
> > > As a local hack to work out a bug okay. If you really need it upstream
> > > could you put it under a debug config option?
> >
> > You mean something that is automatically selected whenever those
> > configs are enabled?
> >
> > CONFIG_TRACEPOINT && CONFIG_KVM_BOOK3S_HANDLER && CONFIG_PPC_BOOK3S_64
> >
> > Or something the user need to select himself in menuconfig?
>
> Yeah I meant a default n thing under powerpc kernel debugging somewhere.
So, IIUC all we can do is split this in 2 changes:
1 - Adding notrace to those functions
2 - Introducing a kernel debug config that reverts (1) and 'fixes' mftb
If that's correct, I have some ideas we can use.
For debug option, should we add the offset on get_tb() or mftb()?
Another option would be to adding this tb_offset only in the routines
used by tracing. But this could probably mean having to add a function
in arch-generic code, but still an option.
What do you think?
>
> Thanks,
> Nick
Thank you!
Leonardo Bras
next prev parent reply other threads:[~2021-02-08 16:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-05 6:06 [PATCH v2 1/1] powerpc/kvm: Save Timebase Offset to fix sched_clock() while running guest code Leonardo Bras
2021-02-05 6:28 ` Nicholas Piggin
2021-02-05 7:01 ` Leonardo Bras
2021-02-06 3:03 ` Nicholas Piggin
2021-02-08 16:37 ` Leonardo Bras [this message]
2021-02-05 23:32 ` Michael Ellerman
2021-02-05 13:09 ` Fabiano Rosas
2021-02-05 20:35 ` Leonardo Bras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5f267a8aec5b8199a580c96ab2b1a3c27de4eb09.camel@gmail.com \
--to=leobras.c@gmail.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=atrajeev@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=christophe.leroy@csgroup.eu \
--cc=frederic@kernel.org \
--cc=geert+renesas@glider.be \
--cc=jniethe5@gmail.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@ozlabs.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).