Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Alexander Graf <agraf@suse.de>
To: Bhushan Bharat-R65777 <R65777@freescale.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>,
	"<kvm-ppc@vger.kernel.org>" <kvm-ppc@vger.kernel.org>,
	"<kvm@vger.kernel.org>" <kvm@vger.kernel.org>,
	"<bharatb.yadav@gmail.com>" <bharatb.yadav@gmail.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Kumar Gala <galak@kernel.crashing.org>,
	Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
Date: Tue, 17 Jul 2012 14:51:56 +0200	[thread overview]
Message-ID: <50055FEC.4020602@suse.de> (raw)
In-Reply-To: <6A3DF150A5B70D4F9B66A25E3F7C888D03DCB08D@039-SN2MPN1-023.039d.mgd.msft.net>

On 07/17/2012 11:57 AM, Bhushan Bharat-R65777 wrote:
>
>> -----Original Message-----
>> From: kvm-ppc-owner@vger.kernel.org [mailto:kvm-ppc-owner@vger.kernel.org] On
>> Behalf Of Alexander Graf
>> Sent: Tuesday, July 17, 2012 12:50 PM
>> To: Wood Scott-B07421
>> Cc: Bhushan Bharat-R65777; <kvm-ppc@vger.kernel.org>; <kvm@vger.kernel.org>;
>> <bharatb.yadav@gmail.com>; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar
>> Gala
>> Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
>>
>>
>>
>> On 17.07.2012, at 03:02, Scott Wood <scottwood@freescale.com> wrote:
>>
>>> On 07/16/2012 12:18 PM, Alexander Graf wrote:
>>>>> +/*
>>>>> + * Return the number of jiffies until the next timeout.  If the
>>>> timeout is
>>>>> + * longer than the NEXT_TIMER_MAX_DELTA, that
>>>> then?
>>>>
>>>>> return NEXT_TIMER_MAX_DELTA
>>>>> + * instead.
>>>> I can read code.
>>> Come on, it's not exactly x++; /* add one to x */
>>>
>>> It's faster to read code (as well as know the constraints within which
>>> you can modify it without having to spend a lot of time digesting all
>>> the callers' use cases) when you have a high level description of its
>>> interface contract, and can be selective about when to zoom in to the
>>> details.  Linux kernel code tends to be bad about this.
>> Yeah, not opposed to leave that part in :).
>>
>>>> The important piece of information in the comment is
>>>> missing: The reason.
>>> The reason for what?  Why you want to know the next timeout?  That's
>>> the caller's business.  Or why we use NEXT_TIMER_MAX_DELTA as the limit?
>> Why we use the limit. IIRC it was explained in the last thread, just didn't make
>> its way into the comment.
> Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose).
> Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment.

Ah, ok. Just saying, if you comment on some mechanism, like you did 
here, please also include the reasoning behind it. For example

   Do foo if x is true.

isn't particularly helpful. However

   Do foo if x is true because the bar API will break with high values

is very helpful. It includes the action and reason of the code :). 
Alternatively, to me the same as above would be

   /* bar API will break with high values */
   if (x)
     do(foo)

because in this case the code is the action description. Either variant 
works fine for me.

>
>>>>> +void kvmppc_watchdog_func(unsigned long data) {
>>>>> +    struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
>>>>> +    u32 tsr, new_tsr;
>>>>> +    int final;
>>>>> +
>>>>> +    do {
>>>>> +        new_tsr = tsr = vcpu->arch.tsr;
>>>>> +        final = 0;
>>>>> +
>>>>> +        /* Time out event */
>>>>> +        if (tsr & TSR_ENW) {
>>>>> +            if (tsr & TSR_WIS) {
>>>>> +                new_tsr = (tsr & ~TCR_WRC_MASK) |
>>>>> +                      (vcpu->arch.tcr & TCR_WRC_MASK);
>>>>> +                vcpu->arch.tcr &= ~TCR_WRC_MASK;
>>>> Can't we just poke the vcpu to exit the VM and do the above on its own?
>>> We've discussed this before.  TSR updates are done via atomics, and we
>>> send a request for the vcpu to act on the result.  This is how the
>>> decrementer works.
>>>
>>> http://www.spinics.net/lists/kvm-ppc/msg03169.html
>> Yeah, the major difference to the dec is the atomicity of the whole thing. Dec
>> changes one bit to enable the interrupt line. The final expiration is more
>> complex.
> Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)?

Final expiration sets TCR. TSR should be ok.

>
>>>> This is the watdog expired case, right?
>>> Final expiration, yes.
>>>
>>>> I'd also prefer to have an
>>>> explicit event for the expiry than a special TSR check in the main loop.
>>> So check TSR[WRS] in update_timer_ints(), and have it queue a
>>> pseudoexception?
>> Or here.
> Do we mean define a sudo IROPRIO for final expiry.

We can also define an event that is sent through kvm_make_request. But 
yeah, IRQPRIO is probably easier. Not 100% sure which way is better 
though. Avi, any preferences?

>
>>> That would eliminate the need to change the runnable function.
>>>
>>>> Also call me sceptic on the reset of tcr. If our user space watchdog
>>>> event is "write a message", then we essentially want to hide the fact
>>>> that the watchdog expired from the guest, right? In that case, the
>>>> second time-out wouldn't do anything guest visible.
>>> This was probably copied straight out of the hardware documentation,
>>> which explicitly says TCR[WRC] gets set to zero on final expiration
>>> (as part of reset).  We should leave that part up to userspace.  It
>>> definitely shouldn't be done inside the cmpxchg loop (or from
>>> interrupt context -- only TSR gets the atomic treatment).  I don't
>>> think the read of TCR outside vcpu context is a problem, though.
>> Yeah, but it'd just make me less wary if only the vcpu thread itself accesses
>> vcpu internal registers that aren't irq state and thus designed for it (TSR).
>>
>> But yes, the most flexible way would probably be to do it from user space. Since
>> it'd happen from within the vcpu context of user space, we can also guarantee
>> that the TCR access is atomic.
> Yes, will move the tcr.wrc clearing to userspace.
>
>>>>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) {
>>>>> -    return !(v->arch.shared->msr & MSR_WE) ||
>>>>> -           !!(v->arch.pending_exceptions) ||
>>>>> -           v->requests;
>>>>> +    bool ret = !(v->arch.shared->msr & MSR_WE) ||
>>>>> +           !!(v->arch.pending_exceptions) ||
>>>>> +           v->requests;
>>>>> +
>>>>> +    ret = ret || kvmppc_get_tsr_wrc(v);
>>>> Why do you need to declare the cpu as non-runnable when a watchdog
>>>> event occured?
>>> It's the other way around -- it's always runnable when a watchdog exit
>>> is pending.  It's like a pending exception.
>> Ah, so yes, we should just shove it into pending_exceptions then.
> Pending_exception? You mean sudo again here as said earlier.

pseudo :). Yeah, I'm referring to above. No need to check 500 different 
conditions when we already have a bitmap that says "event is pending".


Alex

next prev parent reply	other threads:[~2012-07-17 12:51 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-09 10:34 [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation Bharat Bhushan
2012-07-16 17:18 ` Alexander Graf
2012-07-17  1:02   ` Scott Wood
2012-07-17  7:20     ` Alexander Graf
2012-07-17  9:57       ` Bhushan Bharat-R65777
2012-07-17 12:51         ` Alexander Graf [this message]
2012-07-17 13:15           ` Bhushan Bharat-R65777
2012-07-17 14:01             ` Alexander Graf
2012-07-17 14:13               ` Bhushan Bharat-R65777
2012-07-17 14:35                 ` Alexander Graf
2012-07-17 16:10                   ` Bhushan Bharat-R65777
2012-07-17 16:27                   ` Scott Wood
2012-07-17 16:51                     ` Alexander Graf
2012-07-17 18:00                       ` Scott Wood
2012-07-17 11:31     ` Bhushan Bharat-R65777
2012-07-17 16:37       ` Scott Wood
2012-07-17 16:56         ` Bhushan Bharat-R65777
2012-07-17 17:00           ` Scott Wood
2012-07-17 17:10             ` Bhushan Bharat-R65777
2012-07-17 17:25               ` Scott Wood
2012-07-17 17:29                 ` Bhushan Bharat-R65777

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50055FEC.4020602@suse.de \
    --to=agraf@suse.de \
    --cc=B07421@freescale.com \
    --cc=R65777@freescale.com \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=bharatb.yadav@gmail.com \
    --cc=galak@kernel.crashing.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox