Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Graf <agraf@suse.de>
To: Bhushan Bharat-R65777 <R65777@freescale.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>,
	"<kvm-ppc@vger.kernel.org>" <kvm-ppc@vger.kernel.org>,
	"<kvm@vger.kernel.org>" <kvm@vger.kernel.org>,
	"<bharatb.yadav@gmail.com>" <bharatb.yadav@gmail.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Kumar Gala <galak@kernel.crashing.org>,
	Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
Date: Tue, 17 Jul 2012 12:51:56 +0000	[thread overview]
Message-ID: <50055FEC.4020602@suse.de> (raw)
In-Reply-To: <6A3DF150A5B70D4F9B66A25E3F7C888D03DCB08D@039-SN2MPN1-023.039d.mgd.msft.net>

On 07/17/2012 11:57 AM, Bhushan Bharat-R65777 wrote:
>
>> -----Original Message-----
>> From: kvm-ppc-owner@vger.kernel.org [mailto:kvm-ppc-owner@vger.kernel.org] On
>> Behalf Of Alexander Graf
>> Sent: Tuesday, July 17, 2012 12:50 PM
>> To: Wood Scott-B07421
>> Cc: Bhushan Bharat-R65777; <kvm-ppc@vger.kernel.org>; <kvm@vger.kernel.org>;
>> <bharatb.yadav@gmail.com>; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar
>> Gala
>> Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
>>
>>
>>
>> On 17.07.2012, at 03:02, Scott Wood <scottwood@freescale.com> wrote:
>>
>>> On 07/16/2012 12:18 PM, Alexander Graf wrote:
>>>>> +/*
>>>>> + * Return the number of jiffies until the next timeout.  If the
>>>> timeout is
>>>>> + * longer than the NEXT_TIMER_MAX_DELTA, that
>>>> then?
>>>>
>>>>> return NEXT_TIMER_MAX_DELTA
>>>>> + * instead.
>>>> I can read code.
>>> Come on, it's not exactly x++; /* add one to x */
>>>
>>> It's faster to read code (as well as know the constraints within which
>>> you can modify it without having to spend a lot of time digesting all
>>> the callers' use cases) when you have a high level description of its
>>> interface contract, and can be selective about when to zoom in to the
>>> details.  Linux kernel code tends to be bad about this.
>> Yeah, not opposed to leave that part in :).
>>
>>>> The important piece of information in the comment is
>>>> missing: The reason.
>>> The reason for what?  Why you want to know the next timeout?  That's
>>> the caller's business.  Or why we use NEXT_TIMER_MAX_DELTA as the limit?
>> Why we use the limit. IIRC it was explained in the last thread, just didn't make
>> its way into the comment.
> Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose).
> Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment.

Ah, ok. Just saying, if you comment on some mechanism, like you did 
here, please also include the reasoning behind it. For example

   Do foo if x is true.

isn't particularly helpful. However

   Do foo if x is true because the bar API will break with high values

is very helpful. It includes the action and reason of the code :). 
Alternatively, to me the same as above would be

   /* bar API will break with high values */
   if (x)
     do(foo)

because in this case the code is the action description. Either variant 
works fine for me.

>
>>>>> +void kvmppc_watchdog_func(unsigned long data) {
>>>>> +    struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
>>>>> +    u32 tsr, new_tsr;
>>>>> +    int final;
>>>>> +
>>>>> +    do {
>>>>> +        new_tsr = tsr = vcpu->arch.tsr;
>>>>> +        final = 0;
>>>>> +
>>>>> +        /* Time out event */
>>>>> +        if (tsr & TSR_ENW) {
>>>>> +            if (tsr & TSR_WIS) {
>>>>> +                new_tsr = (tsr & ~TCR_WRC_MASK) |
>>>>> +                      (vcpu->arch.tcr & TCR_WRC_MASK);
>>>>> +                vcpu->arch.tcr &= ~TCR_WRC_MASK;
>>>> Can't we just poke the vcpu to exit the VM and do the above on its own?
>>> We've discussed this before.  TSR updates are done via atomics, and we
>>> send a request for the vcpu to act on the result.  This is how the
>>> decrementer works.
>>>
>>> http://www.spinics.net/lists/kvm-ppc/msg03169.html
>> Yeah, the major difference to the dec is the atomicity of the whole thing. Dec
>> changes one bit to enable the interrupt line. The final expiration is more
>> complex.
> Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)?

Final expiration sets TCR. TSR should be ok.

>
>>>> This is the watdog expired case, right?
>>> Final expiration, yes.
>>>
>>>> I'd also prefer to have an
>>>> explicit event for the expiry than a special TSR check in the main loop.
>>> So check TSR[WRS] in update_timer_ints(), and have it queue a
>>> pseudoexception?
>> Or here.
> Do we mean define a sudo IROPRIO for final expiry.

We can also define an event that is sent through kvm_make_request. But 
yeah, IRQPRIO is probably easier. Not 100% sure which way is better 
though. Avi, any preferences?

>
>>> That would eliminate the need to change the runnable function.
>>>
>>>> Also call me sceptic on the reset of tcr. If our user space watchdog
>>>> event is "write a message", then we essentially want to hide the fact
>>>> that the watchdog expired from the guest, right? In that case, the
>>>> second time-out wouldn't do anything guest visible.
>>> This was probably copied straight out of the hardware documentation,
>>> which explicitly says TCR[WRC] gets set to zero on final expiration
>>> (as part of reset).  We should leave that part up to userspace.  It
>>> definitely shouldn't be done inside the cmpxchg loop (or from
>>> interrupt context -- only TSR gets the atomic treatment).  I don't
>>> think the read of TCR outside vcpu context is a problem, though.
>> Yeah, but it'd just make me less wary if only the vcpu thread itself accesses
>> vcpu internal registers that aren't irq state and thus designed for it (TSR).
>>
>> But yes, the most flexible way would probably be to do it from user space. Since
>> it'd happen from within the vcpu context of user space, we can also guarantee
>> that the TCR access is atomic.
> Yes, will move the tcr.wrc clearing to userspace.
>
>>>>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) {
>>>>> -    return !(v->arch.shared->msr & MSR_WE) ||
>>>>> -           !!(v->arch.pending_exceptions) ||
>>>>> -           v->requests;
>>>>> +    bool ret = !(v->arch.shared->msr & MSR_WE) ||
>>>>> +           !!(v->arch.pending_exceptions) ||
>>>>> +           v->requests;
>>>>> +
>>>>> +    ret = ret || kvmppc_get_tsr_wrc(v);
>>>> Why do you need to declare the cpu as non-runnable when a watchdog
>>>> event occured?
>>> It's the other way around -- it's always runnable when a watchdog exit
>>> is pending.  It's like a pending exception.
>> Ah, so yes, we should just shove it into pending_exceptions then.
> Pending_exception? You mean sudo again here as said earlier.

pseudo :). Yeah, I'm referring to above. No need to check 500 different 
conditions when we already have a bitmap that says "event is pending".


Alex

WARNING: multiple messages have this Message-ID (diff)

From: Alexander Graf <agraf@suse.de>
To: Bhushan Bharat-R65777 <R65777@freescale.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>,
	"<kvm-ppc@vger.kernel.org>" <kvm-ppc@vger.kernel.org>,
	"<kvm@vger.kernel.org>" <kvm@vger.kernel.org>,
	"<bharatb.yadav@gmail.com>" <bharatb.yadav@gmail.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Kumar Gala <galak@kernel.crashing.org>,
	Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
Date: Tue, 17 Jul 2012 14:51:56 +0200	[thread overview]
Message-ID: <50055FEC.4020602@suse.de> (raw)
In-Reply-To: <6A3DF150A5B70D4F9B66A25E3F7C888D03DCB08D@039-SN2MPN1-023.039d.mgd.msft.net>

On 07/17/2012 11:57 AM, Bhushan Bharat-R65777 wrote:
>
>> -----Original Message-----
>> From: kvm-ppc-owner@vger.kernel.org [mailto:kvm-ppc-owner@vger.kernel.org] On
>> Behalf Of Alexander Graf
>> Sent: Tuesday, July 17, 2012 12:50 PM
>> To: Wood Scott-B07421
>> Cc: Bhushan Bharat-R65777; <kvm-ppc@vger.kernel.org>; <kvm@vger.kernel.org>;
>> <bharatb.yadav@gmail.com>; Bhushan Bharat-R65777; Benjamin Herrenschmidt; Kumar
>> Gala
>> Subject: Re: [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation
>>
>>
>>
>> On 17.07.2012, at 03:02, Scott Wood <scottwood@freescale.com> wrote:
>>
>>> On 07/16/2012 12:18 PM, Alexander Graf wrote:
>>>>> +/*
>>>>> + * Return the number of jiffies until the next timeout.  If the
>>>> timeout is
>>>>> + * longer than the NEXT_TIMER_MAX_DELTA, that
>>>> then?
>>>>
>>>>> return NEXT_TIMER_MAX_DELTA
>>>>> + * instead.
>>>> I can read code.
>>> Come on, it's not exactly x++; /* add one to x */
>>>
>>> It's faster to read code (as well as know the constraints within which
>>> you can modify it without having to spend a lot of time digesting all
>>> the callers' use cases) when you have a high level description of its
>>> interface contract, and can be selective about when to zoom in to the
>>> details.  Linux kernel code tends to be bad about this.
>> Yeah, not opposed to leave that part in :).
>>
>>>> The important piece of information in the comment is
>>>> missing: The reason.
>>> The reason for what?  Why you want to know the next timeout?  That's
>>> the caller's business.  Or why we use NEXT_TIMER_MAX_DELTA as the limit?
>> Why we use the limit. IIRC it was explained in the last thread, just didn't make
>> its way into the comment.
> Earlier we have a comment on the #define MAX_TIMEOUT (new define added for a purpose, so the comment described the puspose).
> Now we uses the generic #define NEX_TIMER_MAX_DELTA (include/linux/timer.h), so removed the comment.

Ah, ok. Just saying, if you comment on some mechanism, like you did 
here, please also include the reasoning behind it. For example

   Do foo if x is true.

isn't particularly helpful. However

   Do foo if x is true because the bar API will break with high values

is very helpful. It includes the action and reason of the code :). 
Alternatively, to me the same as above would be

   /* bar API will break with high values */
   if (x)
     do(foo)

because in this case the code is the action description. Either variant 
works fine for me.

>
>>>>> +void kvmppc_watchdog_func(unsigned long data) {
>>>>> +    struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
>>>>> +    u32 tsr, new_tsr;
>>>>> +    int final;
>>>>> +
>>>>> +    do {
>>>>> +        new_tsr = tsr = vcpu->arch.tsr;
>>>>> +        final = 0;
>>>>> +
>>>>> +        /* Time out event */
>>>>> +        if (tsr & TSR_ENW) {
>>>>> +            if (tsr & TSR_WIS) {
>>>>> +                new_tsr = (tsr & ~TCR_WRC_MASK) |
>>>>> +                      (vcpu->arch.tcr & TCR_WRC_MASK);
>>>>> +                vcpu->arch.tcr &= ~TCR_WRC_MASK;
>>>> Can't we just poke the vcpu to exit the VM and do the above on its own?
>>> We've discussed this before.  TSR updates are done via atomics, and we
>>> send a request for the vcpu to act on the result.  This is how the
>>> decrementer works.
>>>
>>> http://www.spinics.net/lists/kvm-ppc/msg03169.html
>> Yeah, the major difference to the dec is the atomicity of the whole thing. Dec
>> changes one bit to enable the interrupt line. The final expiration is more
>> complex.
> Is not setting the TSR.WRS atomic here (cmpxchg() will handle this)?

Final expiration sets TCR. TSR should be ok.

>
>>>> This is the watdog expired case, right?
>>> Final expiration, yes.
>>>
>>>> I'd also prefer to have an
>>>> explicit event for the expiry than a special TSR check in the main loop.
>>> So check TSR[WRS] in update_timer_ints(), and have it queue a
>>> pseudoexception?
>> Or here.
> Do we mean define a sudo IROPRIO for final expiry.

We can also define an event that is sent through kvm_make_request. But 
yeah, IRQPRIO is probably easier. Not 100% sure which way is better 
though. Avi, any preferences?

>
>>> That would eliminate the need to change the runnable function.
>>>
>>>> Also call me sceptic on the reset of tcr. If our user space watchdog
>>>> event is "write a message", then we essentially want to hide the fact
>>>> that the watchdog expired from the guest, right? In that case, the
>>>> second time-out wouldn't do anything guest visible.
>>> This was probably copied straight out of the hardware documentation,
>>> which explicitly says TCR[WRC] gets set to zero on final expiration
>>> (as part of reset).  We should leave that part up to userspace.  It
>>> definitely shouldn't be done inside the cmpxchg loop (or from
>>> interrupt context -- only TSR gets the atomic treatment).  I don't
>>> think the read of TCR outside vcpu context is a problem, though.
>> Yeah, but it'd just make me less wary if only the vcpu thread itself accesses
>> vcpu internal registers that aren't irq state and thus designed for it (TSR).
>>
>> But yes, the most flexible way would probably be to do it from user space. Since
>> it'd happen from within the vcpu context of user space, we can also guarantee
>> that the TCR access is atomic.
> Yes, will move the tcr.wrc clearing to userspace.
>
>>>>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) {
>>>>> -    return !(v->arch.shared->msr & MSR_WE) ||
>>>>> -           !!(v->arch.pending_exceptions) ||
>>>>> -           v->requests;
>>>>> +    bool ret = !(v->arch.shared->msr & MSR_WE) ||
>>>>> +           !!(v->arch.pending_exceptions) ||
>>>>> +           v->requests;
>>>>> +
>>>>> +    ret = ret || kvmppc_get_tsr_wrc(v);
>>>> Why do you need to declare the cpu as non-runnable when a watchdog
>>>> event occured?
>>> It's the other way around -- it's always runnable when a watchdog exit
>>> is pending.  It's like a pending exception.
>> Ah, so yes, we should just shove it into pending_exceptions then.
> Pending_exception? You mean sudo again here as said earlier.

pseudo :). Yeah, I'm referring to above. No need to check 500 different 
conditions when we already have a bitmap that says "event is pending".


Alex

next prev parent reply	other threads:[~2012-07-17 12:51 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-09 10:34 [PATCH 2/2 v2] KVM: PPC: booke: Add watchdog emulation Bharat Bhushan
2012-07-09 10:46 ` Bharat Bhushan
2012-07-16 17:18 ` Alexander Graf
2012-07-16 17:18   ` Alexander Graf
2012-07-17  1:02   ` Scott Wood
2012-07-17  1:02     ` Scott Wood
2012-07-17  7:20     ` Alexander Graf
2012-07-17  7:20       ` Alexander Graf
2012-07-17  9:57       ` Bhushan Bharat-R65777
2012-07-17 12:51         ` Alexander Graf [this message]
2012-07-17 12:51           ` Alexander Graf
2012-07-17 13:15           ` Bhushan Bharat-R65777
2012-07-17 14:01             ` Alexander Graf
2012-07-17 14:01               ` Alexander Graf
2012-07-17 14:13               ` Bhushan Bharat-R65777
2012-07-17 14:35                 ` Alexander Graf
2012-07-17 14:35                   ` Alexander Graf
2012-07-17 16:10                   ` Bhushan Bharat-R65777
2012-07-17 16:27                   ` Scott Wood
2012-07-17 16:27                     ` Scott Wood
2012-07-17 16:51                     ` Alexander Graf
2012-07-17 16:51                       ` Alexander Graf
2012-07-17 18:00                       ` Scott Wood
2012-07-17 18:00                         ` Scott Wood
2012-07-17 11:31     ` Bhushan Bharat-R65777
2012-07-17 11:31       ` Bhushan Bharat-R65777
2012-07-17 16:37       ` Scott Wood
2012-07-17 16:37         ` Scott Wood
2012-07-17 16:56         ` Bhushan Bharat-R65777
2012-07-17 16:56           ` Bhushan Bharat-R65777
2012-07-17 17:00           ` Scott Wood
2012-07-17 17:00             ` Scott Wood
2012-07-17 17:10             ` Bhushan Bharat-R65777
2012-07-17 17:10               ` Bhushan Bharat-R65777
2012-07-17 17:25               ` Scott Wood
2012-07-17 17:25                 ` Scott Wood
2012-07-17 17:29                 ` Bhushan Bharat-R65777
2012-07-17 17:29                   ` Bhushan Bharat-R65777

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50055FEC.4020602@suse.de \
    --to=agraf@suse.de \
    --cc=B07421@freescale.com \
    --cc=R65777@freescale.com \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=bharatb.yadav@gmail.com \
    --cc=galak@kernel.crashing.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.