From: Scott Wood <scottwood@freescale.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <r65777@freescale.com>,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org,
Bharat Bhushan <Bharat.Bhushan@freescale.com>
Subject: Re: [PATCH 2/2] KVM: PPC: booke: Add watchdog emulation
Date: Fri, 06 Jul 2012 23:37:55 +0000 [thread overview]
Message-ID: <4FF776D3.1070701@freescale.com> (raw)
In-Reply-To: <19D972D8-56CB-4C79-9430-E1AFBA39D01F@suse.de>
On 07/06/2012 08:17 AM, Alexander Graf wrote:
> On 28.06.2012, at 08:17, Bharat Bhushan wrote:
>> +/*
>> + * The timer system can almost deal with LONG_MAX timeouts, except that
>> + * when you get very close to LONG_MAX, the slack added can cause overflow.
>> + *
>> + * LONG_MAX/2 is a conservative threshold, but it should be adequate for
>> + * any realistic use.
>> + */
>> +#define MAX_TIMEOUT (LONG_MAX/2)
>
> Should this really be in kvm code?
It looks like we can use NEXT_TIMER_MAX_DELTA for this.
>> + mask = 1ULL << (63 - period);
>> + tb = get_tb();
>> + if (tb & mask)
>> + nr_jiffies += mask;
>
> To be honest, you lost me here. nr_jiffies is jiffies, right? While
> mask is basically in timebase granularity. I suppose you're just
> reusing the variable here for no good reason - the compiler will
> gladly optimize things for you if you write things a bit more verbose
> :).
Probably due to the way do_div() works, but yeah, it's confusing. Maybe
something generic like "ticks", "interval", "remaining", etc. would be
better, with a comment on the do_div saying it's converting timebase
ticks into jiffies.
>> +static void arm_next_watchdog(struct kvm_vcpu *vcpu)
>> +{
>> + unsigned long nr_jiffies;
>> +
>> + nr_jiffies = watchdog_next_timeout(vcpu);
>> + if (nr_jiffies < MAX_TIMEOUT)
>> + mod_timer(&vcpu->arch.wdt_timer, jiffies + nr_jiffies);
>> + else
>> + del_timer(&vcpu->arch.wdt_timer);
>
> Can you del a timer that's not armed? Could that ever happen in this case?
"del_timer() deactivates a timer - this works on both active and
inactive timers."
> Also, could the timer possibly be running somewhere, so do we need del_timer_sync? Or don't we need to care?
This can be called in the context of the timer, so del_timer_sync()
would hang.
As for what would happen if a caller from a different context were to
race with a timer, I think you could end up with the timer armed based
on an old TCR. del_timer_sync() won't help though, unless you replace
mod_timer() with del_timer_sync() plus add_timer() (with a check to see
whether it's running in timer context). A better solution is probably
to use a spinlock in arm_next_watchdog().
>> +void kvmppc_watchdog_func(unsigned long data)
>> +{
>> + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
>> + u32 tsr, new_tsr;
>> + int final;
>> +
>> + do {
>> + new_tsr = tsr = vcpu->arch.tsr;
>> + final = 0;
>> +
>> + /* Time out event */
>> + if (tsr & TSR_ENW) {
>> + if (tsr & TSR_WIS) {
>> + new_tsr = (tsr & ~TCR_WRC_MASK) |
>> + (vcpu->arch.tcr & TCR_WRC_MASK);
>> + vcpu->arch.tcr &= ~TCR_WRC_MASK;
>> + final = 1;
>> + } else {
>> + new_tsr = tsr | TSR_WIS;
>> + }
>> + } else {
>> + new_tsr = tsr | TSR_ENW;
>> + }
>> + } while (cmpxchg(&vcpu->arch.tsr, tsr, new_tsr) != tsr);
>> +
>> + if (new_tsr & (TSR_WIS | TCR_WRC_MASK)) {
>> + smp_wmb();
>> + kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
>> + kvm_vcpu_kick(vcpu);
>> + }
>> +
>> + /*
>> + * Avoid getting a storm of timers if the guest sets
>> + * the period very short. We'll restart it if anything
>> + * changes.
>> + */
>> + if (!final)
>> + arm_next_watchdog(vcpu);
>
> Mind to explain this part a bit further?
The whole function, or some subset near the end?
The "if (!final)" check means that we stop running the timer after final
expiration, to prevent the host from being flooded with timers if the
guest sets a short period but does not have TCR set to exit to QEMU.
Timers will resume the next time TSR/TCR is updated.
>> @@ -1106,7 +1213,14 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
>> }
>>
>> if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
>> + u32 old_tsr = vcpu->arch.tsr;
>> +
>> vcpu->arch.tsr = sregs->u.e.tsr;
>> +
>> + if ((old_tsr ^ vcpu->arch.tsr) &
>> + (TSR_ENW | TSR_WIS | TCR_WRC_MASK))
>> + arm_next_watchdog(vcpu);
>
> Why isn't this one guarded by vcpu->arch.watchdog_enable?
I'm not sure that any of them should be -- there's no reason for the
watchdog interrupt mechanism to be dependent on QEMU, only the
heavyweight exit on final expiration.
>> + spr_val &= ~TCR_WRC_MASK;
>> + kvmppc_set_tcr(vcpu,
>> + spr_val | (TCR_WRC_MASK & vcpu->arch.tcr));
>
> In fact, what you're trying to do here is keep TCR_WRC always on when it was enabled once. So all you need is the OR here. No need for the mask above.
WRC is a 2-bit field that is supposed to preserve its value once written
to be non-zero. Not that we actually do anything different based on the
specific non-zero value, but still we should implement the architected
semantics.
-Scott
WARNING: multiple messages have this Message-ID (diff)
From: Scott Wood <scottwood@freescale.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <r65777@freescale.com>, <kvm-ppc@vger.kernel.org>,
<kvm@vger.kernel.org>,
Bharat Bhushan <Bharat.Bhushan@freescale.com>
Subject: Re: [PATCH 2/2] KVM: PPC: booke: Add watchdog emulation
Date: Fri, 6 Jul 2012 18:37:55 -0500 [thread overview]
Message-ID: <4FF776D3.1070701@freescale.com> (raw)
In-Reply-To: <19D972D8-56CB-4C79-9430-E1AFBA39D01F@suse.de>
On 07/06/2012 08:17 AM, Alexander Graf wrote:
> On 28.06.2012, at 08:17, Bharat Bhushan wrote:
>> +/*
>> + * The timer system can almost deal with LONG_MAX timeouts, except that
>> + * when you get very close to LONG_MAX, the slack added can cause overflow.
>> + *
>> + * LONG_MAX/2 is a conservative threshold, but it should be adequate for
>> + * any realistic use.
>> + */
>> +#define MAX_TIMEOUT (LONG_MAX/2)
>
> Should this really be in kvm code?
It looks like we can use NEXT_TIMER_MAX_DELTA for this.
>> + mask = 1ULL << (63 - period);
>> + tb = get_tb();
>> + if (tb & mask)
>> + nr_jiffies += mask;
>
> To be honest, you lost me here. nr_jiffies is jiffies, right? While
> mask is basically in timebase granularity. I suppose you're just
> reusing the variable here for no good reason - the compiler will
> gladly optimize things for you if you write things a bit more verbose
> :).
Probably due to the way do_div() works, but yeah, it's confusing. Maybe
something generic like "ticks", "interval", "remaining", etc. would be
better, with a comment on the do_div saying it's converting timebase
ticks into jiffies.
>> +static void arm_next_watchdog(struct kvm_vcpu *vcpu)
>> +{
>> + unsigned long nr_jiffies;
>> +
>> + nr_jiffies = watchdog_next_timeout(vcpu);
>> + if (nr_jiffies < MAX_TIMEOUT)
>> + mod_timer(&vcpu->arch.wdt_timer, jiffies + nr_jiffies);
>> + else
>> + del_timer(&vcpu->arch.wdt_timer);
>
> Can you del a timer that's not armed? Could that ever happen in this case?
"del_timer() deactivates a timer - this works on both active and
inactive timers."
> Also, could the timer possibly be running somewhere, so do we need del_timer_sync? Or don't we need to care?
This can be called in the context of the timer, so del_timer_sync()
would hang.
As for what would happen if a caller from a different context were to
race with a timer, I think you could end up with the timer armed based
on an old TCR. del_timer_sync() won't help though, unless you replace
mod_timer() with del_timer_sync() plus add_timer() (with a check to see
whether it's running in timer context). A better solution is probably
to use a spinlock in arm_next_watchdog().
>> +void kvmppc_watchdog_func(unsigned long data)
>> +{
>> + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
>> + u32 tsr, new_tsr;
>> + int final;
>> +
>> + do {
>> + new_tsr = tsr = vcpu->arch.tsr;
>> + final = 0;
>> +
>> + /* Time out event */
>> + if (tsr & TSR_ENW) {
>> + if (tsr & TSR_WIS) {
>> + new_tsr = (tsr & ~TCR_WRC_MASK) |
>> + (vcpu->arch.tcr & TCR_WRC_MASK);
>> + vcpu->arch.tcr &= ~TCR_WRC_MASK;
>> + final = 1;
>> + } else {
>> + new_tsr = tsr | TSR_WIS;
>> + }
>> + } else {
>> + new_tsr = tsr | TSR_ENW;
>> + }
>> + } while (cmpxchg(&vcpu->arch.tsr, tsr, new_tsr) != tsr);
>> +
>> + if (new_tsr & (TSR_WIS | TCR_WRC_MASK)) {
>> + smp_wmb();
>> + kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
>> + kvm_vcpu_kick(vcpu);
>> + }
>> +
>> + /*
>> + * Avoid getting a storm of timers if the guest sets
>> + * the period very short. We'll restart it if anything
>> + * changes.
>> + */
>> + if (!final)
>> + arm_next_watchdog(vcpu);
>
> Mind to explain this part a bit further?
The whole function, or some subset near the end?
The "if (!final)" check means that we stop running the timer after final
expiration, to prevent the host from being flooded with timers if the
guest sets a short period but does not have TCR set to exit to QEMU.
Timers will resume the next time TSR/TCR is updated.
>> @@ -1106,7 +1213,14 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
>> }
>>
>> if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
>> + u32 old_tsr = vcpu->arch.tsr;
>> +
>> vcpu->arch.tsr = sregs->u.e.tsr;
>> +
>> + if ((old_tsr ^ vcpu->arch.tsr) &
>> + (TSR_ENW | TSR_WIS | TCR_WRC_MASK))
>> + arm_next_watchdog(vcpu);
>
> Why isn't this one guarded by vcpu->arch.watchdog_enable?
I'm not sure that any of them should be -- there's no reason for the
watchdog interrupt mechanism to be dependent on QEMU, only the
heavyweight exit on final expiration.
>> + spr_val &= ~TCR_WRC_MASK;
>> + kvmppc_set_tcr(vcpu,
>> + spr_val | (TCR_WRC_MASK & vcpu->arch.tcr));
>
> In fact, what you're trying to do here is keep TCR_WRC always on when it was enabled once. So all you need is the OR here. No need for the mask above.
WRC is a 2-bit field that is supposed to preserve its value once written
to be non-zero. Not that we actually do anything different based on the
specific non-zero value, but still we should implement the architected
semantics.
-Scott
next prev parent reply other threads:[~2012-07-06 23:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-28 6:17 [PATCH 2/2] KVM: PPC: booke: Add watchdog emulation Bharat Bhushan
2012-06-28 6:29 ` Bharat Bhushan
2012-07-06 13:17 ` Alexander Graf
2012-07-06 13:17 ` Alexander Graf
2012-07-06 23:37 ` Scott Wood [this message]
2012-07-06 23:37 ` Scott Wood
2012-07-07 7:50 ` Alexander Graf
2012-07-07 7:50 ` Alexander Graf
2012-07-09 5:13 ` Bhushan Bharat-R65777
2012-07-09 5:13 ` Bhushan Bharat-R65777
2012-07-09 8:49 ` Alexander Graf
2012-07-09 8:49 ` Alexander Graf
2012-07-09 14:28 ` Scott Wood
2012-07-09 14:28 ` Scott Wood
2012-07-09 14:36 ` Alexander Graf
2012-07-09 14:36 ` Alexander Graf
2012-07-09 14:44 ` Bhushan Bharat-R65777
2012-07-09 14:44 ` Bhushan Bharat-R65777
2012-07-09 17:09 ` Scott Wood
2012-07-09 17:09 ` Scott Wood
2012-07-09 17:29 ` Alexander Graf
2012-07-09 17:29 ` Alexander Graf
2012-07-09 6:43 ` Bhushan Bharat-R65777
2012-07-09 6:43 ` Bhushan Bharat-R65777
2012-07-09 9:11 ` Alexander Graf
2012-07-09 9:11 ` Alexander Graf
-- strict thread matches above, loose matches on Subject: below --
2012-07-20 4:59 Bharat Bhushan
2012-07-20 5:11 ` Bharat Bhushan
2012-07-20 4:59 ` Bhushan Bharat-R65777
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF776D3.1070701@freescale.com \
--to=scottwood@freescale.com \
--cc=Bharat.Bhushan@freescale.com \
--cc=agraf@suse.de \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=r65777@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.