From: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
To: Paul Mackerras <paulus@ozlabs.org>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>,
Michael Ellerman <mpe@ellerman.id.au>,
KVM-PPC <kvm-ppc@vger.kernel.org>, KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 2/3] powerpc/book3s: Fix TB corruption in guest exit path on HMI interrupt.
Date: Sun, 15 May 2016 04:33:03 +0000 [thread overview]
Message-ID: <5737F92F.6090202@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160318045132.GB1393@fergus.ozlabs.ibm.com>
On 03/18/2016 10:21 AM, Paul Mackerras wrote:
> On Thu, Jan 14, 2016 at 08:45:04AM +0530, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> When a guest is assigned to a core it converts the host Timebase (TB)
>> into guest TB by adding guest timebase offset before entering into
>> guest. During guest exit it restores the guest TB to host TB. This means
>> under certain conditions (Guest migration) host TB and guest TB can differ.
>>
>> When we get an HMI for TB related issues the opal HMI handler would
>> try fixing errors and restore the correct host TB value. With no guest
>> running, we don't have any issues. But with guest running on the core
>> we run into TB corruption issues.
>>
>> If we get an HMI while in the guest, the current HMI handler invokes opal
>> hmi handler before forcing guest to exit. The guest exit path subtracts
>> the guest TB offset from the current TB value which may have already
>> been restored with host value by opal hmi handler. This leads to incorrect
>> host and guest TB values.
>>
>> With split-core, things become more complex. With split-core, TB also gets
>> split and each subcore gets its own TB register. When a hmi handler fixes
>> a TB error and restores the TB value, it affects all the TB values of
>> sibling subcores on the same core. On TB errors all the thread in the core
>> gets HMI. With existing code, the individual threads call opal hmi handle
>> independently which can easily throw TB out of sync if we have guest
>> running on subcores. Hence we will need to co-ordinate with all the
>> threads before making opal hmi handler call followed by TB resync.
>>
>> This patch introduces a sibling subcore state structure (shared by all
>> threads in the core) in paca which holds information about whether sibling
>> subcores are in Guest mode or host mode. An array in_guest[] of size
>> MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
>> The subcore id is used as index into in_guest[] array. Only primary
>> thread entering/exiting the guest is responsible to set/unset its
>> designated array element.
>>
>> On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
>> this patch will now force guest to vacate the core/subcore. Primary
>> thread from each subcore will then turn off its respective bit
>> from the above bitmap during the guest exit path just after the
>> guest->host partition switch is complete.
>>
>> All other threads that have just exited the guest OR were already in host
>> will wait until all other subcores clears their respective bit.
>> Once all the subcores turn off their respective bit, all threads will
>> will make call to opal hmi handler.
>>
>> It is not necessary that opal hmi handler would resync the TB value for
>> every HMI interrupts. It would do so only for the HMI caused due to
>> TB errors. For rest, it would not touch TB value. Hence to make things
>> simpler, primary thread would call TB resync explicitly once for each
>> core immediately after opal hmi handler instead of subtracting guest
>> offset from TB. TB resync call will restore the TB with host value.
>> Thus we can be sure about the TB state.
>>
>> One of the primary threads exiting the guest will take up the
>> responsibility of calling TB resync. It will use one of the top bits
>> (bit 63) from subcore state flags bitmap to make the decision. The first
>> primary thread (among the subcores) that is able to set the bit will
>> have to call the TB resync. Rest all other threads will wait until TB
>> resync is complete. Once TB resync is complete all threads will then
>> proceed.
Hi Paul,
I am extremely sorry for late reply. This mail somehow slipped under my
nose.
>
> This patch looks pretty good; the one nit that I can see is that you
> have a vcpu parameter to kvmppc_realmode_hmi_handler which is
> completely unused; which is just as well, because you get it from r9
> at the point where it is called, but at that point r9 may or may not
> contain a vcpu pointer, depending on the path we take to get there.
>
> Why not just remove the vcpu argument?
>
> Also, patch 3/3 seems like it is just something that was missed from
> this patch (2/3). Why not fold it in?
Agree. I have sent out v2 with above changes:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-May/143148.html
Thanks,
-Mahesh.
WARNING: multiple messages have this Message-ID (diff)
From: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
To: Paul Mackerras <paulus@ozlabs.org>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>,
Michael Ellerman <mpe@ellerman.id.au>,
KVM-PPC <kvm-ppc@vger.kernel.org>, KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 2/3] powerpc/book3s: Fix TB corruption in guest exit path on HMI interrupt.
Date: Sun, 15 May 2016 09:51:03 +0530 [thread overview]
Message-ID: <5737F92F.6090202@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160318045132.GB1393@fergus.ozlabs.ibm.com>
On 03/18/2016 10:21 AM, Paul Mackerras wrote:
> On Thu, Jan 14, 2016 at 08:45:04AM +0530, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> When a guest is assigned to a core it converts the host Timebase (TB)
>> into guest TB by adding guest timebase offset before entering into
>> guest. During guest exit it restores the guest TB to host TB. This means
>> under certain conditions (Guest migration) host TB and guest TB can differ.
>>
>> When we get an HMI for TB related issues the opal HMI handler would
>> try fixing errors and restore the correct host TB value. With no guest
>> running, we don't have any issues. But with guest running on the core
>> we run into TB corruption issues.
>>
>> If we get an HMI while in the guest, the current HMI handler invokes opal
>> hmi handler before forcing guest to exit. The guest exit path subtracts
>> the guest TB offset from the current TB value which may have already
>> been restored with host value by opal hmi handler. This leads to incorrect
>> host and guest TB values.
>>
>> With split-core, things become more complex. With split-core, TB also gets
>> split and each subcore gets its own TB register. When a hmi handler fixes
>> a TB error and restores the TB value, it affects all the TB values of
>> sibling subcores on the same core. On TB errors all the thread in the core
>> gets HMI. With existing code, the individual threads call opal hmi handle
>> independently which can easily throw TB out of sync if we have guest
>> running on subcores. Hence we will need to co-ordinate with all the
>> threads before making opal hmi handler call followed by TB resync.
>>
>> This patch introduces a sibling subcore state structure (shared by all
>> threads in the core) in paca which holds information about whether sibling
>> subcores are in Guest mode or host mode. An array in_guest[] of size
>> MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
>> The subcore id is used as index into in_guest[] array. Only primary
>> thread entering/exiting the guest is responsible to set/unset its
>> designated array element.
>>
>> On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
>> this patch will now force guest to vacate the core/subcore. Primary
>> thread from each subcore will then turn off its respective bit
>> from the above bitmap during the guest exit path just after the
>> guest->host partition switch is complete.
>>
>> All other threads that have just exited the guest OR were already in host
>> will wait until all other subcores clears their respective bit.
>> Once all the subcores turn off their respective bit, all threads will
>> will make call to opal hmi handler.
>>
>> It is not necessary that opal hmi handler would resync the TB value for
>> every HMI interrupts. It would do so only for the HMI caused due to
>> TB errors. For rest, it would not touch TB value. Hence to make things
>> simpler, primary thread would call TB resync explicitly once for each
>> core immediately after opal hmi handler instead of subtracting guest
>> offset from TB. TB resync call will restore the TB with host value.
>> Thus we can be sure about the TB state.
>>
>> One of the primary threads exiting the guest will take up the
>> responsibility of calling TB resync. It will use one of the top bits
>> (bit 63) from subcore state flags bitmap to make the decision. The first
>> primary thread (among the subcores) that is able to set the bit will
>> have to call the TB resync. Rest all other threads will wait until TB
>> resync is complete. Once TB resync is complete all threads will then
>> proceed.
Hi Paul,
I am extremely sorry for late reply. This mail somehow slipped under my
nose.
>
> This patch looks pretty good; the one nit that I can see is that you
> have a vcpu parameter to kvmppc_realmode_hmi_handler which is
> completely unused; which is just as well, because you get it from r9
> at the point where it is called, but at that point r9 may or may not
> contain a vcpu pointer, depending on the path we take to get there.
>
> Why not just remove the vcpu argument?
>
> Also, patch 3/3 seems like it is just something that was missed from
> this patch (2/3). Why not fold it in?
Agree. I have sent out v2 with above changes:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-May/143148.html
Thanks,
-Mahesh.
next prev parent reply other threads:[~2016-05-15 4:33 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 3:14 [PATCH 1/3] Powernv: Remove the usage of PACAR1 from opal wrappers Mahesh J Salgaonkar
2016-01-14 3:26 ` Mahesh J Salgaonkar
2016-01-14 3:14 ` Mahesh J Salgaonkar
2016-01-14 3:15 ` [PATCH 2/3] powerpc/book3s: Fix TB corruption in guest exit path on HMI interrupt Mahesh J Salgaonkar
2016-01-14 3:27 ` Mahesh J Salgaonkar
2016-03-18 4:51 ` Paul Mackerras
2016-03-18 4:51 ` Paul Mackerras
2016-05-15 4:21 ` Mahesh Jagannath Salgaonkar [this message]
2016-05-15 4:33 ` Mahesh Jagannath Salgaonkar
2016-01-14 3:15 ` [PATCH 3/3] KVM: PPC: Book3S HV: Fix soft lockups in KVM on HMI for time base errors Mahesh J Salgaonkar
2016-01-14 3:27 ` Mahesh J Salgaonkar
2016-03-18 3:33 ` [1/3] Powernv: Remove the usage of PACAR1 from opal wrappers Michael Ellerman
2016-03-18 3:33 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5737F92F.6090202@linux.vnet.ibm.com \
--to=mahesh@linux.vnet.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.