* [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13
@ 2013-12-18 16:01 Alexander Graf
2013-12-18 16:01 ` [PULL 01/12] KVM: PPC: Book3S HV: Fix physical address calculations Alexander Graf
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov
Hi Paolo / Gleb,
This is my current patch queue for 3.13.
It fixes some grave issues we've only found after 3.13-rc1:
- Make the modularized HV/PR book3s kvm work well as modules
- Fix some race conditions
- Fix compilation with certain compilers (booke)
- Fix THP for book3s_hv
- Fix preemption for book3s_pr
Please pull.
Alex
The following changes since commit f080480488028bcc25357f85e8ae54ccc3bb7173:
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm (2013-11-15 13:51:36 +0900)
are available in the git repository at:
git://github.com/agraf/linux-2.6.git tags/signed-for-3.13
for you to fetch changes up to df9059bb64023da9f27e56a94a3e2b8f4b6336a9:
KVM: PPC: Book3S HV: Don't drop low-order page address bits (2013-12-18 11:30:35 +0100)
----------------------------------------------------------------
Patch queue for 3.13 - 2013-12-18
This fixes some grave issues we've only found after 3.13-rc1:
- Make the modularized HV/PR book3s kvm work well as modules
- Fix some race conditions
- Fix compilation with certain compilers (booke)
- Fix THP for book3s_hv
- Fix preemption for book3s_pr
Alexander Graf (4):
KVM: PPC: Book3S: PR: Don't clobber our exit handler id
KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
KVM: PPC: Book3S: PR: Enable interrupts earlier
Aneesh Kumar K.V (1):
powerpc: book3s: kvm: Don't abuse host r2 in exit path
Paul Mackerras (5):
KVM: PPC: Book3S HV: Fix physical address calculations
KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
KVM: PPC: Book3S HV: Don't drop low-order page address bits
Scott Wood (1):
powerpc/kvm/booke: Fix build break due to stack frame size warning
pingfan liu (1):
powerpc: kvm: fix rare but potential deadlock scene
----------------------------------------------------------------
Alexander Graf (4):
KVM: PPC: Book3S: PR: Don't clobber our exit handler id
KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
KVM: PPC: Book3S: PR: Enable interrupts earlier
Aneesh Kumar K.V (1):
powerpc: book3s: kvm: Don't abuse host r2 in exit path
Paul Mackerras (5):
KVM: PPC: Book3S HV: Fix physical address calculations
KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
KVM: PPC: Book3S HV: Don't drop low-order page address bits
Scott Wood (1):
powerpc/kvm/booke: Fix build break due to stack frame size warning
pingfan liu (1):
powerpc: kvm: fix rare but potential deadlock scene
arch/powerpc/include/asm/kvm_book3s.h | 4 ++++
arch/powerpc/include/asm/kvm_book3s_asm.h | 2 ++
arch/powerpc/include/asm/switch_to.h | 2 +-
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/process.c | 32 +++++++++++++++----------------
arch/powerpc/kvm/book3s_64_mmu_hv.c | 18 +++++++++++++----
arch/powerpc/kvm/book3s_hv.c | 24 +++++++++++++----------
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 9 +++++++--
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 23 ++++++++++++----------
arch/powerpc/kvm/book3s_interrupts.S | 19 ++++++++++--------
arch/powerpc/kvm/book3s_pr.c | 22 +++++++++++++++++++++
arch/powerpc/kvm/book3s_rmhandlers.S | 6 +-----
arch/powerpc/kvm/booke.c | 12 ++++++------
13 files changed, 112 insertions(+), 62 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PULL 01/12] KVM: PPC: Book3S HV: Fix physical address calculations
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 02/12] KVM: PPC: Book3S HV: Refine barriers in guest entry/exit Alexander Graf
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Paul Mackerras
From: Paul Mackerras <paulus@samba.org>
This fixes a bug in kvmppc_do_h_enter() where the physical address
for a page can be calculated incorrectly if transparent huge pages
(THP) are active. Until THP came along, it was true that if we
encountered a large (16M) page in kvmppc_do_h_enter(), then the
associated memslot must be 16M aligned for both its guest physical
address and the userspace address, and the physical address
calculations in kvmppc_do_h_enter() assumed that. With THP, that
is no longer true.
In the case where we are using MMU notifiers and the page size that
we get from the Linux page tables is larger than the page being mapped
by the guest, we need to fill in some low-order bits of the physical
address. Without THP, these bits would be the same in the guest
physical address (gpa) and the host virtual address (hva). With THP,
they can be different, and we need to use the bits from hva rather
than gpa.
In the case where we are not using MMU notifiers, the host physical
address we get from the memslot->arch.slot_phys[] array already
includes the low-order bits down to the PAGE_SIZE level, even if
we are using large pages. Thus we can simplify the calculation in
this case to just add in the remaining bits in the case where
PAGE_SIZE is 64k and the guest is mapping a 4k page.
The same bug exists in kvmppc_book3s_hv_page_fault(). The basic fix
is to use psize (the page size from the HPTE) rather than pte_size
(the page size from the Linux PTE) when updating the HPTE low word
in r. That means that pfn needs to be computed to PAGE_SIZE
granularity even if the Linux PTE is a huge page PTE. That can be
arranged simply by doing the page_to_pfn() before setting page to
the head of the compound page. If psize is less than PAGE_SIZE,
then we need to make sure we only update the bits from PAGE_SIZE
upwards, in order not to lose any sub-page offset bits in r.
On the other hand, if psize is greater than PAGE_SIZE, we need to
make sure we don't bring in non-zero low order bits in pfn, hence
we mask (pfn << PAGE_SHIFT) with ~(psize - 1).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 +++++++++---
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 ++--
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index f3ff587..47bbeaf 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -665,6 +665,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
return -EFAULT;
} else {
page = pages[0];
+ pfn = page_to_pfn(page);
if (PageHuge(page)) {
page = compound_head(page);
pte_size <<= compound_order(page);
@@ -689,7 +690,6 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
}
rcu_read_unlock_sched();
}
- pfn = page_to_pfn(page);
}
ret = -EFAULT;
@@ -707,8 +707,14 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
r = (r & ~(HPTE_R_W|HPTE_R_I|HPTE_R_G)) | HPTE_R_M;
}
- /* Set the HPTE to point to pfn */
- r = (r & ~(HPTE_R_PP0 - pte_size)) | (pfn << PAGE_SHIFT);
+ /*
+ * Set the HPTE to point to pfn.
+ * Since the pfn is at PAGE_SIZE granularity, make sure we
+ * don't mask out lower-order bits if psize < PAGE_SIZE.
+ */
+ if (psize < PAGE_SIZE)
+ psize = PAGE_SIZE;
+ r = (r & ~(HPTE_R_PP0 - psize)) | ((pfn << PAGE_SHIFT) & ~(psize - 1));
if (hpte_is_writable(r) && !write_ok)
r = hpte_make_readonly(r);
ret = RESUME_GUEST;
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 9c51544..fddbf98 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -225,6 +225,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
is_io = pa & (HPTE_R_I | HPTE_R_W);
pte_size = PAGE_SIZE << (pa & KVMPPC_PAGE_ORDER_MASK);
pa &= PAGE_MASK;
+ pa |= gpa & ~PAGE_MASK;
} else {
/* Translate to host virtual address */
hva = __gfn_to_hva_memslot(memslot, gfn);
@@ -238,13 +239,12 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
ptel = hpte_make_readonly(ptel);
is_io = hpte_cache_bits(pte_val(pte));
pa = pte_pfn(pte) << PAGE_SHIFT;
+ pa |= hva & (pte_size - 1);
}
}
if (pte_size < psize)
return H_PARAMETER;
- if (pa && pte_size > psize)
- pa |= gpa & (pte_size - 1);
ptel &= ~(HPTE_R_PP0 - psize);
ptel |= pa;
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 02/12] KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
2013-12-18 16:01 ` [PULL 01/12] KVM: PPC: Book3S HV: Fix physical address calculations Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 03/12] KVM: PPC: Book3S HV: Make tbacct_lock irq-safe Alexander Graf
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Paul Mackerras
From: Paul Mackerras <paulus@samba.org>
Some users have reported instances of the host hanging with secondary
threads of a core waiting for the primary thread to exit the guest,
and the primary thread stuck in nap mode. This prompted a review of
the memory barriers in the guest entry/exit code, and this is the
result. Most of these changes are the suggestions of Dean Burdick
<deanburdick@us.ibm.com>.
The barriers between updating napping_threads and reading the
entry_exit_count on the one hand, and updating entry_exit_count and
reading napping_threads on the other, need to be isync not lwsync,
since we need to ensure that either the napping_threads update or the
entry_exit_count update get seen. It is not sufficient to order the
load vs. lwarx, as lwsync does; we need to order the load vs. the
stwcx., so we need isync.
In addition, we need a full sync before sending IPIs to wake other
threads from nap, to ensure that the write to the entry_exit_count is
visible before the IPI occurs.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bc8de75..bde28da 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -153,7 +153,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
13: b machine_check_fwnmi
-
/*
* We come in here when wakened from nap mode on a secondary hw thread.
* Relocation is off and most register values are lost.
@@ -224,6 +223,11 @@ kvm_start_guest:
/* Clear our vcpu pointer so we don't come back in early */
li r0, 0
std r0, HSTATE_KVM_VCPU(r13)
+ /*
+ * Make sure we clear HSTATE_KVM_VCPU(r13) before incrementing
+ * the nap_count, because once the increment to nap_count is
+ * visible we could be given another vcpu.
+ */
lwsync
/* Clear any pending IPI - we're an offline thread */
ld r5, HSTATE_XICS_PHYS(r13)
@@ -241,7 +245,6 @@ kvm_start_guest:
/* increment the nap count and then go to nap mode */
ld r4, HSTATE_KVM_VCORE(r13)
addi r4, r4, VCORE_NAP_COUNT
- lwsync /* make previous updates visible */
51: lwarx r3, 0, r4
addi r3, r3, 1
stwcx. r3, 0, r4
@@ -990,14 +993,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
*/
/* Increment the threads-exiting-guest count in the 0xff00
bits of vcore->entry_exit_count */
- lwsync
ld r5,HSTATE_KVM_VCORE(r13)
addi r6,r5,VCORE_ENTRY_EXIT
41: lwarx r3,0,r6
addi r0,r3,0x100
stwcx. r0,0,r6
bne 41b
- lwsync
+ isync /* order stwcx. vs. reading napping_threads */
/*
* At this point we have an interrupt that we have to pass
@@ -1030,6 +1032,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
sld r0,r0,r4
andc. r3,r3,r0 /* no sense IPI'ing ourselves */
beq 43f
+ /* Order entry/exit update vs. IPIs */
+ sync
mulli r4,r4,PACA_SIZE /* get paca for thread 0 */
subf r6,r4,r13
42: andi. r0,r3,1
@@ -1638,10 +1642,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
bge kvm_cede_exit
stwcx. r4,0,r6
bne 31b
+ /* order napping_threads update vs testing entry_exit_count */
+ isync
li r0,1
stb r0,HSTATE_NAPPING(r13)
- /* order napping_threads update vs testing entry_exit_count */
- lwsync
mr r4,r3
lwz r7,VCORE_ENTRY_EXIT(r5)
cmpwi r7,0x100
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 03/12] KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
2013-12-18 16:01 ` [PULL 01/12] KVM: PPC: Book3S HV: Fix physical address calculations Alexander Graf
2013-12-18 16:01 ` [PULL 02/12] KVM: PPC: Book3S HV: Refine barriers in guest entry/exit Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 04/12] KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call Alexander Graf
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Paul Mackerras
From: Paul Mackerras <paulus@samba.org>
Lockdep reported that there is a potential for deadlock because
vcpu->arch.tbacct_lock is not irq-safe, and is sometimes taken inside
the rq_lock (run-queue lock) in the scheduler, which is taken within
interrupts. The lockdep splat looks like:
======================================================
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
3.12.0-rc5-kvm+ #8 Not tainted
------------------------------------------------------
qemu-system-ppc/4803 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(&(&vcpu->arch.tbacct_lock)->rlock){+.+...}, at: [<c0000000000947ac>] .kvmppc_core_vcpu_put_hv+0x2c/0xa0
and this task is already holding:
(&rq->lock){-.-.-.}, at: [<c000000000ac16c0>] .__schedule+0x180/0xaa0
which would create a new lock dependency:
(&rq->lock){-.-.-.} -> (&(&vcpu->arch.tbacct_lock)->rlock){+.+...}
but this new dependency connects a HARDIRQ-irq-safe lock:
(&rq->lock){-.-.-.}
... which became HARDIRQ-irq-safe at:
[<c00000000013797c>] .lock_acquire+0xbc/0x190
[<c000000000ac3c74>] ._raw_spin_lock+0x34/0x60
[<c0000000000f8564>] .scheduler_tick+0x54/0x180
[<c0000000000c2610>] .update_process_times+0x70/0xa0
[<c00000000012cdfc>] .tick_periodic+0x3c/0xe0
[<c00000000012cec8>] .tick_handle_periodic+0x28/0xb0
[<c00000000001ef40>] .timer_interrupt+0x120/0x2e0
[<c000000000002868>] decrementer_common+0x168/0x180
[<c0000000001c7ca4>] .get_page_from_freelist+0x924/0xc10
[<c0000000001c8e00>] .__alloc_pages_nodemask+0x200/0xba0
[<c0000000001c9eb8>] .alloc_pages_exact_nid+0x68/0x110
[<c000000000f4c3ec>] .page_cgroup_init+0x1e0/0x270
[<c000000000f24480>] .start_kernel+0x3e0/0x4e4
[<c000000000009d30>] .start_here_common+0x20/0x70
to a HARDIRQ-irq-unsafe lock:
(&(&vcpu->arch.tbacct_lock)->rlock){+.+...}
... which became HARDIRQ-irq-unsafe at:
... [<c00000000013797c>] .lock_acquire+0xbc/0x190
[<c000000000ac3c74>] ._raw_spin_lock+0x34/0x60
[<c0000000000946ac>] .kvmppc_core_vcpu_load_hv+0x2c/0x100
[<c00000000008394c>] .kvmppc_core_vcpu_load+0x2c/0x40
[<c000000000081000>] .kvm_arch_vcpu_load+0x10/0x30
[<c00000000007afd4>] .vcpu_load+0x64/0xd0
[<c00000000007b0f8>] .kvm_vcpu_ioctl+0x68/0x730
[<c00000000025530c>] .do_vfs_ioctl+0x4dc/0x7a0
[<c000000000255694>] .SyS_ioctl+0xc4/0xe0
[<c000000000009ee4>] syscall_exit+0x0/0x98
Some users have reported this deadlock occurring in practice, though
the reports have been primarily on 3.10.x-based kernels.
This fixes the problem by making tbacct_lock be irq-safe.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_hv.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 072287f..31d9cfb 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -131,8 +131,9 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
+ unsigned long flags;
- spin_lock(&vcpu->arch.tbacct_lock);
+ spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
if (vc->runner == vcpu && vc->vcore_state != VCORE_INACTIVE &&
vc->preempt_tb != TB_NIL) {
vc->stolen_tb += mftb() - vc->preempt_tb;
@@ -143,19 +144,20 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
vcpu->arch.busy_stolen += mftb() - vcpu->arch.busy_preempt;
vcpu->arch.busy_preempt = TB_NIL;
}
- spin_unlock(&vcpu->arch.tbacct_lock);
+ spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
}
static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
+ unsigned long flags;
- spin_lock(&vcpu->arch.tbacct_lock);
+ spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
if (vc->runner == vcpu && vc->vcore_state != VCORE_INACTIVE)
vc->preempt_tb = mftb();
if (vcpu->arch.state == KVMPPC_VCPU_BUSY_IN_HOST)
vcpu->arch.busy_preempt = mftb();
- spin_unlock(&vcpu->arch.tbacct_lock);
+ spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
}
static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
@@ -486,11 +488,11 @@ static u64 vcore_stolen_time(struct kvmppc_vcore *vc, u64 now)
*/
if (vc->vcore_state != VCORE_INACTIVE &&
vc->runner->arch.run_task != current) {
- spin_lock(&vc->runner->arch.tbacct_lock);
+ spin_lock_irq(&vc->runner->arch.tbacct_lock);
p = vc->stolen_tb;
if (vc->preempt_tb != TB_NIL)
p += now - vc->preempt_tb;
- spin_unlock(&vc->runner->arch.tbacct_lock);
+ spin_unlock_irq(&vc->runner->arch.tbacct_lock);
} else {
p = vc->stolen_tb;
}
@@ -512,10 +514,10 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
core_stolen = vcore_stolen_time(vc, now);
stolen = core_stolen - vcpu->arch.stolen_logged;
vcpu->arch.stolen_logged = core_stolen;
- spin_lock(&vcpu->arch.tbacct_lock);
+ spin_lock_irq(&vcpu->arch.tbacct_lock);
stolen += vcpu->arch.busy_stolen;
vcpu->arch.busy_stolen = 0;
- spin_unlock(&vcpu->arch.tbacct_lock);
+ spin_unlock_irq(&vcpu->arch.tbacct_lock);
if (!dt || !vpa)
return;
memset(dt, 0, sizeof(struct dtl_entry));
@@ -1115,13 +1117,13 @@ static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
if (vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
return;
- spin_lock(&vcpu->arch.tbacct_lock);
+ spin_lock_irq(&vcpu->arch.tbacct_lock);
now = mftb();
vcpu->arch.busy_stolen += vcore_stolen_time(vc, now) -
vcpu->arch.stolen_logged;
vcpu->arch.busy_preempt = now;
vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
- spin_unlock(&vcpu->arch.tbacct_lock);
+ spin_unlock_irq(&vcpu->arch.tbacct_lock);
--vc->n_runnable;
list_del(&vcpu->arch.run_list);
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 04/12] KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (2 preceding siblings ...)
2013-12-18 16:01 ` [PULL 03/12] KVM: PPC: Book3S HV: Make tbacct_lock irq-safe Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 05/12] powerpc: kvm: fix rare but potential deadlock scene Alexander Graf
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Paul Mackerras
From: Paul Mackerras <paulus@samba.org>
Running a kernel with CONFIG_PROVE_RCU=y yields the following diagnostic:
===============================
[ INFO: suspicious RCU usage. ]
3.12.0-rc5-kvm+ #9 Not tainted
-------------------------------
include/linux/kvm_host.h:473 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-ppc/4831:
stack backtrace:
CPU: 28 PID: 4831 Comm: qemu-system-ppc Not tainted 3.12.0-rc5-kvm+ #9
Call Trace:
[c000000be462b2a0] [c00000000001644c] .show_stack+0x7c/0x1f0 (unreliable)
[c000000be462b370] [c000000000ad57c0] .dump_stack+0x88/0xb4
[c000000be462b3f0] [c0000000001315e8] .lockdep_rcu_suspicious+0x138/0x180
[c000000be462b480] [c00000000007862c] .gfn_to_memslot+0x13c/0x170
[c000000be462b510] [c00000000007d384] .gfn_to_hva_prot+0x24/0x90
[c000000be462b5a0] [c00000000007d420] .kvm_read_guest_page+0x30/0xd0
[c000000be462b630] [c00000000007d528] .kvm_read_guest+0x68/0x110
[c000000be462b6e0] [c000000000084594] .kvmppc_rtas_hcall+0x34/0x180
[c000000be462b7d0] [c000000000097934] .kvmppc_pseries_do_hcall+0x74/0x830
[c000000be462b880] [c0000000000990e8] .kvmppc_vcpu_run_hv+0xff8/0x15a0
[c000000be462b9e0] [c0000000000839cc] .kvmppc_vcpu_run+0x2c/0x40
[c000000be462ba50] [c0000000000810b4] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c000000be462bae0] [c00000000007b508] .kvm_vcpu_ioctl+0x478/0x730
[c000000be462bca0] [c00000000025532c] .do_vfs_ioctl+0x4dc/0x7a0
[c000000be462bd80] [c0000000002556b4] .SyS_ioctl+0xc4/0xe0
[c000000be462be30] [c000000000009ee4] syscall_exit+0x0/0x98
To fix this, we take the SRCU read lock around the kvmppc_rtas_hcall()
call.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_hv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 31d9cfb..b51d5db 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -591,7 +591,9 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
if (list_empty(&vcpu->kvm->arch.rtas_tokens))
return RESUME_HOST;
+ idx = srcu_read_lock(&vcpu->kvm->srcu);
rc = kvmppc_rtas_hcall(vcpu);
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
if (rc == -ENOENT)
return RESUME_HOST;
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 05/12] powerpc: kvm: fix rare but potential deadlock scene
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (3 preceding siblings ...)
2013-12-18 16:01 ` [PULL 04/12] KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 06/12] KVM: PPC: Book3S: PR: Don't clobber our exit handler id Alexander Graf
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
pingfan liu, Liu Ping Fan, stable
From: pingfan liu <qemulist@gmail.com>
Since kvmppc_hv_find_lock_hpte() is called from both virtmode and
realmode, so it can trigger the deadlock.
Suppose the following scene:
Two physical cpuM, cpuN, two VM instances A, B, each VM has a group of
vcpus.
If on cpuM, vcpu_A_1 holds bitlock X (HPTE_V_HVLOCK), then is switched
out, and on cpuN, vcpu_A_2 try to lock X in realmode, then cpuN will be
caught in realmode for a long time.
What makes things even worse if the following happens,
On cpuM, bitlockX is hold, on cpuN, Y is hold.
vcpu_B_2 try to lock Y on cpuM in realmode
vcpu_A_2 try to lock X on cpuN in realmode
Oops! deadlock happens
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Reviewed-by: Paul Mackerras <paulus@samba.org>
CC: stable@vger.kernel.org
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 +++++-
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 ++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 47bbeaf..c5d1484 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -469,11 +469,14 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
slb_v = vcpu->kvm->arch.vrma_slb_v;
}
+ preempt_disable();
/* Find the HPTE in the hash table */
index = kvmppc_hv_find_lock_hpte(kvm, eaddr, slb_v,
HPTE_V_VALID | HPTE_V_ABSENT);
- if (index < 0)
+ if (index < 0) {
+ preempt_enable();
return -ENOENT;
+ }
hptep = (unsigned long *)(kvm->arch.hpt_virt + (index << 4));
v = hptep[0] & ~HPTE_V_HVLOCK;
gr = kvm->arch.revmap[index].guest_rpte;
@@ -481,6 +484,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
/* Unlock the HPTE */
asm volatile("lwsync" : : : "memory");
hptep[0] = v;
+ preempt_enable();
gpte->eaddr = eaddr;
gpte->vpage = ((v & HPTE_V_AVPN) << 4) | ((eaddr >> 12) & 0xfff);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index fddbf98..1931aa3 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -749,6 +749,10 @@ static int slb_base_page_shift[4] = {
20, /* 1M, unsupported */
};
+/* When called from virtmode, this func should be protected by
+ * preempt_disable(), otherwise, the holding of HPTE_V_HVLOCK
+ * can trigger deadlock issue.
+ */
long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, unsigned long slb_v,
unsigned long valid)
{
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 06/12] KVM: PPC: Book3S: PR: Don't clobber our exit handler id
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (4 preceding siblings ...)
2013-12-18 16:01 ` [PULL 05/12] powerpc: kvm: fix rare but potential deadlock scene Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 07/12] KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu Alexander Graf
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov
We call a C helper to save all svcpu fields into our vcpu. The C
ABI states that r12 is considered volatile. However, we keep our
exit handler id in r12 currently.
So we need to save it away into a non-volatile register instead
that definitely does get preserved across the C call.
This bug usually didn't hit anyone yet since gcc is smart enough
to generate code that doesn't even need r12 which means it stayed
identical throughout the call by sheer luck. But we can't rely on
that.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_interrupts.S | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_interrupts.S b/arch/powerpc/kvm/book3s_interrupts.S
index f4dd041..5e7cb32 100644
--- a/arch/powerpc/kvm/book3s_interrupts.S
+++ b/arch/powerpc/kvm/book3s_interrupts.S
@@ -132,9 +132,17 @@ kvm_start_lightweight:
*
*/
+ PPC_LL r3, GPR4(r1) /* vcpu pointer */
+
+ /*
+ * kvmppc_copy_from_svcpu can clobber volatile registers, save
+ * the exit handler id to the vcpu and restore it from there later.
+ */
+ stw r12, VCPU_TRAP(r3)
+
/* Transfer reg values from shadow vcpu back to vcpu struct */
/* On 64-bit, interrupts are still off at this point */
- PPC_LL r3, GPR4(r1) /* vcpu pointer */
+
GET_SHADOW_VCPU(r4)
bl FUNC(kvmppc_copy_from_svcpu)
nop
@@ -151,7 +159,6 @@ kvm_start_lightweight:
*/
ld r3, PACA_SPRG3(r13)
mtspr SPRN_SPRG3, r3
-
#endif /* CONFIG_PPC_BOOK3S_64 */
/* R7 = vcpu */
@@ -177,7 +184,7 @@ kvm_start_lightweight:
PPC_STL r31, VCPU_GPR(R31)(r7)
/* Pass the exit number as 3rd argument to kvmppc_handle_exit */
- mr r5, r12
+ lwz r5, VCPU_TRAP(r7)
/* Restore r3 (kvm_run) and r4 (vcpu) */
REST_2GPRS(3, r1)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 07/12] KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (5 preceding siblings ...)
2013-12-18 16:01 ` [PULL 06/12] KVM: PPC: Book3S: PR: Don't clobber our exit handler id Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 08/12] KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy Alexander Graf
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov
The kvmppc_copy_{to,from}_svcpu functions are publically visible,
so we should also export them in a header for others C files to
consume.
So far we didn't need this because we only called it from asm code.
The next patch will introduce a C caller.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_book3s.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 4a594b7..bc23b1b 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -192,6 +192,10 @@ extern void kvmppc_load_up_vsx(void);
extern u32 kvmppc_alignment_dsisr(struct kvm_vcpu *vcpu, unsigned int inst);
extern ulong kvmppc_alignment_dar(struct kvm_vcpu *vcpu, unsigned int inst);
extern int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd);
+extern void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
+ struct kvm_vcpu *vcpu);
+extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
+ struct kvmppc_book3s_shadow_vcpu *svcpu);
static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
{
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 08/12] KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (6 preceding siblings ...)
2013-12-18 16:01 ` [PULL 07/12] KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 09/12] KVM: PPC: Book3S: PR: Enable interrupts earlier Alexander Graf
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov
As soon as we get back to our "highmem" handler in virtual address
space we may get preempted. Today the reason we can get preempted is
that we replay interrupts and all the lazy logic thinks we have
interrupts enabled.
However, it's not hard to make the code interruptible and that way
we can enable and handle interrupts even earlier.
This fixes random guest crashes that happened with CONFIG_PREEMPT=y
for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_book3s_asm.h | 1 +
arch/powerpc/kvm/book3s_pr.c | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 0bd9348..412b2f3 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -106,6 +106,7 @@ struct kvmppc_host_state {
};
struct kvmppc_book3s_shadow_vcpu {
+ bool in_use;
ulong gpr[14];
u32 cr;
u32 xer;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index fe14ca3..5b9e906 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -66,6 +66,7 @@ static void kvmppc_core_vcpu_load_pr(struct kvm_vcpu *vcpu, int cpu)
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
memcpy(svcpu->slb, to_book3s(vcpu)->slb_shadow, sizeof(svcpu->slb));
svcpu->slb_max = to_book3s(vcpu)->slb_shadow_max;
+ svcpu->in_use = 0;
svcpu_put(svcpu);
#endif
vcpu->cpu = smp_processor_id();
@@ -78,6 +79,9 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_PPC_BOOK3S_64
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
+ if (svcpu->in_use) {
+ kvmppc_copy_from_svcpu(vcpu, svcpu);
+ }
memcpy(to_book3s(vcpu)->slb_shadow, svcpu->slb, sizeof(svcpu->slb));
to_book3s(vcpu)->slb_shadow_max = svcpu->slb_max;
svcpu_put(svcpu);
@@ -110,12 +114,26 @@ void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
svcpu->ctr = vcpu->arch.ctr;
svcpu->lr = vcpu->arch.lr;
svcpu->pc = vcpu->arch.pc;
+ svcpu->in_use = true;
}
/* Copy data touched by real-mode code from shadow vcpu back to vcpu */
void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
struct kvmppc_book3s_shadow_vcpu *svcpu)
{
+ /*
+ * vcpu_put would just call us again because in_use hasn't
+ * been updated yet.
+ */
+ preempt_disable();
+
+ /*
+ * Maybe we were already preempted and synced the svcpu from
+ * our preempt notifiers. Don't bother touching this svcpu then.
+ */
+ if (!svcpu->in_use)
+ goto out;
+
vcpu->arch.gpr[0] = svcpu->gpr[0];
vcpu->arch.gpr[1] = svcpu->gpr[1];
vcpu->arch.gpr[2] = svcpu->gpr[2];
@@ -139,6 +157,10 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
vcpu->arch.fault_dar = svcpu->fault_dar;
vcpu->arch.fault_dsisr = svcpu->fault_dsisr;
vcpu->arch.last_inst = svcpu->last_inst;
+ svcpu->in_use = false;
+
+out:
+ preempt_enable();
}
static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 09/12] KVM: PPC: Book3S: PR: Enable interrupts earlier
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (7 preceding siblings ...)
2013-12-18 16:01 ` [PULL 08/12] KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 10/12] powerpc/kvm/booke: Fix build break due to stack frame size warning Alexander Graf
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc; +Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov
Now that the svcpu sync is interrupt aware we can enable interrupts
earlier in the exit code path again, moving 32bit and 64bit closer
together.
While at it, document the fact that we're always executing the exit
path with interrupts enabled so that the next person doesn't trap
over this.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_interrupts.S | 6 +-----
arch/powerpc/kvm/book3s_rmhandlers.S | 6 +-----
2 files changed, 2 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_interrupts.S b/arch/powerpc/kvm/book3s_interrupts.S
index 5e7cb32..f779450 100644
--- a/arch/powerpc/kvm/book3s_interrupts.S
+++ b/arch/powerpc/kvm/book3s_interrupts.S
@@ -129,6 +129,7 @@ kvm_start_lightweight:
* R12 = exit handler id
* R13 = PACA
* SVCPU.* = guest *
+ * MSR.EE = 1
*
*/
@@ -148,11 +149,6 @@ kvm_start_lightweight:
nop
#ifdef CONFIG_PPC_BOOK3S_64
- /* Re-enable interrupts */
- ld r3, HSTATE_HOST_MSR(r13)
- ori r3, r3, MSR_EE
- MTMSR_EERI(r3)
-
/*
* Reload kernel SPRG3 value.
* No need to save guest value as usermode can't modify SPRG3.
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index a38c4c9..c3c5231 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -153,15 +153,11 @@ _GLOBAL(kvmppc_entry_trampoline)
li r6, MSR_IR | MSR_DR
andc r6, r5, r6 /* Clear DR and IR in MSR value */
-#ifdef CONFIG_PPC_BOOK3S_32
/*
* Set EE in HOST_MSR so that it's enabled when we get into our
- * C exit handler function. On 64-bit we delay enabling
- * interrupts until we have finished transferring stuff
- * to or from the PACA.
+ * C exit handler function.
*/
ori r5, r5, MSR_EE
-#endif
mtsrr0 r7
mtsrr1 r6
RFI
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 10/12] powerpc/kvm/booke: Fix build break due to stack frame size warning
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (8 preceding siblings ...)
2013-12-18 16:01 ` [PULL 09/12] KVM: PPC: Book3S: PR: Enable interrupts earlier Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 11/12] powerpc: book3s: kvm: Don't abuse host r2 in exit path Alexander Graf
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Scott Wood
From: Scott Wood <scottwood@freescale.com>
Commit ce11e48b7fdd256ec68b932a89b397a790566031 ("KVM: PPC: E500: Add
userspace debug stub support") added "struct thread_struct" to the
stack of kvmppc_vcpu_run(). thread_struct is 1152 bytes on my build,
compared to 48 bytes for the recently-introduced "struct debug_reg".
Use the latter instead.
This fixes the following error:
cc1: warnings being treated as errors
arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run':
arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes
make[2]: *** [arch/powerpc/kvm/booke.o] Error 1
make[1]: *** [arch/powerpc/kvm] Error 2
make[1]: *** Waiting for unfinished jobs....
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/switch_to.h | 2 +-
arch/powerpc/kernel/process.c | 32 ++++++++++++++++----------------
arch/powerpc/kvm/booke.c | 12 ++++++------
3 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h
index 9ee1261..aace905 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -35,7 +35,7 @@ extern void giveup_vsx(struct task_struct *);
extern void enable_kernel_spe(void);
extern void giveup_spe(struct task_struct *);
extern void load_up_spe(struct task_struct *);
-extern void switch_booke_debug_regs(struct thread_struct *new_thread);
+extern void switch_booke_debug_regs(struct debug_reg *new_debug);
#ifndef CONFIG_SMP
extern void discard_lazy_cpu_state(void);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 75c2d10..83530af 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -339,7 +339,7 @@ static void set_debug_reg_defaults(struct thread_struct *thread)
#endif
}
-static void prime_debug_regs(struct thread_struct *thread)
+static void prime_debug_regs(struct debug_reg *debug)
{
/*
* We could have inherited MSR_DE from userspace, since
@@ -348,22 +348,22 @@ static void prime_debug_regs(struct thread_struct *thread)
*/
mtmsr(mfmsr() & ~MSR_DE);
- mtspr(SPRN_IAC1, thread->debug.iac1);
- mtspr(SPRN_IAC2, thread->debug.iac2);
+ mtspr(SPRN_IAC1, debug->iac1);
+ mtspr(SPRN_IAC2, debug->iac2);
#if CONFIG_PPC_ADV_DEBUG_IACS > 2
- mtspr(SPRN_IAC3, thread->debug.iac3);
- mtspr(SPRN_IAC4, thread->debug.iac4);
+ mtspr(SPRN_IAC3, debug->iac3);
+ mtspr(SPRN_IAC4, debug->iac4);
#endif
- mtspr(SPRN_DAC1, thread->debug.dac1);
- mtspr(SPRN_DAC2, thread->debug.dac2);
+ mtspr(SPRN_DAC1, debug->dac1);
+ mtspr(SPRN_DAC2, debug->dac2);
#if CONFIG_PPC_ADV_DEBUG_DVCS > 0
- mtspr(SPRN_DVC1, thread->debug.dvc1);
- mtspr(SPRN_DVC2, thread->debug.dvc2);
+ mtspr(SPRN_DVC1, debug->dvc1);
+ mtspr(SPRN_DVC2, debug->dvc2);
#endif
- mtspr(SPRN_DBCR0, thread->debug.dbcr0);
- mtspr(SPRN_DBCR1, thread->debug.dbcr1);
+ mtspr(SPRN_DBCR0, debug->dbcr0);
+ mtspr(SPRN_DBCR1, debug->dbcr1);
#ifdef CONFIG_BOOKE
- mtspr(SPRN_DBCR2, thread->debug.dbcr2);
+ mtspr(SPRN_DBCR2, debug->dbcr2);
#endif
}
/*
@@ -371,11 +371,11 @@ static void prime_debug_regs(struct thread_struct *thread)
* debug registers, set the debug registers from the values
* stored in the new thread.
*/
-void switch_booke_debug_regs(struct thread_struct *new_thread)
+void switch_booke_debug_regs(struct debug_reg *new_debug)
{
if ((current->thread.debug.dbcr0 & DBCR0_IDM)
- || (new_thread->debug.dbcr0 & DBCR0_IDM))
- prime_debug_regs(new_thread);
+ || (new_debug->dbcr0 & DBCR0_IDM))
+ prime_debug_regs(new_debug);
}
EXPORT_SYMBOL_GPL(switch_booke_debug_regs);
#else /* !CONFIG_PPC_ADV_DEBUG_REGS */
@@ -683,7 +683,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
#endif /* CONFIG_SMP */
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
- switch_booke_debug_regs(&new->thread);
+ switch_booke_debug_regs(&new->thread.debug);
#else
/*
* For PPC_BOOK3S_64, we use the hw-breakpoint interfaces that would
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 53e65a2..0591e05 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -681,7 +681,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
{
int ret, s;
- struct thread_struct thread;
+ struct debug_reg debug;
#ifdef CONFIG_PPC_FPU
struct thread_fp_state fp;
int fpexc_mode;
@@ -723,9 +723,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
#endif
/* Switch to guest debug context */
- thread.debug = vcpu->arch.shadow_dbg_reg;
- switch_booke_debug_regs(&thread);
- thread.debug = current->thread.debug;
+ debug = vcpu->arch.shadow_dbg_reg;
+ switch_booke_debug_regs(&debug);
+ debug = current->thread.debug;
current->thread.debug = vcpu->arch.shadow_dbg_reg;
kvmppc_fix_ee_before_entry();
@@ -736,8 +736,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
We also get here with interrupts enabled. */
/* Switch back to user space debug context */
- switch_booke_debug_regs(&thread);
- current->thread.debug = thread.debug;
+ switch_booke_debug_regs(&debug);
+ current->thread.debug = debug;
#ifdef CONFIG_PPC_FPU
kvmppc_save_guest_fp(vcpu);
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 11/12] powerpc: book3s: kvm: Don't abuse host r2 in exit path
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (9 preceding siblings ...)
2013-12-18 16:01 ` [PULL 10/12] powerpc/kvm/booke: Fix build break due to stack frame size warning Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-18 16:01 ` [PULL 12/12] KVM: PPC: Book3S HV: Don't drop low-order page address bits Alexander Graf
2013-12-20 18:20 ` [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Paolo Bonzini
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Aneesh Kumar K.V
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
KVM mode when both HV and PR are enabled in the kernel. Without this we
get the below crash
(qemu)
Unable to handle kernel paging request for data at address 0xffffffffffff8310
Faulting instruction address: 0xc00000000001d5a4
cpu 0x2: Vector: 300 (Data Access) at [c0000001dc53aef0]
pc: c00000000001d5a4: .vtime_delta.isra.1+0x34/0x1d0
lr: c00000000001d760: .vtime_account_system+0x20/0x60
sp: c0000001dc53b170
msr: 8000000000009032
dar: ffffffffffff8310
dsisr: 40000000
current = 0xc0000001d76c62d0
paca = 0xc00000000fef1100 softe: 0 irq_happened: 0x01
pid = 4472, comm = qemu-system-ppc
enter ? for help
[c0000001dc53b200] c00000000001d760 .vtime_account_system+0x20/0x60
[c0000001dc53b290] c00000000008d050 .kvmppc_handle_exit_pr+0x60/0xa50
[c0000001dc53b340] c00000000008f51c kvm_start_lightweight+0xb4/0xc4
[c0000001dc53b510] c00000000008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
[c0000001dc53b9e0] c00000000008341c .kvmppc_vcpu_run+0x2c/0x40
[c0000001dc53ba50] c000000000080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c0000001dc53bae0] c00000000007b4c8 .kvm_vcpu_ioctl+0x478/0x730
[c0000001dc53bca0] c0000000002140cc .do_vfs_ioctl+0x4ac/0x770
[c0000001dc53bd80] c0000000002143e8 .SyS_ioctl+0x58/0xb0
[c0000001dc53be30] c000000000009e58 syscall_exit+0x0/0x98
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_book3s_asm.h | 1 +
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 7 +++----
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 412b2f3..192917d 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -79,6 +79,7 @@ struct kvmppc_host_state {
ulong vmhandler;
ulong scratch0;
ulong scratch1;
+ ulong scratch2;
u8 in_guest;
u8 restore_hid5;
u8 napping;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 2ea5cc0..d3de010 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -576,6 +576,7 @@ int main(void)
HSTATE_FIELD(HSTATE_VMHANDLER, vmhandler);
HSTATE_FIELD(HSTATE_SCRATCH0, scratch0);
HSTATE_FIELD(HSTATE_SCRATCH1, scratch1);
+ HSTATE_FIELD(HSTATE_SCRATCH2, scratch2);
HSTATE_FIELD(HSTATE_IN_GUEST, in_guest);
HSTATE_FIELD(HSTATE_RESTORE_HID5, restore_hid5);
HSTATE_FIELD(HSTATE_NAPPING, napping);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bde28da..be4fa04a 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -754,15 +754,14 @@ kvmppc_interrupt_hv:
* guest CR, R12 saved in shadow VCPU SCRATCH1/0
* guest R13 saved in SPRN_SCRATCH0
*/
- /* abuse host_r2 as third scratch area; we get r2 from PACATOC(r13) */
- std r9, HSTATE_HOST_R2(r13)
+ std r9, HSTATE_SCRATCH2(r13)
lbz r9, HSTATE_IN_GUEST(r13)
cmpwi r9, KVM_GUEST_MODE_HOST_HV
beq kvmppc_bad_host_intr
#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
cmpwi r9, KVM_GUEST_MODE_GUEST
- ld r9, HSTATE_HOST_R2(r13)
+ ld r9, HSTATE_SCRATCH2(r13)
beq kvmppc_interrupt_pr
#endif
/* We're now back in the host but in guest MMU context */
@@ -782,7 +781,7 @@ kvmppc_interrupt_hv:
std r6, VCPU_GPR(R6)(r9)
std r7, VCPU_GPR(R7)(r9)
std r8, VCPU_GPR(R8)(r9)
- ld r0, HSTATE_HOST_R2(r13)
+ ld r0, HSTATE_SCRATCH2(r13)
std r0, VCPU_GPR(R9)(r9)
std r10, VCPU_GPR(R10)(r9)
std r11, VCPU_GPR(R11)(r9)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PULL 12/12] KVM: PPC: Book3S HV: Don't drop low-order page address bits
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (10 preceding siblings ...)
2013-12-18 16:01 ` [PULL 11/12] powerpc: book3s: kvm: Don't abuse host r2 in exit path Alexander Graf
@ 2013-12-18 16:01 ` Alexander Graf
2013-12-20 18:20 ` [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Paolo Bonzini
12 siblings, 0 replies; 14+ messages in thread
From: Alexander Graf @ 2013-12-18 16:01 UTC (permalink / raw)
To: kvm-ppc
Cc: kvm@vger.kernel.org mailing list, Paolo Bonzini, Gleb Natapov,
Paul Mackerras
From: Paul Mackerras <paulus@samba.org>
Commit caaa4c804fae ("KVM: PPC: Book3S HV: Fix physical address
calculations") unfortunately resulted in some low-order address bits
getting dropped in the case where the guest is creating a 4k HPTE
and the host page size is 64k. By getting the low-order bits from
hva rather than gpa we miss out on bits 12 - 15 in this case, since
hva is at page granularity. This puts the missing bits back in.
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 1931aa3..8689e2e 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -240,6 +240,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
is_io = hpte_cache_bits(pte_val(pte));
pa = pte_pfn(pte) << PAGE_SHIFT;
pa |= hva & (pte_size - 1);
+ pa |= gpa & ~PAGE_MASK;
}
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
` (11 preceding siblings ...)
2013-12-18 16:01 ` [PULL 12/12] KVM: PPC: Book3S HV: Don't drop low-order page address bits Alexander Graf
@ 2013-12-20 18:20 ` Paolo Bonzini
12 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2013-12-20 18:20 UTC (permalink / raw)
To: Alexander Graf; +Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Gleb Natapov
Il 18/12/2013 17:01, Alexander Graf ha scritto:
> Hi Paolo / Gleb,
>
> This is my current patch queue for 3.13.
>
> It fixes some grave issues we've only found after 3.13-rc1:
>
> - Make the modularized HV/PR book3s kvm work well as modules
> - Fix some race conditions
> - Fix compilation with certain compilers (booke)
> - Fix THP for book3s_hv
> - Fix preemption for book3s_pr
>
> Please pull.
>
> Alex
>
>
> The following changes since commit f080480488028bcc25357f85e8ae54ccc3bb7173:
>
> Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm (2013-11-15 13:51:36 +0900)
>
> are available in the git repository at:
>
>
> git://github.com/agraf/linux-2.6.git tags/signed-for-3.13
>
> for you to fetch changes up to df9059bb64023da9f27e56a94a3e2b8f4b6336a9:
>
> KVM: PPC: Book3S HV: Don't drop low-order page address bits (2013-12-18 11:30:35 +0100)
>
> ----------------------------------------------------------------
> Patch queue for 3.13 - 2013-12-18
>
> This fixes some grave issues we've only found after 3.13-rc1:
>
> - Make the modularized HV/PR book3s kvm work well as modules
> - Fix some race conditions
> - Fix compilation with certain compilers (booke)
> - Fix THP for book3s_hv
> - Fix preemption for book3s_pr
>
> Alexander Graf (4):
> KVM: PPC: Book3S: PR: Don't clobber our exit handler id
> KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
> KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
> KVM: PPC: Book3S: PR: Enable interrupts earlier
>
> Aneesh Kumar K.V (1):
> powerpc: book3s: kvm: Don't abuse host r2 in exit path
>
> Paul Mackerras (5):
> KVM: PPC: Book3S HV: Fix physical address calculations
> KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
> KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
> KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
> KVM: PPC: Book3S HV: Don't drop low-order page address bits
>
> Scott Wood (1):
> powerpc/kvm/booke: Fix build break due to stack frame size warning
>
> pingfan liu (1):
> powerpc: kvm: fix rare but potential deadlock scene
>
> ----------------------------------------------------------------
> Alexander Graf (4):
> KVM: PPC: Book3S: PR: Don't clobber our exit handler id
> KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
> KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
> KVM: PPC: Book3S: PR: Enable interrupts earlier
>
> Aneesh Kumar K.V (1):
> powerpc: book3s: kvm: Don't abuse host r2 in exit path
>
> Paul Mackerras (5):
> KVM: PPC: Book3S HV: Fix physical address calculations
> KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
> KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
> KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
> KVM: PPC: Book3S HV: Don't drop low-order page address bits
>
> Scott Wood (1):
> powerpc/kvm/booke: Fix build break due to stack frame size warning
>
> pingfan liu (1):
> powerpc: kvm: fix rare but potential deadlock scene
>
> arch/powerpc/include/asm/kvm_book3s.h | 4 ++++
> arch/powerpc/include/asm/kvm_book3s_asm.h | 2 ++
> arch/powerpc/include/asm/switch_to.h | 2 +-
> arch/powerpc/kernel/asm-offsets.c | 1 +
> arch/powerpc/kernel/process.c | 32 +++++++++++++++----------------
> arch/powerpc/kvm/book3s_64_mmu_hv.c | 18 +++++++++++++----
> arch/powerpc/kvm/book3s_hv.c | 24 +++++++++++++----------
> arch/powerpc/kvm/book3s_hv_rm_mmu.c | 9 +++++++--
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 23 ++++++++++++----------
> arch/powerpc/kvm/book3s_interrupts.S | 19 ++++++++++--------
> arch/powerpc/kvm/book3s_pr.c | 22 +++++++++++++++++++++
> arch/powerpc/kvm/book3s_rmhandlers.S | 6 +-----
> arch/powerpc/kvm/booke.c | 12 ++++++------
> 13 files changed, 112 insertions(+), 62 deletions(-)
>
Pulled and sent to Linus, thanks.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-12-20 18:20 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-18 16:01 [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Alexander Graf
2013-12-18 16:01 ` [PULL 01/12] KVM: PPC: Book3S HV: Fix physical address calculations Alexander Graf
2013-12-18 16:01 ` [PULL 02/12] KVM: PPC: Book3S HV: Refine barriers in guest entry/exit Alexander Graf
2013-12-18 16:01 ` [PULL 03/12] KVM: PPC: Book3S HV: Make tbacct_lock irq-safe Alexander Graf
2013-12-18 16:01 ` [PULL 04/12] KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call Alexander Graf
2013-12-18 16:01 ` [PULL 05/12] powerpc: kvm: fix rare but potential deadlock scene Alexander Graf
2013-12-18 16:01 ` [PULL 06/12] KVM: PPC: Book3S: PR: Don't clobber our exit handler id Alexander Graf
2013-12-18 16:01 ` [PULL 07/12] KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu Alexander Graf
2013-12-18 16:01 ` [PULL 08/12] KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy Alexander Graf
2013-12-18 16:01 ` [PULL 09/12] KVM: PPC: Book3S: PR: Enable interrupts earlier Alexander Graf
2013-12-18 16:01 ` [PULL 10/12] powerpc/kvm/booke: Fix build break due to stack frame size warning Alexander Graf
2013-12-18 16:01 ` [PULL 11/12] powerpc: book3s: kvm: Don't abuse host r2 in exit path Alexander Graf
2013-12-18 16:01 ` [PULL 12/12] KVM: PPC: Book3S HV: Don't drop low-order page address bits Alexander Graf
2013-12-20 18:20 ` [PULL 00/12] 3.13 patch queue 2013-12-18 for 3.13 Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).