* [PATCH 22/27] KVM: PPC: PV assembler helpers
From: Alexander Graf @ 2010-07-29 12:48 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.
To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kernel/kvm_emul.S | 29 +++++++++++++++++++++++++++++
1 files changed, 29 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index e0d4183..1dac72d 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -35,3 +35,32 @@ kvm_hypercall_start:
#define KVM_MAGIC_PAGE (-4096)
+#ifdef CONFIG_64BIT
+#define LL64(reg, offs, reg2) ld reg, (offs)(reg2)
+#define STL64(reg, offs, reg2) std reg, (offs)(reg2)
+#else
+#define LL64(reg, offs, reg2) lwz reg, (offs + 4)(reg2)
+#define STL64(reg, offs, reg2) stw reg, (offs + 4)(reg2)
+#endif
+
+#define SCRATCH_SAVE \
+ /* Enable critical section. We are critical if \
+ shared->critical == r1 */ \
+ STL64(r1, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0); \
+ \
+ /* Save state */ \
+ PPC_STL r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0); \
+ PPC_STL r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0); \
+ mfcr r31; \
+ stw r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0);
+
+#define SCRATCH_RESTORE \
+ /* Restore state */ \
+ PPC_LL r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0); \
+ lwz r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0); \
+ mtcr r30; \
+ PPC_LL r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0); \
+ \
+ /* Disable critical section. We are critical if \
+ shared->critical == r1 and r2 is always != r1 */ \
+ STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
--
1.6.0.2
^ permalink raw reply related
* [PATCH 16/27] KVM: PPC: Generic KVM PV guest support
From: Alexander Graf @ 2010-07-29 12:47 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
We have all the hypervisor pieces in place now, but the guest parts are still
missing.
This patch implements basic awareness of KVM when running Linux as guest. It
doesn't do anything with it yet though.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v2 -> v3:
- Add hypercall stub
---
arch/powerpc/kernel/Makefile | 2 +-
arch/powerpc/kernel/asm-offsets.c | 15 +++++++++++++++
arch/powerpc/kernel/kvm.c | 3 +++
arch/powerpc/kernel/kvm_emul.S | 37 +++++++++++++++++++++++++++++++++++++
arch/powerpc/platforms/Kconfig | 10 ++++++++++
5 files changed, 66 insertions(+), 1 deletions(-)
create mode 100644 arch/powerpc/kernel/kvm_emul.S
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5ea853d..d8e29b4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -125,7 +125,7 @@ ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC),)
obj-y += ppc_save_regs.o
endif
-obj-$(CONFIG_KVM_GUEST) += kvm.o
+obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o
# Disable GCOV in odd or sensitive code
GCOV_PROFILE_prom_init.o := n
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index a55d47e..e3e740b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -465,6 +465,21 @@ int main(void)
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
#endif /* CONFIG_PPC_BOOK3S */
#endif
+
+#ifdef CONFIG_KVM_GUEST
+ DEFINE(KVM_MAGIC_SCRATCH1, offsetof(struct kvm_vcpu_arch_shared,
+ scratch1));
+ DEFINE(KVM_MAGIC_SCRATCH2, offsetof(struct kvm_vcpu_arch_shared,
+ scratch2));
+ DEFINE(KVM_MAGIC_SCRATCH3, offsetof(struct kvm_vcpu_arch_shared,
+ scratch3));
+ DEFINE(KVM_MAGIC_INT, offsetof(struct kvm_vcpu_arch_shared,
+ int_pending));
+ DEFINE(KVM_MAGIC_MSR, offsetof(struct kvm_vcpu_arch_shared, msr));
+ DEFINE(KVM_MAGIC_CRITICAL, offsetof(struct kvm_vcpu_arch_shared,
+ critical));
+#endif
+
#ifdef CONFIG_44x
DEFINE(PGD_T_LOG2, PGD_T_LOG2);
DEFINE(PTE_T_LOG2, PTE_T_LOG2);
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 4f85505..a5ece71 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -30,6 +30,9 @@
#include <asm/cacheflush.h>
#include <asm/disassemble.h>
+#define KVM_MAGIC_PAGE (-4096L)
+#define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
+
unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr)
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
new file mode 100644
index 0000000..e0d4183
--- /dev/null
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -0,0 +1,37 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2010
+ *
+ * Authors: Alexander Graf <agraf@suse.de>
+ */
+
+#include <asm/ppc_asm.h>
+#include <asm/kvm_asm.h>
+#include <asm/reg.h>
+#include <asm/page.h>
+#include <asm/asm-offsets.h>
+
+/* Hypercall entry point. Will be patched with device tree instructions. */
+
+.global kvm_hypercall_start
+kvm_hypercall_start:
+ li r3, -1
+ nop
+ nop
+ nop
+ blr
+
+#define KVM_MAGIC_PAGE (-4096)
+
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index d1663db..1744349 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -21,6 +21,16 @@ source "arch/powerpc/platforms/44x/Kconfig"
source "arch/powerpc/platforms/40x/Kconfig"
source "arch/powerpc/platforms/amigaone/Kconfig"
+config KVM_GUEST
+ bool "KVM Guest support"
+ default y
+ ---help---
+ This option enables various optimizations for running under the KVM
+ hypervisor. Overhead for the kernel when not running inside KVM should
+ be minimal.
+
+ In case of doubt, say Y
+
config PPC_NATIVE
bool
depends on 6xx || PPC64
--
1.6.0.2
^ permalink raw reply related
* [PATCH 23/27] KVM: PPC: PV mtmsrd L=1
From: Alexander Graf @ 2010-07-29 12:48 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.
Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- use kvm_patch_ins_b
---
arch/powerpc/kernel/kvm.c | 45 ++++++++++++++++++++++++++++++++
arch/powerpc/kernel/kvm_emul.S | 56 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 239a70d..717ab0d 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -63,6 +63,7 @@
#define KVM_INST_MTSPR_DSISR 0x7c1203a6
#define KVM_INST_TLBSYNC 0x7c00046c
+#define KVM_INST_MTMSRD_L1 0x7c010164
static bool kvm_patching_worked = true;
static char kvm_tmp[1024 * 1024];
@@ -138,6 +139,43 @@ static u32 *kvm_alloc(int len)
return p;
}
+extern u32 kvm_emulate_mtmsrd_branch_offs;
+extern u32 kvm_emulate_mtmsrd_reg_offs;
+extern u32 kvm_emulate_mtmsrd_len;
+extern u32 kvm_emulate_mtmsrd[];
+
+static void kvm_patch_ins_mtmsrd(u32 *inst, u32 rt)
+{
+ u32 *p;
+ int distance_start;
+ int distance_end;
+ ulong next_inst;
+
+ p = kvm_alloc(kvm_emulate_mtmsrd_len * 4);
+ if (!p)
+ return;
+
+ /* Find out where we are and put everything there */
+ distance_start = (ulong)p - (ulong)inst;
+ next_inst = ((ulong)inst + 4);
+ distance_end = next_inst - (ulong)&p[kvm_emulate_mtmsrd_branch_offs];
+
+ /* Make sure we only write valid b instructions */
+ if (distance_start > KVM_INST_B_MAX) {
+ kvm_patching_worked = false;
+ return;
+ }
+
+ /* Modify the chunk to fit the invocation */
+ memcpy(p, kvm_emulate_mtmsrd, kvm_emulate_mtmsrd_len * 4);
+ p[kvm_emulate_mtmsrd_branch_offs] |= distance_end & KVM_INST_B_MASK;
+ p[kvm_emulate_mtmsrd_reg_offs] |= rt;
+ flush_icache_range((ulong)p, (ulong)p + kvm_emulate_mtmsrd_len * 4);
+
+ /* Patch the invocation */
+ kvm_patch_ins_b(inst, distance_start);
+}
+
static void kvm_map_magic_page(void *data)
{
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -211,6 +249,13 @@ static void kvm_check_ins(u32 *inst)
case KVM_INST_TLBSYNC:
kvm_patch_ins_nop(inst);
break;
+
+ /* Rewrites */
+ case KVM_INST_MTMSRD_L1:
+ /* We use r30 and r31 during the hook */
+ if (get_rt(inst_rt) < 30)
+ kvm_patch_ins_mtmsrd(inst, inst_rt);
+ break;
}
switch (_inst) {
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 1dac72d..10dc4a6 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -64,3 +64,59 @@ kvm_hypercall_start:
/* Disable critical section. We are critical if \
shared->critical == r1 and r2 is always != r1 */ \
STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
+
+.global kvm_emulate_mtmsrd
+kvm_emulate_mtmsrd:
+
+ SCRATCH_SAVE
+
+ /* Put MSR & ~(MSR_EE|MSR_RI) in r31 */
+ LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+ lis r30, (~(MSR_EE | MSR_RI))@h
+ ori r30, r30, (~(MSR_EE | MSR_RI))@l
+ and r31, r31, r30
+
+ /* OR the register's (MSR_EE|MSR_RI) on MSR */
+kvm_emulate_mtmsrd_reg:
+ andi. r30, r0, (MSR_EE|MSR_RI)
+ or r31, r31, r30
+
+ /* Put MSR back into magic page */
+ STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+ /* Check if we have to fetch an interrupt */
+ lwz r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
+ cmpwi r31, 0
+ beq+ no_check
+
+ /* Check if we may trigger an interrupt */
+ andi. r30, r30, MSR_EE
+ beq no_check
+
+ SCRATCH_RESTORE
+
+ /* Nag hypervisor */
+ tlbsync
+
+ b kvm_emulate_mtmsrd_branch
+
+no_check:
+
+ SCRATCH_RESTORE
+
+ /* Go back to caller */
+kvm_emulate_mtmsrd_branch:
+ b .
+kvm_emulate_mtmsrd_end:
+
+.global kvm_emulate_mtmsrd_branch_offs
+kvm_emulate_mtmsrd_branch_offs:
+ .long (kvm_emulate_mtmsrd_branch - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_reg_offs
+kvm_emulate_mtmsrd_reg_offs:
+ .long (kvm_emulate_mtmsrd_reg - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_len
+kvm_emulate_mtmsrd_len:
+ .long (kvm_emulate_mtmsrd_end - kvm_emulate_mtmsrd) / 4
--
1.6.0.2
^ permalink raw reply related
* [PATCH 13/27] KVM: PPC: Magic Page Book3s support
From: Alexander Graf @ 2010-07-29 12:47 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.
In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v2 -> v3:
- RMO -> PAM
- combine 32 and 64 real page magic override
- remove leftover goto point
- align hypercalls to in/out of ePAPR
---
arch/powerpc/include/asm/kvm_book3s.h | 1 +
arch/powerpc/kvm/book3s.c | 35 ++++++++++++++++++++++++++++++--
arch/powerpc/kvm/book3s_32_mmu.c | 16 +++++++++++++++
arch/powerpc/kvm/book3s_32_mmu_host.c | 2 +-
arch/powerpc/kvm/book3s_64_mmu.c | 30 +++++++++++++++++++++++++++-
arch/powerpc/kvm/book3s_64_mmu_host.c | 9 +------
6 files changed, 81 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index b5b1961..00cf8b0 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -130,6 +130,7 @@ extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
bool upper, u32 val);
extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu);
+extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
extern u32 kvmppc_trampoline_lowmem;
extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 0ed5376..eee97b5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -419,6 +419,25 @@ void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
}
}
+pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+ ulong mp_pa = vcpu->arch.magic_page_pa;
+
+ /* Magic page override */
+ if (unlikely(mp_pa) &&
+ unlikely(((gfn << PAGE_SHIFT) & KVM_PAM) ==
+ ((mp_pa & PAGE_MASK) & KVM_PAM))) {
+ ulong shared_page = ((ulong)vcpu->arch.shared) & PAGE_MASK;
+ pfn_t pfn;
+
+ pfn = (pfn_t)virt_to_phys((void*)shared_page) >> PAGE_SHIFT;
+ get_page(pfn_to_page(pfn));
+ return pfn;
+ }
+
+ return gfn_to_pfn(vcpu->kvm, gfn);
+}
+
/* Book3s_32 CPUs always have 32 bytes cache line size, which Linux assumes. To
* make Book3s_32 Linux work on Book3s_64, we have to make sure we trap dcbz to
* emulate 32 bytes dcbz length.
@@ -554,6 +573,13 @@ mmio:
static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
{
+ ulong mp_pa = vcpu->arch.magic_page_pa;
+
+ if (unlikely(mp_pa) &&
+ unlikely((mp_pa & KVM_PAM) >> PAGE_SHIFT == gfn)) {
+ return 1;
+ }
+
return kvm_is_visible_gfn(vcpu->kvm, gfn);
}
@@ -1257,6 +1283,7 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
struct kvmppc_vcpu_book3s *vcpu_book3s;
struct kvm_vcpu *vcpu;
int err = -ENOMEM;
+ unsigned long p;
vcpu_book3s = vmalloc(sizeof(struct kvmppc_vcpu_book3s));
if (!vcpu_book3s)
@@ -1274,8 +1301,10 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
if (err)
goto free_shadow_vcpu;
- vcpu->arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
- if (!vcpu->arch.shared)
+ p = __get_free_page(GFP_KERNEL|__GFP_ZERO);
+ /* the real shared page fills the last 4k of our page */
+ vcpu->arch.shared = (void*)(p + PAGE_SIZE - 4096);
+ if (!p)
goto uninit_vcpu;
vcpu->arch.host_retip = kvm_return_point;
@@ -1322,7 +1351,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
{
struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
- free_page((unsigned long)vcpu->arch.shared);
+ free_page((unsigned long)vcpu->arch.shared & PAGE_MASK);
kvm_vcpu_uninit(vcpu);
kfree(vcpu_book3s->shadow_vcpu);
vfree(vcpu_book3s);
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index 449bce5..a7d121a 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -281,8 +281,24 @@ static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
struct kvmppc_pte *pte, bool data)
{
int r;
+ ulong mp_ea = vcpu->arch.magic_page_ea;
pte->eaddr = eaddr;
+
+ /* Magic page override */
+ if (unlikely(mp_ea) &&
+ unlikely((eaddr & ~0xfffULL) == (mp_ea & ~0xfffULL)) &&
+ !(vcpu->arch.shared->msr & MSR_PR)) {
+ pte->vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data);
+ pte->raddr = vcpu->arch.magic_page_pa | (pte->raddr & 0xfff);
+ pte->raddr &= KVM_PAM;
+ pte->may_execute = true;
+ pte->may_read = true;
+ pte->may_write = true;
+
+ return 0;
+ }
+
r = kvmppc_mmu_book3s_32_xlate_bat(vcpu, eaddr, pte, data);
if (r < 0)
r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte, data, true);
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 67b8c38..05e8c9e 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -147,7 +147,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
struct hpte_cache *pte;
/* Get host physical address for gpa */
- hpaddr = gfn_to_pfn(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
+ hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
if (kvm_is_error_hva(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n",
orig_pte->eaddr);
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index 58aa840..d7889ef 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -163,6 +163,22 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
bool found = false;
bool perm_err = false;
int second = 0;
+ ulong mp_ea = vcpu->arch.magic_page_ea;
+
+ /* Magic page override */
+ if (unlikely(mp_ea) &&
+ unlikely((eaddr & ~0xfffULL) == (mp_ea & ~0xfffULL)) &&
+ !(vcpu->arch.shared->msr & MSR_PR)) {
+ gpte->eaddr = eaddr;
+ gpte->vpage = kvmppc_mmu_book3s_64_ea_to_vp(vcpu, eaddr, data);
+ gpte->raddr = vcpu->arch.magic_page_pa | (gpte->raddr & 0xfff);
+ gpte->raddr &= KVM_PAM;
+ gpte->may_execute = true;
+ gpte->may_read = true;
+ gpte->may_write = true;
+
+ return 0;
+ }
slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu_book3s, eaddr);
if (!slbe)
@@ -445,6 +461,7 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
ulong ea = esid << SID_SHIFT;
struct kvmppc_slb *slb;
u64 gvsid = esid;
+ ulong mp_ea = vcpu->arch.magic_page_ea;
if (vcpu->arch.shared->msr & (MSR_DR|MSR_IR)) {
slb = kvmppc_mmu_book3s_64_find_slbe(to_book3s(vcpu), ea);
@@ -464,7 +481,7 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
break;
case MSR_DR|MSR_IR:
if (!slb)
- return -ENOENT;
+ goto no_slb;
*vsid = gvsid;
break;
@@ -477,6 +494,17 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid,
*vsid |= VSID_PR;
return 0;
+
+no_slb:
+ /* Catch magic page case */
+ if (unlikely(mp_ea) &&
+ unlikely(esid == (mp_ea >> SID_SHIFT)) &&
+ !(vcpu->arch.shared->msr & MSR_PR)) {
+ *vsid = VSID_REAL | esid;
+ return 0;
+ }
+
+ return -EINVAL;
}
static bool kvmppc_mmu_book3s_64_is_dcbz32(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 71c1f90..6cdd19a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -101,18 +101,13 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
struct kvmppc_sid_map *map;
/* Get host physical address for gpa */
- hpaddr = gfn_to_pfn(vcpu->kvm, orig_pte->raddr >> PAGE_SHIFT);
+ hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
if (kvm_is_error_hva(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n", orig_pte->eaddr);
return -EINVAL;
}
hpaddr <<= PAGE_SHIFT;
-#if PAGE_SHIFT == 12
-#elif PAGE_SHIFT == 16
- hpaddr |= orig_pte->raddr & 0xf000;
-#else
-#error Unknown page size
-#endif
+ hpaddr |= orig_pte->raddr & (~0xfffULL & ~PAGE_MASK);
/* and write the mapping ea -> hpa into the pt */
vcpu->arch.mmu.esid_to_vsid(vcpu, orig_pte->eaddr >> SID_SHIFT, &vsid);
--
1.6.0.2
^ permalink raw reply related
* [PATCH 19/27] KVM: PPC: PV tlbsync to nop
From: Alexander Graf @ 2010-07-29 12:48 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
With our current MMU scheme we don't need to know about the tlbsync instruction.
So we can just nop it out.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- use kvm_patch_ins
---
arch/powerpc/kernel/kvm.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 9ec572c..3258922 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -62,6 +62,8 @@
#define KVM_INST_MTSPR_DAR 0x7c1303a6
#define KVM_INST_MTSPR_DSISR 0x7c1203a6
+#define KVM_INST_TLBSYNC 0x7c00046c
+
static bool kvm_patching_worked = true;
static inline void kvm_patch_ins(u32 *inst, u32 new_inst)
@@ -98,6 +100,11 @@ static void kvm_patch_ins_stw(u32 *inst, long addr, u32 rt)
kvm_patch_ins(inst, KVM_INST_STW | rt | (addr & 0x0000fffc));
}
+static void kvm_patch_ins_nop(u32 *inst)
+{
+ kvm_patch_ins(inst, KVM_INST_NOP);
+}
+
static void kvm_map_magic_page(void *data)
{
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -166,6 +173,11 @@ static void kvm_check_ins(u32 *inst)
case KVM_INST_MTSPR_DSISR:
kvm_patch_ins_stw(inst, magic_var(dsisr), inst_rt);
break;
+
+ /* Nops */
+ case KVM_INST_TLBSYNC:
+ kvm_patch_ins_nop(inst);
+ break;
}
switch (_inst) {
--
1.6.0.2
^ permalink raw reply related
* [PATCH 25/27] KVM: PPC: PV wrteei
From: Alexander Graf @ 2010-07-29 12:48 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280407688-9815-1-git-send-email-agraf@suse.de>
On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.
So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- use kvm_patch_ins_b
---
arch/powerpc/kernel/kvm.c | 50 ++++++++++++++++++++++++++++++++++++++++
arch/powerpc/kernel/kvm_emul.S | 41 ++++++++++++++++++++++++++++++++
2 files changed, 91 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 8ac57e2..e936817 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -67,6 +67,9 @@
#define KVM_INST_MTMSRD_L1 0x7c010164
#define KVM_INST_MTMSR 0x7c000124
+#define KVM_INST_WRTEEI_0 0x7c000146
+#define KVM_INST_WRTEEI_1 0x7c008146
+
static bool kvm_patching_worked = true;
static char kvm_tmp[1024 * 1024];
static int kvm_tmp_index;
@@ -221,6 +224,47 @@ static void kvm_patch_ins_mtmsr(u32 *inst, u32 rt)
kvm_patch_ins_b(inst, distance_start);
}
+#ifdef CONFIG_BOOKE
+
+extern u32 kvm_emulate_wrteei_branch_offs;
+extern u32 kvm_emulate_wrteei_ee_offs;
+extern u32 kvm_emulate_wrteei_len;
+extern u32 kvm_emulate_wrteei[];
+
+static void kvm_patch_ins_wrteei(u32 *inst)
+{
+ u32 *p;
+ int distance_start;
+ int distance_end;
+ ulong next_inst;
+
+ p = kvm_alloc(kvm_emulate_wrteei_len * 4);
+ if (!p)
+ return;
+
+ /* Find out where we are and put everything there */
+ distance_start = (ulong)p - (ulong)inst;
+ next_inst = ((ulong)inst + 4);
+ distance_end = next_inst - (ulong)&p[kvm_emulate_wrteei_branch_offs];
+
+ /* Make sure we only write valid b instructions */
+ if (distance_start > KVM_INST_B_MAX) {
+ kvm_patching_worked = false;
+ return;
+ }
+
+ /* Modify the chunk to fit the invocation */
+ memcpy(p, kvm_emulate_wrteei, kvm_emulate_wrteei_len * 4);
+ p[kvm_emulate_wrteei_branch_offs] |= distance_end & KVM_INST_B_MASK;
+ p[kvm_emulate_wrteei_ee_offs] |= (*inst & MSR_EE);
+ flush_icache_range((ulong)p, (ulong)p + kvm_emulate_wrteei_len * 4);
+
+ /* Patch the invocation */
+ kvm_patch_ins_b(inst, distance_start);
+}
+
+#endif
+
static void kvm_map_magic_page(void *data)
{
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -310,6 +354,12 @@ static void kvm_check_ins(u32 *inst)
}
switch (_inst) {
+#ifdef CONFIG_BOOKE
+ case KVM_INST_WRTEEI_0:
+ case KVM_INST_WRTEEI_1:
+ kvm_patch_ins_wrteei(inst);
+ break;
+#endif
}
}
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 8cd22f4..3199f65 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -204,3 +204,44 @@ kvm_emulate_mtmsr_orig_ins_offs:
.global kvm_emulate_mtmsr_len
kvm_emulate_mtmsr_len:
.long (kvm_emulate_mtmsr_end - kvm_emulate_mtmsr) / 4
+
+
+
+.global kvm_emulate_wrteei
+kvm_emulate_wrteei:
+
+ SCRATCH_SAVE
+
+ /* Fetch old MSR in r31 */
+ LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+ /* Remove MSR_EE from old MSR */
+ li r30, 0
+ ori r30, r30, MSR_EE
+ andc r31, r31, r30
+
+ /* OR new MSR_EE onto the old MSR */
+kvm_emulate_wrteei_ee:
+ ori r31, r31, 0
+
+ /* Write new MSR value back */
+ STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+ SCRATCH_RESTORE
+
+ /* Go back to caller */
+kvm_emulate_wrteei_branch:
+ b .
+kvm_emulate_wrteei_end:
+
+.global kvm_emulate_wrteei_branch_offs
+kvm_emulate_wrteei_branch_offs:
+ .long (kvm_emulate_wrteei_branch - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_ee_offs
+kvm_emulate_wrteei_ee_offs:
+ .long (kvm_emulate_wrteei_ee - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_len
+kvm_emulate_wrteei_len:
+ .long (kvm_emulate_wrteei_end - kvm_emulate_wrteei) / 4
--
1.6.0.2
^ permalink raw reply related
* [PATCH 1/7] KVM: PPC: Book3S_32 MMU debug compile fixes
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
Due to previous changes, the Book3S_32 guest MMU code didn't compile properly
when enabling debugging.
This patch repairs the broken code paths, making it possible to define DEBUG_MMU
and friends again.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_32_mmu.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a7d121a..5bf4bf8 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -104,7 +104,7 @@ static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s *vcpu_book3
pteg = (vcpu_book3s->sdr1 & 0xffff0000) | hash;
dprintk("MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x vsid=0x%x\n",
- vcpu_book3s->vcpu.arch.pc, eaddr, vcpu_book3s->sdr1, pteg,
+ kvmppc_get_pc(&vcpu_book3s->vcpu), eaddr, vcpu_book3s->sdr1, pteg,
sre->vsid);
r = gfn_to_hva(vcpu_book3s->vcpu.kvm, pteg >> PAGE_SHIFT);
@@ -269,7 +269,7 @@ no_page_found:
dprintk_pte("KVM MMU: No PTE found (sdr1=0x%llx ptegp=0x%lx)\n",
to_book3s(vcpu)->sdr1, ptegp);
for (i=0; i<16; i+=2) {
- dprintk_pte(" %02d: 0x%x - 0x%x (0x%llx)\n",
+ dprintk_pte(" %02d: 0x%x - 0x%x (0x%x)\n",
i, pteg[i], pteg[i+1], ptem);
}
}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 7/7] KVM: PPC: Move KVM trampolines before __end_interrupts
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
When using a relocatable kernel we need to make sure that the trampline code
and the interrupt handlers are both copied to low memory. The only way to do
this reliably is to put them in the copied section.
This patch should make relocated kernels work with KVM.
KVM-Stable-Tag
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kernel/exceptions-64s.S | 6 ++++++
arch/powerpc/kernel/head_64.S | 6 ------
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 3e423fb..a0f25fb 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -299,6 +299,12 @@ slb_miss_user_pseries:
b . /* prevent spec. execution */
#endif /* __DISABLED__ */
+/* KVM's trampoline code needs to be close to the interrupt handlers */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+#include "../kvm/book3s_rmhandlers.S"
+#endif
+
.align 7
.globl __end_interrupts
__end_interrupts:
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 844a44b..d3010a3 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -166,12 +166,6 @@ exception_marker:
#include "exceptions-64s.S"
#endif
-/* KVM trampoline code needs to be close to the interrupt handlers */
-
-#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
-#include "../kvm/book3s_rmhandlers.S"
-#endif
-
_GLOBAL(generic_secondary_thread_init)
mr r24,r3
--
1.6.0.2
^ permalink raw reply related
* [PATCH 4/7] KVM: PPC: Add book3s_32 tlbie flush acceleration
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
On Book3s_32 the tlbie instruction flushed effective addresses by the mask
0x0ffff000. This is pretty hard to reflect with a hash that hashes ~0xfff, so
to speed up that target we should also keep a special hash around for it.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_host.h | 4 +++
arch/powerpc/kvm/book3s_mmu_hpte.c | 40 ++++++++++++++++++++++++++++++----
2 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index fafc71a..bba3b9b 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -42,9 +42,11 @@
#define HPTEG_CACHE_NUM (1 << 15)
#define HPTEG_HASH_BITS_PTE 13
+#define HPTEG_HASH_BITS_PTE_LONG 12
#define HPTEG_HASH_BITS_VPTE 13
#define HPTEG_HASH_BITS_VPTE_LONG 5
#define HPTEG_HASH_NUM_PTE (1 << HPTEG_HASH_BITS_PTE)
+#define HPTEG_HASH_NUM_PTE_LONG (1 << HPTEG_HASH_BITS_PTE_LONG)
#define HPTEG_HASH_NUM_VPTE (1 << HPTEG_HASH_BITS_VPTE)
#define HPTEG_HASH_NUM_VPTE_LONG (1 << HPTEG_HASH_BITS_VPTE_LONG)
@@ -163,6 +165,7 @@ struct kvmppc_mmu {
struct hpte_cache {
struct hlist_node list_pte;
+ struct hlist_node list_pte_long;
struct hlist_node list_vpte;
struct hlist_node list_vpte_long;
struct rcu_head rcu_head;
@@ -293,6 +296,7 @@ struct kvm_vcpu_arch {
#ifdef CONFIG_PPC_BOOK3S
struct hlist_head hpte_hash_pte[HPTEG_HASH_NUM_PTE];
+ struct hlist_head hpte_hash_pte_long[HPTEG_HASH_NUM_PTE_LONG];
struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE];
struct hlist_head hpte_hash_vpte_long[HPTEG_HASH_NUM_VPTE_LONG];
int hpte_cache_count;
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c b/arch/powerpc/kvm/book3s_mmu_hpte.c
index b643893..02c64ab 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -45,6 +45,12 @@ static inline u64 kvmppc_mmu_hash_pte(u64 eaddr)
return hash_64(eaddr >> PTE_SIZE, HPTEG_HASH_BITS_PTE);
}
+static inline u64 kvmppc_mmu_hash_pte_long(u64 eaddr)
+{
+ return hash_64((eaddr & 0x0ffff000) >> PTE_SIZE,
+ HPTEG_HASH_BITS_PTE_LONG);
+}
+
static inline u64 kvmppc_mmu_hash_vpte(u64 vpage)
{
return hash_64(vpage & 0xfffffffffULL, HPTEG_HASH_BITS_VPTE);
@@ -66,6 +72,11 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
index = kvmppc_mmu_hash_pte(pte->pte.eaddr);
hlist_add_head_rcu(&pte->list_pte, &vcpu->arch.hpte_hash_pte[index]);
+ /* Add to ePTE_long list */
+ index = kvmppc_mmu_hash_pte_long(pte->pte.eaddr);
+ hlist_add_head_rcu(&pte->list_pte_long,
+ &vcpu->arch.hpte_hash_pte_long[index]);
+
/* Add to vPTE list */
index = kvmppc_mmu_hash_vpte(pte->pte.vpage);
hlist_add_head_rcu(&pte->list_vpte, &vcpu->arch.hpte_hash_vpte[index]);
@@ -99,6 +110,7 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
spin_lock(&vcpu->arch.mmu_lock);
hlist_del_init_rcu(&pte->list_pte);
+ hlist_del_init_rcu(&pte->list_pte_long);
hlist_del_init_rcu(&pte->list_vpte);
hlist_del_init_rcu(&pte->list_vpte_long);
@@ -150,10 +162,28 @@ static void kvmppc_mmu_pte_flush_page(struct kvm_vcpu *vcpu, ulong guest_ea)
rcu_read_unlock();
}
-void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
+static void kvmppc_mmu_pte_flush_long(struct kvm_vcpu *vcpu, ulong guest_ea)
{
- u64 i;
+ struct hlist_head *list;
+ struct hlist_node *node;
+ struct hpte_cache *pte;
+
+ /* Find the list of entries in the map */
+ list = &vcpu->arch.hpte_hash_pte_long[
+ kvmppc_mmu_hash_pte_long(guest_ea)];
+ rcu_read_lock();
+
+ /* Check the list for matching entries and invalidate */
+ hlist_for_each_entry_rcu(pte, node, list, list_pte_long)
+ if ((pte->pte.eaddr & 0x0ffff000UL) == guest_ea)
+ invalidate_pte(vcpu, pte);
+
+ rcu_read_unlock();
+}
+
+void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
+{
dprintk_mmu("KVM: Flushing %d Shadow PTEs: 0x%lx & 0x%lx\n",
vcpu->arch.hpte_cache_count, guest_ea, ea_mask);
@@ -164,9 +194,7 @@ void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
kvmppc_mmu_pte_flush_page(vcpu, guest_ea);
break;
case 0x0ffff000:
- /* 32-bit flush w/o segment, go through all possible segments */
- for (i = 0; i < 0x100000000ULL; i += 0x10000000ULL)
- kvmppc_mmu_pte_flush(vcpu, guest_ea | i, ~0xfffUL);
+ kvmppc_mmu_pte_flush_long(vcpu, guest_ea);
break;
case 0:
/* Doing a complete flush -> start from scratch */
@@ -292,6 +320,8 @@ int kvmppc_mmu_hpte_init(struct kvm_vcpu *vcpu)
/* init hpte lookup hashes */
kvmppc_mmu_hpte_init_hash(vcpu->arch.hpte_hash_pte,
ARRAY_SIZE(vcpu->arch.hpte_hash_pte));
+ kvmppc_mmu_hpte_init_hash(vcpu->arch.hpte_hash_pte_long,
+ ARRAY_SIZE(vcpu->arch.hpte_hash_pte_long));
kvmppc_mmu_hpte_init_hash(vcpu->arch.hpte_hash_vpte,
ARRAY_SIZE(vcpu->arch.hpte_hash_vpte));
kvmppc_mmu_hpte_init_hash(vcpu->arch.hpte_hash_vpte_long,
--
1.6.0.2
^ permalink raw reply related
* [PATCH 2/7] KVM: PPC: RCU'ify the Book3s MMU
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
So far we've been running all code without locking of any sort. This wasn't
really an issue because I didn't see any parallel access to the shadow MMU
code coming.
But then I started to implement dirty bitmapping to MOL which has the video
code in its own thread, so suddenly we had the dirty bitmap code run in
parallel to the shadow mmu code. And with that came trouble.
So I went ahead and made the MMU modifying functions as parallelizable as
I could think of. I hope I didn't screw up too much RCU logic :-). If you
know your way around RCU and locking and what needs to be done when, please
take a look at this patch.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_host.h | 2 +
arch/powerpc/kvm/book3s_mmu_hpte.c | 78 ++++++++++++++++++++++++++--------
2 files changed, 61 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e1da775..fafc71a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -165,6 +165,7 @@ struct hpte_cache {
struct hlist_node list_pte;
struct hlist_node list_vpte;
struct hlist_node list_vpte_long;
+ struct rcu_head rcu_head;
u64 host_va;
u64 pfn;
ulong slot;
@@ -295,6 +296,7 @@ struct kvm_vcpu_arch {
struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE];
struct hlist_head hpte_hash_vpte_long[HPTEG_HASH_NUM_VPTE_LONG];
int hpte_cache_count;
+ spinlock_t mmu_lock;
#endif
};
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c b/arch/powerpc/kvm/book3s_mmu_hpte.c
index 4868d4a..b643893 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -60,68 +60,94 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
{
u64 index;
+ spin_lock(&vcpu->arch.mmu_lock);
+
/* Add to ePTE list */
index = kvmppc_mmu_hash_pte(pte->pte.eaddr);
- hlist_add_head(&pte->list_pte, &vcpu->arch.hpte_hash_pte[index]);
+ hlist_add_head_rcu(&pte->list_pte, &vcpu->arch.hpte_hash_pte[index]);
/* Add to vPTE list */
index = kvmppc_mmu_hash_vpte(pte->pte.vpage);
- hlist_add_head(&pte->list_vpte, &vcpu->arch.hpte_hash_vpte[index]);
+ hlist_add_head_rcu(&pte->list_vpte, &vcpu->arch.hpte_hash_vpte[index]);
/* Add to vPTE_long list */
index = kvmppc_mmu_hash_vpte_long(pte->pte.vpage);
- hlist_add_head(&pte->list_vpte_long,
- &vcpu->arch.hpte_hash_vpte_long[index]);
+ hlist_add_head_rcu(&pte->list_vpte_long,
+ &vcpu->arch.hpte_hash_vpte_long[index]);
+
+ spin_unlock(&vcpu->arch.mmu_lock);
+}
+
+static void free_pte_rcu(struct rcu_head *head)
+{
+ struct hpte_cache *pte = container_of(head, struct hpte_cache, rcu_head);
+ kmem_cache_free(hpte_cache, pte);
}
static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
{
+ /* pte already invalidated? */
+ if (hlist_unhashed(&pte->list_pte))
+ return;
+
dprintk_mmu("KVM: Flushing SPT: 0x%lx (0x%llx) -> 0x%llx\n",
pte->pte.eaddr, pte->pte.vpage, pte->host_va);
/* Different for 32 and 64 bit */
kvmppc_mmu_invalidate_pte(vcpu, pte);
+ spin_lock(&vcpu->arch.mmu_lock);
+
+ hlist_del_init_rcu(&pte->list_pte);
+ hlist_del_init_rcu(&pte->list_vpte);
+ hlist_del_init_rcu(&pte->list_vpte_long);
+
+ spin_unlock(&vcpu->arch.mmu_lock);
+
if (pte->pte.may_write)
kvm_release_pfn_dirty(pte->pfn);
else
kvm_release_pfn_clean(pte->pfn);
- hlist_del(&pte->list_pte);
- hlist_del(&pte->list_vpte);
- hlist_del(&pte->list_vpte_long);
-
vcpu->arch.hpte_cache_count--;
- kmem_cache_free(hpte_cache, pte);
+ call_rcu(&pte->rcu_head, free_pte_rcu);
}
static void kvmppc_mmu_pte_flush_all(struct kvm_vcpu *vcpu)
{
struct hpte_cache *pte;
- struct hlist_node *node, *tmp;
+ struct hlist_node *node;
int i;
+ rcu_read_lock();
+
for (i = 0; i < HPTEG_HASH_NUM_VPTE_LONG; i++) {
struct hlist_head *list = &vcpu->arch.hpte_hash_vpte_long[i];
- hlist_for_each_entry_safe(pte, node, tmp, list, list_vpte_long)
+ hlist_for_each_entry_rcu(pte, node, list, list_vpte_long)
invalidate_pte(vcpu, pte);
}
+
+ rcu_read_unlock();
}
static void kvmppc_mmu_pte_flush_page(struct kvm_vcpu *vcpu, ulong guest_ea)
{
struct hlist_head *list;
- struct hlist_node *node, *tmp;
+ struct hlist_node *node;
struct hpte_cache *pte;
/* Find the list of entries in the map */
list = &vcpu->arch.hpte_hash_pte[kvmppc_mmu_hash_pte(guest_ea)];
+ rcu_read_lock();
+
/* Check the list for matching entries and invalidate */
- hlist_for_each_entry_safe(pte, node, tmp, list, list_pte)
+ hlist_for_each_entry_rcu(pte, node, list, list_pte)
if ((pte->pte.eaddr & ~0xfffUL) == guest_ea)
invalidate_pte(vcpu, pte);
+
+ rcu_read_unlock();
}
void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
@@ -156,33 +182,41 @@ void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
static void kvmppc_mmu_pte_vflush_short(struct kvm_vcpu *vcpu, u64 guest_vp)
{
struct hlist_head *list;
- struct hlist_node *node, *tmp;
+ struct hlist_node *node;
struct hpte_cache *pte;
u64 vp_mask = 0xfffffffffULL;
list = &vcpu->arch.hpte_hash_vpte[kvmppc_mmu_hash_vpte(guest_vp)];
+ rcu_read_lock();
+
/* Check the list for matching entries and invalidate */
- hlist_for_each_entry_safe(pte, node, tmp, list, list_vpte)
+ hlist_for_each_entry_rcu(pte, node, list, list_vpte)
if ((pte->pte.vpage & vp_mask) == guest_vp)
invalidate_pte(vcpu, pte);
+
+ rcu_read_unlock();
}
/* Flush with mask 0xffffff000 */
static void kvmppc_mmu_pte_vflush_long(struct kvm_vcpu *vcpu, u64 guest_vp)
{
struct hlist_head *list;
- struct hlist_node *node, *tmp;
+ struct hlist_node *node;
struct hpte_cache *pte;
u64 vp_mask = 0xffffff000ULL;
list = &vcpu->arch.hpte_hash_vpte_long[
kvmppc_mmu_hash_vpte_long(guest_vp)];
+ rcu_read_lock();
+
/* Check the list for matching entries and invalidate */
- hlist_for_each_entry_safe(pte, node, tmp, list, list_vpte_long)
+ hlist_for_each_entry_rcu(pte, node, list, list_vpte_long)
if ((pte->pte.vpage & vp_mask) == guest_vp)
invalidate_pte(vcpu, pte);
+
+ rcu_read_unlock();
}
void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 guest_vp, u64 vp_mask)
@@ -206,21 +240,25 @@ void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 guest_vp, u64 vp_mask)
void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, ulong pa_start, ulong pa_end)
{
- struct hlist_node *node, *tmp;
+ struct hlist_node *node;
struct hpte_cache *pte;
int i;
dprintk_mmu("KVM: Flushing %d Shadow pPTEs: 0x%lx - 0x%lx\n",
vcpu->arch.hpte_cache_count, pa_start, pa_end);
+ rcu_read_lock();
+
for (i = 0; i < HPTEG_HASH_NUM_VPTE_LONG; i++) {
struct hlist_head *list = &vcpu->arch.hpte_hash_vpte_long[i];
- hlist_for_each_entry_safe(pte, node, tmp, list, list_vpte_long)
+ hlist_for_each_entry_rcu(pte, node, list, list_vpte_long)
if ((pte->pte.raddr >= pa_start) &&
(pte->pte.raddr < pa_end))
invalidate_pte(vcpu, pte);
}
+
+ rcu_read_unlock();
}
struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu)
@@ -259,6 +297,8 @@ int kvmppc_mmu_hpte_init(struct kvm_vcpu *vcpu)
kvmppc_mmu_hpte_init_hash(vcpu->arch.hpte_hash_vpte_long,
ARRAY_SIZE(vcpu->arch.hpte_hash_vpte_long));
+ spin_lock_init(&vcpu->arch.mmu_lock);
+
return 0;
}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 5/7] KVM: PPC: Use MSR_DR for external load_up
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
Book3S_32 requires MSR_DR to be disabled during load_up_xxx while on Book3S_64
it's supposed to be enabled. I misread the code and disabled it in both cases,
potentially breaking the PS3 which has a really small RMA.
This patch makes KVM work on the PS3 again.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_rmhandlers.S | 28 +++++++++++++++++++---------
1 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index 506d5c3..229d3d6 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -202,8 +202,25 @@ _GLOBAL(kvmppc_rmcall)
#if defined(CONFIG_PPC_BOOK3S_32)
#define STACK_LR INT_FRAME_SIZE+4
+
+/* load_up_xxx have to run with MSR_DR=0 on Book3S_32 */
+#define MSR_EXT_START \
+ PPC_STL r20, _NIP(r1); \
+ mfmsr r20; \
+ LOAD_REG_IMMEDIATE(r3, MSR_DR|MSR_EE); \
+ andc r3,r20,r3; /* Disable DR,EE */ \
+ mtmsr r3; \
+ sync
+
+#define MSR_EXT_END \
+ mtmsr r20; /* Enable DR,EE */ \
+ sync; \
+ PPC_LL r20, _NIP(r1)
+
#elif defined(CONFIG_PPC_BOOK3S_64)
#define STACK_LR _LINK
+#define MSR_EXT_START
+#define MSR_EXT_END
#endif
/*
@@ -215,19 +232,12 @@ _GLOBAL(kvmppc_load_up_ ## what); \
PPC_STLU r1, -INT_FRAME_SIZE(r1); \
mflr r3; \
PPC_STL r3, STACK_LR(r1); \
- PPC_STL r20, _NIP(r1); \
- mfmsr r20; \
- LOAD_REG_IMMEDIATE(r3, MSR_DR|MSR_EE); \
- andc r3,r20,r3; /* Disable DR,EE */ \
- mtmsr r3; \
- sync; \
+ MSR_EXT_START; \
\
bl FUNC(load_up_ ## what); \
\
- mtmsr r20; /* Enable DR,EE */ \
- sync; \
+ MSR_EXT_END; \
PPC_LL r3, STACK_LR(r1); \
- PPC_LL r20, _NIP(r1); \
mtlr r3; \
addi r1, r1, INT_FRAME_SIZE; \
blr
--
1.6.0.2
^ permalink raw reply related
* [PATCH 3/7] KVM: PPC: correctly check gfn_to_pfn() return value
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, Gleb Natapov, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
From: Gleb Natapov <gleb@redhat.com>
On failure gfn_to_pfn returns bad_page so use correct function to check
for that.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/kvm/book3s_32_mmu_host.c | 2 +-
arch/powerpc/kvm/book3s_64_mmu_host.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 05e8c9e..343452c 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -148,7 +148,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
- if (kvm_is_error_hva(hpaddr)) {
+ if (is_error_pfn(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n",
orig_pte->eaddr);
return -EINVAL;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 6cdd19a..672b149 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -102,7 +102,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
- if (kvm_is_error_hva(hpaddr)) {
+ if (is_error_pfn(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n", orig_pte->eaddr);
return -EINVAL;
}
--
1.6.0.2
^ permalink raw reply related
* [PATCH 0/7] Rest of my KVM-PPC patch queue
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
During the past few weeks a couple of fixes have gathered in my queue. This
is a dump of everything that is not related to the PV framework.
Please apply on top of the PV stuff.
Alexander Graf (6):
KVM: PPC: Book3S_32 MMU debug compile fixes
KVM: PPC: RCU'ify the Book3s MMU
KVM: PPC: Add book3s_32 tlbie flush acceleration
KVM: PPC: Use MSR_DR for external load_up
KVM: PPC: Make long relocations be ulong
KVM: PPC: Move KVM trampolines before __end_interrupts
Gleb Natapov (1):
KVM: PPC: correctly check gfn_to_pfn() return value
arch/powerpc/include/asm/kvm_book3s.h | 4 +-
arch/powerpc/include/asm/kvm_host.h | 6 ++
arch/powerpc/kernel/exceptions-64s.S | 6 ++
arch/powerpc/kernel/head_64.S | 6 --
arch/powerpc/kvm/book3s_32_mmu.c | 4 +-
arch/powerpc/kvm/book3s_32_mmu_host.c | 2 +-
arch/powerpc/kvm/book3s_64_mmu_host.c | 2 +-
arch/powerpc/kvm/book3s_mmu_hpte.c | 118 ++++++++++++++++++++++++++-------
arch/powerpc/kvm/book3s_rmhandlers.S | 32 ++++++---
9 files changed, 133 insertions(+), 47 deletions(-)
^ permalink raw reply
* [PATCH 6/7] KVM: PPC: Make long relocations be ulong
From: Alexander Graf @ 2010-07-29 13:04 UTC (permalink / raw)
To: kvm-ppc; +Cc: linuxppc-dev, KVM list
In-Reply-To: <1280408662-10328-1-git-send-email-agraf@suse.de>
On Book3S KVM we directly expose some asm pointers to C code as
variables. These need to be relocated and thus break on relocatable
kernels.
To make sure we can at least build, let's mark them as long instead
of u32 where 64bit relocations don't work.
This fixes the following build error:
WARNING: 2 bad relocations^M
> c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M
> c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M
Please keep in mind that actually using KVM on a relocated kernel
might still break. This only fixes the compile problem.
Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/powerpc/include/asm/kvm_book3s.h | 4 ++--
arch/powerpc/kvm/book3s_rmhandlers.S | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 00cf8b0..f04f516 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -132,8 +132,8 @@ extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu);
extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
-extern u32 kvmppc_trampoline_lowmem;
-extern u32 kvmppc_trampoline_enter;
+extern ulong kvmppc_trampoline_lowmem;
+extern ulong kvmppc_trampoline_enter;
extern void kvmppc_rmcall(ulong srr0, ulong srr1);
extern void kvmppc_load_up_fpu(void);
extern void kvmppc_load_up_altivec(void);
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index 229d3d6..2b9c908 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -252,10 +252,10 @@ define_load_up(vsx)
.global kvmppc_trampoline_lowmem
kvmppc_trampoline_lowmem:
- .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
+ PPC_LONG kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
.global kvmppc_trampoline_enter
kvmppc_trampoline_enter:
- .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
+ PPC_LONG kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
#include "book3s_segment.S"
--
1.6.0.2
^ permalink raw reply related
* Re: [PATCH v2 5/7] Add support for ramdisk on ppc32 for uImage-ppc and Elf-ppc
From: Matthew McClintock @ 2010-07-29 16:03 UTC (permalink / raw)
To: Simon Horman; +Cc: linuxppc-dev, kexec
In-Reply-To: <20100729083324.GC10021@verge.net.au>
On Jul 29, 2010, at 3:33 AM, Simon Horman wrote:
> On Tue, Jul 20, 2010 at 03:14:58PM -0500, Matthew McClintock wrote:
>> This fixes --reuseinitrd and --ramdisk option for ppc32 on
>> uImage-ppc and Elf. It works for normal kexec as well as for
>> kdump.
>>
>> When using --reuseinitrd you need to specifify retain_initrd
>> on the command line. Also, if you are doing kdump you need to make
>> sure your initrd lives in the crashdump region otherwise the
>> kdump kernel will not be able to access it. The --ramdisk option
>> should always work.
>
> Thanks, I have applied this change.
> I had to do a minor merge on the Makefile,
> could you verify that the result is correct?
>
Tested and looks good.
-M
^ permalink raw reply
* Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.
From: Fushen Chen @ 2010-07-29 16:26 UTC (permalink / raw)
To: Greg KH; +Cc: linuxppc-dev, linux-usb
In-Reply-To: <20100729030559.GA16667@suse.de>
[-- Attachment #1: Type: text/plain, Size: 715 bytes --]
> [PATCH 1/2 v1.04]
> 1. License information is under clarification.
I meant that APM is still working with Synopys to resolve the GPL License.
There is no result yet.
I'll change this line to "License issue is resolved." if that happens.
I modified other part of the patch according to other reviewer's comment.
Thanks,
Fushen
On Wed, Jul 28, 2010 at 8:05 PM, Greg KH <gregkh@suse.de> wrote:
> On Wed, Jul 28, 2010 at 05:28:41PM -0700, Fushen Chen wrote:
> > [PATCH 1/2 v1.04]
> > 1. License information is under clarification.
>
> What do you mean by this? I fail to see a change here, why just repost
> the same code again?
>
> What is being done to resolve the issues I outlined previously?
>
> greg k-h
>
[-- Attachment #2: Type: text/html, Size: 1084 bytes --]
^ permalink raw reply
* Re: [PATCH 4/6] regulator: Remove owner field from attribute initialization in regulator core driver
From: Mark Brown @ 2010-07-29 17:15 UTC (permalink / raw)
To: Guenter Roeck
Cc: Nick Cheng, Jani Nikula, Linus Walleij, James Smart, linuxppc-dev,
Eric W. Biederman, Greg Kroah-Hartman, linux-kernel, Chris Wright,
Tejun Heo, James E.J. Bottomley, Richard Purdie, Alex Iannicelli,
Dmitry Torokhov, Jean Delvare, Paul Mackerras, Benjamin Thery,
linux-scsi, Liam Girdwood
In-Reply-To: <1280380166-29196-5-git-send-email-guenter.roeck@ericsson.com>
On Wed, Jul 28, 2010 at 10:09:24PM -0700, Guenter Roeck wrote:
> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
^ permalink raw reply
* Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.
From: Greg KH @ 2010-07-29 17:22 UTC (permalink / raw)
To: Fushen Chen; +Cc: linuxppc-dev, linux-usb
In-Reply-To: <AANLkTik35G2Th-1jN3jtx3Mn-rPNNfe3CDQgCdijHG8z@mail.gmail.com>
On Thu, Jul 29, 2010 at 09:26:12AM -0700, Fushen Chen wrote:
> > [PATCH 1/2 v1.04]
> > 1. License information is under clarification.
>
> I meant that APM is still working with Synopys to resolve the GPL License.
> There is no result yet.
Then I would be very careful in posting the code like you have done. As
it is, the code is not something that can be legally posted or used in
any device, and you might be held liable for it :(
good luck,
greg k-h
^ permalink raw reply
* [PATCH] of: Provide default of_node_to_nid() implementation.
From: Grant Likely @ 2010-07-29 19:04 UTC (permalink / raw)
To: sfr, monstr, microblaze-uclinux, devicetree-discuss, linux-kernel,
linuxppc-dev, benh, sparclinux, davem
of_node_to_nid() is only relevant in a few architectures. Don't force
everyone to implement it anyway.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
---
v3: make -1 the default return value and let powerpc override it to 0 when
CONFIG_NUMA not set.
arch/microblaze/include/asm/topology.h | 10 ----------
arch/powerpc/include/asm/prom.h | 7 +++++++
arch/powerpc/include/asm/topology.h | 7 -------
arch/sparc/include/asm/prom.h | 3 +--
include/linux/of.h | 5 +++++
5 files changed, 13 insertions(+), 19 deletions(-)
diff --git a/arch/microblaze/include/asm/topology.h b/arch/microblaze/include/asm/topology.h
index 96bcea5..5428f33 100644
--- a/arch/microblaze/include/asm/topology.h
+++ b/arch/microblaze/include/asm/topology.h
@@ -1,11 +1 @@
#include <asm-generic/topology.h>
-
-#ifndef _ASM_MICROBLAZE_TOPOLOGY_H
-#define _ASM_MICROBLAZE_TOPOLOGY_H
-
-struct device_node;
-static inline int of_node_to_nid(struct device_node *device)
-{
- return 0;
-}
-#endif /* _ASM_MICROBLAZE_TOPOLOGY_H */
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index da7dd63..55bccc0 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -103,6 +103,13 @@ struct device_node *of_find_next_cache_node(struct device_node *np);
/* Get the MAC address */
extern const void *of_get_mac_address(struct device_node *np);
+#ifdef CONFIG_NUMA
+extern int of_node_to_nid(struct device_node *device);
+#else
+static inline int of_node_to_nid(struct device_node *device) { return 0; }
+#endif
+#define of_node_to_nid of_node_to_nid
+
/**
* of_irq_map_pci - Resolve the interrupt for a PCI device
* @pdev: the device whose interrupt is to be resolved
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 32adf72..09dd38c 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -41,8 +41,6 @@ static inline int cpu_to_node(int cpu)
cpu_all_mask : \
node_to_cpumask_map[node])
-int of_node_to_nid(struct device_node *device);
-
struct pci_bus;
#ifdef CONFIG_PCI
extern int pcibus_to_node(struct pci_bus *bus);
@@ -94,11 +92,6 @@ extern void sysfs_remove_device_from_node(struct sys_device *dev, int nid);
#else
-static inline int of_node_to_nid(struct device_node *device)
-{
- return 0;
-}
-
static inline void dump_numa_cpu_topology(void) {}
static inline int sysfs_add_device_to_node(struct sys_device *dev, int nid)
diff --git a/arch/sparc/include/asm/prom.h b/arch/sparc/include/asm/prom.h
index c82a7da..291f125 100644
--- a/arch/sparc/include/asm/prom.h
+++ b/arch/sparc/include/asm/prom.h
@@ -43,8 +43,7 @@ extern int of_getintprop_default(struct device_node *np,
extern int of_find_in_proplist(const char *list, const char *match, int len);
#ifdef CONFIG_NUMA
extern int of_node_to_nid(struct device_node *dp);
-#else
-#define of_node_to_nid(dp) (-1)
+#define of_node_to_nid of_node_to_nid
#endif
extern void prom_build_devicetree(void);
diff --git a/include/linux/of.h b/include/linux/of.h
index b0756f3..cad7cf0 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -146,6 +146,11 @@ static inline unsigned long of_read_ulong(const __be32 *cell, int size)
#define OF_BAD_ADDR ((u64)-1)
+#ifndef of_node_to_nid
+static inline int of_node_to_nid(struct device_node *np) { return -1; }
+#define of_node_to_nid of_node_to_nid
+#endif
+
extern struct device_node *of_find_node_by_name(struct device_node *from,
const char *name);
#define for_each_node_by_name(dn, name) \
^ permalink raw reply related
* [PATCH] of/address: Clean up function declarations
From: Grant Likely @ 2010-07-29 19:04 UTC (permalink / raw)
To: sfr, monstr, microblaze-uclinux, devicetree-discuss, linux-kernel,
linuxppc-dev, benh, sparclinux, davem
This patch moves the declaration of of_get_address(), of_get_pci_address(),
and of_pci_address_to_resource() out of arch code and into the common
linux/of_address header file.
This patch also fixes some of the asm/prom.h ordering issues. It still
includes some header files that it ideally shouldn't be, but at least the
ordering is consistent now so that of_* overrides work.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
---
arch/microblaze/include/asm/prom.h | 33 +++++++-------------
arch/powerpc/include/asm/prom.h | 49 +++++++----------------------
arch/powerpc/kernel/legacy_serial.c | 1 +
arch/powerpc/kernel/pci-common.c | 1 +
arch/powerpc/platforms/52xx/lite5200.c | 1 +
arch/powerpc/platforms/amigaone/setup.c | 3 +-
arch/powerpc/platforms/iseries/mf.c | 1 +
arch/powerpc/platforms/powermac/feature.c | 2 +
drivers/char/bsr.c | 1 +
drivers/net/fsl_pq_mdio.c | 1 +
drivers/net/xilinx_emaclite.c | 2 +
drivers/serial/uartlite.c | 1 +
drivers/spi/mpc512x_psc_spi.c | 1 +
drivers/spi/mpc52xx_psc_spi.c | 1 +
drivers/spi/xilinx_spi_of.c | 1 +
drivers/usb/gadget/fsl_qe_udc.c | 1 +
drivers/video/controlfb.c | 2 +
drivers/video/offb.c | 3 +-
include/linux/of_address.h | 32 +++++++++++++++++++
19 files changed, 74 insertions(+), 63 deletions(-)
diff --git a/arch/microblaze/include/asm/prom.h b/arch/microblaze/include/asm/prom.h
index cb9c3dd..101fa09 100644
--- a/arch/microblaze/include/asm/prom.h
+++ b/arch/microblaze/include/asm/prom.h
@@ -20,11 +20,6 @@
#ifndef __ASSEMBLY__
#include <linux/types.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <linux/of_fdt.h>
-#include <linux/proc_fs.h>
-#include <linux/platform_device.h>
#include <asm/irq.h>
#include <asm/atomic.h>
@@ -52,25 +47,9 @@ extern void pci_create_OF_bus_map(void);
* OF address retreival & translation
*/
-/* Extract an address from a device, returns the region size and
- * the address space flags too. The PCI version uses a BAR number
- * instead of an absolute index
- */
-extern const u32 *of_get_address(struct device_node *dev, int index,
- u64 *size, unsigned int *flags);
-extern const u32 *of_get_pci_address(struct device_node *dev, int bar_no,
- u64 *size, unsigned int *flags);
-
-extern int of_pci_address_to_resource(struct device_node *dev, int bar,
- struct resource *r);
-
#ifdef CONFIG_PCI
extern unsigned long pci_address_to_pio(phys_addr_t address);
-#else
-static inline unsigned long pci_address_to_pio(phys_addr_t address)
-{
- return (unsigned long)-1;
-}
+#define pci_address_to_pio pci_address_to_pio
#endif /* CONFIG_PCI */
/* Parse the ibm,dma-window property of an OF node into the busno, phys and
@@ -99,8 +78,18 @@ extern const void *of_get_mac_address(struct device_node *np);
* resolving using the OF tree walking.
*/
struct pci_dev;
+struct of_irq;
extern int of_irq_map_pci(struct pci_dev *pdev, struct of_irq *out_irq);
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
+
+/* These includes are put at the bottom because they may contain things
+ * that are overridden by this file. Ideally they shouldn't be included
+ * by this file, but there are a bunch of .c files that currently depend
+ * on it. Eventually they will be cleaned up. */
+#include <linux/of_fdt.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+
#endif /* _ASM_MICROBLAZE_PROM_H */
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 55bccc0..ae26f2e 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -17,11 +17,6 @@
* 2 of the License, or (at your option) any later version.
*/
#include <linux/types.h>
-#include <linux/of_fdt.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <linux/proc_fs.h>
-#include <linux/platform_device.h>
#include <asm/irq.h>
#include <asm/atomic.h>
@@ -49,41 +44,9 @@ extern void pci_create_OF_bus_map(void);
extern u64 of_translate_dma_address(struct device_node *dev,
const u32 *in_addr);
-/* Extract an address from a device, returns the region size and
- * the address space flags too. The PCI version uses a BAR number
- * instead of an absolute index
- */
-extern const u32 *of_get_address(struct device_node *dev, int index,
- u64 *size, unsigned int *flags);
-#ifdef CONFIG_PCI
-extern const u32 *of_get_pci_address(struct device_node *dev, int bar_no,
- u64 *size, unsigned int *flags);
-#else
-static inline const u32 *of_get_pci_address(struct device_node *dev,
- int bar_no, u64 *size, unsigned int *flags)
-{
- return NULL;
-}
-#endif /* CONFIG_PCI */
-
-#ifdef CONFIG_PCI
-extern int of_pci_address_to_resource(struct device_node *dev, int bar,
- struct resource *r);
-#else
-static inline int of_pci_address_to_resource(struct device_node *dev, int bar,
- struct resource *r)
-{
- return -ENOSYS;
-}
-#endif /* CONFIG_PCI */
-
#ifdef CONFIG_PCI
extern unsigned long pci_address_to_pio(phys_addr_t address);
-#else
-static inline unsigned long pci_address_to_pio(phys_addr_t address)
-{
- return (unsigned long)-1;
-}
+#define pci_address_to_pio pci_address_to_pio
#endif /* CONFIG_PCI */
/* Parse the ibm,dma-window property of an OF node into the busno, phys and
@@ -122,9 +85,19 @@ static inline int of_node_to_nid(struct device_node *device) { return 0; }
* resolving using the OF tree walking.
*/
struct pci_dev;
+struct of_irq;
extern int of_irq_map_pci(struct pci_dev *pdev, struct of_irq *out_irq);
extern void of_instantiate_rtc(void);
+/* These includes are put at the bottom because they may contain things
+ * that are overridden by this file. Ideally they shouldn't be included
+ * by this file, but there are a bunch of .c files that currently depend
+ * on it. Eventually they will be cleaned up. */
+#include <linux/of_fdt.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+
#endif /* __KERNEL__ */
#endif /* _POWERPC_PROM_H */
diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c
index 035ada5..c1fd0f9 100644
--- a/arch/powerpc/kernel/legacy_serial.c
+++ b/arch/powerpc/kernel/legacy_serial.c
@@ -4,6 +4,7 @@
#include <linux/serial_core.h>
#include <linux/console.h>
#include <linux/pci.h>
+#include <linux/of_address.h>
#include <linux/of_device.h>
#include <asm/io.h>
#include <asm/mmu.h>
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 5b38f6a..9021c4a 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -21,6 +21,7 @@
#include <linux/string.h>
#include <linux/init.h>
#include <linux/bootmem.h>
+#include <linux/of_address.h>
#include <linux/mm.h>
#include <linux/list.h>
#include <linux/syscalls.h>
diff --git a/arch/powerpc/platforms/52xx/lite5200.c b/arch/powerpc/platforms/52xx/lite5200.c
index 6d584f4..de55bc0 100644
--- a/arch/powerpc/platforms/52xx/lite5200.c
+++ b/arch/powerpc/platforms/52xx/lite5200.c
@@ -18,6 +18,7 @@
#include <linux/init.h>
#include <linux/pci.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/root_dev.h>
#include <linux/initrd.h>
#include <asm/time.h>
diff --git a/arch/powerpc/platforms/amigaone/setup.c b/arch/powerpc/platforms/amigaone/setup.c
index fb4eb0d..03aabc0 100644
--- a/arch/powerpc/platforms/amigaone/setup.c
+++ b/arch/powerpc/platforms/amigaone/setup.c
@@ -13,12 +13,13 @@
*/
#include <linux/kernel.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/seq_file.h>
#include <generated/utsrelease.h>
#include <asm/machdep.h>
#include <asm/cputable.h>
-#include <asm/prom.h>
#include <asm/pci-bridge.h>
#include <asm/i8259.h>
#include <asm/time.h>
diff --git a/arch/powerpc/platforms/iseries/mf.c b/arch/powerpc/platforms/iseries/mf.c
index d2c1d49..33e5fc7 100644
--- a/arch/powerpc/platforms/iseries/mf.c
+++ b/arch/powerpc/platforms/iseries/mf.c
@@ -30,6 +30,7 @@
#include <linux/init.h>
#include <linux/completion.h>
#include <linux/delay.h>
+#include <linux/proc_fs.h>
#include <linux/dma-mapping.h>
#include <linux/bcd.h>
#include <linux/rtc.h>
diff --git a/arch/powerpc/platforms/powermac/feature.c b/arch/powerpc/platforms/powermac/feature.c
index 9e1b9fd..75eec03 100644
--- a/arch/powerpc/platforms/powermac/feature.c
+++ b/arch/powerpc/platforms/powermac/feature.c
@@ -21,6 +21,8 @@
#include <linux/delay.h>
#include <linux/kernel.h>
#include <linux/sched.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/spinlock.h>
#include <linux/adb.h>
#include <linux/pmu.h>
diff --git a/drivers/char/bsr.c b/drivers/char/bsr.c
index 89d871e..9191713 100644
--- a/drivers/char/bsr.c
+++ b/drivers/char/bsr.c
@@ -23,6 +23,7 @@
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/of_platform.h>
+#include <linux/fs.h>
#include <linux/module.h>
#include <linux/cdev.h>
#include <linux/list.h>
diff --git a/drivers/net/fsl_pq_mdio.c b/drivers/net/fsl_pq_mdio.c
index b4c41d7..f53f850 100644
--- a/drivers/net/fsl_pq_mdio.c
+++ b/drivers/net/fsl_pq_mdio.c
@@ -35,6 +35,7 @@
#include <linux/mii.h>
#include <linux/phy.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_mdio.h>
#include <linux/of_platform.h>
diff --git a/drivers/net/xilinx_emaclite.c b/drivers/net/xilinx_emaclite.c
index d04c5b2..b2c2f39 100644
--- a/drivers/net/xilinx_emaclite.c
+++ b/drivers/net/xilinx_emaclite.c
@@ -20,7 +20,7 @@
#include <linux/skbuff.h>
#include <linux/io.h>
#include <linux/slab.h>
-
+#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_platform.h>
#include <linux/of_mdio.h>
diff --git a/drivers/serial/uartlite.c b/drivers/serial/uartlite.c
index 8acccd5..caf085d 100644
--- a/drivers/serial/uartlite.c
+++ b/drivers/serial/uartlite.c
@@ -21,6 +21,7 @@
#include <asm/io.h>
#if defined(CONFIG_OF) && (defined(CONFIG_PPC32) || defined(CONFIG_MICROBLAZE))
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_platform.h>
diff --git a/drivers/spi/mpc512x_psc_spi.c b/drivers/spi/mpc512x_psc_spi.c
index 1bb4315..10baac3 100644
--- a/drivers/spi/mpc512x_psc_spi.c
+++ b/drivers/spi/mpc512x_psc_spi.c
@@ -19,6 +19,7 @@
#include <linux/init.h>
#include <linux/errno.h>
#include <linux/interrupt.h>
+#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/workqueue.h>
#include <linux/completion.h>
diff --git a/drivers/spi/mpc52xx_psc_spi.c b/drivers/spi/mpc52xx_psc_spi.c
index bd81ff9..66d1701 100644
--- a/drivers/spi/mpc52xx_psc_spi.c
+++ b/drivers/spi/mpc52xx_psc_spi.c
@@ -16,6 +16,7 @@
#include <linux/types.h>
#include <linux/errno.h>
#include <linux/interrupt.h>
+#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/workqueue.h>
#include <linux/completion.h>
diff --git a/drivers/spi/xilinx_spi_of.c b/drivers/spi/xilinx_spi_of.c
index 87cda09..f53d3f6 100644
--- a/drivers/spi/xilinx_spi_of.c
+++ b/drivers/spi/xilinx_spi_of.c
@@ -29,6 +29,7 @@
#include <linux/io.h>
#include <linux/slab.h>
+#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/of_device.h>
#include <linux/of_spi.h>
diff --git a/drivers/usb/gadget/fsl_qe_udc.c b/drivers/usb/gadget/fsl_qe_udc.c
index 82506ca..9648b75 100644
--- a/drivers/usb/gadget/fsl_qe_udc.c
+++ b/drivers/usb/gadget/fsl_qe_udc.c
@@ -32,6 +32,7 @@
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/moduleparam.h>
+#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/dma-mapping.h>
#include <linux/usb/ch9.h>
diff --git a/drivers/video/controlfb.c b/drivers/video/controlfb.c
index 49fcbe8..c225dcc 100644
--- a/drivers/video/controlfb.c
+++ b/drivers/video/controlfb.c
@@ -40,6 +40,8 @@
#include <linux/vmalloc.h>
#include <linux/delay.h>
#include <linux/interrupt.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/fb.h>
#include <linux/init.h>
#include <linux/pci.h>
diff --git a/drivers/video/offb.c b/drivers/video/offb.c
index 46dda7d..cb163a5 100644
--- a/drivers/video/offb.c
+++ b/drivers/video/offb.c
@@ -19,13 +19,14 @@
#include <linux/mm.h>
#include <linux/vmalloc.h>
#include <linux/delay.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/interrupt.h>
#include <linux/fb.h>
#include <linux/init.h>
#include <linux/ioport.h>
#include <linux/pci.h>
#include <asm/io.h>
-#include <asm/prom.h>
#ifdef CONFIG_PPC64
#include <asm/pci-bridge.h>
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index cc567df..8aea06f 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -8,5 +8,37 @@ extern int of_address_to_resource(struct device_node *dev, int index,
struct resource *r);
extern void __iomem *of_iomap(struct device_node *device, int index);
+/* Extract an address from a device, returns the region size and
+ * the address space flags too. The PCI version uses a BAR number
+ * instead of an absolute index
+ */
+extern const u32 *of_get_address(struct device_node *dev, int index,
+ u64 *size, unsigned int *flags);
+
+#ifndef pci_address_to_pio
+static inline unsigned long pci_address_to_pio(phys_addr_t addr) { return -1; }
+#define pci_address_to_pio pci_address_to_pio
+#endif
+
+#ifdef CONFIG_PCI
+extern const u32 *of_get_pci_address(struct device_node *dev, int bar_no,
+ u64 *size, unsigned int *flags);
+extern int of_pci_address_to_resource(struct device_node *dev, int bar,
+ struct resource *r);
+#else /* CONFIG_PCI */
+static inline int of_pci_address_to_resource(struct device_node *dev, int bar,
+ struct resource *r)
+{
+ return -ENOSYS;
+}
+
+static inline const u32 *of_get_pci_address(struct device_node *dev,
+ int bar_no, u64 *size, unsigned int *flags)
+{
+ return NULL;
+}
+#endif /* CONFIG_PCI */
+
+
#endif /* __OF_ADDRESS_H */
^ permalink raw reply related
* Re: Commit 3da34aa brakes MSI support on MPC8308 (possibly all MPC83xx) [REPOST]
From: Wolfgang Denk @ 2010-07-29 21:20 UTC (permalink / raw)
To: galak, Kim Phillips; +Cc: linuxppc-dev, Ilya Yanok
In-Reply-To: <4C48B384.1020006@emcraft.com>
Dear Kumar & Kim,
any comments on this issue?
Thanks.
In message <4C48B384.1020006@emcraft.com> Ilya Yanok wrote:
> Hi Kumar, Kim, Josh, everybody,
>
> I hope to disturb you but I haven't got any reply for my first posting...
>
> I've found that MSI work correctly with older kernels on my MPC8308RDB
> board and don't work with newer ones. After bisecting I've found that
> the source of the problem is commit 3da34aa:
>
> commit 3da34aae03d498ee62f75aa7467de93cce3030fd
> Author: Kumar Gala <galak@kernel.crashing.org>
> Date: Tue May 12 15:51:56 2009 -0500
>
> powerpc/fsl: Support unique MSI addresses per PCIe Root Complex
>
> Its feasible based on how the PCI address map is setup that the region
> of PCI address space used for MSIs differs for each PHB on the same
> SoC.
>
> Instead of assuming that the address mappes to CCSRBAR 1:1 we read
> PEXCSRBAR (BAR0) for the PHB that the given pci_dev is on.
>
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
>
> I can see BAR0 initialization for 85xx/86xx hardware but not for 83xx
> neigher in the kernel nor in U-Boot (that makes me think that all 83xx
> can be affected).
> I'm not actually an PCI expert so I've just tried to write IMMR base
> address to the BAR0 register from the U-Boot to get the correct address
> but this doesn't help.
> Please direct me how to init 83xx PCIE controller to make it compatible
> with this patch.
>
> Kim, I think MPC8315E is affected too, could you please test it?
>
> Thanks in advance.
>
> Regards, Ilya.
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
A good aphorism is too hard for the tooth of time, and is not worn
away by all the centuries, although it serves as food for every
epoch. - Friedrich Wilhelm Nietzsche
_Miscellaneous Maxims and Opinions_ no. 168
^ permalink raw reply
* Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec
From: Michael Neuling @ 2010-07-30 0:08 UTC (permalink / raw)
To: Matt Evans; +Cc: linuxppc-dev, Milton Miller
In-Reply-To: <4C511216.30109@ozlabs.org>
In message <4C511216.30109@ozlabs.org> you wrote:
> When CPU hotplug is used, some CPUs may be offline at the time a kexec is
> performed. The subsequent kernel may expect these CPUs to be already running
,
> and will declare them stuck. On pseries, there's also a soft-offline (cede)
> state that CPUs may be in; this can also cause problems as the kexeced kernel
> may ask RTAS if they're online -- and RTAS would say they are. Again, stuck.
>
> This patch kicks each present offline CPU awake before the kexec, so that
> none are lost to these assumptions in the subsequent kernel.
There are a lot of cleanups in this patch. The change you are making
would be a lot clearer without all the additional cleanups in there. I
think I'd like to see this as two patches. One for cleanups and one for
the addition of wake_offline_cpus().
Other than that, I'm not completely convinced this is the functionality
we want. Do we really want to online these cpus? Why where they
offlined in the first place? I understand the stuck problem, but is the
solution to online them, or to change the device tree so that the second
kernel doesn't detect them as stuck?
Mikey
>
> Signed-off-by: Matt Evans <matt@ozlabs.org>
> ---
> v2: Added FIXME comment noting a possible problem with incorrectly
> started secondary CPUs, following feedback from Milton.
>
> arch/powerpc/kernel/machine_kexec_64.c | 55 ++++++++++++++++++++++++++++--
-
> 1 files changed, 49 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/mac
hine_kexec_64.c
> index 4fbb3be..37f805e 100644
> --- a/arch/powerpc/kernel/machine_kexec_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_64.c
> @@ -15,6 +15,8 @@
> #include <linux/thread_info.h>
> #include <linux/init_task.h>
> #include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/cpu.h>
>
> #include <asm/page.h>
> #include <asm/current.h>
> @@ -181,7 +183,20 @@ static void kexec_prepare_cpus_wait(int wait_state)
> int my_cpu, i, notified=-1;
>
> my_cpu = get_cpu();
> - /* Make sure each CPU has atleast made it to the state we need */
> + /* Make sure each CPU has at least made it to the state we need.
> + *
> + * FIXME: There is a (slim) chance of a problem if not all of the CPUs
> + * are correctly onlined. If somehow we start a CPU on boot with RTAS
> + * start-cpu, but somehow that CPU doesn't write callin_cpu_map[] in
> + * time, the boot CPU will timeout. If it does eventually execute
> + * stuff, the secondary will start up (paca[].cpu_start was written) an
d
> + * get into a peculiar state. If the platform supports
> + * smp_ops->take_timebase(), the secondary CPU will probably be spinnin
g
> + * in there. If not (i.e. pseries), the secondary will continue on and
> + * try to online itself/idle/etc. If it survives that, we need to find
> + * these possible-but-not-online-but-should-be CPUs and chaperone them
> + * into kexec_smp_wait().
> + */
> for_each_online_cpu(i) {
> if (i == my_cpu)
> continue;
> @@ -189,9 +204,9 @@ static void kexec_prepare_cpus_wait(int wait_state)
> while (paca[i].kexec_state < wait_state) {
> barrier();
> if (i != notified) {
> - printk( "kexec: waiting for cpu %d (physical"
> - " %d) to enter %i state\n",
> - i, paca[i].hw_cpu_id, wait_state);
> + printk(KERN_INFO "kexec: waiting for cpu %d "
> + "(physical %d) to enter %i state\n",
> + i, paca[i].hw_cpu_id, wait_state);
> notified = i;
> }
> }
> @@ -199,9 +214,32 @@ static void kexec_prepare_cpus_wait(int wait_state)
> mb();
> }
>
> -static void kexec_prepare_cpus(void)
> +/*
> + * We need to make sure each present CPU is online. The next kernel will sc
an
> + * the device tree and assume primary threads are online and query secondary
> + * threads via RTAS to online them if required. If we don't online primary
> + * threads, they will be stuck. However, we also online secondary threads a
s we
> + * may be using 'cede offline'. In this case RTAS doesn't see the secondary
> + * threads as offline -- and again, these CPUs will be stuck.
> + *
> + * So, we online all CPUs that should be running, including secondary thread
s.
> + */
> +static void wake_offline_cpus(void)
> {
> + int cpu = 0;
>
> + for_each_present_cpu(cpu) {
> + if (!cpu_online(cpu)) {
> + printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
> + cpu);
> + cpu_up(cpu);
> + }
> + }
> +}
> +
> +static void kexec_prepare_cpus(void)
> +{
> + wake_offline_cpus();
> smp_call_function(kexec_smp_down, NULL, /* wait */0);
> local_irq_disable();
> mb(); /* make sure IRQs are disabled before we say they are */
> @@ -215,7 +253,10 @@ static void kexec_prepare_cpus(void)
> if (ppc_md.kexec_cpu_down)
> ppc_md.kexec_cpu_down(0, 0);
>
> - /* Before removing MMU mapings make sure all CPUs have entered real mod
e */
> + /*
> + * Before removing MMU mappings make sure all CPUs have entered real
> + * mode:
> + */
> kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE);
>
> put_cpu();
> @@ -284,6 +325,8 @@ void default_machine_kexec(struct kimage *image)
> if (crashing_cpu == -1)
> kexec_prepare_cpus();
>
> + pr_debug("kexec: Starting switchover sequence.\n");
> +
> /* switch to a staticly allocated stack. Based on irq stack code.
> * XXX: the task struct will likely be invalid once we do the copy!
> */
> --
> 1.6.3.3
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply
* Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.
From: Feng Kan @ 2010-07-30 0:14 UTC (permalink / raw)
To: Greg KH
Cc: linux-usb, David Daney, linuxppc-dev, Mark Miesfeld, Ted Chan,
Fushen Chen
In-Reply-To: <20100726231616.GA25342@suse.de>
Hi Greg:
We will change to a BSD 3 clause license header. Our legal counsel is
talking to Synopsis to make this change. We will resubmit once this
is in place. Please let me know if you have any additional concerns.
Feng Kan
Applied Micro
On Mon, Jul 26, 2010 at 4:16 PM, Greg KH <gregkh@suse.de> wrote:
> On Mon, Jul 26, 2010 at 04:05:49PM -0700, Feng Kan wrote:
>> Hi Greg:
>>
>> We are having our legal revisit this again. What would you advise us
>> to do at this point?
>
> I thought I was very clear below as to what is needed.
>
>> Disclose the agreement or have someone with legal authority reply this
>> thread.
>
> Neither will resolve the end issue, right?
>
>> Perhaps something in the header that states Applied Micro verified
>> with Synopsys to use this code for GPL purpose.
>
> No, that will just make it messier. =C2=A0Someone needs to delete all of =
the
> mess in the file, put the proper license information for what the code
> is being licensed under (whatever it is), and provide a signed-off-by
> from a person from Synopsys and APM that can speak for the company that
> they agree that the code can properly be placed into the Linux kernel.
>
> thanks,
>
> greg k-h
>
--=20
Feng Kan
^ permalink raw reply
* Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec
From: Matt Evans @ 2010-07-30 0:41 UTC (permalink / raw)
To: Michael Neuling; +Cc: linuxppc-dev, Milton Miller
In-Reply-To: <3296.1280448512@neuling.org>
Michael Neuling wrote:
> In message <4C511216.30109@ozlabs.org> you wrote:
>> When CPU hotplug is used, some CPUs may be offline at the time a kexec is
>> performed. The subsequent kernel may expect these CPUs to be already running
> ,
>> and will declare them stuck. On pseries, there's also a soft-offline (cede)
>> state that CPUs may be in; this can also cause problems as the kexeced kernel
>> may ask RTAS if they're online -- and RTAS would say they are. Again, stuck.
>>
>> This patch kicks each present offline CPU awake before the kexec, so that
>> none are lost to these assumptions in the subsequent kernel.
>
> There are a lot of cleanups in this patch. The change you are making
> would be a lot clearer without all the additional cleanups in there. I
> think I'd like to see this as two patches. One for cleanups and one for
> the addition of wake_offline_cpus().
Okay, I can split this. Typofixy-add-debug in one, wake_offline_cpus in another.
> Other than that, I'm not completely convinced this is the functionality
> we want. Do we really want to online these cpus? Why where they
> offlined in the first place? I understand the stuck problem, but is the
> solution to online them, or to change the device tree so that the second
> kernel doesn't detect them as stuck?
Well... There are two cases. If a CPU is soft-offlined on pseries, it must be woken from that cede loop (in platforms/pseries/hotplug-cpu.c) as we're replacing code under its feet. We could either special-case the wakeup from this cede loop to get that CPU to RTAS "stop-self" itself properly. (Kind of like a "wake to die".)
So that leaves hard-offline CPUs (perhaps including the above): I don't know why they might have been offlined. If it's something serious, like fire, they'd be removed from the present set too (and thus not be considered in this restarting case). We could add a mask to the CPU node to show which of the threads (if any) are running, and alter the startup code to start everything if this mask doesn't exist (non-kexec) or only online currently-running threads if the mask is present. That feels a little weird.
My reasoning for restarting everything was: The first time you boot, all of your present CPUs are started up. When you reboot, any CPUs you offlined for fun are restarted. Kexec is (in this non-crash sense) a user-initiated 'quick reboot', so I reasoned that it should look the same as a 'hard reboot' and your new invocation would have all available CPUs running as is usual.
Cheers,
Matt
^ permalink raw reply
* Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.
From: David Daney @ 2010-07-30 0:42 UTC (permalink / raw)
To: Fushen Chen; +Cc: linuxppc-dev, linux-usb, gregkh
In-Reply-To: <12803633233020-git-send-email-fchen@apm.com>
On 07/28/2010 05:28 PM, Fushen Chen wrote:
> [PATCH 1/2 v1.04]
.
.
.
PATCH 1/2 seems to not have made it to linux-usb@vger.kernel.org. I
suspect that a spam filter got it.
Could you remove whatever there is in the patch that triggers the
filter? Or failing that, change the filter so we can all see the patch?
Thanks,
David Daney
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox