* [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-1-maz@kernel.org>
As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/mips/kvm/loongson_ipi.c | 4 ++--
arch/mips/kvm/mips.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/mips/kvm/loongson_ipi.c b/arch/mips/kvm/loongson_ipi.c
index 3681fc8fba38..5d53f32d837c 100644
--- a/arch/mips/kvm/loongson_ipi.c
+++ b/arch/mips/kvm/loongson_ipi.c
@@ -120,7 +120,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
s->status |= data;
irq.cpu = id;
irq.irq = 6;
- kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
+ kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
break;
case CORE0_CLEAR_OFF:
@@ -128,7 +128,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
if (!s->status) {
irq.cpu = id;
irq.irq = -6;
- kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
+ kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
}
break;
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index ceacca74f808..6228bf396d63 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -479,7 +479,7 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
if (irq->cpu == -1)
dvcpu = vcpu;
else
- dvcpu = vcpu->kvm->vcpus[irq->cpu];
+ dvcpu = kvm_get_vcpu(vcpu->kvm, irq->cpu);
if (intr == 2 || intr == 3 || intr == 4 || intr == 6) {
kvm_mips_callbacks->queue_io_int(dvcpu, irq);
--
2.30.2
^ permalink raw reply related
* [PATCH 0/5] KVM: Turn the vcpu array into an xarray
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
The kvm structure is pretty large. A large portion of it is the vcpu
array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
VMs. Of course, hardly anyone runs VMs this big, so this is often a
net waste of memory and cache locality.
A possible approach is to turn the fixed-size array into an xarray,
which results in a net code deletion after a bit of cleanup.
This series is on top of the current linux/master as it touches the
RISC-V implementation. Only tested on arm64.
Marc Zyngier (5):
KVM: Move wiping of the kvm->vcpus array to common code
KVM: mips: Use kvm_get_vcpu() instead of open-coded access
KVM: s390: Use kvm_get_vcpu() instead of open-coded access
KVM: x86: Use kvm_get_vcpu() instead of open-coded access
KVM: Convert the kvm->vcpus array to a xarray
arch/arm64/kvm/arm.c | 10 +---------
arch/mips/kvm/loongson_ipi.c | 4 ++--
arch/mips/kvm/mips.c | 23 ++---------------------
arch/powerpc/kvm/powerpc.c | 10 +---------
arch/riscv/kvm/vm.c | 10 +---------
arch/s390/kvm/kvm-s390.c | 26 ++++++--------------------
arch/x86/kvm/vmx/posted_intr.c | 2 +-
arch/x86/kvm/x86.c | 9 +--------
include/linux/kvm_host.h | 7 ++++---
virt/kvm/kvm_main.c | 33 ++++++++++++++++++++++++++-------
10 files changed, 45 insertions(+), 89 deletions(-)
--
2.30.2
^ permalink raw reply
* [PATCH 3/5] KVM: s390: Use kvm_get_vcpu() instead of open-coded access
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-1-maz@kernel.org>
As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/s390/kvm/kvm-s390.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7af53b8788fa..4a0f62b03964 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4572,7 +4572,7 @@ int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
}
for (i = 0; i < online_vcpus; i++) {
- if (!is_vcpu_stopped(vcpu->kvm->vcpus[i]))
+ if (!is_vcpu_stopped(kvm_get_vcpu(vcpu->kvm, i)))
started_vcpus++;
}
@@ -4634,9 +4634,11 @@ int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
__disable_ibs_on_vcpu(vcpu);
for (i = 0; i < online_vcpus; i++) {
- if (!is_vcpu_stopped(vcpu->kvm->vcpus[i])) {
+ struct kvm_vcpu *tmp = kvm_get_vcpu(vcpu->kvm, i);
+
+ if (!is_vcpu_stopped(tmp)) {
started_vcpus++;
- started_vcpu = vcpu->kvm->vcpus[i];
+ started_vcpu = tmp;
}
}
--
2.30.2
^ permalink raw reply related
* [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
From: Marc Zyngier @ 2021-11-05 19:21 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-1-maz@kernel.org>
As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/x86/kvm/vmx/posted_intr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..82a49720727d 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
if (!kvm_arch_has_assigned_device(kvm) ||
!irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+ !kvm_vcpu_apicv_active(kvm_get_vcpu(kvm, 0)))
return 0;
idx = srcu_read_lock(&kvm->irq_srcu);
--
2.30.2
^ permalink raw reply related
* [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-1-maz@kernel.org>
All architectures have similar loops iterating over the vcpus,
freeing one vcpu at a time, and eventually wiping the reference
off the vcpus array. They are also inconsistently taking
the kvm->lock mutex when wiping the references from the array.
Make this code common, which will simplify further changes.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arm.c | 10 +---------
arch/mips/kvm/mips.c | 21 +--------------------
arch/powerpc/kvm/powerpc.c | 10 +---------
arch/riscv/kvm/vm.c | 10 +---------
arch/s390/kvm/kvm-s390.c | 18 +-----------------
arch/x86/kvm/x86.c | 9 +--------
include/linux/kvm_host.h | 2 +-
virt/kvm/kvm_main.c | 20 ++++++++++++++++++--
8 files changed, 25 insertions(+), 75 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f5490afe1ebf..75bb7215da03 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -175,19 +175,11 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
*/
void kvm_arch_destroy_vm(struct kvm *kvm)
{
- int i;
-
bitmap_free(kvm->arch.pmu_filter);
kvm_vgic_destroy(kvm);
- for (i = 0; i < KVM_MAX_VCPUS; ++i) {
- if (kvm->vcpus[i]) {
- kvm_vcpu_destroy(kvm->vcpus[i]);
- kvm->vcpus[i] = NULL;
- }
- }
- atomic_set(&kvm->online_vcpus, 0);
+ kvm_destroy_vcpus(kvm);
}
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 562aa878b266..ceacca74f808 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -171,25 +171,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
return 0;
}
-void kvm_mips_free_vcpus(struct kvm *kvm)
-{
- unsigned int i;
- struct kvm_vcpu *vcpu;
-
- kvm_for_each_vcpu(i, vcpu, kvm) {
- kvm_vcpu_destroy(vcpu);
- }
-
- mutex_lock(&kvm->lock);
-
- for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
- kvm->vcpus[i] = NULL;
-
- atomic_set(&kvm->online_vcpus, 0);
-
- mutex_unlock(&kvm->lock);
-}
-
static void kvm_mips_free_gpa_pt(struct kvm *kvm)
{
/* It should always be safe to remove after flushing the whole range */
@@ -199,7 +180,7 @@ static void kvm_mips_free_gpa_pt(struct kvm *kvm)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
- kvm_mips_free_vcpus(kvm);
+ kvm_destroy_vcpus(kvm);
kvm_mips_free_gpa_pt(kvm);
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 35e9cccdeef9..492e4a4121cb 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -463,9 +463,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
- unsigned int i;
- struct kvm_vcpu *vcpu;
-
#ifdef CONFIG_KVM_XICS
/*
* We call kick_all_cpus_sync() to ensure that all
@@ -476,14 +473,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kick_all_cpus_sync();
#endif
- kvm_for_each_vcpu(i, vcpu, kvm)
- kvm_vcpu_destroy(vcpu);
+ kvm_destroy_vcpus(kvm);
mutex_lock(&kvm->lock);
- for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
- kvm->vcpus[i] = NULL;
-
- atomic_set(&kvm->online_vcpus, 0);
kvmppc_core_destroy_vm(kvm);
diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
index 26399df15b63..6af6cde295eb 100644
--- a/arch/riscv/kvm/vm.c
+++ b/arch/riscv/kvm/vm.c
@@ -46,15 +46,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
- int i;
-
- for (i = 0; i < KVM_MAX_VCPUS; ++i) {
- if (kvm->vcpus[i]) {
- kvm_vcpu_destroy(kvm->vcpus[i]);
- kvm->vcpus[i] = NULL;
- }
- }
- atomic_set(&kvm->online_vcpus, 0);
+ kvm_destroy_vcpus(kvm);
}
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c6257f625929..7af53b8788fa 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2819,27 +2819,11 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
free_page((unsigned long)(vcpu->arch.sie_block));
}
-static void kvm_free_vcpus(struct kvm *kvm)
-{
- unsigned int i;
- struct kvm_vcpu *vcpu;
-
- kvm_for_each_vcpu(i, vcpu, kvm)
- kvm_vcpu_destroy(vcpu);
-
- mutex_lock(&kvm->lock);
- for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
- kvm->vcpus[i] = NULL;
-
- atomic_set(&kvm->online_vcpus, 0);
- mutex_unlock(&kvm->lock);
-}
-
void kvm_arch_destroy_vm(struct kvm *kvm)
{
u16 rc, rrc;
- kvm_free_vcpus(kvm);
+ kvm_destroy_vcpus(kvm);
sca_dispose(kvm);
kvm_s390_gisa_destroy(kvm);
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c1c4e2b05a63..498a43126615 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11302,15 +11302,8 @@ static void kvm_free_vcpus(struct kvm *kvm)
kvm_clear_async_pf_completion_queue(vcpu);
kvm_unload_vcpu_mmu(vcpu);
}
- kvm_for_each_vcpu(i, vcpu, kvm)
- kvm_vcpu_destroy(vcpu);
-
- mutex_lock(&kvm->lock);
- for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
- kvm->vcpus[i] = NULL;
- atomic_set(&kvm->online_vcpus, 0);
- mutex_unlock(&kvm->lock);
+ kvm_destroy_vcpus(kvm);
}
void kvm_arch_sync_events(struct kvm *kvm)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 60a35d9fe259..36967291b8c6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -725,7 +725,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu_by_id(struct kvm *kvm, int id)
if (WARN_ON_ONCE(!memslot->npages)) { \
} else
-void kvm_vcpu_destroy(struct kvm_vcpu *vcpu);
+void kvm_destroy_vcpus(struct kvm *kvm);
void vcpu_load(struct kvm_vcpu *vcpu);
void vcpu_put(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3f6d450355f0..d83553eeea21 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -435,7 +435,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
vcpu->last_used_slot = 0;
}
-void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
+static void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
{
kvm_dirty_ring_free(&vcpu->dirty_ring);
kvm_arch_vcpu_destroy(vcpu);
@@ -450,7 +450,23 @@ void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
free_page((unsigned long)vcpu->run);
kmem_cache_free(kvm_vcpu_cache, vcpu);
}
-EXPORT_SYMBOL_GPL(kvm_vcpu_destroy);
+
+void kvm_destroy_vcpus(struct kvm *kvm)
+{
+ unsigned int i;
+ struct kvm_vcpu *vcpu;
+
+ kvm_for_each_vcpu(i, vcpu, kvm)
+ kvm_vcpu_destroy(vcpu);
+
+ mutex_lock(&kvm->lock);
+ for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
+ kvm->vcpus[i] = NULL;
+
+ atomic_set(&kvm->online_vcpus, 0);
+ mutex_unlock(&kvm->lock);
+}
+EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);
#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
--
2.30.2
^ permalink raw reply related
* [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
From: Marc Zyngier @ 2021-11-05 19:21 UTC (permalink / raw)
To: kvm, linux-mips, kvmarm, linuxppc-dev
Cc: Juergen Gross, Huacai Chen, Janosch Frank, Christian Borntraeger,
Sean Christopherson, Anup Patel, David Hildenbrand,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
Alexandru Elisei, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-1-maz@kernel.org>
At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
and is mostly empty in most cases (running 512 vcpu VMs is not that
common). This mean that we end-up with a 4kB block of unused memory
in the middle of the kvm structure.
Instead of wasting away this memory, let's use an xarray instead,
which gives us almost the same flexibility as a normal array, but
with a reduced memory usage with smaller VMs.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
include/linux/kvm_host.h | 5 +++--
virt/kvm/kvm_main.c | 15 +++++++++------
2 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 36967291b8c6..3933d825e28b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -29,6 +29,7 @@
#include <linux/refcount.h>
#include <linux/nospec.h>
#include <linux/notifier.h>
+#include <linux/xarray.h>
#include <asm/signal.h>
#include <linux/kvm.h>
@@ -552,7 +553,7 @@ struct kvm {
struct mutex slots_arch_lock;
struct mm_struct *mm; /* userspace tied to this vm */
struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM];
- struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
+ struct xarray vcpu_array;
/* Used to wait for completion of MMU notifiers. */
spinlock_t mn_invalidate_lock;
@@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu. */
smp_rmb();
- return kvm->vcpus[i];
+ return xa_load(&kvm->vcpu_array, i);
}
#define kvm_for_each_vcpu(idx, vcpup, kvm) \
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d83553eeea21..4c18d7911fa5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -461,7 +461,7 @@ void kvm_destroy_vcpus(struct kvm *kvm)
mutex_lock(&kvm->lock);
for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
- kvm->vcpus[i] = NULL;
+ xa_erase(&kvm->vcpu_array, i);
atomic_set(&kvm->online_vcpus, 0);
mutex_unlock(&kvm->lock);
@@ -1066,6 +1066,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
mutex_init(&kvm->slots_arch_lock);
spin_lock_init(&kvm->mn_invalidate_lock);
rcuwait_init(&kvm->mn_memslots_update_rcuwait);
+ xa_init(&kvm->vcpu_array);
INIT_LIST_HEAD(&kvm->devices);
@@ -3661,7 +3662,10 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
}
vcpu->vcpu_idx = atomic_read(&kvm->online_vcpus);
- BUG_ON(kvm->vcpus[vcpu->vcpu_idx]);
+ r = xa_insert(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, GFP_KERNEL_ACCOUNT);
+ BUG_ON(r == -EBUSY);
+ if (r)
+ goto unlock_vcpu_destroy;
/* Fill the stats id string for the vcpu */
snprintf(vcpu->stats_id, sizeof(vcpu->stats_id), "kvm-%d/vcpu-%d",
@@ -3671,15 +3675,14 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
kvm_get_kvm(kvm);
r = create_vcpu_fd(vcpu);
if (r < 0) {
+ xa_erase(&kvm->vcpu_array, vcpu->vcpu_idx);
kvm_put_kvm_no_destroy(kvm);
goto unlock_vcpu_destroy;
}
- kvm->vcpus[vcpu->vcpu_idx] = vcpu;
-
/*
- * Pairs with smp_rmb() in kvm_get_vcpu. Write kvm->vcpus
- * before kvm->online_vcpu's incremented value.
+ * Pairs with smp_rmb() in kvm_get_vcpu. Store the vcpu
+ * pointer before kvm->online_vcpu's incremented value.
*/
smp_wmb();
atomic_inc(&kvm->online_vcpus);
--
2.30.2
^ permalink raw reply related
* Re: [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
From: Sean Christopherson @ 2021-11-05 20:03 UTC (permalink / raw)
To: Marc Zyngier
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-5-maz@kernel.org>
On Fri, Nov 05, 2021, Marc Zyngier wrote:
> As we are about to change the way vcpus are allocated, mandate
> the use of kvm_get_vcpu() instead of open-coding the access.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/x86/kvm/vmx/posted_intr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..82a49720727d 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
>
> if (!kvm_arch_has_assigned_device(kvm) ||
> !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> + !kvm_vcpu_apicv_active(kvm_get_vcpu(kvm, 0)))
Huh. The existing code is decidedly odd. I think it might even be broken, as
it's not obvious that vCPU0 _must_ be created when e.g. kvm_arch_irq_bypass_add_producer()
is called.
An equivalent, safe check would be:
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..a3100591a9ca 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
if (!kvm_arch_has_assigned_device(kvm) ||
!irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+ !kvm_apicv_activated(kvm))
return 0;
idx = srcu_read_lock(&kvm->irq_srcu);
But I think even that is flawed, as APICv can be dynamically deactivated and
re-activated while the VM is running, and I don't see a path that re-updates
the IRTE when APICv is re-activated. So I think a more conservative check is
needed, e.g.
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..6cf5b2e86118 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
if (!kvm_arch_has_assigned_device(kvm) ||
!irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+ !irqchip_in_kernel(kvm) || !enable_apicv)
return 0;
idx = srcu_read_lock(&kvm->irq_srcu);
Paolo, am I missing something?
^ permalink raw reply related
* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
From: Sean Christopherson @ 2021-11-05 20:12 UTC (permalink / raw)
To: Marc Zyngier
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-2-maz@kernel.org>
On Fri, Nov 05, 2021, Marc Zyngier wrote:
> All architectures have similar loops iterating over the vcpus,
> freeing one vcpu at a time, and eventually wiping the reference
> off the vcpus array. They are also inconsistently taking
> the kvm->lock mutex when wiping the references from the array.
...
> +void kvm_destroy_vcpus(struct kvm *kvm)
> +{
> + unsigned int i;
> + struct kvm_vcpu *vcpu;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm)
> + kvm_vcpu_destroy(vcpu);
> +
> + mutex_lock(&kvm->lock);
But why is kvm->lock taken here? Unless I'm overlooking an arch, everyone calls
this from kvm_arch_destroy_vm(), in which case this is the only remaining reference
to @kvm. And if there's some magic path for which that's not true, I don't see how
it can possibly be safe to call kvm_vcpu_destroy() without holding kvm->lock, or
how this would guarantee that all vCPUs have actually been destroyed before nullifying
the array.
> + for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> + kvm->vcpus[i] = NULL;
> +
> + atomic_set(&kvm->online_vcpus, 0);
> + mutex_unlock(&kvm->lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);
^ permalink raw reply
* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
From: Sean Christopherson @ 2021-11-05 20:21 UTC (permalink / raw)
To: Marc Zyngier
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <20211105192101.3862492-6-maz@kernel.org>
On Fri, Nov 05, 2021, Marc Zyngier wrote:
> At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
> and is mostly empty in most cases (running 512 vcpu VMs is not that
> common). This mean that we end-up with a 4kB block of unused memory
> in the middle of the kvm structure.
Heh, x86 is now up to 1024 entries.
> Instead of wasting away this memory, let's use an xarray instead,
> which gives us almost the same flexibility as a normal array, but
> with a reduced memory usage with smaller VMs.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>
> /* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu. */
> smp_rmb();
> - return kvm->vcpus[i];
> + return xa_load(&kvm->vcpu_array, i);
> }
It'd be nice for this series to convert kvm_for_each_vcpu() to use xa_for_each()
as well. Maybe as a patch on top so that potential explosions from that are
isolated from the initiali conversion?
Or maybe even use xa_for_each_range() to cap at online_vcpus? That's technically
a functional change, but IMO it's easier to reason about iterating over a snapshot
of vCPUs as opposed to being able to iterate over vCPUs as their being added. In
practice I doubt it matters.
#define kvm_for_each_vcpu(idx, vcpup, kvm) \
xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, atomic_read(&kvm->online_vcpus))
^ permalink raw reply
* Re: Fwd: Fwd: X stopped working with 5.14 on iBook
From: Stanley Johnson @ 2021-11-06 1:22 UTC (permalink / raw)
To: Christophe Leroy
Cc: Christopher M. Riedl, linuxppc-dev, Riccardo Mottola, Finn Thain
In-Reply-To: <79ae1f49-f6b1-e9ad-977d-0cc7e553c7b9@csgroup.eu>
[-- Attachment #1: Type: text/plain, Size: 1631 bytes --]
> ...
> I think I found the issue.
> __get_user_sigset() is wrong for 32 bits.
> Could you change its content to return __get_user((u64)&dst->sig[0],
> (u64 __user *)&src->sig[0]);
> If it works, for the mainline also change unsafe_get_user_sigset()
> Christophe
Christophe,
Using the attached patches and git commands provided by Finn (thanks!), here are the results:
$ git reset --hard
$ git checkout d3ccc9781560
$ git am ../__get_user_sigset_cast.patch
$ cp ../dot-config-pmac-5.12.0-rc3-00034-gd3ccc9781560-USER_NS .config
$ make ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- -j4 clean olddefconfig vmlinux
$ grep USER_NS .config
CONFIG_USER_NS=y
$ strings vmlinux | grep Linux.version
Linux version 5.12.0-rc3-pmac-00035-g77e46218520e (johnson@ThinkPad) (powerpc-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.37) #146 SMP Fri Nov 5 18:57:46 MDT 2021
With this kernel, X works.
Trying mainline:
$ git reset --hard
$ git checkout v5.15
$ git am ../__get_user_sigset_cast_5.15.patch
$ cp ../dot-config-powermac-5.14 .config
$ ./scripts/config -e USER_NS
$ make ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- -j4 clean olddefconfig vmlinux
$ grep USER_NS .config
CONFIG_USER_NS=y
$ strings vmlinux | grep Linux.version
Linux version 5.15.0-pmac-00001-gbc0bc813b6ac (johnson@ThinkPad) (powerpc-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.37) #147 SMP Fri Nov 5 19:17:01 MDT 2021
With this kernel, X also works.
So it appears that your change has fixed the problem, at least on the G4 Cube; thanks!
-Stan
[-- Attachment #2: __get_user_sigset_cast.patch --]
[-- Type: application/octet-stream, Size: 1002 bytes --]
From a6cdd3e9a10d29df38fc6c844517ab28d6961227 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy@csgroup.eu>
Date: Sat, 6 Nov 2021 10:36:52 +1100
Subject: [PATCH] Re: Fwd: Fwd: X stopped working with 5.14 on iBook
I think I found the issue.
__get_user_sigset() is wrong for 32 bits.
Could you change its content to return __get_user(*(u64*)&dst->sig[0], (u64
__user *)&src->sig[0]);
If it works, for the mainline also change unsafe_get_user_sigset()
Christophe
diff --git a/arch/powerpc/kernel/signal.h b/arch/powerpc/kernel/signal.h
index 1393876f3814..d4d173a70913 100644
--- a/arch/powerpc/kernel/signal.h
+++ b/arch/powerpc/kernel/signal.h
@@ -23,7 +23,7 @@ static inline int __get_user_sigset(sigset_t *dst, const sigset_t __user *src)
{
BUILD_BUG_ON(sizeof(sigset_t) != sizeof(u64));
- return __get_user(dst->sig[0], (u64 __user *)&src->sig[0]);
+ return __get_user(*(u64*)&dst->sig[0], (u64 __user *)&src->sig[0]);
}
#ifdef CONFIG_VSX
[-- Attachment #3: __get_user_sigset_cast_5.15.patch --]
[-- Type: application/octet-stream, Size: 1259 bytes --]
From 5f80d4984f62f60a4bea748e8508d1baa171c0a5 Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy@csgroup.eu>
Date: Sat, 6 Nov 2021 10:36:52 +1100
Subject: [PATCH] Fwd: Fwd: X stopped working with 5.14 on iBook
I think I found the issue.
__get_user_sigset() is wrong for 32 bits.
Could you change its content to return __get_user(*(u64*)&dst->sig[0], (u64
__user *)&src->sig[0]);
If it works, for the mainline also change unsafe_get_user_sigset()
Christophe
diff --git a/arch/powerpc/kernel/signal.h b/arch/powerpc/kernel/signal.h
index 1f07317964e4..aba8a9e751e8 100644
--- a/arch/powerpc/kernel/signal.h
+++ b/arch/powerpc/kernel/signal.h
@@ -23,10 +23,10 @@ static inline int __get_user_sigset(sigset_t *dst, const sigset_t __user *src)
{
BUILD_BUG_ON(sizeof(sigset_t) != sizeof(u64));
- return __get_user(dst->sig[0], (u64 __user *)&src->sig[0]);
+ return __get_user(*(u64*)&dst->sig[0], (u64 __user *)&src->sig[0]);
}
#define unsafe_get_user_sigset(dst, src, label) \
- unsafe_get_user((dst)->sig[0], (u64 __user *)&(src)->sig[0], label)
+ unsafe_get_user(*(u64*)&(dst)->sig[0], (u64 __user *)&(src)->sig[0], label)
#ifdef CONFIG_VSX
extern unsigned long copy_vsx_to_user(void __user *to,
^ permalink raw reply related
* [powerpc:next] BUILD SUCCESS c12ab8dbc492b992e1ea717db933cee568780c47
From: kernel test robot @ 2021-11-06 1:38 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
branch HEAD: c12ab8dbc492b992e1ea717db933cee568780c47 powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
elapsed time: 5687m
configs tested: 197
configs skipped: 4
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
i386 randconfig-c001-20211101
mips randconfig-c004-20211101
powerpc randconfig-c003-20211101
powerpc powernv_defconfig
arm shmobile_defconfig
mips ar7_defconfig
powerpc currituck_defconfig
sh landisk_defconfig
mips malta_defconfig
powerpc tqm8555_defconfig
arm cerfcube_defconfig
mips bcm63xx_defconfig
powerpc bamboo_defconfig
powerpc katmai_defconfig
mips decstation_r4k_defconfig
powerpc icon_defconfig
arc nsimosci_defconfig
sh ecovec24-romimage_defconfig
powerpc kmeter1_defconfig
m68k m5307c3_defconfig
mips mtx1_defconfig
mips malta_qemu_32r6_defconfig
powerpc ep8248e_defconfig
sh sh7785lcr_32bit_defconfig
h8300 h8s-sim_defconfig
powerpc mpc8540_ads_defconfig
powerpc mpc837x_mds_defconfig
arm spear13xx_defconfig
arm mvebu_v7_defconfig
mips loongson3_defconfig
mips maltasmvp_defconfig
m68k stmark2_defconfig
arc axs103_defconfig
arm pxa3xx_defconfig
sh sh03_defconfig
powerpc taishan_defconfig
ia64 gensparse_defconfig
arm jornada720_defconfig
mips sb1250_swarm_defconfig
powerpc bluestone_defconfig
arm moxart_defconfig
riscv allnoconfig
powerpc sam440ep_defconfig
m68k atari_defconfig
arm mmp2_defconfig
mips jmr3927_defconfig
arm ezx_defconfig
arc vdk_hs38_defconfig
powerpc mpc8272_ads_defconfig
mips xway_defconfig
sh apsh4ad0a_defconfig
xtensa audio_kc705_defconfig
powerpc pcm030_defconfig
arm versatile_defconfig
arm tegra_defconfig
riscv defconfig
mips gcw0_defconfig
arm corgi_defconfig
sh shmin_defconfig
arm s3c2410_defconfig
powerpc ppc6xx_defconfig
powerpc acadia_defconfig
arm oxnas_v6_defconfig
powerpc tqm8xx_defconfig
m68k hp300_defconfig
parisc generic-64bit_defconfig
arm colibri_pxa270_defconfig
xtensa alldefconfig
arm spear6xx_defconfig
mips allmodconfig
powerpc wii_defconfig
mips vocore2_defconfig
sh magicpanelr2_defconfig
s390 debug_defconfig
powerpc mpc834x_itx_defconfig
arm h5000_defconfig
arc alldefconfig
sh hp6xx_defconfig
arm cm_x300_defconfig
powerpc linkstation_defconfig
sh se7712_defconfig
mips ath79_defconfig
mips maltaaprp_defconfig
sh lboxre2_defconfig
sh se7206_defconfig
powerpc mpc8313_rdb_defconfig
s390 alldefconfig
ia64 zx1_defconfig
powerpc ep88xc_defconfig
arm spitz_defconfig
ia64 generic_defconfig
arm omap1_defconfig
sh sh2007_defconfig
sh sh7785lcr_defconfig
m68k multi_defconfig
sh se7705_defconfig
m68k m5475evb_defconfig
powerpc cm5200_defconfig
arm mvebu_v5_defconfig
microblaze mmu_defconfig
arm aspeed_g5_defconfig
powerpc gamecube_defconfig
sh se7780_defconfig
mips decstation_defconfig
arm ep93xx_defconfig
powerpc microwatt_defconfig
arm orion5x_defconfig
arm randconfig-c002-20211101
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
s390 allmodconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
i386 debian-10.3
mips allyesconfig
powerpc allyesconfig
powerpc allnoconfig
powerpc allmodconfig
x86_64 randconfig-a012-20211101
x86_64 randconfig-a015-20211101
x86_64 randconfig-a016-20211101
x86_64 randconfig-a013-20211101
x86_64 randconfig-a011-20211101
x86_64 randconfig-a014-20211101
i386 randconfig-a016-20211101
i386 randconfig-a014-20211101
i386 randconfig-a015-20211101
i386 randconfig-a013-20211101
i386 randconfig-a011-20211101
i386 randconfig-a012-20211101
arc randconfig-r043-20211101
riscv randconfig-r042-20211101
s390 randconfig-r044-20211101
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel-8.3-kselftests
um x86_64_defconfig
um i386_defconfig
x86_64 allyesconfig
x86_64 defconfig
x86_64 rhel-8.3
x86_64 rhel-8.3-func
x86_64 kexec
clang tested configs:
mips randconfig-c004-20211101
arm randconfig-c002-20211101
i386 randconfig-c001-20211101
s390 randconfig-c005-20211101
powerpc randconfig-c003-20211101
riscv randconfig-c006-20211101
x86_64 randconfig-c007-20211101
mips randconfig-c004-20211102
arm randconfig-c002-20211102
i386 randconfig-c001-20211102
s390 randconfig-c005-20211102
powerpc randconfig-c003-20211102
riscv randconfig-c006-20211102
x86_64 randconfig-c007-20211102
x86_64 randconfig-a004-20211101
x86_64 randconfig-a006-20211101
x86_64 randconfig-a001-20211101
x86_64 randconfig-a002-20211101
x86_64 randconfig-a003-20211101
x86_64 randconfig-a005-20211101
i386 randconfig-a005-20211101
i386 randconfig-a001-20211101
i386 randconfig-a003-20211101
i386 randconfig-a004-20211101
i386 randconfig-a006-20211101
i386 randconfig-a002-20211101
hexagon randconfig-r041-20211101
hexagon randconfig-r045-20211101
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:topic/ppc-kvm] BUILD SUCCESS 235cee162459d96153d63651ce7ff51752528c96
From: kernel test robot @ 2021-11-06 1:37 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git topic/ppc-kvm
branch HEAD: 235cee162459d96153d63651ce7ff51752528c96 KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
elapsed time: 5687m
configs tested: 193
configs skipped: 93
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
i386 randconfig-c001-20211101
powerpc randconfig-c003-20211101
powerpc allyesconfig
powerpc powernv_defconfig
arm shmobile_defconfig
mips ar7_defconfig
powerpc currituck_defconfig
sh landisk_defconfig
mips malta_defconfig
powerpc tqm8555_defconfig
arm cerfcube_defconfig
mips bcm63xx_defconfig
powerpc bamboo_defconfig
powerpc katmai_defconfig
mips decstation_r4k_defconfig
powerpc icon_defconfig
arc nsimosci_defconfig
sh ecovec24-romimage_defconfig
powerpc kmeter1_defconfig
m68k m5307c3_defconfig
mips mtx1_defconfig
mips malta_qemu_32r6_defconfig
powerpc ep8248e_defconfig
sh sh7785lcr_32bit_defconfig
h8300 h8s-sim_defconfig
powerpc mpc8540_ads_defconfig
powerpc mpc837x_mds_defconfig
arm spear13xx_defconfig
arm mvebu_v7_defconfig
mips loongson3_defconfig
mips maltasmvp_defconfig
m68k stmark2_defconfig
arc axs103_defconfig
arm pxa3xx_defconfig
sh sh03_defconfig
powerpc taishan_defconfig
ia64 gensparse_defconfig
arm jornada720_defconfig
mips sb1250_swarm_defconfig
powerpc bluestone_defconfig
arm moxart_defconfig
s390 allmodconfig
riscv allnoconfig
powerpc sam440ep_defconfig
m68k atari_defconfig
arm mmp2_defconfig
mips jmr3927_defconfig
arm ezx_defconfig
arc vdk_hs38_defconfig
powerpc mpc8272_ads_defconfig
mips xway_defconfig
sh apsh4ad0a_defconfig
xtensa audio_kc705_defconfig
powerpc pcm030_defconfig
arm versatile_defconfig
arm tegra_defconfig
riscv defconfig
mips gcw0_defconfig
arm corgi_defconfig
sh shmin_defconfig
arm s3c2410_defconfig
powerpc ppc6xx_defconfig
powerpc acadia_defconfig
arm oxnas_v6_defconfig
powerpc tqm8xx_defconfig
m68k hp300_defconfig
parisc generic-64bit_defconfig
arm colibri_pxa270_defconfig
xtensa alldefconfig
arm spear6xx_defconfig
mips allmodconfig
powerpc wii_defconfig
mips vocore2_defconfig
sh magicpanelr2_defconfig
s390 debug_defconfig
powerpc mpc834x_itx_defconfig
arm h5000_defconfig
arc alldefconfig
sh hp6xx_defconfig
m68k q40_defconfig
arm aspeed_g5_defconfig
sh titan_defconfig
mips capcella_defconfig
powerpc tqm8548_defconfig
arm cm_x300_defconfig
powerpc linkstation_defconfig
sh se7712_defconfig
mips ath79_defconfig
mips maltaaprp_defconfig
sh lboxre2_defconfig
sh se7206_defconfig
powerpc mpc8313_rdb_defconfig
s390 alldefconfig
ia64 zx1_defconfig
powerpc ep88xc_defconfig
arm spitz_defconfig
sh se7705_defconfig
m68k m5475evb_defconfig
powerpc cm5200_defconfig
arm mvebu_v5_defconfig
microblaze mmu_defconfig
powerpc gamecube_defconfig
sh se7780_defconfig
mips decstation_defconfig
arm ep93xx_defconfig
powerpc microwatt_defconfig
arm orion5x_defconfig
arm randconfig-c002-20211101
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
i386 debian-10.3
mips allyesconfig
powerpc allmodconfig
powerpc allnoconfig
x86_64 randconfig-a012-20211101
x86_64 randconfig-a015-20211101
x86_64 randconfig-a016-20211101
x86_64 randconfig-a013-20211101
x86_64 randconfig-a011-20211101
x86_64 randconfig-a014-20211101
i386 randconfig-a016-20211101
i386 randconfig-a014-20211101
i386 randconfig-a015-20211101
i386 randconfig-a013-20211101
i386 randconfig-a011-20211101
i386 randconfig-a012-20211101
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel-8.3-kselftests
um x86_64_defconfig
um i386_defconfig
x86_64 defconfig
x86_64 rhel-8.3
x86_64 rhel-8.3-func
x86_64 kexec
x86_64 allyesconfig
clang tested configs:
mips randconfig-c004-20211101
arm randconfig-c002-20211101
i386 randconfig-c001-20211101
s390 randconfig-c005-20211101
powerpc randconfig-c003-20211101
riscv randconfig-c006-20211101
x86_64 randconfig-c007-20211101
mips randconfig-c004-20211102
arm randconfig-c002-20211102
i386 randconfig-c001-20211102
s390 randconfig-c005-20211102
powerpc randconfig-c003-20211102
riscv randconfig-c006-20211102
x86_64 randconfig-c007-20211102
x86_64 randconfig-a004-20211101
x86_64 randconfig-a006-20211101
x86_64 randconfig-a001-20211101
x86_64 randconfig-a002-20211101
x86_64 randconfig-a003-20211101
x86_64 randconfig-a005-20211101
i386 randconfig-a005-20211101
i386 randconfig-a001-20211101
i386 randconfig-a003-20211101
i386 randconfig-a004-20211101
i386 randconfig-a006-20211101
i386 randconfig-a002-20211101
hexagon randconfig-r041-20211101
hexagon randconfig-r045-20211101
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 11c41994b1c3a7cb08c87883cc19d469258882b6
From: kernel test robot @ 2021-11-06 1:50 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 11c41994b1c3a7cb08c87883cc19d469258882b6 KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
elapsed time: 740m
configs tested: 32
configs skipped: 53
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
nios2 defconfig
nds32 allnoconfig
arc allyesconfig
nds32 defconfig
csky defconfig
alpha defconfig
alpha allyesconfig
nios2 allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
parisc allyesconfig
s390 defconfig
powerpc allyesconfig
powerpc allnoconfig
powerpc allmodconfig
riscv nommu_k210_defconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
um x86_64_defconfig
um i386_defconfig
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS f855455dee0b970f3a9d930ec9b3a2a7138c0f5e
From: kernel test robot @ 2021-11-06 2:50 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: f855455dee0b970f3a9d930ec9b3a2a7138c0f5e Automatic merge of 'next' into merge (2021-11-05 22:19)
elapsed time: 797m
configs tested: 53
configs skipped: 3
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
s390 allmodconfig
parisc allyesconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
i386 debian-10.3
i386 allyesconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allnoconfig
powerpc allmodconfig
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
um x86_64_defconfig
um i386_defconfig
x86_64 allyesconfig
x86_64 rhel-8.3-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 rhel-8.3-func
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
From: Marc Zyngier @ 2021-11-06 11:17 UTC (permalink / raw)
To: Sean Christopherson
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <YYWQHBwD4nBLo9qi@google.com>
On Fri, 05 Nov 2021 20:12:12 +0000,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Nov 05, 2021, Marc Zyngier wrote:
> > All architectures have similar loops iterating over the vcpus,
> > freeing one vcpu at a time, and eventually wiping the reference
> > off the vcpus array. They are also inconsistently taking
> > the kvm->lock mutex when wiping the references from the array.
>
> ...
>
> > +void kvm_destroy_vcpus(struct kvm *kvm)
> > +{
> > + unsigned int i;
> > + struct kvm_vcpu *vcpu;
> > +
> > + kvm_for_each_vcpu(i, vcpu, kvm)
> > + kvm_vcpu_destroy(vcpu);
> > +
> > + mutex_lock(&kvm->lock);
>
> But why is kvm->lock taken here? Unless I'm overlooking an arch,
> everyone calls this from kvm_arch_destroy_vm(), in which case this
> is the only remaining reference to @kvm. And if there's some magic
> path for which that's not true, I don't see how it can possibly be
> safe to call kvm_vcpu_destroy() without holding kvm->lock, or how
> this would guarantee that all vCPUs have actually been destroyed
> before nullifying the array.
I asked myself the same question two years ago, and couldn't really
understand the requirement. However, x86 does just that, so I
preserved the behaviour.
If you too believe that this is just wrong, I'm happy to drop the
locking altogether. If that breaks someone's flow, they'll shout soon
enough.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
From: Marc Zyngier @ 2021-11-06 11:48 UTC (permalink / raw)
To: Sean Christopherson
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <YYWSUJ1qzhfqjQow@google.com>
On Fri, 05 Nov 2021 20:21:36 +0000,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Nov 05, 2021, Marc Zyngier wrote:
> > At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
> > and is mostly empty in most cases (running 512 vcpu VMs is not that
> > common). This mean that we end-up with a 4kB block of unused memory
> > in the middle of the kvm structure.
>
> Heh, x86 is now up to 1024 entries.
Humph. I don't want to know whether people are actually using that in
practice. The only time I create VMs with 512 vcpus is to check
whether it still works...
>
> > Instead of wasting away this memory, let's use an xarray instead,
> > which gives us almost the same flexibility as a normal array, but
> > with a reduced memory usage with smaller VMs.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
> >
> > /* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu. */
> > smp_rmb();
> > - return kvm->vcpus[i];
> > + return xa_load(&kvm->vcpu_array, i);
> > }
>
> It'd be nice for this series to convert kvm_for_each_vcpu() to use
> xa_for_each() as well. Maybe as a patch on top so that potential
> explosions from that are isolated from the initiali conversion?
>
> Or maybe even use xa_for_each_range() to cap at online_vcpus?
> That's technically a functional change, but IMO it's easier to
> reason about iterating over a snapshot of vCPUs as opposed to being
> able to iterate over vCPUs as their being added. In practice I
> doubt it matters.
>
> #define kvm_for_each_vcpu(idx, vcpup, kvm) \
> xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, atomic_read(&kvm->online_vcpus))
>
I think that's already the behaviour of this iterator (we stop at the
first empty slot capped to online_vcpus. The only change in behaviour
is that vcpup currently holds a pointer to the last vcpu in no empty
slot has been encountered. xa_for_each{,_range}() would set the
pointer to NULL at all times.
I doubt anyone relies on that, but it is probably worth eyeballing
some of the use cases...
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* Re: [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access
From: Philippe Mathieu-Daudé @ 2021-11-06 15:56 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvm, David Hildenbrand, Atish Patra, Paul Mackerras,
Claudio Imbrenda, kvmarm, Janosch Frank, Huacai Chen,
Christian Borntraeger, Aleksandar Markovic, kernel-team,
Suzuki K Poulose, Nicholas Piggin, Alexandru Elisei,
Juergen Gross, Sean Christopherson, Anup Patel,
open list:BROADCOM NVRAM DRIVER, James Morse, Paolo Bonzini,
linuxppc-dev
In-Reply-To: <20211105192101.3862492-3-maz@kernel.org>
On Fri, Nov 5, 2021 at 9:14 PM Marc Zyngier <maz@kernel.org> wrote:
>
> As we are about to change the way vcpus are allocated, mandate
> the use of kvm_get_vcpu() instead of open-coding the access.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/mips/kvm/loongson_ipi.c | 4 ++--
> arch/mips/kvm/mips.c | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Jonathan Neuschäfer @ 2021-11-06 20:54 UTC (permalink / raw)
To: Dmitry Osipenko
Cc: Ulf Hansson, Rich Felker, linux-ia64, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Lee Jones, Helge Deller,
Daniel Lezcano, Russell King, linux-csky, Jonathan Hunter,
Tony Lindgren, Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven,
xen-devel, linux-mips, Guenter Roeck, Len Brown, Albert Ou,
linux-omap, Jonathan Neuschäfer, Vladimir Zapolskiy,
linux-acpi, linux-m68k, Mark Brown, Borislav Petkov, Greentime Hu,
Paul Walmsley, linux-tegra, Thomas Gleixner, Andy Shevchenko,
Nancy Yuen, linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <20211027211715.12671-28-digetx@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4481 bytes --]
Hi,
On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
> Use devm_register_power_handler() that replaces global pm_power_off
> variable and allows to register multiple power-off handlers. It also
> provides restart-handler support, i.e. all in one API.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
When I boot with (most of) this patchset applied, I get the warning at
kernel/reboot.c:187:
/*
* Handler must have unique priority. Otherwise call order is
* determined by registration order, which is unreliable.
*/
WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list, nb));
As the NTXEC driver doesn't specify a priority, I think this is an issue
to be fixed elsewhere.
Other than that, it works and looks good, as far as I can tell.
For this patch:
Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Best regards,
Jonathan
---
Full Oops log:
[ 3.523294] ------------[ cut here ]------------
[ 3.528193] WARNING: CPU: 0 PID: 1 at kernel/reboot.c:187 register_restart_handler+0x4c/0x58
[ 3.536975] Modules linked in:
[ 3.540312] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-00021-gcb24c628b307 #622
[ 3.548214] Hardware name: Freescale i.MX50 (Device Tree Support)
[ 3.554357] [<c0111540>] (unwind_backtrace) from [<c010cdd0>] (show_stack+0x10/0x14)
[ 3.562183] [<c010cdd0>] (show_stack) from [<c0bf240c>] (dump_stack_lvl+0x58/0x70)
[ 3.569824] [<c0bf240c>] (dump_stack_lvl) from [<c0127604>] (__warn+0xd4/0x154)
[ 3.577191] [<c0127604>] (__warn) from [<c0bec844>] (warn_slowpath_fmt+0x74/0xa8)
[ 3.584727] [<c0bec844>] (warn_slowpath_fmt) from [<c01593c8>] (register_restart_handler+0x4c/0x58)
[ 3.593823] [<c01593c8>] (register_restart_handler) from [<c08676c8>] (__watchdog_register_device+0x13c/0x27c)
[ 3.603889] [<c08676c8>] (__watchdog_register_device) from [<c0867868>] (watchdog_register_device+0x60/0xb4)
[ 3.613764] [<c0867868>] (watchdog_register_device) from [<c08678f8>] (devm_watchdog_register_device+0x3c/0x84)
[ 3.623898] [<c08678f8>] (devm_watchdog_register_device) from [<c1146454>] (imx2_wdt_probe+0x254/0x2ac)
[ 3.633346] [<c1146454>] (imx2_wdt_probe) from [<c06feb74>] (platform_probe+0x58/0xb8)
[ 3.641314] [<c06feb74>] (platform_probe) from [<c06fb2f8>] (call_driver_probe+0x24/0x108)
[ 3.649636] [<c06fb2f8>] (call_driver_probe) from [<c06fbe08>] (really_probe.part.0+0xa8/0x358)
[ 3.658384] [<c06fbe08>] (really_probe.part.0) from [<c06fc1c4>] (__driver_probe_device+0x94/0x208)
[ 3.667470] [<c06fc1c4>] (__driver_probe_device) from [<c06fc368>] (driver_probe_device+0x30/0xc8)
[ 3.676468] [<c06fc368>] (driver_probe_device) from [<c06fcb0c>] (__driver_attach+0xe0/0x1c4)
[ 3.685032] [<c06fcb0c>] (__driver_attach) from [<c06f9a20>] (bus_for_each_dev+0x74/0xc0)
[ 3.693253] [<c06f9a20>] (bus_for_each_dev) from [<c06faeb8>] (bus_add_driver+0x100/0x208)
[ 3.701563] [<c06faeb8>] (bus_add_driver) from [<c06fd8a0>] (driver_register+0x88/0x118)
[ 3.709696] [<c06fd8a0>] (driver_register) from [<c06fe920>] (__platform_driver_probe+0x44/0xdc)
[ 3.718522] [<c06fe920>] (__platform_driver_probe) from [<c01022ac>] (do_one_initcall+0x78/0x388)
[ 3.727444] [<c01022ac>] (do_one_initcall) from [<c1101708>] (do_initcalls+0xcc/0x110)
[ 3.735413] [<c1101708>] (do_initcalls) from [<c110198c>] (kernel_init_freeable+0x1ec/0x250)
[ 3.743896] [<c110198c>] (kernel_init_freeable) from [<c0bfe724>] (kernel_init+0x10/0x128)
[ 3.752224] [<c0bfe724>] (kernel_init) from [<c010011c>] (ret_from_fork+0x14/0x38)
[ 3.759844] Exception stack(0xc40adfb0 to 0xc40adff8)
[ 3.764933] dfa0: 00000000 00000000 00000000 00000000
[ 3.773143] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3.781351] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 3.788347] irq event stamp: 143613
[ 3.792102] hardirqs last enabled at (143623): [<c01a3ebc>] __up_console_sem+0x50/0x60
[ 3.800397] hardirqs last disabled at (143632): [<c01a3ea8>] __up_console_sem+0x3c/0x60
[ 3.808491] softirqs last enabled at (143612): [<c0101518>] __do_softirq+0x2f8/0x5b0
[ 3.816591] softirqs last disabled at (143603): [<c01307dc>] __irq_exit_rcu+0x160/0x1d8
[ 3.825014] ---[ end trace 7f6709d2c89774b4 ]---
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH] powerpc/pseries: use slab context cpumask allocation in CPU hotplug init
From: Sachin Sant @ 2021-11-07 8:59 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev
In-Reply-To: <20211105132923.1582514-1-npiggin@gmail.com>
> On 05-Nov-2021, at 6:59 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Slab is up at this point, using the bootmem allocator triggers a
> warning. Switch to using the regular cpumask allocator.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
Fixes the warning for me. Thanks
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>
> This only matters when CONFIG_CPUMASK_OFFNODE=y, which has not been
> possible before on powerpc.
>
> Thanks,
> Nick
>
> arch/powerpc/platforms/pseries/hotplug-cpu.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index d646c22e94ab..78a70ba60d24 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -1016,12 +1016,13 @@ static int __init pseries_cpu_hotplug_init(void)
> /* Processors can be added/removed only on LPAR */
> if (firmware_has_feature(FW_FEATURE_LPAR)) {
> for_each_node(node) {
> - alloc_bootmem_cpumask_var(&node_recorded_ids_map[node]);
> + if (!alloc_cpumask_var_node(&node_recorded_ids_map[node],
> + GFP_KERNEL, node))
> + return -ENOMEM;
>
> /* Record ids of CPU added at boot time */
> - cpumask_or(node_recorded_ids_map[node],
> - node_recorded_ids_map[node],
> - cpumask_of_node(node));
> + cpumask_copy(node_recorded_ids_map[node],
> + cpumask_of_node(node));
> }
>
> of_reconfig_notifier_register(&pseries_smp_nb);
> --
> 2.23.0
>
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Dmitry Osipenko @ 2021-11-07 16:53 UTC (permalink / raw)
To: Jonathan Neuschäfer
Cc: Ulf Hansson, Rich Felker, linux-ia64, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Lee Jones, Helge Deller,
Daniel Lezcano, Russell King, linux-csky, Jonathan Hunter,
Tony Lindgren, Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven,
xen-devel, linux-mips, Guenter Roeck, Len Brown, Albert Ou,
linux-omap, Vladimir Zapolskiy, linux-acpi, linux-m68k,
Mark Brown, Borislav Petkov, Greentime Hu, Paul Walmsley,
linux-tegra, Thomas Gleixner, Andy Shevchenko, Nancy Yuen,
linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <YYbqlmOM95q7Hbjo@latitude>
06.11.2021 23:54, Jonathan Neuschäfer пишет:
> Hi,
>
> On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
>> Use devm_register_power_handler() that replaces global pm_power_off
>> variable and allows to register multiple power-off handlers. It also
>> provides restart-handler support, i.e. all in one API.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>
> When I boot with (most of) this patchset applied, I get the warning at
> kernel/reboot.c:187:
>
> /*
> * Handler must have unique priority. Otherwise call order is
> * determined by registration order, which is unreliable.
> */
> WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list, nb));
>
> As the NTXEC driver doesn't specify a priority, I think this is an issue
> to be fixed elsewhere.
>
> Other than that, it works and looks good, as far as I can tell.
>
>
> For this patch:
>
> Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Thank you. You have conflicting restart handlers, apparently NTXEC
driver should have higher priority than the watchdog driver. It should
be a common problem for the watchdog drivers, I will lower watchdog's
default priority to fix it.
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Guenter Roeck @ 2021-11-07 17:08 UTC (permalink / raw)
To: Dmitry Osipenko, Jonathan Neuschäfer
Cc: Ulf Hansson, Rich Felker, Thomas Gleixner, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin, linux-ia64,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Helge Deller, Daniel Lezcano,
Russell King, linux-csky, Jonathan Hunter, Tony Lindgren,
Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven, xen-devel,
linux-mips, Len Brown, Albert Ou, linux-omap, Vladimir Zapolskiy,
linux-acpi, linux-m68k, Mark Brown, Borislav Petkov, Greentime Hu,
Paul Walmsley, linux-tegra, Lee Jones, Andy Shevchenko,
Nancy Yuen, linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <be0c74c6-05a9-cad5-c285-6626d05f8860@gmail.com>
On 11/7/21 8:53 AM, Dmitry Osipenko wrote:
> 06.11.2021 23:54, Jonathan Neuschäfer пишет:
>> Hi,
>>
>> On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
>>> Use devm_register_power_handler() that replaces global pm_power_off
>>> variable and allows to register multiple power-off handlers. It also
>>> provides restart-handler support, i.e. all in one API.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>> ---
>>
>> When I boot with (most of) this patchset applied, I get the warning at
>> kernel/reboot.c:187:
>>
>> /*
>> * Handler must have unique priority. Otherwise call order is
>> * determined by registration order, which is unreliable.
>> */
>> WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list, nb));
>>
>> As the NTXEC driver doesn't specify a priority, I think this is an issue
>> to be fixed elsewhere.
>>
>> Other than that, it works and looks good, as far as I can tell.
>>
>>
>> For this patch:
>>
>> Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>
> Thank you. You have conflicting restart handlers, apparently NTXEC
> driver should have higher priority than the watchdog driver. It should
> be a common problem for the watchdog drivers, I will lower watchdog's
> default priority to fix it.
>
The watchdog subsystem already uses "0" as default priority, which was
intended as priority of last resort for restart handlers. I do not see
a reason to change that.
Guenter
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Dmitry Osipenko @ 2021-11-07 17:16 UTC (permalink / raw)
To: Guenter Roeck, Jonathan Neuschäfer
Cc: Ulf Hansson, Rich Felker, Thomas Gleixner, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin, linux-ia64,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Helge Deller, Daniel Lezcano,
Russell King, linux-csky, Jonathan Hunter, Tony Lindgren,
Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven, xen-devel,
linux-mips, Len Brown, Albert Ou, linux-omap, Vladimir Zapolskiy,
linux-acpi, linux-m68k, Mark Brown, Borislav Petkov, Greentime Hu,
Paul Walmsley, linux-tegra, Lee Jones, Andy Shevchenko,
Nancy Yuen, linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <9a22c22d-94b1-f519-27a2-ae0b8bbf6e99@roeck-us.net>
07.11.2021 20:08, Guenter Roeck пишет:
> On 11/7/21 8:53 AM, Dmitry Osipenko wrote:
>> 06.11.2021 23:54, Jonathan Neuschäfer пишет:
>>> Hi,
>>>
>>> On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
>>>> Use devm_register_power_handler() that replaces global pm_power_off
>>>> variable and allows to register multiple power-off handlers. It also
>>>> provides restart-handler support, i.e. all in one API.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>> ---
>>>
>>> When I boot with (most of) this patchset applied, I get the warning at
>>> kernel/reboot.c:187:
>>>
>>> /*
>>> * Handler must have unique priority. Otherwise call order is
>>> * determined by registration order, which is unreliable.
>>> */
>>> WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list,
>>> nb));
>>>
>>> As the NTXEC driver doesn't specify a priority, I think this is an issue
>>> to be fixed elsewhere.
>>>
>>> Other than that, it works and looks good, as far as I can tell.
>>>
>>>
>>> For this patch:
>>>
>>> Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>
>> Thank you. You have conflicting restart handlers, apparently NTXEC
>> driver should have higher priority than the watchdog driver. It should
>> be a common problem for the watchdog drivers, I will lower watchdog's
>> default priority to fix it.
>>
>
> The watchdog subsystem already uses "0" as default priority, which was
> intended as priority of last resort for restart handlers. I do not see
> a reason to change that.
Right, I meant that watchdog drivers which use restart handler set the
level to the default 128 [1]. Although, maybe it's a problem only for
i.MX drivers in practice, I'll take a closer look at the other drivers.
[1]
https://elixir.bootlin.com/linux/latest/source/drivers/watchdog/imx2_wdt.c#L318
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Guenter Roeck @ 2021-11-07 17:32 UTC (permalink / raw)
To: Dmitry Osipenko, Jonathan Neuschäfer
Cc: Ulf Hansson, Rich Felker, Thomas Gleixner, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin, linux-ia64,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Helge Deller, Daniel Lezcano,
Russell King, linux-csky, Jonathan Hunter, Tony Lindgren,
Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven, xen-devel,
linux-mips, Len Brown, Albert Ou, linux-omap, Vladimir Zapolskiy,
linux-acpi, linux-m68k, Mark Brown, Borislav Petkov, Greentime Hu,
Paul Walmsley, linux-tegra, Lee Jones, Andy Shevchenko,
Nancy Yuen, linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <658cf796-e3b1-f816-1e15-9e9e08b8ade0@gmail.com>
On 11/7/21 9:16 AM, Dmitry Osipenko wrote:
> 07.11.2021 20:08, Guenter Roeck пишет:
>> On 11/7/21 8:53 AM, Dmitry Osipenko wrote:
>>> 06.11.2021 23:54, Jonathan Neuschäfer пишет:
>>>> Hi,
>>>>
>>>> On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
>>>>> Use devm_register_power_handler() that replaces global pm_power_off
>>>>> variable and allows to register multiple power-off handlers. It also
>>>>> provides restart-handler support, i.e. all in one API.
>>>>>
>>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>>> ---
>>>>
>>>> When I boot with (most of) this patchset applied, I get the warning at
>>>> kernel/reboot.c:187:
>>>>
>>>> /*
>>>> * Handler must have unique priority. Otherwise call order is
>>>> * determined by registration order, which is unreliable.
>>>> */
>>>> WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list,
>>>> nb));
>>>>
>>>> As the NTXEC driver doesn't specify a priority, I think this is an issue
>>>> to be fixed elsewhere.
>>>>
>>>> Other than that, it works and looks good, as far as I can tell.
>>>>
>>>>
>>>> For this patch:
>>>>
>>>> Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>>> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>>
>>> Thank you. You have conflicting restart handlers, apparently NTXEC
>>> driver should have higher priority than the watchdog driver. It should
>>> be a common problem for the watchdog drivers, I will lower watchdog's
>>> default priority to fix it.
>>>
>>
>> The watchdog subsystem already uses "0" as default priority, which was
>> intended as priority of last resort for restart handlers. I do not see
>> a reason to change that.
>
> Right, I meant that watchdog drivers which use restart handler set the
> level to the default 128 [1]. Although, maybe it's a problem only for
> i.MX drivers in practice, I'll take a closer look at the other drivers.
>
They don't have to do that. The default is priority 0. It is the decision
of the driver author to set the watchdog's restart priority. So it is wrong
to claim that this would be "a common problem for the watchdog drivers",
because it isn't. Presumably there was a reason for the driver author
to select the default priority of 128. If there is a platform which has
a better means to restart the system, it should select a priority of
129 or higher instead of affecting _all_ platforms using the imx watchdog
to reset the system.
Sure, you can negotiate that with the driver author, but the default should
really be to change the priority for less affected platforms.
Guenter
^ permalink raw reply
* Re: [PATCH v2 27/45] mfd: ntxec: Use devm_register_power_handler()
From: Dmitry Osipenko @ 2021-11-07 17:42 UTC (permalink / raw)
To: Guenter Roeck, Jonathan Neuschäfer
Cc: Ulf Hansson, Rich Felker, Thomas Gleixner, Tomer Maimon,
Santosh Shilimkar, Rafael J . Wysocki, Boris Ostrovsky,
Catalin Marinas, Linus Walleij, Dave Hansen, x86, Tali Perry,
Thierry Reding, Guo Ren, Pavel Machek, H. Peter Anvin, linux-ia64,
linux-riscv, Vincent Chen, Will Deacon, Greg Ungerer,
Stefano Stabellini, Benjamin Fair, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Helge Deller, Daniel Lezcano,
Russell King, linux-csky, Jonathan Hunter, Tony Lindgren,
Chen-Yu Tsai, Ingo Molnar, Geert Uytterhoeven, xen-devel,
linux-mips, Len Brown, Albert Ou, linux-omap, Vladimir Zapolskiy,
linux-acpi, linux-m68k, Mark Brown, Borislav Petkov, Greentime Hu,
Paul Walmsley, linux-tegra, Lee Jones, Andy Shevchenko,
Nancy Yuen, linux-arm-kernel, Juergen Gross, Thomas Bogendoerfer,
linux-parisc, Nick Hu, Avi Fishman, Patrick Venture, linux-pm,
Liam Girdwood, linux-kernel, Palmer Dabbelt, Philipp Zabel,
Paul Mackerras, Andrew Morton, linuxppc-dev, openbmc,
Joshua Thompson
In-Reply-To: <5a17fee3-4214-c2b9-abc1-ab9d6071591b@roeck-us.net>
07.11.2021 20:32, Guenter Roeck пишет:
> On 11/7/21 9:16 AM, Dmitry Osipenko wrote:
>> 07.11.2021 20:08, Guenter Roeck пишет:
>>> On 11/7/21 8:53 AM, Dmitry Osipenko wrote:
>>>> 06.11.2021 23:54, Jonathan Neuschäfer пишет:
>>>>> Hi,
>>>>>
>>>>> On Thu, Oct 28, 2021 at 12:16:57AM +0300, Dmitry Osipenko wrote:
>>>>>> Use devm_register_power_handler() that replaces global pm_power_off
>>>>>> variable and allows to register multiple power-off handlers. It also
>>>>>> provides restart-handler support, i.e. all in one API.
>>>>>>
>>>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>>>> ---
>>>>>
>>>>> When I boot with (most of) this patchset applied, I get the warning at
>>>>> kernel/reboot.c:187:
>>>>>
>>>>> /*
>>>>> * Handler must have unique priority. Otherwise call order is
>>>>> * determined by registration order, which is unreliable.
>>>>> */
>>>>> WARN_ON(!atomic_notifier_has_unique_priority(&restart_handler_list,
>>>>>
>>>>> nb));
>>>>>
>>>>> As the NTXEC driver doesn't specify a priority, I think this is an
>>>>> issue
>>>>> to be fixed elsewhere.
>>>>>
>>>>> Other than that, it works and looks good, as far as I can tell.
>>>>>
>>>>>
>>>>> For this patch:
>>>>>
>>>>> Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>>>> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
>>>>
>>>> Thank you. You have conflicting restart handlers, apparently NTXEC
>>>> driver should have higher priority than the watchdog driver. It should
>>>> be a common problem for the watchdog drivers, I will lower watchdog's
>>>> default priority to fix it.
>>>>
>>>
>>> The watchdog subsystem already uses "0" as default priority, which was
>>> intended as priority of last resort for restart handlers. I do not see
>>> a reason to change that.
>>
>> Right, I meant that watchdog drivers which use restart handler set the
>> level to the default 128 [1]. Although, maybe it's a problem only for
>> i.MX drivers in practice, I'll take a closer look at the other drivers.
>>
>
> They don't have to do that. The default is priority 0. It is the decision
> of the driver author to set the watchdog's restart priority. So it is wrong
> to claim that this would be "a common problem for the watchdog drivers",
> because it isn't. Presumably there was a reason for the driver author
> to select the default priority of 128. If there is a platform which has
> a better means to restart the system, it should select a priority of
> 129 or higher instead of affecting _all_ platforms using the imx watchdog
> to reset the system.
>
> Sure, you can negotiate that with the driver author, but the default should
> really be to change the priority for less affected platforms.
Yes, looks like there is no common problem for watchdog drivers.
Initially I was recalling that watchdog core uses 128 by default and
typed the message without verifying it. I see now that it's incorrect,
my bad.
EC drivers tend to use higher priority in general. Jonathan, could you
please confirm that NTXEC driver is a more preferable restart method
than the watchdog?
^ permalink raw reply
* Re: [PATCH 0/3] KEXEC_SIG with appended signature
From: Daniel Axtens @ 2021-11-07 22:18 UTC (permalink / raw)
To: Michal Suchánek, Mimi Zohar
Cc: Thiago Jung Bauermann, Rob Herring, Vasily Gorbik, linux-s390,
Heiko Carstens, linux-kernel, David Howells,
Lakshmi Ramasubramanian, Luis Chamberlain, keyrings,
Paul Mackerras, Frank van der Linden, Jessica Yu,
Alexander Gordeev, linuxppc-dev, Christian Borntraeger,
Hari Bathini
In-Reply-To: <20211105131401.GL11195@kunlun.suse.cz>
Michal Suchánek <msuchanek@suse.de> writes:
> On Fri, Nov 05, 2021 at 09:55:52PM +1100, Daniel Axtens wrote:
>> Michal Suchanek <msuchanek@suse.de> writes:
>>
>> > S390 uses appended signature for kernel but implements the check
>> > separately from module loader.
>> >
>> > Support for secure boot on powerpc with appended signature is planned -
>> > grub patches submitted upstream but not yet merged.
>>
>> Power Non-Virtualised / OpenPower already supports secure boot via kexec
>> with signature verification via IMA. I think you have now sent a
>> follow-up series that merges some of the IMA implementation, I just
>> wanted to make sure it was clear that we actually already have support
>
> So is IMA_KEXEC and KEXEC_SIG redundant?
>
> I see some architectures have both. I also see there is a lot of overlap
> between the IMA framework and the KEXEC_SIG and MODULE_SIg.
Mimi would be much better placed than me to answer this.
The limits of my knowledge are basically that signature verification for
modules and kexec kernels can be enforced by IMA policies.
For example a secure booted powerpc kernel with module support will have
the following IMA policy set at the arch level:
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
(in arch/powerpc/kernel/ima_arch.c)
Module signature enforcement can be set with either IMA (policy like
"appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig" )
or with CONFIG_MODULE_SIG_FORCE/module.sig_enforce=1.
Sometimes this leads to arguably unexpected interactions - for example
commit fa4f3f56ccd2 ("powerpc/ima: Fix secure boot rules in ima arch
policy"), so it might be interesting to see if we can make things easier
to understand. I suspect another amusing interaction is that if the IMA
verification is used, the signature could be in an xattr rather than an
appended signature.
Kind regards,
Daniel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox