From mboxrd@z Thu Jan 1 00:00:00 1970 From: james.morse@arm.com (James Morse) Date: Wed, 20 Apr 2016 12:19:38 +0100 Subject: [PATCH v7 07/16] arm64: kvm: allows kvm cpu hotplug In-Reply-To: <20160420102944.GD1234@linaro.org> References: <1459529620-22150-1-git-send-email-james.morse@arm.com> <1459529620-22150-8-git-send-email-james.morse@arm.com> <571656E9.5050402@arm.com> <57166CC9.4030804@arm.com> <20160420102944.GD1234@linaro.org> Message-ID: <571765CA.1070009@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi, On 20/04/16 11:29, AKASHI Takahiro wrote: > On Tue, Apr 19, 2016 at 06:37:13PM +0100, James Morse wrote: >> On 19/04/16 17:03, Marc Zyngier wrote: >>> On 01/04/16 17:53, James Morse wrote: >>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c >>>> index b5384311dec4..962904a443be 100644 >>>> --- a/arch/arm/kvm/arm.c >>>> +++ b/arch/arm/kvm/arm.c >>>> @@ -591,7 +587,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) >>>> /* >>>> * Re-check atomic conditions >>>> */ >>>> - if (signal_pending(current)) { >>>> + if (unlikely(!__this_cpu_read(kvm_arm_hardware_enabled))) { >>>> + /* cpu has been torn down */ >>>> + ret = 0; >>>> + run->exit_reason = KVM_EXIT_FAIL_ENTRY; >>>> + run->fail_entry.hardware_entry_failure_reason >>>> + = (u64)-ENOEXEC; >>> >>> This hunk makes me feel a bit uneasy. Having to check something that >>> critical on the entry path is at least a bit weird. If we've reset EL2 >>> already, it means that we must have forced an exit on the guest to do so. >> >> (To save anyone else digging: the story comes from v12 of the kexec copy of this >> patch [0]) >> >> >>> So why do we hand the control back to KVM (or anything else) once we've >>> nuked a CPU? I'd expect it to be put on some back-burner, never to >>> return in this lifetime... >> >> This looks like the normal reboot code being called in the middle of a running >> system. Kexec calls kernel_restart_prepare(), which kicks each reboot notifier, >> one of which is kvm_reboot(), which calls: >>> on_each_cpu(hardware_disable_nolock, NULL, 1); >> >> We have to give the CPU back as there may be other reboot notifiers, and kexec >> hasn't yet shuffled onto the boot cpu. > > Right, and this kvm reboot notifier can be executed via IPI at any time > while interrupted is enabled, and so the check must be done between > local_irq_disable() and local_irq_enable(). Good point, this makes it really nasty! >> How about moving this check into handle_exit()[1]? >> Currently this sees ARM_EXCEPTION_IRQ, and tries to re-enter the guest, we can >> test for kvm_rebooting, which is set by kvm's reboot notifier .... but this >> would still break if another vcpu wakes from cond_resched() and sprints towards >> __kvm_vcpu_run(). Can we move cond_resched() to immediately before handle_exit()? > > I don't think that it would break as reboot process is *one-directional*, > and any cpu should be torn down sooner or later. I think we can schedule quite a few vcpus in the meantime though: The CPU that called kvm_reboot() will wait until each cpu has returned from hardware_disable_nolock(), eventually it will call disable_non_boot_cpus(), in the meantime the scheduler is free to pick threads to run. > I'm not quite sure about Marc's point, but if he has concern on overhead > of checking per-cpu kvm_arm_hardware_enabled, we may, instead, check on > kvm_rebooting. > (And this check won't make sense for VHE-enabled platform.) Gah, VHE. I think Marc's suggestion to make it an exception returned from the hyp-stub is best. I need to switch over to kexec to test it though.... Thanks, James