From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Zyngier Date: Fri, 04 Jun 2021 10:03:50 +0000 Subject: Re: [RFC][PATCH] kvm: add suspend pm-notifier Message-Id: <87tumeymih.wl-maz@kernel.org> List-Id: References: <20210603164315.682994-1-senozhatsky@chromium.org> <87v96uyq2v.wl-maz@kernel.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Sergey Senozhatsky Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Jim Mattson , Huacai Chen , Paul Mackerras , Christian Borntraeger , Suleiman Souhlal , x86@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-s390@vger.kernel.org On Fri, 04 Jun 2021 10:20:28 +0100, Sergey Senozhatsky wrote: > > On (21/06/04 09:46), Marc Zyngier wrote: > [..] > > > +void kvm_arch_pm_notifier(struct kvm *kvm) > > > +{ > > > +#ifdef CONFIG_PM > > > + int c; > > > + > > > + mutex_lock(&kvm->lock); > > > + for (c = 0; c < kvm->created_vcpus; c++) { > > > + struct kvm_vcpu *vcpu = kvm->vcpus[c]; > > > + int r; > > > + > > > + if (!vcpu) > > > + continue; > > > > Wouldn't kvm_for_each_vcpu() avoid this kind of checks? > > Right, that's what I do in v2, which I haven't posted yet. > > [..] > > > +#include > > > + > > > #ifndef KVM_MAX_VCPU_ID > > > #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS > > > #endif > > > @@ -579,6 +581,10 @@ struct kvm { > > > pid_t userspace_pid; > > > unsigned int max_halt_poll_ns; > > > u32 dirty_ring_size; > > > + > > > +#ifdef CONFIG_PM > > > + struct notifier_block pm_notifier; > > > +#endif > > > > I'd certainly like to be able to opt out from this on architectures > > that do not implement anything useful in the PM callbacks. > > Well on the other hand PM-callbacks are harmless on those archs, they > won't overload the __weak function. I don't care much for the callbacks. But struct kvm is bloated enough, and I'd be happy not to have this structure embedded in it if I can avoid it. > > > Please consider making this an independent config option that individual > > archs can buy into. > > Sure, I can take a look into this, but how is this better than __weak > function? (that's a real question) Weak functions are OK AFAIC. More crud in struct kvm is what I'm not OK with. > > [..] > > > +#ifdef CONFIG_PM > > > +static int kvm_pm_notifier_call(struct notifier_block *bl, > > > + unsigned long state, > > > + void *unused) > > > +{ > > > + struct kvm *kvm = container_of(bl, struct kvm, pm_notifier); > > > + > > > + switch (state) { > > > + case PM_HIBERNATION_PREPARE: > > > + case PM_SUSPEND_PREPARE: > > > + kvm_arch_pm_notifier(kvm); > > > > How about passing the state to the notifier callback? I'd expect it to > > be useful to do something on resume too. > > For different states we can have different kvm_arch functions instead. > kvm_arch_pm_notifier() can be renamed to kvm_arch_suspend_notifier(), > so that we don't need to have `switch (state)` in every arch-code. Then > for resume/post resume states we can have kvm_arch_resume_notifier() > arch functions. I'd rather we keep an arch API that is similar to the one the rest of the kernel has, instead of a flurry of small helpers that need to grow each time someone adds a new PM state. A switch() in the arch-specific implementation is absolutely fine. > > > > + break; > > > + } > > > + return NOTIFY_DONE; > > > +} > > > + > > > +static void kvm_init_pm_notifier(struct kvm *kvm) > > > +{ > > > + kvm->pm_notifier.notifier_call = kvm_pm_notifier_call; > > > + kvm->pm_notifier.priority = INT_MAX; > > > > How is this priority determined? > > Nothing magical here. I want this to be executed first, before we suspend > ftrace, RCU and the like. Besides KVM is usually the last one to register > its PM callbacks, so there can be something on the notifier list that > returns NOTIFY_STOP_MASK in front of KVM PM-notifier list entry. Which begs the question: should arch-specific code be able to veto suspend and return an error itself? Always returning NOTIFY_DONE seems highly optimistic. M. -- Without deviation from the norm, progress is not possible.