* [PATCH kernel] KVM: PPC: Improve KVM reference counting @ 2019-02-21 3:44 Alexey Kardashevskiy 2019-02-21 6:26 ` Michael Ellerman 2019-02-22 9:41 ` Paul Mackerras 0 siblings, 2 replies; 4+ messages in thread From: Alexey Kardashevskiy @ 2019-02-21 3:44 UTC (permalink / raw) To: linuxppc-dev; +Cc: Alexey Kardashevskiy, kvm-ppc, David Gibson The anon fd's ops releases the KVM reference in the release hook. However we reference the KVM object after we create the fd so there is small window when the release function can be called and dereferenced the KVM object which potentially may free it. It is not a problem at the moment as the file is created and KVM is referenced under the KVM lock and the release function obtains the same lock before dereferencing the KVM (although the lock is not held when calling kvm_put_kvm()) but it is a fragile against future changes. This references the KVM object before creating a file. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- The original bug is described here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cfa393811 But in this case kvm_put_kvm() is called straight away with no locks before/after/around. --- arch/powerpc/kvm/book3s_64_vio.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 532ab797..d68c969 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -338,14 +338,15 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, } } + kvm_get_kvm(kvm); if (!ret) ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, stt, O_RDWR | O_CLOEXEC); - if (ret >= 0) { + if (ret >= 0) list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); - kvm_get_kvm(kvm); - } + else + kvm_put_kvm(kvm); mutex_unlock(&kvm->lock); -- 2.17.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] KVM: PPC: Improve KVM reference counting 2019-02-21 3:44 [PATCH kernel] KVM: PPC: Improve KVM reference counting Alexey Kardashevskiy @ 2019-02-21 6:26 ` Michael Ellerman 2019-02-22 1:33 ` Alexey Kardashevskiy 2019-02-22 9:41 ` Paul Mackerras 1 sibling, 1 reply; 4+ messages in thread From: Michael Ellerman @ 2019-02-21 6:26 UTC (permalink / raw) To: Alexey Kardashevskiy, linuxppc-dev Cc: Alexey Kardashevskiy, kvm-ppc, David Gibson Alexey Kardashevskiy <aik@ozlabs.ru> writes: > The anon fd's ops releases the KVM reference in the release hook. > However we reference the KVM object after we create the fd so there is > small window when the release function can be called and > dereferenced the KVM object which potentially may free it. dereference > > It is not a problem at the moment as the file is created and KVM is > referenced under the KVM lock and the release function obtains the same > lock before dereferencing the KVM (although the lock is not held when > calling kvm_put_kvm()) but it is a fragile against future changes. > > This references the KVM object before creating a file. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > The original bug is described here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cfa393811 > But in this case kvm_put_kvm() is called straight away with no locks before/after/around. > > --- > arch/powerpc/kvm/book3s_64_vio.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c > index 532ab797..d68c969 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -338,14 +338,15 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > } > } > > + kvm_get_kvm(kvm); > if (!ret) > ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, > stt, O_RDWR | O_CLOEXEC); > > - if (ret >= 0) { > + if (ret >= 0) > list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); > - kvm_get_kvm(kvm); > - } > + else > + kvm_put_kvm(kvm); > > mutex_unlock(&kvm->lock); This looks correct to me. But I feel like the logic could be cleaner, perhaps like this (patch below): mutex_lock(&kvm->lock); /* Check this LIOBN hasn't been previously allocated */ list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) { if (siter->liobn == args->liobn) { ret = -EBUSY; goto fail_unlock; } } kvm_get_kvm(kvm); ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, stt, O_RDWR | O_CLOEXEC); if (ret < 0) { kvm_put_kvm(kvm); goto fail_unlock; } list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); mutex_unlock(&kvm->lock); return ret; fail_unlock: mutex_unlock(&kvm->lock); fail: cheers diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 532ab79734c7..5d74602db0d0 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -296,7 +296,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, struct kvmppc_spapr_tce_table *stt = NULL; struct kvmppc_spapr_tce_table *siter; unsigned long npages, size = args->size; - int ret = -ENOMEM; + int ret; int i; if (!args->size || args->page_shift < 12 || args->page_shift > 34 || @@ -330,28 +330,30 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, mutex_lock(&kvm->lock); /* Check this LIOBN hasn't been previously allocated */ - ret = 0; list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) { if (siter->liobn == args->liobn) { ret = -EBUSY; - break; + goto fail_unlock; } } - if (!ret) - ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, - stt, O_RDWR | O_CLOEXEC); + kvm_get_kvm(kvm); + ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, + stt, O_RDWR | O_CLOEXEC); - if (ret >= 0) { - list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); - kvm_get_kvm(kvm); + if (ret < 0) { + kvm_put_kvm(kvm); + goto fail_unlock; } + list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); + mutex_unlock(&kvm->lock); - if (ret >= 0) - return ret; + return ret; + fail_unlock: + mutex_unlock(&kvm->lock); fail: for (i = 0; i < npages; i++) if (stt->pages[i]) ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] KVM: PPC: Improve KVM reference counting 2019-02-21 6:26 ` Michael Ellerman @ 2019-02-22 1:33 ` Alexey Kardashevskiy 0 siblings, 0 replies; 4+ messages in thread From: Alexey Kardashevskiy @ 2019-02-22 1:33 UTC (permalink / raw) To: Michael Ellerman, linuxppc-dev; +Cc: kvm-ppc, David Gibson On 21/02/2019 17:26, Michael Ellerman wrote: > Alexey Kardashevskiy <aik@ozlabs.ru> writes: > >> The anon fd's ops releases the KVM reference in the release hook. >> However we reference the KVM object after we create the fd so there is >> small window when the release function can be called and >> dereferenced the KVM object which potentially may free it. > dereference >> >> It is not a problem at the moment as the file is created and KVM is >> referenced under the KVM lock and the release function obtains the same >> lock before dereferencing the KVM (although the lock is not held when >> calling kvm_put_kvm()) but it is a fragile against future changes. >> >> This references the KVM object before creating a file. >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> >> The original bug is described here: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cfa393811 >> But in this case kvm_put_kvm() is called straight away with no locks before/after/around. >> >> --- >> arch/powerpc/kvm/book3s_64_vio.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c >> index 532ab797..d68c969 100644 >> --- a/arch/powerpc/kvm/book3s_64_vio.c >> +++ b/arch/powerpc/kvm/book3s_64_vio.c >> @@ -338,14 +338,15 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, >> } >> } >> >> + kvm_get_kvm(kvm); >> if (!ret) >> ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, >> stt, O_RDWR | O_CLOEXEC); >> >> - if (ret >= 0) { >> + if (ret >= 0) >> list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); >> - kvm_get_kvm(kvm); >> - } >> + else >> + kvm_put_kvm(kvm); >> >> mutex_unlock(&kvm->lock); > > This looks correct to me. But I feel like the logic could be cleaner, > perhaps like this (patch below): And I feel that 1) your patch tries to hide what it actually does 2) having 2 unlocks for one lock is an invite for future bugs imho, I'd think the whole point of adding new gotos is exactly to have 1 unlock. > > mutex_lock(&kvm->lock); > > /* Check this LIOBN hasn't been previously allocated */ > list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) { > if (siter->liobn == args->liobn) { > ret = -EBUSY; > goto fail_unlock; > } > } > > kvm_get_kvm(kvm); > ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, > stt, O_RDWR | O_CLOEXEC); > > if (ret < 0) { > kvm_put_kvm(kvm); > goto fail_unlock; > } > > list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); > > mutex_unlock(&kvm->lock); > > return ret; > > fail_unlock: > mutex_unlock(&kvm->lock); > fail: > > > cheers > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c > index 532ab79734c7..5d74602db0d0 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -296,7 +296,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > struct kvmppc_spapr_tce_table *stt = NULL; > struct kvmppc_spapr_tce_table *siter; > unsigned long npages, size = args->size; > - int ret = -ENOMEM; > + int ret; > int i; > > if (!args->size || args->page_shift < 12 || args->page_shift > 34 || > @@ -330,28 +330,30 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > mutex_lock(&kvm->lock); > > /* Check this LIOBN hasn't been previously allocated */ > - ret = 0; > list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) { > if (siter->liobn == args->liobn) { > ret = -EBUSY; > - break; > + goto fail_unlock; > } > } > > - if (!ret) > - ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, > - stt, O_RDWR | O_CLOEXEC); > + kvm_get_kvm(kvm); > + ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, > + stt, O_RDWR | O_CLOEXEC); > > - if (ret >= 0) { > - list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); > - kvm_get_kvm(kvm); > + if (ret < 0) { > + kvm_put_kvm(kvm); > + goto fail_unlock; > } > > + list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); > + > mutex_unlock(&kvm->lock); > > - if (ret >= 0) > - return ret; > + return ret; > > + fail_unlock: > + mutex_unlock(&kvm->lock); > fail: > for (i = 0; i < npages; i++) > if (stt->pages[i]) > -- Alexey ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] KVM: PPC: Improve KVM reference counting 2019-02-21 3:44 [PATCH kernel] KVM: PPC: Improve KVM reference counting Alexey Kardashevskiy 2019-02-21 6:26 ` Michael Ellerman @ 2019-02-22 9:41 ` Paul Mackerras 1 sibling, 0 replies; 4+ messages in thread From: Paul Mackerras @ 2019-02-22 9:41 UTC (permalink / raw) To: Alexey Kardashevskiy; +Cc: linuxppc-dev, kvm-ppc, David Gibson On Thu, Feb 21, 2019 at 02:44:14PM +1100, Alexey Kardashevskiy wrote: > The anon fd's ops releases the KVM reference in the release hook. > However we reference the KVM object after we create the fd so there is > small window when the release function can be called and > dereferenced the KVM object which potentially may free it. > > It is not a problem at the moment as the file is created and KVM is > referenced under the KVM lock and the release function obtains the same > lock before dereferencing the KVM (although the lock is not held when > calling kvm_put_kvm()) but it is a fragile against future changes. > > This references the KVM object before creating a file. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Thanks, applied to my kvm-ppc-next tree. Paul. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-02-22 9:51 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-02-21 3:44 [PATCH kernel] KVM: PPC: Improve KVM reference counting Alexey Kardashevskiy 2019-02-21 6:26 ` Michael Ellerman 2019-02-22 1:33 ` Alexey Kardashevskiy 2019-02-22 9:41 ` Paul Mackerras
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).