* [PATCH 0/2] module_refcounting and anonymous inodes
@ 2008-12-02 10:14 Christian Borntraeger
2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-02 10:14 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, LKML
Hello Avi,
here is the latest respin of my fixes for the kvm module unload problem:
[PATCH 1/2] anon_inodes: use fops->owner for module refcount
[PATCH 2/2] kvm: set owner of cpu and vm file operations
Both patches fix module reference counting problems and only matter for module
unload - nothing critical.
Tested on s390 and x86_32.
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/2] anon_inodes: use fops->owner for module refcount 2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger @ 2008-12-02 10:16 ` Christian Borntraeger 2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger 2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2 siblings, 0 replies; 12+ messages in thread From: Christian Borntraeger @ 2008-12-02 10:16 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, LKML, Davide Libenzi There is an imbalance for anonymous inodes. If the fops->owner field is set, the module reference count of owner is decreases on release. ("filp_close" --> "__fput" ---> "fops_put") On the other hand, anon_inode_getfd does not increase the module reference count of owner. This causes two problems: - if owner is set, the module refcount goes negative - if owner is not set, the module can be unloaded while code is running This patch changes anon_inode_getfd to be symmetric regarding fops->owner handling. I have checked all existing users of anon_inode_getfd. Noone sets fops->owner, thats why nobody has seen the module refcount negative. The refcounting was tested with a patched and unpatched KVM module.(see patch 2/2) I also did an epoll_open/close test. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Davide Libenzi <davidel@xmailserver.org> --- fs/anon_inodes.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) Index: kvm/fs/anon_inodes.c =================================================================== --- kvm.orig/fs/anon_inodes.c +++ kvm/fs/anon_inodes.c @@ -79,9 +79,12 @@ int anon_inode_getfd(const char *name, c if (IS_ERR(anon_inode_inode)) return -ENODEV; + if (fops->owner && !try_module_get(fops->owner)) + return -ENOENT; + error = get_unused_fd_flags(flags); if (error < 0) - return error; + goto err_module; fd = error; /* @@ -128,6 +131,8 @@ err_dput: dput(dentry); err_put_unused_fd: put_unused_fd(fd); +err_module: + module_put(fops->owner); return error; } EXPORT_SYMBOL_GPL(anon_inode_getfd); ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] kvm: set owner of cpu and vm file operations 2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger @ 2008-12-02 10:17 ` Christian Borntraeger 2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2 siblings, 0 replies; 12+ messages in thread From: Christian Borntraeger @ 2008-12-02 10:17 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, LKML There is a race between a "close of the file descriptors" and module unload in the kvm module. You can easily trigger this problem by applying this debug patch: >--- kvm.orig/virt/kvm/kvm_main.c >+++ kvm/virt/kvm/kvm_main.c >@@ -648,10 +648,14 @@ void kvm_free_physmem(struct kvm *kvm) > kvm_free_physmem_slot(&kvm->memslots[i], NULL); > } > >+#include <linux/delay.h> > static void kvm_destroy_vm(struct kvm *kvm) > { > struct mm_struct *mm = kvm->mm; > >+ printk("off1\n"); >+ msleep(5000); >+ printk("off2\n"); > spin_lock(&kvm_lock); > list_del(&kvm->vm_list); > spin_unlock(&kvm_lock); and killing the userspace, followed by an rmmod. The problem is that kvm_destroy_vm can run while the module count is 0. That means, you can remove the module while kvm_destroy_vm is running. But kvm_destroy_vm is part of the module text. This causes a kerneloops. The race exists without the msleep but is much harder to trigger. This patch requires the fix for anon_inodes (anon_inodes: use fops->owner for module refcount). With this patch, we can set the owner of all anonymous KVM inodes file operations. The VFS will then control the KVM module refcount as long as there is an open file. kvm_destroy_vm will be called by the release function of the last closed file - before the VFS drops the module refcount. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> --- virt/kvm/kvm_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) Index: kvm/virt/kvm/kvm_main.c =================================================================== --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode return 0; } -static const struct file_operations kvm_vcpu_fops = { +static struct file_operations kvm_vcpu_fops = { .release = kvm_vcpu_release, .unlocked_ioctl = kvm_vcpu_ioctl, .compat_ioctl = kvm_vcpu_ioctl, @@ -1697,7 +1697,7 @@ static int kvm_vm_mmap(struct file *file return 0; } -static const struct file_operations kvm_vm_fops = { +static struct file_operations kvm_vm_fops = { .release = kvm_vm_release, .unlocked_ioctl = kvm_vm_ioctl, .compat_ioctl = kvm_vm_ioctl, @@ -2061,6 +2061,8 @@ int kvm_init(void *opaque, unsigned int } kvm_chardev_ops.owner = module; + kvm_vm_fops.owner = module; + kvm_vcpu_fops.owner = module; r = misc_register(&kvm_dev); if (r) { ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger 2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger @ 2008-12-08 11:51 ` Christian Borntraeger 2008-12-08 11:57 ` Avi Kivity 2 siblings, 1 reply; 12+ messages in thread From: Christian Borntraeger @ 2008-12-08 11:51 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, LKML Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger: > Hello Avi, > > here is the latest respin of my fixes for the kvm module unload problem: > > [PATCH 1/2] anon_inodes: use fops->owner for module refcount > [PATCH 2/2] kvm: set owner of cpu and vm file operations In the meantime patch 2 has a offset against the latest git. Should I resend the patch or will you apply it anyway? Christian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger @ 2008-12-08 11:57 ` Avi Kivity 2008-12-09 9:43 ` Sheng Yang 0 siblings, 1 reply; 12+ messages in thread From: Avi Kivity @ 2008-12-08 11:57 UTC (permalink / raw) To: Christian Borntraeger; +Cc: kvm, LKML Christian Borntraeger wrote: > Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger: > >> Hello Avi, >> >> here is the latest respin of my fixes for the kvm module unload problem: >> >> [PATCH 1/2] anon_inodes: use fops->owner for module refcount >> [PATCH 2/2] kvm: set owner of cpu and vm file operations >> > > In the meantime patch 2 has a offset against the latest git. Should I resend > the patch or will you apply it anyway? > Applied the patches; thanks for the reminder. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-08 11:57 ` Avi Kivity @ 2008-12-09 9:43 ` Sheng Yang 2008-12-09 9:51 ` Avi Kivity 0 siblings, 1 reply; 12+ messages in thread From: Sheng Yang @ 2008-12-09 9:43 UTC (permalink / raw) To: kvm; +Cc: Avi Kivity, Christian Borntraeger, LKML On Monday 08 December 2008 19:57:14 Avi Kivity wrote: > Christian Borntraeger wrote: > > Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger: > >> Hello Avi, > >> > >> here is the latest respin of my fixes for the kvm module unload problem: > >> > >> [PATCH 1/2] anon_inodes: use fops->owner for module refcount > >> [PATCH 2/2] kvm: set owner of cpu and vm file operations > > > > In the meantime patch 2 has a offset against the latest git. Should I > > resend the patch or will you apply it anyway? > > Applied the patches; thanks for the reminder. Should we push the first patch to 2.6.28? I got some trouble with the separate 2nd patch, for I am using Linus' tree and make KVM as modules, so the reference count reduced to negative now... (Oh Avi, I know you suggest to use in kernel rather than modules, but module is indeed convenient. :) ) -- regards Yang, Sheng ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-09 9:43 ` Sheng Yang @ 2008-12-09 9:51 ` Avi Kivity 2008-12-09 10:08 ` Christian Borntraeger 0 siblings, 1 reply; 12+ messages in thread From: Avi Kivity @ 2008-12-09 9:51 UTC (permalink / raw) To: Sheng Yang; +Cc: kvm, Christian Borntraeger, LKML Sheng Yang wrote: > Should we push the first patch to 2.6.28? It's not a recent regression, so no. > I got some trouble with the separate > 2nd patch, for I am using Linus' tree and make KVM as modules, so the > reference count reduced to negative now... (Oh Avi, I know you suggest to use > in kernel rather than modules, but module is indeed convenient. :) ) > Right, that would affect everyone. What we need is to hack the second patch for external modules on <2.6.29. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-09 9:51 ` Avi Kivity @ 2008-12-09 10:08 ` Christian Borntraeger 2008-12-09 10:15 ` Avi Kivity 0 siblings, 1 reply; 12+ messages in thread From: Christian Borntraeger @ 2008-12-09 10:08 UTC (permalink / raw) To: Avi Kivity; +Cc: Sheng Yang, kvm, LKML Am Dienstag, 9. Dezember 2008 schrieb Avi Kivity: > Sheng Yang wrote: > > Should we push the first patch to 2.6.28? > > It's not a recent regression, so no. > > > I got some trouble with the separate > > 2nd patch, for I am using Linus' tree and make KVM as modules, so the > > reference count reduced to negative now... (Oh Avi, I know you suggest to use > > in kernel rather than modules, but module is indeed convenient. :) ) > > > > Right, that would affect everyone. What we need is to hack the second > patch for external modules on <2.6.29. Oh this is tricky. Both patches belong together, patch 2 depends on patch 1. For base kernels which do not contain patch1, this additional (untested) patch would probably help: --- virt/kvm/kvm_main.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) Index: kvm/virt/kvm/kvm_main.c =================================================================== --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1501,9 +1501,15 @@ static struct file_operations kvm_vcpu_f */ static int create_vcpu_fd(struct kvm_vcpu *vcpu) { - int fd = anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, 0); - if (fd < 0) + int fd; + + if (!try_module_get(kvm_vcpu_fops.owner)) + return -ENOENT; + fd = anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, 0); + if (fd < 0) { kvm_put_kvm(vcpu->kvm); + module_put(kvm_vcpu_fops.owner); + } return fd; } @@ -1895,12 +1901,19 @@ static int kvm_dev_ioctl_create_vm(void) int fd; struct kvm *kvm; + if (!try_module_get(kvm_vm_fops.owner)) + return -ENOENT; + kvm = kvm_create_vm(); - if (IS_ERR(kvm)) + if (IS_ERR(kvm)) { + module_put(kvm_vm_fops.owner); return PTR_ERR(kvm); + } fd = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, 0); - if (fd < 0) + if (fd < 0) { kvm_put_kvm(kvm); + module_put(kvm_vm_fops.owner); + } return fd; } The problem is, how do you detect if the base kernel has patch1 applied? Christian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-09 10:08 ` Christian Borntraeger @ 2008-12-09 10:15 ` Avi Kivity 2008-12-09 13:22 ` Avi Kivity 0 siblings, 1 reply; 12+ messages in thread From: Avi Kivity @ 2008-12-09 10:15 UTC (permalink / raw) To: Christian Borntraeger; +Cc: Sheng Yang, kvm, LKML Christian Borntraeger wrote: > The problem is, how do you detect if the base kernel has patch1 applied? > In the external module compatibility kit, implement two versions of anon_inode_getfd(), and select the appropriate one according to kernel version (like we currently support 91 smp_call_function_single variants). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-09 10:15 ` Avi Kivity @ 2008-12-09 13:22 ` Avi Kivity 2008-12-10 2:05 ` Sheng Yang 0 siblings, 1 reply; 12+ messages in thread From: Avi Kivity @ 2008-12-09 13:22 UTC (permalink / raw) To: Christian Borntraeger; +Cc: Sheng Yang, kvm, LKML Avi Kivity wrote: > Christian Borntraeger wrote: >> The problem is, how do you detect if the base kernel has patch1 applied? >> > > In the external module compatibility kit, implement two versions of > anon_inode_getfd(), and select the appropriate one according to kernel > version (like we currently support 91 smp_call_function_single variants). > I committed something simpler: if running on 2.6.28 or earlier, I hack out the two lines your second patch adds.\ -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-09 13:22 ` Avi Kivity @ 2008-12-10 2:05 ` Sheng Yang 2008-12-10 9:12 ` Avi Kivity 0 siblings, 1 reply; 12+ messages in thread From: Sheng Yang @ 2008-12-10 2:05 UTC (permalink / raw) To: Avi Kivity; +Cc: Christian Borntraeger, kvm, LKML On Tuesday 09 December 2008 21:22:42 Avi Kivity wrote: > Avi Kivity wrote: > > Christian Borntraeger wrote: > >> The problem is, how do you detect if the base kernel has patch1 applied? > > > > In the external module compatibility kit, implement two versions of > > anon_inode_getfd(), and select the appropriate one according to kernel > > version (like we currently support 91 smp_call_function_single variants). > > I committed something simpler: if running on 2.6.28 or earlier, I hack > out the two lines your second patch adds.\ Good hack! > diff --git a/kernel/external-module-compat-comm.h > b/kernel/external-module-compat-comm.h index a089f62..d90522d 100644 > --- a/kernel/external-module-compat-comm.h > +++ b/kernel/external-module-compat-comm.h > @@ -682,3 +682,13 @@ static inline void cpumask_clear_cpu(int cpu, > cpumask_var_t mask) > > #endif > > +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28) But should it be 2,6,29?... -- regards Yang, Sheng > + > +#define IF_ANON_INODES_DOES_REFCOUNTS(x) > + > +#else > + > +#define IF_ANON_INODES_DOES_REFCOUNTS(x) x > + > +#endif > + ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes 2008-12-10 2:05 ` Sheng Yang @ 2008-12-10 9:12 ` Avi Kivity 0 siblings, 0 replies; 12+ messages in thread From: Avi Kivity @ 2008-12-10 9:12 UTC (permalink / raw) To: Sheng Yang; +Cc: Christian Borntraeger, kvm, LKML Sheng Yang wrote: >> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28) >> > > But should it be 2,6,29?... > > Thanks, fixed. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-12-10 9:13 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger 2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger 2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger 2008-12-08 11:57 ` Avi Kivity 2008-12-09 9:43 ` Sheng Yang 2008-12-09 9:51 ` Avi Kivity 2008-12-09 10:08 ` Christian Borntraeger 2008-12-09 10:15 ` Avi Kivity 2008-12-09 13:22 ` Avi Kivity 2008-12-10 2:05 ` Sheng Yang 2008-12-10 9:12 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).