* [PATCH 0/2] module_refcounting and anonymous inodes
@ 2008-12-02 10:14 Christian Borntraeger
2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-02 10:14 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, LKML
Hello Avi,
here is the latest respin of my fixes for the kvm module unload problem:
[PATCH 1/2] anon_inodes: use fops->owner for module refcount
[PATCH 2/2] kvm: set owner of cpu and vm file operations
Both patches fix module reference counting problems and only matter for module
unload - nothing critical.
Tested on s390 and x86_32.
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] anon_inodes: use fops->owner for module refcount
2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
@ 2008-12-02 10:16 ` Christian Borntraeger
2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger
2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2 siblings, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-02 10:16 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, LKML, Davide Libenzi
There is an imbalance for anonymous inodes. If the fops->owner field is set,
the module reference count of owner is decreases on release.
("filp_close" --> "__fput" ---> "fops_put")
On the other hand, anon_inode_getfd does not increase the module reference
count of owner. This causes two problems:
- if owner is set, the module refcount goes negative
- if owner is not set, the module can be unloaded while code is running
This patch changes anon_inode_getfd to be symmetric regarding fops->owner
handling.
I have checked all existing users of anon_inode_getfd. Noone sets fops->owner,
thats why nobody has seen the module refcount negative. The refcounting was
tested with a patched and unpatched KVM module.(see patch 2/2) I also did an
epoll_open/close test.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Davide Libenzi <davidel@xmailserver.org>
---
fs/anon_inodes.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Index: kvm/fs/anon_inodes.c
===================================================================
--- kvm.orig/fs/anon_inodes.c
+++ kvm/fs/anon_inodes.c
@@ -79,9 +79,12 @@ int anon_inode_getfd(const char *name, c
if (IS_ERR(anon_inode_inode))
return -ENODEV;
+ if (fops->owner && !try_module_get(fops->owner))
+ return -ENOENT;
+
error = get_unused_fd_flags(flags);
if (error < 0)
- return error;
+ goto err_module;
fd = error;
/*
@@ -128,6 +131,8 @@ err_dput:
dput(dentry);
err_put_unused_fd:
put_unused_fd(fd);
+err_module:
+ module_put(fops->owner);
return error;
}
EXPORT_SYMBOL_GPL(anon_inode_getfd);
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] kvm: set owner of cpu and vm file operations
2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger
@ 2008-12-02 10:17 ` Christian Borntraeger
2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2 siblings, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-02 10:17 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, LKML
There is a race between a "close of the file descriptors" and module
unload in the kvm module.
You can easily trigger this problem by applying this debug patch:
>--- kvm.orig/virt/kvm/kvm_main.c
>+++ kvm/virt/kvm/kvm_main.c
>@@ -648,10 +648,14 @@ void kvm_free_physmem(struct kvm *kvm)
> kvm_free_physmem_slot(&kvm->memslots[i], NULL);
> }
>
>+#include <linux/delay.h>
> static void kvm_destroy_vm(struct kvm *kvm)
> {
> struct mm_struct *mm = kvm->mm;
>
>+ printk("off1\n");
>+ msleep(5000);
>+ printk("off2\n");
> spin_lock(&kvm_lock);
> list_del(&kvm->vm_list);
> spin_unlock(&kvm_lock);
and killing the userspace, followed by an rmmod.
The problem is that kvm_destroy_vm can run while the module count
is 0. That means, you can remove the module while kvm_destroy_vm
is running. But kvm_destroy_vm is part of the module text. This
causes a kerneloops. The race exists without the msleep but is much
harder to trigger.
This patch requires the fix for anon_inodes (anon_inodes: use fops->owner
for module refcount).
With this patch, we can set the owner of all anonymous KVM inodes file
operations. The VFS will then control the KVM module refcount as long as there
is an open file. kvm_destroy_vm will be called by the release function of the
last closed file - before the VFS drops the module refcount.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
virt/kvm/kvm_main.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Index: kvm/virt/kvm/kvm_main.c
===================================================================
--- kvm.orig/virt/kvm/kvm_main.c
+++ kvm/virt/kvm/kvm_main.c
@@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode
return 0;
}
-static const struct file_operations kvm_vcpu_fops = {
+static struct file_operations kvm_vcpu_fops = {
.release = kvm_vcpu_release,
.unlocked_ioctl = kvm_vcpu_ioctl,
.compat_ioctl = kvm_vcpu_ioctl,
@@ -1697,7 +1697,7 @@ static int kvm_vm_mmap(struct file *file
return 0;
}
-static const struct file_operations kvm_vm_fops = {
+static struct file_operations kvm_vm_fops = {
.release = kvm_vm_release,
.unlocked_ioctl = kvm_vm_ioctl,
.compat_ioctl = kvm_vm_ioctl,
@@ -2061,6 +2061,8 @@ int kvm_init(void *opaque, unsigned int
}
kvm_chardev_ops.owner = module;
+ kvm_vm_fops.owner = module;
+ kvm_vcpu_fops.owner = module;
r = misc_register(&kvm_dev);
if (r) {
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger
2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger
@ 2008-12-08 11:51 ` Christian Borntraeger
2008-12-08 11:57 ` Avi Kivity
2 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-08 11:51 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, LKML
Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger:
> Hello Avi,
>
> here is the latest respin of my fixes for the kvm module unload problem:
>
> [PATCH 1/2] anon_inodes: use fops->owner for module refcount
> [PATCH 2/2] kvm: set owner of cpu and vm file operations
In the meantime patch 2 has a offset against the latest git. Should I resend
the patch or will you apply it anyway?
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
@ 2008-12-08 11:57 ` Avi Kivity
2008-12-09 9:43 ` Sheng Yang
0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-12-08 11:57 UTC (permalink / raw)
To: Christian Borntraeger; +Cc: kvm, LKML
Christian Borntraeger wrote:
> Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger:
>
>> Hello Avi,
>>
>> here is the latest respin of my fixes for the kvm module unload problem:
>>
>> [PATCH 1/2] anon_inodes: use fops->owner for module refcount
>> [PATCH 2/2] kvm: set owner of cpu and vm file operations
>>
>
> In the meantime patch 2 has a offset against the latest git. Should I resend
> the patch or will you apply it anyway?
>
Applied the patches; thanks for the reminder.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-08 11:57 ` Avi Kivity
@ 2008-12-09 9:43 ` Sheng Yang
2008-12-09 9:51 ` Avi Kivity
0 siblings, 1 reply; 12+ messages in thread
From: Sheng Yang @ 2008-12-09 9:43 UTC (permalink / raw)
To: kvm; +Cc: Avi Kivity, Christian Borntraeger, LKML
On Monday 08 December 2008 19:57:14 Avi Kivity wrote:
> Christian Borntraeger wrote:
> > Am Dienstag, 2. Dezember 2008 schrieb Christian Borntraeger:
> >> Hello Avi,
> >>
> >> here is the latest respin of my fixes for the kvm module unload problem:
> >>
> >> [PATCH 1/2] anon_inodes: use fops->owner for module refcount
> >> [PATCH 2/2] kvm: set owner of cpu and vm file operations
> >
> > In the meantime patch 2 has a offset against the latest git. Should I
> > resend the patch or will you apply it anyway?
>
> Applied the patches; thanks for the reminder.
Should we push the first patch to 2.6.28? I got some trouble with the separate
2nd patch, for I am using Linus' tree and make KVM as modules, so the
reference count reduced to negative now... (Oh Avi, I know you suggest to use
in kernel rather than modules, but module is indeed convenient. :) )
--
regards
Yang, Sheng
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-09 9:43 ` Sheng Yang
@ 2008-12-09 9:51 ` Avi Kivity
2008-12-09 10:08 ` Christian Borntraeger
0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-12-09 9:51 UTC (permalink / raw)
To: Sheng Yang; +Cc: kvm, Christian Borntraeger, LKML
Sheng Yang wrote:
> Should we push the first patch to 2.6.28?
It's not a recent regression, so no.
> I got some trouble with the separate
> 2nd patch, for I am using Linus' tree and make KVM as modules, so the
> reference count reduced to negative now... (Oh Avi, I know you suggest to use
> in kernel rather than modules, but module is indeed convenient. :) )
>
Right, that would affect everyone. What we need is to hack the second
patch for external modules on <2.6.29.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-09 9:51 ` Avi Kivity
@ 2008-12-09 10:08 ` Christian Borntraeger
2008-12-09 10:15 ` Avi Kivity
0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2008-12-09 10:08 UTC (permalink / raw)
To: Avi Kivity; +Cc: Sheng Yang, kvm, LKML
Am Dienstag, 9. Dezember 2008 schrieb Avi Kivity:
> Sheng Yang wrote:
> > Should we push the first patch to 2.6.28?
>
> It's not a recent regression, so no.
>
> > I got some trouble with the separate
> > 2nd patch, for I am using Linus' tree and make KVM as modules, so the
> > reference count reduced to negative now... (Oh Avi, I know you suggest to
use
> > in kernel rather than modules, but module is indeed convenient. :) )
> >
>
> Right, that would affect everyone. What we need is to hack the second
> patch for external modules on <2.6.29.
Oh this is tricky. Both patches belong together, patch 2 depends on patch 1.
For base kernels which do not contain patch1, this additional (untested) patch
would probably help:
---
virt/kvm/kvm_main.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
Index: kvm/virt/kvm/kvm_main.c
===================================================================
--- kvm.orig/virt/kvm/kvm_main.c
+++ kvm/virt/kvm/kvm_main.c
@@ -1501,9 +1501,15 @@ static struct file_operations kvm_vcpu_f
*/
static int create_vcpu_fd(struct kvm_vcpu *vcpu)
{
- int fd = anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, 0);
- if (fd < 0)
+ int fd;
+
+ if (!try_module_get(kvm_vcpu_fops.owner))
+ return -ENOENT;
+ fd = anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, 0);
+ if (fd < 0) {
kvm_put_kvm(vcpu->kvm);
+ module_put(kvm_vcpu_fops.owner);
+ }
return fd;
}
@@ -1895,12 +1901,19 @@ static int kvm_dev_ioctl_create_vm(void)
int fd;
struct kvm *kvm;
+ if (!try_module_get(kvm_vm_fops.owner))
+ return -ENOENT;
+
kvm = kvm_create_vm();
- if (IS_ERR(kvm))
+ if (IS_ERR(kvm)) {
+ module_put(kvm_vm_fops.owner);
return PTR_ERR(kvm);
+ }
fd = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, 0);
- if (fd < 0)
+ if (fd < 0) {
kvm_put_kvm(kvm);
+ module_put(kvm_vm_fops.owner);
+ }
return fd;
}
The problem is, how do you detect if the base kernel has patch1 applied?
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-09 10:08 ` Christian Borntraeger
@ 2008-12-09 10:15 ` Avi Kivity
2008-12-09 13:22 ` Avi Kivity
0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-12-09 10:15 UTC (permalink / raw)
To: Christian Borntraeger; +Cc: Sheng Yang, kvm, LKML
Christian Borntraeger wrote:
> The problem is, how do you detect if the base kernel has patch1 applied?
>
In the external module compatibility kit, implement two versions of
anon_inode_getfd(), and select the appropriate one according to kernel
version (like we currently support 91 smp_call_function_single variants).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-09 10:15 ` Avi Kivity
@ 2008-12-09 13:22 ` Avi Kivity
2008-12-10 2:05 ` Sheng Yang
0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-12-09 13:22 UTC (permalink / raw)
To: Christian Borntraeger; +Cc: Sheng Yang, kvm, LKML
Avi Kivity wrote:
> Christian Borntraeger wrote:
>> The problem is, how do you detect if the base kernel has patch1 applied?
>>
>
> In the external module compatibility kit, implement two versions of
> anon_inode_getfd(), and select the appropriate one according to kernel
> version (like we currently support 91 smp_call_function_single variants).
>
I committed something simpler: if running on 2.6.28 or earlier, I hack
out the two lines your second patch adds.\
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-09 13:22 ` Avi Kivity
@ 2008-12-10 2:05 ` Sheng Yang
2008-12-10 9:12 ` Avi Kivity
0 siblings, 1 reply; 12+ messages in thread
From: Sheng Yang @ 2008-12-10 2:05 UTC (permalink / raw)
To: Avi Kivity; +Cc: Christian Borntraeger, kvm, LKML
On Tuesday 09 December 2008 21:22:42 Avi Kivity wrote:
> Avi Kivity wrote:
> > Christian Borntraeger wrote:
> >> The problem is, how do you detect if the base kernel has patch1 applied?
> >
> > In the external module compatibility kit, implement two versions of
> > anon_inode_getfd(), and select the appropriate one according to kernel
> > version (like we currently support 91 smp_call_function_single variants).
>
> I committed something simpler: if running on 2.6.28 or earlier, I hack
> out the two lines your second patch adds.\
Good hack!
> diff --git a/kernel/external-module-compat-comm.h
> b/kernel/external-module-compat-comm.h index a089f62..d90522d 100644
> --- a/kernel/external-module-compat-comm.h
> +++ b/kernel/external-module-compat-comm.h
> @@ -682,3 +682,13 @@ static inline void cpumask_clear_cpu(int cpu,
> cpumask_var_t mask)
>
> #endif
>
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28)
But should it be 2,6,29?...
--
regards
Yang, Sheng
> +
> +#define IF_ANON_INODES_DOES_REFCOUNTS(x)
> +
> +#else
> +
> +#define IF_ANON_INODES_DOES_REFCOUNTS(x) x
> +
> +#endif
> +
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] module_refcounting and anonymous inodes
2008-12-10 2:05 ` Sheng Yang
@ 2008-12-10 9:12 ` Avi Kivity
0 siblings, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2008-12-10 9:12 UTC (permalink / raw)
To: Sheng Yang; +Cc: Christian Borntraeger, kvm, LKML
Sheng Yang wrote:
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28)
>>
>
> But should it be 2,6,29?...
>
>
Thanks, fixed.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-12-10 9:13 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-02 10:14 [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2008-12-02 10:16 ` [PATCH 1/2] anon_inodes: use fops->owner for module refcount Christian Borntraeger
2008-12-02 10:17 ` [PATCH 2/2] kvm: set owner of cpu and vm file operations Christian Borntraeger
2008-12-08 11:51 ` [PATCH 0/2] module_refcounting and anonymous inodes Christian Borntraeger
2008-12-08 11:57 ` Avi Kivity
2008-12-09 9:43 ` Sheng Yang
2008-12-09 9:51 ` Avi Kivity
2008-12-09 10:08 ` Christian Borntraeger
2008-12-09 10:15 ` Avi Kivity
2008-12-09 13:22 ` Avi Kivity
2008-12-10 2:05 ` Sheng Yang
2008-12-10 9:12 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).