* [PATCH 1/2 v4.9.y] KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
@ 2017-09-12 5:41 Paul Mackerras
2017-09-12 5:42 ` [PATCH 2/2 v4.9.y] KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list Paul Mackerras
2017-10-02 9:43 ` Patch "KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()" " gregkh
0 siblings, 2 replies; 4+ messages in thread
From: Paul Mackerras @ 2017-09-12 5:41 UTC (permalink / raw)
To: stable
commit 47c5310a8dbe7c2cb9f0083daa43ceed76c257fa upstream, with part
of commit edd03602d97236e8fea13cd76886c576186aa307 folded in.
Nixiaoming pointed out that there is a memory leak in
kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
fails; the memory allocated for the kvmppc_spapr_tce_table struct
is not freed, and nor are the pages allocated for the iommu
tables. In addition, we have already incremented the process's
count of locked memory pages, and this doesn't get restored on
error.
David Hildenbrand pointed out that there is a race in that the
function checks early on that there is not already an entry in the
stt->iommu_tables list with the same LIOBN, but an entry with the
same LIOBN could get added between then and when the new entry is
added to the list.
This fixes all three problems. To simplify things, we now call
anon_inode_getfd() before placing the new entry in the list. The
check for an existing entry is done while holding the kvm->lock
mutex, immediately before adding the new entry to the list.
Finally, on failure we now call kvmppc_account_memlimit to
decrement the process's count of locked memory pages.
[paulus@ozlabs.org - folded in that part of edd03602d972 ("KVM:
PPC: Book3S HV: Protect updates to spapr_tce_tables list", 2017-08-28)
which restructured the code that 47c5310a8dbe modified, to avoid
a build failure caused by the absence of put_unused_fd().]
Fixes: 54738c097163 ("KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode")
Fixes: f8626985c7c2 ("KVM: PPC: Account TCE-containing pages in locked_vm")
Reported-by: Nixiaoming <nixiaoming@huawei.com>
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
arch/powerpc/kvm/book3s_64_vio.c | 54 +++++++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 23 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index c379ff5a4438..7c1cb9dcd9a0 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -150,6 +150,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce_64 *args)
{
struct kvmppc_spapr_tce_table *stt = NULL;
+ struct kvmppc_spapr_tce_table *siter;
unsigned long npages, size;
int ret = -ENOMEM;
int i;
@@ -157,24 +158,16 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
if (!args->size)
return -EINVAL;
- /* Check this LIOBN hasn't been previously allocated */
- list_for_each_entry(stt, &kvm->arch.spapr_tce_tables, list) {
- if (stt->liobn == args->liobn)
- return -EBUSY;
- }
-
size = args->size;
npages = kvmppc_tce_pages(size);
ret = kvmppc_account_memlimit(kvmppc_stt_pages(npages), true);
- if (ret) {
- stt = NULL;
- goto fail;
- }
+ if (ret)
+ return ret;
stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
GFP_KERNEL);
if (!stt)
- goto fail;
+ goto fail_acct;
stt->liobn = args->liobn;
stt->page_shift = args->page_shift;
@@ -188,24 +181,39 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
goto fail;
}
- kvm_get_kvm(kvm);
-
mutex_lock(&kvm->lock);
- list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
+
+ /* Check this LIOBN hasn't been previously allocated */
+ ret = 0;
+ list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) {
+ if (siter->liobn == args->liobn) {
+ ret = -EBUSY;
+ break;
+ }
+ }
+
+ if (!ret)
+ ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
+ stt, O_RDWR | O_CLOEXEC);
+
+ if (ret >= 0) {
+ list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
+ kvm_get_kvm(kvm);
+ }
mutex_unlock(&kvm->lock);
- return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
- stt, O_RDWR | O_CLOEXEC);
+ if (ret >= 0)
+ return ret;
-fail:
- if (stt) {
- for (i = 0; i < npages; i++)
- if (stt->pages[i])
- __free_page(stt->pages[i]);
+ fail:
+ for (i = 0; i < npages; i++)
+ if (stt->pages[i])
+ __free_page(stt->pages[i]);
- kfree(stt);
- }
+ kfree(stt);
+ fail_acct:
+ kvmppc_account_memlimit(kvmppc_stt_pages(npages), false);
return ret;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2 v4.9.y] KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
2017-09-12 5:41 [PATCH 1/2 v4.9.y] KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() Paul Mackerras
@ 2017-09-12 5:42 ` Paul Mackerras
2017-10-02 9:43 ` Patch "KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list" has been added to the 4.9-stable tree gregkh
2017-10-02 9:43 ` Patch "KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()" " gregkh
1 sibling, 1 reply; 4+ messages in thread
From: Paul Mackerras @ 2017-09-12 5:42 UTC (permalink / raw)
To: stable
commit edd03602d97236e8fea13cd76886c576186aa307 upstream.
Al Viro pointed out that while one thread of a process is executing
in kvm_vm_ioctl_create_spapr_tce(), another thread could guess the
file descriptor returned by anon_inode_getfd() and close() it before
the first thread has added it to the kvm->arch.spapr_tce_tables list.
That highlights a more general problem: there is no mutual exclusion
between writers to the spapr_tce_tables list, leading to the
possibility of the list becoming corrupted, which could cause a
host kernel crash.
To fix the mutual exclusion problem, we add a mutex_lock/unlock
pair around the list_del_rce in kvm_spapr_tce_release().
If another thread does guess the file descriptor returned by the
anon_inode_getfd() call in kvm_vm_ioctl_create_spapr_tce() and closes
it, its call to kvm_spapr_tce_release() will not do any harm because
it will have to wait until the first thread has released kvm->lock.
The other things that the second thread could do with the guessed
file descriptor are to mmap it or to pass it as a parameter to a
KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM device fd. An mmap
call won't cause any harm because kvm_spapr_tce_mmap() and
kvm_spapr_tce_fault() don't access the spapr_tce_tables list or
the kvmppc_spapr_tce_table.list field, and the fields that they do use
have been properly initialized by the time of the anon_inode_getfd()
call.
The KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl calls
kvm_spapr_tce_attach_iommu_group(), which scans the spapr_tce_tables
list looking for the kvmppc_spapr_tce_table struct corresponding to
the fd given as the parameter. Either it will find the new entry
or it won't; if it doesn't, it just returns an error, and if it
does, it will function normally. So, in each case there is no
harmful effect.
[paulus@ozlabs.org - moved parts of the upstream patch into the backport
of 47c5310a8dbe, adjusted this commit message accordingly.]
Fixes: 366baf28ee3f ("KVM: PPC: Use RCU for arch.spapr_tce_tables")
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
arch/powerpc/kvm/book3s_64_vio.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 7c1cb9dcd9a0..da2a7eccb10a 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -129,8 +129,11 @@ static int kvm_spapr_tce_mmap(struct file *file, struct vm_area_struct *vma)
static int kvm_spapr_tce_release(struct inode *inode, struct file *filp)
{
struct kvmppc_spapr_tce_table *stt = filp->private_data;
+ struct kvm *kvm = stt->kvm;
+ mutex_lock(&kvm->lock);
list_del_rcu(&stt->list);
+ mutex_unlock(&kvm->lock);
kvm_put_kvm(stt->kvm);
--
2.11.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Patch "KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()" has been added to the 4.9-stable tree
2017-09-12 5:41 [PATCH 1/2 v4.9.y] KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() Paul Mackerras
2017-09-12 5:42 ` [PATCH 2/2 v4.9.y] KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list Paul Mackerras
@ 2017-10-02 9:43 ` gregkh
1 sibling, 0 replies; 4+ messages in thread
From: gregkh @ 2017-10-02 9:43 UTC (permalink / raw)
To: paulus, david, gregkh, nixiaoming; +Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
kvm-ppc-book3s-fix-race-and-leak-in-kvm_vm_ioctl_create_spapr_tce.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From paulus@ozlabs.org Mon Oct 2 11:15:01 2017
From: Paul Mackerras <paulus@ozlabs.org>
Date: Tue, 12 Sep 2017 15:41:49 +1000
Subject: KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
To: stable@vger.kernel.org
Message-ID: <20170912054149.3vrf3mcbtp4jodr6@oak.ozlabs.ibm.com>
Content-Disposition: inline
From: Paul Mackerras <paulus@ozlabs.org>
commit 47c5310a8dbe7c2cb9f0083daa43ceed76c257fa upstream, with part
of commit edd03602d97236e8fea13cd76886c576186aa307 folded in.
Nixiaoming pointed out that there is a memory leak in
kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
fails; the memory allocated for the kvmppc_spapr_tce_table struct
is not freed, and nor are the pages allocated for the iommu
tables. In addition, we have already incremented the process's
count of locked memory pages, and this doesn't get restored on
error.
David Hildenbrand pointed out that there is a race in that the
function checks early on that there is not already an entry in the
stt->iommu_tables list with the same LIOBN, but an entry with the
same LIOBN could get added between then and when the new entry is
added to the list.
This fixes all three problems. To simplify things, we now call
anon_inode_getfd() before placing the new entry in the list. The
check for an existing entry is done while holding the kvm->lock
mutex, immediately before adding the new entry to the list.
Finally, on failure we now call kvmppc_account_memlimit to
decrement the process's count of locked memory pages.
[paulus@ozlabs.org - folded in that part of edd03602d972 ("KVM:
PPC: Book3S HV: Protect updates to spapr_tce_tables list", 2017-08-28)
which restructured the code that 47c5310a8dbe modified, to avoid
a build failure caused by the absence of put_unused_fd().]
Fixes: 54738c097163 ("KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode")
Fixes: f8626985c7c2 ("KVM: PPC: Account TCE-containing pages in locked_vm")
Reported-by: Nixiaoming <nixiaoming@huawei.com>
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/powerpc/kvm/book3s_64_vio.c | 56 ++++++++++++++++++++++-----------------
1 file changed, 32 insertions(+), 24 deletions(-)
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -150,6 +150,7 @@ long kvm_vm_ioctl_create_spapr_tce(struc
struct kvm_create_spapr_tce_64 *args)
{
struct kvmppc_spapr_tce_table *stt = NULL;
+ struct kvmppc_spapr_tce_table *siter;
unsigned long npages, size;
int ret = -ENOMEM;
int i;
@@ -157,24 +158,16 @@ long kvm_vm_ioctl_create_spapr_tce(struc
if (!args->size)
return -EINVAL;
- /* Check this LIOBN hasn't been previously allocated */
- list_for_each_entry(stt, &kvm->arch.spapr_tce_tables, list) {
- if (stt->liobn == args->liobn)
- return -EBUSY;
- }
-
size = args->size;
npages = kvmppc_tce_pages(size);
ret = kvmppc_account_memlimit(kvmppc_stt_pages(npages), true);
- if (ret) {
- stt = NULL;
- goto fail;
- }
+ if (ret)
+ return ret;
stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
GFP_KERNEL);
if (!stt)
- goto fail;
+ goto fail_acct;
stt->liobn = args->liobn;
stt->page_shift = args->page_shift;
@@ -188,24 +181,39 @@ long kvm_vm_ioctl_create_spapr_tce(struc
goto fail;
}
- kvm_get_kvm(kvm);
-
mutex_lock(&kvm->lock);
- list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
- mutex_unlock(&kvm->lock);
+ /* Check this LIOBN hasn't been previously allocated */
+ ret = 0;
+ list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) {
+ if (siter->liobn == args->liobn) {
+ ret = -EBUSY;
+ break;
+ }
+ }
+
+ if (!ret)
+ ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
+ stt, O_RDWR | O_CLOEXEC);
+
+ if (ret >= 0) {
+ list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
+ kvm_get_kvm(kvm);
+ }
- return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
- stt, O_RDWR | O_CLOEXEC);
+ mutex_unlock(&kvm->lock);
-fail:
- if (stt) {
- for (i = 0; i < npages; i++)
- if (stt->pages[i])
- __free_page(stt->pages[i]);
+ if (ret >= 0)
+ return ret;
- kfree(stt);
- }
+ fail:
+ for (i = 0; i < npages; i++)
+ if (stt->pages[i])
+ __free_page(stt->pages[i]);
+
+ kfree(stt);
+ fail_acct:
+ kvmppc_account_memlimit(kvmppc_stt_pages(npages), false);
return ret;
}
Patches currently in stable-queue which might be from paulus@ozlabs.org are
queue-4.9/kvm-ppc-book3s-fix-race-and-leak-in-kvm_vm_ioctl_create_spapr_tce.patch
queue-4.9/kvm-ppc-book3s-hv-protect-updates-to-spapr_tce_tables-list.patch
^ permalink raw reply [flat|nested] 4+ messages in thread
* Patch "KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list" has been added to the 4.9-stable tree
2017-09-12 5:42 ` [PATCH 2/2 v4.9.y] KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list Paul Mackerras
@ 2017-10-02 9:43 ` gregkh
0 siblings, 0 replies; 4+ messages in thread
From: gregkh @ 2017-10-02 9:43 UTC (permalink / raw)
To: paulus, david, gregkh; +Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
kvm-ppc-book3s-hv-protect-updates-to-spapr_tce_tables-list.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From paulus@ozlabs.org Mon Oct 2 11:15:17 2017
From: Paul Mackerras <paulus@ozlabs.org>
Date: Tue, 12 Sep 2017 15:42:38 +1000
Subject: KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
To: stable@vger.kernel.org
Message-ID: <20170912054238.ori4majx46kxpz5m@oak.ozlabs.ibm.com>
Content-Disposition: inline
From: Paul Mackerras <paulus@ozlabs.org>
commit edd03602d97236e8fea13cd76886c576186aa307 upstream.
Al Viro pointed out that while one thread of a process is executing
in kvm_vm_ioctl_create_spapr_tce(), another thread could guess the
file descriptor returned by anon_inode_getfd() and close() it before
the first thread has added it to the kvm->arch.spapr_tce_tables list.
That highlights a more general problem: there is no mutual exclusion
between writers to the spapr_tce_tables list, leading to the
possibility of the list becoming corrupted, which could cause a
host kernel crash.
To fix the mutual exclusion problem, we add a mutex_lock/unlock
pair around the list_del_rce in kvm_spapr_tce_release().
If another thread does guess the file descriptor returned by the
anon_inode_getfd() call in kvm_vm_ioctl_create_spapr_tce() and closes
it, its call to kvm_spapr_tce_release() will not do any harm because
it will have to wait until the first thread has released kvm->lock.
The other things that the second thread could do with the guessed
file descriptor are to mmap it or to pass it as a parameter to a
KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM device fd. An mmap
call won't cause any harm because kvm_spapr_tce_mmap() and
kvm_spapr_tce_fault() don't access the spapr_tce_tables list or
the kvmppc_spapr_tce_table.list field, and the fields that they do use
have been properly initialized by the time of the anon_inode_getfd()
call.
The KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl calls
kvm_spapr_tce_attach_iommu_group(), which scans the spapr_tce_tables
list looking for the kvmppc_spapr_tce_table struct corresponding to
the fd given as the parameter. Either it will find the new entry
or it won't; if it doesn't, it just returns an error, and if it
does, it will function normally. So, in each case there is no
harmful effect.
[paulus@ozlabs.org - moved parts of the upstream patch into the backport
of 47c5310a8dbe, adjusted this commit message accordingly.]
Fixes: 366baf28ee3f ("KVM: PPC: Use RCU for arch.spapr_tce_tables")
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/powerpc/kvm/book3s_64_vio.c | 3 +++
1 file changed, 3 insertions(+)
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -129,8 +129,11 @@ static int kvm_spapr_tce_mmap(struct fil
static int kvm_spapr_tce_release(struct inode *inode, struct file *filp)
{
struct kvmppc_spapr_tce_table *stt = filp->private_data;
+ struct kvm *kvm = stt->kvm;
+ mutex_lock(&kvm->lock);
list_del_rcu(&stt->list);
+ mutex_unlock(&kvm->lock);
kvm_put_kvm(stt->kvm);
Patches currently in stable-queue which might be from paulus@ozlabs.org are
queue-4.9/kvm-ppc-book3s-fix-race-and-leak-in-kvm_vm_ioctl_create_spapr_tce.patch
queue-4.9/kvm-ppc-book3s-hv-protect-updates-to-spapr_tce_tables-list.patch
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-10-02 9:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-12 5:41 [PATCH 1/2 v4.9.y] KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() Paul Mackerras
2017-09-12 5:42 ` [PATCH 2/2 v4.9.y] KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list Paul Mackerras
2017-10-02 9:43 ` Patch "KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list" has been added to the 4.9-stable tree gregkh
2017-10-02 9:43 ` Patch "KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()" " gregkh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).