* [PATCH 4.14 2/3] KVM: do not allow mapping valid but non-reference-counted pages
2021-08-03 13:55 [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
@ 2021-08-03 13:55 ` Ovidiu Panait
2021-08-04 8:51 ` Paolo Bonzini
2021-08-03 13:55 ` [PATCH 4.14 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Ovidiu Panait @ 2021-08-03 13:55 UTC (permalink / raw)
To: stable; +Cc: pbonzini
From: Nicholas Piggin <npiggin@gmail.com>
commit f8be156be163a052a067306417cd0ff679068c97 upstream.
It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.
Fix this by only taking a reference on valid pages if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).
This addresses CVE-2021-22543.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
virt/kvm/kvm_main.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4e23d0b4b810..469361d01116 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1485,6 +1485,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
return true;
}
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+ if (kvm_is_reserved_pfn(pfn))
+ return 1;
+ return get_page_unless_zero(pfn_to_page(pfn));
+}
+
static int hva_to_pfn_remapped(struct vm_area_struct *vma,
unsigned long addr, bool *async,
bool write_fault, bool *writable,
@@ -1534,13 +1541,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
* Whoever called remap_pfn_range is also going to call e.g.
* unmap_mapping_range before the underlying pages are freed,
* causing a call to our MMU notifier.
+ *
+ * Certain IO or PFNMAP mappings can be backed with valid
+ * struct pages, but be allocated without refcounting e.g.,
+ * tail pages of non-compound higher order allocations, which
+ * would then underflow the refcount when the caller does the
+ * required put_page. Don't allow those pages here.
*/
- kvm_get_pfn(pfn);
+ if (!kvm_try_get_pfn(pfn))
+ r = -EFAULT;
out:
pte_unmap_unlock(ptep, ptl);
*p_pfn = pfn;
- return 0;
+
+ return r;
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 4.14 2/3] KVM: do not allow mapping valid but non-reference-counted pages
2021-08-03 13:55 ` [PATCH 4.14 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
@ 2021-08-04 8:51 ` Paolo Bonzini
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2021-08-04 8:51 UTC (permalink / raw)
To: Ovidiu Panait, stable
On 03/08/21 15:55, Ovidiu Panait wrote:
> From: Nicholas Piggin <npiggin@gmail.com>
>
> commit f8be156be163a052a067306417cd0ff679068c97 upstream.
>
> It's possible to create a region which maps valid but non-refcounted
> pages (e.g., tail pages of non-compound higher order allocations). These
> host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
> of APIs, which take a reference to the page, which takes it from 0 to 1.
> When the reference is dropped, this will free the page incorrectly.
>
> Fix this by only taking a reference on valid pages if it was non-zero,
> which indicates it is participating in normal refcounting (and can be
> released with put_page).
>
> This addresses CVE-2021-22543.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> Tested-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
> virt/kvm/kvm_main.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 4e23d0b4b810..469361d01116 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1485,6 +1485,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
> return true;
> }
>
> +static int kvm_try_get_pfn(kvm_pfn_t pfn)
> +{
> + if (kvm_is_reserved_pfn(pfn))
> + return 1;
> + return get_page_unless_zero(pfn_to_page(pfn));
> +}
> +
> static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> unsigned long addr, bool *async,
> bool write_fault, bool *writable,
> @@ -1534,13 +1541,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> * Whoever called remap_pfn_range is also going to call e.g.
> * unmap_mapping_range before the underlying pages are freed,
> * causing a call to our MMU notifier.
> + *
> + * Certain IO or PFNMAP mappings can be backed with valid
> + * struct pages, but be allocated without refcounting e.g.,
> + * tail pages of non-compound higher order allocations, which
> + * would then underflow the refcount when the caller does the
> + * required put_page. Don't allow those pages here.
> */
> - kvm_get_pfn(pfn);
> + if (!kvm_try_get_pfn(pfn))
> + r = -EFAULT;
>
> out:
> pte_unmap_unlock(ptep, ptl);
> *p_pfn = pfn;
> - return 0;
> +
> + return r;
> }
>
> /*
>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 4.14 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
2021-08-03 13:55 [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
2021-08-03 13:55 ` [PATCH 4.14 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
@ 2021-08-03 13:55 ` Ovidiu Panait
2021-08-04 8:51 ` Paolo Bonzini
2021-08-04 8:51 ` [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
2021-08-06 6:26 ` Greg KH
3 siblings, 1 reply; 7+ messages in thread
From: Ovidiu Panait @ 2021-08-03 13:55 UTC (permalink / raw)
To: stable; +Cc: pbonzini
From: Sean Christopherson <seanjc@google.com>
commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream
Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving
a so called "remapped" hva/pfn pair. In theory, the hva could resolve to
a pfn in high memory on a 32-bit kernel.
This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not
assume PTE is writable after follow_pfn"), which added an error PFN value
to the mix, causing gcc to comlain about overflowing the unsigned long.
arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’:
include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’
to ‘long unsigned int’ changes value from
‘9218868437227405314’ to ‘2’ [-Werror=overflow]
89 | #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2)
| ^
virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’
Cc: stable@vger.kernel.org
Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210208201940.1258328-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
virt/kvm/kvm_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 469361d01116..36b9f2b29071 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1497,7 +1497,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
bool write_fault, bool *writable,
kvm_pfn_t *p_pfn)
{
- unsigned long pfn;
+ kvm_pfn_t pfn;
pte_t *ptep;
spinlock_t *ptl;
int r;
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 4.14 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
2021-08-03 13:55 ` [PATCH 4.14 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
@ 2021-08-04 8:51 ` Paolo Bonzini
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2021-08-04 8:51 UTC (permalink / raw)
To: Ovidiu Panait, stable
On 03/08/21 15:55, Ovidiu Panait wrote:
> From: Sean Christopherson <seanjc@google.com>
>
> commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream
>
> Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving
> a so called "remapped" hva/pfn pair. In theory, the hva could resolve to
> a pfn in high memory on a 32-bit kernel.
>
> This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not
> assume PTE is writable after follow_pfn"), which added an error PFN value
> to the mix, causing gcc to comlain about overflowing the unsigned long.
>
> arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’:
> include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’
> to ‘long unsigned int’ changes value from
> ‘9218868437227405314’ to ‘2’ [-Werror=overflow]
> 89 | #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2)
> | ^
> virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’
>
> Cc: stable@vger.kernel.org
> Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Message-Id: <20210208201940.1258328-1-seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
> virt/kvm/kvm_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 469361d01116..36b9f2b29071 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1497,7 +1497,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> bool write_fault, bool *writable,
> kvm_pfn_t *p_pfn)
> {
> - unsigned long pfn;
> + kvm_pfn_t pfn;
> pte_t *ptep;
> spinlock_t *ptl;
> int r;
>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn
2021-08-03 13:55 [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
2021-08-03 13:55 ` [PATCH 4.14 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
2021-08-03 13:55 ` [PATCH 4.14 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
@ 2021-08-04 8:51 ` Paolo Bonzini
2021-08-06 6:26 ` Greg KH
3 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2021-08-04 8:51 UTC (permalink / raw)
To: Ovidiu Panait, stable
On 03/08/21 15:55, Ovidiu Panait wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.
>
> In order to convert an HVA to a PFN, KVM usually tries to use
> the get_user_pages family of functinso. This however is not
> possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.
>
> In doing this however KVM loses the information on whether the
> PFN is writable. That is usually not a problem because the main
> use of VM_IO vmas with KVM is for BARs in PCI device assignment,
> however it is a bug. To fix it, use follow_pte and check pte_write
> while under the protection of the PTE lock. The information can
> be used to fail hva_to_pfn_remapped or passed back to the
> caller via *writable.
>
> Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
> up page faults before giving up", 2016-07-05); however, even older version
> have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
> Handle vma regions with no backing page", 2008-07-20), as they also did
> not check whether the PFN was writable.
>
> Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
> Reported-by: David Stevens <stevensd@google.com>
> Cc: 3pvd@google.com
> Cc: Jann Horn <jannh@google.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [OP: backport to 4.14, adjust follow_pte() -> follow_pte_pmd()]
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
> virt/kvm/kvm_main.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 547ae59199db..4e23d0b4b810 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1491,9 +1491,11 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> kvm_pfn_t *p_pfn)
> {
> unsigned long pfn;
> + pte_t *ptep;
> + spinlock_t *ptl;
> int r;
>
> - r = follow_pfn(vma, addr, &pfn);
> + r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
> if (r) {
> /*
> * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does
> @@ -1508,14 +1510,19 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> if (r)
> return r;
>
> - r = follow_pfn(vma, addr, &pfn);
> + r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
> if (r)
> return r;
> + }
>
> + if (write_fault && !pte_write(*ptep)) {
> + pfn = KVM_PFN_ERR_RO_FAULT;
> + goto out;
> }
>
> if (writable)
> - *writable = true;
> + *writable = pte_write(*ptep);
> + pfn = pte_pfn(*ptep);
>
> /*
> * Get a reference here because callers of *hva_to_pfn* and
> @@ -1530,6 +1537,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> */
> kvm_get_pfn(pfn);
>
> +out:
> + pte_unmap_unlock(ptep, ptl);
> *p_pfn = pfn;
> return 0;
> }
>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn
2021-08-03 13:55 [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
` (2 preceding siblings ...)
2021-08-04 8:51 ` [PATCH 4.14 1/3] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
@ 2021-08-06 6:26 ` Greg KH
3 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2021-08-06 6:26 UTC (permalink / raw)
To: Ovidiu Panait; +Cc: stable, pbonzini
On Tue, Aug 03, 2021 at 04:55:19PM +0300, Ovidiu Panait wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.
>
> In order to convert an HVA to a PFN, KVM usually tries to use
> the get_user_pages family of functinso. This however is not
> possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.
>
> In doing this however KVM loses the information on whether the
> PFN is writable. That is usually not a problem because the main
> use of VM_IO vmas with KVM is for BARs in PCI device assignment,
> however it is a bug. To fix it, use follow_pte and check pte_write
> while under the protection of the PTE lock. The information can
> be used to fail hva_to_pfn_remapped or passed back to the
> caller via *writable.
>
> Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
> up page faults before giving up", 2016-07-05); however, even older version
> have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
> Handle vma regions with no backing page", 2008-07-20), as they also did
> not check whether the PFN was writable.
>
> Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
> Reported-by: David Stevens <stevensd@google.com>
> Cc: 3pvd@google.com
> Cc: Jann Horn <jannh@google.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [OP: backport to 4.14, adjust follow_pte() -> follow_pte_pmd()]
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
> virt/kvm/kvm_main.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
All now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread