* [PATCH v2 1/2] KVM: PPC: Book3S HV: move kvmppc_svm_page_out up
2020-07-27 19:24 [PATCH v2 0/2] Rework secure memslot dropping Ram Pai
@ 2020-07-27 19:24 ` Ram Pai
2020-07-27 19:24 ` [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping Ram Pai
2020-07-28 5:52 ` [PATCH v2 0/2] Rework secure memslot dropping Paul Mackerras
2 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2020-07-27 19:24 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
sukadev, bauerman, david
From: Laurent Dufour <ldufour@linux.ibm.com>
kvmppc_svm_page_out() will need to be called by kvmppc_uvmem_drop_pages()
so move it upper in this file.
Furthermore it will be interesting to call this function when already
holding the kvm->arch.uvmem_lock, so prefix the original function with __
and remove the locking in it, and introduce a wrapper which call that
function with the lock held.
There is no functional change.
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 166 ++++++++++++++++++++-----------------
1 file changed, 90 insertions(+), 76 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 5b917ea..565f24b 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -497,6 +497,96 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
}
/*
+ * Provision a new page on HV side and copy over the contents
+ * from secure memory using UV_PAGE_OUT uvcall.
+ * Caller must held kvm->arch.uvmem_lock.
+ */
+static int __kvmppc_svm_page_out(struct vm_area_struct *vma,
+ unsigned long start,
+ unsigned long end, unsigned long page_shift,
+ struct kvm *kvm, unsigned long gpa)
+{
+ unsigned long src_pfn, dst_pfn = 0;
+ struct migrate_vma mig;
+ struct page *dpage, *spage;
+ struct kvmppc_uvmem_page_pvt *pvt;
+ unsigned long pfn;
+ int ret = U_SUCCESS;
+
+ memset(&mig, 0, sizeof(mig));
+ mig.vma = vma;
+ mig.start = start;
+ mig.end = end;
+ mig.src = &src_pfn;
+ mig.dst = &dst_pfn;
+ mig.src_owner = &kvmppc_uvmem_pgmap;
+
+ /* The requested page is already paged-out, nothing to do */
+ if (!kvmppc_gfn_is_uvmem_pfn(gpa >> page_shift, kvm, NULL))
+ return ret;
+
+ ret = migrate_vma_setup(&mig);
+ if (ret)
+ return -1;
+
+ spage = migrate_pfn_to_page(*mig.src);
+ if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE))
+ goto out_finalize;
+
+ if (!is_zone_device_page(spage))
+ goto out_finalize;
+
+ dpage = alloc_page_vma(GFP_HIGHUSER, vma, start);
+ if (!dpage) {
+ ret = -1;
+ goto out_finalize;
+ }
+
+ lock_page(dpage);
+ pvt = spage->zone_device_data;
+ pfn = page_to_pfn(dpage);
+
+ /*
+ * This function is used in two cases:
+ * - When HV touches a secure page, for which we do UV_PAGE_OUT
+ * - When a secure page is converted to shared page, we *get*
+ * the page to essentially unmap the device page. In this
+ * case we skip page-out.
+ */
+ if (!pvt->skip_page_out)
+ ret = uv_page_out(kvm->arch.lpid, pfn << page_shift,
+ gpa, 0, page_shift);
+
+ if (ret == U_SUCCESS)
+ *mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
+ else {
+ unlock_page(dpage);
+ __free_page(dpage);
+ goto out_finalize;
+ }
+
+ migrate_vma_pages(&mig);
+
+out_finalize:
+ migrate_vma_finalize(&mig);
+ return ret;
+}
+
+static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end,
+ unsigned long page_shift,
+ struct kvm *kvm, unsigned long gpa)
+{
+ int ret;
+
+ mutex_lock(&kvm->arch.uvmem_lock);
+ ret = __kvmppc_svm_page_out(vma, start, end, page_shift, kvm, gpa);
+ mutex_unlock(&kvm->arch.uvmem_lock);
+
+ return ret;
+}
+
+/*
* Drop device pages that we maintain for the secure guest
*
* We first mark the pages to be skipped from UV_PAGE_OUT when there
@@ -866,82 +956,6 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
return ret;
}
-/*
- * Provision a new page on HV side and copy over the contents
- * from secure memory using UV_PAGE_OUT uvcall.
- */
-static int kvmppc_svm_page_out(struct vm_area_struct *vma,
- unsigned long start,
- unsigned long end, unsigned long page_shift,
- struct kvm *kvm, unsigned long gpa)
-{
- unsigned long src_pfn, dst_pfn = 0;
- struct migrate_vma mig;
- struct page *dpage, *spage;
- struct kvmppc_uvmem_page_pvt *pvt;
- unsigned long pfn;
- int ret = U_SUCCESS;
-
- memset(&mig, 0, sizeof(mig));
- mig.vma = vma;
- mig.start = start;
- mig.end = end;
- mig.src = &src_pfn;
- mig.dst = &dst_pfn;
- mig.src_owner = &kvmppc_uvmem_pgmap;
-
- mutex_lock(&kvm->arch.uvmem_lock);
- /* The requested page is already paged-out, nothing to do */
- if (!kvmppc_gfn_is_uvmem_pfn(gpa >> page_shift, kvm, NULL))
- goto out;
-
- ret = migrate_vma_setup(&mig);
- if (ret)
- goto out;
-
- spage = migrate_pfn_to_page(*mig.src);
- if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE))
- goto out_finalize;
-
- if (!is_zone_device_page(spage))
- goto out_finalize;
-
- dpage = alloc_page_vma(GFP_HIGHUSER, vma, start);
- if (!dpage) {
- ret = -1;
- goto out_finalize;
- }
-
- lock_page(dpage);
- pvt = spage->zone_device_data;
- pfn = page_to_pfn(dpage);
-
- /*
- * This function is used in two cases:
- * - When HV touches a secure page, for which we do UV_PAGE_OUT
- * - When a secure page is converted to shared page, we *get*
- * the page to essentially unmap the device page. In this
- * case we skip page-out.
- */
- if (!pvt->skip_page_out)
- ret = uv_page_out(kvm->arch.lpid, pfn << page_shift,
- gpa, 0, page_shift);
-
- if (ret == U_SUCCESS)
- *mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
- else {
- unlock_page(dpage);
- __free_page(dpage);
- goto out_finalize;
- }
-
- migrate_vma_pages(&mig);
-out_finalize:
- migrate_vma_finalize(&mig);
-out:
- mutex_unlock(&kvm->arch.uvmem_lock);
- return ret;
-}
/*
* Fault handler callback that gets called when HV touches any page that
--
1.8.3.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-27 19:24 [PATCH v2 0/2] Rework secure memslot dropping Ram Pai
2020-07-27 19:24 ` [PATCH v2 1/2] KVM: PPC: Book3S HV: move kvmppc_svm_page_out up Ram Pai
@ 2020-07-27 19:24 ` Ram Pai
2020-07-28 5:52 ` [PATCH v2 0/2] Rework secure memslot dropping Paul Mackerras
2 siblings, 0 replies; 10+ messages in thread
From: Ram Pai @ 2020-07-27 19:24 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
sukadev, bauerman, david
From: Laurent Dufour <ldufour@linux.ibm.com>
When a secure memslot is dropped, all the pages backed in the secure
device (aka really backed by secure memory by the Ultravisor)
should be paged out to a normal page. Previously, this was
achieved by triggering the page fault mechanism which is calling
kvmppc_svm_page_out() on each pages.
This can't work when hot unplugging a memory slot because the memory
slot is flagged as invalid and gfn_to_pfn() is then not trying to access
the page, so the page fault mechanism is not triggered.
Since the final goal is to make a call to kvmppc_svm_page_out() it seems
simpler to call directly instead of triggering such a mechanism. This
way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
memslot.
Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
the call to __kvmppc_svm_page_out() is made. As
__kvmppc_svm_page_out needs the vma pointer to migrate the pages,
the VMA is fetched in a lazy way, to not trigger find_vma() all
the time. In addition, the mmap_sem is held in read mode during
that time, not in write mode since the virual memory layout is not
impacted, and kvm->arch.uvmem_lock prevents concurrent operation
on the secure device.
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[modified the changelog description]
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[modified check on the VMA in kvmppc_uvmem_drop_pages]
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 52 +++++++++++++++++++++++++-------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 565f24b..0d49e34 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -594,35 +594,53 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
* fault on them, do fault time migration to replace the device PTEs in
* QEMU page table with normal PTEs from newly allocated pages.
*/
-void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
+void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
struct kvm *kvm, bool skip_page_out)
{
int i;
struct kvmppc_uvmem_page_pvt *pvt;
- unsigned long pfn, uvmem_pfn;
- unsigned long gfn = free->base_gfn;
+ struct page *uvmem_page;
+ struct vm_area_struct *vma = NULL;
+ unsigned long uvmem_pfn, gfn;
+ unsigned long addr;
+
+ mmap_read_lock(kvm->mm);
+
+ addr = slot->userspace_addr;
- for (i = free->npages; i; --i, ++gfn) {
- struct page *uvmem_page;
+ gfn = slot->base_gfn;
+ for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
+
+ /* Fetch the VMA if addr is not in the latest fetched one */
+ if (!vma || addr >= vma->vm_end) {
+ vma = find_vma_intersection(kvm->mm, addr, addr+1);
+ if (!vma) {
+ pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
+ break;
+ }
+ }
mutex_lock(&kvm->arch.uvmem_lock);
- if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+
+ if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+ uvmem_page = pfn_to_page(uvmem_pfn);
+ pvt = uvmem_page->zone_device_data;
+ pvt->skip_page_out = skip_page_out;
+ pvt->remove_gfn = true;
+
+ if (__kvmppc_svm_page_out(vma, addr, addr + PAGE_SIZE,
+ PAGE_SHIFT, kvm, pvt->gpa))
+ pr_err("Can't page out gpa:0x%lx addr:0x%lx\n",
+ pvt->gpa, addr);
+ } else {
+ /* Remove the shared flag if any */
kvmppc_gfn_remove(gfn, kvm);
- mutex_unlock(&kvm->arch.uvmem_lock);
- continue;
}
- uvmem_page = pfn_to_page(uvmem_pfn);
- pvt = uvmem_page->zone_device_data;
- pvt->skip_page_out = skip_page_out;
- pvt->remove_gfn = true;
mutex_unlock(&kvm->arch.uvmem_lock);
-
- pfn = gfn_to_pfn(kvm, gfn);
- if (is_error_noslot_pfn(pfn))
- continue;
- kvm_release_pfn_clean(pfn);
}
+
+ mmap_read_unlock(kvm->mm);
}
unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 0/2] Rework secure memslot dropping
2020-07-27 19:24 [PATCH v2 0/2] Rework secure memslot dropping Ram Pai
2020-07-27 19:24 ` [PATCH v2 1/2] KVM: PPC: Book3S HV: move kvmppc_svm_page_out up Ram Pai
2020-07-27 19:24 ` [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping Ram Pai
@ 2020-07-28 5:52 ` Paul Mackerras
2 siblings, 0 replies; 10+ messages in thread
From: Paul Mackerras @ 2020-07-28 5:52 UTC (permalink / raw)
To: Ram Pai
Cc: ldufour, cclaudio, kvm-ppc, bharata, sathnaga, aneesh.kumar,
sukadev, linuxppc-dev, bauerman, david
On Mon, Jul 27, 2020 at 12:24:27PM -0700, Ram Pai wrote:
> From: Laurent Dufour <ldufour@linux.ibm.com>
>
> When doing memory hotplug on a secure VM, the secure pages are not well
> cleaned from the secure device when dropping the memslot. This silent
> error, is then preventing the SVM to reboot properly after the following
> sequence of commands are run in the Qemu monitor:
>
> device_add pc-dimm,id=dimm1,memdev=mem1
> device_del dimm1
> device_add pc-dimm,id=dimm1,memdev=mem1
>
> At reboot time, when the kernel is booting again and switching to the
> secure mode, the page_in is failing for the pages in the memslot because
> the cleanup was not done properly, because the memslot is flagged as
> invalid during the hot unplug and thus the page fault mechanism is not
> triggered.
>
> To prevent that during the memslot dropping, instead of belonging on the
> page fault mechanism to trigger the page out of the secured pages, it seems
> simpler to directly call the function doing the page out. This way the
> state of the memslot is not interfering on the page out process.
>
> This series applies on top of the Ram's one titled:
> "[v6 0/5] Migrate non-migrated pages of a SVM."
Thanks, series applied to my kvm-ppc-next branch and pull request sent.
Paul.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-21 10:42 Laurent Dufour
@ 2020-07-21 10:42 ` Laurent Dufour
2020-07-21 21:37 ` Ram Pai
2020-07-23 3:36 ` Bharata B Rao
0 siblings, 2 replies; 10+ messages in thread
From: Laurent Dufour @ 2020-07-21 10:42 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel, kvm-ppc, mpe, paulus
Cc: sukadev, linuxram, bauerman, bharata
When a secure memslot is dropped, all the pages backed in the secure device
(aka really backed by secure memory by the Ultravisor) should be paged out
to a normal page. Previously, this was achieved by triggering the page
fault mechanism which is calling kvmppc_svm_page_out() on each pages.
This can't work when hot unplugging a memory slot because the memory slot
is flagged as invalid and gfn_to_pfn() is then not trying to access the
page, so the page fault mechanism is not triggered.
Since the final goal is to make a call to kvmppc_svm_page_out() it seems
simpler to directly calling it instead of triggering such a mechanism. This
way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
memslot.
Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
the call to __kvmppc_svm_page_out() is made.
As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
addition, the mmap_sem is help in read mode during that time, not in write
mode since the virual memory layout is not impacted, and
kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++----------
1 file changed, 37 insertions(+), 17 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 5a4b02d3f651..ba5c7c77cc3a 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -624,35 +624,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
* fault on them, do fault time migration to replace the device PTEs in
* QEMU page table with normal PTEs from newly allocated pages.
*/
-void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
+void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
struct kvm *kvm, bool skip_page_out)
{
int i;
struct kvmppc_uvmem_page_pvt *pvt;
- unsigned long pfn, uvmem_pfn;
- unsigned long gfn = free->base_gfn;
+ struct page *uvmem_page;
+ struct vm_area_struct *vma = NULL;
+ unsigned long uvmem_pfn, gfn;
+ unsigned long addr, end;
+
+ mmap_read_lock(kvm->mm);
+
+ addr = slot->userspace_addr;
+ end = addr + (slot->npages * PAGE_SIZE);
- for (i = free->npages; i; --i, ++gfn) {
- struct page *uvmem_page;
+ gfn = slot->base_gfn;
+ for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
+
+ /* Fetch the VMA if addr is not in the latest fetched one */
+ if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) {
+ vma = find_vma_intersection(kvm->mm, addr, end);
+ if (!vma ||
+ vma->vm_start > addr || vma->vm_end < end) {
+ pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
+ break;
+ }
+ }
mutex_lock(&kvm->arch.uvmem_lock);
- if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+
+ if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+ uvmem_page = pfn_to_page(uvmem_pfn);
+ pvt = uvmem_page->zone_device_data;
+ pvt->skip_page_out = skip_page_out;
+ pvt->remove_gfn = true;
+
+ if (__kvmppc_svm_page_out(vma, addr, addr + PAGE_SIZE,
+ PAGE_SHIFT, kvm, pvt->gpa))
+ pr_err("Can't page out gpa:0x%lx addr:0x%lx\n",
+ pvt->gpa, addr);
+ } else {
+ /* Remove the shared flag if any */
kvmppc_gfn_remove(gfn, kvm);
- mutex_unlock(&kvm->arch.uvmem_lock);
- continue;
}
- uvmem_page = pfn_to_page(uvmem_pfn);
- pvt = uvmem_page->zone_device_data;
- pvt->skip_page_out = skip_page_out;
- pvt->remove_gfn = true;
mutex_unlock(&kvm->arch.uvmem_lock);
-
- pfn = gfn_to_pfn(kvm, gfn);
- if (is_error_noslot_pfn(pfn))
- continue;
- kvm_release_pfn_clean(pfn);
}
+
+ mmap_read_unlock(kvm->mm);
}
unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
--
2.27.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-21 10:42 ` [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping Laurent Dufour
@ 2020-07-21 21:37 ` Ram Pai
2020-07-22 7:18 ` Laurent Dufour
2020-07-23 3:36 ` Bharata B Rao
1 sibling, 1 reply; 10+ messages in thread
From: Ram Pai @ 2020-07-21 21:37 UTC (permalink / raw)
To: Laurent Dufour
Cc: linux-kernel, kvm-ppc, bharata, paulus, sukadev, linuxppc-dev,
bauerman
On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote:
> When a secure memslot is dropped, all the pages backed in the secure device
> (aka really backed by secure memory by the Ultravisor) should be paged out
> to a normal page. Previously, this was achieved by triggering the page
> fault mechanism which is calling kvmppc_svm_page_out() on each pages.
>
> This can't work when hot unplugging a memory slot because the memory slot
> is flagged as invalid and gfn_to_pfn() is then not trying to access the
> page, so the page fault mechanism is not triggered.
>
> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
> simpler to directly calling it instead of triggering such a mechanism. This
^^ call directly instead of triggering..
> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
> memslot.
>
> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
> the call to __kvmppc_svm_page_out() is made.
> As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
> VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
> addition, the mmap_sem is help in read mode during that time, not in write
^^ held
> mode since the virual memory layout is not impacted, and
> kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
>
> Cc: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
RP
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-21 21:37 ` Ram Pai
@ 2020-07-22 7:18 ` Laurent Dufour
0 siblings, 0 replies; 10+ messages in thread
From: Laurent Dufour @ 2020-07-22 7:18 UTC (permalink / raw)
To: Ram Pai, paulus
Cc: linux-kernel, kvm-ppc, bharata, sukadev, linuxppc-dev, bauerman
Le 21/07/2020 à 23:37, Ram Pai a écrit :
> On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote:
>> When a secure memslot is dropped, all the pages backed in the secure device
>> (aka really backed by secure memory by the Ultravisor) should be paged out
>> to a normal page. Previously, this was achieved by triggering the page
>> fault mechanism which is calling kvmppc_svm_page_out() on each pages.
>>
>> This can't work when hot unplugging a memory slot because the memory slot
>> is flagged as invalid and gfn_to_pfn() is then not trying to access the
>> page, so the page fault mechanism is not triggered.
>>
>> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
>> simpler to directly calling it instead of triggering such a mechanism. This
> ^^ call directly instead of triggering..
>
>> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
>> memslot.
>>
>> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
>> the call to __kvmppc_svm_page_out() is made.
>> As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
>> VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
>> addition, the mmap_sem is help in read mode during that time, not in write
> ^^ held
>
>> mode since the virual memory layout is not impacted, and
>> kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
>>
>> Cc: Ram Pai <linuxram@us.ibm.com>
>
> Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Thanks for reviewing this series.
Regarding the wordsmithing, Paul, could you manage that when pulling the series?
Thanks,
Laurent.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-21 10:42 ` [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping Laurent Dufour
2020-07-21 21:37 ` Ram Pai
@ 2020-07-23 3:36 ` Bharata B Rao
2020-07-23 12:32 ` Laurent Dufour
1 sibling, 1 reply; 10+ messages in thread
From: Bharata B Rao @ 2020-07-23 3:36 UTC (permalink / raw)
To: Laurent Dufour
Cc: linuxram, linux-kernel, kvm-ppc, paulus, sukadev, linuxppc-dev,
bauerman
On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote:
> When a secure memslot is dropped, all the pages backed in the secure device
> (aka really backed by secure memory by the Ultravisor) should be paged out
> to a normal page. Previously, this was achieved by triggering the page
> fault mechanism which is calling kvmppc_svm_page_out() on each pages.
>
> This can't work when hot unplugging a memory slot because the memory slot
> is flagged as invalid and gfn_to_pfn() is then not trying to access the
> page, so the page fault mechanism is not triggered.
>
> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
> simpler to directly calling it instead of triggering such a mechanism. This
> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
> memslot.
>
> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
> the call to __kvmppc_svm_page_out() is made.
> As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
> VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
> addition, the mmap_sem is help in read mode during that time, not in write
> mode since the virual memory layout is not impacted, and
> kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
>
> Cc: Ram Pai <linuxram@us.ibm.com>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 17 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index 5a4b02d3f651..ba5c7c77cc3a 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -624,35 +624,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
> * fault on them, do fault time migration to replace the device PTEs in
> * QEMU page table with normal PTEs from newly allocated pages.
> */
> -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
> struct kvm *kvm, bool skip_page_out)
> {
> int i;
> struct kvmppc_uvmem_page_pvt *pvt;
> - unsigned long pfn, uvmem_pfn;
> - unsigned long gfn = free->base_gfn;
> + struct page *uvmem_page;
> + struct vm_area_struct *vma = NULL;
> + unsigned long uvmem_pfn, gfn;
> + unsigned long addr, end;
> +
> + mmap_read_lock(kvm->mm);
> +
> + addr = slot->userspace_addr;
We typically use gfn_to_hva() for that, but that won't work for a
memslot that is already marked INVALID which is the case here.
I think it is ok to access slot->userspace_addr here of an INVALID
memslot, but just thought of explictly bringing this up.
> + end = addr + (slot->npages * PAGE_SIZE);
>
> - for (i = free->npages; i; --i, ++gfn) {
> - struct page *uvmem_page;
> + gfn = slot->base_gfn;
> + for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
> +
> + /* Fetch the VMA if addr is not in the latest fetched one */
> + if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) {
> + vma = find_vma_intersection(kvm->mm, addr, end);
> + if (!vma ||
> + vma->vm_start > addr || vma->vm_end < end) {
> + pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
> + break;
> + }
> + }
In Ram's series, kvmppc_memslot_page_merge() also walks the VMAs spanning
the memslot, but it uses a different logic for the same. Why can't these
two cases use the same method to walk the VMAs? Is there anything subtly
different between the two cases?
Regards,
Bharata.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-23 3:36 ` Bharata B Rao
@ 2020-07-23 12:32 ` Laurent Dufour
2020-07-23 14:06 ` Laurent Dufour
0 siblings, 1 reply; 10+ messages in thread
From: Laurent Dufour @ 2020-07-23 12:32 UTC (permalink / raw)
To: bharata, linuxram
Cc: linux-kernel, kvm-ppc, paulus, sukadev, linuxppc-dev, bauerman
Le 23/07/2020 à 05:36, Bharata B Rao a écrit :
> On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote:
>> When a secure memslot is dropped, all the pages backed in the secure device
>> (aka really backed by secure memory by the Ultravisor) should be paged out
>> to a normal page. Previously, this was achieved by triggering the page
>> fault mechanism which is calling kvmppc_svm_page_out() on each pages.
>>
>> This can't work when hot unplugging a memory slot because the memory slot
>> is flagged as invalid and gfn_to_pfn() is then not trying to access the
>> page, so the page fault mechanism is not triggered.
>>
>> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
>> simpler to directly calling it instead of triggering such a mechanism. This
>> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
>> memslot.
>>
>> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
>> the call to __kvmppc_svm_page_out() is made.
>> As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
>> VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
>> addition, the mmap_sem is help in read mode during that time, not in write
>> mode since the virual memory layout is not impacted, and
>> kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
>>
>> Cc: Ram Pai <linuxram@us.ibm.com>
>> Cc: Bharata B Rao <bharata@linux.ibm.com>
>> Cc: Paul Mackerras <paulus@ozlabs.org>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>> arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++----------
>> 1 file changed, 37 insertions(+), 17 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
>> index 5a4b02d3f651..ba5c7c77cc3a 100644
>> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
>> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
>> @@ -624,35 +624,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
>> * fault on them, do fault time migration to replace the device PTEs in
>> * QEMU page table with normal PTEs from newly allocated pages.
>> */
>> -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
>> +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
>> struct kvm *kvm, bool skip_page_out)
>> {
>> int i;
>> struct kvmppc_uvmem_page_pvt *pvt;
>> - unsigned long pfn, uvmem_pfn;
>> - unsigned long gfn = free->base_gfn;
>> + struct page *uvmem_page;
>> + struct vm_area_struct *vma = NULL;
>> + unsigned long uvmem_pfn, gfn;
>> + unsigned long addr, end;
>> +
>> + mmap_read_lock(kvm->mm);
>> +
>> + addr = slot->userspace_addr;
>
> We typically use gfn_to_hva() for that, but that won't work for a
> memslot that is already marked INVALID which is the case here.
> I think it is ok to access slot->userspace_addr here of an INVALID
> memslot, but just thought of explictly bringing this up.
Which explicitly mentioned above in the patch's description:
This can't work when hot unplugging a memory slot because the memory slot
is flagged as invalid and gfn_to_pfn() is then not trying to access the
page, so the page fault mechanism is not triggered.
>
>> + end = addr + (slot->npages * PAGE_SIZE);
>>
>> - for (i = free->npages; i; --i, ++gfn) {
>> - struct page *uvmem_page;
>> + gfn = slot->base_gfn;
>> + for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
>> +
>> + /* Fetch the VMA if addr is not in the latest fetched one */
>> + if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) {
>> + vma = find_vma_intersection(kvm->mm, addr, end);
>> + if (!vma ||
>> + vma->vm_start > addr || vma->vm_end < end) {
>> + pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
>> + break;
>> + }
>> + }
>
> In Ram's series, kvmppc_memslot_page_merge() also walks the VMAs spanning
> the memslot, but it uses a different logic for the same. Why can't these
> two cases use the same method to walk the VMAs? Is there anything subtly
> different between the two cases?
This is probably doable. At the time I wrote that patch, the
kvmppc_memslot_page_merge() was not yet introduced AFAIR.
This being said, I'd help a lot to factorize that code... I let Ram dealing with
that ;)
Cheers,
Laurent.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
2020-07-23 12:32 ` Laurent Dufour
@ 2020-07-23 14:06 ` Laurent Dufour
0 siblings, 0 replies; 10+ messages in thread
From: Laurent Dufour @ 2020-07-23 14:06 UTC (permalink / raw)
To: bharata, linuxram
Cc: linux-kernel, kvm-ppc, paulus, sukadev, linuxppc-dev, bauerman
Le 23/07/2020 à 14:32, Laurent Dufour a écrit :
> Le 23/07/2020 à 05:36, Bharata B Rao a écrit :
>> On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote:
>>> When a secure memslot is dropped, all the pages backed in the secure device
>>> (aka really backed by secure memory by the Ultravisor) should be paged out
>>> to a normal page. Previously, this was achieved by triggering the page
>>> fault mechanism which is calling kvmppc_svm_page_out() on each pages.
>>>
>>> This can't work when hot unplugging a memory slot because the memory slot
>>> is flagged as invalid and gfn_to_pfn() is then not trying to access the
>>> page, so the page fault mechanism is not triggered.
>>>
>>> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
>>> simpler to directly calling it instead of triggering such a mechanism. This
>>> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
>>> memslot.
>>>
>>> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
>>> the call to __kvmppc_svm_page_out() is made.
>>> As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
>>> VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
>>> addition, the mmap_sem is help in read mode during that time, not in write
>>> mode since the virual memory layout is not impacted, and
>>> kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
>>>
>>> Cc: Ram Pai <linuxram@us.ibm.com>
>>> Cc: Bharata B Rao <bharata@linux.ibm.com>
>>> Cc: Paul Mackerras <paulus@ozlabs.org>
>>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>>> ---
>>> arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++----------
>>> 1 file changed, 37 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> b/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> index 5a4b02d3f651..ba5c7c77cc3a 100644
>>> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> @@ -624,35 +624,55 @@ static inline int kvmppc_svm_page_out(struct
>>> vm_area_struct *vma,
>>> * fault on them, do fault time migration to replace the device PTEs in
>>> * QEMU page table with normal PTEs from newly allocated pages.
>>> */
>>> -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
>>> +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
>>> struct kvm *kvm, bool skip_page_out)
>>> {
>>> int i;
>>> struct kvmppc_uvmem_page_pvt *pvt;
>>> - unsigned long pfn, uvmem_pfn;
>>> - unsigned long gfn = free->base_gfn;
>>> + struct page *uvmem_page;
>>> + struct vm_area_struct *vma = NULL;
>>> + unsigned long uvmem_pfn, gfn;
>>> + unsigned long addr, end;
>>> +
>>> + mmap_read_lock(kvm->mm);
>>> +
>>> + addr = slot->userspace_addr;
>>
>> We typically use gfn_to_hva() for that, but that won't work for a
>> memslot that is already marked INVALID which is the case here.
>> I think it is ok to access slot->userspace_addr here of an INVALID
>> memslot, but just thought of explictly bringing this up.
>
> Which explicitly mentioned above in the patch's description:
>
> This can't work when hot unplugging a memory slot because the memory slot
> is flagged as invalid and gfn_to_pfn() is then not trying to access the
> page, so the page fault mechanism is not triggered.
>
>>
>>> + end = addr + (slot->npages * PAGE_SIZE);
>>> - for (i = free->npages; i; --i, ++gfn) {
>>> - struct page *uvmem_page;
>>> + gfn = slot->base_gfn;
>>> + for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
>>> +
>>> + /* Fetch the VMA if addr is not in the latest fetched one */
>>> + if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) {
>>> + vma = find_vma_intersection(kvm->mm, addr, end);
>>> + if (!vma ||
>>> + vma->vm_start > addr || vma->vm_end < end) {
>>> + pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
>>> + break;
>>> + }
>>> + }
>>
>> In Ram's series, kvmppc_memslot_page_merge() also walks the VMAs spanning
>> the memslot, but it uses a different logic for the same. Why can't these
>> two cases use the same method to walk the VMAs? Is there anything subtly
>> different between the two cases?
>
> This is probably doable. At the time I wrote that patch, the
> kvmppc_memslot_page_merge() was not yet introduced AFAIR.
>
> This being said, I'd help a lot to factorize that code... I let Ram dealing with
> that ;)
Indeed I don't think this is relevant, the loop in kvmppc_memslot_page_merge()
deals with one call (to ksm_advise) per VMA, while this code is dealing with one
call per page of the VMA, which completely different.
I don't think merging the both will be a good idea.
Cheers,
Laurent.
^ permalink raw reply [flat|nested] 10+ messages in thread