* [PULL 0/8] KVM: s390: memory management and migration
@ 2014-04-09 10:44 Christian Borntraeger
2014-04-09 10:44 ` [PULL 1/8] KVM: s390: also set guest pages back to stable on kexec/kdump Christian Borntraeger
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Christian Borntraeger
Marcelo, Gleb, (Paolo,)
The following changes since commit 7cbb39d4d4d530dff12f2ff06ed6c85c504ba91a:
Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm (2014-04-02 14:50:10 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git tags/kvm-s390-20140409
for you to fetch changes up to 384ee3e2a18893f9de84ce2f00cf786ad81fe08e:
KVM: s390: Add proper dirty bitmap support to S390 kvm. (2014-04-09 11:12:15 +0200)
----------------------------------------------------------------
here are two changes for KVM on s390:
- newer Linux version don't use the storage keys for dirty and
reference tracking. We can optimize the guest handling for
those guests for faults as well as page-in and page-out by
simply not caring about the guest visible storage key. We trap
guest storage key instruction to enable those keys only on demand.
- Migration bitmap: Until now s390 never provided a proper dirty
bitmap. Let's provide a proper migration bitmap for s390. We also
change the user dirty tracking to a fault based mechanism. This
makes the host completely independent from the storage keys. Long
term this will allow us to back guest memory with large pages.
Please note:
The first patch commit 60bfdc8cd150 (KVM: s390: also set guest pages
back to stable on kexec/kdump) is only included to avoid a compile
error in the calls to page_table_reset_pgste. This commit already made
its way into Linus git as commit id 1b6a19b34d54d3d56b9042 and we need
to adopt the new call as well.
The merge conflict between Linus tree and kvm tree is easy to solve.
(kvm tree has the final result)
All patches that touch s390 memory management are either ACKed or
written by the s390 maintainers.
----------------------------------------------------------------
Christian Borntraeger (1):
KVM: s390: also set guest pages back to stable on kexec/kdump
Dominik Dingel (5):
KVM: s390: Adding skey bit to mmu context
KVM: s390: Clear storage keys
KVM: s390: Allow skeys to be enabled for the current process
KVM: s390: Don't enable skeys by default
KVM: s390/mm: new gmap_test_and_clear_dirty function
Jason J. Herne (1):
KVM: s390: Add proper dirty bitmap support to S390 kvm.
Martin Schwidefsky (1):
KVM: s390/mm: use software dirty bit detection for user dirty tracking
arch/s390/include/asm/kvm_host.h | 3 +
arch/s390/include/asm/mmu.h | 2 +
arch/s390/include/asm/mmu_context.h | 1 +
arch/s390/include/asm/pgalloc.h | 3 +-
arch/s390/include/asm/pgtable.h | 169 +++++++++++++++++-------------------
arch/s390/kvm/diag.c | 8 +-
arch/s390/kvm/kvm-s390.c | 52 ++++++++++-
arch/s390/kvm/priv.c | 14 +++
arch/s390/kvm/trace.h | 14 +++
arch/s390/mm/pgtable.c | 88 ++++++++++++++++---
virt/kvm/kvm_main.c | 2 -
11 files changed, 248 insertions(+), 108 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PULL 1/8] KVM: s390: also set guest pages back to stable on kexec/kdump
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 2/8] KVM: s390: Adding skey bit to mmu context Christian Borntraeger
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Christian Borntraeger
We need to reset the usage state of the pages on kexec/kdump,
which use subcode 0 and 1. We will only do the cmma reset in
the kernel, everything else is done in userspace as before.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/kvm/diag.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 03a05ff..08dfc83 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -167,6 +167,10 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 5, "diag ipl functions, subcode %lx", subcode);
switch (subcode) {
+ case 0:
+ case 1:
+ page_table_reset_pgste(current->mm, 0, TASK_SIZE);
+ return -EOPNOTSUPP;
case 3:
vcpu->run->s390_reset_flags = KVM_S390_RESET_CLEAR;
page_table_reset_pgste(current->mm, 0, TASK_SIZE);
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 2/8] KVM: s390: Adding skey bit to mmu context
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
2014-04-09 10:44 ` [PULL 1/8] KVM: s390: also set guest pages back to stable on kexec/kdump Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 3/8] KVM: s390: Clear storage keys Christian Borntraeger
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Dominik Dingel <dingel@linux.vnet.ibm.com>
For lazy storage key handling, we need a mechanism to track if the
process ever issued a storage key operation.
This patch adds the basic infrastructure for making the storage
key handling optional, but still leaves it enabled for now by default.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/mmu.h | 2 ++
arch/s390/include/asm/mmu_context.h | 1 +
arch/s390/include/asm/pgtable.h | 41 ++++++++++++++++++++++++-------------
3 files changed, 30 insertions(+), 14 deletions(-)
diff --git a/arch/s390/include/asm/mmu.h b/arch/s390/include/asm/mmu.h
index ff132ac..665cfb3 100644
--- a/arch/s390/include/asm/mmu.h
+++ b/arch/s390/include/asm/mmu.h
@@ -14,6 +14,8 @@ typedef struct {
unsigned long vdso_base;
/* The mmu context has extended page tables. */
unsigned int has_pgste:1;
+ /* The mmu context uses storage keys. */
+ unsigned int use_skey:1;
} mm_context_t;
#define INIT_MM_CONTEXT(name) \
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 38149b6..768e05a 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -22,6 +22,7 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.asce_bits |= _ASCE_TYPE_REGION3;
#endif
mm->context.has_pgste = 0;
+ mm->context.use_skey = 1;
mm->context.asce_limit = STACK_TOP_MAX;
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 50a75d9..818ab0f 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -466,6 +466,16 @@ static inline int mm_has_pgste(struct mm_struct *mm)
#endif
return 0;
}
+
+static inline int mm_use_skey(struct mm_struct *mm)
+{
+#ifdef CONFIG_PGSTE
+ if (mm->context.use_skey)
+ return 1;
+#endif
+ return 0;
+}
+
/*
* pgd/pmd/pte query functions
*/
@@ -699,12 +709,13 @@ static inline void pgste_set(pte_t *ptep, pgste_t pgste)
#endif
}
-static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste)
+static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste,
+ struct mm_struct *mm)
{
#ifdef CONFIG_PGSTE
unsigned long address, bits, skey;
- if (pte_val(*ptep) & _PAGE_INVALID)
+ if (!mm_use_skey(mm) || pte_val(*ptep) & _PAGE_INVALID)
return pgste;
address = pte_val(*ptep) & PAGE_MASK;
skey = (unsigned long) page_get_storage_key(address);
@@ -729,10 +740,11 @@ static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste)
}
-static inline pgste_t pgste_update_young(pte_t *ptep, pgste_t pgste)
+static inline pgste_t pgste_update_young(pte_t *ptep, pgste_t pgste,
+ struct mm_struct *mm)
{
#ifdef CONFIG_PGSTE
- if (pte_val(*ptep) & _PAGE_INVALID)
+ if (!mm_use_skey(mm) || pte_val(*ptep) & _PAGE_INVALID)
return pgste;
/* Get referenced bit from storage key */
if (page_reset_referenced(pte_val(*ptep) & PAGE_MASK))
@@ -741,13 +753,14 @@ static inline pgste_t pgste_update_young(pte_t *ptep, pgste_t pgste)
return pgste;
}
-static inline void pgste_set_key(pte_t *ptep, pgste_t pgste, pte_t entry)
+static inline void pgste_set_key(pte_t *ptep, pgste_t pgste, pte_t entry,
+ struct mm_struct *mm)
{
#ifdef CONFIG_PGSTE
unsigned long address;
unsigned long nkey;
- if (pte_val(entry) & _PAGE_INVALID)
+ if (!mm_use_skey(mm) || pte_val(entry) & _PAGE_INVALID)
return;
VM_BUG_ON(!(pte_val(*ptep) & _PAGE_INVALID));
address = pte_val(entry) & PAGE_MASK;
@@ -870,7 +883,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
if (mm_has_pgste(mm)) {
pgste = pgste_get_lock(ptep);
pgste_val(pgste) &= ~_PGSTE_GPS_ZERO;
- pgste_set_key(ptep, pgste, entry);
+ pgste_set_key(ptep, pgste, entry, mm);
pgste_set_pte(ptep, entry);
pgste_set_unlock(ptep, pgste);
} else {
@@ -1028,7 +1041,7 @@ static inline int ptep_test_and_clear_user_dirty(struct mm_struct *mm,
if (mm_has_pgste(mm)) {
pgste = pgste_get_lock(ptep);
- pgste = pgste_update_all(ptep, pgste);
+ pgste = pgste_update_all(ptep, pgste, mm);
dirty = !!(pgste_val(pgste) & PGSTE_HC_BIT);
pgste_val(pgste) &= ~PGSTE_HC_BIT;
pgste_set_unlock(ptep, pgste);
@@ -1048,7 +1061,7 @@ static inline int ptep_test_and_clear_user_young(struct mm_struct *mm,
if (mm_has_pgste(mm)) {
pgste = pgste_get_lock(ptep);
- pgste = pgste_update_young(ptep, pgste);
+ pgste = pgste_update_young(ptep, pgste, mm);
young = !!(pgste_val(pgste) & PGSTE_HR_BIT);
pgste_val(pgste) &= ~PGSTE_HR_BIT;
pgste_set_unlock(ptep, pgste);
@@ -1159,7 +1172,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
pte_val(*ptep) = _PAGE_INVALID;
if (mm_has_pgste(mm)) {
- pgste = pgste_update_all(&pte, pgste);
+ pgste = pgste_update_all(&pte, pgste, mm);
pgste_set_unlock(ptep, pgste);
}
return pte;
@@ -1182,7 +1195,7 @@ static inline pte_t ptep_modify_prot_start(struct mm_struct *mm,
ptep_flush_lazy(mm, address, ptep);
if (mm_has_pgste(mm)) {
- pgste = pgste_update_all(&pte, pgste);
+ pgste = pgste_update_all(&pte, pgste, mm);
pgste_set(ptep, pgste);
}
return pte;
@@ -1196,7 +1209,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
if (mm_has_pgste(mm)) {
pgste = pgste_get(ptep);
- pgste_set_key(ptep, pgste, pte);
+ pgste_set_key(ptep, pgste, pte, mm);
pgste_set_pte(ptep, pte);
pgste_set_unlock(ptep, pgste);
} else
@@ -1223,7 +1236,7 @@ static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
if ((pgste_val(pgste) & _PGSTE_GPS_USAGE_MASK) ==
_PGSTE_GPS_USAGE_UNUSED)
pte_val(pte) |= _PAGE_UNUSED;
- pgste = pgste_update_all(&pte, pgste);
+ pgste = pgste_update_all(&pte, pgste, vma->vm_mm);
pgste_set_unlock(ptep, pgste);
}
return pte;
@@ -1255,7 +1268,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
pte_val(*ptep) = _PAGE_INVALID;
if (!full && mm_has_pgste(mm)) {
- pgste = pgste_update_all(&pte, pgste);
+ pgste = pgste_update_all(&pte, pgste, mm);
pgste_set_unlock(ptep, pgste);
}
return pte;
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 3/8] KVM: s390: Clear storage keys
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
2014-04-09 10:44 ` [PULL 1/8] KVM: s390: also set guest pages back to stable on kexec/kdump Christian Borntraeger
2014-04-09 10:44 ` [PULL 2/8] KVM: s390: Adding skey bit to mmu context Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 4/8] KVM: s390: Allow skeys to be enabled for the current process Christian Borntraeger
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Dominik Dingel <dingel@linux.vnet.ibm.com>
page_table_reset_pgste() already does a complete page table walk to
reset the pgste. Enhance it to initialize the storage keys to
PAGE_DEFAULT_KEY if requested by the caller. This will be used
for lazy storage key handling.
Lets adopt the current code (diag 308) to not clear the keys.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/pgalloc.h | 3 ++-
arch/s390/kvm/diag.c | 6 +++---
arch/s390/mm/pgtable.c | 38 +++++++++++++++++++++++++++-----------
3 files changed, 32 insertions(+), 15 deletions(-)
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 884017c..9e18a61 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -22,7 +22,8 @@ unsigned long *page_table_alloc(struct mm_struct *, unsigned long);
void page_table_free(struct mm_struct *, unsigned long *);
void page_table_free_rcu(struct mmu_gather *, unsigned long *);
-void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long);
+void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long,
+ bool init_skey);
int set_guest_storage_key(struct mm_struct *mm, unsigned long addr,
unsigned long key, bool nq);
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 08dfc83..44dcfa8 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -169,15 +169,15 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
switch (subcode) {
case 0:
case 1:
- page_table_reset_pgste(current->mm, 0, TASK_SIZE);
+ page_table_reset_pgste(current->mm, 0, TASK_SIZE, false);
return -EOPNOTSUPP;
case 3:
vcpu->run->s390_reset_flags = KVM_S390_RESET_CLEAR;
- page_table_reset_pgste(current->mm, 0, TASK_SIZE);
+ page_table_reset_pgste(current->mm, 0, TASK_SIZE, false);
break;
case 4:
vcpu->run->s390_reset_flags = 0;
- page_table_reset_pgste(current->mm, 0, TASK_SIZE);
+ page_table_reset_pgste(current->mm, 0, TASK_SIZE, false);
break;
default:
return -EOPNOTSUPP;
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 796c932..5f5d643 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -878,8 +878,8 @@ static inline void page_table_free_pgste(unsigned long *table)
__free_page(page);
}
-static inline unsigned long page_table_reset_pte(struct mm_struct *mm,
- pmd_t *pmd, unsigned long addr, unsigned long end)
+static inline unsigned long page_table_reset_pte(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long addr, unsigned long end, bool init_skey)
{
pte_t *start_pte, *pte;
spinlock_t *ptl;
@@ -890,6 +890,22 @@ static inline unsigned long page_table_reset_pte(struct mm_struct *mm,
do {
pgste = pgste_get_lock(pte);
pgste_val(pgste) &= ~_PGSTE_GPS_USAGE_MASK;
+ if (init_skey) {
+ unsigned long address;
+
+ pgste_val(pgste) &= ~(PGSTE_ACC_BITS | PGSTE_FP_BIT |
+ PGSTE_GR_BIT | PGSTE_GC_BIT);
+
+ /* skip invalid and not writable pages */
+ if (pte_val(*pte) & _PAGE_INVALID ||
+ !(pte_val(*pte) & _PAGE_WRITE)) {
+ pgste_set_unlock(pte, pgste);
+ continue;
+ }
+
+ address = pte_val(*pte) & PAGE_MASK;
+ page_set_storage_key(address, PAGE_DEFAULT_KEY, 1);
+ }
pgste_set_unlock(pte, pgste);
} while (pte++, addr += PAGE_SIZE, addr != end);
pte_unmap_unlock(start_pte, ptl);
@@ -897,8 +913,8 @@ static inline unsigned long page_table_reset_pte(struct mm_struct *mm,
return addr;
}
-static inline unsigned long page_table_reset_pmd(struct mm_struct *mm,
- pud_t *pud, unsigned long addr, unsigned long end)
+static inline unsigned long page_table_reset_pmd(struct mm_struct *mm, pud_t *pud,
+ unsigned long addr, unsigned long end, bool init_skey)
{
unsigned long next;
pmd_t *pmd;
@@ -908,14 +924,14 @@ static inline unsigned long page_table_reset_pmd(struct mm_struct *mm,
next = pmd_addr_end(addr, end);
if (pmd_none_or_clear_bad(pmd))
continue;
- next = page_table_reset_pte(mm, pmd, addr, next);
+ next = page_table_reset_pte(mm, pmd, addr, next, init_skey);
} while (pmd++, addr = next, addr != end);
return addr;
}
-static inline unsigned long page_table_reset_pud(struct mm_struct *mm,
- pgd_t *pgd, unsigned long addr, unsigned long end)
+static inline unsigned long page_table_reset_pud(struct mm_struct *mm, pgd_t *pgd,
+ unsigned long addr, unsigned long end, bool init_skey)
{
unsigned long next;
pud_t *pud;
@@ -925,14 +941,14 @@ static inline unsigned long page_table_reset_pud(struct mm_struct *mm,
next = pud_addr_end(addr, end);
if (pud_none_or_clear_bad(pud))
continue;
- next = page_table_reset_pmd(mm, pud, addr, next);
+ next = page_table_reset_pmd(mm, pud, addr, next, init_skey);
} while (pud++, addr = next, addr != end);
return addr;
}
-void page_table_reset_pgste(struct mm_struct *mm,
- unsigned long start, unsigned long end)
+void page_table_reset_pgste(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool init_skey)
{
unsigned long addr, next;
pgd_t *pgd;
@@ -944,7 +960,7 @@ void page_table_reset_pgste(struct mm_struct *mm,
next = pgd_addr_end(addr, end);
if (pgd_none_or_clear_bad(pgd))
continue;
- next = page_table_reset_pud(mm, pgd, addr, next);
+ next = page_table_reset_pud(mm, pgd, addr, next, init_skey);
} while (pgd++, addr = next, addr != end);
up_read(&mm->mmap_sem);
}
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 4/8] KVM: s390: Allow skeys to be enabled for the current process
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (2 preceding siblings ...)
2014-04-09 10:44 ` [PULL 3/8] KVM: s390: Clear storage keys Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 5/8] KVM: s390: Don't enable skeys by default Christian Borntraeger
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Dominik Dingel <dingel@linux.vnet.ibm.com>
Introduce a new function s390_enable_skey(), which enables storage key
handling via setting the use_skey flag in the mmu context.
This function is only useful within the context of kvm.
Note that enabling storage keys will cause a one-time hickup when
walking the page table; however, it saves us special effort for cases
like clear reset while making it possible for us to be architecture
conform.
s390_enable_skey() takes the page table lock to prevent resetting
storage keys triggered from multiple vcpus.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/pgtable.h | 1 +
arch/s390/mm/pgtable.c | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 818ab0f..bc4eb35 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1691,6 +1691,7 @@ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
extern int vmem_add_mapping(unsigned long start, unsigned long size);
extern int vmem_remove_mapping(unsigned long start, unsigned long size);
extern int s390_enable_sie(void);
+extern void s390_enable_skey(void);
/*
* No page table caches to initialise
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 5f5d643..25e4a2a 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -1368,6 +1368,29 @@ int s390_enable_sie(void)
}
EXPORT_SYMBOL_GPL(s390_enable_sie);
+/*
+ * Enable storage key handling from now on and initialize the storage
+ * keys with the default key.
+ */
+void s390_enable_skey(void)
+{
+ /*
+ * To avoid races between multiple vcpus, ending in calling
+ * page_table_reset twice or more,
+ * the page_table_lock is taken for serialization.
+ */
+ spin_lock(¤t->mm->page_table_lock);
+ if (mm_use_skey(current->mm)) {
+ spin_unlock(¤t->mm->page_table_lock);
+ return;
+ }
+
+ current->mm->context.use_skey = 1;
+ spin_unlock(¤t->mm->page_table_lock);
+ page_table_reset_pgste(current->mm, 0, TASK_SIZE, true);
+}
+EXPORT_SYMBOL_GPL(s390_enable_skey);
+
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
int pmdp_clear_flush_young(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 5/8] KVM: s390: Don't enable skeys by default
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (3 preceding siblings ...)
2014-04-09 10:44 ` [PULL 4/8] KVM: s390: Allow skeys to be enabled for the current process Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 6/8] KVM: s390/mm: use software dirty bit detection for user dirty tracking Christian Borntraeger
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Dominik Dingel <dingel@linux.vnet.ibm.com>
The first invocation of storage key operations on a given cpu will be intercepted.
On these intercepts we will enable storage keys for the guest and remove the
previously added intercepts.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/kvm_host.h | 3 +++
arch/s390/include/asm/mmu_context.h | 2 +-
arch/s390/kvm/kvm-s390.c | 1 +
arch/s390/kvm/priv.c | 14 ++++++++++++++
arch/s390/kvm/trace.h | 14 ++++++++++++++
5 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 154b600..a993b6f 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -89,6 +89,9 @@ struct kvm_s390_sie_block {
__u16 lctl; /* 0x0044 */
__s16 icpua; /* 0x0046 */
#define ICTL_LPSW 0x00400000
+#define ICTL_ISKE 0x00004000
+#define ICTL_SSKE 0x00002000
+#define ICTL_RRBE 0x00001000
__u32 ictl; /* 0x0048 */
__u32 eca; /* 0x004c */
__u8 icptcode; /* 0x0050 */
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 768e05a..00a6b74 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -22,7 +22,7 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.asce_bits |= _ASCE_TYPE_REGION3;
#endif
mm->context.has_pgste = 0;
- mm->context.use_skey = 1;
+ mm->context.use_skey = 0;
mm->context.asce_limit = STACK_TOP_MAX;
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b3ecb8f..b767ec9 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -465,6 +465,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->ecb2 = 8;
vcpu->arch.sie_block->eca = 0xC1002001U;
vcpu->arch.sie_block->fac = (int) (long) vfacilities;
+ vcpu->arch.sie_block->ictl |= ICTL_ISKE | ICTL_SSKE | ICTL_RRBE;
if (kvm_enabled_cmma()) {
cbrl = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (cbrl) {
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 476e9e2..8a63e99 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -147,8 +147,21 @@ static int handle_store_cpu_address(struct kvm_vcpu *vcpu)
return 0;
}
+static void __skey_check_enable(struct kvm_vcpu *vcpu)
+{
+ if (!(vcpu->arch.sie_block->ictl & (ICTL_ISKE | ICTL_SSKE | ICTL_RRBE)))
+ return;
+
+ s390_enable_skey();
+ trace_kvm_s390_skey_related_inst(vcpu);
+ vcpu->arch.sie_block->ictl &= ~(ICTL_ISKE | ICTL_SSKE | ICTL_RRBE);
+}
+
+
static int handle_skey(struct kvm_vcpu *vcpu)
{
+ __skey_check_enable(vcpu);
+
vcpu->stat.instruction_storage_key++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
@@ -618,6 +631,7 @@ static int handle_pfmf(struct kvm_vcpu *vcpu)
}
if (vcpu->run->s.regs.gprs[reg1] & PFMF_SK) {
+ __skey_check_enable(vcpu);
if (set_guest_storage_key(current->mm, useraddr,
vcpu->run->s.regs.gprs[reg1] & PFMF_KEY,
vcpu->run->s.regs.gprs[reg1] & PFMF_NQ))
diff --git a/arch/s390/kvm/trace.h b/arch/s390/kvm/trace.h
index e8e7213..a4bf7d7 100644
--- a/arch/s390/kvm/trace.h
+++ b/arch/s390/kvm/trace.h
@@ -30,6 +30,20 @@
TP_printk("%02d[%016lx-%016lx]: " p_str, __entry->id, \
__entry->pswmask, __entry->pswaddr, p_args)
+TRACE_EVENT(kvm_s390_skey_related_inst,
+ TP_PROTO(VCPU_PROTO_COMMON),
+ TP_ARGS(VCPU_ARGS_COMMON),
+
+ TP_STRUCT__entry(
+ VCPU_FIELD_COMMON
+ ),
+
+ TP_fast_assign(
+ VCPU_ASSIGN_COMMON
+ ),
+ VCPU_TP_PRINTK("%s", "first instruction related to skeys on vcpu")
+ );
+
TRACE_EVENT(kvm_s390_major_guest_pfault,
TP_PROTO(VCPU_PROTO_COMMON),
TP_ARGS(VCPU_ARGS_COMMON),
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 6/8] KVM: s390/mm: use software dirty bit detection for user dirty tracking
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (4 preceding siblings ...)
2014-04-09 10:44 ` [PULL 5/8] KVM: s390: Don't enable skeys by default Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 7/8] KVM: s390/mm: new gmap_test_and_clear_dirty function Christian Borntraeger
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
Switch the user dirty bit detection used for migration from the hardware
provided host change-bit in the storage key/pgste to a fault based
detection method. Turns out that this is actually faster than the
old storage key operations.
As a bonus, this reduces the dependency of the host from the storage key
to a point where it becomes possible to enable the RCP bypass for KVM
guests. In the future this might be used for backing guests with large
pages.
The fault based dirty detection will only indicate changes caused
by accesses via the guest address space. The hardware based method
can detect all changes, even those caused by I/O or accesses via the
kernel page table. The KVM/qemu code needs to take this into account.
Dirty tracking for vhost and I/O has the same challenge on x86, so
everything is already in place (vhost log, fallback no non-dataplane..)
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
[Dominik Dingel: fixups]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/pgtable.h | 135 +++++++++++++++++-----------------------
arch/s390/mm/pgtable.c | 6 +-
2 files changed, 59 insertions(+), 82 deletions(-)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index bc4eb35..d75eb76 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -309,7 +309,8 @@ extern unsigned long MODULES_END;
#define PGSTE_HC_BIT 0x00200000UL
#define PGSTE_GR_BIT 0x00040000UL
#define PGSTE_GC_BIT 0x00020000UL
-#define PGSTE_IN_BIT 0x00008000UL /* IPTE notify bit */
+#define PGSTE_UC_BIT 0x00008000UL /* user dirty (migration) */
+#define PGSTE_IN_BIT 0x00004000UL /* IPTE notify bit */
#else /* CONFIG_64BIT */
@@ -391,7 +392,8 @@ extern unsigned long MODULES_END;
#define PGSTE_HC_BIT 0x0020000000000000UL
#define PGSTE_GR_BIT 0x0004000000000000UL
#define PGSTE_GC_BIT 0x0002000000000000UL
-#define PGSTE_IN_BIT 0x0000800000000000UL /* IPTE notify bit */
+#define PGSTE_UC_BIT 0x0000800000000000UL /* user dirty (migration) */
+#define PGSTE_IN_BIT 0x0000400000000000UL /* IPTE notify bit */
#endif /* CONFIG_64BIT */
@@ -720,16 +722,6 @@ static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste,
address = pte_val(*ptep) & PAGE_MASK;
skey = (unsigned long) page_get_storage_key(address);
bits = skey & (_PAGE_CHANGED | _PAGE_REFERENCED);
- if (!(pgste_val(pgste) & PGSTE_HC_BIT) && (bits & _PAGE_CHANGED)) {
- /* Transfer dirty + referenced bit to host bits in pgste */
- pgste_val(pgste) |= bits << 52;
- page_set_storage_key(address, skey ^ bits, 0);
- } else if (!(pgste_val(pgste) & PGSTE_HR_BIT) &&
- (bits & _PAGE_REFERENCED)) {
- /* Transfer referenced bit to host bit in pgste */
- pgste_val(pgste) |= PGSTE_HR_BIT;
- page_reset_referenced(address);
- }
/* Transfer page changed & referenced bit to guest bits in pgste */
pgste_val(pgste) |= bits << 48; /* GR bit & GC bit */
/* Copy page access key and fetch protection bit to pgste */
@@ -740,19 +732,6 @@ static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste,
}
-static inline pgste_t pgste_update_young(pte_t *ptep, pgste_t pgste,
- struct mm_struct *mm)
-{
-#ifdef CONFIG_PGSTE
- if (!mm_use_skey(mm) || pte_val(*ptep) & _PAGE_INVALID)
- return pgste;
- /* Get referenced bit from storage key */
- if (page_reset_referenced(pte_val(*ptep) & PAGE_MASK))
- pgste_val(pgste) |= PGSTE_HR_BIT | PGSTE_GR_BIT;
-#endif
- return pgste;
-}
-
static inline void pgste_set_key(pte_t *ptep, pgste_t pgste, pte_t entry,
struct mm_struct *mm)
{
@@ -770,23 +749,30 @@ static inline void pgste_set_key(pte_t *ptep, pgste_t pgste, pte_t entry,
* key C/R to 0.
*/
nkey = (pgste_val(pgste) & (PGSTE_ACC_BITS | PGSTE_FP_BIT)) >> 56;
+ nkey |= (pgste_val(pgste) & (PGSTE_GR_BIT | PGSTE_GC_BIT)) >> 48;
page_set_storage_key(address, nkey, 0);
#endif
}
-static inline void pgste_set_pte(pte_t *ptep, pte_t entry)
+static inline pgste_t pgste_set_pte(pte_t *ptep, pgste_t pgste, pte_t entry)
{
- if (!MACHINE_HAS_ESOP &&
- (pte_val(entry) & _PAGE_PRESENT) &&
- (pte_val(entry) & _PAGE_WRITE)) {
- /*
- * Without enhanced suppression-on-protection force
- * the dirty bit on for all writable ptes.
- */
- pte_val(entry) |= _PAGE_DIRTY;
- pte_val(entry) &= ~_PAGE_PROTECT;
+ if ((pte_val(entry) & _PAGE_PRESENT) &&
+ (pte_val(entry) & _PAGE_WRITE) &&
+ !(pte_val(entry) & _PAGE_INVALID)) {
+ if (!MACHINE_HAS_ESOP) {
+ /*
+ * Without enhanced suppression-on-protection force
+ * the dirty bit on for all writable ptes.
+ */
+ pte_val(entry) |= _PAGE_DIRTY;
+ pte_val(entry) &= ~_PAGE_PROTECT;
+ }
+ if (!(pte_val(entry) & _PAGE_PROTECT))
+ /* This pte allows write access, set user-dirty */
+ pgste_val(pgste) |= PGSTE_UC_BIT;
}
*ptep = entry;
+ return pgste;
}
/**
@@ -884,7 +870,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
pgste = pgste_get_lock(ptep);
pgste_val(pgste) &= ~_PGSTE_GPS_ZERO;
pgste_set_key(ptep, pgste, entry, mm);
- pgste_set_pte(ptep, entry);
+ pgste = pgste_set_pte(ptep, pgste, entry);
pgste_set_unlock(ptep, pgste);
} else {
if (!(pte_val(entry) & _PAGE_INVALID) && MACHINE_HAS_EDAT1)
@@ -1030,45 +1016,6 @@ static inline pte_t pte_mkhuge(pte_t pte)
}
#endif
-/*
- * Get (and clear) the user dirty bit for a pte.
- */
-static inline int ptep_test_and_clear_user_dirty(struct mm_struct *mm,
- pte_t *ptep)
-{
- pgste_t pgste;
- int dirty = 0;
-
- if (mm_has_pgste(mm)) {
- pgste = pgste_get_lock(ptep);
- pgste = pgste_update_all(ptep, pgste, mm);
- dirty = !!(pgste_val(pgste) & PGSTE_HC_BIT);
- pgste_val(pgste) &= ~PGSTE_HC_BIT;
- pgste_set_unlock(ptep, pgste);
- return dirty;
- }
- return dirty;
-}
-
-/*
- * Get (and clear) the user referenced bit for a pte.
- */
-static inline int ptep_test_and_clear_user_young(struct mm_struct *mm,
- pte_t *ptep)
-{
- pgste_t pgste;
- int young = 0;
-
- if (mm_has_pgste(mm)) {
- pgste = pgste_get_lock(ptep);
- pgste = pgste_update_young(ptep, pgste, mm);
- young = !!(pgste_val(pgste) & PGSTE_HR_BIT);
- pgste_val(pgste) &= ~PGSTE_HR_BIT;
- pgste_set_unlock(ptep, pgste);
- }
- return young;
-}
-
static inline void __ptep_ipte(unsigned long address, pte_t *ptep)
{
unsigned long pto = (unsigned long) ptep;
@@ -1108,6 +1055,36 @@ static inline void ptep_flush_lazy(struct mm_struct *mm,
atomic_sub(0x10000, &mm->context.attach_count);
}
+/*
+ * Get (and clear) the user dirty bit for a pte.
+ */
+static inline int ptep_test_and_clear_user_dirty(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep)
+{
+ pgste_t pgste;
+ pte_t pte;
+ int dirty;
+
+ if (!mm_has_pgste(mm))
+ return 0;
+ pgste = pgste_get_lock(ptep);
+ dirty = !!(pgste_val(pgste) & PGSTE_UC_BIT);
+ pgste_val(pgste) &= ~PGSTE_UC_BIT;
+ pte = *ptep;
+ if (dirty && (pte_val(pte) & _PAGE_PRESENT)) {
+ pgste = pgste_ipte_notify(mm, ptep, pgste);
+ __ptep_ipte(addr, ptep);
+ if (MACHINE_HAS_ESOP || !(pte_val(pte) & _PAGE_WRITE))
+ pte_val(pte) |= _PAGE_PROTECT;
+ else
+ pte_val(pte) |= _PAGE_INVALID;
+ *ptep = pte;
+ }
+ pgste_set_unlock(ptep, pgste);
+ return dirty;
+}
+
#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
@@ -1127,7 +1104,7 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
pte = pte_mkold(pte);
if (mm_has_pgste(vma->vm_mm)) {
- pgste_set_pte(ptep, pte);
+ pgste = pgste_set_pte(ptep, pgste, pte);
pgste_set_unlock(ptep, pgste);
} else
*ptep = pte;
@@ -1210,7 +1187,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
if (mm_has_pgste(mm)) {
pgste = pgste_get(ptep);
pgste_set_key(ptep, pgste, pte, mm);
- pgste_set_pte(ptep, pte);
+ pgste = pgste_set_pte(ptep, pgste, pte);
pgste_set_unlock(ptep, pgste);
} else
*ptep = pte;
@@ -1291,7 +1268,7 @@ static inline pte_t ptep_set_wrprotect(struct mm_struct *mm,
pte = pte_wrprotect(pte);
if (mm_has_pgste(mm)) {
- pgste_set_pte(ptep, pte);
+ pgste = pgste_set_pte(ptep, pgste, pte);
pgste_set_unlock(ptep, pgste);
} else
*ptep = pte;
@@ -1316,7 +1293,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
ptep_flush_direct(vma->vm_mm, address, ptep);
if (mm_has_pgste(vma->vm_mm)) {
- pgste_set_pte(ptep, entry);
+ pgste = pgste_set_pte(ptep, pgste, entry);
pgste_set_unlock(ptep, pgste);
} else
*ptep = entry;
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 25e4a2a..442a266 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -827,6 +827,7 @@ void gmap_do_ipte_notify(struct mm_struct *mm, pte_t *pte)
}
spin_unlock(&gmap_notifier_lock);
}
+EXPORT_SYMBOL_GPL(gmap_do_ipte_notify);
static inline int page_table_with_pgste(struct page *page)
{
@@ -859,8 +860,7 @@ static inline unsigned long *page_table_alloc_pgste(struct mm_struct *mm,
atomic_set(&page->_mapcount, 0);
table = (unsigned long *) page_to_phys(page);
clear_table(table, _PAGE_INVALID, PAGE_SIZE/2);
- clear_table(table + PTRS_PER_PTE, PGSTE_HR_BIT | PGSTE_HC_BIT,
- PAGE_SIZE/2);
+ clear_table(table + PTRS_PER_PTE, 0, PAGE_SIZE/2);
return table;
}
@@ -1000,7 +1000,7 @@ int set_guest_storage_key(struct mm_struct *mm, unsigned long addr,
/* changing the guest storage key is considered a change of the page */
if ((pgste_val(new) ^ pgste_val(old)) &
(PGSTE_ACC_BITS | PGSTE_FP_BIT | PGSTE_GR_BIT | PGSTE_GC_BIT))
- pgste_val(new) |= PGSTE_HC_BIT;
+ pgste_val(new) |= PGSTE_UC_BIT;
pgste_set_unlock(ptep, new);
pte_unmap_unlock(*ptep, ptl);
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 7/8] KVM: s390/mm: new gmap_test_and_clear_dirty function
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (5 preceding siblings ...)
2014-04-09 10:44 ` [PULL 6/8] KVM: s390/mm: use software dirty bit detection for user dirty tracking Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 10:44 ` [PULL 8/8] KVM: s390: Add proper dirty bitmap support to S390 kvm Christian Borntraeger
2014-04-09 19:52 ` [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Dominik Dingel, Christian Borntraeger
From: Dominik Dingel <dingel@linux.vnet.ibm.com>
For live migration kvm needs to test and clear the dirty bit of guest pages.
Lets provide ptep_test_and_clear_user_dirty in architecture specific
memory management code, since we need proper locking.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/include/asm/pgtable.h | 2 ++
arch/s390/mm/pgtable.c | 21 +++++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index d75eb76..ab65035 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -838,6 +838,8 @@ unsigned long __gmap_fault(unsigned long address, struct gmap *);
unsigned long gmap_fault(unsigned long address, struct gmap *);
void gmap_discard(unsigned long from, unsigned long to, struct gmap *);
void __gmap_zap(unsigned long address, struct gmap *);
+bool gmap_test_and_clear_dirty(unsigned long address, struct gmap *);
+
void gmap_register_ipte_notifier(struct gmap_notifier *);
void gmap_unregister_ipte_notifier(struct gmap_notifier *);
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 442a266..8979847 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -1391,6 +1391,27 @@ void s390_enable_skey(void)
}
EXPORT_SYMBOL_GPL(s390_enable_skey);
+/*
+ * Test and reset if a guest page is dirty
+ */
+bool gmap_test_and_clear_dirty(unsigned long address, struct gmap *gmap)
+{
+ pte_t *pte;
+ spinlock_t *ptl;
+ bool dirty = false;
+
+ pte = get_locked_pte(gmap->mm, address, &ptl);
+ if (unlikely(!pte))
+ return false;
+
+ if (ptep_test_and_clear_user_dirty(gmap->mm, address, pte))
+ dirty = true;
+
+ spin_unlock(ptl);
+ return dirty;
+}
+EXPORT_SYMBOL_GPL(gmap_test_and_clear_dirty);
+
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
int pmdp_clear_flush_young(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PULL 8/8] KVM: s390: Add proper dirty bitmap support to S390 kvm.
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (6 preceding siblings ...)
2014-04-09 10:44 ` [PULL 7/8] KVM: s390/mm: new gmap_test_and_clear_dirty function Christian Borntraeger
@ 2014-04-09 10:44 ` Christian Borntraeger
2014-04-09 19:52 ` [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 10:44 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky, Jason J. Herne, Dominik Dingel,
Christian Borntraeger
From: "Jason J. Herne" <jjherne@us.ibm.com>
Replace the kvm_s390_sync_dirty_log() stub with code to construct the KVM
dirty_bitmap from S390 memory change bits. Also add code to properly clear
the dirty_bitmap size when clearing the bitmap.
Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
CC: Dominik Dingel <dingel@linux.vnet.ibm.com>
[Dominik Dingel: use gmap_test_and_clear_dirty, locking fixes]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++-
virt/kvm/kvm_main.c | 2 --
2 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b767ec9..346a347 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -11,6 +11,7 @@
* Christian Borntraeger <borntraeger@de.ibm.com>
* Heiko Carstens <heiko.carstens@de.ibm.com>
* Christian Ehrhardt <ehrhardt@de.ibm.com>
+ * Jason J. Herne <jjherne@us.ibm.com>
*/
#include <linux/compiler.h>
@@ -179,6 +180,25 @@ int kvm_dev_ioctl_check_extension(long ext)
return r;
}
+static void kvm_s390_sync_dirty_log(struct kvm *kvm,
+ struct kvm_memory_slot *memslot)
+{
+ gfn_t cur_gfn, last_gfn;
+ unsigned long address;
+ struct gmap *gmap = kvm->arch.gmap;
+
+ down_read(&gmap->mm->mmap_sem);
+ /* Loop over all guest pages */
+ last_gfn = memslot->base_gfn + memslot->npages;
+ for (cur_gfn = memslot->base_gfn; cur_gfn <= last_gfn; cur_gfn++) {
+ address = gfn_to_hva_memslot(memslot, cur_gfn);
+
+ if (gmap_test_and_clear_dirty(address, gmap))
+ mark_page_dirty(kvm, cur_gfn);
+ }
+ up_read(&gmap->mm->mmap_sem);
+}
+
/* Section: vm related */
/*
* Get (and clear) the dirty memory log for a memory slot.
@@ -186,7 +206,36 @@ int kvm_dev_ioctl_check_extension(long ext)
int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
struct kvm_dirty_log *log)
{
- return 0;
+ int r;
+ unsigned long n;
+ struct kvm_memory_slot *memslot;
+ int is_dirty = 0;
+
+ mutex_lock(&kvm->slots_lock);
+
+ r = -EINVAL;
+ if (log->slot >= KVM_USER_MEM_SLOTS)
+ goto out;
+
+ memslot = id_to_memslot(kvm->memslots, log->slot);
+ r = -ENOENT;
+ if (!memslot->dirty_bitmap)
+ goto out;
+
+ kvm_s390_sync_dirty_log(kvm, memslot);
+ r = kvm_get_dirty_log(kvm, log, &is_dirty);
+ if (r)
+ goto out;
+
+ /* Clear the dirty log */
+ if (is_dirty) {
+ n = kvm_dirty_bitmap_bytes(memslot);
+ memset(memslot->dirty_bitmap, 0, n);
+ }
+ r = 0;
+out:
+ mutex_unlock(&kvm->slots_lock);
+ return r;
}
static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 56baae8..7facdb1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -637,14 +637,12 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
*/
static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
{
-#ifndef CONFIG_S390
unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot);
memslot->dirty_bitmap = kvm_kvzalloc(dirty_bytes);
if (!memslot->dirty_bitmap)
return -ENOMEM;
-#endif /* !CONFIG_S390 */
return 0;
}
--
1.8.4.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PULL 0/8] KVM: s390: memory management and migration
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
` (7 preceding siblings ...)
2014-04-09 10:44 ` [PULL 8/8] KVM: s390: Add proper dirty bitmap support to S390 kvm Christian Borntraeger
@ 2014-04-09 19:52 ` Christian Borntraeger
8 siblings, 0 replies; 10+ messages in thread
From: Christian Borntraeger @ 2014-04-09 19:52 UTC (permalink / raw)
To: Paolo Bonzini, Marcelo Tosatti, Gleb Natapov
Cc: KVM, linux-s390, Cornelia Huck, Heiko Carstens,
Martin Schwidefsky
On 09/04/14 12:44, Christian Borntraeger wrote:
> Marcelo, Gleb, (Paolo,)
>
> The following changes since commit 7cbb39d4d4d530dff12f2ff06ed6c85c504ba91a:
>
> Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm (2014-04-02 14:50:10 -0700)
>
> are available in the git repository at:
>
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git tags/kvm-s390-20140409
>
> for you to fetch changes up to 384ee3e2a18893f9de84ce2f00cf786ad81fe08e:
>
> KVM: s390: Add proper dirty bitmap support to S390 kvm. (2014-04-09 11:12:15 +0200)
>
The buildbot rightly complains about a link error for allnoconfig.
PULLv2 will follow shortly, with the following diff
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -1022,6 +1022,11 @@ static inline unsigned long *page_table_alloc_pgste(struct mm_struct *mm,
return NULL;
}
+void page_table_reset_pgste(struct mm_struct *mm, unsigned long start,
+ unsigned long end, bool init_skey)
+{
+}
+
static inline void page_table_free_pgste(unsigned long *table)
{
}
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-04-09 19:52 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-09 10:44 [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
2014-04-09 10:44 ` [PULL 1/8] KVM: s390: also set guest pages back to stable on kexec/kdump Christian Borntraeger
2014-04-09 10:44 ` [PULL 2/8] KVM: s390: Adding skey bit to mmu context Christian Borntraeger
2014-04-09 10:44 ` [PULL 3/8] KVM: s390: Clear storage keys Christian Borntraeger
2014-04-09 10:44 ` [PULL 4/8] KVM: s390: Allow skeys to be enabled for the current process Christian Borntraeger
2014-04-09 10:44 ` [PULL 5/8] KVM: s390: Don't enable skeys by default Christian Borntraeger
2014-04-09 10:44 ` [PULL 6/8] KVM: s390/mm: use software dirty bit detection for user dirty tracking Christian Borntraeger
2014-04-09 10:44 ` [PULL 7/8] KVM: s390/mm: new gmap_test_and_clear_dirty function Christian Borntraeger
2014-04-09 10:44 ` [PULL 8/8] KVM: s390: Add proper dirty bitmap support to S390 kvm Christian Borntraeger
2014-04-09 19:52 ` [PULL 0/8] KVM: s390: memory management and migration Christian Borntraeger
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.