* [PATCH v2 0/2] KVM: s390: fix storage attributes migration @ 2018-01-18 12:56 Claudio Imbrenda 2018-01-18 12:56 ` [PATCH v2 1/2] KVM: s390x: some utility functions for migration Claudio Imbrenda 2018-01-18 12:56 ` [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots Claudio Imbrenda 0 siblings, 2 replies; 10+ messages in thread From: Claudio Imbrenda @ 2018-01-18 12:56 UTC (permalink / raw) To: kvm; +Cc: borntraeger, cohuck, pbonzini, david, schwidefsky, frankja This patchset fixes the migration of storage attributes. Each memory slot has a bitmap, with 2 bits per page, used to keep track of dirty pages during migration. We only use one bit, the other would be wasted. With this patch, the second bit is now used to keep track of dirty storage attributes. This means that we do not need anymore to allocate and manage the additional bitmap, and no extra work is needed to keep track of memory slots. The code does get a little bit more complicated when accounting for the memory slots, but overall it should be more robust. The first patch simply introduces two auxiliary functions, while the second patch does the actual job. v1 -> v2 * renamed _cmma_bitmap to cmma_bitmap and moved it to kvm-s390.h * renamed and/or removed some variables to improve readability * added some comments inline * simplified the code flow to improve readability Claudio Imbrenda (2): KVM: s390x: some utility functions for migration KVM: s390: Fix storage attributes migration with memory slots arch/s390/include/asm/kvm_host.h | 9 +- arch/s390/kvm/kvm-s390.c | 297 ++++++++++++++++++++++++--------------- arch/s390/kvm/kvm-s390.h | 6 + arch/s390/kvm/priv.c | 26 ++-- 4 files changed, 206 insertions(+), 132 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-18 12:56 [PATCH v2 0/2] KVM: s390: fix storage attributes migration Claudio Imbrenda @ 2018-01-18 12:56 ` Claudio Imbrenda 2018-01-22 9:08 ` Christian Borntraeger 2018-01-18 12:56 ` [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots Claudio Imbrenda 1 sibling, 1 reply; 10+ messages in thread From: Claudio Imbrenda @ 2018-01-18 12:56 UTC (permalink / raw) To: kvm; +Cc: borntraeger, cohuck, pbonzini, david, schwidefsky, frankja These are some utilty functions that will be used later on for storage attributes migration. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> --- arch/s390/kvm/kvm-s390.c | 32 ++++++++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.h | 6 ++++++ 2 files changed, 38 insertions(+) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 6f17031..5a69334 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -1512,6 +1512,38 @@ static long kvm_s390_set_skeys(struct kvm *kvm, struct kvm_s390_skeys *args) #define KVM_S390_CMMA_SIZE_MAX ((u32)KVM_S390_SKEYS_MAX) /* + * Similar to gfn_to_memslot, but returns a memslot also when the address falls + * in a hole. In that case a memslot near the hole is returned. + */ +static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) +{ + struct kvm_memslots *slots = kvm_memslots(kvm); + int start = 0, end = slots->used_slots; + int slot = atomic_read(&slots->lru_slot); + struct kvm_memory_slot *memslots = slots->memslots; + + if (gfn >= memslots[slot].base_gfn && + gfn < memslots[slot].base_gfn + memslots[slot].npages) + return slot; + + while (start < end) { + slot = start + (end - start) / 2; + + if (gfn >= memslots[slot].base_gfn) + end = slot; + else + start = slot + 1; + } + + if (gfn >= memslots[start].base_gfn && + gfn < memslots[start].base_gfn + memslots[start].npages) { + atomic_set(&slots->lru_slot, start); + } + + return start; +} + +/* * This function searches for the next page with dirty CMMA attributes, and * saves the attributes in the buffer up to either the end of the buffer or * until a block of at least KVM_S390_MAX_BIT_DISTANCE clean bits is found; diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 04a3e91..b86c1ad 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -183,6 +183,12 @@ static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm) return kvm->arch.user_cpu_state_ctrl != 0; } +static inline unsigned long *cmma_bitmap(struct kvm_memory_slot *slot) +{ + return slot->dirty_bitmap + + kvm_dirty_bitmap_bytes(slot) / sizeof(*slot->dirty_bitmap); +} + /* implemented in interrupt.c */ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu); -- 2.7.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-18 12:56 ` [PATCH v2 1/2] KVM: s390x: some utility functions for migration Claudio Imbrenda @ 2018-01-22 9:08 ` Christian Borntraeger 2018-01-22 9:12 ` Christian Borntraeger ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Christian Borntraeger @ 2018-01-22 9:08 UTC (permalink / raw) To: Claudio Imbrenda, kvm, Paolo Bonzini, Radim Krčmář Cc: cohuck, pbonzini, david, schwidefsky, frankja On 01/18/2018 01:56 PM, Claudio Imbrenda wrote: > These are some utilty functions that will be used later on for storage > attributes migration. > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> > --- > arch/s390/kvm/kvm-s390.c | 32 ++++++++++++++++++++++++++++++++ > arch/s390/kvm/kvm-s390.h | 6 ++++++ > 2 files changed, 38 insertions(+) > > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index 6f17031..5a69334 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -1512,6 +1512,38 @@ static long kvm_s390_set_skeys(struct kvm *kvm, struct kvm_s390_skeys *args) > #define KVM_S390_CMMA_SIZE_MAX ((u32)KVM_S390_SKEYS_MAX) > > /* > + * Similar to gfn_to_memslot, but returns a memslot also when the address falls gfn_to_memslot returns a memslot, this returns an int? > + * in a hole. In that case a memslot near the hole is returned. Can you please clarify that statement? Will it be the slot that is closest in terms of bytes or what? Maybe also provide a use case and an example > + */ > +static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > +{ > + struct kvm_memslots *slots = kvm_memslots(kvm); > + int start = 0, end = slots->used_slots; > + int slot = atomic_read(&slots->lru_slot); > + struct kvm_memory_slot *memslots = slots->memslots; > + > + if (gfn >= memslots[slot].base_gfn && > + gfn < memslots[slot].base_gfn + memslots[slot].npages) > + return slot; > + > + while (start < end) { > + slot = start + (end - start) / 2; > + > + if (gfn >= memslots[slot].base_gfn) > + end = slot; > + else > + start = slot + 1; > + } > + > + if (gfn >= memslots[start].base_gfn && > + gfn < memslots[start].base_gfn + memslots[start].npages) { > + atomic_set(&slots->lru_slot, start); > + } Another question for Paolo/Radim. In case we need such function, would it be better in common code (kvm_main.c) > + > + return start; > +} > + > +/* > * This function searches for the next page with dirty CMMA attributes, and > * saves the attributes in the buffer up to either the end of the buffer or > * until a block of at least KVM_S390_MAX_BIT_DISTANCE clean bits is found; > diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h > index 04a3e91..b86c1ad 100644 > --- a/arch/s390/kvm/kvm-s390.h > +++ b/arch/s390/kvm/kvm-s390.h > @@ -183,6 +183,12 @@ static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm) > return kvm->arch.user_cpu_state_ctrl != 0; > } > > +static inline unsigned long *cmma_bitmap(struct kvm_memory_slot *slot) > +{ > + return slot->dirty_bitmap + > + kvm_dirty_bitmap_bytes(slot) / sizeof(*slot->dirty_bitmap); > +} > + Hmmm, in virt/kvm/kvm_main.c we do have in kvm_get_dirty_log_protect n = kvm_dirty_bitmap_bytes(memslot); dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long); Does it make sense to have that helper common (and call it maybe unsigned long *kvm_get_shadow_dirty(struct kvm_memory_slot *slot) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-22 9:08 ` Christian Borntraeger @ 2018-01-22 9:12 ` Christian Borntraeger 2018-01-22 9:49 ` Claudio Imbrenda 2018-01-25 16:37 ` Radim Krčmář 2 siblings, 0 replies; 10+ messages in thread From: Christian Borntraeger @ 2018-01-22 9:12 UTC (permalink / raw) To: Claudio Imbrenda, kvm, Paolo Bonzini, Radim Krčmář Cc: cohuck, david, schwidefsky, frankja On 01/22/2018 10:08 AM, Christian Borntraeger wrote: >> +static inline unsigned long *cmma_bitmap(struct kvm_memory_slot *slot) >> +{ >> + return slot->dirty_bitmap + >> + kvm_dirty_bitmap_bytes(slot) / sizeof(*slot->dirty_bitmap); >> +} >> + > > Hmmm, in virt/kvm/kvm_main.c we do have in kvm_get_dirty_log_protect > > > n = kvm_dirty_bitmap_bytes(memslot); > > dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long); FWIW, I find this variant easier to read. > > > Does it make sense to have that helper common (and call it maybe > > unsigned long *kvm_get_shadow_dirty(struct kvm_memory_slot *slot) > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-22 9:08 ` Christian Borntraeger 2018-01-22 9:12 ` Christian Borntraeger @ 2018-01-22 9:49 ` Claudio Imbrenda 2018-01-25 16:37 ` Radim Krčmář 2 siblings, 0 replies; 10+ messages in thread From: Claudio Imbrenda @ 2018-01-22 9:49 UTC (permalink / raw) To: Christian Borntraeger Cc: kvm, Paolo Bonzini, Radim Krčmář, cohuck, david, schwidefsky, frankja On Mon, 22 Jan 2018 10:08:47 +0100 Christian Borntraeger <borntraeger@de.ibm.com> wrote: [...] > > /* > > + * Similar to gfn_to_memslot, but returns a memslot also when the > > address falls > > gfn_to_memslot returns a memslot, this returns an int? true, I have to update the comment > > + * in a hole. In that case a memslot near the hole is returned. > > Can you please clarify that statement? Will it be the slot that is > closest in terms of bytes or what? Maybe also provide a use case and > an example I think that just any of the two memslots (before and after the hole) can be returned, not necessarily the closest one. The logic that uses this function takes this into account. Maybe I can change the description to "the index of any of the memoslots bordering the hole is returned" > > + */ > > +static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > > +{ > > + struct kvm_memslots *slots = kvm_memslots(kvm); > > + int start = 0, end = slots->used_slots; > > + int slot = atomic_read(&slots->lru_slot); > > + struct kvm_memory_slot *memslots = slots->memslots; > > + > > + if (gfn >= memslots[slot].base_gfn && > > + gfn < memslots[slot].base_gfn + memslots[slot].npages) > > + return slot; > > + > > + while (start < end) { > > + slot = start + (end - start) / 2; > > + > > + if (gfn >= memslots[slot].base_gfn) > > + end = slot; > > + else > > + start = slot + 1; > > + } > > + > > + if (gfn >= memslots[start].base_gfn && > > + gfn < memslots[start].base_gfn + > > memslots[start].npages) { > > + atomic_set(&slots->lru_slot, start); > > + } > > Another question for Paolo/Radim. In case we need such function, would > it be better in common code (kvm_main.c) > > + > > + return start; > > > > +} > > + > > +/* > > * This function searches for the next page with dirty CMMA > > attributes, and > > * saves the attributes in the buffer up to either the end of the > > buffer or > > * until a block of at least KVM_S390_MAX_BIT_DISTANCE clean bits > > is found; diff --git a/arch/s390/kvm/kvm-s390.h > > b/arch/s390/kvm/kvm-s390.h index 04a3e91..b86c1ad 100644 > > --- a/arch/s390/kvm/kvm-s390.h > > +++ b/arch/s390/kvm/kvm-s390.h > > @@ -183,6 +183,12 @@ static inline int > > kvm_s390_user_cpu_state_ctrl(struct kvm *kvm) return > > kvm->arch.user_cpu_state_ctrl != 0; } > > > > +static inline unsigned long *cmma_bitmap(struct kvm_memory_slot > > *slot) +{ > > + return slot->dirty_bitmap + > > + kvm_dirty_bitmap_bytes(slot) / > > sizeof(*slot->dirty_bitmap); +} > > + > > Hmmm, in virt/kvm/kvm_main.c we do have in kvm_get_dirty_log_protect > > > n = kvm_dirty_bitmap_bytes(memslot); > > dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long); > > > Does it make sense to have that helper common (and call it maybe > > unsigned long *kvm_get_shadow_dirty(struct kvm_memory_slot *slot) I hadn't noticed it; it surely makes sense to share the code. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-22 9:08 ` Christian Borntraeger 2018-01-22 9:12 ` Christian Borntraeger 2018-01-22 9:49 ` Claudio Imbrenda @ 2018-01-25 16:37 ` Radim Krčmář 2018-01-25 18:15 ` Claudio Imbrenda 2 siblings, 1 reply; 10+ messages in thread From: Radim Krčmář @ 2018-01-25 16:37 UTC (permalink / raw) To: Christian Borntraeger Cc: Claudio Imbrenda, kvm, Paolo Bonzini, cohuck, david, schwidefsky, frankja 2018-01-22 10:08+0100, Christian Borntraeger: > On 01/18/2018 01:56 PM, Claudio Imbrenda wrote: > > These are some utilty functions that will be used later on for storage > > attributes migration. > > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> > > --- > > arch/s390/kvm/kvm-s390.c | 32 ++++++++++++++++++++++++++++++++ > > arch/s390/kvm/kvm-s390.h | 6 ++++++ > > 2 files changed, 38 insertions(+) > > > > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > > index 6f17031..5a69334 100644 > > --- a/arch/s390/kvm/kvm-s390.c > > +++ b/arch/s390/kvm/kvm-s390.c > > @@ -1512,6 +1512,38 @@ static long kvm_s390_set_skeys(struct kvm *kvm, struct kvm_s390_skeys *args) > > #define KVM_S390_CMMA_SIZE_MAX ((u32)KVM_S390_SKEYS_MAX) > > > > /* > > + * Similar to gfn_to_memslot, but returns a memslot also when the address falls > > gfn_to_memslot returns a memslot, this returns an int? > > > + * in a hole. In that case a memslot near the hole is returned. > > Can you please clarify that statement? Will it be the slot that is closest > in terms of bytes or what? Maybe also provide a use case and an example > > + */ > > +static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > > +{ > > + struct kvm_memslots *slots = kvm_memslots(kvm); > > + int start = 0, end = slots->used_slots; > > + int slot = atomic_read(&slots->lru_slot); > > + struct kvm_memory_slot *memslots = slots->memslots; > > + > > + if (gfn >= memslots[slot].base_gfn && > > + gfn < memslots[slot].base_gfn + memslots[slot].npages) > > + return slot; > > + > > + while (start < end) { > > + slot = start + (end - start) / 2; > > + > > + if (gfn >= memslots[slot].base_gfn) > > + end = slot; > > + else > > + start = slot + 1; > > + } > > + > > + if (gfn >= memslots[start].base_gfn && > > + gfn < memslots[start].base_gfn + memslots[start].npages) { > > + atomic_set(&slots->lru_slot, start); > > + } > > Another question for Paolo/Radim. In case we need such function, would > it be better in common code (kvm_main.c) Please keep it in s390 and don't do it in the best case. The function doesn't look reusable. If a gfn isn't in a slot, then we shouldn't be using slots to work with it. I don't understand why CMMA need adresses in the gaps, so I can't provide a good idea here -- maybe we can add slots where needed? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] KVM: s390x: some utility functions for migration 2018-01-25 16:37 ` Radim Krčmář @ 2018-01-25 18:15 ` Claudio Imbrenda 0 siblings, 0 replies; 10+ messages in thread From: Claudio Imbrenda @ 2018-01-25 18:15 UTC (permalink / raw) To: Radim Krčmář Cc: Christian Borntraeger, kvm, Paolo Bonzini, cohuck, david, schwidefsky, frankja On Thu, 25 Jan 2018 17:37:22 +0100 Radim Krčmář <rkrcmar@redhat.com> wrote: > 2018-01-22 10:08+0100, Christian Borntraeger: > > On 01/18/2018 01:56 PM, Claudio Imbrenda wrote: > > > These are some utilty functions that will be used later on for > > > storage attributes migration. > > > > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> > > > --- > > > arch/s390/kvm/kvm-s390.c | 32 ++++++++++++++++++++++++++++++++ > > > arch/s390/kvm/kvm-s390.h | 6 ++++++ > > > 2 files changed, 38 insertions(+) > > > > > > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > > > index 6f17031..5a69334 100644 > > > --- a/arch/s390/kvm/kvm-s390.c > > > +++ b/arch/s390/kvm/kvm-s390.c > > > @@ -1512,6 +1512,38 @@ static long kvm_s390_set_skeys(struct kvm > > > *kvm, struct kvm_s390_skeys *args) #define KVM_S390_CMMA_SIZE_MAX > > > ((u32)KVM_S390_SKEYS_MAX) > > > > > > /* > > > + * Similar to gfn_to_memslot, but returns a memslot also when > > > the address falls > > > > gfn_to_memslot returns a memslot, this returns an int? > > > > > + * in a hole. In that case a memslot near the hole is returned. > > > > Can you please clarify that statement? Will it be the slot that is > > closest in terms of bytes or what? Maybe also provide a use case > > and an example > > > + */ > > > +static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > > > +{ > > > + struct kvm_memslots *slots = kvm_memslots(kvm); > > > + int start = 0, end = slots->used_slots; > > > + int slot = atomic_read(&slots->lru_slot); > > > + struct kvm_memory_slot *memslots = slots->memslots; > > > + > > > + if (gfn >= memslots[slot].base_gfn && > > > + gfn < memslots[slot].base_gfn + > > > memslots[slot].npages) > > > + return slot; > > > + > > > + while (start < end) { > > > + slot = start + (end - start) / 2; > > > + > > > + if (gfn >= memslots[slot].base_gfn) > > > + end = slot; > > > + else > > > + start = slot + 1; > > > + } > > > + > > > + if (gfn >= memslots[start].base_gfn && > > > + gfn < memslots[start].base_gfn + > > > memslots[start].npages) { > > > + atomic_set(&slots->lru_slot, start); > > > + } > > > > Another question for Paolo/Radim. In case we need such function, > > would it be better in common code (kvm_main.c) > > Please keep it in s390 and don't do it in the best case. > The function doesn't look reusable. If a gfn isn't in a slot, then we > shouldn't be using slots to work with it. > > I don't understand why CMMA need adresses in the gaps, so I can't > provide a good idea here -- maybe we can add slots where needed? We actually don't need addresses in the gap. What we want instead is the first address after the gap, and that is not possible if we simply return a NULL when we hit a hole, like gfn_to_memslot would do. That's because the current code returns (almost) only the dirty values, so if userspace starts at a "clean" address, the first clean values are skipped, and the start address updated accordingly. If the start address is in a hole, we want to skip the hole and start at the next valid address, but to do that we need a slot next to the hole. an alternative would be to limit the interface to work inside memslots, thus returning an error when the starting address is in a hole, like we are already doing in the "peek" case. This will require changes in userspace. Not sure we want to do that, although until the recent patch from Christian, multi-slot guests were broken anyway. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots 2018-01-18 12:56 [PATCH v2 0/2] KVM: s390: fix storage attributes migration Claudio Imbrenda 2018-01-18 12:56 ` [PATCH v2 1/2] KVM: s390x: some utility functions for migration Claudio Imbrenda @ 2018-01-18 12:56 ` Claudio Imbrenda 2018-01-19 15:41 ` Christian Borntraeger 2018-01-22 14:26 ` Christian Borntraeger 1 sibling, 2 replies; 10+ messages in thread From: Claudio Imbrenda @ 2018-01-18 12:56 UTC (permalink / raw) To: kvm; +Cc: borntraeger, cohuck, pbonzini, david, schwidefsky, frankja This is a fix for several issues that were found in the original code for storage attributes migration. Now no bitmap is allocated to keep track of dirty storage attributes; the extra bits of the per-memslot bitmap that are always present anyway are now used for this purpose. The code has also been refactored a little to improve readability. Fixes: 190df4a212a ("KVM: s390: CMMA tracking, ESSA emulation, migration mode") Fixes: 4036e3874a1 ("KVM: s390: ioctls to get and set guest storage attributes") Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> --- arch/s390/include/asm/kvm_host.h | 9 +- arch/s390/kvm/kvm-s390.c | 265 ++++++++++++++++++++++----------------- arch/s390/kvm/priv.c | 26 ++-- 3 files changed, 168 insertions(+), 132 deletions(-) diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index 819838c..2ca73b0 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -763,12 +763,6 @@ struct kvm_s390_vsie { struct page *pages[KVM_MAX_VCPUS]; }; -struct kvm_s390_migration_state { - unsigned long bitmap_size; /* in bits (number of guest pages) */ - atomic64_t dirty_pages; /* number of dirty pages */ - unsigned long *pgste_bitmap; -}; - struct kvm_arch{ void *sca; int use_esca; @@ -796,7 +790,8 @@ struct kvm_arch{ struct kvm_s390_vsie vsie; u8 epdx; u64 epoch; - struct kvm_s390_migration_state *migration_state; + atomic_t migration_mode; + atomic64_t cmma_dirty_pages; /* subset of available cpu features enabled by user space */ DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS); struct kvm_s390_gisa *gisa; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5a69334..85448a2 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -770,53 +770,37 @@ static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req) */ static int kvm_s390_vm_start_migration(struct kvm *kvm) { - struct kvm_s390_migration_state *mgs; - struct kvm_memory_slot *ms; - /* should be the only one */ struct kvm_memslots *slots; - unsigned long ram_pages; + struct kvm_memory_slot *sl; + unsigned long ram_pages = 0; int slotnr; /* migration mode already enabled */ - if (kvm->arch.migration_state) + if (atomic_read(&kvm->arch.migration_mode)) return 0; - slots = kvm_memslots(kvm); if (!slots || !slots->used_slots) return -EINVAL; - mgs = kzalloc(sizeof(*mgs), GFP_KERNEL); - if (!mgs) - return -ENOMEM; - kvm->arch.migration_state = mgs; - - if (kvm->arch.use_cmma) { + if (!kvm->arch.use_cmma) { + atomic_set(&kvm->arch.migration_mode, 1); + return 0; + } + /* mark all the pages in active slots as dirty */ + for (slotnr = 0; slotnr < slots->used_slots; slotnr++) { + sl = slots->memslots + slotnr; /* - * Get the last slot. They should be sorted by base_gfn, so the - * last slot is also the one at the end of the address space. - * We have verified above that at least one slot is present. + * The second half of the bitmap is only used on x86, + * and would be wasted otherwise, so we put it to good + * use here to keep track of the state of the storage + * attributes. */ - ms = slots->memslots + slots->used_slots - 1; - /* round up so we only use full longs */ - ram_pages = roundup(ms->base_gfn + ms->npages, BITS_PER_LONG); - /* allocate enough bytes to store all the bits */ - mgs->pgste_bitmap = vmalloc(ram_pages / 8); - if (!mgs->pgste_bitmap) { - kfree(mgs); - kvm->arch.migration_state = NULL; - return -ENOMEM; - } - - mgs->bitmap_size = ram_pages; - atomic64_set(&mgs->dirty_pages, ram_pages); - /* mark all the pages in active slots as dirty */ - for (slotnr = 0; slotnr < slots->used_slots; slotnr++) { - ms = slots->memslots + slotnr; - bitmap_set(mgs->pgste_bitmap, ms->base_gfn, ms->npages); - } - - kvm_s390_sync_request_broadcast(kvm, KVM_REQ_START_MIGRATION); + memset(cmma_bitmap(sl), 0xff, kvm_dirty_bitmap_bytes(sl)); + ram_pages += sl->npages; } + atomic64_set(&kvm->arch.cmma_dirty_pages, ram_pages); + atomic_set(&kvm->arch.migration_mode, 1); + kvm_s390_sync_request_broadcast(kvm, KVM_REQ_START_MIGRATION); return 0; } @@ -826,19 +810,12 @@ static int kvm_s390_vm_start_migration(struct kvm *kvm) */ static int kvm_s390_vm_stop_migration(struct kvm *kvm) { - struct kvm_s390_migration_state *mgs; - /* migration mode already disabled */ - if (!kvm->arch.migration_state) + if (!atomic_read(&kvm->arch.migration_mode)) return 0; - mgs = kvm->arch.migration_state; - kvm->arch.migration_state = NULL; - - if (kvm->arch.use_cmma) { + atomic_set(&kvm->arch.migration_mode, 0); + if (kvm->arch.use_cmma) kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION); - vfree(mgs->pgste_bitmap); - } - kfree(mgs); return 0; } @@ -868,7 +845,7 @@ static int kvm_s390_vm_set_migration(struct kvm *kvm, static int kvm_s390_vm_get_migration(struct kvm *kvm, struct kvm_device_attr *attr) { - u64 mig = (kvm->arch.migration_state != NULL); + u64 mig = atomic_read(&kvm->arch.migration_mode); if (attr->attr != KVM_S390_VM_MIGRATION_STATUS) return -ENXIO; @@ -1543,6 +1520,108 @@ static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) return start; } +static int kvm_s390_peek_cmma(struct kvm *kvm, struct kvm_s390_cmma_log *args, + u8 *res, unsigned long bufsize) +{ + unsigned long pgstev, cur_gfn, hva; + + cur_gfn = args->start_gfn; + args->count = 0; + while (args->count < bufsize) { + hva = gfn_to_hva(kvm, cur_gfn); + /* + * We return an error if the first value was invalid, but we + * return successfully if at least one value was copied. + */ + if (kvm_is_error_hva(hva)) + return args->count ? 0 : -EFAULT; + if (get_pgste(kvm->mm, hva, &pgstev) < 0) + pgstev = 0; + res[args->count++] = (pgstev >> 24) & 0x43; + cur_gfn++; + } + + return 0; +} + +static unsigned long kvm_s390_next_dirty_cmma(struct kvm *kvm, + unsigned long cur) +{ + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *sl; + unsigned long ofs; + int slotidx; + + slotidx = gfn_to_memslot_approx(kvm, cur); + sl = slots->memslots + slotidx; + + if (sl->base_gfn + sl->npages <= cur) { + slotidx--; + /* If we are above the highest slot, wrap around */ + if (slotidx < 0) + slotidx = slots->used_slots - 1; + + sl = slots->memslots + slotidx; + cur = sl->base_gfn; + } + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, cur - sl->base_gfn); + while ((slotidx > 0) && (ofs >= sl->npages)) { + slotidx--; + sl = slots->memslots + slotidx; + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, 0); + } + return sl->base_gfn + ofs; +} + +static int kvm_s390_get_cmma(struct kvm *kvm, struct kvm_s390_cmma_log *args, + u8 *res, unsigned long bufsize) +{ + unsigned long mem_end, cur_gfn, next_gfn, hva, pgstev; + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *sl; + + cur_gfn = kvm_s390_next_dirty_cmma(kvm, args->start_gfn); + sl = gfn_to_memslot(kvm, cur_gfn); + args->count = 0; + args->start_gfn = cur_gfn; + if (!sl) + return 0; + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); + mem_end = slots->memslots[0].base_gfn + slots->memslots[0].npages; + + while (args->count < bufsize) { + hva = gfn_to_hva(kvm, cur_gfn); + if (kvm_is_error_hva(hva)) + break; + /* decrement only if we actually flipped the bit to 0 */ + if (test_and_clear_bit(cur_gfn - sl->base_gfn, cmma_bitmap(sl))) + atomic64_dec(&kvm->arch.cmma_dirty_pages); + if (get_pgste(kvm->mm, hva, &pgstev) < 0) + pgstev = 0; + /* save the value */ + res[args->count++] = (pgstev >> 24) & 0x43; + /* + * if the next bit is too far away, stop. + * if we reached the previous "next", find the next one + */ + if (next_gfn > cur_gfn + KVM_S390_MAX_BIT_DISTANCE) + break; + if (cur_gfn == next_gfn) + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); + /* reached the end of memory or of the buffer, stop */ + if ((next_gfn >= mem_end) || + (next_gfn - args->start_gfn >= bufsize)) + break; + cur_gfn++; + if (cur_gfn - sl->base_gfn >= sl->npages) { + sl = gfn_to_memslot(kvm, cur_gfn); + if (!sl) + break; + } + } + return 0; +} + /* * This function searches for the next page with dirty CMMA attributes, and * saves the attributes in the buffer up to either the end of the buffer or @@ -1554,97 +1633,53 @@ static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) static int kvm_s390_get_cmma_bits(struct kvm *kvm, struct kvm_s390_cmma_log *args) { - struct kvm_s390_migration_state *s = kvm->arch.migration_state; - unsigned long bufsize, hva, pgstev, i, next, cur; - int srcu_idx, peek, r = 0, rr; - u8 *res; + unsigned long bufsize; + int srcu_idx, peek, migr, ret; + u8 *values; - cur = args->start_gfn; - i = next = pgstev = 0; + migr = atomic_read(&kvm->arch.migration_mode); if (unlikely(!kvm->arch.use_cmma)) return -ENXIO; /* Invalid/unsupported flags were specified */ - if (args->flags & ~KVM_S390_CMMA_PEEK) + if (unlikely(args->flags & ~KVM_S390_CMMA_PEEK)) return -EINVAL; /* Migration mode query, and we are not doing a migration */ peek = !!(args->flags & KVM_S390_CMMA_PEEK); - if (!peek && !s) + if (unlikely(!peek && !migr)) return -EINVAL; /* CMMA is disabled or was not used, or the buffer has length zero */ bufsize = min(args->count, KVM_S390_CMMA_SIZE_MAX); - if (!bufsize || !kvm->mm->context.use_cmma) { + if (unlikely(!bufsize || !kvm->mm->context.use_cmma)) { memset(args, 0, sizeof(*args)); return 0; } - - if (!peek) { - /* We are not peeking, and there are no dirty pages */ - if (!atomic64_read(&s->dirty_pages)) { - memset(args, 0, sizeof(*args)); - return 0; - } - cur = find_next_bit(s->pgste_bitmap, s->bitmap_size, - args->start_gfn); - if (cur >= s->bitmap_size) /* nothing found, loop back */ - cur = find_next_bit(s->pgste_bitmap, s->bitmap_size, 0); - if (cur >= s->bitmap_size) { /* again! (very unlikely) */ - memset(args, 0, sizeof(*args)); - return 0; - } - next = find_next_bit(s->pgste_bitmap, s->bitmap_size, cur + 1); + /* We are not peeking, and there are no dirty pages */ + if (!peek && !atomic64_read(&kvm->arch.cmma_dirty_pages)) { + memset(args, 0, sizeof(*args)); + return 0; } - res = vmalloc(bufsize); - if (!res) + values = vmalloc(bufsize); + if (!values) return -ENOMEM; - args->start_gfn = cur; - down_read(&kvm->mm->mmap_sem); srcu_idx = srcu_read_lock(&kvm->srcu); - while (i < bufsize) { - hva = gfn_to_hva(kvm, cur); - if (kvm_is_error_hva(hva)) { - r = -EFAULT; - break; - } - /* decrement only if we actually flipped the bit to 0 */ - if (!peek && test_and_clear_bit(cur, s->pgste_bitmap)) - atomic64_dec(&s->dirty_pages); - r = get_pgste(kvm->mm, hva, &pgstev); - if (r < 0) - pgstev = 0; - /* save the value */ - res[i++] = (pgstev >> 24) & 0x43; - /* - * if the next bit is too far away, stop. - * if we reached the previous "next", find the next one - */ - if (!peek) { - if (next > cur + KVM_S390_MAX_BIT_DISTANCE) - break; - if (cur == next) - next = find_next_bit(s->pgste_bitmap, - s->bitmap_size, cur + 1); - /* reached the end of the bitmap or of the buffer, stop */ - if ((next >= s->bitmap_size) || - (next >= args->start_gfn + bufsize)) - break; - } - cur++; - } + if (peek) + ret = kvm_s390_peek_cmma(kvm, args, values, bufsize); + else + ret = kvm_s390_get_cmma(kvm, args, values, bufsize); srcu_read_unlock(&kvm->srcu, srcu_idx); up_read(&kvm->mm->mmap_sem); - args->count = i; - args->remaining = s ? atomic64_read(&s->dirty_pages) : 0; - rr = copy_to_user((void __user *)args->values, res, args->count); - if (rr) - r = -EFAULT; + args->remaining = migr ? atomic64_read(&kvm->arch.cmma_dirty_pages) : 0; - vfree(res); - return r; + if (copy_to_user((void __user *)args->values, values, args->count)) + ret = -EFAULT; + + vfree(values); + return ret; } /* @@ -2088,10 +2123,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_s390_destroy_adapters(kvm); kvm_s390_clear_float_irqs(kvm); kvm_s390_vsie_destroy(kvm); - if (kvm->arch.migration_state) { - vfree(kvm->arch.migration_state->pgste_bitmap); - kfree(kvm->arch.migration_state); - } KVM_EVENT(3, "vm 0x%pK destroyed", kvm); } diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c index 572496c..7ae4893 100644 --- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c @@ -954,9 +954,11 @@ static int handle_pfmf(struct kvm_vcpu *vcpu) return 0; } -static inline int do_essa(struct kvm_vcpu *vcpu, const int orc) +/* + * Must be called with relevant read locks held (kvm->mm->mmap_sem, kvm->srcu) + */ +static inline int __do_essa(struct kvm_vcpu *vcpu, const int orc) { - struct kvm_s390_migration_state *ms = vcpu->kvm->arch.migration_state; int r1, r2, nappended, entries; unsigned long gfn, hva, res, pgstev, ptev; unsigned long *cbrlo; @@ -1007,9 +1009,11 @@ static inline int do_essa(struct kvm_vcpu *vcpu, const int orc) } if (orc) { - /* increment only if we are really flipping the bit to 1 */ - if (!test_and_set_bit(gfn, ms->pgste_bitmap)) - atomic64_inc(&ms->dirty_pages); + struct kvm_memory_slot *s = gfn_to_memslot(vcpu->kvm, gfn); + + /* Increment only if we are really flipping the bit */ + if (s && !test_and_set_bit(gfn - s->base_gfn, cmma_bitmap(s))) + atomic64_inc(&vcpu->kvm->arch.cmma_dirty_pages); } return nappended; @@ -1038,7 +1042,7 @@ static int handle_essa(struct kvm_vcpu *vcpu) : ESSA_SET_STABLE_IF_RESIDENT)) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); - if (likely(!vcpu->kvm->arch.migration_state)) { + if (likely(!atomic_read(&vcpu->kvm->arch.migration_mode))) { /* * CMMA is enabled in the KVM settings, but is disabled in * the SIE block and in the mm_context, and we are not doing @@ -1066,10 +1070,16 @@ static int handle_essa(struct kvm_vcpu *vcpu) /* Retry the ESSA instruction */ kvm_s390_retry_instr(vcpu); } else { - /* Account for the possible extra cbrl entry */ - i = do_essa(vcpu, orc); + int srcu_idx; + + down_read(&vcpu->kvm->mm->mmap_sem); + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + i = __do_essa(vcpu, orc); + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + up_read(&vcpu->kvm->mm->mmap_sem); if (i < 0) return i; + /* Account for the possible extra cbrl entry */ entries += i; } vcpu->arch.sie_block->cbrlo &= PAGE_MASK; /* reset nceo */ -- 2.7.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots 2018-01-18 12:56 ` [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots Claudio Imbrenda @ 2018-01-19 15:41 ` Christian Borntraeger 2018-01-22 14:26 ` Christian Borntraeger 1 sibling, 0 replies; 10+ messages in thread From: Christian Borntraeger @ 2018-01-19 15:41 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: cohuck, pbonzini, david, schwidefsky, frankja On 01/18/2018 01:56 PM, Claudio Imbrenda wrote: > This is a fix for several issues that were found in the original code > for storage attributes migration. > > Now no bitmap is allocated to keep track of dirty storage attributes; > the extra bits of the per-memslot bitmap that are always present anyway > are now used for this purpose. > > The code has also been refactored a little to improve readability. > > Fixes: 190df4a212a ("KVM: s390: CMMA tracking, ESSA emulation, migration mode") > Fixes: 4036e3874a1 ("KVM: s390: ioctls to get and set guest storage attributes") > Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> [..] > + /* mark all the pages in active slots as dirty */ > + for (slotnr = 0; slotnr < slots->used_slots; slotnr++) { > + sl = slots->memslots + slotnr; > /* > - * Get the last slot. They should be sorted by base_gfn, so the > - * last slot is also the one at the end of the address space. > - * We have verified above that at least one slot is present. Can you please rebase on top of 4.15-rc8 (or master)? This code has already some fixes. [...] > +static unsigned long kvm_s390_next_dirty_cmma(struct kvm *kvm, > + unsigned long cur) > +{ > + struct kvm_memslots *slots = kvm_memslots(kvm); > + struct kvm_memory_slot *sl; > + unsigned long ofs; > + int slotidx; > + > + slotidx = gfn_to_memslot_approx(kvm, cur); > + sl = slots->memslots + slotidx; > + > + if (sl->base_gfn + sl->npages <= cur) { > + slotidx--; > + /* If we are above the highest slot, wrap around */ > + if (slotidx < 0) > + slotidx = slots->used_slots - 1; > + > + sl = slots->memslots + slotidx; > + cur = sl->base_gfn; > + } > + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, cur - sl->base_gfn); > + while ((slotidx > 0) && (ofs >= sl->npages)) { > + slotidx--; > + sl = slots->memslots + slotidx; > + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, 0); > + } > + return sl->base_gfn + ofs; > +} > + > +static int kvm_s390_get_cmma(struct kvm *kvm, struct kvm_s390_cmma_log *args, > + u8 *res, unsigned long bufsize) > +{ > + unsigned long mem_end, cur_gfn, next_gfn, hva, pgstev; > + struct kvm_memslots *slots = kvm_memslots(kvm); > + struct kvm_memory_slot *sl; > + > + cur_gfn = kvm_s390_next_dirty_cmma(kvm, args->start_gfn); > + sl = gfn_to_memslot(kvm, cur_gfn); > + args->count = 0; > + args->start_gfn = cur_gfn; > + if (!sl) > + return 0; > + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); > + mem_end = slots->memslots[0].base_gfn + slots->memslots[0].npages; > + > + while (args->count < bufsize) { > + hva = gfn_to_hva(kvm, cur_gfn); > + if (kvm_is_error_hva(hva)) > + break; > + /* decrement only if we actually flipped the bit to 0 */ > + if (test_and_clear_bit(cur_gfn - sl->base_gfn, cmma_bitmap(sl))) > + atomic64_dec(&kvm->arch.cmma_dirty_pages); > + if (get_pgste(kvm->mm, hva, &pgstev) < 0) > + pgstev = 0; > + /* save the value */ > + res[args->count++] = (pgstev >> 24) & 0x43; > + /* > + * if the next bit is too far away, stop. > + * if we reached the previous "next", find the next one > + */ > + if (next_gfn > cur_gfn + KVM_S390_MAX_BIT_DISTANCE) > + break; > + if (cur_gfn == next_gfn) > + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); > + /* reached the end of memory or of the buffer, stop */ > + if ((next_gfn >= mem_end) || > + (next_gfn - args->start_gfn >= bufsize)) > + break; > + cur_gfn++; > + if (cur_gfn - sl->base_gfn >= sl->npages) { > + sl = gfn_to_memslot(kvm, cur_gfn); > + if (!sl) > + break; > + } > + } I have to say that this code does not get easier to understand, I fear that I wont be able to review this is time for 4.15. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots 2018-01-18 12:56 ` [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots Claudio Imbrenda 2018-01-19 15:41 ` Christian Borntraeger @ 2018-01-22 14:26 ` Christian Borntraeger 1 sibling, 0 replies; 10+ messages in thread From: Christian Borntraeger @ 2018-01-22 14:26 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: cohuck, pbonzini, david, schwidefsky, frankja On 01/18/2018 01:56 PM, Claudio Imbrenda wrote: > This is a fix for several issues that were found in the original code > for storage attributes migration. > > Now no bitmap is allocated to keep track of dirty storage attributes; > the extra bits of the per-memslot bitmap that are always present anyway > are now used for this purpose. > > The code has also been refactored a little to improve readability. > > Fixes: 190df4a212a ("KVM: s390: CMMA tracking, ESSA emulation, migration mode") > Fixes: 4036e3874a1 ("KVM: s390: ioctls to get and set guest storage attributes") > Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> > --- > arch/s390/include/asm/kvm_host.h | 9 +- > arch/s390/kvm/kvm-s390.c | 265 ++++++++++++++++++++++----------------- > arch/s390/kvm/priv.c | 26 ++-- > 3 files changed, 168 insertions(+), 132 deletions(-) > > diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h > index 819838c..2ca73b0 100644 > --- a/arch/s390/include/asm/kvm_host.h > +++ b/arch/s390/include/asm/kvm_host.h > @@ -763,12 +763,6 @@ struct kvm_s390_vsie { > struct page *pages[KVM_MAX_VCPUS]; > }; > > -struct kvm_s390_migration_state { > - unsigned long bitmap_size; /* in bits (number of guest pages) */ > - atomic64_t dirty_pages; /* number of dirty pages */ > - unsigned long *pgste_bitmap; > -}; > - > struct kvm_arch{ > void *sca; > int use_esca; > @@ -796,7 +790,8 @@ struct kvm_arch{ > struct kvm_s390_vsie vsie; > u8 epdx; > u64 epoch; > - struct kvm_s390_migration_state *migration_state; > + atomic_t migration_mode; > + atomic64_t cmma_dirty_pages; > /* subset of available cpu features enabled by user space */ > DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS); > struct kvm_s390_gisa *gisa; > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index 5a69334..85448a2 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -770,53 +770,37 @@ static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req) > */ > static int kvm_s390_vm_start_migration(struct kvm *kvm) > { > - struct kvm_s390_migration_state *mgs; > - struct kvm_memory_slot *ms; > - /* should be the only one */ > struct kvm_memslots *slots; > - unsigned long ram_pages; > + struct kvm_memory_slot *sl; > + unsigned long ram_pages = 0; > int slotnr; > > /* migration mode already enabled */ > - if (kvm->arch.migration_state) > + if (atomic_read(&kvm->arch.migration_mode)) > return 0; Being atomic does not help here, because you just use read and write (and not an atomic op). We are protected by a mutex, though. So maybe just use a normal bool variable and add a comment. Please also have a look at KVM: s390: add proper locking for CMMA migration bitmap which changes the locking. Do we want to keep some of these changes (e.g. use the slots_lock)? Or do we want to schedule that fix for stable and use your patches only for 4.16 and later? [...] > + atomic64_set(&kvm->arch.cmma_dirty_pages, ram_pages); > + atomic_set(&kvm->arch.migration_mode, 1); > + kvm_s390_sync_request_broadcast(kvm, KVM_REQ_START_MIGRATION); > return 0; > } > > @@ -826,19 +810,12 @@ static int kvm_s390_vm_start_migration(struct kvm *kvm) > */ > static int kvm_s390_vm_stop_migration(struct kvm *kvm) > { > - struct kvm_s390_migration_state *mgs; > - > /* migration mode already disabled */ > - if (!kvm->arch.migration_state) > + if (!atomic_read(&kvm->arch.migration_mode)) > return 0; > - mgs = kvm->arch.migration_state; > - kvm->arch.migration_state = NULL; > - > - if (kvm->arch.use_cmma) { > + atomic_set(&kvm->arch.migration_mode, 0); > + if (kvm->arch.use_cmma) > kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION); > - vfree(mgs->pgste_bitmap); > - } > - kfree(mgs); > return 0; > } > > @@ -868,7 +845,7 @@ static int kvm_s390_vm_set_migration(struct kvm *kvm, > static int kvm_s390_vm_get_migration(struct kvm *kvm, > struct kvm_device_attr *attr) > { > - u64 mig = (kvm->arch.migration_state != NULL); > + u64 mig = atomic_read(&kvm->arch.migration_mode); > > if (attr->attr != KVM_S390_VM_MIGRATION_STATUS) > return -ENXIO; > @@ -1543,6 +1520,108 @@ static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > return start; > } > > +static int kvm_s390_peek_cmma(struct kvm *kvm, struct kvm_s390_cmma_log *args, > + u8 *res, unsigned long bufsize) > +{ > + unsigned long pgstev, cur_gfn, hva; > + > + cur_gfn = args->start_gfn; > + args->count = 0; > + while (args->count < bufsize) { > + hva = gfn_to_hva(kvm, cur_gfn); > + /* > + * We return an error if the first value was invalid, but we > + * return successfully if at least one value was copied. > + */ > + if (kvm_is_error_hva(hva)) > + return args->count ? 0 : -EFAULT; > + if (get_pgste(kvm->mm, hva, &pgstev) < 0) > + pgstev = 0; > + res[args->count++] = (pgstev >> 24) & 0x43; > + cur_gfn++; > + } > + > + return 0; > +} > + > +static unsigned long kvm_s390_next_dirty_cmma(struct kvm *kvm, > + unsigned long cur) > +{ > + struct kvm_memslots *slots = kvm_memslots(kvm); > + struct kvm_memory_slot *sl; > + unsigned long ofs; > + int slotidx; > + > + slotidx = gfn_to_memslot_approx(kvm, cur); > + sl = slots->memslots + slotidx; > + > + if (sl->base_gfn + sl->npages <= cur) { > + slotidx--; > + /* If we are above the highest slot, wrap around */ > + if (slotidx < 0) > + slotidx = slots->used_slots - 1; > + > + sl = slots->memslots + slotidx; > + cur = sl->base_gfn; > + } > + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, cur - sl->base_gfn); > + while ((slotidx > 0) && (ofs >= sl->npages)) { > + slotidx--; > + sl = slots->memslots + slotidx; > + ofs = find_next_bit(cmma_bitmap(sl), sl->npages, 0); > + } > + return sl->base_gfn + ofs; > +} > + > +static int kvm_s390_get_cmma(struct kvm *kvm, struct kvm_s390_cmma_log *args, > + u8 *res, unsigned long bufsize) > +{ > + unsigned long mem_end, cur_gfn, next_gfn, hva, pgstev; > + struct kvm_memslots *slots = kvm_memslots(kvm); > + struct kvm_memory_slot *sl; > + > + cur_gfn = kvm_s390_next_dirty_cmma(kvm, args->start_gfn); > + sl = gfn_to_memslot(kvm, cur_gfn); > + args->count = 0; > + args->start_gfn = cur_gfn; > + if (!sl) > + return 0; > + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); > + mem_end = slots->memslots[0].base_gfn + slots->memslots[0].npages; > + > + while (args->count < bufsize) { > + hva = gfn_to_hva(kvm, cur_gfn); > + if (kvm_is_error_hva(hva)) > + break; > + /* decrement only if we actually flipped the bit to 0 */ > + if (test_and_clear_bit(cur_gfn - sl->base_gfn, cmma_bitmap(sl))) > + atomic64_dec(&kvm->arch.cmma_dirty_pages); > + if (get_pgste(kvm->mm, hva, &pgstev) < 0) > + pgstev = 0; > + /* save the value */ > + res[args->count++] = (pgstev >> 24) & 0x43; > + /* > + * if the next bit is too far away, stop. > + * if we reached the previous "next", find the next one > + */ > + if (next_gfn > cur_gfn + KVM_S390_MAX_BIT_DISTANCE) > + break; > + if (cur_gfn == next_gfn) > + next_gfn = kvm_s390_next_dirty_cmma(kvm, cur_gfn + 1); > + /* reached the end of memory or of the buffer, stop */ > + if ((next_gfn >= mem_end) || > + (next_gfn - args->start_gfn >= bufsize)) > + break; > + cur_gfn++; > + if (cur_gfn - sl->base_gfn >= sl->npages) { > + sl = gfn_to_memslot(kvm, cur_gfn); > + if (!sl) > + break; > + } > + } > + return 0; > +} > + > /* > * This function searches for the next page with dirty CMMA attributes, and > * saves the attributes in the buffer up to either the end of the buffer or > @@ -1554,97 +1633,53 @@ static int gfn_to_memslot_approx(struct kvm *kvm, gfn_t gfn) > static int kvm_s390_get_cmma_bits(struct kvm *kvm, > struct kvm_s390_cmma_log *args) > { > - struct kvm_s390_migration_state *s = kvm->arch.migration_state; > - unsigned long bufsize, hva, pgstev, i, next, cur; > - int srcu_idx, peek, r = 0, rr; > - u8 *res; > + unsigned long bufsize; > + int srcu_idx, peek, migr, ret; > + u8 *values; > > - cur = args->start_gfn; > - i = next = pgstev = 0; > + migr = atomic_read(&kvm->arch.migration_mode); > > if (unlikely(!kvm->arch.use_cmma)) > return -ENXIO; > /* Invalid/unsupported flags were specified */ > - if (args->flags & ~KVM_S390_CMMA_PEEK) > + if (unlikely(args->flags & ~KVM_S390_CMMA_PEEK)) > return -EINVAL; > /* Migration mode query, and we are not doing a migration */ > peek = !!(args->flags & KVM_S390_CMMA_PEEK); > - if (!peek && !s) > + if (unlikely(!peek && !migr)) > return -EINVAL; > /* CMMA is disabled or was not used, or the buffer has length zero */ > bufsize = min(args->count, KVM_S390_CMMA_SIZE_MAX); > - if (!bufsize || !kvm->mm->context.use_cmma) { > + if (unlikely(!bufsize || !kvm->mm->context.use_cmma)) { > memset(args, 0, sizeof(*args)); > return 0; > } > - > - if (!peek) { > - /* We are not peeking, and there are no dirty pages */ > - if (!atomic64_read(&s->dirty_pages)) { > - memset(args, 0, sizeof(*args)); > - return 0; > - } > - cur = find_next_bit(s->pgste_bitmap, s->bitmap_size, > - args->start_gfn); > - if (cur >= s->bitmap_size) /* nothing found, loop back */ > - cur = find_next_bit(s->pgste_bitmap, s->bitmap_size, 0); > - if (cur >= s->bitmap_size) { /* again! (very unlikely) */ > - memset(args, 0, sizeof(*args)); > - return 0; > - } > - next = find_next_bit(s->pgste_bitmap, s->bitmap_size, cur + 1); > + /* We are not peeking, and there are no dirty pages */ > + if (!peek && !atomic64_read(&kvm->arch.cmma_dirty_pages)) { > + memset(args, 0, sizeof(*args)); > + return 0; > } > > - res = vmalloc(bufsize); > - if (!res) > + values = vmalloc(bufsize); > + if (!values) > return -ENOMEM; > > - args->start_gfn = cur; > - > down_read(&kvm->mm->mmap_sem); > srcu_idx = srcu_read_lock(&kvm->srcu); > - while (i < bufsize) { > - hva = gfn_to_hva(kvm, cur); > - if (kvm_is_error_hva(hva)) { > - r = -EFAULT; > - break; > - } > - /* decrement only if we actually flipped the bit to 0 */ > - if (!peek && test_and_clear_bit(cur, s->pgste_bitmap)) > - atomic64_dec(&s->dirty_pages); > - r = get_pgste(kvm->mm, hva, &pgstev); > - if (r < 0) > - pgstev = 0; > - /* save the value */ > - res[i++] = (pgstev >> 24) & 0x43; > - /* > - * if the next bit is too far away, stop. > - * if we reached the previous "next", find the next one > - */ > - if (!peek) { > - if (next > cur + KVM_S390_MAX_BIT_DISTANCE) > - break; > - if (cur == next) > - next = find_next_bit(s->pgste_bitmap, > - s->bitmap_size, cur + 1); > - /* reached the end of the bitmap or of the buffer, stop */ > - if ((next >= s->bitmap_size) || > - (next >= args->start_gfn + bufsize)) > - break; > - } > - cur++; > - } > + if (peek) > + ret = kvm_s390_peek_cmma(kvm, args, values, bufsize); > + else > + ret = kvm_s390_get_cmma(kvm, args, values, bufsize); > srcu_read_unlock(&kvm->srcu, srcu_idx); > up_read(&kvm->mm->mmap_sem); > - args->count = i; > - args->remaining = s ? atomic64_read(&s->dirty_pages) : 0; > > - rr = copy_to_user((void __user *)args->values, res, args->count); > - if (rr) > - r = -EFAULT; > + args->remaining = migr ? atomic64_read(&kvm->arch.cmma_dirty_pages) : 0; > > - vfree(res); > - return r; > + if (copy_to_user((void __user *)args->values, values, args->count)) > + ret = -EFAULT; > + > + vfree(values); > + return ret; > } > > /* > @@ -2088,10 +2123,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm) > kvm_s390_destroy_adapters(kvm); > kvm_s390_clear_float_irqs(kvm); > kvm_s390_vsie_destroy(kvm); > - if (kvm->arch.migration_state) { > - vfree(kvm->arch.migration_state->pgste_bitmap); > - kfree(kvm->arch.migration_state); > - } > KVM_EVENT(3, "vm 0x%pK destroyed", kvm); > } > > diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c > index 572496c..7ae4893 100644 > --- a/arch/s390/kvm/priv.c > +++ b/arch/s390/kvm/priv.c > @@ -954,9 +954,11 @@ static int handle_pfmf(struct kvm_vcpu *vcpu) > return 0; > } > > -static inline int do_essa(struct kvm_vcpu *vcpu, const int orc) > +/* > + * Must be called with relevant read locks held (kvm->mm->mmap_sem, kvm->srcu) > + */ > +static inline int __do_essa(struct kvm_vcpu *vcpu, const int orc) > { > - struct kvm_s390_migration_state *ms = vcpu->kvm->arch.migration_state; > int r1, r2, nappended, entries; > unsigned long gfn, hva, res, pgstev, ptev; > unsigned long *cbrlo; > @@ -1007,9 +1009,11 @@ static inline int do_essa(struct kvm_vcpu *vcpu, const int orc) > } > > if (orc) { > - /* increment only if we are really flipping the bit to 1 */ > - if (!test_and_set_bit(gfn, ms->pgste_bitmap)) > - atomic64_inc(&ms->dirty_pages); > + struct kvm_memory_slot *s = gfn_to_memslot(vcpu->kvm, gfn); > + > + /* Increment only if we are really flipping the bit */ > + if (s && !test_and_set_bit(gfn - s->base_gfn, cmma_bitmap(s))) > + atomic64_inc(&vcpu->kvm->arch.cmma_dirty_pages); > } > > return nappended; > @@ -1038,7 +1042,7 @@ static int handle_essa(struct kvm_vcpu *vcpu) > : ESSA_SET_STABLE_IF_RESIDENT)) > return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); > > - if (likely(!vcpu->kvm->arch.migration_state)) { > + if (likely(!atomic_read(&vcpu->kvm->arch.migration_mode))) { > /* > * CMMA is enabled in the KVM settings, but is disabled in > * the SIE block and in the mm_context, and we are not doing > @@ -1066,10 +1070,16 @@ static int handle_essa(struct kvm_vcpu *vcpu) > /* Retry the ESSA instruction */ > kvm_s390_retry_instr(vcpu); > } else { > - /* Account for the possible extra cbrl entry */ > - i = do_essa(vcpu, orc); > + int srcu_idx; > + > + down_read(&vcpu->kvm->mm->mmap_sem); > + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); > + i = __do_essa(vcpu, orc); > + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); > + up_read(&vcpu->kvm->mm->mmap_sem); > if (i < 0) > return i; > + /* Account for the possible extra cbrl entry */ > entries += i; > } > vcpu->arch.sie_block->cbrlo &= PAGE_MASK; /* reset nceo */ > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-01-25 18:15 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-18 12:56 [PATCH v2 0/2] KVM: s390: fix storage attributes migration Claudio Imbrenda 2018-01-18 12:56 ` [PATCH v2 1/2] KVM: s390x: some utility functions for migration Claudio Imbrenda 2018-01-22 9:08 ` Christian Borntraeger 2018-01-22 9:12 ` Christian Borntraeger 2018-01-22 9:49 ` Claudio Imbrenda 2018-01-25 16:37 ` Radim Krčmář 2018-01-25 18:15 ` Claudio Imbrenda 2018-01-18 12:56 ` [PATCH v2 2/2] KVM: s390: Fix storage attributes migration with memory slots Claudio Imbrenda 2018-01-19 15:41 ` Christian Borntraeger 2018-01-22 14:26 ` Christian Borntraeger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).