Re: [Qemu-devel] [qemu-s390x] [PATCH for-4.2 v4 1/2] kvm: s390: split too big memory section on several memslots

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Igor Mammedov <imammedo@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: thuth@redhat.com, cohuck@redhat.com, qemu-devel@nongnu.org,
	borntraeger@de.ibm.com, qemu-s390x@nongnu.org,
	pbonzini@redhat.com
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH for-4.2 v4 1/2] kvm: s390: split too big memory section on several memslots
Date: Wed, 7 Aug 2019 11:55:59 +0200	[thread overview]
Message-ID: <20190807115559.660f3e6c@redhat.com> (raw)
In-Reply-To: <b90f1fc0-782c-b454-b999-48e88fac4cb9@redhat.com>

On Wed, 7 Aug 2019 09:54:27 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 06.08.19 11:48, Igor Mammedov wrote:
> > Max memslot size supported by kvm on s390 is 8Tb,
> > move logic of splitting RAM in chunks upto 8T to KVM code.
> > 
> > This way it will hide KVM specific restrictions in KVM code
> > and won't affect baord level design decisions. Which would allow
> > us to avoid misusing memory_region_allocate_system_memory() API
> > and eventually use a single hostmem backend for guest RAM.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > v4:
> >   * fix compilation issue
> >           (Christian Borntraeger <borntraeger@de.ibm.com>)
> >   * advance HVA along with GPA in kvm_set_phys_mem()
> >           (Christian Borntraeger <borntraeger@de.ibm.com>)
> > 
> > patch prepares only KVM side for switching to single RAM memory region
> > another patch will take care of  dropping manual RAM partitioning in
> > s390 code.
> > ---
> >  include/sysemu/kvm_int.h   |  1 +
> >  accel/kvm/kvm-all.c        | 80 +++++++++++++++++++++++---------------
> >  hw/s390x/s390-virtio-ccw.c |  9 -----
> >  target/s390x/kvm.c         | 12 ++++++
> >  4 files changed, 62 insertions(+), 40 deletions(-)
> > 
> > diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
> > index 31df465fdc..7f7520bce2 100644
> > --- a/include/sysemu/kvm_int.h
> > +++ b/include/sysemu/kvm_int.h
> > @@ -41,4 +41,5 @@ typedef struct KVMMemoryListener {
> >  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
> >                                    AddressSpace *as, int as_id);
> >  
> > +void kvm_set_max_memslot_size(hwaddr max_slot_size);
> >  #endif
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index f450f25295..d87f855ea4 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -138,6 +138,7 @@ bool kvm_direct_msi_allowed;
> >  bool kvm_ioeventfd_any_length_allowed;
> >  bool kvm_msi_use_devid;
> >  static bool kvm_immediate_exit;
> > +static hwaddr kvm_max_slot_size = ~0;
> >  
> >  static const KVMCapabilityInfo kvm_required_capabilites[] = {
> >      KVM_CAP_INFO(USER_MEMORY),
> > @@ -951,6 +952,14 @@ kvm_check_extension_list(KVMState *s, const KVMCapabilityInfo *list)
> >      return NULL;
> >  }
> >  
> > +void kvm_set_max_memslot_size(hwaddr max_slot_size)
> > +{
> > +    g_assert(
> > +        ROUND_UP(max_slot_size, qemu_real_host_page_size) == max_slot_size
> > +    );
> > +    kvm_max_slot_size = max_slot_size;
> > +}
> > +
> >  static void kvm_set_phys_mem(KVMMemoryListener *kml,
> >                               MemoryRegionSection *section, bool add)
> >  {
> > @@ -958,7 +967,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> >      int err;
> >      MemoryRegion *mr = section->mr;
> >      bool writeable = !mr->readonly && !mr->rom_device;
> > -    hwaddr start_addr, size;
> > +    hwaddr start_addr, size, slot_size;
> >      void *ram;
> >  
> >      if (!memory_region_is_ram(mr)) {
> > @@ -983,41 +992,50 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> >      kvm_slots_lock(kml);
> >  
> >      if (!add) {
> > -        mem = kvm_lookup_matching_slot(kml, start_addr, size);
> > -        if (!mem) {
> > -            goto out;
> > -        }
> > -        if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> > -            kvm_physical_sync_dirty_bitmap(kml, section);
> > -        }
> > +        do {
> > +            slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
> > +            mem = kvm_lookup_matching_slot(kml, start_addr, slot_size);
> > +            if (!mem) {
> > +                goto out;  
> 
> I wonder if this can trigger for the first, but not the second slot (or
> the other way around). In that case you would want to continue the loop
> (incrementing counters). But most probably there would something be
> wrong in the caller if that would happen.

I couldn't come up with scenario where it would be possible
(unless flatview rendering is broken)

> 
> > +            }
> > +            if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> > +                kvm_physical_sync_dirty_bitmap(kml, section);
> > +            }
> >  
> > -        /* unregister the slot */
> > -        g_free(mem->dirty_bmap);
> > -        mem->dirty_bmap = NULL;
> > -        mem->memory_size = 0;
> > -        mem->flags = 0;
> > -        err = kvm_set_user_memory_region(kml, mem, false);
> > -        if (err) {
> > -            fprintf(stderr, "%s: error unregistering slot: %s\n",
> > -                    __func__, strerror(-err));
> > -            abort();
> > -        }
> > +            /* unregister the slot */
> > +            g_free(mem->dirty_bmap);
> > +            mem->dirty_bmap = NULL;
> > +            mem->memory_size = 0;
> > +            mem->flags = 0;
> > +            err = kvm_set_user_memory_region(kml, mem, false);
> > +            if (err) {
> > +                fprintf(stderr, "%s: error unregistering slot: %s\n",
> > +                        __func__, strerror(-err));
> > +                abort();
> > +            }
> > +            start_addr += slot_size;
> > +        } while ((size -= slot_size));  
> 
> NIT: I think you can drop parentheses - but I would really prefer to not
> perform computations in the condition.
sure, I'll move computation within the loop

> >          goto out;
> >      }
> >  
> >      /* register the new slot */
> > -    mem = kvm_alloc_slot(kml);
> > -    mem->memory_size = size;
> > -    mem->start_addr = start_addr;
> > -    mem->ram = ram;
> > -    mem->flags = kvm_mem_flags(mr);
> > -
> > -    err = kvm_set_user_memory_region(kml, mem, true);
> > -    if (err) {
> > -        fprintf(stderr, "%s: error registering slot: %s\n", __func__,
> > -                strerror(-err));
> > -        abort();
> > -    }
> > +    do {
> > +        slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
> > +        mem = kvm_alloc_slot(kml);
> > +        mem->memory_size = slot_size;
> > +        mem->start_addr = start_addr;
> > +        mem->ram = ram;
> > +        mem->flags = kvm_mem_flags(mr);
> > +
> > +        err = kvm_set_user_memory_region(kml, mem, true);
> > +        if (err) {
> > +            fprintf(stderr, "%s: error registering slot: %s\n", __func__,
> > +                    strerror(-err));
> > +            abort();
> > +        }
> > +        start_addr += slot_size;
> > +        ram += slot_size;
> > +    } while ((size -= slot_size));  
> 
> dito
> 
> One note:
> 
> KVMState stores the number of slots in "nr_slots". We export that via
> kvm_get_max_memslots().
>
> E.g., spapr uses that to compare it against "machine->ram_slots".
this patch shouldn't affect spapr/arm or x86 machines as they do not have
limits on memslot size.

> Later (esp. for s390x), kvm_get_max_memslots() can no longer be compared to
> ram_slots directly. Could be that a ram slot would map to multiple KVM
> memory slots. There would be no easy way to detect if KVM is able to
> deal with "machine->ram_slots" as defined by the user, until the sizes
> of the slots are known.

If I recall correctly about kvm_foo_slots() APIs,
assumption 1 memory region == 1 memslot didn't/doesn't hold true
in all cases (ex: x86) and it's only best effort to let us compare
the number of apples to oranges on a tree and works for current
usecases.

From hotplug point of view kvm_has_free_slot() would be more important,
to allow for graceful abort. If s390 would ever need to hotplug
RAM MemoryRegion, anyway APIs should be changed to account for
1:N split when actual dependency arises.

next prev parent reply	other threads:[~2019-08-07  9:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-06  9:48 [Qemu-devel] [PATCH for-4.2 v4 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
2019-08-06  9:48 ` [Qemu-devel] [PATCH for-4.2 v4 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
2019-08-07  7:54   ` [Qemu-devel] [qemu-s390x] " David Hildenbrand
2019-08-07  9:55     ` Igor Mammedov [this message]
2019-08-07 10:20       ` David Hildenbrand
2019-08-07 15:32   ` [Qemu-devel] [PATCH for-4.2 v5 " Igor Mammedov
2019-08-20 16:07     ` Cornelia Huck
2019-08-27 12:56       ` Igor Mammedov
2019-08-29  6:47         ` Christian Borntraeger
2019-08-29 12:04           ` Igor Mammedov
2019-08-29 12:07             ` [Qemu-devel] [qemu-s390x] " Christian Borntraeger
2019-08-29 12:31               ` Igor Mammedov
2019-08-29 12:41                 ` Christian Borntraeger
2019-08-30  9:41                   ` Igor Mammedov
2019-08-30 16:19                     ` Christian Borntraeger
2019-09-02 13:49                       ` Igor Mammedov
2019-09-03  6:57                         ` Christian Borntraeger
2019-09-16 13:16                           ` Igor Mammedov
2019-08-06  9:48 ` [Qemu-devel] [PATCH for-4.2 v4 2/2] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190807115559.660f3e6c@redhat.com \
    --to=imammedo@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.