* [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on @ 2015-06-03 12:22 Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 12:22 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, mst When more than ~50 pc-dimm devices are hotplugged with vhost enabled, QEMU will assert in vhost vhost_commit() due to backend refusing to accept too many memory ranges. Series introduces Reserved HVA MemoryRegion container where all hotplugged memory is remapped and passes the single container range to vhost instead of multiple memory ranges for each hotlugged pc-dimm device. It's alternative approach to increasing backend supported memory regions limit since what I've come up with backend side approach is quite a bit more code so far. With this approach it would be possible to extend it to initial memory later and provide a single range for all RAM to vhost, which should speed up its hot-path by replacing current GPA<->HVA lookup loop with offset calculation. Igor Mammedov (2): memory: introduce MemoryRegion container with reserved HVA range pc: fix QEMU crashing when more than ~50 memory hotplugged exec.c | 13 +++++++++++ hw/i386/pc.c | 4 ++-- hw/virtio/vhost.c | 15 ++++++++++--- include/exec/cpu-common.h | 1 + include/exec/memory.h | 42 ++++++++++++++++++++++++++++++++++-- memory.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 123 insertions(+), 7 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range 2015-06-03 12:22 [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov @ 2015-06-03 12:22 ` Igor Mammedov 2015-06-03 12:45 ` Paolo Bonzini 2015-06-03 12:22 ` [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov 2015-06-03 15:05 ` [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Michael S. Tsirkin 2 siblings, 1 reply; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 12:22 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, mst Patch adds memory_region_init_rsvd_hva() and memory_region_find_rsvd_hva() API to allocate and lookup reserved HVA MemoryRegion. MemoryRegion with reserved HVA range will be used for providing linear 1:1 HVA->GVA mapping for RAM MemoryRegion-s that is added as subregions inside it. It will be used for memory hotplug and vhost integration reducing all hotplugged MemoryRegions down to one memory range descriptor, which allows to overcome vhost's limitation on number of allowed memory ranges. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- exec.c | 13 +++++++++++ include/exec/cpu-common.h | 1 + include/exec/memory.h | 42 ++++++++++++++++++++++++++++++++++-- memory.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 109 insertions(+), 2 deletions(-) diff --git a/exec.c b/exec.c index e19ab22..e01cd8f 100644 --- a/exec.c +++ b/exec.c @@ -1325,6 +1325,19 @@ static int memory_try_enable_merging(void *addr, size_t len) return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE); } +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva) +{ + RAMBlock *block = find_ram_block(addr); + + assert(block); + block->host = mremap(block->host, block->used_length, + block->used_length, + MREMAP_MAYMOVE | MREMAP_FIXED, new_hva); + memory_try_enable_merging(block->host, block->used_length); + qemu_ram_setup_dump(block->host, block->used_length); +} + + /* Only legal before guest might have detected the memory size: e.g. on * incoming migration, or right after reset. * diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 43428bd..a9f73d6 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -60,6 +60,7 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value); typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr); void qemu_ram_remap(ram_addr_t addr, ram_addr_t length); +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva); /* This should not be used by devices. */ MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr); void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev); diff --git a/include/exec/memory.h b/include/exec/memory.h index b61c84f..9b0858e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -174,6 +174,7 @@ struct MemoryRegion { bool terminates; bool romd_mode; bool ram; + bool rsvd_hva; bool skip_dump; bool readonly; /* For RAM regions */ bool enabled; @@ -281,6 +282,23 @@ void memory_region_init(MemoryRegion *mr, struct Object *owner, const char *name, uint64_t size); +/** + * memory_region_init_rsvd_hva: Initialize a reserved HVA memory region + * + * The container for RAM memory regions. + * When adding subregion with memory_region_add_subregion(), subregion's + * backing host memory will be remapped to inside the reserved by this + * region HVA. + * + * @mr: the #MemoryRegion to be initialized + * @owner: the object that tracks the region's reference count + * @name: used for debugging; not visible to the user or ABI + * @size: size of the region; any subregions beyond this size will be clipped + */ +void memory_region_init_rsvd_hva(MemoryRegion *mr, + struct Object *owner, + const char *name, + uint64_t size); /** * memory_region_ref: Add 1 to a memory region's reference count @@ -620,8 +638,8 @@ int memory_region_get_fd(MemoryRegion *mr); * memory_region_get_ram_ptr: Get a pointer into a RAM memory region. * * Returns a host pointer to a RAM memory region (created with - * memory_region_init_ram() or memory_region_init_ram_ptr()). Use with - * care. + * memory_region_init_ram() or memory_region_init_ram_ptr()) or + * memory_region_init_rsvd_hva(). Use with care. * * @mr: the memory region being queried. */ @@ -1014,6 +1032,26 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr, hwaddr addr, uint64_t size); /** + * memory_region_find_rsvd_hva: finds a parent MemoryRegion with + * reserved HVA and translates it into a #MemoryRegionSection. + * + * Locates the first parent #MemoryRegion of @mr that is + * of reserved HVA type. + * + * Returns a #MemoryRegionSection that describes a reserved HVA + * memory region. + * .@offset_within_address_space is offset of found + * (in the .@mr field) memory region relative to the address + * space that contains it. + * .@offset_within_region is offset of @mr relative + * to the returned region (in the .@mr field). + * .@size is size of found memory region + * + * @mr: a MemoryRegion whose HVA pernt is looked up + */ +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr); + +/** * address_space_sync_dirty_bitmap: synchronize the dirty log for all memory * * Synchronizes the dirty page log for an entire address space. diff --git a/memory.c b/memory.c index 03c536b..70564dc 100644 --- a/memory.c +++ b/memory.c @@ -25,6 +25,7 @@ #include "exec/memory-internal.h" #include "exec/ram_addr.h" #include "sysemu/sysemu.h" +#include <sys/mman.h> //#define DEBUG_UNASSIGNED @@ -939,6 +940,17 @@ void memory_region_init(MemoryRegion *mr, } } +void memory_region_init_rsvd_hva(MemoryRegion *mr, + Object *owner, + const char *name, + uint64_t size) +{ + memory_region_init(mr, owner, name, size); + mr->rsvd_hva = true; + mr->ram_addr = (ram_addr_t)mmap(0, memory_region_size(mr), + PROT_NONE, MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); +} + static void memory_region_get_addr(Object *obj, Visitor *v, void *opaque, const char *name, Error **errp) { @@ -1514,6 +1526,10 @@ int memory_region_get_fd(MemoryRegion *mr) void *memory_region_get_ram_ptr(MemoryRegion *mr) { + if (mr->rsvd_hva) { + return (void *)mr->ram_addr; + } + if (mr->alias) { return memory_region_get_ram_ptr(mr->alias) + mr->alias_offset; } @@ -1742,6 +1758,17 @@ static void memory_region_add_subregion_common(MemoryRegion *mr, assert(!subregion->container); subregion->container = mr; subregion->addr = offset; + + if (mr->ram) { + MemoryRegionSection rsvd_hva = memory_region_find_rsvd_hva(mr); + + if (rsvd_hva.mr) { + qemu_ram_remap_hva(mr->ram_addr, + memory_region_get_ram_ptr(rsvd_hva.mr) + + rsvd_hva.offset_within_region); + } + } + memory_region_update_container_subregions(subregion); } @@ -1884,6 +1911,34 @@ bool memory_region_is_mapped(MemoryRegion *mr) return mr->container ? true : false; } +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr) +{ + MemoryRegionSection ret = { .mr = NULL }; + MemoryRegion *hva_container = NULL; + hwaddr addr = mr->addr; + MemoryRegion *root; + + addr += mr->addr; + for (root = mr; root->container; ) { + if (!hva_container && root->rsvd_hva) { + hva_container = root; + ret.offset_within_region = addr; + } + root = root->container; + addr += root->addr; + } + + ret.address_space = memory_region_to_address_space(root); + if (!ret.address_space || !hva_container) { + return ret; + } + + ret.mr = hva_container; + ret.offset_within_address_space = addr; + ret.size = int128_make64(memory_region_size(ret.mr)); + return ret; +} + MemoryRegionSection memory_region_find(MemoryRegion *mr, hwaddr addr, uint64_t size) { -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range 2015-06-03 12:22 ` [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov @ 2015-06-03 12:45 ` Paolo Bonzini 2015-06-03 13:43 ` Igor Mammedov 0 siblings, 1 reply; 12+ messages in thread From: Paolo Bonzini @ 2015-06-03 12:45 UTC (permalink / raw) To: Igor Mammedov, qemu-devel; +Cc: mst On 03/06/2015 14:22, Igor Mammedov wrote: > Patch adds memory_region_init_rsvd_hva() and > memory_region_find_rsvd_hva() API to allocate and lookup > reserved HVA MemoryRegion. > > MemoryRegion with reserved HVA range will be used for > providing linear 1:1 HVA->GVA mapping for RAM MemoryRegion-s > that is added as subregions inside it. > > It will be used for memory hotplug and vhost integration > reducing all hotplugged MemoryRegions down to one > memory range descriptor, which allows to overcome > vhost's limitation on number of allowed memory ranges. > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> Very nice patches. Really. About the naming, I would use memory_region_init_hva_range and memory_region_find_hva_range. Also, it has to be Linux-only (due to MAP_NORESERVE and mremap). Finally, instead of overloading ram_addr, just add "void *rsvd_hva" and make it NULL for other kinds of regions. This matches what we do for e.g. iommu_ops. One question: > void *memory_region_get_ram_ptr(MemoryRegion *mr) > { > + if (mr->rsvd_hva) { > + return (void *)mr->ram_addr; > + } > + I see how this is "nice" and I don't object to it, but is it also necessary? Paolo > --- > exec.c | 13 +++++++++++ > include/exec/cpu-common.h | 1 + > include/exec/memory.h | 42 ++++++++++++++++++++++++++++++++++-- > memory.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 109 insertions(+), 2 deletions(-) > > diff --git a/exec.c b/exec.c > index e19ab22..e01cd8f 100644 > --- a/exec.c > +++ b/exec.c > @@ -1325,6 +1325,19 @@ static int memory_try_enable_merging(void *addr, size_t len) > return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE); > } > > +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva) > +{ > + RAMBlock *block = find_ram_block(addr); > + > + assert(block); > + block->host = mremap(block->host, block->used_length, > + block->used_length, > + MREMAP_MAYMOVE | MREMAP_FIXED, new_hva); > + memory_try_enable_merging(block->host, block->used_length); > + qemu_ram_setup_dump(block->host, block->used_length); > +} > + > + > /* Only legal before guest might have detected the memory size: e.g. on > * incoming migration, or right after reset. > * > diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h > index 43428bd..a9f73d6 100644 > --- a/include/exec/cpu-common.h > +++ b/include/exec/cpu-common.h > @@ -60,6 +60,7 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value); > typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr); > > void qemu_ram_remap(ram_addr_t addr, ram_addr_t length); > +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva); > /* This should not be used by devices. */ > MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr); > void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev); > diff --git a/include/exec/memory.h b/include/exec/memory.h > index b61c84f..9b0858e 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -174,6 +174,7 @@ struct MemoryRegion { > bool terminates; > bool romd_mode; > bool ram; > + bool rsvd_hva; > bool skip_dump; > bool readonly; /* For RAM regions */ > bool enabled; > @@ -281,6 +282,23 @@ void memory_region_init(MemoryRegion *mr, > struct Object *owner, > const char *name, > uint64_t size); > +/** > + * memory_region_init_rsvd_hva: Initialize a reserved HVA memory region > + * > + * The container for RAM memory regions. > + * When adding subregion with memory_region_add_subregion(), subregion's > + * backing host memory will be remapped to inside the reserved by this > + * region HVA. > + * > + * @mr: the #MemoryRegion to be initialized > + * @owner: the object that tracks the region's reference count > + * @name: used for debugging; not visible to the user or ABI > + * @size: size of the region; any subregions beyond this size will be clipped > + */ > +void memory_region_init_rsvd_hva(MemoryRegion *mr, > + struct Object *owner, > + const char *name, > + uint64_t size); > > /** > * memory_region_ref: Add 1 to a memory region's reference count > @@ -620,8 +638,8 @@ int memory_region_get_fd(MemoryRegion *mr); > * memory_region_get_ram_ptr: Get a pointer into a RAM memory region. > * > * Returns a host pointer to a RAM memory region (created with > - * memory_region_init_ram() or memory_region_init_ram_ptr()). Use with > - * care. > + * memory_region_init_ram() or memory_region_init_ram_ptr()) or > + * memory_region_init_rsvd_hva(). Use with care. > * > * @mr: the memory region being queried. > */ > @@ -1014,6 +1032,26 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr, > hwaddr addr, uint64_t size); > > /** > + * memory_region_find_rsvd_hva: finds a parent MemoryRegion with > + * reserved HVA and translates it into a #MemoryRegionSection. > + * > + * Locates the first parent #MemoryRegion of @mr that is > + * of reserved HVA type. > + * > + * Returns a #MemoryRegionSection that describes a reserved HVA > + * memory region. > + * .@offset_within_address_space is offset of found > + * (in the .@mr field) memory region relative to the address > + * space that contains it. > + * .@offset_within_region is offset of @mr relative > + * to the returned region (in the .@mr field). > + * .@size is size of found memory region > + * > + * @mr: a MemoryRegion whose HVA pernt is looked up > + */ > +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr); > + > +/** > * address_space_sync_dirty_bitmap: synchronize the dirty log for all memory > * > * Synchronizes the dirty page log for an entire address space. > diff --git a/memory.c b/memory.c > index 03c536b..70564dc 100644 > --- a/memory.c > +++ b/memory.c > @@ -25,6 +25,7 @@ > #include "exec/memory-internal.h" > #include "exec/ram_addr.h" > #include "sysemu/sysemu.h" > +#include <sys/mman.h> > > //#define DEBUG_UNASSIGNED > > @@ -939,6 +940,17 @@ void memory_region_init(MemoryRegion *mr, > } > } > > +void memory_region_init_rsvd_hva(MemoryRegion *mr, > + Object *owner, > + const char *name, > + uint64_t size) > +{ > + memory_region_init(mr, owner, name, size); > + mr->rsvd_hva = true; > + mr->ram_addr = (ram_addr_t)mmap(0, memory_region_size(mr), > + PROT_NONE, MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > +} > + > static void memory_region_get_addr(Object *obj, Visitor *v, void *opaque, > const char *name, Error **errp) > { > @@ -1514,6 +1526,10 @@ int memory_region_get_fd(MemoryRegion *mr) > > void *memory_region_get_ram_ptr(MemoryRegion *mr) > { > + if (mr->rsvd_hva) { > + return (void *)mr->ram_addr; > + } > + > if (mr->alias) { > return memory_region_get_ram_ptr(mr->alias) + mr->alias_offset; > } > @@ -1742,6 +1758,17 @@ static void memory_region_add_subregion_common(MemoryRegion *mr, > assert(!subregion->container); > subregion->container = mr; > subregion->addr = offset; > + > + if (mr->ram) { > + MemoryRegionSection rsvd_hva = memory_region_find_rsvd_hva(mr); > + > + if (rsvd_hva.mr) { > + qemu_ram_remap_hva(mr->ram_addr, > + memory_region_get_ram_ptr(rsvd_hva.mr) + > + rsvd_hva.offset_within_region); > + } > + } > + > memory_region_update_container_subregions(subregion); > } > > @@ -1884,6 +1911,34 @@ bool memory_region_is_mapped(MemoryRegion *mr) > return mr->container ? true : false; > } > > +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr) > +{ > + MemoryRegionSection ret = { .mr = NULL }; > + MemoryRegion *hva_container = NULL; > + hwaddr addr = mr->addr; > + MemoryRegion *root; > + > + addr += mr->addr; > + for (root = mr; root->container; ) { > + if (!hva_container && root->rsvd_hva) { > + hva_container = root; > + ret.offset_within_region = addr; > + } > + root = root->container; > + addr += root->addr; > + } > + > + ret.address_space = memory_region_to_address_space(root); > + if (!ret.address_space || !hva_container) { > + return ret; > + } > + > + ret.mr = hva_container; > + ret.offset_within_address_space = addr; > + ret.size = int128_make64(memory_region_size(ret.mr)); > + return ret; > +} > + > MemoryRegionSection memory_region_find(MemoryRegion *mr, > hwaddr addr, uint64_t size) > { > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range 2015-06-03 12:45 ` Paolo Bonzini @ 2015-06-03 13:43 ` Igor Mammedov 0 siblings, 0 replies; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 13:43 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, mst On Wed, 03 Jun 2015 14:45:32 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 03/06/2015 14:22, Igor Mammedov wrote: > > Patch adds memory_region_init_rsvd_hva() and > > memory_region_find_rsvd_hva() API to allocate and lookup > > reserved HVA MemoryRegion. > > > > MemoryRegion with reserved HVA range will be used for > > providing linear 1:1 HVA->GVA mapping for RAM MemoryRegion-s > > that is added as subregions inside it. > > > > It will be used for memory hotplug and vhost integration > > reducing all hotplugged MemoryRegions down to one > > memory range descriptor, which allows to overcome > > vhost's limitation on number of allowed memory ranges. > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > Very nice patches. Really. > > About the naming, I would use memory_region_init_hva_range and > memory_region_find_hva_range. > > Also, it has to be Linux-only (due to MAP_NORESERVE and mremap). > > Finally, instead of overloading ram_addr, just add "void *rsvd_hva" and > make it NULL for other kinds of regions. This matches what we do for > e.g. iommu_ops. sure , I'll fix it up. > > One question: > > > void *memory_region_get_ram_ptr(MemoryRegion *mr) > > { > > + if (mr->rsvd_hva) { > > + return (void *)mr->ram_addr; > > + } > > + > > I see how this is "nice" and I don't object to it, but is it also necessary? I've tried to reuse memory_region_get_ram_ptr(), what is other alternative? something like memory_region_get_hva_range_ptr() > > Paolo > > > --- > > exec.c | 13 +++++++++++ > > include/exec/cpu-common.h | 1 + > > include/exec/memory.h | 42 ++++++++++++++++++++++++++++++++++-- > > memory.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ > > 4 files changed, 109 insertions(+), 2 deletions(-) > > > > diff --git a/exec.c b/exec.c > > index e19ab22..e01cd8f 100644 > > --- a/exec.c > > +++ b/exec.c > > @@ -1325,6 +1325,19 @@ static int memory_try_enable_merging(void *addr, size_t len) > > return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE); > > } > > > > +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva) > > +{ > > + RAMBlock *block = find_ram_block(addr); > > + > > + assert(block); > > + block->host = mremap(block->host, block->used_length, > > + block->used_length, > > + MREMAP_MAYMOVE | MREMAP_FIXED, new_hva); > > + memory_try_enable_merging(block->host, block->used_length); > > + qemu_ram_setup_dump(block->host, block->used_length); > > +} > > + > > + > > /* Only legal before guest might have detected the memory size: e.g. on > > * incoming migration, or right after reset. > > * > > diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h > > index 43428bd..a9f73d6 100644 > > --- a/include/exec/cpu-common.h > > +++ b/include/exec/cpu-common.h > > @@ -60,6 +60,7 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value); > > typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr); > > > > void qemu_ram_remap(ram_addr_t addr, ram_addr_t length); > > +void qemu_ram_remap_hva(ram_addr_t addr, void *new_hva); > > /* This should not be used by devices. */ > > MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr); > > void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev); > > diff --git a/include/exec/memory.h b/include/exec/memory.h > > index b61c84f..9b0858e 100644 > > --- a/include/exec/memory.h > > +++ b/include/exec/memory.h > > @@ -174,6 +174,7 @@ struct MemoryRegion { > > bool terminates; > > bool romd_mode; > > bool ram; > > + bool rsvd_hva; > > bool skip_dump; > > bool readonly; /* For RAM regions */ > > bool enabled; > > @@ -281,6 +282,23 @@ void memory_region_init(MemoryRegion *mr, > > struct Object *owner, > > const char *name, > > uint64_t size); > > +/** > > + * memory_region_init_rsvd_hva: Initialize a reserved HVA memory region > > + * > > + * The container for RAM memory regions. > > + * When adding subregion with memory_region_add_subregion(), subregion's > > + * backing host memory will be remapped to inside the reserved by this > > + * region HVA. > > + * > > + * @mr: the #MemoryRegion to be initialized > > + * @owner: the object that tracks the region's reference count > > + * @name: used for debugging; not visible to the user or ABI > > + * @size: size of the region; any subregions beyond this size will be clipped > > + */ > > +void memory_region_init_rsvd_hva(MemoryRegion *mr, > > + struct Object *owner, > > + const char *name, > > + uint64_t size); > > > > /** > > * memory_region_ref: Add 1 to a memory region's reference count > > @@ -620,8 +638,8 @@ int memory_region_get_fd(MemoryRegion *mr); > > * memory_region_get_ram_ptr: Get a pointer into a RAM memory region. > > * > > * Returns a host pointer to a RAM memory region (created with > > - * memory_region_init_ram() or memory_region_init_ram_ptr()). Use with > > - * care. > > + * memory_region_init_ram() or memory_region_init_ram_ptr()) or > > + * memory_region_init_rsvd_hva(). Use with care. > > * > > * @mr: the memory region being queried. > > */ > > @@ -1014,6 +1032,26 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr, > > hwaddr addr, uint64_t size); > > > > /** > > + * memory_region_find_rsvd_hva: finds a parent MemoryRegion with > > + * reserved HVA and translates it into a #MemoryRegionSection. > > + * > > + * Locates the first parent #MemoryRegion of @mr that is > > + * of reserved HVA type. > > + * > > + * Returns a #MemoryRegionSection that describes a reserved HVA > > + * memory region. > > + * .@offset_within_address_space is offset of found > > + * (in the .@mr field) memory region relative to the address > > + * space that contains it. > > + * .@offset_within_region is offset of @mr relative > > + * to the returned region (in the .@mr field). > > + * .@size is size of found memory region > > + * > > + * @mr: a MemoryRegion whose HVA pernt is looked up > > + */ > > +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr); > > + > > +/** > > * address_space_sync_dirty_bitmap: synchronize the dirty log for all memory > > * > > * Synchronizes the dirty page log for an entire address space. > > diff --git a/memory.c b/memory.c > > index 03c536b..70564dc 100644 > > --- a/memory.c > > +++ b/memory.c > > @@ -25,6 +25,7 @@ > > #include "exec/memory-internal.h" > > #include "exec/ram_addr.h" > > #include "sysemu/sysemu.h" > > +#include <sys/mman.h> > > > > //#define DEBUG_UNASSIGNED > > > > @@ -939,6 +940,17 @@ void memory_region_init(MemoryRegion *mr, > > } > > } > > > > +void memory_region_init_rsvd_hva(MemoryRegion *mr, > > + Object *owner, > > + const char *name, > > + uint64_t size) > > +{ > > + memory_region_init(mr, owner, name, size); > > + mr->rsvd_hva = true; > > + mr->ram_addr = (ram_addr_t)mmap(0, memory_region_size(mr), > > + PROT_NONE, MAP_NORESERVE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > > +} > > + > > static void memory_region_get_addr(Object *obj, Visitor *v, void *opaque, > > const char *name, Error **errp) > > { > > @@ -1514,6 +1526,10 @@ int memory_region_get_fd(MemoryRegion *mr) > > > > void *memory_region_get_ram_ptr(MemoryRegion *mr) > > { > > + if (mr->rsvd_hva) { > > + return (void *)mr->ram_addr; > > + } > > + > > if (mr->alias) { > > return memory_region_get_ram_ptr(mr->alias) + mr->alias_offset; > > } > > @@ -1742,6 +1758,17 @@ static void memory_region_add_subregion_common(MemoryRegion *mr, > > assert(!subregion->container); > > subregion->container = mr; > > subregion->addr = offset; > > + > > + if (mr->ram) { > > + MemoryRegionSection rsvd_hva = memory_region_find_rsvd_hva(mr); > > + > > + if (rsvd_hva.mr) { > > + qemu_ram_remap_hva(mr->ram_addr, > > + memory_region_get_ram_ptr(rsvd_hva.mr) + > > + rsvd_hva.offset_within_region); > > + } > > + } > > + > > memory_region_update_container_subregions(subregion); > > } > > > > @@ -1884,6 +1911,34 @@ bool memory_region_is_mapped(MemoryRegion *mr) > > return mr->container ? true : false; > > } > > > > +MemoryRegionSection memory_region_find_rsvd_hva(MemoryRegion *mr) > > +{ > > + MemoryRegionSection ret = { .mr = NULL }; > > + MemoryRegion *hva_container = NULL; > > + hwaddr addr = mr->addr; > > + MemoryRegion *root; > > + > > + addr += mr->addr; > > + for (root = mr; root->container; ) { > > + if (!hva_container && root->rsvd_hva) { > > + hva_container = root; > > + ret.offset_within_region = addr; > > + } > > + root = root->container; > > + addr += root->addr; > > + } > > + > > + ret.address_space = memory_region_to_address_space(root); > > + if (!ret.address_space || !hva_container) { > > + return ret; > > + } > > + > > + ret.mr = hva_container; > > + ret.offset_within_address_space = addr; > > + ret.size = int128_make64(memory_region_size(ret.mr)); > > + return ret; > > +} > > + > > MemoryRegionSection memory_region_find(MemoryRegion *mr, > > hwaddr addr, uint64_t size) > > { > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 12:22 [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov @ 2015-06-03 12:22 ` Igor Mammedov 2015-06-03 12:48 ` Paolo Bonzini 2015-06-03 15:05 ` [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Michael S. Tsirkin 2 siblings, 1 reply; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 12:22 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, mst QEMU assert in vhost due to hitting vhost bachend limit on number of supported memory regions. Instead of increasing limit in backends, describe all hotlugged memory as one continuos range to vhost that implements linear 1:1 HVA->GPA mapping in backend. It allows to avoid increasing VHOST_MEMORY_MAX_NREGIONS limit in kernel and refactoring current region lookup algorithm to a faster/scalable datastructure. The same applies to vhost user which has even lower limit. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- hw/i386/pc.c | 4 ++-- hw/virtio/vhost.c | 15 ++++++++++++--- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 1eb1db0..c722339 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1306,8 +1306,8 @@ FWCfgState *pc_memory_init(MachineState *machine, exit(EXIT_FAILURE); } - memory_region_init(&pcms->hotplug_memory, OBJECT(pcms), - "hotplug-memory", hotplug_mem_size); + memory_region_init_rsvd_hva(&pcms->hotplug_memory, OBJECT(pcms), + "hotplug-memory", hotplug_mem_size); memory_region_add_subregion(system_memory, pcms->hotplug_memory_base, &pcms->hotplug_memory); } diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 54851b7..44cfaab 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -380,6 +380,7 @@ static void vhost_set_memory(MemoryListener *listener, bool log_dirty = memory_region_is_logging(section->mr); int s = offsetof(struct vhost_memory, regions) + (dev->mem->nregions + 1) * sizeof dev->mem->regions[0]; + MemoryRegionSection rsvd_hva; void *ram; dev->mem = g_realloc(dev->mem, s); @@ -388,17 +389,25 @@ static void vhost_set_memory(MemoryListener *listener, add = false; } + rsvd_hva = memory_region_find_rsvd_hva(section->mr); + if (rsvd_hva.mr) { + start_addr = rsvd_hva.offset_within_address_space; + size = int128_get64(rsvd_hva.size); + ram = memory_region_get_ram_ptr(rsvd_hva.mr); + } else { + ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; + } + assert(size); /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */ - ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; if (add) { - if (!vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { + if (!rsvd_hva.mr && !vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { /* Region exists with same address. Nothing to do. */ return; } } else { - if (!vhost_dev_find_reg(dev, start_addr, size)) { + if (!rsvd_hva.mr && !vhost_dev_find_reg(dev, start_addr, size)) { /* Removing region that we don't access. Nothing to do. */ return; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 12:22 ` [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov @ 2015-06-03 12:48 ` Paolo Bonzini 2015-06-03 14:05 ` Igor Mammedov 0 siblings, 1 reply; 12+ messages in thread From: Paolo Bonzini @ 2015-06-03 12:48 UTC (permalink / raw) To: Igor Mammedov, qemu-devel; +Cc: mst On 03/06/2015 14:22, Igor Mammedov wrote: > QEMU assert in vhost due to hitting vhost bachend limit > on number of supported memory regions. > > Instead of increasing limit in backends, describe all hotlugged > memory as one continuos range to vhost that implements linear > 1:1 HVA->GPA mapping in backend. > It allows to avoid increasing VHOST_MEMORY_MAX_NREGIONS limit > in kernel and refactoring current region lookup algorithm > to a faster/scalable datastructure. The same applies to > vhost user which has even lower limit. > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > hw/i386/pc.c | 4 ++-- > hw/virtio/vhost.c | 15 ++++++++++++--- > 2 files changed, 14 insertions(+), 5 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 1eb1db0..c722339 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -1306,8 +1306,8 @@ FWCfgState *pc_memory_init(MachineState *machine, > exit(EXIT_FAILURE); > } > > - memory_region_init(&pcms->hotplug_memory, OBJECT(pcms), > - "hotplug-memory", hotplug_mem_size); > + memory_region_init_rsvd_hva(&pcms->hotplug_memory, OBJECT(pcms), > + "hotplug-memory", hotplug_mem_size); > memory_region_add_subregion(system_memory, pcms->hotplug_memory_base, > &pcms->hotplug_memory); > } > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 54851b7..44cfaab 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -380,6 +380,7 @@ static void vhost_set_memory(MemoryListener *listener, > bool log_dirty = memory_region_is_logging(section->mr); > int s = offsetof(struct vhost_memory, regions) + > (dev->mem->nregions + 1) * sizeof dev->mem->regions[0]; > + MemoryRegionSection rsvd_hva; > void *ram; > > dev->mem = g_realloc(dev->mem, s); > @@ -388,17 +389,25 @@ static void vhost_set_memory(MemoryListener *listener, > add = false; > } > > + rsvd_hva = memory_region_find_rsvd_hva(section->mr); > + if (rsvd_hva.mr) { > + start_addr = rsvd_hva.offset_within_address_space; > + size = int128_get64(rsvd_hva.size); > + ram = memory_region_get_ram_ptr(rsvd_hva.mr); > + } else { > + ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; > + } I don't think this is needed. What _could_ be useful is to merge adjacent ranges even if they are partly unmapped, but your patch doesn't do that. But converting to a "reserved HVA" range should have been done already in memory_region_add_subregion_common. Am I missing something? (And I see now that my request about memory_region_get_ram_ptr is linked to this bit of your patch). Paolo > assert(size); > > /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */ > - ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; > if (add) { > - if (!vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { > + if (!rsvd_hva.mr && !vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { > /* Region exists with same address. Nothing to do. */ > return; > } > } else { > - if (!vhost_dev_find_reg(dev, start_addr, size)) { > + if (!rsvd_hva.mr && !vhost_dev_find_reg(dev, start_addr, size)) { > /* Removing region that we don't access. Nothing to do. */ > return; > } > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 12:48 ` Paolo Bonzini @ 2015-06-03 14:05 ` Igor Mammedov 2015-06-03 15:08 ` Paolo Bonzini 0 siblings, 1 reply; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 14:05 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, mst On Wed, 03 Jun 2015 14:48:46 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 03/06/2015 14:22, Igor Mammedov wrote: > > QEMU assert in vhost due to hitting vhost bachend limit > > on number of supported memory regions. > > > > Instead of increasing limit in backends, describe all hotlugged > > memory as one continuos range to vhost that implements linear > > 1:1 HVA->GPA mapping in backend. > > It allows to avoid increasing VHOST_MEMORY_MAX_NREGIONS limit > > in kernel and refactoring current region lookup algorithm > > to a faster/scalable datastructure. The same applies to > > vhost user which has even lower limit. > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > --- > > hw/i386/pc.c | 4 ++-- > > hw/virtio/vhost.c | 15 ++++++++++++--- > > 2 files changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > > index 1eb1db0..c722339 100644 > > --- a/hw/i386/pc.c > > +++ b/hw/i386/pc.c > > @@ -1306,8 +1306,8 @@ FWCfgState *pc_memory_init(MachineState *machine, > > exit(EXIT_FAILURE); > > } > > > > - memory_region_init(&pcms->hotplug_memory, OBJECT(pcms), > > - "hotplug-memory", hotplug_mem_size); > > + memory_region_init_rsvd_hva(&pcms->hotplug_memory, OBJECT(pcms), > > + "hotplug-memory", hotplug_mem_size); > > memory_region_add_subregion(system_memory, pcms->hotplug_memory_base, > > &pcms->hotplug_memory); > > } > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > index 54851b7..44cfaab 100644 > > --- a/hw/virtio/vhost.c > > +++ b/hw/virtio/vhost.c > > @@ -380,6 +380,7 @@ static void vhost_set_memory(MemoryListener *listener, > > bool log_dirty = memory_region_is_logging(section->mr); > > int s = offsetof(struct vhost_memory, regions) + > > (dev->mem->nregions + 1) * sizeof dev->mem->regions[0]; > > + MemoryRegionSection rsvd_hva; > > void *ram; > > > > dev->mem = g_realloc(dev->mem, s); > > @@ -388,17 +389,25 @@ static void vhost_set_memory(MemoryListener *listener, > > add = false; > > } > > > > + rsvd_hva = memory_region_find_rsvd_hva(section->mr); > > + if (rsvd_hva.mr) { > > + start_addr = rsvd_hva.offset_within_address_space; > > + size = int128_get64(rsvd_hva.size); > > + ram = memory_region_get_ram_ptr(rsvd_hva.mr); > > + } else { > > + ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; > > + } > > I don't think this is needed. > > What _could_ be useful is to merge adjacent ranges even if they are > partly unmapped, but your patch doesn't do that. merging/splitting for adjacent regions is done at following vhost_dev_(un)assign_memory() but it doesn't cover cases with gaps in between. Trying to make merging/splitting work with gaps might be more complicated (I haven't tried though), than just passing known in advance whole rsvd_hva range. More over if/when initial memory also converted to rsvd_hva (aliasing stopped me there for now), we could throw away all this merging and just keep a single rsvd_hva range for all RAM here. > > But converting to a "reserved HVA" range should have been done already > in memory_region_add_subregion_common. > > Am I missing something? (And I see now that my request about > memory_region_get_ram_ptr is linked to this bit of your patch). > > Paolo > > > assert(size); > > > > /* Optimize no-change case. At least cirrus_vga does this a lot at this time. */ > > - ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; > > if (add) { > > - if (!vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { > > + if (!rsvd_hva.mr && !vhost_dev_cmp_memory(dev, start_addr, size, (uintptr_t)ram)) { > > /* Region exists with same address. Nothing to do. */ > > return; > > } > > } else { > > - if (!vhost_dev_find_reg(dev, start_addr, size)) { > > + if (!rsvd_hva.mr && !vhost_dev_find_reg(dev, start_addr, size)) { > > /* Removing region that we don't access. Nothing to do. */ > > return; > > } > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 14:05 ` Igor Mammedov @ 2015-06-03 15:08 ` Paolo Bonzini 2015-06-03 15:23 ` Igor Mammedov 0 siblings, 1 reply; 12+ messages in thread From: Paolo Bonzini @ 2015-06-03 15:08 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, mst On 03/06/2015 16:05, Igor Mammedov wrote: >>> > > + rsvd_hva = memory_region_find_rsvd_hva(section->mr); >>> > > + if (rsvd_hva.mr) { >>> > > + start_addr = rsvd_hva.offset_within_address_space; >>> > > + size = int128_get64(rsvd_hva.size); >>> > > + ram = memory_region_get_ram_ptr(rsvd_hva.mr); >>> > > + } else { >>> > > + ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; >>> > > + } >> > >> > I don't think this is needed. >> > >> > What _could_ be useful is to merge adjacent ranges even if they are >> > partly unmapped, but your patch doesn't do that. > merging/splitting for adjacent regions is done at following > vhost_dev_(un)assign_memory() but it doesn't cover cases with > gaps in between. > > Trying to make merging/splitting work with gaps might be more > complicated (I haven't tried though), than just passing known > in advance whole rsvd_hva range. > > More over if/when initial memory also converted to rsvd_hva > (aliasing stopped me there for now), we could throw away all > this merging and just keep a single rsvd_hva range for all RAM here. Understood now. This still should be a separate patch. I'm much more confident with the other two (e.g. what happens if a malicious guest writes to memory that is still MAP_NORESERVE), so feel free to post those without RFC tag. But the vhost one really needs mst's eyes. Paolo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 15:08 ` Paolo Bonzini @ 2015-06-03 15:23 ` Igor Mammedov 2015-06-03 16:11 ` Paolo Bonzini 0 siblings, 1 reply; 12+ messages in thread From: Igor Mammedov @ 2015-06-03 15:23 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, mst On Wed, 03 Jun 2015 17:08:00 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 03/06/2015 16:05, Igor Mammedov wrote: > >>> > > + rsvd_hva = memory_region_find_rsvd_hva(section->mr); > >>> > > + if (rsvd_hva.mr) { > >>> > > + start_addr = rsvd_hva.offset_within_address_space; > >>> > > + size = int128_get64(rsvd_hva.size); > >>> > > + ram = memory_region_get_ram_ptr(rsvd_hva.mr); > >>> > > + } else { > >>> > > + ram = memory_region_get_ram_ptr(section->mr) + section->offset_within_region; > >>> > > + } > >> > > >> > I don't think this is needed. > >> > > >> > What _could_ be useful is to merge adjacent ranges even if they are > >> > partly unmapped, but your patch doesn't do that. > > merging/splitting for adjacent regions is done at following > > vhost_dev_(un)assign_memory() but it doesn't cover cases with > > gaps in between. > > > > Trying to make merging/splitting work with gaps might be more > > complicated (I haven't tried though), than just passing known > > in advance whole rsvd_hva range. > > > > More over if/when initial memory also converted to rsvd_hva > > (aliasing stopped me there for now), we could throw away all > > this merging and just keep a single rsvd_hva range for all RAM here. > > Understood now. This still should be a separate patch. I'm much more > confident with the other two (e.g. what happens if a malicious guest > writes to memory that is still MAP_NORESERVE), it should get SIGSEVG due to access to PROT_NONE. > so feel free to post > those without RFC tag. But the vhost one really needs mst's eyes. ok, I'll split it out. > > Paolo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 15:23 ` Igor Mammedov @ 2015-06-03 16:11 ` Paolo Bonzini 2015-06-03 16:30 ` Michael S. Tsirkin 0 siblings, 1 reply; 12+ messages in thread From: Paolo Bonzini @ 2015-06-03 16:11 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, mst On 03/06/2015 17:23, Igor Mammedov wrote: >> > Understood now. This still should be a separate patch. I'm much more >> > confident with the other two (e.g. what happens if a malicious guest >> > writes to memory that is still MAP_NORESERVE), > it should get SIGSEVG due to access to PROT_NONE. QEMU doesn't get the SEGV if you do address_space_rw or address_space_map to unallocated space, because the empty area in the container is treated as MMIO. But what does vhost do if you tell it to treat the whole block as a single huge lump? Paolo >> > so feel free to post >> > those without RFC tag. But the vhost one really needs mst's eyes. > ok, I'll split it out. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged 2015-06-03 16:11 ` Paolo Bonzini @ 2015-06-03 16:30 ` Michael S. Tsirkin 0 siblings, 0 replies; 12+ messages in thread From: Michael S. Tsirkin @ 2015-06-03 16:30 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Igor Mammedov, qemu-devel On Wed, Jun 03, 2015 at 06:11:29PM +0200, Paolo Bonzini wrote: > > > On 03/06/2015 17:23, Igor Mammedov wrote: > >> > Understood now. This still should be a separate patch. I'm much more > >> > confident with the other two (e.g. what happens if a malicious guest > >> > writes to memory that is still MAP_NORESERVE), > > it should get SIGSEVG due to access to PROT_NONE. > > QEMU doesn't get the SEGV if you do address_space_rw or > address_space_map to unallocated space, because the empty area in the > container is treated as MMIO. > > But what does vhost do if you tell it to treat the whole block as a > single huge lump? > > Paolo Guest can make vhost attempt reading or writing it. vhost will do copy from/to user. > >> > so feel free to post > >> > those without RFC tag. But the vhost one really needs mst's eyes. > > ok, I'll split it out. > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on 2015-06-03 12:22 [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov @ 2015-06-03 15:05 ` Michael S. Tsirkin 2 siblings, 0 replies; 12+ messages in thread From: Michael S. Tsirkin @ 2015-06-03 15:05 UTC (permalink / raw) To: Igor Mammedov; +Cc: pbonzini, qemu-devel On Wed, Jun 03, 2015 at 02:22:35PM +0200, Igor Mammedov wrote: > When more than ~50 pc-dimm devices are hotplugged with > vhost enabled, QEMU will assert in vhost vhost_commit() > due to backend refusing to accept too many memory ranges. > > Series introduces Reserved HVA MemoryRegion container > where all hotplugged memory is remapped and passes > the single container range to vhost instead of multiple > memory ranges for each hotlugged pc-dimm device. > > It's alternative approach to increasing backend supported > memory regions limit since what I've come up with > backend side approach is quite a bit more code so far. > > With this approach it would be possible to extend it to > initial memory later and provide a single range for all > RAM to vhost, which should speed up its hot-path by replacing > current GPA<->HVA lookup loop with offset calculation. I'm a bit too busy for a thorough review so it'll take a bit of time. From a quick look, it's nice. Acked-by: Michael S. Tsirkin <mst@redhat.com> > Igor Mammedov (2): > memory: introduce MemoryRegion container with reserved HVA range > pc: fix QEMU crashing when more than ~50 memory hotplugged > > exec.c | 13 +++++++++++ > hw/i386/pc.c | 4 ++-- > hw/virtio/vhost.c | 15 ++++++++++--- > include/exec/cpu-common.h | 1 + > include/exec/memory.h | 42 ++++++++++++++++++++++++++++++++++-- > memory.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ > 6 files changed, 123 insertions(+), 7 deletions(-) > > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-06-03 16:30 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-03 12:22 [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 1/2] memory: introduce MemoryRegion container with reserved HVA range Igor Mammedov 2015-06-03 12:45 ` Paolo Bonzini 2015-06-03 13:43 ` Igor Mammedov 2015-06-03 12:22 ` [Qemu-devel] [RFC 2/2] pc: fix QEMU crashing when more than ~50 memory hotplugged Igor Mammedov 2015-06-03 12:48 ` Paolo Bonzini 2015-06-03 14:05 ` Igor Mammedov 2015-06-03 15:08 ` Paolo Bonzini 2015-06-03 15:23 ` Igor Mammedov 2015-06-03 16:11 ` Paolo Bonzini 2015-06-03 16:30 ` Michael S. Tsirkin 2015-06-03 15:05 ` [Qemu-devel] [RFC 0/2] Fix QEMU crash during memory hotplug with vhost=on Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).