From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Burakov, Anatoly" Subject: Re: [PATCH 3/3] net/virtio-user: fix memory hotplug support in vhost-kernel Date: Fri, 7 Sep 2018 13:24:05 +0100 Message-ID: References: <20180905042852.6212-1-tiwei.bie@intel.com> <20180905042852.6212-4-tiwei.bie@intel.com> <20180907113744.GA22511@debian> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org, seanbh@gmail.com, stable@dpdk.org To: Tiwei Bie Return-path: In-Reply-To: <20180907113744.GA22511@debian> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 07-Sep-18 12:37 PM, Tiwei Bie wrote: > On Fri, Sep 07, 2018 at 10:44:22AM +0100, Burakov, Anatoly wrote: >> On 05-Sep-18 5:28 AM, Tiwei Bie wrote: >>> It's possible to have much more hugepage backed memory regions >>> than what vhost-kernel supports due to the memory hotplug, which >>> may cause problems. A better solution is to have the virtio-user >>> pass all the memory ranges reserved by DPDK to vhost-kernel. >>> >>> Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug") >>> Cc: stable@dpdk.org >>> >>> Signed-off-by: Tiwei Bie >>> --- >>> drivers/net/virtio/virtio_user/vhost_kernel.c | 38 +++++++++---------- >>> 1 file changed, 18 insertions(+), 20 deletions(-) >>> >>> diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c b/drivers/net/virtio/virtio_user/vhost_kernel.c >>> index 897fee0af..9338166d9 100644 >>> --- a/drivers/net/virtio/virtio_user/vhost_kernel.c >>> +++ b/drivers/net/virtio/virtio_user/vhost_kernel.c >>> @@ -70,41 +70,41 @@ static uint64_t vhost_req_user_to_kernel[] = { >>> [VHOST_USER_SET_MEM_TABLE] = VHOST_SET_MEM_TABLE, >>> }; >>> -struct walk_arg { >>> - struct vhost_memory_kernel *vm; >>> - uint32_t region_nr; >>> -}; >>> static int >>> -add_memory_region(const struct rte_memseg_list *msl __rte_unused, >>> - const struct rte_memseg *ms, size_t len, void *arg) >>> +add_memseg_list(const struct rte_memseg_list *msl, void *arg) >>> { >>> - struct walk_arg *wa = arg; >>> + struct vhost_memory_kernel *vm = arg; >>> struct vhost_memory_region *mr; >>> void *start_addr; >>> + uint64_t len; >>> - if (wa->region_nr >= max_regions) >>> + if (vm->nregions >= max_regions) >>> return -1; >>> - mr = &wa->vm->regions[wa->region_nr++]; >>> - start_addr = ms->addr; >>> + start_addr = msl->base_va; >>> + len = msl->page_sz * msl->memseg_arr.len; >>> + >>> + mr = &vm->regions[vm->nregions++]; >>> mr->guest_phys_addr = (uint64_t)(uintptr_t)start_addr; >>> mr->userspace_addr = (uint64_t)(uintptr_t)start_addr; >>> mr->memory_size = len; >>> - mr->mmap_offset = 0; >>> + mr->mmap_offset = 0; /* flags_padding */ >>> + >>> + PMD_DRV_LOG(DEBUG, "index=%u addr=%p len=%" PRIu64, >>> + vm->nregions - 1, start_addr, len); >>> return 0; >>> } >>> -/* By default, vhost kernel module allows 64 regions, but DPDK allows >>> - * 256 segments. As a relief, below function merges those virtually >>> - * adjacent memsegs into one region. >>> +/* By default, vhost kernel module allows 64 regions, but DPDK may >>> + * have much more memory regions. Below function will treat each >>> + * contiguous memory space reserved by DPDK as one region. >>> */ >>> static struct vhost_memory_kernel * >>> prepare_vhost_memory_kernel(void) >>> { >>> struct vhost_memory_kernel *vm; >>> - struct walk_arg wa; >>> vm = malloc(sizeof(struct vhost_memory_kernel) + >>> max_regions * >>> @@ -112,20 +112,18 @@ prepare_vhost_memory_kernel(void) >>> if (!vm) >>> return NULL; >>> - wa.region_nr = 0; >>> - wa.vm = vm; >>> + vm->nregions = 0; >>> + vm->padding = 0; >>> /* >>> * The memory lock has already been taken by memory subsystem >>> * or virtio_user_start_device(). >>> */ >>> - if (rte_memseg_contig_walk_thread_unsafe(add_memory_region, &wa) < 0) { >>> + if (rte_memseg_list_walk_thread_unsafe(add_memseg_list, vm) < 0) { >>> free(vm); >>> return NULL; >>> } >>> - vm->nregions = wa.region_nr; >>> - vm->padding = 0; >>> return vm; >>> } >>> >> >> Doesn't that assume single file segments mode? > > This is to find out the VA ranges reserved by memory subsystem. > Why does it need to assume single file segments mode? If you are not in single-file segments mode, each individual page in a VA-contiguous area will be behind a different fd - so it will be part of a different region, would it not? > > >> >> -- >> Thanks, >> Anatoly > -- Thanks, Anatoly