From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonas Pfefferle Subject: [PATCH v2] vfio: fix sPAPR IOMMU DMA window size Date: Tue, 8 Aug 2017 10:28:10 +0200 Message-ID: <1502180890-20076-1-git-send-email-jpf@zurich.ibm.com> Cc: dev@dpdk.org, aik@ozlabs.ru, Jonas Pfefferle To: anatoly.burakov@intel.com Return-path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by dpdk.org (Postfix) with ESMTP id CABAA2BC7 for ; Tue, 8 Aug 2017 10:28:23 +0200 (CEST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v788OFXH033270 for ; Tue, 8 Aug 2017 04:28:22 -0400 Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 2c785pd8u7-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 08 Aug 2017 04:28:22 -0400 Received: from localhost by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 8 Aug 2017 09:28:20 +0100 List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" DMA window size needs to be big enough to span all memory segment's physical addresses. We do not need multiple levels of IOMMU tables as we already span ~70TB of physical memory with 16MB hugepages. Signed-off-by: Jonas Pfefferle --- v2: * roundup to next power 2 function without loop. lib/librte_eal/linuxapp/eal/eal_vfio.c | 42 +++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c index 946df7e..a3f9977 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -722,6 +722,35 @@ vfio_type1_dma_map(int vfio_container_fd) return 0; } +static inline int +clz64(uint64_t val) +{ + return val ? __builtin_clzll(val) : 64; +} + +static inline bool +is_power_of_2(uint64_t value) +{ + if (!value) + return false; + + return !(value & (value - 1)); +} + +static inline uint64_t +roundup_next_pow2(uint64_t value) +{ + uint8_t nlz = clz64(value); + + if (is_power_of_2(value)) + return value; + + if (!nlz) + return 0; + + return 1ULL << (64 - nlz); +} + static int vfio_spapr_dma_map(int vfio_container_fd) { @@ -759,10 +788,12 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } - /* calculate window size based on number of hugepages configured */ - create.window_size = rte_eal_get_physmem_size(); + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ + /* create DMA window from 0 to max(phys_addr + len) */ + /* sPAPR requires window size to be a power of 2 */ + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len); create.page_shift = __builtin_ctzll(ms->hugepage_sz); - create.levels = 2; + create.levels = 1; ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); if (ret) { @@ -771,6 +802,11 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } + if (create.start_addr != 0) { + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); + return -1; + } + /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ for (i = 0; i < RTE_MAX_MEMSEG; i++) { struct vfio_iommu_type1_dma_map dma_map; -- 2.7.4