From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e38.co.us.ibm.com (e38.co.us.ibm.com [32.97.110.159]) (using TLSv1 with cipher CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id E32B91A04D3 for ; Sat, 3 Oct 2015 03:16:28 +1000 (AEST) Received: from localhost by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 2 Oct 2015 11:16:26 -0600 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 49C2F19D8041 for ; Fri, 2 Oct 2015 11:04:29 -0600 (MDT) Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t92HF2Mf55378064 for ; Fri, 2 Oct 2015 10:15:02 -0700 Received: from d03av05.boulder.ibm.com (localhost [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t92HGEKf002073 for ; Fri, 2 Oct 2015 11:16:16 -0600 Date: Fri, 2 Oct 2015 10:16:06 -0700 From: Nishanth Aravamudan To: Matthew Wilcox Cc: Keith Busch , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Alexey Kardashevskiy , David Gibson , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH 0/2] Fix NVMe driver support on Power with 32-bit DMA Message-ID: <20151002171606.GA41011@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PRP entries will match the device's page size, and that the DMA aligment matches the kernel's page aligment. On Power, the the IOMMU page size, as mentioned above, can be 4K, while the device can have a page size of 8K, while the kernel has a page size of 64K. This eventually trips the BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple of 4K but not 8K (e.g., 0xF000). In this particular case, and generally, we want to use the IOMMU's page size for the default device page size, rather than the kernel's page size. This series consists of two patches, one of which exposes the IOMMU's page shift on Power (currently only the page size is exposed, and it seems unnecessary to ilog2 that value in the driver). The second patch leverages this value on Power in the NVMe driver. With these patches, a NVMe device survives our internal hardware exerciser; the kernel BUGs within a few seconds without the patch.