From: nacc@linux.vnet.ibm.com (Nishanth Aravamudan)
Subject: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA
Date: Fri, 23 Oct 2015 13:57:49 -0700 [thread overview]
Message-ID: <20151023205749.GD10197@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151023205420.GA10197@linux.vnet.ibm.com>
[Sorry, subject should have been 0/7!]
On 23.10.2015 [13:54:20 -0700], Nishanth Aravamudan wrote:
> We received a bug report recently when DDW (64-bit direct DMA on Power)
> is not enabled for NVMe devices. In that case, we fall back to 32-bit
> DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
> Entries).
>
> The NVMe device driver, though, assumes that the DMA alignment for the
> PRP entries will match the device's page size, and that the DMA aligment
> matches the kernel's page aligment. On Power, the the IOMMU page size,
> as mentioned above, can be 4K, while the device can have a page size of
> 8K, while the kernel has a page size of 64K. This eventually trips the
> BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
> of 4K but not 8K (e.g., 0xF000).
>
> In this particular case, and generally, we want to use the IOMMU's page
> size for the default device page size, rather than the kernel's page
> size.
>
> This series consists of five patches:
>
> 1) add a generic dma_get_page_shift implementation that just returns
> PAGE_SHIFT
> 2) override the generic implementation on Power to use the IOMMU table's
> page shift if available
> 3) allow further specific overriding on power with machdep platform
> overrides
> 4) use the machdep override on pseries, as the DDW code puts the TCE
> shift in a special property and there is no IOMMU table available
> 5) move some sparc code around to make IOMMU_PAGE_SHIFT available in
> include/asm
> 6) override the generic implementation on sparce to use IOMMU_PAGE_SHIFT
> 7) leverage the new API in the NVMe driver
>
> With these patches, a NVMe device survives our internal hardware
> exerciser; the kernel BUGs within a few seconds without the patch.
>
> arch/powerpc/include/asm/dma-mapping.h | 3 +++
> arch/powerpc/include/asm/machdep.h | 3 ++-
> arch/powerpc/kernel/dma.c | 11 +++++++++++
> arch/powerpc/platforms/pseries/iommu.c | 36 ++++++++++++++++++++++++++++++++++++
> arch/sparc/include/asm/dma-mapping.h | 8 ++++++++
> arch/sparc/include/asm/iommu_common.h | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> arch/sparc/kernel/iommu.c | 2 +-
> arch/sparc/kernel/iommu_common.h | 51 ---------------------------------------------------
> arch/sparc/kernel/pci_psycho.c | 2 +-
> arch/sparc/kernel/pci_sabre.c | 2 +-
> arch/sparc/kernel/pci_schizo.c | 2 +-
> arch/sparc/kernel/pci_sun4v.c | 2 +-
> arch/sparc/kernel/psycho_common.c | 2 +-
> arch/sparc/kernel/sbus.c | 3 +--
> drivers/block/nvme-core.c | 3 ++-
> include/linux/dma-mapping.h | 7 +++++++
> 16 files changed, 127 insertions(+), 61 deletions(-)
>
> v1 -> v2:
> Based upon feedback from Christoph Hellwig, rather than using an
> arch-specific hack, expose the DMA page shift via a generic DMA API and
> override it on Power as needed.
> v2 -> v3:
> Based upon feedback from Christoph Hellwig, put the generic
> implementation in include/linux/dma-mapping.h, since not all archs use
> include/asm-generic/dma-mapping-common.h.
> Add sparc implementation, as that arch seems to have a different IOMMU
> page size.
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
David Gibson <david@gibson.dropbear.id.au>,
Christoph Hellwig <hch@infradead.org>,
"David S. Miller" <davem@davemloft.net>,
linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org
Subject: Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA
Date: Fri, 23 Oct 2015 13:57:49 -0700 [thread overview]
Message-ID: <20151023205749.GD10197@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151023205420.GA10197@linux.vnet.ibm.com>
[Sorry, subject should have been 0/7!]
On 23.10.2015 [13:54:20 -0700], Nishanth Aravamudan wrote:
> We received a bug report recently when DDW (64-bit direct DMA on Power)
> is not enabled for NVMe devices. In that case, we fall back to 32-bit
> DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
> Entries).
>
> The NVMe device driver, though, assumes that the DMA alignment for the
> PRP entries will match the device's page size, and that the DMA aligment
> matches the kernel's page aligment. On Power, the the IOMMU page size,
> as mentioned above, can be 4K, while the device can have a page size of
> 8K, while the kernel has a page size of 64K. This eventually trips the
> BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
> of 4K but not 8K (e.g., 0xF000).
>
> In this particular case, and generally, we want to use the IOMMU's page
> size for the default device page size, rather than the kernel's page
> size.
>
> This series consists of five patches:
>
> 1) add a generic dma_get_page_shift implementation that just returns
> PAGE_SHIFT
> 2) override the generic implementation on Power to use the IOMMU table's
> page shift if available
> 3) allow further specific overriding on power with machdep platform
> overrides
> 4) use the machdep override on pseries, as the DDW code puts the TCE
> shift in a special property and there is no IOMMU table available
> 5) move some sparc code around to make IOMMU_PAGE_SHIFT available in
> include/asm
> 6) override the generic implementation on sparce to use IOMMU_PAGE_SHIFT
> 7) leverage the new API in the NVMe driver
>
> With these patches, a NVMe device survives our internal hardware
> exerciser; the kernel BUGs within a few seconds without the patch.
>
> arch/powerpc/include/asm/dma-mapping.h | 3 +++
> arch/powerpc/include/asm/machdep.h | 3 ++-
> arch/powerpc/kernel/dma.c | 11 +++++++++++
> arch/powerpc/platforms/pseries/iommu.c | 36 ++++++++++++++++++++++++++++++++++++
> arch/sparc/include/asm/dma-mapping.h | 8 ++++++++
> arch/sparc/include/asm/iommu_common.h | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> arch/sparc/kernel/iommu.c | 2 +-
> arch/sparc/kernel/iommu_common.h | 51 ---------------------------------------------------
> arch/sparc/kernel/pci_psycho.c | 2 +-
> arch/sparc/kernel/pci_sabre.c | 2 +-
> arch/sparc/kernel/pci_schizo.c | 2 +-
> arch/sparc/kernel/pci_sun4v.c | 2 +-
> arch/sparc/kernel/psycho_common.c | 2 +-
> arch/sparc/kernel/sbus.c | 3 +--
> drivers/block/nvme-core.c | 3 ++-
> include/linux/dma-mapping.h | 7 +++++++
> 16 files changed, 127 insertions(+), 61 deletions(-)
>
> v1 -> v2:
> Based upon feedback from Christoph Hellwig, rather than using an
> arch-specific hack, expose the DMA page shift via a generic DMA API and
> override it on Power as needed.
> v2 -> v3:
> Based upon feedback from Christoph Hellwig, put the generic
> implementation in include/linux/dma-mapping.h, since not all archs use
> include/asm-generic/dma-mapping-common.h.
> Add sparc implementation, as that arch seems to have a different IOMMU
> page size.
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: sparclinux@vger.kernel.org
Subject: Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA
Date: Fri, 23 Oct 2015 20:57:49 +0000 [thread overview]
Message-ID: <20151023205749.GD10197@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151023205420.GA10197@linux.vnet.ibm.com>
[Sorry, subject should have been 0/7!]
On 23.10.2015 [13:54:20 -0700], Nishanth Aravamudan wrote:
> We received a bug report recently when DDW (64-bit direct DMA on Power)
> is not enabled for NVMe devices. In that case, we fall back to 32-bit
> DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
> Entries).
>
> The NVMe device driver, though, assumes that the DMA alignment for the
> PRP entries will match the device's page size, and that the DMA aligment
> matches the kernel's page aligment. On Power, the the IOMMU page size,
> as mentioned above, can be 4K, while the device can have a page size of
> 8K, while the kernel has a page size of 64K. This eventually trips the
> BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
> of 4K but not 8K (e.g., 0xF000).
>
> In this particular case, and generally, we want to use the IOMMU's page
> size for the default device page size, rather than the kernel's page
> size.
>
> This series consists of five patches:
>
> 1) add a generic dma_get_page_shift implementation that just returns
> PAGE_SHIFT
> 2) override the generic implementation on Power to use the IOMMU table's
> page shift if available
> 3) allow further specific overriding on power with machdep platform
> overrides
> 4) use the machdep override on pseries, as the DDW code puts the TCE
> shift in a special property and there is no IOMMU table available
> 5) move some sparc code around to make IOMMU_PAGE_SHIFT available in
> include/asm
> 6) override the generic implementation on sparce to use IOMMU_PAGE_SHIFT
> 7) leverage the new API in the NVMe driver
>
> With these patches, a NVMe device survives our internal hardware
> exerciser; the kernel BUGs within a few seconds without the patch.
>
> arch/powerpc/include/asm/dma-mapping.h | 3 +++
> arch/powerpc/include/asm/machdep.h | 3 ++-
> arch/powerpc/kernel/dma.c | 11 +++++++++++
> arch/powerpc/platforms/pseries/iommu.c | 36 ++++++++++++++++++++++++++++++++++++
> arch/sparc/include/asm/dma-mapping.h | 8 ++++++++
> arch/sparc/include/asm/iommu_common.h | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> arch/sparc/kernel/iommu.c | 2 +-
> arch/sparc/kernel/iommu_common.h | 51 ---------------------------------------------------
> arch/sparc/kernel/pci_psycho.c | 2 +-
> arch/sparc/kernel/pci_sabre.c | 2 +-
> arch/sparc/kernel/pci_schizo.c | 2 +-
> arch/sparc/kernel/pci_sun4v.c | 2 +-
> arch/sparc/kernel/psycho_common.c | 2 +-
> arch/sparc/kernel/sbus.c | 3 +--
> drivers/block/nvme-core.c | 3 ++-
> include/linux/dma-mapping.h | 7 +++++++
> 16 files changed, 127 insertions(+), 61 deletions(-)
>
> v1 -> v2:
> Based upon feedback from Christoph Hellwig, rather than using an
> arch-specific hack, expose the DMA page shift via a generic DMA API and
> override it on Power as needed.
> v2 -> v3:
> Based upon feedback from Christoph Hellwig, put the generic
> implementation in include/linux/dma-mapping.h, since not all archs use
> include/asm-generic/dma-mapping-common.h.
> Add sparc implementation, as that arch seems to have a different IOMMU
> page size.
next prev parent reply other threads:[~2015-10-23 20:57 UTC|newest]
Thread overview: 147+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-23 20:54 [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA Nishanth Aravamudan
2015-10-23 20:54 ` Nishanth Aravamudan
2015-10-23 20:54 ` Nishanth Aravamudan
2015-10-23 20:56 ` [PATCH 1/7 v3] dma-mapping: add generic dma_get_page_shift API Nishanth Aravamudan
2015-10-23 20:56 ` Nishanth Aravamudan
2015-10-23 20:56 ` Nishanth Aravamudan
2015-10-23 20:57 ` [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift Nishanth Aravamudan
2015-10-23 20:57 ` Nishanth Aravamudan
2015-10-23 20:57 ` Nishanth Aravamudan
2015-10-27 6:02 ` Alexey Kardashevskiy
2015-10-27 6:02 ` Alexey Kardashevskiy
2015-10-27 6:02 ` Alexey Kardashevskiy
2015-10-27 14:06 ` Busch, Keith
2015-10-27 14:06 ` Busch, Keith
2015-10-27 14:06 ` Busch, Keith
2015-10-27 22:27 ` Nishanth Aravamudan
2015-10-27 22:27 ` Nishanth Aravamudan
2015-10-27 22:27 ` Nishanth Aravamudan
2015-10-28 1:00 ` Alexey Kardashevskiy
2015-10-28 1:00 ` Alexey Kardashevskiy
2015-10-28 1:00 ` Alexey Kardashevskiy
2015-10-28 1:54 ` Nishanth Aravamudan
2015-10-28 1:54 ` Nishanth Aravamudan
2015-10-28 1:54 ` Nishanth Aravamudan
2015-10-28 2:20 ` Benjamin Herrenschmidt
2015-10-28 2:20 ` Benjamin Herrenschmidt
2015-10-28 2:20 ` Benjamin Herrenschmidt
2015-10-28 2:30 ` Nishanth Aravamudan
2015-10-28 2:30 ` Nishanth Aravamudan
2015-10-28 2:30 ` Nishanth Aravamudan
2015-10-28 3:20 ` Benjamin Herrenschmidt
2015-10-28 3:20 ` Benjamin Herrenschmidt
2015-10-28 3:20 ` Benjamin Herrenschmidt
2015-10-23 20:57 ` Nishanth Aravamudan [this message]
2015-10-23 20:57 ` [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA Nishanth Aravamudan
2015-10-23 20:57 ` Nishanth Aravamudan
2015-10-23 20:58 ` [PATCH 3/7 v2] powerpc/dma: implement per-platform dma_get_page_shift Nishanth Aravamudan
2015-10-23 20:58 ` Nishanth Aravamudan
2015-10-23 20:58 ` Nishanth Aravamudan
2015-10-23 20:59 ` [PATCH 4/7 v2] pseries/iommu: implement DDW-aware dma_get_page_shift Nishanth Aravamudan
2015-10-23 20:59 ` Nishanth Aravamudan
2015-10-23 20:59 ` Nishanth Aravamudan
2015-10-27 5:56 ` Alexey Kardashevskiy
2015-10-27 5:56 ` Alexey Kardashevskiy
2015-10-27 5:56 ` Alexey Kardashevskiy
2015-10-27 22:22 ` Nishanth Aravamudan
2015-10-27 22:22 ` Nishanth Aravamudan
2015-10-27 22:22 ` Nishanth Aravamudan
2015-10-23 21:00 ` [PATCH 5/7] [RFC PATCH 5/7] sparc: rename kernel/iommu_common.h -> include/asm/iommu_common.h Nishanth Aravamudan
2015-10-23 21:00 ` Nishanth Aravamudan
2015-10-23 21:00 ` Nishanth Aravamudan
2015-10-23 21:02 ` Nishanth Aravamudan
2015-10-23 21:02 ` Nishanth Aravamudan
2015-10-23 21:02 ` Nishanth Aravamudan
2015-10-23 21:01 ` [RFC PATCH 6/7] sparc/dma-mapping: override dma_get_page_shift Nishanth Aravamudan
2015-10-23 21:01 ` Nishanth Aravamudan
2015-10-23 21:01 ` Nishanth Aravamudan
2015-10-23 21:02 ` [PATCH 7/7 v2] drivers/nvme: default to the IOMMU page size Nishanth Aravamudan
2015-10-23 21:02 ` Nishanth Aravamudan
2015-10-23 21:02 ` Nishanth Aravamudan
2015-10-27 1:27 ` [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA David Miller
2015-10-27 1:27 ` David Miller
2015-10-27 1:27 ` David Miller
2015-10-27 22:20 ` Nishanth Aravamudan
2015-10-27 22:20 ` Nishanth Aravamudan
2015-10-27 22:20 ` Nishanth Aravamudan
2015-10-27 22:36 ` Busch, Keith
2015-10-27 22:36 ` Busch, Keith
2015-10-27 22:36 ` Busch, Keith
2015-10-28 0:54 ` David Miller
2015-10-28 0:54 ` David Miller
2015-10-28 0:54 ` David Miller
2015-10-28 13:59 ` Busch, Keith
2015-10-28 13:59 ` Busch, Keith
2015-10-28 13:59 ` Busch, Keith
2015-10-29 11:55 ` Christoph Hellwig
2015-10-29 11:55 ` Christoph Hellwig
2015-10-29 11:55 ` Christoph Hellwig
2015-10-29 15:57 ` Nishanth Aravamudan
2015-10-29 15:57 ` Nishanth Aravamudan
2015-10-29 15:57 ` Nishanth Aravamudan
2015-10-29 17:20 ` Busch, Keith
2015-10-29 17:20 ` Busch, Keith
2015-10-29 17:20 ` Busch, Keith
2015-10-30 21:35 ` [PATCH 1/1 v3] drivers/nvme: default to 4k device page size Nishanth Aravamudan
2015-10-30 21:35 ` Nishanth Aravamudan
2015-10-30 21:35 ` Nishanth Aravamudan
2015-10-30 21:48 ` Keith Busch
2015-10-30 21:48 ` Keith Busch
2015-10-30 21:48 ` Keith Busch
2015-10-30 22:13 ` Nishanth Aravamudan
2015-10-30 22:13 ` Nishanth Aravamudan
2015-10-30 22:13 ` Nishanth Aravamudan
2015-11-03 13:18 ` Christoph Hellwig
2015-11-03 13:18 ` Christoph Hellwig
2015-11-03 13:18 ` Christoph Hellwig
2015-11-03 13:46 ` Keith Busch
2015-11-03 13:46 ` Keith Busch
2015-11-03 13:46 ` Keith Busch
2015-11-05 17:01 ` [PATCH 1/1 v4] " Nishanth Aravamudan
2015-11-05 17:01 ` Nishanth Aravamudan
2015-11-05 17:01 ` Nishanth Aravamudan
2015-11-05 19:58 ` Christoph Hellwig
2015-11-05 19:58 ` Christoph Hellwig
2015-11-05 19:58 ` Christoph Hellwig
2015-11-05 21:54 ` Nishanth Aravamudan
2015-11-05 21:54 ` Nishanth Aravamudan
2015-11-05 21:54 ` Nishanth Aravamudan
2015-11-06 16:13 ` Nishanth Aravamudan
2015-11-06 16:13 ` Nishanth Aravamudan
2015-11-06 16:13 ` Nishanth Aravamudan
2015-11-13 7:37 ` Christoph Hellwig
2015-11-13 7:37 ` Christoph Hellwig
2015-11-13 7:37 ` Christoph Hellwig
2015-11-13 15:08 ` Keith Busch
2015-11-13 15:08 ` Keith Busch
2015-11-13 15:08 ` Keith Busch
2015-11-18 14:42 ` Christoph Hellwig
2015-11-18 14:42 ` Christoph Hellwig
2015-11-18 14:42 ` Christoph Hellwig
2015-10-30 1:49 ` [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA David Miller
2015-10-30 1:49 ` David Miller
2015-10-30 1:49 ` David Miller
2015-10-30 21:35 ` Nishanth Aravamudan
2015-10-30 21:35 ` Nishanth Aravamudan
2015-10-30 21:35 ` Nishanth Aravamudan
2015-10-27 22:57 ` Julian Calaby
2015-10-27 22:57 ` Julian Calaby
2015-10-27 22:57 ` Julian Calaby
2015-10-27 23:40 ` Nishanth Aravamudan
2015-10-27 23:40 ` Nishanth Aravamudan
2015-10-27 23:40 ` Nishanth Aravamudan
2015-10-27 23:43 ` Julian Calaby
2015-10-27 23:43 ` Julian Calaby
2015-10-27 23:43 ` Julian Calaby
2015-10-28 0:29 ` Benjamin Herrenschmidt
2015-10-28 0:29 ` Benjamin Herrenschmidt
2015-10-28 0:29 ` Benjamin Herrenschmidt
2015-10-28 0:43 ` David Miller
2015-10-28 1:00 ` David Miller
2015-10-28 1:00 ` David Miller
2015-10-28 0:53 ` David Miller
2015-10-28 0:53 ` David Miller
2015-10-28 0:53 ` David Miller
2015-10-28 1:52 ` Nishanth Aravamudan
2015-10-28 1:52 ` Nishanth Aravamudan
2015-10-28 1:52 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151023205749.GD10197@linux.vnet.ibm.com \
--to=nacc@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.