From: Aurelien Jarno <aurelien@aurel32.net>
To: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
John Crispin <john@phrozen.org>,
"Steven J. Hill" <Steven.Hill@imgtec.com>,
linux-mips@linux-mips.org, Fuxin Zhang <zhangfx@lemote.com>,
Zhangjin Wu <wuzhangjin@gmail.com>,
Hongliang Tao <taohl@lemote.com>, Hua Yan <yanh@lemote.com>
Subject: Re: [PATCH V16 08/12] MIPS: Loongson: Add swiotlb to support big memory (>4GB)
Date: Sat, 11 Jan 2014 16:24:12 +0100 [thread overview]
Message-ID: <20140111152412.GA20717@ohm.rr44.fr> (raw)
In-Reply-To: <CAAhV-H7WFdt-4jYG5qPV36UWJQnSfkSa2J-3CAs2+QLqwHVhuA@mail.gmail.com>
On Fri, Jan 10, 2014 at 03:45:13PM +0800, Huacai Chen wrote:
> On Fri, Jan 10, 2014 at 6:08 AM, Aurelien Jarno <aurelien@aurel32.net>wrote:
>
> > On Thu, Jan 09, 2014 at 06:33:48PM +0800, Huacai Chen wrote:
> > > On Thu, Jan 9, 2014 at 6:58 AM, Aurelien Jarno <aurelien@aurel32.net>
> > wrote:
> > >
> > > > On Wed, Jan 08, 2014 at 10:44:24AM +0800, Huacai Chen wrote:
> > > > > This is probably a workaround because Loongson doesn't support DMA
> > > > > address above 4GB. If memory is more than 4GB, CONFIG_SWIOTLB and
> > > > > ZONE_DMA32 should be selected. In this way, DMA pages are allocated
> > > > > below 4GB preferably.
> > > > >
> > > > > However, CONFIG_SWIOTLB+ZONE_DMA32 is not enough, so, we provide a
> > > > > platform-specific dma_map_ops::set_dma_mask() to make sure each
> > > > > driver's dma_mask and coherent_dma_mask is below 32-bit.
> > > > >
> > > > > Signed-off-by: Huacai Chen <chenhc@lemote.com>
> > > > > Signed-off-by: Hongliang Tao <taohl@lemote.com>
> > > > > Signed-off-by: Hua Yan <yanh@lemote.com>
> > > > > ---
> > > > > arch/mips/include/asm/dma-mapping.h | 5 +
> > > > > .../mips/include/asm/mach-loongson/dma-coherence.h | 22 +++-
> > > > > arch/mips/loongson/common/Makefile | 5 +
> > > > > arch/mips/loongson/common/dma-swiotlb.c | 165
> > > > ++++++++++++++++++++
> > > > > 4 files changed, 196 insertions(+), 1 deletions(-)
> > > > > create mode 100644 arch/mips/loongson/common/dma-swiotlb.c
> > > > >
> > > > > diff --git a/arch/mips/include/asm/dma-mapping.h
> > > > b/arch/mips/include/asm/dma-mapping.h
> > > > > index 84238c5..06412aa 100644
> > > > > --- a/arch/mips/include/asm/dma-mapping.h
> > > > > +++ b/arch/mips/include/asm/dma-mapping.h
> > > > > @@ -49,9 +49,14 @@ static inline int dma_mapping_error(struct device
> > > > *dev, u64 mask)
> > > > > static inline int
> > > > > dma_set_mask(struct device *dev, u64 mask)
> > > > > {
> > > > > + struct dma_map_ops *ops = get_dma_ops(dev);
> > > > > +
> > > > > if(!dev->dma_mask || !dma_supported(dev, mask))
> > > > > return -EIO;
> > > > >
> > > > > + if (ops->set_dma_mask)
> > > > > + return ops->set_dma_mask(dev, mask);
> > > > > +
> > > >
> > > > I think that with the changes I propose in loongson_dma_alloc_coherent,
> > > > this should be useless.
> > > >
> > > > > *dev->dma_mask = mask;
> > > > >
> > > > > return 0;
> > > > > diff --git a/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > b/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > index aeb2c05..6a90275 100644
> > > > > --- a/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > +++ b/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > @@ -11,24 +11,40 @@
> > > > > #ifndef __ASM_MACH_LOONGSON_DMA_COHERENCE_H
> > > > > #define __ASM_MACH_LOONGSON_DMA_COHERENCE_H
> > > > >
> > > > > +#ifdef CONFIG_SWIOTLB
> > > > > +#include <linux/swiotlb.h>
> > > > > +#endif
> > > > > +
> > > > > struct device;
> > > > >
> > > > > +extern dma_addr_t phys_to_dma(struct device *dev, phys_addr_t
> > paddr);
> > > > > +extern phys_addr_t dma_to_phys(struct device *dev, dma_addr_t
> > daddr);
> > > > > static inline dma_addr_t plat_map_dma_mem(struct device *dev, void
> > > > *addr,
> > > > > size_t size)
> > > > > {
> > > > > +#ifdef CONFIG_CPU_LOONGSON3
> > > > > + return virt_to_phys(addr);
> > > > > +#else
> > > > > return virt_to_phys(addr) | 0x80000000;
> > > > > +#endif
> > > > > }
> > > > >
> > > > > static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
> > > > > struct page *page)
> > > > > {
> > > > > +#ifdef CONFIG_CPU_LOONGSON3
> > > > > + return page_to_phys(page);
> > > > > +#else
> > > > > return page_to_phys(page) | 0x80000000;
> > > > > +#endif
> > > > > }
> > > > >
> > > > > static inline unsigned long plat_dma_addr_to_phys(struct device
> > *dev,
> > > > > dma_addr_t dma_addr)
> > > > > {
> > > > > -#if defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
> > > > > +#if defined(CONFIG_CPU_LOONGSON3) && defined(CONFIG_64BIT)
> > > > > + return dma_addr;
> > > > > +#elif defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
> > > > > return (dma_addr > 0x8fffffff) ? dma_addr : (dma_addr &
> > > > 0x0fffffff);
> > > > > #else
> > > > > return dma_addr & 0x7fffffff;
> > > > > @@ -55,7 +71,11 @@ static inline int plat_dma_supported(struct device
> > > > *dev, u64 mask)
> > > > >
> > > > > static inline int plat_device_is_coherent(struct device *dev)
> > > > > {
> > > > > +#ifdef CONFIG_DMA_NONCOHERENT
> > > > > return 0;
> > > > > +#else
> > > > > + return 1;
> > > > > +#endif /* CONFIG_DMA_NONCOHERENT */
> > > > > }
> > > > >
> > > > > #endif /* __ASM_MACH_LOONGSON_DMA_COHERENCE_H */
> > > > > diff --git a/arch/mips/loongson/common/Makefile
> > > > b/arch/mips/loongson/common/Makefile
> > > > > index 9e4484c..0bb9cc9 100644
> > > > > --- a/arch/mips/loongson/common/Makefile
> > > > > +++ b/arch/mips/loongson/common/Makefile
> > > > > @@ -26,3 +26,8 @@ obj-$(CONFIG_CS5536) += cs5536/
> > > > > #
> > > > >
> > > > > obj-$(CONFIG_LOONGSON_SUSPEND) += pm.o
> > > > > +
> > > > > +#
> > > > > +# Big Memory (SWIOTLB) Support
> > > > > +#
> > > > > +obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o
> > > > > diff --git a/arch/mips/loongson/common/dma-swiotlb.c
> > > > b/arch/mips/loongson/common/dma-swiotlb.c
> > > > > new file mode 100644
> > > > > index 0000000..9d5451b
> > > > > --- /dev/null
> > > > > +++ b/arch/mips/loongson/common/dma-swiotlb.c
> > > > > @@ -0,0 +1,165 @@
> > > > > +#include <linux/mm.h>
> > > > > +#include <linux/init.h>
> > > > > +#include <linux/dma-mapping.h>
> > > > > +#include <linux/scatterlist.h>
> > > > > +#include <linux/swiotlb.h>
> > > > > +#include <linux/bootmem.h>
> > > > > +
> > > > > +#include <asm/bootinfo.h>
> > > > > +#include <dma-coherence.h>
> > > > > +
> > > > > +static void *loongson_dma_alloc_coherent(struct device *dev, size_t
> > > > size,
> > > > > + dma_addr_t *dma_handle, gfp_t gfp, struct dma_attrs
> > *attrs)
> > > > > +{
> > > > > + void *ret;
> > > > > +
> > > > > + if (dma_alloc_from_coherent(dev, size, dma_handle, &ret))
> > > > > + return ret;
> > > > > +
> > > > > + /* ignore region specifiers */
> > > > > + gfp &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> > > > > +
> > > > > +#ifdef CONFIG_ISA
> > > > > + if (dev == NULL)
> > > > > + gfp |= __GFP_DMA;
> > > > > + else
> > > > > +#endif
> > > > > +#ifdef CONFIG_ZONE_DMA
> > > > > + if (dev->coherent_dma_mask < DMA_BIT_MASK(32))
> > > > > + gfp |= __GFP_DMA;
> > > > > + else
> > > > > +#endif
> > > > > +#ifdef CONFIG_ZONE_DMA32
> > > > > + /* Loongson-3 only support DMA in the memory below 4GB now */
> > > > > + if (dev->coherent_dma_mask < DMA_BIT_MASK(40))
> > > >
> > > > You don't want the if here. If the device has coherent_dma_mask with
> > > > the highest bit being the 32th one or higher, you still need
> > > > __GFP_DMA32 to make sure the memory will be allocated below 4GB, as
> > > > the Loongson 3 CPU doesn't support more than 32-bit DMA.
> > > >
> > > 32-bit DMA is probably not because of Loongson-3, but because of the
> > > north-bridge configuration. In our recent experiments, we can enable
> > > 64-bit DMA by update PMON (but it need more time to prove the stability).
> > > So we keep the set_dma_mask() for future. Moreover, we need a correct
> > > dma_mask to check the DMAablity in lib/swiotlb.c.
> > >
> > > DMA_BIT_MASK(40) comes from Radeon GPU, if DMA-40 is enabled, it doesn't
> > > need to alloc pages in DMA32 zone.
> >
>
> > The description of the patch says that a Loongson system only supports
> > DMA below 4GB. We doesn't care here if the limit comes from the
> > north-bridge configuration or of the CPU. So if the limit should be 4GB,
> > __GFP_DMA32 should always be set. If there is no limit, it should be set
> > if (dev->coherent_dma_mask < DMA_BIT_MASK(64)) as the DMA32 zone is the
> > biggest zone usable when the address doesn't fit in 64 bit.
> >
> > In any case it is not really consistent to not limit the addresses to
> > 32 bits here, but limit them later in set_dma_mask().
> >
> I still think set_dma_mask() is needed. Of course we can set GFP_DMA32 in
> when alloc a page, but DMA32 zone is limited, what will happen if alloc a
> page in DMA32 zone fails? In current design, if it fails, we will alloc a
> page above 4GB and use swiotlb to bounce the page.
Yes it is exactly how swiotlb should work. The memory should be
allocated *if possible* in the DMA32 zone, and if it fails swiotlb is
used to bounce the page. However bouncing the page has a cost that
should be avoided. Not specifying GFP_DMA32, means that the memory
might be allocated anywhere, and *if we are lucky* it will be in the
DAM32 zone where not bounce buffer is needed.
> Moreover, DMA include alloc()/free() API and map_page()/unmap_page() API.
> For alloc()/free(), GFP_DMA32 can resove most of problems. However, for
> map_page()/unmap_page(), the page is probably allocated before map_page()
> and without GFP_DMA32, so cannot do DMA directly (need swiotlb to bounce).
> In this situation, device's dma_mask is used to distinguish whether bounce
> is needed (see dma_capable() which defined in
> arch/mips/include/asm/dma-mapping.h and used in lib/swiotlb.c).
Thanks for your explanation, you have convinced me that a set_dma_mask
is indeed needed. That said I do wonder if it won't be better
performance wise to replace dma_set_mask by something like:
static inline int dma_set_mask(struct device *dev, u64 mask)
{
return get_dma_ops(dev)->set_dma_mask(dev, mask);
}
And to provide a default set_dma_mask function in mm/dma-default.c.
Does anybody have an opinion about that?
> Maybe my commit message is not clear, the fact is Loongson-3A can only do
> DMA *directly* below 4GB, but can use memory above 4GB to do DMA indirectly
> (via swiotlb).
As said above, even if the use of swiotlb is possible, it has a cost
(the memory has to be copied after the DMA transfer) that should be
avoided as much as possible. That's why GFP_DMA32 should always be
specified in loongson_dma_alloc_coherent.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
next prev parent reply other threads:[~2014-01-11 15:25 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-08 2:44 [PATCH V16 00/12] MIPS: Add Loongson-3 based machines support Huacai Chen
2014-01-08 2:44 ` [PATCH V16 01/12] MIPS: Loongson: Add basic Loongson-3 definition Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 02/12] MIPS: Loongson: Add basic Loongson-3 CPU support Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
[not found] ` <CAAhV-H4tx=sCk=iUwuCfnCS+rbmtu5Y_UcpAn6JXDoobA+OGrQ@mail.gmail.com>
2014-01-09 21:36 ` Aurelien Jarno
2014-01-12 9:03 ` Huacai Chen
2014-01-12 9:57 ` Aurelien Jarno
2014-01-12 12:10 ` Aaro Koskinen
2014-01-18 1:54 ` John Crispin
2014-01-08 2:44 ` [PATCH V16 03/12] MIPS: Loongson 3: Add Lemote-3A machtypes definition Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 04/12] MIPS: Loongson: Add UEFI-like firmware interface (LEFI) support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-09 12:43 ` Alex Smith
2014-01-09 12:43 ` Alex Smith
[not found] ` <CAAhV-H64BXsw5CBL-KW1eqXkYcadhHF2NeBH9YmWQz046Lpzzw@mail.gmail.com>
2014-01-09 21:37 ` Aurelien Jarno
2014-01-09 12:37 ` Alex Smith
2014-01-09 12:37 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 05/12] MIPS: Loongson 3: Add HT-linked PCI support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
[not found] ` <CAAhV-H57tDmYByjVwhf3teFZkGowR4E9+OO1vO0kP3iAKTNJVw@mail.gmail.com>
2014-01-09 21:38 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 06/12] MIPS: Loongson 3: Add IRQ init and dispatch support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-09 12:52 ` Alex Smith
2014-01-09 12:52 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 07/12] MIPS: Loongson 3: Add serial port support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 08/12] MIPS: Loongson: Add swiotlb to support big memory (>4GB) Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
[not found] ` <CAAhV-H4h43N2OR4znwVv3miVbGkWJLapdgr9Jou1j4R8-9TRyA@mail.gmail.com>
2014-01-09 22:08 ` Aurelien Jarno
[not found] ` <CAAhV-H7WFdt-4jYG5qPV36UWJQnSfkSa2J-3CAs2+QLqwHVhuA@mail.gmail.com>
2014-01-11 15:24 ` Aurelien Jarno [this message]
2014-01-09 12:56 ` Alex Smith
2014-01-09 12:56 ` Alex Smith
[not found] ` <CAAhV-H7ZO0gNzQ5wQ-yD=NiP2AJrc3-bWLXHo-HDngf27c9+gQ@mail.gmail.com>
2014-01-11 15:25 ` Aurelien Jarno
2014-01-12 9:12 ` Huacai Chen
2014-01-12 9:57 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 09/12] MIPS: Loongson: Add Loongson-3 Kconfig options Huacai Chen
2014-01-09 13:07 ` Alex Smith
2014-01-09 13:07 ` Alex Smith
2014-01-11 15:24 ` Aurelien Jarno
2014-01-13 3:02 ` Huacai Chen
2014-01-13 4:30 ` Aurelien Jarno
2014-01-13 10:15 ` Huacai Chen
2014-01-13 10:38 ` Aurelien Jarno
2014-01-13 12:15 ` Huacai Chen
2014-01-08 2:44 ` [PATCH V16 10/12] MIPS: Loongson 3: Add Loongson-3 SMP support Huacai Chen
2014-01-09 13:19 ` Alex Smith
2014-01-09 13:19 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 11/12] MIPS: Loongson 3: Add CPU hotplug support Huacai Chen
2014-01-11 15:24 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 12/12] MIPS: Loongson: Add a Loongson-3 default config file Huacai Chen
2014-01-08 7:26 ` [PATCH V16 00/12] MIPS: Add Loongson-3 based machines support John Crispin
2014-01-18 9:38 ` Andreas Barth
[not found] ` <0466fa9d60b91233d2157d5ce0b51333.squirrel@mail.lemote.com>
2014-02-06 23:27 ` Andreas Barth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140111152412.GA20717@ohm.rr44.fr \
--to=aurelien@aurel32.net \
--cc=Steven.Hill@imgtec.com \
--cc=chenhc@lemote.com \
--cc=john@phrozen.org \
--cc=linux-mips@linux-mips.org \
--cc=ralf@linux-mips.org \
--cc=taohl@lemote.com \
--cc=wuzhangjin@gmail.com \
--cc=yanh@lemote.com \
--cc=zhangfx@lemote.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox