From: Aurelien Jarno <aurelien@aurel32.net>
To: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
John Crispin <john@phrozen.org>,
"Steven J. Hill" <Steven.Hill@imgtec.com>,
linux-mips@linux-mips.org, Fuxin Zhang <zhangfx@lemote.com>,
Zhangjin Wu <wuzhangjin@gmail.com>,
Hongliang Tao <taohl@lemote.com>, Hua Yan <yanh@lemote.com>
Subject: Re: [PATCH V16 08/12] MIPS: Loongson: Add swiotlb to support big memory (>4GB)
Date: Sat, 11 Jan 2014 16:24:12 +0100 [thread overview]
Message-ID: <20140111152412.GA20717@ohm.rr44.fr> (raw)
In-Reply-To: <CAAhV-H7WFdt-4jYG5qPV36UWJQnSfkSa2J-3CAs2+QLqwHVhuA@mail.gmail.com>
On Fri, Jan 10, 2014 at 03:45:13PM +0800, Huacai Chen wrote:
> On Fri, Jan 10, 2014 at 6:08 AM, Aurelien Jarno <aurelien@aurel32.net>wrote:
>
> > On Thu, Jan 09, 2014 at 06:33:48PM +0800, Huacai Chen wrote:
> > > On Thu, Jan 9, 2014 at 6:58 AM, Aurelien Jarno <aurelien@aurel32.net>
> > wrote:
> > >
> > > > On Wed, Jan 08, 2014 at 10:44:24AM +0800, Huacai Chen wrote:
> > > > > This is probably a workaround because Loongson doesn't support DMA
> > > > > address above 4GB. If memory is more than 4GB, CONFIG_SWIOTLB and
> > > > > ZONE_DMA32 should be selected. In this way, DMA pages are allocated
> > > > > below 4GB preferably.
> > > > >
> > > > > However, CONFIG_SWIOTLB+ZONE_DMA32 is not enough, so, we provide a
> > > > > platform-specific dma_map_ops::set_dma_mask() to make sure each
> > > > > driver's dma_mask and coherent_dma_mask is below 32-bit.
> > > > >
> > > > > Signed-off-by: Huacai Chen <chenhc@lemote.com>
> > > > > Signed-off-by: Hongliang Tao <taohl@lemote.com>
> > > > > Signed-off-by: Hua Yan <yanh@lemote.com>
> > > > > ---
> > > > > arch/mips/include/asm/dma-mapping.h | 5 +
> > > > > .../mips/include/asm/mach-loongson/dma-coherence.h | 22 +++-
> > > > > arch/mips/loongson/common/Makefile | 5 +
> > > > > arch/mips/loongson/common/dma-swiotlb.c | 165
> > > > ++++++++++++++++++++
> > > > > 4 files changed, 196 insertions(+), 1 deletions(-)
> > > > > create mode 100644 arch/mips/loongson/common/dma-swiotlb.c
> > > > >
> > > > > diff --git a/arch/mips/include/asm/dma-mapping.h
> > > > b/arch/mips/include/asm/dma-mapping.h
> > > > > index 84238c5..06412aa 100644
> > > > > --- a/arch/mips/include/asm/dma-mapping.h
> > > > > +++ b/arch/mips/include/asm/dma-mapping.h
> > > > > @@ -49,9 +49,14 @@ static inline int dma_mapping_error(struct device
> > > > *dev, u64 mask)
> > > > > static inline int
> > > > > dma_set_mask(struct device *dev, u64 mask)
> > > > > {
> > > > > + struct dma_map_ops *ops = get_dma_ops(dev);
> > > > > +
> > > > > if(!dev->dma_mask || !dma_supported(dev, mask))
> > > > > return -EIO;
> > > > >
> > > > > + if (ops->set_dma_mask)
> > > > > + return ops->set_dma_mask(dev, mask);
> > > > > +
> > > >
> > > > I think that with the changes I propose in loongson_dma_alloc_coherent,
> > > > this should be useless.
> > > >
> > > > > *dev->dma_mask = mask;
> > > > >
> > > > > return 0;
> > > > > diff --git a/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > b/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > index aeb2c05..6a90275 100644
> > > > > --- a/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > +++ b/arch/mips/include/asm/mach-loongson/dma-coherence.h
> > > > > @@ -11,24 +11,40 @@
> > > > > #ifndef __ASM_MACH_LOONGSON_DMA_COHERENCE_H
> > > > > #define __ASM_MACH_LOONGSON_DMA_COHERENCE_H
> > > > >
> > > > > +#ifdef CONFIG_SWIOTLB
> > > > > +#include <linux/swiotlb.h>
> > > > > +#endif
> > > > > +
> > > > > struct device;
> > > > >
> > > > > +extern dma_addr_t phys_to_dma(struct device *dev, phys_addr_t
> > paddr);
> > > > > +extern phys_addr_t dma_to_phys(struct device *dev, dma_addr_t
> > daddr);
> > > > > static inline dma_addr_t plat_map_dma_mem(struct device *dev, void
> > > > *addr,
> > > > > size_t size)
> > > > > {
> > > > > +#ifdef CONFIG_CPU_LOONGSON3
> > > > > + return virt_to_phys(addr);
> > > > > +#else
> > > > > return virt_to_phys(addr) | 0x80000000;
> > > > > +#endif
> > > > > }
> > > > >
> > > > > static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
> > > > > struct page *page)
> > > > > {
> > > > > +#ifdef CONFIG_CPU_LOONGSON3
> > > > > + return page_to_phys(page);
> > > > > +#else
> > > > > return page_to_phys(page) | 0x80000000;
> > > > > +#endif
> > > > > }
> > > > >
> > > > > static inline unsigned long plat_dma_addr_to_phys(struct device
> > *dev,
> > > > > dma_addr_t dma_addr)
> > > > > {
> > > > > -#if defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
> > > > > +#if defined(CONFIG_CPU_LOONGSON3) && defined(CONFIG_64BIT)
> > > > > + return dma_addr;
> > > > > +#elif defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
> > > > > return (dma_addr > 0x8fffffff) ? dma_addr : (dma_addr &
> > > > 0x0fffffff);
> > > > > #else
> > > > > return dma_addr & 0x7fffffff;
> > > > > @@ -55,7 +71,11 @@ static inline int plat_dma_supported(struct device
> > > > *dev, u64 mask)
> > > > >
> > > > > static inline int plat_device_is_coherent(struct device *dev)
> > > > > {
> > > > > +#ifdef CONFIG_DMA_NONCOHERENT
> > > > > return 0;
> > > > > +#else
> > > > > + return 1;
> > > > > +#endif /* CONFIG_DMA_NONCOHERENT */
> > > > > }
> > > > >
> > > > > #endif /* __ASM_MACH_LOONGSON_DMA_COHERENCE_H */
> > > > > diff --git a/arch/mips/loongson/common/Makefile
> > > > b/arch/mips/loongson/common/Makefile
> > > > > index 9e4484c..0bb9cc9 100644
> > > > > --- a/arch/mips/loongson/common/Makefile
> > > > > +++ b/arch/mips/loongson/common/Makefile
> > > > > @@ -26,3 +26,8 @@ obj-$(CONFIG_CS5536) += cs5536/
> > > > > #
> > > > >
> > > > > obj-$(CONFIG_LOONGSON_SUSPEND) += pm.o
> > > > > +
> > > > > +#
> > > > > +# Big Memory (SWIOTLB) Support
> > > > > +#
> > > > > +obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o
> > > > > diff --git a/arch/mips/loongson/common/dma-swiotlb.c
> > > > b/arch/mips/loongson/common/dma-swiotlb.c
> > > > > new file mode 100644
> > > > > index 0000000..9d5451b
> > > > > --- /dev/null
> > > > > +++ b/arch/mips/loongson/common/dma-swiotlb.c
> > > > > @@ -0,0 +1,165 @@
> > > > > +#include <linux/mm.h>
> > > > > +#include <linux/init.h>
> > > > > +#include <linux/dma-mapping.h>
> > > > > +#include <linux/scatterlist.h>
> > > > > +#include <linux/swiotlb.h>
> > > > > +#include <linux/bootmem.h>
> > > > > +
> > > > > +#include <asm/bootinfo.h>
> > > > > +#include <dma-coherence.h>
> > > > > +
> > > > > +static void *loongson_dma_alloc_coherent(struct device *dev, size_t
> > > > size,
> > > > > + dma_addr_t *dma_handle, gfp_t gfp, struct dma_attrs
> > *attrs)
> > > > > +{
> > > > > + void *ret;
> > > > > +
> > > > > + if (dma_alloc_from_coherent(dev, size, dma_handle, &ret))
> > > > > + return ret;
> > > > > +
> > > > > + /* ignore region specifiers */
> > > > > + gfp &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
> > > > > +
> > > > > +#ifdef CONFIG_ISA
> > > > > + if (dev == NULL)
> > > > > + gfp |= __GFP_DMA;
> > > > > + else
> > > > > +#endif
> > > > > +#ifdef CONFIG_ZONE_DMA
> > > > > + if (dev->coherent_dma_mask < DMA_BIT_MASK(32))
> > > > > + gfp |= __GFP_DMA;
> > > > > + else
> > > > > +#endif
> > > > > +#ifdef CONFIG_ZONE_DMA32
> > > > > + /* Loongson-3 only support DMA in the memory below 4GB now */
> > > > > + if (dev->coherent_dma_mask < DMA_BIT_MASK(40))
> > > >
> > > > You don't want the if here. If the device has coherent_dma_mask with
> > > > the highest bit being the 32th one or higher, you still need
> > > > __GFP_DMA32 to make sure the memory will be allocated below 4GB, as
> > > > the Loongson 3 CPU doesn't support more than 32-bit DMA.
> > > >
> > > 32-bit DMA is probably not because of Loongson-3, but because of the
> > > north-bridge configuration. In our recent experiments, we can enable
> > > 64-bit DMA by update PMON (but it need more time to prove the stability).
> > > So we keep the set_dma_mask() for future. Moreover, we need a correct
> > > dma_mask to check the DMAablity in lib/swiotlb.c.
> > >
> > > DMA_BIT_MASK(40) comes from Radeon GPU, if DMA-40 is enabled, it doesn't
> > > need to alloc pages in DMA32 zone.
> >
>
> > The description of the patch says that a Loongson system only supports
> > DMA below 4GB. We doesn't care here if the limit comes from the
> > north-bridge configuration or of the CPU. So if the limit should be 4GB,
> > __GFP_DMA32 should always be set. If there is no limit, it should be set
> > if (dev->coherent_dma_mask < DMA_BIT_MASK(64)) as the DMA32 zone is the
> > biggest zone usable when the address doesn't fit in 64 bit.
> >
> > In any case it is not really consistent to not limit the addresses to
> > 32 bits here, but limit them later in set_dma_mask().
> >
> I still think set_dma_mask() is needed. Of course we can set GFP_DMA32 in
> when alloc a page, but DMA32 zone is limited, what will happen if alloc a
> page in DMA32 zone fails? In current design, if it fails, we will alloc a
> page above 4GB and use swiotlb to bounce the page.
Yes it is exactly how swiotlb should work. The memory should be
allocated *if possible* in the DMA32 zone, and if it fails swiotlb is
used to bounce the page. However bouncing the page has a cost that
should be avoided. Not specifying GFP_DMA32, means that the memory
might be allocated anywhere, and *if we are lucky* it will be in the
DAM32 zone where not bounce buffer is needed.
> Moreover, DMA include alloc()/free() API and map_page()/unmap_page() API.
> For alloc()/free(), GFP_DMA32 can resove most of problems. However, for
> map_page()/unmap_page(), the page is probably allocated before map_page()
> and without GFP_DMA32, so cannot do DMA directly (need swiotlb to bounce).
> In this situation, device's dma_mask is used to distinguish whether bounce
> is needed (see dma_capable() which defined in
> arch/mips/include/asm/dma-mapping.h and used in lib/swiotlb.c).
Thanks for your explanation, you have convinced me that a set_dma_mask
is indeed needed. That said I do wonder if it won't be better
performance wise to replace dma_set_mask by something like:
static inline int dma_set_mask(struct device *dev, u64 mask)
{
return get_dma_ops(dev)->set_dma_mask(dev, mask);
}
And to provide a default set_dma_mask function in mm/dma-default.c.
Does anybody have an opinion about that?
> Maybe my commit message is not clear, the fact is Loongson-3A can only do
> DMA *directly* below 4GB, but can use memory above 4GB to do DMA indirectly
> (via swiotlb).
As said above, even if the use of swiotlb is possible, it has a cost
(the memory has to be copied after the DMA transfer) that should be
avoided as much as possible. That's why GFP_DMA32 should always be
specified in loongson_dma_alloc_coherent.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
next prev parent reply other threads:[~2014-01-11 15:25 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-08 2:44 [PATCH V16 00/12] MIPS: Add Loongson-3 based machines support Huacai Chen
2014-01-08 2:44 ` [PATCH V16 01/12] MIPS: Loongson: Add basic Loongson-3 definition Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 02/12] MIPS: Loongson: Add basic Loongson-3 CPU support Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
[not found] ` <CAAhV-H4tx=sCk=iUwuCfnCS+rbmtu5Y_UcpAn6JXDoobA+OGrQ@mail.gmail.com>
2014-01-09 21:36 ` Aurelien Jarno
2014-01-12 9:03 ` Huacai Chen
2014-01-12 9:57 ` Aurelien Jarno
2014-01-12 12:10 ` Aaro Koskinen
2014-01-18 1:54 ` John Crispin
2014-01-08 2:44 ` [PATCH V16 03/12] MIPS: Loongson 3: Add Lemote-3A machtypes definition Huacai Chen
2014-01-08 19:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 04/12] MIPS: Loongson: Add UEFI-like firmware interface (LEFI) support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-09 12:43 ` Alex Smith
2014-01-09 12:43 ` Alex Smith
[not found] ` <CAAhV-H64BXsw5CBL-KW1eqXkYcadhHF2NeBH9YmWQz046Lpzzw@mail.gmail.com>
2014-01-09 21:37 ` Aurelien Jarno
2014-01-09 12:37 ` Alex Smith
2014-01-09 12:37 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 05/12] MIPS: Loongson 3: Add HT-linked PCI support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
[not found] ` <CAAhV-H57tDmYByjVwhf3teFZkGowR4E9+OO1vO0kP3iAKTNJVw@mail.gmail.com>
2014-01-09 21:38 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 06/12] MIPS: Loongson 3: Add IRQ init and dispatch support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-09 12:52 ` Alex Smith
2014-01-09 12:52 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 07/12] MIPS: Loongson 3: Add serial port support Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 08/12] MIPS: Loongson: Add swiotlb to support big memory (>4GB) Huacai Chen
2014-01-08 22:58 ` Aurelien Jarno
[not found] ` <CAAhV-H4h43N2OR4znwVv3miVbGkWJLapdgr9Jou1j4R8-9TRyA@mail.gmail.com>
2014-01-09 22:08 ` Aurelien Jarno
[not found] ` <CAAhV-H7WFdt-4jYG5qPV36UWJQnSfkSa2J-3CAs2+QLqwHVhuA@mail.gmail.com>
2014-01-11 15:24 ` Aurelien Jarno [this message]
2014-01-09 12:56 ` Alex Smith
2014-01-09 12:56 ` Alex Smith
[not found] ` <CAAhV-H7ZO0gNzQ5wQ-yD=NiP2AJrc3-bWLXHo-HDngf27c9+gQ@mail.gmail.com>
2014-01-11 15:25 ` Aurelien Jarno
2014-01-12 9:12 ` Huacai Chen
2014-01-12 9:57 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 09/12] MIPS: Loongson: Add Loongson-3 Kconfig options Huacai Chen
2014-01-09 13:07 ` Alex Smith
2014-01-09 13:07 ` Alex Smith
2014-01-11 15:24 ` Aurelien Jarno
2014-01-13 3:02 ` Huacai Chen
2014-01-13 4:30 ` Aurelien Jarno
2014-01-13 10:15 ` Huacai Chen
2014-01-13 10:38 ` Aurelien Jarno
2014-01-13 12:15 ` Huacai Chen
2014-01-08 2:44 ` [PATCH V16 10/12] MIPS: Loongson 3: Add Loongson-3 SMP support Huacai Chen
2014-01-09 13:19 ` Alex Smith
2014-01-09 13:19 ` Alex Smith
2014-01-08 2:44 ` [PATCH V16 11/12] MIPS: Loongson 3: Add CPU hotplug support Huacai Chen
2014-01-11 15:24 ` Aurelien Jarno
2014-01-08 2:44 ` [PATCH V16 12/12] MIPS: Loongson: Add a Loongson-3 default config file Huacai Chen
2014-01-08 7:26 ` [PATCH V16 00/12] MIPS: Add Loongson-3 based machines support John Crispin
2014-01-18 9:38 ` Andreas Barth
[not found] ` <0466fa9d60b91233d2157d5ce0b51333.squirrel@mail.lemote.com>
2014-02-06 23:27 ` Andreas Barth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140111152412.GA20717@ohm.rr44.fr \
--to=aurelien@aurel32.net \
--cc=Steven.Hill@imgtec.com \
--cc=chenhc@lemote.com \
--cc=john@phrozen.org \
--cc=linux-mips@linux-mips.org \
--cc=ralf@linux-mips.org \
--cc=taohl@lemote.com \
--cc=wuzhangjin@gmail.com \
--cc=yanh@lemote.com \
--cc=zhangfx@lemote.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.