* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-02-03 8:36 [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver Arnd Bergmann
@ 2022-02-03 8:47 ` Ard Biesheuvel
2022-02-08 12:49 ` Arnd Bergmann
2022-02-08 10:19 ` Greg KH
2022-04-06 6:32 ` Christoph Hellwig
2 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2022-02-03 8:47 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Russell King, Arnd Bergmann, Linus Walleij, Christoph Hellwig,
Laurentiu Tudor, linux-usb, Alan Stern, Linux ARM,
Linux Kernel Mailing List
On Thu, 3 Feb 2022 at 09:38, Arnd Bergmann <arnd@kernel.org> wrote:
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> The sa1111 platform is one of the two remaining users of the old Arm
> specific "dmabounce" code, which is an earlier implementation of the
> generic swiotlb.
>
> Linus Walleij submitted a patch that removes dmabounce support from
> the ixp4xx, and I had a look at the other user, which is the sa1111
> companion chip.
>
> Looking at how dmabounce is used, I could narrow it down to one driver
> one three machines:
>
> - dmabounce is only initialized on assabet/neponset, jornada720 and
> badge4, which are the platforms that have an sa1111 and support
> DMA on it.
>
> - All three of these suffer from "erratum #7" that requires only
> doing DMA to half the memory sections based on one of the address
> lines, in addition, the neponset also can't DMA to the RAM that
> is connected to sa1111 itself.
>
> - the pxa lubbock machine also has sa1111, but does not support DMA
> on it and does not set dmabounce.
>
> - only the OHCI and audio devices on sa1111 support DMA, but as
> there is no audio driver for this hardware, only OHCI remains.
>
> In the OHCI code, I noticed that two other platforms already have
> a local bounce buffer support in the form of the "local_mem"
> allocator. Specifically, TMIO and SM501 use this on a few other ARM
> boards with 16KB or 128KB of local SRAM that can be accessed from the
> OHCI and from the CPU.
>
> While this is not the same problem as on sa1111, I could not find a
> reason why we can't re-use the existing implementation but replace the
> physical SRAM address mapping with a locally allocated DMA buffer.
>
> There are two main downsides:
>
> - rather than using a dynamically sized pool, this buffer needs
> to be allocated at probe time using a fixed size. Without
> having any idea of what it should be, I picked a size of
> 64KB, which is between what the other two OHCI front-ends use
> in their SRAM. If anyone has a better idea what that size
> is reasonable, this can be trivially changed.
>
I suppose this is a problem if the driver falls back to ordinary DRAM
once the allocation runs out?
> - Previously, only USB transfers to unaddressable memory needed
> to go through the bounce buffer, now all of them do, which may
> impact runtime performance for USB endpoints that do a lot of
> transfers.
>
> On the upside, the local_mem support uses write-combining buffers,
> which should be a bit faster for transfers to the device compared to
> normal uncached coherent memory as used in dmabounce.
>
Talking from past experience using this trick on a NXP ARM9 SoC ~10
years ago, using on-chip SRAM for USB DMA likely results in a
significant performance boost, even without write combining, although
the exact scenario obviously matters.
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Cc: linux-usb@vger.kernel.org
> Acked-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Changes in v2:
>
> - drop check for assabet, as bounce buffers are required on
> all sa1100 machines
> - select CONFIG_ZONE_DMA again
> - update comments and changelog text based on discussion
> ---
> arch/arm/common/Kconfig | 2 +-
> arch/arm/common/sa1111.c | 64 ----------------------------------
> drivers/usb/core/hcd.c | 17 +++++++--
> drivers/usb/host/ohci-sa1111.c | 25 +++++++++++++
> 4 files changed, 40 insertions(+), 68 deletions(-)
>
> diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
> index c8e198631d41..bc158fd227e1 100644
> --- a/arch/arm/common/Kconfig
> +++ b/arch/arm/common/Kconfig
> @@ -1,7 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> config SA1111
> bool
> - select DMABOUNCE if !ARCH_PXA
> + select ZONE_DMA if ARCH_SA1100
>
> config DMABOUNCE
> bool
> diff --git a/arch/arm/common/sa1111.c b/arch/arm/common/sa1111.c
> index 7df003b149c6..a00915883f78 100644
> --- a/arch/arm/common/sa1111.c
> +++ b/arch/arm/common/sa1111.c
> @@ -1391,70 +1391,9 @@ void sa1111_driver_unregister(struct sa1111_driver *driver)
> }
> EXPORT_SYMBOL(sa1111_driver_unregister);
>
> -#ifdef CONFIG_DMABOUNCE
> -/*
> - * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> - * Chip Specification Update" (June 2000), erratum #7, there is a
> - * significant bug in the SA1111 SDRAM shared memory controller. If
> - * an access to a region of memory above 1MB relative to the bank base,
> - * it is important that address bit 10 _NOT_ be asserted. Depending
> - * on the configuration of the RAM, bit 10 may correspond to one
> - * of several different (processor-relative) address bits.
> - *
> - * This routine only identifies whether or not a given DMA address
> - * is susceptible to the bug.
> - *
> - * This should only get called for sa1111_device types due to the
> - * way we configure our device dma_masks.
> - */
> -static int sa1111_needs_bounce(struct device *dev, dma_addr_t addr, size_t size)
> -{
> - /*
> - * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> - * User's Guide" mentions that jumpers R51 and R52 control the
> - * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> - * SDRAM bank 1 on Neponset). The default configuration selects
> - * Assabet, so any address in bank 1 is necessarily invalid.
> - */
> - return (machine_is_assabet() || machine_is_pfs168()) &&
> - (addr >= 0xc8000000 || (addr + size) >= 0xc8000000);
> -}
> -
> -static int sa1111_notifier_call(struct notifier_block *n, unsigned long action,
> - void *data)
> -{
> - struct sa1111_dev *dev = to_sa1111_device(data);
> -
> - switch (action) {
> - case BUS_NOTIFY_ADD_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL) {
> - int ret = dmabounce_register_dev(&dev->dev, 1024, 4096,
> - sa1111_needs_bounce);
> - if (ret)
> - dev_err(&dev->dev, "failed to register with dmabounce: %d\n", ret);
> - }
> - break;
> -
> - case BUS_NOTIFY_DEL_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL)
> - dmabounce_unregister_dev(&dev->dev);
> - break;
> - }
> - return NOTIFY_OK;
> -}
> -
> -static struct notifier_block sa1111_bus_notifier = {
> - .notifier_call = sa1111_notifier_call,
> -};
> -#endif
> -
> static int __init sa1111_init(void)
> {
> int ret = bus_register(&sa1111_bus_type);
> -#ifdef CONFIG_DMABOUNCE
> - if (ret == 0)
> - bus_register_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> if (ret == 0)
> platform_driver_register(&sa1111_device_driver);
> return ret;
> @@ -1463,9 +1402,6 @@ static int __init sa1111_init(void)
> static void __exit sa1111_exit(void)
> {
> platform_driver_unregister(&sa1111_device_driver);
> -#ifdef CONFIG_DMABOUNCE
> - bus_unregister_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> bus_unregister(&sa1111_bus_type);
> }
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 3c7c64ff3c0a..8417baedc89c 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1260,7 +1260,8 @@ void usb_hcd_unlink_urb_from_ep(struct usb_hcd *hcd, struct urb *urb)
> EXPORT_SYMBOL_GPL(usb_hcd_unlink_urb_from_ep);
>
> /*
> - * Some usb host controllers can only perform dma using a small SRAM area.
> + * Some usb host controllers can only perform dma using a small SRAM area,
> + * or have restrictions on addressable DRAM.
> * The usb core itself is however optimized for host controllers that can dma
> * using regular system memory - like pci devices doing bus mastering.
> *
> @@ -3095,8 +3096,18 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
> if (IS_ERR(hcd->localmem_pool))
> return PTR_ERR(hcd->localmem_pool);
>
> - local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> - size, MEMREMAP_WC);
> + /*
> + * if a physical SRAM address was passed, map it, otherwise
> + * allocate system memory as a buffer.
> + */
> + if (phys_addr)
> + local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> + size, MEMREMAP_WC);
> + else
> + local_mem = dmam_alloc_attrs(hcd->self.sysdev, size, &dma,
> + GFP_KERNEL,
> + DMA_ATTR_WRITE_COMBINE);
> +
> if (IS_ERR(local_mem))
> return PTR_ERR(local_mem);
>
> diff --git a/drivers/usb/host/ohci-sa1111.c b/drivers/usb/host/ohci-sa1111.c
> index 137f66f6977f..0da2badf0658 100644
> --- a/drivers/usb/host/ohci-sa1111.c
> +++ b/drivers/usb/host/ohci-sa1111.c
> @@ -206,6 +206,31 @@ static int ohci_hcd_sa1111_probe(struct sa1111_dev *dev)
> goto err1;
> }
>
> + /*
> + * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> + * Chip Specification Update" (June 2000), erratum #7, there is a
> + * significant bug in the SA1111 SDRAM shared memory controller. If
> + * an access to a region of memory above 1MB relative to the bank base,
> + * it is important that address bit 10 _NOT_ be asserted. Depending
> + * on the configuration of the RAM, bit 10 may correspond to one
> + * of several different (processor-relative) address bits.
> + *
> + * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> + * User's Guide" mentions that jumpers R51 and R52 control the
> + * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> + * SDRAM bank 1 on Neponset). The default configuration selects
> + * Assabet, so any address in bank 1 is necessarily invalid.
> + *
> + * As a workaround, use a bounce buffer in addressable memory
> + * as local_mem, relying on ZONE_DMA to provide an area that
> + * fits within the above constraints.
> + *
> + * SZ_64K is an estimate for what size this might need.
> + */
> + ret = usb_hcd_setup_local_mem(hcd, 0, 0, SZ_64K);
> + if (ret)
> + goto err1;
> +
> if (!request_mem_region(hcd->rsrc_start, hcd->rsrc_len, hcd_name)) {
> dev_dbg(&dev->dev, "request_mem_region failed\n");
> ret = -EBUSY;
> --
> 2.29.2
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-02-03 8:47 ` Ard Biesheuvel
@ 2022-02-08 12:49 ` Arnd Bergmann
2022-02-08 13:35 ` Ard Biesheuvel
0 siblings, 1 reply; 7+ messages in thread
From: Arnd Bergmann @ 2022-02-08 12:49 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Russell King, Arnd Bergmann, Linus Walleij, Christoph Hellwig,
Laurentiu Tudor, USB list, Alan Stern, Linux ARM,
Linux Kernel Mailing List
On Thu, Feb 3, 2022 at 9:47 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Thu, 3 Feb 2022 at 09:38, Arnd Bergmann <arnd@kernel.org> wrote:
>
> > There are two main downsides:
> >
> > - rather than using a dynamically sized pool, this buffer needs
> > to be allocated at probe time using a fixed size. Without
> > having any idea of what it should be, I picked a size of
> > 64KB, which is between what the other two OHCI front-ends use
> > in their SRAM. If anyone has a better idea what that size
> > is reasonable, this can be trivially changed.
> >
>
> I suppose this is a problem if the driver falls back to ordinary DRAM
> once the allocation runs out?
From what I can tell, there is no such fallback. If the localmem_pool
runs out, the allocation fails, which may cause other problems, but
it never falls back to the wrong DMA address.
> > - Previously, only USB transfers to unaddressable memory needed
> > to go through the bounce buffer, now all of them do, which may
> > impact runtime performance for USB endpoints that do a lot of
> > transfers.
> >
> > On the upside, the local_mem support uses write-combining buffers,
> > which should be a bit faster for transfers to the device compared to
> > normal uncached coherent memory as used in dmabounce.
> >
>
> Talking from past experience using this trick on a NXP ARM9 SoC ~10
> years ago, using on-chip SRAM for USB DMA likely results in a
> significant performance boost, even without write combining, although
> the exact scenario obviously matters.
Right, that makes sense, but it won't help here because there is
no SRAM. One detail I noticed is that the localmem pool normally
gets mapped as WC, which is what I did in the new code as well, but
dma_alloc_flags(..., DMA_ATTR_WRITE_COMBINE) does not always
honor this flag. I think it will do it here because a GFP_KERNEL
allocation should be served by the remap_allocator, while
GFP_ATOMIC allocations would be served by pool_allocator_alloc(),
which ignores the flag.
Arnd
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-02-08 12:49 ` Arnd Bergmann
@ 2022-02-08 13:35 ` Ard Biesheuvel
0 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2022-02-08 13:35 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Russell King, Arnd Bergmann, Linus Walleij, Christoph Hellwig,
Laurentiu Tudor, USB list, Alan Stern, Linux ARM,
Linux Kernel Mailing List
On Tue, 8 Feb 2022 at 13:49, Arnd Bergmann <arnd@kernel.org> wrote:
>
> On Thu, Feb 3, 2022 at 9:47 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Thu, 3 Feb 2022 at 09:38, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > > There are two main downsides:
> > >
> > > - rather than using a dynamically sized pool, this buffer needs
> > > to be allocated at probe time using a fixed size. Without
> > > having any idea of what it should be, I picked a size of
> > > 64KB, which is between what the other two OHCI front-ends use
> > > in their SRAM. If anyone has a better idea what that size
> > > is reasonable, this can be trivially changed.
> > >
> >
> > I suppose this is a problem if the driver falls back to ordinary DRAM
> > once the allocation runs out?
>
> From what I can tell, there is no such fallback. If the localmem_pool
> runs out, the allocation fails, which may cause other problems, but
> it never falls back to the wrong DMA address.
>
OK that is the least bad outcome I suppose.
> > > - Previously, only USB transfers to unaddressable memory needed
> > > to go through the bounce buffer, now all of them do, which may
> > > impact runtime performance for USB endpoints that do a lot of
> > > transfers.
> > >
> > > On the upside, the local_mem support uses write-combining buffers,
> > > which should be a bit faster for transfers to the device compared to
> > > normal uncached coherent memory as used in dmabounce.
> > >
> >
> > Talking from past experience using this trick on a NXP ARM9 SoC ~10
> > years ago, using on-chip SRAM for USB DMA likely results in a
> > significant performance boost, even without write combining, although
> > the exact scenario obviously matters.
>
> Right, that makes sense, but it won't help here because there is
> no SRAM. One detail I noticed is that the localmem pool normally
> gets mapped as WC, which is what I did in the new code as well, but
> dma_alloc_flags(..., DMA_ATTR_WRITE_COMBINE) does not always
> honor this flag. I think it will do it here because a GFP_KERNEL
> allocation should be served by the remap_allocator, while
> GFP_ATOMIC allocations would be served by pool_allocator_alloc(),
> which ignores the flag.
>
Ah yes, ignore me. For some reason, I thought this was about on-chip SRAM.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-02-03 8:36 [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver Arnd Bergmann
2022-02-03 8:47 ` Ard Biesheuvel
@ 2022-02-08 10:19 ` Greg KH
2022-04-06 6:32 ` Christoph Hellwig
2 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2022-02-08 10:19 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Russell King, Arnd Bergmann, Linus Walleij, Christoph Hellwig,
Laurentiu Tudor, linux-usb, Alan Stern, linux-arm-kernel,
linux-kernel
On Thu, Feb 03, 2022 at 09:36:33AM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
>
> The sa1111 platform is one of the two remaining users of the old Arm
> specific "dmabounce" code, which is an earlier implementation of the
> generic swiotlb.
>
> Linus Walleij submitted a patch that removes dmabounce support from
> the ixp4xx, and I had a look at the other user, which is the sa1111
> companion chip.
>
> Looking at how dmabounce is used, I could narrow it down to one driver
> one three machines:
>
> - dmabounce is only initialized on assabet/neponset, jornada720 and
> badge4, which are the platforms that have an sa1111 and support
> DMA on it.
>
> - All three of these suffer from "erratum #7" that requires only
> doing DMA to half the memory sections based on one of the address
> lines, in addition, the neponset also can't DMA to the RAM that
> is connected to sa1111 itself.
>
> - the pxa lubbock machine also has sa1111, but does not support DMA
> on it and does not set dmabounce.
>
> - only the OHCI and audio devices on sa1111 support DMA, but as
> there is no audio driver for this hardware, only OHCI remains.
>
> In the OHCI code, I noticed that two other platforms already have
> a local bounce buffer support in the form of the "local_mem"
> allocator. Specifically, TMIO and SM501 use this on a few other ARM
> boards with 16KB or 128KB of local SRAM that can be accessed from the
> OHCI and from the CPU.
>
> While this is not the same problem as on sa1111, I could not find a
> reason why we can't re-use the existing implementation but replace the
> physical SRAM address mapping with a locally allocated DMA buffer.
>
> There are two main downsides:
>
> - rather than using a dynamically sized pool, this buffer needs
> to be allocated at probe time using a fixed size. Without
> having any idea of what it should be, I picked a size of
> 64KB, which is between what the other two OHCI front-ends use
> in their SRAM. If anyone has a better idea what that size
> is reasonable, this can be trivially changed.
>
> - Previously, only USB transfers to unaddressable memory needed
> to go through the bounce buffer, now all of them do, which may
> impact runtime performance for USB endpoints that do a lot of
> transfers.
>
> On the upside, the local_mem support uses write-combining buffers,
> which should be a bit faster for transfers to the device compared to
> normal uncached coherent memory as used in dmabounce.
>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Cc: linux-usb@vger.kernel.org
> Acked-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> Changes in v2:
>
> - drop check for assabet, as bounce buffers are required on
> all sa1100 machines
> - select CONFIG_ZONE_DMA again
> - update comments and changelog text based on discussion
> ---
> arch/arm/common/Kconfig | 2 +-
> arch/arm/common/sa1111.c | 64 ----------------------------------
> drivers/usb/core/hcd.c | 17 +++++++--
> drivers/usb/host/ohci-sa1111.c | 25 +++++++++++++
> 4 files changed, 40 insertions(+), 68 deletions(-)
>
> diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
> index c8e198631d41..bc158fd227e1 100644
> --- a/arch/arm/common/Kconfig
> +++ b/arch/arm/common/Kconfig
> @@ -1,7 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> config SA1111
> bool
> - select DMABOUNCE if !ARCH_PXA
> + select ZONE_DMA if ARCH_SA1100
>
> config DMABOUNCE
> bool
> diff --git a/arch/arm/common/sa1111.c b/arch/arm/common/sa1111.c
> index 7df003b149c6..a00915883f78 100644
> --- a/arch/arm/common/sa1111.c
> +++ b/arch/arm/common/sa1111.c
> @@ -1391,70 +1391,9 @@ void sa1111_driver_unregister(struct sa1111_driver *driver)
> }
> EXPORT_SYMBOL(sa1111_driver_unregister);
>
> -#ifdef CONFIG_DMABOUNCE
> -/*
> - * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> - * Chip Specification Update" (June 2000), erratum #7, there is a
> - * significant bug in the SA1111 SDRAM shared memory controller. If
> - * an access to a region of memory above 1MB relative to the bank base,
> - * it is important that address bit 10 _NOT_ be asserted. Depending
> - * on the configuration of the RAM, bit 10 may correspond to one
> - * of several different (processor-relative) address bits.
> - *
> - * This routine only identifies whether or not a given DMA address
> - * is susceptible to the bug.
> - *
> - * This should only get called for sa1111_device types due to the
> - * way we configure our device dma_masks.
> - */
> -static int sa1111_needs_bounce(struct device *dev, dma_addr_t addr, size_t size)
> -{
> - /*
> - * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> - * User's Guide" mentions that jumpers R51 and R52 control the
> - * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> - * SDRAM bank 1 on Neponset). The default configuration selects
> - * Assabet, so any address in bank 1 is necessarily invalid.
> - */
> - return (machine_is_assabet() || machine_is_pfs168()) &&
> - (addr >= 0xc8000000 || (addr + size) >= 0xc8000000);
> -}
> -
> -static int sa1111_notifier_call(struct notifier_block *n, unsigned long action,
> - void *data)
> -{
> - struct sa1111_dev *dev = to_sa1111_device(data);
> -
> - switch (action) {
> - case BUS_NOTIFY_ADD_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL) {
> - int ret = dmabounce_register_dev(&dev->dev, 1024, 4096,
> - sa1111_needs_bounce);
> - if (ret)
> - dev_err(&dev->dev, "failed to register with dmabounce: %d\n", ret);
> - }
> - break;
> -
> - case BUS_NOTIFY_DEL_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL)
> - dmabounce_unregister_dev(&dev->dev);
> - break;
> - }
> - return NOTIFY_OK;
> -}
> -
> -static struct notifier_block sa1111_bus_notifier = {
> - .notifier_call = sa1111_notifier_call,
> -};
> -#endif
> -
> static int __init sa1111_init(void)
> {
> int ret = bus_register(&sa1111_bus_type);
> -#ifdef CONFIG_DMABOUNCE
> - if (ret == 0)
> - bus_register_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> if (ret == 0)
> platform_driver_register(&sa1111_device_driver);
> return ret;
> @@ -1463,9 +1402,6 @@ static int __init sa1111_init(void)
> static void __exit sa1111_exit(void)
> {
> platform_driver_unregister(&sa1111_device_driver);
> -#ifdef CONFIG_DMABOUNCE
> - bus_unregister_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> bus_unregister(&sa1111_bus_type);
> }
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 3c7c64ff3c0a..8417baedc89c 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1260,7 +1260,8 @@ void usb_hcd_unlink_urb_from_ep(struct usb_hcd *hcd, struct urb *urb)
> EXPORT_SYMBOL_GPL(usb_hcd_unlink_urb_from_ep);
>
> /*
> - * Some usb host controllers can only perform dma using a small SRAM area.
> + * Some usb host controllers can only perform dma using a small SRAM area,
> + * or have restrictions on addressable DRAM.
> * The usb core itself is however optimized for host controllers that can dma
> * using regular system memory - like pci devices doing bus mastering.
> *
> @@ -3095,8 +3096,18 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
> if (IS_ERR(hcd->localmem_pool))
> return PTR_ERR(hcd->localmem_pool);
>
> - local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> - size, MEMREMAP_WC);
> + /*
> + * if a physical SRAM address was passed, map it, otherwise
> + * allocate system memory as a buffer.
> + */
> + if (phys_addr)
> + local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> + size, MEMREMAP_WC);
> + else
> + local_mem = dmam_alloc_attrs(hcd->self.sysdev, size, &dma,
> + GFP_KERNEL,
> + DMA_ATTR_WRITE_COMBINE);
> +
> if (IS_ERR(local_mem))
> return PTR_ERR(local_mem);
>
> diff --git a/drivers/usb/host/ohci-sa1111.c b/drivers/usb/host/ohci-sa1111.c
> index 137f66f6977f..0da2badf0658 100644
> --- a/drivers/usb/host/ohci-sa1111.c
> +++ b/drivers/usb/host/ohci-sa1111.c
> @@ -206,6 +206,31 @@ static int ohci_hcd_sa1111_probe(struct sa1111_dev *dev)
> goto err1;
> }
>
> + /*
> + * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> + * Chip Specification Update" (June 2000), erratum #7, there is a
> + * significant bug in the SA1111 SDRAM shared memory controller. If
> + * an access to a region of memory above 1MB relative to the bank base,
> + * it is important that address bit 10 _NOT_ be asserted. Depending
> + * on the configuration of the RAM, bit 10 may correspond to one
> + * of several different (processor-relative) address bits.
> + *
> + * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> + * User's Guide" mentions that jumpers R51 and R52 control the
> + * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> + * SDRAM bank 1 on Neponset). The default configuration selects
> + * Assabet, so any address in bank 1 is necessarily invalid.
> + *
> + * As a workaround, use a bounce buffer in addressable memory
> + * as local_mem, relying on ZONE_DMA to provide an area that
> + * fits within the above constraints.
> + *
> + * SZ_64K is an estimate for what size this might need.
> + */
> + ret = usb_hcd_setup_local_mem(hcd, 0, 0, SZ_64K);
> + if (ret)
> + goto err1;
> +
> if (!request_mem_region(hcd->rsrc_start, hcd->rsrc_len, hcd_name)) {
> dev_dbg(&dev->dev, "request_mem_region failed\n");
> ret = -EBUSY;
> --
> 2.29.2
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-02-03 8:36 [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver Arnd Bergmann
2022-02-03 8:47 ` Ard Biesheuvel
2022-02-08 10:19 ` Greg KH
@ 2022-04-06 6:32 ` Christoph Hellwig
2022-04-06 7:38 ` Arnd Bergmann
2 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-04-06 6:32 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Russell King, Arnd Bergmann, Linus Walleij, Christoph Hellwig,
Laurentiu Tudor, linux-usb, Alan Stern, linux-arm-kernel,
linux-kernel
Just curious, where does this stand currently?
On Thu, Feb 03, 2022 at 09:36:33AM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
>
> The sa1111 platform is one of the two remaining users of the old Arm
> specific "dmabounce" code, which is an earlier implementation of the
> generic swiotlb.
>
> Linus Walleij submitted a patch that removes dmabounce support from
> the ixp4xx, and I had a look at the other user, which is the sa1111
> companion chip.
>
> Looking at how dmabounce is used, I could narrow it down to one driver
> one three machines:
>
> - dmabounce is only initialized on assabet/neponset, jornada720 and
> badge4, which are the platforms that have an sa1111 and support
> DMA on it.
>
> - All three of these suffer from "erratum #7" that requires only
> doing DMA to half the memory sections based on one of the address
> lines, in addition, the neponset also can't DMA to the RAM that
> is connected to sa1111 itself.
>
> - the pxa lubbock machine also has sa1111, but does not support DMA
> on it and does not set dmabounce.
>
> - only the OHCI and audio devices on sa1111 support DMA, but as
> there is no audio driver for this hardware, only OHCI remains.
>
> In the OHCI code, I noticed that two other platforms already have
> a local bounce buffer support in the form of the "local_mem"
> allocator. Specifically, TMIO and SM501 use this on a few other ARM
> boards with 16KB or 128KB of local SRAM that can be accessed from the
> OHCI and from the CPU.
>
> While this is not the same problem as on sa1111, I could not find a
> reason why we can't re-use the existing implementation but replace the
> physical SRAM address mapping with a locally allocated DMA buffer.
>
> There are two main downsides:
>
> - rather than using a dynamically sized pool, this buffer needs
> to be allocated at probe time using a fixed size. Without
> having any idea of what it should be, I picked a size of
> 64KB, which is between what the other two OHCI front-ends use
> in their SRAM. If anyone has a better idea what that size
> is reasonable, this can be trivially changed.
>
> - Previously, only USB transfers to unaddressable memory needed
> to go through the bounce buffer, now all of them do, which may
> impact runtime performance for USB endpoints that do a lot of
> transfers.
>
> On the upside, the local_mem support uses write-combining buffers,
> which should be a bit faster for transfers to the device compared to
> normal uncached coherent memory as used in dmabounce.
>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> Cc: linux-usb@vger.kernel.org
> Acked-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Changes in v2:
>
> - drop check for assabet, as bounce buffers are required on
> all sa1100 machines
> - select CONFIG_ZONE_DMA again
> - update comments and changelog text based on discussion
> ---
> arch/arm/common/Kconfig | 2 +-
> arch/arm/common/sa1111.c | 64 ----------------------------------
> drivers/usb/core/hcd.c | 17 +++++++--
> drivers/usb/host/ohci-sa1111.c | 25 +++++++++++++
> 4 files changed, 40 insertions(+), 68 deletions(-)
>
> diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
> index c8e198631d41..bc158fd227e1 100644
> --- a/arch/arm/common/Kconfig
> +++ b/arch/arm/common/Kconfig
> @@ -1,7 +1,7 @@
> # SPDX-License-Identifier: GPL-2.0
> config SA1111
> bool
> - select DMABOUNCE if !ARCH_PXA
> + select ZONE_DMA if ARCH_SA1100
>
> config DMABOUNCE
> bool
> diff --git a/arch/arm/common/sa1111.c b/arch/arm/common/sa1111.c
> index 7df003b149c6..a00915883f78 100644
> --- a/arch/arm/common/sa1111.c
> +++ b/arch/arm/common/sa1111.c
> @@ -1391,70 +1391,9 @@ void sa1111_driver_unregister(struct sa1111_driver *driver)
> }
> EXPORT_SYMBOL(sa1111_driver_unregister);
>
> -#ifdef CONFIG_DMABOUNCE
> -/*
> - * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> - * Chip Specification Update" (June 2000), erratum #7, there is a
> - * significant bug in the SA1111 SDRAM shared memory controller. If
> - * an access to a region of memory above 1MB relative to the bank base,
> - * it is important that address bit 10 _NOT_ be asserted. Depending
> - * on the configuration of the RAM, bit 10 may correspond to one
> - * of several different (processor-relative) address bits.
> - *
> - * This routine only identifies whether or not a given DMA address
> - * is susceptible to the bug.
> - *
> - * This should only get called for sa1111_device types due to the
> - * way we configure our device dma_masks.
> - */
> -static int sa1111_needs_bounce(struct device *dev, dma_addr_t addr, size_t size)
> -{
> - /*
> - * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> - * User's Guide" mentions that jumpers R51 and R52 control the
> - * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> - * SDRAM bank 1 on Neponset). The default configuration selects
> - * Assabet, so any address in bank 1 is necessarily invalid.
> - */
> - return (machine_is_assabet() || machine_is_pfs168()) &&
> - (addr >= 0xc8000000 || (addr + size) >= 0xc8000000);
> -}
> -
> -static int sa1111_notifier_call(struct notifier_block *n, unsigned long action,
> - void *data)
> -{
> - struct sa1111_dev *dev = to_sa1111_device(data);
> -
> - switch (action) {
> - case BUS_NOTIFY_ADD_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL) {
> - int ret = dmabounce_register_dev(&dev->dev, 1024, 4096,
> - sa1111_needs_bounce);
> - if (ret)
> - dev_err(&dev->dev, "failed to register with dmabounce: %d\n", ret);
> - }
> - break;
> -
> - case BUS_NOTIFY_DEL_DEVICE:
> - if (dev->dev.dma_mask && dev->dma_mask < 0xffffffffUL)
> - dmabounce_unregister_dev(&dev->dev);
> - break;
> - }
> - return NOTIFY_OK;
> -}
> -
> -static struct notifier_block sa1111_bus_notifier = {
> - .notifier_call = sa1111_notifier_call,
> -};
> -#endif
> -
> static int __init sa1111_init(void)
> {
> int ret = bus_register(&sa1111_bus_type);
> -#ifdef CONFIG_DMABOUNCE
> - if (ret == 0)
> - bus_register_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> if (ret == 0)
> platform_driver_register(&sa1111_device_driver);
> return ret;
> @@ -1463,9 +1402,6 @@ static int __init sa1111_init(void)
> static void __exit sa1111_exit(void)
> {
> platform_driver_unregister(&sa1111_device_driver);
> -#ifdef CONFIG_DMABOUNCE
> - bus_unregister_notifier(&sa1111_bus_type, &sa1111_bus_notifier);
> -#endif
> bus_unregister(&sa1111_bus_type);
> }
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 3c7c64ff3c0a..8417baedc89c 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1260,7 +1260,8 @@ void usb_hcd_unlink_urb_from_ep(struct usb_hcd *hcd, struct urb *urb)
> EXPORT_SYMBOL_GPL(usb_hcd_unlink_urb_from_ep);
>
> /*
> - * Some usb host controllers can only perform dma using a small SRAM area.
> + * Some usb host controllers can only perform dma using a small SRAM area,
> + * or have restrictions on addressable DRAM.
> * The usb core itself is however optimized for host controllers that can dma
> * using regular system memory - like pci devices doing bus mastering.
> *
> @@ -3095,8 +3096,18 @@ int usb_hcd_setup_local_mem(struct usb_hcd *hcd, phys_addr_t phys_addr,
> if (IS_ERR(hcd->localmem_pool))
> return PTR_ERR(hcd->localmem_pool);
>
> - local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> - size, MEMREMAP_WC);
> + /*
> + * if a physical SRAM address was passed, map it, otherwise
> + * allocate system memory as a buffer.
> + */
> + if (phys_addr)
> + local_mem = devm_memremap(hcd->self.sysdev, phys_addr,
> + size, MEMREMAP_WC);
> + else
> + local_mem = dmam_alloc_attrs(hcd->self.sysdev, size, &dma,
> + GFP_KERNEL,
> + DMA_ATTR_WRITE_COMBINE);
> +
> if (IS_ERR(local_mem))
> return PTR_ERR(local_mem);
>
> diff --git a/drivers/usb/host/ohci-sa1111.c b/drivers/usb/host/ohci-sa1111.c
> index 137f66f6977f..0da2badf0658 100644
> --- a/drivers/usb/host/ohci-sa1111.c
> +++ b/drivers/usb/host/ohci-sa1111.c
> @@ -206,6 +206,31 @@ static int ohci_hcd_sa1111_probe(struct sa1111_dev *dev)
> goto err1;
> }
>
> + /*
> + * According to the "Intel StrongARM SA-1111 Microprocessor Companion
> + * Chip Specification Update" (June 2000), erratum #7, there is a
> + * significant bug in the SA1111 SDRAM shared memory controller. If
> + * an access to a region of memory above 1MB relative to the bank base,
> + * it is important that address bit 10 _NOT_ be asserted. Depending
> + * on the configuration of the RAM, bit 10 may correspond to one
> + * of several different (processor-relative) address bits.
> + *
> + * Section 4.6 of the "Intel StrongARM SA-1111 Development Module
> + * User's Guide" mentions that jumpers R51 and R52 control the
> + * target of SA-1111 DMA (either SDRAM bank 0 on Assabet, or
> + * SDRAM bank 1 on Neponset). The default configuration selects
> + * Assabet, so any address in bank 1 is necessarily invalid.
> + *
> + * As a workaround, use a bounce buffer in addressable memory
> + * as local_mem, relying on ZONE_DMA to provide an area that
> + * fits within the above constraints.
> + *
> + * SZ_64K is an estimate for what size this might need.
> + */
> + ret = usb_hcd_setup_local_mem(hcd, 0, 0, SZ_64K);
> + if (ret)
> + goto err1;
> +
> if (!request_mem_region(hcd->rsrc_start, hcd->rsrc_len, hcd_name)) {
> dev_dbg(&dev->dev, "request_mem_region failed\n");
> ret = -EBUSY;
> --
> 2.29.2
>
---end quoted text---
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] [v2] ARM: sa1100/assabet: move dmabounce hack to ohci driver
2022-04-06 6:32 ` Christoph Hellwig
@ 2022-04-06 7:38 ` Arnd Bergmann
0 siblings, 0 replies; 7+ messages in thread
From: Arnd Bergmann @ 2022-04-06 7:38 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Russell King, Arnd Bergmann, Linus Walleij, Laurentiu Tudor,
USB list, Alan Stern, Linux ARM, Linux Kernel Mailing List
On Wed, Apr 6, 2022 at 8:32 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> Just curious, where does this stand currently?
I am not aware of any remaining problems, but I am waiting for Russell to review
it again. Russell, can you have a look now, so we can try to get this into 4.19?
Let me know if you need me to resend the patch, or have a look at the archive[1]
link for the current version [1].
Arnd
[1] https://lore.kernel.org/all/20220203083658.559803-1-arnd@kernel.org/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread