* arm64 memcpy_{from|to}io and memset_io
@ 2015-10-14 6:12 Radha Mohan
2015-10-14 8:17 ` Arnd Bergmann
2015-10-14 16:12 ` Catalin Marinas
0 siblings, 2 replies; 4+ messages in thread
From: Radha Mohan @ 2015-10-14 6:12 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
I see that the memcpy_{from|to}io and memset_io are not in an
optimized manner. I guess these are just a copy from
arch/arm/include/asm/io.h where there could be problem with different
implementations.
Do we still need these to be byte write ?
Can we convert them to use a more optimized memcpy ?
We have some drivers, like framebuffer driver using these functions
and end up writing byte-by-byte. This causes a very poor VGA
performance.
Let me know if there are any concerns to convert these to use memcpy.
I can send a patch.
regards,
Radha Mohan
^ permalink raw reply [flat|nested] 4+ messages in thread* arm64 memcpy_{from|to}io and memset_io
2015-10-14 6:12 arm64 memcpy_{from|to}io and memset_io Radha Mohan
@ 2015-10-14 8:17 ` Arnd Bergmann
2015-10-14 16:12 ` Catalin Marinas
1 sibling, 0 replies; 4+ messages in thread
From: Arnd Bergmann @ 2015-10-14 8:17 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 13 October 2015 23:12:18 Radha Mohan wrote:
> Hi,
> I see that the memcpy_{from|to}io and memset_io are not in an
> optimized manner. I guess these are just a copy from
> arch/arm/include/asm/io.h where there could be problem with different
> implementations.
> Do we still need these to be byte write ?
No.
> Can we convert them to use a more optimized memcpy ?
Yes.
> We have some drivers, like framebuffer driver using these functions
> and end up writing byte-by-byte. This causes a very poor VGA
> performance.
>
> Let me know if there are any concerns to convert these to use memcpy.
> I can send a patch.
A few things to watch out for:
- you cannot use a static inline to do the job, because gcc might
replace a plain memcpy() with unaligned pointer dereferences
that are not allowed on __iomem
- when providing an external implementation of the functions, make sure
they honor the alignment as well
- I think you need the same barriers that readl/writel have, but only
at the start/end of the loop, not in the middle.
Arnd
^ permalink raw reply [flat|nested] 4+ messages in thread* arm64 memcpy_{from|to}io and memset_io
2015-10-14 6:12 arm64 memcpy_{from|to}io and memset_io Radha Mohan
2015-10-14 8:17 ` Arnd Bergmann
@ 2015-10-14 16:12 ` Catalin Marinas
2015-10-14 16:16 ` Radha Mohan
1 sibling, 1 reply; 4+ messages in thread
From: Catalin Marinas @ 2015-10-14 16:12 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 13, 2015 at 11:12:18PM -0700, Radha Mohan wrote:
> I see that the memcpy_{from|to}io and memset_io are not in an
> optimized manner. I guess these are just a copy from
> arch/arm/include/asm/io.h where there could be problem with different
> implementations.
I think you may be looking at an older kernel version. In the latest
mainline, memcpy_*io functions are more optimised in the sense that they
use 64-bit accesses if the alignment permits.
> Do we still need these to be byte write ?
No but see above.
> Can we convert them to use a more optimized memcpy ?
There is a risk to converting them to something like memcpy() as the
latter does not guarantee aligned accesses. Alignment is mandatory for
Device memory access.
> We have some drivers, like framebuffer driver using these functions
> and end up writing byte-by-byte. This causes a very poor VGA
> performance.
You probably have an old kernel version.
--
Catalin
^ permalink raw reply [flat|nested] 4+ messages in thread* arm64 memcpy_{from|to}io and memset_io
2015-10-14 16:12 ` Catalin Marinas
@ 2015-10-14 16:16 ` Radha Mohan
0 siblings, 0 replies; 4+ messages in thread
From: Radha Mohan @ 2015-10-14 16:16 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Oct 14, 2015 at 9:12 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Tue, Oct 13, 2015 at 11:12:18PM -0700, Radha Mohan wrote:
>> I see that the memcpy_{from|to}io and memset_io are not in an
>> optimized manner. I guess these are just a copy from
>> arch/arm/include/asm/io.h where there could be problem with different
>> implementations.
>
> I think you may be looking at an older kernel version. In the latest
> mainline, memcpy_*io functions are more optimised in the sense that they
> use 64-bit accesses if the alignment permits.
>
>> Do we still need these to be byte write ?
>
> No but see above.
>
>> Can we convert them to use a more optimized memcpy ?
>
> There is a risk to converting them to something like memcpy() as the
> latter does not guarantee aligned accesses. Alignment is mandatory for
> Device memory access.
>
>> We have some drivers, like framebuffer driver using these functions
>> and end up writing byte-by-byte. This causes a very poor VGA
>> performance.
>
> You probably have an old kernel version.
Yes, I was alternating between old and new kernels. The newer implementation
is much better. I will try that. Thanks.
>
> --
> Catalin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-10-14 16:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-14 6:12 arm64 memcpy_{from|to}io and memset_io Radha Mohan
2015-10-14 8:17 ` Arnd Bergmann
2015-10-14 16:12 ` Catalin Marinas
2015-10-14 16:16 ` Radha Mohan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox