* memcpy alignment
@ 2015-12-15 4:11 Jon Masters
2015-12-15 9:34 ` Catalin Marinas
0 siblings, 1 reply; 9+ messages in thread
From: Jon Masters @ 2015-12-15 4:11 UTC (permalink / raw)
To: linux-arm-kernel
Hi Folks,
What's supposed to happen if the natural alignment of the pointers dst
and src is not the same? We don't seem to handle that case today, and
instead will base our access data type on the source alignment only.
Jon.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 4:11 memcpy alignment Jon Masters
@ 2015-12-15 9:34 ` Catalin Marinas
2015-12-15 15:24 ` Jon Masters
0 siblings, 1 reply; 9+ messages in thread
From: Catalin Marinas @ 2015-12-15 9:34 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote:
> What's supposed to happen if the natural alignment of the pointers dst
> and src is not the same? We don't seem to handle that case today, and
> instead will base our access data type on the source alignment only.
The hardware takes care of the other one. The memcpy behaviour in the
kernel is the same as the glibc one (the Cortex Strings library).
--
Catalin
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 9:34 ` Catalin Marinas
@ 2015-12-15 15:24 ` Jon Masters
2015-12-15 15:32 ` Russell King - ARM Linux
0 siblings, 1 reply; 9+ messages in thread
From: Jon Masters @ 2015-12-15 15:24 UTC (permalink / raw)
To: linux-arm-kernel
On 12/15/2015 04:34 AM, Catalin Marinas wrote:
> On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote:
>> What's supposed to happen if the natural alignment of the pointers dst
>> and src is not the same? We don't seem to handle that case today, and
>> instead will base our access data type on the source alignment only.
>
> The hardware takes care of the other one. The memcpy behaviour in the
> kernel is the same as the glibc one (the Cortex Strings library).
Not if you're dealing with Device memory. I accept that one could always
ensure that there's a non-Device mapping (I got the other replies) but I
don't think it's unreasonable to expect a memcpy to work in the presence
of misaligned addresses either.
Jon.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 15:24 ` Jon Masters
@ 2015-12-15 15:32 ` Russell King - ARM Linux
2015-12-15 15:43 ` Jon Masters
0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-12-15 15:32 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote:
> On 12/15/2015 04:34 AM, Catalin Marinas wrote:
> > On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote:
> >> What's supposed to happen if the natural alignment of the pointers dst
> >> and src is not the same? We don't seem to handle that case today, and
> >> instead will base our access data type on the source alignment only.
> >
> > The hardware takes care of the other one. The memcpy behaviour in the
> > kernel is the same as the glibc one (the Cortex Strings library).
>
> Not if you're dealing with Device memory. I accept that one could always
> ensure that there's a non-Device mapping (I got the other replies) but I
> don't think it's unreasonable to expect a memcpy to work in the presence
> of misaligned addresses either.
Using memcpy() on memory returned from functions which setup IO/MMIO
mappings has always been outlawed. It's the whole reason we have
things like memcpy_toio(), memcpy_fromio() and memset_io(), which
date back a few decades now, and they exist for exactly these kinds
of reasons.
If you get an __iomem pointer, then you must respect that it
essentially can not be non-dereferenced, and you must use one of the
standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
contract you implicitly signed up to by using something like ioremap()
or other mapping which gives you an iomem mapping.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 15:32 ` Russell King - ARM Linux
@ 2015-12-15 15:43 ` Jon Masters
2015-12-15 15:52 ` Mark Rutland
2015-12-15 16:09 ` Leif Lindholm
0 siblings, 2 replies; 9+ messages in thread
From: Jon Masters @ 2015-12-15 15:43 UTC (permalink / raw)
To: linux-arm-kernel
On 12/15/2015 10:32 AM, Russell King - ARM Linux wrote:
> On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote:
>> On 12/15/2015 04:34 AM, Catalin Marinas wrote:
>>> On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote:
>>>> What's supposed to happen if the natural alignment of the pointers dst
>>>> and src is not the same? We don't seem to handle that case today, and
>>>> instead will base our access data type on the source alignment only.
>>>
>>> The hardware takes care of the other one. The memcpy behaviour in the
>>> kernel is the same as the glibc one (the Cortex Strings library).
>>
>> Not if you're dealing with Device memory. I accept that one could always
>> ensure that there's a non-Device mapping (I got the other replies) but I
>> don't think it's unreasonable to expect a memcpy to work in the presence
>> of misaligned addresses either.
>
> Using memcpy() on memory returned from functions which setup IO/MMIO
> mappings has always been outlawed. It's the whole reason we have
> things like memcpy_toio(), memcpy_fromio() and memset_io(), which
> date back a few decades now, and they exist for exactly these kinds
> of reasons.
>
> If you get an __iomem pointer, then you must respect that it
> essentially can not be non-dereferenced, and you must use one of the
> standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
> memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
> contract you implicitly signed up to by using something like ioremap()
> or other mapping which gives you an iomem mapping.
Thanks Russell. If it's definitely never allowed then the existing x86
code needs to be fixed to use an IO access function in that case. I get
that those accessors are there for this reason, but I wanted to make
sure that we don't ever expect to touch Device memory any other way (for
example, conflicting mappings between a VM and hypervisor). I am certain
there's other non-ACPI code that is going to have this happen :)
Jon.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 15:43 ` Jon Masters
@ 2015-12-15 15:52 ` Mark Rutland
2015-12-15 16:09 ` Leif Lindholm
1 sibling, 0 replies; 9+ messages in thread
From: Mark Rutland @ 2015-12-15 15:52 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote:
> On 12/15/2015 10:32 AM, Russell King - ARM Linux wrote:
> > On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote:
> >> On 12/15/2015 04:34 AM, Catalin Marinas wrote:
> >>> On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote:
> >>>> What's supposed to happen if the natural alignment of the pointers dst
> >>>> and src is not the same? We don't seem to handle that case today, and
> >>>> instead will base our access data type on the source alignment only.
> >>>
> >>> The hardware takes care of the other one. The memcpy behaviour in the
> >>> kernel is the same as the glibc one (the Cortex Strings library).
> >>
> >> Not if you're dealing with Device memory. I accept that one could always
> >> ensure that there's a non-Device mapping (I got the other replies) but I
> >> don't think it's unreasonable to expect a memcpy to work in the presence
> >> of misaligned addresses either.
> >
> > Using memcpy() on memory returned from functions which setup IO/MMIO
> > mappings has always been outlawed. It's the whole reason we have
> > things like memcpy_toio(), memcpy_fromio() and memset_io(), which
> > date back a few decades now, and they exist for exactly these kinds
> > of reasons.
> >
> > If you get an __iomem pointer, then you must respect that it
> > essentially can not be non-dereferenced, and you must use one of the
> > standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
> > memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
> > contract you implicitly signed up to by using something like ioremap()
> > or other mapping which gives you an iomem mapping.
>
> Thanks Russell. If it's definitely never allowed then the existing x86
> code needs to be fixed to use an IO access function in that case. I get
> that those accessors are there for this reason, but I wanted to make
> sure that we don't ever expect to touch Device memory any other way (for
> example, conflicting mappings between a VM and hypervisor). I am certain
> there's other non-ACPI code that is going to have this happen :)
It's important to note that much of the ACPI code was written prior to
the availability of memremap.
Depending on the case, the better solution may be to use memremap. I
would expect this to be the case for any static tables of information,
at least.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 15:43 ` Jon Masters
2015-12-15 15:52 ` Mark Rutland
@ 2015-12-15 16:09 ` Leif Lindholm
2015-12-15 16:28 ` Jon Masters
1 sibling, 1 reply; 9+ messages in thread
From: Leif Lindholm @ 2015-12-15 16:09 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote:
> > If you get an __iomem pointer, then you must respect that it
> > essentially can not be non-dereferenced, and you must use one of the
> > standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
> > memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
> > contract you implicitly signed up to by using something like ioremap()
> > or other mapping which gives you an iomem mapping.
>
> Thanks Russell. If it's definitely never allowed then the existing x86
> code needs to be fixed to use an IO access function in that case. I get
> that those accessors are there for this reason, but I wanted to make
> sure that we don't ever expect to touch Device memory any other way (for
> example, conflicting mappings between a VM and hypervisor). I am certain
> there's other non-ACPI code that is going to have this happen :)
A lot of code that has never run on anything other than x86 will have
such issues.
Tracking the use of page_is_ram() around the kernel, looking at what
it does for different architectures, and looking at how its (not
formalised) semantics are interpreted can also be quite unsettling.
/
Leif
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 16:09 ` Leif Lindholm
@ 2015-12-15 16:28 ` Jon Masters
2015-12-15 16:47 ` Måns Rullgård
0 siblings, 1 reply; 9+ messages in thread
From: Jon Masters @ 2015-12-15 16:28 UTC (permalink / raw)
To: linux-arm-kernel
On 12/15/2015 11:09 AM, Leif Lindholm wrote:
> On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote:
>>> If you get an __iomem pointer, then you must respect that it
>>> essentially can not be non-dereferenced, and you must use one of the
>>> standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
>>> memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
>>> contract you implicitly signed up to by using something like ioremap()
>>> or other mapping which gives you an iomem mapping.
>>
>> Thanks Russell. If it's definitely never allowed then the existing x86
>> code needs to be fixed to use an IO access function in that case. I get
>> that those accessors are there for this reason, but I wanted to make
>> sure that we don't ever expect to touch Device memory any other way (for
>> example, conflicting mappings between a VM and hypervisor). I am certain
>> there's other non-ACPI code that is going to have this happen :)
>
> A lot of code that has never run on anything other than x86 will have
> such issues.
>
> Tracking the use of page_is_ram() around the kernel, looking at what
> it does for different architectures, and looking at how its (not
> formalised) semantics are interpreted can also be quite unsettling.
Yeah. That was the reason I didn't just change the existing initrd code
in the first place (wanted to leave it as is). I *did not know* memcpy
to/from Device memory was explicitly banned (and I get why, and I do
know one is supposed to map Device memory as such, etc. etc.) for this
reason. However it's actually very reasonable to demand correctness
going in. If it were hacked/kludged to paper over the situation I
describe it would stand even less chance of being fixed.
I would /separately/ note that there's an inefficiency in that the
existing code relies upon assumed equal alignment between src/dst so the
hardware is probably doing a lot of silent unaligned writes.
Jon.
^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment
2015-12-15 16:28 ` Jon Masters
@ 2015-12-15 16:47 ` Måns Rullgård
0 siblings, 0 replies; 9+ messages in thread
From: Måns Rullgård @ 2015-12-15 16:47 UTC (permalink / raw)
To: linux-arm-kernel
Jon Masters <jcm@redhat.com> writes:
> On 12/15/2015 11:09 AM, Leif Lindholm wrote:
>> On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote:
>>>> If you get an __iomem pointer, then you must respect that it
>>>> essentially can not be non-dereferenced, and you must use one of the
>>>> standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/
>>>> memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API
>>>> contract you implicitly signed up to by using something like ioremap()
>>>> or other mapping which gives you an iomem mapping.
>>>
>>> Thanks Russell. If it's definitely never allowed then the existing x86
>>> code needs to be fixed to use an IO access function in that case. I get
>>> that those accessors are there for this reason, but I wanted to make
>>> sure that we don't ever expect to touch Device memory any other way (for
>>> example, conflicting mappings between a VM and hypervisor). I am certain
>>> there's other non-ACPI code that is going to have this happen :)
>>
>> A lot of code that has never run on anything other than x86 will have
>> such issues.
>>
>> Tracking the use of page_is_ram() around the kernel, looking at what
>> it does for different architectures, and looking at how its (not
>> formalised) semantics are interpreted can also be quite unsettling.
>
> Yeah. That was the reason I didn't just change the existing initrd code
> in the first place (wanted to leave it as is). I *did not know* memcpy
> to/from Device memory was explicitly banned (and I get why, and I do
> know one is supposed to map Device memory as such, etc. etc.) for this
> reason.
Additionally to alignment constraints, IO regions often allow only
certain access sizes, and memcpy() doesn't make any promises as to what
it might do.
> I would /separately/ note that there's an inefficiency in that the
> existing code relies upon assumed equal alignment between src/dst so the
> hardware is probably doing a lot of silent unaligned writes.
This is the most efficient way. Manually shifting things around to get
both reads and writes aligned costs more than letting the hardware
handle one side. An unaligned store typically costs on average one
cycle more than an aligned store. On some hardware (e.g. Cortex-A8), an
unaligned store within a cache line is free but one that crosses cache
lines needs several extra cycles.
--
M?ns Rullg?rd
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-12-15 16:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-15 4:11 memcpy alignment Jon Masters
2015-12-15 9:34 ` Catalin Marinas
2015-12-15 15:24 ` Jon Masters
2015-12-15 15:32 ` Russell King - ARM Linux
2015-12-15 15:43 ` Jon Masters
2015-12-15 15:52 ` Mark Rutland
2015-12-15 16:09 ` Leif Lindholm
2015-12-15 16:28 ` Jon Masters
2015-12-15 16:47 ` Måns Rullgård
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).