* memcpy alignment @ 2015-12-15 4:11 Jon Masters 2015-12-15 9:34 ` Catalin Marinas 0 siblings, 1 reply; 9+ messages in thread From: Jon Masters @ 2015-12-15 4:11 UTC (permalink / raw) To: linux-arm-kernel Hi Folks, What's supposed to happen if the natural alignment of the pointers dst and src is not the same? We don't seem to handle that case today, and instead will base our access data type on the source alignment only. Jon. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 4:11 memcpy alignment Jon Masters @ 2015-12-15 9:34 ` Catalin Marinas 2015-12-15 15:24 ` Jon Masters 0 siblings, 1 reply; 9+ messages in thread From: Catalin Marinas @ 2015-12-15 9:34 UTC (permalink / raw) To: linux-arm-kernel On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote: > What's supposed to happen if the natural alignment of the pointers dst > and src is not the same? We don't seem to handle that case today, and > instead will base our access data type on the source alignment only. The hardware takes care of the other one. The memcpy behaviour in the kernel is the same as the glibc one (the Cortex Strings library). -- Catalin ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 9:34 ` Catalin Marinas @ 2015-12-15 15:24 ` Jon Masters 2015-12-15 15:32 ` Russell King - ARM Linux 0 siblings, 1 reply; 9+ messages in thread From: Jon Masters @ 2015-12-15 15:24 UTC (permalink / raw) To: linux-arm-kernel On 12/15/2015 04:34 AM, Catalin Marinas wrote: > On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote: >> What's supposed to happen if the natural alignment of the pointers dst >> and src is not the same? We don't seem to handle that case today, and >> instead will base our access data type on the source alignment only. > > The hardware takes care of the other one. The memcpy behaviour in the > kernel is the same as the glibc one (the Cortex Strings library). Not if you're dealing with Device memory. I accept that one could always ensure that there's a non-Device mapping (I got the other replies) but I don't think it's unreasonable to expect a memcpy to work in the presence of misaligned addresses either. Jon. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 15:24 ` Jon Masters @ 2015-12-15 15:32 ` Russell King - ARM Linux 2015-12-15 15:43 ` Jon Masters 0 siblings, 1 reply; 9+ messages in thread From: Russell King - ARM Linux @ 2015-12-15 15:32 UTC (permalink / raw) To: linux-arm-kernel On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote: > On 12/15/2015 04:34 AM, Catalin Marinas wrote: > > On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote: > >> What's supposed to happen if the natural alignment of the pointers dst > >> and src is not the same? We don't seem to handle that case today, and > >> instead will base our access data type on the source alignment only. > > > > The hardware takes care of the other one. The memcpy behaviour in the > > kernel is the same as the glibc one (the Cortex Strings library). > > Not if you're dealing with Device memory. I accept that one could always > ensure that there's a non-Device mapping (I got the other replies) but I > don't think it's unreasonable to expect a memcpy to work in the presence > of misaligned addresses either. Using memcpy() on memory returned from functions which setup IO/MMIO mappings has always been outlawed. It's the whole reason we have things like memcpy_toio(), memcpy_fromio() and memset_io(), which date back a few decades now, and they exist for exactly these kinds of reasons. If you get an __iomem pointer, then you must respect that it essentially can not be non-dereferenced, and you must use one of the standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API contract you implicitly signed up to by using something like ioremap() or other mapping which gives you an iomem mapping. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 15:32 ` Russell King - ARM Linux @ 2015-12-15 15:43 ` Jon Masters 2015-12-15 15:52 ` Mark Rutland 2015-12-15 16:09 ` Leif Lindholm 0 siblings, 2 replies; 9+ messages in thread From: Jon Masters @ 2015-12-15 15:43 UTC (permalink / raw) To: linux-arm-kernel On 12/15/2015 10:32 AM, Russell King - ARM Linux wrote: > On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote: >> On 12/15/2015 04:34 AM, Catalin Marinas wrote: >>> On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote: >>>> What's supposed to happen if the natural alignment of the pointers dst >>>> and src is not the same? We don't seem to handle that case today, and >>>> instead will base our access data type on the source alignment only. >>> >>> The hardware takes care of the other one. The memcpy behaviour in the >>> kernel is the same as the glibc one (the Cortex Strings library). >> >> Not if you're dealing with Device memory. I accept that one could always >> ensure that there's a non-Device mapping (I got the other replies) but I >> don't think it's unreasonable to expect a memcpy to work in the presence >> of misaligned addresses either. > > Using memcpy() on memory returned from functions which setup IO/MMIO > mappings has always been outlawed. It's the whole reason we have > things like memcpy_toio(), memcpy_fromio() and memset_io(), which > date back a few decades now, and they exist for exactly these kinds > of reasons. > > If you get an __iomem pointer, then you must respect that it > essentially can not be non-dereferenced, and you must use one of the > standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ > memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API > contract you implicitly signed up to by using something like ioremap() > or other mapping which gives you an iomem mapping. Thanks Russell. If it's definitely never allowed then the existing x86 code needs to be fixed to use an IO access function in that case. I get that those accessors are there for this reason, but I wanted to make sure that we don't ever expect to touch Device memory any other way (for example, conflicting mappings between a VM and hypervisor). I am certain there's other non-ACPI code that is going to have this happen :) Jon. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 15:43 ` Jon Masters @ 2015-12-15 15:52 ` Mark Rutland 2015-12-15 16:09 ` Leif Lindholm 1 sibling, 0 replies; 9+ messages in thread From: Mark Rutland @ 2015-12-15 15:52 UTC (permalink / raw) To: linux-arm-kernel On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote: > On 12/15/2015 10:32 AM, Russell King - ARM Linux wrote: > > On Tue, Dec 15, 2015 at 10:24:13AM -0500, Jon Masters wrote: > >> On 12/15/2015 04:34 AM, Catalin Marinas wrote: > >>> On Mon, Dec 14, 2015 at 11:11:02PM -0500, Jon Masters wrote: > >>>> What's supposed to happen if the natural alignment of the pointers dst > >>>> and src is not the same? We don't seem to handle that case today, and > >>>> instead will base our access data type on the source alignment only. > >>> > >>> The hardware takes care of the other one. The memcpy behaviour in the > >>> kernel is the same as the glibc one (the Cortex Strings library). > >> > >> Not if you're dealing with Device memory. I accept that one could always > >> ensure that there's a non-Device mapping (I got the other replies) but I > >> don't think it's unreasonable to expect a memcpy to work in the presence > >> of misaligned addresses either. > > > > Using memcpy() on memory returned from functions which setup IO/MMIO > > mappings has always been outlawed. It's the whole reason we have > > things like memcpy_toio(), memcpy_fromio() and memset_io(), which > > date back a few decades now, and they exist for exactly these kinds > > of reasons. > > > > If you get an __iomem pointer, then you must respect that it > > essentially can not be non-dereferenced, and you must use one of the > > standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ > > memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API > > contract you implicitly signed up to by using something like ioremap() > > or other mapping which gives you an iomem mapping. > > Thanks Russell. If it's definitely never allowed then the existing x86 > code needs to be fixed to use an IO access function in that case. I get > that those accessors are there for this reason, but I wanted to make > sure that we don't ever expect to touch Device memory any other way (for > example, conflicting mappings between a VM and hypervisor). I am certain > there's other non-ACPI code that is going to have this happen :) It's important to note that much of the ACPI code was written prior to the availability of memremap. Depending on the case, the better solution may be to use memremap. I would expect this to be the case for any static tables of information, at least. Thanks, Mark. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 15:43 ` Jon Masters 2015-12-15 15:52 ` Mark Rutland @ 2015-12-15 16:09 ` Leif Lindholm 2015-12-15 16:28 ` Jon Masters 1 sibling, 1 reply; 9+ messages in thread From: Leif Lindholm @ 2015-12-15 16:09 UTC (permalink / raw) To: linux-arm-kernel On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote: > > If you get an __iomem pointer, then you must respect that it > > essentially can not be non-dereferenced, and you must use one of the > > standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ > > memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API > > contract you implicitly signed up to by using something like ioremap() > > or other mapping which gives you an iomem mapping. > > Thanks Russell. If it's definitely never allowed then the existing x86 > code needs to be fixed to use an IO access function in that case. I get > that those accessors are there for this reason, but I wanted to make > sure that we don't ever expect to touch Device memory any other way (for > example, conflicting mappings between a VM and hypervisor). I am certain > there's other non-ACPI code that is going to have this happen :) A lot of code that has never run on anything other than x86 will have such issues. Tracking the use of page_is_ram() around the kernel, looking at what it does for different architectures, and looking at how its (not formalised) semantics are interpreted can also be quite unsettling. / Leif ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 16:09 ` Leif Lindholm @ 2015-12-15 16:28 ` Jon Masters 2015-12-15 16:47 ` Måns Rullgård 0 siblings, 1 reply; 9+ messages in thread From: Jon Masters @ 2015-12-15 16:28 UTC (permalink / raw) To: linux-arm-kernel On 12/15/2015 11:09 AM, Leif Lindholm wrote: > On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote: >>> If you get an __iomem pointer, then you must respect that it >>> essentially can not be non-dereferenced, and you must use one of the >>> standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ >>> memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API >>> contract you implicitly signed up to by using something like ioremap() >>> or other mapping which gives you an iomem mapping. >> >> Thanks Russell. If it's definitely never allowed then the existing x86 >> code needs to be fixed to use an IO access function in that case. I get >> that those accessors are there for this reason, but I wanted to make >> sure that we don't ever expect to touch Device memory any other way (for >> example, conflicting mappings between a VM and hypervisor). I am certain >> there's other non-ACPI code that is going to have this happen :) > > A lot of code that has never run on anything other than x86 will have > such issues. > > Tracking the use of page_is_ram() around the kernel, looking at what > it does for different architectures, and looking at how its (not > formalised) semantics are interpreted can also be quite unsettling. Yeah. That was the reason I didn't just change the existing initrd code in the first place (wanted to leave it as is). I *did not know* memcpy to/from Device memory was explicitly banned (and I get why, and I do know one is supposed to map Device memory as such, etc. etc.) for this reason. However it's actually very reasonable to demand correctness going in. If it were hacked/kludged to paper over the situation I describe it would stand even less chance of being fixed. I would /separately/ note that there's an inefficiency in that the existing code relies upon assumed equal alignment between src/dst so the hardware is probably doing a lot of silent unaligned writes. Jon. ^ permalink raw reply [flat|nested] 9+ messages in thread
* memcpy alignment 2015-12-15 16:28 ` Jon Masters @ 2015-12-15 16:47 ` Måns Rullgård 0 siblings, 0 replies; 9+ messages in thread From: Måns Rullgård @ 2015-12-15 16:47 UTC (permalink / raw) To: linux-arm-kernel Jon Masters <jcm@redhat.com> writes: > On 12/15/2015 11:09 AM, Leif Lindholm wrote: >> On Tue, Dec 15, 2015 at 10:43:03AM -0500, Jon Masters wrote: >>>> If you get an __iomem pointer, then you must respect that it >>>> essentially can not be non-dereferenced, and you must use one of the >>>> standard kernel accessors (read[bwl]/ioread*/write[bwl]/iowrite*/ >>>> memcpy_fromio/memcpy_toio/memset_io) to access it. That's the API >>>> contract you implicitly signed up to by using something like ioremap() >>>> or other mapping which gives you an iomem mapping. >>> >>> Thanks Russell. If it's definitely never allowed then the existing x86 >>> code needs to be fixed to use an IO access function in that case. I get >>> that those accessors are there for this reason, but I wanted to make >>> sure that we don't ever expect to touch Device memory any other way (for >>> example, conflicting mappings between a VM and hypervisor). I am certain >>> there's other non-ACPI code that is going to have this happen :) >> >> A lot of code that has never run on anything other than x86 will have >> such issues. >> >> Tracking the use of page_is_ram() around the kernel, looking at what >> it does for different architectures, and looking at how its (not >> formalised) semantics are interpreted can also be quite unsettling. > > Yeah. That was the reason I didn't just change the existing initrd code > in the first place (wanted to leave it as is). I *did not know* memcpy > to/from Device memory was explicitly banned (and I get why, and I do > know one is supposed to map Device memory as such, etc. etc.) for this > reason. Additionally to alignment constraints, IO regions often allow only certain access sizes, and memcpy() doesn't make any promises as to what it might do. > I would /separately/ note that there's an inefficiency in that the > existing code relies upon assumed equal alignment between src/dst so the > hardware is probably doing a lot of silent unaligned writes. This is the most efficient way. Manually shifting things around to get both reads and writes aligned costs more than letting the hardware handle one side. An unaligned store typically costs on average one cycle more than an aligned store. On some hardware (e.g. Cortex-A8), an unaligned store within a cache line is free but one that crosses cache lines needs several extra cycles. -- M?ns Rullg?rd ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-12-15 16:47 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-12-15 4:11 memcpy alignment Jon Masters 2015-12-15 9:34 ` Catalin Marinas 2015-12-15 15:24 ` Jon Masters 2015-12-15 15:32 ` Russell King - ARM Linux 2015-12-15 15:43 ` Jon Masters 2015-12-15 15:52 ` Mark Rutland 2015-12-15 16:09 ` Leif Lindholm 2015-12-15 16:28 ` Jon Masters 2015-12-15 16:47 ` Måns Rullgård
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).