* mm: get_user_pages_fast() @ 2013-11-05 22:43 Michael McTernan 2013-11-06 10:19 ` Will Deacon 2013-11-06 10:54 ` Russell King - ARM Linux 0 siblings, 2 replies; 7+ messages in thread From: Michael McTernan @ 2013-11-05 22:43 UTC (permalink / raw) To: linux-arm-kernel Hi all, I think there's a problem on ARM with futex calls to FUTEX_WAIT and similar for addresses that have been mapped to userspace from /dev/mem or a device which sets VM_IO or VM_PFNMAP. This can break pthread_mutex_lock() if placed on such mapped memory and attributed as process-shared. The cause looks to be that ARM doesn't yet implement get_user_pages_fast(), which is called in get_futex_key() for a non-process private futex. On ARM the weak wrapper function in mm/util.c is used as a compatibility fallback. This fallback takes the mm->mmap_sem lock, and calls through to get_user_pages(). Unfortunately get_user_pages() then returns -EFAULT for pages marked with VM_IO | VM_PFNMAP set. Memory which has been mapped to userspace through /dev/mem is correctly flagged in this way. get_user_pages_fast() on other architectures, as implemented in arch/.../gup.c doesn't appear have the same checks for VM_IO or VM_PFNMAP, so futex functionality can work. Implementing get_user_pages_fast() to the same spec on ARM should fix this. The weak get_user_pages_fast() may then also need fixing or eventually removing since its behaviour isn't the same as get_user_pages_fast() in all cases and may be causing bugs where it us used. Bugs in locking are not very nice. I've seen a couple of patches go by that provide get_user_pages_fast() on ARM, but none yet accepted. I therefore would like to highlight this particular case for context, and also to perhaps generate some movement on this issue, which I also reported here: https://bugzilla.kernel.org/show_bug.cgi?id=64321 For reference, here are some previous patches and discussion on adding this function: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-April/162115.html https://lkml.org/lkml/2013/10/3/529 http://patches.linaro.org/21130/ Kind Regards, Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-05 22:43 mm: get_user_pages_fast() Michael McTernan @ 2013-11-06 10:19 ` Will Deacon 2013-11-06 10:54 ` Russell King - ARM Linux 1 sibling, 0 replies; 7+ messages in thread From: Will Deacon @ 2013-11-06 10:19 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 05, 2013 at 10:43:15PM +0000, Michael McTernan wrote: > Hi all, Hi Mike, > I think there's a problem on ARM with futex calls to FUTEX_WAIT and > similar for addresses that have been mapped to userspace from /dev/mem > or a device which sets VM_IO or VM_PFNMAP. This can break > pthread_mutex_lock() if placed on such mapped memory and attributed as > process-shared. > > The cause looks to be that ARM doesn't yet implement > get_user_pages_fast(), which is called in get_futex_key() for a > non-process private futex. [...] > I've seen a couple of patches go by that provide get_user_pages_fast() > on ARM, but none yet accepted. I therefore would like to highlight this > particular case for context, and also to perhaps generate some movement > on this issue, which I also reported here: > > https://bugzilla.kernel.org/show_bug.cgi?id=64321 The latest patches from Steve are here: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/205951.html but we need some feedback from the MM guys regarding the atomics in patch 2 (since we don't require IPIs for TLB invalidation). I'd also like to see some numbers comparing the two schemes (since we could add a dummy IPI to our huge TLB flushing code if it's cheaper). Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-05 22:43 mm: get_user_pages_fast() Michael McTernan 2013-11-06 10:19 ` Will Deacon @ 2013-11-06 10:54 ` Russell King - ARM Linux 2013-11-06 11:07 ` Will Deacon 2013-11-06 19:38 ` Michael McTernan 1 sibling, 2 replies; 7+ messages in thread From: Russell King - ARM Linux @ 2013-11-06 10:54 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 05, 2013 at 10:43:15PM +0000, Michael McTernan wrote: > I think there's a problem on ARM with futex calls to FUTEX_WAIT and > similar for addresses that have been mapped to userspace from /dev/mem > or a device which sets VM_IO or VM_PFNMAP. This can break > pthread_mutex_lock() if placed on such mapped memory and attributed as > process-shared. Firstly, please stop this madness. Placing futexes in IO memory is not supported, period. You can't map random bits of /dev/mem and expect this stuff to work. (a) you're generally not allowed to map kernel memory via /dev/mem, so that rules out all kernel managed memory. (b) you are allowed to map memory outside of that, but you get it as strongly ordered memory. (b) means that the load/store exclusives, which userspace mutexes will use on ARMv6+, will not work correctly. (they're not supported to strongly ordered memory by the architecture.) In other words, don't put mutexes in memory you've remapped from /dev/mem. In fact, as an application, you should not be mapping /dev/mem at all. ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-06 10:54 ` Russell King - ARM Linux @ 2013-11-06 11:07 ` Will Deacon 2013-11-06 19:38 ` Michael McTernan 1 sibling, 0 replies; 7+ messages in thread From: Will Deacon @ 2013-11-06 11:07 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 06, 2013 at 10:54:00AM +0000, Russell King - ARM Linux wrote: > On Tue, Nov 05, 2013 at 10:43:15PM +0000, Michael McTernan wrote: > > I think there's a problem on ARM with futex calls to FUTEX_WAIT and > > similar for addresses that have been mapped to userspace from /dev/mem > > or a device which sets VM_IO or VM_PFNMAP. This can break > > pthread_mutex_lock() if placed on such mapped memory and attributed as > > process-shared. > > Firstly, please stop this madness. Placing futexes in IO memory is not > supported, period. You can't map random bits of /dev/mem and expect > this stuff to work. Yikes, I didn't notice that when I read the original post! > (a) you're generally not allowed to map kernel memory via /dev/mem, so > that rules out all kernel managed memory. > (b) you are allowed to map memory outside of that, but you get it as > strongly ordered memory. > > (b) means that the load/store exclusives, which userspace mutexes will > use on ARMv6+, will not work correctly. (they're not supported to > strongly ordered memory by the architecture.) > > In other words, don't put mutexes in memory you've remapped from /dev/mem. > In fact, as an application, you should not be mapping /dev/mem at all. Completely agreed. As an aside, fastgup is still something we want on ARM (it is required to support futexes on transparent huge pages, which is what I originally assumed this was about), so my previous comments about Steve's patches still stand. Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-06 10:54 ` Russell King - ARM Linux 2013-11-06 11:07 ` Will Deacon @ 2013-11-06 19:38 ` Michael McTernan 2013-11-06 19:57 ` Russell King - ARM Linux 1 sibling, 1 reply; 7+ messages in thread From: Michael McTernan @ 2013-11-06 19:38 UTC (permalink / raw) To: linux-arm-kernel On 11/06/2013 10:54 AM, Russell King - ARM Linux wrote: > Firstly, please stop this madness. In the context of an embedded system, where memory is shared between CPUs, DSP, FPGAs and some peripherals, and where the SCU is maintaining coherency on selected areas, this is a feature of the platform, not madness. > Placing futexes in IO memory is not supported, period. Except the _PRIVATE futex calls do work fine in IO memory since they don't walk the process page tables to determine the 'key' from the address. These a limited to only sync between threads in the same processes though. > You can't map random bits of /dev/mem and expect > this stuff to work. Nothing I am mapping is random. > (b) you are allowed to map memory outside of that, but you get it as > strongly ordered memory. > > (b) means that the load/store exclusives, which userspace mutexes will > use on ARMv6+, will not work correctly. > (they're not supported to strongly ordered memory by the architecture.) I don't see this stated in the ARM Architecture Reference Manual. Instead I see it saying that it is implementation dependent whether LDREX and STREX work on strongly ordered memory (section A3.4.5). > In other words, don't put mutexes in memory you've remapped from /dev/mem. > In fact, as an application, you should not be mapping /dev/mem at all. It's certainly more standard to map the mutexes() out of something like file backed mmap(), but that isn't as elegant for my application. To give a bit of background, what we have is some memory at a know address which controls a peripheral. Part of the peripheral addresses are buffers and registers, but some is set aside for use by software. Being able to map this whole section into userspace makes for really fast fast-paths where the driver is basically outside the kernel. Also keeping driver state variables in the peripheral RAM means things can be restarted if the Linux processes crash or die, or even if the whole of Linux were to be restarted for some reason. So it's more than convenient to have the mutexes work there, though as an easier goal I'll probably just get the futex calls to work for my applications. Hopefully this doesn't sound as mad as first assumed & thanks for your time. Kind Regards, Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-06 19:38 ` Michael McTernan @ 2013-11-06 19:57 ` Russell King - ARM Linux 2013-11-06 22:31 ` Michael McTernan 0 siblings, 1 reply; 7+ messages in thread From: Russell King - ARM Linux @ 2013-11-06 19:57 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 06, 2013 at 07:38:33PM +0000, Michael McTernan wrote: > On 11/06/2013 10:54 AM, Russell King - ARM Linux wrote: >> Firstly, please stop this madness. > > In the context of an embedded system, where memory is shared between > CPUs, DSP, FPGAs and some peripherals, and where the SCU is maintaining > coherency on selected areas, this is a feature of the platform, not > madness. Please expose the memory in a saner way rather than re-using /dev/mem. /dev/mem is all well and good, but it has side effects with the way it maps memory. Having your own driver gives you control over that. It's also better practice because it avoids a whole load of security issues. > > Placing futexes in IO memory is not supported, period. > > Except the _PRIVATE futex calls do work fine in IO memory since they > don't walk the process page tables to determine the 'key' from the > address. These a limited to only sync between threads in the same > processes though. Sorry, I should have said mutexes. The point here is that userspace pthreads can ldrex/strex on that memory before issuing any futex calls. > > You can't map random bits of /dev/mem and expect > > this stuff to work. > > Nothing I am mapping is random. As far as I'm concerned, it's random areas of system memory - random in the sense that it's not well defined what it is and what attributes it should have in the page tables for things like memory type etc. I somehow doubt that you've thought about this at all, you just decided that opening /dev/mem is a solution to your problem. >> (b) means that the load/store exclusives, which userspace mutexes will >> use on ARMv6+, will not work correctly. >> (they're not supported to strongly ordered memory by the architecture.) > > I don't see this stated in the ARM Architecture Reference Manual. > Instead I see it saying that it is implementation dependent whether > LDREX and STREX work on strongly ordered memory (section A3.4.5). Here's the latest wording: It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be performed to a memory region with the Device or Strongly-ordered memory attribute. Unless the implementation documentation explicitly states that LDREX and STREX operations to a memory region with the Device or Strongly-ordered attribute are permitted, the effect of such operations is UNPREDICTABLE. In other words, it is only supported where it is explicitly documented that it works on these memory types; otherwise it must be assumed that they do not work... However, that concern goes away if you provide your own device driver in Linux to provide _just_ a mechanism to map this memory. Plus, you can restrict it to just being able to map this specific memory without exposing the entire system RAM to anything which happens to exploit your application. >> In other words, don't put mutexes in memory you've remapped from /dev/mem. >> In fact, as an application, you should not be mapping /dev/mem at all. > > It's certainly more standard to map the mutexes() out of something like > file backed mmap(), but that isn't as elegant for my application. > > To give a bit of background, what we have is some memory at a know > address which controls a peripheral. Part of the peripheral addresses > are buffers and registers, but some is set aside for use by software. > Being able to map this whole section into userspace makes for really > fast fast-paths where the driver is basically outside the kernel. Also > keeping driver state variables in the peripheral RAM means things can be > restarted if the Linux processes crash or die, or even if the whole of > Linux were to be restarted for some reason. So it's more than > convenient to have the mutexes work there, though as an easier goal I'll > probably just get the futex calls to work for my applications. So, what happens if Linux crashes having taken a pthread mutex in this region, and your application then comes along and tries to regain that lock? Do you always re-initialise the mutexes (you technically have to if you read the man page for pthread_mutex_lock() which tells you about a whole raft of undefined behaviours concerning the pthread mutex implementation.) ^ permalink raw reply [flat|nested] 7+ messages in thread
* mm: get_user_pages_fast() 2013-11-06 19:57 ` Russell King - ARM Linux @ 2013-11-06 22:31 ` Michael McTernan 0 siblings, 0 replies; 7+ messages in thread From: Michael McTernan @ 2013-11-06 22:31 UTC (permalink / raw) To: linux-arm-kernel On 11/06/2013 07:57 PM, Russell King - ARM Linux wrote: > Please expose the memory in a saner way rather than re-using /dev/mem. > /dev/mem is all well and good, but it has side effects with the way it > maps memory. Having your own driver gives you control over that. It's > also better practice because it avoids a whole load of security issues. Sure - that's where I'm heading now. The security isn't an issue for this application, though I accept it's good practice. >>> Placing futexes in IO memory is not supported, period. >> >> Except the _PRIVATE futex calls do work fine in IO memory since they >> don't walk the process page tables to determine the 'key' from the >> address. ... > Sorry, I should have said mutexes. The point here is that userspace > pthreads can ldrex/strex on that memory before issuing any futex calls. Mutexes do access the memory, yes. There is still inconsistency in futex behaviour, with _PRIVATE calls working for IO memory, non-private not. This isn't ARM specific though. >>> You can't map random bits of /dev/mem and expect >>> this stuff to work. >> >> Nothing I am mapping is random. > > As far as I'm concerned, it's random areas of system memory - random in > the sense that it's not well defined what it is and what attributes it > should have in the page tables for things like memory type etc. I > somehow doubt that you've thought about this at all, you just decided > that opening /dev/mem is a solution to your problem. I didn't want to bore the list with gory details, but I didn't earlier state that I added a /dev/shmem and defined memory regions and types for the areas, with the required attributes for sharability and caching - so it isn't really random, though I see how it sounds like that from my original post. /dev/shmem is a close cousin of /dev/mem, and has picked up the VM_PFNMAP and VM_IO flags which cause me inconvenience. >>> (b) means that the load/store exclusives, which userspace mutexes will >>> use on ARMv6+, will not work correctly. >>> (they're not supported to strongly ordered memory by the architecture.) >> >> I don't see this stated in the ARM Architecture Reference Manual. >> Instead I see it saying that it is implementation dependent whether >> LDREX and STREX work on strongly ordered memory (section A3.4.5). > > Here's the latest wording: > > It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > performed to a memory region with the Device or Strongly-ordered memory > attribute. Unless the implementation documentation explicitly states > that LDREX and STREX operations to a memory region with the Device or > Strongly-ordered attribute are permitted, the effect of such operations > is UNPREDICTABLE. > > In other words, it is only supported where it is explicitly documented > that it works on these memory types; otherwise it must be assumed that > they do not work... What you've quoted is not the same as your first statement that "they're not supported to strongly ordered memory by the architecture." It's implementation defined, which means it is a possibility. > So, what happens if Linux crashes having taken a pthread mutex in this > region, and your application then comes along and tries to regain that > lock? Well there's a whole restart procedure needed to pick up state from the memory area and get back to sensible operation. Re-initialising pthread_mutexes and some other counters and indices is all part of that - for a simple enough FIFO type interface it's not that difficult to cover potentially corrupt state variables, though getting good testing coverage is hard. Fuzzing the restart state helps a lot, though the key is to keep the state space as small as possible. Kind Regards, Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-11-06 22:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-05 22:43 mm: get_user_pages_fast() Michael McTernan 2013-11-06 10:19 ` Will Deacon 2013-11-06 10:54 ` Russell King - ARM Linux 2013-11-06 11:07 ` Will Deacon 2013-11-06 19:38 ` Michael McTernan 2013-11-06 19:57 ` Russell King - ARM Linux 2013-11-06 22:31 ` Michael McTernan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox