* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() [not found] ` <20191013195949.GM26530@ZenIV.linux.org.uk> @ 2019-10-15 18:08 ` Al Viro 2019-10-15 18:08 ` Al Viro ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Al Viro @ 2019-10-15 18:08 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch [futex folks and linux-arch Cc'd] On Sun, Oct 13, 2019 at 08:59:49PM +0100, Al Viro wrote: > Re plotting: how strongly would you object against passing the range to > user_access_end()? Powerpc folks have a very close analogue of stac/clac, > currently buried inside their __get_user()/__put_user()/etc. - the same > places where x86 does, including futex.h and friends. > > And there it's even costlier than on x86. It would obviously be nice > to lift it at least out of unsafe_get_user()/unsafe_put_user() and > move into user_access_begin()/user_access_end(); unfortunately, in > one subarchitecture they really want it the range on the user_access_end() > side as well. That's obviously not fatal (they can bloody well save those > into thread_info at user_access_begin()), but right now we have relatively > few user_access_end() callers, so the interface changes are still possible. > > Other architectures with similar stuff are riscv (no arguments, same > as for stac/clac), arm (uaccess_save_and_enable() on the way in, > return value passed to uaccess_restore() on the way out) and s390 > (similar to arm, but there it's needed only to deal with nesting, > and I'm not sure it actually can happen). > > It would be nice to settle the API while there are not too many users > outside of arch/x86; changing it later will be a PITA and we definitely > have architectures that do potentially costly things around the userland > memory access; user_access_begin()/user_access_end() is in the right > place to try and see if they fit there... Another question: right now we have if (!access_ok(uaddr, sizeof(u32))) return -EFAULT; ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); if (ret) return ret; in kernel/futex.c. Would there be any objections to moving access_ok() inside the instances and moving pagefault_disable()/pagefault_enable() outside? Reasons: * on x86 that would allow folding access_ok() with STAC into user_access_begin(). The same would be doable on other usual suspects (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their STAC counterparts. * pagefault_disable()/pagefault_enable() pair is universal on all architectures, really meant to by the nature of the beast and lifting it into kernel/futex.c would get the same situation as with futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside the primitive (also foldable into user_access_begin(), at that). * access_ok() would be closer to actual memory access (and out of the generic code). Comments? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 18:08 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro @ 2019-10-15 18:08 ` Al Viro 2019-10-15 19:00 ` Linus Torvalds 2019-10-16 12:12 ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro 2 siblings, 0 replies; 20+ messages in thread From: Al Viro @ 2019-10-15 18:08 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch [futex folks and linux-arch Cc'd] On Sun, Oct 13, 2019 at 08:59:49PM +0100, Al Viro wrote: > Re plotting: how strongly would you object against passing the range to > user_access_end()? Powerpc folks have a very close analogue of stac/clac, > currently buried inside their __get_user()/__put_user()/etc. - the same > places where x86 does, including futex.h and friends. > > And there it's even costlier than on x86. It would obviously be nice > to lift it at least out of unsafe_get_user()/unsafe_put_user() and > move into user_access_begin()/user_access_end(); unfortunately, in > one subarchitecture they really want it the range on the user_access_end() > side as well. That's obviously not fatal (they can bloody well save those > into thread_info at user_access_begin()), but right now we have relatively > few user_access_end() callers, so the interface changes are still possible. > > Other architectures with similar stuff are riscv (no arguments, same > as for stac/clac), arm (uaccess_save_and_enable() on the way in, > return value passed to uaccess_restore() on the way out) and s390 > (similar to arm, but there it's needed only to deal with nesting, > and I'm not sure it actually can happen). > > It would be nice to settle the API while there are not too many users > outside of arch/x86; changing it later will be a PITA and we definitely > have architectures that do potentially costly things around the userland > memory access; user_access_begin()/user_access_end() is in the right > place to try and see if they fit there... Another question: right now we have if (!access_ok(uaddr, sizeof(u32))) return -EFAULT; ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); if (ret) return ret; in kernel/futex.c. Would there be any objections to moving access_ok() inside the instances and moving pagefault_disable()/pagefault_enable() outside? Reasons: * on x86 that would allow folding access_ok() with STAC into user_access_begin(). The same would be doable on other usual suspects (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their STAC counterparts. * pagefault_disable()/pagefault_enable() pair is universal on all architectures, really meant to by the nature of the beast and lifting it into kernel/futex.c would get the same situation as with futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside the primitive (also foldable into user_access_begin(), at that). * access_ok() would be closer to actual memory access (and out of the generic code). Comments? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 18:08 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro 2019-10-15 18:08 ` Al Viro @ 2019-10-15 19:00 ` Linus Torvalds 2019-10-15 19:00 ` Linus Torvalds 2019-10-15 19:40 ` Al Viro 2019-10-16 12:12 ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro 2 siblings, 2 replies; 20+ messages in thread From: Linus Torvalds @ 2019-10-15 19:00 UTC (permalink / raw) To: Al Viro Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > Another question: right now we have > if (!access_ok(uaddr, sizeof(u32))) > return -EFAULT; > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > if (ret) > return ret; > in kernel/futex.c. Would there be any objections to moving access_ok() > inside the instances and moving pagefault_disable()/pagefault_enable() outside? I think we should remove all the "atomic" versions, and just make the rule be that if you want atomic, you surround it with pagefault_disable()/pagefault_enable(). That covers not just the futex ops (where "atomic" is actually somewhat ambiguous - the ops themselves are atomic too, so the naming might stay, although arguably the "futex" part makes that pointless too), but also copy_to_user_inatomic() and the powerpc version of __get_user_inatomic(). So we'd aim to get rid of all the "inatomic" ones entirely. Same ultimately probably goes for the NMI versions. We should just make it be a rule that we can use all of the user access functions with pagefault_{dis,en}able() around them, and they'll be "safe" to use in atomic context. One issue with the NMI versions is that they actually want to avoid the current value of set_fs(). So copy_from_user_nmi() (at least on x86) is special in that it does if (__range_not_ok(from, n, TASK_SIZE)) return n; instead of access_ok() because of that issue. NMI also has some other issues (nmi_uaccess_okay() on x86, at least), but those *probably* could be handled at page fault time instead. Anyway, NMI is so special that I'd suggest leaving it for later, but the non-NMI atomic accesses I would suggest you clean up at the same time. I think the *only* reason we have the "inatomic()" versions is that the regular ones do that "might_fault()" testing unconditionally, and might_fault() _used_ to be just a might_sleep() - so it's not about functionality per se, it's about "we have this sanity check that we need to undo". We've already made "might_fault()" look at pagefault_disabled(), so I think a lot of the reasons for inatomic are entirely historical. Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 19:00 ` Linus Torvalds @ 2019-10-15 19:00 ` Linus Torvalds 2019-10-15 19:40 ` Al Viro 1 sibling, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2019-10-15 19:00 UTC (permalink / raw) To: Al Viro Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > Another question: right now we have > if (!access_ok(uaddr, sizeof(u32))) > return -EFAULT; > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > if (ret) > return ret; > in kernel/futex.c. Would there be any objections to moving access_ok() > inside the instances and moving pagefault_disable()/pagefault_enable() outside? I think we should remove all the "atomic" versions, and just make the rule be that if you want atomic, you surround it with pagefault_disable()/pagefault_enable(). That covers not just the futex ops (where "atomic" is actually somewhat ambiguous - the ops themselves are atomic too, so the naming might stay, although arguably the "futex" part makes that pointless too), but also copy_to_user_inatomic() and the powerpc version of __get_user_inatomic(). So we'd aim to get rid of all the "inatomic" ones entirely. Same ultimately probably goes for the NMI versions. We should just make it be a rule that we can use all of the user access functions with pagefault_{dis,en}able() around them, and they'll be "safe" to use in atomic context. One issue with the NMI versions is that they actually want to avoid the current value of set_fs(). So copy_from_user_nmi() (at least on x86) is special in that it does if (__range_not_ok(from, n, TASK_SIZE)) return n; instead of access_ok() because of that issue. NMI also has some other issues (nmi_uaccess_okay() on x86, at least), but those *probably* could be handled at page fault time instead. Anyway, NMI is so special that I'd suggest leaving it for later, but the non-NMI atomic accesses I would suggest you clean up at the same time. I think the *only* reason we have the "inatomic()" versions is that the regular ones do that "might_fault()" testing unconditionally, and might_fault() _used_ to be just a might_sleep() - so it's not about functionality per se, it's about "we have this sanity check that we need to undo". We've already made "might_fault()" look at pagefault_disabled(), so I think a lot of the reasons for inatomic are entirely historical. Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 19:00 ` Linus Torvalds 2019-10-15 19:00 ` Linus Torvalds @ 2019-10-15 19:40 ` Al Viro 2019-10-15 19:40 ` Al Viro 2019-10-15 20:18 ` Al Viro 1 sibling, 2 replies; 20+ messages in thread From: Al Viro @ 2019-10-15 19:40 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 12:00:34PM -0700, Linus Torvalds wrote: > On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > Another question: right now we have > > if (!access_ok(uaddr, sizeof(u32))) > > return -EFAULT; > > > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > > if (ret) > > return ret; > > in kernel/futex.c. Would there be any objections to moving access_ok() > > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > I think we should remove all the "atomic" versions, and just make the > rule be that if you want atomic, you surround it with > pagefault_disable()/pagefault_enable(). Umm... I thought about that, but ended up with "it documents the intent" - pagefault_disable() might be implicit (e.g. done by kmap_atomic()) or several levels up the call chain. Not sure. > That covers not just the futex ops (where "atomic" is actually > somewhat ambiguous - the ops themselves are atomic too, so the naming > might stay, although arguably the "futex" part makes that pointless > too), but also copy_to_user_inatomic() and the powerpc version of > __get_user_inatomic(). Eh? copy_to_user_inatomic() doesn't exist; __copy_to_user_inatomic() does, but... arch/mips/kernel/unaligned.c:1307: res = __copy_to_user_inatomic(addr, fpr, sizeof(*fpr)); drivers/gpu/drm/i915/i915_gem.c:313: unwritten = __copy_to_user_inatomic(user_data, lib/test_kasan.c:510: unused = __copy_to_user_inatomic(usermem, kmem, size + 1); mm/maccess.c:98: ret = __copy_to_user_inatomic((__force void __user *)dst, src, size); these are all callers it has left anywhere and I'm certainly going to kill it. Now, __copy_from_user_inatomic() has a lot more callers left... Frankly, the messier part of API is the nocache side of things. Consider e.g. this: /* platform specific: cacheless copy */ static void cacheless_memcpy(void *dst, void *src, size_t n) { /* * Use the only available X64 cacheless copy. Add a __user cast * to quiet sparse. The src agument is already in the kernel so * there are no security issues. The extra fault recovery machinery * is not invoked. */ __copy_user_nocache(dst, (void __user *)src, n, 0); } or this static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset) { #ifdef ARCH_HAS_NOCACHE_UACCESS /* * Using non-temporal mov to improve performance on non-cached * writes, even though we aren't actually copying from user space. */ __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len); #else memcpy_toio(offset, entry->buf, entry->len); #endif /* Ensure that the data is fully copied out before setting the flags */ wmb(); ntb_tx_copy_callback(entry, NULL); } "user" part is bollocks in both cases; moreover, I really wonder about that ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment* and it just so happens that ..._toio() doesn't require anything special on x86. Have e.g. arm grow nocache stuff and the things will suddenly break, won't they? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 19:40 ` Al Viro @ 2019-10-15 19:40 ` Al Viro 2019-10-15 20:18 ` Al Viro 1 sibling, 0 replies; 20+ messages in thread From: Al Viro @ 2019-10-15 19:40 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 12:00:34PM -0700, Linus Torvalds wrote: > On Tue, Oct 15, 2019 at 11:08 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > Another question: right now we have > > if (!access_ok(uaddr, sizeof(u32))) > > return -EFAULT; > > > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > > if (ret) > > return ret; > > in kernel/futex.c. Would there be any objections to moving access_ok() > > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > I think we should remove all the "atomic" versions, and just make the > rule be that if you want atomic, you surround it with > pagefault_disable()/pagefault_enable(). Umm... I thought about that, but ended up with "it documents the intent" - pagefault_disable() might be implicit (e.g. done by kmap_atomic()) or several levels up the call chain. Not sure. > That covers not just the futex ops (where "atomic" is actually > somewhat ambiguous - the ops themselves are atomic too, so the naming > might stay, although arguably the "futex" part makes that pointless > too), but also copy_to_user_inatomic() and the powerpc version of > __get_user_inatomic(). Eh? copy_to_user_inatomic() doesn't exist; __copy_to_user_inatomic() does, but... arch/mips/kernel/unaligned.c:1307: res = __copy_to_user_inatomic(addr, fpr, sizeof(*fpr)); drivers/gpu/drm/i915/i915_gem.c:313: unwritten = __copy_to_user_inatomic(user_data, lib/test_kasan.c:510: unused = __copy_to_user_inatomic(usermem, kmem, size + 1); mm/maccess.c:98: ret = __copy_to_user_inatomic((__force void __user *)dst, src, size); these are all callers it has left anywhere and I'm certainly going to kill it. Now, __copy_from_user_inatomic() has a lot more callers left... Frankly, the messier part of API is the nocache side of things. Consider e.g. this: /* platform specific: cacheless copy */ static void cacheless_memcpy(void *dst, void *src, size_t n) { /* * Use the only available X64 cacheless copy. Add a __user cast * to quiet sparse. The src agument is already in the kernel so * there are no security issues. The extra fault recovery machinery * is not invoked. */ __copy_user_nocache(dst, (void __user *)src, n, 0); } or this static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset) { #ifdef ARCH_HAS_NOCACHE_UACCESS /* * Using non-temporal mov to improve performance on non-cached * writes, even though we aren't actually copying from user space. */ __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len); #else memcpy_toio(offset, entry->buf, entry->len); #endif /* Ensure that the data is fully copied out before setting the flags */ wmb(); ntb_tx_copy_callback(entry, NULL); } "user" part is bollocks in both cases; moreover, I really wonder about that ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment* and it just so happens that ..._toio() doesn't require anything special on x86. Have e.g. arm grow nocache stuff and the things will suddenly break, won't they? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 19:40 ` Al Viro 2019-10-15 19:40 ` Al Viro @ 2019-10-15 20:18 ` Al Viro 2019-10-15 20:18 ` Al Viro 1 sibling, 1 reply; 20+ messages in thread From: Al Viro @ 2019-10-15 20:18 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 08:40:12PM +0100, Al Viro wrote: > or this > static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset) > { > #ifdef ARCH_HAS_NOCACHE_UACCESS > /* > * Using non-temporal mov to improve performance on non-cached > * writes, even though we aren't actually copying from user space. > */ > __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len); > #else > memcpy_toio(offset, entry->buf, entry->len); > #endif > > /* Ensure that the data is fully copied out before setting the flags */ > wmb(); > > ntb_tx_copy_callback(entry, NULL); > } > "user" part is bollocks in both cases; moreover, I really wonder about that > ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment* > and it just so happens that ..._toio() doesn't require anything special on > x86. Have e.g. arm grow nocache stuff and the things will suddenly break, > won't they? Incidentally, there are two callers of __copy_from_user_inatomic_nocache() in generic code: lib/iov_iter.c:792: __copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len, lib/iov_iter.c:849: if (__copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len, Neither is done under under pagefault_disable(), AFAICS. This one drivers/gpu/drm/qxl/qxl_ioctl.c:189: unwritten = __copy_from_user_inatomic_nocache probably is - it has something called qxl_bo_kmap_atomic_page() called just prior, which would seem to imply kmap_atomic() somewhere in it. The same goes for drivers/gpu/drm/i915/i915_gem.c:500: unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + offset, So we have 5 callers anywhere. Two are not "inatomic" in any sense; source is in userspace and we want nocache behaviour. Two _are_ done into a page that had been fed through kmap_atomic(); the source is, again, in userland. And the last one is complete BS - it wants memcpy_toio_nocache() and abuses this thing. Incidentally, in case of fault i915 caller ends up unmapping the page, mapping it non-atomic (with kmap?) and doing plain copy_from_user(), nocache be damned. qxl, OTOH, whines and fails all the way to userland... ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-15 20:18 ` Al Viro @ 2019-10-15 20:18 ` Al Viro 0 siblings, 0 replies; 20+ messages in thread From: Al Viro @ 2019-10-15 20:18 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 08:40:12PM +0100, Al Viro wrote: > or this > static void ntb_memcpy_tx(struct ntb_queue_entry *entry, void __iomem *offset) > { > #ifdef ARCH_HAS_NOCACHE_UACCESS > /* > * Using non-temporal mov to improve performance on non-cached > * writes, even though we aren't actually copying from user space. > */ > __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len); > #else > memcpy_toio(offset, entry->buf, entry->len); > #endif > > /* Ensure that the data is fully copied out before setting the flags */ > wmb(); > > ntb_tx_copy_callback(entry, NULL); > } > "user" part is bollocks in both cases; moreover, I really wonder about that > ifdef in ntb one - ARCH_HAS_NOCACHE_UACCESS is x86-only *at* *the* *moment* > and it just so happens that ..._toio() doesn't require anything special on > x86. Have e.g. arm grow nocache stuff and the things will suddenly break, > won't they? Incidentally, there are two callers of __copy_from_user_inatomic_nocache() in generic code: lib/iov_iter.c:792: __copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len, lib/iov_iter.c:849: if (__copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len, Neither is done under under pagefault_disable(), AFAICS. This one drivers/gpu/drm/qxl/qxl_ioctl.c:189: unwritten = __copy_from_user_inatomic_nocache probably is - it has something called qxl_bo_kmap_atomic_page() called just prior, which would seem to imply kmap_atomic() somewhere in it. The same goes for drivers/gpu/drm/i915/i915_gem.c:500: unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + offset, So we have 5 callers anywhere. Two are not "inatomic" in any sense; source is in userspace and we want nocache behaviour. Two _are_ done into a page that had been fed through kmap_atomic(); the source is, again, in userland. And the last one is complete BS - it wants memcpy_toio_nocache() and abuses this thing. Incidentally, in case of fault i915 caller ends up unmapping the page, mapping it non-atomic (with kmap?) and doing plain copy_from_user(), nocache be damned. qxl, OTOH, whines and fails all the way to userland... ^ permalink raw reply [flat|nested] 20+ messages in thread
* [RFC] change of calling conventions for arch_futex_atomic_op_inuser() 2019-10-15 18:08 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro 2019-10-15 18:08 ` Al Viro 2019-10-15 19:00 ` Linus Torvalds @ 2019-10-16 12:12 ` Al Viro 2019-10-16 12:12 ` Al Viro 2019-10-16 12:24 ` Thomas Gleixner 2 siblings, 2 replies; 20+ messages in thread From: Al Viro @ 2019-10-16 12:12 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote: > [futex folks and linux-arch Cc'd] > Another question: right now we have > if (!access_ok(uaddr, sizeof(u32))) > return -EFAULT; > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > if (ret) > return ret; > in kernel/futex.c. Would there be any objections to moving access_ok() > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > Reasons: > * on x86 that would allow folding access_ok() with STAC into > user_access_begin(). The same would be doable on other usual suspects > (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their > STAC counterparts. > * pagefault_disable()/pagefault_enable() pair is universal on > all architectures, really meant to by the nature of the beast and > lifting it into kernel/futex.c would get the same situation as with > futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside > the primitive (also foldable into user_access_begin(), at that). > * access_ok() would be closer to actual memory access (and > out of the generic code). > > Comments? FWIW, completely untested patch follows; just the (semimechanical) conversion of calling conventions, no per-architecture followups included. Could futex folks ACK/NAK that in principle? commit 7babb6ad28cb3e80977fb6bd0405e3f81a943161 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Tue Oct 15 16:54:41 2019 -0400 arch_futex_atomic_op_inuser(): move access_ok() in and pagefault_disable() - out Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/arch/alpha/include/asm/futex.h b/arch/alpha/include/asm/futex.h index bfd3c01038f8..da67afd578fd 100644 --- a/arch/alpha/include/asm/futex.h +++ b/arch/alpha/include/asm/futex.h @@ -31,7 +31,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -53,8 +54,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h index 9d0d070e6c22..607d1c16d4dd 100644 --- a/arch/arc/include/asm/futex.h +++ b/arch/arc/include/asm/futex.h @@ -75,10 +75,12 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + #ifndef CONFIG_ARC_HAS_LLSC preempt_disable(); /* to guarantee atomic r-m-w of futex op */ #endif - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -101,7 +103,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); #ifndef CONFIG_ARC_HAS_LLSC preempt_enable(); #endif diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h index 83c391b597d4..e133da303a98 100644 --- a/arch/arm/include/asm/futex.h +++ b/arch/arm/include/asm/futex.h @@ -134,10 +134,12 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret, tmp; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + #ifndef CONFIG_SMP preempt_disable(); #endif - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -159,7 +161,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); #ifndef CONFIG_SMP preempt_enable(); #endif diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h index 6cc26a127819..97f6a63810ec 100644 --- a/arch/arm64/include/asm/futex.h +++ b/arch/arm64/include/asm/futex.h @@ -48,7 +48,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) int oldval = 0, ret, tmp; u32 __user *uaddr = __uaccess_mask_ptr(_uaddr); - pagefault_disable(); + if (!access_ok(_uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -75,8 +76,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/hexagon/include/asm/futex.h b/arch/hexagon/include/asm/futex.h index cb635216a732..8693dc5ae9ec 100644 --- a/arch/hexagon/include/asm/futex.h +++ b/arch/hexagon/include/asm/futex.h @@ -36,7 +36,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -62,8 +63,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/ia64/include/asm/futex.h b/arch/ia64/include/asm/futex.h index 2e106d462196..1db26b432d8c 100644 --- a/arch/ia64/include/asm/futex.h +++ b/arch/ia64/include/asm/futex.h @@ -50,7 +50,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -74,8 +75,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/microblaze/include/asm/futex.h b/arch/microblaze/include/asm/futex.h index 8c90357e5983..86131ed84c9a 100644 --- a/arch/microblaze/include/asm/futex.h +++ b/arch/microblaze/include/asm/futex.h @@ -34,7 +34,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -56,8 +57,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h index b83b0397462d..86f224548651 100644 --- a/arch/mips/include/asm/futex.h +++ b/arch/mips/include/asm/futex.h @@ -88,7 +88,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -115,8 +116,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/nds32/include/asm/futex.h b/arch/nds32/include/asm/futex.h index 5213c65c2e0b..60b7ab74ed92 100644 --- a/arch/nds32/include/asm/futex.h +++ b/arch/nds32/include/asm/futex.h @@ -66,8 +66,9 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; - pagefault_disable(); switch (op) { case FUTEX_OP_SET: __futex_atomic_op("move %0, %3", ret, oldval, tmp, uaddr, @@ -93,8 +94,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/openrisc/include/asm/futex.h b/arch/openrisc/include/asm/futex.h index fe894e6331ae..865e9cd0d97b 100644 --- a/arch/openrisc/include/asm/futex.h +++ b/arch/openrisc/include/asm/futex.h @@ -35,7 +35,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -57,8 +58,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h index 50662b6cb605..6e2e4d10e3c8 100644 --- a/arch/parisc/include/asm/futex.h +++ b/arch/parisc/include/asm/futex.h @@ -40,11 +40,10 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) u32 tmp; _futex_spin_lock_irqsave(uaddr, &flags); - pagefault_disable(); ret = -EFAULT; if (unlikely(get_user(oldval, uaddr) != 0)) - goto out_pagefault_enable; + goto out_unlock; ret = 0; tmp = oldval; @@ -72,8 +71,7 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) if (ret == 0 && unlikely(put_user(tmp, uaddr) != 0)) ret = -EFAULT; -out_pagefault_enable: - pagefault_enable(); +out_unlock: _futex_spin_unlock_irqrestore(uaddr, &flags); if (!ret) diff --git a/arch/powerpc/include/asm/futex.h b/arch/powerpc/include/asm/futex.h index eea28ca679db..d6e32b32f452 100644 --- a/arch/powerpc/include/asm/futex.h +++ b/arch/powerpc/include/asm/futex.h @@ -35,8 +35,9 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; allow_write_to_user(uaddr, sizeof(*uaddr)); - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -58,8 +59,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - *oval = oldval; prevent_write_to_user(uaddr, sizeof(*uaddr)); diff --git a/arch/riscv/include/asm/futex.h b/arch/riscv/include/asm/futex.h index 4ad6409c4647..84574acfb927 100644 --- a/arch/riscv/include/asm/futex.h +++ b/arch/riscv/include/asm/futex.h @@ -40,7 +40,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret = 0; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -67,8 +68,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/s390/include/asm/futex.h b/arch/s390/include/asm/futex.h index 5e97a4353147..3c18a48baf44 100644 --- a/arch/s390/include/asm/futex.h +++ b/arch/s390/include/asm/futex.h @@ -28,8 +28,10 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, int oldval = 0, newval, ret; mm_segment_t old_fs; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + old_fs = enable_sacf_uaccess(); - pagefault_disable(); switch (op) { case FUTEX_OP_SET: __futex_atomic_op("lr %2,%5\n", @@ -54,7 +56,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, default: ret = -ENOSYS; } - pagefault_enable(); disable_sacf_uaccess(old_fs); if (!ret) diff --git a/arch/sh/include/asm/futex.h b/arch/sh/include/asm/futex.h index 3190ec89df81..b39cda09fb95 100644 --- a/arch/sh/include/asm/futex.h +++ b/arch/sh/include/asm/futex.h @@ -34,8 +34,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 oldval, newval, prev; int ret; - pagefault_disable(); - do { ret = get_user(oldval, uaddr); @@ -67,8 +65,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, ret = futex_atomic_cmpxchg_inatomic(&prev, uaddr, oldval, newval); } while (!ret && prev != oldval); - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/sparc/include/asm/futex_64.h b/arch/sparc/include/asm/futex_64.h index 0865ce77ec00..72de967318d7 100644 --- a/arch/sparc/include/asm/futex_64.h +++ b/arch/sparc/include/asm/futex_64.h @@ -38,8 +38,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, if (unlikely((((unsigned long) uaddr) & 0x3UL))) return -EINVAL; - pagefault_disable(); - switch (op) { case FUTEX_OP_SET: __futex_cas_op("mov\t%4, %1", ret, oldval, uaddr, oparg); @@ -60,8 +58,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h index 13c83fe97988..6bcd1c1486d9 100644 --- a/arch/x86/include/asm/futex.h +++ b/arch/x86/include/asm/futex.h @@ -47,7 +47,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret, tem; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -70,8 +71,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/xtensa/include/asm/futex.h b/arch/xtensa/include/asm/futex.h index 0c4457ca0a85..271cfcf8a841 100644 --- a/arch/xtensa/include/asm/futex.h +++ b/arch/xtensa/include/asm/futex.h @@ -72,7 +72,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, #if XCHAL_HAVE_S32C1I || XCHAL_HAVE_EXCLUSIVE int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -99,8 +100,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h index 02970b11f71f..f4c3470480c7 100644 --- a/include/asm-generic/futex.h +++ b/include/asm-generic/futex.h @@ -34,7 +34,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr) u32 tmp; preempt_disable(); - pagefault_disable(); ret = -EFAULT; if (unlikely(get_user(oldval, uaddr) != 0)) @@ -67,7 +66,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr) ret = -EFAULT; out_pagefault_enable: - pagefault_enable(); preempt_enable(); if (ret == 0) diff --git a/kernel/futex.c b/kernel/futex.c index bd18f60e4c6c..2cc8a35109da 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1662,10 +1662,9 @@ static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr) oparg = 1 << oparg; } - if (!access_ok(uaddr, sizeof(u32))) - return -EFAULT; - + pagefault_disable(); ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); + pagefault_enable(); if (ret) return ret; ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [RFC] change of calling conventions for arch_futex_atomic_op_inuser() 2019-10-16 12:12 ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro @ 2019-10-16 12:12 ` Al Viro 2019-10-16 12:24 ` Thomas Gleixner 1 sibling, 0 replies; 20+ messages in thread From: Al Viro @ 2019-10-16 12:12 UTC (permalink / raw) To: Linus Torvalds Cc: Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote: > [futex folks and linux-arch Cc'd] > Another question: right now we have > if (!access_ok(uaddr, sizeof(u32))) > return -EFAULT; > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > if (ret) > return ret; > in kernel/futex.c. Would there be any objections to moving access_ok() > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > Reasons: > * on x86 that would allow folding access_ok() with STAC into > user_access_begin(). The same would be doable on other usual suspects > (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their > STAC counterparts. > * pagefault_disable()/pagefault_enable() pair is universal on > all architectures, really meant to by the nature of the beast and > lifting it into kernel/futex.c would get the same situation as with > futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside > the primitive (also foldable into user_access_begin(), at that). > * access_ok() would be closer to actual memory access (and > out of the generic code). > > Comments? FWIW, completely untested patch follows; just the (semimechanical) conversion of calling conventions, no per-architecture followups included. Could futex folks ACK/NAK that in principle? commit 7babb6ad28cb3e80977fb6bd0405e3f81a943161 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Tue Oct 15 16:54:41 2019 -0400 arch_futex_atomic_op_inuser(): move access_ok() in and pagefault_disable() - out Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/arch/alpha/include/asm/futex.h b/arch/alpha/include/asm/futex.h index bfd3c01038f8..da67afd578fd 100644 --- a/arch/alpha/include/asm/futex.h +++ b/arch/alpha/include/asm/futex.h @@ -31,7 +31,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -53,8 +54,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h index 9d0d070e6c22..607d1c16d4dd 100644 --- a/arch/arc/include/asm/futex.h +++ b/arch/arc/include/asm/futex.h @@ -75,10 +75,12 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + #ifndef CONFIG_ARC_HAS_LLSC preempt_disable(); /* to guarantee atomic r-m-w of futex op */ #endif - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -101,7 +103,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); #ifndef CONFIG_ARC_HAS_LLSC preempt_enable(); #endif diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h index 83c391b597d4..e133da303a98 100644 --- a/arch/arm/include/asm/futex.h +++ b/arch/arm/include/asm/futex.h @@ -134,10 +134,12 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret, tmp; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + #ifndef CONFIG_SMP preempt_disable(); #endif - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -159,7 +161,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); #ifndef CONFIG_SMP preempt_enable(); #endif diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h index 6cc26a127819..97f6a63810ec 100644 --- a/arch/arm64/include/asm/futex.h +++ b/arch/arm64/include/asm/futex.h @@ -48,7 +48,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) int oldval = 0, ret, tmp; u32 __user *uaddr = __uaccess_mask_ptr(_uaddr); - pagefault_disable(); + if (!access_ok(_uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -75,8 +76,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/hexagon/include/asm/futex.h b/arch/hexagon/include/asm/futex.h index cb635216a732..8693dc5ae9ec 100644 --- a/arch/hexagon/include/asm/futex.h +++ b/arch/hexagon/include/asm/futex.h @@ -36,7 +36,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -62,8 +63,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/ia64/include/asm/futex.h b/arch/ia64/include/asm/futex.h index 2e106d462196..1db26b432d8c 100644 --- a/arch/ia64/include/asm/futex.h +++ b/arch/ia64/include/asm/futex.h @@ -50,7 +50,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -74,8 +75,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/microblaze/include/asm/futex.h b/arch/microblaze/include/asm/futex.h index 8c90357e5983..86131ed84c9a 100644 --- a/arch/microblaze/include/asm/futex.h +++ b/arch/microblaze/include/asm/futex.h @@ -34,7 +34,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -56,8 +57,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h index b83b0397462d..86f224548651 100644 --- a/arch/mips/include/asm/futex.h +++ b/arch/mips/include/asm/futex.h @@ -88,7 +88,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -115,8 +116,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/nds32/include/asm/futex.h b/arch/nds32/include/asm/futex.h index 5213c65c2e0b..60b7ab74ed92 100644 --- a/arch/nds32/include/asm/futex.h +++ b/arch/nds32/include/asm/futex.h @@ -66,8 +66,9 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; - pagefault_disable(); switch (op) { case FUTEX_OP_SET: __futex_atomic_op("move %0, %3", ret, oldval, tmp, uaddr, @@ -93,8 +94,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/openrisc/include/asm/futex.h b/arch/openrisc/include/asm/futex.h index fe894e6331ae..865e9cd0d97b 100644 --- a/arch/openrisc/include/asm/futex.h +++ b/arch/openrisc/include/asm/futex.h @@ -35,7 +35,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -57,8 +58,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h index 50662b6cb605..6e2e4d10e3c8 100644 --- a/arch/parisc/include/asm/futex.h +++ b/arch/parisc/include/asm/futex.h @@ -40,11 +40,10 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) u32 tmp; _futex_spin_lock_irqsave(uaddr, &flags); - pagefault_disable(); ret = -EFAULT; if (unlikely(get_user(oldval, uaddr) != 0)) - goto out_pagefault_enable; + goto out_unlock; ret = 0; tmp = oldval; @@ -72,8 +71,7 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) if (ret == 0 && unlikely(put_user(tmp, uaddr) != 0)) ret = -EFAULT; -out_pagefault_enable: - pagefault_enable(); +out_unlock: _futex_spin_unlock_irqrestore(uaddr, &flags); if (!ret) diff --git a/arch/powerpc/include/asm/futex.h b/arch/powerpc/include/asm/futex.h index eea28ca679db..d6e32b32f452 100644 --- a/arch/powerpc/include/asm/futex.h +++ b/arch/powerpc/include/asm/futex.h @@ -35,8 +35,9 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; allow_write_to_user(uaddr, sizeof(*uaddr)); - pagefault_disable(); switch (op) { case FUTEX_OP_SET: @@ -58,8 +59,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - *oval = oldval; prevent_write_to_user(uaddr, sizeof(*uaddr)); diff --git a/arch/riscv/include/asm/futex.h b/arch/riscv/include/asm/futex.h index 4ad6409c4647..84574acfb927 100644 --- a/arch/riscv/include/asm/futex.h +++ b/arch/riscv/include/asm/futex.h @@ -40,7 +40,8 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { int oldval = 0, ret = 0; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -67,8 +68,6 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/s390/include/asm/futex.h b/arch/s390/include/asm/futex.h index 5e97a4353147..3c18a48baf44 100644 --- a/arch/s390/include/asm/futex.h +++ b/arch/s390/include/asm/futex.h @@ -28,8 +28,10 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, int oldval = 0, newval, ret; mm_segment_t old_fs; + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; + old_fs = enable_sacf_uaccess(); - pagefault_disable(); switch (op) { case FUTEX_OP_SET: __futex_atomic_op("lr %2,%5\n", @@ -54,7 +56,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, default: ret = -ENOSYS; } - pagefault_enable(); disable_sacf_uaccess(old_fs); if (!ret) diff --git a/arch/sh/include/asm/futex.h b/arch/sh/include/asm/futex.h index 3190ec89df81..b39cda09fb95 100644 --- a/arch/sh/include/asm/futex.h +++ b/arch/sh/include/asm/futex.h @@ -34,8 +34,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 oldval, newval, prev; int ret; - pagefault_disable(); - do { ret = get_user(oldval, uaddr); @@ -67,8 +65,6 @@ static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, ret = futex_atomic_cmpxchg_inatomic(&prev, uaddr, oldval, newval); } while (!ret && prev != oldval); - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/sparc/include/asm/futex_64.h b/arch/sparc/include/asm/futex_64.h index 0865ce77ec00..72de967318d7 100644 --- a/arch/sparc/include/asm/futex_64.h +++ b/arch/sparc/include/asm/futex_64.h @@ -38,8 +38,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, if (unlikely((((unsigned long) uaddr) & 0x3UL))) return -EINVAL; - pagefault_disable(); - switch (op) { case FUTEX_OP_SET: __futex_cas_op("mov\t%4, %1", ret, oldval, uaddr, oparg); @@ -60,8 +58,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h index 13c83fe97988..6bcd1c1486d9 100644 --- a/arch/x86/include/asm/futex.h +++ b/arch/x86/include/asm/futex.h @@ -47,7 +47,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, { int oldval = 0, ret, tem; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -70,8 +71,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/arch/xtensa/include/asm/futex.h b/arch/xtensa/include/asm/futex.h index 0c4457ca0a85..271cfcf8a841 100644 --- a/arch/xtensa/include/asm/futex.h +++ b/arch/xtensa/include/asm/futex.h @@ -72,7 +72,8 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, #if XCHAL_HAVE_S32C1I || XCHAL_HAVE_EXCLUSIVE int oldval = 0, ret; - pagefault_disable(); + if (!access_ok(uaddr, sizeof(u32))) + return -EFAULT; switch (op) { case FUTEX_OP_SET: @@ -99,8 +100,6 @@ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, ret = -ENOSYS; } - pagefault_enable(); - if (!ret) *oval = oldval; diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h index 02970b11f71f..f4c3470480c7 100644 --- a/include/asm-generic/futex.h +++ b/include/asm-generic/futex.h @@ -34,7 +34,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr) u32 tmp; preempt_disable(); - pagefault_disable(); ret = -EFAULT; if (unlikely(get_user(oldval, uaddr) != 0)) @@ -67,7 +66,6 @@ arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr) ret = -EFAULT; out_pagefault_enable: - pagefault_enable(); preempt_enable(); if (ret == 0) diff --git a/kernel/futex.c b/kernel/futex.c index bd18f60e4c6c..2cc8a35109da 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1662,10 +1662,9 @@ static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr) oparg = 1 << oparg; } - if (!access_ok(uaddr, sizeof(u32))) - return -EFAULT; - + pagefault_disable(); ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); + pagefault_enable(); if (ret) return ret; ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [RFC] change of calling conventions for arch_futex_atomic_op_inuser() 2019-10-16 12:12 ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro 2019-10-16 12:12 ` Al Viro @ 2019-10-16 12:24 ` Thomas Gleixner 2019-10-16 12:24 ` Thomas Gleixner 1 sibling, 1 reply; 20+ messages in thread From: Thomas Gleixner @ 2019-10-16 12:24 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Wed, 16 Oct 2019, Al Viro wrote: > On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote: > > [futex folks and linux-arch Cc'd] > > > Another question: right now we have > > if (!access_ok(uaddr, sizeof(u32))) > > return -EFAULT; > > > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > > if (ret) > > return ret; > > in kernel/futex.c. Would there be any objections to moving access_ok() > > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > > > Reasons: > > * on x86 that would allow folding access_ok() with STAC into > > user_access_begin(). The same would be doable on other usual suspects > > (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their > > STAC counterparts. > > * pagefault_disable()/pagefault_enable() pair is universal on > > all architectures, really meant to by the nature of the beast and > > lifting it into kernel/futex.c would get the same situation as with > > futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside > > the primitive (also foldable into user_access_begin(), at that). > > * access_ok() would be closer to actual memory access (and > > out of the generic code). > > > > Comments? > > FWIW, completely untested patch follows; just the (semimechanical) conversion > of calling conventions, no per-architecture followups included. Could futex > folks ACK/NAK that in principle? Makes sense and does not change any of the futex semantics. Go wild. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC] change of calling conventions for arch_futex_atomic_op_inuser() 2019-10-16 12:24 ` Thomas Gleixner @ 2019-10-16 12:24 ` Thomas Gleixner 0 siblings, 0 replies; 20+ messages in thread From: Thomas Gleixner @ 2019-10-16 12:24 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, linux-fsdevel, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-arch On Wed, 16 Oct 2019, Al Viro wrote: > On Tue, Oct 15, 2019 at 07:08:46PM +0100, Al Viro wrote: > > [futex folks and linux-arch Cc'd] > > > Another question: right now we have > > if (!access_ok(uaddr, sizeof(u32))) > > return -EFAULT; > > > > ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr); > > if (ret) > > return ret; > > in kernel/futex.c. Would there be any objections to moving access_ok() > > inside the instances and moving pagefault_disable()/pagefault_enable() outside? > > > > Reasons: > > * on x86 that would allow folding access_ok() with STAC into > > user_access_begin(). The same would be doable on other usual suspects > > (arm, arm64, ppc, riscv, s390), bringing access_ok() next to their > > STAC counterparts. > > * pagefault_disable()/pagefault_enable() pair is universal on > > all architectures, really meant to by the nature of the beast and > > lifting it into kernel/futex.c would get the same situation as with > > futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside > > the primitive (also foldable into user_access_begin(), at that). > > * access_ok() would be closer to actual memory access (and > > out of the generic code). > > > > Comments? > > FWIW, completely untested patch follows; just the (semimechanical) conversion > of calling conventions, no per-architecture followups included. Could futex > folks ACK/NAK that in principle? Makes sense and does not change any of the futex semantics. Go wild. Thanks, tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20191006222046.GA18027@roeck-us.net>]
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() [not found] <20191006222046.GA18027@roeck-us.net> @ 2019-10-07 19:21 ` Linus Torvalds 2019-10-07 19:21 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Linus Torvalds @ 2019-10-07 19:21 UTC (permalink / raw) To: Guenter Roeck, Michael Cree Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch [-- Attachment #1: Type: text/plain, Size: 905 bytes --] On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: > > this patch causes all my sparc64 emulations to stall during boot. It causes > all alpha emulations to crash with [1a] and [1b] when booting from a virtual > disk, and one of the xtensa emulations to crash with [2]. So I think your alpha emulation environment may be broken, because Michael Cree reports that it works for him on real hardware, but he does see the kernel unaligned count being high. But regardless, this is my current fairly minimal patch that I think should fix the unaligned issue, while still giving the behavior we want on x86. I hope Al can do something nicer, but I think this is "acceptable". I'm running this now on x86, and I verified that x86-32 code generation looks sane too, but it woudl be good to verify that this makes the alignment issue go away on other architectures. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 4604 bytes --] arch/x86/include/asm/uaccess.h | 23 ++++++++++++++++++++++ fs/readdir.c | 44 ++---------------------------------------- include/linux/uaccess.h | 6 ++++-- 3 files changed, 29 insertions(+), 44 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 35c225ede0e4..61d93f062a36 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -734,5 +734,28 @@ do { \ if (unlikely(__gu_err)) goto err_label; \ } while (0) +/* + * We want the unsafe accessors to always be inlined and use + * the error labels - thus the macro games. + */ +#define unsafe_copy_loop(dst, src, len, type, label) \ + while (len >= sizeof(type)) { \ + unsafe_put_user(*(type *)src,(type __user *)dst,label); \ + dst += sizeof(type); \ + src += sizeof(type); \ + len -= sizeof(type); \ + } + +#define unsafe_copy_to_user(_dst,_src,_len,label) \ +do { \ + char __user *__ucu_dst = (_dst); \ + const char *__ucu_src = (_src); \ + size_t __ucu_len = (_len); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label); \ +} while (0) + #endif /* _ASM_X86_UACCESS_H */ diff --git a/fs/readdir.c b/fs/readdir.c index 19bea591c3f1..6e2623e57b2e 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -27,53 +27,13 @@ /* * Note the "unsafe_put_user() semantics: we goto a * label for errors. - * - * Also note how we use a "while()" loop here, even though - * only the biggest size needs to loop. The compiler (well, - * at least gcc) is smart enough to turn the smaller sizes - * into just if-statements, and this way we don't need to - * care whether 'u64' or 'u32' is the biggest size. - */ -#define unsafe_copy_loop(dst, src, len, type, label) \ - while (len >= sizeof(type)) { \ - unsafe_put_user(get_unaligned((type *)src), \ - (type __user *)dst, label); \ - dst += sizeof(type); \ - src += sizeof(type); \ - len -= sizeof(type); \ - } - -/* - * We avoid doing 64-bit copies on 32-bit architectures. They - * might be better, but the component names are mostly small, - * and the 64-bit cases can end up being much more complex and - * put much more register pressure on the code, so it's likely - * not worth the pain of unaligned accesses etc. - * - * So limit the copies to "unsigned long" size. I did verify - * that at least the x86-32 case is ok without this limiting, - * but I worry about random other legacy 32-bit cases that - * might not do as well. - */ -#define unsafe_copy_type(dst, src, len, type, label) do { \ - if (sizeof(type) <= sizeof(unsigned long)) \ - unsafe_copy_loop(dst, src, len, type, label); \ -} while (0) - -/* - * Copy the dirent name to user space, and NUL-terminate - * it. This should not be a function call, since we're doing - * the copy inside a "user_access_begin/end()" section. */ #define unsafe_copy_dirent_name(_dst, _src, _len, label) do { \ char __user *dst = (_dst); \ const char *src = (_src); \ size_t len = (_len); \ - unsafe_copy_type(dst, src, len, u64, label); \ - unsafe_copy_type(dst, src, len, u32, label); \ - unsafe_copy_type(dst, src, len, u16, label); \ - unsafe_copy_type(dst, src, len, u8, label); \ - unsafe_put_user(0, dst, label); \ + unsafe_put_user(0, dst+len, label); \ + unsafe_copy_to_user(dst, src, len, label); \ } while (0) diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index e47d0522a1f4..d4ee6e942562 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -355,8 +355,10 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); #ifndef user_access_begin #define user_access_begin(ptr,len) access_ok(ptr, len) #define user_access_end() do { } while (0) -#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0) -#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0) +#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0) +#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e) +#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e) +#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e) static inline unsigned long user_access_save(void) { return 0UL; } static inline void user_access_restore(unsigned long flags) { } #endif ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 19:21 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Linus Torvalds @ 2019-10-07 19:21 ` Linus Torvalds 2019-10-07 20:29 ` Guenter Roeck 2019-10-07 23:27 ` Guenter Roeck 2 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2019-10-07 19:21 UTC (permalink / raw) To: Guenter Roeck, Michael Cree Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch [-- Attachment #1: Type: text/plain, Size: 905 bytes --] On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: > > this patch causes all my sparc64 emulations to stall during boot. It causes > all alpha emulations to crash with [1a] and [1b] when booting from a virtual > disk, and one of the xtensa emulations to crash with [2]. So I think your alpha emulation environment may be broken, because Michael Cree reports that it works for him on real hardware, but he does see the kernel unaligned count being high. But regardless, this is my current fairly minimal patch that I think should fix the unaligned issue, while still giving the behavior we want on x86. I hope Al can do something nicer, but I think this is "acceptable". I'm running this now on x86, and I verified that x86-32 code generation looks sane too, but it woudl be good to verify that this makes the alignment issue go away on other architectures. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 4604 bytes --] arch/x86/include/asm/uaccess.h | 23 ++++++++++++++++++++++ fs/readdir.c | 44 ++---------------------------------------- include/linux/uaccess.h | 6 ++++-- 3 files changed, 29 insertions(+), 44 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 35c225ede0e4..61d93f062a36 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -734,5 +734,28 @@ do { \ if (unlikely(__gu_err)) goto err_label; \ } while (0) +/* + * We want the unsafe accessors to always be inlined and use + * the error labels - thus the macro games. + */ +#define unsafe_copy_loop(dst, src, len, type, label) \ + while (len >= sizeof(type)) { \ + unsafe_put_user(*(type *)src,(type __user *)dst,label); \ + dst += sizeof(type); \ + src += sizeof(type); \ + len -= sizeof(type); \ + } + +#define unsafe_copy_to_user(_dst,_src,_len,label) \ +do { \ + char __user *__ucu_dst = (_dst); \ + const char *__ucu_src = (_src); \ + size_t __ucu_len = (_len); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label); \ + unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label); \ +} while (0) + #endif /* _ASM_X86_UACCESS_H */ diff --git a/fs/readdir.c b/fs/readdir.c index 19bea591c3f1..6e2623e57b2e 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -27,53 +27,13 @@ /* * Note the "unsafe_put_user() semantics: we goto a * label for errors. - * - * Also note how we use a "while()" loop here, even though - * only the biggest size needs to loop. The compiler (well, - * at least gcc) is smart enough to turn the smaller sizes - * into just if-statements, and this way we don't need to - * care whether 'u64' or 'u32' is the biggest size. - */ -#define unsafe_copy_loop(dst, src, len, type, label) \ - while (len >= sizeof(type)) { \ - unsafe_put_user(get_unaligned((type *)src), \ - (type __user *)dst, label); \ - dst += sizeof(type); \ - src += sizeof(type); \ - len -= sizeof(type); \ - } - -/* - * We avoid doing 64-bit copies on 32-bit architectures. They - * might be better, but the component names are mostly small, - * and the 64-bit cases can end up being much more complex and - * put much more register pressure on the code, so it's likely - * not worth the pain of unaligned accesses etc. - * - * So limit the copies to "unsigned long" size. I did verify - * that at least the x86-32 case is ok without this limiting, - * but I worry about random other legacy 32-bit cases that - * might not do as well. - */ -#define unsafe_copy_type(dst, src, len, type, label) do { \ - if (sizeof(type) <= sizeof(unsigned long)) \ - unsafe_copy_loop(dst, src, len, type, label); \ -} while (0) - -/* - * Copy the dirent name to user space, and NUL-terminate - * it. This should not be a function call, since we're doing - * the copy inside a "user_access_begin/end()" section. */ #define unsafe_copy_dirent_name(_dst, _src, _len, label) do { \ char __user *dst = (_dst); \ const char *src = (_src); \ size_t len = (_len); \ - unsafe_copy_type(dst, src, len, u64, label); \ - unsafe_copy_type(dst, src, len, u32, label); \ - unsafe_copy_type(dst, src, len, u16, label); \ - unsafe_copy_type(dst, src, len, u8, label); \ - unsafe_put_user(0, dst, label); \ + unsafe_put_user(0, dst+len, label); \ + unsafe_copy_to_user(dst, src, len, label); \ } while (0) diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index e47d0522a1f4..d4ee6e942562 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -355,8 +355,10 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); #ifndef user_access_begin #define user_access_begin(ptr,len) access_ok(ptr, len) #define user_access_end() do { } while (0) -#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0) -#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0) +#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0) +#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e) +#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e) +#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e) static inline unsigned long user_access_save(void) { return 0UL; } static inline void user_access_restore(unsigned long flags) { } #endif ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 19:21 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Linus Torvalds 2019-10-07 19:21 ` Linus Torvalds @ 2019-10-07 20:29 ` Guenter Roeck 2019-10-07 20:29 ` Guenter Roeck 2019-10-07 23:27 ` Guenter Roeck 2 siblings, 1 reply; 20+ messages in thread From: Guenter Roeck @ 2019-10-07 20:29 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Cree, Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch On Mon, Oct 07, 2019 at 12:21:25PM -0700, Linus Torvalds wrote: > On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: > > > > this patch causes all my sparc64 emulations to stall during boot. It causes > > all alpha emulations to crash with [1a] and [1b] when booting from a virtual > > disk, and one of the xtensa emulations to crash with [2]. > > So I think your alpha emulation environment may be broken, because > Michael Cree reports that it works for him on real hardware, but he > does see the kernel unaligned count being high. > Yes, that possibility always exists, unfortunately. > But regardless, this is my current fairly minimal patch that I think > should fix the unaligned issue, while still giving the behavior we > want on x86. I hope Al can do something nicer, but I think this is > "acceptable". > > I'm running this now on x86, and I verified that x86-32 code > generation looks sane too, but it woudl be good to verify that this > makes the alignment issue go away on other architectures. > > Linus I started a complete test run with the patch applied. I'll let you know how it went after it is complete - it should be done in a couple of hours. Guenter ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 20:29 ` Guenter Roeck @ 2019-10-07 20:29 ` Guenter Roeck 0 siblings, 0 replies; 20+ messages in thread From: Guenter Roeck @ 2019-10-07 20:29 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Cree, Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch On Mon, Oct 07, 2019 at 12:21:25PM -0700, Linus Torvalds wrote: > On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: > > > > this patch causes all my sparc64 emulations to stall during boot. It causes > > all alpha emulations to crash with [1a] and [1b] when booting from a virtual > > disk, and one of the xtensa emulations to crash with [2]. > > So I think your alpha emulation environment may be broken, because > Michael Cree reports that it works for him on real hardware, but he > does see the kernel unaligned count being high. > Yes, that possibility always exists, unfortunately. > But regardless, this is my current fairly minimal patch that I think > should fix the unaligned issue, while still giving the behavior we > want on x86. I hope Al can do something nicer, but I think this is > "acceptable". > > I'm running this now on x86, and I verified that x86-32 code > generation looks sane too, but it woudl be good to verify that this > makes the alignment issue go away on other architectures. > > Linus I started a complete test run with the patch applied. I'll let you know how it went after it is complete - it should be done in a couple of hours. Guenter ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 19:21 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Linus Torvalds 2019-10-07 19:21 ` Linus Torvalds 2019-10-07 20:29 ` Guenter Roeck @ 2019-10-07 23:27 ` Guenter Roeck 2019-10-07 23:27 ` Guenter Roeck 2019-10-08 6:28 ` Geert Uytterhoeven 2 siblings, 2 replies; 20+ messages in thread From: Guenter Roeck @ 2019-10-07 23:27 UTC (permalink / raw) To: Linus Torvalds, Michael Cree Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch On 10/7/19 12:21 PM, Linus Torvalds wrote: > On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: >> >> this patch causes all my sparc64 emulations to stall during boot. It causes >> all alpha emulations to crash with [1a] and [1b] when booting from a virtual >> disk, and one of the xtensa emulations to crash with [2]. > > So I think your alpha emulation environment may be broken, because > Michael Cree reports that it works for him on real hardware, but he > does see the kernel unaligned count being high. > > But regardless, this is my current fairly minimal patch that I think > should fix the unaligned issue, while still giving the behavior we > want on x86. I hope Al can do something nicer, but I think this is > "acceptable". > > I'm running this now on x86, and I verified that x86-32 code > generation looks sane too, but it woudl be good to verify that this > makes the alignment issue go away on other architectures. > > Linus > Test results look good. Feel free to add Tested-by: Guenter Roeck <linux@roeck-us.net> to your patch. Build results: total: 158 pass: 154 fail: 4 Failed builds: arm:allmodconfig m68k:defconfig mips:allmodconfig sparc64:allmodconfig Qemu test results: total: 391 pass: 390 fail: 1 Failed tests: ppc64:mpc8544ds:ppc64_e5500_defconfig:nosmp:initrd This is with "regulator: fixed: Prevent NULL pointer dereference when !CONFIG_OF" applied as well. The other failures are unrelated. arm: arch/arm/crypto/aes-ce-core.S:299: Error: selected processor does not support `movw ip,:lower16:.Lcts_permute_table' in ARM mode Fix is pending in crypto tree. m68k: c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported' I don't know the status. mips: drivers/staging/octeon/ethernet-defines.h:30:38: error: 'CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE' undeclared and other similar errors. I don't know the status. ppc64: powerpc64-linux-ld: mm/page_alloc.o:(.toc+0x18): undefined reference to `node_reclaim_distance' Reported against offending patch earlier today. sparc64: drivers/watchdog/cpwd.c:500:19: error: 'compat_ptr_ioctl' undeclared here Oops. I'll need to look into that. Looks like the patch to use a new infrastructure made it into the kernel but the infrastructure itself didn't make it after all. Guenter ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 23:27 ` Guenter Roeck @ 2019-10-07 23:27 ` Guenter Roeck 2019-10-08 6:28 ` Geert Uytterhoeven 1 sibling, 0 replies; 20+ messages in thread From: Guenter Roeck @ 2019-10-07 23:27 UTC (permalink / raw) To: Linus Torvalds, Michael Cree Cc: Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch On 10/7/19 12:21 PM, Linus Torvalds wrote: > On Sun, Oct 6, 2019 at 3:20 PM Guenter Roeck <linux@roeck-us.net> wrote: >> >> this patch causes all my sparc64 emulations to stall during boot. It causes >> all alpha emulations to crash with [1a] and [1b] when booting from a virtual >> disk, and one of the xtensa emulations to crash with [2]. > > So I think your alpha emulation environment may be broken, because > Michael Cree reports that it works for him on real hardware, but he > does see the kernel unaligned count being high. > > But regardless, this is my current fairly minimal patch that I think > should fix the unaligned issue, while still giving the behavior we > want on x86. I hope Al can do something nicer, but I think this is > "acceptable". > > I'm running this now on x86, and I verified that x86-32 code > generation looks sane too, but it woudl be good to verify that this > makes the alignment issue go away on other architectures. > > Linus > Test results look good. Feel free to add Tested-by: Guenter Roeck <linux@roeck-us.net> to your patch. Build results: total: 158 pass: 154 fail: 4 Failed builds: arm:allmodconfig m68k:defconfig mips:allmodconfig sparc64:allmodconfig Qemu test results: total: 391 pass: 390 fail: 1 Failed tests: ppc64:mpc8544ds:ppc64_e5500_defconfig:nosmp:initrd This is with "regulator: fixed: Prevent NULL pointer dereference when !CONFIG_OF" applied as well. The other failures are unrelated. arm: arch/arm/crypto/aes-ce-core.S:299: Error: selected processor does not support `movw ip,:lower16:.Lcts_permute_table' in ARM mode Fix is pending in crypto tree. m68k: c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported' I don't know the status. mips: drivers/staging/octeon/ethernet-defines.h:30:38: error: 'CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE' undeclared and other similar errors. I don't know the status. ppc64: powerpc64-linux-ld: mm/page_alloc.o:(.toc+0x18): undefined reference to `node_reclaim_distance' Reported against offending patch earlier today. sparc64: drivers/watchdog/cpwd.c:500:19: error: 'compat_ptr_ioctl' undeclared here Oops. I'll need to look into that. Looks like the patch to use a new infrastructure made it into the kernel but the infrastructure itself didn't make it after all. Guenter ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-07 23:27 ` Guenter Roeck 2019-10-07 23:27 ` Guenter Roeck @ 2019-10-08 6:28 ` Geert Uytterhoeven 2019-10-08 6:28 ` Geert Uytterhoeven 1 sibling, 1 reply; 20+ messages in thread From: Geert Uytterhoeven @ 2019-10-08 6:28 UTC (permalink / raw) To: Guenter Roeck Cc: Linus Torvalds, Michael Cree, Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch, Bartlomiej Zolnierkiewicz On Tue, Oct 8, 2019 at 1:30 AM Guenter Roeck <linux@roeck-us.net> wrote: > m68k: > > c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported' > > I don't know the status. Fall-out from the (non)inline optimization. Patch available: https://lore.kernel.org/lkml/20190927094708.11563-1-geert@linux-m68k.org/ Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() 2019-10-08 6:28 ` Geert Uytterhoeven @ 2019-10-08 6:28 ` Geert Uytterhoeven 0 siblings, 0 replies; 20+ messages in thread From: Geert Uytterhoeven @ 2019-10-08 6:28 UTC (permalink / raw) To: Guenter Roeck Cc: Linus Torvalds, Michael Cree, Linux Kernel Mailing List, Alexander Viro, linux-fsdevel, linux-arch, Bartlomiej Zolnierkiewicz On Tue, Oct 8, 2019 at 1:30 AM Guenter Roeck <linux@roeck-us.net> wrote: > m68k: > > c2p_iplan2.c:(.text+0x98): undefined reference to `c2p_unsupported' > > I don't know the status. Fall-out from the (non)inline optimization. Patch available: https://lore.kernel.org/lkml/20190927094708.11563-1-geert@linux-m68k.org/ Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2019-10-16 12:24 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAHk-=wgOWxqwqCFuP_Bw=Hxxf9njeHJs0OLNGNc63peNd=kRqw@mail.gmail.com>
[not found] ` <20191010195504.GI26530@ZenIV.linux.org.uk>
[not found] ` <CAHk-=wgWRQo0m7TUCK4T_J-3Vqte+p-FWzvT3CB1jJHgX-KctA@mail.gmail.com>
[not found] ` <20191011001104.GJ26530@ZenIV.linux.org.uk>
[not found] ` <CAHk-=wgg3jzkk-jObm1FLVYGS8JCTiKppEnA00_QX7Wsm5ieLQ@mail.gmail.com>
[not found] ` <20191013181333.GK26530@ZenIV.linux.org.uk>
[not found] ` <CAHk-=wgrWGyACBM8N8KP7Pu_2VopuzM4A12yQz6Eo=X2Jpwzcw@mail.gmail.com>
[not found] ` <20191013191050.GL26530@ZenIV.linux.org.uk>
[not found] ` <CAHk-=wjJNE9hOKuatqh6SFf4nd65LG4ZR3gQSgg+rjSpVxe89w@mail.gmail.com>
[not found] ` <20191013195949.GM26530@ZenIV.linux.org.uk>
2019-10-15 18:08 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Al Viro
2019-10-15 18:08 ` Al Viro
2019-10-15 19:00 ` Linus Torvalds
2019-10-15 19:00 ` Linus Torvalds
2019-10-15 19:40 ` Al Viro
2019-10-15 19:40 ` Al Viro
2019-10-15 20:18 ` Al Viro
2019-10-15 20:18 ` Al Viro
2019-10-16 12:12 ` [RFC] change of calling conventions for arch_futex_atomic_op_inuser() Al Viro
2019-10-16 12:12 ` Al Viro
2019-10-16 12:24 ` Thomas Gleixner
2019-10-16 12:24 ` Thomas Gleixner
[not found] <20191006222046.GA18027@roeck-us.net>
2019-10-07 19:21 ` [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user() Linus Torvalds
2019-10-07 19:21 ` Linus Torvalds
2019-10-07 20:29 ` Guenter Roeck
2019-10-07 20:29 ` Guenter Roeck
2019-10-07 23:27 ` Guenter Roeck
2019-10-07 23:27 ` Guenter Roeck
2019-10-08 6:28 ` Geert Uytterhoeven
2019-10-08 6:28 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).