All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Chris Wright <chrisw@sous-sol.org>
Cc: linux-kernel@vger.kernel.org, stable@kernel.org, jejb@kernel.org,
	Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@elte.hu>,
	Lennert Buytenhek <buytenh@wantstofly.org>,
	Riku Voipio <riku.voipio@movial.fi>,
	Greg Kroah-Hartman <gregkh@suse.de>
Subject: Re: [patch 33/76] futex: runtime enable pi and robust functionality
Date: Thu, 27 Mar 2008 14:28:33 +1100	[thread overview]
Message-ID: <1206588514.6926.76.camel@pasglop> (raw)
In-Reply-To: <20080321224400.674465387@sous-sol.org>


On Fri, 2008-03-21 at 15:43 -0700, Chris Wright wrote:
> plain text document attachment
> (futex-runtime-enable-pi-and-robust-functionality.patch)
> -stable review patch.  If anyone has any objections, please let us know.
> ---------------------
> 
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Not all architectures implement futex_atomic_cmpxchg_inatomic().  The default
> implementation returns -ENOSYS, which is currently not handled inside of the
> futex guts.

I think that breaks some embedded PowerPC platforms. It's also very
fishy... So early, we haven't activated an mm yet, it's like trying
to access user memory from a kernel thread... might happen to give
you -EFAULT on x86 but you are making pretty big assumptions on how
low level mm works in architectures...

Ben.

> Futex PI calls and robust list exits with a held futex result in an endless
> loop in the futex code on architectures which have no support.
> 
> Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
> add a fair amount of extra if/else constructs to the already complex code.  It
> is also not possible to disable the robust feature before user space tries to
> register robust lists.
> 
> Compile time disabling is not a good idea either, as there are already
> architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.
> 
> Detect the functionality at runtime instead by calling
> cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
> code.  This is guaranteed to fail, but the call of
> futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.
> 
> On architectures, which use the asm-generic implementation or have a runtime
> CPU feature detection, a -ENOSYS return value disables the PI/robust features.
> 
> On architectures with a working implementation the call returns -EFAULT and
> the PI/robust features are enabled.
> 
> The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
> when the detection fails.
> 
> Fixes http://lkml.org/lkml/2008/2/11/149
> Originally reported by: Lennart Buytenhek
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Ingo Molnar <mingo@elte.hu>
> Cc: Lennert Buytenhek <buytenh@wantstofly.org>
> Cc: Riku Voipio <riku.voipio@movial.fi>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> ---
>  include/linux/futex.h |    1 +
>  kernel/futex.c        |   38 ++++++++++++++++++++++++++++++++++----
>  kernel/futex_compat.c |    9 +++++++++
>  3 files changed, 44 insertions(+), 4 deletions(-)
> 
> --- a/include/linux/futex.h
> +++ b/include/linux/futex.h
> @@ -153,6 +153,7 @@ union futex_key {
>  #ifdef CONFIG_FUTEX
>  extern void exit_robust_list(struct task_struct *curr);
>  extern void exit_pi_state_list(struct task_struct *curr);
> +extern int futex_cmpxchg_enabled;
>  #else
>  static inline void exit_robust_list(struct task_struct *curr)
>  {
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -60,6 +60,8 @@
>  
>  #include "rtmutex_common.h"
>  
> +int __read_mostly futex_cmpxchg_enabled;
> +
>  #define FUTEX_HASHBITS (CONFIG_BASE_SMALL ? 4 : 8)
>  
>  /*
> @@ -466,6 +468,8 @@ void exit_pi_state_list(struct task_stru
>  	struct futex_hash_bucket *hb;
>  	union futex_key key;
>  
> +	if (!futex_cmpxchg_enabled)
> +		return;
>  	/*
>  	 * We are a ZOMBIE and nobody can enqueue itself on
>  	 * pi_state_list anymore, but we have to be careful
> @@ -1854,6 +1858,8 @@ asmlinkage long
>  sys_set_robust_list(struct robust_list_head __user *head,
>  		    size_t len)
>  {
> +	if (!futex_cmpxchg_enabled)
> +		return -ENOSYS;
>  	/*
>  	 * The kernel knows only one size for now:
>  	 */
> @@ -1878,6 +1884,9 @@ sys_get_robust_list(int pid, struct robu
>  	struct robust_list_head __user *head;
>  	unsigned long ret;
>  
> +	if (!futex_cmpxchg_enabled)
> +		return -ENOSYS;
> +
>  	if (!pid)
>  		head = current->robust_list;
>  	else {
> @@ -1980,6 +1989,9 @@ void exit_robust_list(struct task_struct
>  	unsigned long futex_offset;
>  	int rc;
>  
> +	if (!futex_cmpxchg_enabled)
> +		return;
> +
>  	/*
>  	 * Fetch the list head (which was registered earlier, via
>  	 * sys_set_robust_list()):
> @@ -2034,7 +2046,7 @@ void exit_robust_list(struct task_struct
>  long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
>  		u32 __user *uaddr2, u32 val2, u32 val3)
>  {
> -	int ret;
> +	int ret = -ENOSYS;
>  	int cmd = op & FUTEX_CMD_MASK;
>  	struct rw_semaphore *fshared = NULL;
>  
> @@ -2062,13 +2074,16 @@ long do_futex(u32 __user *uaddr, int op,
>  		ret = futex_wake_op(uaddr, fshared, uaddr2, val, val2, val3);
>  		break;
>  	case FUTEX_LOCK_PI:
> -		ret = futex_lock_pi(uaddr, fshared, val, timeout, 0);
> +		if (futex_cmpxchg_enabled)
> +			ret = futex_lock_pi(uaddr, fshared, val, timeout, 0);
>  		break;
>  	case FUTEX_UNLOCK_PI:
> -		ret = futex_unlock_pi(uaddr, fshared);
> +		if (futex_cmpxchg_enabled)
> +			ret = futex_unlock_pi(uaddr, fshared);
>  		break;
>  	case FUTEX_TRYLOCK_PI:
> -		ret = futex_lock_pi(uaddr, fshared, 0, timeout, 1);
> +		if (futex_cmpxchg_enabled)
> +			ret = futex_lock_pi(uaddr, fshared, 0, timeout, 1);
>  		break;
>  	default:
>  		ret = -ENOSYS;
> @@ -2123,8 +2138,23 @@ static struct file_system_type futex_fs_
>  
>  static int __init init(void)
>  {
> +	u32 curval;
>  	int i;
>  
> +	/*
> +	 * This will fail and we want it. Some arch implementations do
> +	 * runtime detection of the futex_atomic_cmpxchg_inatomic()
> +	 * functionality. We want to know that before we call in any
> +	 * of the complex code paths. Also we want to prevent
> +	 * registration of robust lists in that case. NULL is
> +	 * guaranteed to fault and we get -EFAULT on functional
> +	 * implementation, the non functional ones will return
> +	 * -ENOSYS.
> +	 */
> +	curval = cmpxchg_futex_value_locked(NULL, 0, 0);
> +	if (curval == -EFAULT)
> +		futex_cmpxchg_enabled = 1;
> +
>  	for (i = 0; i < ARRAY_SIZE(futex_queues); i++) {
>  		plist_head_init(&futex_queues[i].chain, &futex_queues[i].lock);
>  		spin_lock_init(&futex_queues[i].lock);
> --- a/kernel/futex_compat.c
> +++ b/kernel/futex_compat.c
> @@ -54,6 +54,9 @@ void compat_exit_robust_list(struct task
>  	compat_long_t futex_offset;
>  	int rc;
>  
> +	if (!futex_cmpxchg_enabled)
> +		return;
> +
>  	/*
>  	 * Fetch the list head (which was registered earlier, via
>  	 * sys_set_robust_list()):
> @@ -115,6 +118,9 @@ asmlinkage long
>  compat_sys_set_robust_list(struct compat_robust_list_head __user *head,
>  			   compat_size_t len)
>  {
> +	if (!futex_cmpxchg_enabled)
> +		return -ENOSYS;
> +
>  	if (unlikely(len != sizeof(*head)))
>  		return -EINVAL;
>  
> @@ -130,6 +136,9 @@ compat_sys_get_robust_list(int pid, comp
>  	struct compat_robust_list_head __user *head;
>  	unsigned long ret;
>  
> +	if (!futex_cmpxchg_enabled)
> +		return -ENOSYS;
> +
>  	if (!pid)
>  		head = current->compat_robust_list;
>  	else {
> 


  parent reply	other threads:[~2008-03-27  3:32 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-21 22:42 [patch 00/76] 2.6.24-stable review Chris Wright
2008-03-21 22:42 ` [patch 01/76] Revert "NET: Add if_addrlabel.h to sanitized headers." Chris Wright
2008-03-21 22:42 ` [patch 02/76] SPARC64: Loosen checks in exception table handling Chris Wright
2008-03-21 22:42 ` [patch 03/76] SPARC: Fix link errors with gcc-4.3 Chris Wright
2008-03-21 22:42 ` [patch 04/76] TCP: Improve ipv4 established hash function Chris Wright
2008-03-21 22:42 ` [patch 05/76] NIU: More BMAC alt MAC address fixes Chris Wright
2008-03-21 22:42 ` [patch 06/76] NIU: Fix BMAC alternate MAC address indexing Chris Wright
2008-03-21 22:42 ` [patch 07/76] NIU: Bump driver version and release date Chris Wright
2008-03-22  0:10   ` Jesper Juhl
2008-03-22  0:14     ` David Miller
2008-03-22  0:16       ` Jesper Juhl
2008-03-26 21:28         ` Jesper Juhl
2008-03-22  2:35       ` Roland Dreier
2008-03-22  5:21         ` David Miller
2008-03-22 22:46           ` David Miller
2008-03-23  4:20             ` Roland Dreier
2008-03-22 22:57       ` Pekka Enberg
2008-03-22 23:26         ` David Miller
2008-03-23  8:42           ` Pekka Enberg
2008-03-23 10:18             ` David Miller
2008-03-23  4:13         ` Greg KH
2008-03-23  8:08           ` Chris Wright
2008-03-24 22:23         ` Adrian Bunk
2008-03-21 22:42 ` [patch 08/76] NET: Messed multicast lists after dev_mc_sync/unsync Chris Wright
2008-03-21 22:42 ` [patch 09/76] NET: Fix race in dev_close(). (Bug 9750) Chris Wright
2008-03-21 22:43 ` [patch 10/76] IPV6: Fix IPsec datagram fragmentation Chris Wright
2008-03-21 22:43 ` [patch 11/76] IPV6: dst_entry leak in ip4ip6_err Chris Wright
2008-03-21 22:43 ` [patch 12/76] IPV4: Remove IP_TOS setting privilege checks Chris Wright
2008-03-21 22:43 ` [patch 13/76] IPCONFIG: The kernel gets no IP from some DHCP servers Chris Wright
2008-03-21 22:43 ` [patch 14/76] IPCOMP: Disable BH on output when using shared tfm Chris Wright
2008-03-21 22:43 ` [patch 15/76] IRQ_NOPROBE helper functions Chris Wright
2008-03-21 22:43 ` [patch 16/76] MIPS: Mark all but i8259 interrupts as no-probe Chris Wright
2008-03-21 22:43 ` [patch 17/76] ub: fix up the conversion to sg_init_table() Chris Wright
2008-03-21 22:43 ` [patch 18/76] x86: Clear DF before calling signal handler Chris Wright
2008-03-21 22:43 ` [patch 19/76] iov_iter_advance() fix Chris Wright
2008-03-21 22:43 ` [patch 20/76] drivers: fix dma_get_required_mask Chris Wright
2008-03-21 22:43 ` [patch 21/76] x86: adjust enable_NMI_through_LVT0() Chris Wright
2008-03-21 22:43 ` [patch 22/76] SCSI ips: handle scsi_add_host() failure, and other err cleanups Chris Wright
2008-03-21 22:43 ` [patch 23/76] CRYPTO xcbc: Fix crash with IPsec Chris Wright
2008-03-21 22:43 ` [patch 24/76] CRYPTO xts: Use proper alignment Chris Wright
2008-03-21 22:43 ` [patch 25/76] fuse: fix permission checking Chris Wright
2008-03-21 22:43 ` [patch 26/76] usb-storage: dont access beyond the end of the sg buffer Chris Wright
2008-03-21 22:43 ` [patch 27/76] SCSI ips: fix data buffer accessors conversion bug Chris Wright
2008-03-21 22:43 ` [patch 28/76] SCSI gdth: dont call pci_free_consistent under spinlock Chris Wright
2008-03-21 22:43 ` [patch 29/76] SCSI aic94xx: fix REQ_TASK_ABORT and REQ_DEVICE_RESET Chris Wright
2008-03-21 22:43 ` [patch 30/76] x86: replace LOCK_PREFIX in futex.h Chris Wright
2008-03-21 22:43 ` [patch 31/76] ARM pxa: fix clock lookup to find specific device clocks Chris Wright
2008-03-21 22:43 ` [patch 32/76] futex: fix init order Chris Wright
2008-03-21 22:43 ` [patch 33/76] futex: runtime enable pi and robust functionality Chris Wright
2008-03-22  7:37   ` Thomas Gleixner
2008-03-22 16:19     ` Heiko Carstens
2008-03-22 18:27       ` Chris Wright
2008-03-22 22:05     ` David Miller
2008-03-27  3:28   ` Benjamin Herrenschmidt [this message]
2008-03-21 22:43 ` [patch 34/76] file capabilities: simplify signal check Chris Wright
2008-03-21 22:43 ` [patch 35/76] hugetlb: ensure we do not reference a surplus page after handing it to buddy Chris Wright
2008-03-21 22:43 ` [patch 36/76] ufs: fix parenthesisation in ufs_set_fs_state() Chris Wright
2008-03-21 22:43 ` [patch 37/76] spi: pxa2xx_spi clock polarity fix Chris Wright
2008-03-21 22:43 ` [patch 38/76] NETFILTER: nfnetlink_queue: fix SKB_LINEAR_ASSERT when mangling packet data Chris Wright
2008-03-21 22:43 ` [patch 39/76] NETFILTER: Fix incorrect use of skb_make_writable Chris Wright
2008-03-21 22:43 ` [patch 40/76] NETFILTER: fix ebtable targets return Chris Wright
2008-03-21 22:43 ` [patch 41/76] SCSI advansys: fix overrun_buf aligned bug Chris Wright
2008-03-21 22:43 ` [patch 42/76] pata_hpt*, pata_serverworks: fix UDMA masking Chris Wright
2008-03-21 22:43 ` [patch 43/76] moduleparam: fix alpha, ia64 and ppc64 compile failures Chris Wright
2008-03-21 22:43 ` [patch 44/76] PCI x86: always use conf1 to access config space below 256 bytes Chris Wright
2008-03-21 22:43 ` [patch 45/76] e1000e: Fix CRC stripping in hardware context bug Chris Wright
2008-03-21 22:43 ` [patch 46/76] arcmsr: fix IRQs disabled warning spew Chris Wright
2008-03-21 22:43 ` [patch 47/76] b43: Backport bcm4311 fix Chris Wright
2008-03-21 22:43   ` Chris Wright
2008-03-21 22:43 ` [patch 48/76] atmel_spi: fix clock polarity Chris Wright
2008-03-21 22:43 ` [patch 49/76] x86: move out tick_nohz_stop_sched_tick() call from the loop Chris Wright
2008-03-21 22:43 ` [patch 50/76] macb: Fix speed setting Chris Wright
2008-03-21 22:43 ` [patch 51/76] ioat: fix ack handling, driver must ensure that ack is zero Chris Wright
2008-03-21 22:43 ` [patch 52/76] eCryptfs: make ecryptfs_prepare_write decrypt the page Chris Wright
2008-03-21 22:43 ` [patch 53/76] VT notifier fix for VT switch Chris Wright
2008-03-21 22:43 ` [patch 54/76] USB: ftdi_sio: Workaround for broken Matrix Orbital serial port Chris Wright
2008-03-21 22:43 ` [patch 55/76] USB: ftdi_sio - really enable EM1010PC Chris Wright
2008-03-21 22:43 ` [patch 56/76] USB: ehci: handle large bulk URBs correctly (again) Chris Wright
2008-03-21 22:43 ` [patch 57/76] SCSI: fix BUG when sum(scatterlist) > bufflen Chris Wright
2008-03-21 22:43 ` [patch 58/76] x86: dont use P6_NOPs if compiling with CONFIG_X86_GENERIC Chris Wright
2008-03-21 22:43 ` [patch 59/76] Fix default compose table initialization Chris Wright
2008-03-21 22:43 ` [patch 60/76] SCSI: gdth: bugfix for the at-exit problems Chris Wright
2008-03-21 22:43 ` [patch 61/76] SCSI: gdth: fix to internal commands execution Chris Wright
2008-03-21 22:43 ` [patch 62/76] SCSI: mpt fusion: dont oops if NumPhys==0 Chris Wright
2008-03-21 22:43 ` [patch 63/76] sched: fix race in schedule() Chris Wright
2008-03-21 22:43 ` [patch 64/76] nfsd: fix oops on access from high-numbered ports Chris Wright
2008-03-21 22:43 ` [patch 65/76] sched_nr_migrate wrong mode bits Chris Wright
2008-03-21 22:43 ` [patch 66/76] NETFILTER: xt_time: fix failure to match on Sundays Chris Wright
2008-03-21 22:43 ` [patch 67/76] NETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb Chris Wright
2008-03-21 22:43 ` [patch 68/76] NETFILTER: nfnetlink_log: fix computation of netlink skb size Chris Wright
2008-03-21 22:43 ` [patch 69/76] zisofs: fix readpage() outside i_size Chris Wright
2008-03-21 22:44 ` [patch 70/76] jbd2: correctly unescape journal data blocks Chris Wright
2008-03-21 22:44 ` [patch 71/76] jbd: " Chris Wright
2008-03-21 22:44 ` [patch 72/76] aio: bad AIO race in aio_complete() leads to process hang Chris Wright
2008-03-21 22:44 ` [patch 73/76] async_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor Chris Wright
2008-03-21 22:44 ` [patch 74/76] SCSI advansys: Fix bug in AdvLoadMicrocode Chris Wright
2008-03-21 22:44 ` [patch 75/76] BLUETOOTH: Fix bugs in previous conn add/del workqueue changes Chris Wright
2008-03-21 22:44 ` [patch 76/76] relay: fix subbuf_splice_actor() adding too many pages Chris Wright
2008-03-22  1:02 ` [patch 00/76] 2.6.24-stable review Joe Korty
2008-03-22 18:26   ` Chris Wright
2008-03-22 18:28 ` Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1206588514.6926.76.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=buytenh@wantstofly.org \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=gregkh@suse.de \
    --cc=jejb@kernel.org \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mkrufky@linuxtv.org \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=riku.voipio@movial.fi \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.