From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938222AbXGUORU (ORCPT ); Sat, 21 Jul 2007 10:17:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757619AbXGUORM (ORCPT ); Sat, 21 Jul 2007 10:17:12 -0400 Received: from mail.screens.ru ([213.234.233.54]:40480 "EHLO mail.screens.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757493AbXGUORK (ORCPT ); Sat, 21 Jul 2007 10:17:10 -0400 Date: Sat, 21 Jul 2007 18:18:14 +0400 From: Oleg Nesterov To: Ingo Molnar Cc: Andrew Morton , Alexey Kuznetsov , Eric Dumazet , Steven Rostedt , Thomas Gleixner , Ulrich Drepper , linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo Subject: Re: [PATCH] pi-futex: set PF_EXITING without taking ->pi_lock Message-ID: <20070721141814.GA1013@tv-sign.ru> References: <20070721115712.GA871@tv-sign.ru> <20070721123159.GB1769@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070721123159.GB1769@elte.hu> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On 07/21, Ingo Molnar wrote: > > * Oleg Nesterov wrote: > > > It is a bit annoying that do_exit() takes ->pi_lock to set PF_EXITING. > > All we need is to synchronize with lookup_pi_state() which saw this task > > without PF_EXITING under ->pi_lock. > > > > Change do_exit() to use spin_unlock_wait(). > > > > Signed-off-by: Oleg Nesterov > > Acked-by: Ingo Molnar Thanks! > > - spin_lock_irq(&tsk->pi_lock); > > - tsk->flags |= PF_EXITING; > > - spin_unlock_irq(&tsk->pi_lock); > > + smp_mb(); > > + spin_unlock_wait(&tsk->pi_lock); > > hm, isnt spin_unlock_wait() an SMP barrier in itself? no, only barrier() due to cpu_relax() > (if not then it should be.) I think you are right, I can't imagine a valid usage of spin_unlock_wait() without a barrier. For example, from net/dccp/ccid.c static void ccids_write_lock(void) { spin_lock(&ccids_lock); while (atomic_read(&ccids_lockct) != 0) { spin_unlock(&ccids_lock); yield(); spin_lock(&ccids_lock); } } static inline void ccids_read_lock(void) { atomic_inc(&ccids_lockct); spin_unlock_wait(&ccids_lock); } This looks racy, in theory atomic_inc() and spin_unlock_wait() could be re-ordered. However, in this particular case we have an "optimized" smp_mb_after_atomic_inc(), perhaps it is good that the caller can choose the "right" barrier by hand. Oleg.