From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753763Ab1A0W0z (ORCPT ); Thu, 27 Jan 2011 17:26:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:30298 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752195Ab1A0W0y (ORCPT ); Thu, 27 Jan 2011 17:26:54 -0500 Date: Thu, 27 Jan 2011 23:18:56 +0100 From: Oleg Nesterov To: Peter Zijlstra Cc: Frederic Weisbecker , Ingo Molnar , Alan Stern , Arnaldo Carvalho de Melo , Paul Mackerras , Prasad , Roland McGrath , linux-kernel@vger.kernel.org Subject: Re: Q: perf_install_in_context/perf_event_enable are racy? Message-ID: <20110127221856.GA10539@redhat.com> References: <20110121130323.GA12900@elte.hu> <1295617185.28776.273.camel@laptop> <20110121142616.GA31165@redhat.com> <1295622304.28776.293.camel@laptop> <20110121204014.GA2870@nowhere> <20110124114234.GA12166@redhat.com> <20110126175322.GA28617@redhat.com> <1296134077.15234.161.camel@laptop> <20110127165712.GC25060@redhat.com> <1296148294.15234.242.camel@laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1296148294.15234.242.camel@laptop> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/27, Peter Zijlstra wrote: > > On Thu, 2011-01-27 at 17:57 +0100, Oleg Nesterov wrote: > > > > > With, however, things are more interesting. 2 seems to be adequately > > > covered by the patch I just send, the IPI will bail and the next > > > sched-in of the relevant task will pick matters up. 1 otoh doesn't seem > > > covered, the IPI will bail, leaving us stranded. > > > > Hmm, yes... Strangely, I missed that when I was thinking about in_ctxsw. > > > > Perhaps, we can change task_oncpu_function_call() so that it returns > > -EAGAIN in case it hits in_ctxsw != 0? If the caller sees -EAGAIN, it > > should always retry even if !ctx->is_active. > > That would be very easy to do, we can pass a return value through the > task_function_call structure. Yes. Perhaps task_oncpu_function_call() should retry itself to simplify the callers. I wonder if we should also retry if rq->curr != p... Oh. You know, I am starting to think I will never understand this. Forget about the problems with __ARCH_WANT_INTERRUPT_ON_CTXSW. perf_install_in_context() does task_oncpu_function_call() and then // ctx->is_active == 0 /* * The lock prevents that this context is scheduled in so we * can add the event safely, if it the call above did not * succeed. */ if (list_empty(&event->group_entry)) add_event_to_ctx(event, ctx); This assumes that the task is not running. Why? Because (I guess) we assume that either task_oncpu_function_call() should see task_curr() == T, or if the task becomes running after that it should see the new ->perf_event_ctxp[ctxn] != NULL. And I do not see how we can prove this. If find_get_context() sets the new context, the only guarantee we have is: perf_event_exit_task() can't miss this context. The task, however, can be scheduled in and miss the new value in perf_event_ctxp[]. And, task_oncpu_function_call() can equally miss rq->curr == task. But. I think this all falls into the absolutely theoretical category, and in the worst case nothing really bad can happen, just event_sched_in() will be delayed until this task reshedules. So, I think your patch should fix all problems with schedule. Just it needs the couple of changes we already discussed: - finish_task_switch() should clear rq->in_ctxsw before local_irq_enable() - task_oncpu_function_call() (or its callers) should always retry the "if (task_curr(p))" code if ->in_ctxsw is true. If you think we have other problems here please don't tell me, I already got lost ;) Oleg.