From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752749Ab0JVE4i (ORCPT ); Fri, 22 Oct 2010 00:56:38 -0400 Received: from mail9.hitachi.co.jp ([133.145.228.44]:39110 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752570Ab0JVE4h (ORCPT ); Fri, 22 Oct 2010 00:56:37 -0400 X-AuditID: b753bd60-a7533ba000000226-ca-4cc11981b488 Message-ID: <4CC1197C.5040103@hitachi.com> Date: Fri, 22 Oct 2010 13:56:28 +0900 From: Masami Hiramatsu Organization: Systems Development Lab., Hitachi, Ltd., Japan User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.9.2.11) Gecko/20101013 Thunderbird/3.1.5 MIME-Version: 1.0 To: Peter Zijlstra Cc: Steven Rostedt , Jason Baron , Ingo Molnar , LKML , Andrew Morton , Frederic Weisbecker , Thomas Gleixner , "H. Peter Anvin" , Arnaldo Carvalho de Melo , 2nddept-manager@sdl.hitachi.co.jp Subject: Re: [PATCH][GIT PULL] tracing: Fix compile issue for trace_sched_wakeup.c References: <1287508282.16971.386.camel@gandalf.stny.rr.com> <20101019184111.GA17266@elte.hu> <20101020154045.GA18353@elte.hu> <20101020164324.GC7348@redhat.com> <4CBFAC70.30602@hitachi.com> <1287645744.3488.57.camel@twins> <1287658862.16971.569.camel@gandalf.stny.rr.com> <1287659008.3488.102.camel@twins> In-Reply-To: <1287659008.3488.102.camel@twins> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAAAA== X-FMFTCR: RANGEB Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2010/10/21 20:03), Peter Zijlstra wrote: > On Thu, 2010-10-21 at 07:01 -0400, Steven Rostedt wrote: >> On Thu, 2010-10-21 at 09:22 +0200, Peter Zijlstra wrote: >>> On Thu, 2010-10-21 at 11:58 +0900, Masami Hiramatsu wrote: >>> >>>> It seems there can be a bug in stop_machine() routine under >>>> heavy use. usually that is called just once at a time, but jump >>>> label and optprobe might call it heavily (thousands times?). >>>> So some racy situation can be happen easily. >>> >>> There are people doing hotplug stress testing, that too results in heavy >>> stop_machine usage. >> >> But with hotplug, isn't there a bit more time between stop machine >> calls? That is, you need to do a bit of work to bring down or up a CPU, >> and that will slow down the number of stop machine calls together. >> >> Here, we do a simple change and call stop machine() several times. >> >> Although, I agree, I do not think the bug is in stop machine itself, but >> perhaps the way we are using it might have some niche anomaly that we >> are hitting. > > Possibly, but wouldn't it make sense to batch up the work and simply > call stop_machine only once? I mean, if you already know you're going to > do this... > Yeah, here is what I had tried; http://sourceware.org/ml/systemtap/2010-q2/msg00294.html I agree that the crash will just disappear with this API, but it will be just hidden, still remains inside kernel. Anyway, this batch patching is needed from performance viewpoint too. I'll rework on it. Thank you, -- Masami HIRAMATSU 2nd Dept. Linux Technology Center Hitachi, Ltd., Systems Development Laboratory E-mail: masami.hiramatsu.pt@hitachi.com