From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752408AbcHQO5P (ORCPT ); Wed, 17 Aug 2016 10:57:15 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:48444 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750764AbcHQO5O (ORCPT ); Wed, 17 Aug 2016 10:57:14 -0400 Date: Wed, 17 Aug 2016 16:57:09 +0200 From: Peter Zijlstra To: Steven Rostedt Cc: Thomas Gleixner , Ingo Molnar , Alexander Shishkin , linux-kernel@vger.kernel.org Subject: Re: [RFC] ftrace / perf 'recursion' Message-ID: <20160817145709.GS30192@twins.programming.kicks-ass.net> References: <20160817091953.GH7141@twins.programming.kicks-ass.net> <20160817103306.GI7141@twins.programming.kicks-ass.net> <20160817105716.GJ7141@twins.programming.kicks-ass.net> <20160817094932.22a9e768@gandalf.local.home> <20160817140612.GR30192@twins.programming.kicks-ass.net> <20160817102559.726742bf@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160817102559.726742bf@gandalf.local.home> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 17, 2016 at 10:25:59AM -0400, Steven Rostedt wrote: > > > Also, it will prevent any tracing of NMIs that occur in there. > > > > It should not, see how I only mark the IRQ bit, not the NMI bit. > > Ah, I didn't look deep at what you set there. Maybe that would work. > Still pretty hacky. Sure :-) > > > I would really like to keep this fix within perf if possible. If > > > anything, the flag should just tell the perf function handler not to > > > trace, this shouldn't stop all function handlers. > > > > Well, my thinking was that there's a reason most of irq_work is already > > notrace. kernel/irq_work.c has CC_FLAGS_FTRACE removed. That seems to > > suggest that tracing irq_work is a problem. > > Well, you were the one that added that ;-) OK, I suppose I can do the same for perf only, which is basically the first patch on this thread. And then remove the notrace muck for irq_work.c. > Are you calling a signal to userspace via the irq work? Maybe we should > have a kernel thread that does that instead. That way, the irq works > can be suspended until the kernel thread gets to run. Then even though > the waking of the thread will cause more events, it will be spaced out > enough not to cause an irq work storm. Nah, that'd wreck the desired semantics. We could maybe use a task_work for the signal cruft though, and only generate the signal on the return to userspace. But I'm not sure that will cure the problem. We'd still need the irq_work to wake tasks stuck in poll() and friends. And once we're over the watermark, every new event will trigger that wakeup, and the wakeup will generate a new event etc..