From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753168AbeAQUGC (ORCPT ); Wed, 17 Jan 2018 15:06:02 -0500 Received: from mail-qt0-f193.google.com ([209.85.216.193]:44743 "EHLO mail-qt0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753129AbeAQUGA (ORCPT ); Wed, 17 Jan 2018 15:06:00 -0500 X-Google-Smtp-Source: ACJfBot2LCaQEDIUNVzyAyMLhYdjD945PiVqTpVTpap4JyFNCBibjDHmSPZwR3aHBRKnMRUjxEim7w== Date: Wed, 17 Jan 2018 12:05:51 -0800 From: Tejun Heo To: Steven Rostedt Cc: Petr Mladek , Sergey Senozhatsky , Sergey Senozhatsky , akpm@linux-foundation.org, linux-mm@kvack.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Peter Zijlstra , Linus Torvalds , Jan Kara , Mathieu Desnoyers , Tetsuo Handa , rostedt@rostedt.homelinux.com, Byungchul Park , Pavel Machek , linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Message-ID: <20180117200551.GW3460072@devbig577.frc2.facebook.com> References: <20180111045817.GA494@jagdpanzerIV> <20180111093435.GA24497@linux.suse> <20180111103845.GB477@jagdpanzerIV> <20180111112908.50de440a@vmware.local.home> <20180111203057.5b1a8f8f@gandalf.local.home> <20180111215547.2f66a23a@gandalf.local.home> <20180116194456.GS3460072@devbig577.frc2.facebook.com> <20180117091208.ezvuhumnsarz5thh@pathway.suse.cz> <20180117151509.GT3460072@devbig577.frc2.facebook.com> <20180117121251.7283a56e@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180117121251.7283a56e@gandalf.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Steven. On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote: > From what I gathered, you said an OOM would trigger, and then the > network console would not be able to allocate memory and it would > trigger a printk too, and cause an infinite amount of printks. Yeah, it falls into back-and-forth loop between the OOM code and netconsole path. > This could very well be a great place to force offloading. If a printk > is called from within a printk, at the same context (normal, softirq, > irq or NMI), then we should trigger the offloading. I was thinking more of a timeout based approach (ie. if stuck for longer than X or X messages, offload), but if local feedback loop is the only thing we're missing after your improvements, detecting that specific condition definitely works and is likely a better approach in terms of message delivery guarantee. > +static void kick_offload_thread(void) > +{ > + /* > + * Consoles are triggering printks, offload the printks > + * to another CPU to hopefully avoid a lockup. > + */ > +} ... > @@ -2333,6 +2390,7 @@ void console_unlock(void) > > for (;;) { > struct printk_log *msg; > + bool offload; > size_t ext_len = 0; > size_t len; > > @@ -2393,15 +2451,20 @@ void console_unlock(void) > * waiter waiting to take over. > */ > console_lock_spinning_enable(); > + offload = recursion_check_start(); > > stop_critical_timings(); /* don't trace print latency */ > call_console_drivers(ext_text, ext_len, text, len); > start_critical_timings(); > > + recursion_check_finish(offload); > + > if (console_lock_spinning_disable_and_check()) { > printk_safe_exit_irqrestore(flags); > return; > } > + if (offload) > + kick_offload_thread(); Yeah, something like this would definitely work. Thanks a lot. -- tejun