From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752845AbcJACY2 (ORCPT ); Fri, 30 Sep 2016 22:24:28 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:33760 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750864AbcJACYT (ORCPT ); Fri, 30 Sep 2016 22:24:19 -0400 Date: Sat, 1 Oct 2016 11:24:15 +0900 From: Sergey Senozhatsky To: Petr Mladek , Jan Kara Cc: Andrew Morton , Tejun Heo , Calvin Owens , Thomas Gleixner , Mel Gorman , Steven Rostedt , linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky Subject: Re: [RFC][PATCHv2 3/7] printk: introduce per-cpu alt_print seq buffer Message-ID: <20161001022415.GA527@swordfish> References: <20160930151758.8965-1-sergey.senozhatsky@gmail.com> <20160930151758.8965-4-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160930151758.8965-4-sergey.senozhatsky@gmail.com> User-Agent: Mutt/1.7.0 (2016-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (10/01/16 00:17), Sergey Senozhatsky wrote: [..] > +void alt_printk_enter(void) > +{ > + unsigned long flags; - > + local_irq_save(flags); > + if (!(this_cpu_read(alt_printk_ctx) & ALT_PRINTK_CONTEXT_MASK)) > + this_cpu_write(alt_printk_irq_flags, flags); > + this_cpu_inc(alt_printk_ctx); > +} [..] > +void alt_printk_exit(void) > +{ > + this_cpu_dec(alt_printk_ctx); > + if (!(this_cpu_read(alt_printk_ctx) & ALT_PRINTK_CONTEXT_MASK)) > + local_irq_restore(this_cpu_read(alt_printk_irq_flags)); > +} a few words why did I make it so complex here. it may look that nested alt_printk_enter()-s are not really possible; but it's not exactly the case. and the reasons being is that printk exports too many symbols. basically, the only 'safe' printk variant at the moment is printk() vprintk_func() because it 'respects' current printk context... and that's it. the reset of them will call vprintk_emit() regardless the context. this patch set fixes (in 0001) vprintk() vprintk() vprintk_func() but we still have EXPORT_SYMBOL(vprintk_emit) EXPORT_SYMBOL(printk_emit) EXPORT_SYMBOL_GPL(vprintk_default) any of those called from alt_printk_log_store() will deadlock us. printk() vprintk_emit() alt_printk_enter() raw_spin_lock(&logbuf_lock) ... printk() << first recursion vprintk_alt() alt_printk_log_store() vprintk_default() ---> vprintk_emit() << second recursion raw_spin_lock(&logbuf_lock) << deadlock ... alt_printk_exit() our vprintk_alt()->alt_printk_log_store() seems to be *more or less* safe, for the time being. there are many WARN_ON_* on vprintk_alt()->alt_printk_log_store()->vsnprintf() path but they all seem to be calling printk(): - WARN_ON_ONCE() from vsnprintf() - WARN_ONCE() from vsnprintf()->format_decode() - WARN_ON vsnprintf()->set_field_width() - WARN_ON from vsnprintf()->set_precision() - WARN_ON from vsnprintf()->pointer()->flags_string() a side note, some of these WARNs are... 'funny'. e.g., to deadlock a system it's enough to just pass an unsupported flag in format string. vsnprintf() will WARN_ONCE(1, "Please remove unsupported %%%c in format string\n", *fmt) but the problem is that we are already in printk(): printk() raw_spin_lock(&logbuf_lock) text_len = vscnprintf(text, sizeof(textbuf), fmt, args) WARN_ONCE(1, "Please remove unsupported ...) printk() raw_spin_lock(&logbuf_lock) << deadlock with alt_printk enabled we are cool now. [ 175.226161] BUG: recent printk recursion! [ 175.226170] ------------[ cut here ]------------ [ 175.226173] WARNING: CPU: 3 PID: 361 at lib/vsprintf.c:1900 format_decode+0x22c/0x308 [ 175.226175] Please remove unsupported %{ in format string [ 175.226182] CPU: 3 PID: 361 Comm: bash [ 175.226194] Call Trace: [ 175.226197] [] dump_stack+0x68/0x92 [ 175.226199] [] __warn+0xb8/0xd3 [ 175.226202] [] warn_slowpath_fmt+0x46/0x4e [ 175.226204] [] format_decode+0x22c/0x308 [ 175.226207] [] vsnprintf+0x89/0x481 [ 175.226209] [] vscnprintf+0xd/0x26 [ 175.226212] [] vprintk_emit+0xcc/0x271 [ 175.226214] [] vprintk_default+0x18/0x1a [ 175.226217] [] vprintk_func+0x82/0x89 [ 175.226219] [] printk+0x3e/0x46 [..] [ 175.226244] [] entry_SYSCALL_64_fastpath+0x18/0xad [ 175.226249] ---[ end trace ]--- speaking of %p. FORMAT_TYPE_PTR is big and sometimes seem to be platform dependent. there are tons of things that can go wrong there. for instance: - vsnprintf() pointer() symbol_string() sprint_backtrace() __sprint_symbol() kallsyms_lookup() get_symbol_pos() BUG_ON - vsnprintf() pointer() symbol_string() sprint_backtrace() __sprint_symbol() kallsyms_lookup() module_address_lookup() __module_address() BUG_ON - vsnprintf() pointer() symbol_string() sprint_backtrace() __sprint_symbol() kallsyms_lookup() module_address_lookup() module_assert_mutex_or_preempt() WARN_ON_ONCE() and probably even more. for example, drivers/base/core.c has its own printk() implementation and calls vprintk_emit() directly return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen, fmt, args) so as a precautions measure I've decided to enable alt_printk_enter() nesting. but can drop it. -ss