From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755437AbdKGAXJ (ORCPT ); Mon, 6 Nov 2017 19:23:09 -0500 Received: from mail-qt0-f196.google.com ([209.85.216.196]:57211 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933873AbdKGAW5 (ORCPT ); Mon, 6 Nov 2017 19:22:57 -0500 X-Google-Smtp-Source: ABhQp+QScYVH0E5/8xh/XovvPRF8kf+IeuR1Fsrlxzv7qIerFZvy0yZYa6+fRlknRt2IvEJ+/rbJfg== Date: Mon, 6 Nov 2017 16:22:54 -0800 From: Tejun Heo To: Sergey Senozhatsky Cc: Petr Mladek , Steven Rostedt , linux-kernel@vger.kernel.org, kernel-team@fb.com, Linus Torvalds , Andrew Morton Subject: Re: [PATCH 2/2] Subject: printk: Don't trap random context in infinite log_buf flush Message-ID: <20171107002254.GI3252168@devbig577.frc2.facebook.com> References: <20171102135146.GN3252168@devbig577.frc2.facebook.com> <20171102135258.GO3252168@devbig577.frc2.facebook.com> <20171104042408.GB539@tigerII.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171104042408.GB539@tigerII.localdomain> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Sat, Nov 04, 2017 at 01:24:08PM +0900, Sergey Senozhatsky wrote: > thanks for the patch set. we are currently looking at another approach: > lkml.kernel.org/r/20171102134515.6eef16de@gandalf.local.home > > would you be interested in taking a look? Hmm... It took some tweaking but the code at the end locks up the machine w/ Steven's patch applied and it's not that contrived a case (e.g. out of memory messages from packet tx/rx paths while OOM is in progress). > there are some concerns, like a huge number of printk-s happening while > console_sem is locked. e.g. console_lock()/console_unlock() on one of the > CPUs, or console_lock(); printk(); ... printk(); console_unlock(); Unless we make all messages fully synchronous, I don't think there's a good solution for that and I don't think we wanna make everything fully synchronous. The information value of messages go down pretty fast in a message deluge and as long as we stay synchronous in the beginning, information loss usually isn't the problem. The current code can lose a lot of messages but the way it loses messages usually doesn't lead to loss of important information. The proposed patch doesn't materially change when and what we lose. It just moves the flush infinite loop into a safe context. Thanks. #include #include #include #include #include #include static bool in_printk; static bool stop_testing; static struct hrtimer printk_timer; static ktime_t timer_interval; static enum hrtimer_restart printk_timerfn(struct hrtimer *timer) { int i; if (READ_ONCE(in_printk)) for (i = 0; i < 1000; i++) printk("%-80s\n", "XXX TIMER"); hrtimer_forward_now(&printk_timer, timer_interval); return READ_ONCE(stop_testing) ? HRTIMER_NORESTART : HRTIMER_RESTART; } static void preempt_printk_workfn(struct work_struct *work) { int i; hrtimer_init(&printk_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); printk_timer.function = printk_timerfn; timer_interval = ktime_set(0, NSEC_PER_MSEC); hrtimer_start(&printk_timer, timer_interval, HRTIMER_MODE_REL); while (!READ_ONCE(stop_testing)) { preempt_disable(); WRITE_ONCE(in_printk, true); for (i = 0; i < 100; i++) printk("%-80s\n", "XXX PREEMPT"); WRITE_ONCE(in_printk, false); preempt_enable(); msleep(1); } } static DECLARE_WORK(preempt_printk_work, preempt_printk_workfn); static int __init test_init(void) { queue_work_on(0, system_wq, &preempt_printk_work); return 0; } static void __exit test_exit(void) { WRITE_ONCE(stop_testing, true); flush_work(&preempt_printk_work); hrtimer_cancel(&printk_timer); } module_init(test_init); module_exit(test_exit); MODULE_LICENSE("GPL");