From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933940AbdLRMfu (ORCPT ); Mon, 18 Dec 2017 07:35:50 -0500 Received: from mail-pf0-f193.google.com ([209.85.192.193]:34139 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933821AbdLRMfr (ORCPT ); Mon, 18 Dec 2017 07:35:47 -0500 X-Google-Smtp-Source: ACJfBovqaEVzDMBt2dfkmj6cjnEYSKS/qAu7go3uc4fWClGgNOe/9uIvWjVKQKuOFLwYOQ0R8n0giw== Date: Mon, 18 Dec 2017 21:35:41 +0900 From: Sergey Senozhatsky To: Petr Mladek , Steven Rostedt Cc: Tejun Heo , Sergey Senozhatsky , Sergey Senozhatsky , Jan Kara , Andrew Morton , Peter Zijlstra , Rafael Wysocki , Pavel Machek , Tetsuo Handa , linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread Message-ID: <20171218123541.GC31274@jagdpanzerIV> References: <20171214125506.52a7e5fa@gandalf.local.home> <20171214181153.GZ3919388@devbig577.frc2.facebook.com> <20171215021024.GA11199@jagdpanzerIV> <20171214221831.3ead0298@vmware.local.home> <20171215050607.GC11199@jagdpanzerIV> <20171215083151.cu3xbdgmxqkszwso@pathway.suse.cz> <20171215084236.GE468@jagdpanzerIV> <20171215090801.eulx4pg54p667ya5@pathway.suse.cz> <20171218093405.GA31274@jagdpanzerIV> <20171218103452.GB31274@jagdpanzerIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171218103452.GB31274@jagdpanzerIV> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (12/18/17 19:36), Sergey Senozhatsky wrote: [..] > it takes call_console_drivers() 0.01+ of a second to print some of > the messages [I think we can ignore raw_spin_lock(&console_owner_lock) > and fully blame call_console_drivers()]. so vprintk_emit() seems to be > gazillion times faster and i_do_printks can add tons of messages while > some other process prints just one. > > > to be more precise, I see from the traces that i_do_printks can add 1100 > messages to the logbuf while call_console_drivers() prints just one. > > > systemd-udevd-671 owns the lock. sets the console_owner. i_do_printks > keeps adding printks. there kworker/0:1-135 that just ahead of > i_do_printks-1992 and registers itself as the console_sem waiter. > > systemd-udevd-671 [003] d..3 66.334866: offloading: set console_owner :0 > kworker/0:1-135 [000] d..2 66.335999: offloading: vprintk_emit()->trylock FAIL will spin? :1 > i_do_printks-1992 [002] d..2 66.345474: offloading: vprintk_emit()->trylock FAIL will spin? :0 x 1100 > ... > systemd-udevd-671 [003] d..3 66.345917: offloading: cleaar console_owner waiter != NULL :1 > > > i_do_printks-1992 finishes printing [it does limited number of printks], > it does not compete for console_sem anymore, so those are other tasks > that have to flush pending messages stored by i_do_printks-1992 :( even in this case the task that took over printing had to flush logbuf messages worth of 1100 x 0.01s == 10+ seconds of printing. which is enough to cause problem. if there are 2200 messages in logbuf, then there will be 2200 x 0.01 == 20+ seconds of printing. if the task is atomic, then you probably can imagine what will happen. numbers don't lie. if we have enough tasks competing for console_sem then the tasks that actually fills up the logbuf buffer may never call_console_drivers(). so the lockups are still very much possible. *in my particular case* -ss