From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754888AbdLOCKh (ORCPT ); Thu, 14 Dec 2017 21:10:37 -0500 Received: from mail-pg0-f65.google.com ([74.125.83.65]:42914 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754872AbdLOCKa (ORCPT ); Thu, 14 Dec 2017 21:10:30 -0500 X-Google-Smtp-Source: ACJfBotu7uR4spnhX4vQd++TRfLaDvbjDHqLJJ0s7r0f+LW5HXN5Dl0YIhGnk5dUUyvr2M+d1iCLWg== Date: Fri, 15 Dec 2017 11:10:24 +0900 From: Sergey Senozhatsky To: Steven Rostedt Cc: Tejun Heo , Sergey Senozhatsky , Petr Mladek , Jan Kara , Andrew Morton , Peter Zijlstra , Rafael Wysocki , Pavel Machek , Tetsuo Handa , linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread Message-ID: <20171215021024.GA11199@jagdpanzerIV> References: <20171204134825.7822-1-sergey.senozhatsky@gmail.com> <20171214142709.trgl76hbcdwaczzd@pathway.suse.cz> <20171214152551.GY3919388@devbig577.frc2.facebook.com> <20171214125506.52a7e5fa@gandalf.local.home> <20171214181153.GZ3919388@devbig577.frc2.facebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171214181153.GZ3919388@devbig577.frc2.facebook.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On (12/14/17 10:11), Tejun Heo wrote: > Hey, Steven. > > On Thu, Dec 14, 2017 at 12:55:06PM -0500, Steven Rostedt wrote: > > Yes! Please create a reproducer, because I still don't believe there is > > one. And it's all hand waving until there's an actual report that we can > > lock up the system with my approach. > > Yeah, will do, but out of curiosity, Sergey and I already described > what the root problem was and you didn't really seem to take that. Is > that because the explanation didn't make sense to you or us > misunderstanding what your code does? I second _everything_ that Tejun has said. Steven, your approach works ONLY when we have the following preconditions: a) there is a CPU that is calling printk() from the 'safe' (non-atomic, etc) context what does guarantee that? what happens if there is NO non-atomic CPU or that non-atomic simplky missses the console_owner != false point? we are going to conclude "if printk() doesn't work for you, it's because you are holding it wrong"? what if that non-atomic CPU does not call printk(), but instead it does console_lock()/console_unlock()? why there is no handoff? CPU0 CPU1 ~ CPU10 in atomic contexts [!]. ping-ponging console_sem ownership to each other. while what they really need to do is to simply up() and let CPU0 to handle it. printk console_lock() schedule() ... printk printk ... printk printk up() // woken up console_unlock() why do we make an emphasis on fixing vprintk_printk()? b) non-atomic CPU sees console_owner set (which is set for a very short period of time) again. what if that non-atomic CPU does not see console_owner? "don't use printk()"? c) the task that is looping in console_unlock() sees non-atomic CPU when console_owner is set. IOW, we need to have the right CPU (a) at the very right moment (b && c) doing the very right thing. * and the "very right moment" is tiny and additionally depends on a foreign CPU [the one that is looping in console_unlock()]. a simple question - how is that going to work for everyone? are we "fixing" a small fraction of possible use-cases? Steven, I thought we reached the agreement [**] that the solution we should be working on is a combination of prinkt_kthread and console_sem hand off. Simply because it adds the missing "there is a non-atomic CPU wishing to console_unlock()" thing. lkml.kernel.org/r/20171108162813.GA983427@devbig577.frc2.facebook.com https://marc.info/?l=linux-kernel&m=151011840830776&w=2 https://marc.info/?l=linux-kernel&m=151015141407368&w=2 https://marc.info/?l=linux-kernel&m=151018900919386&w=2 https://marc.info/?l=linux-kernel&m=151019815721161&w=2 https://marc.info/?l=linux-kernel&m=151020275921953&w=2 ** https://marc.info/?l=linux-kernel&m=151020404622181&w=2 ** https://marc.info/?l=linux-kernel&m=151020565222469&w=2 what am I missing? -ss