From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753262AbbIWIho (ORCPT ); Wed, 23 Sep 2015 04:37:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:60692 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752527AbbIWIhl (ORCPT ); Wed, 23 Sep 2015 04:37:41 -0400 Date: Wed, 23 Sep 2015 10:37:37 +0200 From: Jan Kara To: Andrew Morton Cc: Jan Kara , LKML , pmladek@suse.com, rostedt@goodmis.org, Gavin Hu , KY Srinivasan , Jan Kara Subject: Re: [PATCH 3/4] kernel: Avoid softlockups in stop_machine() during heavy printing Message-ID: <20150923083737.GC23181@quack.suse.cz> References: <1439998711-7013-1-git-send-email-jack@suse.com> <1439998711-7013-4-git-send-email-jack@suse.com> <20150918151521.c5c8caa59a0e254fdd713337@linux-foundation.org> <20150922105502.GK9028@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150922105502.GK9028@quack.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 22-09-15 12:55:02, Jan Kara wrote: > > > +{ > > > + bool retry; > > > + unsigned long flags; > > > + > > > + while (1) { > > > + raw_spin_lock_irqsave(&logbuf_lock, flags); > > > + retry = console_seq != log_next_seq; > > > + raw_spin_unlock_irqrestore(&logbuf_lock, flags); > > > > Does this lock/unlock do anything useful? Forgot to comment on this: console_seq and log_next_seq are updated under logbuf_lock. Also they are 64-bit so on 32-bit archs their updates are non-atomic. So although in practice the check will likely work fine without logbuf_lock, I prefer taking the lock to save reader some pondering and the code isn't performance sensitive in any way. > > > + if (!retry || console_suspended) > > > + break; > > > + /* Cycle console_sem to wait for outstanding printing */ > > > + console_lock(); > > > + console_unlock(); > > > + } > > > +} > > > + > > > +/* > > > * Prevent further output on the passed console device so that (for example) > > > * serial drivers can disable console output before suspending a port, and can > > > * re-enable output afterwards. > > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c > > > index fd643d8c4b42..016d34621d2e 100644 > > > --- a/kernel/stop_machine.c > > > +++ b/kernel/stop_machine.c > > > @@ -21,6 +21,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > /* > > > * Structure to determine completion condition and record errors. May > > > @@ -543,6 +544,14 @@ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) > > > return ret; > > > } > > > > > > + /* > > > + * If there are lots of outstanding messages, printing them can take a > > > + * long time and all cpus would be spinning waiting for the printing to > > > + * finish thus triggering NMI watchdog, RCU lockups etc. Wait for the > > > + * printing here to avoid these. > > > + */ > > > + console_flush(); > > > > This is pretty pointless if num_possible_cpus==1. I'd suggest setting > > printk_offload_chars=0 in this case, add some early bale-out into > > console_flush(). Or something along those lines. > > > > And make console_flush() go away altogether if CONFIG_SMP=n - it's > > pointless bloat. > > Sure, I'll do that. I've implemented all this except for early bail out from console_flush() when num_possible_cpus==1 - that doesn't seem very useful since we'll just check that the buffer is empty and bail out anyway... Honza -- Jan Kara SUSE Labs, CR