From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kees Cook Subject: Re: Regression bisected to f2f84b05e02b (bug: consolidate warn_slowpath_fmt() usage) Date: Thu, 11 Jun 2020 22:07:05 -0700 Message-ID: <202006112201.3B20AB28DC@keescook> References: <20200602024804.GA3776630@p50-ethernet.mattst88.com> <202006021052.E52618F@keescook> <20200612044757.GA10703@tower> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=E/c/tVNZnZXtqPcvM/cRJXYwg6T3yZDLqtUKUDRTdt4=; b=O6iMlj7M7cPlzLPmVV55ccyxdmoDkLHEjx9YGi+rTx/SuEeA8L0Kjs9Ne+dGwMTwbq rc2ZpA3N/oYNyMbp1V99/7JChJws0nqKJfN0shrkRwbPVuIUABRpkiMLhLyjJLxaGdmP ovUHtZjYkmhLf2+ypJUlrY0Jy3jRXFdlAQ+h0= Content-Disposition: inline In-Reply-To: <20200612044757.GA10703@tower> Sender: linux-alpha-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Michael Cree , Matt Turner , Linux-Arch , LKML , linux-alpha , Richard Henderson , Ivan Kokshaysky On Fri, Jun 12, 2020 at 04:47:57PM +1200, Michael Cree wrote: > On Thu, Jun 11, 2020 at 09:23:52PM -0700, Matt Turner wrote: > > Since I noticed earlier that using maxcpus=1 on a 2-CPU system > > prevented the system from hanging, I tried disabling CONFIG_SMP on my > > 1-CPU system as well. In doing so, I discovered that the RCU torture > > module (RCU_TORTURE_TEST) triggers some null pointer dereferences on > > Alpha when CONFIG_SMP is set, but works successfully when CONFIG_SMP > > is unset. > > > > That seems likely to be a symptom of the same underlying problem that > > started this thread, don't you think? If so, I'll focus my attention > > on that. > > I wonder if that is related to user space segfaults we are now seeing > on SMP systems but not UP systems while building Alpha debian-ports. > It's happening in the test-suites of builds of certain software > (such as autogen and guile) but they always build successfully with > the test suite passing on a UP system. > > When investigating I seem to recall it was a NULL (or near NULL) > pointer dereference but couldn't make any sense of how it might > have got into such an obviously wrong state. By some miracle, I have avoided any experience with RCU bugs. ;) If the RCU_TORTURE_TEST Oopses or the segfaults are repeatable and don't go away with the WARN patch reverted, then perhaps it might be used to bisect to something closer to the root cause? Given the similarity to the SMP vs UP stuff and the RCU tests, I'd agree that does seem like the best path to investigate. -- Kees Cook