From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752832AbdK0NAi (ORCPT ); Mon, 27 Nov 2017 08:00:38 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:57110 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752754AbdK0NAg (ORCPT ); Mon, 27 Nov 2017 08:00:36 -0500 Date: Mon, 27 Nov 2017 14:00:28 +0100 From: Martin Schwidefsky To: Will Deacon Cc: Peter Zijlstra , Sebastian Ott , Ingo Molnar , Heiko Carstens , linux-kernel@vger.kernel.org Subject: Re: [bisected] system hang after boot In-Reply-To: <20171127125455.GC30679@arm.com> References: <20171122182659.GA22648@arm.com> <20171122202217.GO3326@worktop> <20171127114947.GA30679@arm.com> <20171127134918.4da71a78@mschwideX1> <20171127125455.GC30679@arm.com> X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17112713-0016-0000-0000-00000505F2FB X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17112713-0017-0000-0000-00002841D287 Message-Id: <20171127140028.77cfb60a@mschwideX1> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-27_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711270180 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 27 Nov 2017 12:54:56 +0000 Will Deacon wrote: > Hi Martin, > > On Mon, Nov 27, 2017 at 01:49:18PM +0100, Martin Schwidefsky wrote: > > On Mon, 27 Nov 2017 11:49:48 +0000 > > Will Deacon wrote: > > > On Wed, Nov 22, 2017 at 09:22:17PM +0100, Peter Zijlstra wrote: > > > > On Wed, Nov 22, 2017 at 06:26:59PM +0000, Will Deacon wrote: > > > > > > > > > Now, I can't see what the break_lock is doing here other than causing > > > > > problems. Is there a good reason for it, or can you just try removing it > > > > > altogether? Patch below. > > > > > > > > The main use is spin_is_contended(), which in turn ends up used in > > > > __cond_resched_lock() through spin_needbreak(). > > > > > > > > This allows better lock wait times for PREEMPT kernels on platforms > > > > where the lock implementation itself cannot provide 'contended' state. > > > > > > > > In that capacity the write-write race shouldn't be a problem though. > > > > > > I'm not sure why it isn't a problem: given that the break_lock variable > > > can read as 1 for a lock that is no longer contended and 0 for a lock that > > > is currently contended, then the __cond_resched_lock is likely to see a > > > value of 0 (i.e. spin_needbreak always return false) more often than no > > > since it's checked by the lock holder. > > > > Grepping for 'break_lock' the two locking blueprints are the only places > > where the field is written to. Unless I am blind, the associated unlock > > functions do *not* reset 'break_lock'. > > > > Without the raw_##op##_can_lock(lock) check the first of the blueprints > > now looks like this: > > > > void __lockfunc __raw_##op##_lock(locktype##_t *lock) \ > > { \ > > for (;;) { \ > > preempt_disable(); \ > > if (likely(do_raw_##op##_trylock(lock))) \ > > break; \ > > preempt_enable(); \ > > \ > > if (!(lock)->break_lock) \ > > (lock)->break_lock = 1; \ > > while ((lock)->break_lock) \ > > arch_##op##_relax(&lock->raw_lock); \ > > } \ > > (lock)->break_lock = 0; \ > > } \ > > > > All it takes to create an endless loop is two CPUs, the first acquired the > > lock and the second tries to get the lock. After the unsuccessful trylock > > of the second CPU, the first CPU releases the lock and never tries to take > > it again. The second CPU will be stuck in an endless loop. > > Yes, it basically relies on the lock holder never winning that race. > However, Peter's use-case just needs the lock-holder to be able to detect > contention (which is always best-effort anyway), so I think we can make that > "work" by removing the while loop above (see my subsequent diff sent to > Sebastian). Well, what race? The lock hold just has to hold the lock while another CPU tries to get it. There is no particular bad timing involved, just a little bit of contention is enough. And yes, I think removing the while loop on break_lock will work. > It's still questionable, because on a machine with store-buffers you really > want to order writes to break_lock against something else, but it might > happen to fall out depending on the details of the trylock() implementation. Even more, if the compiler "proves" that nobody writes to break_lock it can convert that to "while (1)" loop. > > I guess my best course of action is to remove GENERIC_LOCKBREAK from > > arch/s390/Kconfig to avoid this construct altogether. Let us see what > > breaks if I do that .. > > We could just consider ripping out GENERIC_LOCKBREAK entirely, but I was > hoping we could get a simpler fix in for now. I would opt for removing it entirely. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.