From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758279AbcATHIF (ORCPT ); Wed, 20 Jan 2016 02:08:05 -0500 Received: from e06smtp06.uk.ibm.com ([195.75.94.102]:39366 "EHLO e06smtp06.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758250AbcATHHt (ORCPT ); Wed, 20 Jan 2016 02:07:49 -0500 X-IBM-Helo: d06dlp03.portsmouth.uk.ibm.com X-IBM-MailFrom: heiko.carstens@de.ibm.com X-IBM-RcptTo: kvm@vger.kernel.org;linux-kernel@vger.kernel.org;linux-s390@vger.kernel.org Date: Wed, 20 Jan 2016 08:07:40 +0100 From: Heiko Carstens To: Tejun Heo Cc: Christian Borntraeger , Peter Zijlstra , "linux-kernel@vger.kernel.org >> Linux Kernel Mailing List" , linux-s390 , KVM list , Oleg Nesterov , "Paul E. McKenney" Subject: Re: regression 4.4: deadlock in with cgroup percpu_rwsem Message-ID: <20160120070740.GA3395@osiris> References: <56978452.6010606@de.ibm.com> <20160114195630.GA3520@mtj.duckdns.org> <5698A023.9070703@de.ibm.com> <56990C9E.7020801@de.ibm.com> <20160118183205.GW6357@twins.programming.kicks-ass.net> <569D3370.6040503@de.ibm.com> <20160119095518.GC3528@osiris> <569E9032.3070903@de.ibm.com> <20160119193845.GT3520@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160119193845.GT3520@mtj.duckdns.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16012007-0025-0000-0000-00000594FF08 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 19, 2016 at 02:38:45PM -0500, Tejun Heo wrote: > Hello, > > On Tue, Jan 19, 2016 at 08:36:18PM +0100, Christian Borntraeger wrote: > > No, its not a task_struct. Activating some more debug information did indeed > > revealed several other issues (overwritten redzones etc). Unfortunately I > > only saw the broken things after the facts, so I do not know which code did that. > > When I disabled the cgroup controllers in libvirt I was no longer able to trigger > > the bugs. Still trying to narrow things down. > > Hmmm... that's worrying. CONFIG_DEBUG_PAGEALLOC sometimes can catch > these sort of bugs red-handed. Might worth trying. Christian, just to avoid that you get surprised like I did: CONFIG_DEBUG_PAGEALLOC requires in the meantime an additional kernel parameter "debug_pagealloc=on" to be active. That change was introduced a year ago, so it was probably only me who wasn't aware of that change :)