From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932680AbcEXOxi (ORCPT ); Tue, 24 May 2016 10:53:38 -0400 Received: from mga14.intel.com ([192.55.52.115]:33198 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932079AbcEXOxh (ORCPT ); Tue, 24 May 2016 10:53:37 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,360,1459839600"; d="scan'208";a="109602195" Date: Tue, 24 May 2016 17:53:32 +0300 From: Mika Westerberg To: Andrea Arcangeli Cc: "Kirill A. Shutemov" , linux-kernel@vger.kernel.org Subject: Re: v4.6 kernel BUG at mm/rmap.c:1101! Message-ID: <20160524145332.GF1789@lahna.fi.intel.com> References: <20160523140638.GA1738@lahna.fi.intel.com> <20160523150826.GA20829@redhat.com> <20160524081223.GE1712@lahna.fi.intel.com> <20160524140809.GG20829@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160524140809.GG20829@redhat.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 24, 2016 at 04:08:09PM +0200, Andrea Arcangeli wrote: > On Tue, May 24, 2016 at 11:12:23AM +0300, Mika Westerberg wrote: > > Hmm, the kernel shipped with Fedora 23 has that enabled: > > > > lahna % grep CONFIG_DEBUG_VM /boot/config-4.4.9-300.fc23.x86_64 > > CONFIG_DEBUG_VM=y > > # CONFIG_DEBUG_VM_VMACACHE is not set > > # CONFIG_DEBUG_VM_RB is not set > > Yes, it would have been more accurate to say "enterprise", not just > "production". Fair enough. > It's great to run Fedora with CONFIG_DEBUG_VM=y and I'd recommend to > keep it that way, so it contributes to stronger runtime validation of > the VM invariants. > > I keep CONFIG_DEBUG_VM=y on all my systems too of course. > > Also note the RHEL debug kernel has CONFIG_DEBUG_VM=y also enabled, > but only the debug kernel. > > In general while testing new kernels with new VM modifications it's > good idea to set CONFIG_DEBUG_VM=y, if you can afford the occasional > false positive like in this case and it's not an enterprise production > kernel, where clearly all testing should have already happened before > that become "enterprise" ready in the first place, so we can save a > few cycles. > > Lately we got VM_WARN_ON too and I added to my tree recently: > > +#define VM_WARN_ON_PAGE(cond, page) \ > + do { \ > + if (unlikely(cond)) { \ > + dump_page(page, "VM_WARN_ON_PAGE(" __stringify(cond)")");\ > + __WARN(); \ > + } \ > + } while (0) > > So we could convert some... to reduce the pain of a false positive, > but in cases like the one that triggered I'm not sure it'd be good > idea to switch it to a WARN_ON as it may be a sign of memory > corruption if the assert fails (after the patch) and keeping going > after memory corruption can actually do more harm than good. > > One thing to keep =n however is CONFIG_DEBUG_VM_RB=n, that one is > expensive and that's why it has its own separate knob to be able to > disable it while keeping CONFIG_DEBUG_VM=y. IIRC I kept originally > under #if 0... so I wouldn't recommend to enable VM_RB on production > (it's too much overhead), that's a nice validation but for development > only. Understood. Thanks for the thorough explanation :)