From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754145AbcDTOoh (ORCPT ); Wed, 20 Apr 2016 10:44:37 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:49835 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751831AbcDTOog (ORCPT ); Wed, 20 Apr 2016 10:44:36 -0400 Date: Wed, 20 Apr 2016 16:44:33 +0200 From: Peter Zijlstra To: Minchan Kim Cc: Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: preempt_count overflow in CONFIG_PREEMPT Message-ID: <20160420144433.GG3430@twins.programming.kicks-ass.net> References: <20160419065843.GB12910@bbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160419065843.GB12910@bbox> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 19, 2016 at 03:58:43PM +0900, Minchan Kim wrote: > migration trial A page to B page. > B is newly allocated page so it's empty. > > 1. freeze every objects in A page > for object in a page > bit_spin_lock(object) > > 2. memcpy(B, A, PAGE_SIZE); > > 3. unfreeze every objects in A page > for object in a page > bit_spin_unlock(object) > > 4. put_page(A); > > The logic is rather staightforward, I guess. :) > Here, the problem is that unlike object migration, page migration > needs to prevent all objects access in a page all at once before step 2. > So, if we are luck, we can increase preempt_count as 113 every CPU so > easily preempt_count_add emits spinlock count overflow in > DEBUG_LOCKS_WARN_ON if we are multiple CPUs.(My machine is 12 CPU). > > I think there are several choices to fix it but I'm not sure what's > the best so I want to hear your opinion. > > 1. increase preempt_count size? Nope, 256 is way far too many locks to be holding, esp. spin-locks. You get the most horrid latency spikes from that. > 2. support bit_spin_lock_no_preempt/bit_spin_unlock_no_preempt? Only if you really really really have to, but it would suck. > 3. redesign zsmalloc page migration locking granularity? > > I want to avoid 3 if possible because such design will make code > very complicated and may hurt scalabitity and performance, I guess. This really is your best option. You don't think O(nr_cpus) locking is a scalability fail? > I guess 8bit for PREEMPT_BITS is too small for considering the > number of CPUs in recent computer system? Not really. Holding a lock (or even multiple as you do) for each cpu is a completely painful thing and doesn't scale. > I hope I'm not alone to see this issue until now. :) Very occasionally people run into this.. we try and convince them to change their ways.