From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751305AbbLUNPO (ORCPT ); Mon, 21 Dec 2015 08:15:14 -0500 Received: from mx2.suse.de ([195.135.220.15]:33053 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751134AbbLUNPM (ORCPT ); Mon, 21 Dec 2015 08:15:12 -0500 Subject: Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock To: Nathan Zimmer References: <1447777078-135492-1-git-send-email-nzimmer@sgi.com> Cc: Andrew Morton , Nadia Yvette Chambers , Naoya Horiguchi , Mel Gorman , "Aneesh Kumar K.V" , linux-kernel@vger.kernel.org, linux-mm@kvack.org From: Vlastimil Babka Message-ID: <5677FB5D.7010805@suse.cz> Date: Mon, 21 Dec 2015 14:15:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <1447777078-135492-1-git-send-email-nzimmer@sgi.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/17/2015 05:17 PM, Nathan Zimmer wrote: > When running the SPECint_rate gcc on some very large boxes it was noticed > that the system was spending lots of time in mpol_shared_policy_lookup. > The gamess benchmark can also show it and is what I mostly used to chase > down the issue since the setup for that I found a easier. > > To be clear the binaries were on tmpfs because of disk I/O reqruirements. > We then used text replication to avoid icache misses and having all the > copies banging on the memory where the instruction code resides. > This results in us hitting a bottle neck in mpol_shared_policy_lookup > since lookup is serialised by the shared_policy lock. > > I have only reproduced this on very large (3k+ cores) boxes. The problem > starts showing up at just a few hundred ranks getting worse until it > threatens to livelock once it gets large enough. > For example on the gamess benchmark at 128 ranks this area consumes only > ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is > over 90%. > > To alleviate the contention on this area I converted the spinslock to a > rwlock. This allows the large number of lookups to happen simultaneously. > The results were quite good reducing this to consumtion at max ranks to > around 2%. > > Acked-by: David Rientjes Acked-by: Vlastimil Babka