From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753679Ab2I0IWZ (ORCPT ); Thu, 27 Sep 2012 04:22:25 -0400 Received: from casper.infradead.org ([85.118.1.10]:36836 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751962Ab2I0IWA convert rfc822-to-8bit (ORCPT ); Thu, 27 Sep 2012 04:22:00 -0400 Message-ID: <1348734104.3292.5.camel@twins> Subject: Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected From: Peter Zijlstra To: Linus Torvalds Cc: Borislav Petkov , Mike Galbraith , Mel Gorman , Nikolay Ulyanitsky , linux-kernel@vger.kernel.org, Andreas Herrmann , Andrew Morton , Thomas Gleixner , Ingo Molnar , Suresh Siddha Date: Thu, 27 Sep 2012 10:21:44 +0200 In-Reply-To: References: <1348505683.11847.111.camel@twins> <1348511193.6951.44.camel@marge.simpson.net> <20120924192056.GB4082@liondog.tnic> <1348538258.7100.23.camel@marge.simpson.net> <1348574286.3881.40.camel@twins> <20120925131736.GA30652@x1.osrc.amd.com> <20120925170058.GC30158@x1.osrc.amd.com> <20120926163233.GA5339@x1.osrc.amd.com> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-Mailer: Evolution 3.2.2- Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2012-09-26 at 11:19 -0700, Linus Torvalds wrote: > > For example, it starts with the maximum target scheduling domain, and > works its way in over the scheduling groups within that domain. What > the f*ck is the logic of that kind of crazy thing? It never makes > sense to look at a biggest domain first. That's about SMT, it was felt that you don't want SMT siblings first because typically SMT siblings are somewhat under-powered compared to actual cores. Also, the whole scheduler topology thing doesn't have L2/L3 domains, it only has the LLC domain, if you want more we'll need to fix that. For now its a fixed: SMT MC (llc) CPU (package/machine-for-!numa) NUMA So in your patch, your for_each_domain() loop will really only do the SMT/MC levels and prefer an SMT sibling over an idle core.