From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932099AbbFIRNX (ORCPT ); Tue, 9 Jun 2015 13:13:23 -0400 Received: from casper.infradead.org ([85.118.1.10]:52545 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753153AbbFIRNQ convert rfc822-to-8bit (ORCPT ); Tue, 9 Jun 2015 13:13:16 -0400 Message-ID: <1433869987.1495.53.camel@twins> Subject: Re: [PATCH v2 3/4] x86, mwaitt: introduce mwaix delay with a configurable timer From: Peter Zijlstra To: Andy Lutomirski Cc: Thomas Gleixner , John Stultz , Aaron Lu , Tony Li , "Rafael J. Wysocki" , Suravee Suthikulanit , =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , "linux-kernel@vger.kernel.org" , Borislav Petkov , Fengguang Wu , Len Brown , Ken Xue , Huang Rui , X86 ML Date: Tue, 09 Jun 2015 19:13:07 +0200 In-Reply-To: References: <1433819621-15093-1-git-send-email-ray.huang@amd.com> <1433819621-15093-4-git-send-email-ray.huang@amd.com> <20150609092952.GR3644@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2015-06-09 at 09:46 -0700, Andy Lutomirski wrote: > On Jun 9, 2015 2:30 AM, "Peter Zijlstra" wrote: > > How about you think instead and do something like: > > > > rdtsc(start); > > rdtsc_barrier(); > > Other way around. We really need a function static inline u64 > rdtsc_with_barrier(). So admittedly I have not actually looked at how the tsc barrier stuff works, but what? We don't care if the rdtsc goes up, we just want to make sure its done before going further. > > > > > for (;;) { > > delay = min(MWAIT_MAX_LOOPS, loops); > > > > __monitorx(&addr, 0, 0); > > mwaitx(delay, true); > > I don't like this hack. The compiler is entirely within is rights to > poke addr's cacheline (i.e. the stack) between the two instructions. > I'd suggest either making the thing a full cacheline long or using a > single asm statement. > > Also, "addr" is a bad name for a dummy variable that isn't an address > at all. How about "dummy"? Sure, and I like your question on why monitorx exists at all. But none of that was the point here, the main point being that if loops was too big, we should do multiple mwaitx invocations, not punt and busy loop. > > rdtsc_barrier(); > > rdtsc(end); > > rdtsc_barrier(); > > The second barrier is unnecessary. By virtue of the address dependency? > > > > loops -= end - start; > > if (loops <= 0) > > break; > > > > start = end; > > }