From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Shi Subject: Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm Date: Tue, 15 May 2012 21:28:50 +0800 Message-ID: <4FB25A12.1050506@intel.com> References: <1337072138-8323-1-git-send-email-alex.shi@intel.com> <1337072138-8323-7-git-send-email-alex.shi@intel.com> <1337087170.27020.166.camel@laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Luming Yu Cc: Peter Zijlstra , Nick Piggin , tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, arnd@arndb.de, rostedt@goodmis.org, fweisbec@gmail.com, jeremy@goop.org, riel@redhat.com, luto@mit.edu, avi@redhat.com, len.brown@intel.com, dhowells@redhat.com, fenghua.yu@intel.com, borislav.petkov@amd.com, yinghai@kernel.org, ak@linux.intel.com, cpw@sgi.com, steiner@sgi.com, akpm@linux-foundation.org, penberg@kernel.org, hughd@google.com, rientjes@google.com, kosaki.motohiro@jp.fujitsu.com, n-horiguchi@ah.jp.nec.com, tj@kernel.org, oleg@redhat.com, axboe@kernel.dk, jmorris@namei.org, kamezawa.hiroyu@jp.fujitsu.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, yongjie.ren@intel.com, linux-arch@vger.kernel.org, jcm@jonmasters.org List-Id: linux-arch.vger.kernel.org On 05/15/2012 09:27 PM, Luming Yu wrote: > On Tue, May 15, 2012 at 9:06 PM, Peter Zijlstra wrote: >> On Tue, 2012-05-15 at 20:58 +0800, Luming Yu wrote: >>> >>> >>> Both __native_flush_tlb() and __native_flush_tlb_single(...) >>> introduced roughly 1 ns latency to tsc sampling executed in > > Fix typo, I just observed 1us with current tool, I would check if I > can push the accuracy to nanoseconds level. > >>> stop_machine_context in two logical CPUs >> >> But you have to weight that against the cost of re-population, and > > Right, it's hard to detect, but I will try if I can get measurement > done in a simple test tool to help people measure > this kind of stuff in few minutes. > >> that's the difficult bit, since we have no clue how many tlb entries are >> in use by the current cr3. >> >> It might be possible for intel to give us this information, I've asked >> for something similar for cachelines. > > This is the official document > http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf > Please, such huge documents! and it also has no such info. > Let me know if it can answer your question. > >> From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com ([143.182.124.37]:27273 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758850Ab2EON3E (ORCPT ); Tue, 15 May 2012 09:29:04 -0400 Message-ID: <4FB25A12.1050506@intel.com> Date: Tue, 15 May 2012 21:28:50 +0800 From: Alex Shi MIME-Version: 1.0 Subject: Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm References: <1337072138-8323-1-git-send-email-alex.shi@intel.com> <1337072138-8323-7-git-send-email-alex.shi@intel.com> <1337087170.27020.166.camel@laptop> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Luming Yu Cc: Peter Zijlstra , Nick Piggin , tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, arnd@arndb.de, rostedt@goodmis.org, fweisbec@gmail.com, jeremy@goop.org, riel@redhat.com, luto@mit.edu, avi@redhat.com, len.brown@intel.com, dhowells@redhat.com, fenghua.yu@intel.com, borislav.petkov@amd.com, yinghai@kernel.org, ak@linux.intel.com, cpw@sgi.com, steiner@sgi.com, akpm@linux-foundation.org, penberg@kernel.org, hughd@google.com, rientjes@google.com, kosaki.motohiro@jp.fujitsu.com, n-horiguchi@ah.jp.nec.com, tj@kernel.org, oleg@redhat.com, axboe@kernel.dk, jmorris@namei.org, kamezawa.hiroyu@jp.fujitsu.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, yongjie.ren@intel.com, linux-arch@vger.kernel.org, jcm@jonmasters.org Message-ID: <20120515132850.5chegc17wuKYl6JqP_hS4sS7EnS7Mb4eReMfqdJuBug@z> On 05/15/2012 09:27 PM, Luming Yu wrote: > On Tue, May 15, 2012 at 9:06 PM, Peter Zijlstra wrote: >> On Tue, 2012-05-15 at 20:58 +0800, Luming Yu wrote: >>> >>> >>> Both __native_flush_tlb() and __native_flush_tlb_single(...) >>> introduced roughly 1 ns latency to tsc sampling executed in > > Fix typo, I just observed 1us with current tool, I would check if I > can push the accuracy to nanoseconds level. > >>> stop_machine_context in two logical CPUs >> >> But you have to weight that against the cost of re-population, and > > Right, it's hard to detect, but I will try if I can get measurement > done in a simple test tool to help people measure > this kind of stuff in few minutes. > >> that's the difficult bit, since we have no clue how many tlb entries are >> in use by the current cr3. >> >> It might be possible for intel to give us this information, I've asked >> for something similar for cachelines. > > This is the official document > http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf > Please, such huge documents! and it also has no such info. > Let me know if it can answer your question. > >>