From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gordan Bobic Subject: Re: Virtualization Performance: Intel vs. AMD Date: Sun, 15 Nov 2009 23:50:08 +0000 Message-ID: <4B0093B0.2070503@bobich.net> References: <200911151054.57510.tfjellstrom@shaw.ca> <4B0080C2.1010309@bobich.net> <200911151603.40453.tfjellstrom@shaw.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: kvm@vger.kernel.org Return-path: Received: from 78-86-195-86.zone2.bethere.co.uk ([78.86.195.86]:39540 "EHLO sentinel1.shatteredsilicon.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751778AbZKOXuF (ORCPT ); Sun, 15 Nov 2009 18:50:05 -0500 Received: from ariia.shatteredsilicon.net (ariia.shatteredsilicon.net [10.2.3.1]) by sentinel1.shatteredsilicon.net (Postfix) with ESMTP id 4138AF87C4 for ; Sun, 15 Nov 2009 23:50:09 +0000 (GMT) In-Reply-To: <200911151603.40453.tfjellstrom@shaw.ca> Sender: kvm-owner@vger.kernel.org List-ID: Thomas Fjellstrom wrote: >>>>> The Core i7 has hyperthreading, so you see 8 logical CPUs. >>>> Are you saying the AMD processors do not have hyperthreading? >>> Course not. Hyperthreading is dubious at best. >> That's a rather questionable answer to a rather broad issue. SMT is >> useful, especially on processors with deep pipelines (think Pentium 4 - >> and in general, deeper pipelines tend to be required for higher clock >> speeds), because it reduces the number of context switches. Context >> switches are certainly one of the most expensive operations if not the >> most expensive operation you can do on a processor, and typically >> requires flushing the pipelines. Double the number of hardware threads, >> and you halve the number of context switches. > > Hardware context switches aren't free either. And while it really has > nothing to do with this discussion, the P4 arch was far from perfect (many > would say, far from GOOD). I actually disagree with a lot of criticism of P4. The reason why it's performance _appeared_ to be poor was because it was more reliant on compilers doing their job well. Unfortunately, most compilers generate very poor code, and most programmers aren't even aware of the improvements that can be had in this area with a bit of extra work and a decent compiler. Performance differences of 7+ times (700%) aren't unheard of on Pentium 4 between, say, ICC and GCC generated code. P4 wasn't a bad design - the compilers just weren't good enough to leverage it to anywhere near it's potential. >> This typically isn't useful if your CPU is processing one >> single-threaded application 99% of the time, but on a loaded server it >> can make a significant difference to throughput. > > I'll buy that. Though you'll have to agree that the initial Hyperthread > implementation in intel cpus was really bad. I hear good things about the > latest version though. As measured by what? A single-threaded desktop benchmark? > But hey, if you can stick more cores in, or do what AMD is doing with its > upcoming line, why not do that? Hyperthreading seems like more of a gimmick > than anything. If there weren't clear and quantifiable benefits then IBM wouldn't be putting it in it's Power series of high end processors, it wouldn't be in the X-Box 360's Xenon (PPC970 variant), and Sun wouldn't be going massively SMT in the Niagara SPARCs. Silicon die space is _expensive_ - it wouldn't be getting wasted on gimmicks. > What seems to help the most with the new Intel arch is the > auto overclocking when some cores are idle. Far more of a performance > improvement than Hyperthreading will ever be it seems. Which is targeted at gamers and desktop enthusiasts who think that FPS in Crysis is a meaningful measure of performance for most applications. Server load profile is a whole different ball game. Anyway, let's get this back on topic for the list before we get told off (of course, I'm more than happy to continue the discussion off list). Gordan