From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ak@linux.intel.com>
Received: from mga04.intel.com ([192.55.52.120])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <ak@linux.intel.com>)
	id 1fLsQL-0002IL-1u
	for speck@linutronix.de; Thu, 24 May 2018 17:45:06 +0200
Date: Thu, 24 May 2018 08:44:49 -0700
From: Andi Kleen <ak@linux.intel.com>
Subject: [MODERATED] Re: L1D-Fault KVM mitigation
Message-ID: <20180524154449.GP4486@tassilo.jf.intel.com>
References: <20180424090630.wlghmrpasn7v7wbn@suse.de>
 <20180424093537.GC4064@hirez.programming.kicks-ass.net>
 <1524563292.8691.38.camel@infradead.org>
 <20180424110445.GU4043@hirez.programming.kicks-ass.net>
 <1527068745.8186.89.camel@infradead.org>
 <20180524094526.GE12198@hirez.programming.kicks-ass.net>
 <alpine.DEB.2.21.1805241201510.1577@nanos.tec.linutronix.de>
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.21.1805241201510.1577@nanos.tec.linutronix.de>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

> Now with the whole gang scheduling the numbers I heard through the
> grapevine are in the range of factor 130, i.e. 13k% for a simple boot from
> disk image. 13 minutes instead of 6 seconds...

That's unoptimized, and also for the extreme PIO case (which does one
exit for every dword). With some optimizations we hope to do better.

Also the PIO case is really not that interesting, although it would
be nice to get rid of the slowdown so that users don't
have to fix their boot loaders.

But yes PIO will suffer a bit that's unavoidable. As long 
as the slow down is not too bad it should be acceptable. If it's 
a problem they can always use some other mechanism to load
the kernel.

> 
> That's not surprising at all, though the magnitude is way higher than I
> expected. I don't see a realistic chance for vmexit heavy workloads to work
> with that synchronization thing at all, whether it's ucode assisted or not.

Nothing should be anywhere near as VMEXIT intensive as PIO.

The worst realistic one is likely IO intensive with lots of 
small transactions. But even there you are nowhere near
"one exit for every 2 bytes in a long loop" like PIO.

>  
> The only workload types which will ever benefit from that co-scheduling
> stuff are CPU bound workloads which more or less never vmexit. But are
> those workloads really workloads which benefit from HT?

That's a very extreme conclusion which I don't think is backed at all
by your data.

> Compute workloads
> tend to use floating point or vector instructions which are not really HT
> friendly.

There are plenty of compute workloads that benefit from HT: usually
everything that is not completely memory bandwidth dominated per single
thread.

> 
> Can the virt folks who know what runs on their clowdy offerings please shed
> some light on this? Has anyone made a proper analysis of clowd workloads
> and their behaviour on HT and their vmexit rates?

My understanding is that most cloud providers only sell cores, so they
are already good for other guests.

Usually they use some other affinity mechanism to reach a similar
effect.

But we still have to fix the general case e.g. just for "someone
runs a KVM guest on a random system with HT on"

-Andi