From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 26 Jun 2018 02:43:24 -0000 Received: from aserp2120.oracle.com ([141.146.126.78]) by Galois.linutronix.de with esmtps (TLS1.2:RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fXdww-0001Dy-FF for speck@linutronix.de; Tue, 26 Jun 2018 04:43:23 +0200 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w5Q2dcIY166070 for ; Tue, 26 Jun 2018 02:43:15 GMT Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2120.oracle.com with ESMTP id 2ju1nrt7fd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 26 Jun 2018 02:43:15 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w5Q2hEHE014106 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 26 Jun 2018 02:43:14 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w5Q2hDSp026706 for ; Tue, 26 Jun 2018 02:43:14 GMT Date: Mon, 25 Jun 2018 22:43:12 -0400 From: Konrad Rzeszutek Wilk Subject: [MODERATED] Re: [PATCH v4 0/8] [PATCH v4] Further reworked KVM patches Message-ID: <20180626024312.GA7087@char.us.oracle.com> References: <20180623135414.091581758@localhost.localdomain> MIME-Version: 1.0 In-Reply-To: <20180623135414.091581758@localhost.localdomain> Content-Type: multipart/mixed; boundary="OgqxwSJOaUobr8KG" Content-Disposition: inline To: speck@linutronix.de List-ID: --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Jun 23, 2018 at 09:54:14AM -0400, speck for konrad.wilk_at_oracle.com wrote: > Hi! > > This patchset has Paolo's patches + mine with the review comment > addressed. That is: > - Moved the warning when launching guests. > - Make it warn if disallow_smt=0 > - Make it error if disallow_smt=1 and also fail ioctl: > ioctl(KVM_CREATE_VM) failed: 95 Operation not supported > qemu-system-x86_64: failed to initialize KVM: Operation not supported > - Expand the blacklist of VMexit handlers that need L1 flushing. > > > It also tacks on an optimization if the non-default vmentry_l1d_flush=2 > (always use L1D cache) parameter is used. > > I will send off on Monday a build to the performance team so will > have a better feel for it - but in the meantime was wondering > if other folks had done SPEC* type runs? I did some kernel compilations > and the numbers were in the noise. Attached is the git bundle and also all the patches as it seems some never made it across. I really am puzzled of why as I can see in my maillog that they got sent across. I will retry sending only the ones I did not see go across. --OgqxwSJOaUobr8KG Content-Type: application/octet-stream Content-Disposition: attachment; filename="kvm.l1tf.v4.bundle" Content-Transfer-Encoding: base64 IyB2MiBnaXQgYnVuZGxlCi03Y2UyZjAzOTNlYTIzOTYxNDJiN2ZhZjZlZTliMWYzNjc2ZDA4 YTVmIHg4Ni9DUFUvQU1EOiBNb3ZlIFRPUE9FWFQgcmVlbmFibGVtZW50IGJlZm9yZSByZWFk aW5nIHNtcF9udW1fc2libGluZ3MKMWY0MmI0YjZmYjJmNzFkYjFmZmQ4YjBiZTIxMDQyNWYw YWJhODY2YyBIRUFECgpQQUNLAAAAAgAAAEGSKHicnZHdbtswDIXv9RR8AaeKf+VgKFakHVa0 HoZkSXdXyBZlC7ElT5aSOU9fJUMHDLvadKVDUocfKWcRAWtZ8oyhbJoypYWQacaalGY0Fkws G05zJrFEJCO3qB1wVnKa5kWecVGyopZFli/rZSJqyuKSskSmlGdxTbh3nbHwZLTlAjZnnM7e 4QFeVH+AD4dreHEK4qOxvOlx0ZjhFpZZXGYpS4ocIppSSkJ0UM7hfzkVWZnE707kJ8tvnvbV zb76voLdhFBtNzDxI0KvJgcy0D7eJfHrp+fd9vPruroHJcHiD68sigUhjxJchzAY4XuEsA8+ 4AVMTeAMyN5P3bXgeXkPDW/CzWjAI9oZ9tXDl28PGxLSGk4Y0hrM6NSgzgj1DH5Sur0+/pMp +HYXccmsv+7IaA74u+yk3K+GozVj4Djy3iNY1Xbhm9x7T6iN14LbOUywVa1GERkpo3pe/eNG SRRF5Jis4K5xnvf9DKhb3uLf2ATCqUxQ6jqC9LpxKizDdYFLWlNPgW69JW9fydFrnRd4nJ2P QU7DMBRE9z7Fv0Aa24kdu6oQrKtsWqmw/bG/25AmRo7TQk9PhLgArEYzi6c3ORFB40xlhbai 6ZyynfKVNdLVqCQGRxVKlNx2mtgHJpoyiNo33CvdGLSKI0pReWOtJBEomFpxLpz22jBc8iUm 2McpoYfDg+bHkmmA1/46wG74mTf3tTzHhO5KGxfHJxBKWlXr2kgoeM05W9exz5n+RWqUraT6 JbFPo8v9qS1P7dsWXryHkHCke0wDhNX0grd+OoPv5/fYr09xjMsaMUB7PMyQI8x4ozLRnGMi xo79eSJfxBCK7mv7Rz32DV39ezmQHXicnZDdbsIwDIXv8xR+ASBtk/6gaRrSuJhYhwQMuHUS ByogQW06Np5+AfEC2539yT7HPqElAhKCK9JGJqXhiKZSac51lqRKlYXJMTOorSwsO2NLLoDN OEmeiRx1opSuBJa2TPKCmyhTEmrSRakyZNiHvW9h5l2LBhZX6q59oANsmuMBng53PLzE5sW3 qI801P70DIlMKylyIQoYcME5i/TUhED/UipklaXyocS+y3w0W9ejdb0dw5LO1GIgCHuCSGDy uZq/zyevsOupC6O97wK4/qSiM2rtexcatxsytiHQ6MD5Cxx9vAfBNNbSPZzHvLdQLxcdmOjh TFwD76CxcCGIITLjb2hdT7dvK/jqbtXHarqI2stm58gMvLUD9TP+48fsF/ptl5OYGXicnY9N boMwEEb3PsVcIMT4B0wUVe06YpNIaXeVsYdgAXZkjNrm9HVRL9Au35uZb2ZSRARJDZMl7zgy IcvSCCaFFViVFda16BhTHeWiq8ldR/QJSswDvUFlqDaWykpQy6QUsuqUorTndaMzc6LXNIQI p+CjtnB+4PJYE47w6qYRjuOmi48MzyFqM2FhwvwEpWSN5I1qGthRQSnJdnYp4b+SatlwJn6T yKeq9qdru7+2bwd4sRZ65+37vEQYcLrnDf3qTXLBE1IUkMKmtyZIA0J7OUPwgC5D3MxtxSVB fnIISyI/9cktqSDk4m4e7S70/a77OvzxcvINxKGB45MleJydkU1u2zAQhfc8xVzAtv5FBUWR AM2iSJ0AtpFmOySHlmCJDCiyTnL6jlwDzbYFuCAfOd97M4yBCKS2RVNUylRthp1pTd2UtSq0 rUjWXaZyqZQuZS1eMZCLYKXN27ypjSm5ggxmRrZKS2pbzNqmzGXeWey0wBR7H+DBu4AGdh80 f6RIJ/g5jCf4crrI6zMfbn1APdJa++kr5HXR1XnLC1ZZlWWC1WmIkf6L1NZdWVRXkniTzebh ebt53r7cwP51HCLEnoCPsN3v4MfT3TeYY0g6pkAzRA89/iJAB72f4+aYaI7g0qQozGshDj0F gmEG52GmCV0cNOge3ZFApQjnIfZswA+u4plAMwzH0Z+ZKpJTOKLTZAAnn3i43i5JLtaKYPRo +M47jnj/8v3ANWbZPh7udxd/jIv90sOfVJ/r5yU6z587iZ5z/qWweIXAhO+CjcxgLS2/y9T9 cHRkVt7alXq/+cehi98nqMLbl1R4nJVU32/bOAx+119B7GkDZje2EycpdretP4YVS4ZD03WP BS3RthBbCiQ5afrXH2332tz2cgcYhkSKHz+SnxQcEaRyOckXC6UmKl/kMi0m00meztVkUaQo k0U+zaSSs0zs0JEJsEiXaZbgJMWyTNMppvNknuQk57O5zBY4nSqlcIYksAu1dfAX2sbChTVP 2mj4sCvG1SdHqsYQS9v+CcksXfKXJwlEnH4i2NrqEMjBN2scKrh9Iv/UBdrCT91s4cN2MMcH 3nyyDmVDr0jz2TJLs2ck8bjIz77dr8/hhydYJSBR1gRl0/kaCiqtI7hfX3+/u74FXQLuUTdY MJwQWZYvl3m02ZHsGgx6T9H1I6+DtibaaEXRZY3GUBOtddAV9nYf71QJikptyAsEQwdYb27h 7c3nLH34svqx+fpwub4C3CJMHpPJxTs41FrWUKMHr1tO7uDgdKDImuYoPLVogpYeggUbau4I w/nnDAq0ATaCsrJreTrM+o63r1G2ZL/2Awc9gGDT2AO86cMdVA4N18b5jgy1x0aroQzBcX2j tKnAB9fJ0DnicK4X6FH70DtakrzXvvXQ6C3Bz4ub7/dXMdwE4Lk0YiijQLkFNOoVngbGPAle 4ziO9+yVTad6VOY3GpuhhYWzXVWHvtDiCDtHkoZj2oy8+p6/53ystS6c1HCK5MVbqmJYpcB6 XK0u38WwsS0xmpXkvXUeWjzyeW9/pVlq5wM0tKdm/IuTxM9a4gVyPVfjeKHXLhccw91JmWz4 PXJMW/C8ar5bClA66/2Qt7GVlticUuSRcLtYr/EbIb6SkQRO9615VvGhlwEnepEzGwwxLA99 lPtvfRfsCvUBXQCssOcHnWGOPnBY1ZEPHhzh0PA+uLZ8oqXWumOvhSBYUgPQIMR/QbMSPzPb 3fFFgv9otNfhyzUDBgGAOoSdPz87K7rqiaWD8ZYcX6zYuurM1/bwwI5YVvqjVn8ky+UsSYTY 6IpvQGTLMiqO5//lnfk15P+9LSKKIrHPzuGWlB3aMb5S4m/OdcxynZkBeJyVVmmP2zYQ/a5f MVigSArYho/42qRFDqRp0KQtNleBoljQ4sgmTIkqScnr/Pq+oeSNN9kC7WIByxTnzfXmjaNn piXPp3q1ni/nmw0vCq3yxXI5Xs+W6+VitVgUPF6tJzzjrFaeq0jrfL1WrJfr1XKsmcd5Pl2P x4vpdDlfT1YrtZwVzHoxz1QTd87T78pZR89d9dlUhp7Um+7pqWe9U3GUu/JHmsyn6+Wj5Xw+ peH40Xic4bQ0MbKnX1zllaarzxw+N5H39MnYPT3Zp+PRAV+eOq9yy2dI8/Vs8qhHyvZteUk3 q8UlAdFsVTSuogKBvZlQrvIdE9yUplKWCtXYSG1jK/ZqYyzuc8iyT0xKa4oHdw5ROs0hAb34 +HI4HU9Ww9liOh6Q2ivBPqFmCXVE9F5cCUYyRDHpoi1RUX+8thN9Xdgm7H6YXJCq9H1vphej LLv3nEygYMraHkk7U22B8CW5dIkQL7fsj/Tx7ctf37+8Gt2HNElIOBOM6Ei1zmgKiFhVxzuI HHZIwVVZD5diNlWIjFa5KgWCC6agCAPPKvQ1h0v2CR7nu2PNvjXB+Qx+NyqwGKdXEa/IFZSj VjhQUSLjG85BAT1KpTSaFZ3bvQa2fRDA04gww4A2TaQDzOE/krKWkPKNiag96qUKpqayHII4 PJ7As1BzbgqT33WduyrAITgLI7FNMaCFeFc52imv5b4ma0JE4NmFXLog8dfnHx6ngG5jgPEZ 6lk8lxkNBf2IyM12ixmQinDZWBWdH9BhZ/IdbBuraQNq0tY5sFP5LUcpckZEDiaeJBexMi33 +YG4w65gcSdlCQNx5joXb99+uEUHhTapUTuHhGq1RS3UxkpTuw6d6DASgNeVTIgR+MHJE2yb wD7UKmfxoKouqNq7XKqO0bW3GaqOpwmMb8ARGUcjZPFNHYMcVwx6odHgTBrA0DXH89+N8XxK gB4GaJrauJa/lx6l9klF2rKO3upB9xCiPEA1lB+05cGbyPhERXQK4ZkF6w87rtDZSlvJxVQt 13GAj7Y2OsuEgpo7xQAJ7h2nfuhzlzhx/1AL9Ztaxm06mt9AHLz0qmYwA107sQX0/lmFA6Nk ff2cB8Nl7Gbj72QSX7gCuktv1J7pYdF3FL1HCQd4hhf5rzQDWouOIyBkmFpSmhyg4G/W220Z CogZ6MQjTYLwFf3U34+63M+UMM06uivUEKU4KKHuUXLSTk7gXlRp48D/NPSJfBqHjzOTyle7 EAzolURCkTx3KiH3BCh9ahVBFO/KFGSnRrpJNFVZVypMPReu50PiLtZCbTlymvMuo0Fi/L+D ZhgCy2dxeAx938zQgLAh5R1PCiCJmiDj8VV6Wa8wR9c8aNO8F2bbyLzjBPN5BKXLZM7a5Cqi 9SrNtiTgJIBejbItNDei9O/MtmI9RLOHm+Plf1muX5v8v4WaDYfDrJ1e0jNsQJQm33c774/V 4vr5h1fXbybvf5LxGtIVH5zfd0VM2xtJhdybWmo1ytrZJexVBSHBAmxQKvyeUCXLjkeq765f X3149ZtA4X3b9Q+r+8usUG5d6C73W+fu5W9QMcIduYSWtt6pDUcU2aL50NyhSXKDv9Nuh+54 RbWFXvXacuI8bY6dEPD1n9/oyL0qMrgrF391nl6IqnJfhkRrfEvr8nbrdG58GGX/AEQ+U3md U3icnVNNc9NADL37V+hGmakbO46duAMM0IGBgQDTQsstI3vleIm9m9kPt+HXo93m0AIXyMnR Sk9PT0/OEAFiU3ZtldX1qsoEYZ3XFWUF0VzUbTuvsMjLvMyqZI+GlINlS/MuK+qCcF7UVb6Y N8sOu4qobvKuqJaVyFZYdgl612sDH7QyKODyJ9mf3tEObuSwg2e7GD675T8vtcF2oLNWjy8g L+d1ma3KuoA0W2RZwtFROkf/hbRcrBbF6oiU3K2q2Yfr9TncoFHgLWPKDjgC0sKgUZCAq/VX QCXgY/71LVx8+QaN30JDUm1hb8iyAmdJ8r4D19PjHIY4JsAtqzoM+jYmBfhRCz8QOM1Ix0YJ 2vg84g8WSXeRTgihC5/wUSp/F5kEgB4nLjfeOqa49WSdTcKb0KA0N0TuitAYvSMFlpzfM8uL QXsBEymhGZm3FxtqRcc2IeIVa9Vp47BhgrfS9XBx/QbmWb5KiyovYxeman3bh/LDX3Bsr/0g QlsQ0sbJN3Z0YVxOYiJX5FwQ8PfXkVAdMQKiPfB4I6AYpZLWGXSsDA6WZyQSNuGKgBCIhi2d vDvsyaSuN4SC4Z+CVBHn9fvPV6fAtZPEGHiiNHd8AsFKPE8ySEXAdsaR2FePUmdMYiZoki3Z 2T2hWbv3M66ftVo5o4fkxPLZ3NsSsnJZVLTANi9aCIm9dvvBb8/hi9GTFAQ7pRsbpj2WB+5P WZTP3M4AY8gtOqnVcUM6bt+h3QF2HQvhDqcBOMhrT3lEZmz83kEjVZj6NCHXQspOObg+iBx0 xR2r6eOiWNuNVsNhcy8xj3z0D0zs3PumxivFpYlm78hmiChRVRuWJ7eKRKq7Lm0O5/94hEma pslUnKdrzQaWcevTeLeZeKJNGA5O2mieuGZ2M0d4i7tpfJDSeBfirPrIdwP8GzWz3hsvwrW5 B4XX6+98K0oMZM5CZrrmWwtvN68uP4Vjf2jB59mjlDeXl39k5DHjrbzjswv6hdtQfmzIJL8A TZ7FP/QFsery8G0auMdPnXPfR/q45ETnX/54nAFUAKv/2QnZCbBRARR+D+WzrE/MTnN20km3 Q/QOeAP0FZNlAeAURz4U5Ekf4VvGb4I14AF9/qazvWezWQIvARSvr6YLoqBIiuTNhCv01B17 07OsvbOcAz0BT8wnIPECRpblIkpdhGhE7rervu3NvDlj+2d4nFtut9xuwwxmkRmdGxhOT0sV k/svI9O3ek+yqjR38eY1zNXSAO1rDTrxAqs7sFFSdUSVJlhvT7Q6fslxzboqeJwBIQDe/4IL gguwxwIUuPekq2k6r0Zdb/81fPe5y/YOLvGz2wKnAg3cEJH+JY4pxLZ1b9YIrDxBVkarlsQD yzRzeJxtkUFOGzEYhRV2mR0SB3g7iERRBolNJTZAUCNIVbXTbKoqcsb/zFh47Mj2BGYVdhyg V6nELapKPQkXQP3tIBaInWX/7/vfe37+k/39l93/HujdzbolE1y/0LlcVLrzzemPq/nsJ2Yq qFoEZQ0q63CdoyDXKiM0LkWnA87nk6NsOBzOhVYSwtVdJPmPGB8iP8RxFh+PIS15CBMBpSgb QloCxtKaXL/BfDb5XEy+JlaO4BTPBwuxtoz1Fq0w/TvqF1lkSyjjAwkZEdJCBR7QPVSFwJIb xRO2Qmkl8YUIUB50R2UXSG5wazstoVnOayOBjyyp0Vgf0FJrXZ/MjSGVF0sd/TG2fS0oJb2g KtXC7BwH7OKNZWXgV1SqSpXJrjAl+VH2WO0OnnY2kay1vV34Nrx8wHTr/jovLnH+5TuWXR3h K0eee06x+T2u9j2nb9EIj2+zAgfiRuBTvyL3oWgc5+IwI5CJ1pPGbHsy+0yJS1GfdeRDqn1J KFnDzRy9jTVm8nb8nenRw3Rv8Djd2zn59R/CTsZ48AKXUMZTC/xtrqxXh0BLKcy1aCJKxnic O8l2km2DCLPIfPkJK+r7xFt8qxqLDu9gXSE6vWrNZA1mRQDSZQyr+gPVW9PJ4f7ov/iMj1bA i9HQge473HicAToAxf/VCNUIsCICFN0Px7zRgiMmarL+DhStkaGnecqPkzYCKxRqtiYq69sn MTr+sxiN0sfiZuni9LN1AuABC3wcFa0DeJwzMQAChcTiXIY5/1iXGr7V0JmzvulMt9WtCx+j 9i80AcuWJhZkMoheCP/PvGyN/v6G4rzsppVzhNjXbgQAAWQaCvsDhqmYUEiWhQqXZ8eI/czy +mSNVjN4nAE7AMT/kWGRYbD+FBR4dIklo3BWr1H7Y0OWCWjaVKfuTLMSFdEEFA51FwibgJSk No/xKoY7KxWFbe+Xs/cZmhYUxBn5/QvBPNKNnRvlq9/4/fk2ktUXVciTDHicW3SZ6fQVpg3P 3eq5OPW1FNLyixR8DBVKUotyM/MScxTSEktzShTKSnPyUosSkzJzMksqFbT0uTiT8vNzFMqS C0rjS/OS8/PSMvNSU6y5wEbklhaXKKTllBZnKJRkpIJMc0ksSVRITkwG8uB6wQriwYLxRamF 1lybn7vl8XPmGKbEg6U2+4UtCxAqy89MUcguy42Hi0++O0V8svwKic3Sy9uZAOmcQ/n+CGiy wxUN4VNP2i5DzMSkZFIqNQwReJx795Sx8znjhi/c9W4+ocEe8c6+LpycBhUGIGBokMSlnJKa lpmXquBj6BIPVsHJyalhqGBjo2CgqaCgoK/FxQkGQLaClkJ4UWZJalJicrZCYl6KQmZeWWJO ZkpiSapCSUYqikIfQwWgeKJCcmJyRqoeSGrzDW6xdAAUAiWE/AZFDXKWwq4C+tjl/ICjw22l IhrmbnicAWwAk//YCtgKsLcCFE1OPcJJTkSkQqtrEFBOcKgD1YCJs8sCfAEU//xEf1QQOlFH TTmQQk8BUOK9mICTWwRRFAIBRa3FRhtFKXKhjb3z74zDAdINk8AEYxTw8l0x5eLdwOPLGqu0 RDswGX9DLZM3BSGDkS3M9APVlGkNi5WXqH9Muibowb5cst4i3nicm/eR49BHjoYtjPn1SmXJ BaW6dolFyRl6IGZ8aV5yfl5aZl5qioKtQklRaar19tx6RsOdAEIoFNn0A/BZpz8P0Ij87F8c bO4Fz0u53MmoeJzr38i9ZSN3QwvjFiZxvekZ8ow84mk5pcUZ8cmJyRmp8UWphQq2CmmJOcWp kw86S2zX1WNKWgUA6GcScf3FAVWaErYYTeOMZ+9PIAGWNgD0H4dTeJyVVltsG1Ua1vgSOt4U p5dc2hL4m4XIDrZjZxFbkkLVpoGwzU1NUqEtMJrYx5lJ5+Kec8ZJgMqAhJB4AKEIjoSEeERF rVRaoREPlapKCCHEal95qNQiod0HJHbf9mm1/5nx2E7SItaS7Zlz+f7v/85/Oe9/1P/9F/3X /t01WjGZblnuusZsDs9CVbcYmUjZbsWziFbTqW5nOpdsKXueuvGfrk+SW+nuh7Zu790/XreJ w+mmZpUqWtXymIEopR0Iu9bkwHT4jXe6L8/22RuazSiD11Oq5zBz1SEVcKh489vEwKG6bp2f O6udXF6an5k/eVqbXTy7+MpE6tLEjcf/+ks8zzj1yhzqEcSqRxifSKnqzgnDZfzGa69+d3UC zULVdCpyPLNz2YidgxYHHMmmOkmJt47FDwn3YeUPA5KsalYhY+efkxzNVwqISTa2in/v33cp pVLCPepAfmpufmpuSfy5W1mIo2X/g27lb0pDNaXQEYlhxAiI5wKTSF8Cm3Acilm0supyF9gF s6ZF3uXz0Y6CQ/G99RYyQextIx2LUTq1botvrsTS4rPbsaSAtHKgEc2i6VTb0HhqN00p4/1Z hg4jQMhOLmyRC146ubUGgueukJq4dSWW2RPN+npa+VEZysGK61oQBo/rWJvBicgzNHOwhmhF 8WG3wv2b+5T++OzvkPVoJ5Sqrj3YQzHQqwwm4Y03YE0c7VXeGur0ODz/TqUd+uSTIn07djjQ 9EiHpuql7UckSn3K4Zeagsg/jyBQMK3hm9RQGtpOtK2vnFtrk1jrkDTi0Ctu3k3sET19SvpY c6LNoyX+WjNmJQDbeVRr/jt9yv6kmgsLQnZCfHk5fkC48zHV/+qIciclHl1U1AMyn8t62SBh VmuUXBS6Hn9MvPtUPC3uDcaHVVQwpQJ+xN3B+MGxU8svaDOlpeezIf16uebln9Np2SiECCEa 4rRqkbR8UPw0HRsQd27H1L01SrCokDBKxc2vE4Pi2jOK2mB6nWjSAY1xnZOMhM5Gih2VOX4f tuGq7QqzdZOXDdhdtALKZZ0RKI7v4v7wdvKidDne3yjJdaMj+AMj8GIVdtdKk0EpBwQ3yskN k4OhOxWLUDlDCau5WH5WLAJVl4YwjHBuOqvQYV4+ap5Tdh0MZlIpAEx6lBKnh1ubwA1EMvRa jThM3J1BVtIFNh6i5dGspeNeY7Ruoz3PJuOwTqDiguNyWNcx17hBIHANsBSt4LNF8AAqsLIJ eogC4OBh4JCxWSO0bjKXwgj6MSLXnDk3C64DK7gHbML1hpUD5kojax7j0pLJIxhcR+qEbkZ4 gVzoz9xOeBYxXHGRHg05oSwRUItzIfJTdzZRChSuwQ2dA8W6HqwitmfpHAln5BvDPgc1nRtM Kg5TC0tgmyxQtoWN4y+OzmMHC/uH6TrZXUYCG2Xdwa+HAbNKuIb/FPvhaoMwyEjguumiZdye w30V0Fd10+nk/1tc/j8S6CyU3QoB1+OsYeKDPJQM2eCEOrolmzGh1KvxXGR/3TAxJKO9rfko Opm0b9KLbEI+BGRR64rWdgDTlNi4Ac/IrJqEhuxG8ffpFUr0CxPNVBrDQKyQqu5ZfHdSdeaU +PW9eK/4eV8s7Yth5fzHPa3ahNsIPghlr3JE3DqV6PIfuan0/CspK1lWlLzkiPjnlVifsK/G igk56H96NXYu0UgFqRlmJkYjcVDlMmaatBvKR2CmdLoZ+63QPDeLPX3qLFiEM5CHi+sCGLyd gKxCYJlYRlOhu7L63L/45C5mhnEGbxBRDQrX7rpOPQtjWXFrv3IocE6gc2rm+ZnlxWltcva0 PDH8QPBBtlowk4NiDjAwiNj3S7zv3a7rMb/revzEm8Pqg8sGVly5Q3YK8Q8jAX51LHGr56Gm zAGQf3wy8UPqTMDzwZV7ePg+he4o3gqljxfqdns0k/WTc4nBEe2PGALIAWRrwFveCzC0OLuE J6Jj6asAFmQjmILJhWVY8VZldDG0UICzpIrZj3Vp8txUfqxYOpb/09NjxSBfKlhrTIsVXnaG BGwkXhXvP6H0N1oNSNLMSAXwVquhBpziTQO1RhMaGtem5k6empk63WxU0uXOe3A4qtaohomR iWgHsdi+/c0vzM0vLS4vLMhhbL+4PK2t69RpbxDF68nD/tZG4nUjIZuX+OBeYkD890qs0A7f 48mw1+HUdHP03rVY3k/+lHj7L+dhWuZzO06lKygN1gzq2kEMN2tpeAOBICpbcVf4rUjw384l r06HvApe8kTIw7e85OcnDm7rw1rYNP2hr5JW7/8AltMEuvNyAEaqcCBaot+8BXcGUlC+cXyi W054nJ1VzW/jRBRX3MMSU2lZIUERSDwMlew0pJu0KkuyH0rTpA0kpMoHqtBKo4k9SUzGnuCP fBRBhLgAQuLQQ8SyqvgTkDj4isTeEPwFcERwW4RW4sKBGbub7mYpFXux/d68md/v996b51+/ uPj58cVvflx5kaYN1KG+21OS8HZhv4UazXxTnXs/vrMS3FmJsZ9q6wlo9gh0hxh5DA0wjDDt Ewd0bMPApxRMGzxmu8A6MMBd4qYgsS7Hh/rAf/U6dvReSnwi39aZ3THtoz8+fDa4/lts+Qlp PTHbvxt7puxOXI9Yj3sgMeAaeI5PcnKwdzf2yRDOiwx+INJfG1N5PSFDJK6S3uHwOv8yXdjI wJvmNjAbGv0JxX2ShLbvcUoQZgZMD0YEenhIhM8h2OCMxUlbm+HGNtGx7/JFfpxDBhTrxCK2 B5h2mWN6PUuA2GzqARlj3aMTqNRbKVmIfNkggqfggwr5wl4R1eo7xTpsyq6HPVOHITMNSCAk UJHFXLGbWANvgg6JI6rD05WT5TCsP7TQvJ6qcGny+3Lc5FRc85BMeVw832gU6021zZiHRKp6 2FUPrmyh7dYuqqSbJU0TUWYH1IjAQ0GlYr7ZqhdRqdJq7PHwHU0DDhAfOZbrULXaqKNyfiNz sl6o7iRDYaGp5aY80iGe79g5Of4BR+GNViofVItZ0JlPDZ4/nqc2gTk5A3s4NebAYamQEMGz nrkBYX+E5jXYz+8WUaP8ThGuXl3MIwfCrsWTSLmWKSUqpyBQTcflxbBd34mqFqYRMLdEK3JH s7IdYsSVMXMorK4SPE5Gr5v2TU8RK+l0FuaGxYaHbQrq6mURpgDKN6oofwCKFm7T+bbpSSg2 DAqvbF5+fWvxRN0anGKl5+53eYOk0+25fTYlnjTTuG9xoW+xEXRMfsOEpmnU8OfKymT/j6qH RG1t/ock9oimzKkm2iG2Tu6bWciC4iigLrY6xw39ovZaGKjwo/lEU0g7eunRyxgrRx9pzz85 vw9abvbdl1LtUtzlA0H3wpJDgj+D5VtSP3Z04/cLDUwp0yMgdbe0j/LNWrVcSC52Fb8i4Q15 SUQKFl3G5wLzPe5fJDz7xV16bjb6Srr1FOo4hDywFFy6LTnSG6eOrBwPQyICqm+7ZlfMMMrs rrZ48L+wCibHEpFee8zds8+OpReCP7+Wvm9ceWCghsk7uYAOeY9fuLOnbW727T1pbfbpwdJy /dFB3MHUJSepOxtA5DNaFQMoNa/f2lrw8z1JeVo5/78Q/H1hqeYo5/8/gvLh0u3qP4V+Sf/x Aou0dbMwWGGu+vw/6C/WIxIwszGseJy7qHRRacNCZhFdRs0vUtr/2k88ebO/wEGv+LeLUunm rcxzeAH8sg729ALSn91+V7s9yVbtz/f7iojqQn86e3icWz+f+fJ85g16kZMbZUV544tzS+KT 8/NKivJzNCcrHhHd7BiZVwYADGQOS+ECohR4nLvJeZNzgyuTSNyVHfqburNnRky7Y33ql78V x22To5sjmRqYAAIvDujwApdQxlML/G2urFeHQEspzLVoIkrGeJw7yXaSbYMIswiXBIdg2qMC RcE1ji0NZ6xmhV8SVpmswawIAKovCm/hAp0geJy7ynGVY0Mik8gs1q7nhvGS9rZHePhak/sn Ljkilby5lOkBIwDYlgzt4AKYf3icASAA3//YCtgKsKwEFJsYhIzKugv5WIbcS/etgiViHSTp k8AEmPnyD0/nBJZ+eJwBRwC4/8uqF7CnF4C0Afgol/4yBAmX7CkBUZcu0QIHlzspAYUEOwoJ fYdlKgG3ZSoCRI+XVikEE4e6ugK3uroDTZyHIVcEtyFXBSp+xeUZY+ECpDR4nAEhAN7/2QnZ CbBFAhTVF/TYt6SSQm0LjPPuvQqH4E+9s7NZAoACEscQGfACl1DGUwv8ba6sV4dASynMtWgi SsZ4nDvJdpJtgwizyASFGdH/2kQrthSUnuFxbEi/1Lhl+mQNZkUA2LANROECn0J4nAEhAN7/ 1QjVCLBhAhSp7GKstO2geBA6WRi48BKEji3snbN1AuABCVgPhuACmyN4nAEgAN//2ArYCrCs BBTnxp71uRh0SUnYvWMRl3sNtKBf9pPABJgISBAQ7gWBJHic27Bc/Ooy8YYtjOzCvJpc1Vyc mXklCpk6ClnbNYUZVzEzZk2fLMCoPJ1bnPH79EQJxpjprBKMdtt3SjDasm7/K8souaxd9ggj EDNtlz3CnNg2vfAHs1j7bC+W7bO9WEU7AW9cHpzhAqZteJwBIQDe/9kJ2QmwRQIUrsuv2MKr vQP7prOx3R5LR/VUjryzWQKAAiNyELLwApdQxlML/G2urFeHQEspzLVoIkrGeJw7yXaSbYMI s4jS0peL7/sfy9T1Wr+e/yWHtIjP84jJGsyKANvADJjhAqF7eJy7ynGVY0Mik4iq/2o3raNM 7UIP1v+1nGmtEGgvcWhzKdMDRgDbPQz/4AKdWnicASAA3//YCtgKsKwEFPnXCz6fzQy+ZaiE 8N/lSC8zr6egk8AEmAdKEFzuDIE3eJy7ukz8wFLxhiWMAtP3CjBySHDq6ubq2qWXphaX6OUV WXNB+Bn506UEGM2mXxNgtJm+/gsL6/TdL1kEph8XZ+SfvvQli+D0A7WsItvVBBlfM05n52Ms 2d4hzGjGPP2OGKPkdD5xxlw2hdSc4lSF6bXijCq82tpgEyEWTDcMB6pBmGYczigENk0AoUgT 6IxarukGEowBjJnTGyUYZRgz2+dJMAIxExAzAzHL9nkSrNt3AQCJgkBp4QKpdHicu8l5k3OD K5NI4QZ/4ebg8gK3S+c3cJ9eOTuUoUJscyRTAxMA6BwNNPACl1DGUwv8ba6sV4dASynMtWgi SsZ4nAEgAN//yQbJBrAUAxR/9JB/Trzd/tfskuyJJWlIQpecVZMoAyEDRA8q4QKlAnicu8px lWNDIpMIf0Ayo4ZK0ReNiUvCGBLOS/NcnLtxcynTA0YAyL0MiuACoGF4nLvBdYNrwxoWETPP /dbWTXvss6/0dhtN416p4/3s3+QDLDMA4w4Oh+wPggV4nDuwVHz6UvGGLYzCvNMDeRhlpt/i ZdTgKM0rzkzPS03ZzsrHeJ5RIy2/SEEjU8FWwcBaIVPBRiFX1y69NLW4RC+vyFpBWztTk4uT MzNNQWP6XHFG8ek9vIxcsklFqYnZ1lxcYHGgVltkTdv/8DMeZYLbMh1oS9l2K2FGcebpV+yY BSQIm88ONX96nhjjFEVU0zUVqoHqtbWRhNplxBmBmAmImYGYZbuMOOuS3QCMPkhB4QKsdHic ASEA3v/ZCdkJsEUCFKB8l4tHW8yzy3fGzIuf3LVB6z2Vs1kCgAITLxBM8AKXUMZTC/xtrqxX h0BLKcy1aCJKxnicO8l2km2DCLOIZO7z3V/Xc7aesay5n+O5w9Z4nyj/ZA1mRQDg+AzS4QKo AnicASEA3v/VCNUIsGECFDOw/htgIrBlf7fcX+jZ6pUspPxJs3UC4AEHpBBG4AKjY3icASAA 3//YCtgKsKwEFKUUGEKRZclmA1sR033wx7yczKF0k8AEmORjDw/nH4IEeJybvlT80mLxDa+i NltHP2KuL80rzkzPS01RyCuy5uLkLC4pKk0uUSjLrYjPLS6KT80rKapUSC9NLS6J9guKdwwN 8ffxd3SJ9w0OCo7FqT4jH4fyWgWQosTSkvyc/MQUoEDx5vj4c+t1uDjT8osUNDIVbBUMrBUy FWwUcnXtgA5S0NbO1ATakpmmoAEUAbtjehI/o1ZIXhFIvCi1pLQoD2iOri5EAxcnTFl0ZizQ NDgPLB0LkQc7DyYN5kBkpxvUsmqyg9ma1tMfRjAKT48XYBTfXi3A2MYqjss5SmKM23iBOhSq gTLa2kDx6YzijPrMQKHpBieYRMFmCIGMBPqfC+HA6QfFGUVZIcx54ozyIL5k+z1xxu33xJnS ZzNqTtevZdWaHvGChX+6wGYm9uk2L1kM2vdtZgJi5u37NrNcM5leWMsqPH3NCxYjiGobiIDF SxaT7QUvWXZMnq5by2q3HSioFgoAA6ijguECsS94nLvJeZNzgyuTiMGOdUvPmZwMqDDU4V+e b3Hz9DblJ5sjmRqYAPOUDknwApdQxlML/G2urFeHQEspzLVoIkrGeJw7yXaSbYMIs4jVl9pJ dg1NByMufP8rG2xmzMvx+/pkDWZFAOK0DXPqA6w7eJwBOgDF/9UI1QiwIgIUvLibjo/6tpK5 /tMJNYich8xL3h6TNgIrFGeoJPjk59Tq1BG2uwvR9NTachiRs3UC4AFuNx5PrQN4nDMxAAKF xOJchlNS4dPPLuSbt/OoUXrAvQlH3r56xGACli1NLMhkEL0Q/p952Rr9/Q3FedlNK+cIsa/d CAD63hnf8QKGqZhQSJaFCpdnx4j9zPL6ZI1WM3icm5g4MXHDPxGRipJO1cUFYesDfyc7T+PM uBWy/J3PZiHRemkAB9EO6eAChFd4nLvBdYNrgzKriK+eZ+lP2T1HzNlsstKq/2v4f7B7Ptmc VREAzz0Mv+sCmWV4nAErANT/jJ4PoJ0PgIQBtALDm5danAI/BGludCCXmZwCJZfNmwIoh7+c Are/nANNMoKCEXDqA7RReJwBOgDF/9kJ2QmwUQEUqRDi3crdRTcNr74R57Zzs2q86WGTZQHg FKE3pb8hjuyLxIbxDVnLIcilfASys1kCgAJMLhw/8QJGluUiSl2EaETut6u+7c28OWP7Z3ic W2633G7DDGaR9mexjcH+cXvaLL8Z7/fJqF36eJ/r5jXM1dIABn8PjPECqzuwUVJ1RJUmWG9P tDp+yXHNuip4nGvibuLecJxJ5OpskyvPtMKiV3fJCBn4Nb/MtX2RvPk203ImAO56DlzlAbQj eJy7cJnr7AWuhi2MOYLt74UYt78XYno4DQBwSAnB8AKXUMZTC/xtrqxXh0BLKcy1aCJKxnic ASAA3//JBskGsBQDFJfsw47y4xwHgv/Exwzzzz13t+vskygDIQa5EEHxAtVb08nh/ui/+IyP VsCL0dCB7jvceJwBIQDe/9UI1QiwYQIUW7m/RaeIiaf3SanF0NAmvrM80SGzdQLgARR8EHfw AkUNcpbCrgL62OX8gKPDbaUiGuZueJy7wXWDa8MaFpEPfcZ/NNYIJ/tHb+Ft2vY7eHeKLcfk AywzAPFkDnThBIg8eJy7tFj863TxDe94mv14t/vxMrJKtQvqMAIxExAzbxfUYVnHvf2JOcsR 6e13g1k6728/Zsza0zH9mzwj//a0Paw5ogAF1RfZ9BIVINdJdgkIvKl76ZEbGxfVf5Q= --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0001-x86-KVM-Warn-user-if-KVM-is-loaded-SMT-and-L1TF-CPU-.patch" >From 9c99aed79870dee0cc2900622759188a73feed65 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 11:29:53 -0400 Subject: [PATCH v4 1/8] x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present. If the L1TF CPU bug is present we allow the KVM module to be loaded as the major of users that use Linux and KVM have trusted guests and do not want a broken setup. Cloud vendors are the ones that are uncomfortable with CVE 2018-3615 and as such they are the ones that should set disallow_smt to one. Setting disallow_smt to means that the system administrator also needs to disable SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line parameter, or via the /sys/devices/system/cpu/smt/control (see commit 05736e4ac13c cpu/hotplug: Provide knobs to control SMT). Other mitigations are to use task affinity, cpu sets, interrupt binding, etc - anything to make sure that _only_ the same guests vCPUs are running on sibling threads. Signed-off-by: Konrad Rzeszutek Wilk --- v3:-Move it to vmx_vcpu_init (could also do it in kvm_vcpu_init but it seemed more prudent to do it in VMX handler. -Made it WARN if disallow_smt=0 -Made it ERR if disallow_smt=1 -Fixed the CVE number --- Documentation/admin-guide/kernel-parameters.txt | 6 ++++++ arch/x86/kvm/vmx.c | 13 +++++++++++++ kernel/cpu.c | 1 + 3 files changed, 20 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 8e29c4b6756f..d59b34d4e62a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1946,6 +1946,12 @@ [KVM,ARM] Allow use of GICv4 for direct injection of LPIs. + kvm-intel.disallow_smt=[KVM] If the L1TF CPU bug is present and the + system has SMT (aka Hyper-Threading) enabled then + don't allow guests to be created. + + Default is 0 (allow guests to be created). + kvm-intel.ept= [KVM,Intel] Disable extended page tables (virtualized MMU) support on capable Intel chips. Default is 1 (enabled) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 559a12b6184d..f08e33fc28ac 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -71,6 +71,9 @@ static const struct x86_cpu_id vmx_cpu_id[] = { }; MODULE_DEVICE_TABLE(x86cpu, vmx_cpu_id); +static bool __read_mostly disallow_smt = false; +module_param(disallow_smt, bool, S_IRUGO); + static bool __read_mostly enable_vpid = 1; module_param_named(vpid, enable_vpid, bool, 0444); @@ -10370,10 +10373,20 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) return ERR_PTR(err); } +#define L1TF_MSG "SMT enabled with L1TF CPU bug present. Refer to CVE-2018-3620 for details.\n" + static int vmx_vm_init(struct kvm *kvm) { if (!ple_gap) kvm->arch.pause_in_guest = true; + + if (boot_cpu_has(X86_BUG_L1TF) && (cpu_smt_control == CPU_SMT_ENABLED)) { + if (disallow_smt) { + pr_err(L1TF_MSG); + return -EOPNOTSUPP; + } + pr_warn(L1TF_MSG); + } return 0; } diff --git a/kernel/cpu.c b/kernel/cpu.c index d29fdd7e57bb..2d0129f41a2b 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -935,6 +935,7 @@ EXPORT_SYMBOL(cpu_down); #ifdef CONFIG_HOTPLUG_SMT enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; +EXPORT_SYMBOL_GPL(cpu_smt_control); static int __init smt_cmdline_disable(char *str) { -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0002-kvm-x86-mitigation-for-L1-cache-terminal-fault-vulne.patch" >From 829231a02aff224a271716ec757c38a44ddda5ae Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Sat, 23 Jun 2018 05:52:32 -0400 Subject: [PATCH v4 2/8] kvm: x86: mitigation for L1 cache terminal fault vulnerabilities We add two mitigation modes for CVE-2018-3620, aka L1 terminal fault. The two modes are "vmentry_l1d_flush=1" and "vmentry_l1d_flush=2". "vmentry_l1d_flush=2" is simply doing an L1 cache flush on every VMENTER. "vmentry_l1d_flush=1" is trying to avoid so many L1 cache flueshes on VMENTER and instead only does if the reason for entering the hypervisor is based on the type of code that is executed. The idea is based on Intel's patches, but we treat all vmexits as safe unless they execute specific code that is considered unsafe. There is no hardcoded list of "safe" exit reasons; but vmexits are considered safe unless: - They trigger the emulator, which could be a good target for other speculative execution-based threats, - or the MMU, which can bring host page tables in the L1 cache. - In addition, executing userspace or another process will trigger a flush. - external interrupts - nested operations that require the MMU (see above). That is vmptrld, vmptrst, vmclear,vmwrite,vmread. - Also when handling invept,invvpid The default is "vmentry_l1d_flush=1". The cost of "vmentry_l1d_flush=2" is up to 2.5x more expensive vmexits on Haswell processors, and 30% on Coffee Lake (for the latter, this is independent of whether microcode or the generic flush code are used). The mitigation does not in any way try to do anything about hyperthreading; it is possible for a sibling thread to read data from the cache during a vmexit, before the host completes the flush, or to read data from the cache while a sibling runs. The suggestion there is to disable hyperthreading unless you've configured your system to dedicate each core to a specific guest. Signed-off-by: Paolo Bonzini Signed-off-by: Konrad Rzeszutek Wilk --- v2: Add checks for X86_BUG_L1TF - Rework the commit description. v3: change module parameter to S_IRUGO move the kvm_l1d_flush closer to VMENTER move the module parameter so it is in alphabetical order-ish. add two extra places that are used by handle_[vmptrld, vmptrst,mclear,vmwrite,vmread,invept,invvpid]. Can be changed to be only specific handlers. --- Documentation/admin-guide/kernel-parameters.txt | 11 +++++ arch/x86/include/asm/kvm_host.h | 8 ++++ arch/x86/kvm/mmu.c | 1 + arch/x86/kvm/svm.c | 1 + arch/x86/kvm/vmx.c | 51 +++++++++++++++++++- arch/x86/kvm/x86.c | 63 ++++++++++++++++++++++++- 6 files changed, 133 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d59b34d4e62a..b8f7a4ab693a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1919,6 +1919,17 @@ kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface. Default is false (don't support). + kvm.vmentry_l1d_flush=[KVM] Mitigation for L1 Terminal Fault CVE. + Valid arguments: 0, 1, 2 + + 2 does an L1 cache flush on every VMENTER. + 1 tries to avoid so many L1 cache flush on VMENTERs and instead + do it only if the kind of code that is executed would lead to + leaking host memory. + 0 disables the mitigation + + Default is 1 (do L1 cache flush in specific instances) + kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit KVM MMU at runtime. Default is 0 (off) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c13cd28d9d1b..78748925a370 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -713,6 +713,12 @@ struct kvm_vcpu_arch { /* be preempted when it's in kernel-mode(cpl=0) */ bool preempted_in_kernel; + + /* for L1 terminal fault vulnerability */ + bool vcpu_unconfined; + + /* must flush the L1 Data cache */ + bool flush_cache_req; }; struct kvm_lpage_info { @@ -881,6 +887,7 @@ struct kvm_vcpu_stat { u64 signal_exits; u64 irq_window_exits; u64 nmi_window_exits; + u64 l1d_flush; u64 halt_exits; u64 halt_successful_poll; u64 halt_attempted_poll; @@ -1449,6 +1456,7 @@ bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq, void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e, struct kvm_lapic_irq *irq); +void kvm_l1d_flush(void); static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d594690d8b95..4d4e3dc2494e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3840,6 +3840,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, { int r = 1; + vcpu->arch.vcpu_unconfined = true; switch (vcpu->arch.apf.host_apf_reason) { default: trace_kvm_page_fault(fault_address, error_code); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index f059a73f0fd0..fffc447f5410 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -5437,6 +5437,7 @@ static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa) static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu) { + vcpu->arch.flush_cache_req = false; } static inline void sync_cr8_to_lapic(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f08e33fc28ac..a51418429165 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -117,6 +117,9 @@ static u64 __read_mostly host_xss; static bool __read_mostly enable_pml = 1; module_param_named(pml, enable_pml, bool, S_IRUGO); +static int __read_mostly vmentry_l1d_flush = 1; +module_param(vmentry_l1d_flush, int, S_IRUGO); + #define MSR_TYPE_R 1 #define MSR_TYPE_W 2 #define MSR_TYPE_RW 3 @@ -2621,6 +2624,45 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) vmx->guest_msrs[i].mask); } +static void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu) +{ + vmx_save_host_state(vcpu); + + if (!enable_ept || static_cpu_has(X86_FEATURE_HYPERVISOR) || + !static_cpu_has(X86_BUG_L1TF)) { + vcpu->arch.flush_cache_req = false; + return; + } + + switch (vmentry_l1d_flush) { + case 0: + vcpu->arch.flush_cache_req = false; + break; + case 1: + /* + * If vmentry_l1d_flush is 1, each vmexit handler is responsible for + * setting vcpu->arch.vcpu_unconfined. Currently this happens in the + * following cases: + * - vmlaunch/vmresume: we do not want the cache to be cleared by a + * nested hypervisor *and* by KVM on bare metal, so we just do it + * on every nested entry. Nested hypervisors do not bother clearing + * the cache. + * - anything that runs the emulator (the slow paths for EPT misconfig + * or I/O instruction) + * - anything that can cause get_user_pages (EPT violation, and again + * the slow paths for EPT misconfig or I/O instruction) + * - anything that can run code outside KVM (external interrupt, + * which can run interrupt handlers or irqs; or the sched_in + * preempt notifier) + */ + break; + case 2: + default: + vcpu->arch.flush_cache_req = true; + break; + } +} + static void __vmx_load_host_state(struct vcpu_vmx *vmx) { if (!vmx->host_state.loaded) @@ -9754,6 +9796,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) ); + vcpu->arch.vcpu_unconfined = true; } } STACK_FRAME_NON_STANDARD(vmx_handle_external_intr); @@ -10011,6 +10054,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) evmcs_rsp = static_branch_unlikely(&enable_evmcs) ? (unsigned long)¤t_evmcs->host_rsp : 0; + if (vcpu->arch.flush_cache_req) + kvm_l1d_flush(); + asm( /* Store host registers */ "push %%" _ASM_DX "; push %%" _ASM_BP ";" @@ -11824,6 +11870,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) return ret; } + /* Hide L1D cache contents from the nested guest. */ + vmx->vcpu.arch.vcpu_unconfined = true; + /* * If we're entering a halted L2 vcpu and the L2 vcpu won't be woken * by event injection, halt vcpu. @@ -12941,7 +12990,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .vcpu_free = vmx_free_vcpu, .vcpu_reset = vmx_vcpu_reset, - .prepare_guest_switch = vmx_save_host_state, + .prepare_guest_switch = vmx_prepare_guest_switch, .vcpu_load = vmx_vcpu_load, .vcpu_put = vmx_vcpu_put, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0046aa70205a..4d2e4975f91d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -195,6 +195,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "irq_injections", VCPU_STAT(irq_injections) }, { "nmi_injections", VCPU_STAT(nmi_injections) }, { "req_event", VCPU_STAT(req_event) }, + { "l1d_flush", VCPU_STAT(l1d_flush) }, { "mmu_shadow_zapped", VM_STAT(mmu_shadow_zapped) }, { "mmu_pte_write", VM_STAT(mmu_pte_write) }, { "mmu_pte_updated", VM_STAT(mmu_pte_updated) }, @@ -4799,6 +4800,8 @@ int kvm_read_guest_virt(struct kvm_vcpu *vcpu, { u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0; + /* The gva_to_pa walker can pull in tons of pages. */ + vcpu->arch.vcpu_unconfined = true; return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, access, exception); } @@ -4874,6 +4877,9 @@ static int emulator_write_std(struct x86_emulate_ctxt *ctxt, gva_t addr, void *v int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception) { + /* kvm_write_guest_virt_system can pull in tons of pages. */ + vcpu->arch.vcpu_unconfined = true; + return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, PFERR_WRITE_MASK, exception); } @@ -6050,6 +6056,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, bool writeback = true; bool write_fault_to_spt = vcpu->arch.write_fault_to_shadow_pgtable; + vcpu->arch.vcpu_unconfined = true; + /* * Clear write_fault_to_shadow_pgtable here to ensure it is * never reused. @@ -6533,10 +6541,49 @@ static struct notifier_block pvclock_gtod_notifier = { }; #endif + +/* + * The L1D cache is 32 KiB on Skylake, but to flush it we have to read in + * 64 KiB because the replacement algorithm is not exactly LRU. + */ +#define L1D_CACHE_ORDER 4 +static void *__read_mostly empty_zero_pages; + +void kvm_l1d_flush(void) +{ + /* FIXME: could this be boot_cpu_data.x86_cache_size * 2? */ + int size = PAGE_SIZE << L1D_CACHE_ORDER; + + ASSERT(boot_cpu_has(X86_BUG_L1TF)); + + asm volatile( + /* First ensure the pages are in the TLB */ + "xorl %%eax, %%eax\n\t" + "11: \n\t" + "movzbl (%0, %%" _ASM_AX "), %%ecx\n\t" + "addl $4096, %%eax\n\t" + "cmpl %%eax, %1\n\t" + "jne 11b\n\t" + "xorl %%eax, %%eax\n\t" + "cpuid\n\t" + /* Now fill the cache */ + "xorl %%eax, %%eax\n\t" + "12:\n\t" + "movzbl (%0, %%" _ASM_AX "), %%ecx\n\t" + "addl $64, %%eax\n\t" + "cmpl %%eax, %1\n\t" + "jne 12b\n\t" + "lfence\n\t" + : : "r" (empty_zero_pages), "r" (size) + : "eax", "ebx", "ecx", "edx"); +} +EXPORT_SYMBOL_GPL(kvm_l1d_flush); + int kvm_arch_init(void *opaque) { int r; struct kvm_x86_ops *ops = opaque; + struct page *page; if (kvm_x86_ops) { printk(KERN_ERR "kvm: already loaded the other module\n"); @@ -6556,10 +6603,15 @@ int kvm_arch_init(void *opaque) } r = -ENOMEM; + page = alloc_pages(GFP_ATOMIC, L1D_CACHE_ORDER); + if (!page) + goto out; + empty_zero_pages = page_address(page); + shared_msrs = alloc_percpu(struct kvm_shared_msrs); if (!shared_msrs) { printk(KERN_ERR "kvm: failed to allocate percpu kvm_shared_msrs\n"); - goto out; + goto out_free_zero_pages; } r = kvm_mmu_module_init(); @@ -6590,6 +6642,8 @@ int kvm_arch_init(void *opaque) return 0; +out_free_zero_pages: + free_pages((unsigned long)empty_zero_pages, L1D_CACHE_ORDER); out_free_percpu: free_percpu(shared_msrs); out: @@ -6614,6 +6668,7 @@ void kvm_arch_exit(void) #endif kvm_x86_ops = NULL; kvm_mmu_module_exit(); + free_pages((unsigned long)empty_zero_pages, L1D_CACHE_ORDER); free_percpu(shared_msrs); } @@ -7392,7 +7447,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) preempt_disable(); + vcpu->arch.flush_cache_req = vcpu->arch.vcpu_unconfined; kvm_x86_ops->prepare_guest_switch(vcpu); + vcpu->arch.vcpu_unconfined = false; + if (vcpu->arch.flush_cache_req) + vcpu->stat.l1d_flush++; /* * Disable IRQs before setting IN_GUEST_MODE. Posted interrupt @@ -7579,6 +7638,7 @@ static int vcpu_run(struct kvm_vcpu *vcpu) struct kvm *kvm = vcpu->kvm; vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + vcpu->arch.vcpu_unconfined = true; for (;;) { if (kvm_vcpu_running(vcpu)) { @@ -8698,6 +8758,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) { + vcpu->arch.vcpu_unconfined = true; kvm_x86_ops->sched_in(vcpu, cpu); } -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0003-x86-KVM-Use-L1-cache-flush-before-VMENTER-if-availab.patch" >From f8f17165dd3470eda0d87bc8e77a07631819fa9c Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 20 Jun 2018 17:20:11 -0400 Subject: [PATCH v4 3/8] x86/KVM: Use L1 cache flush before VMENTER if available. 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD aka 0x10B) which has similar write-only semantics to other MSRs defined in the document. The semantics of this MSR is to allow "finer granularity invalidation of caching structures than existing mechanisms like WBINVD. It will writeback and invalidate the L1 data cache, including all cachelines brought in by preceding instructions, without invalidating all caches (eg. L2 or LLC). Some processors may also invalidate the first level level instruction cache on a L1D_FLUSH command. The L1 data and instruction caches may be shared across the logical processors of a core." Hence right before we do an VMENTER we need to flush the L1 data cache to thwart against untrusted guests reading the host memory that is cached in L1 data cache. A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 Signed-off-by: Paolo Bonzini Signed-off-by: Konrad Rzeszutek Wilk --- v3: Redo the commit --- arch/x86/include/asm/msr-index.h | 6 ++++++ arch/x86/kvm/x86.c | 10 ++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 68b2c3150de1..0e7517089b80 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -76,6 +76,12 @@ * control required. */ +#define MSR_IA32_FLUSH_CMD 0x0000010b +#define L1D_FLUSH (1 << 0) /* + * Writeback and invalidate the + * L1 data cache. + */ + #define MSR_IA32_BBL_CR_CTL 0x00000119 #define MSR_IA32_BBL_CR_CTL3 0x0000011e diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4d2e4975f91d..f0f25d31e5e2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6551,11 +6551,17 @@ static void *__read_mostly empty_zero_pages; void kvm_l1d_flush(void) { - /* FIXME: could this be boot_cpu_data.x86_cache_size * 2? */ - int size = PAGE_SIZE << L1D_CACHE_ORDER; + int size; ASSERT(boot_cpu_has(X86_BUG_L1TF)); + if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { + wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH); + return; + } + + /* FIXME: could this be boot_cpu_data.x86_cache_size * 2? */ + size = PAGE_SIZE << L1D_CACHE_ORDER; asm volatile( /* First ensure the pages are in the TLB */ "xorl %%eax, %%eax\n\t" -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0004-x86-KVM-VMX-Split-the-VMX-MSR-LOAD-structures-to-hav.patch" >From 1e0c2fce8c0acd05640d255456b8800f379a2553 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 13:58:37 -0400 Subject: [PATCH v4 4/8] x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers. There is no semantic change but with this change we can allow an unbalanced amount of MSRs to be loaded on VMEXIT and VMENTER. That is the number of MSRs to save or restore on VMEXIT or VMENTER may be different. Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/kvm/vmx.c | 65 +++++++++++++++++++++++++++++------------------------- 1 file changed, 35 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index a51418429165..3649bf3b3b82 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -763,6 +763,11 @@ static inline int pi_test_sn(struct pi_desc *pi_desc) (unsigned long *)&pi_desc->control); } +struct vmx_msrs { + unsigned nr; + struct vmx_msr_entry val[NR_AUTOLOAD_MSRS]; +}; + struct vcpu_vmx { struct kvm_vcpu vcpu; unsigned long host_rsp; @@ -796,9 +801,8 @@ struct vcpu_vmx { struct loaded_vmcs *loaded_vmcs; bool __launched; /* temporary, used in vmx_vcpu_run */ struct msr_autoload { - unsigned nr; - struct vmx_msr_entry guest[NR_AUTOLOAD_MSRS]; - struct vmx_msr_entry host[NR_AUTOLOAD_MSRS]; + struct vmx_msrs guest; + struct vmx_msrs host; } msr_autoload; struct { int loaded; @@ -2395,18 +2399,18 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr) } break; } - - for (i = 0; i < m->nr; ++i) - if (m->guest[i].index == msr) + for (i = 0; i < m->guest.nr; ++i) + if (m->guest.val[i].index == msr) break; - if (i == m->nr) + if (i == m->guest.nr) return; - --m->nr; - m->guest[i] = m->guest[m->nr]; - m->host[i] = m->host[m->nr]; - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->nr); - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->nr); + --m->guest.nr; + --m->host.nr; + m->guest.val[i] = m->guest.val[m->guest.nr]; + m->host.val[i] = m->host.val[m->host.nr]; + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, @@ -2458,24 +2462,25 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, wrmsrl(MSR_IA32_PEBS_ENABLE, 0); } - for (i = 0; i < m->nr; ++i) - if (m->guest[i].index == msr) + for (i = 0; i < m->guest.nr; ++i) + if (m->guest.val[i].index == msr) break; if (i == NR_AUTOLOAD_MSRS) { printk_once(KERN_WARNING "Not enough msr switch entries. " "Can't add msr %x\n", msr); return; - } else if (i == m->nr) { - ++m->nr; - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->nr); - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->nr); + } else if (i == m->guest.nr) { + ++m->guest.nr; + ++m->host.nr; + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } - m->guest[i].index = msr; - m->guest[i].value = guest_val; - m->host[i].index = msr; - m->host[i].value = host_val; + m->guest.val[i].index = msr; + m->guest.val[i].value = guest_val; + m->host.val[i].index = msr; + m->host.val[i].value = host_val; } static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) @@ -6284,9 +6289,9 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx) vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0); - vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host)); + vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0); - vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest)); + vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val)); if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) vmcs_write64(GUEST_IA32_PAT, vmx->vcpu.arch.pat); @@ -11286,10 +11291,10 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) * Set the MSR load/store lists to match L0's settings. */ vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr); - vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host)); - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr); - vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest)); + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); + vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); + vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val)); set_cr4_guest_host_mask(vmx); @@ -12393,8 +12398,8 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, vmx_segment_cache_clear(vmx); /* Update any VMCS fields that might have changed while L2 ran */ - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr); - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr); + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset); if (vmx->hv_deadline_tsc == -1) vmcs_clear_bits(PIN_BASED_VM_EXEC_CONTROL, -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0005-x86-KVM-VMX-Add-find_msr-helper-function.patch" >From f30e50346ac1bbc94a8f81670d0be8eacec78b3a Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 20:11:39 -0400 Subject: [PATCH v4 5/8] x86/KVM/VMX: Add find_msr helper function .. to help find the MSR on either the guest or host MSR list. Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/kvm/vmx.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3649bf3b3b82..f9d70b3e9fcd 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2376,9 +2376,20 @@ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx, vm_exit_controls_clearbit(vmx, exit); } +static int find_msr(struct vmx_msrs *m, unsigned msr) +{ + unsigned int i; + + for (i = 0; i < m->nr; ++i) { + if (m->val[i].index == msr) + return i; + } + return -ENOENT; +} + static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr) { - unsigned i; + int i; struct msr_autoload *m = &vmx->msr_autoload; switch (msr) { @@ -2399,11 +2410,8 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr) } break; } - for (i = 0; i < m->guest.nr; ++i) - if (m->guest.val[i].index == msr) - break; - - if (i == m->guest.nr) + i = find_msr(&m->guest, msr); + if (i < 0) return; --m->guest.nr; --m->host.nr; @@ -2427,7 +2435,7 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, u64 guest_val, u64 host_val) { - unsigned i; + int i; struct msr_autoload *m = &vmx->msr_autoload; switch (msr) { @@ -2462,16 +2470,13 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, wrmsrl(MSR_IA32_PEBS_ENABLE, 0); } - for (i = 0; i < m->guest.nr; ++i) - if (m->guest.val[i].index == msr) - break; - + i = find_msr(&m->guest, msr); if (i == NR_AUTOLOAD_MSRS) { printk_once(KERN_WARNING "Not enough msr switch entries. " "Can't add msr %x\n", msr); return; - } else if (i == m->guest.nr) { - ++m->guest.nr; + } else if (i < 0) { + i = m->guest.nr++; ++m->host.nr; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0006-x86-KVM-VMX-Seperate-the-VMX-AUTOLOAD-guest-host-num.patch" >From 14d70d5678a950aa213d8992e1fef845001c6d68 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 22:00:47 -0400 Subject: [PATCH v4 6/8] x86/KVM/VMX: Seperate the VMX AUTOLOAD guest/host number accounting. We can now load a different number of MSRs depending on if we are doing VMEXIT vs VMENTER. Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/kvm/vmx.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f9d70b3e9fcd..e7c69ef5b918 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2412,12 +2412,18 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr) } i = find_msr(&m->guest, msr); if (i < 0) - return; + goto skip_guest; --m->guest.nr; - --m->host.nr; m->guest.val[i] = m->guest.val[m->guest.nr]; - m->host.val[i] = m->host.val[m->host.nr]; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); + +skip_guest: + i = find_msr(&m->host, msr); + if (i < 0) + return; + + --m->host.nr; + m->host.val[i] = m->host.val[m->host.nr]; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } @@ -2435,7 +2441,7 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, u64 guest_val, u64 host_val) { - int i; + int i, j; struct msr_autoload *m = &vmx->msr_autoload; switch (msr) { @@ -2471,21 +2477,24 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, } i = find_msr(&m->guest, msr); - if (i == NR_AUTOLOAD_MSRS) { + j = find_msr(&m->host, msr); + if (i == NR_AUTOLOAD_MSRS || j == NR_AUTOLOAD_MSRS) { printk_once(KERN_WARNING "Not enough msr switch entries. " "Can't add msr %x\n", msr); return; - } else if (i < 0) { + } + if (i < 0) { i = m->guest.nr++; - ++m->host.nr; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); + } + if (j < 0) { + j = m->host.nr++; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } - m->guest.val[i].index = msr; m->guest.val[i].value = guest_val; - m->host.val[i].index = msr; - m->host.val[i].value = host_val; + m->host.val[j].index = msr; + m->host.val[j].value = host_val; } static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0007-x86-KVM-VMX-Add-framework-for-having-disjoint-amount.patch" >From a89a046765ad987bf7561b13db0829083f40a52b Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 22:01:22 -0400 Subject: [PATCH v4 7/8] x86/KVM/VMX: Add framework for having disjoint amount of MSRs to save/restore Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/kvm/vmx.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e7c69ef5b918..9b18848ccaba 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2439,9 +2439,9 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, } static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, - u64 guest_val, u64 host_val) + u64 guest_val, u64 host_val, bool entry_only) { - int i, j; + int i, j = 0; struct msr_autoload *m = &vmx->msr_autoload; switch (msr) { @@ -2477,7 +2477,9 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, } i = find_msr(&m->guest, msr); - j = find_msr(&m->host, msr); + if (!entry_only) + j = find_msr(&m->host, msr); + if (i == NR_AUTOLOAD_MSRS || j == NR_AUTOLOAD_MSRS) { printk_once(KERN_WARNING "Not enough msr switch entries. " "Can't add msr %x\n", msr); @@ -2487,12 +2489,16 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, i = m->guest.nr++; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); } + m->guest.val[i].index = msr; + m->guest.val[i].value = guest_val; + + if (entry_only) + return; + if (j < 0) { j = m->host.nr++; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } - m->guest.val[i].index = msr; - m->guest.val[i].value = guest_val; m->host.val[j].index = msr; m->host.val[j].value = host_val; } @@ -2538,7 +2544,7 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) guest_efer &= ~EFER_LME; if (guest_efer != host_efer) add_atomic_switch_msr(vmx, MSR_EFER, - guest_efer, host_efer); + guest_efer, host_efer, false); return false; } else { guest_efer &= ~ignore_bits; @@ -4031,7 +4037,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.ia32_xss = data; if (vcpu->arch.ia32_xss != host_xss) add_atomic_switch_msr(vmx, MSR_IA32_XSS, - vcpu->arch.ia32_xss, host_xss); + vcpu->arch.ia32_xss, host_xss, false); else clear_atomic_switch_msr(vmx, MSR_IA32_XSS); break; @@ -9978,7 +9984,7 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) clear_atomic_switch_msr(vmx, msrs[i].msr); else add_atomic_switch_msr(vmx, msrs[i].msr, msrs[i].guest, - msrs[i].host); + msrs[i].host, false); } static void vmx_arm_hv_timer(struct kvm_vcpu *vcpu) -- 2.14.3 --OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0008-x86-KVM-VMX-Use-MSR-save-list-for-IA32_FLUSH_CMD-if-.patch" >From 1f42b4b6fb2f71db1ffd8b0be210425f0aba866c Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 20 Jun 2018 22:32:56 -0400 Subject: [PATCH v4 8/8] x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required. If the module parameter is to flush the L1D cache on every VMENTER then we can optimize by using the MSR save list to have the CPU poke the MSR with the proper value right at VMENTER boundary. Signed-off-by: Konrad Rzeszutek Wilk --- v3: Actually engage the MSR save list Move it to function that frobs VMCS --- arch/x86/kvm/vmx.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9b18848ccaba..020145adc546 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2649,15 +2649,22 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) vmx->guest_msrs[i].mask); } -static void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu) +static bool vmx_l1d_cache_flush_req(struct kvm_vcpu *vcpu) { - vmx_save_host_state(vcpu); - if (!enable_ept || static_cpu_has(X86_FEATURE_HYPERVISOR) || !static_cpu_has(X86_BUG_L1TF)) { vcpu->arch.flush_cache_req = false; - return; + return false; } + return true; +} + +static void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu) +{ + vmx_save_host_state(vcpu); + + if (!vmx_l1d_cache_flush_req(vcpu)) + return; switch (vmentry_l1d_flush) { case 0: @@ -6352,6 +6359,15 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx) vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg)); vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); } + + /* + * If we enforce flushing the L1D cache on every VMENTER lets use the + * MSR save list. + */ + if (vmx_l1d_cache_flush_req(&vmx->vcpu)) + if (vmentry_l1d_flush == 2) + add_atomic_switch_msr(vmx, MSR_IA32_FLUSH_CMD, + L1D_FLUSH, 0, true); } static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) @@ -10079,7 +10095,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) evmcs_rsp = static_branch_unlikely(&enable_evmcs) ? (unsigned long)¤t_evmcs->host_rsp : 0; - if (vcpu->arch.flush_cache_req) + if (vcpu->arch.flush_cache_req && vmentry_l1d_flush != 1) kvm_l1d_flush(); asm( -- 2.14.3 --OgqxwSJOaUobr8KG--