From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Leizhen (ThunderTown)" Subject: Re: [PATCH v3 0/6] add non-strict mode support for arm-smmu-v3 Date: Thu, 26 Jul 2018 11:44:25 +0800 Message-ID: <5B594399.1080404@huawei.com> References: <1531376312-2192-1-git-send-email-thunder.leizhen@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Robin Murphy , Jean-Philippe Brucker , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel , LinuxArm List-Id: iommu@lists.linux-foundation.org CgpPbiAyMDE4LzcvMjUgNTo1MSwgUm9iaW4gTXVycGh5IHdyb3RlOgo+IE9uIDIwMTgtMDctMTIg NzoxOCBBTSwgWmhlbiBMZWkgd3JvdGU6Cj4+IHYyIC0+IHYzOgo+PiBBZGQgYSBib290dXAgb3B0 aW9uICJpb21tdV9zdHJpY3RfbW9kZSIgdG8gbWFrZSB0aGUgbWFuYWdlciBjYW4gY2hvb3NlIHdo aWNoCj4+IG1vZGUgdG8gYmUgdXNlZC4gVGhlIGZpcnN0IDUgcGF0Y2hlcyBoYXZlIG5vdCBjaGFu Z2VkLgo+PiArICAgIGlvbW11X3N0cmljdF9tb2RlPSAgICBbYXJtLXNtbXUtdjNdCj4+ICsgICAg ICAgIDAgLSBzdHJpY3QgbW9kZSAoZGVmYXVsdCkKPj4gKyAgICAgICAgMSAtIG5vbi1zdHJpY3Qg bW9kZQo+Pgo+PiB2MSAtPiB2MjoKPj4gVXNlIHRoZSBsb3dlc3QgYml0IG9mIHRoZSBpb19wZ3Rh YmxlX29wcy51bm1hcCdzIGlvdmEgcGFyYW1ldGVyIHRvIHBhc3MgdGhlIHN0cmljdCBtb2RlOgo+ PiAwLCBJT01NVV9TVFJJQ1Q7Cj4+IDEsIElPTU1VX05PTl9TVFJJQ1Q7Cj4+IFRyZWF0IDAgYXMg SU9NTVVfU1RSSUNULCBzbyB0aGF0IHRoZSB1bm1hcCBvcGVyYXRpb24gY2FuIGNvbXBhdGlibGUg d2l0aAo+PiBvdGhlciBJT01NVXMgd2hpY2ggc3RpbGwgdXNlIHN0cmljdCBtb2RlLiBJbiBvdGhl ciB3b3JkcywgdGhpcyBwYXRjaCBzZXJpZXMKPj4gd2lsbCBub3QgaW1wYWN0IG90aGVyIElPTU1V IGRyaXZlcnMuIEkgdHJpZWQgYWRkIGEgbmV3IHF1aXJrIElPX1BHVEFCTEVfUVVJUktfTk9OX1NU UklDVAo+PiBpbiBpb19wZ3RhYmxlX2NmZy5xdWlya3MsIGJ1dCBpdCBjYW4gbm90IHBhc3MgdGhl IHN0cmljdCBtb2RlIG9mIHRoZSBkb21haW4gZnJvbSBTTU1VdjMKPj4gZHJpdmVyIHRvIGlvLXBn dGFibGUgbW9kdWxlLgo+IAo+IFdoYXQgZXhhY3RseSBpcyB0aGUgaXNzdWUgdGhlcmU/IFdlIGRv bid0IGhhdmUgYW55IHByb2JsZW0gd2l0aCBvdGhlciBxdWlya3MgbGlrZSBOT19ETUEsIGFuZCBh cyBJIHNhaWQgYmVmb3JlLCBieSB0aGUgdGltZSB3ZSdyZSBhbGxvY2F0aW5nIHRoZSBpby1wZ3Rh YmxlIGluIGFybV9zbW11X2RvbWFpbl9maW5hbGlzZSgpIHdlIGFscmVhZHkga25vdyBldmVyeXRo aW5nIHRoZXJlIGlzIHRvIGtub3cgYWJvdXQgdGhlIGRvbWFpbi4KCkJlY2F1c2UgdXNlcnNwYWNl IGNhbiBtYXAvdW5hbXAgYW5kIHN0YXJ0IGRldmljZXMgdG8gYWNjZXNzIG1lbW9yeSB0aHJvdWdo IFZGSU8uClNvIHRoYXQsIHRoZSBhdHRhY2tlciBjYW46CjEuIGFsbG9jIG1lbW9yeQoyLiBtYXAK My4gdW5tYXAKNC4gZnJlZSBtZW1vcnkKNS4gcmVwZWF0ZWRseSBhY2Nlc3NzaW5nIHRoZSBqdXN0 IGZyZWVkIG1lbW9yeSBiYXNlIG9uIHRoZSBqdXN0IHVubWFwcGVkIGlvdmEsCiAgIHRoaXMgYXR0 YWNrIG1heSBzdWNjZXNzIGlmIHRoZSBmcmVlZCBtZW1vcnkgaXMgcmV1c2VkIGJ5IG90aGVycyBh bmQgdGhlIG1hcHBpbmcgc3RpbGwgc3RheWluZyBpbiBUTEIKCkJ1dCBpZiBvbmx5IHJvb3QgdXNl ciBjYW4gdXNlIFZGSU8sIHRoaXMgaXMgYW4gdW5uZWNlc3Nhcnkgd29ycnkuIFRoZW4gYm90aCBu b3JtYWwgYW5kIFZGSU8gd2lsbCB1c2UgdGhlCnNhbWUgc3RyaWN0IG1vZGUsIHNvIHRoYXQgdGhl IG5ldyBxdWlyayBJT19QR1RBQkxFX1FVSVJLX05PTl9TVFJJQ1QgY2FuIGVhc2lseSBiZSBhcHBs aWVkLgoKPiAKPj4gQWRkIGEgbmV3IG1lbWJlciBkb21haW5fbm9uX3N0cmljdCBpbiBzdHJ1Y3Qg aW9tbXVfZG1hX2Nvb2tpZSwgdGhpcyBtZW1iZXIgd2lsbCBvbmx5IGJlCj4+IGluaXRpYWxpemVk IHdoZW4gdGhlIHJlbGF0ZWQgZG9tYWluIGFuZCBJT01NVSBkcml2ZXIgc3VwcG9ydCBub24tc3Ry aWN0IG1vZGUuCj4+Cj4+IHYxOgo+PiBJbiBjb21tb24sIGEgSU9NTVUgdW5tYXAgb3BlcmF0aW9u IGZvbGxvdyB0aGUgYmVsb3cgc3RlcHM6Cj4+IDEuIHJlbW92ZSB0aGUgbWFwcGluZyBpbiBwYWdl IHRhYmxlIG9mIHRoZSBzcGVjaWZpZWQgaW92YSByYW5nZQo+PiAyLiBleGVjdXRlIHRsYmkgY29t bWFuZCB0byBpbnZhbGlkIHRoZSBtYXBwaW5nIHdoaWNoIGlzIGNhY2hlZCBpbiBUTEIKPj4gMy4g d2FpdCBmb3IgdGhlIGFib3ZlIHRsYmkgb3BlcmF0aW9uIHRvIGJlIGZpbmlzaGVkCj4+IDQuIGZy ZWUgdGhlIElPVkEgcmVzb3VyY2UKPj4gNS4gZnJlZSB0aGUgcGh5c2ljYWwgbWVtb3J5IHJlc291 cmNlCj4+Cj4+IFRoaXMgbWF5YmUgYSBwcm9ibGVtIHdoZW4gdW5tYXAgaXMgdmVyeSBmcmVxdWVu dGx5LCB0aGUgY29tYmluYXRpb24gb2YgdGxiaQo+PiBhbmQgd2FpdCBvcGVyYXRpb24gd2lsbCBj b25zdW1lIGEgbG90IG9mIHRpbWUuIEEgZmVhc2libGUgbWV0aG9kIGlzIHB1dCBvZmYKPj4gdGxi aSBhbmQgaW92YS1mcmVlIG9wZXJhdGlvbiwgd2hlbiBhY2N1bXVsYXRpbmcgdG8gYSBjZXJ0YWlu IG51bWJlciBvcgo+PiByZWFjaGluZyBhIHNwZWNpZmllZCB0aW1lLCBleGVjdXRlIG9ubHkgb25l IHRsYmlfYWxsIGNvbW1hbmQgdG8gY2xlYW4gdXAKPj4gVExCLCB0aGVuIGZyZWUgdGhlIGJhY2t1 cCBJT1ZBcy4gTWFyayBhcyBub24tc3RyaWN0IG1vZGUuCj4+Cj4+IEJ1dCBpdCBtdXN0IGJlIG5v dGVkIHRoYXQsIGFsdGhvdWdoIHRoZSBtYXBwaW5nIGhhcyBhbHJlYWR5IGJlZW4gcmVtb3ZlZCBp bgo+PiB0aGUgcGFnZSB0YWJsZSwgaXQgbWF5YmUgc3RpbGwgZXhpc3QgaW4gVExCLiBBbmQgdGhl IGZyZWVkIHBoeXNpY2FsIG1lbW9yeQo+PiBtYXkgYWxzbyBiZSByZXVzZWQgZm9yIG90aGVycy4g U28gYSBhdHRhY2tlciBjYW4gcGVyc2lzdGVudCBhY2Nlc3MgdG8gbWVtb3J5Cj4+IGJhc2VkIG9u IHRoZSBqdXN0IGZyZWVkIElPVkEsIHRvIG9idGFpbiBzZW5zaWJsZSBkYXRhIG9yIGNvcnJ1cHQg bWVtb3J5LiBTbwo+PiB0aGUgVkZJTyBzaG91bGQgYWx3YXlzIGNob29zZSB0aGUgc3RyaWN0IG1v ZGUuCj4+Cj4+IFNvbWUgbWF5IGNvbnNpZGVyIHB1dCBvZmYgcGh5c2ljYWwgbWVtb3J5IGZyZWUg YWxzbywgdGhhdCB3aWxsIHN0aWxsIGZvbGxvdwo+PiBzdHJpY3QgbW9kZS4gQnV0IGZvciB0aGUg bWFwX3NnIGNhc2VzLCB0aGUgbWVtb3J5IGFsbG9jYXRpb24gaXMgbm90IGNvbnRyb2xsZWQKPj4g YnkgSU9NTVUgQVBJcywgc28gaXQgaXMgbm90IGVuZm9yY2VhYmxlLgo+Pgo+PiBGb3J0dW5hdGVs eSwgSW50ZWwgYW5kIEFNRCBoYXZlIGFscmVhZHkgYXBwbGllZCB0aGUgbm9uLXN0cmljdCBtb2Rl LCBhbmQgcHV0Cj4+IHF1ZXVlX2lvdmEoKSBvcGVyYXRpb24gaW50byB0aGUgY29tbW9uIGZpbGUg ZG1hLWlvbW11LmMuLCBhbmQgbXkgd29yayBpcyBiYXNlZAo+PiBvbiBpdC4gVGhlIGRpZmZlcmVu Y2UgaXMgdGhhdCBhcm0tc21tdS12MyBkcml2ZXIgd2lsbCBjYWxsIElPTU1VIGNvbW1vbiBBUElz IHRvCj4+IHVubWFwLCBidXQgSW50ZWwgYW5kIEFNRCBJT01NVSBkcml2ZXJzIGFyZSBub3QuCj4+ Cj4+IEJlbG93IGlzIHRoZSBwZXJmb3JtYW5jZSBkYXRhIG9mIHN0cmljdCB2cyBub24tc3RyaWN0 IGZvciBOVk1lIGRldmljZToKPj4gUmFuZG9tbHkgUmVhZCAgSU9QUzogMTQ2SyhzdHJpY3QpIHZz IDU3M0sobm9uLXN0cmljdCkKPj4gUmFuZG9tbHkgV3JpdGUgSU9QUzogMTQzSyhzdHJpY3QpIHZz IDUxM0sobm9uLXN0cmljdCkKPiAKPiBIb3cgZG9lcyB0aGF0IGNvbXBhcmUgdG8gcGFzc3Rocm91 Z2ggcGVyZm9ybWFuY2U/IE9uZSB0aGluZyBJJ20gbm90IGVudGlyZWx5IGNsZWFyIGFib3V0IGlz IHdoYXQgdGhlIHJlYWxpc3RpYyB1c2UtY2FzZSBmb3IgdGhpcyBpcyAtIGV2ZW4gaWYgaW52YWxp ZGF0aW9uIHdlcmUgaW5maW5pdGVseSBmYXN0LCBlbmFibGluZyB0cmFuc2xhdGlvbiBzdGlsbCB0 eXBpY2FsbHkgaGFzIGEgZmFpciBpbXBhY3Qgb24gb3ZlcmFsbCBzeXN0ZW0gcGVyZm9ybWFuY2Ug aW4gdGVybXMgb2YgbGF0ZW5jeSwgcG93ZXIsIG1lbW9yeSBiYW5kd2lkdGgsIGV0Yy4sIHNvIEkg Y2FuJ3QgaGVscCB3b25kZXIgd2hhdCBkZXZpY2VzIGV4aXN0IHRvZGF5IGZvciB3aGljaCBwZXJm b3JtYW5jZSBpcyBjcml0aWNhbCBhbmQgcm9idXN0bmVzcyogaXMgdW5pbXBvcnRhbnQsIHlldCBo YXZlIGNyaXBwbGVkIGFkZHJlc3NpbmcgY2FwYWJpbGl0aWVzIHN1Y2ggdGhhdCB0aGV5IGNhbid0 IGp1c3QgdXNlIHBhc3N0aHJvdWdoLgpJIGhhdmUgbm8gcGFzc3Rocm91Z2ggcGVyZm9ybWFuY2Ug ZGF0YSB5ZXQsIEkgd2lsbCBhc2sgbXkgdGVhbSB0byBkbyBpdC4gQnV0IHdlIGhhdmUgdGVzdGVk IHRoZSBHbG9iYWwgYnlwYXNzOgpSYW5kb21seSBSZWFkIElPUFM6IDc0NEssIGFuZCBSYW5kb21s eSBXcml0ZSBJT1BTOiBpcyB0aGUgc2FtZSB0byBub24tc3RyaWN0LgoKSSdtIGFsc28gbm90IGNs ZWFyLiBCdXQgSSB0aGluayBpbiBtb3N0IGNhc2VzLCB0aGUgc3lzdGVtIGRvZXMgbm90IG5lZWQg dG8gcnVuIGF0IGZ1bGwgY2FwYWNpdHksIGJ1dCB0aGUgc3lzdGVtCnNob3VsZCBoYXZlIHRoYXQg YWJpbGl0eS4gRm9yIGV4YW1wbGUsIGEgc3lzdGVtJ3MgZGFpbHkgbG9hZCBtYXkgb25seSAzMC01 MCUsIGJ1dCB0aGUgbG9hZCBtYXkgaW5jcmVhc2UgdG8gODAlKwpvbiBmZXN0aXZhbCBkYXkuCgpQ YXNzdGhyb3VnaCBpcyBub3QgZW5vdWdoIHRvIHN1cHBvcnQgVkZJTywgYW5kIHZpcnR1YWxpemF0 aW9uIG5lZWQgdGhlIGxhdGVyLgoKPiAKPiBSb2Jpbi4KPiAKPiAKPiAqIEkgZG9uJ3Qgd2FudCB0 byBzYXkgInNlY3VyaXR5IiBoZXJlLCBzaW5jZSBJJ20gYWN0dWFsbHkgYSBsb3QgbGVzcyBjb25j ZXJuZWQgYWJvdXQgdGhlIHRoZW9yZXRpY2FsIG1hbGljaW91cyBlbmRwb2ludC93aWxkIHdyaXRl IHNjZW5hcmlvcyB0aGFuIHRoZSB0aGUgbXVjaCBtb3JlIHN0cmFpZ2h0Zm9yd2FyZCBtYWxmdW5j dGlvbmluZyBkZXZpY2UgYW5kL29yIGJ1Z2d5IGRyaXZlciBjYXVzaW5nIHVzZS1hZnRlci1mcmVl IHN0eWxlIG1lbW9yeSBjb3JydXB0aW9uLiBBbHNvLCBJJ20gc2ljayBvZiB0aGUgd29yZCAic2Vj dXJpdHkiLi4uCgpPS++8jFdlIHJlYWxseSBoYXZlIG5vIG5lZWQgdG8gY29uc2lkZXIgYnVnZ3kg ZGV2aWNlcy4KCj4gCj4+Cj4+IFpoZW4gTGVpICg2KToKPj4gICAgaW9tbXUvYXJtLXNtbXUtdjM6 IGZpeCB0aGUgaW1wbGVtZW50YXRpb24gb2YgZmx1c2hfaW90bGJfYWxsIGhvb2sKPj4gICAgaW9t bXUvZG1hOiBhZGQgc3VwcG9ydCBmb3Igbm9uLXN0cmljdCBtb2RlCj4+ICAgIGlvbW11L2FtZDog dXNlIGRlZmF1bHQgYnJhbmNoIHRvIGRlYWwgd2l0aCBhbGwgbm9uLXN1cHBvcnRlZAo+PiAgICAg IGNhcGFiaWxpdGllcwo+PiAgICBpb21tdS9pby1wZ3RhYmxlLWFybTogYWRkIHN1cHBvcnQgZm9y IG5vbi1zdHJpY3QgbW9kZQo+PiAgICBpb21tdS9hcm0tc21tdS12MzogYWRkIHN1cHBvcnQgZm9y IG5vbi1zdHJpY3QgbW9kZQo+PiAgICBpb21tdS9hcm0tc21tdS12MzogYWRkIGJvb3R1cCBvcHRp b24gImlvbW11X3N0cmljdF9tb2RlIgo+Pgo+PiAgIERvY3VtZW50YXRpb24vYWRtaW4tZ3VpZGUv a2VybmVsLXBhcmFtZXRlcnMudHh0IHwgMTIgKysrKysrKwo+PiAgIGRyaXZlcnMvaW9tbXUvYW1k X2lvbW11LmMgICAgICAgICAgICAgICAgICAgICAgIHwgIDQgKy0tCj4+ICAgZHJpdmVycy9pb21t dS9hcm0tc21tdS12My5jICAgICAgICAgICAgICAgICAgICAgfCA0MiArKysrKysrKysrKysrKysr KysrKysrKy0tCj4+ICAgZHJpdmVycy9pb21tdS9kbWEtaW9tbXUuYyAgICAgICAgICAgICAgICAg ICAgICAgfCAyNSArKysrKysrKysrKysrKysKPj4gICBkcml2ZXJzL2lvbW11L2lvLXBndGFibGUt YXJtLmMgICAgICAgICAgICAgICAgICB8IDIzICsrKysrKysrLS0tLS0tCj4+ICAgaW5jbHVkZS9s aW51eC9pb21tdS5oICAgICAgICAgICAgICAgICAgICAgICAgICAgfCAgNyArKysrKwo+PiAgIDYg ZmlsZXMgY2hhbmdlZCwgOTggaW5zZXJ0aW9ucygrKSwgMTUgZGVsZXRpb25zKC0pCj4+Cj4gCj4g Lgo+IAoKLS0gClRoYW5rcyEKQmVzdFJlZ2FyZHMKCl9fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fCmlvbW11IG1haWxpbmcgbGlzdAppb21tdUBsaXN0cy5saW51 eC1mb3VuZGF0aW9uLm9yZwpodHRwczovL2xpc3RzLmxpbnV4Zm91bmRhdGlvbi5vcmcvbWFpbG1h bi9saXN0aW5mby9pb21tdQ== From mboxrd@z Thu Jan 1 00:00:00 1970 From: thunder.leizhen@huawei.com (Leizhen (ThunderTown)) Date: Thu, 26 Jul 2018 11:44:25 +0800 Subject: [PATCH v3 0/6] add non-strict mode support for arm-smmu-v3 In-Reply-To: References: <1531376312-2192-1-git-send-email-thunder.leizhen@huawei.com> Message-ID: <5B594399.1080404@huawei.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 2018/7/25 5:51, Robin Murphy wrote: > On 2018-07-12 7:18 AM, Zhen Lei wrote: >> v2 -> v3: >> Add a bootup option "iommu_strict_mode" to make the manager can choose which >> mode to be used. The first 5 patches have not changed. >> + iommu_strict_mode= [arm-smmu-v3] >> + 0 - strict mode (default) >> + 1 - non-strict mode >> >> v1 -> v2: >> Use the lowest bit of the io_pgtable_ops.unmap's iova parameter to pass the strict mode: >> 0, IOMMU_STRICT; >> 1, IOMMU_NON_STRICT; >> Treat 0 as IOMMU_STRICT, so that the unmap operation can compatible with >> other IOMMUs which still use strict mode. In other words, this patch series >> will not impact other IOMMU drivers. I tried add a new quirk IO_PGTABLE_QUIRK_NON_STRICT >> in io_pgtable_cfg.quirks, but it can not pass the strict mode of the domain from SMMUv3 >> driver to io-pgtable module. > > What exactly is the issue there? We don't have any problem with other quirks like NO_DMA, and as I said before, by the time we're allocating the io-pgtable in arm_smmu_domain_finalise() we already know everything there is to know about the domain. Because userspace can map/unamp and start devices to access memory through VFIO. So that, the attacker can: 1. alloc memory 2. map 3. unmap 4. free memory 5. repeatedly accesssing the just freed memory base on the just unmapped iova, this attack may success if the freed memory is reused by others and the mapping still staying in TLB But if only root user can use VFIO, this is an unnecessary worry. Then both normal and VFIO will use the same strict mode, so that the new quirk IO_PGTABLE_QUIRK_NON_STRICT can easily be applied. > >> Add a new member domain_non_strict in struct iommu_dma_cookie, this member will only be >> initialized when the related domain and IOMMU driver support non-strict mode. >> >> v1: >> In common, a IOMMU unmap operation follow the below steps: >> 1. remove the mapping in page table of the specified iova range >> 2. execute tlbi command to invalid the mapping which is cached in TLB >> 3. wait for the above tlbi operation to be finished >> 4. free the IOVA resource >> 5. free the physical memory resource >> >> This maybe a problem when unmap is very frequently, the combination of tlbi >> and wait operation will consume a lot of time. A feasible method is put off >> tlbi and iova-free operation, when accumulating to a certain number or >> reaching a specified time, execute only one tlbi_all command to clean up >> TLB, then free the backup IOVAs. Mark as non-strict mode. >> >> But it must be noted that, although the mapping has already been removed in >> the page table, it maybe still exist in TLB. And the freed physical memory >> may also be reused for others. So a attacker can persistent access to memory >> based on the just freed IOVA, to obtain sensible data or corrupt memory. So >> the VFIO should always choose the strict mode. >> >> Some may consider put off physical memory free also, that will still follow >> strict mode. But for the map_sg cases, the memory allocation is not controlled >> by IOMMU APIs, so it is not enforceable. >> >> Fortunately, Intel and AMD have already applied the non-strict mode, and put >> queue_iova() operation into the common file dma-iommu.c., and my work is based >> on it. The difference is that arm-smmu-v3 driver will call IOMMU common APIs to >> unmap, but Intel and AMD IOMMU drivers are not. >> >> Below is the performance data of strict vs non-strict for NVMe device: >> Randomly Read IOPS: 146K(strict) vs 573K(non-strict) >> Randomly Write IOPS: 143K(strict) vs 513K(non-strict) > > How does that compare to passthrough performance? One thing I'm not entirely clear about is what the realistic use-case for this is - even if invalidation were infinitely fast, enabling translation still typically has a fair impact on overall system performance in terms of latency, power, memory bandwidth, etc., so I can't help wonder what devices exist today for which performance is critical and robustness* is unimportant, yet have crippled addressing capabilities such that they can't just use passthrough. I have no passthrough performance data yet, I will ask my team to do it. But we have tested the Global bypass: Randomly Read IOPS: 744K, and Randomly Write IOPS: is the same to non-strict. I'm also not clear. But I think in most cases, the system does not need to run at full capacity, but the system should have that ability. For example, a system's daily load may only 30-50%, but the load may increase to 80%+ on festival day. Passthrough is not enough to support VFIO, and virtualization need the later. > > Robin. > > > * I don't want to say "security" here, since I'm actually a lot less concerned about the theoretical malicious endpoint/wild write scenarios than the the much more straightforward malfunctioning device and/or buggy driver causing use-after-free style memory corruption. Also, I'm sick of the word "security"... OK?We really have no need to consider buggy devices. > >> >> Zhen Lei (6): >> iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook >> iommu/dma: add support for non-strict mode >> iommu/amd: use default branch to deal with all non-supported >> capabilities >> iommu/io-pgtable-arm: add support for non-strict mode >> iommu/arm-smmu-v3: add support for non-strict mode >> iommu/arm-smmu-v3: add bootup option "iommu_strict_mode" >> >> Documentation/admin-guide/kernel-parameters.txt | 12 +++++++ >> drivers/iommu/amd_iommu.c | 4 +-- >> drivers/iommu/arm-smmu-v3.c | 42 +++++++++++++++++++++++-- >> drivers/iommu/dma-iommu.c | 25 +++++++++++++++ >> drivers/iommu/io-pgtable-arm.c | 23 ++++++++------ >> include/linux/iommu.h | 7 +++++ >> 6 files changed, 98 insertions(+), 15 deletions(-) >> > > . > -- Thanks! BestRegards From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D34EBC28CF6 for ; Thu, 26 Jul 2018 03:44:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 788D920846 for ; Thu, 26 Jul 2018 03:44:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 788D920846 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727742AbeGZE71 (ORCPT ); Thu, 26 Jul 2018 00:59:27 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:10124 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725782AbeGZE71 (ORCPT ); Thu, 26 Jul 2018 00:59:27 -0400 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 5B3DA4DEF148A; Thu, 26 Jul 2018 11:44:36 +0800 (CST) Received: from [127.0.0.1] (10.177.23.164) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.382.0; Thu, 26 Jul 2018 11:44:26 +0800 Subject: Re: [PATCH v3 0/6] add non-strict mode support for arm-smmu-v3 To: Robin Murphy , Jean-Philippe Brucker , Will Deacon , "Joerg Roedel" , linux-arm-kernel , iommu , linux-kernel , LinuxArm References: <1531376312-2192-1-git-send-email-thunder.leizhen@huawei.com> From: "Leizhen (ThunderTown)" Message-ID: <5B594399.1080404@huawei.com> Date: Thu, 26 Jul 2018 11:44:25 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/7/25 5:51, Robin Murphy wrote: > On 2018-07-12 7:18 AM, Zhen Lei wrote: >> v2 -> v3: >> Add a bootup option "iommu_strict_mode" to make the manager can choose which >> mode to be used. The first 5 patches have not changed. >> + iommu_strict_mode= [arm-smmu-v3] >> + 0 - strict mode (default) >> + 1 - non-strict mode >> >> v1 -> v2: >> Use the lowest bit of the io_pgtable_ops.unmap's iova parameter to pass the strict mode: >> 0, IOMMU_STRICT; >> 1, IOMMU_NON_STRICT; >> Treat 0 as IOMMU_STRICT, so that the unmap operation can compatible with >> other IOMMUs which still use strict mode. In other words, this patch series >> will not impact other IOMMU drivers. I tried add a new quirk IO_PGTABLE_QUIRK_NON_STRICT >> in io_pgtable_cfg.quirks, but it can not pass the strict mode of the domain from SMMUv3 >> driver to io-pgtable module. > > What exactly is the issue there? We don't have any problem with other quirks like NO_DMA, and as I said before, by the time we're allocating the io-pgtable in arm_smmu_domain_finalise() we already know everything there is to know about the domain. Because userspace can map/unamp and start devices to access memory through VFIO. So that, the attacker can: 1. alloc memory 2. map 3. unmap 4. free memory 5. repeatedly accesssing the just freed memory base on the just unmapped iova, this attack may success if the freed memory is reused by others and the mapping still staying in TLB But if only root user can use VFIO, this is an unnecessary worry. Then both normal and VFIO will use the same strict mode, so that the new quirk IO_PGTABLE_QUIRK_NON_STRICT can easily be applied. > >> Add a new member domain_non_strict in struct iommu_dma_cookie, this member will only be >> initialized when the related domain and IOMMU driver support non-strict mode. >> >> v1: >> In common, a IOMMU unmap operation follow the below steps: >> 1. remove the mapping in page table of the specified iova range >> 2. execute tlbi command to invalid the mapping which is cached in TLB >> 3. wait for the above tlbi operation to be finished >> 4. free the IOVA resource >> 5. free the physical memory resource >> >> This maybe a problem when unmap is very frequently, the combination of tlbi >> and wait operation will consume a lot of time. A feasible method is put off >> tlbi and iova-free operation, when accumulating to a certain number or >> reaching a specified time, execute only one tlbi_all command to clean up >> TLB, then free the backup IOVAs. Mark as non-strict mode. >> >> But it must be noted that, although the mapping has already been removed in >> the page table, it maybe still exist in TLB. And the freed physical memory >> may also be reused for others. So a attacker can persistent access to memory >> based on the just freed IOVA, to obtain sensible data or corrupt memory. So >> the VFIO should always choose the strict mode. >> >> Some may consider put off physical memory free also, that will still follow >> strict mode. But for the map_sg cases, the memory allocation is not controlled >> by IOMMU APIs, so it is not enforceable. >> >> Fortunately, Intel and AMD have already applied the non-strict mode, and put >> queue_iova() operation into the common file dma-iommu.c., and my work is based >> on it. The difference is that arm-smmu-v3 driver will call IOMMU common APIs to >> unmap, but Intel and AMD IOMMU drivers are not. >> >> Below is the performance data of strict vs non-strict for NVMe device: >> Randomly Read IOPS: 146K(strict) vs 573K(non-strict) >> Randomly Write IOPS: 143K(strict) vs 513K(non-strict) > > How does that compare to passthrough performance? One thing I'm not entirely clear about is what the realistic use-case for this is - even if invalidation were infinitely fast, enabling translation still typically has a fair impact on overall system performance in terms of latency, power, memory bandwidth, etc., so I can't help wonder what devices exist today for which performance is critical and robustness* is unimportant, yet have crippled addressing capabilities such that they can't just use passthrough. I have no passthrough performance data yet, I will ask my team to do it. But we have tested the Global bypass: Randomly Read IOPS: 744K, and Randomly Write IOPS: is the same to non-strict. I'm also not clear. But I think in most cases, the system does not need to run at full capacity, but the system should have that ability. For example, a system's daily load may only 30-50%, but the load may increase to 80%+ on festival day. Passthrough is not enough to support VFIO, and virtualization need the later. > > Robin. > > > * I don't want to say "security" here, since I'm actually a lot less concerned about the theoretical malicious endpoint/wild write scenarios than the the much more straightforward malfunctioning device and/or buggy driver causing use-after-free style memory corruption. Also, I'm sick of the word "security"... OK,We really have no need to consider buggy devices. > >> >> Zhen Lei (6): >> iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook >> iommu/dma: add support for non-strict mode >> iommu/amd: use default branch to deal with all non-supported >> capabilities >> iommu/io-pgtable-arm: add support for non-strict mode >> iommu/arm-smmu-v3: add support for non-strict mode >> iommu/arm-smmu-v3: add bootup option "iommu_strict_mode" >> >> Documentation/admin-guide/kernel-parameters.txt | 12 +++++++ >> drivers/iommu/amd_iommu.c | 4 +-- >> drivers/iommu/arm-smmu-v3.c | 42 +++++++++++++++++++++++-- >> drivers/iommu/dma-iommu.c | 25 +++++++++++++++ >> drivers/iommu/io-pgtable-arm.c | 23 ++++++++------ >> include/linux/iommu.h | 7 +++++ >> 6 files changed, 98 insertions(+), 15 deletions(-) >> > > . > -- Thanks! BestRegards