From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnd Bergmann Subject: Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR Date: Mon, 02 Jan 2017 09:44:46 +0100 Message-ID: <2736959.3MfCab47fD@wuerfel> References: <20161227015413.187403-1-kirill.shutemov@linux.intel.com> <20161227015413.187403-30-kirill.shutemov@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20161227015413.187403-30-kirill.shutemov@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=m.gmane.org@lists.infradead.org To: "Kirill A. Shutemov" Cc: linux-arch@vger.kernel.org, Andi Kleen , Catalin Marinas , linux-mm@kvack.org, linux-api@vger.kernel.org, x86@kernel.org, Will Deacon , linux-kernel@vger.kernel.org, Andy Lutomirski , Dave Hansen , Ingo Molnar , "H. Peter Anvin" , Andrew Morton , Linus Torvalds , Thomas Gleixner , linux-arm-kernel@lists.infradead.org List-Id: linux-arch.vger.kernel.org T24gVHVlc2RheSwgRGVjZW1iZXIgMjcsIDIwMTYgNDo1NDoxMyBBTSBDRVQgS2lyaWxsIEEuIFNo dXRlbW92IHdyb3RlOgo+IFRoaXMgcGF0Y2ggaW50cm9kdWNlcyBuZXcgcmxpbWl0IHJlc291cmNl IHRvIG1hbmFnZSBtYXhpbXVtIHZpcnR1YWwKPiBhZGRyZXNzIGF2YWlsYWJsZSB0byB1c2Vyc3Bh Y2UgdG8gbWFwLgo+IAo+IE9uIHg4NiwgNS1sZXZlbCBwYWdpbmcgZW5hYmxlcyA1Ni1iaXQgdXNl cnNwYWNlIHZpcnR1YWwgYWRkcmVzcyBzcGFjZS4KPiBOb3QgYWxsIHVzZXIgc3BhY2UgaXMgcmVh ZHkgdG8gaGFuZGxlIHdpZGUgYWRkcmVzc2VzLiBJdCdzIGtub3duIHRoYXQKPiBhdCBsZWFzdCBz b21lIEpJVCBjb21waWxlcnMgdXNlIGhpZ2ggYml0IGluIHBvaW50ZXJzIHRvIGVuY29kZSB0aGVp cgo+IGluZm9ybWF0aW9uLiBJdCBjb2xsaWRlcyB3aXRoIHZhbGlkIHBvaW50ZXJzIHdpdGggNS1s ZXZlbCBwYWdpbmcgYW5kCj4gbGVhZHMgdG8gY3Jhc2hlcy4KPiAKPiBUaGUgcGF0Y2ggYWltcyB0 byBhZGRyZXNzIHRoaXMgY29tcGF0aWJpbGl0eSBpc3N1ZS4KPiAKPiBNTSB3b3VsZCB1c2UgbWlu KFJMSU1JVF9WQUREUiwgVEFTS19TSVpFKSBhcyB1cHBlciBsaW1pdCBvZiB2aXJ0dWFsCj4gYWRk cmVzcyBhdmFpbGFibGUgdG8gbWFwIGJ5IHVzZXJzcGFjZS4KPiAKPiBUaGUgZGVmYXVsdCBoYXJk IGxpbWl0IHdpbGwgYmUgUkxJTV9JTkZJTklUWSwgd2hpY2ggYmFzaWNhbGx5IG1lYW5zIHRoYXQK PiBUQVNLX1NJWkUgbGltaXRzIGF2YWlsYWJsZSBhZGRyZXNzIHNwYWNlLgo+IAo+IFRoZSBzb2Z0 IGxpbWl0IHdpbGwgYWxzbyBiZSBSTElNX0lORklOSVRZIGV2ZXJ5d2hlcmUsIGJ1dCB0aGUgbWFj aGluZQo+IHdpdGggNS1sZXZlbCBwYWdpbmcgZW5hYmxlZC4gSW4gdGhpcyBjYXNlLCBzb2Z0IGxp bWl0IHdvdWxkIGJlCj4gKDFVTCA8PCA0NykgLSBQQUdFX1NJWkUuIEl04oCZcyBjdXJyZW50IHg4 Ni02NCBUQVNLX1NJWkVfTUFYIHdpdGggNC1sZXZlbAo+IHBhZ2luZyB3aGljaCBrbm93biB0byBi ZSBzYWZlCj4gCj4gTmV3IHJsaW1pdCByZXNvdXJjZSB3b3VsZCBmb2xsb3cgdXN1YWwgc2VtYW50 aWNzIHdpdGggcmVnYXJkcyB0bwo+IGluaGVyaXRhbmNlOiBwcmVzZXJ2ZWQgb24gZm9yaygyKSBh bmQgZXhlYygyKS4gVGhpcyBoYXMgcG90ZW50aWFsIHRvCj4gYnJlYWsgYXBwbGljYXRpb24gaWYg bGltaXRzIHNldCB0b28gd2lkZSBvciB0b28gbmFycm93LCBidXQgdGhpcyBpcyBub3QKPiB1bmNv bW1vbiBmb3Igb3RoZXIgcmVzb3VyY2VzIChjb25zaWRlciBSTElNSVRfREFUQSBvciBSTElNSVRf QVMpLgo+IAo+IEFzIHdpdGggb3RoZXIgcmVzb3VyY2VzIHlvdSBjYW4gc2V0IHRoZSBsaW1pdCBs b3dlciB0aGFuIGN1cnJlbnQgdXNhZ2UuCj4gSXQgd291bGQgYWZmZWN0IG9ubHkgZnV0dXJlIHZp cnR1YWwgYWRkcmVzcyBzcGFjZSBhbGxvY2F0aW9ucy4KPiAKPiBVc2UtY2FzZXMgZm9yIG5ldyBy bGltaXQ6Cj4gCj4gICAtIEJ1bXBpbmcgdGhlIHNvZnQgbGltaXQgdG8gUkxJTV9JTkZJTklUWSwg YWxsb3dzIGN1cnJlbnQgcHJvY2VzcyBhbGwKPiAgICAgaXRzIGNoaWxkcmVuIHRvIHVzZSBhZGRy ZXNzZXMgYWJvdmUgNDctYml0cy4KPiAKPiAgIC0gQnVtcGluZyB0aGUgc29mdCBsaW1pdCB0byBS TElNX0lORklOSVRZIGFmdGVyIGZvcmsoMiksIGJ1dCBiZWZvcmUKPiAgICAgZXhlYygyKSBhbGxv d3MgdGhlIGNoaWxkIHRvIHVzZSBhZGRyZXNzZXMgYWJvdmUgNDctYml0cy4KPiAKPiAgIC0gTG93 ZXJpbmcgdGhlIGhhcmQgbGltaXQgdG8gNDctYml0cyB3b3VsZCBwcmV2ZW50IGN1cnJlbnQgcHJv Y2VzcyBhbGwKPiAgICAgaXRzIGNoaWxkcmVuIHRvIHVzZSBhZGRyZXNzZXMgYWJvdmUgNDctYml0 cywgdW5sZXNzIGEgcHJvY2VzcyBoYXMKPiAgICAgQ0FQX1NZU19SRVNPVVJDRVMuCj4gCj4gICAt IEl04oCZcyBhbHNvIGNhbiBiZSBoYW5keSB0byBsb3dlciBoYXJkIG9yIHNvZnQgbGltaXQgdG8g YXJiaXRyYXJ5Cj4gICAgIGFkZHJlc3MuIFVzZXItbW9kZSBlbXVsYXRpb24gaW4gUUVNVSBtYXkg bG93ZXIgdGhlIGxpbWl0IHRvIDMyLWJpdAo+ICAgICB0byBlbXVsYXRlIDMyLWJpdCBtYWNoaW5l IG9uIDY0LWJpdCBob3N0Lgo+IAo+IFRPRE86Cj4gICAtIHBvcnQgdG8gbm9uLXg4NjsKPiAKPiBO b3QteWV0LXNpZ25lZC1vZmYtYnk6IEtpcmlsbCBBLiBTaHV0ZW1vdiA8a2lyaWxsLnNodXRlbW92 QGxpbnV4LmludGVsLmNvbT4KPiBDYzogbGludXgtYXBpQHZnZXIua2VybmVsLm9yZwoKVGhpcyBz ZWVtcyB0byBuaWNlbHkgYWRkcmVzcyB0aGUgc2FtZSBwcm9ibGVtIG9uIGFybTY0LCB3aGljaCBo YXMKcnVuIGludG8gdGhlIHNhbWUgaXNzdWUgZHVlIHRvIHRoZSB2YXJpb3VzIHBhZ2UgdGFibGUg Zm9ybWF0cwp0aGF0IGNhbiBjdXJyZW50bHkgYmUgY2hvc2VuIGF0IGNvbXBpbGUgdGltZS4KCkkg ZG9uJ3Qgc2VlIGhvdyB0aGlzIGludGVyYWN0cyB3aXRoIHRoZSBleGlzdGluZwpQRVJfTElOVVgz Mi9QRVJfTElOVVgzMl8zR0IgcGVyc29uYWxpdHkgZmxhZ3MsIGJ1dCBJIGFzc3VtZSB5b3UgaGF2 ZQplaXRoZXIgYWxyZWFkeSB0aG91Z2h0IG9mIHRoYXQsIG9yIHdlIGNhbiBjb21lIHVwIHdpdGgg YSBnb29kIHdheQp0byBkZWZpbmUgd2hhdCBoYXBwZW5zIHdoZW4gY29uZmxpY3Rpbmcgc2V0dGlu Z3MgYXJlIGFwcGxpZWQuCgpUaGUgdHdvIHJlYXNvbmFibGUgd2F5cyBJIGNhbiB0aGluayBvZiBh cmUgdG8gZWl0aGVyIHVzZSB0aGUKbWluaW11bSBvZiB0aGUgdHdvIGxpbWl0cywgb3IgdG8gbWFr ZSB0aGUgcGVyc29uYWxpdHkgc3lzY2FsbApzZXQgdGhlIHNvZnQgcmxpbWl0IGFuZCB1c2Ugd2hh dGV2ZXIgbGltaXQgd2FzIGxhc3Qgc2V0LgoKCUFybmQKCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCmxpbnV4LWFybS1rZXJuZWwgbWFpbGluZyBsaXN0Cmxp bnV4LWFybS1rZXJuZWxAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5mcmFkZWFk Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL2xpbnV4LWFybS1rZXJuZWwK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de ([212.227.126.131]:55473 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750990AbdABIro (ORCPT ); Mon, 2 Jan 2017 03:47:44 -0500 From: Arnd Bergmann Subject: Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR Date: Mon, 02 Jan 2017 09:44:46 +0100 Message-ID: <2736959.3MfCab47fD@wuerfel> In-Reply-To: <20161227015413.187403-30-kirill.shutemov@linux.intel.com> References: <20161227015413.187403-1-kirill.shutemov@linux.intel.com> <20161227015413.187403-30-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset="UTF-8" Sender: linux-arch-owner@vger.kernel.org List-ID: To: "Kirill A. Shutemov" Cc: Linus Torvalds , Andrew Morton , x86@kernel.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Dave Hansen , Andy Lutomirski , linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Catalin Marinas , Will Deacon Message-ID: <20170102084446.uWEp3_paRD2USzyXL0vEscWa5hAwGSdv-XXZokC3V9w@z> On Tuesday, December 27, 2016 4:54:13 AM CET Kirill A. Shutemov wrote: > This patch introduces new rlimit resource to manage maximum virtual > address available to userspace to map. > > On x86, 5-level paging enables 56-bit userspace virtual address space. > Not all user space is ready to handle wide addresses. It's known that > at least some JIT compilers use high bit in pointers to encode their > information. It collides with valid pointers with 5-level paging and > leads to crashes. > > The patch aims to address this compatibility issue. > > MM would use min(RLIMIT_VADDR, TASK_SIZE) as upper limit of virtual > address available to map by userspace. > > The default hard limit will be RLIM_INFINITY, which basically means that > TASK_SIZE limits available address space. > > The soft limit will also be RLIM_INFINITY everywhere, but the machine > with 5-level paging enabled. In this case, soft limit would be > (1UL << 47) - PAGE_SIZE. It’s current x86-64 TASK_SIZE_MAX with 4-level > paging which known to be safe > > New rlimit resource would follow usual semantics with regards to > inheritance: preserved on fork(2) and exec(2). This has potential to > break application if limits set too wide or too narrow, but this is not > uncommon for other resources (consider RLIMIT_DATA or RLIMIT_AS). > > As with other resources you can set the limit lower than current usage. > It would affect only future virtual address space allocations. > > Use-cases for new rlimit: > > - Bumping the soft limit to RLIM_INFINITY, allows current process all > its children to use addresses above 47-bits. > > - Bumping the soft limit to RLIM_INFINITY after fork(2), but before > exec(2) allows the child to use addresses above 47-bits. > > - Lowering the hard limit to 47-bits would prevent current process all > its children to use addresses above 47-bits, unless a process has > CAP_SYS_RESOURCES. > > - It’s also can be handy to lower hard or soft limit to arbitrary > address. User-mode emulation in QEMU may lower the limit to 32-bit > to emulate 32-bit machine on 64-bit host. > > TODO: > - port to non-x86; > > Not-yet-signed-off-by: Kirill A. Shutemov > Cc: linux-api@vger.kernel.org This seems to nicely address the same problem on arm64, which has run into the same issue due to the various page table formats that can currently be chosen at compile time. I don't see how this interacts with the existing PER_LINUX32/PER_LINUX32_3GB personality flags, but I assume you have either already thought of that, or we can come up with a good way to define what happens when conflicting settings are applied. The two reasonable ways I can think of are to either use the minimum of the two limits, or to make the personality syscall set the soft rlimit and use whatever limit was last set. Arnd