* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv @ 2016-08-16 18:32 Christopher Covington 2016-08-17 10:30 ` Catalin Marinas 0 siblings, 1 reply; 8+ messages in thread From: Christopher Covington @ 2016-08-16 18:32 UTC (permalink / raw) To: linux-arm-kernel Some userspace applications need to know the maximum virtual address they can use (TASK_SIZE). There are several possible values for TASK_SIZE with the arm64 kernel, and such applications are either making bad hard-coded assumptions, or are guessing and checking using system calls like munmap(), which may have other reasons for returning an error than TASK_SIZE being exceeded. To make correct functioning easy for userspace applications that need to know the maximum virtual address they can use, communicate TASK_SIZE via the ELF auxiliary vector, just like PAGE_SIZE is currently communicated. Signed-off-by: Christopher Covington <cov@codeaurora.org> --- Tested with the following commands: LD_SHOW_AUXV=1 sleep 1 # GNU dynamic ld-linux*.so hexdump -v -e '4/4 "%08x " "\n"' /proc/self/auxv | \ sed -r 's/0*([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+)/\1 0x\4\3/ s/^0 / NULL: / s/^3 / PHDR: / s/^4 / PHENT: / s/^5 / PHNUM: / s/^6 / PAGESZ: / s/^7 / BASE: / s/^8 / FLAGS: / s/^9 / ENTRY: / s/^b / UID: / s/^c / EUID: / s/^d / GID: / s/^e / EGID: / s/^f /PLATFORM: / s/^10 / HWCAP: / s/^11 / CLKTCK: / s/^17 / SECURE: / s/^19 / RANDOM: / s/^1f / EXECFN: / s/^21 / VDSO: / s/^22 / TASKSZ: /' # compatible with static busybox --- arch/arm64/include/asm/elf.h | 1 + arch/arm64/include/uapi/asm/auxvec.h | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h index a55384f..3811795 100644 --- a/arch/arm64/include/asm/elf.h +++ b/arch/arm64/include/asm/elf.h @@ -145,6 +145,7 @@ typedef struct user_fpsimd_state elf_fpregset_t; do { \ NEW_AUX_ENT(AT_SYSINFO_EHDR, \ (elf_addr_t)current->mm->context.vdso); \ + NEW_AUX_ENT(AT_TASKSZ, TASK_SIZE); \ } while (0) #define ARCH_HAS_SETUP_ADDITIONAL_PAGES diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h index 4cf0c17..595bfda 100644 --- a/arch/arm64/include/uapi/asm/auxvec.h +++ b/arch/arm64/include/uapi/asm/auxvec.h @@ -18,7 +18,8 @@ /* vDSO location */ #define AT_SYSINFO_EHDR 33 +#define AT_TASKSZ 34 -#define AT_VECTOR_SIZE_ARCH 1 /* entries in ARCH_DLINFO */ +#define AT_VECTOR_SIZE_ARCH 2 /* entries in ARCH_DLINFO */ #endif -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-16 18:32 [PATCH] arm64: Expose TASK_SIZE to userspace via auxv Christopher Covington @ 2016-08-17 10:30 ` Catalin Marinas 2016-08-17 11:12 ` Christopher Covington 0 siblings, 1 reply; 8+ messages in thread From: Catalin Marinas @ 2016-08-17 10:30 UTC (permalink / raw) To: linux-arm-kernel On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: > Some userspace applications need to know the maximum virtual address they can > use (TASK_SIZE). Just curious, what are the cases needing TASK_SIZE in user space? -- Catalin ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-17 10:30 ` Catalin Marinas @ 2016-08-17 11:12 ` Christopher Covington 2016-08-18 12:00 ` Ard Biesheuvel 2016-08-18 12:17 ` Richard Weinberger 0 siblings, 2 replies; 8+ messages in thread From: Christopher Covington @ 2016-08-17 11:12 UTC (permalink / raw) To: linux-arm-kernel On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: >On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: >> Some userspace applications need to know the maximum virtual address >they can >> use (TASK_SIZE). > >Just curious, what are the cases needing TASK_SIZE in user space? Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory. Thanks, Cov -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. Sent from my Snapdragon powered Android device with K-9 Mail. Please excuse my brevity. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-17 11:12 ` Christopher Covington @ 2016-08-18 12:00 ` Ard Biesheuvel 2016-08-18 12:42 ` Catalin Marinas 2016-08-18 12:17 ` Richard Weinberger 1 sibling, 1 reply; 8+ messages in thread From: Ard Biesheuvel @ 2016-08-18 12:00 UTC (permalink / raw) To: linux-arm-kernel On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote: > > > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: >>> Some userspace applications need to know the maximum virtual address >>they can >>> use (TASK_SIZE). >> >>Just curious, what are the cases needing TASK_SIZE in user space? > > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory. > One aspect of this that I would like to discuss is whether the current practice makes sense, of tying TASK_SIZE to whatever the size of the kernel VA space is. I could imagine simply limiting the user VA space to 39-bits (or even 36-bits, depending on how deeply we care about 16 KB pages), and implement an arch specific hook (prctl() perhaps?) to increase TASK_SIZE on demand. That would not only give us a reliable way to check whether this is supported (i.e., the prctl() would return error if it isn't), it also allows for some optimizations, since a 48-bit VA kernel can run all processes using 3 levels with relative ease (and switching between 4levels and 3levels processes would also be possible, but would either require a TLB flush, or would result in this optimization to be disabled globally, whichever is less costly in terms of performance) -- Ard. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-18 12:00 ` Ard Biesheuvel @ 2016-08-18 12:42 ` Catalin Marinas 2016-08-18 13:18 ` Ard Biesheuvel 0 siblings, 1 reply; 8+ messages in thread From: Catalin Marinas @ 2016-08-18 12:42 UTC (permalink / raw) To: linux-arm-kernel On Thu, Aug 18, 2016 at 02:00:56PM +0200, Ard Biesheuvel wrote: > On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote: > > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: > >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: > >>> Some userspace applications need to know the maximum virtual address > >>they can > >>> use (TASK_SIZE). > >> > >>Just curious, what are the cases needing TASK_SIZE in user space? > > > > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine > > https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the > > specific cases I've run into. I've heard LuaJIT might have a similar > > situation. In general I think making allocations from the top down > > is a shortcut for finding a large unused region of memory. > > One aspect of this that I would like to discuss is whether the current > practice makes sense, of tying TASK_SIZE to whatever the size of the > kernel VA space is. I'm fine with decoupling them as long as we can have sane pgd/pud/pmd/pte macros. We rely on generic files line pgtable-nopud.h etc. currently, so we would have to give up on that and do our own checks. It's also worth testing any potential performance implication of creating/tearing down large page tables with the new macros. > I could imagine simply limiting the user VA space to 39-bits (or even > 36-bits, depending on how deeply we care about 16 KB pages), and > implement an arch specific hook (prctl() perhaps?) to increase > TASK_SIZE on demand. As you stated below, switching TASK_SIZE on demand is problematic if you actually want a switch the TCR_EL1.T0SZ. As per other recent discussions, I'm not sure we can do it safely without full TLBI on context switch. That's an aspect we'll have to sort out with 52-bit VA but most likely we'll allow this range in T0SZ and only artificially limit TASK_SIZE to smaller values so that we don't break any other tasks. But then you won't gain much from a reduced number of page table levels. > That would not only give us a reliable way to check whether this is > supported (i.e., the prctl() would return error if it isn't), it also > allows for some optimizations, since a 48-bit VA kernel can run all > processes using 3 levels with relative ease (and switching between > 4levels and 3levels processes would also be possible, but would either > require a TLB flush, or would result in this optimization to be > disabled globally, whichever is less costly in terms of performance) I'm more for using 48-bit VA permanently for both user and kernel (and 52-bit VA at some point in the future, though limiting user space to 48-bit VA by default). But it would be good to get some benchmark numbers on the impact to see whether it's still worth keeping the other VA combinations around. -- Catalin ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-18 12:42 ` Catalin Marinas @ 2016-08-18 13:18 ` Ard Biesheuvel 0 siblings, 0 replies; 8+ messages in thread From: Ard Biesheuvel @ 2016-08-18 13:18 UTC (permalink / raw) To: linux-arm-kernel On 18 August 2016 at 14:42, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Thu, Aug 18, 2016 at 02:00:56PM +0200, Ard Biesheuvel wrote: >> On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote: >> > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: >> >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: >> >>> Some userspace applications need to know the maximum virtual address >> >>they can >> >>> use (TASK_SIZE). >> >> >> >>Just curious, what are the cases needing TASK_SIZE in user space? >> > >> > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine >> > https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the >> > specific cases I've run into. I've heard LuaJIT might have a similar >> > situation. In general I think making allocations from the top down >> > is a shortcut for finding a large unused region of memory. >> >> One aspect of this that I would like to discuss is whether the current >> practice makes sense, of tying TASK_SIZE to whatever the size of the >> kernel VA space is. > > I'm fine with decoupling them as long as we can have sane > pgd/pud/pmd/pte macros. We rely on generic files line pgtable-nopud.h > etc. currently, so we would have to give up on that and do our own > checks. It's also worth testing any potential performance implication of > creating/tearing down large page tables with the new macros. > Well, I don't think it is necessarily worth the trouble of rewriting all that. My concern is that TASK_SIZE randomly increased to 48 bits recently, merely because some Freescale SoCs cannot fit their RAM into the linear mapping on a 39-bit VA kernel. This had nothing to do with userland requirements. Do we know the userland requirements? What use cases do we know about that require >39 bit userland VA space? >> I could imagine simply limiting the user VA space to 39-bits (or even >> 36-bits, depending on how deeply we care about 16 KB pages), and >> implement an arch specific hook (prctl() perhaps?) to increase >> TASK_SIZE on demand. > > As you stated below, switching TASK_SIZE on demand is problematic if you > actually want a switch the TCR_EL1.T0SZ. As per other recent > discussions, I'm not sure we can do it safely without full TLBI on > context switch. That's an aspect we'll have to sort out with 52-bit VA > but most likely we'll allow this range in T0SZ and only artificially > limit TASK_SIZE to smaller values so that we don't break any other > tasks. But then you won't gain much from a reduced number of page table > levels. > There are several ways to go about this. The 48-bit VA kernel could run everything with 3 levels, and simply switch to 4 levels the moment some process needs it. So we keep all the existing macros, but simply point TTBR0_EL1 to the level 1 translation table rather than to the level 0 table (and update T0SZ accordingly). So when the first 48 bit VA userland process arrives (which may be never in many cases), we either switch to 4 levels for everything (and the page tables are already set up for that), or we do a TLB flush, but only when switching from a 4levels task to a 3levels task or vice versa (but this is messy so the first approach is probably more suitable) So there is no associated space savings, only the TLB and cache footprint gets optimized. >> That would not only give us a reliable way to check whether this is >> supported (i.e., the prctl() would return error if it isn't), it also >> allows for some optimizations, since a 48-bit VA kernel can run all >> processes using 3 levels with relative ease (and switching between >> 4levels and 3levels processes would also be possible, but would either >> require a TLB flush, or would result in this optimization to be >> disabled globally, whichever is less costly in terms of performance) > > I'm more for using 48-bit VA permanently for both user and kernel (and > 52-bit VA at some point in the future, though limiting user space to > 48-bit VA by default). But it would be good to get some benchmark > numbers on the impact to see whether it's still worth keeping the other > VA combinations around. > Of course, none of this complexity is justified if the performance impact is negligible. I do wonder about the virt case, though. -- Ard. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-17 11:12 ` Christopher Covington 2016-08-18 12:00 ` Ard Biesheuvel @ 2016-08-18 12:17 ` Richard Weinberger 2016-09-09 14:14 ` Christopher Covington 1 sibling, 1 reply; 8+ messages in thread From: Richard Weinberger @ 2016-08-18 12:17 UTC (permalink / raw) To: linux-arm-kernel On Wed, Aug 17, 2016 at 1:12 PM, Christopher Covington <cov@codeaurora.org> wrote: > > > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: >>> Some userspace applications need to know the maximum virtual address >>they can >>> use (TASK_SIZE). >> >>Just curious, what are the cases needing TASK_SIZE in user space? > > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory. I think this makes sense for all archs. At lest UserModeLinux on x86 also needs to know bottom and top addresses of the usable address space. Currently it figures by scanning and catching SIGSEGV. -- Thanks, //richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv 2016-08-18 12:17 ` Richard Weinberger @ 2016-09-09 14:14 ` Christopher Covington 0 siblings, 0 replies; 8+ messages in thread From: Christopher Covington @ 2016-09-09 14:14 UTC (permalink / raw) To: linux-arm-kernel Hi Richard, On 08/18/2016 08:17 AM, Richard Weinberger wrote: > On Wed, Aug 17, 2016 at 1:12 PM, Christopher Covington > <cov@codeaurora.org> wrote: >> >> >> On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote: >>> On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote: >>>> Some userspace applications need to know the maximum virtual address >>> they can >>>> use (TASK_SIZE). >>> >>> Just curious, what are the cases needing TASK_SIZE in user space? >> >> Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory. > > I think this makes sense for all archs. > At lest UserModeLinux on x86 also needs to know bottom and top > addresses of the usable > address space. > Currently it figures by scanning and catching SIGSEGV. For the bottom, can you use /proc/sys/vm/mmap_min_addr? Cov -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-09-09 14:14 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-16 18:32 [PATCH] arm64: Expose TASK_SIZE to userspace via auxv Christopher Covington 2016-08-17 10:30 ` Catalin Marinas 2016-08-17 11:12 ` Christopher Covington 2016-08-18 12:00 ` Ard Biesheuvel 2016-08-18 12:42 ` Catalin Marinas 2016-08-18 13:18 ` Ard Biesheuvel 2016-08-18 12:17 ` Richard Weinberger 2016-09-09 14:14 ` Christopher Covington
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).