* Re: [PATCH v6 00/10] Intel MPX support [not found] <1403084656-27284-1-git-send-email-qiaowei.ren@intel.com> @ 2014-06-18 14:41 ` Dave Hansen [not found] ` <1403084656-27284-3-git-send-email-qiaowei.ren@intel.com> 1 sibling, 0 replies; 22+ messages in thread From: Dave Hansen @ 2014-06-18 14:41 UTC (permalink / raw) To: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar Cc: x86, linux-kernel, Linux-MM On 06/18/2014 02:44 AM, Qiaowei Ren wrote: > This patchset adds support for the Memory Protection Extensions > (MPX) feature found in future Intel processors. It's very important to note that this is a very different patch set than the last one. The way we are freeing the unused bounds tables is _completely different (9/10), and needs some very heavy mm reviews. I'm sure Qiaowei will cc linux-mm@ next time. We're also not asking that this be merged in its current state. The 32-bit binary on 64-bit kernel issue is a show stopper for merging, but we're trying to post early and often. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <1403084656-27284-3-git-send-email-qiaowei.ren@intel.com>]
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface [not found] ` <1403084656-27284-3-git-send-email-qiaowei.ren@intel.com> @ 2014-06-23 19:49 ` Andy Lutomirski 2014-06-23 20:03 ` Dave Hansen 2014-06-24 2:53 ` Ren, Qiaowei 0 siblings, 2 replies; 22+ messages in thread From: Andy Lutomirski @ 2014-06-23 19:49 UTC (permalink / raw) To: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Dave Hansen Cc: x86, linux-kernel, Linux MM On 06/18/2014 02:44 AM, Qiaowei Ren wrote: > This patch adds one MPX specific mmap interface, which only handles > mpx related maps, including bounds table and bounds directory. > > In order to track MPX specific memory usage, this interface is added > to stick new vm_flag VM_MPX in the vma_area_struct when create a > bounds table or bounds directory. I imagine the linux-mm people would want to think about any new vm flag. Why is this needed? > > Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> > --- > arch/x86/Kconfig | 4 +++ > arch/x86/include/asm/mpx.h | 38 ++++++++++++++++++++++++++++ > arch/x86/mm/Makefile | 2 + > arch/x86/mm/mpx.c | 58 ++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 102 insertions(+), 0 deletions(-) > create mode 100644 arch/x86/include/asm/mpx.h > create mode 100644 arch/x86/mm/mpx.c > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 25d2c6f..0194790 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT > def_bool y > depends on INTEL_IOMMU && ACPI > > +config X86_INTEL_MPX > + def_bool y > + depends on CPU_SUP_INTEL > + > config X86_32_SMP > def_bool y > depends on X86_32 && SMP > diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h > new file mode 100644 > index 0000000..5725ac4 > --- /dev/null > +++ b/arch/x86/include/asm/mpx.h > @@ -0,0 +1,38 @@ > +#ifndef _ASM_X86_MPX_H > +#define _ASM_X86_MPX_H > + > +#include <linux/types.h> > +#include <asm/ptrace.h> > + > +#ifdef CONFIG_X86_64 > + > +/* upper 28 bits [47:20] of the virtual address in 64-bit used to > + * index into bounds directory (BD). > + */ > +#define MPX_BD_ENTRY_OFFSET 28 > +#define MPX_BD_ENTRY_SHIFT 3 > +/* bits [19:3] of the virtual address in 64-bit used to index into > + * bounds table (BT). > + */ > +#define MPX_BT_ENTRY_OFFSET 17 > +#define MPX_BT_ENTRY_SHIFT 5 > +#define MPX_IGN_BITS 3 > + > +#else > + > +#define MPX_BD_ENTRY_OFFSET 20 > +#define MPX_BD_ENTRY_SHIFT 2 > +#define MPX_BT_ENTRY_OFFSET 10 > +#define MPX_BT_ENTRY_SHIFT 4 > +#define MPX_IGN_BITS 2 > + > +#endif > + > +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) > +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) > + > +#define MPX_BNDSTA_ERROR_CODE 0x3 > + > +unsigned long mpx_mmap(unsigned long len); > + > +#endif /* _ASM_X86_MPX_H */ > diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile > index 6a19ad9..ecfdc46 100644 > --- a/arch/x86/mm/Makefile > +++ b/arch/x86/mm/Makefile > @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o > obj-$(CONFIG_NUMA_EMU) += numa_emulation.o > > obj-$(CONFIG_MEMTEST) += memtest.o > + > +obj-$(CONFIG_X86_INTEL_MPX) += mpx.o > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c > new file mode 100644 > index 0000000..546c5d1 > --- /dev/null > +++ b/arch/x86/mm/mpx.c > @@ -0,0 +1,58 @@ > +#include <linux/kernel.h> > +#include <linux/syscalls.h> > +#include <asm/mpx.h> > +#include <asm/mman.h> > +#include <linux/sched/sysctl.h> > + > +/* > + * this is really a simplified "vm_mmap". it only handles mpx > + * related maps, including bounds table and bounds directory. > + * > + * here we can stick new vm_flag VM_MPX in the vma_area_struct > + * when create a bounds table or bounds directory, in order to > + * track MPX specific memory. > + */ > +unsigned long mpx_mmap(unsigned long len) > +{ > + unsigned long ret; > + unsigned long addr, pgoff; > + struct mm_struct *mm = current->mm; > + vm_flags_t vm_flags; > + > + /* Only bounds table and bounds directory can be allocated here */ > + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) > + return -EINVAL; > + > + down_write(&mm->mmap_sem); > + > + /* Too many mappings? */ > + if (mm->map_count > sysctl_max_map_count) { > + ret = -ENOMEM; > + goto out; > + } > + > + /* Obtain the address to map to. we verify (or select) it and ensure > + * that it represents a valid section of the address space. > + */ > + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); > + if (addr & ~PAGE_MASK) { > + ret = addr; > + goto out; > + } > + > + vm_flags = VM_READ | VM_WRITE | VM_MPX | > + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; > + > + /* Make bounds tables and bouds directory unlocked. */ > + if (vm_flags & VM_LOCKED) > + vm_flags &= ~VM_LOCKED; Why? I would expect MCL_FUTURE to lock these. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 19:49 ` [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface Andy Lutomirski @ 2014-06-23 20:03 ` Dave Hansen 2014-06-23 20:06 ` Andy Lutomirski 2014-06-24 2:53 ` Ren, Qiaowei 1 sibling, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-23 20:03 UTC (permalink / raw) To: Andy Lutomirski, Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar Cc: x86, linux-kernel, Linux MM On 06/23/2014 12:49 PM, Andy Lutomirski wrote: > On 06/18/2014 02:44 AM, Qiaowei Ren wrote: >> This patch adds one MPX specific mmap interface, which only handles >> mpx related maps, including bounds table and bounds directory. >> >> In order to track MPX specific memory usage, this interface is added >> to stick new vm_flag VM_MPX in the vma_area_struct when create a >> bounds table or bounds directory. > > I imagine the linux-mm people would want to think about any new vm flag. > Why is this needed? These tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. I know VM flags are scarce, and I'm open to other ways to skin this cat. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 20:03 ` Dave Hansen @ 2014-06-23 20:06 ` Andy Lutomirski 2014-06-23 20:28 ` Dave Hansen 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-23 20:06 UTC (permalink / raw) To: Dave Hansen Cc: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Mon, Jun 23, 2014 at 1:03 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/23/2014 12:49 PM, Andy Lutomirski wrote: >> On 06/18/2014 02:44 AM, Qiaowei Ren wrote: >>> This patch adds one MPX specific mmap interface, which only handles >>> mpx related maps, including bounds table and bounds directory. >>> >>> In order to track MPX specific memory usage, this interface is added >>> to stick new vm_flag VM_MPX in the vma_area_struct when create a >>> bounds table or bounds directory. >> >> I imagine the linux-mm people would want to think about any new vm flag. >> Why is this needed? > > These tables can take huge amounts of memory. In the worst-case > scenario, the tables can be 4x the size of the data structure being > tracked. IOW, a 1-page structure can require 4 bounds-table pages. > > My expectation is that folks using MPX are going to be keen on figuring > out how much memory is being dedicated to it. With this feature, plus > some grepping in /proc/$pid/smaps one could take a pretty good stab at it. > > I know VM flags are scarce, and I'm open to other ways to skin this cat. > Can the new vm_operation "name" be use for this? The magic "always written to core dumps" feature might need to be reconsidered. There's also arch_vma_name, but I just finished removing for x86, and I'd be a little sad to see it come right back. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 20:06 ` Andy Lutomirski @ 2014-06-23 20:28 ` Dave Hansen 2014-06-23 21:04 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-23 20:28 UTC (permalink / raw) To: Andy Lutomirski Cc: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/23/2014 01:06 PM, Andy Lutomirski wrote: > Can the new vm_operation "name" be use for this? The magic "always > written to core dumps" feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops->names from getting merged. That seems like a bit of a design oversight for ->name. Right? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops->name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 20:28 ` Dave Hansen @ 2014-06-23 21:04 ` Andy Lutomirski 2014-06-24 5:53 ` Ren, Qiaowei 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-23 21:04 UTC (permalink / raw) To: Dave Hansen Cc: Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Mon, Jun 23, 2014 at 1:28 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >> Can the new vm_operation "name" be use for this? The magic "always >> written to core dumps" feature might need to be reconsidered. > > One thing I'd like to avoid is an MPX vma getting merged with a non-MPX > vma. I don't see any code to prevent two VMAs with different > vm_ops->names from getting merged. That seems like a bit of a design > oversight for ->name. Right? AFAIK there are no ->name users that don't also set ->close, for exactly that reason. I'd be okay with adding a check for ->name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? > > Thinking out loud a bit... There are also some more complicated but more > performant cleanup mechanisms that I'd like to go after in the future. > Given a page, we might want to figure out if it is an MPX page or not. > I wonder if we'll ever collide with some other user of vm_ops->name. It > looks fairly narrowly used at the moment, but would this keep us from > putting these pages on, say, a tmpfs mount? Doesn't look that way at > the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 21:04 ` Andy Lutomirski @ 2014-06-24 5:53 ` Ren, Qiaowei 2014-06-24 23:55 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Ren, Qiaowei @ 2014-06-24 5:53 UTC (permalink / raw) To: Andy Lutomirski, Hansen, Dave Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 2014-06-24, Andy Lutomirski wrote: >> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>> Can the new vm_operation "name" be use for this? The magic "always >>> written to core dumps" feature might need to be reconsidered. >> >> One thing I'd like to avoid is an MPX vma getting merged with a >> non-MPX vma. I don't see any code to prevent two VMAs with >> different vm_ops->names from getting merged. That seems like a bit >> of a design oversight for ->name. Right? > > AFAIK there are no ->name users that don't also set ->close, for > exactly that reason. I'd be okay with adding a check for ->name, too. > > Hmm. If MPX vmas had a real struct file attached, this would all come > for free. Maybe vmas with non-default vm_ops and file != NULL should > never be mergeable? > >> >> Thinking out loud a bit... There are also some more complicated but >> more performant cleanup mechanisms that I'd like to go after in the future. >> Given a page, we might want to figure out if it is an MPX page or not. >> I wonder if we'll ever collide with some other user of vm_ops->name. >> It looks fairly narrowly used at the moment, but would this keep us >> from putting these pages on, say, a tmpfs mount? Doesn't look that >> way at the moment. > > You could always check the vm_ops pointer to see if it's MPX. > > One feature I've wanted: a way to have special per-process vmas that > can be easily found. For example, I want to be able to efficiently > find out where the vdso and vvar vmas are. I don't think this is currently supported. > Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { ...... static struct vm_special_mapping mpx_mapping = { .name = "[mpx]", .pages = no_pages, }; ....... vma = _install_special_mapping(mm, addr, len, vm_flags, &mpx_mapping); ...... } Then, we could check the ->name to see if the VMA is MPX specific. Right? Thanks, Qiaowei ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-24 5:53 ` Ren, Qiaowei @ 2014-06-24 23:55 ` Andy Lutomirski 2014-06-25 1:40 ` Ren, Qiaowei 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-24 23:55 UTC (permalink / raw) To: Ren, Qiaowei Cc: Hansen, Dave, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei <qiaowei.ren@intel.com> wrote: > On 2014-06-24, Andy Lutomirski wrote: >>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>>> Can the new vm_operation "name" be use for this? The magic "always >>>> written to core dumps" feature might need to be reconsidered. >>> >>> One thing I'd like to avoid is an MPX vma getting merged with a >>> non-MPX vma. I don't see any code to prevent two VMAs with >>> different vm_ops->names from getting merged. That seems like a bit >>> of a design oversight for ->name. Right? >> >> AFAIK there are no ->name users that don't also set ->close, for >> exactly that reason. I'd be okay with adding a check for ->name, too. >> >> Hmm. If MPX vmas had a real struct file attached, this would all come >> for free. Maybe vmas with non-default vm_ops and file != NULL should >> never be mergeable? >> >>> >>> Thinking out loud a bit... There are also some more complicated but >>> more performant cleanup mechanisms that I'd like to go after in the future. >>> Given a page, we might want to figure out if it is an MPX page or not. >>> I wonder if we'll ever collide with some other user of vm_ops->name. >>> It looks fairly narrowly used at the moment, but would this keep us >>> from putting these pages on, say, a tmpfs mount? Doesn't look that >>> way at the moment. >> >> You could always check the vm_ops pointer to see if it's MPX. >> >> One feature I've wanted: a way to have special per-process vmas that >> can be easily found. For example, I want to be able to efficiently >> find out where the vdso and vvar vmas are. I don't think this is currently supported. >> > Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): > > unsigned long mpx_mmap(unsigned long len) > { > ...... > static struct vm_special_mapping mpx_mapping = { > .name = "[mpx]", > .pages = no_pages, > }; > > ....... > vma = _install_special_mapping(mm, addr, len, vm_flags, &mpx_mapping); > ...... > } > > Then, we could check the ->name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Possibly silly question: would it make more sense to just create one giant vma for the MPX tables and only populate pieces of it as needed? This wouldn't work for 32-bit code, but maybe we don't care. (I see no reason why it couldn't work for x32, though.) (I don't really understand how anonymous memory works at all. I'm not an mm person.) --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-24 23:55 ` Andy Lutomirski @ 2014-06-25 1:40 ` Ren, Qiaowei 2014-06-25 21:04 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Ren, Qiaowei @ 2014-06-25 1:40 UTC (permalink / raw) To: Andy Lutomirski Cc: Hansen, Dave, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2895 bytes --] On 2014-06-25, Andy Lutomirski wrote: > On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei <qiaowei.ren@intel.com> > wrote: >> On 2014-06-24, Andy Lutomirski wrote: >>>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>>>> Can the new vm_operation "name" be use for this? The magic >>>>> "always written to core dumps" feature might need to be reconsidered. >>>> >>>> One thing I'd like to avoid is an MPX vma getting merged with a >>>> non-MPX vma. I don't see any code to prevent two VMAs with >>>> different vm_ops->names from getting merged. That seems like a >>>> bit of a design oversight for ->name. Right? >>> >>> AFAIK there are no ->name users that don't also set ->close, for >>> exactly that reason. I'd be okay with adding a check for ->name, too. >>> >>> Hmm. If MPX vmas had a real struct file attached, this would all >>> come for free. Maybe vmas with non-default vm_ops and file != NULL >>> should never be mergeable? >>> >>>> >>>> Thinking out loud a bit... There are also some more complicated >>>> but more performant cleanup mechanisms that I'd like to go after in the future. >>>> Given a page, we might want to figure out if it is an MPX page or not. >>>> I wonder if we'll ever collide with some other user of vm_ops->name. >>>> It looks fairly narrowly used at the moment, but would this keep >>>> us from putting these pages on, say, a tmpfs mount? Doesn't look >>>> that way at the moment. >>> >>> You could always check the vm_ops pointer to see if it's MPX. >>> >>> One feature I've wanted: a way to have special per-process vmas that >>> can be easily found. For example, I want to be able to efficiently >>> find out where the vdso and vvar vmas are. I don't think this is >>> currently supported. >>> >> Andy, if you add a check for ->name to avoid the MPX vmas merged >> with > non-MPX vmas, I guess the work flow should be as follow (use > _install_special_mapping to get a new vma): >> >> unsigned long mpx_mmap(unsigned long len) { >> ...... >> static struct vm_special_mapping mpx_mapping = { >> .name = "[mpx]", >> .pages = no_pages, >> }; >> >> ....... vma = _install_special_mapping(mm, addr, len, vm_flags, >> &mpx_mapping); ...... >> } >> >> Then, we could check the ->name to see if the VMA is MPX specific. Right? > > Does this actually create a vma backed with real memory? Doesn't this > need to go through anon_vma or something? _install_special_mapping > completely prevents merging. > Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific ->name to the vma when it is created? Seems like that I could not find such interface. Thanks, Qiaowei N§²æìr¸zǧu©²Æ {\béì¹»\x1c®&Þ)îÆi¢Ø^nr¶Ý¢j$½§$¢¸\x05¢¹¨è§~'.)îÄÃ,yèm¶ÿÃ\f%{±j+ðèצj)Z· ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-25 1:40 ` Ren, Qiaowei @ 2014-06-25 21:04 ` Andy Lutomirski 2014-06-25 21:05 ` Andy Lutomirski 2014-06-25 21:43 ` Dave Hansen 0 siblings, 2 replies; 22+ messages in thread From: Andy Lutomirski @ 2014-06-25 21:04 UTC (permalink / raw) To: Ren, Qiaowei Cc: Hansen, Dave, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei <qiaowei.ren@intel.com> wrote: > On 2014-06-25, Andy Lutomirski wrote: >> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei <qiaowei.ren@intel.com> >> wrote: >>> On 2014-06-24, Andy Lutomirski wrote: >>>>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>>>>> Can the new vm_operation "name" be use for this? The magic >>>>>> "always written to core dumps" feature might need to be reconsidered. >>>>> >>>>> One thing I'd like to avoid is an MPX vma getting merged with a >>>>> non-MPX vma. I don't see any code to prevent two VMAs with >>>>> different vm_ops->names from getting merged. That seems like a >>>>> bit of a design oversight for ->name. Right? >>>> >>>> AFAIK there are no ->name users that don't also set ->close, for >>>> exactly that reason. I'd be okay with adding a check for ->name, too. >>>> >>>> Hmm. If MPX vmas had a real struct file attached, this would all >>>> come for free. Maybe vmas with non-default vm_ops and file != NULL >>>> should never be mergeable? >>>> >>>>> >>>>> Thinking out loud a bit... There are also some more complicated >>>>> but more performant cleanup mechanisms that I'd like to go after in the future. >>>>> Given a page, we might want to figure out if it is an MPX page or not. >>>>> I wonder if we'll ever collide with some other user of vm_ops->name. >>>>> It looks fairly narrowly used at the moment, but would this keep >>>>> us from putting these pages on, say, a tmpfs mount? Doesn't look >>>>> that way at the moment. >>>> >>>> You could always check the vm_ops pointer to see if it's MPX. >>>> >>>> One feature I've wanted: a way to have special per-process vmas that >>>> can be easily found. For example, I want to be able to efficiently >>>> find out where the vdso and vvar vmas are. I don't think this is >>>> currently supported. >>>> >>> Andy, if you add a check for ->name to avoid the MPX vmas merged >>> with >> non-MPX vmas, I guess the work flow should be as follow (use >> _install_special_mapping to get a new vma): >>> >>> unsigned long mpx_mmap(unsigned long len) { >>> ...... >>> static struct vm_special_mapping mpx_mapping = { >>> .name = "[mpx]", >>> .pages = no_pages, >>> }; >>> >>> ....... vma = _install_special_mapping(mm, addr, len, vm_flags, >>> &mpx_mapping); ...... >>> } >>> >>> Then, we could check the ->name to see if the VMA is MPX specific. Right? >> >> Does this actually create a vma backed with real memory? Doesn't this >> need to go through anon_vma or something? _install_special_mapping >> completely prevents merging. >> > Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. > > So, could you tell me how to set MPX specific ->name to the vma when it is created? Seems like that I could not find such interface. You may need to add one. I'd suggest posting a new thread to linux-mm describing what you need and asking how to do it. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-25 21:04 ` Andy Lutomirski @ 2014-06-25 21:05 ` Andy Lutomirski 2014-06-25 21:45 ` Dave Hansen 2014-06-25 21:43 ` Dave Hansen 1 sibling, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-25 21:05 UTC (permalink / raw) To: Ren, Qiaowei Cc: Hansen, Dave, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Wed, Jun 25, 2014 at 2:04 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei <qiaowei.ren@intel.com> wrote: >> On 2014-06-25, Andy Lutomirski wrote: >>> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei <qiaowei.ren@intel.com> >>> wrote: >>>> On 2014-06-24, Andy Lutomirski wrote: >>>>>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>>>>>> Can the new vm_operation "name" be use for this? The magic >>>>>>> "always written to core dumps" feature might need to be reconsidered. >>>>>> >>>>>> One thing I'd like to avoid is an MPX vma getting merged with a >>>>>> non-MPX vma. I don't see any code to prevent two VMAs with >>>>>> different vm_ops->names from getting merged. That seems like a >>>>>> bit of a design oversight for ->name. Right? >>>>> >>>>> AFAIK there are no ->name users that don't also set ->close, for >>>>> exactly that reason. I'd be okay with adding a check for ->name, too. >>>>> >>>>> Hmm. If MPX vmas had a real struct file attached, this would all >>>>> come for free. Maybe vmas with non-default vm_ops and file != NULL >>>>> should never be mergeable? >>>>> >>>>>> >>>>>> Thinking out loud a bit... There are also some more complicated >>>>>> but more performant cleanup mechanisms that I'd like to go after in the future. >>>>>> Given a page, we might want to figure out if it is an MPX page or not. >>>>>> I wonder if we'll ever collide with some other user of vm_ops->name. >>>>>> It looks fairly narrowly used at the moment, but would this keep >>>>>> us from putting these pages on, say, a tmpfs mount? Doesn't look >>>>>> that way at the moment. >>>>> >>>>> You could always check the vm_ops pointer to see if it's MPX. >>>>> >>>>> One feature I've wanted: a way to have special per-process vmas that >>>>> can be easily found. For example, I want to be able to efficiently >>>>> find out where the vdso and vvar vmas are. I don't think this is >>>>> currently supported. >>>>> >>>> Andy, if you add a check for ->name to avoid the MPX vmas merged >>>> with >>> non-MPX vmas, I guess the work flow should be as follow (use >>> _install_special_mapping to get a new vma): >>>> >>>> unsigned long mpx_mmap(unsigned long len) { >>>> ...... >>>> static struct vm_special_mapping mpx_mapping = { >>>> .name = "[mpx]", >>>> .pages = no_pages, >>>> }; >>>> >>>> ....... vma = _install_special_mapping(mm, addr, len, vm_flags, >>>> &mpx_mapping); ...... >>>> } >>>> >>>> Then, we could check the ->name to see if the VMA is MPX specific. Right? >>> >>> Does this actually create a vma backed with real memory? Doesn't this >>> need to go through anon_vma or something? _install_special_mapping >>> completely prevents merging. >>> >> Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. >> >> So, could you tell me how to set MPX specific ->name to the vma when it is created? Seems like that I could not find such interface. > > You may need to add one. > > I'd suggest posting a new thread to linux-mm describing what you need > and asking how to do it. Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-25 21:05 ` Andy Lutomirski @ 2014-06-25 21:45 ` Dave Hansen 2014-06-26 22:19 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-25 21:45 UTC (permalink / raw) To: Andy Lutomirski, Ren, Qiaowei Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/25/2014 02:05 PM, Andy Lutomirski wrote: > Hmm. the memfd_create thing may be able to do this for you. If you > created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-25 21:45 ` Dave Hansen @ 2014-06-26 22:19 ` Andy Lutomirski 2014-06-26 22:58 ` Dave Hansen 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-26 22:19 UTC (permalink / raw) To: Dave Hansen Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/25/2014 02:05 PM, Andy Lutomirski wrote: >> Hmm. the memfd_create thing may be able to do this for you. If you >> created a per-mm memfd and mapped it, it all just might work. > > memfd_create() seems to bring a fair amount of baggage along (the fd > part :) if all we want is a marker. Really, all we need is _a_ bit, and > some way to plumb to userspace the RSS values of VMAs with that bit set. > > Creating and mmap()'ing a fd seems a rather roundabout way to get there. Hmm. So does VM_MPX, though. If this stuff were done entirely in userspace, then memfd_create would be exactly the right solution, I think. Would it work to just scan the bound directory to figure out how many bound tables exist? --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-26 22:19 ` Andy Lutomirski @ 2014-06-26 22:58 ` Dave Hansen 2014-06-26 23:15 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-26 22:58 UTC (permalink / raw) To: Andy Lutomirski Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/26/2014 03:19 PM, Andy Lutomirski wrote: > On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen <dave.hansen@intel.com> wrote: >> On 06/25/2014 02:05 PM, Andy Lutomirski wrote: >>> Hmm. the memfd_create thing may be able to do this for you. If you >>> created a per-mm memfd and mapped it, it all just might work. >> >> memfd_create() seems to bring a fair amount of baggage along (the fd >> part :) if all we want is a marker. Really, all we need is _a_ bit, and >> some way to plumb to userspace the RSS values of VMAs with that bit set. >> >> Creating and mmap()'ing a fd seems a rather roundabout way to get there. > > Hmm. So does VM_MPX, though. If this stuff were done entirely in > userspace, then memfd_create would be exactly the right solution, I > think. > > Would it work to just scan the bound directory to figure out how many > bound tables exist? Theoretically, perhaps. Practically, the bounds directory is 2GB, and it is likely to be very sparse. You would have to walk the page tables finding where pages were mapped, then search the mapped pages for bounds table entries. Assuming that it was aligned and minimally populated, that's a *MINIMUM* search looking for a PGD entry, then you have to look at 512 PUD entries. A full search would have to look at half a million ptes. That's just finding out how sparse the first level of the tables are before you've looked at a byte of actual data, and if they were empty. We could keep another, parallel, data structure that handles this better other than the hardware tables. Like, say, an rbtree that stores ranges of virtual addresses. We could call them vm_area_somethings ... wait a sec... we have a structure like that. ;) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-26 22:58 ` Dave Hansen @ 2014-06-26 23:15 ` Andy Lutomirski 2014-06-27 0:19 ` Dave Hansen 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-26 23:15 UTC (permalink / raw) To: Dave Hansen Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Thu, Jun 26, 2014 at 3:58 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/26/2014 03:19 PM, Andy Lutomirski wrote: >> On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen <dave.hansen@intel.com> wrote: >>> On 06/25/2014 02:05 PM, Andy Lutomirski wrote: >>>> Hmm. the memfd_create thing may be able to do this for you. If you >>>> created a per-mm memfd and mapped it, it all just might work. >>> >>> memfd_create() seems to bring a fair amount of baggage along (the fd >>> part :) if all we want is a marker. Really, all we need is _a_ bit, and >>> some way to plumb to userspace the RSS values of VMAs with that bit set. >>> >>> Creating and mmap()'ing a fd seems a rather roundabout way to get there. >> >> Hmm. So does VM_MPX, though. If this stuff were done entirely in >> userspace, then memfd_create would be exactly the right solution, I >> think. >> >> Would it work to just scan the bound directory to figure out how many >> bound tables exist? > > Theoretically, perhaps. > > Practically, the bounds directory is 2GB, and it is likely to be very > sparse. You would have to walk the page tables finding where pages were > mapped, then search the mapped pages for bounds table entries. > > Assuming that it was aligned and minimally populated, that's a *MINIMUM* > search looking for a PGD entry, then you have to look at 512 PUD > entries. A full search would have to look at half a million ptes. > That's just finding out how sparse the first level of the tables are > before you've looked at a byte of actual data, and if they were empty. > > We could keep another, parallel, data structure that handles this better > other than the hardware tables. Like, say, an rbtree that stores ranges > of virtual addresses. We could call them vm_area_somethings ... wait a > sec... we have a structure like that. ;) > > So here's my mental image of how I might do this if I were doing it entirely in userspace: I'd create a file or memfd for the bound tables and another for the bound directory. These files would be *huge*: the bound directory file would be 2GB and the bounds table file would be 2^48 bytes or whatever it is. (Maybe even bigger?) Then I'd just map pieces of those files wherever they'd need to be, and I'd make the mappings sparse. I suspect that you don't actually want a vma for each piece of bound table that gets mapped -- the space of vmas could end up incredibly sparse. So I'd at least map (in the vma sense, not the pte sense) and entire bound table at a time. And I'd probably just map the bound directory in one big piece. Then I'd populate it in the fault handler. This is almost what the code is doing, I think, modulo the files. This has one killer problem: these mappings need to be private (cowed on fork). So memfd is no good. There's got to be an easyish way to modify the mm code to allow anonymous maps with vm_ops. Maybe a new mmap_region parameter or something? Maybe even a special anon_vma, but I don't really understand how those work. Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-26 23:15 ` Andy Lutomirski @ 2014-06-27 0:19 ` Dave Hansen 2014-06-27 0:26 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-27 0:19 UTC (permalink / raw) To: Andy Lutomirski Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/26/2014 04:15 PM, Andy Lutomirski wrote: > So here's my mental image of how I might do this if I were doing it > entirely in userspace: I'd create a file or memfd for the bound tables > and another for the bound directory. These files would be *huge*: the > bound directory file would be 2GB and the bounds table file would be > 2^48 bytes or whatever it is. (Maybe even bigger?) > > Then I'd just map pieces of those files wherever they'd need to be, > and I'd make the mappings sparse. I suspect that you don't actually > want a vma for each piece of bound table that gets mapped -- the space > of vmas could end up incredibly sparse. So I'd at least map (in the > vma sense, not the pte sense) and entire bound table at a time. And > I'd probably just map the bound directory in one big piece. > > Then I'd populate it in the fault handler. > > This is almost what the code is doing, I think, modulo the files. > > This has one killer problem: these mappings need to be private (cowed > on fork). So memfd is no good. This essentially uses the page cache's radix tree as a parallel data structure in order to keep a vaddr->mpx_vma map. That's not a bad idea, but it is a parallel data structure that does not handle copy-on-write very well. I'm pretty sure we need the semantics that anonymous memory provides. > There's got to be an easyish way to > modify the mm code to allow anonymous maps with vm_ops. Maybe a new > mmap_region parameter or something? Maybe even a special anon_vma, > but I don't really understand how those work. Yeah, we very well might end up having to go down that path. > Also, egads: what happens when a bound table entry is associated with > a MAP_SHARED page? Bounds table entries are for pointers. Do we keep pointers inside of MAP_SHARED-mapped things? :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-27 0:19 ` Dave Hansen @ 2014-06-27 0:26 ` Andy Lutomirski 2014-06-27 17:34 ` Dave Hansen 0 siblings, 1 reply; 22+ messages in thread From: Andy Lutomirski @ 2014-06-27 0:26 UTC (permalink / raw) To: Dave Hansen Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/26/2014 04:15 PM, Andy Lutomirski wrote: >> So here's my mental image of how I might do this if I were doing it >> entirely in userspace: I'd create a file or memfd for the bound tables >> and another for the bound directory. These files would be *huge*: the >> bound directory file would be 2GB and the bounds table file would be >> 2^48 bytes or whatever it is. (Maybe even bigger?) >> >> Then I'd just map pieces of those files wherever they'd need to be, >> and I'd make the mappings sparse. I suspect that you don't actually >> want a vma for each piece of bound table that gets mapped -- the space >> of vmas could end up incredibly sparse. So I'd at least map (in the >> vma sense, not the pte sense) and entire bound table at a time. And >> I'd probably just map the bound directory in one big piece. >> >> Then I'd populate it in the fault handler. >> >> This is almost what the code is doing, I think, modulo the files. >> >> This has one killer problem: these mappings need to be private (cowed >> on fork). So memfd is no good. > > This essentially uses the page cache's radix tree as a parallel data > structure in order to keep a vaddr->mpx_vma map. That's not a bad idea, > but it is a parallel data structure that does not handle copy-on-write > very well. > > I'm pretty sure we need the semantics that anonymous memory provides. > >> There's got to be an easyish way to >> modify the mm code to allow anonymous maps with vm_ops. Maybe a new >> mmap_region parameter or something? Maybe even a special anon_vma, >> but I don't really understand how those work. > > Yeah, we very well might end up having to go down that path. > >> Also, egads: what happens when a bound table entry is associated with >> a MAP_SHARED page? > > Bounds table entries are for pointers. Do we keep pointers inside of > MAP_SHARED-mapped things? :) Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: struct thing { struct thing *next; }; struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); storage[0].next = &storage[1]; fork(); I'm not suggesting that this needs to *work* in the first incarnation of this :) --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-27 0:26 ` Andy Lutomirski @ 2014-06-27 17:34 ` Dave Hansen 2014-06-27 17:42 ` Dave Hansen 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-27 17:34 UTC (permalink / raw) To: Andy Lutomirski Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/26/2014 05:26 PM, Andy Lutomirski wrote: > On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen <dave.hansen@intel.com> wrote: >> On 06/26/2014 04:15 PM, Andy Lutomirski wrote: >>> Also, egads: what happens when a bound table entry is associated with >>> a MAP_SHARED page? >> >> Bounds table entries are for pointers. Do we keep pointers inside of >> MAP_SHARED-mapped things? :) > > Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: > > struct thing { > struct thing *next; > }; > > struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); > storage[0].next = &storage[1]; > fork(); > > I'm not suggesting that this needs to *work* in the first incarnation of this :) I'm not sure I'm seeing the issue. I'm claiming that we need COW behavior for the bounds tables, at least by default. If userspace knows enough about the ways that it is using the tables and knows how to share them, let it go to town. The kernel will permit this kind of usage model, but we simply won't be helping with the management of the tables when userspace creates them. You've demonstrated a case where userspace might theoretically might want to share bounds tables (although I think it's pretty dangerous). It's equally theoretically possible that userspace might *not* want to share the tables for instance if one process narrowed the bounds and the other did not. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-27 17:34 ` Dave Hansen @ 2014-06-27 17:42 ` Dave Hansen 2014-06-27 18:57 ` Andy Lutomirski 0 siblings, 1 reply; 22+ messages in thread From: Dave Hansen @ 2014-06-27 17:42 UTC (permalink / raw) To: Andy Lutomirski Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/27/2014 10:34 AM, Dave Hansen wrote: > I'm claiming that we need COW behavior for the bounds tables, at least > by default. If userspace knows enough about the ways that it is using > the tables and knows how to share them, let it go to town. The kernel > will permit this kind of usage model, but we simply won't be helping > with the management of the tables when userspace creates them. Actually, this is another reason we need to mark VMAs as being MPX-related explicitly instead of inferring it from the tables. If userspace does something really specialized like this, the kernel does not want to confuse these VMAs the ones it created. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-27 17:42 ` Dave Hansen @ 2014-06-27 18:57 ` Andy Lutomirski 0 siblings, 0 replies; 22+ messages in thread From: Andy Lutomirski @ 2014-06-27 18:57 UTC (permalink / raw) To: Dave Hansen Cc: Ren, Qiaowei, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On Fri, Jun 27, 2014 at 10:42 AM, Dave Hansen <dave.hansen@intel.com> wrote: > On 06/27/2014 10:34 AM, Dave Hansen wrote: >> I'm claiming that we need COW behavior for the bounds tables, at least >> by default. If userspace knows enough about the ways that it is using >> the tables and knows how to share them, let it go to town. The kernel >> will permit this kind of usage model, but we simply won't be helping >> with the management of the tables when userspace creates them. > > Actually, this is another reason we need to mark VMAs as being > MPX-related explicitly instead of inferring it from the tables. If > userspace does something really specialized like this, the kernel does > not want to confuse these VMAs the ones it created. > Good point. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-25 21:04 ` Andy Lutomirski 2014-06-25 21:05 ` Andy Lutomirski @ 2014-06-25 21:43 ` Dave Hansen 1 sibling, 0 replies; 22+ messages in thread From: Dave Hansen @ 2014-06-25 21:43 UTC (permalink / raw) To: Andy Lutomirski, Ren, Qiaowei Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86 ML, linux-kernel@vger.kernel.org, Linux MM On 06/25/2014 02:04 PM, Andy Lutomirski wrote: > On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei <qiaowei.ren@intel.com> wrote: >> Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. >> >> So, could you tell me how to set MPX specific ->name to the vma when it is created? Seems like that I could not find such interface. > > You may need to add one. > > I'd suggest posting a new thread to linux-mm describing what you need > and asking how to do it. I shared this with Qiaowei privately, but might as well repeat myself here in case anyone wants to set me straight. Most of the interfaces do to set vm_ops do it in file_operations ->mmap op. Nobody sets ->vm_ops on anonymous VMAs, so we're in uncharted territory. My suggestion: you can either plumb a new API down in to mmap_region() to get the VMA or set ->vm_ops, or just call find_vma() after mmap_region() or get_unmapped_area() and set it manually. Just make sure you still have mmap_sem held over the whole thing. I think I prefer just setting ->vm_ops directly, even though it's a wee bit of a hack to create something just to look it up a moment later. Oh, well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface 2014-06-23 19:49 ` [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface Andy Lutomirski 2014-06-23 20:03 ` Dave Hansen @ 2014-06-24 2:53 ` Ren, Qiaowei 1 sibling, 0 replies; 22+ messages in thread From: Ren, Qiaowei @ 2014-06-24 2:53 UTC (permalink / raw) To: Andy Lutomirski, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Hansen, Dave Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Linux MM On 2014-06-24, Andy Lutomirski wrote: >> + /* Make bounds tables and bouds directory unlocked. */ >> + if (vm_flags & VM_LOCKED) >> + vm_flags &= ~VM_LOCKED; > > Why? I would expect MCL_FUTURE to lock these. > Andy, I was just a little confused about LOCKED & POPULATE earlier and I thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this checking should be removed, and there should be mm_populate() for VM_LOCKED case after mmap_region(): if (!IS_ERR_VALUE(addr) && (vm_flags & VM_LOCKED)) mm_populate(addr, len); Thanks, Qiaowei -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2014-06-27 18:57 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1403084656-27284-1-git-send-email-qiaowei.ren@intel.com> 2014-06-18 14:41 ` [PATCH v6 00/10] Intel MPX support Dave Hansen [not found] ` <1403084656-27284-3-git-send-email-qiaowei.ren@intel.com> 2014-06-23 19:49 ` [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface Andy Lutomirski 2014-06-23 20:03 ` Dave Hansen 2014-06-23 20:06 ` Andy Lutomirski 2014-06-23 20:28 ` Dave Hansen 2014-06-23 21:04 ` Andy Lutomirski 2014-06-24 5:53 ` Ren, Qiaowei 2014-06-24 23:55 ` Andy Lutomirski 2014-06-25 1:40 ` Ren, Qiaowei 2014-06-25 21:04 ` Andy Lutomirski 2014-06-25 21:05 ` Andy Lutomirski 2014-06-25 21:45 ` Dave Hansen 2014-06-26 22:19 ` Andy Lutomirski 2014-06-26 22:58 ` Dave Hansen 2014-06-26 23:15 ` Andy Lutomirski 2014-06-27 0:19 ` Dave Hansen 2014-06-27 0:26 ` Andy Lutomirski 2014-06-27 17:34 ` Dave Hansen 2014-06-27 17:42 ` Dave Hansen 2014-06-27 18:57 ` Andy Lutomirski 2014-06-25 21:43 ` Dave Hansen 2014-06-24 2:53 ` Ren, Qiaowei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).