From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH 0/2] Tracking user space vDSO remaping Date: Fri, 20 Mar 2015 16:53:26 +0100 Message-ID: Return-path: Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 ++++++-- 7 files changed, 70 insertions(+), 3 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH 1/2] mm: Introducing arch_remap hook Date: Fri, 20 Mar 2015 16:53:27 +0100 Message-ID: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> References: Return-path: In-Reply-To: In-Reply-To: References: Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). To no break the build, this patch adds the empty hook definition to the architectures that were not using the generic hook's definition. Signed-off-by: Laurent Dufour --- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 +++++++-- 6 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index 8fb3802f8fad..ddd861a490ba 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* __S390_MMU_CONTEXT_H */ diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h index 941527e507f7..f499b017c1f9 100644 --- a/arch/um/include/asm/mmu_context.h +++ b/arch/um/include/asm/mmu_context.h @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, struct vm_area_struct *vma) { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} /* * end asm-generic/mm_hooks.h functions */ diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h index 1cb5220afaf9..39a0a553172e 100644 --- a/arch/unicore32/include/asm/mmu_context.h +++ b/arch/unicore32/include/asm/mmu_context.h @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 883f6b933fa4..75cb71f4be1e 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, mpx_notify_unmap(mm, vma, start, end); } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index 866aa461efa5..e507f4783a5b 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -26,4 +26,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_GENERIC_MM_HOOKS_H */ diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..6a409ca09425 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,12 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH 2/2] powerpc/mm: Tracking vDSO remap Date: Fri, 20 Mar 2015 16:53:28 +0100 Message-ID: <462eda8901babf0a08b5ef642684ae1c6303bd5b.1426866405.git.ldufour@linux.vnet.ibm.com> References: Return-path: In-Reply-To: In-Reply-To: References: Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..ce7fc93518ee 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,39 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Weinberger Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook Date: Sat, 21 Mar 2015 00:19:38 +0100 Message-ID: <550CAB0A.8070402@nod.at> References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org To: Laurent Dufour , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Am 20.03.2015 um 16:53 schrieb Laurent Dufour: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. Just wanted to point out that I like that new hook as UserModeLinux can benefit from it. UML has the concept of stub pages where the UML host process can inject commands to guest processes. Currently we play nasty games in the TLB code to make all this work. arch_unmap() could make this stuff more clear and less error prone. Thanks, //richard -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook Date: Mon, 23 Mar 2015 09:52:09 +0100 Message-ID: <20150323085209.GA28965@gmail.com> References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail-wi0-f182.google.com ([209.85.212.182]:33974 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752007AbbCWIwO (ORCPT ); Mon, 23 Mar 2015 04:52:14 -0400 Content-Disposition: inline In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. > > Signed-off-by: Laurent Dufour > --- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 +++++++-- > 6 files changed, 36 insertions(+), 2 deletions(-) > > diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h > index 8fb3802f8fad..ddd861a490ba 100644 > --- a/arch/s390/include/asm/mmu_context.h > +++ b/arch/s390/include/asm/mmu_context.h > @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* __S390_MMU_CONTEXT_H */ > diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h > index 941527e507f7..f499b017c1f9 100644 > --- a/arch/um/include/asm/mmu_context.h > +++ b/arch/um/include/asm/mmu_context.h > @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > struct vm_area_struct *vma) > { > } > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > /* > * end asm-generic/mm_hooks.h functions > */ > diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h > index 1cb5220afaf9..39a0a553172e 100644 > --- a/arch/unicore32/include/asm/mmu_context.h > +++ b/arch/unicore32/include/asm/mmu_context.h > @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h > index 883f6b933fa4..75cb71f4be1e 100644 > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, > mpx_notify_unmap(mm, vma, start, end); > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* _ASM_X86_MMU_CONTEXT_H */ So instead of spreading these empty prototypes around mmu_context.h files, why not add something like this to the PPC definition: #define __HAVE_ARCH_REMAP and define the empty prototype for everyone else? It's a bit like how the __HAVE_ARCH_PTEP_* namespace works. That should shrink this patch considerably. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook Date: Mon, 23 Mar 2015 10:11:18 +0100 Message-ID: <550FD8B6.305@linux.vnet.ibm.com> References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> <20150323085209.GA28965@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150323085209.GA28965@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On 23/03/2015 09:52, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some architecture would like to be triggered when a memory area is moved >> through the mremap system call. >> >> This patch is introducing a new arch_remap mm hook which is placed in the >> path of mremap, and is called before the old area is unmapped (and the >> arch_unmap hook is called). >> >> To no break the build, this patch adds the empty hook definition to the >> architectures that were not using the generic hook's definition. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/s390/include/asm/mmu_context.h | 6 ++++++ >> arch/um/include/asm/mmu_context.h | 5 +++++ >> arch/unicore32/include/asm/mmu_context.h | 6 ++++++ >> arch/x86/include/asm/mmu_context.h | 6 ++++++ >> include/asm-generic/mm_hooks.h | 6 ++++++ >> mm/mremap.c | 9 +++++++-- >> 6 files changed, 36 insertions(+), 2 deletions(-) >> >> diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h >> index 8fb3802f8fad..ddd861a490ba 100644 >> --- a/arch/s390/include/asm/mmu_context.h >> +++ b/arch/s390/include/asm/mmu_context.h >> @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* __S390_MMU_CONTEXT_H */ >> diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h >> index 941527e507f7..f499b017c1f9 100644 >> --- a/arch/um/include/asm/mmu_context.h >> +++ b/arch/um/include/asm/mmu_context.h >> @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> struct vm_area_struct *vma) >> { >> } >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> /* >> * end asm-generic/mm_hooks.h functions >> */ >> diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h >> index 1cb5220afaf9..39a0a553172e 100644 >> --- a/arch/unicore32/include/asm/mmu_context.h >> +++ b/arch/unicore32/include/asm/mmu_context.h >> @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif >> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h >> index 883f6b933fa4..75cb71f4be1e 100644 >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, >> mpx_notify_unmap(mm, vma, start, end); >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* _ASM_X86_MMU_CONTEXT_H */ > > So instead of spreading these empty prototypes around mmu_context.h > files, why not add something like this to the PPC definition: > > #define __HAVE_ARCH_REMAP > > and define the empty prototype for everyone else? It's a bit like how > the __HAVE_ARCH_PTEP_* namespace works. > > That should shrink this patch considerably. My idea was to mimic the MMU hook's definition. This new hook is in the continuity of what have been done for arch_dup_mmap, arch_exit_mmap, arch_unmap and arch_bprm_mm_init. Do you think that there is a need to make this one in another way ? Thanks, Laurent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v2 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 12:06:34 +0100 Message-ID: References: <20150323085209.GA28965@gmail.com> Return-path: In-Reply-To: <20150323085209.GA28965@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v2 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 12:06:35 +0100 Message-ID: References: Return-path: In-Reply-To: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 12:06:36 +0100 Message-ID: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> References: Return-path: In-Reply-To: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..be5dca3f7826 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 13:11:19 +0100 Message-ID: <20150325121118.GA2542@gmail.com> References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org * Laurent Dufour wrote: > Some processes (CRIU) are moving the vDSO area using the mremap system > call. As a consequence the kernel reference to the vDSO base address is > no more valid and the signal return frame built once the vDSO has been > moved is not pointing to the new sigreturn address. > > This patch handles vDSO remapping and unmapping. > > Signed-off-by: Laurent Dufour > --- > arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- > 1 file changed, 35 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h > index 73382eba02dc..be5dca3f7826 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -8,7 +8,6 @@ > #include > #include > #include > -#include > #include > > /* > @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, > #endif > } > > +static inline void arch_dup_mmap(struct mm_struct *oldmm, > + struct mm_struct *mm) > +{ > +} > + > +static inline void arch_exit_mmap(struct mm_struct *mm) > +{ > +} > + > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} > + > +static inline void arch_bprm_mm_init(struct mm_struct *mm, > + struct vm_area_struct *vma) > +{ > +} > + > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap don't allow moving multiple vma so we can limit the check > + * to old_start == vdso_base. s/mremap don't allow moving multiple vma mremap() doesn't allow moving multiple vmas right? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 14:25:16 +0100 Message-ID: <5512B73C.5050509@linux.vnet.ibm.com> References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> <20150325121118.GA2542@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150325121118.GA2542@gmail.com> Sender: linux-kernel-owner@vger.kernel.org To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On 25/03/2015 13:11, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some processes (CRIU) are moving the vDSO area using the mremap system >> call. As a consequence the kernel reference to the vDSO base address is >> no more valid and the signal return frame built once the vDSO has been >> moved is not pointing to the new sigreturn address. >> >> This patch handles vDSO remapping and unmapping. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- >> 1 file changed, 35 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h >> index 73382eba02dc..be5dca3f7826 100644 >> --- a/arch/powerpc/include/asm/mmu_context.h >> +++ b/arch/powerpc/include/asm/mmu_context.h >> @@ -8,7 +8,6 @@ >> #include >> #include >> #include >> -#include >> #include >> >> /* >> @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, >> #endif >> } >> >> +static inline void arch_dup_mmap(struct mm_struct *oldmm, >> + struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_exit_mmap(struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_unmap(struct mm_struct *mm, >> + struct vm_area_struct *vma, >> + unsigned long start, unsigned long end) >> +{ >> + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) >> + mm->context.vdso_base = 0; >> +} >> + >> +static inline void arch_bprm_mm_init(struct mm_struct *mm, >> + struct vm_area_struct *vma) >> +{ >> +} >> + >> +#define __HAVE_ARCH_REMAP >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> + /* >> + * mremap don't allow moving multiple vma so we can limit the check >> + * to old_start == vdso_base. > > s/mremap don't allow moving multiple vma > mremap() doesn't allow moving multiple vmas > > right? Sure you're right. I'll provide a v3 fixing that comment. Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v3 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 14:53:50 +0100 Message-ID: References: <20150325121118.GA2542@gmail.com> Return-path: In-Reply-To: <20150325121118.GA2542@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 14:53:52 +0100 Message-ID: References: Return-path: In-Reply-To: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..7d315c1898d4 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap() doesn't allow moving multiple vmas so we can limit the + * check to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v3 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 14:53:51 +0100 Message-ID: References: Return-path: Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:44132 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752797AbbCYNyL (ORCPT ); Wed, 25 Mar 2015 09:54:11 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:09 -0000 In-Reply-To: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 19:33:16 +0100 Message-ID: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail-wg0-f51.google.com ([74.125.82.51]:33345 "EHLO mail-wg0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752929AbbCYSdW (ORCPT ); Wed, 25 Mar 2015 14:33:22 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} So AFAICS PowerPC can have multi-page vDSOs, right? So what happens if I munmap() the middle or end of the vDSO? The above condition only seems to cover unmaps that affect the first page. I think 'affects any page' ought to be the right condition? (But I know nothing about PowerPC so I might be wrong.) > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap() doesn't allow moving multiple vmas so we can limit the > + * check to old_start == vdso_base. > + */ > + if (old_start == mm->context.vdso_base) > + mm->context.vdso_base = new_start; > +} mremap() doesn't allow moving multiple vmas, but it allows the movement of multi-page vmas and it also allows partial mremap()s, where it will split up a vma. In particular, what happens if an mremap() is done with old_start == vdso_base, but a shorter end than the end of the vDSO? (i.e. a partial mremap() with fewer pages than the vDSO size) Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 19:36:47 +0100 Message-ID: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail-wg0-f53.google.com ([74.125.82.53]:35296 "EHLO mail-wg0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750819AbbCYSgw (ORCPT ); Wed, 25 Mar 2015 14:36:52 -0400 Content-Disposition: inline In-Reply-To: <20150325183316.GA9090@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Ingo Molnar wrote: > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. I.e. mremap() supports the shrinking (and growing) of vmas. In that case mremap() will unmap the end of the vma and will shrink the remaining vDSO vma. Doesn't that result in a non-working vDSO that should zero out vdso_base? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 08:09:57 +1100 Message-ID: <1427317797.6468.86.camel@kernel.crashing.org> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150325183316.GA9090@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On Wed, 2015-03-25 at 19:33 +0100, Ingo Molnar wrote: > * Laurent Dufour wrote: > > > +static inline void arch_unmap(struct mm_struct *mm, > > + struct vm_area_struct *vma, > > + unsigned long start, unsigned long end) > > +{ > > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > > + mm->context.vdso_base = 0; > > +} > > So AFAICS PowerPC can have multi-page vDSOs, right? > > So what happens if I munmap() the middle or end of the vDSO? The above > condition only seems to cover unmaps that affect the first page. I > think 'affects any page' ought to be the right condition? (But I know > nothing about PowerPC so I might be wrong.) You are right, we have at least two pages. > > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. > > In particular, what happens if an mremap() is done with > old_start == vdso_base, but a shorter end than the end of the vDSO? > (i.e. a partial mremap() with fewer pages than the vDSO size) Is there a way to forbid splitting ? Does x86 deal with that case at all or it doesn't have to for some other reason ? Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 08:11:07 +1100 Message-ID: <1427317867.6468.87.camel@kernel.crashing.org> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150325183647.GA9331@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > * Ingo Molnar wrote: > > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > case mremap() will unmap the end of the vma and will shrink the > remaining vDSO vma. > > Doesn't that result in a non-working vDSO that should zero out > vdso_base? Right. Now we can't completely prevent the user from shooting itself in the foot I suppose, though there is a legit usage scenario which is to move the vDSO around which it would be nice to support. I think it's reasonable to put the onus on the user here to do the right thing. Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 10:43:30 +0100 Message-ID: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1427317867.6468.87.camel@kernel.crashing.org> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org * Benjamin Herrenschmidt wrote: > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > > > +#define __HAVE_ARCH_REMAP > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > + unsigned long old_start, unsigned long old_end, > > > > + unsigned long new_start, unsigned long new_end) > > > > +{ > > > > + /* > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > + * check to old_start == vdso_base. > > > > + */ > > > > + if (old_start == mm->context.vdso_base) > > > > + mm->context.vdso_base = new_start; > > > > +} > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > movement of multi-page vmas and it also allows partial mremap()s, > > > where it will split up a vma. > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > case mremap() will unmap the end of the vma and will shrink the > > remaining vDSO vma. > > > > Doesn't that result in a non-working vDSO that should zero out > > vdso_base? > > Right. Now we can't completely prevent the user from shooting itself > in the foot I suppose, though there is a legit usage scenario which > is to move the vDSO around which it would be nice to support. I > think it's reasonable to put the onus on the user here to do the > right thing. I argue we should use the right condition to clear vdso_base: if the vDSO gets at least partially unmapped. Otherwise there's little point in the whole patch: either correctly track whether the vDSO is OK, or don't ... There's also the question of mprotect(): can users mprotect() the vDSO on PowerPC? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 10:48:44 +0100 Message-ID: <20150326094844.GB15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1427317797.6468.86.camel@kernel.crashing.org> Sender: linux-kernel-owner@vger.kernel.org To: Benjamin Herrenschmidt Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org * Benjamin Herrenschmidt wrote: > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > > > In particular, what happens if an mremap() is done with > > old_start == vdso_base, but a shorter end than the end of the vDSO? > > (i.e. a partial mremap() with fewer pages than the vDSO size) > > Is there a way to forbid splitting ? Does x86 deal with that case at > all or it doesn't have to for some other reason ? So we use _install_special_mapping() - maybe PowerPC does that too? That adds VM_DONTEXPAND which ought to prevent some - but not all - of the VM API weirdnesses. On x86 we'll just dump core if someone unmaps the vdso. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 11:13:53 +0100 Message-ID: <5513DBE1.4070404@linux.vnet.ibm.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> <20150326094844.GB15407@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:53386 "EHLO e06smtp16.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751552AbbCZKOG (ORCPT ); Thu, 26 Mar 2015 06:14:06 -0400 Received: from /spool/local by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 10:14:05 -0000 In-Reply-To: <20150326094844.GB15407@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar , Benjamin Herrenschmidt Cc: Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org On 26/03/2015 10:48, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >>>> +#define __HAVE_ARCH_REMAP >>>> +static inline void arch_remap(struct mm_struct *mm, >>>> + unsigned long old_start, unsigned long old_end, >>>> + unsigned long new_start, unsigned long new_end) >>>> +{ >>>> + /* >>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>> + * check to old_start == vdso_base. >>>> + */ >>>> + if (old_start == mm->context.vdso_base) >>>> + mm->context.vdso_base = new_start; >>>> +} >>> >>> mremap() doesn't allow moving multiple vmas, but it allows the >>> movement of multi-page vmas and it also allows partial mremap()s, >>> where it will split up a vma. >>> >>> In particular, what happens if an mremap() is done with >>> old_start == vdso_base, but a shorter end than the end of the vDSO? >>> (i.e. a partial mremap() with fewer pages than the vDSO size) >> >> Is there a way to forbid splitting ? Does x86 deal with that case at >> all or it doesn't have to for some other reason ? > > So we use _install_special_mapping() - maybe PowerPC does that too? > That adds VM_DONTEXPAND which ought to prevent some - but not all - of > the VM API weirdnesses. The same is done on PowerPC. So calling mremap() to extend the vDSO is failing but splitting it or unmapping a part of it is allowed but lead to an unusable vDSO. > On x86 we'll just dump core if someone unmaps the vdso. On PowerPC, you'll get the same result. Should we prevent the user to break its vDSO ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 11:37:33 +0100 Message-ID: <5513E16D.1030101@linux.vnet.ibm.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150326094330.GA15407@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar , Benjamin Herrenschmidt Cc: Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On 26/03/2015 10:43, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >> On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: >>> * Ingo Molnar wrote: >>> >>>>> +#define __HAVE_ARCH_REMAP >>>>> +static inline void arch_remap(struct mm_struct *mm, >>>>> + unsigned long old_start, unsigned long old_end, >>>>> + unsigned long new_start, unsigned long new_end) >>>>> +{ >>>>> + /* >>>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>>> + * check to old_start == vdso_base. >>>>> + */ >>>>> + if (old_start == mm->context.vdso_base) >>>>> + mm->context.vdso_base = new_start; >>>>> +} >>>> >>>> mremap() doesn't allow moving multiple vmas, but it allows the >>>> movement of multi-page vmas and it also allows partial mremap()s, >>>> where it will split up a vma. >>> >>> I.e. mremap() supports the shrinking (and growing) of vmas. In that >>> case mremap() will unmap the end of the vma and will shrink the >>> remaining vDSO vma. >>> >>> Doesn't that result in a non-working vDSO that should zero out >>> vdso_base? >> >> Right. Now we can't completely prevent the user from shooting itself >> in the foot I suppose, though there is a legit usage scenario which >> is to move the vDSO around which it would be nice to support. I >> think it's reasonable to put the onus on the user here to do the >> right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... That's a good option, but it may be hard to achieve in the case the vDSO area has been splitted in multiple pieces. Not sure there is a right way to handle that, here this is a best effort, allowing a process to unmap its vDSO and having the sigreturn call done through the stack area (it has to make it executable). Anyway I'll dig into that, assuming that the vdso_base pointer should be clear if a part of the vDSO is moved or unmapped. The patch will be larger since I'll have to get the vDSO size which is private to the vdso.c file. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and certainly all the other architectures. Furthermore, if it is done on a partial part of the vDSO it is splitting the vma... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 15:17:31 +0100 Message-ID: <20150326141730.GA23060@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mail-wi0-f170.google.com ([209.85.212.170]:33888 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752490AbbCZORk (ORCPT ); Thu, 26 Mar 2015 10:17:40 -0400 Content-Disposition: inline In-Reply-To: <5513E16D.1030101@linux.vnet.ibm.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > > I argue we should use the right condition to clear vdso_base: if > > the vDSO gets at least partially unmapped. Otherwise there's > > little point in the whole patch: either correctly track whether > > the vDSO is OK, or don't ... > > That's a good option, but it may be hard to achieve in the case the > vDSO area has been splitted in multiple pieces. > > Not sure there is a right way to handle that, here this is a best > effort, allowing a process to unmap its vDSO and having the > sigreturn call done through the stack area (it has to make it > executable). > > Anyway I'll dig into that, assuming that the vdso_base pointer > should be clear if a part of the vDSO is moved or unmapped. The > patch will be larger since I'll have to get the vDSO size which is > private to the vdso.c file. At least for munmap() I don't think that's a worry: once unmapped (even if just partially), vdso_base becomes zero and won't ever be set again. So no need to track the zillion pieces, should there be any: Humpty Dumpty won't be whole again, right? > > There's also the question of mprotect(): can users mprotect() the > > vDSO on PowerPC? > > Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and > certainly all the other architectures. Furthermore, if it is done on > a partial part of the vDSO it is splitting the vma... btw., CRIU's main purpose here is to reconstruct a vDSO that was originally randomized, but whose address must now be reproduced as-is, right? In that sense detecting the 'good' mremap() as your patch does should do the trick and is certainly not objectionable IMHO - I was just wondering whether we could make a perfect job very simply. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 15:32:06 +0100 Message-ID: <55141866.6080007@linux.vnet.ibm.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> <20150326141730.GA23060@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150326141730.GA23060@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On 26/03/2015 15:17, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >>> I argue we should use the right condition to clear vdso_base: if >>> the vDSO gets at least partially unmapped. Otherwise there's >>> little point in the whole patch: either correctly track whether >>> the vDSO is OK, or don't ... >> >> That's a good option, but it may be hard to achieve in the case the >> vDSO area has been splitted in multiple pieces. >> >> Not sure there is a right way to handle that, here this is a best >> effort, allowing a process to unmap its vDSO and having the >> sigreturn call done through the stack area (it has to make it >> executable). >> >> Anyway I'll dig into that, assuming that the vdso_base pointer >> should be clear if a part of the vDSO is moved or unmapped. The >> patch will be larger since I'll have to get the vDSO size which is >> private to the vdso.c file. > > At least for munmap() I don't think that's a worry: once unmapped > (even if just partially), vdso_base becomes zero and won't ever be set > again. > > So no need to track the zillion pieces, should there be any: Humpty > Dumpty won't be whole again, right? My idea is to clear vdso_base if at least part of the vdso is unmap. But since some part of the vdso may have been moved and unmapped later, to be complete, the patch has to handle partial mremap() of the vDSO too. Otherwise such a scenario will not be detected: new_area = mremap(vdso_base + page_size, ....); munmap(new_area,...); >>> There's also the question of mprotect(): can users mprotect() the >>> vDSO on PowerPC? >> >> Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and >> certainly all the other architectures. Furthermore, if it is done on >> a partial part of the vDSO it is splitting the vma... > > btw., CRIU's main purpose here is to reconstruct a vDSO that was > originally randomized, but whose address must now be reproduced as-is, > right? You're right, CRIU has to move the vDSO to the same address it has at checkpoint time. > In that sense detecting the 'good' mremap() as your patch does should > do the trick and is certainly not objectionable IMHO - I was just > wondering whether we could make a perfect job very simply. I'd try to address the perfect job, this may complexify the patch, especially because the vdso's size is not recorded in the PowerPC mm_context structure. Not sure it is a good idea to extend that structure.. Thanks, Laurent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v4 1/2] mm: Introducing arch_remap hook Date: Thu, 26 Mar 2015 18:37:52 +0100 Message-ID: References: Return-path: In-Reply-To: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v4 0/2] Tracking user space vDSO remaping Date: Thu, 26 Mar 2015 18:37:51 +0100 Message-ID: References: <20150326141730.GA23060@gmail.com> Return-path: In-Reply-To: <20150326141730.GA23060@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v4: -------------- - Reviewing the PowerPC part of the patch to handle partial unmap and remap of the vDSO. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ mm/mremap.c | 11 ++++++++-- 3 files changed, 79 insertions(+), 3 deletions(-) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 18:37:53 +0100 Message-ID: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> References: Return-path: In-Reply-To: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Sender: owner-linux-mm@kvack.org To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Moving or unmapping partially the vDSO lead to invalidate it from the kernel point of view. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..67734ce8be67 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,36 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +extern void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end); +static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + arch_vdso_remap(mm, start, end, 0, 0); +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + arch_vdso_remap(mm, old_start, old_end, new_start, new_end); +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 305eb0d9b768..a11b5d8f36d6 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -283,6 +283,45 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return rc; } +void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + unsigned long vdso_end, vdso_start; + + if (!mm->context.vdso_base) + return; + vdso_start = mm->context.vdso_base; + +#ifdef CONFIG_PPC64 + /* Calling is_32bit_task() implies that we are dealing with the + * current process memory. If there is a call path where mm is not + * owned by the current task, then we'll have need to store the + * vDSO size in the mm->context. + */ + BUG_ON(current->mm != mm); + if (is_32bit_task()) + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); + else + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); +#else + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); +#endif + vdso_end += (1<context.vdso_base = new_start; + else + mm->context.vdso_base = 0; + } +} + const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base) -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 19:55:50 +0100 Message-ID: <20150326185550.GA25547@gmail.com> References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org * Laurent Dufour wrote: > +{ > + unsigned long vdso_end, vdso_start; > + > + if (!mm->context.vdso_base) > + return; > + vdso_start = mm->context.vdso_base; > + > +#ifdef CONFIG_PPC64 > + /* Calling is_32bit_task() implies that we are dealing with the > + * current process memory. If there is a call path where mm is not > + * owned by the current task, then we'll have need to store the > + * vDSO size in the mm->context. > + */ > + BUG_ON(current->mm != mm); > + if (is_32bit_task()) > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > + else > + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); > +#else > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > +#endif > + vdso_end += (1< + > + /* Check if the vDSO is in the range of the remapped area */ > + if ((vdso_start <= old_start && old_start < vdso_end) || > + (vdso_start < old_end && old_end <= vdso_end) || > + (old_start <= vdso_start && vdso_start < old_end)) { > + /* Update vdso_base if the vDSO is entirely moved. */ > + if (old_start == vdso_start && old_end == vdso_end && > + (old_end - old_start) == (new_end - new_start)) > + mm->context.vdso_base = new_start; > + else > + mm->context.vdso_base = 0; > + } > +} Oh my, that really looks awfully complex, as you predicted, and right in every mremap() call. I'm fine with your original, imperfect, KISS approach. Sorry about this detour ... Reviewed-by: Ingo Molnar Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Fri, 27 Mar 2015 10:23:03 +1100 Message-ID: <1427412183.6468.148.camel@kernel.crashing.org> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150326094330.GA15407@gmail.com> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org List-Id: linux-arch.vger.kernel.org On Thu, 2015-03-26 at 10:43 +0100, Ingo Molnar wrote: > * Benjamin Herrenschmidt wrote: > > > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > > * Ingo Molnar wrote: > > > > > > > > +#define __HAVE_ARCH_REMAP > > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > > + unsigned long old_start, unsigned long old_end, > > > > > + unsigned long new_start, unsigned long new_end) > > > > > +{ > > > > > + /* > > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > > + * check to old_start == vdso_base. > > > > > + */ > > > > > + if (old_start == mm->context.vdso_base) > > > > > + mm->context.vdso_base = new_start; > > > > > +} > > > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > > movement of multi-page vmas and it also allows partial mremap()s, > > > > where it will split up a vma. > > > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > > case mremap() will unmap the end of the vma and will shrink the > > > remaining vDSO vma. > > > > > > Doesn't that result in a non-working vDSO that should zero out > > > vdso_base? > > > > Right. Now we can't completely prevent the user from shooting itself > > in the foot I suppose, though there is a legit usage scenario which > > is to move the vDSO around which it would be nice to support. I > > think it's reasonable to put the onus on the user here to do the > > right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... Well, if we are going to clear it at all yes, we should probably be a bit smarter about it. My point however was we probably don't need to be super robust about dealing with any crazy scenario userspace might conceive. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Nothing prevents it. But here too, I wouldn't bother. The user might be doing on purpose expecting to catch the resulting signal for example (though arguably a signal from a sigreturn frame is ... odd). Cheers, Ben. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Laurent Dufour Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Date: Fri, 27 Mar 2015 12:02:13 +0100 Message-ID: <551538B5.2030507@linux.vnet.ibm.com> References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> <20150326185550.GA25547@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:46438 "EHLO e06smtp17.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751284AbbC0LCX (ORCPT ); Fri, 27 Mar 2015 07:02:23 -0400 Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 27 Mar 2015 11:02:21 -0000 In-Reply-To: <20150326185550.GA25547@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org On 26/03/2015 19:55, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> +{ >> + unsigned long vdso_end, vdso_start; >> + >> + if (!mm->context.vdso_base) >> + return; >> + vdso_start = mm->context.vdso_base; >> + >> +#ifdef CONFIG_PPC64 >> + /* Calling is_32bit_task() implies that we are dealing with the >> + * current process memory. If there is a call path where mm is not >> + * owned by the current task, then we'll have need to store the >> + * vDSO size in the mm->context. >> + */ >> + BUG_ON(current->mm != mm); >> + if (is_32bit_task()) >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> + else >> + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); >> +#else >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> +#endif >> + vdso_end += (1<> + >> + /* Check if the vDSO is in the range of the remapped area */ >> + if ((vdso_start <= old_start && old_start < vdso_end) || >> + (vdso_start < old_end && old_end <= vdso_end) || >> + (old_start <= vdso_start && vdso_start < old_end)) { >> + /* Update vdso_base if the vDSO is entirely moved. */ >> + if (old_start == vdso_start && old_end == vdso_end && >> + (old_end - old_start) == (new_end - new_start)) >> + mm->context.vdso_base = new_start; >> + else >> + mm->context.vdso_base = 0; >> + } >> +} > > Oh my, that really looks awfully complex, as you predicted, and right > in every mremap() call. I do agree, that's awfully complex ;) > I'm fine with your original, imperfect, KISS approach. Sorry about > this detour ... > > Reviewed-by: Ingo Molnar No problem, so let's stay on the v3 version of the patch. Thanks for Reviewed-by statement which, I guess, applied to the v3 too. Should I resend the v3 ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Covington Subject: Re: [PATCH 0/2] Tracking user space vDSO remaping Date: Wed, 2 Mar 2016 07:13:10 -0500 Message-ID: <56D6D8D6.6060306@codeaurora.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, criu@openvz.org, "linux-arm-kernel@lists.infradead.org" , Will Deacon , Laura Abbott , David Brown List-Id: linux-arch.vger.kernel.org Hi, On 03/20/2015 11:53 AM, Laurent Dufour wrote: > CRIU is recreating the process memory layout by remapping the checkpointee > memory area on top of the current process (criu). This includes remapping > the vDSO to the place it has at checkpoint time. > > However some architectures like powerpc are keeping a reference to the vDSO > base address to build the signal return stack frame by calling the vDSO > sigreturn service. So once the vDSO has been moved, this reference is no > more valid and the signal frame built later are not usable. > > This patch serie is introducing a new mm hook 'arch_remap' which is called > when mremap is done and the mm lock still hold. The next patch is adding the > vDSO remap and unmap tracking to the powerpc architecture. > > Laurent Dufour (2): > mm: Introducing arch_remap hook > powerpc/mm: Tracking vDSO remap > > arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 ++++++-- > 7 files changed, 70 insertions(+), 3 deletions(-) We would like to be able to remap/unmap the VDSO on arm and arm64 as well. When I proposed a patch with mmu_context.h and mmu-arch-hooks.h changes to arm64 that were nearly identical to those done to powerpc, Will Deacon reasonably suggested [1] attempting to combine the code and provide generic VDSO accessors. Unfortunately, I no prior experience with generic MM code. Can anyone advise on how to get started with that? 1. http://www.spinics.net/lists/linux-arm-msm/msg18441.html Thanks, Christopher Covington -- Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com ([195.75.94.109]:46132 "EHLO e06smtp13.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751281AbbCTPxo (ORCPT ); Fri, 20 Mar 2015 11:53:44 -0400 Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:42 -0000 From: Laurent Dufour Subject: [PATCH 0/2] Tracking user space vDSO remaping Date: Fri, 20 Mar 2015 16:53:26 +0100 Message-ID: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150320155326.8_fu9mxiBfLgBgRjSnh-UmVnn4nHeSJQarSLm7XRaPk@z> CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 ++++++-- 7 files changed, 70 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:45821 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751834AbbCTPxr (ORCPT ); Fri, 20 Mar 2015 11:53:47 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:46 -0000 From: Laurent Dufour Subject: [PATCH 2/2] powerpc/mm: Tracking vDSO remap Date: Fri, 20 Mar 2015 16:53:28 +0100 Message-ID: <462eda8901babf0a08b5ef642684ae1c6303bd5b.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150320155328.a79qPMkEM98E-d9ubeQ5BZAFYBzDEMJM5WWWJAs3Ea8@z> Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..ce7fc93518ee 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,39 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp12.uk.ibm.com ([195.75.94.108]:38666 "EHLO e06smtp12.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751316AbbCTPxp (ORCPT ); Fri, 20 Mar 2015 11:53:45 -0400 Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:44 -0000 From: Laurent Dufour Subject: [PATCH 1/2] mm: Introducing arch_remap hook Date: Fri, 20 Mar 2015 16:53:27 +0100 Message-ID: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150320155327.dXuyT8nePqM40P6YVjSCl8we3g5RX2VpuPuud6esnP8@z> Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). To no break the build, this patch adds the empty hook definition to the architectures that were not using the generic hook's definition. Signed-off-by: Laurent Dufour --- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 +++++++-- 6 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index 8fb3802f8fad..ddd861a490ba 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* __S390_MMU_CONTEXT_H */ diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h index 941527e507f7..f499b017c1f9 100644 --- a/arch/um/include/asm/mmu_context.h +++ b/arch/um/include/asm/mmu_context.h @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, struct vm_area_struct *vma) { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} /* * end asm-generic/mm_hooks.h functions */ diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h index 1cb5220afaf9..39a0a553172e 100644 --- a/arch/unicore32/include/asm/mmu_context.h +++ b/arch/unicore32/include/asm/mmu_context.h @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 883f6b933fa4..75cb71f4be1e 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, mpx_notify_unmap(mm, vma, start, end); } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index 866aa461efa5..e507f4783a5b 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -26,4 +26,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_GENERIC_MM_HOOKS_H */ diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..6a409ca09425 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,12 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from a.ns.miles-group.at ([95.130.255.143]:65276 "EHLO radon.swed.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751361AbbCTXTt (ORCPT ); Fri, 20 Mar 2015 19:19:49 -0400 Message-ID: <550CAB0A.8070402@nod.at> Date: Sat, 21 Mar 2015 00:19:38 +0100 From: Richard Weinberger MIME-Version: 1.0 Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150320231938.f5bUFErwpICbO3VCCXHOliJ_8BhOEhSSMWvWO2o2jZ0@z> Am 20.03.2015 um 16:53 schrieb Laurent Dufour: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. Just wanted to point out that I like that new hook as UserModeLinux can benefit from it. UML has the concept of stub pages where the UML host process can inject commands to guest processes. Currently we play nasty games in the TLB code to make all this work. arch_unmap() could make this stuff more clear and less error prone. Thanks, //richard From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp14.uk.ibm.com ([195.75.94.110]:44198 "EHLO e06smtp14.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752182AbbCWJLg (ORCPT ); Mon, 23 Mar 2015 05:11:36 -0400 Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 23 Mar 2015 09:11:34 -0000 Message-ID: <550FD8B6.305@linux.vnet.ibm.com> Date: Mon, 23 Mar 2015 10:11:18 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> <20150323085209.GA28965@gmail.com> In-Reply-To: <20150323085209.GA28965@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150323091118.x5NG65HBcz1ZBe_gw2uMnn2Fz8MSWAku2iiHt10MpdA@z> On 23/03/2015 09:52, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some architecture would like to be triggered when a memory area is moved >> through the mremap system call. >> >> This patch is introducing a new arch_remap mm hook which is placed in the >> path of mremap, and is called before the old area is unmapped (and the >> arch_unmap hook is called). >> >> To no break the build, this patch adds the empty hook definition to the >> architectures that were not using the generic hook's definition. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/s390/include/asm/mmu_context.h | 6 ++++++ >> arch/um/include/asm/mmu_context.h | 5 +++++ >> arch/unicore32/include/asm/mmu_context.h | 6 ++++++ >> arch/x86/include/asm/mmu_context.h | 6 ++++++ >> include/asm-generic/mm_hooks.h | 6 ++++++ >> mm/mremap.c | 9 +++++++-- >> 6 files changed, 36 insertions(+), 2 deletions(-) >> >> diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h >> index 8fb3802f8fad..ddd861a490ba 100644 >> --- a/arch/s390/include/asm/mmu_context.h >> +++ b/arch/s390/include/asm/mmu_context.h >> @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* __S390_MMU_CONTEXT_H */ >> diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h >> index 941527e507f7..f499b017c1f9 100644 >> --- a/arch/um/include/asm/mmu_context.h >> +++ b/arch/um/include/asm/mmu_context.h >> @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> struct vm_area_struct *vma) >> { >> } >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> /* >> * end asm-generic/mm_hooks.h functions >> */ >> diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h >> index 1cb5220afaf9..39a0a553172e 100644 >> --- a/arch/unicore32/include/asm/mmu_context.h >> +++ b/arch/unicore32/include/asm/mmu_context.h >> @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif >> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h >> index 883f6b933fa4..75cb71f4be1e 100644 >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, >> mpx_notify_unmap(mm, vma, start, end); >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* _ASM_X86_MMU_CONTEXT_H */ > > So instead of spreading these empty prototypes around mmu_context.h > files, why not add something like this to the PPC definition: > > #define __HAVE_ARCH_REMAP > > and define the empty prototype for everyone else? It's a bit like how > the __HAVE_ARCH_PTEP_* namespace works. > > That should shrink this patch considerably. My idea was to mimic the MMU hook's definition. This new hook is in the continuity of what have been done for arch_dup_mmap, arch_exit_mmap, arch_unmap and arch_bprm_mm_init. Do you think that there is a need to make this one in another way ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:43024 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752058AbbCYLGv (ORCPT ); Wed, 25 Mar 2015 07:06:51 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:50 -0000 From: Laurent Dufour Subject: [PATCH v2 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 12:06:34 +0100 Message-ID: In-Reply-To: <20150323085209.GA28965@gmail.com> References: <20150323085209.GA28965@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150325110634.zqtGAGrF87heqNEPx2Bq_OtmBjT3w8WoFEN8EBgoLyA@z> CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp15.uk.ibm.com ([195.75.94.111]:49696 "EHLO e06smtp15.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752276AbbCYLG5 (ORCPT ); Wed, 25 Mar 2015 07:06:57 -0400 Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:55 -0000 From: Laurent Dufour Subject: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 12:06:36 +0100 Message-ID: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150325110636.mmuKPSB7nI2SnS5iHWMa_txZej7AR3ZRZ01yHz2lSFw@z> Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..be5dca3f7826 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:57914 "EHLO e06smtp17.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752387AbbCYLGz (ORCPT ); Wed, 25 Mar 2015 07:06:55 -0400 Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:53 -0000 From: Laurent Dufour Subject: [PATCH v2 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 12:06:35 +0100 Message-ID: In-Reply-To: References: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150325110635.VANhbLVQXe-fRk0knZsKfkRmGVzVE6yjqTd6aApbwmA@z> Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f48.google.com ([74.125.82.48]:36777 "EHLO mail-wg0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751673AbbCYMLZ (ORCPT ); Wed, 25 Mar 2015 08:11:25 -0400 Date: Wed, 25 Mar 2015 13:11:19 +0100 From: Ingo Molnar Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325121118.GA2542@gmail.com> References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150325121119.iIDMJgnpgzkWuIGc-o0m_h_2d8D4PR36ZYlAfrxUeCg@z> * Laurent Dufour wrote: > Some processes (CRIU) are moving the vDSO area using the mremap system > call. As a consequence the kernel reference to the vDSO base address is > no more valid and the signal return frame built once the vDSO has been > moved is not pointing to the new sigreturn address. > > This patch handles vDSO remapping and unmapping. > > Signed-off-by: Laurent Dufour > --- > arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- > 1 file changed, 35 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h > index 73382eba02dc..be5dca3f7826 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -8,7 +8,6 @@ > #include > #include > #include > -#include > #include > > /* > @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, > #endif > } > > +static inline void arch_dup_mmap(struct mm_struct *oldmm, > + struct mm_struct *mm) > +{ > +} > + > +static inline void arch_exit_mmap(struct mm_struct *mm) > +{ > +} > + > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} > + > +static inline void arch_bprm_mm_init(struct mm_struct *mm, > + struct vm_area_struct *vma) > +{ > +} > + > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap don't allow moving multiple vma so we can limit the check > + * to old_start == vdso_base. s/mremap don't allow moving multiple vma mremap() doesn't allow moving multiple vmas right? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp10.uk.ibm.com ([195.75.94.106]:44134 "EHLO e06smtp10.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752793AbbCYNyJ (ORCPT ); Wed, 25 Mar 2015 09:54:09 -0400 Received: from /spool/local by e06smtp10.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:07 -0000 From: Laurent Dufour Subject: [PATCH v3 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 14:53:50 +0100 Message-ID: In-Reply-To: <20150325121118.GA2542@gmail.com> References: <20150325121118.GA2542@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150325135350.5d2bEovula0WZcEUraHlu1Eu6AHbYICZd6nANMgX07E@z> CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com ([195.75.94.109]:56252 "EHLO e06smtp13.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752832AbbCYNyN (ORCPT ); Wed, 25 Mar 2015 09:54:13 -0400 Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:11 -0000 From: Laurent Dufour Subject: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 14:53:52 +0100 Message-ID: In-Reply-To: References: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150325135352.7eT_0RKjGEOpYf7PWM5M3ab6T2BkZSus3H-WFfWGAEg@z> Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..7d315c1898d4 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap() doesn't allow moving multiple vmas so we can limit the + * check to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org ([63.228.1.57]:42636 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751118AbbCYVfC (ORCPT ); Wed, 25 Mar 2015 17:35:02 -0400 Message-ID: <1427317797.6468.86.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt Date: Thu, 26 Mar 2015 08:09:57 +1100 In-Reply-To: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150325210957.4HylhcDAd1sadJL3oeWYD4s6yxjRtuJuUlFlrK4YsfI@z> On Wed, 2015-03-25 at 19:33 +0100, Ingo Molnar wrote: > * Laurent Dufour wrote: > > > +static inline void arch_unmap(struct mm_struct *mm, > > + struct vm_area_struct *vma, > > + unsigned long start, unsigned long end) > > +{ > > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > > + mm->context.vdso_base = 0; > > +} > > So AFAICS PowerPC can have multi-page vDSOs, right? > > So what happens if I munmap() the middle or end of the vDSO? The above > condition only seems to cover unmaps that affect the first page. I > think 'affects any page' ought to be the right condition? (But I know > nothing about PowerPC so I might be wrong.) You are right, we have at least two pages. > > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. > > In particular, what happens if an mremap() is done with > old_start == vdso_base, but a shorter end than the end of the vDSO? > (i.e. a partial mremap() with fewer pages than the vDSO size) Is there a way to forbid splitting ? Does x86 deal with that case at all or it doesn't have to for some other reason ? Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org ([63.228.1.57]:57590 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750764AbbCYWfn (ORCPT ); Wed, 25 Mar 2015 18:35:43 -0400 Message-ID: <1427317867.6468.87.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt Date: Thu, 26 Mar 2015 08:11:07 +1100 In-Reply-To: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150325211107.KAahTP9Dj9XAqy5pwdZdUZSxeL_krTa_mwpholLaMK0@z> On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > * Ingo Molnar wrote: > > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > case mremap() will unmap the end of the vma and will shrink the > remaining vDSO vma. > > Doesn't that result in a non-working vDSO that should zero out > vdso_base? Right. Now we can't completely prevent the user from shooting itself in the foot I suppose, though there is a legit usage scenario which is to move the vDSO around which it would be nice to support. I think it's reasonable to put the onus on the user here to do the right thing. Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f47.google.com ([74.125.82.47]:33473 "EHLO mail-wg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750881AbbCZJng (ORCPT ); Thu, 26 Mar 2015 05:43:36 -0400 Date: Thu, 26 Mar 2015 10:43:30 +0100 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1427317867.6468.87.camel@kernel.crashing.org> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150326094330.JaUr7toPts7zX7AfP6t_SjtFiO-lkbuAoXA7NJ96XPo@z> * Benjamin Herrenschmidt wrote: > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > > > +#define __HAVE_ARCH_REMAP > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > + unsigned long old_start, unsigned long old_end, > > > > + unsigned long new_start, unsigned long new_end) > > > > +{ > > > > + /* > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > + * check to old_start == vdso_base. > > > > + */ > > > > + if (old_start == mm->context.vdso_base) > > > > + mm->context.vdso_base = new_start; > > > > +} > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > movement of multi-page vmas and it also allows partial mremap()s, > > > where it will split up a vma. > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > case mremap() will unmap the end of the vma and will shrink the > > remaining vDSO vma. > > > > Doesn't that result in a non-working vDSO that should zero out > > vdso_base? > > Right. Now we can't completely prevent the user from shooting itself > in the foot I suppose, though there is a legit usage scenario which > is to move the vDSO around which it would be nice to support. I > think it's reasonable to put the onus on the user here to do the > right thing. I argue we should use the right condition to clear vdso_base: if the vDSO gets at least partially unmapped. Otherwise there's little point in the whole patch: either correctly track whether the vDSO is OK, or don't ... There's also the question of mprotect(): can users mprotect() the vDSO on PowerPC? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:58159 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750843AbbCZKhs (ORCPT ); Thu, 26 Mar 2015 06:37:48 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 10:37:47 -0000 Message-ID: <5513E16D.1030101@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 11:37:33 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> In-Reply-To: <20150326094330.GA15407@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar , Benjamin Herrenschmidt Cc: Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150326103733.8V9lo5CbQsRPED9TC_eqUWedmugC93wKLUQXjA_IEnY@z> On 26/03/2015 10:43, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >> On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: >>> * Ingo Molnar wrote: >>> >>>>> +#define __HAVE_ARCH_REMAP >>>>> +static inline void arch_remap(struct mm_struct *mm, >>>>> + unsigned long old_start, unsigned long old_end, >>>>> + unsigned long new_start, unsigned long new_end) >>>>> +{ >>>>> + /* >>>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>>> + * check to old_start == vdso_base. >>>>> + */ >>>>> + if (old_start == mm->context.vdso_base) >>>>> + mm->context.vdso_base = new_start; >>>>> +} >>>> >>>> mremap() doesn't allow moving multiple vmas, but it allows the >>>> movement of multi-page vmas and it also allows partial mremap()s, >>>> where it will split up a vma. >>> >>> I.e. mremap() supports the shrinking (and growing) of vmas. In that >>> case mremap() will unmap the end of the vma and will shrink the >>> remaining vDSO vma. >>> >>> Doesn't that result in a non-working vDSO that should zero out >>> vdso_base? >> >> Right. Now we can't completely prevent the user from shooting itself >> in the foot I suppose, though there is a legit usage scenario which >> is to move the vDSO around which it would be nice to support. I >> think it's reasonable to put the onus on the user here to do the >> right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... That's a good option, but it may be hard to achieve in the case the vDSO area has been splitted in multiple pieces. Not sure there is a right way to handle that, here this is a best effort, allowing a process to unmap its vDSO and having the sigreturn call done through the stack area (it has to make it executable). Anyway I'll dig into that, assuming that the vdso_base pointer should be clear if a part of the vDSO is moved or unmapped. The patch will be larger since I'll have to get the vDSO size which is private to the vdso.c file. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and certainly all the other architectures. Furthermore, if it is done on a partial part of the vDSO it is splitting the vma... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com ([195.75.94.109]:45796 "EHLO e06smtp13.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752508AbbCZOcg (ORCPT ); Thu, 26 Mar 2015 10:32:36 -0400 Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 14:32:34 -0000 Message-ID: <55141866.6080007@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 15:32:06 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> <20150326141730.GA23060@gmail.com> In-Reply-To: <20150326141730.GA23060@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150326143206.ujA6N_VYt_imH3WWv6wZZkRHLIf4QUdfiV4n26ukRzM@z> On 26/03/2015 15:17, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >>> I argue we should use the right condition to clear vdso_base: if >>> the vDSO gets at least partially unmapped. Otherwise there's >>> little point in the whole patch: either correctly track whether >>> the vDSO is OK, or don't ... >> >> That's a good option, but it may be hard to achieve in the case the >> vDSO area has been splitted in multiple pieces. >> >> Not sure there is a right way to handle that, here this is a best >> effort, allowing a process to unmap its vDSO and having the >> sigreturn call done through the stack area (it has to make it >> executable). >> >> Anyway I'll dig into that, assuming that the vdso_base pointer >> should be clear if a part of the vDSO is moved or unmapped. The >> patch will be larger since I'll have to get the vDSO size which is >> private to the vdso.c file. > > At least for munmap() I don't think that's a worry: once unmapped > (even if just partially), vdso_base becomes zero and won't ever be set > again. > > So no need to track the zillion pieces, should there be any: Humpty > Dumpty won't be whole again, right? My idea is to clear vdso_base if at least part of the vdso is unmap. But since some part of the vdso may have been moved and unmapped later, to be complete, the patch has to handle partial mremap() of the vDSO too. Otherwise such a scenario will not be detected: new_area = mremap(vdso_base + page_size, ....); munmap(new_area,...); >>> There's also the question of mprotect(): can users mprotect() the >>> vDSO on PowerPC? >> >> Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and >> certainly all the other architectures. Furthermore, if it is done on >> a partial part of the vDSO it is splitting the vma... > > btw., CRIU's main purpose here is to reconstruct a vDSO that was > originally randomized, but whose address must now be reproduced as-is, > right? You're right, CRIU has to move the vDSO to the same address it has at checkpoint time. > In that sense detecting the 'good' mremap() as your patch does should > do the trick and is certainly not objectionable IMHO - I was just > wondering whether we could make a perfect job very simply. I'd try to address the perfect job, this may complexify the patch, especially because the vdso's size is not recorded in the PowerPC mm_context structure. Not sure it is a good idea to extend that structure.. Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp12.uk.ibm.com ([195.75.94.108]:54621 "EHLO e06smtp12.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752058AbbCZRiC (ORCPT ); Thu, 26 Mar 2015 13:38:02 -0400 Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:00 -0000 From: Laurent Dufour Subject: [PATCH v4 0/2] Tracking user space vDSO remaping Date: Thu, 26 Mar 2015 18:37:51 +0100 Message-ID: In-Reply-To: <20150326141730.GA23060@gmail.com> References: <20150326141730.GA23060@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150326173751.1ZdVNhf7SKwdtA2FJ2Uj-Qj2NPBnZdMoso7eVKcE9Oo@z> CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v4: -------------- - Reviewing the PowerPC part of the patch to handle partial unmap and remap of the vDSO. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ mm/mremap.c | 11 ++++++++-- 3 files changed, 79 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp14.uk.ibm.com ([195.75.94.110]:38449 "EHLO e06smtp14.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752101AbbCZRiH (ORCPT ); Thu, 26 Mar 2015 13:38:07 -0400 Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:05 -0000 From: Laurent Dufour Subject: [PATCH v4 1/2] mm: Introducing arch_remap hook Date: Thu, 26 Mar 2015 18:37:52 +0100 Message-ID: In-Reply-To: References: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150326173752.qL-bLPdFIaHBhQcNA_UWgSDTA-eqEfpunbSYoRuv8x8@z> Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:54991 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751533AbbCZRiD (ORCPT ); Thu, 26 Mar 2015 13:38:03 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:01 -0000 From: Laurent Dufour Subject: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 18:37:53 +0100 Message-ID: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Message-ID: <20150326173753.H3lQocgq1CVd9I0T53YqBniJ7-yWuqdwHvh9tyCddrY@z> Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Moving or unmapping partially the vDSO lead to invalidate it from the kernel point of view. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..67734ce8be67 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,36 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +extern void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end); +static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + arch_vdso_remap(mm, start, end, 0, 0); +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + arch_vdso_remap(mm, old_start, old_end, new_start, new_end); +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 305eb0d9b768..a11b5d8f36d6 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -283,6 +283,45 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return rc; } +void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + unsigned long vdso_end, vdso_start; + + if (!mm->context.vdso_base) + return; + vdso_start = mm->context.vdso_base; + +#ifdef CONFIG_PPC64 + /* Calling is_32bit_task() implies that we are dealing with the + * current process memory. If there is a call path where mm is not + * owned by the current task, then we'll have need to store the + * vDSO size in the mm->context. + */ + BUG_ON(current->mm != mm); + if (is_32bit_task()) + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); + else + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); +#else + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); +#endif + vdso_end += (1<context.vdso_base = new_start; + else + mm->context.vdso_base = 0; + } +} + const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f52.google.com ([74.125.82.52]:33914 "EHLO mail-wg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752936AbbCZSz4 (ORCPT ); Thu, 26 Mar 2015 14:55:56 -0400 Date: Thu, 26 Mar 2015 19:55:50 +0100 From: Ingo Molnar Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326185550.GA25547@gmail.com> References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150326185550.KOrutOh_uY-6WNfkzdJPnfC7ujwxuZRWZRb539bXZ04@z> * Laurent Dufour wrote: > +{ > + unsigned long vdso_end, vdso_start; > + > + if (!mm->context.vdso_base) > + return; > + vdso_start = mm->context.vdso_base; > + > +#ifdef CONFIG_PPC64 > + /* Calling is_32bit_task() implies that we are dealing with the > + * current process memory. If there is a call path where mm is not > + * owned by the current task, then we'll have need to store the > + * vDSO size in the mm->context. > + */ > + BUG_ON(current->mm != mm); > + if (is_32bit_task()) > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > + else > + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); > +#else > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > +#endif > + vdso_end += (1< + > + /* Check if the vDSO is in the range of the remapped area */ > + if ((vdso_start <= old_start && old_start < vdso_end) || > + (vdso_start < old_end && old_end <= vdso_end) || > + (old_start <= vdso_start && vdso_start < old_end)) { > + /* Update vdso_base if the vDSO is entirely moved. */ > + if (old_start == vdso_start && old_end == vdso_end && > + (old_end - old_start) == (new_end - new_start)) > + mm->context.vdso_base = new_start; > + else > + mm->context.vdso_base = 0; > + } > +} Oh my, that really looks awfully complex, as you predicted, and right in every mremap() call. I'm fine with your original, imperfect, KISS approach. Sorry about this detour ... Reviewed-by: Ingo Molnar Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org ([63.228.1.57]:59729 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752101AbbC0AGE (ORCPT ); Thu, 26 Mar 2015 20:06:04 -0400 Message-ID: <1427412183.6468.148.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt Date: Fri, 27 Mar 2015 10:23:03 +1100 In-Reply-To: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ingo Molnar Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org Message-ID: <20150326232303.XYEFCHQtNz1FUf33nfKxJtUN9_Jb0Y_imb5EDjknMmI@z> On Thu, 2015-03-26 at 10:43 +0100, Ingo Molnar wrote: > * Benjamin Herrenschmidt wrote: > > > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > > * Ingo Molnar wrote: > > > > > > > > +#define __HAVE_ARCH_REMAP > > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > > + unsigned long old_start, unsigned long old_end, > > > > > + unsigned long new_start, unsigned long new_end) > > > > > +{ > > > > > + /* > > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > > + * check to old_start == vdso_base. > > > > > + */ > > > > > + if (old_start == mm->context.vdso_base) > > > > > + mm->context.vdso_base = new_start; > > > > > +} > > > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > > movement of multi-page vmas and it also allows partial mremap()s, > > > > where it will split up a vma. > > > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > > case mremap() will unmap the end of the vma and will shrink the > > > remaining vDSO vma. > > > > > > Doesn't that result in a non-working vDSO that should zero out > > > vdso_base? > > > > Right. Now we can't completely prevent the user from shooting itself > > in the foot I suppose, though there is a legit usage scenario which > > is to move the vDSO around which it would be nice to support. I > > think it's reasonable to put the onus on the user here to do the > > right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... Well, if we are going to clear it at all yes, we should probably be a bit smarter about it. My point however was we probably don't need to be super robust about dealing with any crazy scenario userspace might conceive. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Nothing prevents it. But here too, I wouldn't bother. The user might be doing on purpose expecting to catch the resulting signal for example (though arguably a signal from a sigreturn frame is ... odd). Cheers, Ben. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.codeaurora.org ([198.145.29.96]:39758 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755176AbcCBMN0 (ORCPT ); Wed, 2 Mar 2016 07:13:26 -0500 Subject: Re: [PATCH 0/2] Tracking user space vDSO remaping References: From: Christopher Covington Message-ID: <56D6D8D6.6060306@codeaurora.org> Date: Wed, 2 Mar 2016 07:13:10 -0500 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, criu@openvz.org, "linux-arm-kernel@lists.infradead.org" , Will Deacon , Laura Abbott , David Brown Message-ID: <20160302121310.FQdB4rXZ-ILwaOvGAWh3YDsMc3WfXUT34y-qI2njy_U@z> Hi, On 03/20/2015 11:53 AM, Laurent Dufour wrote: > CRIU is recreating the process memory layout by remapping the checkpointee > memory area on top of the current process (criu). This includes remapping > the vDSO to the place it has at checkpoint time. > > However some architectures like powerpc are keeping a reference to the vDSO > base address to build the signal return stack frame by calling the vDSO > sigreturn service. So once the vDSO has been moved, this reference is no > more valid and the signal frame built later are not usable. > > This patch serie is introducing a new mm hook 'arch_remap' which is called > when mremap is done and the mm lock still hold. The next patch is adding the > vDSO remap and unmap tracking to the powerpc architecture. > > Laurent Dufour (2): > mm: Introducing arch_remap hook > powerpc/mm: Tracking vDSO remap > > arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 ++++++-- > 7 files changed, 70 insertions(+), 3 deletions(-) We would like to be able to remap/unmap the VDSO on arm and arm64 as well. When I proposed a patch with mmu_context.h and mmu-arch-hooks.h changes to arm64 that were nearly identical to those done to powerpc, Will Deacon reasonably suggested [1] attempting to combine the code and provide generic VDSO accessors. Unfortunately, I no prior experience with generic MM code. Can anyone advise on how to get started with that? 1. http://www.spinics.net/lists/linux-arm-msm/msg18441.html Thanks, Christopher Covington -- Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <1427412183.6468.148.camel@kernel.crashing.org> From: Benjamin Herrenschmidt Date: Fri, 27 Mar 2015 10:23:03 +1100 In-Reply-To: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> Mime-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linuxppc-dev-bounces+geert=linux-m68k.org@lists.ozlabs.org Sender: "Linuxppc-dev" Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap To: Ingo Molnar Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-ID: T24gVGh1LCAyMDE1LTAzLTI2IGF0IDEwOjQzICswMTAwLCBJbmdvIE1vbG5hciB3cm90ZToKPiAq IEJlbmphbWluIEhlcnJlbnNjaG1pZHQgPGJlbmhAa2VybmVsLmNyYXNoaW5nLm9yZz4gd3JvdGU6 Cj4gCj4gPiBPbiBXZWQsIDIwMTUtMDMtMjUgYXQgMTk6MzYgKzAxMDAsIEluZ28gTW9sbmFyIHdy b3RlOgo+ID4gPiAqIEluZ28gTW9sbmFyIDxtaW5nb0BrZXJuZWwub3JnPiB3cm90ZToKPiA+ID4g Cj4gPiA+ID4gPiArI2RlZmluZSBfX0hBVkVfQVJDSF9SRU1BUAo+ID4gPiA+ID4gK3N0YXRpYyBp bmxpbmUgdm9pZCBhcmNoX3JlbWFwKHN0cnVjdCBtbV9zdHJ1Y3QgKm1tLAo+ID4gPiA+ID4gKwkJ CSAgICAgIHVuc2lnbmVkIGxvbmcgb2xkX3N0YXJ0LCB1bnNpZ25lZCBsb25nIG9sZF9lbmQsCj4g PiA+ID4gPiArCQkJICAgICAgdW5zaWduZWQgbG9uZyBuZXdfc3RhcnQsIHVuc2lnbmVkIGxvbmcg bmV3X2VuZCkKPiA+ID4gPiA+ICt7Cj4gPiA+ID4gPiArCS8qCj4gPiA+ID4gPiArCSAqIG1yZW1h cCgpIGRvZXNuJ3QgYWxsb3cgbW92aW5nIG11bHRpcGxlIHZtYXMgc28gd2UgY2FuIGxpbWl0IHRo ZQo+ID4gPiA+ID4gKwkgKiBjaGVjayB0byBvbGRfc3RhcnQgPT0gdmRzb19iYXNlLgo+ID4gPiA+ ID4gKwkgKi8KPiA+ID4gPiA+ICsJaWYgKG9sZF9zdGFydCA9PSBtbS0+Y29udGV4dC52ZHNvX2Jh c2UpCj4gPiA+ID4gPiArCQltbS0+Y29udGV4dC52ZHNvX2Jhc2UgPSBuZXdfc3RhcnQ7Cj4gPiA+ ID4gPiArfQo+ID4gPiA+IAo+ID4gPiA+IG1yZW1hcCgpIGRvZXNuJ3QgYWxsb3cgbW92aW5nIG11 bHRpcGxlIHZtYXMsIGJ1dCBpdCBhbGxvd3MgdGhlIAo+ID4gPiA+IG1vdmVtZW50IG9mIG11bHRp LXBhZ2Ugdm1hcyBhbmQgaXQgYWxzbyBhbGxvd3MgcGFydGlhbCBtcmVtYXAoKXMsIAo+ID4gPiA+ IHdoZXJlIGl0IHdpbGwgc3BsaXQgdXAgYSB2bWEuCj4gPiA+IAo+ID4gPiBJLmUuIG1yZW1hcCgp IHN1cHBvcnRzIHRoZSBzaHJpbmtpbmcgKGFuZCBncm93aW5nKSBvZiB2bWFzLiBJbiB0aGF0IAo+ ID4gPiBjYXNlIG1yZW1hcCgpIHdpbGwgdW5tYXAgdGhlIGVuZCBvZiB0aGUgdm1hIGFuZCB3aWxs IHNocmluayB0aGUgCj4gPiA+IHJlbWFpbmluZyB2RFNPIHZtYS4KPiA+ID4gCj4gPiA+IERvZXNu J3QgdGhhdCByZXN1bHQgaW4gYSBub24td29ya2luZyB2RFNPIHRoYXQgc2hvdWxkIHplcm8gb3V0 IAo+ID4gPiB2ZHNvX2Jhc2U/Cj4gPiAKPiA+IFJpZ2h0LiBOb3cgd2UgY2FuJ3QgY29tcGxldGVs eSBwcmV2ZW50IHRoZSB1c2VyIGZyb20gc2hvb3RpbmcgaXRzZWxmIAo+ID4gaW4gdGhlIGZvb3Qg SSBzdXBwb3NlLCB0aG91Z2ggdGhlcmUgaXMgYSBsZWdpdCB1c2FnZSBzY2VuYXJpbyB3aGljaCAK PiA+IGlzIHRvIG1vdmUgdGhlIHZEU08gYXJvdW5kIHdoaWNoIGl0IHdvdWxkIGJlIG5pY2UgdG8g c3VwcG9ydC4gSSAKPiA+IHRoaW5rIGl0J3MgcmVhc29uYWJsZSB0byBwdXQgdGhlIG9udXMgb24g dGhlIHVzZXIgaGVyZSB0byBkbyB0aGUgCj4gPiByaWdodCB0aGluZy4KPiAKPiBJIGFyZ3VlIHdl IHNob3VsZCB1c2UgdGhlIHJpZ2h0IGNvbmRpdGlvbiB0byBjbGVhciB2ZHNvX2Jhc2U6IGlmIHRo ZSAKPiB2RFNPIGdldHMgYXQgbGVhc3QgcGFydGlhbGx5IHVubWFwcGVkLiBPdGhlcndpc2UgdGhl cmUncyBsaXR0bGUgcG9pbnQgCj4gaW4gdGhlIHdob2xlIHBhdGNoOiBlaXRoZXIgY29ycmVjdGx5 IHRyYWNrIHdoZXRoZXIgdGhlIHZEU08gaXMgT0ssIG9yIAo+IGRvbid0IC4uLgoKV2VsbCwgaWYg d2UgYXJlIGdvaW5nIHRvIGNsZWFyIGl0IGF0IGFsbCB5ZXMsIHdlIHNob3VsZCBwcm9iYWJseSBi ZSBhCmJpdCBzbWFydGVyIGFib3V0IGl0LiBNeSBwb2ludCBob3dldmVyIHdhcyB3ZSBwcm9iYWJs eSBkb24ndCBuZWVkIHRvIGJlCnN1cGVyIHJvYnVzdCBhYm91dCBkZWFsaW5nIHdpdGggYW55IGNy YXp5IHNjZW5hcmlvIHVzZXJzcGFjZSBtaWdodApjb25jZWl2ZS4KCj4gVGhlcmUncyBhbHNvIHRo ZSBxdWVzdGlvbiBvZiBtcHJvdGVjdCgpOiBjYW4gdXNlcnMgbXByb3RlY3QoKSB0aGUgdkRTTyAK PiBvbiBQb3dlclBDPwoKTm90aGluZyBwcmV2ZW50cyBpdC4gQnV0IGhlcmUgdG9vLCBJIHdvdWxk bid0IGJvdGhlci4gVGhlIHVzZXIgbWlnaHQgYmUKZG9pbmcgb24gcHVycG9zZSBleHBlY3Rpbmcg dG8gY2F0Y2ggdGhlIHJlc3VsdGluZyBzaWduYWwgZm9yIGV4YW1wbGUKKHRob3VnaCBhcmd1YWJs eSBhIHNpZ25hbCBmcm9tIGEgc2lncmV0dXJuIGZyYW1lIGlzIC4uLiBvZGQpLgoKQ2hlZXJzLApC ZW4uCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51 eHBwYy1kZXYgbWFpbGluZyBsaXN0CkxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnCmh0dHBz Oi8vbGlzdHMub3psYWJzLm9yZy9saXN0aW5mby9saW51eHBwYy1kZXY= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <1427317867.6468.87.camel@kernel.crashing.org> From: Benjamin Herrenschmidt Date: Thu, 26 Mar 2015 08:11:07 +1100 In-Reply-To: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> Mime-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linuxppc-dev-bounces+geert=linux-m68k.org@lists.ozlabs.org Sender: "Linuxppc-dev" Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap To: Ingo Molnar Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-ID: T24gV2VkLCAyMDE1LTAzLTI1IGF0IDE5OjM2ICswMTAwLCBJbmdvIE1vbG5hciB3cm90ZToKPiAq IEluZ28gTW9sbmFyIDxtaW5nb0BrZXJuZWwub3JnPiB3cm90ZToKPiAKPiA+ID4gKyNkZWZpbmUg X19IQVZFX0FSQ0hfUkVNQVAKPiA+ID4gK3N0YXRpYyBpbmxpbmUgdm9pZCBhcmNoX3JlbWFwKHN0 cnVjdCBtbV9zdHJ1Y3QgKm1tLAo+ID4gPiArCQkJICAgICAgdW5zaWduZWQgbG9uZyBvbGRfc3Rh cnQsIHVuc2lnbmVkIGxvbmcgb2xkX2VuZCwKPiA+ID4gKwkJCSAgICAgIHVuc2lnbmVkIGxvbmcg bmV3X3N0YXJ0LCB1bnNpZ25lZCBsb25nIG5ld19lbmQpCj4gPiA+ICt7Cj4gPiA+ICsJLyoKPiA+ ID4gKwkgKiBtcmVtYXAoKSBkb2Vzbid0IGFsbG93IG1vdmluZyBtdWx0aXBsZSB2bWFzIHNvIHdl IGNhbiBsaW1pdCB0aGUKPiA+ID4gKwkgKiBjaGVjayB0byBvbGRfc3RhcnQgPT0gdmRzb19iYXNl Lgo+ID4gPiArCSAqLwo+ID4gPiArCWlmIChvbGRfc3RhcnQgPT0gbW0tPmNvbnRleHQudmRzb19i YXNlKQo+ID4gPiArCQltbS0+Y29udGV4dC52ZHNvX2Jhc2UgPSBuZXdfc3RhcnQ7Cj4gPiA+ICt9 Cj4gPiAKPiA+IG1yZW1hcCgpIGRvZXNuJ3QgYWxsb3cgbW92aW5nIG11bHRpcGxlIHZtYXMsIGJ1 dCBpdCBhbGxvd3MgdGhlIAo+ID4gbW92ZW1lbnQgb2YgbXVsdGktcGFnZSB2bWFzIGFuZCBpdCBh bHNvIGFsbG93cyBwYXJ0aWFsIG1yZW1hcCgpcywgCj4gPiB3aGVyZSBpdCB3aWxsIHNwbGl0IHVw IGEgdm1hLgo+IAo+IEkuZS4gbXJlbWFwKCkgc3VwcG9ydHMgdGhlIHNocmlua2luZyAoYW5kIGdy b3dpbmcpIG9mIHZtYXMuIEluIHRoYXQgCj4gY2FzZSBtcmVtYXAoKSB3aWxsIHVubWFwIHRoZSBl bmQgb2YgdGhlIHZtYSBhbmQgd2lsbCBzaHJpbmsgdGhlIAo+IHJlbWFpbmluZyB2RFNPIHZtYS4K PiAKPiBEb2Vzbid0IHRoYXQgcmVzdWx0IGluIGEgbm9uLXdvcmtpbmcgdkRTTyB0aGF0IHNob3Vs ZCB6ZXJvIG91dCAKPiB2ZHNvX2Jhc2U/CgpSaWdodC4gTm93IHdlIGNhbid0IGNvbXBsZXRlbHkg cHJldmVudCB0aGUgdXNlciBmcm9tIHNob290aW5nIGl0c2VsZiBpbgp0aGUgZm9vdCBJIHN1cHBv c2UsIHRob3VnaCB0aGVyZSBpcyBhIGxlZ2l0IHVzYWdlIHNjZW5hcmlvIHdoaWNoIGlzIHRvCm1v dmUgdGhlIHZEU08gYXJvdW5kIHdoaWNoIGl0IHdvdWxkIGJlIG5pY2UgdG8gc3VwcG9ydC4gSSB0 aGluayBpdCdzCnJlYXNvbmFibGUgdG8gcHV0IHRoZSBvbnVzIG9uIHRoZSB1c2VyIGhlcmUgdG8g ZG8gdGhlIHJpZ2h0IHRoaW5nLgoKQ2hlZXJzLApCZW4uCgo+IFRoYW5rcywKPiAKPiAJSW5nbwo+ IC0tCj4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vi c2NyaWJlIGxpbnV4LWtlcm5lbCIgaW4KPiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jk b21vQHZnZXIua2VybmVsLm9yZwo+IE1vcmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0dHA6Ly92Z2Vy Lmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbAo+IFBsZWFzZSByZWFkIHRoZSBGQVEgYXQg IGh0dHA6Ly93d3cudHV4Lm9yZy9sa21sLwoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fCkxpbnV4cHBjLWRldiBtYWlsaW5nIGxpc3QKTGludXhwcGMtZGV2 QGxpc3RzLm96bGFicy5vcmcKaHR0cHM6Ly9saXN0cy5vemxhYnMub3JnL2xpc3RpbmZvL2xpbnV4 cHBjLWRldg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <1427317797.6468.86.camel@kernel.crashing.org> From: Benjamin Herrenschmidt Date: Thu, 26 Mar 2015 08:09:57 +1100 In-Reply-To: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> Mime-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linuxppc-dev-bounces+geert=linux-m68k.org@lists.ozlabs.org Sender: "Linuxppc-dev" Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap To: Ingo Molnar Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-ID: T24gV2VkLCAyMDE1LTAzLTI1IGF0IDE5OjMzICswMTAwLCBJbmdvIE1vbG5hciB3cm90ZToKPiAq IExhdXJlbnQgRHVmb3VyIDxsZHVmb3VyQGxpbnV4LnZuZXQuaWJtLmNvbT4gd3JvdGU6Cj4gCj4g PiArc3RhdGljIGlubGluZSB2b2lkIGFyY2hfdW5tYXAoc3RydWN0IG1tX3N0cnVjdCAqbW0sCj4g PiArCQkJc3RydWN0IHZtX2FyZWFfc3RydWN0ICp2bWEsCj4gPiArCQkJdW5zaWduZWQgbG9uZyBz dGFydCwgdW5zaWduZWQgbG9uZyBlbmQpCj4gPiArewo+ID4gKwlpZiAoc3RhcnQgPD0gbW0tPmNv bnRleHQudmRzb19iYXNlICYmIG1tLT5jb250ZXh0LnZkc29fYmFzZSA8IGVuZCkKPiA+ICsJCW1t LT5jb250ZXh0LnZkc29fYmFzZSA9IDA7Cj4gPiArfQo+IAo+IFNvIEFGQUlDUyBQb3dlclBDIGNh biBoYXZlIG11bHRpLXBhZ2UgdkRTT3MsIHJpZ2h0Pwo+IAo+IFNvIHdoYXQgaGFwcGVucyBpZiBJ IG11bm1hcCgpIHRoZSBtaWRkbGUgb3IgZW5kIG9mIHRoZSB2RFNPPyBUaGUgYWJvdmUgCj4gY29u ZGl0aW9uIG9ubHkgc2VlbXMgdG8gY292ZXIgdW5tYXBzIHRoYXQgYWZmZWN0IHRoZSBmaXJzdCBw YWdlLiBJIAo+IHRoaW5rICdhZmZlY3RzIGFueSBwYWdlJyBvdWdodCB0byBiZSB0aGUgcmlnaHQg Y29uZGl0aW9uPyAoQnV0IEkga25vdyAKPiBub3RoaW5nIGFib3V0IFBvd2VyUEMgc28gSSBtaWdo dCBiZSB3cm9uZy4pCgpZb3UgYXJlIHJpZ2h0LCB3ZSBoYXZlIGF0IGxlYXN0IHR3byBwYWdlcy4K PiAKPiA+ICsjZGVmaW5lIF9fSEFWRV9BUkNIX1JFTUFQCj4gPiArc3RhdGljIGlubGluZSB2b2lk IGFyY2hfcmVtYXAoc3RydWN0IG1tX3N0cnVjdCAqbW0sCj4gPiArCQkJICAgICAgdW5zaWduZWQg bG9uZyBvbGRfc3RhcnQsIHVuc2lnbmVkIGxvbmcgb2xkX2VuZCwKPiA+ICsJCQkgICAgICB1bnNp Z25lZCBsb25nIG5ld19zdGFydCwgdW5zaWduZWQgbG9uZyBuZXdfZW5kKQo+ID4gK3sKPiA+ICsJ LyoKPiA+ICsJICogbXJlbWFwKCkgZG9lc24ndCBhbGxvdyBtb3ZpbmcgbXVsdGlwbGUgdm1hcyBz byB3ZSBjYW4gbGltaXQgdGhlCj4gPiArCSAqIGNoZWNrIHRvIG9sZF9zdGFydCA9PSB2ZHNvX2Jh c2UuCj4gPiArCSAqLwo+ID4gKwlpZiAob2xkX3N0YXJ0ID09IG1tLT5jb250ZXh0LnZkc29fYmFz ZSkKPiA+ICsJCW1tLT5jb250ZXh0LnZkc29fYmFzZSA9IG5ld19zdGFydDsKPiA+ICt9Cj4gCj4g bXJlbWFwKCkgZG9lc24ndCBhbGxvdyBtb3ZpbmcgbXVsdGlwbGUgdm1hcywgYnV0IGl0IGFsbG93 cyB0aGUgCj4gbW92ZW1lbnQgb2YgbXVsdGktcGFnZSB2bWFzIGFuZCBpdCBhbHNvIGFsbG93cyBw YXJ0aWFsIG1yZW1hcCgpcywgCj4gd2hlcmUgaXQgd2lsbCBzcGxpdCB1cCBhIHZtYS4KPiAKPiBJ biBwYXJ0aWN1bGFyLCB3aGF0IGhhcHBlbnMgaWYgYW4gbXJlbWFwKCkgaXMgZG9uZSB3aXRoIAo+ IG9sZF9zdGFydCA9PSB2ZHNvX2Jhc2UsIGJ1dCBhIHNob3J0ZXIgZW5kIHRoYW4gdGhlIGVuZCBv ZiB0aGUgdkRTTz8gCj4gKGkuZS4gYSBwYXJ0aWFsIG1yZW1hcCgpIHdpdGggZmV3ZXIgcGFnZXMg dGhhbiB0aGUgdkRTTyBzaXplKQoKSXMgdGhlcmUgYSB3YXkgdG8gZm9yYmlkIHNwbGl0dGluZyA/ IERvZXMgeDg2IGRlYWwgd2l0aCB0aGF0IGNhc2UgYXQgYWxsCm9yIGl0IGRvZXNuJ3QgaGF2ZSB0 byBmb3Igc29tZSBvdGhlciByZWFzb24gPwoKQ2hlZXJzLApCZW4uCgo+IFRoYW5rcywKPiAKPiAJ SW5nbwo+IC0tCj4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUg InVuc3Vic2NyaWJlIGxpbnV4LWtlcm5lbCIgaW4KPiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8g bWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZwo+IE1vcmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0dHA6 Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbAo+IFBsZWFzZSByZWFkIHRoZSBG QVEgYXQgIGh0dHA6Ly93d3cudHV4Lm9yZy9sa21sLwoKCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCkxpbnV4cHBjLWRldiBtYWlsaW5nIGxpc3QKTGludXhw cGMtZGV2QGxpc3RzLm96bGFicy5vcmcKaHR0cHM6Ly9saXN0cy5vemxhYnMub3JnL2xpc3RpbmZv L2xpbnV4cHBjLWRldg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 354451A2A8C for ; Sat, 21 Mar 2015 02:53:40 +1100 (AEDT) Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:37 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 65F9E1B08061 for ; Fri, 20 Mar 2015 15:53:58 +0000 (GMT) Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2KFrYhl9830748 for ; Fri, 20 Mar 2015 15:53:34 GMT Received: from d06av04.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2KFrVt1005488 for ; Fri, 20 Mar 2015 09:53:33 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 0/2] Tracking user space vDSO remaping Date: Fri, 20 Mar 2015 16:53:26 +0100 Message-Id: Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 ++++++-- 7 files changed, 70 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id DE1A11A2A8C for ; Sat, 21 Mar 2015 02:53:42 +1100 (AEDT) Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:39 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 0FC6417D8062 for ; Fri, 20 Mar 2015 15:54:04 +0000 (GMT) Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2KFrcLd5308796 for ; Fri, 20 Mar 2015 15:53:38 GMT Received: from d06av04.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2KFrXBI005569 for ; Fri, 20 Mar 2015 09:53:37 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 1/2] mm: Introducing arch_remap hook Date: Fri, 20 Mar 2015 16:53:27 +0100 Message-Id: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). To no break the build, this patch adds the empty hook definition to the architectures that were not using the generic hook's definition. Signed-off-by: Laurent Dufour --- arch/s390/include/asm/mmu_context.h | 6 ++++++ arch/um/include/asm/mmu_context.h | 5 +++++ arch/unicore32/include/asm/mmu_context.h | 6 ++++++ arch/x86/include/asm/mmu_context.h | 6 ++++++ include/asm-generic/mm_hooks.h | 6 ++++++ mm/mremap.c | 9 +++++++-- 6 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index 8fb3802f8fad..ddd861a490ba 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* __S390_MMU_CONTEXT_H */ diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h index 941527e507f7..f499b017c1f9 100644 --- a/arch/um/include/asm/mmu_context.h +++ b/arch/um/include/asm/mmu_context.h @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, struct vm_area_struct *vma) { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} /* * end asm-generic/mm_hooks.h functions */ diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h index 1cb5220afaf9..39a0a553172e 100644 --- a/arch/unicore32/include/asm/mmu_context.h +++ b/arch/unicore32/include/asm/mmu_context.h @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 883f6b933fa4..75cb71f4be1e 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, mpx_notify_unmap(mm, vma, start, end); } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index 866aa461efa5..e507f4783a5b 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -26,4 +26,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, { } +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ +} + #endif /* _ASM_GENERIC_MM_HOOKS_H */ diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..6a409ca09425 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,12 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id DEE101A2A98 for ; Sat, 21 Mar 2015 02:53:45 +1100 (AEDT) Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Mar 2015 15:53:42 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id EDC8D1B08067 for ; Fri, 20 Mar 2015 15:54:04 +0000 (GMT) Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2KFre5f11141622 for ; Fri, 20 Mar 2015 15:53:40 GMT Received: from d06av04.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2KFrau1005657 for ; Fri, 20 Mar 2015 09:53:40 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 2/2] powerpc/mm: Tracking vDSO remap Date: Fri, 20 Mar 2015 16:53:28 +0100 Message-Id: <462eda8901babf0a08b5ef642684ae1c6303bd5b.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..ce7fc93518ee 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,39 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from radon.swed.at (a.ns.miles-group.at [95.130.255.143]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id EB6651A02EE for ; Sat, 21 Mar 2015 10:26:30 +1100 (AEDT) Message-ID: <550CAB0A.8070402@nod.at> Date: Sat, 21 Mar 2015 00:19:38 +0100 From: Richard Weinberger MIME-Version: 1.0 To: Laurent Dufour , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Content-Type: text/plain; charset=iso-8859-15 Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Am 20.03.2015 um 16:53 schrieb Laurent Dufour: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. Just wanted to point out that I like that new hook as UserModeLinux can benefit from it. UML has the concept of stub pages where the UML host process can inject commands to guest processes. Currently we play nasty games in the TLB code to make all this work. arch_unmap() could make this stuff more clear and less error prone. Thanks, //richard From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x233.google.com (mail-wg0-x233.google.com [IPv6:2a00:1450:400c:c00::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id CCD401A2B37 for ; Mon, 23 Mar 2015 19:52:16 +1100 (AEDT) Received: by wgbcc7 with SMTP id cc7so139972204wgb.0 for ; Mon, 23 Mar 2015 01:52:13 -0700 (PDT) Sender: Ingo Molnar Date: Mon, 23 Mar 2015 09:52:09 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook Message-ID: <20150323085209.GA28965@gmail.com> References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Laurent Dufour wrote: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. > > Signed-off-by: Laurent Dufour > --- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 +++++++-- > 6 files changed, 36 insertions(+), 2 deletions(-) > > diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h > index 8fb3802f8fad..ddd861a490ba 100644 > --- a/arch/s390/include/asm/mmu_context.h > +++ b/arch/s390/include/asm/mmu_context.h > @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* __S390_MMU_CONTEXT_H */ > diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h > index 941527e507f7..f499b017c1f9 100644 > --- a/arch/um/include/asm/mmu_context.h > +++ b/arch/um/include/asm/mmu_context.h > @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > struct vm_area_struct *vma) > { > } > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > /* > * end asm-generic/mm_hooks.h functions > */ > diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h > index 1cb5220afaf9..39a0a553172e 100644 > --- a/arch/unicore32/include/asm/mmu_context.h > +++ b/arch/unicore32/include/asm/mmu_context.h > @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h > index 883f6b933fa4..75cb71f4be1e 100644 > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, > mpx_notify_unmap(mm, vma, start, end); > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* _ASM_X86_MMU_CONTEXT_H */ So instead of spreading these empty prototypes around mmu_context.h files, why not add something like this to the PPC definition: #define __HAVE_ARCH_REMAP and define the empty prototype for everyone else? It's a bit like how the __HAVE_ARCH_PTEP_* namespace works. That should shrink this patch considerably. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id CA61C1A2B10 for ; Mon, 23 Mar 2015 20:11:33 +1100 (AEDT) Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 23 Mar 2015 09:11:29 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7B39317D8063 for ; Mon, 23 Mar 2015 09:11:54 +0000 (GMT) Received: from d06av03.portsmouth.uk.ibm.com (d06av03.portsmouth.uk.ibm.com [9.149.37.213]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2N9BRix11665712 for ; Mon, 23 Mar 2015 09:11:27 GMT Received: from d06av03.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av03.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2N9BMHN011313 for ; Mon, 23 Mar 2015 03:11:27 -0600 Message-ID: <550FD8B6.305@linux.vnet.ibm.com> Date: Mon, 23 Mar 2015 10:11:18 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> <20150323085209.GA28965@gmail.com> In-Reply-To: <20150323085209.GA28965@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 23/03/2015 09:52, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some architecture would like to be triggered when a memory area is moved >> through the mremap system call. >> >> This patch is introducing a new arch_remap mm hook which is placed in the >> path of mremap, and is called before the old area is unmapped (and the >> arch_unmap hook is called). >> >> To no break the build, this patch adds the empty hook definition to the >> architectures that were not using the generic hook's definition. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/s390/include/asm/mmu_context.h | 6 ++++++ >> arch/um/include/asm/mmu_context.h | 5 +++++ >> arch/unicore32/include/asm/mmu_context.h | 6 ++++++ >> arch/x86/include/asm/mmu_context.h | 6 ++++++ >> include/asm-generic/mm_hooks.h | 6 ++++++ >> mm/mremap.c | 9 +++++++-- >> 6 files changed, 36 insertions(+), 2 deletions(-) >> >> diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h >> index 8fb3802f8fad..ddd861a490ba 100644 >> --- a/arch/s390/include/asm/mmu_context.h >> +++ b/arch/s390/include/asm/mmu_context.h >> @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* __S390_MMU_CONTEXT_H */ >> diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h >> index 941527e507f7..f499b017c1f9 100644 >> --- a/arch/um/include/asm/mmu_context.h >> +++ b/arch/um/include/asm/mmu_context.h >> @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> struct vm_area_struct *vma) >> { >> } >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> /* >> * end asm-generic/mm_hooks.h functions >> */ >> diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h >> index 1cb5220afaf9..39a0a553172e 100644 >> --- a/arch/unicore32/include/asm/mmu_context.h >> +++ b/arch/unicore32/include/asm/mmu_context.h >> @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, >> { >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif >> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h >> index 883f6b933fa4..75cb71f4be1e 100644 >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, >> mpx_notify_unmap(mm, vma, start, end); >> } >> >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> +} >> + >> #endif /* _ASM_X86_MMU_CONTEXT_H */ > > So instead of spreading these empty prototypes around mmu_context.h > files, why not add something like this to the PPC definition: > > #define __HAVE_ARCH_REMAP > > and define the empty prototype for everyone else? It's a bit like how > the __HAVE_ARCH_PTEP_* namespace works. > > That should shrink this patch considerably. My idea was to mimic the MMU hook's definition. This new hook is in the continuity of what have been done for arch_dup_mmap, arch_exit_mmap, arch_unmap and arch_bprm_mm_init. Do you think that there is a need to make this one in another way ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 83D551A003D for ; Wed, 25 Mar 2015 22:06:51 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:47 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 2F9AF17D8062 for ; Wed, 25 Mar 2015 11:07:12 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PB6i502359662 for ; Wed, 25 Mar 2015 11:06:44 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PB6deT021500 for ; Wed, 25 Mar 2015 05:06:43 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 12:06:34 +0100 Message-Id: In-Reply-To: <20150323085209.GA28965@gmail.com> References: <20150323085209.GA28965@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 06BB01A003D for ; Wed, 25 Mar 2015 22:06:53 +1100 (AEDT) Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:50 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id C160E1B08067 for ; Wed, 25 Mar 2015 11:07:13 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PB6lDq10748278 for ; Wed, 25 Mar 2015 11:06:47 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PB6hx4021664 for ; Wed, 25 Mar 2015 05:06:47 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 12:06:35 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id A2FE71A11C2 for ; Wed, 25 Mar 2015 22:06:56 +1100 (AEDT) Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 11:06:53 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 060911B08061 for ; Wed, 25 Mar 2015 11:07:16 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PB6oBW9306542 for ; Wed, 25 Mar 2015 11:06:50 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PB6kIC021809 for ; Wed, 25 Mar 2015 05:06:49 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 12:06:36 +0100 Message-Id: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: <20150323085209.GA28965@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..be5dca3f7826 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap don't allow moving multiple vma so we can limit the check + * to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 1774B1A0736 for ; Thu, 26 Mar 2015 00:25:27 +1100 (AEDT) Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:25:23 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id C189D17D805F for ; Wed, 25 Mar 2015 13:25:49 +0000 (GMT) Received: from d06av12.portsmouth.uk.ibm.com (d06av12.portsmouth.uk.ibm.com [9.149.37.247]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDPMr610813766 for ; Wed, 25 Mar 2015 13:25:22 GMT Received: from d06av12.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDPI7F004627 for ; Wed, 25 Mar 2015 07:25:21 -0600 Message-ID: <5512B73C.5050509@linux.vnet.ibm.com> Date: Wed, 25 Mar 2015 14:25:16 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> <20150325121118.GA2542@gmail.com> In-Reply-To: <20150325121118.GA2542@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 25/03/2015 13:11, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some processes (CRIU) are moving the vDSO area using the mremap system >> call. As a consequence the kernel reference to the vDSO base address is >> no more valid and the signal return frame built once the vDSO has been >> moved is not pointing to the new sigreturn address. >> >> This patch handles vDSO remapping and unmapping. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- >> 1 file changed, 35 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h >> index 73382eba02dc..be5dca3f7826 100644 >> --- a/arch/powerpc/include/asm/mmu_context.h >> +++ b/arch/powerpc/include/asm/mmu_context.h >> @@ -8,7 +8,6 @@ >> #include >> #include >> #include >> -#include >> #include >> >> /* >> @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, >> #endif >> } >> >> +static inline void arch_dup_mmap(struct mm_struct *oldmm, >> + struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_exit_mmap(struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_unmap(struct mm_struct *mm, >> + struct vm_area_struct *vma, >> + unsigned long start, unsigned long end) >> +{ >> + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) >> + mm->context.vdso_base = 0; >> +} >> + >> +static inline void arch_bprm_mm_init(struct mm_struct *mm, >> + struct vm_area_struct *vma) >> +{ >> +} >> + >> +#define __HAVE_ARCH_REMAP >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> + /* >> + * mremap don't allow moving multiple vma so we can limit the check >> + * to old_start == vdso_base. > > s/mremap don't allow moving multiple vma > mremap() doesn't allow moving multiple vmas > > right? Sure you're right. I'll provide a v3 fixing that comment. Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 321D41A0961 for ; Wed, 25 Mar 2015 23:11:28 +1100 (AEDT) Received: by wixw10 with SMTP id w10so35607729wix.0 for ; Wed, 25 Mar 2015 05:11:23 -0700 (PDT) Sender: Ingo Molnar Date: Wed, 25 Mar 2015 13:11:19 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325121118.GA2542@gmail.com> References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Laurent Dufour wrote: > Some processes (CRIU) are moving the vDSO area using the mremap system > call. As a consequence the kernel reference to the vDSO base address is > no more valid and the signal return frame built once the vDSO has been > moved is not pointing to the new sigreturn address. > > This patch handles vDSO remapping and unmapping. > > Signed-off-by: Laurent Dufour > --- > arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- > 1 file changed, 35 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h > index 73382eba02dc..be5dca3f7826 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -8,7 +8,6 @@ > #include > #include > #include > -#include > #include > > /* > @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, > #endif > } > > +static inline void arch_dup_mmap(struct mm_struct *oldmm, > + struct mm_struct *mm) > +{ > +} > + > +static inline void arch_exit_mmap(struct mm_struct *mm) > +{ > +} > + > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} > + > +static inline void arch_bprm_mm_init(struct mm_struct *mm, > + struct vm_area_struct *vma) > +{ > +} > + > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap don't allow moving multiple vma so we can limit the check > + * to old_start == vdso_base. s/mremap don't allow moving multiple vma mremap() doesn't allow moving multiple vmas right? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 9DE841A1709 for ; Thu, 26 Mar 2015 00:54:10 +1100 (AEDT) Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:07 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 4631B1B0805F for ; Wed, 25 Mar 2015 13:54:30 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDs4k07602514 for ; Wed, 25 Mar 2015 13:54:04 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDrwGj002997 for ; Wed, 25 Mar 2015 07:54:03 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 14:53:51 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id E821A1A15ED for ; Thu, 26 Mar 2015 00:54:05 +1100 (AEDT) Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:02 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id D4FBF2190056 for ; Wed, 25 Mar 2015 13:53:48 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDs0Ei7864634 for ; Wed, 25 Mar 2015 13:54:00 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDrtN7002858 for ; Wed, 25 Mar 2015 07:53:59 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 0/2] Tracking user space vDSO remaping Date: Wed, 25 Mar 2015 14:53:50 +0100 Message-Id: In-Reply-To: <20150325121118.GA2542@gmail.com> References: <20150325121118.GA2542@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- mm/mremap.c | 11 +++++++++-- 2 files changed, 44 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id BD1C71A0C02 for ; Thu, 26 Mar 2015 00:54:12 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:08 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id CE9111B08067 for ; Wed, 25 Mar 2015 13:54:32 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDs6jW3277218 for ; Wed, 25 Mar 2015 13:54:06 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDs2Gq003182 for ; Wed, 25 Mar 2015 07:54:06 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Date: Wed, 25 Mar 2015 14:53:52 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..7d315c1898d4 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) + mm->context.vdso_base = 0; +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + /* + * mremap() doesn't allow moving multiple vmas so we can limit the + * check to old_start == vdso_base. + */ + if (old_start == mm->context.vdso_base) + mm->context.vdso_base = new_start; +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x22b.google.com (mail-wg0-x22b.google.com [IPv6:2a00:1450:400c:c00::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id BF2241A0681 for ; Thu, 26 Mar 2015 05:33:24 +1100 (AEDT) Received: by wgra20 with SMTP id a20so37647674wgr.3 for ; Wed, 25 Mar 2015 11:33:21 -0700 (PDT) Sender: Ingo Molnar Date: Wed, 25 Mar 2015 19:33:16 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Laurent Dufour wrote: > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} So AFAICS PowerPC can have multi-page vDSOs, right? So what happens if I munmap() the middle or end of the vDSO? The above condition only seems to cover unmaps that affect the first page. I think 'affects any page' ought to be the right condition? (But I know nothing about PowerPC so I might be wrong.) > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap() doesn't allow moving multiple vmas so we can limit the > + * check to old_start == vdso_base. > + */ > + if (old_start == mm->context.vdso_base) > + mm->context.vdso_base = new_start; > +} mremap() doesn't allow moving multiple vmas, but it allows the movement of multi-page vmas and it also allows partial mremap()s, where it will split up a vma. In particular, what happens if an mremap() is done with old_start == vdso_base, but a shorter end than the end of the vDSO? (i.e. a partial mremap() with fewer pages than the vDSO size) Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id C424D1A0681 for ; Thu, 26 Mar 2015 05:36:54 +1100 (AEDT) Received: by wibgn9 with SMTP id gn9so52208624wib.1 for ; Wed, 25 Mar 2015 11:36:51 -0700 (PDT) Sender: Ingo Molnar Date: Wed, 25 Mar 2015 19:36:47 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20150325183316.GA9090@gmail.com> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Ingo Molnar wrote: > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. I.e. mremap() supports the shrinking (and growing) of vmas. In that case mremap() will unmap the end of the vma and will shrink the remaining vDSO vma. Doesn't that result in a non-working vDSO that should zero out vdso_base? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id DE1531A0688 for ; Thu, 26 Mar 2015 08:11:20 +1100 (AEDT) Message-ID: <1427317797.6468.86.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt To: Ingo Molnar Date: Thu, 26 Mar 2015 08:09:57 +1100 In-Reply-To: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 2015-03-25 at 19:33 +0100, Ingo Molnar wrote: > * Laurent Dufour wrote: > > > +static inline void arch_unmap(struct mm_struct *mm, > > + struct vm_area_struct *vma, > > + unsigned long start, unsigned long end) > > +{ > > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > > + mm->context.vdso_base = 0; > > +} > > So AFAICS PowerPC can have multi-page vDSOs, right? > > So what happens if I munmap() the middle or end of the vDSO? The above > condition only seems to cover unmaps that affect the first page. I > think 'affects any page' ought to be the right condition? (But I know > nothing about PowerPC so I might be wrong.) You are right, we have at least two pages. > > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. > > In particular, what happens if an mremap() is done with > old_start == vdso_base, but a shorter end than the end of the vDSO? > (i.e. a partial mremap() with fewer pages than the vDSO size) Is there a way to forbid splitting ? Does x86 deal with that case at all or it doesn't have to for some other reason ? Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id BAFA41A0688 for ; Thu, 26 Mar 2015 08:17:06 +1100 (AEDT) Message-ID: <1427317867.6468.87.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt To: Ingo Molnar Date: Thu, 26 Mar 2015 08:11:07 +1100 In-Reply-To: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > * Ingo Molnar wrote: > > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > case mremap() will unmap the end of the vma and will shrink the > remaining vDSO vma. > > Doesn't that result in a non-working vDSO that should zero out > vdso_base? Right. Now we can't completely prevent the user from shooting itself in the foot I suppose, though there is a legit usage scenario which is to move the vDSO around which it would be nice to support. I think it's reasonable to put the onus on the user here to do the right thing. Cheers, Ben. > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x22c.google.com (mail-wg0-x22c.google.com [IPv6:2a00:1450:400c:c00::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3FDBE1A0688 for ; Thu, 26 Mar 2015 20:48:52 +1100 (AEDT) Received: by wgra20 with SMTP id a20so57582835wgr.3 for ; Thu, 26 Mar 2015 02:48:48 -0700 (PDT) Sender: Ingo Molnar Date: Thu, 26 Mar 2015 10:48:44 +0100 From: Ingo Molnar To: Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326094844.GB15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <1427317797.6468.86.camel@kernel.crashing.org> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Benjamin Herrenschmidt wrote: > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > > > In particular, what happens if an mremap() is done with > > old_start == vdso_base, but a shorter end than the end of the vDSO? > > (i.e. a partial mremap() with fewer pages than the vDSO size) > > Is there a way to forbid splitting ? Does x86 deal with that case at > all or it doesn't have to for some other reason ? So we use _install_special_mapping() - maybe PowerPC does that too? That adds VM_DONTEXPAND which ought to prevent some - but not all - of the VM API weirdnesses. On x86 we'll just dump core if someone unmaps the vdso. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 9E0581A0688 for ; Thu, 26 Mar 2015 21:14:04 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 10:14:00 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id C954817D8042 for ; Thu, 26 Mar 2015 10:14:26 +0000 (GMT) Received: from d06av03.portsmouth.uk.ibm.com (d06av03.portsmouth.uk.ibm.com [9.149.37.213]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QADwRZ8061354 for ; Thu, 26 Mar 2015 10:13:58 GMT Received: from d06av03.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av03.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QADtxI024090 for ; Thu, 26 Mar 2015 04:13:58 -0600 Message-ID: <5513DBE1.4070404@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 11:13:53 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar , Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> <20150326094844.GB15407@gmail.com> In-Reply-To: <20150326094844.GB15407@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 26/03/2015 10:48, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >>>> +#define __HAVE_ARCH_REMAP >>>> +static inline void arch_remap(struct mm_struct *mm, >>>> + unsigned long old_start, unsigned long old_end, >>>> + unsigned long new_start, unsigned long new_end) >>>> +{ >>>> + /* >>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>> + * check to old_start == vdso_base. >>>> + */ >>>> + if (old_start == mm->context.vdso_base) >>>> + mm->context.vdso_base = new_start; >>>> +} >>> >>> mremap() doesn't allow moving multiple vmas, but it allows the >>> movement of multi-page vmas and it also allows partial mremap()s, >>> where it will split up a vma. >>> >>> In particular, what happens if an mremap() is done with >>> old_start == vdso_base, but a shorter end than the end of the vDSO? >>> (i.e. a partial mremap() with fewer pages than the vDSO size) >> >> Is there a way to forbid splitting ? Does x86 deal with that case at >> all or it doesn't have to for some other reason ? > > So we use _install_special_mapping() - maybe PowerPC does that too? > That adds VM_DONTEXPAND which ought to prevent some - but not all - of > the VM API weirdnesses. The same is done on PowerPC. So calling mremap() to extend the vDSO is failing but splitting it or unmapping a part of it is allowed but lead to an unusable vDSO. > On x86 we'll just dump core if someone unmaps the vdso. On PowerPC, you'll get the same result. Should we prevent the user to break its vDSO ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-x22c.google.com (mail-wi0-x22c.google.com [IPv6:2a00:1450:400c:c05::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 88A9D1A0688 for ; Thu, 26 Mar 2015 20:43:38 +1100 (AEDT) Received: by wibg7 with SMTP id g7so141398965wib.1 for ; Thu, 26 Mar 2015 02:43:35 -0700 (PDT) Sender: Ingo Molnar Date: Thu, 26 Mar 2015 10:43:30 +0100 From: Ingo Molnar To: Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <1427317867.6468.87.camel@kernel.crashing.org> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Benjamin Herrenschmidt wrote: > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > > > +#define __HAVE_ARCH_REMAP > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > + unsigned long old_start, unsigned long old_end, > > > > + unsigned long new_start, unsigned long new_end) > > > > +{ > > > > + /* > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > + * check to old_start == vdso_base. > > > > + */ > > > > + if (old_start == mm->context.vdso_base) > > > > + mm->context.vdso_base = new_start; > > > > +} > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > movement of multi-page vmas and it also allows partial mremap()s, > > > where it will split up a vma. > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > case mremap() will unmap the end of the vma and will shrink the > > remaining vDSO vma. > > > > Doesn't that result in a non-working vDSO that should zero out > > vdso_base? > > Right. Now we can't completely prevent the user from shooting itself > in the foot I suppose, though there is a legit usage scenario which > is to move the vDSO around which it would be nice to support. I > think it's reasonable to put the onus on the user here to do the > right thing. I argue we should use the right condition to clear vdso_base: if the vDSO gets at least partially unmapped. Otherwise there's little point in the whole patch: either correctly track whether the vDSO is OK, or don't ... There's also the question of mprotect(): can users mprotect() the vDSO on PowerPC? Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 394871A011B for ; Thu, 26 Mar 2015 21:37:45 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 10:37:41 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 1F58F1B0806B for ; Thu, 26 Mar 2015 10:38:06 +0000 (GMT) Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QAbdtY3342694 for ; Thu, 26 Mar 2015 10:37:39 GMT Received: from d06av04.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QAbZE9027874 for ; Thu, 26 Mar 2015 04:37:39 -0600 Message-ID: <5513E16D.1030101@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 11:37:33 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar , Benjamin Herrenschmidt Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> In-Reply-To: <20150326094330.GA15407@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 26/03/2015 10:43, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >> On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: >>> * Ingo Molnar wrote: >>> >>>>> +#define __HAVE_ARCH_REMAP >>>>> +static inline void arch_remap(struct mm_struct *mm, >>>>> + unsigned long old_start, unsigned long old_end, >>>>> + unsigned long new_start, unsigned long new_end) >>>>> +{ >>>>> + /* >>>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>>> + * check to old_start == vdso_base. >>>>> + */ >>>>> + if (old_start == mm->context.vdso_base) >>>>> + mm->context.vdso_base = new_start; >>>>> +} >>>> >>>> mremap() doesn't allow moving multiple vmas, but it allows the >>>> movement of multi-page vmas and it also allows partial mremap()s, >>>> where it will split up a vma. >>> >>> I.e. mremap() supports the shrinking (and growing) of vmas. In that >>> case mremap() will unmap the end of the vma and will shrink the >>> remaining vDSO vma. >>> >>> Doesn't that result in a non-working vDSO that should zero out >>> vdso_base? >> >> Right. Now we can't completely prevent the user from shooting itself >> in the foot I suppose, though there is a legit usage scenario which >> is to move the vDSO around which it would be nice to support. I >> think it's reasonable to put the onus on the user here to do the >> right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... That's a good option, but it may be hard to achieve in the case the vDSO area has been splitted in multiple pieces. Not sure there is a right way to handle that, here this is a best effort, allowing a process to unmap its vDSO and having the sigreturn call done through the stack area (it has to make it executable). Anyway I'll dig into that, assuming that the vdso_base pointer should be clear if a part of the vDSO is moved or unmapped. The patch will be larger since I'll have to get the vDSO size which is private to the vdso.c file. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and certainly all the other architectures. Furthermore, if it is done on a partial part of the vDSO it is splitting the vma... From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40DCB1A0688 for ; Fri, 27 Mar 2015 01:17:42 +1100 (AEDT) Received: by wgs2 with SMTP id 2so66083225wgs.1 for ; Thu, 26 Mar 2015 07:17:39 -0700 (PDT) Sender: Ingo Molnar Date: Thu, 26 Mar 2015 15:17:31 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326141730.GA23060@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <5513E16D.1030101@linux.vnet.ibm.com> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Laurent Dufour wrote: > > I argue we should use the right condition to clear vdso_base: if > > the vDSO gets at least partially unmapped. Otherwise there's > > little point in the whole patch: either correctly track whether > > the vDSO is OK, or don't ... > > That's a good option, but it may be hard to achieve in the case the > vDSO area has been splitted in multiple pieces. > > Not sure there is a right way to handle that, here this is a best > effort, allowing a process to unmap its vDSO and having the > sigreturn call done through the stack area (it has to make it > executable). > > Anyway I'll dig into that, assuming that the vdso_base pointer > should be clear if a part of the vDSO is moved or unmapped. The > patch will be larger since I'll have to get the vDSO size which is > private to the vdso.c file. At least for munmap() I don't think that's a worry: once unmapped (even if just partially), vdso_base becomes zero and won't ever be set again. So no need to track the zillion pieces, should there be any: Humpty Dumpty won't be whole again, right? > > There's also the question of mprotect(): can users mprotect() the > > vDSO on PowerPC? > > Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and > certainly all the other architectures. Furthermore, if it is done on > a partial part of the vDSO it is splitting the vma... btw., CRIU's main purpose here is to reconstruct a vDSO that was originally randomized, but whose address must now be reproduced as-is, right? In that sense detecting the 'good' mremap() as your patch does should do the trick and is certainly not objectionable IMHO - I was just wondering whether we could make a perfect job very simply. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 71EE31A0688 for ; Fri, 27 Mar 2015 01:32:34 +1100 (AEDT) Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 14:32:30 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7E7F02190077 for ; Thu, 26 Mar 2015 14:31:58 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QEW9pl131566 for ; Thu, 26 Mar 2015 14:32:10 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QEW7Pq011521 for ; Thu, 26 Mar 2015 08:32:09 -0600 Message-ID: <55141866.6080007@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 15:32:06 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> <20150326141730.GA23060@gmail.com> In-Reply-To: <20150326141730.GA23060@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 26/03/2015 15:17, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >>> I argue we should use the right condition to clear vdso_base: if >>> the vDSO gets at least partially unmapped. Otherwise there's >>> little point in the whole patch: either correctly track whether >>> the vDSO is OK, or don't ... >> >> That's a good option, but it may be hard to achieve in the case the >> vDSO area has been splitted in multiple pieces. >> >> Not sure there is a right way to handle that, here this is a best >> effort, allowing a process to unmap its vDSO and having the >> sigreturn call done through the stack area (it has to make it >> executable). >> >> Anyway I'll dig into that, assuming that the vdso_base pointer >> should be clear if a part of the vDSO is moved or unmapped. The >> patch will be larger since I'll have to get the vDSO size which is >> private to the vdso.c file. > > At least for munmap() I don't think that's a worry: once unmapped > (even if just partially), vdso_base becomes zero and won't ever be set > again. > > So no need to track the zillion pieces, should there be any: Humpty > Dumpty won't be whole again, right? My idea is to clear vdso_base if at least part of the vdso is unmap. But since some part of the vdso may have been moved and unmapped later, to be complete, the patch has to handle partial mremap() of the vDSO too. Otherwise such a scenario will not be detected: new_area = mremap(vdso_base + page_size, ....); munmap(new_area,...); >>> There's also the question of mprotect(): can users mprotect() the >>> vDSO on PowerPC? >> >> Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and >> certainly all the other architectures. Furthermore, if it is done on >> a partial part of the vDSO it is splitting the vma... > > btw., CRIU's main purpose here is to reconstruct a vDSO that was > originally randomized, but whose address must now be reproduced as-is, > right? You're right, CRIU has to move the vDSO to the same address it has at checkpoint time. > In that sense detecting the 'good' mremap() as your patch does should > do the trick and is certainly not objectionable IMHO - I was just > wondering whether we could make a perfect job very simply. I'd try to address the perfect job, this may complexify the patch, especially because the vdso's size is not recorded in the PowerPC mm_context structure. Not sure it is a good idea to extend that structure.. Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 0A61F1A0145 for ; Fri, 27 Mar 2015 04:38:04 +1100 (AEDT) Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:01 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 067A71B08074 for ; Thu, 26 Mar 2015 17:38:24 +0000 (GMT) Received: from d06av10.portsmouth.uk.ibm.com (d06av10.portsmouth.uk.ibm.com [9.149.37.251]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QHbvHS1704206 for ; Thu, 26 Mar 2015 17:37:57 GMT Received: from d06av10.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av10.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QHbtG7032188 for ; Thu, 26 Mar 2015 11:37:57 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Date: Thu, 26 Mar 2015 18:37:53 +0100 Message-Id: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> In-Reply-To: References: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some processes (CRIU) are moving the vDSO area using the mremap system call. As a consequence the kernel reference to the vDSO base address is no more valid and the signal return frame built once the vDSO has been moved is not pointing to the new sigreturn address. This patch handles vDSO remapping and unmapping. Moving or unmapping partially the vDSO lead to invalidate it from the kernel point of view. Signed-off-by: Laurent Dufour --- arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 73382eba02dc..67734ce8be67 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -8,7 +8,6 @@ #include #include #include -#include #include /* @@ -109,5 +108,36 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } +static inline void arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +extern void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end); +static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + arch_vdso_remap(mm, start, end, 0, 0); +} + +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} + +#define __HAVE_ARCH_REMAP +static inline void arch_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + arch_vdso_remap(mm, old_start, old_end, new_start, new_end); +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 305eb0d9b768..a11b5d8f36d6 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -283,6 +283,45 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return rc; } +void arch_vdso_remap(struct mm_struct *mm, + unsigned long old_start, unsigned long old_end, + unsigned long new_start, unsigned long new_end) +{ + unsigned long vdso_end, vdso_start; + + if (!mm->context.vdso_base) + return; + vdso_start = mm->context.vdso_base; + +#ifdef CONFIG_PPC64 + /* Calling is_32bit_task() implies that we are dealing with the + * current process memory. If there is a call path where mm is not + * owned by the current task, then we'll have need to store the + * vDSO size in the mm->context. + */ + BUG_ON(current->mm != mm); + if (is_32bit_task()) + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); + else + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); +#else + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); +#endif + vdso_end += (1<context.vdso_base = new_start; + else + mm->context.vdso_base = 0; + } +} + const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 6D3341A06DA for ; Fri, 27 Mar 2015 04:38:05 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:01 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id DD4FC1B08076 for ; Thu, 26 Mar 2015 17:38:22 +0000 (GMT) Received: from d06av10.portsmouth.uk.ibm.com (d06av10.portsmouth.uk.ibm.com [9.149.37.251]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QHbucR10486224 for ; Thu, 26 Mar 2015 17:37:56 GMT Received: from d06av10.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av10.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QHbt3C032151 for ; Thu, 26 Mar 2015 11:37:56 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v4 1/2] mm: Introducing arch_remap hook Date: Thu, 26 Mar 2015 18:37:52 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: <20150326141730.GA23060@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 64CDB1A0145 for ; Fri, 27 Mar 2015 04:38:06 +1100 (AEDT) Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 17:38:03 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 46FA01B08067 for ; Thu, 26 Mar 2015 17:38:22 +0000 (GMT) Received: from d06av10.portsmouth.uk.ibm.com (d06av10.portsmouth.uk.ibm.com [9.149.37.251]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QHbtuP10813884 for ; Thu, 26 Mar 2015 17:37:55 GMT Received: from d06av10.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av10.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QHbsba032124 for ; Thu, 26 Mar 2015 11:37:55 -0600 From: Laurent Dufour To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v4 0/2] Tracking user space vDSO remaping Date: Thu, 26 Mar 2015 18:37:51 +0100 Message-Id: In-Reply-To: <20150326141730.GA23060@gmail.com> References: <20150326141730.GA23060@gmail.com> Cc: criu@openvz.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , CRIU is recreating the process memory layout by remapping the checkpointee memory area on top of the current process (criu). This includes remapping the vDSO to the place it has at checkpoint time. However some architectures like powerpc are keeping a reference to the vDSO base address to build the signal return stack frame by calling the vDSO sigreturn service. So once the vDSO has been moved, this reference is no more valid and the signal frame built later are not usable. This patch serie is introducing a new mm hook 'arch_remap' which is called when mremap is done and the mm lock still hold. The next patch is adding the vDSO remap and unmap tracking to the powerpc architecture. Changes in v4: -------------- - Reviewing the PowerPC part of the patch to handle partial unmap and remap of the vDSO. Changes in v3: -------------- - Fixed grammatical error in a comment of the second patch. Thanks again, Ingo. Changes in v2: -------------- - Following the Ingo Molnar's advice, enabling the call to arch_remap through the __HAVE_ARCH_REMAP macro. This reduces considerably the first patch. Laurent Dufour (2): mm: Introducing arch_remap hook powerpc/mm: Tracking vDSO remap arch/powerpc/include/asm/mmu_context.h | 32 +++++++++++++++++++++++++++- arch/powerpc/kernel/vdso.c | 39 ++++++++++++++++++++++++++++++++++ mm/mremap.c | 11 ++++++++-- 3 files changed, 79 insertions(+), 3 deletions(-) -- 1.9.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-x22e.google.com (mail-wg0-x22e.google.com [IPv6:2a00:1450:400c:c00::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3C98B1A06DA for ; Fri, 27 Mar 2015 05:55:58 +1100 (AEDT) Received: by wgra20 with SMTP id a20so74579139wgr.3 for ; Thu, 26 Mar 2015 11:55:55 -0700 (PDT) Sender: Ingo Molnar Date: Thu, 26 Mar 2015 19:55:50 +0100 From: Ingo Molnar To: Laurent Dufour Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326185550.GA25547@gmail.com> References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , * Laurent Dufour wrote: > +{ > + unsigned long vdso_end, vdso_start; > + > + if (!mm->context.vdso_base) > + return; > + vdso_start = mm->context.vdso_base; > + > +#ifdef CONFIG_PPC64 > + /* Calling is_32bit_task() implies that we are dealing with the > + * current process memory. If there is a call path where mm is not > + * owned by the current task, then we'll have need to store the > + * vDSO size in the mm->context. > + */ > + BUG_ON(current->mm != mm); > + if (is_32bit_task()) > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > + else > + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); > +#else > + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); > +#endif > + vdso_end += (1< + > + /* Check if the vDSO is in the range of the remapped area */ > + if ((vdso_start <= old_start && old_start < vdso_end) || > + (vdso_start < old_end && old_end <= vdso_end) || > + (old_start <= vdso_start && vdso_start < old_end)) { > + /* Update vdso_base if the vDSO is entirely moved. */ > + if (old_start == vdso_start && old_end == vdso_end && > + (old_end - old_start) == (new_end - new_start)) > + mm->context.vdso_base = new_start; > + else > + mm->context.vdso_base = 0; > + } > +} Oh my, that really looks awfully complex, as you predicted, and right in every mremap() call. I'm fine with your original, imperfect, KISS approach. Sorry about this detour ... Reviewed-by: Ingo Molnar Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 5E6361A04A3 for ; Fri, 27 Mar 2015 10:42:14 +1100 (AEDT) Message-ID: <1427412183.6468.148.camel@kernel.crashing.org> Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap From: Benjamin Herrenschmidt To: Ingo Molnar Date: Fri, 27 Mar 2015 10:23:03 +1100 In-Reply-To: <20150326094330.GA15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Laurent Dufour , user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , cov@codeaurora.org, user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, 2015-03-26 at 10:43 +0100, Ingo Molnar wrote: > * Benjamin Herrenschmidt wrote: > > > On Wed, 2015-03-25 at 19:36 +0100, Ingo Molnar wrote: > > > * Ingo Molnar wrote: > > > > > > > > +#define __HAVE_ARCH_REMAP > > > > > +static inline void arch_remap(struct mm_struct *mm, > > > > > + unsigned long old_start, unsigned long old_end, > > > > > + unsigned long new_start, unsigned long new_end) > > > > > +{ > > > > > + /* > > > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > > > + * check to old_start == vdso_base. > > > > > + */ > > > > > + if (old_start == mm->context.vdso_base) > > > > > + mm->context.vdso_base = new_start; > > > > > +} > > > > > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > > > movement of multi-page vmas and it also allows partial mremap()s, > > > > where it will split up a vma. > > > > > > I.e. mremap() supports the shrinking (and growing) of vmas. In that > > > case mremap() will unmap the end of the vma and will shrink the > > > remaining vDSO vma. > > > > > > Doesn't that result in a non-working vDSO that should zero out > > > vdso_base? > > > > Right. Now we can't completely prevent the user from shooting itself > > in the foot I suppose, though there is a legit usage scenario which > > is to move the vDSO around which it would be nice to support. I > > think it's reasonable to put the onus on the user here to do the > > right thing. > > I argue we should use the right condition to clear vdso_base: if the > vDSO gets at least partially unmapped. Otherwise there's little point > in the whole patch: either correctly track whether the vDSO is OK, or > don't ... Well, if we are going to clear it at all yes, we should probably be a bit smarter about it. My point however was we probably don't need to be super robust about dealing with any crazy scenario userspace might conceive. > There's also the question of mprotect(): can users mprotect() the vDSO > on PowerPC? Nothing prevents it. But here too, I wouldn't bother. The user might be doing on purpose expecting to catch the resulting signal for example (though arguably a signal from a sigreturn frame is ... odd). Cheers, Ben. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com [195.75.94.113]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 8DB971A02AC for ; Fri, 27 Mar 2015 22:02:24 +1100 (AEDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 27 Mar 2015 11:02:20 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 8A6EE17D8068 for ; Fri, 27 Mar 2015 11:02:46 +0000 (GMT) Received: from d06av11.portsmouth.uk.ibm.com (d06av11.portsmouth.uk.ibm.com [9.149.37.252]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2RB2IK44719088 for ; Fri, 27 Mar 2015 11:02:18 GMT Received: from d06av11.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av11.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2RB2Gj4007685 for ; Fri, 27 Mar 2015 05:02:17 -0600 Message-ID: <551538B5.2030507@linux.vnet.ibm.com> Date: Fri, 27 Mar 2015 12:02:13 +0100 From: Laurent Dufour MIME-Version: 1.0 To: Ingo Molnar Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> <20150326185550.GA25547@gmail.com> In-Reply-To: <20150326185550.GA25547@gmail.com> Content-Type: text/plain; charset=windows-1252 Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, x86@kernel.org, user-mode-linux-devel@lists.sourceforge.net, Arnd Bergmann , Jeff Dike , "H. Peter Anvin" , linux-kernel@vger.kernel.org, criu@openvz.org, linux-mm@kvack.org, Ingo Molnar , Paul Mackerras , user-mode-linux-user@lists.sourceforge.net, Richard Weinberger , Thomas Gleixner , Guan Xuetao , linuxppc-dev@lists.ozlabs.org, cov@codeaurora.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 26/03/2015 19:55, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> +{ >> + unsigned long vdso_end, vdso_start; >> + >> + if (!mm->context.vdso_base) >> + return; >> + vdso_start = mm->context.vdso_base; >> + >> +#ifdef CONFIG_PPC64 >> + /* Calling is_32bit_task() implies that we are dealing with the >> + * current process memory. If there is a call path where mm is not >> + * owned by the current task, then we'll have need to store the >> + * vDSO size in the mm->context. >> + */ >> + BUG_ON(current->mm != mm); >> + if (is_32bit_task()) >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> + else >> + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); >> +#else >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> +#endif >> + vdso_end += (1<> + >> + /* Check if the vDSO is in the range of the remapped area */ >> + if ((vdso_start <= old_start && old_start < vdso_end) || >> + (vdso_start < old_end && old_end <= vdso_end) || >> + (old_start <= vdso_start && vdso_start < old_end)) { >> + /* Update vdso_base if the vDSO is entirely moved. */ >> + if (old_start == vdso_start && old_end == vdso_end && >> + (old_end - old_start) == (new_end - new_start)) >> + mm->context.vdso_base = new_start; >> + else >> + mm->context.vdso_base = 0; >> + } >> +} > > Oh my, that really looks awfully complex, as you predicted, and right > in every mremap() call. I do agree, that's awfully complex ;) > I'm fine with your original, imperfect, KISS approach. Sorry about > this detour ... > > Reviewed-by: Ingo Molnar No problem, so let's stay on the v3 version of the patch. Thanks for Reviewed-by statement which, I guess, applied to the v3 too. Should I resend the v3 ? Thanks, Laurent. From mboxrd@z Thu Jan 1 00:00:00 1970 From: cov@codeaurora.org (Christopher Covington) Date: Wed, 2 Mar 2016 07:13:10 -0500 Subject: [PATCH 0/2] Tracking user space vDSO remaping In-Reply-To: References: Message-ID: <56D6D8D6.6060306@codeaurora.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi, On 03/20/2015 11:53 AM, Laurent Dufour wrote: > CRIU is recreating the process memory layout by remapping the checkpointee > memory area on top of the current process (criu). This includes remapping > the vDSO to the place it has at checkpoint time. > > However some architectures like powerpc are keeping a reference to the vDSO > base address to build the signal return stack frame by calling the vDSO > sigreturn service. So once the vDSO has been moved, this reference is no > more valid and the signal frame built later are not usable. > > This patch serie is introducing a new mm hook 'arch_remap' which is called > when mremap is done and the mm lock still hold. The next patch is adding the > vDSO remap and unmap tracking to the powerpc architecture. > > Laurent Dufour (2): > mm: Introducing arch_remap hook > powerpc/mm: Tracking vDSO remap > > arch/powerpc/include/asm/mmu_context.h | 35 +++++++++++++++++++++++++++++++- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 ++++++-- > 7 files changed, 70 insertions(+), 3 deletions(-) We would like to be able to remap/unmap the VDSO on arm and arm64 as well. When I proposed a patch with mmu_context.h and mmu-arch-hooks.h changes to arm64 that were nearly identical to those done to powerpc, Will Deacon reasonably suggested [1] attempting to combine the code and provide generic VDSO accessors. Unfortunately, I no prior experience with generic MM code. Can anyone advise on how to get started with that? 1. http://www.spinics.net/lists/linux-arm-msm/msg18441.html Thanks, Christopher Covington -- Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by kanga.kvack.org (Postfix) with ESMTP id 1E5E86B0072 for ; Mon, 23 Mar 2015 04:52:15 -0400 (EDT) Received: by wixw10 with SMTP id w10so55214373wix.0 for ; Mon, 23 Mar 2015 01:52:14 -0700 (PDT) Received: from mail-we0-x22c.google.com (mail-we0-x22c.google.com. [2a00:1450:400c:c03::22c]) by mx.google.com with ESMTPS id lf5si182004wjb.111.2015.03.23.01.52.13 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Mar 2015 01:52:13 -0700 (PDT) Received: by wegp1 with SMTP id p1so131768666weg.1 for ; Mon, 23 Mar 2015 01:52:12 -0700 (PDT) Date: Mon, 23 Mar 2015 09:52:09 +0100 From: Ingo Molnar Subject: Re: [PATCH 1/2] mm: Introducing arch_remap hook Message-ID: <20150323085209.GA28965@gmail.com> References: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <503499aae380db1c4673f146bcba6ad095021257.1426866405.git.ldufour@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > Some architecture would like to be triggered when a memory area is moved > through the mremap system call. > > This patch is introducing a new arch_remap mm hook which is placed in the > path of mremap, and is called before the old area is unmapped (and the > arch_unmap hook is called). > > To no break the build, this patch adds the empty hook definition to the > architectures that were not using the generic hook's definition. > > Signed-off-by: Laurent Dufour > --- > arch/s390/include/asm/mmu_context.h | 6 ++++++ > arch/um/include/asm/mmu_context.h | 5 +++++ > arch/unicore32/include/asm/mmu_context.h | 6 ++++++ > arch/x86/include/asm/mmu_context.h | 6 ++++++ > include/asm-generic/mm_hooks.h | 6 ++++++ > mm/mremap.c | 9 +++++++-- > 6 files changed, 36 insertions(+), 2 deletions(-) > > diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h > index 8fb3802f8fad..ddd861a490ba 100644 > --- a/arch/s390/include/asm/mmu_context.h > +++ b/arch/s390/include/asm/mmu_context.h > @@ -131,4 +131,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* __S390_MMU_CONTEXT_H */ > diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h > index 941527e507f7..f499b017c1f9 100644 > --- a/arch/um/include/asm/mmu_context.h > +++ b/arch/um/include/asm/mmu_context.h > @@ -27,6 +27,11 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > struct vm_area_struct *vma) > { > } > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > /* > * end asm-generic/mm_hooks.h functions > */ > diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h > index 1cb5220afaf9..39a0a553172e 100644 > --- a/arch/unicore32/include/asm/mmu_context.h > +++ b/arch/unicore32/include/asm/mmu_context.h > @@ -97,4 +97,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, > { > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h > index 883f6b933fa4..75cb71f4be1e 100644 > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -172,4 +172,10 @@ static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma, > mpx_notify_unmap(mm, vma, start, end); > } > > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > +} > + > #endif /* _ASM_X86_MMU_CONTEXT_H */ So instead of spreading these empty prototypes around mmu_context.h files, why not add something like this to the PPC definition: #define __HAVE_ARCH_REMAP and define the empty prototype for everyone else? It's a bit like how the __HAVE_ARCH_PTEP_* namespace works. That should shrink this patch considerably. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by kanga.kvack.org (Postfix) with ESMTP id 5A02D6B0038 for ; Wed, 25 Mar 2015 09:25:28 -0400 (EDT) Received: by wgs2 with SMTP id 2so27394591wgs.1 for ; Wed, 25 Mar 2015 06:25:27 -0700 (PDT) Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com. [195.75.94.109]) by mx.google.com with ESMTPS id bi1si22077015wib.106.2015.03.25.06.25.26 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 25 Mar 2015 06:25:26 -0700 (PDT) Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:25:25 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 5BB851B0805F for ; Wed, 25 Mar 2015 13:25:47 +0000 (GMT) Received: from d06av12.portsmouth.uk.ibm.com (d06av12.portsmouth.uk.ibm.com [9.149.37.247]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDPLKt65601784 for ; Wed, 25 Mar 2015 13:25:21 GMT Received: from d06av12.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDPI7B004627 for ; Wed, 25 Mar 2015 07:25:20 -0600 Message-ID: <5512B73C.5050509@linux.vnet.ibm.com> Date: Wed, 25 Mar 2015 14:25:16 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH v2 2/2] powerpc/mm: Tracking vDSO remap References: <20150323085209.GA28965@gmail.com> <25152b76585716dc635945c3455ab9b49e645f6d.1427280806.git.ldufour@linux.vnet.ibm.com> <20150325121118.GA2542@gmail.com> In-Reply-To: <20150325121118.GA2542@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org On 25/03/2015 13:11, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> Some processes (CRIU) are moving the vDSO area using the mremap system >> call. As a consequence the kernel reference to the vDSO base address is >> no more valid and the signal return frame built once the vDSO has been >> moved is not pointing to the new sigreturn address. >> >> This patch handles vDSO remapping and unmapping. >> >> Signed-off-by: Laurent Dufour >> --- >> arch/powerpc/include/asm/mmu_context.h | 36 +++++++++++++++++++++++++++++++++- >> 1 file changed, 35 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h >> index 73382eba02dc..be5dca3f7826 100644 >> --- a/arch/powerpc/include/asm/mmu_context.h >> +++ b/arch/powerpc/include/asm/mmu_context.h >> @@ -8,7 +8,6 @@ >> #include >> #include >> #include >> -#include >> #include >> >> /* >> @@ -109,5 +108,40 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, >> #endif >> } >> >> +static inline void arch_dup_mmap(struct mm_struct *oldmm, >> + struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_exit_mmap(struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void arch_unmap(struct mm_struct *mm, >> + struct vm_area_struct *vma, >> + unsigned long start, unsigned long end) >> +{ >> + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) >> + mm->context.vdso_base = 0; >> +} >> + >> +static inline void arch_bprm_mm_init(struct mm_struct *mm, >> + struct vm_area_struct *vma) >> +{ >> +} >> + >> +#define __HAVE_ARCH_REMAP >> +static inline void arch_remap(struct mm_struct *mm, >> + unsigned long old_start, unsigned long old_end, >> + unsigned long new_start, unsigned long new_end) >> +{ >> + /* >> + * mremap don't allow moving multiple vma so we can limit the check >> + * to old_start == vdso_base. > > s/mremap don't allow moving multiple vma > mremap() doesn't allow moving multiple vmas > > right? Sure you're right. I'll provide a v3 fixing that comment. Thanks, Laurent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) by kanga.kvack.org (Postfix) with ESMTP id E81706B006C for ; Wed, 25 Mar 2015 09:54:06 -0400 (EDT) Received: by wixw10 with SMTP id w10so75100922wix.0 for ; Wed, 25 Mar 2015 06:54:06 -0700 (PDT) Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com. [195.75.94.107]) by mx.google.com with ESMTPS id gp8si22754156wib.20.2015.03.25.06.54.04 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 25 Mar 2015 06:54:05 -0700 (PDT) Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Mar 2015 13:54:04 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 5BE781B08072 for ; Wed, 25 Mar 2015 13:54:28 +0000 (GMT) Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2PDs2EV11731262 for ; Wed, 25 Mar 2015 13:54:02 GMT Received: from d06av01.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2PDrwGd002997 for ; Wed, 25 Mar 2015 07:54:01 -0600 From: Laurent Dufour Subject: [PATCH v3 1/2] mm: Introducing arch_remap hook Date: Wed, 25 Mar 2015 14:53:51 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: <20150325121118.GA2542@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org Cc: cov@codeaurora.org, criu@openvz.org Some architecture would like to be triggered when a memory area is moved through the mremap system call. This patch is introducing a new arch_remap mm hook which is placed in the path of mremap, and is called before the old area is unmapped (and the arch_unmap hook is called). The architectures which need to call this hook should define __HAVE_ARCH_REMAP in their asm/mmu_context.h and provide the arch_remap service with the following prototype: void arch_remap(struct mm_struct *mm, unsigned long old_start, unsigned long old_end, unsigned long new_start, unsigned long new_end); Signed-off-by: Laurent Dufour --- mm/mremap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 57dadc025c64..bafc234db45c 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,6 +25,7 @@ #include #include +#include #include "internal.h" @@ -286,8 +287,14 @@ static unsigned long move_vma(struct vm_area_struct *vma, old_len = new_len; old_addr = new_addr; new_addr = -ENOMEM; - } else if (vma->vm_file && vma->vm_file->f_op->mremap) - vma->vm_file->f_op->mremap(vma->vm_file, new_vma); + } else { + if (vma->vm_file && vma->vm_file->f_op->mremap) + vma->vm_file->f_op->mremap(vma->vm_file, new_vma); +#ifdef __HAVE_ARCH_REMAP + arch_remap(mm, old_addr, old_addr+old_len, + new_addr, new_addr+new_len); +#endif + } /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT) { -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com [74.125.82.51]) by kanga.kvack.org (Postfix) with ESMTP id 924906B0032 for ; Wed, 25 Mar 2015 14:33:23 -0400 (EDT) Received: by wgra20 with SMTP id a20so37648658wgr.3 for ; Wed, 25 Mar 2015 11:33:23 -0700 (PDT) Received: from mail-wg0-x234.google.com (mail-wg0-x234.google.com. [2a00:1450:400c:c00::234]) by mx.google.com with ESMTPS id ey12si6417211wid.87.2015.03.25.11.33.21 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Mar 2015 11:33:21 -0700 (PDT) Received: by wgdm6 with SMTP id m6so37930987wgd.2 for ; Wed, 25 Mar 2015 11:33:21 -0700 (PDT) Date: Wed, 25 Mar 2015 19:33:16 +0100 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325183316.GA9090@gmail.com> References: <20150325121118.GA2542@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > +static inline void arch_unmap(struct mm_struct *mm, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end) > +{ > + if (start <= mm->context.vdso_base && mm->context.vdso_base < end) > + mm->context.vdso_base = 0; > +} So AFAICS PowerPC can have multi-page vDSOs, right? So what happens if I munmap() the middle or end of the vDSO? The above condition only seems to cover unmaps that affect the first page. I think 'affects any page' ought to be the right condition? (But I know nothing about PowerPC so I might be wrong.) > +#define __HAVE_ARCH_REMAP > +static inline void arch_remap(struct mm_struct *mm, > + unsigned long old_start, unsigned long old_end, > + unsigned long new_start, unsigned long new_end) > +{ > + /* > + * mremap() doesn't allow moving multiple vmas so we can limit the > + * check to old_start == vdso_base. > + */ > + if (old_start == mm->context.vdso_base) > + mm->context.vdso_base = new_start; > +} mremap() doesn't allow moving multiple vmas, but it allows the movement of multi-page vmas and it also allows partial mremap()s, where it will split up a vma. In particular, what happens if an mremap() is done with old_start == vdso_base, but a shorter end than the end of the vDSO? (i.e. a partial mremap() with fewer pages than the vDSO size) Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) by kanga.kvack.org (Postfix) with ESMTP id A03156B0032 for ; Wed, 25 Mar 2015 14:36:53 -0400 (EDT) Received: by wibg7 with SMTP id g7so120384612wib.1 for ; Wed, 25 Mar 2015 11:36:53 -0700 (PDT) Received: from mail-wi0-x232.google.com (mail-wi0-x232.google.com. [2a00:1450:400c:c05::232]) by mx.google.com with ESMTPS id p8si5757906wjx.82.2015.03.25.11.36.51 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Mar 2015 11:36:52 -0700 (PDT) Received: by wixw10 with SMTP id w10so81036556wix.0 for ; Wed, 25 Mar 2015 11:36:51 -0700 (PDT) Date: Wed, 25 Mar 2015 19:36:47 +0100 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150325183647.GA9331@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150325183316.GA9090@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Ingo Molnar wrote: > > +#define __HAVE_ARCH_REMAP > > +static inline void arch_remap(struct mm_struct *mm, > > + unsigned long old_start, unsigned long old_end, > > + unsigned long new_start, unsigned long new_end) > > +{ > > + /* > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > + * check to old_start == vdso_base. > > + */ > > + if (old_start == mm->context.vdso_base) > > + mm->context.vdso_base = new_start; > > +} > > mremap() doesn't allow moving multiple vmas, but it allows the > movement of multi-page vmas and it also allows partial mremap()s, > where it will split up a vma. I.e. mremap() supports the shrinking (and growing) of vmas. In that case mremap() will unmap the end of the vma and will shrink the remaining vDSO vma. Doesn't that result in a non-working vDSO that should zero out vdso_base? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by kanga.kvack.org (Postfix) with ESMTP id 55C0C6B0032 for ; Thu, 26 Mar 2015 05:48:50 -0400 (EDT) Received: by wgs2 with SMTP id 2so57933662wgs.1 for ; Thu, 26 Mar 2015 02:48:49 -0700 (PDT) Received: from mail-wi0-x22c.google.com (mail-wi0-x22c.google.com. [2a00:1450:400c:c05::22c]) by mx.google.com with ESMTPS id qa2si27308343wic.10.2015.03.26.02.48.48 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 26 Mar 2015 02:48:49 -0700 (PDT) Received: by wiaa2 with SMTP id a2so14270462wia.0 for ; Thu, 26 Mar 2015 02:48:48 -0700 (PDT) Date: Thu, 26 Mar 2015 10:48:44 +0100 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326094844.GB15407@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1427317797.6468.86.camel@kernel.crashing.org> Sender: owner-linux-mm@kvack.org List-ID: To: Benjamin Herrenschmidt Cc: Laurent Dufour , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Benjamin Herrenschmidt wrote: > > > +#define __HAVE_ARCH_REMAP > > > +static inline void arch_remap(struct mm_struct *mm, > > > + unsigned long old_start, unsigned long old_end, > > > + unsigned long new_start, unsigned long new_end) > > > +{ > > > + /* > > > + * mremap() doesn't allow moving multiple vmas so we can limit the > > > + * check to old_start == vdso_base. > > > + */ > > > + if (old_start == mm->context.vdso_base) > > > + mm->context.vdso_base = new_start; > > > +} > > > > mremap() doesn't allow moving multiple vmas, but it allows the > > movement of multi-page vmas and it also allows partial mremap()s, > > where it will split up a vma. > > > > In particular, what happens if an mremap() is done with > > old_start == vdso_base, but a shorter end than the end of the vDSO? > > (i.e. a partial mremap() with fewer pages than the vDSO size) > > Is there a way to forbid splitting ? Does x86 deal with that case at > all or it doesn't have to for some other reason ? So we use _install_special_mapping() - maybe PowerPC does that too? That adds VM_DONTEXPAND which ought to prevent some - but not all - of the VM API weirdnesses. On x86 we'll just dump core if someone unmaps the vdso. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) by kanga.kvack.org (Postfix) with ESMTP id 12D7A6B0032 for ; Thu, 26 Mar 2015 06:14:03 -0400 (EDT) Received: by wibg7 with SMTP id g7so142334529wib.1 for ; Thu, 26 Mar 2015 03:14:02 -0700 (PDT) Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com. [195.75.94.108]) by mx.google.com with ESMTPS id gy8si9784215wib.118.2015.03.26.03.14.01 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 26 Mar 2015 03:14:01 -0700 (PDT) Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Mar 2015 10:14:00 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 84D8D1B08069 for ; Thu, 26 Mar 2015 10:14:24 +0000 (GMT) Received: from d06av03.portsmouth.uk.ibm.com (d06av03.portsmouth.uk.ibm.com [9.149.37.213]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2QADwfv64880858 for ; Thu, 26 Mar 2015 10:13:58 GMT Received: from d06av03.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av03.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2QADtxE024090 for ; Thu, 26 Mar 2015 04:13:57 -0600 Message-ID: <5513DBE1.4070404@linux.vnet.ibm.com> Date: Thu, 26 Mar 2015 11:13:53 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <1427317797.6468.86.camel@kernel.crashing.org> <20150326094844.GB15407@gmail.com> In-Reply-To: <20150326094844.GB15407@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar , Benjamin Herrenschmidt Cc: Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org On 26/03/2015 10:48, Ingo Molnar wrote: > > * Benjamin Herrenschmidt wrote: > >>>> +#define __HAVE_ARCH_REMAP >>>> +static inline void arch_remap(struct mm_struct *mm, >>>> + unsigned long old_start, unsigned long old_end, >>>> + unsigned long new_start, unsigned long new_end) >>>> +{ >>>> + /* >>>> + * mremap() doesn't allow moving multiple vmas so we can limit the >>>> + * check to old_start == vdso_base. >>>> + */ >>>> + if (old_start == mm->context.vdso_base) >>>> + mm->context.vdso_base = new_start; >>>> +} >>> >>> mremap() doesn't allow moving multiple vmas, but it allows the >>> movement of multi-page vmas and it also allows partial mremap()s, >>> where it will split up a vma. >>> >>> In particular, what happens if an mremap() is done with >>> old_start == vdso_base, but a shorter end than the end of the vDSO? >>> (i.e. a partial mremap() with fewer pages than the vDSO size) >> >> Is there a way to forbid splitting ? Does x86 deal with that case at >> all or it doesn't have to for some other reason ? > > So we use _install_special_mapping() - maybe PowerPC does that too? > That adds VM_DONTEXPAND which ought to prevent some - but not all - of > the VM API weirdnesses. The same is done on PowerPC. So calling mremap() to extend the vDSO is failing but splitting it or unmapping a part of it is allowed but lead to an unusable vDSO. > On x86 we'll just dump core if someone unmaps the vdso. On PowerPC, you'll get the same result. Should we prevent the user to break its vDSO ? Thanks, Laurent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f53.google.com (mail-wg0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id 48B096B006C for ; Thu, 26 Mar 2015 10:17:41 -0400 (EDT) Received: by wgs2 with SMTP id 2so66084321wgs.1 for ; Thu, 26 Mar 2015 07:17:40 -0700 (PDT) Received: from mail-wi0-x230.google.com (mail-wi0-x230.google.com. [2a00:1450:400c:c05::230]) by mx.google.com with ESMTPS id gz6si10115750wjc.142.2015.03.26.07.17.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 26 Mar 2015 07:17:39 -0700 (PDT) Received: by wibg7 with SMTP id g7so150310998wib.1 for ; Thu, 26 Mar 2015 07:17:39 -0700 (PDT) Date: Thu, 26 Mar 2015 15:17:31 +0100 From: Ingo Molnar Subject: Re: [PATCH v3 2/2] powerpc/mm: Tracking vDSO remap Message-ID: <20150326141730.GA23060@gmail.com> References: <20150325121118.GA2542@gmail.com> <20150325183316.GA9090@gmail.com> <20150325183647.GA9331@gmail.com> <1427317867.6468.87.camel@kernel.crashing.org> <20150326094330.GA15407@gmail.com> <5513E16D.1030101@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5513E16D.1030101@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Laurent Dufour Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org * Laurent Dufour wrote: > > I argue we should use the right condition to clear vdso_base: if > > the vDSO gets at least partially unmapped. Otherwise there's > > little point in the whole patch: either correctly track whether > > the vDSO is OK, or don't ... > > That's a good option, but it may be hard to achieve in the case the > vDSO area has been splitted in multiple pieces. > > Not sure there is a right way to handle that, here this is a best > effort, allowing a process to unmap its vDSO and having the > sigreturn call done through the stack area (it has to make it > executable). > > Anyway I'll dig into that, assuming that the vdso_base pointer > should be clear if a part of the vDSO is moved or unmapped. The > patch will be larger since I'll have to get the vDSO size which is > private to the vdso.c file. At least for munmap() I don't think that's a worry: once unmapped (even if just partially), vdso_base becomes zero and won't ever be set again. So no need to track the zillion pieces, should there be any: Humpty Dumpty won't be whole again, right? > > There's also the question of mprotect(): can users mprotect() the > > vDSO on PowerPC? > > Yes, mprotect() the vDSO is allowed on PowerPC, as it is on x86, and > certainly all the other architectures. Furthermore, if it is done on > a partial part of the vDSO it is splitting the vma... btw., CRIU's main purpose here is to reconstruct a vDSO that was originally randomized, but whose address must now be reproduced as-is, right? In that sense detecting the 'good' mremap() as your patch does should do the trick and is certainly not objectionable IMHO - I was just wondering whether we could make a perfect job very simply. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by kanga.kvack.org (Postfix) with ESMTP id 001766B0032 for ; Fri, 27 Mar 2015 07:02:24 -0400 (EDT) Received: by wgra20 with SMTP id a20so94889529wgr.3 for ; Fri, 27 Mar 2015 04:02:24 -0700 (PDT) Received: from e06smtp17.uk.ibm.com (e06smtp17.uk.ibm.com. [195.75.94.113]) by mx.google.com with ESMTPS id lf5si2666108wjb.111.2015.03.27.04.02.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 27 Mar 2015 04:02:23 -0700 (PDT) Received: from /spool/local by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 27 Mar 2015 11:02:21 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 18615219005C for ; Fri, 27 Mar 2015 11:02:06 +0000 (GMT) Received: from d06av11.portsmouth.uk.ibm.com (d06av11.portsmouth.uk.ibm.com [9.149.37.252]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t2RB2HLC59310332 for ; Fri, 27 Mar 2015 11:02:17 GMT Received: from d06av11.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av11.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t2RB2Gj0007685 for ; Fri, 27 Mar 2015 05:02:17 -0600 Message-ID: <551538B5.2030507@linux.vnet.ibm.com> Date: Fri, 27 Mar 2015 12:02:13 +0100 From: Laurent Dufour MIME-Version: 1.0 Subject: Re: [PATCH v4 2/2] powerpc/mm: Tracking vDSO remap References: <20150326141730.GA23060@gmail.com> <7fdae652993cf88bdd633d65e5a8f81c7ad8a1e3.1427390952.git.ldufour@linux.vnet.ibm.com> <20150326185550.GA25547@gmail.com> In-Reply-To: <20150326185550.GA25547@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Jeff Dike , Richard Weinberger , Guan Xuetao , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Arnd Bergmann , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-arch@vger.kernel.org, linux-mm@kvack.org, cov@codeaurora.org, criu@openvz.org On 26/03/2015 19:55, Ingo Molnar wrote: > > * Laurent Dufour wrote: > >> +{ >> + unsigned long vdso_end, vdso_start; >> + >> + if (!mm->context.vdso_base) >> + return; >> + vdso_start = mm->context.vdso_base; >> + >> +#ifdef CONFIG_PPC64 >> + /* Calling is_32bit_task() implies that we are dealing with the >> + * current process memory. If there is a call path where mm is not >> + * owned by the current task, then we'll have need to store the >> + * vDSO size in the mm->context. >> + */ >> + BUG_ON(current->mm != mm); >> + if (is_32bit_task()) >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> + else >> + vdso_end = vdso_start + (vdso64_pages << PAGE_SHIFT); >> +#else >> + vdso_end = vdso_start + (vdso32_pages << PAGE_SHIFT); >> +#endif >> + vdso_end += (1<> + >> + /* Check if the vDSO is in the range of the remapped area */ >> + if ((vdso_start <= old_start && old_start < vdso_end) || >> + (vdso_start < old_end && old_end <= vdso_end) || >> + (old_start <= vdso_start && vdso_start < old_end)) { >> + /* Update vdso_base if the vDSO is entirely moved. */ >> + if (old_start == vdso_start && old_end == vdso_end && >> + (old_end - old_start) == (new_end - new_start)) >> + mm->context.vdso_base = new_start; >> + else >> + mm->context.vdso_base = 0; >> + } >> +} > > Oh my, that really looks awfully complex, as you predicted, and right > in every mremap() call. I do agree, that's awfully complex ;) > I'm fine with your original, imperfect, KISS approach. Sorry about > this detour ... > > Reviewed-by: Ingo Molnar No problem, so let's stay on the v3 version of the patch. Thanks for Reviewed-by statement which, I guess, applied to the v3 too. Should I resend the v3 ? Thanks, Laurent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org