All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>,
	riel@redhat.com, hugh@veritas.com,
	kosaki.motohiro@jp.fujitsu.com, linux-abi@vger.kernel.org
Subject: Re: [PATCH] - support inheritance of mlocks across fork/exec V2
Date: Mon, 08 Dec 2008 10:01:05 -0500	[thread overview]
Message-ID: <1228748465.22647.85.camel@lts-notebook> (raw)
In-Reply-To: <20081206220729.042a926e.akpm@linux-foundation.org>

On Sat, 2008-12-06 at 22:07 -0800, Andrew Morton wrote:
> (cc linux-abi.  Again.)

Sorry.

> 
> On Wed, 03 Dec 2008 14:04:29 -0500 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> 
> > 
> > 
> > Against;  2.6.28-rc7-mmotm-081203
> > 
> > V02:	rework vetting of flags argument as suggested by
> > 	Kosaki Motohiro.
> > 	enhance description as requested by Andrew Morton.
> > 
> > Add support for mlockall(MCL_INHERIT|MCL_RECURSIVE):
> > 
> > 	MCL_CURRENT[|MCL_FUTURE]|MCL_INHERIT - inherit memory locks
> > 		[vmas' VM_LOCKED flags] across fork(), and inherit
> > 		MCL_FUTURE behavior [mm's def_flags] across fork()
> > 		and exec().  Behaves as if child and/or new task
> > 		called mlockall(MCL_CURRENT|MCL_FUTURE) as first
> > 		instruction.
> > 
> > 	MCL_RECURSIVE - inherit MCL_CURRENT|MCL_FUTURE|MCL_INHERIT
> > 		[vmas' VM_LOCKED flags for fork() and mm's def_flags
> > 		and mcl_inherit across fork() and exec()] for all
> > 		future generations of calling process's descendants.
> > 		Behaves as if child and/or new task called
> > 		mlockall(MCL_CURRENT|MCL_FUTURE|MCL_INHERIT|MCL_RECURSIVE)
> > 		as the first instruction.
> > 
> > In support of a "lock prefix command"--e.g., mlock <cmd> <args> ...
> > Analogous to taskset(1) for cpu affinity or numactl(8) for numa memory
> > policy.
> > 
> > Together with patches to keep mlocked pages off the LRU, this will
> > allow users/admins to lock down applications without modifying them,
> > if their RLIMIT_MEMLOCK is sufficiently large, keeping their pages
> > off the LRU and out of consideration for reclaim.
> > 
> > Potentially useful, as well, in real-time environments to force
> > prefaulting and residency for applications that don't mlock themselves.
> > 
> > Jeff Sharkey at Montana State developed a similar patch for Linux
> > [link no longer accessible], but apparently he never submitted the patch.
> > 
> > I submitted an earlier version of this patch around a year ago.  I
> > resurrected it to test the unevictable lru/mlocked pages patches--
> > e.g., by "mlock -r make -j<N*nr_cpus> all".  This did shake out a few
> > races and vmstat accounting bugs, but NOT something I'd recommend as
> > general practice--for kernel builds, that is.
> > 
> > ----
> > 
> > Define MCL_INHERIT, MCL_RECURSIVE in <asm-*/mman.h>.
> > + x86 and  ia64 versions included.
> > + other arch can/will be created, if this patch deemed merge-worthy.
> > 
> > Similarly, I'll provide kernel man page update if/when needed.
> > 
> > Example "lock prefix command" in Documentation/vm/mlock.c
> > 
> >
> > ...
> >
> > --- linux-2.6.28-rc7-mmotm-081203.orig/mm/mlock.c	2008-12-03 10:33:11.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/mm/mlock.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -573,15 +573,18 @@ asmlinkage long sys_munlock(unsigned lon
> >  static int do_mlockall(int flags)
> >  {
> >  	struct vm_area_struct * vma, * prev = NULL;
> > +	struct mm_struct *mm = current->mm;
> >  	unsigned int def_flags = 0;
> >  
> >  	if (flags & MCL_FUTURE)
> > -		def_flags = VM_LOCKED;
> > -	current->mm->def_flags = def_flags;
> > -	if (flags == MCL_FUTURE)
> > +		def_flags = VM_LOCKED;;
> 
> You get paid by the semicolon?

Nope.  Interestingly, ckeckpatch was fine with this.  [It did warn about
>80 char line in x86 mman.h, but there was already a much longer line,
so I left it.  Fixed now so that patch is squeaky clean, as far as
checkpatch is concerned.]

> 
> > +	mm->def_flags = def_flags;
> > +	if (flags & MCL_INHERIT)
> > +		mm->mcl_inherit = flags & (MCL_INHERIT | MCL_RECURSIVE);
> 
> aaaaaaarrrrrrggggggghhhhhhh!
> 
> So mm->mcl_inherit is _not_ a boolean meaning "this mm has MCL_INHERIT
> set", as I naively believed when I saw it.  Is it in fact a composite
> of both MCL_INHERIT and MCL_RECURSIVE, given a badly misleading name
> and a waffly comment.
> 
> Can we please make this clearer?

How about "mcl_inherit_mlocks_possibly_recursively" ?

Seriously, it made/makes sense to me.  If mcl_inherit is non-zero, it
WILL contain MCL_INHERIT, meaning that existing mlocks and any VM_LOCKED
in def_flags will be inherited across the next fork or exec [of course,
existing mlocks can't be inherited across exec as we discard the address
space].  It may also contain MCL_RECURSIVE--tested separately--meaning
that the contents of mcl_inherit, itself, will also be inherited across
the next and future forks/execs, until/unless a descendant invokes
mlockall() w/o MCL_INHERIT or MCL_RECURSIVE to clear out mcl_inherit.

This seemed a bit much to try to explain on line where mcl_inherit is
declared in mm_types.  More appropriate to a man page, I would have
thought.   How about a Documentation/vm doc?

Meanwhile, perhaps:  "mcl_inherit" -> "mcl_inherit_flags" or "mcl_flags"
or "mlock_flags", or such?

> 
> >
> > ...
> >
> > --- linux-2.6.28-rc7-mmotm-081203.orig/kernel/fork.c	2008-12-03 10:18:15.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/kernel/fork.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -278,7 +278,8 @@ static int dup_mmap(struct mm_struct *mm
> >  	 */
> >  	down_write_nested(&mm->mmap_sem, SINGLE_DEPTH_NESTING);
> >  
> > -	mm->locked_vm = 0;
> > +	if (!mm->mcl_inherit)
> > +		mm->locked_vm = 0;
> >  	mm->mmap = NULL;
> >  	mm->mmap_cache = NULL;
> >  	mm->free_area_cache = oldmm->mmap_base;
> > @@ -316,7 +317,8 @@ static int dup_mmap(struct mm_struct *mm
> >  		if (IS_ERR(pol))
> >  			goto fail_nomem_policy;
> >  		vma_set_policy(tmp, pol);
> > -		tmp->vm_flags &= ~VM_LOCKED;
> > +		if (!mm->mcl_inherit)
> > +			tmp->vm_flags &= ~VM_LOCKED;
> >  		tmp->vm_mm = mm;
> >  		tmp->vm_next = NULL;
> >  		anon_vma_link(tmp);
> > @@ -406,6 +408,8 @@ __cacheline_aligned_in_smp DEFINE_SPINLO
> >  
> >  static struct mm_struct * mm_init(struct mm_struct * mm, struct task_struct *p)
> >  {
> > +	unsigned long def_flags = 0;
> > +
> >  	atomic_set(&mm->mm_users, 1);
> >  	atomic_set(&mm->mm_count, 1);
> >  	init_rwsem(&mm->mmap_sem);
> > @@ -422,9 +426,14 @@ static struct mm_struct * mm_init(struct
> >  	mm->free_area_cache = TASK_UNMAPPED_BASE;
> >  	mm->cached_hole_size = ~0UL;
> >  	mm_init_owner(mm, p);
> > +	if (current->mm && current->mm->mcl_inherit) {
> > +		def_flags = current->mm->def_flags & VM_LOCKED;
> > +		if (mm->mcl_inherit & MCL_RECURSIVE)
> > +			mm->mcl_inherit  = current->mm->mcl_inherit;
> > +	}
> 
> hm.  That code looks familiar.  Not worth a helper function?

If we could remove the copy in load_elf_binary(), point would be moot.

More below.

> 
> >  	if (likely(!mm_alloc_pgd(mm))) {
> > -		mm->def_flags = 0;
> > +		mm->def_flags = def_flags;
> >  		mmu_notifier_mm_init(mm);
> >  		return mm;
> >  	}
> > Index: linux-2.6.28-rc7-mmotm-081203/fs/binfmt_elf.c
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/fs/binfmt_elf.c	2008-12-03 10:19:21.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/fs/binfmt_elf.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -585,6 +585,7 @@ static int load_elf_binary(struct linux_
> >  	unsigned long reloc_func_desc = 0;
> >  	int executable_stack = EXSTACK_DEFAULT;
> >  	unsigned long def_flags = 0;
> > +	int mcl_inherit = 0;
> >  	struct {
> >  		struct elfhdr elf_ex;
> >  		struct elfhdr interp_elf_ex;
> > @@ -749,6 +750,13 @@ static int load_elf_binary(struct linux_
> >  		SET_PERSONALITY(loc->elf_ex);
> >  	}
> >  
> > +	/* Optionally inherit MCL_FUTURE state before destroying old mm */
> > +	if (current->mm && current->mm->mcl_inherit) {
> 
> umm, OK, it looks like current->mm can be null here.

Probably not here.  IIRC, it can be NULL [for init] up in mm_init() and
I probably just cut and pasted, instead of a helper function...

> 
> > +		def_flags = current->mm->def_flags & VM_LOCKED;
> 
> ooh, we finally used local variable def_flags for something.  That's
> been wasting space since 2.2.26 (or earlier).

def_flags was being used to clear out the current->mm->def_flags.  Don't
know why, tho'.  See below...
> 
> > +		if (current->mm->mcl_inherit & MCL_RECURSIVE)
> > +			mcl_inherit  = current->mm->mcl_inherit;
> > +	}
> > +
> >  	/* Flush all traces of the currently running executable */
> >  	retval = flush_old_exec(bprm);
> >  	if (retval)
> > @@ -757,6 +765,7 @@ static int load_elf_binary(struct linux_
> >  	/* OK, This is the point of no return */
> >  	current->flags &= ~PF_FORKNOEXEC;
> >  	current->mm->def_flags = def_flags;
> > +	current->mm->mcl_inherit = mcl_inherit;
> >  
> >  	/* Do this immediately, since STACK_TOP as used in setup_arg_pages
> >  	   may depend on the personality.  */
> 
> Why is this done in binfmt_elf.c?  Doesn't this preclude the operation
> of these new flags for other binary formats?

Because binfmt_elf.c is the only binary format loader [that I can see]
that takes it upon itself to reinitialize mm->def_flags, which has
already been set up in mm_init().  I wondered, myself, why this is the
case, but I didn't dare just remove it, lest it reintroduce some obscure
bug.  So, I duplicated my version of def_flags [and mcl_inherit]
initialization here. 

Now that you've raised the question:  why, indeed, does the elf format
loader reinit def_flags?  Can we just remove this?

> 
> > Index: linux-2.6.28-rc7-mmotm-081203/arch/x86/include/asm/mman.h
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/arch/x86/include/asm/mman.h	2008-12-03 10:16:26.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/arch/x86/include/asm/mman.h	2008-12-03 10:33:29.000000000 -0500
> > @@ -16,5 +16,8 @@
> >  
> >  #define MCL_CURRENT	1		/* lock all current mappings */
> >  #define MCL_FUTURE	2		/* lock all future mappings */
> > +#define MCL_INHERIT	4		/* inherit mlocks across fork */
> > +					/* inherit '_FUTURE flag across fork/exec */
> > +#define MCL_RECURSIVE	8		/* inherit mlocks recursively */
> >  
> >  #endif /* _ASM_X86_MMAN_H */
> > Index: linux-2.6.28-rc7-mmotm-081203/include/linux/mm_types.h
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/include/linux/mm_types.h	2008-12-03 10:18:01.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/include/linux/mm_types.h	2008-12-03 10:33:29.000000000 -0500
> > @@ -235,6 +235,8 @@ struct mm_struct {
> >  	unsigned int token_priority;
> >  	unsigned int last_interval;
> >  
> > +	int mcl_inherit;		/* inherit current/future locks */
> > +
> >  	unsigned long flags; /* Must use atomic bitops to access the bits */
> 
> I was thinking that the boolean mcl_inherit could become a bit in
> mm.flags.  That was before I unmisled myself about it.
> 
> Perhaps we could use two bits.  That might get a bit nasty.

I did look for someplace to add these flags, before deciding on a new
member.  The "Must use atomic bitops", and the fact that mm->flags seems
used soley for controlling core dumping [and defined in sched.h, no
less!], put me off from going that route.  Similarly, because def_flags
is an initializer for the vma's vm_flags, I didn't want to pollute that.
I figured the 'int' would fill a padding slot on 64-bit archs, but does
add to mm size for 32-bit archs.

Suggestions?
I'll hold off on a re-spin, pending resolution of some of these issues.

Thanks for the thorough review.

Lee



WARNING: multiple messages have this Message-ID (diff)
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>,
	riel@redhat.com, hugh@veritas.com,
	kosaki.motohiro@jp.fujitsu.com, linux-abi@vger.kernel.org
Subject: Re: [PATCH] - support inheritance of mlocks across fork/exec V2
Date: Mon, 08 Dec 2008 10:01:05 -0500	[thread overview]
Message-ID: <1228748465.22647.85.camel@lts-notebook> (raw)
In-Reply-To: <20081206220729.042a926e.akpm@linux-foundation.org>

On Sat, 2008-12-06 at 22:07 -0800, Andrew Morton wrote:
> (cc linux-abi.  Again.)

Sorry.

> 
> On Wed, 03 Dec 2008 14:04:29 -0500 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> 
> > 
> > 
> > Against;  2.6.28-rc7-mmotm-081203
> > 
> > V02:	rework vetting of flags argument as suggested by
> > 	Kosaki Motohiro.
> > 	enhance description as requested by Andrew Morton.
> > 
> > Add support for mlockall(MCL_INHERIT|MCL_RECURSIVE):
> > 
> > 	MCL_CURRENT[|MCL_FUTURE]|MCL_INHERIT - inherit memory locks
> > 		[vmas' VM_LOCKED flags] across fork(), and inherit
> > 		MCL_FUTURE behavior [mm's def_flags] across fork()
> > 		and exec().  Behaves as if child and/or new task
> > 		called mlockall(MCL_CURRENT|MCL_FUTURE) as first
> > 		instruction.
> > 
> > 	MCL_RECURSIVE - inherit MCL_CURRENT|MCL_FUTURE|MCL_INHERIT
> > 		[vmas' VM_LOCKED flags for fork() and mm's def_flags
> > 		and mcl_inherit across fork() and exec()] for all
> > 		future generations of calling process's descendants.
> > 		Behaves as if child and/or new task called
> > 		mlockall(MCL_CURRENT|MCL_FUTURE|MCL_INHERIT|MCL_RECURSIVE)
> > 		as the first instruction.
> > 
> > In support of a "lock prefix command"--e.g., mlock <cmd> <args> ...
> > Analogous to taskset(1) for cpu affinity or numactl(8) for numa memory
> > policy.
> > 
> > Together with patches to keep mlocked pages off the LRU, this will
> > allow users/admins to lock down applications without modifying them,
> > if their RLIMIT_MEMLOCK is sufficiently large, keeping their pages
> > off the LRU and out of consideration for reclaim.
> > 
> > Potentially useful, as well, in real-time environments to force
> > prefaulting and residency for applications that don't mlock themselves.
> > 
> > Jeff Sharkey at Montana State developed a similar patch for Linux
> > [link no longer accessible], but apparently he never submitted the patch.
> > 
> > I submitted an earlier version of this patch around a year ago.  I
> > resurrected it to test the unevictable lru/mlocked pages patches--
> > e.g., by "mlock -r make -j<N*nr_cpus> all".  This did shake out a few
> > races and vmstat accounting bugs, but NOT something I'd recommend as
> > general practice--for kernel builds, that is.
> > 
> > ----
> > 
> > Define MCL_INHERIT, MCL_RECURSIVE in <asm-*/mman.h>.
> > + x86 and  ia64 versions included.
> > + other arch can/will be created, if this patch deemed merge-worthy.
> > 
> > Similarly, I'll provide kernel man page update if/when needed.
> > 
> > Example "lock prefix command" in Documentation/vm/mlock.c
> > 
> >
> > ...
> >
> > --- linux-2.6.28-rc7-mmotm-081203.orig/mm/mlock.c	2008-12-03 10:33:11.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/mm/mlock.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -573,15 +573,18 @@ asmlinkage long sys_munlock(unsigned lon
> >  static int do_mlockall(int flags)
> >  {
> >  	struct vm_area_struct * vma, * prev = NULL;
> > +	struct mm_struct *mm = current->mm;
> >  	unsigned int def_flags = 0;
> >  
> >  	if (flags & MCL_FUTURE)
> > -		def_flags = VM_LOCKED;
> > -	current->mm->def_flags = def_flags;
> > -	if (flags == MCL_FUTURE)
> > +		def_flags = VM_LOCKED;;
> 
> You get paid by the semicolon?

Nope.  Interestingly, ckeckpatch was fine with this.  [It did warn about
>80 char line in x86 mman.h, but there was already a much longer line,
so I left it.  Fixed now so that patch is squeaky clean, as far as
checkpatch is concerned.]

> 
> > +	mm->def_flags = def_flags;
> > +	if (flags & MCL_INHERIT)
> > +		mm->mcl_inherit = flags & (MCL_INHERIT | MCL_RECURSIVE);
> 
> aaaaaaarrrrrrggggggghhhhhhh!
> 
> So mm->mcl_inherit is _not_ a boolean meaning "this mm has MCL_INHERIT
> set", as I naively believed when I saw it.  Is it in fact a composite
> of both MCL_INHERIT and MCL_RECURSIVE, given a badly misleading name
> and a waffly comment.
> 
> Can we please make this clearer?

How about "mcl_inherit_mlocks_possibly_recursively" ?

Seriously, it made/makes sense to me.  If mcl_inherit is non-zero, it
WILL contain MCL_INHERIT, meaning that existing mlocks and any VM_LOCKED
in def_flags will be inherited across the next fork or exec [of course,
existing mlocks can't be inherited across exec as we discard the address
space].  It may also contain MCL_RECURSIVE--tested separately--meaning
that the contents of mcl_inherit, itself, will also be inherited across
the next and future forks/execs, until/unless a descendant invokes
mlockall() w/o MCL_INHERIT or MCL_RECURSIVE to clear out mcl_inherit.

This seemed a bit much to try to explain on line where mcl_inherit is
declared in mm_types.  More appropriate to a man page, I would have
thought.   How about a Documentation/vm doc?

Meanwhile, perhaps:  "mcl_inherit" -> "mcl_inherit_flags" or "mcl_flags"
or "mlock_flags", or such?

> 
> >
> > ...
> >
> > --- linux-2.6.28-rc7-mmotm-081203.orig/kernel/fork.c	2008-12-03 10:18:15.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/kernel/fork.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -278,7 +278,8 @@ static int dup_mmap(struct mm_struct *mm
> >  	 */
> >  	down_write_nested(&mm->mmap_sem, SINGLE_DEPTH_NESTING);
> >  
> > -	mm->locked_vm = 0;
> > +	if (!mm->mcl_inherit)
> > +		mm->locked_vm = 0;
> >  	mm->mmap = NULL;
> >  	mm->mmap_cache = NULL;
> >  	mm->free_area_cache = oldmm->mmap_base;
> > @@ -316,7 +317,8 @@ static int dup_mmap(struct mm_struct *mm
> >  		if (IS_ERR(pol))
> >  			goto fail_nomem_policy;
> >  		vma_set_policy(tmp, pol);
> > -		tmp->vm_flags &= ~VM_LOCKED;
> > +		if (!mm->mcl_inherit)
> > +			tmp->vm_flags &= ~VM_LOCKED;
> >  		tmp->vm_mm = mm;
> >  		tmp->vm_next = NULL;
> >  		anon_vma_link(tmp);
> > @@ -406,6 +408,8 @@ __cacheline_aligned_in_smp DEFINE_SPINLO
> >  
> >  static struct mm_struct * mm_init(struct mm_struct * mm, struct task_struct *p)
> >  {
> > +	unsigned long def_flags = 0;
> > +
> >  	atomic_set(&mm->mm_users, 1);
> >  	atomic_set(&mm->mm_count, 1);
> >  	init_rwsem(&mm->mmap_sem);
> > @@ -422,9 +426,14 @@ static struct mm_struct * mm_init(struct
> >  	mm->free_area_cache = TASK_UNMAPPED_BASE;
> >  	mm->cached_hole_size = ~0UL;
> >  	mm_init_owner(mm, p);
> > +	if (current->mm && current->mm->mcl_inherit) {
> > +		def_flags = current->mm->def_flags & VM_LOCKED;
> > +		if (mm->mcl_inherit & MCL_RECURSIVE)
> > +			mm->mcl_inherit  = current->mm->mcl_inherit;
> > +	}
> 
> hm.  That code looks familiar.  Not worth a helper function?

If we could remove the copy in load_elf_binary(), point would be moot.

More below.

> 
> >  	if (likely(!mm_alloc_pgd(mm))) {
> > -		mm->def_flags = 0;
> > +		mm->def_flags = def_flags;
> >  		mmu_notifier_mm_init(mm);
> >  		return mm;
> >  	}
> > Index: linux-2.6.28-rc7-mmotm-081203/fs/binfmt_elf.c
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/fs/binfmt_elf.c	2008-12-03 10:19:21.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/fs/binfmt_elf.c	2008-12-03 10:33:29.000000000 -0500
> > @@ -585,6 +585,7 @@ static int load_elf_binary(struct linux_
> >  	unsigned long reloc_func_desc = 0;
> >  	int executable_stack = EXSTACK_DEFAULT;
> >  	unsigned long def_flags = 0;
> > +	int mcl_inherit = 0;
> >  	struct {
> >  		struct elfhdr elf_ex;
> >  		struct elfhdr interp_elf_ex;
> > @@ -749,6 +750,13 @@ static int load_elf_binary(struct linux_
> >  		SET_PERSONALITY(loc->elf_ex);
> >  	}
> >  
> > +	/* Optionally inherit MCL_FUTURE state before destroying old mm */
> > +	if (current->mm && current->mm->mcl_inherit) {
> 
> umm, OK, it looks like current->mm can be null here.

Probably not here.  IIRC, it can be NULL [for init] up in mm_init() and
I probably just cut and pasted, instead of a helper function...

> 
> > +		def_flags = current->mm->def_flags & VM_LOCKED;
> 
> ooh, we finally used local variable def_flags for something.  That's
> been wasting space since 2.2.26 (or earlier).

def_flags was being used to clear out the current->mm->def_flags.  Don't
know why, tho'.  See below...
> 
> > +		if (current->mm->mcl_inherit & MCL_RECURSIVE)
> > +			mcl_inherit  = current->mm->mcl_inherit;
> > +	}
> > +
> >  	/* Flush all traces of the currently running executable */
> >  	retval = flush_old_exec(bprm);
> >  	if (retval)
> > @@ -757,6 +765,7 @@ static int load_elf_binary(struct linux_
> >  	/* OK, This is the point of no return */
> >  	current->flags &= ~PF_FORKNOEXEC;
> >  	current->mm->def_flags = def_flags;
> > +	current->mm->mcl_inherit = mcl_inherit;
> >  
> >  	/* Do this immediately, since STACK_TOP as used in setup_arg_pages
> >  	   may depend on the personality.  */
> 
> Why is this done in binfmt_elf.c?  Doesn't this preclude the operation
> of these new flags for other binary formats?

Because binfmt_elf.c is the only binary format loader [that I can see]
that takes it upon itself to reinitialize mm->def_flags, which has
already been set up in mm_init().  I wondered, myself, why this is the
case, but I didn't dare just remove it, lest it reintroduce some obscure
bug.  So, I duplicated my version of def_flags [and mcl_inherit]
initialization here. 

Now that you've raised the question:  why, indeed, does the elf format
loader reinit def_flags?  Can we just remove this?

> 
> > Index: linux-2.6.28-rc7-mmotm-081203/arch/x86/include/asm/mman.h
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/arch/x86/include/asm/mman.h	2008-12-03 10:16:26.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/arch/x86/include/asm/mman.h	2008-12-03 10:33:29.000000000 -0500
> > @@ -16,5 +16,8 @@
> >  
> >  #define MCL_CURRENT	1		/* lock all current mappings */
> >  #define MCL_FUTURE	2		/* lock all future mappings */
> > +#define MCL_INHERIT	4		/* inherit mlocks across fork */
> > +					/* inherit '_FUTURE flag across fork/exec */
> > +#define MCL_RECURSIVE	8		/* inherit mlocks recursively */
> >  
> >  #endif /* _ASM_X86_MMAN_H */
> > Index: linux-2.6.28-rc7-mmotm-081203/include/linux/mm_types.h
> > ===================================================================
> > --- linux-2.6.28-rc7-mmotm-081203.orig/include/linux/mm_types.h	2008-12-03 10:18:01.000000000 -0500
> > +++ linux-2.6.28-rc7-mmotm-081203/include/linux/mm_types.h	2008-12-03 10:33:29.000000000 -0500
> > @@ -235,6 +235,8 @@ struct mm_struct {
> >  	unsigned int token_priority;
> >  	unsigned int last_interval;
> >  
> > +	int mcl_inherit;		/* inherit current/future locks */
> > +
> >  	unsigned long flags; /* Must use atomic bitops to access the bits */
> 
> I was thinking that the boolean mcl_inherit could become a bit in
> mm.flags.  That was before I unmisled myself about it.
> 
> Perhaps we could use two bits.  That might get a bit nasty.

I did look for someplace to add these flags, before deciding on a new
member.  The "Must use atomic bitops", and the fact that mm->flags seems
used soley for controlling core dumping [and defined in sched.h, no
less!], put me off from going that route.  Similarly, because def_flags
is an initializer for the vma's vm_flags, I didn't want to pollute that.
I figured the 'int' would fill a padding slot on 64-bit archs, but does
add to mm size for 32-bit archs.

Suggestions?
I'll hold off on a re-spin, pending resolution of some of these issues.

Thanks for the thorough review.

Lee


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-12-08 15:01 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24 21:21 [PATCH/RFC] - support inheritance of mlocks across fork/exec Lee Schermerhorn
2008-11-25  4:05 ` KOSAKI Motohiro
2008-11-25 16:21 ` Rik van Riel
2008-11-25 23:26 ` Andrew Morton
2008-12-03 19:04   ` [PATCH] - support inheritance of mlocks across fork/exec V2 Lee Schermerhorn
2008-12-03 19:04     ` Lee Schermerhorn
2008-12-04  1:57     ` KOSAKI Motohiro
2008-12-04  1:57       ` KOSAKI Motohiro
2008-12-07  6:07     ` Andrew Morton
2008-12-07  6:07       ` Andrew Morton
2008-12-08 15:01       ` Lee Schermerhorn [this message]
2008-12-08 15:01         ` Lee Schermerhorn
     [not found]       ` <20081206220729.042a926e.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-12-08 21:05         ` Lee Schermerhorn
2008-12-08 21:05           ` Lee Schermerhorn
2008-12-08 21:05           ` Lee Schermerhorn
2008-12-08 21:33           ` Matt Mackall
2008-12-08 21:33             ` Matt Mackall
2008-12-08 21:33             ` Matt Mackall
2008-12-09 19:40             ` Lee Schermerhorn
2008-12-09 19:40               ` Lee Schermerhorn
2008-12-09 19:40               ` Lee Schermerhorn
2008-12-09 20:41               ` Matt Mackall
2008-12-09 20:41                 ` Matt Mackall
2008-12-09 20:41                 ` Matt Mackall
2009-06-05  4:39     ` Jon Masters
2009-06-05  4:39       ` Jon Masters
2009-06-05  4:49       ` KOSAKI Motohiro
2009-06-05  4:49         ` KOSAKI Motohiro
2009-06-05  5:12         ` Jon Masters
2009-06-05  5:12           ` Jon Masters
2008-11-26  8:37 ` [PATCH/RFC] - support inheritance of mlocks across fork/exec KOSAKI Motohiro
2008-11-29 22:38   ` Lee Schermerhorn
2008-11-30  5:21     ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1228748465.22647.85.camel@lts-notebook \
    --to=lee.schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-abi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.