* __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread @ 2008-10-21 15:14 hooanon05 2008-10-21 15:46 ` Alan Cox 0 siblings, 1 reply; 9+ messages in thread From: hooanon05 @ 2008-10-21 15:14 UTC (permalink / raw) To: linux-kernel Hello all, When sysctl_overcommit_memory is set OVERCOMMIT_NEVER, __vm_enough_memory() refers current->mm. For example, # exportfs -i -o ... localhost:/tmpfs # mkdir /tmp/w # mount -o ... localhost:/tmpfs /tmp/w # yes > /tmp/w/fileA In this case, NFSD (kernel thread) calls shmem_file_write() or shmem_write_begin() and __vm_enough_memory() is called. But current->mm is NULL and the kernel crashes. If a user have to set OVERCOMMIT_NEVER, where should we fix? Junjiro R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-21 15:14 __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread hooanon05 @ 2008-10-21 15:46 ` Alan Cox 2008-10-21 23:10 ` hooanon05 0 siblings, 1 reply; 9+ messages in thread From: Alan Cox @ 2008-10-21 15:46 UTC (permalink / raw) To: hooanon05; +Cc: linux-kernel > In this case, NFSD (kernel thread) calls shmem_file_write() or > shmem_write_begin() and __vm_enough_memory() is called. But current->mm > is NULL and the kernel crashes. > If a user have to set OVERCOMMIT_NEVER, where should we fix? Calling into the file system code assuming that current->mm is NULL isn't safe and hasn't been for a very long time since someone added the 3% hack. The shmem case is actually a bit special so my thoughts are: Make security_vm_enough_memory() WARN() if current->mm = NULL Make security_vm_enough_memory_mm() WARN() if the passed mm = NULL Add security_vm_enough_memory_fs() which does not do the warning test All would still call security->ops->vm_enough_memory and then __vm_enough_memory() would skip the 3% adjustment when the passed mm was NULL Does that sound sensible ? Alan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-21 15:46 ` Alan Cox @ 2008-10-21 23:10 ` hooanon05 2008-10-22 5:35 ` hooanon05 2008-10-22 8:27 ` Alan Cox 0 siblings, 2 replies; 9+ messages in thread From: hooanon05 @ 2008-10-21 23:10 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Alan Cox: > Calling into the file system code assuming that current->mm is > NULL isn't safe and hasn't been for a very long time since someone added > the 3% hack. I guess - people don't care overcommit and leave it as default, so they don't meet the problem - people who cares overcommit has rich memory, and they don't meet the problem too. > The shmem case is actually a bit special so my thoughts are: > > Make security_vm_enough_memory() WARN() if current->mm = NULL > Make security_vm_enough_memory_mm() WARN() if the passed mm = NULL > Add security_vm_enough_memory_fs() which does not do the warning test > > All would still call security->ops->vm_enough_memory and then > __vm_enough_memory() would skip the 3% adjustment when the passed mm was > NULL > > Does that sound sensible ? In your first option, write() to the exported tmpfs will produce the warning on nfs server even if much memory is left. I don't think it is a good idea. I'd suggest to make __vm_enough_memory() would skip the 3% adjustment only. --- /src/linux-2.6/linux-2.6.27/mm/mmap.c 2008-10-10 07:13:53.000000000 +0900 +++ /tmp/mmap.c 2008-10-22 08:07:09.000000000 +0900 @@ -173,9 +173,10 @@ allowed -= allowed / 32; allowed += total_swap_pages; - /* Don't let a single process grow too big: + /* Don't let a single user process grow too big: leave 3% of the size of this process for other processes */ - allowed -= mm->total_vm / 32; + if (mm) + allowed -= mm->total_vm / 32; /* * cast `allowed' as a signed long because vm_committed_space Junjiro R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-21 23:10 ` hooanon05 @ 2008-10-22 5:35 ` hooanon05 2008-10-22 8:27 ` Alan Cox 1 sibling, 0 replies; 9+ messages in thread From: hooanon05 @ 2008-10-22 5:35 UTC (permalink / raw) To: Alan Cox, linux-kernel > I guess > - people don't care overcommit and leave it as default, so they don't > meet the problem > - people who cares overcommit has rich memory, and they don't meet the > problem too. Correction: the problem is unrelated to the memory size. I was sleepy when I wrote that. Regardless the memory size, when nfsd writes to tmpfs the system surely crashes. Junjiro R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-21 23:10 ` hooanon05 2008-10-22 5:35 ` hooanon05 @ 2008-10-22 8:27 ` Alan Cox 2008-10-22 11:26 ` hooanon05 1 sibling, 1 reply; 9+ messages in thread From: Alan Cox @ 2008-10-22 8:27 UTC (permalink / raw) To: hooanon05; +Cc: linux-kernel > In your first option, write() to the exported tmpfs will produce the > warning on nfs server even if much memory is left. I don't think it is a No - the shmem routine would use vm_enough_memory_fs which wouldn't care if current->mm is NULL, but the others would check. > good idea. > I'd suggest to make __vm_enough_memory() would skip the 3% adjustment > only. > > --- /src/linux-2.6/linux-2.6.27/mm/mmap.c 2008-10-10 07:13:53.000000000 +0900 > +++ /tmp/mmap.c 2008-10-22 08:07:09.000000000 +0900 > @@ -173,9 +173,10 @@ > allowed -= allowed / 32; > allowed += total_swap_pages; > > - /* Don't let a single process grow too big: > + /* Don't let a single user process grow too big: > leave 3% of the size of this process for other processes */ > - allowed -= mm->total_vm / 32; > + if (mm) > + allowed -= mm->total_vm / 32; Doing this means we lose the ability to trap incorrect calls to vm_enough_memory. Alan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-22 8:27 ` Alan Cox @ 2008-10-22 11:26 ` hooanon05 2008-10-22 12:09 ` Alan Cox 0 siblings, 1 reply; 9+ messages in thread From: hooanon05 @ 2008-10-22 11:26 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Alan Cox: > No - the shmem routine would use vm_enough_memory_fs which wouldn't care > if current->mm is NULL, but the others would check. Unfortunately I cannot imagine what new vm_enough_memory_fs() will be. Would you show me its draft or pseudo code? Junjiro R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-22 11:26 ` hooanon05 @ 2008-10-22 12:09 ` Alan Cox 2008-10-22 12:46 ` hooanon05 2008-10-22 13:37 ` James Morris 0 siblings, 2 replies; 9+ messages in thread From: Alan Cox @ 2008-10-22 12:09 UTC (permalink / raw) To: hooanon05; +Cc: linux-kernel On Wed, 22 Oct 2008 20:26:19 +0900 hooanon05@yahoo.co.jp wrote: > > Alan Cox: > > No - the shmem routine would use vm_enough_memory_fs which wouldn't care > > if current->mm is NULL, but the others would check. > > Unfortunately I cannot imagine what new vm_enough_memory_fs() will be. > Would you show me its draft or pseudo code? This is the patch I propose: nfsd: Fix vm overcommit crash From: Alan Cox <alan@redhat.com> Junjiro R. Okajima reported a problem where knfsd crashes if you are using it to export shmemfs objects and run strict overcommit. In this situation the current->mm based modifier to the overcommit goes through a NULL pointer. We could simply check for NULL and skip the modifier but we've caught other real bugs in the past from mm being NULL here - cases where we did need a valid mm set up (eg the exec bug about a year ago). To preserve the checks and get the logic we want shuffle the checking around and add a new helper to the vm_ security wrappers Also fix a current->mm reference in nommu that should use the passed mm --- include/linux/security.h | 1 + mm/mmap.c | 3 ++- mm/nommu.c | 3 ++- mm/shmem.c | 4 ++-- security/security.c | 9 +++++++++ 5 files changed, 16 insertions(+), 4 deletions(-) diff --git a/include/linux/security.h b/include/linux/security.h index f5c4a51..a2b8430 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1585,6 +1585,7 @@ int security_syslog(int type); int security_settime(struct timespec *ts, struct timezone *tz); int security_vm_enough_memory(long pages); int security_vm_enough_memory_mm(struct mm_struct *mm, long pages); +int security_vm_enough_memory_kern(long pages); int security_bprm_alloc(struct linux_binprm *bprm); void security_bprm_free(struct linux_binprm *bprm); void security_bprm_apply_creds(struct linux_binprm *bprm, int unsafe); diff --git a/mm/mmap.c b/mm/mmap.c index 74f4d15..de14ac2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -175,7 +175,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) /* Don't let a single process grow too big: leave 3% of the size of this process for other processes */ - allowed -= mm->total_vm / 32; + if (mm) + allowed -= mm->total_vm / 32; /* * cast `allowed' as a signed long because vm_committed_space diff --git a/mm/nommu.c b/mm/nommu.c index 2696b24..7695dc8 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1454,7 +1454,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) /* Don't let a single process grow too big: leave 3% of the size of this process for other processes */ - allowed -= current->mm->total_vm / 32; + if (mm) + allowed -= mm->total_vm / 32; /* * cast `allowed' as a signed long because vm_committed_space diff --git a/mm/shmem.c b/mm/shmem.c index d38d7e6..1677b3e 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -162,7 +162,7 @@ static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb) static inline int shmem_acct_size(unsigned long flags, loff_t size) { return (flags & VM_ACCOUNT)? - security_vm_enough_memory(VM_ACCT(size)): 0; + security_vm_enough_memory_kern(VM_ACCT(size)): 0; } static inline void shmem_unacct_size(unsigned long flags, loff_t size) @@ -180,7 +180,7 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size) static inline int shmem_acct_block(unsigned long flags) { return (flags & VM_ACCOUNT)? - 0: security_vm_enough_memory(VM_ACCT(PAGE_CACHE_SIZE)); + 0: security_vm_enough_memory_kern(VM_ACCT(PAGE_CACHE_SIZE)); } static inline void shmem_unacct_blocks(unsigned long flags, long pages) diff --git a/security/security.c b/security/security.c index 255b085..c0acfa7 100644 --- a/security/security.c +++ b/security/security.c @@ -198,14 +198,23 @@ int security_settime(struct timespec *ts, struct timezone *tz) int security_vm_enough_memory(long pages) { + WARN_ON(current->mm == NULL); return security_ops->vm_enough_memory(current->mm, pages); } int security_vm_enough_memory_mm(struct mm_struct *mm, long pages) { + WARN_ON(mm == NULL); return security_ops->vm_enough_memory(mm, pages); } +int security_vm_enough_memory_kern(long pages) +{ + /* If current->mm is a kernel thread then we will pass NULL, + for this specific case that is fine */ + return security_ops->vm_enough_memory(current->mm, pages); +} + int security_bprm_alloc(struct linux_binprm *bprm) { return security_ops->bprm_alloc_security(bprm); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-22 12:09 ` Alan Cox @ 2008-10-22 12:46 ` hooanon05 2008-10-22 13:37 ` James Morris 1 sibling, 0 replies; 9+ messages in thread From: hooanon05 @ 2008-10-22 12:46 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Alan Cox: > This is the patch I propose: ::: > We could simply check for NULL and skip the modifier but we've caught other > real bugs in the past from mm being NULL here - cases where we did need a > valid mm set up (eg the exec bug about a year ago). OK. I didn't know there was an actual bug but now I understand why you want to insert WARN_ON(). I have nothing to say about your patch. Thank you Junjiro R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 2008-10-22 12:09 ` Alan Cox 2008-10-22 12:46 ` hooanon05 @ 2008-10-22 13:37 ` James Morris 1 sibling, 0 replies; 9+ messages in thread From: James Morris @ 2008-10-22 13:37 UTC (permalink / raw) To: Alan Cox; +Cc: hooanon05, linux-kernel, linux-security-module On Wed, 22 Oct 2008, Alan Cox wrote: [Adding lsm list to the cc] > On Wed, 22 Oct 2008 20:26:19 +0900 > hooanon05@yahoo.co.jp wrote: > > > > > Alan Cox: > > > No - the shmem routine would use vm_enough_memory_fs which wouldn't care > > > if current->mm is NULL, but the others would check. > > > > Unfortunately I cannot imagine what new vm_enough_memory_fs() will be. > > Would you show me its draft or pseudo code? > > This is the patch I propose: > > nfsd: Fix vm overcommit crash > > From: Alan Cox <alan@redhat.com> > > Junjiro R. Okajima reported a problem where knfsd crashes if you are using > it to export shmemfs objects and run strict overcommit. In this situation > the current->mm based modifier to the overcommit goes through a NULL > pointer. > > We could simply check for NULL and skip the modifier but we've caught other > real bugs in the past from mm being NULL here - cases where we did need a > valid mm set up (eg the exec bug about a year ago). > > To preserve the checks and get the logic we want shuffle the checking > around and add a new helper to the vm_ security wrappers > > Also fix a current->mm reference in nommu that should use the passed mm Looks ok to me. Acked-by: James Morris <jmorris@namei.org> > --- > > include/linux/security.h | 1 + > mm/mmap.c | 3 ++- > mm/nommu.c | 3 ++- > mm/shmem.c | 4 ++-- > security/security.c | 9 +++++++++ > 5 files changed, 16 insertions(+), 4 deletions(-) > > > diff --git a/include/linux/security.h b/include/linux/security.h > index f5c4a51..a2b8430 100644 > --- a/include/linux/security.h > +++ b/include/linux/security.h > @@ -1585,6 +1585,7 @@ int security_syslog(int type); > int security_settime(struct timespec *ts, struct timezone *tz); > int security_vm_enough_memory(long pages); > int security_vm_enough_memory_mm(struct mm_struct *mm, long pages); > +int security_vm_enough_memory_kern(long pages); > int security_bprm_alloc(struct linux_binprm *bprm); > void security_bprm_free(struct linux_binprm *bprm); > void security_bprm_apply_creds(struct linux_binprm *bprm, int unsafe); > diff --git a/mm/mmap.c b/mm/mmap.c > index 74f4d15..de14ac2 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -175,7 +175,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) > > /* Don't let a single process grow too big: > leave 3% of the size of this process for other processes */ > - allowed -= mm->total_vm / 32; > + if (mm) > + allowed -= mm->total_vm / 32; > > /* > * cast `allowed' as a signed long because vm_committed_space > diff --git a/mm/nommu.c b/mm/nommu.c > index 2696b24..7695dc8 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -1454,7 +1454,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) > > /* Don't let a single process grow too big: > leave 3% of the size of this process for other processes */ > - allowed -= current->mm->total_vm / 32; > + if (mm) > + allowed -= mm->total_vm / 32; > > /* > * cast `allowed' as a signed long because vm_committed_space > diff --git a/mm/shmem.c b/mm/shmem.c > index d38d7e6..1677b3e 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -162,7 +162,7 @@ static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb) > static inline int shmem_acct_size(unsigned long flags, loff_t size) > { > return (flags & VM_ACCOUNT)? > - security_vm_enough_memory(VM_ACCT(size)): 0; > + security_vm_enough_memory_kern(VM_ACCT(size)): 0; > } > > static inline void shmem_unacct_size(unsigned long flags, loff_t size) > @@ -180,7 +180,7 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size) > static inline int shmem_acct_block(unsigned long flags) > { > return (flags & VM_ACCOUNT)? > - 0: security_vm_enough_memory(VM_ACCT(PAGE_CACHE_SIZE)); > + 0: security_vm_enough_memory_kern(VM_ACCT(PAGE_CACHE_SIZE)); > } > > static inline void shmem_unacct_blocks(unsigned long flags, long pages) > diff --git a/security/security.c b/security/security.c > index 255b085..c0acfa7 100644 > --- a/security/security.c > +++ b/security/security.c > @@ -198,14 +198,23 @@ int security_settime(struct timespec *ts, struct timezone *tz) > > int security_vm_enough_memory(long pages) > { > + WARN_ON(current->mm == NULL); > return security_ops->vm_enough_memory(current->mm, pages); > } > > int security_vm_enough_memory_mm(struct mm_struct *mm, long pages) > { > + WARN_ON(mm == NULL); > return security_ops->vm_enough_memory(mm, pages); > } > > +int security_vm_enough_memory_kern(long pages) > +{ > + /* If current->mm is a kernel thread then we will pass NULL, > + for this specific case that is fine */ > + return security_ops->vm_enough_memory(current->mm, pages); > +} > + > int security_bprm_alloc(struct linux_binprm *bprm) > { > return security_ops->bprm_alloc_security(bprm); > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-10-22 13:37 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-21 15:14 __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread hooanon05 2008-10-21 15:46 ` Alan Cox 2008-10-21 23:10 ` hooanon05 2008-10-22 5:35 ` hooanon05 2008-10-22 8:27 ` Alan Cox 2008-10-22 11:26 ` hooanon05 2008-10-22 12:09 ` Alan Cox 2008-10-22 12:46 ` hooanon05 2008-10-22 13:37 ` James Morris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox