* [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail @ 2013-07-30 19:56 Azat Khuzhin 2013-07-31 6:32 ` Hugh Dickins 0 siblings, 1 reply; 3+ messages in thread From: Azat Khuzhin @ 2013-07-30 19:56 UTC (permalink / raw) To: linux-kernel; +Cc: Azat Khuzhin, Hugh Dickins, linux-mm Otherwize if there is no left space on shmem device, there will be "Bus error" when application will try to write to address space that was returned by mmap(2) This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE isset. So, with this patch, you will get next: a) $ echo 2 >| /proc/sys/vm/overcommit_memory .... mmap() = MAP_FAILED; .... b) .... mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED; write() killed by SIGBUS .... c) $ echo 0 >| /proc/sys/vm/overcommit_memory .... mmap() = !MAP_FAILED; write() killed by SIGBUS .... Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> --- mm/shmem.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index a87990c..965f4ba 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -32,6 +32,8 @@ #include <linux/export.h> #include <linux/swap.h> #include <linux/aio.h> +#include <linux/statfs.h> +#include <linux/path.h> static struct vfsmount *shm_mnt; @@ -1356,6 +1358,20 @@ out_nomem: static int shmem_mmap(struct file *file, struct vm_area_struct *vma) { + if (!(vma->vm_flags & VM_NORESERVE) && + sysctl_overcommit_memory == OVERCOMMIT_NEVER) { + struct inode *inode = file_inode(file); + struct kstatfs sbuf; + u64 size; + + inode->i_sb->s_op->statfs(file->f_dentry, &sbuf); + size = sbuf.f_bfree * sbuf.f_bsize; + + if (size < inode->i_size) { + return -ENOMEM; + } + } + file_accessed(file); vma->vm_ops = &shmem_vm_ops; return 0; -- 1.7.10.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail 2013-07-30 19:56 [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail Azat Khuzhin @ 2013-07-31 6:32 ` Hugh Dickins 2013-07-31 9:28 ` Azat Khuzhin 0 siblings, 1 reply; 3+ messages in thread From: Hugh Dickins @ 2013-07-31 6:32 UTC (permalink / raw) To: Azat Khuzhin; +Cc: linux-kernel, linux-mm On Tue, 30 Jul 2013, Azat Khuzhin wrote: > Otherwize if there is no left space on shmem device, there will be > "Bus error" when application will try to write to address space that was > returned by mmap(2) > > This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE > isset. > > So, with this patch, you will get next: > > a) > $ echo 2 >| /proc/sys/vm/overcommit_memory > .... > mmap() = MAP_FAILED; > .... > > b) > .... > mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED; > write() > killed by SIGBUS > .... > > c) > $ echo 0 >| /proc/sys/vm/overcommit_memory > .... > mmap() = !MAP_FAILED; > write() > killed by SIGBUS > .... > > Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> Thanks for making the patch, but I'm afraid there are a number of things wrong with it; and even if it were perfect, I would still be reluctant to change the semantics of shmem_mmap() after all this time. Some comments on your implementation below; but if getting SIGBUS from a write to an mmapping, once the underlying filesystem (shmem/tmpfs or any other) fills up, if that SIGBUS is troublesome for you, then please try using fallocate() to allocate the space before accessing the mmapping. > --- > mm/shmem.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/mm/shmem.c b/mm/shmem.c > index a87990c..965f4ba 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -32,6 +32,8 @@ > #include <linux/export.h> > #include <linux/swap.h> > #include <linux/aio.h> > +#include <linux/statfs.h> > +#include <linux/path.h> I'm surprised you need either of those: vfs.h should have already included statfs.h, and I don't see what path.h would be for. > > static struct vfsmount *shm_mnt; > > @@ -1356,6 +1358,20 @@ out_nomem: > > static int shmem_mmap(struct file *file, struct vm_area_struct *vma) > { > + if (!(vma->vm_flags & VM_NORESERVE) && > + sysctl_overcommit_memory == OVERCOMMIT_NEVER) { So, this would be a new and different usage of sysctl_overcommit_memory: usually it applies to vm_committed_as accounting, but you're extending it to affect tmpfs filesystem size accounting. Hmm. > + struct inode *inode = file_inode(file); > + struct kstatfs sbuf; > + u64 size; > + > + inode->i_sb->s_op->statfs(file->f_dentry, &sbuf); You don't really need to go through ->statfs(), since that will arrive at shmem_statfs(). Where you can see there will be a problem in the case of an unlimited (max_blocks=0) mount - you will fail mmap() of every file of non-0 size - and mmaps of 0-size files aren't much use! But moving on from that case... > + size = sbuf.f_bfree * sbuf.f_bsize; > + > + if (size < inode->i_size) { > + return -ENOMEM; So, if your filesystem is full, mmap() of any (i_size>0) file in it will fail? I don't think that's what you want at all. You seem to be assuming that no pages of the file you're mmap()ing have been allocated yet: that may be the case, but it's very often not so. > + } And if we pass that test, there's stll no assurance that you won't get SIGBUS from accessing the mmapping: nothing has actually been reserved here, and other activity on the system can gobble up all the remaining space in the filesystem, or take vm_committed_as to its maximum. > + } > + > file_accessed(file); > vma->vm_ops = &shmem_vm_ops; > return 0; > -- > 1.7.10.4 Please "man 2 fallocate" and use that instead. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail 2013-07-31 6:32 ` Hugh Dickins @ 2013-07-31 9:28 ` Azat Khuzhin 0 siblings, 0 replies; 3+ messages in thread From: Azat Khuzhin @ 2013-07-31 9:28 UTC (permalink / raw) To: Hugh Dickins; +Cc: open list, linux-mm On Wed, Jul 31, 2013 at 10:32 AM, Hugh Dickins <hughd@google.com> wrote: > On Tue, 30 Jul 2013, Azat Khuzhin wrote: > >> Otherwize if there is no left space on shmem device, there will be >> "Bus error" when application will try to write to address space that was >> returned by mmap(2) >> >> This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE >> isset. >> >> So, with this patch, you will get next: >> >> a) >> $ echo 2 >| /proc/sys/vm/overcommit_memory >> .... >> mmap() = MAP_FAILED; >> .... >> >> b) >> .... >> mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED; >> write() >> killed by SIGBUS >> .... >> >> c) >> $ echo 0 >| /proc/sys/vm/overcommit_memory >> .... >> mmap() = !MAP_FAILED; >> write() >> killed by SIGBUS >> .... >> >> Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> > > Thanks for making the patch, but I'm afraid there are a number of > things wrong with it; and even if it were perfect, I would still be > reluctant to change the semantics of shmem_mmap() after all this time. I was also think about this, but hence it only change behavior with OVERCOMMIT_NEVER, I post this patch. > > Some comments on your implementation below; but if getting SIGBUS from > a write to an mmapping, once the underlying filesystem (shmem/tmpfs or > any other) fills up, if that SIGBUS is troublesome for you, then please > try using fallocate() to allocate the space before accessing the mmapping. Oh.. forgot about fallocate(). Thanks for you comments, I will keep in mind! > >> --- >> mm/shmem.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff --git a/mm/shmem.c b/mm/shmem.c >> index a87990c..965f4ba 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -32,6 +32,8 @@ >> #include <linux/export.h> >> #include <linux/swap.h> >> #include <linux/aio.h> >> +#include <linux/statfs.h> >> +#include <linux/path.h> > > I'm surprised you need either of those: vfs.h should have already > included statfs.h, and I don't see what path.h would be for. > >> >> static struct vfsmount *shm_mnt; >> >> @@ -1356,6 +1358,20 @@ out_nomem: >> >> static int shmem_mmap(struct file *file, struct vm_area_struct *vma) >> { >> + if (!(vma->vm_flags & VM_NORESERVE) && >> + sysctl_overcommit_memory == OVERCOMMIT_NEVER) { > > So, this would be a new and different usage of sysctl_overcommit_memory: > usually it applies to vm_committed_as accounting, but you're extending > it to affect tmpfs filesystem size accounting. Hmm. > >> + struct inode *inode = file_inode(file); >> + struct kstatfs sbuf; >> + u64 size; >> + >> + inode->i_sb->s_op->statfs(file->f_dentry, &sbuf); > > You don't really need to go through ->statfs(), since that will arrive > at shmem_statfs(). Where you can see there will be a problem in the > case of an unlimited (max_blocks=0) mount - you will fail mmap() of > every file of non-0 size - and mmaps of 0-size files aren't much use! > But moving on from that case... Nice catch, thanks! > >> + size = sbuf.f_bfree * sbuf.f_bsize; >> + >> + if (size < inode->i_size) { >> + return -ENOMEM; > > So, if your filesystem is full, mmap() of any (i_size>0) file in it will > fail? I don't think that's what you want at all. You seem to be assuming > that no pages of the file you're mmap()ing have been allocated yet: that > may be the case, but it's very often not so. > >> + } > > And if we pass that test, there's stll no assurance that you won't get > SIGBUS from accessing the mmapping: nothing has actually been reserved > here, and other activity on the system can gobble up all the remaining > space in the filesystem, or take vm_committed_as to its maximum. Completely slipped my mind. > >> + } >> + >> file_accessed(file); >> vma->vm_ops = &shmem_vm_ops; >> return 0; >> -- >> 1.7.10.4 > > Please "man 2 fallocate" and use that instead. > > Hugh -- Respectfully Azat Khuzhin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-07-31 9:28 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-07-30 19:56 [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail Azat Khuzhin 2013-07-31 6:32 ` Hugh Dickins 2013-07-31 9:28 ` Azat Khuzhin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).