linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail
@ 2013-07-30 19:56 Azat Khuzhin
  2013-07-31  6:32 ` Hugh Dickins
  0 siblings, 1 reply; 3+ messages in thread
From: Azat Khuzhin @ 2013-07-30 19:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: Azat Khuzhin, Hugh Dickins, linux-mm

Otherwize if there is no left space on shmem device, there will be
"Bus error" when application will try to write to address space that was
returned by mmap(2)

This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE
isset.

So, with this patch, you will get next:

a)
$ echo 2 >| /proc/sys/vm/overcommit_memory
  ....
  mmap() = MAP_FAILED;
  ....

b)
  ....
  mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED;
  write()
  killed by SIGBUS
  ....

c)
$ echo 0 >| /proc/sys/vm/overcommit_memory
  ....
  mmap() = !MAP_FAILED;
  write()
  killed by SIGBUS
  ....

Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
---
 mm/shmem.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index a87990c..965f4ba 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -32,6 +32,8 @@
 #include <linux/export.h>
 #include <linux/swap.h>
 #include <linux/aio.h>
+#include <linux/statfs.h>
+#include <linux/path.h>
 
 static struct vfsmount *shm_mnt;
 
@@ -1356,6 +1358,20 @@ out_nomem:
 
 static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 {
+	if (!(vma->vm_flags & VM_NORESERVE) &&
+	    sysctl_overcommit_memory == OVERCOMMIT_NEVER) {
+		struct inode *inode = file_inode(file);
+		struct kstatfs sbuf;
+		u64 size;
+
+		inode->i_sb->s_op->statfs(file->f_dentry, &sbuf);
+		size = sbuf.f_bfree * sbuf.f_bsize;
+
+		if (size < inode->i_size) {
+			return -ENOMEM;
+		}
+	}
+
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
 	return 0;
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail
  2013-07-30 19:56 [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail Azat Khuzhin
@ 2013-07-31  6:32 ` Hugh Dickins
  2013-07-31  9:28   ` Azat Khuzhin
  0 siblings, 1 reply; 3+ messages in thread
From: Hugh Dickins @ 2013-07-31  6:32 UTC (permalink / raw)
  To: Azat Khuzhin; +Cc: linux-kernel, linux-mm

On Tue, 30 Jul 2013, Azat Khuzhin wrote:

> Otherwize if there is no left space on shmem device, there will be
> "Bus error" when application will try to write to address space that was
> returned by mmap(2)
> 
> This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE
> isset.
> 
> So, with this patch, you will get next:
> 
> a)
> $ echo 2 >| /proc/sys/vm/overcommit_memory
>   ....
>   mmap() = MAP_FAILED;
>   ....
> 
> b)
>   ....
>   mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED;
>   write()
>   killed by SIGBUS
>   ....
> 
> c)
> $ echo 0 >| /proc/sys/vm/overcommit_memory
>   ....
>   mmap() = !MAP_FAILED;
>   write()
>   killed by SIGBUS
>   ....
> 
> Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>

Thanks for making the patch, but I'm afraid there are a number of
things wrong with it; and even if it were perfect, I would still be
reluctant to change the semantics of shmem_mmap() after all this time.

Some comments on your implementation below; but if getting SIGBUS from
a write to an mmapping, once the underlying filesystem (shmem/tmpfs or
any other) fills up, if that SIGBUS is troublesome for you, then please
try using fallocate() to allocate the space before accessing the mmapping.

> ---
>  mm/shmem.c |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index a87990c..965f4ba 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -32,6 +32,8 @@
>  #include <linux/export.h>
>  #include <linux/swap.h>
>  #include <linux/aio.h>
> +#include <linux/statfs.h>
> +#include <linux/path.h>

I'm surprised you need either of those: vfs.h should have already
included statfs.h, and I don't see what path.h would be for.

>  
>  static struct vfsmount *shm_mnt;
>  
> @@ -1356,6 +1358,20 @@ out_nomem:
>  
>  static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
>  {
> +	if (!(vma->vm_flags & VM_NORESERVE) &&
> +	    sysctl_overcommit_memory == OVERCOMMIT_NEVER) {

So, this would be a new and different usage of sysctl_overcommit_memory:
usually it applies to vm_committed_as accounting, but you're extending
it to affect tmpfs filesystem size accounting.  Hmm.

> +		struct inode *inode = file_inode(file);
> +		struct kstatfs sbuf;
> +		u64 size;
> +
> +		inode->i_sb->s_op->statfs(file->f_dentry, &sbuf);

You don't really need to go through ->statfs(), since that will arrive
at shmem_statfs().  Where you can see there will be a problem in the
case of an unlimited (max_blocks=0) mount - you will fail mmap() of
every file of non-0 size - and mmaps of 0-size files aren't much use!
But moving on from that case...

> +		size = sbuf.f_bfree * sbuf.f_bsize;
> +
> +		if (size < inode->i_size) {
> +			return -ENOMEM;

So, if your filesystem is full, mmap() of any (i_size>0) file in it will
fail?  I don't think that's what you want at all.  You seem to be assuming
that no pages of the file you're mmap()ing have been allocated yet: that
may be the case, but it's very often not so.

> +		}

And if we pass that test, there's stll no assurance that you won't get
SIGBUS from accessing the mmapping: nothing has actually been reserved
here, and other activity on the system can gobble up all the remaining
space in the filesystem, or take vm_committed_as to its maximum.

> +	}
> +
>  	file_accessed(file);
>  	vma->vm_ops = &shmem_vm_ops;
>  	return 0;
> -- 
> 1.7.10.4

Please "man 2 fallocate" and use that instead.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail
  2013-07-31  6:32 ` Hugh Dickins
@ 2013-07-31  9:28   ` Azat Khuzhin
  0 siblings, 0 replies; 3+ messages in thread
From: Azat Khuzhin @ 2013-07-31  9:28 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: open list, linux-mm

On Wed, Jul 31, 2013 at 10:32 AM, Hugh Dickins <hughd@google.com> wrote:
> On Tue, 30 Jul 2013, Azat Khuzhin wrote:
>
>> Otherwize if there is no left space on shmem device, there will be
>> "Bus error" when application will try to write to address space that was
>> returned by mmap(2)
>>
>> This patch also preserve old behaviour if MAP_NORESERVE/VM_NORESERVE
>> isset.
>>
>> So, with this patch, you will get next:
>>
>> a)
>> $ echo 2 >| /proc/sys/vm/overcommit_memory
>>   ....
>>   mmap() = MAP_FAILED;
>>   ....
>>
>> b)
>>   ....
>>   mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE) = !MAP_FAILED;
>>   write()
>>   killed by SIGBUS
>>   ....
>>
>> c)
>> $ echo 0 >| /proc/sys/vm/overcommit_memory
>>   ....
>>   mmap() = !MAP_FAILED;
>>   write()
>>   killed by SIGBUS
>>   ....
>>
>> Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
>
> Thanks for making the patch, but I'm afraid there are a number of
> things wrong with it; and even if it were perfect, I would still be
> reluctant to change the semantics of shmem_mmap() after all this time.

I was also think about this, but hence it only change behavior with
OVERCOMMIT_NEVER, I post this patch.

>
> Some comments on your implementation below; but if getting SIGBUS from
> a write to an mmapping, once the underlying filesystem (shmem/tmpfs or
> any other) fills up, if that SIGBUS is troublesome for you, then please
> try using fallocate() to allocate the space before accessing the mmapping.

Oh.. forgot about fallocate().
Thanks for you comments, I will keep in mind!

>
>> ---
>>  mm/shmem.c |   16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index a87990c..965f4ba 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -32,6 +32,8 @@
>>  #include <linux/export.h>
>>  #include <linux/swap.h>
>>  #include <linux/aio.h>
>> +#include <linux/statfs.h>
>> +#include <linux/path.h>
>
> I'm surprised you need either of those: vfs.h should have already
> included statfs.h, and I don't see what path.h would be for.
>
>>
>>  static struct vfsmount *shm_mnt;
>>
>> @@ -1356,6 +1358,20 @@ out_nomem:
>>
>>  static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
>>  {
>> +     if (!(vma->vm_flags & VM_NORESERVE) &&
>> +         sysctl_overcommit_memory == OVERCOMMIT_NEVER) {
>
> So, this would be a new and different usage of sysctl_overcommit_memory:
> usually it applies to vm_committed_as accounting, but you're extending
> it to affect tmpfs filesystem size accounting.  Hmm.
>
>> +             struct inode *inode = file_inode(file);
>> +             struct kstatfs sbuf;
>> +             u64 size;
>> +
>> +             inode->i_sb->s_op->statfs(file->f_dentry, &sbuf);
>
> You don't really need to go through ->statfs(), since that will arrive
> at shmem_statfs().  Where you can see there will be a problem in the
> case of an unlimited (max_blocks=0) mount - you will fail mmap() of
> every file of non-0 size - and mmaps of 0-size files aren't much use!
> But moving on from that case...

Nice catch, thanks!

>
>> +             size = sbuf.f_bfree * sbuf.f_bsize;
>> +
>> +             if (size < inode->i_size) {
>> +                     return -ENOMEM;
>
> So, if your filesystem is full, mmap() of any (i_size>0) file in it will
> fail?  I don't think that's what you want at all.  You seem to be assuming
> that no pages of the file you're mmap()ing have been allocated yet: that
> may be the case, but it's very often not so.
>
>> +             }
>
> And if we pass that test, there's stll no assurance that you won't get
> SIGBUS from accessing the mmapping: nothing has actually been reserved
> here, and other activity on the system can gobble up all the remaining
> space in the filesystem, or take vm_committed_as to its maximum.

Completely slipped my mind.

>
>> +     }
>> +
>>       file_accessed(file);
>>       vma->vm_ops = &shmem_vm_ops;
>>       return 0;
>> --
>> 1.7.10.4
>
> Please "man 2 fallocate" and use that instead.
>
> Hugh



-- 
Respectfully
Azat Khuzhin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-07-31  9:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-30 19:56 [PATCH] mm: for shm_open()/mmap() with OVERCOMMIT_NEVER, return -1 if no memory avail Azat Khuzhin
2013-07-31  6:32 ` Hugh Dickins
2013-07-31  9:28   ` Azat Khuzhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).