All of lore.kernel.org
 help / color / mirror / Atom feed
From: Badari Pulavarty <pbadari@us.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Chris Wright <chrisw@osdl.org>, linux-mm <linux-mm@kvack.org>
Subject: [RFC][PATCH] OVERCOMMIT_ALWAYS extension
Date: Tue, 18 Oct 2005 09:05:02 -0700	[thread overview]
Message-ID: <1129651502.23632.63.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.61.0510171919150.6548@goblin.wat.veritas.com>

[-- Attachment #1: Type: text/plain, Size: 2827 bytes --]

On Mon, 2005-10-17 at 19:25 +0100, Hugh Dickins wrote:
> On Mon, 17 Oct 2005, Hugh Dickins wrote:
> > On Mon, 17 Oct 2005, Badari Pulavarty wrote:
> > > 
> > > I have been looking at possible ways to extend OVERCOMMIT_ALWAYS
> > > to avoid its abuse.
> > > 
> > > Few of the applications (database) would like to overcommit
> > > memory (by creating shared memory segments more than RAM+swap),
> > > but use only portion of it at any given time and get rid
> > > of portions of them through madvise(DONTNEED), when needed. 
> > > They want this, especially to handle hotplug memory situations 
> > > (where apps may not have clear idea on how much memory they have 
> > > in the system at the time of shared memory create). Currently, 
> > > they are using OVERCOMMIT_ALWAYS system wide to do this - but 
> > > they are affecting every other application on the system.
> > > 
> > > I am wondering, if there is a better way to do this. Simple solution
> > > would be to add IPC_OVERCOMMIT flag or add CAP_SYS_ADMIN to
> > > do the overcommit. This way only specific applications, requesting
> > > this would be able to overcommit. I am worried about, the over
> > > all affects it has on the system. But again, this can't be worse
> > > than system wide  OVERCOMMIT_ALWAYS. Isn't it ?
> > 
> > mmap has MAP_NORESERVE, without CAP_SYS_ADMIN or other restriction,
> > which exempts that mmap from security_vm_enough_memory checking -
> > unless current setting is OVERCOMMIT_NEVER, in which case
> > MAP_NORESERVE is ignored.
> 
> Having written that, it does seem rather odd that we have a flag
> anyone can set to evade that security_ checking.  It was okay when
> it was just vm_enough_memory, but now it's security_vm_enough_memory,
> I wonder if this is a significant oversight, and some CAP required.
> Might break things though.  CC'ed Chris.
> 
> Ah, there's a security_file_mmap earlier, which could reject the
> MAP_NORESERVE flag if it feels so inclined.  Perhaps you'll need
> to allow a similar opportunity for rejection in your approach.
> 
> Hugh
> 
> > So if you're content to move to the OVERCOMMIT_GUESS world, I
> > don't think you could be blamed for adding an IPC_NORESERVE which
> > behaves in the same way, without CAP_SYS_ADMIN restriction.
> > 
> > But if you want to move to OVERCOMMIT_NEVER, yet have a flag which
> > says overcommit now, you'll get into a tussle with NEVER-adherents.
> > 
> > Hugh
> 

Hugh,

As you suggested, here is the patch to add SHM_NORESERVE which does 
same thing as MAP_NORESERVE. This flag is ignored for OVERCOMMIT_NEVER.
I decided to do SHM_NORESERVE instead of IPC_NORESERVE - just to limit
its scope.

BTW, there is a call to security_shm_alloc() earlier, which could
be modified to reject shmget() if it needs to.

Is this reasonable ? Please review.

Thanks,
Badari



[-- Attachment #2: shm-noreserve.patch --]
[-- Type: text/x-patch, Size: 1357 bytes --]

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
--- linux-2.6.14-rc3.org/include/linux/shm.h	2005-10-18 08:44:28.000000000 -0700
+++ linux-2.6.14-rc3/include/linux/shm.h	2005-10-18 08:46:03.000000000 -0700
@@ -92,6 +92,7 @@ struct shmid_kernel /* private to the ke
 #define	SHM_DEST	01000	/* segment will be destroyed on last detach */
 #define SHM_LOCKED      02000   /* segment will not be swapped */
 #define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
+#define SHM_NORESERVE   010000  /* don't check for reservations */
 
 #ifdef CONFIG_SYSVIPC
 long do_shmat(int shmid, char __user *shmaddr, int shmflg, unsigned long *addr);
--- linux-2.6.14-rc3.org/ipc/shm.c	2005-10-17 16:57:40.000000000 -0700
+++ linux-2.6.14-rc3/ipc/shm.c	2005-10-18 08:55:50.000000000 -0700
@@ -212,8 +212,16 @@ static int newseg (key_t key, int shmflg
 		file = hugetlb_zero_setup(size);
 		shp->mlock_user = current->user;
 	} else {
+		int acctflag = VM_ACCOUNT;
+		/*
+		 * Do not allow no accouting for OVERCOMMIT_NEVER, even
+	 	 * its asked for.
+		 */
+		if  ((shmflg & SHM_NORESERVE) && 
+		     sysctl_overcommit_memory != OVERCOMMIT_NEVER)
+			acctflag = 0;
 		sprintf (name, "SYSV%08x", key);
-		file = shmem_file_setup(name, size, VM_ACCOUNT);
+		file = shmem_file_setup(name, size, acctflag);
 	}
 	error = PTR_ERR(file);
 	if (IS_ERR(file))

  parent reply	other threads:[~2005-10-18 16:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-17 17:30 [RFC] OVERCOMMIT_ALWAYS extension Badari Pulavarty
2005-10-17 18:13 ` Hugh Dickins
2005-10-17 18:25   ` Hugh Dickins
2005-10-17 23:14     ` Badari Pulavarty
2005-10-18 16:05     ` Badari Pulavarty [this message]
2005-10-19 17:56       ` [RFC][PATCH] " Hugh Dickins
2005-10-19 18:32         ` Jeff Dike
2005-10-19 21:21           ` Badari Pulavarty
2005-10-19 22:38             ` Jeff Dike
2005-10-19 18:50         ` Badari Pulavarty
2005-10-19 19:12           ` Darren Hart
2005-10-19 20:10           ` Hugh Dickins
2005-10-19 20:47           ` Jeff Dike
2005-10-20 15:11             ` Badari Pulavarty
2005-10-20 17:27               ` Jeff Dike
2005-10-20 22:37                 ` Badari Pulavarty
2005-10-24 20:04                   ` Hugh Dickins
2005-10-24 20:22                     ` Darren Hart
2005-10-24 20:24                     ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1129651502.23632.63.camel@localhost.localdomain \
    --to=pbadari@us.ibm.com \
    --cc=chrisw@osdl.org \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.