All of lore.kernel.org
 help / color / mirror / Atom feed
* Robust shared memory for unrelated processes
@ 2008-11-23 22:42 Chris Smowton
  2008-11-23 23:43 ` Serge E. Hallyn
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Smowton @ 2008-11-23 22:42 UTC (permalink / raw)
  To: linux-kernel

Hello all,

First of all, my apologies if this is the wrong list for this 
suggestion; I haven't posted here before so I might accidentally break 
some local conventions :)

With that said, my question/suggestion relates to sharing memory between 
processes which are *not* in a parent-child relationship.

Suppose for simplicity's sake that one wishes to share a sizable piece 
of memory with a single other process with which one is already in 
contact (say, by a Unix domain socket).

It seems to me that one cannot do this without introducing the risk of 
the shared section not being deallocated until the system next boots, 
for the following reasons:

1. Suppose I use SysV style SHM. Then I must find a free key, create a 
section with that key, and communicate that key to my partner process so 
that he can also open the section. I cannot issue an IPC_RMID during 
this time, as that will render the key immediately unavailable. If I am 
SIGKILL'd at any time between creating the section and receiving 
confirmation that my partner has opened it, the section will persist 
until reboot. This is a large window of opportunity and a very bad thing.

2. Suppose I use POSIX shared memory (i.e. shm_open and its brethren). 
Then the same problem exists, only keys are replaced by friendlier 
names. The situation is as bad as with SysV SHM.

3. Suppose now I get a bit cleverer; I use POSIX SHM, but I create and 
then immediately unlink my section, before sending the file descriptor 
over a Unix domain socket to my partner (using the ancillary control 
channel). This works, and does mean that I am able to create a shared 
section then immediately unlink it, whilst retaining the ability to 
allow processes to open the effectively anonymous shared section by 
sending them its file descriptor. This nearly accomplishes my goal of 
ensuring the shared section does get tidied up if its users are all 
SIGKILL'd; however, the section's creator does still have to issue two 
calls: shm_open("/mysection", ...); shm_unlink("/mysection");. This is 
not atomic, and therefore a window of opportunity still exists for the 
section to go astray if I am killed at the wrong time.

This option would also work with a regular file residing in a tmpfs, 
since this is all Linux's implementation of shm_open does.

4. Alright, so what if I get still a little cleverer? I will try to use 
BSD-style shared memory, as those sections are anonymous and certainly 
cleaned up when the referring processes die. I open /dev/zero and mmap 
it appropriately, before sending its associated FD to my partner. 
Unfortunately this fails; my partner ends up with a private, zeroed 
block of memory and nothing is shared. Curiously, I can dup() the 
dev-zero file descriptor and share memory with my child processes, and 
sendmsg's documentation declares that it will effectively dup() a file 
descriptor which is passed across a unix domain socket, but this does 
not seem to hold for /dev/zero in particular.

Therefore, it seems that in order to permit sharing of memory with a 
process with which I do not have a parent-child relationship, one of the 
following needs to be the case:

1. It needs to be possible to atomically shm_open and shm_unlink, or
2. It needs to be possible to pass handles to /dev/zero over sockets 
like one can regular files and POSIX section handles (which are just 
files in a tmpfs), or
3. It needs to be possible for a general file to atomically created and 
registered for deletion on closure of its last handle.

Does this seem valid? Or is there a means to achieve SHM between 
unrelated processes without the risk of leaking the memory?

I'm reading the mailing list online rather than getting it delivered at 
the moment, so I'd appreciate any comments CC'd to cs448@cam.ac.uk :)

Thanks in advance to anyone willing to advise!

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Robust shared memory for unrelated processes
  2008-11-23 22:42 Robust shared memory for unrelated processes Chris Smowton
@ 2008-11-23 23:43 ` Serge E. Hallyn
  0 siblings, 0 replies; 3+ messages in thread
From: Serge E. Hallyn @ 2008-11-23 23:43 UTC (permalink / raw)
  To: Chris Smowton; +Cc: linux-kernel

Quoting Chris Smowton (chris.smowton@cl.cam.ac.uk):
> Hello all,
>
> First of all, my apologies if this is the wrong list for this suggestion; I 
> haven't posted here before so I might accidentally break some local 
> conventions :)
>
> With that said, my question/suggestion relates to sharing memory between 
> processes which are *not* in a parent-child relationship.
>
> Suppose for simplicity's sake that one wishes to share a sizable piece of 
> memory with a single other process with which one is already in contact 
> (say, by a Unix domain socket).
>
> It seems to me that one cannot do this without introducing the risk of the 
> shared section not being deallocated until the system next boots, for the 
> following reasons:
>
> 1. Suppose I use SysV style SHM. Then I must find a free key, create a 
> section with that key, and communicate that key to my partner process so 
> that he can also open the section. I cannot issue an IPC_RMID during this 
> time, as that will render the key immediately unavailable. If I am 
> SIGKILL'd at any time between creating the section and receiving 
> confirmation that my partner has opened it, the section will persist until 
> reboot. This is a large window of opportunity and a very bad thing.

Ah, but if your app is started in a new IPC namespace, then when the
app dies, the namespace will be released and the section will be freed.

Now both of the processes will need to be started in the same child
ipc namespace, as you currently can't enter an existing ipcns.  If
that is a problem, I'm sure it can be addressed somehow.

-serge

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Robust shared memory for unrelated processes
       [not found] <bBwfI-3N6-13@gated-at.bofh.it>
@ 2008-11-24  1:19 ` Bodo Eggert
  0 siblings, 0 replies; 3+ messages in thread
From: Bodo Eggert @ 2008-11-24  1:19 UTC (permalink / raw)
  To: Chris Smowton, linux-kernel

Chris Smowton <chris.smowton@cl.cam.ac.uk> wrote:

> Suppose for simplicity's sake that one wishes to share a sizable piece
> of memory with a single other process with which one is already in
> contact (say, by a Unix domain socket).
> 
> It seems to me that one cannot do this without introducing the risk of
> the shared section not being deallocated until the system next boots,
[...]

> 3. It needs to be possible for a general file to atomically created and
> registered for deletion on closure of its last handle.

This gives you an autounlink flag for tmpfs, which will get rid of your
shared file as soon as the last process closes it. Unfortunately I
don't remember if link()ing it will make it stay.

diff -X dontdiff -dpruN linux-2.6.24.pure/include/linux/shmem_fs.h
linux-2.6.24.autounlink/include/linux/shmem_fs.h
--- linux-2.6.24.pure/include/linux/shmem_fs.h  2006-11-29 22:57:37.000000000
+0100
+++ linux-2.6.24.autounlink/include/linux/shmem_fs.h    2008-02-14
15:35:01.000000000 +0100
@@ -30,11 +30,14 @@ struct shmem_sb_info {
        unsigned long free_blocks;  /* How many are left for allocation */
        unsigned long max_inodes;   /* How many inodes are allowed */
        unsigned long free_inodes;  /* How many are left for allocation */
-       int policy;                 /* Default NUMA memory alloc policy */
-       nodemask_t policy_nodes;    /* nodemask for preferred and bind */
+       unsigned int  flags;
+       int           policy;       /* Default NUMA memory alloc policy */
+       nodemask_t    policy_nodes; /* nodemask for preferred and bind */
        spinlock_t    stat_lock;
 };
 
+#define TMPFS_FL_AUTOREMOVE 1
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
        return container_of(inode, struct shmem_inode_info, vfs_inode);
diff -X dontdiff -dpruN linux-2.6.24.pure/mm/shmem.c
linux-2.6.24.autounlink/mm/shmem.c
--- linux-2.6.24.pure/mm/shmem.c        2008-01-25 15:09:39.000000000 +0100
+++ linux-2.6.24.autounlink/mm/shmem.c  2008-02-14 18:00:54.000000000 +0100
@@ -1747,31 +1747,41 @@ static int
 shmem_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
 {
        struct inode *inode = shmem_get_inode(dir->i_sb, mode, dev);
+       struct shmem_sb_info *sbinfo = SHMEM_SB(dir->i_sb);
        int error = -ENOSPC;
 
-       if (inode) {
-               error = security_inode_init_security(inode, dir, NULL, NULL,
-                                                    NULL);
-               if (error) {
-                       if (error != -EOPNOTSUPP) {
-                               iput(inode);
-                               return error;
-                       }
-               }
-               error = shmem_acl_init(inode, dir);
-               if (error) {
+       if (!inode)
+               return error;
+
+       error = security_inode_init_security(inode, dir, NULL, NULL,
+                                            NULL);
+       if (error) {
+               if (error != -EOPNOTSUPP) {
                        iput(inode);
                        return error;
                }
-               if (dir->i_mode & S_ISGID) {
-                       inode->i_gid = dir->i_gid;
-                       if (S_ISDIR(mode))
-                               inode->i_mode |= S_ISGID;
-               }
-               dir->i_size += BOGO_DIRENT_SIZE;
-               dir->i_ctime = dir->i_mtime = CURRENT_TIME;
-               d_instantiate(dentry, inode);
+       }
+       error = shmem_acl_init(inode, dir);
+       if (error) {
+               iput(inode);
+               return error;
+       }
+       if (dir->i_mode & S_ISGID) {
+               inode->i_gid = dir->i_gid;
+               if (S_ISDIR(mode))
+                       inode->i_mode |= S_ISGID;
+       }
+
+       dir->i_size += BOGO_DIRENT_SIZE;
+       dir->i_ctime = dir->i_mtime = CURRENT_TIME;
+       d_instantiate(dentry, inode);
+       if ( S_ISDIR(mode)
+         || !(sbinfo->flags & TMPFS_FL_AUTOREMOVE))
+       {
                dget(dentry); /* Extra count - pin the dentry in core */
+       } else {
+               dir->i_size -= BOGO_DIRENT_SIZE;
+               drop_nlink(inode);
        }
        return error;
 }
@@ -1800,6 +1810,11 @@ static int shmem_link(struct dentry *old
        struct inode *inode = old_dentry->d_inode;
        struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
 
+       /* In auto-unlink mode, the newly created link would be unlinked
+          immediately. We don't need to do anything here. */
+       if (sbinfo->flags & TMPFS_FL_AUTOREMOVE)
+               return 0;
+
        /*
         * No ordinary (disk based) filesystem counts links as inodes;
         * but each new link needs a new dentry, pinning lowmem, and
@@ -2095,6 +2110,7 @@ static const struct export_operations sh
 
 static int shmem_parse_options(char *options, int *mode, uid_t *uid,
        gid_t *gid, unsigned long *blocks, unsigned long *inodes,
+       unsigned int * flags,
        int *policy, nodemask_t *policy_nodes)
 {
        char *this_char, *value, *rest;
@@ -2120,8 +2136,18 @@ static int shmem_parse_options(char *opt
                        continue;
                if ((value = strchr(this_char,'=')) != NULL) {
                        *value++ = 0;
+
+               /* These options don't take arguments: */
+               } else if (!strcmp(this_char,"autounlink")) {
+                       *flags |= TMPFS_FL_AUTOREMOVE;
+                       continue;
+               } else if (!strcmp(this_char,"noautounlink")) {
+                       *flags &= ~TMPFS_FL_AUTOREMOVE;
+                       continue;
+
+               /* All other options need an argument */
                } else {
-                       printk(KERN_ERR
+                       printk(KERN_ERR 
                            "tmpfs: No value for mount option '%s'\n",
                            this_char);
                        return 1;
@@ -2192,10 +2218,12 @@ static int shmem_remount_fs(struct super
        nodemask_t policy_nodes = sbinfo->policy_nodes;
        unsigned long blocks;
        unsigned long inodes;
+       unsigned int sbflags;
        int error = -EINVAL;
 
+       sbflags = sbinfo->flags;
        if (shmem_parse_options(data, NULL, NULL, NULL, &max_blocks,
-                               &max_inodes, &policy, &policy_nodes))
+                               &max_inodes, &sbflags, &policy, &policy_nodes))
                return error;
 
        spin_lock(&sbinfo->stat_lock);
@@ -2221,6 +2249,7 @@ static int shmem_remount_fs(struct super
        sbinfo->free_blocks = max_blocks - blocks;
        sbinfo->max_inodes  = max_inodes;
        sbinfo->free_inodes = max_inodes - inodes;
+       sbinfo->flags = sbflags;
        sbinfo->policy = policy;
        sbinfo->policy_nodes = policy_nodes;
 out:
@@ -2247,6 +2276,7 @@ static int shmem_fill_super(struct super
        struct shmem_sb_info *sbinfo;
        unsigned long blocks = 0;
        unsigned long inodes = 0;
+       unsigned int flags = 0;
        int policy = MPOL_DEFAULT;
        nodemask_t policy_nodes = node_states[N_HIGH_MEMORY];
 
@@ -2262,7 +2292,7 @@ static int shmem_fill_super(struct super
                if (inodes > blocks)
                        inodes = blocks;
                if (shmem_parse_options(data, &mode, &uid, &gid, &blocks,
-                                       &inodes, &policy, &policy_nodes))
+                                       &inodes, &flags, &policy, &policy_nodes))
                        return -EINVAL;
        }
        sb->s_export_op = &shmem_export_ops;
@@ -2281,6 +2311,7 @@ static int shmem_fill_super(struct super
        sbinfo->free_blocks = blocks;
        sbinfo->max_inodes = inodes;
        sbinfo->free_inodes = inodes;
+       sbinfo->flags = flags;
        sbinfo->policy = policy;
        sbinfo->policy_nodes = policy_nodes;
 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-11-24  1:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-23 22:42 Robust shared memory for unrelated processes Chris Smowton
2008-11-23 23:43 ` Serge E. Hallyn
     [not found] <bBwfI-3N6-13@gated-at.bofh.it>
2008-11-24  1:19 ` Bodo Eggert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.