From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [RFC][PATCH 22/27] sys_mknodat(): elevate write count for vfs_mknod/create() Date: Thu, 08 Jun 2006 08:23:22 -0700 Message-ID: <1149780202.4097.62.camel@localhost.localdomain> References: <20060608001013.0D041507@localhost.localdomain> <20060608001035.559326ED@localhost.localdomain> <20060608111652.GL11996@MAIL.13thfloor.at> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, viro@ftp.linux.org.uk, hch@infradead.org, trond.myklebust@fys.uio.no Return-path: Received: from e2.ny.us.ibm.com ([32.97.182.142]:5844 "EHLO e2.ny.us.ibm.com") by vger.kernel.org with ESMTP id S964870AbWFHPYS (ORCPT ); Thu, 8 Jun 2006 11:24:18 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e2.ny.us.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id k58FOHkc001384 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 8 Jun 2006 11:24:17 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay04.pok.ibm.com (8.13.6/NCO/VER7.0) with ESMTP id k58FOHwZ251400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 8 Jun 2006 11:24:17 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id k58FOGEL007139 for ; Thu, 8 Jun 2006 11:24:17 -0400 To: Herbert Poetzl In-Reply-To: <20060608111652.GL11996@MAIL.13thfloor.at> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, 2006-06-08 at 13:16 +0200, Herbert Poetzl wrote: > On Wed, Jun 07, 2006 at 05:10:35PM -0700, Dave Hansen wrote: > > This takes care of all of the direct callers of vfs_mknod(). > > Since a few of these cases also handle normal file creation > > as well, this also covers some calls to vfs_create(). > > > > Signed-off-by: Dave Hansen > > --- > > > > ipc/mqueue.c | 0 > > lxc-dave/fs/namei.c | 48 ++++++++++++++++++++++++++++---------------- > > lxc-dave/fs/nfsd/vfs.c | 4 +++ > > lxc-dave/net/unix/af_unix.c | 4 +++ > > 4 files changed, 39 insertions(+), 17 deletions(-) > > > > diff -puN fs/namei.c~elevate-writers-vfs_mknod-try2 fs/namei.c > > --- lxc/fs/namei.c~elevate-writers-vfs_mknod-try2 2006-06-07 16:53:25.000000000 -0700 > > +++ lxc-dave/fs/namei.c 2006-06-07 16:53:25.000000000 -0700 > > @@ -1852,26 +1852,40 @@ asmlinkage long sys_mknodat(int dfd, con > > > > if (!IS_POSIXACL(nd.dentry->d_inode)) > > mode &= ~current->fs->umask; > > - if (!IS_ERR(dentry)) { > > - switch (mode & S_IFMT) { > > - case 0: case S_IFREG: > > - error = vfs_create(nd.dentry->d_inode,dentry,mode,&nd); > > - break; > > - case S_IFCHR: case S_IFBLK: > > - error = vfs_mknod(nd.dentry->d_inode,dentry,mode, > > - new_decode_dev(dev)); > > - break; > > - case S_IFIFO: case S_IFSOCK: > > - error = vfs_mknod(nd.dentry->d_inode,dentry,mode,0); > > + if (IS_ERR(dentry)) > > + goto out_unlock; > > + error = mnt_want_write(nd.mnt); > > + if (error) > > + goto out_dput; > > + switch (mode & S_IFMT) { > > + case 0: case S_IFREG: > > + error = vfs_create(nd.dentry->d_inode,dentry,mode,&nd); > > + break; > > + case S_IFCHR: case S_IFBLK: > > hmm, could you outline some cases where the following > fails, but the previous (above) check succeeded? > (except for dubious locking issues) Are you asking about the vfs_create() call, or the second mnt_want_write() call? I appear to have munged two patches together here. One which elevates the count over the entire switch(), and the other which does the elevation only around the actual vfs_*() calls. It should only have the lower-level calls. > > + error = mnt_want_write(nd.mnt); > > + if (error) > > break; > > - case S_IFDIR: > > - error = -EPERM; > > + error = vfs_mknod(nd.dentry->d_inode,dentry,mode, > > + new_decode_dev(dev)); > > + mnt_drop_write(nd.mnt); > > + break; > > + case S_IFIFO: case S_IFSOCK: > > same here > > > + error = mnt_want_write(nd.mnt); > > + if (error) > > break; > > - default: > > - error = -EINVAL; > > - } > > - dput(dentry); > > + error = vfs_mknod(nd.dentry->d_inode,dentry,mode,0); > > + mnt_drop_write(nd.mnt); > > + break; > > + case S_IFDIR: > > + error = -EPERM; > > this will give -EPERM on a r/w filesystem, but > -EROFS on r/o, is that consistant with the current > behaviour? Nope. That's a bug, I think. I'll fix it up. > > + break; > > + default: > > + error = -EINVAL; > > same here for -EINVAL and -EROFS, maybe the > mnt_want_write() wants to go into vfs_mknod()? I think I tried that once, and it didn't quite work out. In this case, I think just giving vfs_mknod() the same treatment that I gave vfs_create() should do for now. -- Dave