* Re: [PATCH] [14/18] BKL-removal: Add unlocked_fasync [not found] ` <9Q5rE-3ZD-17@gated-at.bofh.it> @ 2008-01-27 11:14 ` Bodo Eggert 2008-01-27 16:31 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync v2 Andi Kleen 0 siblings, 1 reply; 5+ messages in thread From: Bodo Eggert @ 2008-01-27 11:14 UTC (permalink / raw) To: Andi Kleen, linux-kernel, linux-fsdevel, akpm > +++ linux/fs/fcntl.c > @@ -240,11 +240,15 @@ static int setfl(int fd, struct file * f > > lock_kernel(); > if ((arg ^ filp->f_flags) & FASYNC) { > - if (filp->f_op && filp->f_op->fasync) { > + if (filp->f_op && filp->f_op->unlocked_fasync) > + error = filp->f_op->unlocked_fasync(fd, filp, > + !!(arg & FASYNC)); > + else if (filp->f_op && filp->f_op->fasync) { > error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0); > if (error < 0) > goto out; No goto if you use unlocked_fasync? > } > + /* AK: no else error = -EINVAL here? */ > } > > filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] [14/18] BKL-removal: Add unlocked_fasync v2 2008-01-27 11:14 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync Bodo Eggert @ 2008-01-27 16:31 ` Andi Kleen 0 siblings, 0 replies; 5+ messages in thread From: Andi Kleen @ 2008-01-27 16:31 UTC (permalink / raw) To: Bodo Eggert; +Cc: Andi Kleen, linux-kernel, linux-fsdevel, akpm > No goto if you use unlocked_fasync? Indeed. There was another problem in the patch too. Here's an updated patch that also fixes another latent bug. The whole f_flags still seems to be somewhat fragile because the checks tend to happen without any lock, but that has not changed to the previous state. --- Add unlocked_fasync v2 Add a new fops entry point to allow fasync without BKL. While it's arguably unclear this entry point is called often enough for it really matters it was still relatively easy to do. And there are far less async users in the tree than ioctls so it's likely they can be all converted eventually and then the non unlocked async entry point could be dropped. There was still the problem of the actual flags change being protected against other setters of flags. Instead of using BKL for this use the i_mutex now. I also added a mutex_lock against one other flags change that was lockless and could potentially lose updates. Signed-off-by: Andi Kleen <ak@suse.de> --- Documentation/filesystems/vfs.txt | 5 ++++- fs/fcntl.c | 6 +++++- fs/ioctl.c | 5 ++++- include/linux/fs.h | 1 + 4 files changed, 14 insertions(+), 3 deletions(-) Index: linux/fs/fcntl.c =================================================================== --- linux.orig/fs/fcntl.c +++ linux/fs/fcntl.c @@ -238,18 +238,26 @@ static int setfl(int fd, struct file * f if (error) return error; - lock_kernel(); + /* AK: potential race here. Since test is unlocked fasync could + be called several times in a row with on. We hope the drivers + deal with it. */ if ((arg ^ filp->f_flags) & FASYNC) { - if (filp->f_op && filp->f_op->fasync) { - error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0); - if (error < 0) - goto out; + int on = !!(arg & FASYNC); + if (filp->f_op && filp->f_op->unlocked_fasync) + error = filp->f_op->unlocked_fasync(fd, filp, on); + else if (filp->f_op && filp->f_op->fasync) { + lock_kernel(); + error = filp->f_op->fasync(fd, filp, on); + unlock_kernel(); } + /* AK: no else error = -EINVAL here? */ + if (error < 0) + return error; } + mutex_lock(&filp->f_dentry->d_inode->i_mutex); filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK); - out: - unlock_kernel(); + mutex_unlock(&filp->f_dentry->d_inode->i_mutex); return error; } Index: linux/fs/ioctl.c =================================================================== --- linux.orig/fs/ioctl.c +++ linux/fs/ioctl.c @@ -105,10 +105,14 @@ int vfs_ioctl(struct file *filp, unsigne if(O_NONBLOCK != O_NDELAY) flag |= O_NDELAY; #endif + + mutex_lock(&filp->f_dentry->d_inode->i_mutex); if (on) filp->f_flags |= flag; else filp->f_flags &= ~flag; + + mutex_unlock(&filp->f_dentry->d_inode->i_mutex); break; case FIOASYNC: @@ -117,8 +121,13 @@ int vfs_ioctl(struct file *filp, unsigne flag = on ? FASYNC : 0; /* Did FASYNC state change ? */ + /* AK: potential race here: ->fasync could be called with on two times + in a row. We hope the drivers deal with it. */ if ((flag ^ filp->f_flags) & FASYNC) { - if (filp->f_op && filp->f_op->fasync) { + if (filp->f_op && filp->f_op->unlocked_fasync) + error = filp->f_op->unlocked_fasync(fd, + filp, on); + else if (filp->f_op && filp->f_op->fasync) { lock_kernel(); error = filp->f_op->fasync(fd, filp, on); unlock_kernel(); @@ -128,10 +137,12 @@ int vfs_ioctl(struct file *filp, unsigne if (error != 0) break; + mutex_lock(&filp->f_dentry->d_inode->i_mutex); if (on) filp->f_flags |= FASYNC; else filp->f_flags &= ~FASYNC; + mutex_unlock(&filp->f_dentry->d_inode->i_mutex); break; case FIOQSIZE: Index: linux/include/linux/fs.h =================================================================== --- linux.orig/include/linux/fs.h +++ linux/include/linux/fs.h @@ -1179,6 +1179,7 @@ struct file_operations { int (*fsync) (struct file *, struct dentry *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); + int (*unlocked_fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); Index: linux/Documentation/filesystems/vfs.txt =================================================================== --- linux.orig/Documentation/filesystems/vfs.txt +++ linux/Documentation/filesystems/vfs.txt @@ -769,6 +769,7 @@ struct file_operations { int (*fsync) (struct file *, struct dentry *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); + int (*unlocked_fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *); ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *); @@ -828,7 +829,9 @@ otherwise noted. fsync: called by the fsync(2) system call fasync: called by the fcntl(2) system call when asynchronous - (non-blocking) mode is enabled for a file + (non-blocking) mode is enabled for a file. BKL hold + + unlocked_fasync: like fasync, but without BKL lock: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW commands ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <9Qsob-7is-1@gated-at.bofh.it>]
[parent not found: <9QtDz-121-11@gated-at.bofh.it>]
[parent not found: <9QtWX-1qg-15@gated-at.bofh.it>]
[parent not found: <9Qugj-1PK-1@gated-at.bofh.it>]
* Re: [PATCH] [8/18] BKL-removal: Remove BKL from remote_llseek [not found] ` <9Qugj-1PK-1@gated-at.bofh.it> @ 2008-01-28 9:44 ` Bodo Eggert 0 siblings, 0 replies; 5+ messages in thread From: Bodo Eggert @ 2008-01-28 9:44 UTC (permalink / raw) To: Trond Myklebust, Andi Kleen, Steve French, swhiteho, sfrench, vandrove, linux-kernel Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > On Mon, 2008-01-28 at 05:38 +0100, Andi Kleen wrote: >> On Monday 28 January 2008 05:13:09 Trond Myklebust wrote: >> > On Mon, 2008-01-28 at 03:58 +0100, Andi Kleen wrote: >> > > The problem is that it's not a race in who gets to do its thing first, >> > > but a parallel reader can actually see a corrupted value from the two >> > > independent words on 32bit (e.g. during a 4GB). And this could actually >> > > completely corrupt f_pos when it happens with two racing relative seeks >> > > or read/write()s >> > > >> > > I would consider that a bug. >> > >> > I disagree. The corruption occurs because this isn't a situation that is >> > allowed by either POSIX or SUSv2/v3. Exactly what spec are you referring >> > to here? >> >> No specific spec, just general quality of implementation. We normally don't >> have non thread safe system calls even if it was in theory allowed by some >> specification. > > We've had the existing implementation for quite some time. The arguments > against changing it have been the same all along: if your application > wants to share files between threads, the portability argument implies > that you should either use pread/pwrite or use a mutex or some other > form of synchronisation primitive in order to ensure that > lseek()/read()/write() do not overlap. Does anything in the kernel depend on f_pos being valid? E.g. is it possible to read beyond the EOF using this race, or to have files larger than the ulimit? If not, update the manpage and be done. ¢¢ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] [0/18] Implement some low hanging BKL removal fruit in fs/* @ 2008-01-27 2:17 Andi Kleen 2008-01-27 2:17 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync Andi Kleen 0 siblings, 1 reply; 5+ messages in thread From: Andi Kleen @ 2008-01-27 2:17 UTC (permalink / raw) To: linux-kernel, linux-fsdevel, akpm [Andrew: I believe this is -mm material for .25] - Convert some more file systems (generally those who don't use the BKL for anything except mount) to use unlocked_bkl. - Implement BKL less fasync (see patch for the rationale) This is currently a separate entry point, but since the number of fasync users in the tree is relatively small I hope the older entry point can be removed then in the not too far future [help from other people converting more fasync users to unlocked_fasync would be appreciated] - Implement BKL less remote_llseek - While I was at it I also added a few missing compat ioctl handlers - Fix a few comments This fixes a lot of relatively trivial BKL users in fs/*. The main remaining non legacy offenders are now locks.c, nfs/nfsd and reiserfs. I believe BKL removal for all of those is being worked on by other people. Also a lot of "legacy" file systems still use it, but converting those does not seem to be very pressing. -Andi ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] [14/18] BKL-removal: Add unlocked_fasync 2008-01-27 2:17 [PATCH] [0/18] Implement some low hanging BKL removal fruit in fs/* Andi Kleen @ 2008-01-27 2:17 ` Andi Kleen 2008-01-27 7:05 ` KOSAKI Motohiro 0 siblings, 1 reply; 5+ messages in thread From: Andi Kleen @ 2008-01-27 2:17 UTC (permalink / raw) To: linux-kernel, linux-fsdevel, akpm Add a new fops entry point to allow fasync without BKL. While it's arguably unclear this entry point is called often enough for it really matters it was still relatively easy to do. And there are far less async users in the tree than ioctls so it's likely they can be all converted eventually and then the non unlocked async entry point could be dropped. Signed-off-by: Andi Kleen <ak@suse.de> --- Documentation/filesystems/vfs.txt | 5 ++++- fs/fcntl.c | 6 +++++- fs/ioctl.c | 5 ++++- include/linux/fs.h | 1 + 4 files changed, 14 insertions(+), 3 deletions(-) Index: linux/fs/fcntl.c =================================================================== --- linux.orig/fs/fcntl.c +++ linux/fs/fcntl.c @@ -240,11 +240,15 @@ static int setfl(int fd, struct file * f lock_kernel(); if ((arg ^ filp->f_flags) & FASYNC) { - if (filp->f_op && filp->f_op->fasync) { + if (filp->f_op && filp->f_op->unlocked_fasync) + error = filp->f_op->unlocked_fasync(fd, filp, + !!(arg & FASYNC)); + else if (filp->f_op && filp->f_op->fasync) { error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0); if (error < 0) goto out; } + /* AK: no else error = -EINVAL here? */ } filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK); Index: linux/fs/ioctl.c =================================================================== --- linux.orig/fs/ioctl.c +++ linux/fs/ioctl.c @@ -118,7 +118,10 @@ int vfs_ioctl(struct file *filp, unsigne /* Did FASYNC state change ? */ if ((flag ^ filp->f_flags) & FASYNC) { - if (filp->f_op && filp->f_op->fasync) { + if (filp->f_op && filp->f_op->unlocked_fasync) + error = filp->f_op->unlocked_fasync(fd, + filp, on); + else if (filp->f_op && filp->f_op->fasync) { lock_kernel(); error = filp->f_op->fasync(fd, filp, on); unlock_kernel(); Index: linux/include/linux/fs.h =================================================================== --- linux.orig/include/linux/fs.h +++ linux/include/linux/fs.h @@ -1179,6 +1179,7 @@ struct file_operations { int (*fsync) (struct file *, struct dentry *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); + int (*unlocked_fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); Index: linux/Documentation/filesystems/vfs.txt =================================================================== --- linux.orig/Documentation/filesystems/vfs.txt +++ linux/Documentation/filesystems/vfs.txt @@ -769,6 +769,7 @@ struct file_operations { int (*fsync) (struct file *, struct dentry *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); + int (*unlocked_fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *); ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *); @@ -828,7 +829,9 @@ otherwise noted. fsync: called by the fsync(2) system call fasync: called by the fcntl(2) system call when asynchronous - (non-blocking) mode is enabled for a file + (non-blocking) mode is enabled for a file. BKL hold + + unlocked_fasync: like fasync, but without BKL lock: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW commands ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] [14/18] BKL-removal: Add unlocked_fasync 2008-01-27 2:17 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync Andi Kleen @ 2008-01-27 7:05 ` KOSAKI Motohiro 0 siblings, 0 replies; 5+ messages in thread From: KOSAKI Motohiro @ 2008-01-27 7:05 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel, linux-fsdevel, akpm, kosaki.motohiro > Add a new fops entry point to allow fasync without BKL. While it's arguably > unclear this entry point is called often enough for it really matters > it was still relatively easy to do. And there are far less async users > in the tree than ioctls so it's likely they can be all converted > eventually and then the non unlocked async entry point could be dropped. Excellent! I like this patch :) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-01-28 9:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <9Q5hR-3MI-9@gated-at.bofh.it>
[not found] ` <9Q5rE-3ZD-17@gated-at.bofh.it>
2008-01-27 11:14 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync Bodo Eggert
2008-01-27 16:31 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync v2 Andi Kleen
[not found] ` <9Qsob-7is-1@gated-at.bofh.it>
[not found] ` <9QtDz-121-11@gated-at.bofh.it>
[not found] ` <9QtWX-1qg-15@gated-at.bofh.it>
[not found] ` <9Qugj-1PK-1@gated-at.bofh.it>
2008-01-28 9:44 ` [PATCH] [8/18] BKL-removal: Remove BKL from remote_llseek Bodo Eggert
2008-01-27 2:17 [PATCH] [0/18] Implement some low hanging BKL removal fruit in fs/* Andi Kleen
2008-01-27 2:17 ` [PATCH] [14/18] BKL-removal: Add unlocked_fasync Andi Kleen
2008-01-27 7:05 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).