public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/4] Move FASYNC bit handling to f_op->fasync()
  2009-02-02 18:20 [PATCH/RFC] F_SETFL/Fasync BKL removal, now without unsightly global locks Jonathan Corbet
@ 2009-02-02 18:20 ` Jonathan Corbet
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-02 18:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, linux-api, Matt Mackall,
	Alan Cox, Jonathan Corbet

Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function.  Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that.

As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper().  So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct.  The BKL had
already been pushed down into the few exceptions.

The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 fs/fcntl.c   |   21 ++++++++-------------
 fs/ioctl.c   |   11 +----------
 net/socket.c |    2 ++
 3 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 756119e..7b641d9 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -185,19 +185,14 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
 		return error;
 
 	/*
-	 * We still need a lock here for now to keep multiple FASYNC calls
-	 * from racing with each other.
+	 * ->fasync() is responsible for setting the FASYNC bit.
 	 */
-	lock_kernel();
-	if ((arg ^ filp->f_flags) & FASYNC) {
-		if (filp->f_op && filp->f_op->fasync) {
-			error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0);
-			if (error < 0)
-				goto out;
-		}
-		change_bit(NR_FASYNC, &filp->f_flags);
+	if (((arg ^ filp->f_flags) & FASYNC) && filp->f_op &&
+			filp->f_op->fasync) {
+		error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0);
+		if (error < 0)
+			goto out;
 	}
-
 	/*
 	 * Now that we use bitops we have to tweak each bit individually.
 	 */
@@ -207,7 +202,6 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
 	tweak_flags_bit(NR_O_DIRECT, arg & O_DIRECT, &filp->f_flags);
 	tweak_flags_bit(NR_O_NOATIME, arg & O_NOATIME, &filp->f_flags);
  out:
-	unlock_kernel();
 	return error;
 }
 
@@ -532,7 +526,7 @@ static DEFINE_RWLOCK(fasync_lock);
 static struct kmem_cache *fasync_cache __read_mostly;
 
 /*
- * fasync_helper() is used by some character device drivers (mainly mice)
+ * fasync_helper() is used by almost all character device drivers
  * to set up the fasync queue. It returns negative on error, 0 if it did
  * no changes and positive if it added/deleted the entry.
  */
@@ -548,6 +542,7 @@ int fasync_helper(int fd, struct file * filp, int on, struct fasync_struct **fap
 			return -ENOMEM;
 	}
 	write_lock_irq(&fasync_lock);
+	tweak_flags_bit(NR_FASYNC, on, &filp->f_flags);
 	for (fp = fapp; (fa = *fp) != NULL; fp = &fa->fa_next) {
 		if (fa->fa_file == filp) {
 			if(on) {
diff --git a/fs/ioctl.c b/fs/ioctl.c
index fc0db36..bc32cb2 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -427,17 +427,11 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 	/* Did FASYNC state change ? */
 	if ((flag ^ filp->f_flags) & FASYNC) {
 		if (filp->f_op && filp->f_op->fasync)
+			/* fasync() adjusts filp->f_flags */
 			error = filp->f_op->fasync(fd, filp, on);
 		else
 			error = -ENOTTY;
 	}
-	if (error)
-		return error;
-
-	if (on)
-		set_bit(NR_FASYNC, &filp->f_flags);
-	else
-		clear_bit(NR_FASYNC, &filp->f_flags);
 	return error;
 }
 
@@ -505,10 +499,7 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
 		break;
 
 	case FIOASYNC:
-		/* BKL needed to avoid races tweaking f_flags */
-		lock_kernel();
 		error = ioctl_fioasync(fd, filp, argp);
-		unlock_kernel();
 		break;
 
 	case FIOQSIZE:
diff --git a/net/socket.c b/net/socket.c
index 35dd737..2b1e380 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1037,6 +1037,7 @@ static int sock_fasync(int fd, struct file *filp, int on)
 			break;
 
 	if (on) {
+		set_bit(NR_FASYNC, &filp->f_flags);
 		if (fa != NULL) {
 			write_lock_bh(&sk->sk_callback_lock);
 			fa->fa_fd = fd;
@@ -1053,6 +1054,7 @@ static int sock_fasync(int fd, struct file *filp, int on)
 		sock->fasync_list = fna;
 		write_unlock_bh(&sk->sk_callback_lock);
 	} else {
+		clear_bit(NR_FASYNC, &filp->f_flags);
 		if (fa != NULL) {
 			write_lock_bh(&sk->sk_callback_lock);
 			*prev = fa->fa_next;
-- 
1.6.1.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH/RFC] Fasync BKL removal
@ 2009-02-07 20:06 Jonathan Corbet
  2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-07 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, Alan Cox, Matt Mackall

With any luck at all, this is my final attempt at removing the BKL from the
fasync() path and providing proper protection for struct file->f_flags.
This week's episode includes these patches:

1) Rename f_ep_lock to f_lock and move it out of CONFIG_EPOLL.  Epoll
   remains the only user, but the lock is now available for others.

2) Use f_lock to protect f_flags.  This provides better protection to the
   flags than we've had before and allows the removal of the BKL in places
   where its only purpose was protecting accesses to this field.

3) Move FASYNC bit handling into ->fasync(): this is how changes to that
   bit and calls to f_op->fasync() are kept atomic in the absence of the
   BKL.  Almost every FASYNC implementation uses fasync_helper(), so the
   actual bit manipulation is done there.  The one exception (for sockets)
   has been fixed up.  At this point, there is no more BKL in these paths.

4) Rationalize fasync_helper() return value handling.  Some drivers mapped
   positive return values from fasync_helper() onto zero, most others did
   not bother.  This optional cleanup patch moves that mapping into the VFS
   code, making things consistent and enabling the removal of some 50 lines
   of code.

Parts 3 and 4 are essentially the same as the last time around.

Once again, this code works for me and, if there's any regressions here,
the LTP test suite cannot find them.

Thanks to all who have looked at this interminable stream of patches...

jon



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] Rename struct file->f_ep_lock
  2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
@ 2009-02-07 20:06 ` Jonathan Corbet
  2009-02-08  8:56   ` Christoph Hellwig
  2009-02-08 19:51   ` Davide Libenzi
  2009-02-07 20:06 ` [PATCH 2/4] Use f_lock to protect f_flags Jonathan Corbet
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-07 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, Alan Cox, Matt Mackall,
	Jonathan Corbet

This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock.  For now,
epoll remains the only user, but a future patch will use it to protect
f_flags as well.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 fs/eventpoll.c            |   12 +++++++-----
 fs/file_table.c           |    1 +
 include/linux/eventpoll.h |    1 -
 include/linux/fs.h        |    2 +-
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index ba2f9ec..1a59e09 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -427,10 +427,10 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
 	ep_unregister_pollwait(ep, epi);
 
 	/* Remove the current item from the list of epoll hooks */
-	spin_lock(&file->f_ep_lock);
+	spin_lock(&file->f_lock);
 	if (ep_is_linked(&epi->fllink))
 		list_del_init(&epi->fllink);
-	spin_unlock(&file->f_ep_lock);
+	spin_unlock(&file->f_lock);
 
 	rb_erase(&epi->rbn, &ep->rbr);
 
@@ -549,7 +549,7 @@ void eventpoll_release_file(struct file *file)
 	struct epitem *epi;
 
 	/*
-	 * We don't want to get "file->f_ep_lock" because it is not
+	 * We don't want to get "file->f_lock" because it is not
 	 * necessary. It is not necessary because we're in the "struct file"
 	 * cleanup path, and this means that noone is using this file anymore.
 	 * So, for example, epoll_ctl() cannot hit here sicne if we reach this
@@ -558,6 +558,8 @@ void eventpoll_release_file(struct file *file)
 	 * will correctly serialize the operation. We do need to acquire
 	 * "ep->mtx" after "epmutex" because ep_remove() requires it when called
 	 * from anywhere but ep_free().
+	 *
+	 * Besides, ep_remove() acquires the lock, so we can't hold it here.
 	 */
 	mutex_lock(&epmutex);
 
@@ -800,9 +802,9 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 		goto error_unregister;
 
 	/* Add the current item to the list of active epoll hook for this file */
-	spin_lock(&tfile->f_ep_lock);
+	spin_lock(&tfile->f_lock);
 	list_add_tail(&epi->fllink, &tfile->f_ep_links);
-	spin_unlock(&tfile->f_ep_lock);
+	spin_unlock(&tfile->f_lock);
 
 	/*
 	 * Add the current item to the RB tree. All RB tree operations are
diff --git a/fs/file_table.c b/fs/file_table.c
index bbeeac6..aa1e180 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -127,6 +127,7 @@ struct file *get_empty_filp(void)
 	atomic_long_set(&f->f_count, 1);
 	rwlock_init(&f->f_owner.lock);
 	f->f_cred = get_cred(cred);
+	spin_lock_init(&f->f_lock);
 	eventpoll_init_file(f);
 	/* f->f_version: 0 */
 	return f;
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index f1e1d3c..f6856a5 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -61,7 +61,6 @@ struct file;
 static inline void eventpoll_init_file(struct file *file)
 {
 	INIT_LIST_HEAD(&file->f_ep_links);
-	spin_lock_init(&file->f_ep_lock);
 }
 
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6022f44..0d71633 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -842,6 +842,7 @@ struct file {
 #define f_dentry	f_path.dentry
 #define f_vfsmnt	f_path.mnt
 	const struct file_operations	*f_op;
+	spinlock_t		f_lock;  /* f_ep_links */
 	atomic_long_t		f_count;
 	unsigned int 		f_flags;
 	fmode_t			f_mode;
@@ -860,7 +861,6 @@ struct file {
 #ifdef CONFIG_EPOLL
 	/* Used by fs/eventpoll.c to link all the hooks to this file */
 	struct list_head	f_ep_links;
-	spinlock_t		f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
 	struct address_space	*f_mapping;
 #ifdef CONFIG_DEBUG_WRITECOUNT
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] Use f_lock to protect f_flags
  2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
  2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
@ 2009-02-07 20:06 ` Jonathan Corbet
  2009-02-08  9:02   ` Christoph Hellwig
  2009-02-07 20:06 ` [PATCH 3/4] Move FASYNC bit handling to f_op->fasync() Jonathan Corbet
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-07 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, Alan Cox, Matt Mackall,
	Jonathan Corbet

Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all.  This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock.  This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 drivers/char/tty_io.c             |    5 ++---
 drivers/usb/gadget/file_storage.c |    7 ++++++-
 fs/fcntl.c                        |    2 ++
 fs/ioctl.c                        |    7 ++++---
 fs/nfsd/vfs.c                     |    5 ++++-
 include/linux/fs.h                |    2 +-
 ipc/mqueue.c                      |    2 ++
 sound/core/oss/pcm_oss.c          |    2 ++
 sound/oss/au1550_ac97.c           |    2 ++
 sound/oss/audio.c                 |    2 ++
 sound/oss/sh_dac_audio.c          |    2 ++
 sound/oss/swarm_cs4297a.c         |    2 ++
 sound/oss/vwsnd.c                 |    2 ++
 13 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index bc84e12..224f271 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -2162,13 +2162,12 @@ static int fionbio(struct file *file, int __user *p)
 	if (get_user(nonblock, p))
 		return -EFAULT;
 
-	/* file->f_flags is still BKL protected in the fs layer - vomit */
-	lock_kernel();
+	spin_lock(&file->f_lock);
 	if (nonblock)
 		file->f_flags |= O_NONBLOCK;
 	else
 		file->f_flags &= ~O_NONBLOCK;
-	unlock_kernel();
+	spin_unlock(&file->f_lock);
 	return 0;
 }
 
diff --git a/drivers/usb/gadget/file_storage.c b/drivers/usb/gadget/file_storage.c
index b10fa31..26f5a08 100644
--- a/drivers/usb/gadget/file_storage.c
+++ b/drivers/usb/gadget/file_storage.c
@@ -1711,7 +1711,9 @@ static int do_write(struct fsg_dev *fsg)
 		curlun->sense_data = SS_WRITE_PROTECTED;
 		return -EINVAL;
 	}
+	spin_lock(&curlun->filp->f_lock);
 	curlun->filp->f_flags &= ~O_SYNC;	// Default is not to wait
+	spin_unlock(&curlun->filp->f_lock);
 
 	/* Get the starting Logical Block Address and check that it's
 	 * not too big */
@@ -1728,8 +1730,11 @@ static int do_write(struct fsg_dev *fsg)
 			curlun->sense_data = SS_INVALID_FIELD_IN_CDB;
 			return -EINVAL;
 		}
-		if (fsg->cmnd[1] & 0x08)	// FUA
+		if (fsg->cmnd[1] & 0x08) {	// FUA
+			spin_lock(&curlun->filp->f_lock);
 			curlun->filp->f_flags |= O_SYNC;
+			spin_unlock(&curlun->filp->f_lock);
+		}
 	}
 	if (lba >= curlun->num_sectors) {
 		curlun->sense_data = SS_LOGICAL_BLOCK_ADDRESS_OUT_OF_RANGE;
diff --git a/fs/fcntl.c b/fs/fcntl.c
index bd215cc..04df857 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -189,7 +189,9 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
 		}
 	}
 
+	spin_lock(&filp->f_lock);
 	filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK);
+	spin_unlock(&filp->f_lock);
  out:
 	unlock_kernel();
 	return error;
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 240ec63..421aab4 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -404,10 +404,12 @@ static int ioctl_fionbio(struct file *filp, int __user *argp)
 	if (O_NONBLOCK != O_NDELAY)
 		flag |= O_NDELAY;
 #endif
+	spin_lock(&filp->f_lock);
 	if (on)
 		filp->f_flags |= flag;
 	else
 		filp->f_flags &= ~flag;
+	spin_unlock(&filp->f_lock);
 	return error;
 }
 
@@ -432,10 +434,12 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 	if (error)
 		return error;
 
+	spin_lock(&filp->f_lock);
 	if (on)
 		filp->f_flags |= FASYNC;
 	else
 		filp->f_flags &= ~FASYNC;
+	spin_unlock(&filp->f_lock);
 	return error;
 }
 
@@ -499,10 +503,7 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
 		break;
 
 	case FIONBIO:
-		/* BKL needed to avoid races tweaking f_flags */
-		lock_kernel();
 		error = ioctl_fionbio(filp, argp);
-		unlock_kernel();
 		break;
 
 	case FIOASYNC:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6e50aaa..c165a64 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -998,8 +998,11 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 
 	if (!EX_ISSYNC(exp))
 		stable = 0;
-	if (stable && !EX_WGATHER(exp))
+	if (stable && !EX_WGATHER(exp)) {
+		spin_lock(&file->f_lock);
 		file->f_flags |= O_SYNC;
+		spin_unlock(&file->f_lock);
+	}
 
 	/* Write the data. */
 	oldfs = get_fs(); set_fs(KERNEL_DS);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0d71633..137a92a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -842,7 +842,7 @@ struct file {
 #define f_dentry	f_path.dentry
 #define f_vfsmnt	f_path.mnt
 	const struct file_operations	*f_op;
-	spinlock_t		f_lock;  /* f_ep_links */
+	spinlock_t		f_lock;  /* f_ep_links, f_flags */
 	atomic_long_t		f_count;
 	unsigned int 		f_flags;
 	fmode_t			f_mode;
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 54b4077..a8ddadb 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1156,10 +1156,12 @@ SYSCALL_DEFINE3(mq_getsetattr, mqd_t, mqdes,
 	omqstat.mq_flags = filp->f_flags & O_NONBLOCK;
 	if (u_mqstat) {
 		audit_mq_getsetattr(mqdes, &mqstat);
+		spin_lock(&filp->f_lock);
 		if (mqstat.mq_flags & O_NONBLOCK)
 			filp->f_flags |= O_NONBLOCK;
 		else
 			filp->f_flags &= ~O_NONBLOCK;
+		spin_unlock(&filp->f_lock);
 
 		inode->i_atime = inode->i_ctime = CURRENT_TIME;
 	}
diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index e178366..3a28acb 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -1895,7 +1895,9 @@ static int snd_pcm_oss_set_fragment(struct snd_pcm_oss_file *pcm_oss_file, unsig
 
 static int snd_pcm_oss_nonblock(struct file * file)
 {
+	spin_lock(&file->f_lock);
 	file->f_flags |= O_NONBLOCK;
+	spin_unlock(&file->f_lock);
 	return 0;
 }
 
diff --git a/sound/oss/au1550_ac97.c b/sound/oss/au1550_ac97.c
index 81e1f44..4191acc 100644
--- a/sound/oss/au1550_ac97.c
+++ b/sound/oss/au1550_ac97.c
@@ -1627,7 +1627,9 @@ au1550_ioctl(struct inode *inode, struct file *file, unsigned int cmd,
 				    sizeof(abinfo)) ? -EFAULT : 0;
 
 	case SNDCTL_DSP_NONBLOCK:
+		spin_lock(&file->f_lock);
 		file->f_flags |= O_NONBLOCK;
+		spin_unlock(&file->f_lock);
 		return 0;
 
 	case SNDCTL_DSP_GETODELAY:
diff --git a/sound/oss/audio.c b/sound/oss/audio.c
index 89bd27a..b69c05b 100644
--- a/sound/oss/audio.c
+++ b/sound/oss/audio.c
@@ -433,7 +433,9 @@ int audio_ioctl(int dev, struct file *file, unsigned int cmd, void __user *arg)
 			return dma_ioctl(dev, cmd, arg);
 		
 		case SNDCTL_DSP_NONBLOCK:
+			spin_lock(&file->f_lock);
 			file->f_flags |= O_NONBLOCK;
+			spin_unlock(&file->f_lock);
 			return 0;
 
 		case SNDCTL_DSP_GETCAPS:
diff --git a/sound/oss/sh_dac_audio.c b/sound/oss/sh_dac_audio.c
index e5d4239..78cfb66 100644
--- a/sound/oss/sh_dac_audio.c
+++ b/sound/oss/sh_dac_audio.c
@@ -135,7 +135,9 @@ static int dac_audio_ioctl(struct inode *inode, struct file *file,
 		return put_user(AFMT_U8, (int *)arg);
 
 	case SNDCTL_DSP_NONBLOCK:
+		spin_lock(&file->f_lock);
 		file->f_flags |= O_NONBLOCK;
+		spin_unlock(&file->f_lock);
 		return 0;
 
 	case SNDCTL_DSP_GETCAPS:
diff --git a/sound/oss/swarm_cs4297a.c b/sound/oss/swarm_cs4297a.c
index 41562ec..1edab7b 100644
--- a/sound/oss/swarm_cs4297a.c
+++ b/sound/oss/swarm_cs4297a.c
@@ -2200,7 +2200,9 @@ static int cs4297a_ioctl(struct inode *inode, struct file *file,
 				    sizeof(abinfo)) ? -EFAULT : 0;
 
 	case SNDCTL_DSP_NONBLOCK:
+		spin_lock(&file->f_lock);
 		file->f_flags |= O_NONBLOCK;
+		spin_unlock(&file->f_lock);
 		return 0;
 
 	case SNDCTL_DSP_GETODELAY:
diff --git a/sound/oss/vwsnd.c b/sound/oss/vwsnd.c
index 78b8acc..187f727 100644
--- a/sound/oss/vwsnd.c
+++ b/sound/oss/vwsnd.c
@@ -2673,7 +2673,9 @@ static int vwsnd_audio_do_ioctl(struct inode *inode,
 
 	case SNDCTL_DSP_NONBLOCK:	/* _SIO  ('P',14) */
 		DBGX("SNDCTL_DSP_NONBLOCK\n");
+		spin_lock(&file->f_lock);
 		file->f_flags |= O_NONBLOCK;
+		spin_unlock(&file->f_lock);
 		return 0;
 
 	case SNDCTL_DSP_RESET:		/* _SIO  ('P', 0) */
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] Move FASYNC bit handling to f_op->fasync()
  2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
  2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
  2009-02-07 20:06 ` [PATCH 2/4] Use f_lock to protect f_flags Jonathan Corbet
@ 2009-02-07 20:06 ` Jonathan Corbet
  2009-02-08  9:05   ` Christoph Hellwig
  2009-02-07 20:06 ` [PATCH 4/4] Rationalize fasync return values Jonathan Corbet
  2009-02-07 20:28 ` [PATCH/RFC] Fasync BKL removal Matt Mackall
  4 siblings, 1 reply; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-07 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, Alan Cox, Matt Mackall,
	Jonathan Corbet

Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function.  Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that.  As a result, we
have a couple of internal API changes: fasync() must now manage the FASYNC
bit, and it will be called without the BKL held.

As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper().  So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct.  The BKL had
already been pushed down into the rest.

The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 fs/fcntl.c   |   29 ++++++++++++++++-------------
 fs/ioctl.c   |   13 +------------
 net/socket.c |    8 ++++++++
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 04df857..431bb64 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -141,7 +141,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
 	return ret;
 }
 
-#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | FASYNC | O_DIRECT | O_NOATIME)
+#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME)
 
 static int setfl(int fd, struct file * filp, unsigned long arg)
 {
@@ -177,23 +177,19 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
 		return error;
 
 	/*
-	 * We still need a lock here for now to keep multiple FASYNC calls
-	 * from racing with each other.
+	 * ->fasync() is responsible for setting the FASYNC bit.
 	 */
-	lock_kernel();
-	if ((arg ^ filp->f_flags) & FASYNC) {
-		if (filp->f_op && filp->f_op->fasync) {
-			error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0);
-			if (error < 0)
-				goto out;
-		}
+	if (((arg ^ filp->f_flags) & FASYNC) && filp->f_op &&
+			filp->f_op->fasync) {
+		error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0);
+		if (error < 0)
+			goto out;
 	}
-
 	spin_lock(&filp->f_lock);
 	filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK);
 	spin_unlock(&filp->f_lock);
+
  out:
-	unlock_kernel();
 	return error;
 }
 
@@ -518,7 +514,7 @@ static DEFINE_RWLOCK(fasync_lock);
 static struct kmem_cache *fasync_cache __read_mostly;
 
 /*
- * fasync_helper() is used by some character device drivers (mainly mice)
+ * fasync_helper() is used by almost all character device drivers
  * to set up the fasync queue. It returns negative on error, 0 if it did
  * no changes and positive if it added/deleted the entry.
  */
@@ -557,6 +553,13 @@ int fasync_helper(int fd, struct file * filp, int on, struct fasync_struct **fap
 		result = 1;
 	}
 out:
+	/* Fix up FASYNC bit while still holding fasync_lock */
+	spin_lock(&filp->f_lock);
+	if (on)
+		filp->f_flags |= FASYNC;
+	else
+		filp->f_flags &= ~FASYNC;
+	spin_unlock(&filp->f_lock);
 	write_unlock_irq(&fasync_lock);
 	return result;
 }
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 421aab4..e8e89ed 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -427,19 +427,11 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 	/* Did FASYNC state change ? */
 	if ((flag ^ filp->f_flags) & FASYNC) {
 		if (filp->f_op && filp->f_op->fasync)
+			/* fasync() adjusts filp->f_flags */
 			error = filp->f_op->fasync(fd, filp, on);
 		else
 			error = -ENOTTY;
 	}
-	if (error)
-		return error;
-
-	spin_lock(&filp->f_lock);
-	if (on)
-		filp->f_flags |= FASYNC;
-	else
-		filp->f_flags &= ~FASYNC;
-	spin_unlock(&filp->f_lock);
 	return error;
 }
 
@@ -507,10 +499,7 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
 		break;
 
 	case FIOASYNC:
-		/* BKL needed to avoid races tweaking f_flags */
-		lock_kernel();
 		error = ioctl_fioasync(fd, filp, argp);
-		unlock_kernel();
 		break;
 
 	case FIOQSIZE:
diff --git a/net/socket.c b/net/socket.c
index 35dd737..5770398 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1030,6 +1030,14 @@ static int sock_fasync(int fd, struct file *filp, int on)
 
 	lock_sock(sk);
 
+	/* Maintaining the FASYNC bit is our job now */
+	spin_lock(&filp->f_lock);
+	if (on)
+		filp->f_flags |= FASYNC;
+	else
+		filp->f_flags &= ~FASYNC;
+	spin_unlock(&filp->f_lock);
+
 	prev = &(sock->fasync_list);
 
 	for (fa = *prev; fa != NULL; prev = &fa->fa_next, fa = *prev)
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] Rationalize fasync return values
  2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
                   ` (2 preceding siblings ...)
  2009-02-07 20:06 ` [PATCH 3/4] Move FASYNC bit handling to f_op->fasync() Jonathan Corbet
@ 2009-02-07 20:06 ` Jonathan Corbet
  2009-02-07 20:28 ` [PATCH/RFC] Fasync BKL removal Matt Mackall
  4 siblings, 0 replies; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-07 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro, Davide Libenzi,
	David Miller, Christoph Hellwig, Alan Cox, Matt Mackall,
	Jonathan Corbet

Most fasync implementations do something like:

     return fasync_helper(...);

But fasync_helper() will return a positive value at times - a feature used
in at least one place.  Thus, a number of other drivers do:

     err = fasync_helper(...);
     if (err < 0)
             return err;
     return 0;

In the interests of consistency and more concise code, it makes sense to
map positive return values onto zero where ->fasync() is called.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 drivers/char/sonypi.c              |    7 +------
 drivers/gpu/drm/drm_fops.c         |    6 +-----
 drivers/hid/usbhid/hiddev.c        |    5 +----
 drivers/ieee1394/dv1394.c          |    6 +-----
 drivers/input/evdev.c              |    5 +----
 drivers/input/joydev.c             |    5 +----
 drivers/input/mousedev.c           |    5 +----
 drivers/input/serio/serio_raw.c    |    4 +---
 drivers/net/wan/cosa.c             |    4 ++--
 drivers/platform/x86/sony-laptop.c |    7 +------
 drivers/scsi/sg.c                  |    4 +---
 fs/fcntl.c                         |    2 ++
 fs/ioctl.c                         |    2 +-
 fs/pipe.c                          |   17 +++--------------
 sound/core/control.c               |    7 ++-----
 sound/core/pcm_native.c            |    4 +---
 sound/core/timer.c                 |    6 +-----
 17 files changed, 22 insertions(+), 74 deletions(-)

diff --git a/drivers/char/sonypi.c b/drivers/char/sonypi.c
index f437443..fd3dced 100644
--- a/drivers/char/sonypi.c
+++ b/drivers/char/sonypi.c
@@ -888,12 +888,7 @@ found:
 
 static int sonypi_misc_fasync(int fd, struct file *filp, int on)
 {
-	int retval;
-
-	retval = fasync_helper(fd, filp, on, &sonypi_device.fifo_async);
-	if (retval < 0)
-		return retval;
-	return 0;
+	return fasync_helper(fd, filp, on, &sonypi_device.fifo_async);
 }
 
 static int sonypi_misc_release(struct inode *inode, struct file *file)
diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
index b06a537..9ec9d5d 100644
--- a/drivers/gpu/drm/drm_fops.c
+++ b/drivers/gpu/drm/drm_fops.c
@@ -337,14 +337,10 @@ int drm_fasync(int fd, struct file *filp, int on)
 {
 	struct drm_file *priv = filp->private_data;
 	struct drm_device *dev = priv->minor->dev;
-	int retcode;
 
 	DRM_DEBUG("fd = %d, device = 0x%lx\n", fd,
 		  (long)old_encode_dev(priv->minor->device));
-	retcode = fasync_helper(fd, filp, on, &dev->buf_async);
-	if (retcode < 0)
-		return retcode;
-	return 0;
+	return fasync_helper(fd, filp, on, &dev->buf_async);
 }
 EXPORT_SYMBOL(drm_fasync);
 
diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index d73eea3..55828da 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -227,12 +227,9 @@ void hiddev_report_event(struct hid_device *hid, struct hid_report *report)
  */
 static int hiddev_fasync(int fd, struct file *file, int on)
 {
-	int retval;
 	struct hiddev_list *list = file->private_data;
 
-	retval = fasync_helper(fd, file, on, &list->fasync);
-
-	return retval < 0 ? retval : 0;
+	return fasync_helper(fd, file, on, &list->fasync);
 }
 
 
diff --git a/drivers/ieee1394/dv1394.c b/drivers/ieee1394/dv1394.c
index a329e6b..22e5487 100644
--- a/drivers/ieee1394/dv1394.c
+++ b/drivers/ieee1394/dv1394.c
@@ -1325,11 +1325,7 @@ static int dv1394_fasync(int fd, struct file *file, int on)
 
 	struct video_card *video = file_to_video_card(file);
 
-	int retval = fasync_helper(fd, file, on, &video->fasync);
-
-	if (retval < 0)
-		return retval;
-        return 0;
+	return fasync_helper(fd, file, on, &video->fasync);
 }
 
 static ssize_t dv1394_write(struct file *file, const char __user *buffer, size_t count, loff_t *ppos)
diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index ed8baa0..7a7a026 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -94,11 +94,8 @@ static void evdev_event(struct input_handle *handle,
 static int evdev_fasync(int fd, struct file *file, int on)
 {
 	struct evdev_client *client = file->private_data;
-	int retval;
-
-	retval = fasync_helper(fd, file, on, &client->fasync);
 
-	return retval < 0 ? retval : 0;
+	return fasync_helper(fd, file, on, &client->fasync);
 }
 
 static int evdev_flush(struct file *file, fl_owner_t id)
diff --git a/drivers/input/joydev.c b/drivers/input/joydev.c
index 6f23662..4224f01 100644
--- a/drivers/input/joydev.c
+++ b/drivers/input/joydev.c
@@ -159,12 +159,9 @@ static void joydev_event(struct input_handle *handle,
 
 static int joydev_fasync(int fd, struct file *file, int on)
 {
-	int retval;
 	struct joydev_client *client = file->private_data;
 
-	retval = fasync_helper(fd, file, on, &client->fasync);
-
-	return retval < 0 ? retval : 0;
+	return fasync_helper(fd, file, on, &client->fasync);
 }
 
 static void joydev_free(struct device *dev)
diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index ef99a7e..17fd6d4 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -403,12 +403,9 @@ static void mousedev_event(struct input_handle *handle,
 
 static int mousedev_fasync(int fd, struct file *file, int on)
 {
-	int retval;
 	struct mousedev_client *client = file->private_data;
 
-	retval = fasync_helper(fd, file, on, &client->fasync);
-
-	return retval < 0 ? retval : 0;
+	return fasync_helper(fd, file, on, &client->fasync);
 }
 
 static void mousedev_free(struct device *dev)
diff --git a/drivers/input/serio/serio_raw.c b/drivers/input/serio/serio_raw.c
index 06bbd0e..b03009b 100644
--- a/drivers/input/serio/serio_raw.c
+++ b/drivers/input/serio/serio_raw.c
@@ -58,10 +58,8 @@ static unsigned int serio_raw_no;
 static int serio_raw_fasync(int fd, struct file *file, int on)
 {
 	struct serio_raw_list *list = file->private_data;
-	int retval;
 
-	retval = fasync_helper(fd, file, on, &list->fasync);
-	return retval < 0 ? retval : 0;
+	return fasync_helper(fd, file, on, &list->fasync);
 }
 
 static struct serio_raw *serio_raw_locate(int minor)
diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c
index d80b72e..ce753e9 100644
--- a/drivers/net/wan/cosa.c
+++ b/drivers/net/wan/cosa.c
@@ -993,8 +993,8 @@ static struct fasync_struct *fasync[256] = { NULL, };
 static int cosa_fasync(struct inode *inode, struct file *file, int on)
 {
         int port = iminor(inode);
-        int rv = fasync_helper(inode, file, on, &fasync[port]);
-        return rv < 0 ? rv : 0;
+
+	return fasync_helper(inode, file, on, &fasync[port]);
 }
 #endif
 
diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c
index 537959d..bc8996c 100644
--- a/drivers/platform/x86/sony-laptop.c
+++ b/drivers/platform/x86/sony-laptop.c
@@ -1917,12 +1917,7 @@ static struct sonypi_compat_s sonypi_compat = {
 
 static int sonypi_misc_fasync(int fd, struct file *filp, int on)
 {
-	int retval;
-
-	retval = fasync_helper(fd, filp, on, &sonypi_compat.fifo_async);
-	if (retval < 0)
-		return retval;
-	return 0;
+	return fasync_helper(fd, filp, on, &sonypi_compat.fifo_async);
 }
 
 static int sonypi_misc_release(struct inode *inode, struct file *file)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 8f0bd3f..b656cde 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1154,7 +1154,6 @@ sg_poll(struct file *filp, poll_table * wait)
 static int
 sg_fasync(int fd, struct file *filp, int mode)
 {
-	int retval;
 	Sg_device *sdp;
 	Sg_fd *sfp;
 
@@ -1163,8 +1162,7 @@ sg_fasync(int fd, struct file *filp, int mode)
 	SCSI_LOG_TIMEOUT(3, printk("sg_fasync: %s, mode=%d\n",
 				   sdp->disk->disk_name, mode));
 
-	retval = fasync_helper(fd, filp, mode, &sfp->async_qp);
-	return (retval < 0) ? retval : 0;
+	return fasync_helper(fd, filp, mode, &sfp->async_qp);
 }
 
 static int
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 431bb64..d865ca6 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -184,6 +184,8 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
 		error = filp->f_op->fasync(fd, filp, (arg & FASYNC) != 0);
 		if (error < 0)
 			goto out;
+		if (error > 0)
+			error = 0;
 	}
 	spin_lock(&filp->f_lock);
 	filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK);
diff --git a/fs/ioctl.c b/fs/ioctl.c
index e8e89ed..ac2d47e 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -432,7 +432,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 		else
 			error = -ENOTTY;
 	}
-	return error;
+	return error < 0 ? error : 0;
 }
 
 static int ioctl_fsfreeze(struct file *filp)
diff --git a/fs/pipe.c b/fs/pipe.c
index 3a48ba5..900de67 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -667,10 +667,7 @@ pipe_read_fasync(int fd, struct file *filp, int on)
 	retval = fasync_helper(fd, filp, on, &inode->i_pipe->fasync_readers);
 	mutex_unlock(&inode->i_mutex);
 
-	if (retval < 0)
-		return retval;
-
-	return 0;
+	return retval;
 }
 
 
@@ -684,10 +681,7 @@ pipe_write_fasync(int fd, struct file *filp, int on)
 	retval = fasync_helper(fd, filp, on, &inode->i_pipe->fasync_writers);
 	mutex_unlock(&inode->i_mutex);
 
-	if (retval < 0)
-		return retval;
-
-	return 0;
+	return retval;
 }
 
 
@@ -701,16 +695,11 @@ pipe_rdwr_fasync(int fd, struct file *filp, int on)
 	mutex_lock(&inode->i_mutex);
 
 	retval = fasync_helper(fd, filp, on, &pipe->fasync_readers);
-
 	if (retval >= 0)
 		retval = fasync_helper(fd, filp, on, &pipe->fasync_writers);
 
 	mutex_unlock(&inode->i_mutex);
-
-	if (retval < 0)
-		return retval;
-
-	return 0;
+	return retval;
 }
 
 
diff --git a/sound/core/control.c b/sound/core/control.c
index 636b3b5..4b20fa2 100644
--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -1373,12 +1373,9 @@ EXPORT_SYMBOL(snd_ctl_unregister_ioctl_compat);
 static int snd_ctl_fasync(int fd, struct file * file, int on)
 {
 	struct snd_ctl_file *ctl;
-	int err;
+
 	ctl = file->private_data;
-	err = fasync_helper(fd, file, on, &ctl->fasync);
-	if (err < 0)
-		return err;
-	return 0;
+	return fasync_helper(fd, file, on, &ctl->fasync);
 }
 
 /*
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index a789efc..a75c194 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3246,9 +3246,7 @@ static int snd_pcm_fasync(int fd, struct file * file, int on)
 	err = fasync_helper(fd, file, on, &runtime->fasync);
 out:
 	unlock_kernel();
-	if (err < 0)
-		return err;
-	return 0;
+	return err;
 }
 
 /*
diff --git a/sound/core/timer.c b/sound/core/timer.c
index 7965320..3f0050d 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1825,13 +1825,9 @@ static long snd_timer_user_ioctl(struct file *file, unsigned int cmd,
 static int snd_timer_user_fasync(int fd, struct file * file, int on)
 {
 	struct snd_timer_user *tu;
-	int err;
 
 	tu = file->private_data;
-	err = fasync_helper(fd, file, on, &tu->fasync);
-        if (err < 0)
-		return err;
-	return 0;
+	return fasync_helper(fd, file, on, &tu->fasync);
 }
 
 static ssize_t snd_timer_user_read(struct file *file, char __user *buffer,
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH/RFC] Fasync BKL removal
  2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
                   ` (3 preceding siblings ...)
  2009-02-07 20:06 ` [PATCH 4/4] Rationalize fasync return values Jonathan Corbet
@ 2009-02-07 20:28 ` Matt Mackall
  4 siblings, 0 replies; 12+ messages in thread
From: Matt Mackall @ 2009-02-07 20:28 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro,
	Davide Libenzi, David Miller, Christoph Hellwig, Alan Cox

On Sat, 2009-02-07 at 13:06 -0700, Jonathan Corbet wrote:
> With any luck at all, this is my final attempt at removing the BKL from the
> fasync() path and providing proper protection for struct file->f_flags.
> This week's episode includes these patches:

This all looks very pretty to me.

Acked-by: Matt Mackall <mpm@selenic.com>

-- 
http://selenic.com : development and support for Mercurial and Linux



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] Rename struct file->f_ep_lock
  2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
@ 2009-02-08  8:56   ` Christoph Hellwig
  2009-02-08 19:51   ` Davide Libenzi
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2009-02-08  8:56 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro,
	Davide Libenzi, David Miller, Christoph Hellwig, Alan Cox,
	Matt Mackall

On Sat, Feb 07, 2009 at 01:06:54PM -0700, Jonathan Corbet wrote:
> This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock.  For now,
> epoll remains the only user, but a future patch will use it to protect
> f_flags as well.

Looks good to me.


Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] Use f_lock to protect f_flags
  2009-02-07 20:06 ` [PATCH 2/4] Use f_lock to protect f_flags Jonathan Corbet
@ 2009-02-08  9:02   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2009-02-08  9:02 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro,
	Davide Libenzi, David Miller, Christoph Hellwig, Alan Cox,
	Matt Mackall

On Sat, Feb 07, 2009 at 01:06:55PM -0700, Jonathan Corbet wrote:
> Traditionally, changes to struct file->f_flags have been done under BKL
> protection, or with no protection at all.  This patch causes all f_flags
> changes after file open/creation time to be done under protection of
> f_lock.  This allows the removal of some BKL usage and fixes a number of
> longstanding (if microscopic) races.

Looks good to me.


Reviewed-by: Christoph Hellwig <hch@lst.de>

One comments only tangentially related to the patch:


> diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
> index bc84e12..224f271 100644
> --- a/drivers/char/tty_io.c
> +++ b/drivers/char/tty_io.c
> @@ -2162,13 +2162,12 @@ static int fionbio(struct file *file, int __user *p)
>  	if (get_user(nonblock, p))
>  		return -EFAULT;
>  
> -	/* file->f_flags is still BKL protected in the fs layer - vomit */
> -	lock_kernel();
> +	spin_lock(&file->f_lock);
>  	if (nonblock)
>  		file->f_flags |= O_NONBLOCK;
>  	else
>  		file->f_flags &= ~O_NONBLOCK;
> -	unlock_kernel();
> +	spin_unlock(&file->f_lock);
>  	return 0;
>  }

Why is this code there at all?  It's a duplicate of
fs/ioctl.c:ioctl_fionbio minus the sparc special case, and from looking
at the flow in fs/ioctl.c I'm pretty sure FIONBIO never gets handed to
the chardev ioctl methods..


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] Move FASYNC bit handling to f_op->fasync()
  2009-02-07 20:06 ` [PATCH 3/4] Move FASYNC bit handling to f_op->fasync() Jonathan Corbet
@ 2009-02-08  9:05   ` Christoph Hellwig
  2009-02-10  2:41     ` Jonathan Corbet
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2009-02-08  9:05 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro,
	Davide Libenzi, David Miller, Christoph Hellwig, Alan Cox,
	Matt Mackall

On Sat, Feb 07, 2009 at 01:06:56PM -0700, Jonathan Corbet wrote:
> Removing the BKL from FASYNC handling ran into the challenge of keeping the
> setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
> the underlying fasync() function.  Andi Kleen suggested moving the handling
> of that bit into fasync(); this patch does exactly that.  As a result, we
> have a couple of internal API changes: fasync() must now manage the FASYNC
> bit, and it will be called without the BKL held.
> 
> As it happens, every fasync() implementation in the kernel with one
> exception calls fasync_helper().  So, if we make fasync_helper() set the
> FASYNC bit, we can avoid making any changes to the other fasync()
> functions - as long as those functions, themselves, have proper locking.
> Most fasync() implementations do nothing but call fasync_helper() - which
> has its own lock - so they are easily verified as correct.  The BKL had
> already been pushed down into the rest.
> 
> The networking code has its own version of fasync_helper(), so that code
> has been augmented with explicit FASYNC bit handling.

> +	/* Maintaining the FASYNC bit is our job now */

Not sure this comment is a good idea.  All comments of the style
something has changes and is this now get stale really soon.  If you
want to document that change in more details it should probably go
into Documentation/filesystems/Locking near the fasync part.


But the actual good changes look good, so:


Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] Rename struct file->f_ep_lock
  2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
  2009-02-08  8:56   ` Christoph Hellwig
@ 2009-02-08 19:51   ` Davide Libenzi
  1 sibling, 0 replies; 12+ messages in thread
From: Davide Libenzi @ 2009-02-08 19:51 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Linux Kernel Mailing List, Andi Kleen, Oleg Nesterov,
	Andrew Morton, Al Viro, David Miller, Christoph Hellwig, Alan Cox,
	Matt Mackall

On Sat, 7 Feb 2009, Jonathan Corbet wrote:

> This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock.  For now,
> epoll remains the only user, but a future patch will use it to protect
> f_flags as well.

Looks OK to me.


- Davide



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] Move FASYNC bit handling to f_op->fasync()
  2009-02-08  9:05   ` Christoph Hellwig
@ 2009-02-10  2:41     ` Jonathan Corbet
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Corbet @ 2009-02-10  2:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Andi Kleen, Oleg Nesterov, Andrew Morton, Al Viro,
	Davide Libenzi, David Miller, Christoph Hellwig, Alan Cox,
	Matt Mackall

On Sun, 8 Feb 2009 10:05:50 +0100
Christoph Hellwig <hch@lst.de> wrote:

> > +	/* Maintaining the FASYNC bit is our job now */  
> 
> Not sure this comment is a good idea.  All comments of the style
> something has changes and is this now get stale really soon.  If you
> want to document that change in more details it should probably go
> into Documentation/filesystems/Locking near the fasync part.

Good point.  I've removed the comment and updated the doc file instead.

Thanks,

jon

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-02-10  2:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-07 20:06 [PATCH/RFC] Fasync BKL removal Jonathan Corbet
2009-02-07 20:06 ` [PATCH 1/4] Rename struct file->f_ep_lock Jonathan Corbet
2009-02-08  8:56   ` Christoph Hellwig
2009-02-08 19:51   ` Davide Libenzi
2009-02-07 20:06 ` [PATCH 2/4] Use f_lock to protect f_flags Jonathan Corbet
2009-02-08  9:02   ` Christoph Hellwig
2009-02-07 20:06 ` [PATCH 3/4] Move FASYNC bit handling to f_op->fasync() Jonathan Corbet
2009-02-08  9:05   ` Christoph Hellwig
2009-02-10  2:41     ` Jonathan Corbet
2009-02-07 20:06 ` [PATCH 4/4] Rationalize fasync return values Jonathan Corbet
2009-02-07 20:28 ` [PATCH/RFC] Fasync BKL removal Matt Mackall
  -- strict thread matches above, loose matches on Subject: below --
2009-02-02 18:20 [PATCH/RFC] F_SETFL/Fasync BKL removal, now without unsightly global locks Jonathan Corbet
2009-02-02 18:20 ` [PATCH 3/4] Move FASYNC bit handling to f_op->fasync() Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox