* [PATCH] Fix file lookup without ref
@ 2006-04-12 18:31 Dipankar Sarma
2006-04-13 0:53 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Dipankar Sarma @ 2006-04-12 18:31 UTC (permalink / raw)
To: Andrew Morton; +Cc: Paul E.McKenney, linux-kernel
This patch fixes a problem with some places in the kernel where
we look up file structure from the fd table but don't hold
a reference to the file. Those places cannot be lock-free.
These places aren't in fast path, so it is not a problem.
I have tested this patch on powerpc and x86_64 using basic
tests and ltp. We should aim to merge this for 2.6.17.
Thanks
Dipankar
There are places in the kernel where we look up files in fd tables
and access the file structure without holding refereces to the file.
So, we need special care to avoid the race between
looking up files in the fd table and tearing down of the file
in another CPU. Otherwise, one might see a NULL f_dentry or
such torn down version of the file. This patch fixes those
special places where such a race may happen.
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
---
drivers/char/tty_io.c | 8 ++++++--
fs/locks.c | 9 +++++++--
fs/proc/base.c | 21 +++++++++++++++------
3 files changed, 28 insertions(+), 10 deletions(-)
diff -puN drivers/char/tty_io.c~fix-proc-fd-ops drivers/char/tty_io.c
--- linux-2.6.16-rcu/drivers/char/tty_io.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
+++ linux-2.6.16-rcu-dipankar/drivers/char/tty_io.c 2006-04-12 21:06:24.000000000 +0530
@@ -2706,7 +2706,11 @@ static void __do_SAK(void *arg)
}
task_lock(p);
if (p->files) {
- rcu_read_lock();
+ /*
+ * We don't take a ref to the file, so we must
+ * hold ->file_lock instead.
+ */
+ spin_lock(&p->files->file_lock);
fdt = files_fdtable(p->files);
for (i=0; i < fdt->max_fds; i++) {
filp = fcheck_files(p->files, i);
@@ -2721,7 +2725,7 @@ static void __do_SAK(void *arg)
break;
}
}
- rcu_read_unlock();
+ spin_unlock(&p->files->file_lock);
}
task_unlock(p);
} while_each_task_pid(session, PIDTYPE_SID, p);
diff -puN fs/locks.c~fix-proc-fd-ops fs/locks.c
--- linux-2.6.16-rcu/fs/locks.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
+++ linux-2.6.16-rcu-dipankar/fs/locks.c 2006-04-12 21:06:24.000000000 +0530
@@ -2212,7 +2212,12 @@ void steal_locks(fl_owner_t from)
lock_kernel();
j = 0;
- rcu_read_lock();
+
+ /*
+ * We are not taking a ref to the file structures, so
+ * we need to acquire ->file_lock.
+ */
+ spin_lock(&files->file_lock);
fdt = files_fdtable(files);
for (;;) {
unsigned long set;
@@ -2230,7 +2235,7 @@ void steal_locks(fl_owner_t from)
set >>= 1;
}
}
- rcu_read_unlock();
+ spin_unlock(&files->file_lock);
unlock_kernel();
}
EXPORT_SYMBOL(steal_locks);
diff -puN fs/proc/base.c~fix-proc-fd-ops fs/proc/base.c
--- linux-2.6.16-rcu/fs/proc/base.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
+++ linux-2.6.16-rcu-dipankar/fs/proc/base.c 2006-04-12 21:06:24.000000000 +0530
@@ -294,16 +294,20 @@ static int proc_fd_link(struct inode *in
files = get_files_struct(task);
if (files) {
- rcu_read_lock();
+ /*
+ * We are not taking a ref to the file structure, so we must
+ * hold ->file_lock.
+ */
+ spin_lock(&files->file_lock);
file = fcheck_files(files, fd);
if (file) {
*mnt = mntget(file->f_vfsmnt);
*dentry = dget(file->f_dentry);
- rcu_read_unlock();
+ spin_unlock(&files->file_lock);
put_files_struct(files);
return 0;
}
- rcu_read_unlock();
+ spin_unlock(&files->file_lock);
put_files_struct(files);
}
return -ENOENT;
@@ -1485,7 +1489,12 @@ static struct dentry *proc_lookupfd(stru
if (!files)
goto out_unlock;
inode->i_mode = S_IFLNK;
- rcu_read_lock();
+
+ /*
+ * We are not taking a ref to the file structure, so we must
+ * hold ->file_lock.
+ */
+ spin_lock(&files->file_lock);
file = fcheck_files(files, fd);
if (!file)
goto out_unlock2;
@@ -1493,7 +1502,7 @@ static struct dentry *proc_lookupfd(stru
inode->i_mode |= S_IRUSR | S_IXUSR;
if (file->f_mode & 2)
inode->i_mode |= S_IWUSR | S_IXUSR;
- rcu_read_unlock();
+ spin_unlock(&files->file_lock);
put_files_struct(files);
inode->i_op = &proc_pid_link_inode_operations;
inode->i_size = 64;
@@ -1503,7 +1512,7 @@ static struct dentry *proc_lookupfd(stru
return NULL;
out_unlock2:
- rcu_read_unlock();
+ spin_unlock(&files->file_lock);
put_files_struct(files);
out_unlock:
iput(inode);
_
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Fix file lookup without ref
2006-04-12 18:31 [PATCH] Fix file lookup without ref Dipankar Sarma
@ 2006-04-13 0:53 ` Paul E. McKenney
0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2006-04-13 0:53 UTC (permalink / raw)
To: Dipankar Sarma; +Cc: Andrew Morton, linux-kernel
On Thu, Apr 13, 2006 at 12:01:06AM +0530, Dipankar Sarma wrote:
> This patch fixes a problem with some places in the kernel where
> we look up file structure from the fd table but don't hold
> a reference to the file. Those places cannot be lock-free.
> These places aren't in fast path, so it is not a problem.
> I have tested this patch on powerpc and x86_64 using basic
> tests and ltp. We should aim to merge this for 2.6.17.
>
> Thanks
> Dipankar
>
>
>
> There are places in the kernel where we look up files in fd tables
> and access the file structure without holding refereces to the file.
> So, we need special care to avoid the race between
> looking up files in the fd table and tearing down of the file
> in another CPU. Otherwise, one might see a NULL f_dentry or
> such torn down version of the file. This patch fixes those
> special places where such a race may happen.
Acked-by: <paulmck@us.ibm.com>
> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
> ---
>
>
> drivers/char/tty_io.c | 8 ++++++--
> fs/locks.c | 9 +++++++--
> fs/proc/base.c | 21 +++++++++++++++------
> 3 files changed, 28 insertions(+), 10 deletions(-)
>
> diff -puN drivers/char/tty_io.c~fix-proc-fd-ops drivers/char/tty_io.c
> --- linux-2.6.16-rcu/drivers/char/tty_io.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/drivers/char/tty_io.c 2006-04-12 21:06:24.000000000 +0530
> @@ -2706,7 +2706,11 @@ static void __do_SAK(void *arg)
> }
> task_lock(p);
> if (p->files) {
> - rcu_read_lock();
> + /*
> + * We don't take a ref to the file, so we must
> + * hold ->file_lock instead.
> + */
> + spin_lock(&p->files->file_lock);
> fdt = files_fdtable(p->files);
> for (i=0; i < fdt->max_fds; i++) {
> filp = fcheck_files(p->files, i);
> @@ -2721,7 +2725,7 @@ static void __do_SAK(void *arg)
> break;
> }
> }
> - rcu_read_unlock();
> + spin_unlock(&p->files->file_lock);
> }
> task_unlock(p);
> } while_each_task_pid(session, PIDTYPE_SID, p);
> diff -puN fs/locks.c~fix-proc-fd-ops fs/locks.c
> --- linux-2.6.16-rcu/fs/locks.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/fs/locks.c 2006-04-12 21:06:24.000000000 +0530
> @@ -2212,7 +2212,12 @@ void steal_locks(fl_owner_t from)
>
> lock_kernel();
> j = 0;
> - rcu_read_lock();
> +
> + /*
> + * We are not taking a ref to the file structures, so
> + * we need to acquire ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> fdt = files_fdtable(files);
> for (;;) {
> unsigned long set;
> @@ -2230,7 +2235,7 @@ void steal_locks(fl_owner_t from)
> set >>= 1;
> }
> }
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> unlock_kernel();
> }
> EXPORT_SYMBOL(steal_locks);
> diff -puN fs/proc/base.c~fix-proc-fd-ops fs/proc/base.c
> --- linux-2.6.16-rcu/fs/proc/base.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/fs/proc/base.c 2006-04-12 21:06:24.000000000 +0530
> @@ -294,16 +294,20 @@ static int proc_fd_link(struct inode *in
>
> files = get_files_struct(task);
> if (files) {
> - rcu_read_lock();
> + /*
> + * We are not taking a ref to the file structure, so we must
> + * hold ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> file = fcheck_files(files, fd);
> if (file) {
> *mnt = mntget(file->f_vfsmnt);
> *dentry = dget(file->f_dentry);
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> return 0;
> }
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> }
> return -ENOENT;
> @@ -1485,7 +1489,12 @@ static struct dentry *proc_lookupfd(stru
> if (!files)
> goto out_unlock;
> inode->i_mode = S_IFLNK;
> - rcu_read_lock();
> +
> + /*
> + * We are not taking a ref to the file structure, so we must
> + * hold ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> file = fcheck_files(files, fd);
> if (!file)
> goto out_unlock2;
> @@ -1493,7 +1502,7 @@ static struct dentry *proc_lookupfd(stru
> inode->i_mode |= S_IRUSR | S_IXUSR;
> if (file->f_mode & 2)
> inode->i_mode |= S_IWUSR | S_IXUSR;
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> inode->i_op = &proc_pid_link_inode_operations;
> inode->i_size = 64;
> @@ -1503,7 +1512,7 @@ static struct dentry *proc_lookupfd(stru
> return NULL;
>
> out_unlock2:
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> out_unlock:
> iput(inode);
>
> _
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Fix file lookup without ref
@ 2006-04-23 23:15 Suzanne Wood
2006-04-28 16:17 ` Dipankar Sarma
0 siblings, 1 reply; 4+ messages in thread
From: Suzanne Wood @ 2006-04-23 23:15 UTC (permalink / raw)
To: dipankar; +Cc: linux-kernel, paulmck
Do you mind explaining what you mean by "don't hold a reference"
in the places you replace rcu_read_lock() with spin_lock() in
settings with nested fcheck_files() or files_fdtable() which
in turn call rcu_dereference()? How, for example, are the
occurences in proc_readfd() and tid_fd_revalidate() in
fs/proc/base.c different? tid_fid_revalidate() doesn't make
a local assignment and has the FASTCALL put_files_struct, but
is there reasoning that proc_readfd() isn't similar to steal_locks()
in fs/locks.c?
Thanks.
Suzanne
> From: Dipankar Sarma 2006-04-12 18:43:06
>
> This patch fixes a problem with some places in the kernel where
> we look up file structure from the fd table but don't hold
> a reference to the file. Those places cannot be lock-free.
> These places aren't in fast path, so it is not a problem.
> I have tested this patch on powerpc and x86_64 using basic
> tests and ltp. We should aim to merge this for 2.6.17.
>
> Thanks
> Dipankar
>
>
> There are places in the kernel where we look up files in fd tables
> and access the file structure without holding refereces to the file.
> So, we need special care to avoid the race between
> looking up files in the fd table and tearing down of the file
> in another CPU. Otherwise, one might see a NULL f_dentry or
> such torn down version of the file. This patch fixes those
> special places where such a race may happen.
>
> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
> ---
>
>
> drivers/char/tty_io.c | 8 ++++++--
> fs/locks.c | 9 +++++++--
> fs/proc/base.c | 21 +++++++++++++++------
> 3 files changed, 28 insertions(+), 10 deletions(-)
>
> diff -puN drivers/char/tty_io.c~fix-proc-fd-ops drivers/char/tty_io.c
> --- linux-2.6.16-rcu/drivers/char/tty_io.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/drivers/char/tty_io.c 2006-04-12 21:06:24.000000000 +0530
> @@ -2706,7 +2706,11 @@ static void __do_SAK(void *arg)
> }
> task_lock(p);
> if (p->files) {
> - rcu_read_lock();
> + /*
> + * We don't take a ref to the file, so we must
> + * hold ->file_lock instead.
> + */
> + spin_lock(&p->files->file_lock);
> fdt = files_fdtable(p->files);
> for (i=0; i < fdt->max_fds; i++) {
> filp = fcheck_files(p->files, i);
> @@ -2721,7 +2725,7 @@ static void __do_SAK(void *arg)
> break;
> }
> }
> - rcu_read_unlock();
> + spin_unlock(&p->files->file_lock);
> }
> task_unlock(p);
> } while_each_task_pid(session, PIDTYPE_SID, p);
> diff -puN fs/locks.c~fix-proc-fd-ops fs/locks.c
> --- linux-2.6.16-rcu/fs/locks.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/fs/locks.c 2006-04-12 21:06:24.000000000 +0530
> @@ -2212,7 +2212,12 @@ void steal_locks(fl_owner_t from)
>
> lock_kernel();
> j = 0;
> - rcu_read_lock();
> +
> + /*
> + * We are not taking a ref to the file structures, so
> + * we need to acquire ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> fdt = files_fdtable(files);
> for (;;) {
> unsigned long set;
> @@ -2230,7 +2235,7 @@ void steal_locks(fl_owner_t from)
> set >>= 1;
> }
> }
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> unlock_kernel();
> }
> EXPORT_SYMBOL(steal_locks);
> diff -puN fs/proc/base.c~fix-proc-fd-ops fs/proc/base.c
> --- linux-2.6.16-rcu/fs/proc/base.c~fix-proc-fd-ops 2006-04-12 21:06:24.000000000 +0530
> +++ linux-2.6.16-rcu-dipankar/fs/proc/base.c 2006-04-12 21:06:24.000000000 +0530
> @@ -294,16 +294,20 @@ static int proc_fd_link(struct inode *in
>
> files = get_files_struct(task);
> if (files) {
> - rcu_read_lock();
> + /*
> + * We are not taking a ref to the file structure, so we must
> + * hold ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> file = fcheck_files(files, fd);
> if (file) {
> *mnt = mntget(file->f_vfsmnt);
> *dentry = dget(file->f_dentry);
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> return 0;
> }
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> }
> return -ENOENT;
> @@ -1485,7 +1489,12 @@ static struct dentry *proc_lookupfd(stru
> if (!files)
> goto out_unlock;
> inode->i_mode = S_IFLNK;
> - rcu_read_lock();
> +
> + /*
> + * We are not taking a ref to the file structure, so we must
> + * hold ->file_lock.
> + */
> + spin_lock(&files->file_lock);
> file = fcheck_files(files, fd);
> if (!file)
> goto out_unlock2;
> @@ -1493,7 +1502,7 @@ static struct dentry *proc_lookupfd(stru
> inode->i_mode |= S_IRUSR | S_IXUSR;
> if (file->f_mode & 2)
> inode->i_mode |= S_IWUSR | S_IXUSR;
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> inode->i_op = &proc_pid_link_inode_operations;
> inode->i_size = 64;
> @@ -1503,7 +1512,7 @@ static struct dentry *proc_lookupfd(stru
> return NULL;
>
> out_unlock2:
> - rcu_read_unlock();
> + spin_unlock(&files->file_lock);
> put_files_struct(files);
> out_unlock:
> iput(inode);
>
> _
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Fix file lookup without ref
2006-04-23 23:15 Suzanne Wood
@ 2006-04-28 16:17 ` Dipankar Sarma
0 siblings, 0 replies; 4+ messages in thread
From: Dipankar Sarma @ 2006-04-28 16:17 UTC (permalink / raw)
To: Suzanne Wood; +Cc: linux-kernel, paulmck
Hi Suzanne,
Sorry about the late reply, I have been offline for a while.
On Sun, Apr 23, 2006 at 04:15:18PM -0700, Suzanne Wood wrote:
> Do you mind explaining what you mean by "don't hold a reference"
> in the places you replace rcu_read_lock() with spin_lock() in
> settings with nested fcheck_files() or files_fdtable() which
> in turn call rcu_dereference()? How, for example, are the
Well, we use different methods of reference counting with
RCU based objects and fd table is one of those. With the
fd table, when you look up a file without holding
the fd table spinlock, the file structure you get may
be getting torn down on another CPU. We can safely
do this only if we *successfully* increment the reference
count of the file structure using atomic_inc_not_zero()
primitive which is based on cmpxchg. If atomic_inc_not_zero()
fails, we assume that the reference count of the file
structure had become zero and is getting destroyed.
If atomic_inc_not_zero() was successful, then we
"hold" a reference to the file structure and it is
safe to access it.
> occurences in proc_readfd() and tid_fd_revalidate() in
> fs/proc/base.c different? tid_fid_revalidate() doesn't make
> a local assignment and has the FASTCALL put_files_struct, but
> is there reasoning that proc_readfd() isn't similar to steal_locks()
> in fs/locks.c?
In both proc_readfd() and tid_fd_revalidate(), we don't access
the file structure itself in the lock-free section. We just
check if the file exists or not in the fd table. Worst case,
we may see state data in /proc.
Thanks
Dipankar
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-04-28 16:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-12 18:31 [PATCH] Fix file lookup without ref Dipankar Sarma
2006-04-13 0:53 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2006-04-23 23:15 Suzanne Wood
2006-04-28 16:17 ` Dipankar Sarma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox