public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
@ 2006-04-11 16:36 David Howells
  2006-04-11 22:05 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-11 16:36 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel


Make the kernel use atomic operations for files_stat.nr_files accounting
rather than using a spinlocked and interrupt-disabled critical section.  This
should improve reduce the latency of the critical section.

This patch is slightly problematic for a couple of reasons:

 (1) The sysctl code then implicitly casts the atomic value to an int, which
     I'm not sure is permissible.

 (2) linux/fs.h only includes asm/atomic.h if __KERNEL__ is defined, but the
     files_stat_struct is declared whether or not __KERNEL__ is defined.  To
     deal with this I've made the type of the nr_files member change to int if
     necessary.

     However, I'm not sure there's any need to present files_stat_struct to
     non-kernel code.

Additionally, I'm not sure that doing the accounting in the slab constructor
and destructor is a good idea: if ENFILE is returned, and then a whole bunch
of files are freed up thus permitting the next open to theoretically avoid
ENFILE, what's to say that the slab has actually recycled all those objects?

Under some conditions the destructors will only be invoked when the actual
slab pages are released, in which case the ENFILE condition won't necessarily
cease to be reported, even if it has actually gone away.

Consider where the files_cache has more than one object per slab, and after an
ENFILE condition a whole bunch of files are released, such as to leave all the
slab pages still allocated, with one struct file allocated in each...


Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat -p1 /tmp/file-nr.diff
 fs/file_table.c    |   25 +++++++++----------------
 include/linux/fs.h |    8 +++++++-
 2 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 1008050..67b0084 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -22,7 +22,8 @@
 
 /* sysctl tunables... */
 struct files_stat_struct files_stat = {
-	.max_files = NR_FILE
+	.max_files	= NR_FILE,
+	.nr_files	= ATOMIC_INIT(0),
 };
 
 EXPORT_SYMBOL(files_stat); /* Needed by unix.o */
@@ -33,26 +34,18 @@ EXPORT_SYMBOL(files_stat); /* Needed by 
 static DEFINE_SPINLOCK(filp_count_lock);
 
 /* slab constructors and destructors are called from arbitrary
- * context and must be fully threaded - use a local spinlock
- * to protect files_stat.nr_files
+ * context and must be fully threaded
  */
 void filp_ctor(void *objp, struct kmem_cache *cachep, unsigned long cflags)
 {
 	if ((cflags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) ==
-	    SLAB_CTOR_CONSTRUCTOR) {
-		unsigned long flags;
-		spin_lock_irqsave(&filp_count_lock, flags);
-		files_stat.nr_files++;
-		spin_unlock_irqrestore(&filp_count_lock, flags);
-	}
+	    SLAB_CTOR_CONSTRUCTOR)
+		atomic_inc(&files_stat.nr_files);
 }
 
 void filp_dtor(void *objp, struct kmem_cache *cachep, unsigned long dflags)
 {
-	unsigned long flags;
-	spin_lock_irqsave(&filp_count_lock, flags);
-	files_stat.nr_files--;
-	spin_unlock_irqrestore(&filp_count_lock, flags);
+	atomic_dec(&files_stat.nr_files);
 }
 
 static inline void file_free_rcu(struct rcu_head *head)
@@ -78,7 +71,7 @@ struct file *get_empty_filp(void)
 	/*
 	 * Privileged users can go above max_files
 	 */
-	if (files_stat.nr_files >= files_stat.max_files &&
+	if (atomic_read(&files_stat.nr_files) >= files_stat.max_files &&
 				!capable(CAP_SYS_ADMIN))
 		goto over;
 
@@ -101,10 +94,10 @@ struct file *get_empty_filp(void)
 
 over:
 	/* Ran out of filps - report that */
-	if (files_stat.nr_files > old_max) {
+	if (atomic_read(&files_stat.nr_files) > old_max) {
 		printk(KERN_INFO "VFS: file-max limit %d reached\n",
 					files_stat.max_files);
-		old_max = files_stat.nr_files;
+		old_max = atomic_read(&files_stat.nr_files);
 	}
 	goto fail;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cc9ecc0..533fd18 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -9,6 +9,9 @@
 #include <linux/config.h>
 #include <linux/limits.h>
 #include <linux/ioctl.h>
+#ifdef __KERNEL__
+#include <asm/atomic.h>
+#endif
 
 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -30,7 +33,11 @@
 
 /* And dynamically-tunable limits and defaults: */
 struct files_stat_struct {
+#ifdef __KERNEL__
+	atomic_t nr_files;	/* read only */
+#else
 	int nr_files;		/* read only */
+#endif
 	int nr_free_files;	/* read only */
 	int max_files;		/* tunable */
 };
@@ -218,7 +225,6 @@ extern int dir_notify_enable;
 #include <linux/sched.h>
 #include <linux/mutex.h>
 
-#include <asm/atomic.h>
 #include <asm/semaphore.h>
 #include <asm/byteorder.h>
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
  2006-04-11 16:36 [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq David Howells
@ 2006-04-11 22:05 ` Andrew Morton
  2006-04-12 10:56   ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-04-11 22:05 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, linux-kernel

David Howells <dhowells@redhat.com> wrote:
>
> Make the kernel use atomic operations for files_stat.nr_files accounting
> rather than using a spinlocked and interrupt-disabled critical section.

This code has all been redone in current kernels.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
  2006-04-11 22:05 ` Andrew Morton
@ 2006-04-12 10:56   ` David Howells
  2006-04-12 17:48     ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-12 10:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Howells, torvalds, linux-kernel

Andrew Morton <akpm@osdl.org> wrote:

> > Make the kernel use atomic operations for files_stat.nr_files accounting
> > rather than using a spinlocked and interrupt-disabled critical section.
> 
> This code has all been redone in current kernels.

Hmmm... So it has. Trond's tree hasn't caught up yet, which is a bit of a
problem:-/

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
  2006-04-12 10:56   ` David Howells
@ 2006-04-12 17:48     ` Trond Myklebust
  2006-04-12 18:19       ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2006-04-12 17:48 UTC (permalink / raw)
  To: David Howells; +Cc: Andrew Morton, torvalds, linux-kernel

On Wed, 2006-04-12 at 11:56 +0100, David Howells wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> 
> > > Make the kernel use atomic operations for files_stat.nr_files accounting
> > > rather than using a spinlocked and interrupt-disabled critical section.
> > 
> > This code has all been redone in current kernels.
> 
> Hmmm... So it has. Trond's tree hasn't caught up yet, which is a bit of a
> problem:-/

I've been updating the NFS git tree on a daily basis. I'm not going to
begin pulling from the -mm tree, though.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
  2006-04-12 17:48     ` Trond Myklebust
@ 2006-04-12 18:19       ` David Howells
  2006-04-13  3:38         ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-12 18:19 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: David Howells, Andrew Morton, torvalds, linux-kernel

Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

> I've been updating the NFS git tree on a daily basis. I'm not going to
> begin pulling from the -mm tree, though.

The thing to which I refer is in Linus's tree but wasn't in yours this
morning.  Unfortunately, this means my patch has to be different, depending on
whose tree I'm using... although your tree has been updated since then, so the
difference seems to have gone away.

I don't mean to be critical of your efforts, but the requirement imposed by
Andrew that I have to base my tree on yours makes things a little tricky
sometimes:-/

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
  2006-04-12 18:19       ` David Howells
@ 2006-04-13  3:38         ` Trond Myklebust
  0 siblings, 0 replies; 6+ messages in thread
From: Trond Myklebust @ 2006-04-13  3:38 UTC (permalink / raw)
  To: David Howells; +Cc: Andrew Morton, torvalds, linux-kernel

On Wed, 2006-04-12 at 19:19 +0100, David Howells wrote:
> Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> 
> > I've been updating the NFS git tree on a daily basis. I'm not going to
> > begin pulling from the -mm tree, though.
> 
> The thing to which I refer is in Linus's tree but wasn't in yours this
> morning.  Unfortunately, this means my patch has to be different, depending on
> whose tree I'm using... although your tree has been updated since then, so the
> difference seems to have gone away.
> 
> I don't mean to be critical of your efforts, but the requirement imposed by
> Andrew that I have to base my tree on yours makes things a little tricky
> sometimes:-/

You can always pull from Linus too. There is nothing within git that
forces you to wait for me to pull in his changes.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-04-13  3:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-11 16:36 [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq David Howells
2006-04-11 22:05 ` Andrew Morton
2006-04-12 10:56   ` David Howells
2006-04-12 17:48     ` Trond Myklebust
2006-04-12 18:19       ` David Howells
2006-04-13  3:38         ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox