* [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
@ 2006-04-11 16:36 David Howells
2006-04-11 22:05 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-11 16:36 UTC (permalink / raw)
To: torvalds, akpm; +Cc: linux-kernel
Make the kernel use atomic operations for files_stat.nr_files accounting
rather than using a spinlocked and interrupt-disabled critical section. This
should improve reduce the latency of the critical section.
This patch is slightly problematic for a couple of reasons:
(1) The sysctl code then implicitly casts the atomic value to an int, which
I'm not sure is permissible.
(2) linux/fs.h only includes asm/atomic.h if __KERNEL__ is defined, but the
files_stat_struct is declared whether or not __KERNEL__ is defined. To
deal with this I've made the type of the nr_files member change to int if
necessary.
However, I'm not sure there's any need to present files_stat_struct to
non-kernel code.
Additionally, I'm not sure that doing the accounting in the slab constructor
and destructor is a good idea: if ENFILE is returned, and then a whole bunch
of files are freed up thus permitting the next open to theoretically avoid
ENFILE, what's to say that the slab has actually recycled all those objects?
Under some conditions the destructors will only be invoked when the actual
slab pages are released, in which case the ENFILE condition won't necessarily
cease to be reported, even if it has actually gone away.
Consider where the files_cache has more than one object per slab, and after an
ENFILE condition a whole bunch of files are released, such as to leave all the
slab pages still allocated, with one struct file allocated in each...
Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat -p1 /tmp/file-nr.diff
fs/file_table.c | 25 +++++++++----------------
include/linux/fs.h | 8 +++++++-
2 files changed, 16 insertions(+), 17 deletions(-)
diff --git a/fs/file_table.c b/fs/file_table.c
index 1008050..67b0084 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -22,7 +22,8 @@
/* sysctl tunables... */
struct files_stat_struct files_stat = {
- .max_files = NR_FILE
+ .max_files = NR_FILE,
+ .nr_files = ATOMIC_INIT(0),
};
EXPORT_SYMBOL(files_stat); /* Needed by unix.o */
@@ -33,26 +34,18 @@ EXPORT_SYMBOL(files_stat); /* Needed by
static DEFINE_SPINLOCK(filp_count_lock);
/* slab constructors and destructors are called from arbitrary
- * context and must be fully threaded - use a local spinlock
- * to protect files_stat.nr_files
+ * context and must be fully threaded
*/
void filp_ctor(void *objp, struct kmem_cache *cachep, unsigned long cflags)
{
if ((cflags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) ==
- SLAB_CTOR_CONSTRUCTOR) {
- unsigned long flags;
- spin_lock_irqsave(&filp_count_lock, flags);
- files_stat.nr_files++;
- spin_unlock_irqrestore(&filp_count_lock, flags);
- }
+ SLAB_CTOR_CONSTRUCTOR)
+ atomic_inc(&files_stat.nr_files);
}
void filp_dtor(void *objp, struct kmem_cache *cachep, unsigned long dflags)
{
- unsigned long flags;
- spin_lock_irqsave(&filp_count_lock, flags);
- files_stat.nr_files--;
- spin_unlock_irqrestore(&filp_count_lock, flags);
+ atomic_dec(&files_stat.nr_files);
}
static inline void file_free_rcu(struct rcu_head *head)
@@ -78,7 +71,7 @@ struct file *get_empty_filp(void)
/*
* Privileged users can go above max_files
*/
- if (files_stat.nr_files >= files_stat.max_files &&
+ if (atomic_read(&files_stat.nr_files) >= files_stat.max_files &&
!capable(CAP_SYS_ADMIN))
goto over;
@@ -101,10 +94,10 @@ struct file *get_empty_filp(void)
over:
/* Ran out of filps - report that */
- if (files_stat.nr_files > old_max) {
+ if (atomic_read(&files_stat.nr_files) > old_max) {
printk(KERN_INFO "VFS: file-max limit %d reached\n",
files_stat.max_files);
- old_max = files_stat.nr_files;
+ old_max = atomic_read(&files_stat.nr_files);
}
goto fail;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cc9ecc0..533fd18 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -9,6 +9,9 @@
#include <linux/config.h>
#include <linux/limits.h>
#include <linux/ioctl.h>
+#ifdef __KERNEL__
+#include <asm/atomic.h>
+#endif
/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -30,7 +33,11 @@
/* And dynamically-tunable limits and defaults: */
struct files_stat_struct {
+#ifdef __KERNEL__
+ atomic_t nr_files; /* read only */
+#else
int nr_files; /* read only */
+#endif
int nr_free_files; /* read only */
int max_files; /* tunable */
};
@@ -218,7 +225,6 @@ extern int dir_notify_enable;
#include <linux/sched.h>
#include <linux/mutex.h>
-#include <asm/atomic.h>
#include <asm/semaphore.h>
#include <asm/byteorder.h>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
2006-04-11 16:36 [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq David Howells
@ 2006-04-11 22:05 ` Andrew Morton
2006-04-12 10:56 ` David Howells
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-04-11 22:05 UTC (permalink / raw)
To: David Howells; +Cc: torvalds, linux-kernel
David Howells <dhowells@redhat.com> wrote:
>
> Make the kernel use atomic operations for files_stat.nr_files accounting
> rather than using a spinlocked and interrupt-disabled critical section.
This code has all been redone in current kernels.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
2006-04-11 22:05 ` Andrew Morton
@ 2006-04-12 10:56 ` David Howells
2006-04-12 17:48 ` Trond Myklebust
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-12 10:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: David Howells, torvalds, linux-kernel
Andrew Morton <akpm@osdl.org> wrote:
> > Make the kernel use atomic operations for files_stat.nr_files accounting
> > rather than using a spinlocked and interrupt-disabled critical section.
>
> This code has all been redone in current kernels.
Hmmm... So it has. Trond's tree hasn't caught up yet, which is a bit of a
problem:-/
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
2006-04-12 10:56 ` David Howells
@ 2006-04-12 17:48 ` Trond Myklebust
2006-04-12 18:19 ` David Howells
0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2006-04-12 17:48 UTC (permalink / raw)
To: David Howells; +Cc: Andrew Morton, torvalds, linux-kernel
On Wed, 2006-04-12 at 11:56 +0100, David Howells wrote:
> Andrew Morton <akpm@osdl.org> wrote:
>
> > > Make the kernel use atomic operations for files_stat.nr_files accounting
> > > rather than using a spinlocked and interrupt-disabled critical section.
> >
> > This code has all been redone in current kernels.
>
> Hmmm... So it has. Trond's tree hasn't caught up yet, which is a bit of a
> problem:-/
I've been updating the NFS git tree on a daily basis. I'm not going to
begin pulling from the -mm tree, though.
Cheers,
Trond
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
2006-04-12 17:48 ` Trond Myklebust
@ 2006-04-12 18:19 ` David Howells
2006-04-13 3:38 ` Trond Myklebust
0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2006-04-12 18:19 UTC (permalink / raw)
To: Trond Myklebust; +Cc: David Howells, Andrew Morton, torvalds, linux-kernel
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> I've been updating the NFS git tree on a daily basis. I'm not going to
> begin pulling from the -mm tree, though.
The thing to which I refer is in Linus's tree but wasn't in yours this
morning. Unfortunately, this means my patch has to be different, depending on
whose tree I'm using... although your tree has been updated since then, so the
difference seems to have gone away.
I don't mean to be critical of your efforts, but the requirement imposed by
Andrew that I have to base my tree on yours makes things a little tricky
sometimes:-/
David
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq
2006-04-12 18:19 ` David Howells
@ 2006-04-13 3:38 ` Trond Myklebust
0 siblings, 0 replies; 6+ messages in thread
From: Trond Myklebust @ 2006-04-13 3:38 UTC (permalink / raw)
To: David Howells; +Cc: Andrew Morton, torvalds, linux-kernel
On Wed, 2006-04-12 at 19:19 +0100, David Howells wrote:
> Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
>
> > I've been updating the NFS git tree on a daily basis. I'm not going to
> > begin pulling from the -mm tree, though.
>
> The thing to which I refer is in Linus's tree but wasn't in yours this
> morning. Unfortunately, this means my patch has to be different, depending on
> whose tree I'm using... although your tree has been updated since then, so the
> difference seems to have gone away.
>
> I don't mean to be critical of your efforts, but the requirement imposed by
> Andrew that I have to base my tree on yours makes things a little tricky
> sometimes:-/
You can always pull from Linus too. There is nothing within git that
forces you to wait for me to pull in his changes.
Cheers,
Trond
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-04-13 3:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-11 16:36 [PATCH] Use atomic ops for file_nr accounting, not spinlock+irq David Howells
2006-04-11 22:05 ` Andrew Morton
2006-04-12 10:56 ` David Howells
2006-04-12 17:48 ` Trond Myklebust
2006-04-12 18:19 ` David Howells
2006-04-13 3:38 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox