* Re: Fw: two 2.6.13-rc3-mm3 oddities
[not found] <20050803095644.78b58cb4.akpm@osdl.org>
@ 2005-08-08 14:05 ` Dipankar Sarma
2005-08-08 15:26 ` Hugh Dickins
2005-08-08 16:31 ` Manfred Spraul
0 siblings, 2 replies; 6+ messages in thread
From: Dipankar Sarma @ 2005-08-08 14:05 UTC (permalink / raw)
To: Andrew Morton; +Cc: hugh, Manfred Spraul, linux-kernel
I am ccing this to linux-kernel for a wider audience.
On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton wrote:
>
> Subject: two 2.6.13-rc3-mm3 oddities
>
> Just wanted to record a couple of oddities I noticed with 2.6.13-rc3-mm3
> (maybe there before: I hardly tested -mm1 and didn't even download -mm2),
> which have gone away in 2.6.13-rc4-mm1 - so of no great importance, but
> perhaps worth noting in case they resurface later.
>
> One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of
> VFS: file-max limit 49778 reached
> messages, which I imagine was a side-effect of the struct file RCU
> patches you've dropped from -rc4-mm1, perhaps a grace period problem.
Hugh, could you please try this with the experimental patch below ?
Manfred, is it safe to decrement nr_files in file_free()
instead of the destructor ? I can't see any problem.
>
> And repeatably (after a couple of hours or so), on both i386 and x86_64,
> those tests hung waiting on page locks, as if there was a missing
> unlock_page (or a spurious SetPageLocked) somewhere. I didn't have
> time to do any bisection, but studying source diffs showed no likely
> candidates whatsoever (I did wonder about Nick's __ClearPageDirty,
> but reproduced the problem with that backed out). As I say, gone
> away (same tests on -rc4-mm1 have run two days), but curious.
Hugh, can you mail me the exact test you run ? I would like to run
it myself and see if I can reproduce it.
Thanks
Dipankar
nr_files need to decremented before the file structure is handed
over to RCU.
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
---
fs/file_table.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff -puN fs/file_table.c~files-fix-nr-files fs/file_table.c
--- linux-2.6.13-rc3-mm3-fixes/fs/file_table.c~files-fix-nr-files 2005-08-08 18:28:12.000000000 +0530
+++ linux-2.6.13-rc3-mm3-fixes-dipankar/fs/file_table.c 2005-08-08 18:32:31.000000000 +0530
@@ -48,10 +48,6 @@ void filp_ctor(void * objp, struct kmem_
void filp_dtor(void * objp, struct kmem_cache_s *cachep, unsigned long dflags)
{
- unsigned long flags;
- spin_lock_irqsave(&filp_count_lock, flags);
- files_stat.nr_files--;
- spin_unlock_irqrestore(&filp_count_lock, flags);
}
static inline void file_free_rcu(struct rcu_head *head)
@@ -62,6 +58,10 @@ static inline void file_free_rcu(struct
static inline void file_free(struct file *f)
{
+ unsigned long flags;
+ spin_lock_irqsave(&filp_count_lock, flags);
+ files_stat.nr_files--;
+ spin_unlock_irqrestore(&filp_count_lock, flags);
call_rcu(&f->f_rcuhead, file_free_rcu);
}
_
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities
2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma
@ 2005-08-08 15:26 ` Hugh Dickins
2005-08-08 16:31 ` Manfred Spraul
1 sibling, 0 replies; 6+ messages in thread
From: Hugh Dickins @ 2005-08-08 15:26 UTC (permalink / raw)
To: Dipankar Sarma; +Cc: Andrew Morton, Manfred Spraul, linux-kernel
On Mon, 8 Aug 2005, Dipankar Sarma wrote:
> On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton forwarded from Hugh:
> >
> > Subject: two 2.6.13-rc3-mm3 oddities
> >
> > One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of
> > VFS: file-max limit 49778 reached
> > messages, which I imagine was a side-effect of the struct file RCU
> > patches you've dropped from -rc4-mm1, perhaps a grace period problem.
>
> Hugh, could you please try this with the experimental patch below ?
I'll give it a go. But first I ought to try to reproduce the problem
more easily - it happened after about two hours of a test which had run
for two or more hours (before hitting the apparently unrelated problem)
at least twice each on at least two machines. So until I can reproduce
the problem more easily, it'll take some while to have confidence that
your patch addresses it. Therefore worth a little effort at my end to
cut it down (and if I can't cut it down, I may have to solve the odder
locked page issue to get the original test to run long enough to hit
the file-max problem with any likelihood).
> Hugh, can you mail me the exact test you run ? I would like to run
> it myself and see if I can reproduce it.
It'll waste less of both our time if I at least try to reproduce it
more quickly first. Hold on...
Hugh
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities
2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma
2005-08-08 15:26 ` Hugh Dickins
@ 2005-08-08 16:31 ` Manfred Spraul
2005-08-08 16:46 ` Dipankar Sarma
1 sibling, 1 reply; 6+ messages in thread
From: Manfred Spraul @ 2005-08-08 16:31 UTC (permalink / raw)
To: dipankar; +Cc: Andrew Morton, hugh, linux-kernel
Dipankar Sarma wrote:
>Hugh, could you please try this with the experimental patch below ?
>Manfred, is it safe to decrement nr_files in file_free()
>instead of the destructor ? I can't see any problem.
>
>
>
The ctor/dtor are only called when new objects are created, not on every
kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter
becomes negative on builds without CONFIG_DEBUG_SLAB.
Thus increase in the ctor and decrease in file_free() is the wrong
thing. If you want to move the decrease from the dtor to file_free, then
you must move the increase, too.
But: IIRC the counters were moved to the ctor/dtor for performance
reasons, I'd guess mbligh ran into cache line trashing on the
filp_count_lock spinlock with reaim or something like that.
--
Manfred
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities
2005-08-08 16:31 ` Manfred Spraul
@ 2005-08-08 16:46 ` Dipankar Sarma
2005-08-08 17:25 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Dipankar Sarma @ 2005-08-08 16:46 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Andrew Morton, linux-kernel, Hugh Dickins
On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote:
> Dipankar Sarma wrote:
>
> >Hugh, could you please try this with the experimental patch below ?
> >Manfred, is it safe to decrement nr_files in file_free()
> >instead of the destructor ? I can't see any problem.
> >
> >
> >
> The ctor/dtor are only called when new objects are created, not on every
> kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter
> becomes negative on builds without CONFIG_DEBUG_SLAB.
> Thus increase in the ctor and decrease in file_free() is the wrong
> thing. If you want to move the decrease from the dtor to file_free, then
> you must move the increase, too.
> But: IIRC the counters were moved to the ctor/dtor for performance
> reasons, I'd guess mbligh ran into cache line trashing on the
> filp_count_lock spinlock with reaim or something like that.
Ah, so the whole idea was to inc/dec nr_files less often so
that we reduce contention on filp_count_lock, right ? This however
causes skews nr_files by the size of the slab array, AFAICS.
Since we check nr_files before we allocate files from slab, the
check seems inaccurate.
Anyway, I guess, I need to look at scaling the file counting
first.
Thanks
Dipankar
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities
2005-08-08 16:46 ` Dipankar Sarma
@ 2005-08-08 17:25 ` Andrew Morton
2005-08-08 18:03 ` Dipankar Sarma
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2005-08-08 17:25 UTC (permalink / raw)
To: dipankar; +Cc: manfred, linux-kernel, hugh
Dipankar Sarma <dipankar@in.ibm.com> wrote:
>
> On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote:
> > Dipankar Sarma wrote:
> >
> > >Hugh, could you please try this with the experimental patch below ?
> > >Manfred, is it safe to decrement nr_files in file_free()
> > >instead of the destructor ? I can't see any problem.
> > >
> > >
> > >
> > The ctor/dtor are only called when new objects are created, not on every
> > kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter
> > becomes negative on builds without CONFIG_DEBUG_SLAB.
> > Thus increase in the ctor and decrease in file_free() is the wrong
> > thing. If you want to move the decrease from the dtor to file_free, then
> > you must move the increase, too.
> > But: IIRC the counters were moved to the ctor/dtor for performance
> > reasons, I'd guess mbligh ran into cache line trashing on the
> > filp_count_lock spinlock with reaim or something like that.
>
> Ah, so the whole idea was to inc/dec nr_files less often so
> that we reduce contention on filp_count_lock, right ? This however
> causes skews nr_files by the size of the slab array, AFAICS.
> Since we check nr_files before we allocate files from slab, the
> check seems inaccurate.
>
> Anyway, I guess, I need to look at scaling the file counting
> first.
Something like vm_acct_memory() or percpu_counter would suit.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities
2005-08-08 17:25 ` Andrew Morton
@ 2005-08-08 18:03 ` Dipankar Sarma
0 siblings, 0 replies; 6+ messages in thread
From: Dipankar Sarma @ 2005-08-08 18:03 UTC (permalink / raw)
To: Andrew Morton; +Cc: manfred, linux-kernel, hugh
On Mon, Aug 08, 2005 at 10:25:59AM -0700, Andrew Morton wrote:
> Dipankar Sarma <dipankar@in.ibm.com> wrote:
> >
> > > But: IIRC the counters were moved to the ctor/dtor for performance
> > > reasons, I'd guess mbligh ran into cache line trashing on the
> > > filp_count_lock spinlock with reaim or something like that.
> >
> > Ah, so the whole idea was to inc/dec nr_files less often so
> > that we reduce contention on filp_count_lock, right ? This however
> > causes skews nr_files by the size of the slab array, AFAICS.
> > Since we check nr_files before we allocate files from slab, the
> > check seems inaccurate.
> >
> > Anyway, I guess, I need to look at scaling the file counting
> > first.
>
> Something like vm_acct_memory() or percpu_counter would suit.
Yes, that is what I am doing, except that because of the sysctl
stuff, I now have to wallow myself in /proc code.
Thanks
Dipankar
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-08-08 22:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050803095644.78b58cb4.akpm@osdl.org>
2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma
2005-08-08 15:26 ` Hugh Dickins
2005-08-08 16:31 ` Manfred Spraul
2005-08-08 16:46 ` Dipankar Sarma
2005-08-08 17:25 ` Andrew Morton
2005-08-08 18:03 ` Dipankar Sarma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox