* Re: Fw: two 2.6.13-rc3-mm3 oddities [not found] <20050803095644.78b58cb4.akpm@osdl.org> @ 2005-08-08 14:05 ` Dipankar Sarma 2005-08-08 15:26 ` Hugh Dickins 2005-08-08 16:31 ` Manfred Spraul 0 siblings, 2 replies; 6+ messages in thread From: Dipankar Sarma @ 2005-08-08 14:05 UTC (permalink / raw) To: Andrew Morton; +Cc: hugh, Manfred Spraul, linux-kernel I am ccing this to linux-kernel for a wider audience. On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton wrote: > > Subject: two 2.6.13-rc3-mm3 oddities > > Just wanted to record a couple of oddities I noticed with 2.6.13-rc3-mm3 > (maybe there before: I hardly tested -mm1 and didn't even download -mm2), > which have gone away in 2.6.13-rc4-mm1 - so of no great importance, but > perhaps worth noting in case they resurface later. > > One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of > VFS: file-max limit 49778 reached > messages, which I imagine was a side-effect of the struct file RCU > patches you've dropped from -rc4-mm1, perhaps a grace period problem. Hugh, could you please try this with the experimental patch below ? Manfred, is it safe to decrement nr_files in file_free() instead of the destructor ? I can't see any problem. > > And repeatably (after a couple of hours or so), on both i386 and x86_64, > those tests hung waiting on page locks, as if there was a missing > unlock_page (or a spurious SetPageLocked) somewhere. I didn't have > time to do any bisection, but studying source diffs showed no likely > candidates whatsoever (I did wonder about Nick's __ClearPageDirty, > but reproduced the problem with that backed out). As I say, gone > away (same tests on -rc4-mm1 have run two days), but curious. Hugh, can you mail me the exact test you run ? I would like to run it myself and see if I can reproduce it. Thanks Dipankar nr_files need to decremented before the file structure is handed over to RCU. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> --- fs/file_table.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff -puN fs/file_table.c~files-fix-nr-files fs/file_table.c --- linux-2.6.13-rc3-mm3-fixes/fs/file_table.c~files-fix-nr-files 2005-08-08 18:28:12.000000000 +0530 +++ linux-2.6.13-rc3-mm3-fixes-dipankar/fs/file_table.c 2005-08-08 18:32:31.000000000 +0530 @@ -48,10 +48,6 @@ void filp_ctor(void * objp, struct kmem_ void filp_dtor(void * objp, struct kmem_cache_s *cachep, unsigned long dflags) { - unsigned long flags; - spin_lock_irqsave(&filp_count_lock, flags); - files_stat.nr_files--; - spin_unlock_irqrestore(&filp_count_lock, flags); } static inline void file_free_rcu(struct rcu_head *head) @@ -62,6 +58,10 @@ static inline void file_free_rcu(struct static inline void file_free(struct file *f) { + unsigned long flags; + spin_lock_irqsave(&filp_count_lock, flags); + files_stat.nr_files--; + spin_unlock_irqrestore(&filp_count_lock, flags); call_rcu(&f->f_rcuhead, file_free_rcu); } _ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities 2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma @ 2005-08-08 15:26 ` Hugh Dickins 2005-08-08 16:31 ` Manfred Spraul 1 sibling, 0 replies; 6+ messages in thread From: Hugh Dickins @ 2005-08-08 15:26 UTC (permalink / raw) To: Dipankar Sarma; +Cc: Andrew Morton, Manfred Spraul, linux-kernel On Mon, 8 Aug 2005, Dipankar Sarma wrote: > On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton forwarded from Hugh: > > > > Subject: two 2.6.13-rc3-mm3 oddities > > > > One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of > > VFS: file-max limit 49778 reached > > messages, which I imagine was a side-effect of the struct file RCU > > patches you've dropped from -rc4-mm1, perhaps a grace period problem. > > Hugh, could you please try this with the experimental patch below ? I'll give it a go. But first I ought to try to reproduce the problem more easily - it happened after about two hours of a test which had run for two or more hours (before hitting the apparently unrelated problem) at least twice each on at least two machines. So until I can reproduce the problem more easily, it'll take some while to have confidence that your patch addresses it. Therefore worth a little effort at my end to cut it down (and if I can't cut it down, I may have to solve the odder locked page issue to get the original test to run long enough to hit the file-max problem with any likelihood). > Hugh, can you mail me the exact test you run ? I would like to run > it myself and see if I can reproduce it. It'll waste less of both our time if I at least try to reproduce it more quickly first. Hold on... Hugh ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities 2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma 2005-08-08 15:26 ` Hugh Dickins @ 2005-08-08 16:31 ` Manfred Spraul 2005-08-08 16:46 ` Dipankar Sarma 1 sibling, 1 reply; 6+ messages in thread From: Manfred Spraul @ 2005-08-08 16:31 UTC (permalink / raw) To: dipankar; +Cc: Andrew Morton, hugh, linux-kernel Dipankar Sarma wrote: >Hugh, could you please try this with the experimental patch below ? >Manfred, is it safe to decrement nr_files in file_free() >instead of the destructor ? I can't see any problem. > > > The ctor/dtor are only called when new objects are created, not on every kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter becomes negative on builds without CONFIG_DEBUG_SLAB. Thus increase in the ctor and decrease in file_free() is the wrong thing. If you want to move the decrease from the dtor to file_free, then you must move the increase, too. But: IIRC the counters were moved to the ctor/dtor for performance reasons, I'd guess mbligh ran into cache line trashing on the filp_count_lock spinlock with reaim or something like that. -- Manfred ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities 2005-08-08 16:31 ` Manfred Spraul @ 2005-08-08 16:46 ` Dipankar Sarma 2005-08-08 17:25 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Dipankar Sarma @ 2005-08-08 16:46 UTC (permalink / raw) To: Manfred Spraul; +Cc: Andrew Morton, linux-kernel, Hugh Dickins On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: > Dipankar Sarma wrote: > > >Hugh, could you please try this with the experimental patch below ? > >Manfred, is it safe to decrement nr_files in file_free() > >instead of the destructor ? I can't see any problem. > > > > > > > The ctor/dtor are only called when new objects are created, not on every > kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter > becomes negative on builds without CONFIG_DEBUG_SLAB. > Thus increase in the ctor and decrease in file_free() is the wrong > thing. If you want to move the decrease from the dtor to file_free, then > you must move the increase, too. > But: IIRC the counters were moved to the ctor/dtor for performance > reasons, I'd guess mbligh ran into cache line trashing on the > filp_count_lock spinlock with reaim or something like that. Ah, so the whole idea was to inc/dec nr_files less often so that we reduce contention on filp_count_lock, right ? This however causes skews nr_files by the size of the slab array, AFAICS. Since we check nr_files before we allocate files from slab, the check seems inaccurate. Anyway, I guess, I need to look at scaling the file counting first. Thanks Dipankar ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities 2005-08-08 16:46 ` Dipankar Sarma @ 2005-08-08 17:25 ` Andrew Morton 2005-08-08 18:03 ` Dipankar Sarma 0 siblings, 1 reply; 6+ messages in thread From: Andrew Morton @ 2005-08-08 17:25 UTC (permalink / raw) To: dipankar; +Cc: manfred, linux-kernel, hugh Dipankar Sarma <dipankar@in.ibm.com> wrote: > > On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: > > Dipankar Sarma wrote: > > > > >Hugh, could you please try this with the experimental patch below ? > > >Manfred, is it safe to decrement nr_files in file_free() > > >instead of the destructor ? I can't see any problem. > > > > > > > > > > > The ctor/dtor are only called when new objects are created, not on every > > kmem_cache_alloc/kmem_cache_free. Thus I would expect that the counter > > becomes negative on builds without CONFIG_DEBUG_SLAB. > > Thus increase in the ctor and decrease in file_free() is the wrong > > thing. If you want to move the decrease from the dtor to file_free, then > > you must move the increase, too. > > But: IIRC the counters were moved to the ctor/dtor for performance > > reasons, I'd guess mbligh ran into cache line trashing on the > > filp_count_lock spinlock with reaim or something like that. > > Ah, so the whole idea was to inc/dec nr_files less often so > that we reduce contention on filp_count_lock, right ? This however > causes skews nr_files by the size of the slab array, AFAICS. > Since we check nr_files before we allocate files from slab, the > check seems inaccurate. > > Anyway, I guess, I need to look at scaling the file counting > first. Something like vm_acct_memory() or percpu_counter would suit. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: two 2.6.13-rc3-mm3 oddities 2005-08-08 17:25 ` Andrew Morton @ 2005-08-08 18:03 ` Dipankar Sarma 0 siblings, 0 replies; 6+ messages in thread From: Dipankar Sarma @ 2005-08-08 18:03 UTC (permalink / raw) To: Andrew Morton; +Cc: manfred, linux-kernel, hugh On Mon, Aug 08, 2005 at 10:25:59AM -0700, Andrew Morton wrote: > Dipankar Sarma <dipankar@in.ibm.com> wrote: > > > > > But: IIRC the counters were moved to the ctor/dtor for performance > > > reasons, I'd guess mbligh ran into cache line trashing on the > > > filp_count_lock spinlock with reaim or something like that. > > > > Ah, so the whole idea was to inc/dec nr_files less often so > > that we reduce contention on filp_count_lock, right ? This however > > causes skews nr_files by the size of the slab array, AFAICS. > > Since we check nr_files before we allocate files from slab, the > > check seems inaccurate. > > > > Anyway, I guess, I need to look at scaling the file counting > > first. > > Something like vm_acct_memory() or percpu_counter would suit. Yes, that is what I am doing, except that because of the sysctl stuff, I now have to wallow myself in /proc code. Thanks Dipankar ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-08-08 22:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050803095644.78b58cb4.akpm@osdl.org>
2005-08-08 14:05 ` Fw: two 2.6.13-rc3-mm3 oddities Dipankar Sarma
2005-08-08 15:26 ` Hugh Dickins
2005-08-08 16:31 ` Manfred Spraul
2005-08-08 16:46 ` Dipankar Sarma
2005-08-08 17:25 ` Andrew Morton
2005-08-08 18:03 ` Dipankar Sarma
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox