* What's the NFS OOM problem? @ 2006-08-08 22:24 Xin Zhao 2006-08-09 2:33 ` H. Peter Anvin 2006-08-10 4:57 ` Willy Tarreau 0 siblings, 2 replies; 12+ messages in thread From: Xin Zhao @ 2006-08-08 22:24 UTC (permalink / raw) To: linux-kernel; +Cc: linux-fsdevel I often heard of the OOM probelm in NFS, but don't know what it is. Now I am developing a NFS based system and found my system memory (server side) is used too fast. I checked the code but didn't find memory leaking. So I suspect I run into OOM issue. Can someone help me and give me a brief description on OOM issue? Many many thanks! -x ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-08 22:24 What's the NFS OOM problem? Xin Zhao @ 2006-08-09 2:33 ` H. Peter Anvin 2006-08-10 4:57 ` Willy Tarreau 1 sibling, 0 replies; 12+ messages in thread From: H. Peter Anvin @ 2006-08-09 2:33 UTC (permalink / raw) To: Xin Zhao; +Cc: linux-kernel, linux-fsdevel Xin Zhao wrote: > I often heard of the OOM probelm in NFS, but don't know what it is. > Now I am developing a NFS based system and found my system memory > (server side) is used too fast. I checked the code but didn't find > memory leaking. So I suspect I run into OOM issue. > > Can someone help me and give me a brief description on OOM issue? > > Many many thanks! What I suspect you're talking about has to do with a network client running out of memory and not being able to talk to the network. The server isn't affected. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-08 22:24 What's the NFS OOM problem? Xin Zhao 2006-08-09 2:33 ` H. Peter Anvin @ 2006-08-10 4:57 ` Willy Tarreau 2006-08-10 21:53 ` Grant Coady 2006-08-11 0:33 ` Neil Brown 1 sibling, 2 replies; 12+ messages in thread From: Willy Tarreau @ 2006-08-10 4:57 UTC (permalink / raw) To: Xin Zhao; +Cc: linux-kernel, linux-fsdevel On Tue, Aug 08, 2006 at 06:24:47PM -0400, Xin Zhao wrote: > I often heard of the OOM probelm in NFS, but don't know what it is. > Now I am developing a NFS based system and found my system memory > (server side) is used too fast. I checked the code but didn't find > memory leaking. So I suspect I run into OOM issue. I simply think that you're cache is filling while your clients access a lot of files. That's expected. You might also get quite a bunch of dentries cached which will not be accounted for in meminfo. Check /proc/meminfo for the cache+buffer size, and check /proc/slabinfo for the number of dentries. The usual way to ensure this is only cache is to allocate a large amount of memory (let's say all the system RAM provided that everything can get swapped), then free it. You'll see a lot of free memory after that. > Can someone help me and give me a brief description on OOM issue? I don't know about any OOM issue related to NFS. At most it might happen on the client (eg: stating firefox from an NFS root) which might not have enough memory for new network buffers, but I don't even know if it's possible at all. Regards, Willy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-10 4:57 ` Willy Tarreau @ 2006-08-10 21:53 ` Grant Coady 2006-08-11 0:33 ` Neil Brown 1 sibling, 0 replies; 12+ messages in thread From: Grant Coady @ 2006-08-10 21:53 UTC (permalink / raw) To: Willy Tarreau; +Cc: Xin Zhao, linux-kernel, linux-fsdevel On Thu, 10 Aug 2006 06:57:11 +0200, Willy Tarreau <w@1wt.eu> wrote: >On Tue, Aug 08, 2006 at 06:24:47PM -0400, Xin Zhao wrote: >> I often heard of the OOM probelm in NFS, but don't know what it is. >> Now I am developing a NFS based system and found my system memory >> (server side) is used too fast. I checked the code but didn't find >> memory leaking. So I suspect I run into OOM issue. > >I simply think that you're cache is filling while your clients access >a lot of files. That's expected. You might also get quite a bunch of >dentries cached which will not be accounted for in meminfo. Check >/proc/meminfo for the cache+buffer size, and check /proc/slabinfo for >the number of dentries. The usual way to ensure this is only cache is >to allocate a large amount of memory (let's say all the system RAM >provided that everything can get swapped), then free it. You'll see >a lot of free memory after that. > >> Can someone help me and give me a brief description on OOM issue? > >I don't know about any OOM issue related to NFS. At most it might happen >on the client (eg: stating firefox from an NFS root) which might not have >enough memory for new network buffers, but I don't even know if it's >possible at all. I once wrote a silly test script that put way too much work into ksoftirqd and the system slowed right down, it was some time ago, I forget details. You could see the problem by monitoring `top` on both client and server, watching the thing choking. I didn't document it, seemed like a "don't do that" situation at the time. Grant. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-10 4:57 ` Willy Tarreau 2006-08-10 21:53 ` Grant Coady @ 2006-08-11 0:33 ` Neil Brown 2006-08-11 3:57 ` Willy Tarreau ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Neil Brown @ 2006-08-11 0:33 UTC (permalink / raw) To: Willy Tarreau; +Cc: Xin Zhao, linux-kernel, linux-fsdevel On Thursday August 10, w@1wt.eu wrote: > > > Can someone help me and give me a brief description on OOM issue? > > I don't know about any OOM issue related to NFS. At most it might happen > on the client (eg: stating firefox from an NFS root) which might not have > enough memory for new network buffers, but I don't even know if it's > possible at all. We've had reports of OOM problems with NFS at SuSE. The common factors seem to be lots of memory (6G+) and very large files. Tuning down /proc/sys/vm/dirty_*ratio seems to avoid the problem, but I'm not very close to understanding what the real problem is. NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-11 0:33 ` Neil Brown @ 2006-08-11 3:57 ` Willy Tarreau 2006-08-11 4:24 ` Neil Brown 2006-08-11 8:48 ` Peter Zijlstra 2006-08-15 18:24 ` Roger Heflin 2 siblings, 1 reply; 12+ messages in thread From: Willy Tarreau @ 2006-08-11 3:57 UTC (permalink / raw) To: Neil Brown; +Cc: Xin Zhao, linux-kernel, linux-fsdevel On Fri, Aug 11, 2006 at 10:33:32AM +1000, Neil Brown wrote: > On Thursday August 10, w@1wt.eu wrote: > > > > > Can someone help me and give me a brief description on OOM issue? > > > > I don't know about any OOM issue related to NFS. At most it might happen > > on the client (eg: stating firefox from an NFS root) which might not have > > enough memory for new network buffers, but I don't even know if it's > > possible at all. > > We've had reports of OOM problems with NFS at SuSE. > The common factors seem to be lots of memory (6G+) and very large > files. Just out of curiosity, does it happen on 32bit or 64bit machines (or both) ? > Tuning down /proc/sys/vm/dirty_*ratio seems to avoid the problem, > but I'm not very close to understanding what the real problem is. The most important is to be aware of it ;-) > NeilBrown Thanks for the info, Willy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-11 3:57 ` Willy Tarreau @ 2006-08-11 4:24 ` Neil Brown 0 siblings, 0 replies; 12+ messages in thread From: Neil Brown @ 2006-08-11 4:24 UTC (permalink / raw) To: Willy Tarreau; +Cc: Xin Zhao, linux-kernel, linux-fsdevel On Friday August 11, w@1wt.eu wrote: > On Fri, Aug 11, 2006 at 10:33:32AM +1000, Neil Brown wrote: > > We've had reports of OOM problems with NFS at SuSE. > > The common factors seem to be lots of memory (6G+) and very large > > files. > > Just out of curiosity, does it happen on 32bit or 64bit machines (or both) ? Both. If it was just 32bit I'd be blaming highmem in a flash. But it's not that easy :-( NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-11 0:33 ` Neil Brown 2006-08-11 3:57 ` Willy Tarreau @ 2006-08-11 8:48 ` Peter Zijlstra 2006-08-14 2:03 ` Neil Brown 2006-08-15 18:24 ` Roger Heflin 2 siblings, 1 reply; 12+ messages in thread From: Peter Zijlstra @ 2006-08-11 8:48 UTC (permalink / raw) To: Neil Brown; +Cc: Willy Tarreau, Xin Zhao, linux-kernel, linux-fsdevel On Fri, 2006-08-11 at 10:33 +1000, Neil Brown wrote: > On Thursday August 10, w@1wt.eu wrote: > > > > > Can someone help me and give me a brief description on OOM issue? > > > > I don't know about any OOM issue related to NFS. At most it might happen > > on the client (eg: stating firefox from an NFS root) which might not have > > enough memory for new network buffers, but I don't even know if it's > > possible at all. > > We've had reports of OOM problems with NFS at SuSE. > The common factors seem to be lots of memory (6G+) and very large > files. > Tuning down /proc/sys/vm/dirty_*ratio seems to avoid the problem, > but I'm not very close to understanding what the real problem is. Would it not be related to mmap'ed files, where the client will not properly track the dirty pages? This will make the reclaim code go crap itself because suddenly not a single page is easily freeable anymore, all pages are then found to be dirty and require writeback, which takes more memory - ie. allocate network packets, and wait for proper answer. Andrew is currently carrying some patches that will avoid this problem by virtue of tracking dirtying of mmap'ed pages. With these patches nr_dirty is properly incremented and the pdflush logic should kick in and do its thing. This would explain why lowering dirty_*ratio would sometimes help, that would kick off the pdflush thread earlier, which would then detect the previously unknown dirty pages. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-11 8:48 ` Peter Zijlstra @ 2006-08-14 2:03 ` Neil Brown 0 siblings, 0 replies; 12+ messages in thread From: Neil Brown @ 2006-08-14 2:03 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Willy Tarreau, Xin Zhao, linux-kernel, linux-fsdevel On Friday August 11, a.p.zijlstra@chello.nl wrote: > On Fri, 2006-08-11 at 10:33 +1000, Neil Brown wrote: > > On Thursday August 10, w@1wt.eu wrote: > > > > > > > Can someone help me and give me a brief description on OOM issue? > > > > > > I don't know about any OOM issue related to NFS. At most it might happen > > > on the client (eg: stating firefox from an NFS root) which might not have > > > enough memory for new network buffers, but I don't even know if it's > > > possible at all. > > > > We've had reports of OOM problems with NFS at SuSE. > > The common factors seem to be lots of memory (6G+) and very large > > files. > > Tuning down /proc/sys/vm/dirty_*ratio seems to avoid the problem, > > but I'm not very close to understanding what the real problem is. > > Would it not be related to mmap'ed files, where the client will not > properly > track the dirty pages? This will make the reclaim code go crap itself > because > suddenly not a single page is easily freeable anymore, all pages are > then > found to be dirty and require writeback, which takes more memory - ie. > allocate > network packets, and wait for proper answer. I don't think mmap is being used, but I've asked for confirmation just to be sure - thanks for the tip. I have a reconstructed "/proc/meminfo" collected out of a crash-dump. (note that this is from a 2.6.5 based kernel, though the same symptom has been reported on a 2.6.16 based kernel). It is below, plus the 'Unstable' number (which 2.6.5 doesn't show normally). It seems that 10Gig of the 16Gig is in 'writeback'. It is my understanding that pages shouldn't stay in 'writeback for very long. They should get written and then (for nfs) moved to Unstable. The fact that 'Dirty' is zero suggests that there weren't any malloc failures in starting writeback so I don't think the system is actually OOM at this point (MemFree is 17Meg which is 1/1000th of total ram, but some thousands of pages). But the machine has nevertheless seized up. I'm thinking that the very large number of 'writeback pages on the inactive list is slowing down shink_list and associated functions and nothing is progressing very fast. But I wonder why 'writeback' was allowed to get so high, and why it stays to high. Looking at balance_dirty_pages again, I see that it only really worries about the number of dirty pages. i.e. once enough pages have been written, it breaks out of the loop, even if there are heaps and heaps of writeback pages.... So on this 16Gig machine with dirty_ratio at the default of '40', We happily let 6.4Gig get dirty and then start writing it out in balance_dirty_pages. It will then flush out 6Meg for every 4 Meg that is written. While nr_writeback stays high balance_dirty will keep flushing until nr_dirty hits zero. then it will just flush out all dirty pages every time it is called, thus keeping nr_writeback high. They should be a 100msec pause each time balance_dirty_pages is called at this stage. It is called for every 4Meg of data, which would take a lot longer than 100msec to go out via NFS.... Hmmm.. maybe balance_dirty_pages should wait for nr_writeback to drop sometimes. Currently it has to write at least sync_writeback_pages(). If it cannot find that many to write, it stops. Maybe if it cannot find that many, it should wait for nr_writeback to drop by the corresponding number. That would mean that if nr_pagewriteback got out-of-hand, writes would be throttled until it came back in line. But there is more to the story... (I hope you don't mind me rambling on like this. It helps to have someone to explain the problem to). When Writeback is really high, nfs doesn't make (much) progress in getting it down again. Apparently rpciod is using lots of CPU time and not sending many packets on the network... The crash dump shows rpciod and lots of other processes as Runnable, with very simply stack-tops (I don't have full details on hand) so they are probably all trying to get some free memory and so going very slowly (because the inactive list is so long with very little usable on it). rpciod calls rpc_malloc which uses a mempool to avoid starvation, but that doesn't avoid incredible slowness. So here is my understanding of the problem that I am seeing: 1/ balance_dirty_pages will allow nr_writeback to grow without bound. While it does work to decrease the number of ditry pages, it does nothing about decreasing the number of writeback pages. It should (in some situations) wait for the number to decrease (blk_congestion_wait isn't strong enough by itself). 2/ When there is a very large number of Writeback pages on the inactive list, memory reclaim can go very slowly. Maybe Writeback pages shouldn't be on the inactive list? Either that or we need a strong limit on the number of Writeback pages. Comments/corrections very welcome. I'll see if I can find a way to verify any of this with the customer... Thanks for listening, NeilBrown MemTotal: 16154060 kB MemFree: 17760 kB Buffers: 16104 kB Cached: 12956032 kB SwapCached: 1224 kB Active: 52 kB Inactive: 12972740 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16154060 kB LowFree: 17760 kB SwapTotal: 25173844 kB SwapFree: 25119512 kB Dirty: 0 kB Writeback: 10999176 kB Mapped: 4316 kB Slab: 3135240 kB Committed_AS: 112984 kB PageTables: 1436 kB VmallocTotal: 536870911 kB VmallocUsed: 12828 kB VmallocChunk: 536858083 kB Unstable 1326884 kB ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-11 0:33 ` Neil Brown 2006-08-11 3:57 ` Willy Tarreau 2006-08-11 8:48 ` Peter Zijlstra @ 2006-08-15 18:24 ` Roger Heflin 2006-08-17 5:04 ` Neil Brown 2 siblings, 1 reply; 12+ messages in thread From: Roger Heflin @ 2006-08-15 18:24 UTC (permalink / raw) To: Neil Brown; +Cc: Willy Tarreau, Xin Zhao, linux-kernel, linux-fsdevel Neil Brown wrote: > On Thursday August 10, w@1wt.eu wrote: >>> Can someone help me and give me a brief description on OOM issue? >> I don't know about any OOM issue related to NFS. At most it might happen >> on the client (eg: stating firefox from an NFS root) which might not have >> enough memory for new network buffers, but I don't even know if it's >> possible at all. > > We've had reports of OOM problems with NFS at SuSE. > The common factors seem to be lots of memory (6G+) and very large > files. > Tuning down /proc/sys/vm/dirty_*ratio seems to avoid the problem, > but I'm not very close to understanding what the real problem is. > > NeilBrown > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > I have noticed on SLES kernels that when the dirty_*ratios turned down it still uses alot more memory than it should work writeback buffers, it makes me think that with the default setting of 40% that it for some reason may be using all of memory and deadlocking. It does not seem like an NFS only issue, as I believe I have duplicated it with a fast lock setup. Checking writeback in /proc/meminfo does indicate that alot more memory is being used for write cache that should be. Roger ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-15 18:24 ` Roger Heflin @ 2006-08-17 5:04 ` Neil Brown 2006-08-17 13:29 ` Roger Heflin 0 siblings, 1 reply; 12+ messages in thread From: Neil Brown @ 2006-08-17 5:04 UTC (permalink / raw) To: Roger Heflin; +Cc: Willy Tarreau, Xin Zhao, linux-kernel, linux-fsdevel On Tuesday August 15, rheflin@atipa.com wrote: > > I have noticed on SLES kernels that when the dirty_*ratios turned down it > still uses alot more memory than it should work writeback buffers, it makes > me think that with the default setting of 40% that it for some reason > may be using all of memory and deadlocking. It does not seem like an > NFS only issue, as I believe I have duplicated it with a fast lock > setup. We seem to have a little patch in SuSE kernels that might be making the problem worse .... though I presume it was introduced for a reason. I haven't managed to track what that reason was yet. What is "a fast lock setup"?? I don't understand. NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's the NFS OOM problem? 2006-08-17 5:04 ` Neil Brown @ 2006-08-17 13:29 ` Roger Heflin 0 siblings, 0 replies; 12+ messages in thread From: Roger Heflin @ 2006-08-17 13:29 UTC (permalink / raw) To: Neil Brown; +Cc: Willy Tarreau, Xin Zhao, linux-kernel, linux-fsdevel Neil Brown wrote: > On Tuesday August 15, rheflin@atipa.com wrote: >> I have noticed on SLES kernels that when the dirty_*ratios turned down it >> still uses alot more memory than it should work writeback buffers, it makes >> me think that with the default setting of 40% that it for some reason >> may be using all of memory and deadlocking. It does not seem like an >> NFS only issue, as I believe I have duplicated it with a fast lock >> setup. > > We seem to have a little patch in SuSE kernels that might be making > the problem worse .... though I presume it was introduced for a > reason. I haven't managed to track what that reason was yet. > > What is "a fast lock setup"?? I don't understand. > > NeilBrown > I am not sure what I ment, I may have ment a fast disk setup, and thought or typed the wrong thing. The machine I duplicated it with had disks that would sustain 175MB/second (3 striped), 4cpus with local ram of 32GB. The 2 cpu/4GB/100MB/second machine does not seem to have the issue. Both machines are opterons, I believe I duplicated it under SP2, I know I duplicated it SP3 and one of the post-SP3 kernels. It did not occur under SP1. Turning down the dirty*ratios seems to make it go away. When I get a chance I will retest on SP2 and see if it happens there. I do know (and this may be related) that if on a 32GB machine I pagelock a large portion of ram (say 28GB) that machine will deadlock under high IO. The basic symptoms are similar to the writeback issue the machine responds to ping/sysrq, but logins fail, and any new process creation fails. Roger ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-08-17 13:29 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-08-08 22:24 What's the NFS OOM problem? Xin Zhao 2006-08-09 2:33 ` H. Peter Anvin 2006-08-10 4:57 ` Willy Tarreau 2006-08-10 21:53 ` Grant Coady 2006-08-11 0:33 ` Neil Brown 2006-08-11 3:57 ` Willy Tarreau 2006-08-11 4:24 ` Neil Brown 2006-08-11 8:48 ` Peter Zijlstra 2006-08-14 2:03 ` Neil Brown 2006-08-15 18:24 ` Roger Heflin 2006-08-17 5:04 ` Neil Brown 2006-08-17 13:29 ` Roger Heflin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).