* NFS client (3.0.0-3.2.8) high system CPU
@ 2012-03-01 0:52 Steven Wilton
2012-03-01 14:00 ` Myklebust, Trond
0 siblings, 1 reply; 2+ messages in thread
From: Steven Wilton @ 2012-03-01 0:52 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org
Hi,
I've been trying to track down an issue we've started seeing on a bunch of NFS clients where the percentage of CPU time spent by the system has increased for the past month (it was previously using roughly equal system/user time, while now the system time is around 3x the user time). The information that I have is as follows:
- After rebooting the server, the user/system CPU times look good (roughly equal)
- After 24 hours of heavy activity the system CPU time increases to around 3-4x the user CPU time
- If I run "echo 2 > /proc/sys/vm/drop_caches" the system CPU time drops back down to roughly the same as user
The main difference that I can see in slabtop between a system running at high load and "normal" load is the number of nfs_inode_cache objects (as shown below). I tried to increase the ihash_entries and dhash_entries kernel parameters, but this did not fix the problem. I have not found any other suggestions on how to resolve issues caused by large nfs inode caches.
I have tried various kernels between 3.0.0 and 3.2.4, and the machines are currently running a 3.0.22 kernel. The machines have 8GB RAM, and have 3 NFSv4 mounts and one NFSv3 mount (with the majority of the files that they access being on one of the NFSv4 mount points, being a maildir style mail spool).
I have increased /proc/sys/vm/vfs_cache_pressure to 10000, which has resolved the problem for now, however I believe that the reason we started seeing the issue is that we added a lot of extra users onto the system, resulting in access to a larger number of files for each of the clients. I am not confident that future growth will stay below whatever threshold we had exceeded to cause the excessive system CPU load, since the problem seemed to appear at around 1,000,000 nfs_inode_cache entries in slabtop, and the NFS clients are floating between 500,000 and 900,000 inode_cache entries.
Help please :), and please let me know if I can provide any more information to assist in debugging.
Regards
Steven
Running slabtop on a system with high load looks like this:
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
2266249 1645904 72% 0.06K 38411 59 153644K size-64
1236390 1218245 98% 1.02K 412130 3 1648520K nfs_inode_cache
569180 569180 100% 0.19K 28459 20 113836K dentry
190512 184467 96% 0.03K 1701 112 6804K size-32
159026 83040 52% 0.10K 4298 37 17192K buffer_head
95520 76484 80% 0.12K 3184 30 12736K size-128
78001 63551 81% 0.55K 11143 7 44572K radix_tree_node
76956 62898 81% 0.17K 3498 22 13992K vm_area_struct
74690 58327 78% 0.05K 970 77 3880K anon_vma_chain
45680 36644 80% 0.19K 2284 20 9136K size-192
33571 27043 80% 0.06K 569 59 2276K anon_vma
15980 11500 71% 0.19K 799 20 3196K filp
12048 12046 99% 0.08K 251 48 1004K sysfs_dir_cache
12024 7628 63% 0.64K 2004 6 8016K proc_inode_cache
11856 6964 58% 0.30K 912 13 3648K nf_conntrack_ffffffff817fdd80
5745 5293 92% 0.79K 1149 5 4596K ext3_inode_cache
5385 3334 61% 0.25K 359 15 1436K tw_sock_TCP
5250 4540 86% 0.25K 350 15 1400K ip_dst_cache
4956 2259 45% 0.06K 84 59 336K tcp_bind_bucket
4180 3637 87% 0.19K 209 20 836K inet_peer_cache
3975 3909 98% 0.07K 75 53 300K Acpi-Operand
2852 2808 98% 0.04K 31 92 124K Acpi-Namespace
2600 1969 75% 0.19K 130 20 520K cred_jar
2385 1844 77% 0.07K 45 53 180K eventpoll_pwq
2370 1847 77% 0.12K 79 30 316K eventpoll_epi
1980 1562 78% 1.00K 495 4 1980K size-1024
Running slabtop on a system after dropping the caches looks like this:
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
298953 80291 26% 0.06K 5067 59 20268K size-64
134976 101836 75% 0.10K 3648 37 14592K buffer_head
77418 61207 79% 0.17K 3519 22 14076K vm_area_struct
74382 57858 77% 0.05K 966 77 3864K anon_vma_chain
69690 24347 34% 0.12K 2323 30 9292K size-128
50624 14700 29% 0.03K 452 112 1808K size-32
47800 47800 100% 0.19K 2390 20 9560K dentry
36834 36834 100% 1.02K 12278 3 49112K nfs_inode_cache
33394 26657 79% 0.06K 566 59 2264K anon_vma
21665 7895 36% 0.55K 3095 7 12380K radix_tree_node
19820 11786 59% 0.19K 991 20 3964K size-192
15040 10977 72% 0.19K 752 20 3008K filp
12883 7901 61% 0.30K 991 13 3964K nf_conntrack_ffffffff817fdd80
12096 12052 99% 0.08K 252 48 1008K sysfs_dir_cache
5250 4043 77% 0.25K 350 15 1400K ip_dst_cache
5115 3150 61% 0.25K 341 15 1364K tw_sock_TCP
4130 2219 53% 0.06K 70 59 280K tcp_bind_bucket
4120 3481 84% 0.19K 206 20 824K inet_peer_cache
3990 3987 99% 0.79K 798 5 3192K ext3_inode_cache
3975 3909 98% 0.07K 75 53 300K Acpi-Operand
2852 2808 98% 0.04K 31 92 124K Acpi-Namespace
2740 1849 67% 0.19K 137 20 548K cred_jar
2332 1784 76% 0.07K 44 53 176K eventpoll_pwq
2310 1765 76% 0.12K 77 30 308K eventpoll_epi
2055 1181 57% 0.25K 137 15 548K skbuff_head_cache
1986 1899 95% 0.64K 331 6 1324K proc_inode_cache
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: NFS client (3.0.0-3.2.8) high system CPU
2012-03-01 0:52 NFS client (3.0.0-3.2.8) high system CPU Steven Wilton
@ 2012-03-01 14:00 ` Myklebust, Trond
0 siblings, 0 replies; 2+ messages in thread
From: Myklebust, Trond @ 2012-03-01 14:00 UTC (permalink / raw)
To: Steven Wilton; +Cc: linux-nfs@vger.kernel.org
T24gVGh1LCAyMDEyLTAzLTAxIGF0IDA4OjUyICswODAwLCBTdGV2ZW4gV2lsdG9uIHdyb3RlOg0K
PiBIaSwNCj4gDQo+IEkndmUgYmVlbiB0cnlpbmcgdG8gdHJhY2sgZG93biBhbiBpc3N1ZSB3ZSd2
ZSBzdGFydGVkIHNlZWluZyBvbiBhIGJ1bmNoIG9mIE5GUyBjbGllbnRzIHdoZXJlIHRoZSBwZXJj
ZW50YWdlIG9mIENQVSB0aW1lIHNwZW50IGJ5IHRoZSBzeXN0ZW0gaGFzIGluY3JlYXNlZCBmb3Ig
dGhlIHBhc3QgbW9udGggKGl0IHdhcyBwcmV2aW91c2x5IHVzaW5nIHJvdWdobHkgZXF1YWwgc3lz
dGVtL3VzZXIgdGltZSwgd2hpbGUgbm93IHRoZSBzeXN0ZW0gdGltZSBpcyBhcm91bmQgM3ggdGhl
IHVzZXIgdGltZSkuICBUaGUgaW5mb3JtYXRpb24gdGhhdCBJIGhhdmUgaXMgYXMgZm9sbG93czoN
Cj4gDQo+IC0gQWZ0ZXIgcmVib290aW5nIHRoZSBzZXJ2ZXIsIHRoZSB1c2VyL3N5c3RlbSBDUFUg
dGltZXMgbG9vayBnb29kIChyb3VnaGx5IGVxdWFsKQ0KPiAtIEFmdGVyIDI0IGhvdXJzIG9mIGhl
YXZ5IGFjdGl2aXR5IHRoZSBzeXN0ZW0gQ1BVIHRpbWUgaW5jcmVhc2VzIHRvIGFyb3VuZCAzLTR4
IHRoZSB1c2VyIENQVSB0aW1lDQo+IC0gSWYgSSBydW4gImVjaG8gMiA+IC9wcm9jL3N5cy92bS9k
cm9wX2NhY2hlcyIgdGhlIHN5c3RlbSBDUFUgdGltZSBkcm9wcyBiYWNrIGRvd24gdG8gcm91Z2hs
eSB0aGUgc2FtZSBhcyB1c2VyDQo+IA0KPiBUaGUgbWFpbiBkaWZmZXJlbmNlIHRoYXQgSSBjYW4g
c2VlIGluIHNsYWJ0b3AgYmV0d2VlbiBhIHN5c3RlbSBydW5uaW5nIGF0IGhpZ2ggbG9hZCBhbmQg
Im5vcm1hbCIgbG9hZCBpcyB0aGUgbnVtYmVyIG9mIG5mc19pbm9kZV9jYWNoZSBvYmplY3RzIChh
cyBzaG93biBiZWxvdykuICBJIHRyaWVkIHRvIGluY3JlYXNlIHRoZSBpaGFzaF9lbnRyaWVzIGFu
ZCBkaGFzaF9lbnRyaWVzIGtlcm5lbCBwYXJhbWV0ZXJzLCBidXQgdGhpcyBkaWQgbm90IGZpeCB0
aGUgcHJvYmxlbS4gIEkgaGF2ZSBub3QgZm91bmQgYW55IG90aGVyIHN1Z2dlc3Rpb25zIG9uIGhv
dyB0byByZXNvbHZlIGlzc3VlcyBjYXVzZWQgYnkgbGFyZ2UgbmZzIGlub2RlIGNhY2hlcy4NCj4g
DQo+IEkgaGF2ZSB0cmllZCB2YXJpb3VzIGtlcm5lbHMgYmV0d2VlbiAzLjAuMCBhbmQgMy4yLjQs
IGFuZCB0aGUgbWFjaGluZXMgYXJlIGN1cnJlbnRseSBydW5uaW5nIGEgMy4wLjIyIGtlcm5lbC4g
IFRoZSBtYWNoaW5lcyBoYXZlIDhHQiBSQU0sIGFuZCBoYXZlIDMgTkZTdjQgbW91bnRzIGFuZCBv
bmUgTkZTdjMgbW91bnQgKHdpdGggdGhlIG1ham9yaXR5IG9mIHRoZSBmaWxlcyB0aGF0IHRoZXkg
YWNjZXNzIGJlaW5nIG9uIG9uZSBvZiB0aGUgTkZTdjQgbW91bnQgcG9pbnRzLCBiZWluZyBhIG1h
aWxkaXIgc3R5bGUgbWFpbCBzcG9vbCkuDQo+IA0KPiBJIGhhdmUgaW5jcmVhc2VkIC9wcm9jL3N5
cy92bS92ZnNfY2FjaGVfcHJlc3N1cmUgdG8gMTAwMDAsIHdoaWNoIGhhcyByZXNvbHZlZCB0aGUg
cHJvYmxlbSBmb3Igbm93LCBob3dldmVyIEkgYmVsaWV2ZSB0aGF0IHRoZSByZWFzb24gd2Ugc3Rh
cnRlZCBzZWVpbmcgdGhlIGlzc3VlIGlzIHRoYXQgd2UgYWRkZWQgYSBsb3Qgb2YgZXh0cmEgdXNl
cnMgb250byB0aGUgc3lzdGVtLCByZXN1bHRpbmcgaW4gYWNjZXNzIHRvIGEgbGFyZ2VyIG51bWJl
ciBvZiBmaWxlcyBmb3IgZWFjaCBvZiB0aGUgY2xpZW50cy4gIEkgYW0gbm90IGNvbmZpZGVudCB0
aGF0IGZ1dHVyZSBncm93dGggd2lsbCBzdGF5IGJlbG93IHdoYXRldmVyIHRocmVzaG9sZCB3ZSBo
YWQgZXhjZWVkZWQgdG8gY2F1c2UgdGhlIGV4Y2Vzc2l2ZSBzeXN0ZW0gQ1BVIGxvYWQsIHNpbmNl
IHRoZSBwcm9ibGVtIHNlZW1lZCB0byBhcHBlYXIgYXQgYXJvdW5kIDEsMDAwLDAwMCBuZnNfaW5v
ZGVfY2FjaGUgZW50cmllcyBpbiBzbGFidG9wLCBhbmQgdGhlIE5GUyBjbGllbnRzIGFyZSBmbG9h
dGluZyBiZXR3ZWVuIDUwMCwwMDAgYW5kIDkwMCwwMDAgaW5vZGVfY2FjaGUgZW50cmllcy4NCj4g
DQo+IEhlbHAgcGxlYXNlIDopLCBhbmQgcGxlYXNlIGxldCBtZSBrbm93IGlmIEkgY2FuIHByb3Zp
ZGUgYW55IG1vcmUgaW5mb3JtYXRpb24gdG8gYXNzaXN0IGluIGRlYnVnZ2luZy4NCg0KVGhpcyBs
b29rcyBsaWtlIG1vcmUgb2YgYSBwcm9ibGVtIGZvciB0aGUgbWVtb3J5IG1hbmFnZW1lbnQgZm9s
a3MgcmF0aGVyDQp0aGFuIHRoZSBORlMgdGVhbS4gVGhlIE5GUyBjbGllbnQgaGFzIHZlcnkgbGlt
aXRlZCBjb250cm9sIG92ZXIgdGhlDQpjYWNoaW5nIG9mIGlub2Rlcy4NCg0KT25lIHRoaW5nIHRo
YXQgeW91IG1pZ2h0IHRyeSBkb2luZyBpcyB0dXJuaW5nIG9mZiByZWFkZGlycGx1cyAodXNpbmcg
dGhlDQotb25vcmRpcnBsdXMgbW91bnQgb3B0aW9uKSBhbmQgc2VlaW5nIGlmIHRoYXQgY2F1c2Vz
IGZld2VyIGlub2RlcyB0byBiZQ0KY3JlYXRlZCBpbiB0aGUgZmlyc3QgcGxhY2UuDQoNCkNoZWVy
cw0KICBUcm9uZA0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRh
aW5lcg0KDQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNv
bQ0KDQo=
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-03-01 14:00 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-01 0:52 NFS client (3.0.0-3.2.8) high system CPU Steven Wilton
2012-03-01 14:00 ` Myklebust, Trond
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).