linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Debian and Netapp NFSv4 locks/owner exhaustion
@ 2012-09-14  8:29 Richard Allen
  2012-09-14 12:04 ` Jim Rees
  0 siblings, 1 reply; 2+ messages in thread
From: Richard Allen @ 2012-09-14  8:29 UTC (permalink / raw)
  To: linux-nfs

Hi all,

We're currently having a highly intermittent (and seemingly random) 
problem with our 20 Debian clients running courier POP3 and IMAPd 
connecting to a Netapp filer using NFSv4, whereby locks/owner objects 
are being maxed out on the filer, causing the clients to start to have 
very high load.

Netapp think that this is a bug with the implementation of NFS in our 
Debian kernel - please see part of an email they sent us below:

---
"Thanks for providing all the  data
from the sync core and the lock status data provided indicated you have 
a linux client bug. equivalent to redhat bugzilla > 620502/621304 .

https://bugzilla.redhat.com/show_bug.cgi?id=620502  2.6.18 kernel
https://bugzilla.redhat.com/show_bug.cgi?id=621304  2.6.32 kernel.

Basically there is a know linux kernel bug where it request a new owner 
id everytime a new lock request is submitted. There are linux upstream 
fixed that prevents new owner lock id and reuse existing ones.

Analyses from the sync core

As previously mentioned NFSv4 has 4 items StateID/ClientID and OwnerID 
limits.

StateID
kgdb-amd64-7.4-54) p v4lock_states_free_count
$8 = {16789, 32768}
 >>>>>>There are free states available.

ClientID
(kgdb-amd64-7.4-54) p v4lock_clients_free[0]
$10 = {{cqh_first = 0xffffff04cd2cda00, cqh_last = 0xffffff04cd4caf00}
 >>> there are free clients

OwnerID
(kgdb-amd64-7.4-54) p v4lock_owners_free[0]
$9 = {{cqh_first = 0xffffffffa25f2640, cqh_last = 0xffffffffa25f2640}
 >>>there are no free owners

(kgdb-amd64-7.4-54) print_owner_htab_count
total:10240
 >>>>total owner objects in use:10240

(kgdb-amd64-7.4-54) p v4owner_table_size/2
$527 = 10240
 >>>>per node max owners value reached


Lets correlate this to the lock status output provided.

(fed15:samuell:/x/eng/cs-data/2003058003/20120821_sync_core/mailstorage04> 
grep "Free owners" mailstorage04-lock-v*
mailstorage04-lock-v-201208210605:Free owners 3; In-Use Owners 10237 *
mailstorage04-lock-v-201208210610:Free owners -3; In-Use Owners 10243 *
mailstorage04-lock-v-201208210615:Free owners -2; In-Use Owners 10242
mailstorage04-lock-v-201208210620:Free owners 0; In-Use Owners 10240
mailstorage04-lock-v-201208210625:Free owners 3; In-Use Owners 10237
mailstorage04-lock-v-201208210630:Free owners -7; In-Use Owners 10247
mailstorage04-lock-v-201208210635:Free owners 1; In-Use Owners 10239
mailstorage04-lock-v-201208210640:Free owners -3; In-Use Owners 10243
mailstorage04-lock-v-201208210645:Free owners 4; In-Use Owners 10236
mailstorage04-lock-v-201208210650:Free owners -1; In-Use Owners 10241
mailstorage04-lock-v-201208210655:Free owners 4; In-Use Owners 10236
mailstorage04-lock-v-201208210700:Free owners -1; In-Use Owners 10241
mailstorage04-lock-v-201208210705:Free owners -4; In-Use Owners 10244

As you can see the filer is reporting MAX of 10240 and in some event it 
was over subscribed.

Corrective action is to make sure you use a kernel release from your 
distro that has upstream patch

The diff can be found here
https://bugzilla.redhat.com/attachment.cgi?id=436801&action=diff


Redhat has provided an errata fix int he kernel patch

http://rhn.redhat.com/errata/RHSA-2011-0542.html"
---

Some info on our platform:

Clients:
20 of, running Squeeze 6.0.1, but with backported kernel 
3.2.0-0.bpo.1-amd64.
nfs-common 1:1.2.2-4
The application is courier IMAP and POP3.

Server:
Netapp FAS3170 running 8.0.1P1 7-Mode

The main question I have is whether the bugs Netapp mentioned in the 
Redhat kernel have been fixed in the backported Debian kernel we are 
running, and if so, what version the fixes have been introduced in (and 
if not, what version the fixes *will* be introduced in)?

Otherwise, if anyone has any other suggestions as to what else the 
problem could be, I'd be happy to hear them

Thanks,

Richard


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Debian and Netapp NFSv4 locks/owner exhaustion
  2012-09-14  8:29 Debian and Netapp NFSv4 locks/owner exhaustion Richard Allen
@ 2012-09-14 12:04 ` Jim Rees
  0 siblings, 0 replies; 2+ messages in thread
From: Jim Rees @ 2012-09-14 12:04 UTC (permalink / raw)
  To: Richard Allen; +Cc: linux-nfs

Richard Allen wrote:

  The main question I have is whether the bugs Netapp mentioned in the
  Redhat kernel have been fixed in the backported Debian kernel we are
  running, and if so, what version the fixes have been introduced in
  (and if not, what version the fixes *will* be introduced in)?

The way I would do this is to install the source package for the kernel
you're running, then look to see if the patch has been applied to it.  If
not you could try applying it yourself.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-09-14 12:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-14  8:29 Debian and Netapp NFSv4 locks/owner exhaustion Richard Allen
2012-09-14 12:04 ` Jim Rees

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).