From: Christoph Martin <martin@uni-mainz.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Andy Adamson <andros@netapp.com>,
Markus Tacke <tacke@uni-mainz.de>, <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] make nfsd_drc_max_mem configurable
Date: Mon, 6 Jul 2015 14:59:43 +0200 [thread overview]
Message-ID: <559A7BBF.2080201@uni-mainz.de> (raw)
In-Reply-To: <20150618161657.GC10305@fieldses.org>
[-- Attachment #1.1: Type: text/plain, Size: 4041 bytes --]
Hi Bruce,
Am 18.06.2015 um 18:16 schrieb J. Bruce Fields:
>
>> The first time we wanted to setup a NFS Server for our HPC cluster. We
>> were wondering why we were only able to mount the filesystem on 380 of
>> our ~700 nodes. It took us a long time to find out that it was the limit
>> of the NFS4.1 session cache. Since this machine had 12G Ram, the kernel
>> reserved 12M for the cache, which results in 384 slots a 32k:
>>
>> echo $(((12582912>>10)/32))
>> 384
>
> So each client is using 32k?
#define NFSD_SLOT_CACHE_SIZE 2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION 32
#define NFSD_MAX_MEM_PER_SESSION \
(NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE)
So this would be 64k. Maybe I missed a factor of 2 somewhere. But the
calculation above equals the experience in our tests.
>
> Might be interesting to take a look at the CREATE_SESSION call and reply
> in wireshark (especially the values of maxresponsesize_cached and
> maxrequests)--there might also be defaults there that need tweaking.
>
>> We patched the kernel redhat 7 kernel to change NFSD_DRC_SIZE_SHIFT to
>> from 10 to 7 to fix this problem.
>>
>> The second time we installed a small Debian VM with 1G ram to act as a
>> NFS4 referral server for the home and group directories on our campus.
>> Since the server does only NFS referrals it does not really need more
>> memory than the 1G. But it could only server about 30 clients with this
>> limitation of the session cache.
>>
>> I think it would be a good idea to have the amount of memory
>> configurable in nfsd. So I wrote this small patch to make drc_size
>> configurable while loading the kernel nfsd module.
>>
>> The patch uses the old value computed from NFSD_DRC_SIZE_SHIFT as the
>> lower limit. If drc_size as a parameter for then nfsd is higher than a
>> 1/1000 of the RAM, this value will be used.
>>
>> One might consider to make NFSD_DRC_SIZE_SHIFT even higher to use less
>> memory for situations where it is not needed. I did not implement an
>> upper limit, but it might be important.
>>
>> Please consider to include this patch into the nfsd code.
>
> Looks good,
As far as I understand the code now, it would even be possible to change
the value of nfsd_drc_max_mem during runtime of nfsd, since the value is
only used in nfsd4_get_drc_mem in nfs4state.c. I don't see that the
limit is on the slab. It seems only to be on the local usage of the slab.
> my one concern is that this covers only the size of the 4.1
> session cache. We may need to add some more limits in the future and
> might not want to require separate configuration of each limit.
>
> Maybe one or two more generic size parameters would be more useful?
> Like:
>
> - Maximum memory to devote to knfsd
> - Maximum memory to devote to a single client
>
We were discussing this and don't think that this is a good idea.
If you have only one limit per knfsd or client, you don't know how many
memory you have to assign to the different memory slabs, like drc or
others, because you can't know in advance if the nfs server is only used
for say nfs3 or nfs4 or nfs4.1 or a mixture.
So if you think it is necessary to have a global limit, you also need
tunabels for the distribution of the available memory to the different
protocols.
I found the following calls to kmem_cache_create:
>
> nfs4state.c:2635: openowner_slab = kmem_cache_create("nfsd4_openowners",
> nfs4state.c:2639: lockowner_slab = kmem_cache_create("nfsd4_lockowners",
> nfs4state.c:2643: file_slab = kmem_cache_create("nfsd4_files",
> nfs4state.c:2647: stateid_slab = kmem_cache_create("nfsd4_stateids",
> nfs4state.c:2651: deleg_slab = kmem_cache_create("nfsd4_delegations",
> nfscache.c:168: drc_slab = kmem_cache_create("nfsd_drc", sizeof(struct svc_cacherep),
At a first look, they all use different methods to administer this
memory if at all.
Yours
Christoph
[-- Attachment #1.2: martin.vcf --]
[-- Type: text/x-vcard, Size: 195 bytes --]
begin:vcard
fn:Christoph Martin
n:Martin;Christoph
email;internet:martin@uni-mainz.de
tel;work:+49-6131-3926337
tel;fax:+49-6131-3926407
tel;cell:+49-179-7952652
version:2.1
end:vcard
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-07-06 12:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-17 12:48 [PATCH] make nfsd_drc_max_mem configurable Christoph Martin
2015-06-18 16:16 ` J. Bruce Fields
2015-07-06 12:59 ` Christoph Martin [this message]
2015-08-20 9:30 ` Christoph Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=559A7BBF.2080201@uni-mainz.de \
--to=martin@uni-mainz.de \
--cc=andros@netapp.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=tacke@uni-mainz.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox